Candidate

Heng-Yuan Tung

Examination Date

11-2020

Degree

Dissertation

Degree Program

Biostatistics

Examination Committee

Donald Mercante; Lee McDaniel; Zhide Fang; Jian Li

Abstract

A polygenic risk score (PRS), which integrates multiple single nucleotide polymorphisms (SNPs), is a useful disease prediction and risk classification tool. Although the conventional PRS, the sum of multiple SNP individual effects, is useful but not sufficient. Studies have shown that SNP-SNP interactions can improve prediction for certain complex diseases. The Additive-Additive 9 Interaction-model (AA9int) is a powerful method to measure the SNP-SNP interactions associated with an outcome by assessing the non-hierarchical structure and various directions of the additive SNP inheritance mode. In SNPSNP interaction analysis, the number of predictors increases dramatically due to the consideration of all pairwise interactions. Thus, an effective screening method is crucial for building a multivariable model for PRS. For association studies with small sample size, SNP pairs with a complicated interaction pattern tended to be neglected. For addressing these issues, we proposed AA9Lasso, a new Lasso approach to perform multivariable-based variable selection. This AA9Lasso is a modified version of group lasso with a new weighting of the Lasso penalty terms used for AA9int identified SNP pairs. AA9Lasso puts a higher weighting on the SNP pairs with a complicated interaction pattern. Our simulation results showed that AA9Lasso performed better than the conventional univariate selection approach and standard Group Lasso in most of the conditions. Further, we developed the cluster-based two-stage AA9Lasso method to deal with the highly correlated issue of SNP pairs and reduce the computation burden. We then applied the AA9Lasso for evaluating SNP-SNP interactions associated with prostate cancer aggressiveness for 1925 African American men. We tested the 2415 SNPs selected from the five prostate cancer related pathways: the PI3K-Akt, TGFβ, Wnt, NF-κB, and JAK-STAT pathways. By applying the AA9Lasso and uni-pair p-value method as the screening stage and then using the stepwise selection, we built two PRS with the 22 SNP pairs and 19 pairs, respectively. The AA9Lasso PRS selected distinct SNP pairs compared with the uni-pair p-value approach. Our results demonstrated that the integrated PRS of these two methods performed better than each of these two approaches.

Comments

This dissertation is not held in the Libraries' print collection.

Share

COinS