Identification of Putative Early Atherosclerosis Biomarkers by Unsupervised Deconvolution of Heterogeneous Vascular Proteomes

Sarah J. Parker, Cedars Sinai Medical Center
Lulu Chen, Virginia Polytechnic Institute and State University
Weston Spivia, Cedars Sinai Medical Center
Georgia Saylor, Wake Forest University
Chunhong Mao, University of Virginia
Vidya Venkatraman, Cedars Sinai Medical Center
Ronald J. Holewinski, Cedars Sinai Medical Center
Mitra Mastali, Cedars Sinai Medical Center
Rakhi Pandey, Cedars Sinai Medical Center
Grace Athas, LSU Health Sciences Center - New Orleans
Guoqiang Yu, Virginia Polytechnic Institute and State University
Qin Fu, Cedars Sinai Medical Center
Dana Troxlair, LSU Health Sciences Center - New Orleans
Richard Vander Heide, LSU Health Sciences Center - New Orleans
David Herrington, Wake Forest University
Jennifer E. Van Eyk, Cedars Sinai Medical Center
Yue Wang, Virginia Polytechnic Institute and State University

Abstract

Coronary artery disease remains a leading cause of death in industrialized nations, and early detection of disease is a critical intervention target to effectively treat patients and manage risk. Proteomic analysis of mixed tissue homogenates may obscure subtle protein changes that occur uniquely in underlying tissue subtypes. The unsupervised 'convex analysis of mixtures' (CAM) tool has previously been shown to effectively segregate cellular subtypes from mixed expression data. In this study, we hypothesized that CAM would identify proteomic information specifically informative to early atherosclerosis lesion involvement that could lead to potential markers of early disease detection. We quantified the proteome of 99 paired abdominal aorta (AA) and left anterior descending coronary artery (LAD) specimens (N = 198 specimens total) acquired during autopsy of young adults free of diagnosed cardiac disease. The CAM tool was then used to segregate protein subsets uniquely associated with different underlying tissue types, yielding markers of normal and fibrous plaque (FP) tissues in LAD and AA (N = 62 lesions markers). CAM-derived FP marker expression was validated against pathologist estimated luminal surface involvement of FP, as well as in an orthogonal cohort of "pure" fibrous plaque, fatty streak, and normal vascular specimens. A targeted mass spectrometry (MS) assay quantified 39 of 62 CAM-FP markers in plasma from women with angiographically verified coronary artery disease (CAD, N = 46) or free from apparent CAD (control, N = 40). Elastic net variable selection with logistic regression reduced this list to 10 proteins capable of classifying CAD status in this cohort with <6% misclassification error, and a mean area under the receiver operating characteristic curve of 0.992 (confidence interval 0.968-0.998) after cross validation. The proteomics-CAM workflow identified lesion-specific molecular biomarker candidates by distilling the most representative molecules from heterogeneous tissue types.