Modeling Linkage Disequilibrium Increases Accuracy of Polygenic Risk Scores

Bjarni J. Vilhjálmsson*, Jian Yang, Hilary K. Finucane, Alexander Gusev, Sara Lindström, Stephan Ripke, Giulio Genovese, Po Ru Loh, Gaurav Bhatia, Ron Do, Tristan Hayeck, Hong Hee Won, Benjamin Neale, Aiden Corvin, James T.R. Walters, Kai How Farh, Peter A. Holmans, Phil Lee, Brendan Bulik-Sullivan, David A. CollierHailiang Huang, Tune H. Pers, Ingrid Agartz, Esben Agerbo, Margot Albus, Madeline Alexander, Farooq Amin, Silviu A. Bacanu, Martin Begemann, Richard A. Belliveau, Judit Bene, Sarah E. Bergen, Elizabeth Bevilacqua, Tim B. Bigdeli, Donald W. Black, Richard Bruggeman, Wiepke Cahn, Wei Cheng, Wei Cheng, Rene S. Kahn, Rene S. Kahn, Inez Myin-Germeys, Jim Van Os, Jim Van Os, Alan R. Sanders, Eric Strengman, Roel A. Ophoff, Rita K. Schmutzler, Rita K. Schmutzler, Rob B. Van Der Luijt, , , , ,

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

107 Citations (Scopus)

Abstract

Polygenic risk scores have shown great promise in predicting complex disease risk and will become more accurate as training sample sizes increase. The standard approach for calculating risk scores involves linkage disequilibrium (LD)-based marker pruning and applying a p value threshold to association statistics, but this discards information and can reduce predictive accuracy. We introduce LDpred, a method that infers the posterior mean effect size of each marker by using a prior on effect sizes and LD information from an external reference panel. Theory and simulations show that LDpred outperforms the approach of pruning followed by thresholding, particularly at large sample sizes. Accordingly, predicted R2 increased from 20.1% to 25.3% in a large schizophrenia dataset and from 9.8% to 12.0% in a large multiple sclerosis dataset. A similar relative improvement in accuracy was observed for three additional large disease datasets and for non-European schizophrenia samples. The advantage of LDpred over existing methods will grow as sample sizes increase.

Original languageEnglish
Pages (from-to)576-592
Number of pages17
JournalAmerican Journal of Human Genetics
Volume97
Issue number4
DOIs
Publication statusPublished - 1 Jan 2015

Fingerprint

Dive into the research topics of 'Modeling Linkage Disequilibrium Increases Accuracy of Polygenic Risk Scores'. Together they form a unique fingerprint.

Cite this