Accurate and fast multiple-testing correction in eQTL studies

Jae Hoon Sul, Towfique Raj, Simone de Jong, Paul I W de Bakker, Soumya Raychaudhuri, Roel A Ophoff, Barbara E Stranger, Eleazar Eskin, Buhm Han

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

In studies of expression quantitative trait loci (eQTLs), it is of increasing interest to identify eGenes, the genes whose expression levels are associated with variation at a particular genetic variant. Detecting eGenes is important for follow-up analyses and prioritization because genes are the main entities in biological processes. To detect eGenes, one typically focuses on the genetic variant with the minimum p value among all variants in cis with a gene and corrects for multiple testing to obtain a gene-level p value. For performing multiple-testing correction, a permutation test is widely used. Because of growing sample sizes of eQTL studies, however, the permutation test has become a computational bottleneck in eQTL studies. In this paper, we propose an efficient approach for correcting for multiple testing and assess eGene p values by utilizing a multivariate normal distribution. Our approach properly takes into account the linkage-disequilibrium structure among variants, and its time complexity is independent of sample size. By applying our small-sample correction techniques, our method achieves high accuracy in both small and large studies. We have shown that our method consistently produces extremely accurate p values (accuracy > 98%) for three human eQTL datasets with different sample sizes and SNP densities: the Genotype-Tissue Expression pilot dataset, the multi-region brain dataset, and the HapMap 3 dataset.

Original languageEnglish
Pages (from-to)857-68
Number of pages12
JournalAmerican Journal of Human Genetics
Volume96
Issue number6
DOIs
Publication statusPublished - 4 Jun 2015

Keywords

  • Data Interpretation, Statistical
  • Gene Expression Regulation
  • Genes
  • Genetic Variation
  • Humans
  • Multivariate Analysis
  • Normal Distribution
  • Polymorphism, Single Nucleotide
  • Probability
  • Quantitative Trait Loci
  • Sample Size
  • Statistics, Nonparametric

Fingerprint

Dive into the research topics of 'Accurate and fast multiple-testing correction in eQTL studies'. Together they form a unique fingerprint.

Cite this