Leveraging allelic imbalance to refine fine-mapping for eQTL studies

Autoři: Jennifer Zou aff001;  Farhad Hormozdiari aff002;  Brandon Jew aff004;  Stephane E. Castel aff005;  Tuuli Lappalainen aff005;  Jason Ernst aff001;  Jae Hoon Sul aff008;  Eleazar Eskin aff001
Působiště autorů: Computer Science Department, University of California Los Angeles, Los Angeles, California, United States of America aff001;  Genetic Epidemiology and Statistical Genetics Program, Harvard University, Cambridge, Massachusetts, United States of America aff002;  Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America aff003;  Bioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, California, United States of America aff004;  New York Genome Center, New York, New York, United States of America aff005;  Department of Systems Biology, Columbia University, New York, New York, United States of America aff006;  Department of Biological Chemistry, University of California Los Angeles, Los Angeles, California, United States of America aff007;  Department of Psychiatry and Biobehavioral Sciences, University of California Los Angeles, Los Angeles, California, United States of America aff008;  Department of Human Genetics, University of California Los Angeles, Los Angeles, California, United States of America aff009
Vyšlo v časopise: Leveraging allelic imbalance to refine fine-mapping for eQTL studies. PLoS Genet 15(12): e32767. doi:10.1371/journal.pgen.1008481
Kategorie: Research Article
doi: 10.1371/journal.pgen.1008481


Many disease risk loci identified in genome-wide association studies are present in non-coding regions of the genome. Previous studies have found enrichment of expression quantitative trait loci (eQTLs) in disease risk loci, indicating that identifying causal variants for gene expression is important for elucidating the genetic basis of not only gene expression but also complex traits. However, detecting causal variants is challenging due to complex genetic correlation among variants known as linkage disequilibrium (LD) and the presence of multiple causal variants within a locus. Although several fine-mapping approaches have been developed to overcome these challenges, they may produce large sets of putative causal variants when true causal variants are in high LD with many non-causal variants. In eQTL studies, there is an additional source of information that can be used to improve fine-mapping called allelic imbalance (AIM) that measures imbalance in gene expression on two chromosomes of a diploid organism. In this work, we develop a novel statistical method that leverages both AIM and total expression data to detect causal variants that regulate gene expression. We illustrate through simulations and application to 10 tissues of the Genotype-Tissue Expression (GTEx) dataset that our method identifies the true causal variants with higher specificity than an approach that uses only eQTL information. Across all tissues and genes, our method achieves a median reduction rate of 11% in the number of putative causal variants. We use chromatin state data from the Roadmap Epigenomics Consortium to show that the putative causal variants identified by our method are enriched for active regions of the genome, providing orthogonal support that our method identifies causal variants with increased specificity.

Klíčová slova:

Gene expression – Gene mapping – Genetic loci – Genome-wide association studies – Chromatin – Statistical distributions – Variant genotypes


1. Emilsson V, Thorleifsson G, Zhang B, Leonardson AS, Zink F, Zhu J, et al. Genetics of gene expression and its effect on disease. Nature. 2008;452(March).

2. Nica AC, Montgomery SB, Dimas AS, Stranger BE, Beazley C, Barroso I, et al. Candidate causal regulatory effects by integration of expression QTLs with complex trait genetic associations. PLoS Genetics. 2010;6(4):e1000895. doi: 10.1371/journal.pgen.1000895 20369022

3. Nicolae DL, Gamazon E, Zhang W, Duan S, Dolan ME, Cox NJ. Trait-associated SNPs are more likely to be eQTLs: annotation to enhance discovery from GWAS. PLoS Genetics. 2010;6(4):e1000888. doi: 10.1371/journal.pgen.1000888 20369019

4. Davis LK, Yu D, Keenan CL, Gamazon ER, Konkashbaev AI, Derks EM, et al. Partitioning the Heritability of Tourette Syndrome and Obsessive Compulsive Disorder Reveals Differences in Genetic Architecture. PLoS Genetics. 2013;9(10):e1003864. doi: 10.1371/journal.pgen.1003864 24204291

5. Torres JM, Gamazon ER, Parra EJ, Below JE, Valladares-Salgado A, Wacher N, et al. Cross-Tissue and Tissue-Specific eQTLs: Partitioning the Heritability of a Complex Trait. The American Journal of Human Genetics. 2014;95(5):521–534. doi: 10.1016/j.ajhg.2014.10.001 25439722

6. Brem RB, Clinton R. Genetic Dissection of Transcriptional Regulation in Budding Yeast. Science. 2002;296(April). doi: 10.1126/science.1069516 11923494

7. Consortium TG. The Genotype-Tissue Expression (GTEx) project. Nat Genet. 2013;45(6):580–585. doi: 10.1038/ng.2653

8. Consortium TG. The Genotype-Tissue Expression (GTEx) pilot analysis: Multitissue gene regulation in humans. Science. 2015;348(6235):648–660. doi: 10.1126/science.1262110

9. Zhernakova DV, Deelen P, Vermaat M, van Iterson M, van Galen M, Arindrarto W, et al. Identification of context-dependent expression quantitative trait loci in whole blood. Nature Genetics. 2016;49(1):139–145. doi: 10.1038/ng.3737 27918533

10. Consortium TG. Genetic effects on gene expression across human tissues. Nature. 2017;550:204–213. doi: 10.1038/nature24277

11. Wellcome Trust Case Control Consortium T. Bayesian refinement of association signals for 14 loci in 3 common diseases. Nature Genetics. 2012;44(12).

12. Malo N, Libiger O, Schork NJ. Accommodating Linkage Disequilibrium in Genetic-Association Analyses via Ridge Regression. Am J Hum Genet. 2008;82(February):375–385. doi: 10.1016/j.ajhg.2007.10.012 18252218

13. Yang J, Ferreira T, Morris AP, Medland SE, Investigation G, Madden PAF, et al. Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits. Nature Publishing Group. 2012;44(4):369–375.

14. Jansen R, Hottenga JJ, Nivard MG, Abdellaoui A, Laport B, de Geus EJ, et al. Conditional eQTL Analysis Reveals Allelic Heterogeneity of Gene Expression. Human Molecular Genetics. 2017. doi: 10.1093/hmg/ddx043 28165122

15. Hormozdiari F, Zhu A, Kichaev G, Ju CJ, Segre AV, Joo JWJ, et al. Widespread Allelic Heterogeneity in Complex Traits. AJHG. 2017;100(5):789–802. doi: 10.1016/j.ajhg.2017.04.005

16. Brown AA, Vinuela A, Delaneau O, Spector TD, Small KS, Dermitzakis ET. Predicting causal variants affecting expression by using whole-genome sequencing and RNA-seq from multiple human tissues. Nature Genetics. 2017. doi: 10.1038/ng.3979

17. Servin B, Stephens M. Imputation-based analysis of association studies: Candidate regions and quantitative traits. PLoS Genetics. 2007;3(7):1296–1308. doi: 10.1371/journal.pgen.0030114

18. Hormozdiari F, Kostem E, Kang EY, Pasaniuc B, Eskin E. Identifying Causal Variants at Loci with Multiple Signals of Association. Genetics. 2014;198(2):497–508. doi: 10.1534/genetics.114.167908 25104515

19. Farh KKH, Marson A, Zhu J, Kleinewietfeld M, Housley WJ, Beik S, et al. Genetic and epigenetic fine mapping of causal autoimmune disease variants. Nature. 2014;518(7539):337–343. doi: 10.1038/nature13835 25363779

20. Chen W, Larrabee BR, Ovsyannikova IG, Kennedy RB, Haralambieva IH, Poland GA, et al. Fine Mapping Causal Variants with an Approximate Bayesian Method Using Marginal Test Statistics. Genetics. 2015;200(3):719–736. doi: 10.1534/genetics.115.176107 25948564

21. Benner C, Spencer CCA, Havulinna AS, Salomaa V, Ripatti S, Pirinen M. FINEMAP: efficient variable selection using summary data from genome-wide association studies. Bioinformatics. 2016;32(10):1493–1501. doi: 10.1093/bioinformatics/btw018 26773131

22. Hormozdiari F, Kichaev G, Yang WY, Pasaniuc B, Eskin E. Identification of causal genes for complex traits. Bioinformatics. 2015;31(12):i206–i213. doi: 10.1093/bioinformatics/btv240 26072484

23. Pastinen T, Hudson TJ. Cis-acting regulatory variation in the human genome. Science (New York, NY). 2004;306(5696):647–650. doi: 10.1126/science.1101659

24. Hasin-Brumshtein Y, Hormozdiari F, Martin L, van Nas A, Eskin E, Lusis AJ, et al. Allele-specific expression and eQTL analysis in mouse adipose tissue. BMC Genomics. 2014;15(1):471. doi: 10.1186/1471-2164-15-471 24927774

25. Baran Y, Subramaniam M, Biton A, Tukiainen T, Tsang EK, Rivas MA, et al. The landscape of genomic imprinting across diverse adult human tissues. Genome Research. 2015;25(7):927–936. doi: 10.1101/gr.192278.115 25953952

26. Mohammadi P, Castel SE, Brown AA, Lappalainen T. Quantifying the regulatory effect size of cis -acting genetic variation using allelic fold change. Genome Research. 2017; p. 1–13.

27. Yan H. Allelic Variation in Human Gene Expression. Science. 2002;297(5584):1143–1143. doi: 10.1126/science.1072545 12183620

28. Verlaan DJ, Ge B, Grundberg E, Hoberman R, Lam KCL, Koka V, et al. Targeted screening of cis- regulatory variation in human haplotypes. Genome Research. 2009; p. 118–127. doi: 10.1101/gr.084798.108 18971308

29. Pastinen T. Genome-wide allele-specific analysis: insights into regulatory variation. Nature Reviews Genetics. 2010;11(8):533–538. doi: 10.1038/nrg2815 20567245

30. Zhang K, Li JB, Gao Y, Egli D, Xie B, Deng J, et al. Digital RNA allelotyping reveals tissue-specific and allele-specific gene expression in human. Nature methods. 2009;6(8):613–618. doi: 10.1038/nmeth.1357 19620972

31. Cowles CR, Hirschhorn JN, Altshuler D, Lander ES. Detection of regulatory variation in mouse genes. Nature Genetics. 2002;32(3):432–437. doi: 10.1038/ng992 12410233

32. Nica AC, Dermitzakis ET. Expression quantitative trait loci: Present and future. Philosophical Transactions of the Royal Society B: Biological Sciences. 2013;368 (1620). doi: 10.1098/rstb.2012.0362

33. Nagel M, Jansen PR, Stringer S, Watanabe K, Leeuw CAD, Bryois J, et al. Meta-analysis of genome-wide association studies for neuroticism in 449, 484 individuals identifies novel genetic loci and pathways. Nature genetics. 2018;50:920–927. doi: 10.1038/s41588-018-0151-7 29942085

34. Lam M, Trampush JW, Yu J, Glahn DC, Malhotra AK, Lam M, et al. Large-Scale Cognitive GWAS Meta-Analysis Reveals Tissue-Specific Neural Expression and Potential Nootropic Drug Targets Resource Large-Scale Cognitive GWAS Meta-Analysis Reveals Tissue-Specific Neural Expression and Potential Nootropic Drug Targets. Cell Reports. 2017;21(9):2597–2613. doi: 10.1016/j.celrep.2017.11.028 29186694

35. Gonnermann A, Framke T, Großhennig A, Koch A. No solution yet for combining two independent studies in the presence of heterogeneity. Statistics in Medicine. 2015;34(16):2476–2480. doi: 10.1002/sim.6473 26040434

36. Borenstein M, Hedges LV, Higgins JPT, Rothstein HR. A basic introduction to fixed-effect and random-effects models for meta-analysis. Research Synthesis Methods. 2010;1(2):97–111. doi: 10.1002/jrsm.12 26061376

37. Kumasaka N, Knights AJ, Gaffney DJ. technical reports Fine-mapping cellular QTLs with RASQUAL and ATAC-seq. Nature Genetics. 2016;48(2). doi: 10.1038/ng.3467

38. Hu Y, Sun W, Tzeng J, C P. Proper Use of Allele-Specific Expression Improves Statistical Power for cis -eQTL Mapping with RNA-Seq Data. J Am Stat Assoc. 2015;110(511):962–974. doi: 10.1080/01621459.2015.1038449 26568645

39. Harvey CT, Moyerbrailean GA, Davis GO, Wen X, Luca F, Pique-regi R. Genetics and population analysis QuASAR: quantitative allele-specific analysis of reads. Bioinformatics. 2015;31(December 2014):1235–1242. doi: 10.1093/bioinformatics/btu802 25480375

40. Degner JF, Marioni JC, Pai AA, Pickrell JK, Nkadori E, Gilad Y, et al. Effect of read-mapping biases on detecting allele-specific expression from RNA-sequencing data. Bioinformatics. 2009;25(24):3207–3212. doi: 10.1093/bioinformatics/btp579 19808877

41. Pirinen M, Lappalainen T, Zaitlen NA, GTEx Consortium, Dermitzakis ET, Donnelly P, et al. Assessing allele-specific expression across multiple tissues from RNA-seq read data. Bioinformatics. 2015;31(15):2497–2504. doi: 10.1093/bioinformatics/btv074 25819081

42. Chen W, Larrabee BR, Ovsyannikova IG, Kennedy RB, Haralambieva IH, Poland GA, et al. Fine mapping causal variants with an approximate bayesian method using marginal test statistics. Genetics. 2015;200(3):719–736. doi: 10.1534/genetics.115.176107 25948564

43. Hormozdiari F, van de Bunt M, Segrè AV, Li X, Joo JWJ, Bilow M, et al. Colocalization of GWAS and eQTL Signals Detects Target Genes. American Journal of Human Genetics. 2016;99(6):1245–1260. doi: 10.1016/j.ajhg.2016.10.003 27866706

44. Curran JE, Johnson MP, Dyer TD, Charlesworth J, Cole SA, Jowett JBM, et al. Discovery of expression QTLs using large-scale transcriptional profiling in human lymphocytes. Nature Genetics. 2007;39(10):1208–1216. doi: 10.1038/ng2119 17873875

45. Ernst J, Kellis M. Discovery and characterization of chromatin states for systematic annotation of the human genome. Nature Biotechnology. 2010;28(8):817–825. doi: 10.1038/nbt.1662 20657582

46. Ernst J, Kellis M. ChromHMM: automating chromatin- state discovery and characterization. Nature Methods. 2012;9(3):215–216. doi: 10.1038/nmeth.1906 22373907

47. Roadmap Epigenomics Consortium, Kundaje A, Meuleman W, Ernst J, Bilenky M, Yen A, et al. Integrative analysis of 111 reference human epigenomes. Nature. 2015;518(7539):317–330. doi: 10.1038/nature14248 25693563

48. van de Geijn B, McVicker G, Gilad Y, Pritchard JK. WASP: allele-specific software for robust molecular quantitative trait locus discovery. Nature Methods. 2015;12(11):1061–3. doi: 10.1038/nmeth.3582 26366987

49. Hormozdiari F, van de Bunt M, Segrè AV, Li X, Joo JWJ, Bilow M, et al. Colocalization of GWAS and eQTL Signals Detects Target Genes. The American Journal of Human Genetics. 2016;99(6):1245–1260. doi: 10.1016/j.ajhg.2016.10.003 27866706

50. Aguet F, Brown AA, Castel SE, Davis JR, He Y, Jo B, et al. Genetic effects on gene expression across human tissues. Nature. 2017;550(7675):204–213. doi: 10.1038/nature24277

51. Howie BN, Donnelly P, Marchini J. A Flexible and Accurate Genotype Imputation Method for the Next Generation of Genome-Wide Association Studies. PLoS Genetics. 2009;5(6). doi: 10.1371/journal.pgen.1000529 19543373

52. Trapnell C, Pachter L, Salzberg SL. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics. 2009;25(9):1105–1111. doi: 10.1093/bioinformatics/btp120 19289445

53. DeLuca D, Levin J, Sivachenko A, Fennell T, Nazaire M, Williams C, et al. RNA-SeQC: RNA-seq metrics for quality control and process optimization. Bioinformatics. 2012;11(28):1530–2. doi: 10.1093/bioinformatics/bts196

Genetika Reprodukční medicína

Článek vyšel v časopise

PLOS Genetics

2019 Číslo 12

Nejčtenější v tomto čísle
Zapomenuté heslo

Nemáte účet?  Registrujte se

Zapomenuté heslo

Zadejte e-mailovou adresu, se kterou jste vytvářel(a) účet, budou Vám na ni zaslány informace k nastavení nového hesla.


Nemáte účet?  Registrujte se