A general framework for functionally informed set-based analysis: Application to a large-scale colorectal cancer study

Autoři: Xinyuan Dong aff001;  Yu-Ru Su aff001;  Richard Barfield aff001;  Stephanie A. Bien aff001;  Qianchuan He aff001;  Tabitha A. Harrison aff001;  Jeroen R. Huyghe aff001;  Temitope O. Keku aff003;  Noralane M. Lindor aff004;  Clemens Schafmayer aff005;  Andrew T. Chan aff006;  Stephen B. Gruber aff007;  Mark A. Jenkins aff008;  Charles Kooperberg aff001;  Ulrike Peters aff001;  Li Hsu aff001
Působiště autorů: Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA aff001;  Department of Biostatistics, University of Washington, Seattle, WA, USA aff002;  Center for Gastrointestinal Biology and Disease, University of North Carolina, Chapel Hill, North Carolina, USA aff003;  Department of Health Science Research, Mayo Clinic, Scottsdale, Arizona, USA aff004;  Department of General Surgery, University Hospital Rostock, Rostock, Germany aff005;  Division of Gastroenterology, Massachusetts General Hospital and Harvard Medical School, and Channing Division of Network Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, Massachusetts, USA aff006;  City of Hope National Medical Center, Duarte, and Department of Preventive Medicine & USC Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, California, USA aff007;  Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, University of Melbourne, Melbourne, Victoria, Australia aff008
Vyšlo v časopise: A general framework for functionally informed set-based analysis: Application to a large-scale colorectal cancer study. PLoS Genet 16(8): e32767. doi:10.1371/journal.pgen.1008947
Kategorie: Research Article
doi: 10.1371/journal.pgen.1008947


Genome-wide association studies (GWAS) have successfully identified tens of thousands of genetic variants associated with various phenotypes, but together they explain only a fraction of heritability, suggesting many variants have yet to be discovered. Recently it has been recognized that incorporating functional information of genetic variants can improve power for identifying novel loci. For example, S-PrediXcan and TWAS tested the association of predicted gene expression with phenotypes based on GWAS summary statistics by leveraging the information on genetic regulation of gene expression and found many novel loci. However, as genetic variants may have effects on more than one gene and through different mechanisms, these methods likely only capture part of the total effects of these variants. In this paper, we propose a summary statistics-based mixed effects score test (sMiST) that tests for the total effect of both the effect of the mediator by imputing genetically predicted gene expression, like S-PrediXcan and TWAS, and the direct effects of individual variants. It allows for multiple functional annotations and multiple genetically predicted mediators. It can also perform conditional association analysis while adjusting for other genetic variants (e.g., known loci for the phenotype). Extensive simulation and real data analyses demonstrate that sMiST yields p-values that agree well with those obtained from individual level data but with substantively improved computational speed. Importantly, a broad application of sMiST to GWAS is possible, as only summary statistics of genetic variant associations are required. We apply sMiST to a large-scale GWAS of colorectal cancer using summary statistics from ∼120, 000 study participants and gene expression data from the Genotype-Tissue Expression (GTEx) project. We identify several novel and secondary independent genetic loci.

Klíčová slova:

Colorectal cancer – Covariance – Gene expression – Gene prediction – Genetic loci – Genetics – Genome-wide association studies – Test statistics


Článek vyšel v časopise

PLOS Genetics

2020 Číslo 8

Nejčtenější v tomto čísle

Tomuto tématu se dále věnují…


