-
Články
- Vzdělávání
- Časopisy
Top články
Nové číslo
- Témata
- Videa
- Podcasty
Nové podcasty
Reklama- Kariéra
Doporučené pozice
Reklama- Praxe
UMAP reveals cryptic population structure and phenotype heterogeneity in large genomic cohorts
Autoři: Alex Diaz-Papkovich aff001; Luke Anderson-Trocmé aff002; Chief Ben-Eghan aff002; Simon Gravel aff002
Působiště autorů: Quantitative Life Sciences, McGill University, Montreal, Québec, Canada aff001; McGill University and Genome Quebec Innovation Centre, Montreal, Québec, Canada aff002; Department of Human Genetics, McGill University, Montreal, Quebec, Canada aff003
Vyšlo v časopise: UMAP reveals cryptic population structure and phenotype heterogeneity in large genomic cohorts. PLoS Genet 15(11): e32767. doi:10.1371/journal.pgen.1008432
Kategorie: Research Article
doi: https://doi.org/10.1371/journal.pgen.1008432Souhrn
Human populations feature both discrete and continuous patterns of variation. Current analysis approaches struggle to jointly identify these patterns because of modelling assumptions, mathematical constraints, or numerical challenges. Here we apply uniform manifold approximation and projection (UMAP), a non-linear dimension reduction tool, to three well-studied genotype datasets and discover overlooked subpopulations within the American Hispanic population, fine-scale relationships between geography, genotypes, and phenotypes in the UK population, and cryptic structure in the Thousand Genomes Project data. This approach is well-suited to the influx of large and diverse data and opens new lines of inquiry in population-scale datasets.
Klíčová slova:
African people – Caribbean – Data visualization – Ethnicities – Europe – Hispanic people – Chinese people – principal component analysis
Zdroje
1. Lawson DJ, Hellenthal G, Myers S, Falush D (2012) Inference of population structure using dense haplotype data. PLOS Genetics 8(1):e1002453. doi: 10.1371/journal.pgen.1002453 22291602
2. Novembre J, Peter BM (2016) Recent advances in the study of fine-scale population structure in humans. Current Opinion in Genetics & Development 41 : 98–105. doi: 10.1016/j.gde.2016.08.007
3. Spence JP, Steinrücken M, Terhorst J, Song YS (2018) Inference of population history using coalescent hmms: review and outlook. Current Opinion in Genetics & Development 53 : 70–76. doi: 10.1016/j.gde.2018.07.002
4. Patterson N, Price AL, Reich D (2006) Population structure and eigenanalysis. PLOS Genetics 2(12):1–20. doi: 10.1371/journal.pgen.0020190
5. Hellenthal G, et al. (2014) A genetic atlas of human admixture history. Science 343(6172):747–751. doi: 10.1126/science.1243518 24531965
6. McVean G (2009) A genealogical interpretation of principal components analysis. PLOS Genetics 5(10):e1000686. doi: 10.1371/journal.pgen.1000686 19834557
7. Brisbin A, et al. (2012) PCAdmix: principal components-based assignment of ancestry along each chromosome in individuals with admixed ancestry from two or more populations. Human Biology 84(4):343. doi: 10.3378/027.084.0401 23249312
8. Novembre J, et al. (2008) Genes mirror geography within Europe. Nature 456 : 98–101. doi: 10.1038/nature07331 18758442
9. Nelson MR, et al. (2008) The Population Reference Sample, POPRES: a resource for population, disease, and pharmacological genetics research. The American Journal of Human Genetics 83(3):347–358. doi: 10.1016/j.ajhg.2008.08.005 18760391
10. van der Maaten L, Hinton G (2008) Visualizing data using t-SNE. Journal of Machine Learning Research 9(Nov):2579–2605.
11. Platzer A (2013) Visualization of SNPs with t-SNE. PLOS One 8(2):e56883. doi: 10.1371/journal.pone.0056883 23457633
12. 1000 Genomes Project Consortium (2015) A global reference for human genetic variation. Nature 526(7571):68. doi: 10.1038/nature15393 26432245
13. Li W, Cerise JE, Yang Y, Han H (2017) Application of t-SNE to human genetic data. Journal of Bioinformatics and Computational Biology 15(04):1750017. doi: 10.1142/S0219720017500172 28718343
14. McInnes L, Healy J (2018) UMAP: Uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426.
15. Becht E, et al. (2018) Dimensionality reduction for visualizing single-cell data using UMAP. Nature Biotechnology. doi: 10.1038/nbt.4314 30531897
16. Juster FT, Suzman R (1995) An overview of the Health and Retirement Study. Journal of Human Resources pp. S7–S56. doi: 10.2307/146277
17. Sudlow C, et al. (2015) UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Medicine 12(3):e1001779. doi: 10.1371/journal.pmed.1001779 25826379
18. Reich D, Thangaraj K, Patterson N, Price AL, Singh L (2009) Reconstructing indian population history. Nature 461 : 489 EP –. doi: 10.1038/nature08365 19779445
19. 23andMe (2019) 23andme tests new ancestry breakdown in central and south asia. [Online; accessed 2019-04-04].
20. Han E, et al. (2017) Clustering of 770,000 genomes reveals post-colonial population structure of north america. Nature Communications 8 : 14238. doi: 10.1038/ncomms14238 28169989
21. Jordan I, Rishishwar L, Conley AB (2018) Cryptic Native American ancestry recapitulates population-specific migration and settlement of the continental United States. bioRxiv.
22. Leslie S, et al. (2015) The fine-scale genetic structure of the British population. Nature 519(7543):309. doi: 10.1038/nature14230 25788095
23. Robinson MR, et al. (2015) Population genetic differentiation of height and body mass index across Europe. Nature Genetics 47(11):1357. doi: 10.1038/ng.3401 26366552
24. Komlos A (1994) Stature, living standards, and economic development: Essays in anthropometric history. (University of Chicago Press).
25. Quanjer PH, et al. (2012) Multi-ethnic reference values for spirometry for the 3–95-yr age range: the global lung function 2012 equations.
26. Ortega VE, Kumar R (2015) The effect of ancestry and genetic variation on lung function predictions: what is “normal” lung function in diverse human populations? Current Allergy and Asthma Reports 15(4):16. doi: 10.1007/s11882-015-0516-2 26130473
27. Novembre J, Stephens M (2008) Interpreting principal component analyses of spatial population genetic variation. Nature Genetics 40(5):646. doi: 10.1038/ng.139 18425127
28. Purcell S, et al. (2007) PLINK: a tool set for whole-genome association and population-based linkage analyses. The American Journal of Human Genetics 81(3):559–575. doi: 10.1086/519795 17701901
29. Baharian S, et al. (2016) The great migration and African-American genomic diversity. PLOS Genetics 12(5):e1006059. doi: 10.1371/journal.pgen.1006059 27232753
30. Maples BK, Gravel S, Kenny EE, Bustamante CD (2013) RFMix: A discriminative modeling approach for rapid and robust local-ancestry inference. Am J Hum Genet 93(2):278–288. doi: 10.1016/j.ajhg.2013.06.020 23910464
31. Pedregosa F, et al. (2011) Scikit-learn: Machine learning in Python. Journal of Machine Learning Research 12 : 2825–2830.
32. Jones E, Oliphant T, Peterson P, et al. (2001–) SciPy: Open source scientific tools for Python. [Online; accessed 2018-02-02].
33. Seabold S, Perktold J (2010) Statsmodels: Econometric and statistical modeling with python in 9th Python in Science Conference.
34. R Core Team (2013) R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, Vienna, Austria).
35. Hunter JD (2007) Matplotlib: A 2d graphics environment. Computing In Science & Engineering 9(3):90–95. doi: 10.1109/MCSE.2007.55
36. Wickham H (2016) ggplot2: Elegant Graphics for Data Analysis. (Springer-Verlag New York).
Štítky
Genetika Reprodukční medicína
Článek Eukaryote hybrid genomesČlánek The joy of balancersČlánek Inference of recombination maps from a single pair of genomes and its application to ancient samplesČlánek Role of α-Catenin and its mechanosensing properties in regulating Hippo/YAP-dependent tissue growthČlánek SUR-8 interacts with PP1-87B to stabilize PERIOD and regulate circadian rhythms in DrosophilaČlánek Cardiac Snail family of transcription factors directs systemic lipid metabolism in DrosophilaČlánek The great hairball gambit
Článek vyšel v časopisePLOS Genetics
Nejčtenější tento týden
2019 Číslo 11
-
Všechny články tohoto čísla
- A meta-analysis of genome-wide association studies of epigenetic age acceleration
- DNA variants affecting the expression of numerous genes in trans have diverse mechanisms of action and evolutionary histories
- AMPK regulates ESCRT-dependent microautophagy of proteasomes concomitant with proteasome storage granule assembly during glucose starvation
- Systems genomics approaches provide new insights into Arabidopsis thaliana root growth regulation under combinatorial mineral nutrient limitation
- Chromatin dynamics enable transcriptional rhythms in the cnidarian Nematostella vectensis
- Genomic dissection of an extended phenotype: Oak galling by a cynipid gall wasp
- Eukaryote hybrid genomes
- Physiological and genomic evidence that selection on the transcription factor Epas1 has altered cardiovascular function in high-altitude deer mice
- The joy of balancers
- Sumoylation of the DNA polymerase ε by the Smc5/6 complex contributes to DNA replication
- The S phase checkpoint promotes the Smc5/6 complex dependent SUMOylation of Pol2, the catalytic subunit of DNA polymerase ε
- UMAP reveals cryptic population structure and phenotype heterogeneity in large genomic cohorts
- Mouse protein coding diversity: What’s left to discover?
- Inference of recombination maps from a single pair of genomes and its application to ancient samples
- Transcriptional and genomic parallels between the monoxenous parasite Herpetomonas muscarum and Leishmania
- Role of α-Catenin and its mechanosensing properties in regulating Hippo/YAP-dependent tissue growth
- Microbial phenotypic heterogeneity in response to a metabolic toxin: Continuous, dynamically shifting distribution of formaldehyde tolerance in Methylobacterium extorquens populations
- Availability of splicing factors in the nucleoplasm can regulate the release of mRNA from the gene after transcription
- The genetic architecture of helminth-specific immune responses in a wild population of Soay sheep (Ovis aries)
- NPM and NPM-MLF1 interact with chromatin remodeling complexes and influence their recruitment to specific genes
- Gpr63 is a modifier of microcephaly in Ttc21b mouse mutants
- Genome-wide identification of short 2′,3′-cyclic phosphate-containing RNAs and their regulation in aging
- SUR-8 interacts with PP1-87B to stabilize PERIOD and regulate circadian rhythms in Drosophila
- Photodamage repair pathways contribute to the accurate maintenance of the DNA methylome landscape upon UV exposure
- Recruitment of the Ulp2 protease to the inner kinetochore prevents its hyper-sumoylation to ensure accurate chromosome segregation
- A circadian output center controlling feeding:Fasting rhythms in Drosophila
- Increased ultra-rare variant load in an isolated Scottish population impacts exonic and regulatory regions
- The impact of genetic adaptation on chimpanzee subspecies differentiation
- CRL4 regulates recombination and synaptonemal complex aggregation in the Caenorhabditis elegans germline
- Cardiac Snail family of transcription factors directs systemic lipid metabolism in Drosophila
- Contribution of Common Genetic Variants to Familial Aggregation of Disease and Implications for Sequencing Studies
- Linking high GC content to the repair of double strand breaks in prokaryotic genomes
- East-Asian Helicobacter pylori Strains Synthesize Heptan-deficient Lipopolysaccharide
- SON protects nascent transcripts from unproductive degradation by counteracting DIP1
- Ancestral male recombination in Drosophila albomicans produced geographically restricted neo-Y chromosome haplotypes varying in age and onset of decay
- Correction: Wdr62 is involved in female meiotic initiation via activating JNK signaling and associated with POI in humans
- STK-12 acts as a transcriptional brake to control the expression of cellulase-encoding genes in Neurospora crassa
- The great hairball gambit
- PLOS Genetics
- Archiv čísel
- Aktuální číslo
- Informace o časopisu
Nejčtenější v tomto čísle- The genetic architecture of helminth-specific immune responses in a wild population of Soay sheep (Ovis aries)
- A circadian output center controlling feeding:Fasting rhythms in Drosophila
- AMPK regulates ESCRT-dependent microautophagy of proteasomes concomitant with proteasome storage granule assembly during glucose starvation
- Chromatin dynamics enable transcriptional rhythms in the cnidarian Nematostella vectensis
Kurzy
Zvyšte si kvalifikaci online z pohodlí domova
Současné možnosti léčby obezity
nový kurzAutoři: MUDr. Martin Hrubý
Autoři: prof. MUDr. Hana Rosolová, DrSc.
Všechny kurzyPřihlášení#ADS_BOTTOM_SCRIPTS#Zapomenuté hesloZadejte e-mailovou adresu, se kterou jste vytvářel(a) účet, budou Vám na ni zaslány informace k nastavení nového hesla.
- Vzdělávání