Stratification by Smoking Status Reveals an Association of Genotype with Body Mass Index in Never Smokers

We found that a single nucleotide polymorphism in the CHRNA5-A3-B4 gene cluster, which is known to influence smoking heaviness, is associated with lower body mass index (BMI) in current smokers, but higher BMI in never smokers. This difference in effects suggests that the variant influences BMI both via pathways other than smoking, and via the weight-reducing effects of smoking, in opposite directions. The overall effect on BMI would therefore be undetectable in an unstratified genome-wide association study, indicating that novel associations may be obscured by hidden population sub-structure.

Published in the journal: . PLoS Genet 10(12): e32767. doi:10.1371/journal.pgen.1004799
Category: Research Article
doi: 10.1371/journal.pgen.1004799


We found that a single nucleotide polymorphism in the CHRNA5-A3-B4 gene cluster, which is known to influence smoking heaviness, is associated with lower body mass index (BMI) in current smokers, but higher BMI in never smokers. This difference in effects suggests that the variant influences BMI both via pathways other than smoking, and via the weight-reducing effects of smoking, in opposite directions. The overall effect on BMI would therefore be undetectable in an unstratified genome-wide association study, indicating that novel associations may be obscured by hidden population sub-structure.


As obesity represents a substantial and growing threat to public health, efforts to identify the determinants of obesity are of considerable scientific and societal importance. Genome-wide association studies (GWAS) have identified numerous variants associated with body mass index (BMI) [1], but a substantial proportion of the estimated heritability remains to be accounted for. At the same time, a number of modifiable environmental factors have been identified that influence BMI, with cigarette smoking a strong lifestyle influence on BMI [2]. In a previous Mendelian randomisation analysis, we used a single nucleotide polymorphism in the CHRNA5-A3–B4 gene cluster associated with heaviness of smoking within smokers [3] to confirm the causal effect of smoking in reducing BMI [4].

We sought to extend these findings in a larger sample drawn from the Causal Analysis Research in Tobacco and Alcohol (CARTA) consortium ( We used the same genetic variant, characterised by two SNPs (rs16969968 and rs1051730) which are in perfect linkage disequilibrium (LD) in samples of European ancestry, and therefore reflect the same genetic signal (hereafter rs16969968-rs1051730). This variant is associated with approximately 1% phenotypic variance in cigarettes per day and approximately 4% variance in cotinine levels (the primary metabolite of nicotine, and a more precise measure of exposure) [5], [6]. Mendelian randomisation analyses of the causal effects of smoking heaviness require stratification according to smoking status – any causal effects of the exposure (i.e., smoking heaviness) should be reflected in an association of the instrument (i.e., genotype) among current smokers only, and not never smokers (former smokers might be expected to be intermediate between current and never smokers) [7]. The never smoking group therefore enables a test of the specificity of the instrument (i.e., that the variant only affects the outcome through the exposure of interest) [8]. Critically, the rs16969968-rs1051730 variant has not been shown to be associated with smoking initiation (i.e., it does not influence risk of being an ever versus a never smoker) in previous GWAS of smoking behaviour [9], which reduces the risk of introducing collider bias when stratifying on smoking status.

In the course of these analyses, we observed an unexpected finding, which we report here. Specifically, we observed an association of rs16969968-rs1051730 with higher BMI in never smokers. This association has not previously been reported in GWAS of BMI published to date. We therefore focus on the implications of this novel finding, and not the Mendelian randomisation analysis of the causal effects of smoking on BMI.


Our total sample size comprised 148,730 never smokers, former smokers and current smokers. In the 66,809 never smokers, we observed positive association of rs16969968-rs1051730 with BMI (Table 1), indicating an association operating via pathways other than smoking (percentage change per minor allele +0.35, 95% CI +0.18 to +0.52, P = 6.38×10−5). We also confirmed the expected inverse association of rs16969968-rs1051730 with BMI in the 38,913 current smokers (percentage change −0.74, 95% CI −0.97 to −0.51, P = 2.00×10−10), consistent with a causal, weight-reducing effect of cigarette smoking on BMI. There was no evidence of association in the 43,009 former smokers (percentage change −0.14, 95% CI −0.34 to +0.07, P = 0.19). An interaction test indicated that these estimates differed from each other (P = 4.95×10−13). Similar associations were observed for weight (Table 1) and waist circumference (data available on request), but not height (Ps ≥0.27 for all smoking categories). Between-study heterogeneity was low (I2 values ≤36%), and there was no evidence for effect modification by sex. Critically, when data were examined without stratification by smoking status no clear evidence of association with BMI was observed (P = 0.22), indicating that a conventional GWAS would have failed to detect this signal.

Tab. 1. Association of rs16969968-rs1051730 with body mass index, weight and height, stratified by smoking status.
Association of rs16969968-rs1051730 with body mass index, weight and height, stratified by smoking status.
All analyses were adjusted for age. Effect estimate represents per minor allele percentage change for BMI and weight (log transformed for analysis), and per minor allele change in cm for height.

The 0.35% per minor allele BMI increase in never smokers represents a change of approximately 0.09 kg/m2. This is smaller than the effect of rs9939609 in FTO (∼0.4 kg/m2) [10] but is comparable in terms of variance explained to the other variants identified by Speliotes and colleagues [1]. As noted above, the rs16969968-rs1051730 variant has not been shown to be associated with smoking initiation in previous GWAS of smoking behaviour [9]. This is also true in our data (ever smoker versus never smoker: OR per minor allele 1.01, 95% CI 0.99 to 1.03, P = 0.50), although we observed an association with smoking cessation (current smoker versus former smoker: OR per minor allele 1.08, 95% CI 1.06 to 1.10, P = 1.44×10−12), consistent with previous studies [11]. Therefore, we do not believe that these findings are due to collider bias, whereby stratifying on the exposure measure can induce associations between instrument and outcome [12].


Our results indicate that rs16969968-rs1051730 may be associated with BMI in never smokers, via pathways other than smoking, as well as with heaviness of smoking among current smokers. At this stage we can only speculate as to the mechanism through which rs16969968-rs1051730 may exert a positive effect on BMI in never smokers. In GWAS, the CHRNA5-A3-B4 gene cluster was confirmed to be associated with heaviness of smoking, and downstream health outcomes including lung cancer and peripheral arterial disease [9], [13], [14]. It has been shown that the rs16969968 variant is functional and leads to an amino acid change (D398N) in the α5 nicotinic acetylcholine receptor (nAChR) subunit protein [15]. Animal models indicate that this subunit modulates tolerance to high doses of nicotine [16]. Candidate gene studies have suggested an association of rs16969968-rs1051730 with other substance use phenotypes, such as cocaine use [17], while other variants in this region have been reported to be associated with alcohol consumption [18], although the evidence for these associations is currently weak. Therefore, one possibility is that nAChRs play a role in central mechanisms mediating responding to rewarding stimuli in general, which could include natural rewards such as food.

It is also notable that rs3743075, located within the CHRNA3 gene and correlated with rs16969968-rs1051730 (r2 = 0.34, D′ = 1.00), shows association (N = 974, P = 9.06×10−5) with BMI (defined as <30 kg/m2 vs ≥30 kg/m2) (dbGaP Study Accession: pha003015.1). There is evidence from animal models that activation of hypothalamic α3β4 nAChRs leads to activation of pro-opiomelanocortin neurons, and subsequent activation of melanocortin 4 receptors, which have been shown to be critical for nicotine-induced decreases in food intake [19]. Therefore, another possibility is that nAChR sub-units play a role specifically in mediating food intake, through as yet undescribed mechanisms. In other words, the effects we have observed operate via other nAChRs, and other genes in this region (namely CHRNA3 and CHRNB4) may contribute to our finding. Clearly further work is required to explore this possibility. The use of more detailed body composition measures such as percent body fat and its distribution may also serve to refine the nature of the association.

Our results, if confirmed, have important implications for the design of future GWAS. The association we observed in never smokers would essentially be undetectable in an unstratified sample, since the effect size observed in the combined sample would require approximately 791,000 participants to detect even at an uncorrected P-value of 0.05, and even then would indicate an inaccurate effect size. This is essentially because the effect of rs16969968-rs1051730 on BMI that operates via pathways other than smoking is countered by the weight-reducing effect of smoking. Therefore, since there are roughly twice as many never smokers as current smokers on average across our sample, these two effects negate each other. On the other hand, a sample of approximately 160,000 never smokers would be required to detect the effect we observed with genome-wide significance. Assuming the proportions of never, former and current smokers in our sample, this would imply a total sample size of around 350,000. While this is larger than published GWAS of BMI [1], it is achievable. Therefore, although we cannot say how frequent a scenario such as the one we observed here will be, additional variants may be identified in GWAS stratified by environmental exposures known to have pronounced effects on the phenotype of interest, such as cigarette smoking or physical activity on BMI.

The pleiotropic effect of rs16969968-rs1051730 (or LD of this variant with another variant causally influencing BMI), if shown to be robust via replication, has important implications for Mendelian randomisation studies assessing the causal effects of smoking. In this case, we can be reasonably confident that the BMI-reducing effect of the variant operates through smoking because the association with BMI in current smokers is in the opposite direction to the association in never smokers. Furthermore, if the effects on BMI that operate via pathways other than smoking and the effects that operate via the weight-reducing effects of smoking are independent, then the true causal estimate of the magnitude of effect of smoking in reducing BMI is likely to be larger than estimated with this variant. However, some caution must be exercised in conducting and interpreting the results of other Mendelian randomisation analyses using this variant because rs16969968-rs1051730 may influence outcomes through its effects on BMI, instead of or in addition to smoking heaviness. One possible solution is to use genetic variants for BMI as a method of reciprocal randomization to determine the direction of causation within inter-correlated networks of mechanistic pathways (i.e., network Mendelian randomisation) [20].

A limitation to our analysis is that we were only able to control for potential population stratification indirectly in most samples, by restricting analyses to participants of self-reported European ancestry. We were not able to use other methods, such as adjustment for principal components, given that not all contributing studies hold the necessary genetic data. However, we note that the minor allele frequency of the rs16969968-rs1051730 differed only slightly across studies (between 0.30 and 0.36).

Testing for gene-environment interaction in GWAS is not novel [21], and examples exist which incorporate smoking status as an environmental factor [22]. However, this remains relatively uncommon, due to methodological challenges (e.g., introducing collider bias) and sample size constraints. A key challenge is the identification of suitable environmental variables on which to stratify GWAS analyses, from the multitude available. We suggest that focusing on environmental factors that are most strongly associated with the phenotype of interest, are likely to have profound biological effects, and which can be characterised in a relatively consistent way across studies, is likely to be the best strategy. Smoking status meets all of these criteria, and the data presented here demonstrate how stratification on well-characterized environmental factors known to impact on health outcomes (such as smoking status) may reveal novel genetic associations with health outcomes. As our data indicate, these associations may operate through genetic influences on the environmental factors themselves, or through new pathways which are masked by the environmental factors.


Study populations

We used data on individuals (≥16 years) of European ancestry (ascertained via self report, or based on the genome-wide genotype data where available) from 29 studies in the Causal Analysis Research in Tobacco and Alcohol (CARTA) consortium ( the 1958 Birth Cohort (1958 BC), the Avon Longitudinal Study of Parents and Children (ALSPAC, including both mothers and children), the British Regional Heart Study (BRHS), the British Women's Heart and Health Study (BWHHS), the Caerphilly Prospective Study (CaPS), the Christchurch Health and Development Study (CHDS), the Cohorte Lausannoise (CoLaus) study, the Exeter Family Study of Child Health (EFSOCH), the English Longitudinal Study of Ageing (ELSA), FINRISK, the Danish GEMINAKAR twin study, Generation Scotland, the Genomics of Overweight Young Adults (GOYA) females, GOYA males, the Helsinki Birth Cohort Study (HBCS), Health2006, Health2008, the Nord-Trøndelag health study (HUNT), Inter99, the Northern Finland Birth Cohorts (NFBC 1966 and NFBC 1986), MIDSPAN, the Danish MONICA study, the National Health and Nutrition Examination Survey (NHANES), the MRC National Survey of Health & Development (NSHD), the Netherlands Twin Registry (NTR), the Prospective Study of Pravastatin in the Elderly at Risk (PROSPER) and Whitehall II. References to these individual studies are available on request. All studies received ethics approval from local research ethics committees (see Text S1 for full details).


Within each study, individuals were genotyped for one of two single nucleotide polymorphisms (SNPs) in the CHRNA5-A3-B4 nicotinic receptor subunit gene cluster, rs16969968 or rs1051730. These single nucleotide polymorphisms are in perfect linkage disequilibrium with each other in Europeans (R2 = 1.00 in HapMap 3, and therefore represent the same genetic signal. Where studies had data available for both SNPs, we used the SNP that was genotyped in the largest number of individuals.

Body mass index

Height (m), weight (kg) and waist circumference (cm) were assessed within each study, directly measured for 99% of participants, and self-reported for GOYA females (N = 1,015) and a sub-set of NTR (N = 602). Body mass index (BMI) was calculated as weight/height2.

Smoking status

Smoking status was self-reported (either by questionnaire or interview). Individuals were classified as current, former, or never cigarette smokers. Where information on smoking frequency was available, current smokers were restricted to individuals who smoked regularly (typically at least one cigarette per day). Where information on pipe and cigar smoking was available, individuals reporting being current or former smokers of pipes or cigars but not cigarettes were excluded from all analyses. For studies with adolescent populations (ALSPAC children and NFBC 1986), analyses were restricted to current daily smokers who reported smoking at least one cigarette per day (current smokers) and individuals who had never tried smoking (never smokers). Descriptive characteristics of smoking frequency data are provided in Text S2.

Statistical analysis

Analyses were conducted within each contributing study using Stata and R software, following the same analysis plan. Analyses were restricted to individuals with full data on smoking status and rs16969968-rs1051730 genotype. Within each study, genotype frequencies were tested for deviation from Hardy Weinberg Equilibrium (HWE) using a chi-squared test. Mendelian randomisation analyses of the association between rs16969968-rs1051730 and BMI were performed using linear regression, stratified by smoking status (never, former and current) and sex, and adjusted for age. BMI was log transformed prior to analysis. An additive genetic model was assumed on log values, so that each effect size could be exponentiated to represent the percentage increase in BMI per minor (risk) allele.

For NHANES, which has a survey design, Taylor series linearization was implemented to estimate variances. For studies including related family members appropriate methods were used to adjust standard errors: in GEMINAKAR, twin pair identity was included as a cluster variable in the model, in MIDSPAN linear mixed effects regression models fitted using restricted maximum likelihood were used to account for related individuals. ALSPAC mothers and children were analysed as separate samples; as there are related individuals across these samples, sensitivity analyses were performed excluding each of these studies in turn.

Results from individual studies were meta-analysed in Stata (version 13) using the “metan” command. As I2 values were all equal to or below 36% (indicating low to moderate heterogeneity), fixed effects meta-analyses were performed. The “metareg” command was used to examine whether SNP effects varied by sex and estimates were combined as there was no evidence for effect modification by sex. Evidence for interaction between genotype and smoking status was assessed using the Cochran Q statistic. Data are available from the Institutional Data Access/Ethics Committees of the individual studies that contributed to this analysis, for researchers who meet the criteria for access to confidential data. Full details are provided in Text S3.

Sample size calculations

Sample size calculations were performed using Quanto software ( The following parameters were used: 80% power to detect associations, minor allele frequency of 0.33, mean and standard deviation for BMI of 25 kg/m2 and 3.8 kg/m2 respectively, alpha values of 0.05 and 5×10−8.

Supporting Information

Attachment 1

Attachment 2

Attachment 3


1. SpeliotesEK, WillerCJ, BerndtSI, MondaKL, ThorleifssonG, et al. (2010) Association analyses of 249,796 individuals reveal 18 new loci associated with body mass index. Nat Genet 42: 937–948.

2. BhojaniFA, TsaiSP, WendtJK, KollerKL (2014) Simulating the impact of changing trends in smoking and obesity on productivity of an industrial population: an observational study. BMJ Open 4: e004788.

3. WareJJ, van den BreeMB, MunafoMR (2011) Association of the CHRNA5-A3-B4 gene cluster with heaviness of smoking: a meta-analysis. Nicotine Tob Res 13: 1167–1175.

4. FreathyRM, KazeemGR, MorrisRW, JohnsonPC, PaternosterL, et al. (2011) Genetic variation at CHRNA5-CHRNA3-CHRNB4 interacts with smoking status to influence body mass index. Int J Epidemiol 40: 1617–1628.

5. KeskitaloK, BromsU, HeliovaaraM, RipattiS, SurakkaI, et al. (2009) Association of serum cotinine level with a cluster of three nicotinic acetylcholine receptor genes (CHRNA3/CHRNA5/CHRNB4) on chromosome 15. Hum Mol Genet 18: 4007–4012.

6. MunafoMR, TimofeevaMN, MorrisRW, Prieto-MerinoD, SattarN, et al. (2012) Association between genetic variants on chromosome 15q25 locus and objective measures of tobacco exposure. J Natl Cancer Inst 104: 740–748.

7. Davey SmithG (2011) Use of genetic markers and gene-diet interactions for interrogating population-level causal influences of diet on health. Genes Nutr 6: 27–43.

8. Davey SmithG, EbrahimS (2003) 'Mendelian randomization': can genetic epidemiology contribute to understanding environmental determinants of disease? Int J Epidemiol 32: 1–22.

9. Tobacco-and-Genetics-Consortium (2010) Genome-wide meta-analyses identify multiple loci associated with smoking behavior. Nat Genet 42: 441–447.

10. FraylingTM, TimpsonNJ, WeedonMN, ZegginiE, FreathyRM, et al. (2007) A common variant in the FTO gene is associated with body mass index and predisposes to childhood and adult obesity. Science 316: 889–894.

11. MunafoMR, JohnstoneEC, WaltherD, UhlGR, MurphyMF, et al. (2011) CHRNA3 rs1051730 genotype and short-term smoking cessation. Nicotine Tob Res 13: 982–988.

12. GlymourMM, Tchetgen TchetgenEJ, RobinsJM (2012) Credible Mendelian randomization studies: approaches for evaluating the instrumental variable assumptions. Am J Epidemiol 175: 332–339.

13. AmosCI, WuX, BroderickP, GorlovIP, GuJ, et al. (2008) Genome-wide association scan of tag SNPs identifies a susceptibility locus for lung cancer at 15q25.1. Nat Genet 40: 616–622.

14. WareJJ, van den BreeM, MunafoMR (2012) From men to mice: CHRNA5/CHRNA3, smoking behavior and disease. Nicotine Tob Res 14: 1291–1299.

15. BierutLJ, StitzelJA, WangJC, HinrichsAL, GruczaRA, et al. (2008) Variants in nicotinic receptors and risk for nicotine dependence. Am J Psychiatry 165: 1163–1171.

16. FowlerCD, LuQ, JohnsonPM, MarksMJ, KennyPJ (2011) Habenular alpha5 nicotinic receptor subunit signalling controls nicotine intake. Nature 471: 597–601.

17. GruczaRA, WangJC, StitzelJA, HinrichsAL, SacconeSF, et al. (2008) A risk allele for nicotine dependence in CHRNA5 is a protective allele for cocaine dependence. Biol Psychiatry 64: 922–929.

18. HallforsJ, LoukolaA, PitkaniemiJ, BromsU, MannistoS, et al. (2013) Scrutiny of the CHRNA5-CHRNA3-CHRNB4 smoking behavior locus reveals a novel association with alcohol use in a Finnish population based study. Int J Mol Epidemiol Genet 4: 109–119.

19. MineurYS, AbizaidA, RaoY, SalasR, DiLeoneRJ, et al. (2011) Nicotine decreases food intake through activation of POMC neurons. Science 332: 1330–1332.

20. TimpsonNJ, NordestgaardBG, HarbordRM, ZachoJ, FraylingTM, et al. (2011) C-reactive protein levels and body mass index: elucidating direction of causation through reciprocal Mendelian randomization. Int J Obes (Lond) 35: 300–308.

21. MurcrayCE, LewingerJP, GaudermanWJ (2009) Gene-environment interaction in genome-wide association studies. Am J Epidemiol 169: 219–226.

22. HancockDB, ArtigasMS, GharibSA, HenryA, ManichaikulA, et al. (2012) Genome-wide joint meta-analysis of SNP and SNP-by-smoking interaction identifies novel loci for pulmonary function. PLoS Genet 8: e1003098.

Genetika Reprodukční medicína

Článek vyšel v časopise

PLOS Genetics

2014 Číslo 12

Nejčtenější v tomto čísle

Zvyšte si kvalifikaci online z pohodlí domova

Betablokátory a Ca antagonisté z jiného úhlu
nový kurz
Autoři: prof. MUDr. Michal Vrablík, Ph.D., MUDr. Petr Janský

Chronické žilní onemocnění a možnosti konzervativní léčby

Průvodce pomocnými prostředky při léčbě nemocí parodontu
Autoři: MUDr. Ladislav Korábek, CSc., MBA

Jak proměnil léčbu srdečního selhání nástup gliflozinů
Autoři: MUDr. Kristýna Kyšperská, MUDr. Jan Beneš

Autoři: doc. MUDr. Alena Šmahelová, Ph.D.

Všechny kurzy
Zapomenuté heslo

Nemáte účet?  Registrujte se

Zapomenuté heslo

Zadejte e-mailovou adresu, se kterou jste vytvářel(a) účet, budou Vám na ni zaslány informace k nastavení nového hesla.


Nemáte účet?  Registrujte se