Disentangling group specific QTL allele effects from genetic background epistasis using admixed individuals in GWAS: An application to maize flowering

Authors: Simon Rio ^aff001; Tristan Mary-Huard ^aff001; Laurence Moreau ^aff001; Cyril Bauland ^aff001; Carine Palaffre ^aff003; Delphine Madur ^aff001; Valérie Combes ^aff001; Alain Charcosset ^aff001
Authors place of work: Université Paris-Saclay, INRAE, CNRS, AgroParisTech, GQE - Le Moulon, 91190, Gif-sur-Yvette, France ^aff001; MIA, INRAE, AgroParisTech, Université Paris-Saclay, 75005, Paris, France ^aff002; UE 0394 SMH, INRAE, 2297 Route de l’INRA, 40390, Saint-Martin-de-Hinx, France ^aff003
Published in the journal: Disentangling group specific QTL allele effects from genetic background epistasis using admixed individuals in GWAS: An application to maize flowering. PLoS Genet 16(3): e32767. doi:10.1371/journal.pgen.1008241
Category: Research Article
doi: https://doi.org/10.1371/journal.pgen.1008241

Summary

When handling a structured population in association mapping, group-specific allele effects may be observed at quantitative trait loci (QTLs) for several reasons: (i) a different linkage disequilibrium (LD) between SNPs and QTLs across groups, (ii) group-specific genetic mutations in QTL regions, and/or (iii) epistatic interactions between QTLs and other loci that have differentiated allele frequencies between groups. We present here a new genome-wide association (GWAS) approach to identify QTLs exhibiting such group-specific allele effects. We developed genetic materials including admixed progeny from different genetic groups with known genome-wide ancestries (local admixture). A dedicated statistical methodology was developed to analyze pure and admixed individuals jointly, allowing one to disentangle the factors causing the heterogeneity of allele effects across groups. This approach was applied to maize by developing an inbred “Flint-Dent” panel including admixed individuals that was evaluated for flowering time. Several associations were detected revealing a wide range of configurations of allele effects, both at known flowering QTLs (Vgt1, Vgt2 and Vgt3) and new loci. We found several QTLs whose effect depended on the group ancestry of alleles while others interacted with the genetic background. Our GWAS approach provides useful information on the stability of QTL effects across genetic groups and can be applied to a wide range of species.

Keywords:

Plant genomics – Maize – Genetic loci – Quantitative trait loci – Molecular genetics – Alleles – Population genetics – Genome-wide association studies

Introduction

Quantitative traits are genetically determined by numerous regions of the genome, also known as quantitative trait loci (QTLs). The advent of high density genotyping of single nucleotide polymorphisms (SNPs) has opened the way to the identification of QTLs in diversity panels. These studies, referred to as genome-wide association studies (GWAS), use the linkage disequilibrium (LD) between the SNPs and causal variants at QTLs underlying the traits of interest. The panels evaluated in GWAS often include sets of individuals with complex pedigrees or genetic structure [1]. The latter is a common feature in human, animal and plant species and arises when groups of individuals cease to mate with each other and start to be subjected to different evolutionary forces, such as drift or selection [2].

Applying GWAS in a diversity panel including individuals from different groups raises the issue of spurious associations. The stratification of a population into genetic groups generates LD between loci that are differentiated between groups but not necessarily genetically linked. When a given trait is characterized by contrasted group-specific means, all these SNPs will correlate to it and may be detected as false positives. An efficient control of these spurious associations can be done by taking structure and kinship into account in the statistical model [1, 3]. This procedure will however limit the statistical power at differentiated SNPs, making them difficult to detect in multi-group GWAS, especially in case of rare alleles [4].

In a structured population, group-specific allele effects can be observed at SNPs, and testing an overall effect using a standard GWAS model may not be effective if the QTL effect is of opposite sign in the different groups. Such effects can result from group differences in LD between SNPs and QTLs across genetic groups. A different LD extent or linkage phase between linked loci can be explained by specific dynamics of population size such as bottlenecks or expansions [5, 6]. Such patterns of LD were identified in numerous species including human [7, 8], dairy and beef cattle [9, 10], pig [11], wheat [12] and maize [13–16]. A genetic mutation appearing in a QTL region may also lead to group-specific allele effects if it occurred in a founder specific of the genetic group. Several Mendelian syndromes of obesity were shown to result from mutation within specific ethnicities in human [17]. Another possibility consists in QTLs interacting with other loci that have differentiated allele frequencies between groups (i.e. interacting with the genetic background). In human, this possibility was discussed for a candidate gene associated with a higher risk of myocardial infarction in African American than in European populations [18, 19]. Another example is a SNP in the promoter region of HNF4A gene which was associated with a higher risk of developing type 2 diabetes in Askenazi compared to United Kingdom populations [20]. This locus was later proven to be interacting with another gene in the Askenazi population [21]. In maize, evidences of QTLs with group-specific allele effects can also be found, even though the cause of these differences remains unclear. The presence of allelic series has been demonstrated for QTLs associated with flowering time, including Vgt1 [22]. A QTL with group-specific allele effects was also identified in a maize diversity panel for a phenology trait [23]. More generally, studying the stability of QTL allele effects across genetic backgrounds is an important issue. In human, it determines the ability of a genetic marker to predict the predisposition of an individual to develop a genetic disease across ethnic groups. In plant or animal breeding, it conditions the success of introgressing a favorable allele coming from a source of diversity into an elite genetic material.

Different GWAS strategies were adopted to address this issue depending on the species. In human, GWAS mostly focused on a specific genetic group, and these group-specific studies were compared later through meta-analyses [24, 25]. Some of these meta-analyses revealed highly conserved effects between populations [26, 27] while other put in evidence more differences [28]. In dairy cattle, the first GWAS studies focused on a specific breed [29–31]. More recently, multi-breed GWAS were conducted to refine QTLs locations by taking advantage of the low LD extent observed in such composite populations [32–34]. In maize, the possibility to use seeds from different origins and generations led geneticists to assemble GWAS panels with a broad range of genetic materials [35–37]. These panels often include a limited proportion of admixed individuals that were derived from crosses between individuals from different genetic groups. The genomes of these admixed individuals consist in mosaics of fragments with different ancestries. Admixture events are a common feature in living species and can contribute to the successful colonization of new environments [38, 39]. In plants, innovative admixed genetic materials were created to enable high statistical power of QTL detection along with a wide spectrum of genetic diversity studied, such as nested association mapping (NAM) [40] or multi-parent advanced generation inter-cross (MAGIC) [41]. Both NAM and MAGIC populations are of great interest to study the stability of QTL effects in a wide range of genetic backgrounds. However, they generally include a limited number of founders and do not address the stability of QTL allele effects across genetic groups.

This study aimed at evaluating the interest of producing admixed individuals, derived from a large set of parents, in order to decipher the genetic architecture of a trait using innovative GWAS models. The objectives were (i) to demonstrate the interest of multi-group analyses to identify new QTLs, (ii) to highlight the interest of applying multi-group GWAS models to identify group-specific allele effects at QTLs and (iii) to show how admixed individuals can help to disentangle the factors causing the heterogeneity of allele effects across groups: local genomic differences or epistatic interactions between QTLs and the genetic background. To our knowledge, no method has been proposed in the literature to address the last objective. This method was applied to a maize inbred population evaluated for flowering traits, including dent, flint and admixed lines. Maize flowering time is an interesting trait to analyze in quantitative genetics studies. It is considered as a major adaptive trait by tailoring vegetative and reproductive growth phases to local environmental conditions.

Materials and methods

Genetic material and genotypic data

Genetic material consisted in a panel of 970 maize inbred lines assembled within the “Amaizing” project. It gathered 300 dent lines, 304 flint lines and 366 admixed doubled haploids, further referred to as admixed lines. The dent lines were those included in the “Amaizing Dent” panel [42] and the flint lines were those included in the “CF-Flint” panel [16]. The dent and flint lines aimed at representing the diversity of their respective heterotic group used in European breeding and included several breeding generations. The admixed lines were derived from 206 hybrids between flint and dent lines, mated according to a sparse factorial design (Fig 1), followed by in situ gynogenesis [43] to produce fixed admixed inbred lines. Each dent or flint line was involved in 0 to 11 hybrids (1.21 in average), each leading to 1 to 4 admixed lines (1.77 in average). In total, 171 dent lines and 172 flint lines were involved as parents of admixed lines.

**Fig. 1. Diagram of admixed lines production from hybrids obtained by mating dent and flint lines according to a sparse factorial design.**

All the flint and dent lines were genotyped using the 600K Affymetrix Maize Genotyping Array [44]. Residual heterozygous data was treated as missing and all missing values were imputed independently within each group using Beagle v.3.3.2 and default parameters [45]. The few heterozygous genotypic datapoints imputed by Beagle (0.00084% of all datapoints) were randomly assigned to homozygous genotypes. The admixed lines were genotyped with a 15K chip provided by the private company Limagrain which included a reduced set of SNPs from the 50K Illumina MaizeSNP50 BeadChip [46]. Eight check lines were genotyped with both 600K and 15K genotyping technologies to standardize the reference alleles (0/1) on the set of shared SNPs between the 600K and 15K datasets (9,015 SNPs). Admixed lines were then imputed to 600K SNPs using the following procedure, illustrated in S1 Fig. The positions of recombination breakpoints and the parental origins of the alleles for admixed lines were determined with the set of 9,015 shared SNPs. SNPs for which parental lines carry different alleles allowed us to identify the parental line that transmitted its allele to its admixed progeny. For a given admixed line, changes of parental origins of alleles along a given chromosome indicated the location of recombination breakpoints. A smoothing of parental allele origins was performed for the few SNPs indicating discordant information with respect to the chromosome block in which they were located. In this case, we considered the underlying genotypic datapoint as missing. Parental origins of alleles in admixed lines were imputed up to 600K using adjacent SNP information. If a set of SNPs to be imputed was located within a recombination interval, the new position of the breakpoint was positioned at half of that ordered set, according to the physical position of the SNPs along the chromosome (average proportions of SNPs located within such intervals was 0.93% for a given admixed individual). Alleles at SNPs were then imputed based on their origin using parental genotypic data. The MITE associated with the flowering QTL Vgt1 [47, 48] was also genotyped for all the individuals (0: absence, 1: presence). There was a total of 482,013 polymorphic SNPs in this dataset, for which we had information for each individual concerning the SNP allele (0/1), its ancestry (dent/flint) and the genetic background (dent/flint/admixed) in which it was observed.

The dent genome proportion of the admixed lines ranged from 0.16 to 0.86 with a mean equal to 0.51 (S2 Fig). Possible selection biases were studied along the genome by comparing the observed allele frequencies with the expected allele frequencies given the pedigree. No major pattern was observed, suggesting no or minor selection biases among the admixed lines (S3 Fig). A PCoA was performed on genetic distances computed as D_l,l′ = 1 −⁠ K_l,l′, with K_l,l′ being the kinship coefficient between lines l and l′ computed following Eq (2)—see below—assuming a common genetic background for all individuals, i.e. using an average frequency of allele 1 at each locus. The flint and dent lines are clearly distinguished on the two principal coordinates, with a small overlapping region in the center of the graph, while the admixed lines fill the genetic space between the two groups (Fig 2). The same PCoA calculated using the set of 9,015 shared SNPs between the 600K and 15K datasets showed a very similar structure pattern on the first two axes, as shown in S4 Fig.

**Fig. 2. PCoA on genetic distances with coloration of individuals depending on their genetic background: dent, flint or admixed.**

LD between pairs of loci was estimated separately in the dent and the flint datasets using the square correlation r² between loci pairs. We only considered SNPs for which at least ten individuals carried the minor allele in both dent and flint datasets. For each group, LD was calculated and averaged for sets of loci pairs characterized by a similar physical distance ranging from 0 to 2 Mbp, considering a sliding window of 1Kbp. The inter-group LD comparison revealed a higher LD extent in the dent than in the flint genetic group (S5 Fig), which was consistent with previous studies [13–16]. As suggested by [9], the persistence of LD linkage phases across flint and dent genetic groups was evaluated by computing the correlation between the r estimated in each group, along the same sliding window of 1Kbp. We also studied the consistency of LD linkage phases between groups by computing the correlation between their signs in the two groups, giving a value of “0” and “1” for a negative and a positive r, respectively. LD phases were very consistent over short physical distances but began to diverge dramatically when the loci were distant by more than 100-200 Kbp (S6 Fig).

Phenotypic data

All the lines were evaluated per se at Saint-Martin-de-Hinx (France) in 2015 and 2016 for male flowering (MF) and female flowering (FF), in calendar days after sowing. Each trial was a latinized alpha design where every line was evaluated two times on average. Field trials were divided into two blocks of 33 sub-blocks each comprising 36 plots. To avoid competition between genetic backgrounds, dent, flint and admixed lines were sown in different sub-blocks. Three check lines were repeated in all sub-blocks (B73, F353 and UH007). Each plot consisted in a row of 25 plants. MF and FF were measured as a median value within the whole plot.

The contribution of Genotype x Environment (GxE) interactions to the phenotypic variance and the level of broad-sense heritability were investigated using the following model:

where Y_jklrc is the phenotype, μ is the intercept, β_j is the fixed effect of trial j, α_k is the fixed effect of genetic background k (dent, flint, admixed, or the different checks: B73, F353 and UH007), G_kl is the random genotype effect of line l in genetic background k (not for checks) with σ G k 2 being the genotypic variance in genetic background k, (G × β)_jkl is the random GxE interaction of line l in genetic background k for trial j, with σ ( G × β ) j k 2 being the GxE variance in the genetic background k for trial j, E_jklrc is the error with σ E j 2 being the error variance for trial j, X_jr and Z_jc are the row and column random effects in trial j, respectively, as defined by the field design. All random effects are independent of each other. The row and column effects were modeled as independent or using an autoregressive model (AR1), as determined based on the AIC criterion (S1 Table). Least squares means (Y k l *), further referred to as phenotypes (Y_kl), were computed over the whole design using the same model, with genotypes as fixed effects: Y k l * = μ ^ + 1 2 ∑ j = 1 2 β ^ j + α ^ k + γ ^ k l where γ_kl is the fixed genotype effect of line l in genetic background k. Model parameters were estimated using ASReml-R and restricted maximum likelihood (ReML) [49].

General polygenic model

In this study, the following general polygenic model was considered:

where Y_kl is the phenotype (least squares mean) of line l in genetic background k among the N individuals of the sample, μ is the intercept, α_k is the genetic background effect with k ∈ {D, F, A} for dent, flint and admixed genetic background, respectively, G_kl is the random genetic value of the line with [ g D g F g A ] being the concatenated vector of the genetic values in each genetic background where [ g D g F g A ] ∼ N ( [ 0 0 0 ] , [ K D σ G D 2 K D , F σ G D F K D , A σ G D A K F , D σ G D F K F σ G F 2 K F , A σ G F A K A , D σ G D A K A , F σ G F A K A σ G A 2 ] ), K_k,k′ is the kinship matrix between individuals from genetic background k and k′ computed following Eq (2), σ G k 2 is the genetic variance in genetic background k, σ G k k ′ is the genetic covariance between genetic background k and k′, E_kl is the error associated with line l in genetic background k with E k l ∼ N ( 0 , σ E 2 ) independent and identically distributed, and σ E 2 is the error variance.

The kinship between lines l from genetic background k and l′ from genetic background k′, K_kl,k′l′, was computed following [50]:

where W_lm is the genotype of line l at locus m coded 0/1 and f_mk is the frequency of allele 1 at locus m in genetic background k. Note that Eq (2) simplifies to the kinship estimator proposed by [51] when l and l′ belong to the same genetic background.

GWAS models

In this study, three GWAS models were applied to different population samples (Table 1). The GWAS strategies were (i) to analyze dent and flint lines separately using a standard GWAS model M₁, (ii) to analyze dent and flint lines jointly using a GWAS model M₂ accounting for allele ancestry (confounded with the genetic background) and (iii) to analyze dent, flint and admixed lines using a GWAS model M₃ accounting for both allele ancestry and the genetic background of the individuals. All models aimed at detecting a SNP effect, defined as a contrast effect between alleles 0 and 1 at a given SNP.

**Tab. 1. Population sample to which each GWAS model was applied with the corresponding number of SNPs conserved for the analysis (at least 10 individuals carrying the minor allelic state).**

Standard GWAS model M₁

The first GWAS model M₁ [1] was applied separately to the dent and flint datasets. For each SNP among the M loci, one has:

where β i m is the effect of the SNP allele i at locus m (Table 2). All other terms are identical to those appearing in Eq (1), and the kinship was computed following Eq (2) which simplifies to the kinship estimator proposed by [51]. The existence of a SNP effect was tested using hypothesis H 0 : Δ m = β 1 m −⁠ β 0 m = 0.

**Tab. 2. Allelic states observed in each GWAS model, resulting from a combination of SNP alleles, their ancestry and the genetic background in which they are observed.**

Multi-group GWAS model M₂

We applied a multi-group GWAS model M₂ jointly to the flint and dent datasets, specifying the allele ancestry (confounded with the genetic background). For a given SNP m, one has:

where β i j m is the effect of the SNP allele i with ancestry j at locus m, as defined in Table 2. All other terms are identical to those appearing in Eq (1). At a given SNP, the following hypotheses were tested:

H 0 : Δ D m = β 1 D m −⁠ β 0 D m = 0
H 0 : Δ F m = β 1 F m −⁠ β 0 F m = 0
H 0 : Δ D + F m = Δ D m + Δ F m = 0
H 0 : Δ D −⁠ F m = Δ D m −⁠ Δ F m = 0

Hypotheses Δ D m and Δ F m test the existence of a dent and a flint SNP effect, respectively. Hypothesis Δ D + F m tests for a general SNP effect while Δ D −⁠ F m tests for a divergent SNP effect between the dent and flint ancestries.

Multi-group GWAS model M₃

We applied a multi-group GWAS model M₃ jointly to the flint, dent and admixed datasets, specifying the allele ancestry and the genetic background of the individual. For a given SNP m, one has:

where β i j k m is the effect of the SNP allele i with ancestry j at locus m in genetic background k, as defined in Table 2. All other terms are identical to those appearing in Eq (1). At a given SNP, 16 hypotheses were tested (Table 3). Hypotheses referred to as “simple” (Δ D D m, Δ D A m, Δ F A m and Δ F F m) were tested to identify QTLs with a significant SNP effect for each combination of ancestries and genetic backgrounds. For instance, Δ D A m tests whether a dent SNP effect (differential effect between alleles 0 and 1 of dent ancestry) is significant in the admixed genetic background. Hypotheses referred to as “general” (Δ F F + F A m, Δ D D + D A m, Δ D A + F A m, Δ D D + F F m and, Δ D D + D A + F A + F F m) were used to identify QTLs with a mean SNP effect over ancestries and genetic backgrounds. For instance, Δ F F + F A m tests for a general flint SNP effect in the flint and the admixed genetic backgrounds and Δ D D + D A + F A + F F m tests for a general SNP effect over ancestries and genetic backgrounds. Hypotheses referred to as “divergent” (Δ D A −⁠ F A m, Δ D D −⁠ D A m, Δ F F −⁠ F A m, Δ D D −⁠ F F m, Δ D A −⁠ F F m, Δ D D −⁠ F A m, Δ ( D D + D A ) −⁠ ( F F + F A ) m, Δ ( D D + F F ) −⁠ ( D A + F A ) m, Δ ( D D −⁠ D A ) −⁠ ( F F −⁠ F A ) m) were tested to identify QTLs with a contrasted SNP effect between ancestries and/or genetic backgrounds. For instance, Δ D D −⁠ D A m tests for a divergent dent SNP effect between the dent and the admixed genetic backgrounds, which amounts to testing an epistatic interaction between the SNP and the genetic background (see S1 Appendix for details).

Linear combinations tested with M<sub>3</sub> compared to hypotheses tested using other GWAS models (M<sub>1</sub> and M<sub>2</sub>). — **Tab. 3. Linear combinations tested with M₃ compared to hypotheses tested using other GWAS models (M₁ and M₂).**

On a biological standpoint, a QTL with contrasted SNP effects between groups can be caused by (i) a local genomic difference due to a group-specific genetic mutation for all or part of the lines and/or to group differences in LD or (ii) an interaction with the genetic background. Under the first hypothesis, one expects that the effect of a SNP depends on its ancestry but not on the genetic background (admixed or pure, see Fig 3a). Under the second hypothesis, we expect a SNP effect, for a given ancestry, to vary depending on the genetic background. One example would be a QTL with a strong SNP effect in a dent genetic background, but none in the flint genetic background, while the SNP effects would be of intermediate size for alleles of both ancestries in the admixed genetic background (see Fig 3b). Note that other complex configurations are possible, justifying the inclusion of all tests in the analysis.

Fig. 3. Schematic of allele effects when divergent SNP effects are observed between groups, depending on the biological hypothesis: (a) local genomic difference between groups (LD or mutation) and (b) allele effects interacting with the genetic background.

For the three GWAS models, a SNP was discarded if its minor allelic state, as defined in Table 2, was carried by less than 10 individuals, or if it carried a redundant genetic information (genetic information identical to that of another SNP already included in the dataset). To avoid prohibitive computational times, a two-step strategy was adopted for the inference of models M₂ and M₃. In a first step, the parameters of the “null” model of Eq (1) were estimated. The variance parameters were then plugged into their respective covariance matrices in order to derive a genetic covariance matrix G and an error covariance matrix R. In a second step, a model was fitted that included SNP fixed effects, as defined in M₂ (or M₃), and two random effects (one genetic effect and one error effect) with covariance matrices G and R, respectively. Note that this strategy corresponds to fitting M₂ (or M₃) while keeping some variance ratios fixed to their respective values obtained in the “null” model.

Model parameters were estimated using ReML and the linear combinations of fixed effects were tested using Wald tests, both implemented in the R-package MM4LMM [52]. P-values were computed using the X 2 ( 1 ) asymptotic null distribution of the Wald statistic, as presented in [4]. The false discovery rate (FDR) was controlled by applying the procedure of [53] jointly to the whole set of tests defined by each GWAS strategy, and repeatedly for each trait. All GWAS strategies were evaluated for their ability to control type I error and for their statistical power, using simulated phenotypes. Results are presented in S2 Appendix. In general, all models correctly controlled for false positives, and a higher power was observed for multi-group models, notably due to their ability to identify QTLs with complex configurations of effects.

For a given hypothesis tested, significant SNPs were clustered into QTLs if they were located within a physical window of 3 Mbp, leading to a LD below 0.05 between markers of different QTLs.

Results

Associations detected and comparison of GWAS strategies

We observed a substantial phenotypic variability within the dent, flint and admixed genetic backgrounds for both traits. The variance components estimated in the phenotypic analysis are summarized in S1 Table. GxE variances were limited and the broad sense heritabilities were high for each genetic background, ranging from 0.88 in the admixed lines to 0.96 in the dent and flint lines for both MF and FF. The model parameters estimated using the general polygenic model of Eq (1) are presented in S2 Table and showed a larger genetic variance in the dent compared to the flint and admixed genetic backgrounds.

For each GWAS model, two levels of FDR were used: 5% and 20% to declare a SNP as significantly associated. The number of significant SNPs detected and the corresponding number of QTLs were summarized in Table 4 for both traits. The location of QTLs detected using a FDR of 20% was represented along the genome in Fig 4 for MF and in S7 Fig for FF. All associations are listed in S3 and S4 Tables. Note that some SNPs were declared significant by a model (e.g. M₁) but were discarded with another model (e.g. M₃) because of the filtering on the frequency of each allelic state.

**Tab. 4. Number of SNPs associated with each trait, depending on the GWAS strategy, using a FDR of 5% and 20%.**

Position of QTLs detected with (a) M<sub>1</sub>, (b) M<sub>2</sub> and (c) M<sub>3</sub> for MF using a FDR of 20%. — **Fig. 4. Position of QTLs detected with (a) M₁, (b) M₂ and (c) M₃ for MF using a FDR of 20%.**

First, a standard GWAS model M₁ was applied separately to the dent and the flint datasets. Based on a 20% FDR, 35 SNPs were associated with MF in the dent dataset while 21 SNPs were associated in the flint dataset. These SNPs can be clustered into 12 QTLs in the dent dataset and into 13 QTLs in the flint dataset. Interestingly, none of these SNPs were detected in both datasets and they only pointed to one common QTL between datasets, which was located in the vicinity of Vgt2 on chromosome 8 [15].

Secondly, dent and flint datasets were analyzed jointly using model M₂, which takes into account the dent or flint ancestry of the allele. Note that the allele ancestry is confounded with the genetic background in this model. Based on a 20% FDR, 10 SNPs were associated with MF and were significant for Δ D m (5 SNPs), Δ F m (4 SNPs) and Δ D + F m (3 SNPs). Some SNPs displayed more than one significant test, which explains why the total number of SNPs over the four tests did not sum to 10. These SNPs can be clustered into 5 QTLs that were significant for Δ D m (4 QTLs), Δ F m (2 QTLs) and Δ D + F m (2 QTLs). Some QTLs were already detected using M₁ such as the QTL located in the vicinity of Vgt3 on chromosome 3 [54, 55] detected in the dent dataset. Other QTLs were specific to M₂ like the QTL located chromosome 1 detected using Δ D −⁠ F m for FF, or specific to M₁ such as the QTL located on chromosome 2 detected in the flint dataset. Based on a 20% FDR, a larger number of QTLs was detected with M₁ compared to M₂ for both traits.

Finally, the dent, flint and admixed lines were analyzed jointly using model M₃ which distinguished the allele ancestry and the genetic background. The existence of a dent SNP effect was tested in the dent (Δ D D m) and in the admixed genetic backgrounds (Δ D A m), and similarly for the flint SNP effect (Δ F F m and Δ F A m). Several hypotheses on general and divergent SNP effects were also tested between ancestries and genetic backgrounds (Table 3). Based on a 20% FDR, 56 SNPs were associated with MF and were significant for Δ D D + D A + F A + F F m (19 SNPs), Δ F F −⁠ F A m (2 SNPs), Δ D A −⁠ F A m (4 SNPs) and others. These SNPs can be clustered into 17 QTLs that were significant for Δ D D + D A + F A + F F m (5 QTLs), Δ F F −⁠ F A m (2 QTLs), Δ D A −⁠ F A m (4 QTLs) and others. Some of the QTLs were already detected using M₁ and M₂ such as the QTL located in the vicinity of Vgt3 on chromosome 3, while several QTLs were specific to M₃ such as the QTL detected in chromosome 2 using Δ F A m. Several QTLs were detected as showing a divergent SNP effect, including hypotheses testing an interaction with the genetic background. Based on a 20% FDR, a similar number of QTLs was detected using M₃ and M₁ for MF and M₃ was intermediate between M₁ and M₂ for FF.

Highlighted QTLs

Among the 17 QTLs detected for MF with M₃, six QTLs were selected and studied in further details. These QTLs had (i) at least one significant test among M₃ hypotheses based on a FDR of 20%, and (ii) a large frequency for each allele with a minimum of 23 lines carrying the minor allelic state (Vgt1). Among them, SNPs were located in the vicinity of known maize flowering QTLs: Vgt1 [22, 47, 48], Vgt2 [15] and Vgt3 [54, 55]. For all QTLs, information concerning their physical position along the genome, the frequency of each allelic state and their -log₁₀(pval) at each test was summarized in Table 5. The distribution of the phenotypes is illustrated for each allele after adjusting for the variation due to the polygenic background in Fig 5, and their location along the genome is indicated by red vertical lines in Fig 4.

Boxplots of phenotypes adjusted for polygenic background variation using relatedness (MF K corrected) for the different alleles of the six highlighted QTLs: (a) <i>Vgt1</i>, (b) <i>Vgt2</i>, (c) <i>Vgt3</i>, (d) <i>QTL4.1</i>, (e) <i>QTL2.1</i> and (f) <i>QTL7.2</i> using M<sub>3</sub>. — Fig. 5. Boxplots of phenotypes adjusted for polygenic background variation using relatedness (MF K corrected) for the different alleles of the six highlighted QTLs: (a) *Vgt1*, (b) *Vgt2*, (c) *Vgt3*, (d) *QTL4.1*, (e) *QTL2.1* and (f) *QTL7.2* using M₃.

**Tab. 5. Information regarding the six highlighted QTLs.**

The SNP matching Vgt1 region on chromosome 8 was detected as associated with MF (20% FDR) using Δ ( D D + D A ) −⁠ ( F F + F A ) m (-log₁₀(pval) = 5.96) in M₃. This QTL showed a contrasted effect between alleles of different ancestries with an apparent inversion of effects (Fig 5a). This observation was supported by a high -log₁₀(pval) for the tests related to a divergent SNP effect between ancestries: Δ D −⁠ F m (3.83), Δ D D −⁠ F F m (3.90), Δ D A −⁠ F A m (4.13) and Δ ( D D + D A ) −⁠ ( F F + F A ) m (5.96). Conversely a low -log₁₀(pval) was detected for tests Δ D D −⁠ D A m and Δ F F −⁠ F A m, which would have otherwise suggested an interaction with the genetic background. These results support the existence of a local genomic difference at Vgt1 between the dent and the flint genetic groups for MF, but no interaction with the genetic background.

The SNP matching Vgt2 region on chromosome 8 was detected as associated with MF (20% FDR) using Δ D D + D A + F A + F F m (-log₁₀(pval) = 6.68) in M₃. This QTL showed a conserved effect across ancestries and genetic backgrounds (Fig 5b). This observation was supported by a high -log₁₀(pval) for tests related to a general SNP effect: Δ D + F m (6.04), Δ D D + F F m (6.30), Δ D D + D A m (5.23), Δ D A + F A m (3.65) and Δ D D + D A + F A + F F m (6.68), and a low -log₁₀(pval) for tests related to divergent SNP effects (all below 1).

The SNP matching Vgt3 region on chromosome 3 was detected as associated with MF (5% FDR) using Δ D D m (-log₁₀(pval) = 8.69) in M₃. This QTL showed a large effect in the dent genetic background, a medium effect in the admixed genetic background regardless of the allele ancestry and a small effect in the flint genetic background (Fig 5c). This observation was supported by a high -log₁₀(pval) for the tests related to the dent SNP effect in the dent genetic background: Δ^m (M₁ (Dent), 10.99), Δ D m (9.42) and Δ D D m (8.69), and a low -log₁₀(pval) for the tests related to the flint SNP effect in a flint genetic background. Like for Vgt2, a high -log₁₀(pval) was also detected for tests related to a general SNP effect: Δ D + F m (7.81), Δ D D + F F m (7.11), Δ D D + D A m (6.09) and Δ D D + D A + F A + F F m (6.81), but a high -log₁₀(pval) was detected for the test related to a divergent SNP effect between the dent and the flint genetic backgrounds: Δ D D −⁠ F F m (3.47). There was also a high -log₁₀(pval) for a divergent dent SNP effect between different genetic backgrounds: Δ D D −⁠ D A m (2.28). All these results support the existence of a QTL effect that tends to be higher when the dent genome proportion increases within individuals. It suggests that Vgt3 interacts with the genetic background for MF.

The SNP matching a region further referred to as QTL4.1 on chromosome 4 was detected as associated with MF (20% FDR) using Δ D D −⁠ F F m (-log₁₀(pval) = 6.59) in M₃. This QTL is very similar to Vgt1 as it showed a contrasted effect between alleles of different ancestries with an apparent inversion of effects (Fig 5d). This observation was supported by a high -log₁₀(pval) for the tests related to a divergent SNP effect between ancestries: Δ D −⁠ F m (5.54), Δ D D −⁠ F F m (6.59) and Δ ( D D + D A ) −⁠ ( F F + F A ) m (5.38). These results support the existence of a local genomic difference at QTL4.1 between the dent and the flint genetic groups for MF, but no interaction with the genetic background.

The SNP matching a region further referred to as QTL2.1 on chromosome 2 was detected as associated with MF (5% FDR) using Δ F A m (-log₁₀(pval) = 8.99) in M₃. This QTL showed a flint effect in the admixed genetic background (Fig 5e), which was supported by a high -log₁₀(pval) for the test Δ F A m (8.99). Although there was a high -log₁₀(pval) for a general flint SNP effect across genetic backgrounds: Δ F F + F A m (6.42), a high -log₁₀(pval) was observed for a divergent SNP effect between those same alleles: Δ F F −⁠ F A m (3.98). A high -log₁₀(pval) was also observed for a divergent SNP effect between different ancestries in the admixed genetic background: Δ D A −⁠ F A m (5.44). All these results support the existence of a QTL effect existing only for alleles of flint ancestry in the admixed genetic background. It suggests that QTL2.1 is specific of flint ancestry and interacts with the genetic background for MF.

The SNP matching a region further referred to as QTL7.2 on chromsome 7 was detected as associated with MF (20% FDR) using Δ ( D D −⁠ D A ) −⁠ ( F F −⁠ F A ) m (-log₁₀(pval) = 6.20) in M₃. This QTL showed contrasted dent effects between the dent and the admixed genetic backgrounds (Fig 5f). This observation was supported by a high -log₁₀(pval) for the test related to a divergent dent SNP effect between genetic backgrounds: Δ D D −⁠ D A m (5.43). A high -log₁₀(pval) was also observed for the hypothesis testing the equality between the divergent dent SNP effect and the divergent flint SNP effect: Δ ( D D −⁠ D A ) −⁠ ( F F −⁠ F A ) m (6.20). All these results support the existence of a QTL with opposite effects between the dent and the admixed genetic backgrounds. It suggests that QTL7.2 interacts with the genetic background for MF.

Discussion

Accounting for genetic groups in GWAS

The stratification of the population sample into distinct genetic groups is a common feature in GWAS studies that challenges the methods to detect QTLs. A simple way to deal with genetic groups is to analyze them separately. In our study, a standard GWAS model M₁ was applied separately to the dent and the flint datasets. Among the QTLs detected for MF, only one was detected in both dent and flint datasets, and not at the same SNPs, while none were detected in common for FF. One may question whether observing such differences between datasets indicated group specific allele effects, or simply group differences in terms of statistical power due to a difference in allele frequency. This question often arises when GWAS is applied separately to genetic groups, as in maize [16, 56] or dairy cattle [57, 58], and is very difficult to answer except for obvious configurations such as associations at SNPs segregating only in one group.

Another way to handle genetic groups is to analyze them jointly. One possibility is to apply model M₁ while specifying genetic structure as a global fixed effect, in order to prevent the detection of spurious associations. In dairy cattle, this strategy generally improved the precision concerning QTL locations by taking advantage of the low LD extent observed in multi-group datasets. However, while [34] and [33] observed a gain in statistical power due to a larger population size, [32] detected less QTLs by combining breeds compared to separate analyses. They attributed this finding to the limited amount of QTLs segregating within both Holstein and Jersey breeds, but also reported that QTLs detected in both breeds showed only small to medium correlations between within-breed estimates of SNP effects (e.g. 0.082 for milk yield). Obviously, applying M₁ jointly to genetic groups does not address directly the problem of whether QTL effects are conserved or not between genetic groups.

A model specifying group specific allele effects was referred to as M₂ in this study. As with M₁, the existence of a SNP effect can be tested for each group, but M₂ also allows one to test the existence of a general and a divergent SNP effects between groups. In our study, this model allowed to test for a dent (Δ D m) and a flint (Δ F m) SNP effect, along with a general (Δ D + F m) and a divergent (Δ D −⁠ F m) SNP effects between flint and dent ancestries. Note that testing Δ D + F m is similar, although not strictly equivalent, to testing a SNP effect by applying M₁ to a multi-group dataset. Using Δ D + F m = β 1 D m −⁠ β 0 D m + β 1 F m −⁠ β 0 F m in M₂, the same weights are given to allelic contrasts in the two groups. Applying M₁ to a multi-group dataset would only be equivalent to applying M₂ when considering markers with identical allele frequencies in the two groups. Using the hypotheses specifically tested in M₂ (Δ D + F m and Δ D −⁠ F m), it was possible to detect new QTLs that were not detected with M₁. In particular, a QTL detected on chromosome 1 for FF had a divergent SNP effect between the dent and flint genetic groups, suggesting the existence of group-specific QTL effects in this dataset. Some QTLs were detected in common with M₁ but each strategy allowed the detection of specific QTLs, demonstrating the complementarity between the models. In conclusion, M₂ was efficient to identify QTLs with either conserved or specific allele effects between ancestries, but observing group-specific allele effects provided little insight regarding the cause of this specificity. Admixed individuals helped to tackle this issue.

Benefits from admixed individuals

Admixed individuals were generated for this study by mating pure individuals of each group according to a sparse factorial design. Integrating these admixed individuals in GWAS can be done by simply analyzing the joint multi-group dataset using M₁ or M₂, which may lead to a gain in statistical power, due to an increase in population size. More interestingly, admixed individuals can be used to disentangle the factors causing the heterogeneity of allele effects across groups.

We developed model M₃ to distinguish the allele ancestry (dent/flint) and the genetic background (dent/flint/admixed). As shown using simulations (S2 Appendix), applying M₃ should result in a gain in statistical power by (i) testing an overall SNP effect for SNP with conserved effects accross ancestries and/or genetic backgrounds, and (ii) testing hypotheses for complex configurations between allele effects. When applied to MF, 17 QTLs were detected (20% FDR). While many of these QTLs were previously detected using M₁ and M₂, the new hypotheses tested allowed us to discover new interesting regions.

For equivalent tests in M₁, M₂ and M₃ (e.g. Δ^m (Dent) in M₁, Δ D m in M₂ and Δ D D m in M₃), the lower number of associations detected with M₂ and M₃ compared to M₁ for real traits can be attributed to a different filtering on allele frequencies, the use of an approximate model for M₂ and M₃, and to the randomness associated with a particular experiment. Regarding false positive control, the observation of the QQ-plots of the test p-values of M₁, M₂ and M₃ did not show particular problems, as presented for MF in S8, S9, and S10 Figs and for FF in S11, S12 and S13 Figs.

The idea of exploiting admixed individuals has been proposed in the creation of NAM [40] and MAGIC [41] populations. Compared to our approach, such experimental populations include a limited number of founders, generally selected in different genetic groups. This is beneficial to increase power of detection for alleles which were rare in parental groups. However these populations cannot address the question of the epistatic interaction with the genetic background of the original groups. Both our approach and NAM and MAGIC designs are therefore expected to have complementary properties.

Heterogeneity of maize flowering QTL allele effects

From a global perspective, a high number of QTLs have been detected in previous maize studies [16, 22, 37, 59, 60]. When evaluating the American and European NAMs, [22] and [61] showed that flowering time is a trait controlled by a large number of QTLs, many of which display variable effects across individual recombinant populations. Our study highlights consistently a high number of QTLs and confirms a large variation in allele effects. It provides further elements on the origin of this variation, by identifying QTLs affected by local genomic differences, epistasis with the genetic background, or both.

When doing GWAS in a multi-group population, geneticists generally assume that QTL effects are conserved between groups. Such QTLs were detected in our study with the example of the SNP associated with MF in the vicinity of Vgt2 [15] and its candidate gene: the flowering activator ZCN8 [62–64] on chromosome 8. At this SNP, all hypotheses that tested a general SNP effect had a high -log₁₀(pval), and conversely for hypotheses testing a divergent SNP effect. When simultaneously interpreting all tests, Vgt2 appeared to have an effect that is conserved between genetic groups. Such a QTL can easily be detected in a multi-group population sample using a standard GWAS model [1]. However many QTLs showed more complex patterns.

When group-specific allele effects are only due to group differences in LD or group-specific mutations at the QTL, the difference in allele effects should be conserved between the pure and the admixed genetic backgrounds. A first QTL matching this situation is Vgt1 [22, 47, 48] (candidate gene: ZmRap2.7) that was detected by a SNP located on chromosome 8. High -log₁₀(pval) were observed when testing for a divergent SNP effect between ancestries (Δ ( D D + D A ) −⁠ ( F A + F F ) m), suggesting a local genomic difference. It remains difficult to disentangle the effect of LD from that of a genetic mutation without complementary analysis. LD was shown to be different between groups, with a higher LD extent in the dent group (S5 Fig), while LD phases appeared well-conserved at short distances (S6 Fig). However, a strong overall conservation of LD phases at short distances does not exclude a specific configuration for a given SNP-QTL pair. Note that Vgt1 was surprisingly not detected using the MITE located 548 Kbp before the detected SNP. [48] already showed the existence of other genetic variants being more associated with maize flowering than the MITE in the vicinity of Vgt1, such as CGindel587. Another QTL (QTL4.1) was detected by a SNP located on chromosome 4 and had a very similar profile to that of Vgt1. Its position is close (< 700 Kbp) to GRMZM2G126253, a candidate gene for maize flowering time proposed by [60]. To validate the hypothesis of a local genomic difference at these QTLs, one could produce near isogenic lines with the two alleles from both ancestries introgressed in a dent and a flint genetic backgrounds. A phenotypic evaluation of these individuals would give a definitive proof of a local genomic difference.

Group-specific allele effects may also be due to an interaction with the genetic background. A first QTL matching this profile was detected by a SNP in the vicinity of Vgt3 on chromosome 3 [54, 55] and its candidate gene ZmMADS69 [65]. This QTL showed an effect varying according to the genetic background: large in the dent, intermediate in the admixed and small in the flint. A high -log₁₀(pval) was observed for tests that supported this hypothesis: a dent SNP effect in the dent genetic background (Δ D D m) and a divergent dent SNP effect between genetic backgrounds (Δ D D −⁠ D A m). If this interaction with the background involves numerous loci, introgressing alleles from a dent into a flint genetic background may lead to disappointing results, as the effect would probably vanish with repeated back-cross generations. If interactions mostly involve a single locus, the effect at Vgt3 effect is conditioned by the allele at the other locus, so that a simultaneous introgression may be necessary to reach the desired effect. Using near isogenic lines that cumulated an early mutation at Vgt1 [66] and the early allele at Vgt3, the effect of Vgt3 was shown to vanish in presence of the early allele of Vgt1 (A. Charcosset pers. comm.), which supports the hypothesis of Vgt3 interacting with the genetic background. Recently, [65] demonstrated the action of ZmMADS69, the candidate gene of Vgt3, as being an activator of the regulatory module ZmRap2.7—ZCN8, which are the candidate genes of Vgt1 and Vgt2, respectively. The existence of such interactions is consistent with flowering time being controlled by a network of interacting loci, as now well established in model species arabidopis [67].

Other examples of QTLs interacting with the genetic background were identified. Two of them featured a similar profile in the sense that they mainly exhibited a QTL effect in the admixed genetic background. One was located on chromosome 2 (QTL2.1) and showed a flint effect in the admixed genetic background, while the other QTL was located on chromosome 7 (QTL7.2) and showed an opposite dent effect between the dent and the admixed genetic backgrounds. Such QTLs are interesting as they are mainly revealed when creating admixed genetic material. They also suggest complex epistatic interactions between QTLs for these traits. The position of QTL2.1 is close (< 1.4 Mbp) to ereb197 and the position of QTL7.2 is close (< 100 Kbp) to dof47. Both are candidate genes for maize flowering time proposed by [60].

The existence of epistatic interactions was also evaluated globally by decomposing the genetic variance into an additive and an epistatic component, as suggested by [68]. This confirmed the existence of epistatic interactions between pairs of loci for FF and MF (S5 Table) and supported the possibility of QTLs interacting with the genetic background, resulting from epistatic interactions with loci that have differentiated allele frequencies between groups. It would be interesting to test the existence of epistatic interactions between each pair of loci. However, a filtering on crossed allele frequencies between pairs of loci would lead to discard most SNPs from the analysis. Other possibilities would be to test the epistatic variance of each SNP against the polygenic background, as proposed by [69–71].

Conclusion

In this study, we proposed an innovative multi-group GWAS method which accounts and tests for the heterogeneity of QTL allele effects between groups. The addition of admixed individuals to the dataset was useful to disentangle the factors causing the heterogeneity of allele effects, being either local genomic differences or epistatic interactions with the genetic background. Only homozygous inbred lines were considered in this study, but the method may be generalized to heterozygous individuals. Recently many studies focused on the problem of genomic prediction across genetic groups [42, 72–75]. In such scenarios, the stability of QTL effects across genetic backgrounds is an important factor impacting the prediction accuracy. It is also an important factor of the relevancy of any marker based diagnostic in complex/structured populations. Our approach opens new perspectives to investigate this stability in a wide range of species.

Supporting information

S1 Fig [tif]
Imputation diagram of admixed lines.

S2 Fig [tif]
Histogram of dent genome proportion among admixed lines.

S3 Fig [tif]
Genome-wide selection biases among admixed lines.

S4 Fig [tif]
PCoA on genetic distances using the set of 9,015 shared SNPs between the 600K and 15K datasets.

S5 Fig [tif]
LD extent.

S6 Fig [a]
Conservation of LD phases.

S7 Fig [a]
Position of QTLs detected for FF.

S8 Fig [tif]
QQ-plots of M for MF.

S9 Fig [tif]
QQ-plots of M for MF.

S10 Fig [tif]
QQ-plots of M for MF.

S11 Fig [tif]
QQ-plots of M for FF.

S12 Fig [tif]
QQ-plots of M for FF.

S13 Fig [tif]
QQ-plots of M for FF.

S1 Table [xlsx]
Parameters estimated in the phenotypic analysis.

S2 Table [xlsx]
Parameters estimated using the general polygenic model.

S3 Table [delta]
Information regarding significant SNPs for MF.

S4 Table [delta]
Information regarding significant SNPs for FF.

S5 Table [2]
Additive, epistatic and residual variance components for each trait with the p-value (pval) of the epistatic component using a likelihood-ratio LR test.

S1 Appendix [pdf]
Interpretation of the test .

S2 Appendix [pdf]
False discovery rate and statistical power of GWAS models.

Zdroje

1. Yu J, Pressoir G, Briggs WH, Vroh Bi I, Yamasaki M, Doebley JF, et al. A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nature Genetics. 2006;38 : 203–208. doi: 10.1038/ng1702 16380716

2. Wright S. Evolution in Mendelian populations. Genetics. 1931;16 : 97–159. 17246615

3. Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D. Principal components analysis corrects for stratification in genome-wide association studies. Nature Genetics. 2006;38 : 904–909. doi: 10.1038/ng1847 16862161

4. Rincent R, Moreau L, Monod H, Kuhn E, Melchinger AE, Malvar RA, et al. Recovering Power in Association Mapping Panels with Variable Levels of Linkage Disequilibrium. Genetics. 2014;197(1):375–387. doi: 10.1534/genetics.113.159731 24532779

5. Pritchard JK, Przeworski M. Linkage Disequilibrium in Humans: Models and Data. The American Journal of Human Genetics. 2001;69(1):1–14. doi: 10.1086/321275 11410837

6. Rogers AR. How Population Growth Affects Linkage Disequilibrium. Genetics. 2014;197(4):1329–1341. doi: 10.1534/genetics.114.166454 24907258

7. Sawyer SL, Mukherjee N, Pakstis AJ, Feuk L, Kidd JR, Brookes AJ, et al. Linkage disequilibrium patterns vary substantially among populations. European Journal Of Human Genetics. 2005;13 : 677–686. doi: 10.1038/sj.ejhg.5201368 15657612

8. Evans DM, Cardon LR. A comparison of linkage disequilibrium patterns and estimated population recombination rates across multiple populations. The American Journal of Human Genetics. 2005;76 : 681–687. doi: 10.1086/429274 15719321

9. de Roos APWM, Hayes BJ, Spelman RJ, Goddard ME. Linkage disequilibrium and persistence of phase in Holstein-Friesian, Jersey and Angus cattle. Genetics. 2008;179 : 1503–1512. doi: 10.1534/genetics.107.084301 18622038

10. Porto-Neto LR, Kijas JW, Reverter A. The extent of linkage disequilibrium in beef cattle breeds using high-density SNP genotypes. Genetics Selection Evolution. 2014;46(1):22. doi: 10.1186/1297-9686-46-22

11. Badke YM, Bates RO, Ernst CW, Schwab C, Steibel JP. Estimation of linkage disequilibrium in four US pig breeds. BMC Genomics. 2012;13(1):24. doi: 10.1186/1471-2164-13-24 22252454

12. Hao C, Wang L, Ge H, Dong Y, Zhang X. Genetic Diversity and Linkage Disequilibrium in Chinese Bread Wheat (Triticum aestivum L.) Revealed by SSR Markers. PLOS ONE. 2011;6(2):1–13. doi: 10.1371/journal.pone.0017279

13. Van Inghelandt D, Reif JC, Dhillon BS, Flament P, Melchinger AE. Extent and genome-wide distribution of linkage disequilibrium in commercial maize germplasm. Theoretical and Applied Genetics. 2011;123(1):11–20. doi: 10.1007/s00122-011-1562-3 21404061

14. Technow F, Riedelsheimer C, Schrag TA, Melchinger AE. Genomic prediction of hybrid performance in maize with models incorporating dominance and population specific marker effects. Theoretical and Applied Genetics. 2012;125(6):1181–1194. doi: 10.1007/s00122-012-1905-8 22733443

15. Bouchet S, Servin B, Bertin P, Madur D, Combes V, Dumas F, et al. Adaptation of Maize to Temperate Climates: Mid-Density Genome-Wide Association Genetics and Diversity Patterns Reveal Key Genomic Regions, with a Major Contribution of the Vgt2 (ZCN8) Locus. PLOS ONE. 2013;8(8):1–17. doi: 10.1371/journal.pone.0071377

16. Rincent R, Nicolas S, Bouchet S, Altmann T, Brunel D, Revilla P, et al. Dent and Flint maize diversity panels reveal important genetic potential for increasing biomass production. Theoretical and Applied Genetics. 2014;127(11):2313–2331. doi: 10.1007/s00122-014-2379-7 25301321

17. Stryjecki C, Alyass A, Meyre D. Ethnic and population differences in the genetic predisposition to human obesity. Obesity Reviews. 2018;19(1):62–80. doi: 10.1111/obr.12604 29024387

18. Tang H. Confronting ethnicity-specific disease risk. Nature Genetics. 2006;38(1):12–15. doi: 10.1038/ng0106-13

19. Helgadottir A, Manolescu A, Helgason A, Thorleifsson G, Thorsteinsdottir U, Gudbjartsson DF, et al. A variant of the gene encoding leukotriene A4 hydrolase confers ethnicity-specific risk of myocardial infarction. Nature Genetics. 2006;38(1):68–74. doi: 10.1038/ng1692 16282974

20. Barroso I, Luan J, Wheeler E, Whittaker P, Wasson J, Zeggini E, et al. Population-Specific Risk of Type 2 Diabetes Conferred by HNF4A P2 Promoter Variants. Diabetes. 2008;57(11):3161–3165. doi: 10.2337/db08-0719 18728231

21. Neuman RJ, Wasson J, Atzmon G, Wainstein J, Yerushalmi Y, Cohen J, et al. Gene-Gene Interactions Lead to Higher Risk for Development of Type 2 Diabetes in an Ashkenazi Jewish Population. PLOS ONE. 2010;5(3):1–6. doi: 10.1371/journal.pone.0009903

22. Buckler ES, Holland JB, Bradbury PJ, Acharya CB, Brown PJ, Browne C, et al. The Genetic Architecture of Maize Flowering Time. Science. 2009;325(5941):714–718. doi: 10.1126/science.1174276 19661422

23. Durand E, Bouchet S, Bertin P, Ressayre A, Jamin P, Charcosset A, et al. Flowering Time in Maize: Linkage and Epistasis at a Major Effect Locus. Genetics. 2012;190(4):1547–1562. doi: 10.1534/genetics.111.136903 22298708

24. Evangelou E, Ioannidis JPA. Meta-analysis methods for genome-wide association studies and beyond. Nature Genetics. 2013;14 : 379–389. doi: 10.1038/nrg3472

25. Li YR, Keating BJ. Trans-ethnic genome-wide association studies: advantages and challenges of mapping in diverse populations. Genome Medicine. 2014;6(10):91. doi: 10.1186/s13073-014-0091-5 25473427

26. Ioannidis JPA, Ntzani EE, Trikalinos TA. ‘Racial’ differences in genetic effects for complex diseases. Nature Genetics. 2004;36(12):1312–1318. doi: 10.1038/ng1474 15543147

27. Marigorta UM, Navarro A. High Trans-ethnic Replicability of GWAS Results Implies Common Causal Variants. PLOS Genetics. 2013;9(6):1–13. doi: 10.1371/journal.pgen.1003566

28. Ntzani EE, Liberopoulos G, Manolio TA, Ioannidis JPA. Consistency of genome-wide associations across major ancestral groups. Human Genetics. 2012;131(7):1057–1071. doi: 10.1007/s00439-011-1124-4 22183176

29. Cole JB, VanRaden PM, O’Connell JR, Van Tassell CP, Sonstegard TS, Schnabel RD, et al. Distribution and location of genetic effects for dairy traits. Journal of Dairy Science. 2009;92(6):2931–2946. doi: 10.3168/jds.2008-1762 19448026

30. Hayes BJ, Pryce J, Chamberlain AJ, Bowman PJ, Goddard ME. Genetic Architecture of Complex Traits and Accuracy of Genomic Prediction: Coat Colour, Milk-Fat Percentage, and Type in Holstein Cattle as Contrasting Model Traits. PLOS Genetics. 2010;6(9):1–11. doi: 10.1371/journal.pgen.1001139

31. Cole JB, Wiggans GR, Ma L, Sonstegard TS, Lawlor TJ, Crooker BA, et al. Genome-wide association analysis of thirty one production, health, reproduction and body conformation traits in contemporary U.S. Holstein cows. BMC Genomics. 2011;12(1):408. doi: 10.1186/1471-2164-12-408 21831322

32. Raven LA, Cocks BG, Hayes BJ. Multibreed genome wide association can improve precision of mapping causative variants underlying milk production in dairy cattle. BMC Genomics. 2014;15(1):62. doi: 10.1186/1471-2164-15-62 24456127

33. van den Berg I, Boichard D, Lund MS. Comparing power and precision of within-breed and multibreed genome-wide association studies of production traits using whole-genome sequence data for 5 French and Danish dairy cattle breeds. Journal of Dairy Science. 2016;99(11):8932–8945. doi: 10.3168/jds.2016-11073 27568046

34. Sanchez MP, Govignon-Gion A, Croiseau P, Fritz S, Hozé C, Miranda G, et al. Within-breed and multi-breed GWAS on imputed whole-genome sequence variants reveal candidate mutations affecting milk protein composition in dairy cattle. Genetics Selection Evolution. 2017;49(1):68. doi: 10.1186/s12711-017-0344-z

35. Flint-Garcia SA, Thuillet AC, Yu J, Pressoir G, Romero SM, Mitchell SE, et al. Maize association population: a high-resolution platform for quantitative trait locus dissection. The Plant Journal. 2005;44(6):1054–1064. doi: 10.1111/j.1365-313X.2005.02591.x 16359397

36. Camus-Kulandaivelu L, Veyrieras JB, Madur D, Combes V, Fourmann M, Barraud S, et al. Maize Adaptation to Temperate Climate: Relationship Between Population Structure and Polymorphism in the Dwarf8 Gene. Genetics. 2006;172(4):2449–2463. doi: 10.1534/genetics.105.048603 16415370

37. Romay MC, Millard MJ, Glaubitz JC, Peiffer JA, Swarts KL, Casstevens TM, et al. Comprehensive genotyping of the USA national maize inbred seed bank. Genome Biology. 2013;14(6):R55. doi: 10.1186/gb-2013-14-6-r55 23759205

38. Rius M, Darling JA. How important is intraspecific genetic admixture to the success of colonising populations? Trends in Ecology & Evolution. 2014;29(4):233–242. https://doi.org/10.1016/j.tree.2014.02.003.

39. Brandenburg JT, Mary-Huard T, Rigaill G, Hearne SJ, Corti H, Joets J, et al. Independent introductions and admixtures have contributed to adaptation of European maize and its American counterparts. PLOS Genetics. 2017;13(3):1–30. doi: 10.1371/journal.pgen.1006666

40. McMullen MD, Kresovich S, Villeda HS, Bradbury P, Li H, Sun Q, et al. Genetic Properties of the Maize Nested Association Mapping Population. Science. 2009;325(5941):737–740. doi: 10.1126/science.1174320 19661427

41. Cavanagh C, Morell M, Mackay I, Powell W. From mutations to MAGIC: resources for gene discovery, validation and delivery in crop plants. Current Opinion in Plant Biology. 2008;11(2):215–221. https://doi.org/10.1016/j.pbi.2008.01.002. 18295532

42. Rio S, Mary-Huard T, Moreau L, Charcosset A. Genomic selection efficiency and a priori estimation of accuracy in a structured dent maize panel. Theoretical and Applied Genetics. 2019;132(1):81–96. doi: 10.1007/s00122-018-3196-1 30288553

43. Bordes J, Dumas de Vaulx R, Lapierre A, Pollacsek M. Haplodiploidization of maize (Zea mays L.) through induced gynogenesis assisted by glossy markers and its use in breeding. Agronomie. 1997;17 : 291–297. doi: 10.1051/agro:19970504

44. Unterseer S, Bauer E, Haberer G, Seidel M, Knaak C, Ouzunova M, et al. A powerful tool for genome analysis in maize: development and evaluation of the high density 600 k SNP genotyping array. BMC Genomics. 2014;15(1):823. doi: 10.1186/1471-2164-15-823 25266061

45. Browning BL, Browning SR. A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals. The American Journal of Human Genetics. 2009;84(2):210–23. doi: 10.1016/j.ajhg.2009.01.005 19200528

46. Ganal MW, Durstewitz G, Polley A, Bérard A, Buckler ES, Charcosset A, et al. A Large Maize (Zea mays L.) SNP Genotyping Array: Development and Germplasm Genotyping, and Genetic Mapping to Compare with the B73 Reference Genome. PLOS ONE. 2011;6(12):1–15. doi: 10.1371/journal.pone.0028334

47. Salvi S, Sponza G, Morgante M, Tomes D, Niu X, Fengler KA, et al. Conserved noncoding genomic sequences associated with a flowering-time quantitative trait locus in maize. Proceedings of the National Academy of Sciences of the United States of America. 2007;104(27):11376–11381. doi: 10.1073/pnas.0704145104 17595297

48. Ducrocq S, Madur D, Veyrieras JB, Camus-Kulandaivelu L, Kloiber-Maitz M, Presterl T, et al. Key Impact of Vgt1 on Flowering Time Adaptation in Maize: Evidence From Association Mapping and Ecogeographical Information. Genetics. 2008;178(4):2433–2437. doi: 10.1534/genetics.107.084830 18430961

49. Butler DG, Cullis BR, Gilmour AR, Gogel BJ, Thompson R. ASReml-R Reference Manual Version 4; 2009. VSN International Ltd, Hemel Hempstead, HP1 1ES, UK.

50. Wientjes YCJ, Bijma P, Vandenplas J, Calus MPL. Multi-population Genomic Relationships for Estimating Current Genetic Variances Within and Genetic Correlations Between Populations. Genetics. 2017;207(2):503–515. doi: 10.1534/genetics.117.300152 28821589

51. VanRaden PM. Efficient Methods to Compute Genomic Predictions. Journal of Dairy Science. 2008;91(11):4414–4423. doi: 10.3168/jds.2007-0980 18946147

52. Laporte F, Charcosset A, Mary-Huard T. Efficient ReML inference in Variance Component Mixed Models using Min-Max algorithms. 2019. Forthcoming.

53. Benjamini Y, Hochberg Y. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society. 1995;57(1):289–300.

54. Salvi S, Corneti S, Bellotti M, Carraro N, Sanguineti MC, Castelletti S, et al. Genetic dissection of maize phenology using an intraspecific introgression library. BMC plant biology. 2011;11 : 4. doi: 10.1186/1471-2229-11-4 21211047

55. Salvi S, Emanuelli F, Soriano JM, Zamariola L, Giuliani S, Bovina R, et al. Cloning of Vgt3, a major QTL for flowering time in maize. In: 59th Annual Maize Genetics Conference; 2017.

56. Revilla P, Rodríguez VM, Ordás A, Rincent R, Charcosset A, Giauffret C, et al. Association mapping for cold tolerance in two large maize inbred panels. BMC Plant Biology. 2016;16(1):127. doi: 10.1186/s12870-016-0816-2 27267760

57. Buitenhuis B, Janss LL, Poulsen NA, Larsen LB, Larsen MK, Sørensen P. Genome-wide association and biological pathway analysis for milk-fat composition in Danish Holstein and Danish Jersey cattle. BMC Genomics. 2014;15(1):1112. doi: 10.1186/1471-2164-15-1112 25511820

58. Buitenhuis B, Poulsen NA, Larsen LB, Sehested J. Estimation of genetic parameters and detection of quantitative trait loci for minerals in Danish Holstein and Danish Jersey milk. BMC Genetics. 2015;16(1):52. doi: 10.1186/s12863-015-0209-9 25989905

59. Chardon F, Virlon B, Moreau L, Falque M, Joets J, Decousset L, et al. Genetic architecture of flowering time in maize as inferred from quantitative trait loci meta-analysis and synteny conservation with the rice genome. Genetics. 2004;168(4):2169–2185. doi: 10.1534/genetics.104.032375 15611184

60. Li Yx, Li C, Bradbury PJ, Liu X, Lu F, Romay CM, et al. Identification of genetic variants associated with maize flowering time using an extremely large multi-genetic background population. The Plant Journal. 2016;86(5):391–402. doi: 10.1111/tpj.13174 27012534

61. Giraud H, Lehermeier C, Bauer E, Falque M, Segura V, Baulaud C, et al. Linkage Disequilibrium with Linkage Analysis of Multiline Crosses Reveals Different Multiallelic QTL for Hybrid Performance in the Flint and Dent Heterotic Groups of Maize. Genetics. 2014;198(4):1717–1734. doi: 10.1534/genetics.114.169367 25271305

62. Meng X, Muszynski MG, Danilevskaya ON. The FT-Like ZCN8 Gene Functions as a Floral Activator and Is Involved in Photoperiod Sensitivity in Maize. The Plant Cell. 2011;23(3):942–960. doi: 10.1105/tpc.110.081406 21441432

63. Lazakis CM, Coneva V, Colasanti J. ZCN8 encodes a potential orthologue of Arabidopsis FT florigen that integrates both endogenous and photoperiod flowering signals in maize Journal of Experimental Botany. 2011;62(14):4833–4842. doi: 10.1093/jxb/err129 21730358

64. Guo L, Wang X, Zhao M, Huang C, Li C, Li D, et al. Stepwise cis-Regulatory Changes in ZCN8 Contribute to Maize Flowering-Time Adaptation Current Biology. 2018;28(18):3005–3015. doi: 10.1016/j.cub.2018.07.029 30220503

65. Liang Y, Liu Q, Wang X, Huang C, Xu G, Hey S, et al. ZmMADS69 functions as a flowering activator through the ZmRap2.7-ZCN8 regulatory module and contributes to maize flowering time adaptation. New Phytologist. 2019;221(4):2335–2347. doi: 10.1111/nph.15512 30288760

66. Chardon F, Hourcade D, Combes V, Charcosset A. Mapping of a spontaneous mutation for early flowering time in maize highlights contrasting allelic series at two-linked QTL on chromosome 8. Theoretical and Applied Genetics. 2005;112(1):1–11. doi: 10.1007/s00122-005-0050-z 16244856

67. Bouché F, Lobet G, Tocquin P, Périlleux C. FLOR-ID: an interactive database of flowering-time gene networks in Arabidopsis thaliana. Nucleic Acids Research. 2015;44(D1):D1167–D1171. doi: 10.1093/nar/gkv1054 26476447

68. Vitezica ZG, Legarra A, Toro MA, Varona L. Orthogonal Estimates of Variances for Additive, Dominance, and Epistatic Effects in Populations. Genetics. 2017;206(3):1297–1307. doi: 10.1534/genetics.116.199406 28522540

69. Jannink JL. Identifying Quantitative Trait Locus by Genetic Background Interactions in Association Studies. Genetics. 2007;176(1):553–561. doi: 10.1534/genetics.106.062992 17179077

70. Crawford L, Zeng P, Mukherjee S, Zhou X. Detecting epistasis with the marginal epistasis test in genetic mapping studies of quantitative traits. PLOS Genetics. 2017;13(7):1–37. doi: 10.1371/journal.pgen.1006869

71. Legarra A, Vitezica ZG, Naval-Sánchez M, Henshall J, Raidan F, Li Y, et al. Association analysis of loci implied in “buffering” epistasis. bioRxiv. 2019;637579.

72. de Roos APW, Hayes BJ, Goddard ME. Reliability of Genomic Predictions Across Multiple Populations. Genetics. 2009;183(4):1545–1553. doi: 10.1534/genetics.109.104935 19822733

73. Chen L, Schenkel F, Vinsky M, Crews DH, Li C. Accuracy of predicting genomic breeding values for residual feed intake in Angus and Charolais beef cattle. Journal of Animal Science. 2013;91(10):4669–4678. doi: 10.2527/jas.2013-5715 24078618

74. Guo Z, Tucker DM, Basten CJ, Gandhi H, Ersoz E, Guo B, et al. The impact of population structure on genomic prediction in stratified populations. Theoretical and Applied Genetics. 2014;127(3):749–762. doi: 10.1007/s00122-013-2255-x 24452438

75. Duhnen A, Gras A, Teyssèdre S, Romestant M, Claustres B, Dayde J, et al. Genomic Selection for Yield and Seed Protein Content in Soybean: A Study of Breeding Program Data and Assessment of Prediction Accuracy. Crop Science. 2017;57(3):1325–1337. doi: 10.2135/cropsci2016.06.0496

Článek Bayesian network analysis incorporating genetic anchors complements conventional Mendelian randomization approaches for exploratory analysis of causal relationships in complex data

Článek Murine cytomegalovirus infection exacerbates complex IV deficiency in a model of mitochondrial disease

Článek The SNAP hypothesis: Chromosomal rearrangements could emerge from positive Selection during Niche Adaptation

Článek Inhibition of the oligosaccharyl transferase in Caenorhabditis elegans that compromises ER proteostasis suppresses p38-dependent protection against pathogenic bacteria

Článek Correction: Metagenomic sequencing suggests a diversity of RNA interference-like responses to viruses across multicellular eukaryotes

Článek Drosophila insulin-like peptide 2 mediates dietary regulation of sleep intensity

Článek The alarmones (p)ppGpp are part of the heat shock response of Bacillus subtilis

Článek RNA Polymerase II CTD phosphatase Rtr1 fine-tunes transcription termination

Článek Modeling cancer genomic data in yeast reveals selection against ATM function during tumorigenesis

Článek The Caenorhabditis elegans homolog of the Evi1 proto-oncogene, egl-43, coordinates G1 cell cycle arrest with pro-invasive gene expression during anchor cell invasion

Článek Transcription-replication conflicts as a source of common fragile site instability caused by BMI1-RNF2 deficiency

Článek The Lid/KDM5 histone demethylase complex activates a critical effector of the oocyte-to-zygote transition