Abstract
Background
Phenotypic variation along environmental gradients has been documented among and within many species, and in some cases, genetic variation has been shown to be associated with these gradients. Bayenv is a relatively new method developed to detect patterns of polymorphisms associated with environmental gradients. Using a Bayesian Markov Chain Monte Carlo (MCMC) approach, Bayenv evaluates whether a linear model relating population allele frequencies to environmental variables is more probable than a null model based on observed frequencies of neutral markers. Although this method has been used to detect environmental adaptation in a number of species, including humans, plants, fish, and mosquitoes, stability between independent runs of this MCMC algorithm has not been characterized. In this paper, we explore the variability of results between runs and the factors contributing to it.
Results
Independent runs of the Bayenv program were carried out using genomewide singlenucleotide polymorphism (SNP) data from samples from 60 worldwide human populations following previous applications of the Bayenv method. To assess factors contributing to the method's stability, we used varying numbers of MCMC iterations and also analyzed a second modified data set that excluded two Siberian populations with extreme climate variables. Between any two runs, correlations between Bayes factors and the overlap of SNPs in the empirical p value tails were surprisingly low. Enrichments of genic versus nongenic SNPs in the empirical tails were more robust than the empirical p values; however, the significance of the enrichments for some environmental variables still varied among runs, contradicting previously published conclusions. Runs with a greater number of MCMC iterations slightly reduced runtorun variability, and excluding the Siberian populations did not have a large effect on the stability of the runs.
Conclusions
Because of high runtorun variability, we advise against making conclusions about genomewide patterns of adaptation based on only one run of the Bayenv algorithm and recommend caution in interpreting previous studies that have used only one run. Moving forward, we suggest carrying out multiple independent runs of Bayenv and averaging Bayes factors between runs to produce more stable and reliable results. With these modifications, future discoveries of environmental adaptation within species using the Bayenv method will be more accurate, interpretable, and easily compared between studies.
Keywords:
Environmental adaptation; Positive selection; Genomewide scans; Human adaptation; Markov chain monte carlo; Natural selectionBackground
Phenotypic variation along environmental gradients, such as limb length in animals increasing with colder climates [1], animal body size increasing in colder environments [2], and bird coloration becoming darker in more humid climates [3], was first documented in the 1800s. Since then, many associations between phenotypic variation and environmental variation, both among and between species, have been noted [420]. To define these patterns, Huxley [21] introduced the term ‘cline’, which refers to a continuous phenotypic or genetic gradient that is associated with a pattern of geographic and/or environmental heterogeneity.
Various methods have been developed to determine the contribution of environmental variation to allelic or genotypic frequency patterns. Early theory in population genetics was used to derive statistics that allowed variation in allele frequencies to distinguish selection from drift [2224]. Additionally, it has been shown that correlations between features of the environment and allele frequencies can be detected [15,2527]. Recently, the availability of genomewide data on DNA polymorphisms has spurred interest in using haplotype frequencies or allele frequency spectra to infer the existence or extent of selection across genomes [2837].
Coop et al. [38] developed a statistical procedure named Bayenv to detect genomic evidence of adaptation to specific environmental variation, and it is this method we address here. The method carries out a locusbylocus Bayesian analysis to determine whether a pattern in which population allele frequencies vary linearly with environmental variables is more probable than a pattern in which allele frequencies are assumed to be neutral and unrelated to environmental variables.
Given a set of populations with associated environmental variables and genomewide singlenucleotide polymorphism (SNP) allele frequencies, a null model is first estimated to describe how allele frequencies covary across populations. Sample allele frequencies are drawn from a set of underlying population frequencies, which are assumed to be distributed according to a multivariate normal distribution around a transformed global allele frequency with a variancecovariance matrix representing the genetic structure of the populations. These transformed frequencies are not constrained between 0 and 1. The prior on the variancecovariance matrix is an inverse Wishart distribution, and Markov chain Monte Carlo (MCMC) computation is used to explore the posterior distribution of the covariance matrix, given the ancestral allele frequency and population allele frequencies.
After the covariance of allele frequencies between populations is estimated, the posterior probability of an alternative model for an individual SNP is determined. The alternative model predicts that the transformed population allele frequencies at a SNP are normally distributed around the ancestral allele frequency according to a linear model:
where θ is the transformed allele frequency, α is the ancestral allele frequency, Y is the environmental variable, and Ω is the variancecovariance matrix that was estimated in the first step of this procedure.
For each SNP, the Bayes factor (BF) is the posterior probability of the data at that SNP under the alternative model, integrated over all parameters, then divided by the posterior probability of the data under the null model, integrated over all parameters of the null model. Using a single run of the MCMC under the null distribution and importance sampling, the Bayes factors for multiple environmental variables are estimated quickly for each of the SNPs assayed (see additional details in Coop et al. [38]). This method has been shown to have good power compared to a range of other approaches [39].
The Bayenv method has been applied to many species including humans [4044], plants [4549], fish [50,51], and mosquitoes [52]. In all cases, the objective was to determine the extent to which a species' pattern of genomic variation can be inferred to be a response to environmental variation  for example climate, geography, photoperiod, or soil type  and hence could be regarded as a signal of adaptation.
Hancock et al. [40] applied the Bayenv method to genomewide data from 61 worldwide populations for evidence of adaptation to 11 local climatic variables. Analyzing the pattern of variation of genomewide SNP allele frequencies across the populations, they found evidence for adaptation to several variables, the most significant being latitude, summer relative humidity, summer solar radiation, and winter relative humidity. They reported signs of adaptation at SNPs previously identified by genomewide association studies (GWAS) to be significantly related to phenotypes associated with environmental gradients, as well as an enrichment of genic versus nongenic SNPs among the SNPs most strongly correlated with the climate variables.
Although the Bayenv method has been applied to different species and environmental variables and has produced some results agreeing with other analytical methods, the stability of the results across multiple MCMC runs is not known. It is also unknown how the results may be affected by the inclusion of specific populations from extreme environments, which could potentially introduce bias [53]. In this study, we carried out many independent runs of the Bayenv program using data from Hancock et al. [40] to determine the degree of variability between runs and the factors that may contribute to any variability.
During the writing of this manuscript, an updated version of this program, Bayenv 2.0 [53], was released. While the Bayes factor estimation procedure is the same, it adds features that allow for the input of pooled population sequencing data and the computation of nonparametric correlations of the allele frequencies with the environmental variables. Although using this new version is advisable, the analyses of the original Bayenv program presented here remain relevant both to further applications of the method as well as to studies using results of previous Bayenv analyses (i.e., Fraser [54]).
Results
All 60 populations; 100,000 MCMC iterations
First, we analyzed the full data set of 60 populations and assessed correlations with 11 climate variables using 100,000 MCMC iterations (as in Hancock et al. [40]). We compared results between pairs of five independent runs as well as between each of our runs and Hancock et al.'s [40] published results. We report comparisons of our runs with an updated set of results made available online by the authors (which we refer to as Hancock2) that used a different number of MCMC burnins from the original set of results made available (Hancock1) (see ‘Methods’ section for details). The 16 pairs analyzed here are shown by a's in Table 1.
Table 1. Pairs of Bayenv runs compared
We found that Bayenv produced slightly different results in independent runs of the program. For a given climate variable, pairwise correlations of the logarithms of Bayes factors (BF's) were computed for all SNPs from each run, and correlations were averaged across all pairs of runs mentioned above (shown in Table 1). The averaged correlation coefficients ranged from 0.66 to 0.89 (Table 2a), depending on the climate variable. Despite analyzing only 60 of the 61 populations analyzed in Hancock et al. [40], correlations among our runs were similar to correlations with results from Hancock et al. [40]. Correlations between the two versions of Hancock et al.'s [40] published results were also comparable.
Table 2. Average correlations of the log of the Bayes factors (BF) and average correlations of the empirical p values between runs for each climate variable
Correlations of empirical p values for all SNPs between the pairs of runs listed in Table 1 were also calculated; these correlation coefficients ranged from 0.63 to 0.85 (Table 2e), depending on the climate variable. Again, all pairs of runs compared displayed similar correlation values. For both the BF and p value correlations, the longitude variable had the lowest correlation among runs while winter radiation flux had the highest.
To examine the consistency of SNPs present in the empirical tails of each run, the overlap of SNPs in the tails with p value cutoffs of 0.05, 0.01, 0.005, and 0.001 was calculated. Overlap was defined as the proportion of overlapping SNPs relative to the total number of SNPs in the tail; this proportion was averaged over all climate variables. All pairs of runs showed comparable proportions of overlaps, both when compared among the Blair runs and between Blair runs and Hancock2. These overlap proportions ranged from 0.24 in the 0.001 tails of both runs to 0.57 in the 0.05 tails of both runs when Blair1 was compared to Hancock2 (Table 3a), with similar values for overlap between all Blair runs. In general, greater SNP overlap occurred when the SNPs in one run's tail were compared to the less extreme tail in another run. (In other words, more SNPs in the 0.001 tail of run A overlapped with the SNPs in the 0.005 tail of run B than the 0.001 tail of run B, and even more overlapped with the SNPs in the 0.01 tail of run B).
Table 3. Empirical p value tail comparisons between indicated runs
To assess whether SNPs with more extreme p values had greater overlap among runs, we ordered the empirical p values from the lowest to the highest for each climate variable and binned the SNPs in groups of 1,000. Each bin was compared to the corresponding ordered bin in each of the other Blair runs. Bins of lower p values had the highest overlap and overlap decreased as the empirical p values increased. The mean overlap for the top 1,000 SNPs was 0.44, and overlap was much lower for less significant SNPs (Table 4).
Table 4. Overlap of ranked SNPs (ranked by empirical p value from lowest to highest and binned in groups of 1,000) between runs
There was also variation among the results at individual SNPs. For example, SNPs in the CORIN gene region (rs4558846, rs6447571, rs17601068) gave strong signals (log_{10}BF = 21.9, 28.7, and 20.8 and empirical p values = 2.0 × 10^{5}, 3.1 × 10^{5}, and 2.1 × 10^{5}) of association with minimum winter temperature as reported in Hancock et al. [40] (which we refer to as Hancock1), but show low signals in other runs: Blair1 (log_{10}BF = 0.32, 0.50, and 0.48 and empirical p values = 0.40, 0.57, and 0.44) and Hancock2 (log_{10}BF = 0.33, 0.07, and 0.16 and empirical p values = 0.36, 0.22, and 0.20). Some of this variability could be due to the limited burnin in Hancock1 (see ‘Methods’ section). Other SNPs, such as rs2075756 in TRIP6, show more consistency. Blair1 obtained a log_{10}BF of 4.0 and empirical p value of 7.9 × 10^{4}, while Hancock2 reported a log_{10}BF of 1.4 and empirical p value of 3.2 × 10^{3} for this SNP. This SNP shows a strong signal with absolute latitude, the variable for which we saw consistently strong signals of genic/nongenic enrichment among SNPs with low empirical p values (see below).
We then assessed whether genic SNPs were enriched among SNPs with the strongest signals by comparing the fraction of genic SNPs to nongenic SNPs in the empirical tails. This enrichment was compared across all five of our runs as well as with Hancock1 and Hancock2. The absolute enrichment values (calculated as described in the ‘Methods’ section) varied across runs and differed from Hancock's [40] published results (results not shown). The significance values of the genic/nongenic enrichments, which represent the proportion of bootstrapped samples that are also enriched, also varied across runs (i.e., for summer solar radiation) (Table 5).
Table 5. Genic/nongenic enrichments for each climate variable in the indicated empirical tails
The absolute latitude and winter relative humidity variables were almost always significant, with the exception of the 0.005 tail in Blair5; the summer precipitation variable was never significant. All three of these findings agreed with Hancock et al.'s published results [40]. However, summer solar radiation and humidity were not consistently significant, though reported to be significant in Hancock et al. [40]. Among the remaining variables, results varied among runs, and our runs showed less significance on average than Hancock et al.'s published results [40].
All 60 populations; 500,000 MCMC iterations
In order to determine whether the observed variability between runs was due to the MCMC not converging, the number of MCMC iterations was increased fivefold to 500,000. Five independent runs were performed, each with different seeds, but with the same full data set of 60 populations. All pairs of the five runs with 500,000 iterations were compared (shown as b's in Table 1).
Between independent runs, both the Bayes factors and p values showed higher correlation coefficients compared to those between independent runs with 100,000 iterations. When averaged across all pairs of runs, the Bayes factor correlation coefficients ranged from 0.82 to 0.95 (Table 2b), and the correlation coefficients of the p values ranged from 0.81 to 0.92 (Table 2f). These correlations are significantly higher than among runs of 100,000 iterations for all environmental variables: paired Wilcoxon tests of these correlations, which for each climate variable are averaged across pairs of runs, give a p value of 0.0038 for the Bayes factors and a p value of 0.0038 for the empirical p values. The SNPs in the empirical p value tails were compared as before, and although the overlap was greater between long runs than between runs with 100,000 iterations, the proportion of overlapping SNPs in the tails was still not high, ranging from 0.64 in the 0.05 tails of two runs to 0.36 in the 0.001 tails of two runs (Table 3b).
Genic/nongenic enrichment significance values also stabilized somewhat in the runs with 500,000 iterations but still did not show the same significance values from run to run (Table 5). Genic/nongenic enrichments were highly significant for the absolute latitude and relative humidity winter variables and again were not significant for the summer precipitation variable across all five runs. Interestingly, none of the five runs showed significant genic/nongenic enrichment for winter precipitation rate, except in the 0.01 tail in Blair5. As before, varying levels of significance were seen for summer solar radiation and summer relative humidity. This differs from Hancock et al. [40], who reported high significance across all tails for these environmental variables.
In searching for the source of the differences described above, biases from allele frequency, SNP ascertainment panel, and method of calculating empirical p values were investigated, but none of these potential biases was shown to explain the variability among runs. First, the data were reanalyzed using empirical p values that were not binned by allele frequency. P value overlap did not increase when binned vs. unbinned p values were compared. Second, the proportions of SNPs from each ascertainment panel in each tail were calculated, and these proportions were found to be similar to the overall proportions of SNPs in each ascertainment panel. Additionally, when each ascertainment panel was considered separately, the overlap in the tails of p values was similar to the overlap in the tails when all three ascertainment panels were considered together.
Without Siberian populations; 150,000 iterations
Because the Siberian populations, the Naukan Yup’ik and Maritime Chukchee, live in some of the most extreme climatic regions among the 60 sampled populations, it may be possible that removing these populations from the data set could affect the variability of Bayes factors, empirical p values, and genic/nongenic enrichments among independent runs. All pairs of runs using this modified data set were compared (shown as c's in Table 1). Pairwise correlations of the log of the Bayes factors (Table 2c) and p values (Table 2g) as well as fractions of overlap of SNPs in the 0.05, 0.01, 0.005, and 0.001 tails of the empirical p values (Table 3c) were comparable to the correlations between the runs of the full data set (60 populations with 100,000 iterations (Table 2a)). This suggests that the inclusion of populations with extreme environmental variables may not affect runtorun variability.
To determine whether excluding the Siberian populations affected the results of the Bayenv method, each of the runs without the Siberian populations was compared to the updated Hancock results published on the dbCline website (Hancock2) (shown by d's in Table 1). Pairwise correlations between the log of the Bayes factors of these pairs (Table 2d), averaged across all pairs, were similar to correlations between pairs of nonSiberian runs (Table 2c). However, correlations of empirical p values between Hancock2 and nonSiberian runs (Table 2h) were lower than correlations between pairs of nonSiberian runs (Table 2g) and also lower than the correlations between pairs of runs using the full data set (Table 2e). Similarly, overlap in the empirical tails is lower when nonSiberian runs are compared to the results from Hancock2 (Table 3d), demonstrating a slight difference in results when these populations were excluded.
Although genic/nongenic enrichment significance values varied across runs of the nonSiberian data set (Table 5), some patterns differed from the results that included the full data set of 60 populations. The genic/nongenic enrichment for the absolute latitude variable was less significant than in runs analyzing all populations (Table 5), showing a small effect of exclusion of the Siberian populations. In addition, winter precipitation rate was sometimes significant in the nonSiberian runs; this variable was significant in Hancock et al.'s [40] published data but not in our runs with the full population set using 500,000 MCMC iterations.
Discussion
We have explored the computational replicability of the Bayenv program [38] in determining evidence for positive selection in response to climatic variation. We used genomewide SNP data from 60 worldwide human populations and 11 climate variables as in Hancock et al. [40] and carried out independent Bayenv runs with 100,000 and 500,000 MCMC iterations, as well as with 150,000 iterations while omitting Siberian populations with extreme climatic variables. In comparing pairs of our runs as well as each of our runs with the results previously published in Hancock et al. [40], results from the Bayenv method appear to be variable across runs in their Bayes factors, empirical p values, and in the values and significances of enrichments of genic SNPs in the empirical tails. Although some runtorun variability is expected in any MCMC algorithm, we show here that the differences between runs are sufficient to lead to varying biological conclusions regarding human adaptation.
In runs with 100,000 MCMC iterations, correlations of Bayes factors between replicate runs were low (0.66–0.88) between pairs of our runs, between our runs and the results of Hancock et al. [40], and between the different versions of results posted by Hancock et al. (Table 2a). To explore the possibility that the MCMC had failed to converge, we increased the number of MCMC iterations to 500,000; although these correlations increased as a result (0.82–0.93) (Table 2b), there was still substantial runtorun variability in further analyses.
Because Bayes factors are dependent on the accuracy of the null model and, therefore, may be substantially inflated due to imperfections of this model [40], we focused further analyses on the empirical p values. Hancock et al. [40] write, ‘Since we cannot expect the null model to account fully for the effects of population structure, we emphasize that we cannot take the BFs themselves at face value, nor can they be directly compared across climate variables. Therefore, we took a conservative approach and conducted subsequent analyses by comparing each SNP to the empirical distribution.’ In our analysis, empirical p values were seen to be no more stable than the Bayes factors themselves, as correlation coefficients of the empirical p values were in the same range as the correlation coefficients of the Bayes factors (Table 2e). As before, the correlation of p values between runs improved when the MCMC iterations were increased to 500,000 (Table 2f). Although we recommend using a large number of MCMC iterations, this does not completely stabilize Bayenv results from run to run.
Amounts of SNP overlap in the tails of the empirical p value distributions showed that the relative ranking of the SNPs associated with environmental variables varied across runs, even with a larger number of MCMC iterations. While the overlap of SNPs between the extreme tails (i.e., top 0.01%) of two runs was found to be low, we recognize that this slightly overestimates runtorun variability. For example, 24% of the 0.001 tail of Hancock2 overlapped with the 0.001 tail of Blair1; that proportion increased to 83% with the 0.005 tail of Blair1 (Table 3a). This suggests that the SNPs with the lowest empirical p values have generally low p values in all runs. Analysis of the overlap of SNPs with low empirical p values versus the overlap of SNPs with high or midrange empirical p values shows that strong signals are also more reproducible from runtorun than weaker signals.
Despite this trend, a SNP's significance and relative ranking nearly always vary from one run of Bayenv to another. SNPs inferred to be significant using some predefined pvalue threshold in one run may not be found to be significant in another run. Thus, the low fractions of empirical pvalue tail overlap between runs suggest that caution is advisable in drawing inferences based on only one run, especially when making conclusions regarding individual SNPs or genes (see ‘Results’ section).
Finally, we assessed the stability of inferences regarding the enrichment of genic SNPs among SNPs strongly associated with a given climate variable. Evidence of such an enrichment has been argued as evidence of strong selection on genic regions in response to an associated environmental pressure [40,45]. Because the stability of these significances depends only on the ratio of genic to nongenic SNPs in the empirical tails, as opposed to the rank of each individual SNP, this measure could be consistent despite variability of Bayes factors and empirical p values. However, the significances of climate variables including summer maximum temperature, summer relative humidity, summer solar radiation, winter minimum temperature, winter precipitation rate, and winter solar radiation varied widely from run to run (Table 5).
Of particular interest are summer solar radiation and summer relative humidity, both claimed to be significant by Hancock et al. [40] but for which significance did not hold in our analyses. For these variables, as well as others, neither the enrichment values nor the significance of the enrichment values increased or decreased systematically when the tail cutoff was more extreme, as suggested in Hancock et al. [40]. The lack of significance for these variables in the Hancock2 analysis, which includes all 61 populations, shows that the exclusion of the data from Australian Aborigines from our analysis is not the cause for this difference in significance. Thus, replicate runs demonstrate that these two variables may have been less crucial in human adaptation than suggested previously [40].
Despite this variability between runs, certain climate variables show some consistency in genic/nongenic enrichments and agree with conclusions of Hancock et al. [40] (Table 5). Winter relative humidity is significant across all runs, including those in which the Siberian populations are excluded, suggesting that this may be a robust signal of adaptation in response to this climatic variable. Absolute latitude also shows a strong signal and is significant across all runs that include all populations. That we were able to replicate the significance of both of these variables, despite the fact that we did not include the Australian Aborigines (included in Hancock et al. [40]) underscores their potential importance in human adaptation.
We note that populations with extreme climate variables may indeed have an effect on some conclusions drawn from Bayenv analyses. For example, without the Siberian populations, an enrichment of genic SNPs was no longer significant for latitude (Table 5). Thus, while the inclusion of outlying populations may affect biological conclusions, it does not appear to affect runtorun variability (see ‘Results’ section).
In summary, our analysis of the consistency of genic enrichments between runs demonstrates the importance of performing multiple runs of Bayenv. Two independent runs can potentially lead to very different biological conclusions. Significance of a particular environmental variable in multiple Bayenv runs constitutes much stronger evidence for its relevance and importance for human adaptation.
A new version of the Bayenv method, Bayenv 2.0 [53], has recently been released. Running the new version with 100,000 MCMC iterations produces slightly more stable results among runs than the old version; however, the variability is comparable to the runs of the old version with 500,000 MCMC iterations (results not shown). Gunther et al. note that their inclusion of nonparametric analyses in Bayenv 2.0 helps to account for potential false positives due to outlying populations [53]; however, our results suggest that results run on the previous version of the Bayenv program that do not include outlying populations may still lead to varying results across runs. Thus, while we suggest using the new version, the results described in this manuscript remain relevant to the application of Bayenv 2.0.
It can be difficult to disentangle stability and specificity, and we recognize that some of the runtorun variability of this method may be due to less strong signals of environmental adaptation (i.e., winter minimum temperature as compared to latitude). However, as every run may be different, it is still advisable to carry out multiple runs when making biological conclusions, especially when the signal tested may not be strong to begin with. Averaging Bayes factors of two or more independent runs before calculating the empirical p values produces more stable empirical p values and genic/nongenic enrichments (results not shown). Additionally, individual SNPs that show large Bayes factors in at least one run could be rerun with more MCMC iterations to determine a Bayes factor value to which all iterations converge. Unfortunately, this process would not be as useful for a genomewide analysis, as a large set of SNPs is necessary for the calculation of the empirical p values and the genic/nongenic enrichments.
Conclusions
We conclude that, in using the Bayenv method, relying on the relative rankings of SNPs calculated from only one run of the program may lead to biological conclusions that might not have been reached with another independent run of the program. Neither increasing the number of MCMC iterations nor removing outlying populations completely eliminated this variability. Thus, we suggest carrying out several independent runs of the Bayenv analysis to assess variability of the Bayes factors and empirical p values across runs. We also suggest averaging Bayes factors from independent runs to produce more stable results. With these modifications, future discoveries of environmental adaptation within species using the Bayenv method will be more accurate, interpretable, and easily compared between studies.
Methods
Populations
Allele frequency data on 623,318 SNPs from samples of 60 human populations from the freq_data file (61pops_climate_freqs.txt) on the dbCline website ( http://genapps2.uchicago.edu:8081/dbcline/main.jsp webcite) were analyzed. The data set included samples from 52 HGDP populations plus Vasekela !Kung, Amhara, Naukan Yup’ik, Maritime Chukchee, Luhya, Maasai, Tuscans, and Gujarati. These were the same populations used by Hancock et al. [40], with the exception of the Australian Aborigines, who were not included in the allele frequency file on dbCline. To translate allele frequencies into allele counts, the sample sizes from Li et al. [55] were used for the HGDP populations and from the supplementary information of Hancock et al. [40] for the additional eight populations.
Climate variables
The climate data consisted of the 11 variables in Hancock et al. [40] that are posted on the dbCline website: latitude, absolute latitude, longitude, winter minimum temperature, summer maximum temperature, summer precipitation rate, winter precipitation rate, summer radiation flux, winter radiation flux, summer relative humidity, and winter relative humidity. Climate data were obtained for each population directly from the dbCline website (61pops_climate_freqs.txt).
Covariance matrix estimation
For the set of 60 populations, three covariance matrices, each one representing a different ascertainment panel (Illumina 250 K, Illumina 300 K, or Illumina AFR), were generated using the Bayenv program (as in Hancock et al. [40]). The program was run for 100,000 MCMC iterations, and the entries of each matrix were averaged over the last three output matrices (output every 5,000 iterations). Visual comparison showed the matrices to be qualitatively similar to those in Coop et al. [38], and the matrices were stable from run to run. For our analysis, one covariance matrix for each ascertainment panel was estimated as described here and used for all runs.
Bayenv runs
Five independent runs of the Bayenv program were carried out using the full data set of 60 populations and 11 climate variables. As with the covariance matrices, SNPs were analyzed with their corresponding ascertainment panel. Each run used a different random seed but the same input data set and 100,000 MCMC iterations. Bayes factors produced from the three ascertainment panels were combined into one file for each run. These runs are referred to as Blair1  Blair5 (see Table 1).
In a separate analysis, five independent runs of the Bayenv program were carried out as above, using the same full data set but increasing the MCMC iterations to 500,000 (referred to as LongBlair1  LongBlair5) (see Table 1). For all runs, we use the default MCMC burnin values.
Bayenv runs without Siberia
Five independent runs of the Bayenv program were carried out using a modified data set and 150,000 MCMC iterations. In this modified data set, all 11 climate variables were included, but the Siberian populations (Naukan Yup’ik and Maritime Chukchee) were excluded (referred to as W/O Siberia1  W/O Siberia5; see Table 1). The aim here was to explore how the Siberian populations, who live in some of the most extreme climatic regions, might affect the stability and conclusions drawn from Bayes factors, empirical p values, and genic/nongenic enrichment values (see below).
Hancock runs
Bayes factors and empirical p values from Hancock et al. [40] were obtained from the dbCline website. Following the Hancock et al. [40] publication, the authors posted a set of results, which we refer to as Hancock1, which used fewer burnins for the MCMC procedure and thus failed to converge (Anna Di Rienzo, personal correspondence). After further correspondence with the authors, a second set of results that used a longer burnin was posted in autumn 2013 (which we refer to as Hancock2). We examined BF correlations and empirical p value correlations and overlap between these two data sets (described below). For calculations of genic/nongenic enrichment (described below), we performed the analysis on both Hancock1 and Hancock2. In general, our analysis of the Hancock1 data set gave similar results to those published by Hancock et al. [40].
SNP p values
As in Hancock et al. [40], we calculated empirical p values by binning SNPs based on their global allele frequency (across 60 populations). For each SNP, a transformed rank statistic was computed by comparing the computed Bayes factor for the SNP with the Bayes factors from other SNPs in the same global allele frequency bin. For this process, all SNPs were analyzed together, and no distinction was made for SNPs belonging to different ascertainment panels.
Genic/nongenic enrichments
To calculate genic/nongenic enrichments, a SNP was defined as genic if it was within 10 kb of a gene and nongenic if it was greater than 50 kb from a gene (as in Hancock et al. [40], personal correspondence). Enrichment in an empirical tail was calculated with the following equation:
where n_{g} and n_{ng} are the number of genic and nongenic SNPs, respectively, in the empirical tail, and N_{g} and N_{ng} are the number of genic and nongenic SNPs, respectively, among all tested SNPs. To determine the significance of the genic/nongenic enrichments, 500kb regions from the genome were resampled with replacement (also as in Hancock et al. [40]). Then, given the p values of the SNPs in the resampled regions, we calculated enrichment values in each empirical tail. This bootstrap resampling procedure was repeated 10,000 times. Significance values were determined by counting the fraction of bootstrapped samples with enrichment values above 1. In Table 5, one star denotes that 95%–97.5% of bootstrapped samples were enriched, two stars denote 97.5%–99%, and three stars denote >99%; zero stars denotes that <95% of bootstrap samples were enriched, and thus that the genic enrichment is not significant.
Comparison of runs
Each run produces a Bayes factor for each climate variable for each SNP. From those, the empirical p values were calculated. For each climate variable, the Pearson correlations of the log of the Bayes factors and the Pearson correlations of the empirical p values were computed for each pair of runs compared (see Table 1). For each set of comparisons (i.e., Blair15, or W/O_Sib15 vs. Hancock2) the correlation coefficients were then averaged across all relevant pairs of runs (see letters of Table 1). For all pairs shown in Table 1, empirical p values were also compared by calculating the fraction of SNPs in the tail of one run that was present in the tail of the other run. In another analysis, to assess differences in run stability by empirical p value, SNPs were ordered by empirical p value from lowest to highest and binned into groups of 1,000. Overlaps between the corresponding bins of different runs (Blair15) were calculated. (This assessed what proportion of the 1,000 mostsignificant SNPs in Blair1 were present in the 1,000 mostsignificant SNPs in Blair2, what proportion of SNPs ranked 1,001–2,000 in Blair1 were present in SNPs ranked 1,001–2,000 of Blair2, and so forth.) Finally, significances of genic/nongenic enrichment values were calculated for each run, as described above, and compared.
Abbreviations
BF: Bayes factor; MCMC: Markov Chain Monte Carlo; SNP: Singlenucleotide polymorphism.
Competing interests
The authors declare that they have no competing interests.
Authors' contributions
LB participated in the design of the study and performed the statistical analysis. JG and MF participated in its design and coordination. All authors read and approved the final manuscript.
Acknowledgements
We would like to thank Jonathan Pritchard, Graham Coop, Brenna Henn, and Anna Di Rienzo for helpful discussion and comments, as well as two anonymous reviewers for their input. This work was supported in part by the Center for Computational, Evolutionary and Human Genomics at Stanford University.
References

Allen JA: The influence of physical conditions in the genesis of species.

Bergmann C: Über die Verhältnisse der wärmeökonomie der Thiere zu ihrer Grösse.

Gloger CL: Das Abändern der Vögel durch Einfluss des Klimas. August Schulz: Breslau; 1833.

Ashton KG, Tracy MC, de Queiroz A: Is Bergmann's rule valid for mammals?
Am Nat 2000, 156:390415. Publisher Full Text

Brown JH, Lee AK: Bergmann's rule and climatic adaptation in woodrats (Neotoma).
Evolution 1969, 23:329338. Publisher Full Text

Coyne JA, Beecham E: Heritability of two morphological characters within and among natural populations of Drosophila melanogaster.
Genetics 1987, 117:727737. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Freckleton RP, Harvey PH, Pagel M: Bergmann's rule and body size in mammals.
Am Nat 2003, 161:821825. PubMed Abstract  Publisher Full Text

Harcourt AH, Schreier BM: Diversity, body mass, and latitudinal gradients in primates.
Int J Primatol 2009, 30:283300. Publisher Full Text

Huey RB, Gilchrist GW, Carlson ML, Berrigan D, Serra L: Rapid evolution of a geographic cline in size in an introduced fly.
Science 2000, 287:308309. PubMed Abstract  Publisher Full Text

Jablonski NG, Chaplin G: The evolution of human skin coloration.
J Hum Evol 2000, 39:57106. PubMed Abstract  Publisher Full Text

Jain S, Bradshaw A: Evolutionary divergence among adjacent plant populations. I. The evidence and its theoretical analysis.
Heredity 1966, 21:407441. Publisher Full Text

Johnston RE, Selander RK: Evolution in the house sparrow II. Adaptive differentiation in North American populations.
Evolution 1971, 25:128. Publisher Full Text

Katzmarzyk PT, Leonard WR: Climatic influences on human body size and proportions: ecological adaptations and secular trends.
Am J Phys Anthropol 1998, 106:483503. PubMed Abstract  Publisher Full Text

Leonard WR, Sorensen MV, Galloway VA, Spencer GJ, Mosher MJ, Osipova L, Spitsyn VA: Climatic influences on basal metabolic rates among circumpolar populations.
Am J Hum Biol 2002, 14:609620. PubMed Abstract  Publisher Full Text

Mullen LM, Hoekstra HE: Natural selection along an environmental gradient: a classic cline in mouse pigmentation.
Evolution 2008, 62:15551570. PubMed Abstract  Publisher Full Text

Pool JE, Aquadro CF: The genetic basis of adaptive pigmentation variation in Drosophila melanogaster.
Mol Ecol 2007, 16:28442851. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Storz JF, Balasingh J, Bhat HR, Nathan PT, Doss DPS, Prakash AA, Kunz TH: Clinal variation in body size and sexual dimorphism in an Indian fruit bat, Cynopterus sphinx (Chiroptera: Pteropodidae).
Biol J Linnean Soc 2001, 72:1731. Publisher Full Text

Relethford JH: Apportionment of global human genetic diversity based on craniometrics and skin color.
Am J Phys Anthropol 2002, 118:393398. PubMed Abstract  Publisher Full Text

Sumner F: The analysis of a concrete case of intergradation between two subspecies.
Proc Natl Acad Sci 1929, 15:110120. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Roberts DF: Body weight, race and climate.
Am J Phys Anthropol 1953, 11:533558. PubMed Abstract  Publisher Full Text

CavalliSforza L: Population structure and human evolution.
Proc R Soc Lond B Biol Sci 1966, 164:362379. PubMed Abstract  Publisher Full Text

Endler JA: Natural Selection in the Wild. Princeton University Press: Princeton; 1986.

Lewontin R, Krakauer J: Distribution of gene frequency as a test of the theory of the selective neutrality of polymorphisms.
Genetics 1973, 74:175195. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Haldane JB: The theory of a cline.
J Genet 1948, 48:277284. PubMed Abstract  Publisher Full Text

Slatkin M: Gene flow and selection in a cline.
Genetics 1973, 75:733756. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Lenormand T: Gene flow and the limits to natural selection.
Trends Ecol Evol 2002, 17:183189. Publisher Full Text

Akey JM, Zhang G, Zhang K, Jin L, Shriver MD: Interrogating a highdensity SNP map for signatures of natural selection.
Genome Res 2002, 12:18051814. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Barreiro L, Laval G, Quach H, Patin E, QuintanaMurci L: Natural selection has driven population differentiation in modern humans.
Nat Genet 2008, 40:340345. PubMed Abstract  Publisher Full Text

Bonin A: Explorative genome scan to detect candidate loci for adaptation along a gradient of altitude in the common frog (Rana temporaria).

Coop G, Pickrell JK, Novembre J, Kudaravalli S, Li J, Absher D, Myers R, CavalliSforza LL, Feldman MW, Pritchard JK: The role of geography in human adaptation.
PLoS Genet 2009, 5:e1000500. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

FournierLevel A, Korte A, Cooper MD, Nordborg M, Schmitt J, Wilczek AM: A map of local adaptation in Arabidopsis thaliana.
Science 2011, 334:8689. PubMed Abstract  Publisher Full Text

Kohn MH, Pelz HJ, Wayne RK: Locusspecific genetic differentiation at Rw among warfarin resistant rat (Rattus norvegicus) populations.
Genetics 2003, 164:10551070. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Pickrell JK, Coop G, Novembre J, Kudaravalli S, Li JZ, Absher D, Srinivasan BS, Barsh GS, Myers RM, Feldman MW, Pritchard JK: Signals of recent positive selection in a worldwide sample of human populations.
Genome Res 2009, 19:826837. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Sabeti PC, Schaffner SF, Fry B, Lohmueller J, Varilly P, Shamovsky O, Palma A, Mikkelsen TS, Altshuler D, Lander ES: Positive natural selection in the human lineage.
Science 2006, 312:16141620. PubMed Abstract  Publisher Full Text

Tang K, Thornton KR, Stoneking M: A new approach for using genome scans to detect recent positive selection in the human genome.
PLoS Biol 2007, 5:e171. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Voight BF, Kudaravalli S, Wen X, Pritchard JK: A map of recent positive selection in the human genome.
PLoS Biol 2006, 4(3):e72. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Coop G, Witonsky D, Di Rienzo A, Pritchard JK: Using environmental correlations to identify loci underlying local adaptation.
Genetics 2010, 185:14111423. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

De Mita S, Thuillet AC, Gay L, Ahmadi N, Manel S, Ronfort J, Vigouroux Y: Detecting selection along environmental gradients: analysis of eight methods and their effectiveness for outbreeding and selfing populations.
Mol Ecol 2013, 22:13831399. PubMed Abstract  Publisher Full Text

Hancock AM, Witonsky DB, AlkortaAranburu G, Beall CM, Gebremedhin A, Sukernik R, Utermann G, Pritchard JK, Coop G, Di Rienzo A: Adaptations to climatemediated selective pressures in humans.
PLoS Genet 2011, 7(4):e1001375.
doi:10.1371/journal.pgen.1001375
PubMed Abstract  Publisher Full Text  PubMed Central Full Text 
Hancock AM, Witonsky DB, Ehler E, AlkortaAranburu G, Beall C, Gebremedhin A, Sukernik R, Utermann G, Pritchard J, Coop G, Di Reinzo A: Human adaptations to diet, subsistence, and ecoregion are due to subtle shifts in allele frequency.
Proc Natl Acad Sci 2010, 107(Suppl 2):89248930. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Hancock AM, Clark VJ, Qian Y, Di Rienzo A: Population genetic analysis of the uncoupling proteins supports a role for UCP3 in human cold resistance.
Mol Biol Evol 2011, 28:60114. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Hancock A, Witonsky D, Gordon AS, Eshel G, Pritchard JK, Coop G, Di Rienzo A: Adaptations to climate in candidate genes for common metabolic disorders.
PLoS Genet 2008, 4:e32. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Fumagalli M, Sironi M, Pozzoli U, FerrerAdmettla A, Pattini L, Nielsen R: Signatures of environmental genetic adaptation pinpoint pathogens as the main selective pressure through human evolution.
PLoS Genet 2011, 7(11):e1002355. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Hancock AM, Brachi B, Faure N, Horton MW, Jarymowycz LB, Sperone FG, Toomajian C, Roux F, Berelson J: Adaptation to climate across the Arabidopsis thaliana genome.
Science 2011, 334:8386. PubMed Abstract  Publisher Full Text

Eckert AJ, Bower AD, GonzalezMartinez SC, Wegrzyn JL, Coop G, Neale DB: Back to nature: ecological genomics of loblolly pine (Pinus taeda, Pinaceae).

Fang Z, Pyhäjärvi T, Weber AL, Dawe RK, Glaubitz JC, Gonzalez JJ, RossIbarra C, Doebley J, Morrell PL, RossIbarra J: Megabasescale inversion polymorphism in the wild ancestor of maize.
Genetics 2012, 191:883894. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Keller SR, Levsen N, Olson MS, Tiffin P: Local adaptation in the flowering time gene network of balsam poplar, Populus balsamifera L.
Mol Biol Evol 2012, 29:314352. PubMed Abstract  Publisher Full Text

Pyhajarvi T, Hufford MB, Mezmouk S, RossIbarra J: Complex patterns of local adaptation in teosinte.
Genome Biol Evol 2013, 5(9):15941609. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Jones FC, Chan YF, Schmutz J, Grimwood J, Brady SD, Southwick AM, Absher DM, Myers RM, Riemchen TE, Deagle BE, Schluter D, Kingsley DM: A genomewide SNP genotyping array reveals patterns of global and repeated speciespair divergence in sticklebacks.

Limborg MT, Blankenship SM, Young SF, Utter FM, Seeb LW, Hansen MHH, Seeb JE: Signatures of natural selection among lineages and habitats in Oncorhynchus mykiss.
Ecol Evol 2012, 2:118. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Cheng C, White BJ, Kamdem C, Mockaitis K, Costantini C, Hahn MW, Besansky NJ: Ecological genomics of Anopheles gambiae along a latitudinal cline in cameroon: a population resequencing approach.
Genetics 2012, 190:14171432. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Gunther T, Coop G: Robust identification of local adaptation from allele frequencies.
Genetics 2013, 195:205220. PubMed Abstract  Publisher Full Text

Fraser HB: Gene expression drives local adaptation in humans.
Genome Res 2013, 23:10891096. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Li JZ, Absher DM, Tang H, Southwick AM, Casto AM, Ramachandran S, Cann HM, Barsh GS, Feldman M, CavalliSforza LL, Myers RM: Worldwide human relationships inferred from genomewide patterns of variation.
Science 2008, 319:11001104. PubMed Abstract  Publisher Full Text