In genetic association studies, deviation from Hardy-Weinberg equilibrium (HWD) can be due to recent admixture or selection at a locus, but is most commonly due to genotyping errors. In addition to its utility for identifying potential genotyping errors in individual studies, here we report that HWD can be useful in detecting the presence, magnitude and direction of genotyping error across multiple studies. If there is a consistent genotyping error at a given locus, larger studies, in general, will show more evidence for HWD than small studies. As a result, for loci prone to genotyping errors, there will be a correlation between HWD and the study sample size. By contrast, in the absence of consistent genotyping errors, there will be a chance distribution of p-values among studies without correlation with sample size. We calculated the evidence for HWD at 17 separate polymorphic loci investigated in 325 published genetic association studies. In the full set of studies, there was a significant correlation between HWD and locus-standardised sample size (p = 0.001). For 14/17 of the individual loci, there was a positive correlation between extent of HWD and sample size, with the evidence for two loci (5-HTTLPR and CTSD) rising to the level of statistical significance. Among single nucleotide polymorphisms (SNPs), 15/23 studies that deviated significantly from Hardy-Weinberg equilibrium (HWE) did so because of a deficit of hetero-zygotes. The inbreeding coefficient (F(is)) is a measure of the degree and direction of deviation from HWE. Among studies investigating SNPs, there was a significant correlation between F(is) and HWD (R = 0.191; p = 0.002), indicating that the greater the deviation from HWE, the greater the deficit of heterozygotes. By contrast, for repeat variants, only one in five studies that deviated significantly from HWE showed a deficit of heterozygotes and there was no significant correlation between F(is) and HWD. These results indicate the presence of HWD across multiple loci, with the magnitude of the deviation varying substantially from locus to locus. For SNPs, HWD tends to be due to a deficit of heterozygotes, indicating that allelic dropout may be the most prevalent genotyping error.
Keywords:meta-analysis; polymorphism; variant; deviation
Genotyping errors are an important and increasingly recognised problem in modern genetics . Traditional family-based genetic studies allow for straightforward identification of genotyping errors through a familial Mendelian inheritance check. Over the past decade, however, there has been increasing interest in case-control association studies, a type of study in which investigators generally compare a group of subjects having a particular disease with another group not having the disease, to identify a genotypic difference between the groups. Unfortunately, these association studies do not allow for simple inheritance checks to identify errors and, as a result, we have limited insight into the prevalence and nature of genotyping errors in published association studies.
Hardy-Weinberg law states that if conditions of population equilibrium are met (random mating and negligible mutation, migration, stratification, genetic drift and selection), then genotype frequencies should fit a predictable binomial distribution calculable from the allele frequencies. Significant deviation from the predicted distribution has been used as a marker for genotyping error. Previous work has estimated that the control sample genotype distribution violates Hardy-Weinberg equilibrium (HWE) in approximately 10 per cent of published association studies [3-5]. Furthermore, exclusion of studies that violate HWE alters the results of a substantial fraction of gene association meta-analyses .
The inbreeding coefficient (F(is)) can be used as a measure of the degree and direction of deviation from HWE (HWD). Positive F(is) values indicate an excess of homozygotes and negative F(is) values indicate a deficit of homozygotes. Salanti and colleagues  found that with a moderate level of HWD (F(is) = 0.10), only 7 per cent of association studies had at least 80 per cent power to find significant evidence for violation of HWE. Because of this low level of power, focusing on statistically significant violation of HWE in individual association studies substantially limits the insight that we can gain into potential genotyping errors from HWE analysis . A complementary approach that bypasses the problem of limited power in individual studies is the analysis of HWD patterns across a set of studies. As originally demonstrated by Weir, if a locus is prone to genotyping error, the evidence for HWD will increase with increasing sample size. By contrast, if there is no substantial genotyping error, or if the error is random, there will be no relationship between HWD and sample size. By examining a set of studies at a given locus, we can learn about the level of genotyping error present at that locus. Furthermore, by looking at the evidence across multiple loci, we can gain insight into the level and nature of genotyping error in association studies in general.
Here, we investigate: (1) the relationship between sample size and HWD across well-studied loci, and (2) the direction of deviation in a set of association studies compiled from previous meta-analyses.
Materials and methods
Genetic loci for analysis were identified through published meta-analyses. Meta-analyses were identified through PubMed at the National Library of Medicine, limiting the search to meta-analyses published between 2001 and 2005 and using the search terms: (1) association genetic; (2) association polymorphism; (3) association variant. These results were supplemented by a database of meta-analyses compiled by Ioannidis and colleagues [9,10]. Loci were subsequently chosen using the criteria: (1) biallelic markers; (2) at least ten independent studies; and (3) sample size data for all three genotype groups included in the publication. For each included study, we recorded the control group sample size for the three genotype groups (Supplementary Table 1).
Table 1. Relationship between sample size and Hardy-Weinberg exact test p-values for individual loci
The most straightforward way to assess HWD in a set of studies investigating a given locus is to pool the genotype cell counts from each of the relevant studies and assess HWD among the three pooled genotype groups. All of these studies investigated population samples with different ethnicities, however, and consequently different allele frequencies. As a result, simply combining data from different studies would find substantial HWD due to lack of heterozygotes, even in the absence of geno-typing error.
We took an alternative approach to assessing HWD among a set of studies investigating a given locus. For each locus, we determined the correlation between the HWD exact test p-value of each study and study sample size. The stronger the correlation, the stronger the evidence for HWD at that locus. Given that many included studies had small homozygote minor allele cell counts (fewer than five subjects), and that the chi square test is an unreliable test of HWD in the presence of small cell counts, an exact test was used to determine the strength of evidence for HWD .
In addition to investigating the correlation between HWD and sample size among studies investigating each individual locus, we also wanted to explore the strength and significance of this correlation across all studies, regardless of locus. A straightforward assessment of correlation between sample size and HWD, however, would be confounded by statistical artefact. Specifically, the mean sample size varies substantially across loci. Because the level of HWD varies substantially across loci (as demonstrated by our initial analyses), a correlation between sample size and HWD p-value among the set of all studies could merely represent that loci with larger mean sample sizes have greater HWD. In order to control for this potential confound, we calculated a standardised sample size for each study, such that each locus had a mean sample size = 50 and sample size standard deviation = 10. Subsequently, we calculated the strength and significance of the correlation between this locus-standardised sample size and HWD p-value for the set of all studies. The raw sample size for each study was converted to a T-score so that each locus had an overall mean standardised sample size of 50 ± 10. Subsequently, the correlation between standardised sample size and exact test p-value was calculated for the set of all studies.
Inbreeding coefficient was calculated using the following formula:
where p = frequency; A = major allele; a = minor allele; AA = homozygous major allele; aa = homozygous minor allele. All analyses were carried out in SPSS 12.0 (SPSS Inc., Chicago, IL, USA).
In total, 325 studies, investigating 17 loci, fit the criteria for analysis. Twenty-eight studies (9 per cent) showed significant HWD. This proportion is in line with the results of previous studies [3-5]. The number of studies per locus ranged from ten (CYP1) to 39 (PON1 Q192R). The average sample size per locus ranged from 71 (DRD2) to 1,020 (ADD1) (Figure 1).
Figure 1. Hardy-Weinberg disequilibrium (HWD) p-value vs sample size across 325 studies.
Among individual loci, 14/17 variants showed a negative correlation between sample size and HWD p- value, indicating that the majority of studied variants show evidence of consistent genotyping error. Overall, the correlations ranged from R = 0.29 (TPH) to R = -0.59 (CTSD) and was significant for two loci (CTSD and 5-HTTLPR) (Table 1). Among the set of all 325 studies, 23 studies had a homozygote minor allele cell count = 0. The strength and significance of correlations were not substantially changed with the exclusion of these studies (data not shown).
The 325 studies investigated 15 single nucleotide polymorphism (SNP) loci (267 studies) and two repeat polymorphism loci (58 studies). The percentage of individual studies that significantly deviated from HWE was the same (9 per cent) for both the SNP and repeat polymorphism categories. Similarly, the standardised sample size-HWD correlation was statistically significant for both SNP (p = 0.018) and repeat polymorphism (p = 0.004) groups. Of the 28 studies that showed significant deviation from HWE, 23 studies were SNP studies and five were repeat polymorphism studies. Fifteen out of 23 HWE-violating SNP studies showed a deficit of heterozygotes, while only one in five HWE-violating repeat polymorphism studies showed a deficit of heterozygotes. In addition, for SNP studies, there was a significant correlation between F(is) and HWD p-value (R = 0.190; p = 0.002), while repeat polymorphisms showed no evidence of correlation (R = 0.03). In the set of all 325 studies, there was a significant correlation between standardised sample size and HWD (R = 0.18; p = 0.001) (Figure 2).
Figure 2. Mean F(is) statistic stratified by variant type.
To gain insight into the reliability of the results found among controls, and to help to differentiate between selection and genotyping error as the primary cause of HWD, we investigated the correlation between F(is) among cases (F(cases)) and controls (F(controls)) for each individual study. If the HWD among control subjects is due to selection, then we would expect the genotype that is deficient among controls to be overrepresented among cases, and thus F(is) among control and case studies would show a negative correlation. By contrast, if the HWD among control subjects is due to genotyping error, then we would expect the genotype that is deficient among controls also to be deficient among cases, and thus the inbreeding coefficients would show a positive correlation. Lastly, if the HWD among controls were due purely to chance, then we would expect no correlation whatsoever between F(is) statistics.
Looking across 12 loci and 221 studies for which we had data for both cases and controls, we found a significant positive correlation between F (controls) and F (cases) (r = 0.174; p = 0.01). Further, the correlation was in the positive direction for 11/12 loci. These findings indicate that for any given study, the direction and magnitude of HWD among cases is similar to the direction of magnitude of HWD among controls. This result is consistent with genotyping error rather than selection as the primary source of HWD, and provides further evidence that these findings are not due purely to chance.
The primary finding of this analysis was the identification of HWD across a large subset of published association studies investigating both SNP and repeat variants. Although deviation was present at most loci, the degree of deviation varied substantially across loci. At least among SNP studies, the predominant cause of this deviation was a deficit of heterozygotes.
In addition to genotyping error, other factors can contribute to HWD. For example, strong selection against a specific genotype can skew the genotypic distribution of a population. In fact, HWD among cases has been used as a test for genotype phenotype association,[12,13] and Wittke-Thompson and colleagues  have demonstrated a pattern of expected deviation among cases and, under some conditions, controls for various disease models. Our finding that the HWD among cases has a strong tendency to be in the same direction as the deviation found among controls is contrary to the expected result under the selection model, however.
Population stratification is another factor that can contribute to HWD. To eliminate the possibility of ethnic differences between studies causing stratification and HWD in our study, we did not pool the three genotype counts for all studies investigating a given locus and calculate a HWD p-value from this pooled sample. Instead, for each locus, we determined the correlation between the HWD exact test p-value and study sample size. Thus, any effect of stratification in our study is not due to allele frequency differences between studies investigating the same locus. Although population stratification within individual studies may contribute to HWD in our study, there are multiple considerations that are likely to mitigate its effect. First, most studies included in our analysis utilise samples that are ethnically homogeneous. Secondly, a significant proportion of the studies formally tested and rejected the presence of population stratification in their sample. Thirdly, the consistent direction of deviation across studies and the different patterns of deviation found between SNP and repeat variants are more consistent with genotyping error than stratification as a primary cause of HWD. We cannot however, definitively exclude stratification as a contributing cause of HWD among these studies.
Previous studies investigating the nature and consequences of genotyping error based on simulations or experimental samples specifically designed to assess genotyping error have proposed allelic dropout as one of the most frequent causes of gen-otypic error [2,15,16]. Intuitively, it is clear that heterozygotes, which get half a dose of each allele compared with homozygotes, may be more often missed or misclassified. In fact, even in the most sophisticated high-throughput algorithms, heterozygotes have a lower call rate than homozygotes . Our investigation of a large set of published studies is consistent with this prediction. Further, our findings are consistent with the hypothesis that genotyping error is not stochastic, but more common at certain loci [18-21]. These findings raise concerns about the level and widespread nature of genotyping errors in genetic association studies and the conclusions drawn from those studies. In light of this finding, the approach employed here could be useful to identify loci most prone to error. For example, Yonan and colleagues  recently used HWD to identify genotyping errors at the 5hydroxytryptamine transporter 5-HTTLPR variant and developed an alternate assay less prone to error.
We propose that future genetic association meta-analyses examine the correlation between sample size and HWE to determine the level of genotyping error among included studies. Further, we believe that the method and points that this analysis highlight can be of utility to investigators performing individual association studies. First, this result should caution investigators against dismissing the possibility of genotyping error merely because their sample does not show significant deviation from HWE. Instead, investigators should further examine the magnitude and direction of deviation. For instance, a large F(is) statistic in the same direction among cases and controls raises the concern for genotyping error, and should prompt investigators to perform genotyping quality checks.
Supplementary Table. Included association studies stratified by locus
The authors are very grateful to Pratima Naik for her contribution to this study and to Scott Stoltenberg and Laura Scott for advice and helpful discussion. They also thank the reviewers for their thorough and helpful comments, which helped them significantly to improve the manuscript.