Abstract
Recent genomic surveys have produced high-resolution haplotype information, but only in a small number of human populations. We report haplotype structure across 12 Mb of DNA sequence in 927 individuals representing 52 populations. The geographic distribution of haplotypes reflects human history, with a loss of haplotype diversity as distance increases from Africa. Although the extent of linkage disequilibrium (LD) varies markedly across populations, considerable sharing of haplotype structure exists, and inferred recombination hotspot locations generally match across groups. The four samples in the International HapMap Project contain the majority of common haplotypes found in most populations: averaging across populations, 83% of common 20-kb haplotypes in a population are also common in the most similar HapMap sample. Consequently, although the portability of tag SNPs based on the HapMap is reduced in low-LD Africans, the HapMap will be helpful for the design of genome-wide association mapping studies in nearly all human populations.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
We are sorry, but there is no personal subscription option available for your country.
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Zondervan, K.T. & Cardon, L.R. The complex interplay among factors that influence allelic association. Nat. Rev. Genet. 5, 89–100 (2004).
The International HapMap Consortium. A haplotype map of the human genome. Nature 437, 1299–1319 (2005).
Tishkoff, S.A. et al. Global patterns of linkage disequilibrium at the CD4 locus and modern human origins. Science 271, 1380–1387 (1996).
Reich, D.E. et al. Linkage disequilibrium in the human genome. Nature 411, 199–204 (2001).
Plagnol, V. & Wall, J.D. Possible ancestral structure in human populations. PLoS Genet. 2, 972–979 (2006).
Sabeti, P.C. et al. Detecting recent positive selection in the human genome from haplotype structure. Nature 419, 832–837 (2002).
Voight, B.F., Kudaravalli, S., Wen, X. & Pritchard, J.K. A map of recent positive selection in the human genome. PLoS Biol. 4, e72 (2006).
McVean, G.A.T. et al. The fine-scale structure of recombination rate variation in the human genome. Science 304, 581–584 (2004).
Ptak, S.E. et al. Fine-scale recombination patterns differ between chimpanzees and humans. Nat. Genet. 37, 429–434 (2005).
Fearnhead, P. & Smith, N.G. A novel method with improved power to detect recombination hotspots from polymorphism data reveals multiple hotspots in human genes. Am. J. Hum. Genet. 77, 781–794 (2005).
Patil, N. et al. Blocks of limited haplotype diversity revealed by high-resolution scanning of human chromosome 21. Science 294, 1719–1723 (2001).
Hinds, D.A. et al. Whole-genome patterns of common DNA variation in three human populations. Science 307, 1072–1079 (2005).
De Bakker, P.I.W., Graham, R.R., Altshuler, D., Henderson, B.E. & Haiman, C.A. Transferability of tag SNPs to capture common genetic variation in DNA repair genes across multiple populations. Pac. Symp. Biocomput. 11, 478–486 (2006).
Huang, W. et al. Linkage disequilibrium sharing and haplotype-tagged SNP portability between populations. Proc. Natl. Acad. Sci. USA 103, 1418–1421 (2006).
Montpetit, A. et al. An evaluation of the performance of tag SNPs derived from HapMap in a Caucasian population. PLoS Genet. 2, e27 (2006).
Service, S. et al. Magnitude and distribution of linkage disequilibrium in population isolates and implications for genome-wide association studies. Nat. Genet. 38, 556–560 (2006).
Willer, C.J. et al. Tag SNP selection for Finnish individuals based on the CEPH Utah HapMap database. Genet. Epidemiol. 30, 180–190 (2006).
Yoo, Y.K. et al. Fine-scale map of Encyclopedia of DNA Elements regions in the Korean population. Genetics 174, 491–497 (2006).
Rosenberg, N.A. et al. Genetic structure of human populations. Science 298, 2381–2385 (2002).
Barrett, J.C. & Cardon, L.R. Evaluating coverage of genome-wide association studies. Nat. Genet. 38, 659–662 (2006).
Sawyer, S.L. et al. Linkage disequilibrium patterns vary substantially among populations. Eur. J. Hum. Genet. 13, 677–686 (2005).
Bonnen, P.E. et al. Evaluating potential for whole-genome studies in Kosrae, an isolated population in Micronesia. Nat. Genet. 38, 214–217 (2006).
Gonzalez-Neira, A. et al. The portability of tagSNPs across populations: a worldwide survey. Genome Res. 16, 323–330 (2006).
Cann, H.M. et al. A human genome diversity cell line panel. Science 296, 261–262 (2002).
Cavalli-Sforza, L.L. The Human Genome Diversity Project: past, present and future. Nat. Rev. Genet. 6, 333–340 (2005).
Rosenberg, N.A. Standardized subsets of the HGDP-CEPH Human Genome Diversity Cell Line Panel, accounting for atypical and duplicated samples and pairs of close relatives. Ann. Hum. Genet. published online 29 March 2006 (doi:10.1111/j.1469-1809.2006.00285.x).
Kong, A. et al. A high-resolution recombination map of the human genome. Nat. Genet. 31, 241–247 (2002).
Scheet, P. & Stephens, M. A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. Am. J. Hum. Genet. 78, 629–644 (2006).
Ramachandran, S. et al. Support from the relationship of genetic and geographic distance in human populations for a serial founder effect originating in Africa. Proc. Natl. Acad. Sci. USA 102, 15942–15947 (2005).
Clark, A.G., Hubisz, M.J., Bustamante, C.D., Williamson, S.H. & Nielsen, R. Ascertainment bias in studies of human genome-wide polymorphism. Genome Res. 15, 1496–1502 (2005).
Mountain, J.L. & Cavalli-Sforza, L.L. Inference of human evolution through cladistic analysis of nuclear DNA restriction polymorphisms. Proc. Natl. Acad. Sci. USA 91, 6515–6519 (1994).
Nielsen, R. Population genetic analysis of ascertained SNP data. Hum. Genomics 1, 218–224 (2004).
Rogers, A.R. & Jorde, L.B. Ascertainment bias in estimates of average heterozygosity. Am. J. Hum. Genet. 58, 1033–1041 (1996).
Bowcock, A.M. et al. High resolution of human evolutionary trees with polymorphic microsatellites. Nature 368, 455–457 (1994).
Crawford, D.C. et al. Haplotype diversity across 100 candidate genes for inflammation, lipid metabolism, and blood pressure regulation in two populations. Am. J. Hum. Genet. 74, 610–622 (2004).
Stephens, J.C. et al. Haplotype variation and linkage disequilibrium in 313 human genes. Science 293, 489–493 (2001).
Nielsen, R., Hubisz, M.J. & Clark, A.G. Reconstructing the frequency spectrum of ascertained single-nucleotide polymorphism data. Genetics 168, 2373–2382 (2004).
Winckler, W. et al. Comparison of fine-scale recombination rates in humans and chimpanzees. Science 308, 107–111 (2005).
De Bakker, P.I.W. et al. Efficiency and power in genetic association studies. Nat. Genet. 37, 1217–1223 (2005).
Carlson, C.S. et al. Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium. Am. J. Hum. Genet. 74, 106–120 (2005).
Rosenberg, N.A. et al. Clines, clusters, and the effect of study design on the inference of human population structure. PLoS Genet. 1, 660–671 (2005).
Marchini, J. A comparison of phasing algorithms for trios and unrelated individuals. Am. J. Hum. Genet. 78, 437–450 (2006).
Hurlbert, S.H. The nonconcept of species diversity: a critique and alternative parameters. Ecology 52, 577–586 (1971).
Kalinowski, S.T. Counting alleles with rarefaction: private alleles and hierarchical sampling designs. Conserv. Genet. 5, 539–543 (2004).
Nei, M. Molecular Evolutionary Genetics (Columbia Univ. Press, New York, 1987).
Weir, B.S. Genetic Data Analysis II (Sinauer, Sunderland, Massachusetts, 1996).
Pritchard, J.K., Stephens, M. & Donnelly, P. Inference of population structure using multilocus genotype data. Genetics 155, 945–959 (2000).
Acknowledgements
We thank J. DeYoung and the Southern California Genotyping Consortium for genotyping, P. Scheet for providing prepublication access to fastPHASE, C. Shtir for assistance and M. Przeworski for comments. This work was supported by grants from the Burroughs Wellcome Fund (J.K.P., N.A.R.), the Sloan Foundation (J.D.W., J.K.P., N.A.R.), the Packard Foundation (J.K.P.) and the US National Science Foundation (J.D.W., N.A.R.).
Author information
Authors and Affiliations
Contributions
J.D.W., N.A.R., and J.K.P. conceived and jointly supervised the study, D.F.C. performed the SNP design, N.A.R. and J.K.P. cleaned the data, and J.K.P. performed the phasing. All authors analyzed the data, with the following primary contributions: haplotype visualization, X.W.; haplotype diversity statistics and haplotype sharing with the HapMap, M.J.; recombination rate estimation, G.C.; tag SNP portability, D.F.C.; determinants of portability, M.J. and D.F.C. The supplementary information was written by D.F.C, G.C., M.J., N.A.R. and J.K.P., and the paper was written primarily by J.K.P. and N.A.R., with assistance from all other authors.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Supplementary information
Supplementary Fig. 1
Portability of tag-SNPs from seven different panels. (PDF 172 kb)
Supplementary Fig. 2
Difference in PVT between Patil and non-Patil regions (CEU tags). (PDF 32 kb)
Supplementary Fig. 3
Difference in PVT between Patil and non-Patil regions (YRI tags). (PDF 31 kb)
Supplementary Fig. 4
Relationships between tag portability and the distance at which the r2 measure of linkage disequilibrium decays below 0.5, and between tag portability and FST genetic distance to the HapMap population that produces the highest tag portability. (PDF 33 kb)
Supplementary Fig. 5
Relationships between tag portability and distance at which the r2 measure of linkage disequilibrium decays below 0.5, and tag portability and FST genetic distance to the HapMap population that produces the highest tag portability. (PDF 44 kb)
Supplementary Table 1
Details of SNPs used. (XLS 367 kb)
Rights and permissions
About this article
Cite this article
Conrad, D., Jakobsson, M., Coop, G. et al. A worldwide survey of haplotype variation and linkage disequilibrium in the human genome. Nat Genet 38, 1251–1260 (2006). https://doi.org/10.1038/ng1911
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/ng1911
This article is cited by
-
Rare variants analyses suggest novel cleft genes in the African population
Scientific Reports (2024)
-
The utility of a type 2 diabetes polygenic score in addition to clinical variables for prediction of type 2 diabetes incidence in birth, youth and adult cohorts in an Indigenous study population
Diabetologia (2023)
-
A joint use of pooling and imputation for genotyping SNPs
BMC Bioinformatics (2022)
-
Privacy-preserving chi-squared test of independence for small samples
BioData Mining (2021)
-
Calibration and validation of predicted genomic breeding values in an advanced cycle maize population
Theoretical and Applied Genetics (2021)