Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Apr;6(4):291-5.
doi: 10.1038/nmeth.1311. Epub 2009 Mar 15.

Amplification-free Illumina sequencing-library preparation facilitates improved mapping and assembly of (G+C)-biased genomes

Affiliations

Amplification-free Illumina sequencing-library preparation facilitates improved mapping and assembly of (G+C)-biased genomes

Iwanka Kozarewa et al. Nat Methods. 2009 Apr.

Abstract

Amplification artifacts introduced during library preparation for the Illumina Genome Analyzer increase the likelihood that an appreciable proportion of these sequences will be duplicates and cause an uneven distribution of read coverage across the _targeted sequencing regions. As a consequence, these unfavorable features result in difficulties in genome assembly and variation analysis from the short reads, particularly when the sequences are from genomes with base compositions at the extremes of high or low G+C content. Here we present an amplification-free method of library preparation, in which the cluster amplification step, rather than the PCR, enriches for fully ligated template strands, reducing the incidence of duplicate sequences, improving read mapping and single nucleotide polymorphism calling and aiding de novo assembly. We illustrate this by generating and analyzing DNA sequences from extremely (G+C)-poor (Plasmodium falciparum), (G+C)-neutral (Escherichia coli) and (G+C)-rich (Bordetella pertussis) genomes.

PubMed Disclaimer

Figures

Figure 1
Figure 1. No-PCR library preparation
In both standard and no-PCR library preps, partially complementary (‘Y-shaped’) adapters with a 3′ T overhang are ligated onto fragmented, end-repaired, 3′ A-tailed DNA. Whereas standard adapters consist only of sections to which read 1 and read 2 sequencing primers hybridize (R1 and R2′), no-PCR adapters also contain sequences that facilitate hybridization to oligonucleotides attached to the flowcell surface (FP1 and FP2′). The standard library prep uses PCR to add these sections, and to enrich for fully ligated templates which then amplify on the flow cell surface. With the no-PCR approach, the flowcell itself is used to select for fully ligated template molecules. All no-PCR templates hybridise to the flowcell in the same orientation, because only the FP2′ sequence is reverse complementary to a flowcell oligonucleotide.
Figure 2
Figure 2. Distribution of genome sequence coverage
The distribution of sequence coverage across the unmasked genomes are shown for various datasets with (STD) or without (NP) the PCR step. (a) % of unmasked genome against depth of genome base coverage and (b) accumulated % of unmasked genome against depth of genome base coverage for malaria strains (P. falciparum (PF) 2,3,88 and 85 and 3D7) with either long (L) or short (S) reads. (c) % of unmasked genome against depth of genome base coverage and (d) accumulated % of unmasked genome against depth of genome base coverage for E. coli 042 and B. pertussis ST24
Figure 3
Figure 3. Distribution of sequenced reads for different values of GC content
GC profiles for raw and mapped sequence data for the malaria strains NP-3D7-S and STD-PF2 are shown alongside simulated data (‘Shred-3D7′) for comparison. GC levels are calculated in a window size of read length and therefore the peak of fraction reads is dependent upon read length. A shift away from the simulated data curve, towards a more balanced GC composition is evident for the STD-PF2 sequence data, indicating severe bias.
Figure 4
Figure 4. Frequencies of duplicate sequences
Percentage of matched reads against duplication depth for sequence data derived from libraries prepared both with and without a PCR step. (a) Duplication frequencies for Plasmodium libraries; (b) duplication frequencies for E. coli and B. pertussis libraries.

Similar articles

Cited by

References

    1. Goman M, et al. The establishment of genomic DNA libraries for the human malaria parasite Plasmodium falciparum and identification of individual clones by hybridisation. Mol Biochem Parasitol. 1982;5:391–400. - PubMed
    1. Camargo AA, Fischer K, Lanzer M, del Portillo HA. Construction and characterization of a Plasmodium vivax genomic library in yeast artificial chromosomes. Genomics. 1997;42:467–473. - PubMed
    1. de Bruin D, Lanzer M, Ravetch JV. Characterization of yeast artificial chromosomes from Plasmodium falciparum: construction of a stable, representative library and cloning of telomeric DNA fragments. Genomics. 1992;14:332–339. - PubMed
    1. Triglia T, Kemp DJ. Large fragments of Plasmodium falciparum DNA can be stable when cloned in yeast artificial chromosomes. Mol Biochem Parasitol. 1991;44:207–211. - PubMed
    1. Pollack Y, Katzen AL, Spira DT, Golenser J. The genome of Plasmodium falciparum. I: DNA base composition. Nucleic Acids Res. 1982;10:539–546. - PMC - PubMed

Publication types

MeSH terms

  NODES
Association 1
chat 1
twitter 2