Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2007 Apr 20:8:132.
doi: 10.1186/1471-2105-8-132.

A verification protocol for the probe sequences of Affymetrix genome arrays reveals high probe accuracy for studies in mouse, human and rat

Affiliations

A verification protocol for the probe sequences of Affymetrix genome arrays reveals high probe accuracy for studies in mouse, human and rat

Rudi Alberts et al. BMC Bioinformatics. .

Abstract

Background: The Affymetrix GeneChip technology uses multiple probes per gene to measure its expression level. Individual probe signals can vary widely, which hampers proper interpretation. This variation can be caused by probes that do not properly match their _target gene or that match multiple genes. To determine the accuracy of Affymetrix arrays, we developed an extensive verification protocol, for mouse arrays incorporating the NCBI RefSeq, NCBI UniGene Unique, NIA Mouse Gene Index, and UCSC mouse genome databases.

Results: Applying this protocol to Affymetrix Mouse Genome arrays (the earlier U74Av2 and the newer 430 2.0 array), the number of sequence-verified probes with perfect matches was no less than 85% and 95%, respectively; and for 74% and 85% of the probe sets all probes were sequence verified. The latter percentages increased to 80% and 94% after discarding one or two unverifiable probes per probe set, and even further to 84% and 97% when, in addition, allowing for one or two mismatches between probe and _target gene. Similar results were obtained for other mouse arrays, as well as for human and rat arrays. Based on these data, refined chip definition files for all arrays are provided online. Researchers can choose the version appropriate for their study to (re)analyze expression data.

Conclusion: The accuracy of Affymetrix probe sequences is higher than previously reported, particularly on newer arrays. Yet, refined probe set definitions have clear effects on the detection of differentially expressed genes. We demonstrate that the interpretation of the results of Affymetrix arrays is improved when the new chip definition files are used.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Results of the verification protocol for the U74 and 430 arrays. Three analyses were done per array: allowing only perfect matches, allowing one mismatch per probe and allowing two mismatches per probe. Per analysis, probe sets are assigned to the highest quality group (the lowermost group in the figure). So if a probe set is 'entirely verified' in RefSeq, it is assigned to this group. If it is not 'entirely verified' in RefSeq but it is 'entirely verified' in UniGene Unique, it is assigned to this second group, and so on.
Figure 2
Figure 2
Number of perfectly matching probes per probe set that are 'partially verified' against the genome. (A) for 3644 probe sets of the U74 array; (B) for 10729 probe sets of the 430 array.
Figure 3
Figure 3
Number of cross-hybridizing probes per probe set for the 'entirely verified' probe sets. (A) for 9184 probe sets of the U74 array; (B) for 38476 probe sets of the 430 array.

Similar articles

Cited by

References

    1. Lockhart DJ, Dong H, Byrne MC, Follettie MT, Gallo MV, Chee MS, Mittmann M, Wang C, Kobayashi M, Horton H, Brown EL. Expression monitoring by hybridization to high-density oligonucleotide arrays. Nat Biotechnol. 1996;14:1675–1680. doi: 10.1038/nbt1296-1675. - DOI - PubMed
    1. Zhang L, Miles MF, Aldape KD. A model of molecular interactions on short oligonucleotide microarrays. Nat Biotechnol. 2003;21:818–821. doi: 10.1038/nbt836. - DOI - PubMed
    1. Wu Z, Irizarry RA. Preprocessing of oligonucleotide array data. Nat Biotechnol. 2004;22:656–658. doi: 10.1038/nbt0604-656b. - DOI - PubMed
    1. Alberts R, Terpstra P, Bystrykh LV, de Haan G, Jansen RC. A statistical multiprobe model for analyzing cis and trans genes in genetical genomics experiments with short-oligonucleotide arrays. Genetics. 2005;171:1437–1439. doi: 10.1534/genetics.105.045930. - DOI - PMC - PubMed
    1. Jansen RC, Nap JP. Genetical genomics: the added value from segregation. Trends Genet. 2001;17:388–391. doi: 10.1016/S0168-9525(01)02310-1. - DOI - PubMed

Publication types

Associated data

LinkOut - more resources

  NODES
twitter 2