Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Mar 5:10:77.
doi: 10.1186/1471-2105-10-77.

Development and evaluation of new mask protocols for gene expression profiling in humans and chimpanzees

Affiliations

Development and evaluation of new mask protocols for gene expression profiling in humans and chimpanzees

Donna M Toleno et al. BMC Bioinformatics. .

Abstract

Background: Cross-species gene expression analyses using oligonucleotide microarrays designed to evaluate a single species can provide spurious results due to mismatches between the interrogated transcriptome and arrayed probes. Based on the most recent human and chimpanzee genome assemblies, we developed updated and accessible probe masking methods that allow human Affymetrix oligonucleotide microarrays to be used for robust genome-wide expression analyses in both species. In this process, only data from oligonucleotide probes predicted to have robust hybridization sensitivity and specificity for both transcriptomes are retained for analysis.

Results: To characterize the utility of this resource, we applied our mask protocols to existing expression data from brains, livers, hearts, testes, and kidneys derived from both species and determined the effects probe numbers have on expression scores of specific transcripts. In all five tissues, probe sets with decreasing numbers of probes showed non-linear trends towards increased variation in expression scores. The relationships between expression variation and probe number in brain data closely matched those observed in simulated expression data sets subjected to random probe masking. However, there is evidence that additional factors affect the observed relationships between gene expression scores and probe number in tissues such as liver and kidney. In parallel, we observed that decreasing the number of probes within probe sets lead to linear increases in both gained and lost inferences of differential cross-species expression in all five tissues, which will affect the interpretation of expression data subject to masking.

Conclusion: We introduce a readily implemented and updated resource for human and chimpanzee transcriptome analysis through a commonly used microarray platform. Based on empirical observations derived from the analysis of five distinct data sets, we provide novel guidelines for the interpretation of masked data that take the number of probes present in a given probe set into consideration. These guidelines are applicable to other customized applications that involve masking data from specific subsets of probes.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Classification of the 604,258 probes within the Affymetrix U133Plus2 microarray. The pie chart depicts the relative percentage of probes comprising each of the nine probe categories described in the Methods section.
Figure 2
Figure 2
Classification of the 54,675 probe sets within the Affymetrix U133Plus2 microarray. The composition of probe sets with respect to probe categories is depicted. The height of each bar represents the number of probe sets (Y-axis) that contain a least one probe in the indicated category (X-axis). The gray segment of each bar represents the number of probe sets where less than six probes of the indicated category are present. The black segment comprising each bar represents the number of probe sets where more than six probes of the indicated category are present.
Figure 3
Figure 3
Effects of probe number on the variation of gene expression scores. Median interquartile ranges (IQRs) of expression scores (Y-axes) for the indicated tissue from all humans and chimpanzees are plotted in red relative to the number of 1H_1C probes remaining in a probe set (X-axes). Median IQRs are plotted in black for simulated data wherein different numbers of probes were randomly sampled from 1H_1C_11 probe sets (see text for details). Note that only probe sets expressed in a given tissues are considered in this analysis (see Methods). Data from even (Panels A, C, E, G, and I) and odd numbers of remaining probes (Panels B, D, F, H, and J) in probe sets are presented separately due to inherent differences in the way the median polish algorithm employed by RMA processes them. Two-slope red and black lines are provided for the actual and simulated data, respectively (see Methods). The tissue from which the data was collected is indicated in each panel.
Figure 4
Figure 4
Effects of probe number on the inferences of differential gene expression. The median number of gained (black) and lost (red) inferences of differential gene expression (Y-axes) in simulated data sets subjected to random probe masking relative to actual data are plotted against the number of probes in a probe set (X-axes) (see text for details). Error bars represent the observed range of inferred differential gene expression in the simulated data sets. Red and black data points are slightly off-set for visual clarity. Data from probe sets with even (Panels A, C, E, G, and I) and odd numbers of remaining probes (Panels B, D, F, H, and J) are presented separately, as described in Figure 3. The tissue from which the data was collected is indicated in the lower left hand corner of each panel. Note the different scales on the y-axes for heart and testes.

Similar articles

Cited by

References

    1. Schena M, Shalon D, Davis RW, Brown PO. Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science. 1995;270:467–470. doi: 10.1126/science.270.5235.467. - DOI - PubMed
    1. Lockhart DJ, Dong H, Byrne MC, Follettie MT, Gallo MV, Chee MS, Mittmann M, Wang C, Kobayashi M, Horton H, et al. Expression monitoring by hybridization to high-density oligonucleotide arrays. Nat Biotechnol. 1996;14:1675–1680. doi: 10.1038/nbt1296-1675. - DOI - PubMed
    1. Wodicka L, Dong H, Mittmann M, Ho MH, Lockhart DJ. Genome-wide expression monitoring in Saccharomyces cerevisiae. Nat Biotechnol. 1997;15:1359–1367. doi: 10.1038/nbt1297-1359. - DOI - PubMed
    1. White KP, Rifkin SA, Hurban P, Hogness DS. Microarray analysis of Drosophila development during metamorphosis. Science. 1999;286:2179–2184. doi: 10.1126/science.286.5447.2179. - DOI - PubMed
    1. Hill AA, Hunter CP, Tsung BT, Tucker-Kellogg G, Brown EL. Genomic analysis of gene expression in C. elegans. Science. 2000;290:809–812. doi: 10.1126/science.290.5492.809. - DOI - PubMed

Publication types

LinkOut - more resources

  NODES
Note 4
twitter 2