Evolving gene/transcript definitions significantly alter the interpretation of GeneChip data
- PMID: 16284200
- PMCID: PMC1283542
- DOI: 10.1093/nar/gni179
Evolving gene/transcript definitions significantly alter the interpretation of GeneChip data
Abstract
Genome-wide expression profiling is a powerful tool for implicating novel gene ensembles in cellular mechanisms of health and disease. The most popular platform for genome-wide expression profiling is the Affymetrix GeneChip. However, its selection of probes relied on earlier genome and transcriptome annotation which is significantly different from current knowledge. The resultant informatics problems have a profound impact on analysis and interpretation the data. Here, we address these critical issues and offer a solution. We identified several classes of problems at the individual probe level in the existing annotation, under the assumption that current genome and transcriptome databases are more accurate than those used for GeneChip design. We then reorganized probes on more than a dozen popular GeneChips into gene-, transcript- and exon-specific probe sets in light of up-to-date genome, cDNA/EST clustering and single nucleotide polymorphism information. Comparing analysis results between the original and the redefined probe sets reveals approximately 30-50% discrepancy in the genes previously identified as differentially expressed, regardless of analysis method. Our results demonstrate that the original Affymetrix probe set definitions are inaccurate, and many conclusions derived from past GeneChip analyses may be significantly flawed. It will be beneficial to re-analyze existing GeneChip data with updated probe set definitions.
Figures
Similar articles
-
The effect of GeneChip gene definitions on the microarray study of cancers.Bioessays. 2006 Jul;28(7):739-46. doi: 10.1002/bies.20433. Bioessays. 2006. PMID: 16850407
-
A verification protocol for the probe sequences of Affymetrix genome arrays reveals high probe accuracy for studies in mouse, human and rat.BMC Bioinformatics. 2007 Apr 20;8:132. doi: 10.1186/1471-2105-8-132. BMC Bioinformatics. 2007. PMID: 17448222 Free PMC article.
-
Transcript-level annotation of Affymetrix probesets improves the interpretation of gene expression data.BMC Bioinformatics. 2007 Jun 11;8:194. doi: 10.1186/1471-2105-8-194. BMC Bioinformatics. 2007. PMID: 17559689 Free PMC article.
-
[Transcriptome analyses and transcriptome databases].Tanpakushitsu Kakusan Koso. 2004 Aug;49(11 Suppl):1859-65. Tanpakushitsu Kakusan Koso. 2004. PMID: 15377029 Review. Japanese. No abstract available.
-
Normalization of microarray data: single-labeled and dual-labeled arrays.Mol Cells. 2006 Dec 31;22(3):254-61. Mol Cells. 2006. PMID: 17202852 Review.
Cited by
-
Molecular pathways involved in prostate carcinogenesis: insights from public microarray datasets.PLoS One. 2012;7(11):e49831. doi: 10.1371/journal.pone.0049831. Epub 2012 Nov 20. PLoS One. 2012. PMID: 23185449 Free PMC article.
-
IL10 receptor is a novel therapeutic _target in DLBCLs.Leukemia. 2015 Aug;29(8):1684-94. doi: 10.1038/leu.2015.57. Epub 2015 Mar 3. Leukemia. 2015. PMID: 25733167
-
Dormant and after-Ripened Arabidopsis thaliana Seeds are Distinguished by Early Transcriptional Differences in the Imbibed State.Front Plant Sci. 2016 Aug 30;7:1323. doi: 10.3389/fpls.2016.01323. eCollection 2016. Front Plant Sci. 2016. PMID: 27625677 Free PMC article.
-
ATF3 protects against atherosclerosis by suppressing 25-hydroxycholesterol-induced lipid body formation.J Exp Med. 2012 Apr 9;209(4):807-17. doi: 10.1084/jem.20111202. Epub 2012 Apr 2. J Exp Med. 2012. PMID: 22473958 Free PMC article.
-
A gene regulatory network for root epidermis cell differentiation in Arabidopsis.PLoS Genet. 2012 Jan;8(1):e1002446. doi: 10.1371/journal.pgen.1002446. Epub 2012 Jan 12. PLoS Genet. 2012. PMID: 22253603 Free PMC article.
References
-
- Bolstad B.M., Irizarry R.A., Astrand M., Speed T.P. A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics. 2003;19:185–193. - PubMed
-
- Irizarry R.A., Hobbs B., Collin F., Beazer-Barclay Y.D., Antonellis K.J., Scherf U., Speed T.P. Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics. 2003;4:249–264. - PubMed
-
- Lander E.S., Linton L.M., Birren B., Nusbaum C., Zody M.C., Baldwin J., Devon K., Dewar K., Doyle M., FitzHugh W., et al. Initial sequencing and analysis of the human genome. Nature. 2001;409:860–921. - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials
Miscellaneous