Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics
- PMID: 25855118
- PMCID: PMC4776766
- DOI: 10.1021/pr501138h
Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics
Abstract
In this review, we apply selected imputation strategies to label-free liquid chromatography-mass spectrometry (LC-MS) proteomics datasets to evaluate the accuracy with respect to metrics of variance and classification. We evaluate several commonly used imputation approaches for individual merits and discuss the caveats of each approach with respect to the example LC-MS proteomics data. In general, local similarity-based approaches, such as the regularized expectation maximization and least-squares adaptive algorithms, yield the best overall performances with respect to metrics of accuracy and robustness. However, no single algorithm consistently outperforms the remaining approaches, and in some cases, performing classification without imputation sometimes yielded the most accurate classification. Thus, because of the complex mechanisms of missing data in proteomics, which also vary from peptide to protein, no individual method is a single solution for imputation. On the basis of the observations in this review, the goal for imputation in the field of computational proteomics should be to develop new approaches that work generically for this data type and new strategies to guide users in the selection of the best imputation for their dataset and analysis objectives.
Keywords: Imputation; accuracy; classification; label free; mean-square error; peak intensity.
Figures
Similar articles
-
Normalization and missing value imputation for label-free LC-MS analysis.BMC Bioinformatics. 2012;13 Suppl 16(Suppl 16):S5. doi: 10.1186/1471-2105-13-S16-S5. Epub 2012 Nov 5. BMC Bioinformatics. 2012. PMID: 23176322 Free PMC article.
-
Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies.J Proteome Res. 2016 Apr 1;15(4):1116-25. doi: 10.1021/acs.jproteome.5b00981. Epub 2016 Mar 1. J Proteome Res. 2016. PMID: 26906401
-
Open-source platform for the analysis of liquid chromatography-mass spectrometry (LC-MS) data.Methods Mol Biol. 2008;428:369-82. doi: 10.1007/978-1-59745-117-8_19. Methods Mol Biol. 2008. PMID: 18287783
-
A comparative analysis of computational approaches to relative protein quantification using peptide peak intensities in label-free LC-MS proteomics experiments.Proteomics. 2013 Feb;13(3-4):493-503. doi: 10.1002/pmic.201200269. Epub 2012 Nov 8. Proteomics. 2013. PMID: 23019139 Free PMC article. Review.
-
An assessment of software solutions for the analysis of mass spectrometry based quantitative proteomics data.J Proteome Res. 2008 Jan;7(1):51-61. doi: 10.1021/pr700758r. Epub 2008 Jan 4. J Proteome Res. 2008. PMID: 18173218 Review.
Cited by
-
MOTS-c modulates skeletal muscle function by directly binding and activating CK2.iScience. 2024 Oct 19;27(11):111212. doi: 10.1016/j.isci.2024.111212. eCollection 2024 Nov 15. iScience. 2024. PMID: 39559755 Free PMC article.
-
Embracing the informative missingness and silent gene in analyzing biologically diverse samples.Sci Rep. 2024 Nov 16;14(1):28265. doi: 10.1038/s41598-024-78076-0. Sci Rep. 2024. PMID: 39550430 Free PMC article.
-
Plasma proteomic signature of risk and prognosis of frailty in the UK Biobank.Geroscience. 2024 Nov 13. doi: 10.1007/s11357-024-01415-6. Online ahead of print. Geroscience. 2024. PMID: 39535692
-
Interrogating data-independent acquisition LC-MS/MS for affinity proteomics.J Proteins Proteom. 2024;15(3):281-298. doi: 10.1007/s42485-024-00166-4. Epub 2024 Sep 17. J Proteins Proteom. 2024. PMID: 39372605 Free PMC article.
-
Imputation of cancer proteomics data with a deep model that learns from many datasets.bioRxiv [Preprint]. 2024 Aug 28:2024.08.26.609780. doi: 10.1101/2024.08.26.609780. bioRxiv. 2024. PMID: 39253518 Free PMC article. Preprint.
References
-
- Van Oudenhove L, Devreese B. A review on recent developments in mass spectrometry instrumentation and quantitative tools advancing bacterial proteomics. Appl. Microbiol. Biotechnol. 2013;97:4749–4762. - PubMed
-
- Zhang AH, Sun H, Yan GL, et al. Serum proteomics in biomedical research: a systematic review. Appl. Biochem. Biotechnol. 2013;170:774–786. - PubMed
-
- Bantscheff M, Lemeer S, Savitski MM, et al. Quantitative mass spectrometry in proteomics: critical review update from 2007 to the present. Anal Bioanal Chem. 2012;404:939–965. - PubMed
-
- Wright PC, Noirel J, Ow SY, et al. A review of current proteomics technologies with a survey on their widespread use in reproductive biology investigations. Theriogenology. 2012;77:738–765. e752. - PubMed
Publication types
MeSH terms
Substances
Grants and funding
- P41 GM103493/GM/NIGMS NIH HHS/United States
- HHSN27220080060C/PHS HHS/United States
- P41 RR018522/RR/NCRR NIH HHS/United States
- DK071283/DK/NIDDK NIH HHS/United States
- U54 ES016015/ES/NIEHS NIH HHS/United States
- U01CA184783-01/CA/NCI NIH HHS/United States
- P41-RR018522/RR/NCRR NIH HHS/United States
- P41-GM103493/GM/NIGMS NIH HHS/United States
- U54-ES016015/ES/NIEHS NIH HHS/United States
- R21 DK071283/DK/NIDDK NIH HHS/United States
- R33 DK071283/DK/NIDDK NIH HHS/United States
- U01 CA184783/CA/NCI NIH HHS/United States
LinkOut - more resources
Full Text Sources
Other Literature Sources