Muscle5: High-accuracy alignment ensembles enable unbiased assessments of sequence homology and phylogeny
- PMID: 36379955
- PMCID: PMC9664440
- DOI: 10.1038/s41467-022-34630-w
Muscle5: High-accuracy alignment ensembles enable unbiased assessments of sequence homology and phylogeny
Abstract
Multiple sequence alignments are widely used to infer evolutionary relationships, enabling inferences of structure, function, and phylogeny. Standard practice is to construct one alignment by some preferred method and use it in further analysis; however, undetected alignment bias can be problematic. I describe Muscle5, a novel algorithm which constructs an ensemble of high-accuracy alignment with diverse biases by perturbing a hidden Markov model and permuting its guide tree. Confidence in an inference is assessed as the fraction of the ensemble which supports it. Applied to phylogenetic tree estimation, I show that ensembles can confidently resolve topologies with low bootstrap according to standard methods, and conversely that some topologies with high bootstraps are incorrect. Applied to the phylogeny of RNA viruses, ensemble analysis shows that recently adopted taxonomic phyla are probably polyphyletic. Ensemble analysis can improve confidence assessment in any inference from an alignment.
© 2022. The Author(s).
Conflict of interest statement
The author declares no competing interests.
Figures
Similar articles
-
Bayesian coestimation of phylogeny and sequence alignment.BMC Bioinformatics. 2005 Apr 1;6:83. doi: 10.1186/1471-2105-6-83. BMC Bioinformatics. 2005. PMID: 15804354 Free PMC article.
-
Phylogenetic Tree Estimation With and Without Alignment: New Distance Methods and Benchmarking.Syst Biol. 2017 Mar 1;66(2):218-231. doi: 10.1093/sysbio/syw074. Syst Biol. 2017. PMID: 27633353
-
WITCH: Improved Multiple Sequence Alignment Through Weighted Consensus Hidden Markov Model Alignment.J Comput Biol. 2022 Aug;29(8):782-801. doi: 10.1089/cmb.2021.0585. Epub 2022 May 17. J Comput Biol. 2022. PMID: 35575747
-
Multiple sequence alignment: in pursuit of homologous DNA positions.Genome Res. 2007 Feb;17(2):127-35. doi: 10.1101/gr.5232407. Genome Res. 2007. PMID: 17272647 Review.
-
Alignment methods: strategies, challenges, benchmarking, and comparative overview.Methods Mol Biol. 2012;855:203-35. doi: 10.1007/978-1-61779-582-4_7. Methods Mol Biol. 2012. PMID: 22407710 Review.
Cited by
-
Genome assembly of Stephania longa provides insight into cepharanthine biosynthesis.Front Plant Sci. 2024 Sep 5;15:1414636. doi: 10.3389/fpls.2024.1414636. eCollection 2024. Front Plant Sci. 2024. PMID: 39301160 Free PMC article.
-
Chromosome-scale assembly of the wild cereal relative Elymus sibiricus.Sci Data. 2024 Jul 26;11(1):823. doi: 10.1038/s41597-024-03622-4. Sci Data. 2024. PMID: 39060306 Free PMC article.
-
Phylogenetic reconciliation: making the most of genomes to understand microbial ecology and evolution.ISME J. 2024 Jan 8;18(1):wrae129. doi: 10.1093/ismejo/wrae129. ISME J. 2024. PMID: 39001714 Free PMC article. Review.
-
Denitrification genotypes of endospore-forming Bacillota.ISME Commun. 2024 Sep 4;4(1):ycae107. doi: 10.1093/ismeco/ycae107. eCollection 2024 Jan. ISME Commun. 2024. PMID: 39263550 Free PMC article.
-
Sequence-Based Antigenic Analyses of H1 Swine Influenza A Viruses from Colombia (2008-2021) Reveals Temporal and Geographical Antigenic Variations.Viruses. 2023 Sep 30;15(10):2030. doi: 10.3390/v15102030. Viruses. 2023. PMID: 37896808 Free PMC article.
References
MeSH terms
LinkOut - more resources
Full Text Sources