DOI:10.1007/3-540-44794-6_4 - Corpus ID: 15136247
Knowledge Discovery in Multi-label Phenotype Data
@inproceedings{Clare2001KnowledgeDI, title={Knowledge Discovery in Multi-label Phenotype Data}, author={Amanda Clare and Ross D. King}, booktitle={European Conference on Principles of Data Mining and Knowledge Discovery}, year={2001}, url={https://api.semanticscholar.org/CorpusID:15136247} }
- A. Clare, R. King
- Published in European Conference on… 3 September 2001
- Biology, Computer Science
This work uses KDD to analyse data from mutant phenotype growth experiments with the yeast S. cerevisiae to predict novel gene functions, and learns rules which are accurate and biologically meaningful.
753 Citations
753 Citations
Feature selection for gene function prediction using multi-labelled lazy learning
- Yuhai LiuGuozheng LiHong-yu ZhangMary YangJack Y. Yang
- Computer ScienceInt. J. Funct. Informatics Pers. Medicine
- 2008
Experimental results on a real-world multi-label bioinformatics data show that ML-kNN with feature selection greatly outperforms the prior ML- kNN algorithm.
Multi-label Classification of Gene Function using MLPs
- A. SkabarD. WollersheimTim Whitfort
- Biology, Computer ScienceThe 2006 IEEE International Joint Conference on…
- 2006
Comparison of the classification characteristics of the multi-output MLP with that of multiple binary classifiers reveals several differences, most notably a more rapid fall-off in sensitivity as the output cutoff value is increased.
Hierarchical multi-label classification for protein function prediction going beyond traditional approaches
- F. FotouhiChandan K. ReddyNoor Alaydie
- Computer Science, Biology
- 2012
The author proposed the HiBLADE algorithm (Hierarchical multi-label Boosting with LAbel DEpendency), a novel algorithm that takes advantage of not only the pre-established hierarchical taxonomy of the classes, but also effectively exploits the hidden correlation among the classes that is not shown through the class hierarchy, thereby improving the quality of the predictions.
A Randomized Clustering Forest Approach for Efficient Prediction of Protein Functions
- Hong TangYuanyuan WangShaomin TangDianhui ChuChunshan Li
- Computer Science, Biology
- 2019
A novel ensemble MIML algorithm called multi-instance multi-label randomized clustering forest (MIMLRC-Forest) for protein function prediction is proposed, which develops a set of hierarchical clustering trees and conducts a label transfer mechanism to identify the relevant function labels in learning process.
Multilabel Neural Networks with Applications to Functional Genomics and Text Categorization
- Min-Ling ZhangZhi-Hua Zhou
- Computer Science, Biology
- 2006
Applications to two real-world multilabel learning problems, i.e., functional genomics and text categorization, show that the performance of BP-MLL is superior to that of some well-established multilabel learning algorithms.
An Adaptation of Binary Relevance for Multi-Label Classification applied to Functional Genomics
- Erica Akemi TanakaJ. A. Baranauskas
- Biology, Computer Science
- 2012
A new adaptation for the Binary Relevance method taking into account the correlation among labels is proposed taking into account the correlation among labels, focusing on the interpretability of the model, not only its performance.
Comparing Several Approaches for Hierarchical Classification of Proteins with Decision Trees
- Eduardo P. CostaAna Carolina LorenaA. CarvalhoA. FreitasNicholas Holden
- Biology, Computer Science
- 2007
The main characteristics of hierarchical classification models for Bioinformatics problems are described and three hierarchical methods based on the use of Decision Trees to protein functional classification datasets are applied.
Multi-label Classification with ART Neural Networks
- E. Sapozhnikova
- Computer Science2009 Second International Workshop on Knowledge…
- 2009
This paper investigates a novel method to solve a MC task by using an Adaptive Resonance Theory (ART) neural network and a modified Fuzzy ARTMAP algorithm Multi-Label-FAM (ML-F AM) was applied to classification of multi-label data.
...
...
37 References
Genome scale prediction of protein functional class from sequence using data mining
- R. KingAndreas KarwathA. ClareL. Dehaspe
- Computer Science, Biology
- 2000
Biologically interpretable rules are identified that can predict protein function even in the absence of identifiable sequence homology in Mycobacterium tuberculosis with an estimated accuracy of 60-80% and give insight into the evolutionary history of the organism.
Knowledge-based analysis of microarray gene expression data by using support vector machines.
- M. S. BrownW. Grundy D. Haussler
- Computer Science, Biology
- 2000
A method of functionally classifying genes by using gene expression data from DNA microarray hybridization experiments, based on the theory of support vector machines (SVMs), to predict functional roles for uncharacterized yeast ORFs based on their expression data is introduced.
A functional genomics strategy that uses metabolome data to reveal the phenotype of silent mutations
- L. RaamsdonkB. Teusink Stephen G. Oliver
- Biology, Chemistry
- 2001
It is demonstrated how the intracellular concentrations of metabolites can reveal phenotypes for proteins active in metabolic regulation, and this approach to functional analysis, using comparative metabolomics, is called FANCY—an abbreviation for functional analysis by co-responses in yeast.
Prediction of Enzyme Classification from Protein Sequence without the Use of Sequence Similarity
- Marie desJardinsP. KarpMarkus KrummenackerThomas J. LeeC. Ouzounis
- Biology, Computer Science
- 1997
A novel approach for predicting the function of a protein from its amino-acid sequence that uses machine learning techniques to induce classifiers that predict the EC class of an enzyme from features extracted from its primary sequence.
Cluster analysis and display of genome-wide expression patterns.
- M. EisenP. SpellmanP. BrownD. Botstein
- Biology, Computer Science
- 1998
A system of cluster analysis for genome-wide expression data from DNA microarray hybridization is described that uses standard statistical algorithms to arrange genes according to similarity in pattern of gene expression, finding in the budding yeast Saccharomyces cerevisiae that clustering gene expression data groups together efficiently genes of known similar function.
TRIPLES: a database of gene function in Saccharomyces cerevisiae
- Anuj KumarK. CheungP. Ross-MacdonaldP. CoelhoP. MillerM. Snyder
- Biology, Computer ScienceNucleic Acids Res.
- 2000
Using a novel multipurpose mini-transposon, a collection of defined mutant alleles for the analysis of disruption phenotypes, protein localization, and gene expression in Saccharomyces cerevisiae are generated and cataloged in TRIPLES, a Web-accessible database of TRansposon-Insertion Phenotypes, Localization and Expression in SacCharomyces.
A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection
- Ron Kohavi
- Computer Science, Mathematics
- 1995
The results indicate that for real-word datasets similar to the authors', the best method to use for model selection is ten fold stratified cross validation even if computation power allows using more folds.
C4.5: Programs for Machine Learning
- J. R. Quinlan
- Computer Science
- 1992
A complete guide to the C4.5 system as implemented in C for the UNIX environment, which starts from simple core learning methods and shows how they can be elaborated and extended to deal with typical problems such as missing data and over hitting.
...
...
Related Papers
Showing 1 through 3 of 0 Related Papers