Summary
Most classification problems associate a single class to each example or instance. However, there are many classification tasks where each instance can be associated with one or more classes. This group of problems represents an area known as multi-label classification. One typical example of multi-label classification problems is the classification of documents, where each document can be assigned to more than one class. This tutorial presents the most frequently used techniques to deal with these problems in a pedagogical manner, with examples illustrating the main techniques and proposing a taxonomy of multi-label techniques that highlights the similarities and differences between these techniques.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Aiolli, F., Sperduti, A.: Multiclass Classification with Multi-Prototype Support Vector Machines. Journal of Machine Learning Research 6, 817–850 (2005)
Barlett, P., Peter, B., Bartlett, J., Schölkopf, B., Schuurmans, D., Smola, A.J.: Advances in Large-Margin Classifiers. The MIT Press, Cambridge (2000)
Barutcuoglu, Z., Schapire, R.E., Troyanskaya, O.G.: Hierarchical multi-label prediction of gene function. Bioinformatics 22, 830–836 (2006)
Boutell, M., Shen, X., Luo, J., Brown, C.: Multi-label semantic scene classification. Technical Report, Department of Computer Science University of Rochester, USA (2003)
Blockeel, H., Bruynooghe, M., Dzeroski, S., Ramon, J., Struyf, J.: Hierarchical multiclassication. In: Proceedings of the ACM SIGKDD 2002 Workshop on Multi-Relational Data Mining (MRDM 2002), Edmonton, Canada, pp. 21–35.
Blockeel, H., Schietgat, L., Struyf, J., Dzeroski, S., Clare, A.: Decision Trees for Hierarchical Multilabel Classification: A Case Study in Functional Genomics. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) PKDD 2006. LNCS, vol. 4213, pp. 18–29. Springer, Heidelberg (2006)
Brinker, K., Fürnkranz, J., Hüllermeier, E.: A Unified Model for Multila-bel Classification and Ranking. In: ECAI 2006, pp. 489–493 (2006)
Brinker, K., Hüllermeier, E.: Case-Based Multilabel Ranking. In: IJCAI, pp. 702–707 (2007)
Brinker, K., Hüllermeier, E.: Label Ranking in Case-Based Reasoning. In: Weber, R.O., Richter, M.M. (eds.) ICCBR 2007. LNCS, vol. 4626, pp. 77–91. Springer, Heidelberg (2007)
Chan, A., Freitas, A.A.: A new ant colony algorithm for multi-label classi-fication with applications in bioinfomatics. In: Genetic and Evolutionary Computation 2006 Conference (GECCO 2006), Seattle, USA, pp. 27–34 (2006)
Clare, A.J., King, R.D.: Knowledge discovery in multi-label phenotype data. In: Siebes, A., De Raedt, L. (eds.) PKDD 2001. LNCS (LNAI), vol. 2168, p. 42. Springer, Heidelberg (2001)
de Comite, F., Gilleron, R., Tommasi, M.: Learning Multi-label Alter-nating Decision Trees from Texts and Data. In: Perner, P., Rosenfeld, A. (eds.) MLDM 2003. LNCS, vol. 2734, pp. 251–274. Springer, Heidelberg (2003)
Elisseeff, A., Weston, J.: Kernel methods for multi-labelled classifica-tion and categorical regression problems. Technical Report. BIOwulf Technologies (2001)
Elisseeff, A., Weston, J.: A kernel method for multi-labelled classification. In: Neural Information processing Systems. NIPS, vol. 14 (2001)
Freitas, A.A., de Carvalho, A.C.P.L.F.: A Tutorial on Hierarchical Clas-sification with Applications in Bioinformatics. In: Taniar, D. (ed.) Research and Trends in Data Mining Technologies and Applications, pp. 175–208. Idea Group (2007)
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. In: Vitányi, P.M.B. (ed.) EuroCOLT 1995. LNCS, vol. 904, pp. 23–37. Springer, Heidelberg (1995)
Freund, Y., Mason, L.: The alternating decision tree learning algorithm. In: Proceedings of the Sixteenth International Conference on Machine Learning, ICML, pp. 124–133 (1999)
Gao, S., Wu, W., Lee, C.-H., Chua, T.-S.: An MFoM Learning Approach to Robust Multiclass Multi-Label Text Categorization. In: Proceedings of the International Conference on Machine Learning (ICML 2004), Banff, Canada, pp. 329–336 (2004)
Ghamrawi, N., McCallum, A.: Collective Multi-Label Classification. In: Proceedings of the Fourteenth Conference on Information and Knowledge Management (CIKM), pp. 195–200 (2005)
Godbole, S., Sarawagi, S.: Discriminative Methods for Multi-labeled Clas-sification. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS, vol. 3056, pp. 22–30. Springer, Heidelberg (2004)
Gonçalves, T., Quaresma, P.: A preliminary approach to the multi-label classification problem of Portuguese juridical documents. In: Pires, F.M., Abreu, S.P. (eds.) EPIA 2003. LNCS (LNAI), vol. 2902, pp. 435–444. Springer, Heidelberg (2003)
Hua, X., Qi, G.: Online multi-label active annotation: towards large-scale content-based video search. In: Proceeding of the 16th ACM international Conference on Multimedia. MM 2008, Vancouver, British Columbia, Canada, October 26 - 31, pp. 141–150. ACM, New York (2008)
Hsu, C.-W., Lin, C.-J.: A comparison of methods for multi-class support vector machines. IEEE Transactions on Neural Networks 13(2), 415–425 (2002)
Joachims, T.: Text categorization with support vector machines: learning with many relevant features. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 137–142. Springer, Heidelberg (1998)
Karalič, A., Pirnat, V.: Significance level based multiple tree classification. Informatica 15(5), 12 Pages (1991)
Lauser, B., Hotho, A.: Automatic multi-label subject indexing in a multi-lingual environment. In: Koch, T., Sølvberg, I.T. (eds.) ECDL 2003. LNCS, vol. 2769, pp. 140–151. Springer, Heidelberg (2003)
Li, T., Zhang, C., Zhu, S.: Empirical Studies on Multi-label Classification. In: Proceedings of the 18th IEEE international Conference on Tools with Artificial intelligence. ICTAI, November 13 - 15, pp. 86–92. IEEE Computer Society, Washington (2006)
Luo, X., Zincir-Heywood, A.N.: Evaluation of Two Systems on Multi-class Multi-label Document Classification. In: Hacid, M.-S., Murray, N.V., Raś, Z.W., Tsumoto, S. (eds.) ISMIS 2005. LNCS (LNAI), vol. 3488, pp. 161–169. Springer, Heidelberg (2005)
McDonald, R., Crammer, K., Pereira, F.: Flexible Text Segmentation with Structured Multilabel Classification. In: Proceedings of the Human Language Technology Conference on Empirical Methods in Natural Language Processing (HLT-EMNLP, 2005), Vancouver, Canada (2005)
McCallum, A.: Multi-label text classification with a mixture model trained by EM. In: AAAI 1999 Workshop on Text Learning (1999)
Micchelli, C.A., Pontil, M.: Kernels for Multi–task Learning. In: Saul, L.K., Weiss, Y., Bottou, L. (eds.) Advances in Neural Information Processing Systems. NIPS 2004, vol. 17, pp. 921–928. MIT Press, Cambridge (2005)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)
Pavlidis, P., Weston, J., Cai, J., Grundy, W.: Combining microarray expression data and phylogenetic profiles to learn functional categories using support vector machines. In: RECOMB, pp. 242–248 (2001)
Rousu, J., Saunders, C., Szedmak, S., Shawe-Taylor, J.: Learning Hierarchical Multi-Category Text Classification Models. In: 22nd International Conference on Machine Learning (ICML 2005), Bonn, Germany, pp. 745–752 (2005)
Rousu, J., Saunders, C., Szedmak, S., Shawe-Taylor, J.: Kernel-based Learning of Hierarchical Multilabel Classification Models. Journal of Machine Learning Research 7, 1601–1626 (2006)
Sebastiani, F.: Machine learning in automated text categorization. ACM Computing Surveys 34(1), 1–47 (2002)
Shen, X., Boutell, M., Luo, J., Brown, C.: Multi-label machine learning and its application to semantic scene classification. Storage and Retrieval Methods and Applications for Multimedia. In: Yeung, M.M., Lienhart, R.W., Li, C.-S. (eds.) Proceedings of the SPIE, vol. 5307, pp. 188–199 (2003)
Schapire, R.E., Singer, Y.: Improved boosting algorithms using confidence-rated predictions. Machine Learning 37(3), 297–336 (1999)
Schapire, R.E., Singer, Y.: BoosTexter: A boosting-based system for text categorization. Machine Learning 39(2/3), 135–168 (2000)
Su, C.-Y., Lo, A., Lin, C.-C., Chang, F., Hsu, W.-L.: A Novel Approach for Prediction of Multi-Labeled Protein Subcellular Localization for Prokaryotic Bacteria. In: Computational Systems Bioinformatics Conference, CSB Workshops, Palo Alto, USA, pp. 79–82 (2005)
Thabtah, F.A., Cowling, P., Peng, Y.: MMAC: A New Multi-Class, Multi-Label Associative Classification Approach. In: Perner, P. (ed.) ICDM 2004. LNCS, vol. 3275, pp. 217–224. Springer, Heidelberg (2004)
Tikk, D., Biró, G.: Experiments with multi-label text classifier on the Reuters collection. In: Proc. of the International Conference on Computational Cybernetics (ICCC 2003), Siófok, Hungary, pp. 33–38 (2003)
Tsoumakas, G., Katakis, I.: Multi-Label Classification: An Overview. International Journal of Data Warehousing and Mining 3(3), 1–13 (2007)
Ueda, N., Saito, K.: Parametric mixture models for multi-topic text. In: Neural Information Processing Systems 15 (NIPS 15), pp. 737–744. MIT Press, Cambridge (2002)
Ueda, N., Saito, K.: Single-shot detection of multi-category text using pa-rametric mixture models. In: ACM SIG Knowledge Discovery and Data Mining (SIGKDD 2002), pp. 626–631 (2002)
Vallim, R.M.M., Goldberg, D.E., Llorà, X., Duque, T.S.P.C.: A New Approach for Multi-label Classification Based on Default Hierarchies and Organizational Learning, IWLCS. In: The 11th International Workshop on Learning Classifier Systems, part of the Genetic and Evolutionary Computation 2008 Conference (GECCO 2008), Atlanta, Georgia, USA (accepted) (2008)
Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer, Heidelberg (1995)
Xu, Y.-Y., Zhou, X.-Z., Guo, Z.-W.: Weak learning algorithm for multi-label multiclass text categorization. In: International Conference on Machine Learning and Cybernetics, 2002. Proceedings, vol. 2, pp. 890–894 (2002)
Yan, R., Tesic, J., Smith, J.R.: Model-shared subspace boosting for multi-label classification. In: Proceedings of the 13th ACM SIGKDD international Con-ference on Knowledge Discovery and Data Mining. KDD 2007, San Jose, California, USA, August 12-15, pp. 834–843. ACM, New York (2007)
Yu, K., Yu, S., Tresp, V.: Multi-label informed latent semantic indexing. In: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 258–265 (2005)
Zhang, M.-L., Zhou, Z.-H.: A k-nearest neighbor based algorithm for multi-label classification. In: Proceedings of the 1st IEEE International Conference on Granular Computing (GrC 2005), Beijing, China, pp. 718–721 (2005)
Zhou, Z.: Mining Ambiguous Data with Multi-instance Multi-label Representation. In: Alhajj, R., Gao, H., Li, X., Li, J., Zaïane, O.R. (eds.) ADMA 2007. LNCS, vol. 4632, p. 1. Springer, Heidelberg (2007)
Zhu, B., Poon, C.K.: Efficient Approximation Algorithms for Multi-label Map Labeling. In: Aggarwal, A.K., Pandu Rangan, C. (eds.) ISAAC 1999. LNCS, vol. 1741, pp. 143–152. Springer, Heidelberg (1999)
Zhu, S., Ji, X., Xu, W., Gong, Y.: Multi-labeled Classification Using Maximum Entropy Method. In: Proceedings of Annual ACM Conference on Research and Development in Information Retrieval (SIGIR 2005), pp. 274–281, Salvador, Brazil (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
de Carvalho, A.C.P.L.F., Freitas, A.A. (2009). A Tutorial on Multi-label Classification Techniques. In: Abraham, A., Hassanien, AE., Snášel, V. (eds) Foundations of Computational Intelligence Volume 5. Studies in Computational Intelligence, vol 205. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01536-6_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-01536-6_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01535-9
Online ISBN: 978-3-642-01536-6
eBook Packages: EngineeringEngineering (R0)