Abstract
Naive Bayes is a well-known classification algorithm in machine learning area. Since it is based on combinations of efficiently computable probability estimates, NB is often a good choice for high-dimensional problems such as text classification. However, NB is based on a conditional independence assumption between attributes which is often violated in real-world applications. Accordingly, lots of work has been done to improve the performance of NB, such as structure extension, attribute selection, attribute weighting, instance weighting and instance selection. An alternative strategy to address the limitations of NB is to apply NB only over the neighbors of the instance that needs to be classified, where the independence assumption may be more justified. However, this introduces another practical problem – the high variance resulting from insufficient training data when building the NB model only over the neighbors rather than the original training dataset. In this paper, a new learning algorithm named Weighted Lazy Naive Bayes (WLNB) is presented. WLNB is designed to address the variance issue by augmenting the nearest neighbors of a test instance. Then, a self-adaptive evolutionary process is applied to automatically learn two key parameters in WLNB. In this way, a method named Evolutionary Weighted Lazy Naive Bayes (EWLNB) is formed. EWLNB uses Differential Evolution to search optimal values for parameters, which makes the method quite effective. Experimental evaluations on 56 UCI machine learning benchmark datasets demonstrate that EWLNB significantly outperforms NB and several other improved NB algorithms in terms of classification accuracy and class probability estimation.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Aha DW, Kibler D, Albert MK (1991) Instance-based learning algorithms. Mach learn, 6(1):37–66
Asuncion A, Newman D (2007) Uci machine learning repository, 2007
Baihaqie AD, Wulan R (2021) Algorithm configuration k-nearest to clarification medicine tree based on extraction, variation of color, texture and shape of leaf. Ilomata Int J Soc Sci 2(1):81–91
Bazi Y, Alajlan N, Melgani F, Alhichri H, Malek S, Yager RR (2014) Differential evolution extreme learning machine for the classification of hyperspectral images. IEEE Geosci Remote Sens Lett 11(6):1066–1070
Berend D, Kontorovich A (2015) A finite sample analysis of the Naive Bayes classifier. J Mach Learn Res 16:1519–1545
Bergstra J, Bengio Y (2012) Random search for hyper-parameter optimization. J Mach Learn Res 13:281–305
Bermejo P, Gamez Jose A, Puerta JM (2014) Speeding up incremental wrapper feature subset selection with Naive Bayes classifier. Knowledge Based Sys 55:140–147
Cai D, Zhang C, He X (2010) Unsupervised feature selection for multi-cluster data. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining (KDD), pages 333–342, 2010
Chen J, Huang H, Tian T, Youli Y (2009) Feature selection for text classification with Naive Bayes. Expert Syst Appl 36(3):5432–5435
Chow C, Liu C (1968) Approximating discrete probability distributions with dependence trees. IEEE Trans Inform Theory 14(3):462–467
Claesen M, De Moor B (2015) Hyperparameter search in machine learning. arXiv preprint arXiv:1502.02127
Comak E, Arslan A (2006) A support vector machine using the lazy learning approach for multi-class classification. J Medical Eng Technol 30(2):73–7
Frank E, Hall M, Pfahringer B (2002) Locally weighted Naive Bayes. In: Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence (UAI), pages 249–256
Frank E, Hall M, Pfahringer B (2012) Locally weighted Naive Bayes. In: Proceedings of the conference on uncertainty in artificial intelligence (UAI), pages 249–256, 2012
Friedman N, Geiger D, Goldszmidt M (1997) Bayesian network classifiers. Mach Learn 29(2–3):131–163
Georgioudakis M, Plevris V (2020) A comparative study of differential evolution variants in constrained structural optimization. front. Built Environ, 6:102
Geoffrey Webb I, Janice Boughton R, Wang Z (2005) Not So Naive Bayes: Aggregating One-Dependence Estimators. Machine Learning, 58(1):5–24
Hall M (2007) A decision tree-based attribute weighting filter for Naive Bayes. Knowledge Based Syst 20(2):120–126
Hand D, Till R (2001) A simple generalisation of the area under the ROC curve for multiple class classification problems. Mach Learn 45:171–186
Hall MA (2000) Correlation-based feature selection for discrete and numeric class machine learning. In: Proceedings of the seventeenth international conference on machine learning (ICML), pages 359–366
Hernández-González J, Inza I, Lozano JA (2013) Learning bayesian network classifiers from label proportions. Pattern Recognition, 46(12):3425–3440
Hong J-H, Min J-K, Cho U-K, Cho S-B (2008) Fingerprint classification using one-vs-all support vector machines dynamically ordered with Naive Bayes classifiers. Pattern Recogn 41(2):662–671
Ian HW, Eibe F (2005) Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann
Jie L, Liu A, Dong F, Feng G, Gama J, Zhang G (2018) Learning under concept drift: A review. IEEE Trans Knowl Data Eng 31(12):2346–2363
Jia W, Pan S, Zhu X, Zhang P, Zhang C (2016) Sode: Self-adaptive one-dependence estimators for classification. Pattern Recognit 51:358–377
Jia W, Pan S, Zhu X, Cai Z, Zhang P, Zhang C (2015) Self-adaptive attribute weighting for Naive Bayes classification. Expert Syst Appl 42(3):1487–1502
Jia W, Cai Z (2014) A Naive Bayes probability estimation model based on self-adaptive differential evolution. J Intell Inform Syst 42(3):671–694
Jia W, Cai Z-h, Ao S (2012) Hybrid Dynamic K-nearest-neighbour and distance and attribute weighted method for classification. Int J Comput Appl Technol 43(4):378–384
Jiang L, Cai Z, Wang D, Zhang Z (2012a) Improving Tree augmented Naive Bayes for class probability estimation. Knowledge-Based Systems 26:239–245
Jiang L, Zhang H, Cai Z (2009) A Novel Bayes model: hidden Naive Bayes. Knowl Data Eng IEEE Trans 21(10):1361–1371
Jiang L, Wang D, Cai Z, Yan X (2007) Survey of improving Naive Bayes for classification. In: Proceedings of the international conference on advanced data mining and applications (ADMA), pages 134–145
Jiang L, Zhang H (2006) Lazy averaged one-dependence estimators. In: Proceedings of the 19th conference of the Canadian society for computational studies of intelligence, Canadian AI (CAI), pages 515–525
Jiang L, Guo Y (2005) Learning lazy Naive Bayesian classifiers for ranking. In: Proceedings of the 24th IEEE international conference on tools with artificial intelligence (ICTAI), pages 412–416
Jiang L, Zhang H (2005) Learning instance greedily cloning Naive Bayes for ranking. In: Proceedings of Fifth IEEE international conference on data mining (ICDM), pages 202–209
Jiang L, Zhang H, Jiang S (2005) Instance cloning local Naive Bayes. Lecture Notes Comput Sci 3501:280–291
Karthikeyan L, Vijayakumaran C, Chitra S, Arumugam S (2021) Saldeft: Self-adaptive learning differential evolution based optimal physical machine selection for fault tolerance problem in cloud. Wireless Personal Communications, pages 1–28
Keogh E, Pazzani M (1999) Learning augmented Bayesian classifiers: A comparison of distribution-based and classification-based approaches. In: Proceedings of the international workshop on artificial intelligence and statistics (AISTATS), pages 225–230
Kim H, Chen S-S (2009) Associative Naive Bayes classifier: automated linking of gene ontology to medline documents. Pattern Recognit 42(9):1777–1785
Kong H, Shi X, Wang L, Liu Y, Mammadov M, Wang G (2021) Averaged tree-augmented one-dependence estimators. Applied Intelligence, pages 1–17
König C, Turchetta M, Lygeros J, Rupenyan A, Krause A (2021) Safe and efficient model-free adaptive control via bayesian optimization. arXiv preprintarXiv:2101.07825
Langley P, Sage S (2013) Induction of selective bayesian classifiers. Uncertainty Proceedings, pages 399–406, 2013
Liu A, Jie L, Liu F, Zhang G (2018) Accumulating regional density dissimilarity for concept drift detection in data streams. Pattern Recogni 76:256–272
Liangxiao L, Cai Z, Zhang H, Wang D (2012b) Not so greedy: randomly selected Naive Bayes. Expert Syst Appl 39(12):11022–11028
Liangxiao J, Harry Z, Cai Z, Dianhong W (2012c) Weighted average of one-dependence estimators. J Exp Theoretical Artificial Intell 24(2):219–230
Li G, Wang W, Zhang W, Wang Z, Tu H, You W (2021) Grid search based multi-population particle swarm optimization algorithm for multimodal multi-objective optimization. Swarm and Evolutionary Computation, page 100843, 2021
Li Z, Yang Y, Liu J, Zhou X, Lu H (2012) Unsupervised feature selection using nonnegative spectral analysis. In: Proceedings of the national conference on artificial intelligence (AAAI), pages 1026–1032, 2012
Luitel B, Venayagamoorthy GK (2008) Differential evolution particle swarm optimization for digital filter design. Evolutionary Computation, pages 3954–3961
Mania H, Guy A, Recht B (2018) Simple random search provides a competitive approach to reinforcement learning. arXiv preprint arXiv:1803.07055
Manocha A, Bhatia M, Kumar G (2021) Dew computing-inspired health-meteorological factor analysis for early prediction of bronchial asthma. J Netw Comput Appl, page 102995
Made Aris Satia Widiatmika I, Nyoman Piarsa I, Kadek Ayu Wirdiani N. Recognition of the characteristics of baby footprint using the feature extraction method wavelet and the classification k-nearest neighbor (k-nn). Jurnal Ilmiah Merpati (Menara Penelitian Akademika Teknologi Informasi)
Nayyar A. Zaidi, François Petitjean, and Geoffrey I. Webb. Preconditioning an artificial neural network using naive bayes. In Proceedings of the 20th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), pages 341–353, 2016
Pinos M, Mrazek V, Sekanina L (2021) Evolutionary neural architecture search supporting approximate multipliers. arXiv preprint arXiv:2101.11883
Rajabi A, Witt C (2021) Stagnation detection with randomized local search. arXiv preprint arXiv:2101.12054, 2021
Rout M et al. (2021) Analysis and comparison of credit card fraud detection using machine learning. In: Advances in electronics, communication and computing, pages 33–40. Springer
Price KV, Storn RM, Lampinen JA (2005) Differential evolution-a practical approach to global optimization. Springer Opt Its Appl 141(2):1–24
Ratanamahatana CA, Gunopulos D (2002) Scaling up the naive bayesian classifier: Using decision trees for feature selection. In: Proceedings of the IEEE international conference on data mining (ICDM), pages 399–406
Robens-Radermacher A, Held F, Coelho Lima I, Titscher T, Unger JF (2021) Efficient identification of random fields coupling bayesian inference and pgd reduced order model for damage localization. PAMM, 20(1):e202000063
Robnik-Šikonja M, Kononenko I (2003) Theoretical and empirical analysis of reliefF and R reliefF. Mach Learn 53(1–2):23–69
ŞEREF B, BOSTANCI GE, GÜZEL MS (2021) Evolutionary neural networks for improving the prediction performance of recommender systems. Turkish J Electr Eng Comput Sci, 29(1):62–77
Sikdar UK, Ekbal A, Saha S, Uryupina O, Poesio M (2014) Differential evolution-based feature selection technique for anaphora resolution. Soft Computing, 19(8):2149–2161
Storn R, Price K (1997) Differential evolution—a simple and efficient heuristic for global optimization over continuous spaces. J Global Opt 11(4):341–359
Varga J, Lezama OBP, Payares K (2021) Machine learning techniques to determine the polarity of messages on social networks. In: Proceedings of international conference on intelligent computing, information and control systems, pages 117–123. Springer
Webb GI, Boughton JR, Zheng F, Ming Ting K, Salem H (2012) Learning by extrapolation from marginal to full-multivariate probability distributions: decreasingly Naive Bayesian classification. Machine Learning, 86(2):233–272
Tao W, Li X, Zhou D, Li N, Shi J (2021) Differential evolution based layer-wise weight pruning for compressing deep neural networks. Sensors 21(3):880
Wong T-T (2012) A hybrid discretization method for Naive Bayesian classifiers. Pattern Recognition 45(6):2321–2325
Wong T-T, Chang L-H (2011) Individual attribute prior setting methods for Naive Bayesian classifiers. Pattern Recognit 44(5):1041–1047
Wu J, Cai Z, Pan S, Zhu X, Zhang C (2014a) Attribute weighting: How and when does it work for bayesian network classification. In: Proceedings of the international joint conference on neural networks (IJCNN), pages 4076–4083, 2014a
Wu J, Pan S, Cai Z, Zhu X, Zhang C (2014b) Dual instance and attribute weighting for naive bayes classification. In Proceedings of the International Joint Conference on Neural Networks (IJCNN), pages 1675–1679
Yongshan Z, Jia W, Zhihua C, Peng Z, Ling C (2016) Memetic extreme learning machine. Pattern Recognit 58:135–148
Zaidi NA, Cerquides J, Carman MJ, Webb GI (2013) Alleviating Naive Bayes attribute independence assumption by attribute weighting. J Mach Learn Res 14(1):1947–1988
Zhang G, Pan R, Zhou J, Wang L, Gao W (2021) Spectrophotometric color matching for pre-colored fiber blends based on a hybrid of least squares and grid search method. Textile Research Journal, page 0040517521989788, 2021
Zhao J, Zhang R, Zhou Z, Chen S, Jin J, Liu Q A neural architecture search method based on gradient descent for remaining useful life estimation. Neurocomputing, 2021
Zhang J, Chen C, Xiang Y, Zhou W, Xiang Y (2013) Internet traffic classification by aggregating correlated Naive Bayes predictions. IEEE Trans Inform Forensics Secur 8(1):5–15
Zheng Z, Webb GI (2010) Lazy learning of Bayesian rules. Mach Learn 41(1):53–84
Zhang C, Xue G-R, Yu Y, Zha H (2009) Web-scale classification with Naive Bayes. In Proceedings of the 18th international conference on World Wide Web (WWW), pages 1083–1084, 2009
Zhang H, Sheng S (2004) Learning weighted Naive Bayes with accurate ranking. In: Proceedings of the fourth IEEE international conference on data mining (ICDM), pages 567–570
Zhang H, Su J (2004) Naive Bayesian Classifiers for Ranking. In: Proceedings of the 15th European conference on machine learning (ECML), pages 501–512
Zhang ML, Zhou ZH (2007) Ml-knn: A lazy learning approach to multi-label learning. Pattern Recognit 40(7):2038–2048
Zong W, Huang G-B, Chen Y (2013) Weighted extreme learning machine for imbalance learning. Neurocomputing 101:229–242
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Bai, Y., Bain, M. Optimizing weighted lazy learning and Naive Bayes classification using differential evolution algorithm. J Ambient Intell Human Comput 13, 3005–3024 (2022). https://doi.org/10.1007/s12652-021-03135-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-021-03135-7