طراحی مدلی برای طبقه‌بندی داده‌های جریانی با استفاده از یادگیری تقویتی و گرادیان کاهشی تصادفی

نوع مقاله : مقاله پژوهشی

نویسندگان

دانشکده مهندسی برق و کامپیوتر، دانشگاه کاشان، کاشان، ایران

چکیده

حجم وسیعی از تحقیقات در زمینه یادگیری برخط به مساله غلبه بر فراموشی فاجعه‌بار تمرکز کرده‌اند و تحقیقات اندکی در زمینه طبقه‌بندی داده‌های جریانی با صحت و زمان اجرای مناسب تمرکز کرده‌اند. از سوی دیگر، به دلیل حجم و نوع داده‌های جریانی بسیاری از الگوریتم‌های سنتی یادگیری ماشین به خودی خود کارایی لازم هنگام مواجه با آنها را ندارند. بنابراین، در این مقاله برای طبقه‌بندی داده‌های جریانی با صحت و زمان یادگیری مناسب یک مدل جدید با استفاده از یادگیری تقویتی و الگوریتم گرادیان کاهشی تصادفی ارائه شده است. یکی از قابلیت‌های مهم یادگیری تقویتی این است که عامل می‌تواند رفتار خود را به تدریج با تغییراتی که رخ می‌دهد سازگار کند و به صورت تدریجی بر دانش قبلی خود بیافزاید. در این پژوهش به دلیل استفاده از یادگیری تقویتی و تعریف پاداش، عامل عملکرد بهتری در محیط دارد. الگوریتم پیشنهادی بر روی داده‌های مختلف از جمله مجموعه داده جریانی تشخیص فعالیت‌های انسانی آزمایش شده و از لحاظ صحت و زمان اجرا با چندین الگوریتم افزایشی مقایسه شده است. طبق نتایج آزمایشگاهی الگوریتم پیشنهادی بهترین کارایی را هم از نظر صحت و هم از نظر زمان اجرا در مقایسه با سایر الگوریتم‌های افزایشی دارد.

کلیدواژه‌ها

موضوعات


عنوان مقاله [English]

Designing a model for data stream classification using reinforcement learning and stochastic gradient descent

نویسندگان [English]

  • Samira Farzaneh
  • Javad Salimi Sartakhti
Department of Electrical and Computer Engineering, University of Kashan, Kashan, Iran
چکیده [English]

A large amount of research in the field of online learning has focused on the problem of overcoming catastrophic forgetting, and few research studies have focused on classifying the data stream with appropriate accuracy and running time. On the other hand, due to the volume and type of data stream, many traditional machine learning algorithms do not have the necessary efficiency when faced with it. Thus, in this paper, a novel model using reinforcement learning and the stochastic gradient descent algorithm is presented for the classification stream data with appropriate accuracy and running time. One of the important features of reinforcement learning is that the agent can adapt its behaviour gradually to the changes that occur and gradually add to its previous knowledge. In this research, because of the use of reinforcement learning and the definition of reward, the agent has a better performance in the environment. The proposed algorithm has been tested on various data, including the dataset of human activity recognition, and compared with several incremental algorithms in terms of accuracy and running time. According to the experimental results, the proposed algorithm has the best performance in terms of both accuracy and running time compared to other incremental algorithms.

کلیدواژه‌ها [English]

  • Data stream
  • Accuracy and running time
  • Stochastic gradient descent
  • Incremental learning
  • Reinforcement learning
[1] D. Bhattacharya and M. Mitra, “Analytics on big fast data using real time stream data processing architecture,” EMC Corp., 2013.
[2] R. Polikar, L. Upda, S. S. Upda, and V.G. Honavar, “Learn++: an incremental learning algorithm for supervised neural networks,” IEEE Trans. Syst. Man Cybern. Part C, vol. 31, no. 4, pp. 497-508, 2001, doi: 10.1109/5326.983933.
[3] T. Zhang, “Solving large scale linear prediction problems using stochastic gradient descent algorithms,” in Mach. Learn. Proc. 21st Int. Conf. (ICML), Banff, Alberta, Canada, 2004, p. 116, doi: 10.1145/1015330.1015332.
[4] Q. Wang, Y. Ma, K. Zhao, and Y. Tian, “A comprehensive survey of loss functions in machine learning,” Ann. Data Sci., vol. 9, pp. 187-212, 2022, doi: 10.1007/s40745-020-00253-5.
[5] V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M.A. Riedmiller, “Playing atari with deep reinforcement learning,” arXiv preprint arXiv: 1312.5602, 2013.
[6] M.G. Lagoudakis and R. Parr, “Reinforcement learning as classification: leveraging modern classifiers,” in Mach. Learn. Proc. 20th Int. Conf. (ICML), Washington, DC, USA, 2003, pp. 424-431.
[7] S. Maliah and G. Shani, “MDP-based cost sensitive classification using decision trees,” in Proc. 22nd AAAI Conf. Artif. Intell., New Orleans, Louisiana, USA, 2018, pp. 3746-3753, doi: 10.1609/aaai.v32i1.11779.
[8] S.J. Chen, A. Choi, and A. Darwiche, “Value of information based on Decision Robustness,” in Proc. 29th AAAI Conf. Artif. Intell., Austin, Texas, USA, 2015, pp. 3503-3510, doi: 10.1609/aaai.v29i1.9684.
[9] L.-P. Liu, Y. Yu, Y. Jiang, and Z.-H. Zhou, “Tefe: A time-efficient approach to feature extraction,” in Proc. 8th IEEE Int. Conf. Data Mining (ICDM), Pisa, Italy, 2008, pp. 423-432, doi: 10.1109/ICDM.2008.48.
[10] A. Khosravi, H. Abdulmaleki, and M. Fayazi, “Predicting the academic status of admitted applicants based on educational and admission data using data mining techniques,” Soft Comput. J., vol. 9, no. 2, pp. 94-113, 2021, doi: 10.22052/scj.2021.242837.0 [In Persian].
[11] F. Zare Mehrjardi, M. Yazdian-Dehkordi, and A. Latif, “Evaluating classical machine learning and deep-learning methods in sentiment analysis of Persian telegram message,” Soft Comput. J., vol. 11, no. 1, pp. 88-105, 2022, doi: 10.22052/scj.2023.246553.1077 [In Persian].
[12]  H. Veisi, H.R. Ghaedsharaf, and M. Ebrahimi, “Improving the Performance of Machine Learning Algorithms for Heart Disease Diagnosis by Optimizing Data and Features,” Soft Comput. J., vol. 8, no. 1, pp. 70-85, 2019, doi: 10.22052/8.1.70 [In Persian].
[13] V. Losing, B. Hammer, and H. Wersing, “Incremental on-line learning: A review and comparison of state of the art algorithms,” Neurocomputing, vol. 275, pp. 1261-1274, 2018, doi: 10.1016/j.neucom.2017.06.084.
[14] L. Bottou, “Large-scale machine learning with stochastic gradient descent,” in 19th Int. Conf. Comput. Stat. (COMPSTAT), Paris, France, 2010, pp. 177-186, doi: 10.1007/978-3-7908-2604-3_16.
[15] P. Richtarik and M. Takac, “Parallel coordinate descent methods for big data optimization,” Math. Program., vol. 156, no. 1-2, pp. 433-484, 2016, doi: 10.1007/s10107-015-0901-6.
[16] H. Zhang, “The optimality of naive Bayes,” in Proc. 17th Int Florida Artif. Intell. Res. Soc. Conf., Miami Beach, Florida, USA, 2004, pp. 562-567.
[17] C. Salperwyck and V. Lemaire, “Learning with few examples: An empirical study on leading classifiers,” in Int. Joint Conf. Neural Networks (IJCNN), San Jose, California, USA, 2011 pp. 1010-1019, doi: 10.1109/IJCNN.2011.6033333.
[18] V. Metsis, I. Androutsopoulos, and G. Paliouras, “Spam filtering with naive bayes-which naive bayes?,” in 3rd Conf. Email Anti-Spam (CEAS), Mountain View, California, USA, 2006, pp. 28-69.
[19] S.L. Ting, W.H. Ip, and A.H.C. Tsang, “Is Naive Bayes a good classifier for document classification,” Int. J. Softw. Eng. Appl., vol. 5, no. 3, pp. 37-46, 2011.
[20] Y. Freund and R.E. Schapire, “A short introduction to boosting,” J. Japanese Soc. Artif. Intell., vol. 14, no. 5, pp. 771-780, 1999.
[21] A. Saffari, C. Leistner, J. Santner,  M. Godec, and H. Bischof, “On-line random forests,” in 12th IEEE Int. Conf. Comput. Vis. Worksh. (ICCV), Kyoto, Japan, 2009, pp. 1393-1400, doi: 10.1109/ICCVW.2009.5457447.
[22] V. Losing, B. Hammer, and H. Wersing, “Interactive online learning for obstacle classification on a mobile robot,” in Int. Joint Conf. Neural Networks (IJCNN), Killarney, Ireland, 2015, pp. 1-8, doi: 10.1109/IJCNN.2015.7280610.
[23] A. Sato and K. Yamada, “Generalized learning vector quantization,” in Adv. Neural Inf. Process. Syst., NIPS, Denver, CO, USA, 1995, pp. 423-429.
[24] N.A. Syed, H. Liu, and K.K. Sung, “Handling concept drifts in incremental learning with support vector machines,” in Proc. 5th ACM SIGKDD Int. Conf. Knowl. Discov. Data Mining, San Diego, CA, USA, 1999, pp. 317-321, doi: 10.1145/312129.312267.
[25] H. Zhao, H. Wang, Y. Fu, F. Wu, and X. Li, “Memory-Efficient Class-Incremental Learning for Image Classification,” IEEE Trans. Neural Networks Learn. Syst., vol. 33, no. 10, pp. 5966-5977, 2022, doi: 10.1109/TNNLS.2021.3072041.
[26] M. Vakili and M. Rezaei, “Incremental Learning Techniques for Online Human Activity Recognition,” arXiv preprint arXiv: 2109.09435, 2021.
[27] I. Lee, H. Roh, and W. Lee, “Poster abstract: encrypted malware traffic detection using incremental learning,” in 39th IEEE Conf. Comput. Commun. (INFOCOM), Toronto, ON, Canada, 2020, pp. 1348-1349, doi: 10.1109/INFOCOMWKSHPS50562.2020.9162971.
[28] T.-N. Do, “ImageNet Challenging Classification with the Raspberry Pi: An Incremental Local Stochastic Gradient Descent Algorithm,” arXiv preprint arXiv: 2203.11853, 2022.
[29] S. Agarwal, A. Rattani, and C.R. Chowdary, “AILearn: An Adaptive Incremental Learning Model for Spoof Fingerprint Detection,” arXiv preprint arXiv: 2012.14639, 2020.
[30] J. Huo and T.L. van Zyl, “Incremental class learning using variational autoencoders with similarity learning,” Neural Comput. Appl., vol. 37, no. 2, pp. 769-784, 2025, doi: 10.1007/s00521-023-08485-1.
[31] G. Shan, S. Xu, L. Yang, S. Jia, and Y. Xiang, “Learn#: a novel incremental learning method for text classification,” Expert Syst. Appl., vol. 147, p. 113198, 2020, doi: 10.1016/j.eswa.2020.113198.
[32] A.C. Lemos Neto, R.A. Coelho, and C.L.d. Castro, “An incremental learning approach using long short-term memory neural networks,” J. Control Autom. Electr. Syst., vol. 33, pp. 1457-1465, 2022, doi: 10.1007/s40313-021-00882-y.
[33] B. Nemade and D. Shah, “An efficient IoT based prediction system for classification of water using novel adaptive incremental learning framework,” J. King Saud Univ. Comput. Inf. Sci., vol. 34, no. 8, pp. 5121-5131, 2022, doi: 10.1016/j.jksuci.2022.01.009.
[34] M. Kang, J. Park, and B. Han, “Class-incremental learning by knowledge distillation with adaptive feature consolidation,” in IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), New Orleans, LA, USA, 2022, pp. 16050-16059, doi: 10.1109/CVPR52688.2022.01560.
[35] R. Khemchandani, Jayadeva, and S. Chandra, “Incremental twin support vector machines,” Model. Comput. Optim., pp. 263-272, 2009, doi: 10.1142/9789814273510_0017.