Predicting the academic status of admitted applicants based on educational and admission data using data mining techniques

Document Type : Original Article

Authors

1 Department of Computer Engineering, Faculty of Engineering, Mahallat Institute of Higher Education, Mahallat, Iran.

2 Department of Computer Engineering, Faculty of Electrical, Computer and Medical Engineering, Shahab Danesh Institute of Higher Education, Qom, Iran.

3 Department of Computer Engineering, Faculty of Engineering, Qom University, Qom, Iran.

Abstract

Educational data mining has become an increasingly popular field of research in recent years due to the vast amount of student data held by educational institutions. This data can be utilized as a tool to improve the quality of education by extracting knowledge that can assist institutions in enhancing their teaching methods, learning processes, and decision-making. The purpose of this paper is to predict the educational status of students who are intending to continue their studies from an associate degree to a bachelor's degree. As the Ministry of Science plans to eliminate the entrance exam, universities are faced with the challenge of selecting students based on what criteria. To address this issue, data mining techniques such as decision tree, Naïve Bayes, neural network, support vector machine, random forest, Bagging, and Boosting were employed to analyze the educational information of new students. Then, by comparing this information with that of graduate, dropout, and expelled students at the bachelor's level, a more effective method for selecting students was proposed. The results indicate that random forest has the highest accuracy at 92.28%, while Naïve Bayes has the lowest accuracy at 61.09% in predicting educational status. 

Keywords


[1] Han J., and Kamber M., Data Mining: Concepts and Techniques, Morgan Kaufmann Publishers, 2011.
[2] زواره ع.، «کاربرد داده‌کاوی بروی داده‌های آموزش عالی دانشجویان دندانپزشکی شهر رشت با استفاده از تکنیک‌های طبقه‌بندی و خوشه‌بندی»، اولین کنفرانس ملی رویکردهای نوین در مهندسی کامپیوتر و بازیابی اطلاعات ایران، 1392.
[3] Educational Data Mining Group. Recouces, Murch 2021, https://educationaldatamining.org/ recouces.
[4] افروز غ.، «جامعه فرهنگ و تدوین شخصیت کودکان ونوجوانان»، پیوند نشریه ماهانه آموزشی تربیتی انجمن اولیاء و مربیان جمهوری اسلامی ایران، شماره 180، 1374.
[5] بیابانگرد ا.، «روش‌های تحقیق در روانشناسی و علوم تربیتی»، تهران، انتشارات دوران، چاپ اول، 1384.
[6] Daradoumis T., Marquès Puig J.M., Arguedas M., and Calvet Linan L.; “Analyzing students’ perceptions to improve the design of an automated assessment tool in online distributed programming”, Journal of Computers & Education, 128: 159-170, 2019.
[7] Adekitan A.I. and Salau O., “The impact of engineering students' performance in the first three years on their graduation result using educational data mining”, Heliyon, vol. 5, 2019.
[8] Thilagaraj T. and Sengottaiyan N., “A Review of Educational Data Mining in Higher Education System”, In: Proceedings of the Second International Conference on Research in Intelligent and Computing in Engineering, pp. 349-358, 2017.
[9] نبی‌لو م.، دانشپور ن.، «ارائه یک الگوریتم خوشه‌بندی برای داده‌های دسته‌ای با ترکیب معیارها»، مجله محاسبات نرم، جلد 5، شماره 1، 1395.
[10] Dekker G., Pechenizkiy M., and Vleeshouwers J., “Predicting Students Drop Out: A Case Study”, In: Proceedings of the International Conference on Educational Data Mining, 2nd, Cordoba, Spain, pp. 41-50, July. 2009.
[11] Romero C., Ventura S., Espejo P.G., and Hervas C.,“Data Mining Algorithms to Classify Students”, In: Proceedings of the 1st International Conference on Educational Data Mining, pp. 8-17, 2008.
[12] Superby J.F., Vandamme J.-P., and Meskens N., “Determination of factors influencing the achievement of the first-year university students using data mining methods”, In Proceedings of the Workshop on Educational Data Mining at the 8th International Conference on Intelligent Tutoring Systems (ITS 2006), pp. 37-44, 2006.
[13] خاجه‌وند س.، چاله‌چاله ع.، «پیش‌بینی عوامل مؤثر در موفقیت تحصیلی دانشجویان دانشگاه پیام‌نور با کمک تکنیک‌های داده‌کاوی»، دومین کنفرانس ملی فناوری، انرژی و داده با رویکرد مهندسی برق و کامپیوتر، کرمانشاه، انجمن IEEE شاخه دانشجویی کردستان، 1395.
[14] خیرخواه م.، جوانمرد م.، «کاربرد داده‌کاوی در سیستم آموزشی»، کنفرانس ملی فن‌آوری، انرژی و داده با رویکرد مهندسی برق و کامپیوتر، کرمانشاه، انجمن مهندسین برق و الکترونیک-شاخه غرب، 1394.
[15] فرهادی م.، تقوی م.، نوروزی م.، «پیش‌بینی موفقیت یا عدم موفقیت دانشجویان رشته عمران در فارغ‌التحصیلی با به‌کارگیری تکنیک‌های داده‌کاوی»، هفتمین کنفرانس داده‌کاوی ایران، تهران، 1392.
[16] سنایی‌نسب ه.، رشیدی‌جهان ح.، صفاری م.، «عوامل مؤثر بر پیشرفت تحصیلی دانشجویان»، دو ماهنامه راهبردهای آموزش در علوم پزشکی، سال پنجم، شماره 4، ص243-249، 1389.
[17] تاری م.، «اثر بخشی داده‌کاوی در مدیریت آموزش عالی و مطالعه موردی آن در دانشگاه پیام‌نور استان قم»، سومین همایش ملی مهندسی کامپیوتر و فناوری اطلاعات، 1389.
[18] رودباری م.، احمدی ا.، عبادی‌فردآذر ف.، «تعیین عوامل موثر بر پیشرفت تحصیلی دانشجویان دانشگاه علوم پزشکی تهران (پردیس همت)»، نشریه طب و تزکیه، دوره 19، شماره 3 (مسلسل 78)، ص37-48، 1389.
[19] Romero C. and Ventura S., “Educational data mining: a review of the state of the art. Systems, Man, and Cybernetics”, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 40(6): 601-618, 2010.
[20] Hung H-C, Liu I-F, Liang C-T, and Su Y-S., "Applying Educational Data Mining to Explore Students’ Learning Patterns in the Flipped Learning Approach for Coding Education". Symmetry, 12(2):213, 2020.
[21] Injadat M., Moubayed A., Nassif A., and Shami A., "Systematic Ensemble Model Selection Approach for Educational Data Mining", Knowledge-Based Systems, vol. 200, 2020.
[22] Durairaj M. and Vijitha C., "Educational data mining for prediction of student performance using clustering algorithms". Int. J. Comput. Sci. Inf. Technol, 5(4): 5987-5991, 2014.
[23] Francis B.K. and Babu, S.S., "Predicting academic performance of students using a hybrid data mining approach". J. Med. Syst, 43(6): 162, 2019.
[24] Akram A., Fu C., Li Y., Javed M.Y., Lin R., Jiang Y., and Tang Y., "Predicting students’ academic procrastination in blended learning course using homework submission data". IEEE Access, 7:102487–102498, 2019.
[25] Rojanavasu P., "Educational data analytics using association rule mining and classification". In: 2019 Joint International Conference on Digital Arts, Media and Technology with ECTI Northern Section Conference on Electrical, Electronics, Computer and Telecommunications Engineering (ECTI DAMT-NCON), pp. 142–145, 2019.
[26] Fatima S., Siddiqui I.F., and Ali Q., "Analyzing students’ academic performance through educational data mining: 3c Tecnol. glosas innovacion Apl. a la pyme”, International Multi-Topic Conference on Engineering and Science,402–421, 2019.
[27] Alban M. and Mauricio D., "Neural networks to predict dropout at the universities". Int. J. Mach. Learn. Comput. 9(2): 149–153, 2019.
[28] Feng J., "Predicting students’ academic performance with decision tree and neural network", PhD Dissertation, 2019.
[29] Daradoumis T., Marquès Puig J.M, Arguedas M., and Calvet Linan L., “Analyzing students’ perceptions to improve the design of an automated assessment tool in online distributed programming”, Journal of Computers & Education, 128:159-170, 2019.
[30] Bharara S., Sabitha S., and Bansal A.,"Application of learning analytics using clustering data Mining for Students’ disposition analysis". Educ. Inf. Technol, 23(2): 957–984, 2018.
[31] Nurhayati O.D., Bachri O.S., Supriyanto A., and Hasbullah M., "Graduation prediction system using artificial neural network". Int. J. Mech. Eng. Technol, 9(7): 1051–1057, 2018.
[32] Rao K.S., Swapna N., and Kumar P.P., "Educational data mining for student placement prediction using machine learning algorithms". Int. J. Eng. Technol. Sci, 7(1.2): 43–46, 2018.
[33] Okubo F., Yamashita T., Shimada A., and Ogata H., "A neural network approach for students’ performance prediction". In: LAK 2017, pp. 598–599, 2017.
[34] Almarabeh H., "Analysis of students’ performance by using different data mining classifiers". Int. J. Mod. Educ. Comput. Sci, 9(8): 9, 2017.
[35] Costa E.B., Fonseca B. , Santana M.A., de Araujo F.F., and Rego J., “Evaluating the effectiveness of educational data mining techniques for early prediction of students' academic failure in introductory programming courses”, Computers in Human Behavior, 73:247-256, 2017.
[36] Amrieh E.A., Hamtini T., and Aljarah I., “Mining Educational Data to Predict Student’s academic Performance using Ensemble Methods”, International Journal of Database Theory and Application, 9 (8): 119-136, 2016.
[37] Saranya S., Ayyappan R., and Kumar N., “Student progress analysis and educational institutional growth prognosis using data mining”. International Journal of Engineering Sciences & Research Technology (IJESRT), 3(4): 1982-1987, 2014.
[38] Ariouat H., Cairns A.H., Barkaoui K., Akoka J., and Khelifa N., “A two-step clustering approach for improving educational process model discovery”. 25th International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE), Paris, pp. 38-43, 2016.
[39] Arora R.K., Badal D., “Mining association rules to improve academic performance”. Int. J. Comput. Sci. Mob. Comput., 3 (1): 428-433, 2014.
[40] Osmanbegovi´c E. and Sulji´c M., “Datamining approach for predicting student performance”, Economic Review - Journal of Economics and Business. vol. 1, 2012.
[41] Sukanya M., Biruntha S., Karthik S., and Kalaikumaran T., “Data mining: performance improvement in education sector using classification and clustering algorithm”. International Conference on Computing and Control Engineering, 2012.
[42] Torenbeek M., Jansen E.P.W.A., and Hofman W.H.A., “Predicting first-year achievement by pedagogy and skill development in the first weeks at university”, Teach. High. Educ, 16(6): 655-668, 2011.
[43] He Y. and Zhang S.; “Application of data mining on students’ quality evaluation”, 3th International Workshop on Intelligent Systems and Applications, Wuhan, pp. 1-4, 2011.
[44] Sakurai Y., Tsuruta S., and Knauf R., “Success chances estimation of university curricula based on educational history, self-estimated intellectual traits and vocational ambitions”. In: IEEE 11th International Conference on Advanced Learning Technologies, Athens, GA, pp. 476-478, 2011.
[45] Aher B.S. and Lobo L.M.R.J., “combination of clustering, classification & association rule-based approach for course recommender system in e-learning”. Int. J. Comput. Appl, 39(7):8-15, 2012.
[46] Ayesha S., Mustafa T., Sattar A.R., and Khan M.I., “Data mining model for higher education system”, Eur. J. Sci. Res, 43(1): 24-29, 2010.
[47] Kovacic Z.J., “Early prediction of student success: mining students enrolment data”, In: Proceedings of Informing Science and IT Education Conference (InSITE), pp. 647-665. 2010.
[48] Al-shargabi A.A. and Nusari A.N., “Discovering vital patterns from UST student’s data by applying data mining techniques”, In: 2th International Conference on Computer and Automation Engineering (ICCAE), Singapore, pp. 547-551, 2010.
[49] Yan Z., Shen Q., and Shao B., “The analysis of student’s grade based on Rough Sets”, In: 3th IEEE International Conference on Ubi-Media Computing, Jinhua, pp. 345-349, 2010.
[50] Ningning G., “Proposing datawarehouse and data mining in teachingmanagement research”, International Forum on InformationTechnology andApplications, Kunming, pp. 436-439, 2010.
[51] Knauf R., Sakurai Y., Takada K., and Tsuruta S., “Personalizing learning processes by datamining”. In: 10th IEEE International Conference on Advanced Learning Technologies, Sousse, pp. 488-492, 2010.
[52] Xiangjuan B. and Youping G., “The application of data mining technology in analysis of college student’s performance”, In: The 2nd International Conference on Information Science and Engineering, Hangzhou, China, pp. 5477-5480, 2010.
[53] Liu Z. and Zhang X.; “Prediction and analysis for students’ marks based on decision tree algorithm”, In: Third InternationalConference on IntelligentNetworks and Intelligent Systems, Shenyang, pp. 338-341, 2010.
[54] Akulwar P., Pardeshi S., and Kamble A., “Survey on Different Data Mining Techniques for Prediction”, In: Proceedings of the Second International conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud), pp. 513-519, 2018.
[55] ویسی ه.، قایدشرف ح.، ابراهیمی م.، «بهبود کارایی الگوریتم‌های یادگیری ماشین در تشخیص ‏بیماری‌های قلبی با بهینه‌سازی داده‌ها و ویژگی‌ها»، مجله محاسبات نرم، جلد 8، شماره 1، 1398.
[56] وثیقی‌ذاکر ا.، جلیلی س.، «پیش‌بینی ژن‌های بیماری با استفاده از دسته‏‌بند تک‌کلاسی ماشین بردار پشتیبان»، مجله محاسبات نرم، جلد 4، شماره 1، 1394.
[57] Marquez-Vera C., Morales C., and Soto S., “Predicting school failure and dropout by using data mining techniques”, IEEE Revista Iberoamericana de Tecnologias del Aprendizaje, 8: 7-14, 2013.
[58] Gu Q., Cai Z., Zhu L., and Huang B.,“Data mining on imbalanced data sets. In Advanced computer theory and engineering”, In: ICACTE ’08. International conference, pp. 1020-1024, 2008.
[59] Rapidminer Group. Products, Products/Studio/, Version2020, Murch2020, https://rapidminer.com /Products/Studio/