Predictive models of academic risk in computing careers with educational data mining

Authors

DOI: https://doi.org/10.6018/red.463561
Keywords: educational data mining, academic risk, higher education

Abstract

The problems of poor academic performance and lag are recurrent in higher-level educational institutions, especially at the beginning of university studies. The early detection of academic risk conditions enables the implementation of educational intervention measures to address factors of poor school performance, associated with the particular contexts of the students. The purpose of this study was to generate predictive models of academic risk, using educational data mining methods, specifically classification or prediction techniques, for the analysis, obtaining and validation of the models. The data used correspond to admission exam results and sociodemographic data of 415 students of the computer science majors at the Autonomous University of Yucatán (Mexico), enrolled between 2016 and 2019. The results show that the best model corresponding to the algorithm of LMT classification, with a precision value of 75.42% and 0.805 for the area under the ROC curve. It was possible to identify the best predictive attributes, particularly the bachelor entrance exam tests were very significant. The development of computer tools for the early detection of academic risk and strategies for timely educational intervention is proposed.

Downloads

Download data is not yet available.

References

Aldowah, H., Al-Samarraie, H., & Fauzy, W. M. (2019). Educational data mining and learning analytics for 21st century higher education: A review and synthesis. Telematics and Informatics, 37, 13–49. https://doi.org/10.1016/j.tele.2019.01.007

Alyahyan, E., & Düştegör, D. (2020). Predicting academic success in higher education: literature review and best practices. International Journal of Educational Technology in Higher Education, 17(1). https://doi.org/10.1186/s41239-020-0177-7

Anoopkumar, M., & Rahman, A. M. J. (2016). A Review on Data Mining techniques and factors used in Educational Data Mining to predict student amelioration. Proceedings of 2016 International Conference on Data Mining and Advanced Computing, SAPIENCE 2016, 122–133. https://doi.org/10.1109/SAPIENCE.2016.7684113

Ayala, E., López, R. E., & Menéndez, V. H. (2020). Factores asociados al bajo rendimiento académico de estudiantes de primer semestre en carreras de computación. Congreso Internacional de Investigación Academia Journals Chetumal 2020, 12(2), 38–43. Recuperado de: https://www.academiajournals.com/pubchetumal2020

Aziz, A. A., Hafieza, N., & Ahmad, I. (2014). First Semester Computer Science Students’ Academic Performances Analysis by Using Data Mining Classification Algorithms. Proceeding of the International Conference on Artificial Intelligence and Computer Science(AICS 2014), (September), 100–109.

Baker, R. S., & Inventado, P. S. (2014). Chapter 4 Educational Data Mining and Learning Analytics. In J. A. Larusson & B. White (Eds.), Learning Analytics: From Research to Practice (pp. 61–75). NewYork: Springer.

Baker, R. S., Lindrum, D., Lindrum, M. J., & Perkowski, D. (2015). Analyzing Early At-Risk Factors in Higher Education e-Learning Courses. Proceedings of the 8th International Conference on Educational Data Mining (EDM 2015), 150–155. Recuperado de: https://www.educationaldatamining.org/EDM2015/proceedings/full150-155.pdf

Baker, R. S., & Yacef, K. (2009). The State of Educational Data Mining in 2009: A Review and Future Visions. Journal of Educational Data Mining, 1(1), 3–16. https://doi.org/http://doi.ieeecomputersociety.org/10.1109/ASE.2003.1240314

Bakhshinategh, B., Zaiane, O. R., ElAtia, S., & Ipperciel, D. (2018). Educational data mining applications and tasks: A survey of the last 10 years. Education and Information Technologies, 23(1), 537–553. https://doi.org/10.1007/s10639-017-9616-z

Ballester, L., Nadal, A., & Amer, J. (2017). Métodos y técnicas de investigación educativa (2 ed.). Palma: Ediciones UIB.

Berens, J., Schneider, K., Görtz, S., Oster, S., & Burghoff, J. (2019). Early Detection of Students at Risk-Predicting Student Dropouts Using Administrative Student Data from German Universities and Machine Learning Methods. Journal of Educational Data Mining, 11(3), 1–41.

Bouckaert, R. R., Frank, E., Hall, M., Kirkby, R., Reutemann, P., Seewald, A., & Scuse, D. (2018). WEKA Manual for Version 3-8-3. Hamilton, New Zealand: The University of Waikato. Recuperado de: https://user.eng.umd.edu/~austin/ence688p.d/handouts/WekaManual2018.pdf

Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.

Buenaño-Fernández, D., Gil, D., & Luján-Mora, S. (2019). Application of Machine Learning in Predicting Performance for Computer Engineering Students: A Case Study. Sustainability, 11(10), 2833. https://doi.org/10.3390/su11102833

Costa, E. B., Fonseca, B., Santana, M. A., de Araújo, F. F., & Rego, J. (2017). Evaluating the effectiveness of educational data mining techniques for early prediction of students’ academic failure in introductory programming courses. Computers in Human Behavior, 73, 247–256. https://doi.org/10.1016/J.CHB.2017.01.047

Dorio, I. (2017). La transición a la Universidad. El grado de maestro de Educación Infantil (Tesis Doctoral). Universitat de Barcelona, España. Recuperado de: http://diposit.ub.edu/dspace/handle/2445/109484

Fawcett, T. (2003). ROC Graphs: Notes and Practical Considerations for Data Mining Researchers. HP Invent, 27. https://doi.org/10.1.1.10.9777

García, D. (2015). Construcción de un Modelo para Determinar el Rendimiento Académico de los Estudiantes Basado en Learning Analytics (Análisis del Aprendizaje), mediante el Uso de Técnicas Multivariantes (Tesis Doctoral). Universidad de Sevilla, España. Recuperado de: https://idus.us.es/handle/11441/40436

García Gutiérrez, J. A. (2016). Comenzando con Weka : Filtrado y selección de subconjuntos de atributos basada en su relevancia descriptiva para la clase. Madrid.

Gros, B. (2015). Retos y tendencias sobre el futuro de la investigación acerca del aprendizaje con tecnologías digitales. Revista de Educación a Distancia (RED), (32). Recuperado de: https://revistas.um.es/red/article/view/233061

Imran, M., Latif, S., Mehmood, D., & Shah, M. S. (2019). Student Academic Performance Prediction using Supervised Learning Techniques. International Journal of Emerging Technologies in Learning (IJET), 14(14), 92–104. https://doi.org/https://doi.org/10.3991/ijet.v14i14.10310

Kerlinger, F. N., & Lee, H. (2002). Investigación del comportamiento (4a ed.). México: McGraw-Hill.

Kumar, M., & Singh, A. J. (2017). Evaluation of Data Mining Techniques for Predicting Student’s Performance. International Journal of Modern Education and Computer Science, 8, 25–31. Recuperado de:

http://www.mecs-press.org/ijmecs/ijmecs-v9-n8/IJMECS-V9-N8-4.pdf

Kumar, M., Singh, A. J., & Handa, D. (2017). Literature Survey on Student’s Performance Prediction in Education using Data Mining Techniques. International Journal of Education and Management Engineering, 7(6), 40–49. https://doi.org/10.5815/ijeme.2017.06.05

Lamas, H. (2015). Sobre el rendimiento escolar. Propósitos y Representaciones, 3(1), 351–386. https://doi.org/10.20511/pyr2015.v3n1.74

Landwehr, N., Hall, M., & Frank, E. (2006). Logistic model trees. Machine Learning, 2837, 241–252. https://doi.org/10.1007/978-3-540-39857-8_23

Le Cessie, S., & Van Houwelingen, J. C. (1992). Ridge Estimators in Logistic Regression. Applied Statistics, 41(1), 191–201.

López-Ramirez, V. M. (2015). Método sistémico para evaluar el rendimiento académico en instituciones de educación superior (Tesis Doctoral). Instituto Politécnico Nacional, México. Recuperado de: https://tesis.ipn.mx/handle/123456789/21401

López, C. E., Guzmán, E. L., & González, F. A. (2015). A Model to Predict Low Academic Performance at a Specific Enrollment Using Data Mining. Revista Iberoamericana de Tecnologías del Aprendizaje, 10(3), 119–125. https://doi.org/10.1109/RITA.2015.2452632

Márquez-Vera, C., Romero, C., & Ventura, S. (2012). Predicción del Fracaso Escolar Mediante Técnicas de Minería de Datos. IEEE-Rita, 7(3), 109–117. Recuperado de: http://rita.det.uvigo.es/201208/uploads/IEEE-RITA.2012.V7.N3.A1.pdf

Martínez, D. L., Karanik, M., Giovannini, M., & Pinto, N. (2015). Perfiles de Rendimiento Académico: Un Modelo basado en Minería de datos. Campus Virtuales, 6(1), 12–30. Recuperado de: http://uajournals.com/ojs/index.php/campusvirtuales/article/view/66

Menacho, C. H. (2017). Predicción del rendimiento académico aplicando técnicas de minería de datos. Anales Científicos, 78(1), 26. https://doi.org/10.21704/ac.v78i1.811

Merchan, S. M., & Duarte, J. A. (2016). Analysis of Data Mining Techniques for Constructing a Predictive Model for Academic Performance. IEEE Latin America Transactions, 14(6), 2783–2788. https://doi.org/10.1109/TLA.2016.7555255

Miguéis, V. L., Freitas, A., Garcia, P. J. V., & Silva, A. (2018). Early segmentation of students according to their academic performance: A predictive modelling approach. Decision Support Systems, 115, 36–51. https://doi.org/10.1016/j.dss.2018.09.001

Minguillón, J., Casas, J., & Minguillón, J. (2017). Minería de datos: modelos y algoritmos. Recuperado de: https://elibro.net/es/ereader/bibliouaq/58656?page=10

Mitra, S., & Pal, S. K. (1995). Fuzzy multi-layer perceptron, inferencing and rule generation. IEEE Transactions on Neural Networks, 6(1), 51–63.

Molina, M. (2015). Valoración de los criterios referentes al rendimiento académico y variables que lo puedan afectar. Revista Médica Electrónica, 37(6), 617–626. Recuperado de: http://scielo.sld.cu/scielo.php?script=sci_arttext&pid=S1684-18242015000600007

Montes, I. C., & Lerner, J. (2012). Rendimiento Académico de los estudiantes de pregrado de la Universidad EAFIT. Perspectiva Cuantitativa, 158. Recuperado de: https://publicaciones.eafit.edu.co/index.php/cuadernos-investigacion/issue/download/156/22

Muñoz, A. (2015). Modelos para la Mejora del Rendimiento Académico de Alumnos de la E.S.O. mediante Técnicas de Minería de Datos (Tesis Doctoral). Universidad de Murcia, España. Recuperado de: https://dialnet.unirioja.es/servlet/tesis?codigo=127044

Pacheco, V., Cruz, E., & Serrano, L. A. (2019). Rendimiento académico como factor de riesgo en estudiantes de licenciatura. Revista Electrónica de Psicología Iztacala, 22(2), 2318–2336. Recuperado de: http://www.revistas.unam.mx/index.php/repi/article/view/70168

Padua, L. M. (2019). Factores individuales y familiares asociados al bajo rendimiento académico en estudiantes universitarios. Revista Mexicana de Investigación Educativa, 24(80), 173–195. Recuperado de: http://www.scielo.org.mx/pdf/rmie/v24n80/1405-6666-rmie-24-80-173.pdf

Peña-Ayala, A. (2014). Educational Data Mining. In Studies in Computational Intelligence (Vol. 524). https://doi.org/10.1007/978-3-319-02738-8

Quinlan, R. (1993). C4.5: Programs for Machine Learning. San Mateo, CA: Morgan Kaufmann Publishers.

Rico, A., & Sánchez, D. (2018). Diseño de un modelo para automatizar la predicción del rendimiento académico en estudiantes del IPN / Design of a model to automate the prediction of academic performance in students of IPN. RIDE Revista Iberoamericana para la Investigación y el Desarrollo Educativo, 8(16), 246–266. https://doi.org/10.23913/ride.v8i16.340

Río-Jenaro, C., Calle, R., Martín, E., & Robaina, N. (2018). Rendimiento académico en educación superior y su asociación con la participación activa en la plataforma Moodle. Estudios Sobre Educación, 34, 177–198. https://doi.org/10.15581/004.34.177-198

Romero, C., & Ventura, S. (2010). Educational data mining: A review of the state of the art. IEEE Transactions on Systems, Man and Cybernetics Part C: Applications and Reviews, 40(6), 601–618. https://doi.org/10.1109/TSMCC.2010.2053532

Silva, M. (2011). El primer año universitario. Un tramo crítico para el éxito académico. Perfiles Educativos, 33(Extra 0), 102–114. Recuperado de: http://www.scielo.org.mx/scielo.php?script=sci_arttext&pid=S0185-26982011000500010

Slater, S., Joksimović, S., Kovanovic, V., Baker, R. S., & Gasevic, D. (2017). Tools for Educational Data Mining: A Review. Journal of Educational and Behavioral Statistics, 42(1), 85–106. https://doi.org/10.3102/1076998616666808

UADY. (2012). Sistema de Atención integral al Estudiante. Universidad Autónoma de Yucatán. Recuperado de: https://www.saie.uady.mx/tutorias/

Valenzuela, J. R., & Flores, M. (2012). Fundamentos de investigación educativa (eBook, Vol. II). Monterrey, México: Editorial Digital del Tecnológico de Monterrey.

Villanueva, A., Moreno, L. G., & Salinas, M. J. (2018). Data mining techniques applied in educational environments: Literature review. Digital Education Review, (33), 235–266. Recuperado de: https://dialnet.unirioja.es/servlet/articulo?codigo=6485868

Witten, I., Frank, E., & Hall, M. (2011). Data Mining: Practical Machine Learning Tools and Techniques (3rd ed.). Morgan Kaufmann.

Published
21-04-2021
How to Cite
Ayala Franco, E., López Martínez , R. E., & Menéndez Domínguez, V. H. (2021). Predictive models of academic risk in computing careers with educational data mining. Distance Education Journal, 21(66). https://doi.org/10.6018/red.463561
Issue
Section
Learning Engineering and Instructional Engineering

Most read articles by the same author(s)