Technical and Practical Implications of Generative Adversarial Networks for Open Science in Education
Supporting Agencies
- Ministerio de Ciencia, Innovación y Universidades
- Departamento de Didáctica e Investigación Educativa
- Universidad de La Laguna
- EDULLAB: Laboratorio de Educación y Nuevas Tecnologías
- Escuela de Doctorado y Estudios de Postgrado
- Gabinete de Planificación y Comunicación
- Vicerrectorado de Agenda Digital, Modernización y Campus Central
- Vicerrectorado de Innovación Docente, Calidad y Campus Anchieta
- Unidad de Docencia Virtual (UDV)
Abstract
Generative Adversarial Networks (GANs), which are characteristic of Artificial Intelligence, allow the creation of synthetic anonymised data useful for Open Science in educational research. This study experiments with the creation of artificial data from a dataset obtained from a survey on levels of use of digital tools and frequency of personal activities with technology. The original data belong to a sample of students from postgraduate degrees at the University of La Laguna. The results show an adequate degree of similarity between the original data set and the set artificially created through predictive algorithms. Obtaining synthetic datasets equivalent to the original ones in structure, shape and extension allows the release of the data to the academic community, safeguarding the protection of confidential information and contrasting a technique that allows the promotion of Open Science from the collection and processing of the data. Generative Adversarial Networks can be used in educational research for the purpose of transparency in methodological and technical procedures and the dissemination of datasets for academic, research and educational purposes.
Downloads
References
Abadal, E., & Anglada, L. (2020). Ciencia abierta: cómo han evolucionado la denominación y el concepto. Anales de Documentación, 23(1). https://doi.org/10.6018/analesdoc.378171
Alés, N. S. (2020). La Ciencia y Educación Abierta como movimientos articuladores de la investigación, la tecnología y la innovación: Experiencias del proyecto de Acceso Abierto de la Facultad de Comunicación de la Universidad de La Habana. Revista Publicando, 7(27), 65-72.
Alhadad, S. S. J., Searston, R. A., & Lodge, J. M. (2018). Interdisciplinary open science: What are the implications for educational technology research?. In M. Campbell, J. Willems, C. Adachi, D. Blake, I. Doherty, S. Krishnan, S. Macfarlane, L. Ngo, M. O’Donnell, S. Palmer, L. Riddell, I. Story, H. Suri & J. Tai (Eds.), Open Oceans: Learning without borders. Proceedings ASCILITE 2018 Geelong, (pp. 303-308).
Al-Qizwini, M., Barjasteh, I., Al-Qassab, H., & Radha, H. (2017). Deep learning algorithm for autonomous driving using GoogLeNet. 2017 IEEE Intelligent Vehicles Symposium (IV), 89-96. https://doi.org/10.1109/IVS.2017.7995703
Barua, S., Islam, M. M., Yao, X., & Murase, K. (2014). MWMOTE--Majority Weighted Minority Oversampling Technique for Imbalanced Data Set Learning. IEEE Transactions on Knowledge and Data Engineering, 26(2), 405-425. https://doi.org/10.1109/TKDE.2012.232
Bethencourt Aguilar, A., Castellanos Nieves, D., Sosa Alonso, J. J., & Area Moreira, M. (2022). Synthetic student dataset on levels of use of digital tools and frequency of personal activities with ICTs, Mendeley Data, v1, http://dx.doi.org/10.17632/rwz59sxtpn.1
Bethencourt-Aguilar, A., Area-Moreira, M., Sosa-Alonso, J. J., & Castellano-Nieves, D. (2021). The digital transformation of postgraduate degrees. A study on academic analytics at the University of La Laguna. 2021 XI International Conference on Virtual Campus (JICV), 1-4. https://doi.org/10.1109/JICV53222.2021.9600311
Bethencourt-Aguilar, A., Sosa-Alonso, J. J., Castellanos-Nieves, D. C., & Area-Moreira, M. (2021). Uso del campus virtual y el rendimiento académico del alumnado: Análisis antes, durante y después del impacto de la Covid-19 en la educación superior. InnoEduca Tic 2021: Libro de Actas de las VIII Jornadas Iberoamericanas de Innovación Educativa en el ámbito de las TIC y las TAC Las Palmas de Gran Canaria, 18 y 19 de noviembre de 2021, 2021, ISBN 978-84-09-35708-6, págs. 293-297. https://dialnet.unirioja.es/servlet/articulo?codigo=8227886
Bishop, C. M. (1995). Training with Noise is Equivalent to Tikhonov Regularization. Neural Computation, 7(1), 108-116. https://doi.org/10.1162/neco.1995.7.1.108
Burgos, D. (2020). Radical Solutions and Open Science: An Open Approach to Boost Higher Education. Springer Nature. https://doi.org/10.1007/978-981-15-4276-3
Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research, 16, 321-357. https://doi.org/10.1613/jair.953
Conole, G., & Brown, M. (2018). Reflecting on the Impact of the Open Education Movement. Journal of Learning for Development, 5(3). https://doi.org/10.56059/jl4d.v5i3.314
Creswell, A., White, T., Dumoulin, V., Arulkumaran, K., Sengupta, B., & Bharath, A. A. (2018). Generative Adversarial Networks: An Overview. IEEE Signal Processing Magazine, 35(1), 53-65. https://doi.org/10.1109/MSP.2017.2765202
Douzas, G., & Bacao, F. (2017). Self-Organizing Map Oversampling (SOMO) for imbalanced data set learning. Expert Systems with Applications: An International Journal, 82(C), 40-52. https://doi.org/10.1016/j.eswa.2017.03.073
Douzas, G., & Bacao, F. (2018). Effective data generation for imbalanced learning using conditional generative adversarial networks. Expert Systems with Applications, 91, 464-471. https://doi.org/10.1016/j.eswa.2017.09.030
DeRouin, E. & Brown, J. (1991). Neural Network Training on Unequally Represented Classes. Intelligent Engineering Systems through Artificial Neural Networks, 135-140.
Fiore, U., De Santis, A., Perla, F., Zanetti, P., & Palmieri, F. (2019). Using generative adversarial networks for improving classification effectiveness in credit card fraud detection. Information Sciences, 479, 448-455. https://doi.org/10.1016/j.ins.2017.12.030
Fressoli, J. M., & Arza, V. (2018). Los desafíos que enfrentan las prácticas de ciencia abierta. Teknokultura. Revista de Cultura Digital y Movimientos Sociales, 15(2). https://doi.org/10.5209/TEKN.60616
González-Pérez, L. I., Ramírez-Montoya, M. S., & García-Peñalvo, F. J. (2022). Technological Enablers 4.0 to Drive Open Science and Education: Input to UNESCO Recommendations. RIED-Revista Iberoamericana de Educación a Distancia, 25(2), 23-48. https://doi.org/10.5944/ried.25.2.33088
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio, Y. (2014). Generative Adversarial Nets. Advances in Neural Information Processing Systems, 27. https://papers.nips.cc/paper/2014/hash/5ca3e9b122f61f8f06494c97b1afccf3-Abstract.html
Gou, C., Wu, Y., Wang, K., Wang, F.-Y., & Ji, Q. (2016). Learning-by-synthesis for accurate eye detection. 2016 23rd International Conference on Pattern Recognition (ICPR), 3362-3367. https://doi.org/10.1109/ICPR.2016.7900153
Han, H., Wang, W.-Y., & Mao, B.-H. (2005). Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning. En D.-S. Huang, X.-P. Zhang, & G.-B. Huang (Eds.), Advances in Intelligent Computing (pp. 878-887). Springer. https://doi.org/10.1007/11538059_91
He, H., Bai, Y., Garcia, E. A., & Li, S. (2008). ADASYN: Adaptive synthetic sampling approach for imbalanced learning. 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), 1322-1328 https://doi.org/10.1109/ijcnn.2008.4633969
Hinton, G. E., & Salakhutdinov, R. R. (2006). Reducing the dimensionality of data with neural networks. Science (New York, N.Y.), 313(5786), 504-507. https://doi.org/10.1126/science.1127647
Jones, N. (2015). Artificial-intelligence institute launches free science search engine. Nature. https://doi.org/10.1038/nature.2015.18703
Li, D.-C., & Fang, Y.-H. (2009). A non-linearly virtual sample generation technique using group discovery and parametric equations of hypersphere. Expert Systems with Applications, 36(1), 844-851. https://doi.org/10.1016/j.eswa.2007.10.029
Li, D.-C., & Lin, Y.-S. (2006). Using virtual sample generation to build up management knowledge in the early manufacturing stages. European Journal of Operational Research, 175(1), 413-434. https://doi.org/10.1016/j.ejor.2005.05.005
Li, J., Monroe, W., Shi, T., Jean, S., Ritter, A., & Jurafsky, D. (2017). Adversarial Learning for Neural Dialogue Generation https://doi.org/10.48550/arXiv.1701.06547
Logan, J. A. R., Hart, S. A., & Schatschneider, C. (2021). Data Sharing in Education Science. AERA Open, 7, Cornell University. https://doi.org/10.1177/23328584211006475
Mohamed, A., Dahl, G. E., & Hinton, G. (2012). Acoustic Modeling Using Deep Belief Networks. IEEE Transactions on Audio, Speech, and Language Processing, 20(1), 14-22. https://doi.org/10.1109/TASL.2011.2109382
Parti, K., & Szigeti, A. (2021). The Future of Interdisciplinary Research in the Digital Era: Obstacles and Perspectives of Collaboration in Social and Data Sciences - An Empirical Study. Cogent Social Sciences, 7(1). https://doi.org/10.1080/23311886.2021.1970880
Pascual, S., Bonafonte, A., & Serrà, J. (2017). SEGAN: Speech Enhancement Generative Adversarial Network, Cornell University. https://doi.org/10.48550/arXiv.1703.09452
Peset, F., & Millán González, L. (2017). Ciencia abierta y gestión de datos de investigación: RDM. Ediciones Trea.
Radford, A., Metz, L., & Chintala, S. (2015). Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks, Computer Science, abs/1511.06434.
Ramírez-Montoya, M. S., McGreal, R., & Agbu, J.-F. O. (2022). Complex Digital Horizons in the Future of Education 4.0: Insights from UNESCO Recommendations. RIED-Revista Iberoamericana de Educacion a Distancia, 25(2), 09-21. https://doi.org/10.5944/ried.25.2.33843
UNESCO (2021) Recomendación de la UNESCO sobre la Ciencia Abierta—UNESCO Biblioteca Digital. Recuperado 2 de noviembre de 2022, de https://unesdoc.unesco.org/ark:/48223/pf0000379949_spa
Santana, E., & Hotz, G. (2016a). Learning a Driving Simulator. En ArXiv e-prints. https://ui.adsabs.harvard.edu/abs/2016arXiv160801230S
Strcic, J., Civljak, A., Glozinic, T., Pacheco, R. L., Brkovic, T., & Puljak, L. (2022). Open data and data sharing in articles about COVID-19 published in preprint servers medRxiv and bioRxiv. Scientometrics, 127(5), 2791-2802. https://doi.org/10.1007/s11192-022-04346-1
Theodoridis, S., & Koutroumbas, K. (2006). Pattern Recognition, Third Edition. Academic Press, Inc.
Van der Zee, T., & Reich, J. (2018). Open Education Science. AERA Open, 4(3). https://doi.org/10.1177/2332858418787466
Van Dijk, W., Schatschneider, C., & Hart, S. A. (2021). Open Science in Education Sciences. Journal of Learning Disabilities, 54(2), 139-152. https://doi.org/10.1177/0022219420945267
Wang, K., Gou, C., Duan, Y., Lin, Y., Zheng, X., & Wang, F.-Y. (2017). Generative adversarial networks: Introduction and outlook. IEEE/CAA Journal of Automatica Sinica, 4(4), 588-598. https://doi.org/10.1109/JAS.2017.7510583
Wang, L., & Sng, D. (2015). Deep Learning Algorithms with Applications to Video Analytics for A Smart City: A Survey, Cornell University. https://doi.org/10.48550/arXiv.1512.03131
Xie, Z., Jiang, L., Ye, T., & Li, X. (2015). A Synthetic Minority Oversampling Method Based on Local Densities in Low-Dimensional Space for Imbalanced Learning. En M. Renz, C. Shahabi, X. Zhou, & M. A. Cheema (Eds.), Database Systems for Advanced Applications (pp. 3-18). Springer International Publishing. https://doi.org/10.1007/978-3-319-18123-3_1
Zhou, Z.-H., & Jiang, Y. (2004). NeC4.5: Neural ensemble based C4.5. IEEE Transactions on Knowledge and Data Engineering, 16(6), 770-773. https://doi.org/10.1109/TKDE.2004.11
Zhuang, Y., Wu, F., Chen, C., & Pan, Y. (2017). Challenges and opportunities: From big data to knowledge in AI 2.0. Frontiers of Information Technology & Electronic Engineering, 18(1), 3-14. https://doi.org/10.1631/FITEE.1601883
Copyright (c) 2022 Anabel Bethencourt-Aguilar, Dagoberto Castellanos-Nieves, Juan José Sosa-Alonso, Manuel Area-Moreira
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Those authors who have publications with this journal accept the following terms:
a. The authors will retain their copyright and guarantee the journal the right of first publication of their work, which will be simultaneously subject to the Creative Commons License. Non-commercial attribution 4.0 International that allows to share, copy, and redistribute the material in any medium or format and adapt, remix, transform and build on the material in the following terms:
Recognition - You must give the appropriate credit, provide a link to the license, and indicate if changes have been made. You may do so in any reasonable manner, but not in a way that suggests that the licensor or its use endorses it. Non-commercial - You cannot use the material for commercial purposes. Share under it - If you remix, transform, or create on the material, your contributions must be distributed under the same license as the original.
b. Authors may adopt other non-exclusive licensing agreements for the distribution of the published work (e.g. deposit it in an institutional telematic file or publish it in a monographic volume) whenever the initial publication in this journal is indicated.
c. Authors are allowed and encouraged to distribute their work through the Internet (e.g. in institutional telematic archives or on their website) before and during the submission process, which can produce interesting exchanges and increase citations of the published work. (See The effect of open access).
d. In any case, the Editorial Team understands that the opinions expressed by the authors are their exclusive responsibility.