Is "ChatGPT" capable of passing the 2022 MIR exam? Implications of artificial intelligence in medical education in Spain

Authors

DOI: https://doi.org/10.6018/edumed.556511
Keywords: ChatGPT, MIR Exam, Residence Exam, Artificial Intelligence, Medical Education, Postgraduate Formation

Abstract

Artificial intelligence and natural language processing models have made an entrance into the field of medical education. Among them, the ChatGPT model has been used to try to solve different international medical exams. However, there is no literature which addresses this phenomenon in Europe or other Spanish-speaking countries. The present paper aims at evaluating the ability to answer questions of the ChatGPT model in the 2022 MIR, which grants access to the Spanish postgraduate training system. To this end, a cross-sectional descriptive analysis has been carried out in which all the questions of the 2022 MIR exam have been solved by this technology. ChatGPT was able to answer 51.4% of the questions correctly, which is approximately 69 net answers on said exam. According to estimates for this year, it would have obtained a 7688 position, which would be slightly below the population’s median, but would allow it to pass the cut-off score and choose a large number of specialties. These results are similar to those obtained in the existing literature, slightly worse to those obtained  by this tool in the American USMLE exams. The development of AI is  an opportunity for medical students and residents to learn, but it is also a risk in many ways. It is essential to train future specialists in the new reality of artificial intelligence so that they are able to use them and obtain benefits in a reasoned and safe manner.

Downloads

Download data is not yet available.

Metrics

Metrics Loading ...

References

Scott K. Microsoft teams up with openai to exclusively license GPT-3 language model 2020.

Association TWM. Statement on augmented intelligence in medical care. Geneva: Association TWM; 2019.

Standing Committee of European Doctora (CPME) 2019. Policy on AI in Healthcare. Doctors TSCoE: Policy on AI in Health Care. 2019.

https://www.cpme.eu/api/documents/adopted/2019/CPME_AD_Board_16112019_062_FINAL_EN_CPME.AI_.in_.health.care_.pdf

Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med. 2019; 25:44-56.

Wartman SA, Donald Combs C. Medical education must move from the information age to the age of artificial intelligence. Acad Med. 2018;93:1107-9.

Avisha Das, Salih Selek, Alia R. Warner, Xu Zuo, Yan Hu, Vipina Kuttichi Keloth, Jianfu Li, W. Jim Zheng, and Hua Xu. Conversational Bots for Psychotherapy: A Study of Generative Transformer Models Using Domain-specific Dialogues. 2022. In Proceedings of the 21st Workshop on Biomedical Language Processing, pages 285-297, Dublin, Ireland. Association for Computational Linguistics.

Savery M, Abacha AB, Gayen S, Demner-Fushman D. Question-driven summarization of answers to consumer health questions. Sci Data. 2020;7(1):322. Published 2020 Oct 2. doi:10.1038/s41597-020-00667-z

Jin Q, Dhingra B, Liu Z, Cohen WW, Lu X. PubMedQA: A dataset for biomedical research question answering arXiv preprint arXiv:1909.06146. 2019.

Jin D, Pan E, Oufattole N, Weng WH, Fang H, Szolovits P. What Disease Does This Patient Have? A Large-Scale Open Domain Question Answering Dataset from Medical Exams Applied Sciences. 2021;11:6421.

Ha LE, Yaneva V. Automatic question answering for medical MCQs: Can it go further than information retrieval? RANLP 2019.

Gilson A, Safranek C, Huang T, Socrates V, Chi L, Taylor RA, et al. How Does ChatGPT Perform on the Medical Licensing Exams? The Implications of Large Language Models for Medical Education and Knowledge Assessment. medRxiv. 1 de enero de 2022;2022.12.23.22283901.

Huh S. Are ChatGPT's knowledge and interpretation ability comparable to those of medical students in Korea for taking a parasitology examination?: a descriptive study. J Educ Eval Health Prof 2023;20:1 https://doi.org/10.3352/jeehp.2023.20.1

Antaki F, Touma S, Milad D, El-Khoury J, Duval R. Evaluating the Performance of ChatGPT in Ophthalmology: An Analysis of its Successes and Shortcomings. medRxiv. 1 de enero de 2023;2023.01.22.23284882.

Grupo CTO. Servicio post-mir de corrección de exámenes. Enero 2023. Disponible en: https://medicina.grupocto.es/postmir/

Uhrig A. La elección telemática de plaza MIR 2022 deja 218 vacías: En qué número se agota cada especialidad?. Consalud.es. Enero 2023. Disponible en: https://www.consalud.es/especial-mir/adjudicadas-todas-plazas-mir-2022-en-numero-se-agoto-cada-especialidad_115000_102.html

Baidoo-Anu, D. Education in the Era of Generative Artificial Intelligence (AI): Understanding the Potential Benefits of ChatGPT in Promoting Teaching and Learning. 2023. Disponible en: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4337484

Yue Zhuo T. Exploring AI Ethics of ChatGPT: A Diagnostic Analysis. 2023. Disponible en: https://arxiv.org/abs/2301.12867

Published
16-02-2023
How to Cite
Carrasco, J. P., García, E., Sánchez, D. A., Porter, E., De La Puente, L., Navarro, J., & Cerame, A. (2023). Is "ChatGPT" capable of passing the 2022 MIR exam? Implications of artificial intelligence in medical education in Spain. Spanish Journal of Medical Education, 4(1). https://doi.org/10.6018/edumed.556511

Most read articles by the same author(s)