Evaluating the Performance of DeepSeek 3, Claude Sonnet 4, and Gemini 2.5 in the Chilean Medical Licensing Examination: Observational Study.
Abstract
Introduction: Artificial intelligences and their continuous improvement have revolutionized medical education, but their performance in specific evaluative contexts still requires further exploration. Methods: This study qualitatively evaluated and compared the performance of three state-of-the-art language models — Claude Sonnet 4, Gemini 2.5, and DeepSeek 3 — in simulations of the National Medical Knowledge Examination (EUNACOM) in Chile. Three mock exams with 180 questions each were used, covering various medical areas and question types, including those based on clinical cases. Results: The results show that all AI models consistently passed the exams, with Claude Sonnet 4 achieving the highest overall performance (89% accuracy) and the greatest consistency across attempts. Clinical case-based questions were answered more accurately than theoretical knowledge questions, highlighting the models' strength in contextual clinical reasoning. Claude excelled in Internal Medicine and Psychiatry, DeepSeek in Surgery, and Gemini demonstrated balanced performance. However, specific gaps were identified in areas such as Public Health and clinical follow-up, suggesting the need for model-specific adjustments. Conclusion: The findings support the educational potential of these tools but also emphasize the importance of their ethical, supervised, and complementary use alongside traditional medical training. This study contributes to understanding the emerging role of artificial intelligence in professional assessments, as well as its limitations and opportunities within the Chilean medical context.
Downloads
Metrics
-
Abstract61
-
pdf58
References
Heng JJY, Teo DB, Tan LF. The impact of Chat Generative Pre-trained Transformer (ChatGPT) on medical education. Postgrad Med J 2023, 99(1176),1125–1127. https://doi.org/10.1093/postmj/qgad058
Eysenbach G. The role of ChatGPT, generative language models, and artificial intelligence in medical education: a conversation with ChatGPT and a call for papers. JMIR Med Educ 2023, 9, e46885. https://doi.org/10.2196/46885
OpenAI. GPT-4V(ision) system card. In: OpenAI Research. OpenAI 2023. https://openai.com/research/gpt-4v-system-card. Accessed July 20, 2025.
Anthropic. Claude Opus 4. In: Claude Models. Anthropic 2023. https://www.anthropic.com/claude/opus. Accessed July 20, 2025.
Google Cloud. Gemini 2.5 Flash. In: Generative Models Documentation. Google Cloud 2025. https://cloud.google.com/vertex-ai/generative-ai/docs/models/gemini/2-5-flash. Accessed July 20, 2025.
DeepSeek. DeepSeek-V3-0324 Release. In: DeepSeek API Docs. DeepSeek 2025. https://api-docs.deepseek.com/news/news250325. Accessed July 20, 2025.
Institute of Knowledge Engineering. Trust and interest in AI applications in the health sector. In: Health with AI. Institute of Knowledge Engineering n.d. https://www.iic.uam.es/lasalud/confianza-e-interes-en-la-aplicacion-de-la-ia-en-el-sector-salud/. Accessed July 20, 2025.
Kung TH, Cheatham M, Medenilla A, Sillos C, De Leon L, et al. Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models. PLOS Digit Health 2023, 2(2), e0000198. https://doi.org/10.1371/journal.pdig.0000198
Guillen-Grima F, Guillen-Aguinaga S, Guillen-Aguinaga L, Alas-Brun R, Onambele L, et al. Evaluating the efficacy of ChatGPT in navigating the Spanish Medical Residency Entrance Examination (MIR): Promising horizons for AI in clinical medicine. Clin Pract 2023, 13, 1460–1487. https://doi.org/10.3390/clinpract13060130
Eunacom. Official Regulations. In: National Medical Knowledge Exam. 2023. https://www.eunacom.cl/reglamentacion/NormativaOficial.pdf. Accessed July 21, 2025.
Chile. Law No. 20.261: Creates a national unified medical knowledge exam, incorporates specified posts into the Senior Public Management System, and amends Law No. 19,664. Diario Oficial de la República de Chile. 2008 Apr 19. https://www.bcn.cl/leychile/navegar?idNorma=270584.
Rojas M, Rojas M, Burgess V, Toro-Pérez J, Salehi S. Exploration of the performance of versions 3.5, 4, and 4 with vision of ChatGPT in the Chilean National Medical Exam: Observational study. JMIR Med Educ 2024, 10, e55048. https://doi.org/10.2196/55048
Guevara DR. 180 EUNACOM-style questions. In: Study material for the medical exam. DR Guevara 2024. https://www.drguevara.cl/material-y-pruebas-gratis/180-preguntas-tipo-eunacom/. Accessed July 21, 2025.
Faculty of Medicine. Official EUNACOM mock exam. In: Academic Portal, University of Chile. University of Chile 2024. https://medicina.uchile.cl/. Accessed July 21, 2025.
EUNACOM. Sample official questions. In: Official website of the National Medical Knowledge Exam. National Health Service 2023. https://www.eunacom.cl/contenidos/muestra.html. Accessed July 21, 2025.
Carrasco JP, García E, Sánchez DA, Porter E, De La Puente L, Navarro J, Cerame A. Is "ChatGPT" capable of passing the 2022 MIR exam? Implications of artificial intelligence in medical education in Spain. Revista Española de Educación Médica, 2024, 4(1). https://doi.org/10.6018/edumed.556511
Gaspar Casal Foundation. Clinical decisions and artificial intelligence. In: Publications on health innovation. Gaspar Casal Foundation 2020. https://fundaciongasparcasal.org/wp-content/uploads/2020/12/Decisiones-clinicas-e-inteligencia-artificial.pdf. Accessed July 21, 2025.
Masters K, MacNeil H, Benjamin J, Carver T, Nemethy K, Valanci-Aroesty S, et al. Artificial intelligence in health professions education assessment: AMEE Guide No. 178. Med Teach. 2025, 47(9), 1410-1424. doi:10.1080/0142159X.2024.2445037.
World Health Organization. Ethics and Governance of Artificial Intelligence for Health: Large Multi-Modal Models. WHO Guidance. World Health Organization, 18 Jan. 2024, www.who.int/publications/i/item/9789240084759 . Accessed October 6, 2025.
Chile. Law No. 21.719: Regulates the protection and processing of personal data and creates the Data Protection Agency. Official Gazette of the Republic of Chile. 2024 Dec 13. Available from: https://www.bcn.cl/leychile/navegar?idNorma=1209272. Accessed October 6, 2025.
Chamber of Deputies of Chile. Bill regulating artificial intelligence systems [Docket No. 16.821-19]. Valparaíso; 2024 May 7. Available from: https://www.camara.cl/legislacion/ProyectosDeLey/tramitacion.aspx?prmBOLETIN=16821&prmID=17429. Accessed October 6, 2025.
Miao F, Holmes W. Guidance for generative AI in education and research. Paris, France: UNESCO; 2023. https://unesdoc.unesco.org/ark:/48223/pf0000386693. Accessed October 6, 2025.
Copyright (c) 2025 Servicio de Publicaciones de la Universidad de Murcia

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
The works published in this magazine are subject to the following terms:
1. The Publications Service of the University of Murcia (the publisher) preserves the economic rights (copyright) of the published works and favors and allows them to be reused under the use license indicated in point 2.
2. The works are published under a Creative Commons Attribution-NonCommercial-NoDerivative 4.0 license.
3. Self-archiving conditions. Authors are allowed and encouraged to disseminate electronically the pre-print versions (version before being evaluated and sent to the journal) and / or post-print (version evaluated and accepted for publication) of their works before publication , since it favors its circulation and earlier diffusion and with it a possible increase in its citation and reach among the academic community.












