Mapping Applications and Outcomes of Large-Language-Model-Generated Cases in Health Professions Education: A Scoping Review.
Abstract
Objective: Large language models (LLMs) have rapidly permeated health professions education and are increasingly used to generate clinical cases and vignettes, yet their characteristics, evaluation methods, and educational impact remain unclear. To map how LLMs are used to generate cases in health professions education and to summarize reported case characteristics, evaluation approaches, bias, and educational outcomes. Methods: We conducted a scoping review following Arksey and O’Malley’s framework and reported using PRISMA-ScR. PubMed, Web of Science, and Scopus were searched on 27 August 2025. Of 2023 records, 72 full texts were assessed and 23 studies met inclusion criteria. Data were charted with a structured extraction form. Results: Across the 23 studies, 33 distinct LLMs were used, most commonly GPT-based models (54.5%). Cases were mainly text-based (69%), with additional image- (20.7%) and audio-based (10.3%) formats across 23 clinical and educational domains. Prompts were reported in 65.2% of studies, and 60.9% included a formal quality evaluation, ranging from high quality to clearly problematic examples. Seven studies (30.4%) identified bias or discriminatory patterns. Student participation occurred in 39.1% of studies, but no higher-level educational outcomes such as behavior change or long-term performance were reported. Conclusions: LLM-generated cases appear feasible and versatile across health professions education but are supported by early, methodologically heterogeneous evidence. Future research should standardize quality evaluation, rigorously assess learning and behavioral outcomes, and systematically audit bias in generated content.
Downloads
-
Abstract114
-
pdf55
-
Annex 17
References
Krystal Hu. ChatGPT sets record for fastest-growing user base - analyst note. Reuters 2023. https://www.reuters.com/technology/chatgpt-sets-record-fastest-growing-user-base-analyst-note-2023-02-01/
3. Zhang Y, Xie X, Xu Q. ChatGPT in medical education: Bibliometric and visual analysis. JMIR Med Educ. 2025, 11, e72356–e72356. https://doi.org/10.2196/72356.
4. Tsekhmister Y. Effectiveness of case-based learning in medical and pharmacy education: A meta-analysis. Electron J Gen Med. 2023, 20(5), em515. https://doi.org/10.29333/ejgm/13315.
6. Thistlethwaite JE, Davies D, Ekeocha S, Kidd JM, MacDougall C, Matthews P, et al. The effectiveness of case-based learning in health professional education. A BEME systematic review: BEME Guide No. 23. Medical Teacher 2012, 34, e421–e444. https://doi.org/10.3109/0142159X.2012.680939.
7. Arksey H, O’Malley L. Scoping studies: towards a methodological framework. International Journal of Social Research Methodology 2005, 8, 19–32. https://doi.org/10.1080/1364557032000119616.
8. Maggio LA, Samuel A, Stellrecht E. Systematic reviews in medical education. Journal of Graduate Medical Education 2022, 14, 171–175. https://doi.org/10.4300/JGME-D-22-00113.1.
9. Thomas A, Lubarsky S, Varpio L, Durning SJ, Young ME. Scoping reviews in health professions education: challenges, considerations and lessons learned about epistemology and methodology. Adv in Health Sci Educ. 2020, 25, 989–1002. https://doi.org/10.1007/s10459-019-09932-2.
10. Akutay S, Yüceler Kaçmaz H, Kahraman H. The effect of artificial intelligence supported case analysis on nursing students’ case management performance and satisfaction: A randomized controlled trial. Nurse Education in Practice 2024, 80, 104142. https://doi.org/10.1016/j.nepr.2024.104142.
11. Andrew K, Montalbano MJ. Through a Glass Darkly: Perceptions of Ethnoracial Identity in Artificial Intelligence Generated Medical Vignettes and Images. Medical Science Educator 2025, 35, 1473–1488. https://doi.org/10.1007/s40670-025-02332-9.
12. Arain SA, Akhund SA, Barakzai MA, Meo SA. Transforming medical education: Leveraging large language models to enhance PBL-a proof-of-concept study. Advances in Physiology Education 2025, 49, 398–404. https://doi.org/10.1152/advan.00209.2024.
13. Artemiou E, Hooper S, Dascanio L, Schmidt M, Gilbert G. Introducing AI-generated cases (AI-cases) & standardized clients (AI-SCs) in communication training for veterinary students: Perceptions and adoption challenges. Front Vet Sci. 2025, 11, 1504598. https://doi.org/10.3389/fvets.2024.1504598.
15. Aygün İ, Kaya M. Use of large language models for medical synthetic data generation in mental illness. IET Conf Proc. 2024, 652–656. https://doi.org/10.1049/icp.2024.1033.
16. Bakkum MJ, Hartjes MG, Piet JD, Donker EM, Likic R, Sanz E, et al. Using artificial intelligence to create diverse and inclusive medical case vignettes for education. British Journal Of Clinical Pharmacology 2024, 90, 640–648. https://doi.org/10.1111/bcp.15977.
17. Benoit JRA. ChatGPT for clinical vignette generation, revision, and evaluation. MedRxiv 2023. https://doi.org/10.1101/2023.02.04.23285478.
18. Coşkun Ö, Kıyak YS, Budakoğlu Iİ. ChatGPT to generate clinical vignettes for teaching and multiple-choice questions for assessment: A randomized controlled experiment. Medical Teacher 2025, 47, 268–274. https://doi.org/10.1080/0142159X.2024.2327477.
19. Ghaffari F, Langarizadeh M, Nabovati E, Sabery M. Effectiveness of ChatGPT for clinical scenario generation: A qualitative study. Archives Of Academic Emergency Medicine 2025, 13, e49. https://doi.org/10.22037/aaemj.v13i1.2690.
20. Higashitsuji A, Otsuka T, Watanabe K. Impact of ChatGPT on case creation efficiency and learning quality in case-based learning for undergraduate nursing students. Teaching and Learning in Nursing 2025, 20, e159–166. https://doi.org/10.1016/j.teln.2024.10.002.
21. Mondal H, Marndi G, Behera J, Mondal S. ChatGPT for teachers: Practical examples for utilizing artificial intelligence for educational purposes. Indian Journal of Vascular and Endovascular Surgery 2023, 10, 200–205. https://doi.org/10.4103/ijves.ijves_37_23.
22. Jackson ML. From cases to confidence: Developing diagnostic reasoning skills through collaborative learning in graduate nursing education. Nurs Educ Perspect. 2025, 46, 319–321. https://doi.org/10.1097/01.NEP.0000000000001438.
24. Lam G, Shammoon Y, Coulson A, Lalloo F, Maini A, Amin A, et al. Utility of large language models for creating clinical assessment items. Medical Teacher 2025, 47, 878–882. https://doi.org/10.1080/0142159X.2024.2382860.
25. Liu C, Zheng J, Liu Y, Wang X, Zhang Y, Fu Q, et al. Potential to perpetuate social biases in health care by Chinese large language models: A model evaluation study. Int J Equity Health 2025, 24, 206. https://doi.org/10.1186/s12939-025-02581-5
26. Lopez M, Goh P-S. Catering for the Needs of Diverse Patient Populations: Using ChatGPT to Design Case-Based Learning Scenarios. Med Sci Educ. 2024, 34, 319–325. https://doi.org/10.1007/s40670-024-01975-4.
27. Rao AS, Kim J, Mu A, Young CC, Kalmowitz E, Senter-Zapata M, et al. Synthetic medical education in dermatology leveraging generative artificial intelligence. Npj Digit Med. 2025, 8, 247. https://doi.org/10.1038/s41746-025-01650-x.
28. Ruiz Sarrias O, Martínez Del Prado MP, Sala Gonzalez MÁ, Azcuna Sagarduy J, Casado Cuesta P, Figaredo Berjano C, et al. Leveraging large language models for precision monitoring of chemotherapy-induced toxicities: A pilot study with expert comparisons and future directions. Cancers 2024, 16, 2830. https://doi.org/10.3390/cancers16162830.
29. Silvestri-Elmore A, Burton C. How can nursing faculty create case studies using AI and educational technology? Nurse Educator 2025, 50, 35–39. https://doi.org/10.1097/NNE.0000000000001734.
30. Sridharan K, Sequeira RP. Evaluation of artificial intelligence‐generated drug therapy communication skill competencies in medical education. Br J Clin Pharmacol. 2025, 91, 2168–2175. https://doi.org/10.1111/bcp.16144.
31. Xie W, Yuan Z, Si Y, Huang Z, Li Y, Wu F, et al. Enhancing medical students’ diagnostic accuracy of infectious keratitis with AI-generated images. BMC Medical Education 2025, 25, 1027. https://doi.org/10.1186/s12909-025-07592-y.
32. Yanagita Y, Yokokawa D, Uchida S, Li Y, Uehara T, Ikusaka M. Can AI-Generated clinical vignettes in japanese be used medically and linguistically? J Gen Intern Med. 2024, 39, 3282–3289. https://doi.org/10.1007/s11606-024-09031-y.
33. Zack T, Lehman E, Suzgun M, Rodriguez JA, Celi LA, Gichoya J, et al. Assessing the potential of GPT-4 to perpetuate racial and gender biases in health care: a model evaluation study. The Lancet Digital Health 2024, 6, e12–22. https://doi.org/10.1016/S2589-7500(23)00225-X.
34. Zhong D, Chow SKK. Investigating the potential of generative AI clinical case‐based simulations on radiography education: A pilot study. Journal of Imaging Informatics in Medicine 2025, 1–13. https://doi.org/10.1007/s10278-025-01601-8.
35. Lees N. The Brandt Line after forty years: The more North–South relations change, the more they stay the same? Rev Int Stud. 2021, 47(1), 85–106. https://doi.org/10.1017/S026021052000039X.
36. Mayer R. Multimedia Learning. 3rd ed. Cambridge University Press; 2020. https://doi.org/10.1017/9781316941355.
37. Feigerlova E, Hani H, Hothersall-Davies E. A systematic review of the impact of artificial intelligence on educational outcomes in health professions education. BMC Med Educ. 2025, 25, 129. https://doi.org/10.1186/s12909-025-06719-5.
Copyright (c) 2026 Servicio de Publicaciones de la Universidad de Murcia

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
The works published in this magazine are subject to the following terms:
1. The Publications Service of the University of Murcia (the publisher) preserves the economic rights (copyright) of the published works and favors and allows them to be reused under the use license indicated in point 2.
2. The works are published under a Creative Commons Attribution-NonCommercial-NoDerivative 4.0 license.
3. Self-archiving conditions. Authors are allowed and encouraged to disseminate electronically the pre-print versions (version before being evaluated and sent to the journal) and / or post-print (version evaluated and accepted for publication) of their works before publication , since it favors its circulation and earlier diffusion and with it a possible increase in its citation and reach among the academic community.












