Large language model-assisted structured reporting in Radiology residents: an implementation pilot study in Emergency Radiology.
Abstract
Objective: To evaluate whether a structured reporting system assisted by Large Language Models (LLMs) can be practically integrated into the work of radiology residents during on-call shifts. Secondary objectives included: describing format preferences through blind evaluation, characterizing linguistic differences between manual and LLM-assisted reports, and identifying perceived risks for a confirmatory study. Methods: A two-component pilot study was conducted. In the prospective phase, four residents generated 480 reports, alternating between manual and LLM-assisted writing (Custom GPT-4o). In parallel, 200 anonymized reports from attending physicians were analyzed to contextualize the metrics. An ad hoc Likert-type survey (six dimensions) was used, and classification and perplexity metrics were calculated as descriptive indicators. Results: The tool was well received. Median Likert scores ranged from 4.75 to 4.90 out of 5. Residents accurately distinguished which reports had been assisted (F1 = 0.92), suggesting a recognizable formal signature. Self-attribution bias was observed in blinded preferences. Perplexity differed between residents and attending physicians (p = 0.03), suggesting greater regularity among experienced professionals. Conclusions: The findings support the initial integration of the assistant into the on-call system. The value lies in its scaffolding function to standardize communication between residents and requesting physicians, not in automating diagnostic reasoning.
Downloads
-
Abstract58
-
pdf (Español (España))28
-
pdf28
References
1. Kahn CE Jr. Artificial intelligence in radiology: decision support systems. Radiographics. 1994, 14, 849-861. https://doi.org/10.1148/radiographics.14.4.7938772
2. Rajpurkar P, Lungren MP. The current and future state of AI interpretation of medical images. N Engl J Med. 2023, 388, 1981-1990. https://doi.org/10.1056/NEJMra2301725
3. Brown TB, Mann B, Ryder N, et al. Language models are few-shot learners. Advances in Neural Information Processing Systems. 2020, 33, 1877-1901. https://doi.org/10.48550/arXiv.2005.14165
4. Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need. Advances in Neural Information Processing Systems. 2017, 30, 5998-6008. https://doi.org/10.48550/arXiv.1706.03762
5. Hartung MP, Bickle IC, Gaillard F, Kanne JP. How to Create a Great Radiology Report. RadioGraphics. 2020, 40, 1658-1670. https://doi.org/10.1148/rg.2020200020
6. Castro D, Mishra S, Kwan BY, et al. Structured Reporting in Radiology Residency: A Standardized Approach to Assessing Interpretation Skills and Competence. Int Med Educ. 2025, 4, 40. https://doi.org/10.3390/ime4010002
7. Larson DB, Towbin AJ, Pryor RM, Donnelly LF. Improving consistency in radiology reporting through the use of department-wide standardized structured reporting. Radiology. 2013, 267, 240-250. https://doi.org/10.1148/radiol.12121502
8. Kao JP, Kao HT. Large Language Models in radiology: A technical and clinical perspective. Eur J Radiol Artif Intell. 2025, 2, 100021. https://doi.org/10.1016/j.ejrai.2025.100021
9. Mongan J, Moy L, Kahn CE Jr. Checklist for Artificial Intelligence in Medical Imaging (CLAIM): A Guide for Authors and Reviewers. Radiol Artif Intell. 2020, 2, e200029. https://doi.org/10.1148/ryai.2020200029
10. Wirth S, Hebebrand J, Basilico R, et al. European Society of Emergency Radiology (ESER). Guideline on radiological polytrauma imaging and service (full version). Disponible en: https://www.eser-society.org/polytrauma-imaging-guidelines/ (Acceso: enero 2025).
11. Radiological Society of North America. RadReport Templates. Disponible en: https://www.rsna.org/practice-tools/data-tools-and-standards/radreport-templates (Acceso: enero 2025).
12. Sociedad Española de Radiología Médica. Léxico conflictivo en Radiología. Madrid: SERAM; 2020. Disponible en: https://static.seram.es/wp-content/uploads/2021/07/lexico_radiologico_conflictivo.pdf (Acceso: enero 2025).
13. Lewis P, Perez E, Piktus A, et al. Retrieval-augmented generation for knowledge-intensive NLP tasks. Advances in Neural Information Processing Systems. 2020, 33, 9459-9474. https://doi.org/10.48550/arXiv.2005.11401
14. Lancaster GA, Dodd S, Williamson PR. Design and analysis of pilot studies: recommendations for good practice. J Eval Clin Pract. 2004, 10, 307-312. https://doi.org/10.1111/j..2002.384.doc.x
15. Brooke J. SUS: A 'quick and dirty' usability scale. En: Jordan PW, Thomas B, McClelland IL, Weerdmeester B, eds. Usability Evaluation in Industry. London: Taylor & Francis; 1996. p. 189-194.
16. Patel BN, Rosenberg L, Willcox G, et al. Human-machine partnership with artificial intelligence for chest radiograph diagnosis. NPJ Digit Med. 2019, 2, 111. https://doi.org/10.1038/s41746-019-0189-7
17. Wood D, Bruner JS, Ross G. The role of tutoring in problem solving. J Child Psychol Psychiatry. 1976, 17, 89-100. https://doi.org/10.1111/j.1469-7610.1976.tb00381.x
18. Ten Cate O. Entrustability of professional activities and competency-based training. Med Educ. 2005, 39, 1176-1177. https://doi.org/10.1111/j.1365-2929.2005.02341.x
19. Epstein RM. Assessment in medical education. N Engl J Med. 2007, 356, 387-396. https://doi.org/10.1056/NEJMra054784
20. Schwartz LH, Panicek DM, Berk AR, et al. Improving communication of diagnostic radiology findings through structured reporting. Radiology. 2011, 260, 174-181. https://doi.org/10.1148/radiol.11101913
21. Busch F, Hoffmann L, Pinto dos Santos D, et al. Large language models for structured reporting in radiology: past, present, and future. Eur Radiol. 2025, 35, 2589-2602. https://doi.org/10.1007/s00330-024-11107-6
22. Lindholz M, Burdenski A, Ruppel R, et al. Comparing large language models and text embedding models for automated classification of textual, semantic, and critical changes in radiology reports. Eur J Radiol. 2025, 191, 112316. https://doi.org/10.1016/j.ejrad.2025.112316
23. Martín-Noguerol T, López-Úbeda P, Luna A. From GPS to ChatGPT in Radiology... Dumb and Dumber? J Am Coll Radiol. 2025. https://doi.org/10.1016/j.jacr.2025.09.014
24. European Society of Radiology. ESR paper on structured reporting in radiology. Insights Imaging. 2018, 9, 1-7. https://doi.org/10.1007/s13244-017-0588-8
Copyright (c) 2026 Servicio de Publicaciones de la Universidad de Murcia

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
The works published in this magazine are subject to the following terms:
1. The Publications Service of the University of Murcia (the publisher) preserves the economic rights (copyright) of the published works and favors and allows them to be reused under the use license indicated in point 2.
2. The works are published under a Creative Commons Attribution-NonCommercial-NoDerivative 4.0 license.
3. Self-archiving conditions. Authors are allowed and encouraged to disseminate electronically the pre-print versions (version before being evaluated and sent to the journal) and / or post-print (version evaluated and accepted for publication) of their works before publication , since it favors its circulation and earlier diffusion and with it a possible increase in its citation and reach among the academic community.












