Published on in Vol 26 (2024)
Preprints (earlier versions) of this paper are
available at
https://preprints.jmir.org/preprint/66114, first published
.

Journals
- Azizoğlu M, Klyuev S. A Comparative Study on the Question-Answering Proficiency of Artificial Intelligence Models in Bladder-Related Conditions: An Evaluation of Gemini and ChatGPT 4.o. Medical Records 2025;7(1):201 View
- Wei Y, Zhang R, Zhang J, Qi D, Cui W. Research on Intelligent Grading of Physics Problems Based on Large Language Models. Education Sciences 2025;15(2):116 View
- Zeng J, Sun K, Qin P, Liu S. Enhancing ophthalmology students’ awareness of retinitis pigmentosa: assessing the efficacy of ChatGPT in AI-assisted teaching of rare diseases—a quasi-experimental study. Frontiers in Medicine 2025;12 View
- Acar A, Yanik E, Altin E, Kurtkaya Kocak O. Is artificial intelligence successful in the Turkish neurology board exam?. Neurological Research 2025;47(5):402 View
- Hasei J, Nakahara R, Takeuchi K, Yoshida A, Itano T, Fujiwara T, Nakata E, Kunisada T, Ozaki T. Comparative analysis of a standard (GPT-4o) and reasoning-enhanced (o1 pro) large language model on complex clinical questions from the Japanese orthopaedic board examination. Journal of Orthopaedic Science 2025;30(3):565 View
- Budler L, Chen H, Chen A, Topaz M, Tam W, Bian J, Stiglic G. A Brief Review on Benchmarking for Large Language Models Evaluation in Healthcare. WIREs Data Mining and Knowledge Discovery 2025;15(2) View
- Bi C, Zheng X, Zhang Y, Zhou S, Song J, Shang H, Shen B. NDDRF 2.0: An update and expansion of risk factor knowledge base for personalized prevention of neurodegenerative diseases. Alzheimer's & Dementia 2025;21(5) View
- Wu D, Liu N, Ma R, Wu P. Advancements in Herpes Zoster Diagnosis, Treatment, and Management: Systematic Review of Artificial Intelligence Applications. Journal of Medical Internet Research 2025;27:e71970 View
- Wei J, Wang X, Huang M, Xu Y, Yang W. Evaluating the Performance of ChatGPT on Board-Style Examination Questions in Ophthalmology: A Meta-Analysis. Journal of Medical Systems 2025;49(1) View
- Yan Z, Fan K, Zhang Q, Wu X, Chen Y, Wu X, Yu T, Su N, Zou Y, Chi H, Xia L, Cao Q. Comparative analysis of the performance of the large language models DeepSeek-V3, DeepSeek-R1, open AI-O3 mini and open AI-O3 mini high in urology. World Journal of Urology 2025;43(1) View
- Paruzel K, Ordak M. Assessment of ChatGPT-3.5 performance on the medical genetics specialist exam. Laboratory Medicine 2025;56(6):737 View
- Hu D, Guo Y, Zhou Y, Flores L, Zheng K. A systematic review of early evidence on generative AI for drafting responses to patient messages. npj Health Systems 2025;2(1) View
- Souto M, Fernandes A, Silva A, de Freitas Ribeiro L, de Medeiros Fernandes T. A multi-model longitudinal assessment of ChatGPT performance on medical residency examinations. Frontiers in Artificial Intelligence 2025;8 View
- Zhang A, Zhao E, Wang R, Zhang X, Wang J, Chen E. Multimodal large language models for medical image diagnosis: Challenges and opportunities. Journal of Biomedical Informatics 2025;169:104895 View
- Gao F, He Y, Chen Q, Liu F. Evaluating Psychological Competency via Chinese Q&A in Large Language Models. Applied Sciences 2025;15(16):9089 View
- Yang X, Chen W. The performance of ChatGPT on medical image-based assessments and implications for medical education. BMC Medical Education 2025;25(1) View
- Armitage R. Potential for Editorial Committee Use of Large Language Models in Peer Review. Journal of Evaluation in Clinical Practice 2025;31(6) View
- Armitage R. Artificial General Intelligence and Its Threat to Public Health. Journal of Evaluation in Clinical Practice 2025;31(6) View
- Kim K, Kim B. Diagnostic Performance of Large Language Models in Multimodal Analysis of Radiolucent Jaw Lesions. International Dental Journal 2025;75(6):103910 View
- Reshetnikov R, Tyrov I, Vasilev Y, Shumskaya Y, Vladzymyrskyy A, Akhmedzyanova D, Bezhenova K, Varyukhina M, Sokolova M, Blokhin I, Voytenko D, Mynko O, Kodenko M, Omelyanskaya O. Assessing the quality of large generative models for basic healthcare applications. Medical Doctor and Information Technologies 2025;(3):64 View
- Chen H, Zeng D, Qin Y, Fan Z, Ng Yu Ci F, Klonoff D, Ji J, Zhang S, Amissah-Arthur K, Jiménez de Tavárez M, Masood S, Van Le P, Keane P, Sheng B, Wong T, Tham Y. Large language models and global health equity: a roadmap for equitable adoption in LMICs. The Lancet Regional Health - Western Pacific 2025;63:101707 View
- Lu Q. Development of generative artificial intelligence in medical education: a bibliometric profiling. Frontiers in Education 2025;10 View
- Altermatt F, Neyem A, Sumonte N, Villagrán I, Mendoza M, Lacassie H, Delfino A. Evaluating GPT-4o in high-stakes medical assessments: performance and error analysis on a Chilean anesthesiology exam. BMC Medical Education 2025;25(1) View
- Ito T, Ishibashi T, Hayashi T, Kojima S, Sogabe K. Large Language Models for the National Radiological Technologist Licensure Examination in Japan: Cross-Sectional Comparative Benchmarking and Evaluation of Model-Generated Items Study. JMIR Medical Education 2025;11:e81807 View
- Dejean-Bouyer E, Kanlagna A, Thuau F, Perrot P, Lancien U. Performance of ChatGPT-4 on the French Board of Plastic Reconstructive and Aesthetic Surgery written exam: a descriptive study. Journal of Educational Evaluation for Health Professions 2025;22:27 View
- Alanazi H, Altalhi L, Alanazi N, Al Ghamdi R, Aboalela A, Shujaat S. Arabian Nights or English Days? Accuracy of Large Language Models in Answering Bilingual Dental Multiple‐Choice Questions. European Journal of Dental Education 2025 View
- Banskota B, Bhusal R, Yadav P, Banskota A. Artificial intelligence in orthopaedic education, training and research: a systematic review. BMC Medical Education 2025;25(1) View
- Pohlmann P, Glienke M, Sandkamp R, Gratzke C, Schmal H, Schoeb D, Fuchs A. Assessing the Efficacy of Ortho GPT: A Comparative Study with Medical Students and General LLMs on Orthopedic Examination Questions. Bioengineering 2025;12(12):1290 View
Books/Policy Documents
- Zong H, Tao L, Li Z, Wu C, Liu Y, Zhang X. Health Information Processing. Evaluation Track Papers. View
