Published on in Vol 26 (2024)

Preprints (earlier versions) of this paper are available at https://preprints.jmir.org/preprint/54571, first published .
Assessing Generative Pretrained Transformers (GPT) in Clinical Decision-Making: Comparative Analysis of GPT-3.5 and GPT-4

Assessing Generative Pretrained Transformers (GPT) in Clinical Decision-Making: Comparative Analysis of GPT-3.5 and GPT-4

Assessing Generative Pretrained Transformers (GPT) in Clinical Decision-Making: Comparative Analysis of GPT-3.5 and GPT-4

Journals

  1. Zhao B, Liu H, Liu Q, Qi W, Zhang W, Du J, Jin Y, Weng X. Breaking Boundaries in Spinal Surgery: GPT-4's Quest to Revolutionize Surgical Site Infection Management. The Journal of Infectious Diseases 2025;231(2):e345 View
  2. Brin D, Sorin V, Konen E, Nadkarni G, Glicksberg B, Klang E. How GPT models perform on the United States medical licensing examination: a systematic review. Discover Applied Sciences 2024;6(10) View
  3. Le K, Chen J, Mai D, Le K. An Evaluation on the Potential of Large Language Models for Use in Trauma Triage. Emergency Care and Medicine 2024;1(4):350 View
  4. Azeroual O. Can generative AI transform data quality? a critical discussion of ChatGPT’s capabilities. Academia Engineering 2024;1(4) View
  5. Camlet A, Kusiak A, Świetlik D. Application of Conversational AI Models in Decision Making for Clinical Periodontology: Analysis and Predictive Modeling. AI 2025;6(1):3 View
  6. Janumpally R, Nanua S, Ngo A, Youens K. Generative artificial intelligence in graduate medical education. Frontiers in Medicine 2025;11 View
  7. Öztürk Z, Bal C, Çelikkaya B. Evaluation of Information Provided by ChatGPT Versions on Traumatic Dental Injuries for Dental Students and Professionals. Dental Traumatology 2025;41(4):427 View
  8. Erdat E, Kavak E. Benchmarking LLM chatbots’ oncological knowledge with the Turkish Society of Medical Oncology’s annual board examination questions. BMC Cancer 2025;25(1) View
  9. Waaler P, Hussain M, Molchanov I, Bongo L, Elvevåg B. Prompt Engineering an Informational Chatbot for Education on Mental Health Using a Multiagent Approach for Enhanced Compliance With Prompt Instructions: Algorithm Development and Validation. JMIR AI 2025;4:e69820 View
  10. Çalışkan E. Exploring possibilities and limits of ChatGPT: Usage in building design studies. Turkish Journal of Engineering 2025;9(3):490 View
  11. Demir S. Comparison of ChatGPT-4o, Google Gemini 1.5 Pro, Microsoft Copilot Pro, and Ophthalmologists in the management of uveitis and ocular inflammation: A comparative study of large language models. Journal Français d'Ophtalmologie 2025;48(4):104468 View
  12. Coen E, Del Fiol G, Kaphingst K, Borsato E, Shannon J, Smith H, Masino A, Allen C. Chatbot for the Return of Positive Genetic Screening Results for Hereditary Cancer Syndromes: Prompt Engineering Project. JMIR Cancer 2025;11:e65848 View
  13. Liu J, Segal K, Daher M, Ozolin J, Binder W, Bergen M, McDonald C, Owens B, Antoci V. Artificial intelligence versus orthopedic surgeons as an orthopedic consultant in the emergency department. Injury 2025;56(4):112297 View
  14. Meyer N, Meyer J. A Practical Guide to the Utilization of ChatGPT in the Emergency Department: A Systematic Review of Current Applications, Future Directions, and Limitations. Cureus 2025 View
  15. Saglam S, Uludag V, Karaduman Z, Arıcan M, Yücel M, Dalaslan R. Comparative evaluation of artificial intelligence models GPT-4 and GPT-3.5 in clinical decision-making in sports surgery and physiotherapy: a cross-sectional study. BMC Medical Informatics and Decision Making 2025;25(1) View
  16. Yang X, Xiao Y, Liu D, Deng H, Huang J, Zhou Y, Dai C, Wu J, Liu D, Liang M, Xu C. Cross language transformation of free text into structured lobectomy surgical records from a multi center study. Scientific Reports 2025;15(1) View
  17. Chen R, Zhang S, Zheng Y, Yu Q, Wang C. Enhancing treatment decision-making for low back pain: a novel framework integrating large language models with retrieval-augmented generation technology. Frontiers in Medicine 2025;12 View
  18. Wang W, Fu J, Zhang Y, Hu K. A Comparative Analysis of GPT-4o and ERNIE Bot in a Chinese Radiation Oncology Exam. Journal of Cancer Education 2025 View
  19. Niel O, Dookhun D, Caliment A. Performance evaluation of large language models in pediatric nephrology clinical decision support: a comprehensive assessment. Pediatric Nephrology 2025;40(10):3211 View
  20. Hao W, Chen C, Chen K, Li L, Chiu C, Yang T, Jong H, Yang H, Huang C, Liu J, Li Y. ChatGPT Performance Deteriorated in Patients with Comorbidities When Providing Cardiological Therapeutic Consultations. Healthcare 2025;13(13):1598 View
  21. Jin Y, Wu G, Seo J, Park S, Hur S, Aliyeva D, Park J, Kim K. AI Veterinary Assistance: Enhancing Clinical Decision-Making in Animal Healthcare. IEEE Access 2025;13:119292 View
  22. Liu Y, Yuan Y, Yan K, Li Y, Sacca V, Hodges S, Cannistra M, Jeong P, Wu J, Kong J. Evaluating the role of large language models in traditional Chinese medicine diagnosis and treatment recommendations. npj Digital Medicine 2025;8(1) View
  23. Han M, Liu Y. Evaluating generative artificial intelligence products using fuzzy social network multi-attribute decision-making model: User perspective. Applied Soft Computing 2025;183:113715 View
  24. Vithanage D, Yu P, Xie Q, Xu H, Wang L, Deng C. A comprehensive evaluation of large language models for information extraction from unstructured electronic health records in residential aged care. Computers in Biology and Medicine 2025;197:111013 View
  25. Celikten T, Onan A. Benchmarking Large Language Models for Biomedical Literature Summarization: Abstractive Versus Extractive Paradigms. IEEE Access 2025;13:152682 View
  26. Jaleel A, Aziz U, Farid G, Zahid Bashir M, Mirza T, Khizar Abbas S, Aslam S, Sikander R. Evaluating the Potential and Accuracy of ChatGPT-3.5 and 4.0 in Medical Licensing and In-Training Examinations: Systematic Review and Meta-Analysis. JMIR Medical Education 2025;11:e68070 View
  27. Calleo Y, Pilla F. Real-Time AI Delphi: A novel method for decision-making and foresight contexts. Futures 2025;174:103703 View
  28. Yoon D, Kim C, Ryu Y, Lee Y, Chae Y. Performance of GPT-4 for planning acupuncture treatment: comparison with human clinician performance. Frontiers in Medicine 2025;12 View
  29. Nanua S, Steward R, Neely B, Datto M, Youens K. Retrieval-augmented generation for interpreting clinical laboratory regulations using large language models. Journal of Pathology Informatics 2025;19:100520 View
  30. Rai M, Ngaw M, Nannas N. Artificial Intelligence Performance in Introductory Biology: Passing Grades but Poor Performance at High Cognitive Complexity. Education Sciences 2025;15(10):1400 View
  31. Zand J, Miao J, Hommos M, Schwartz G, Taler S, Nejat P, Cheungpasitporn W, Garovic V, Zoghby Z. Performance of Large Language Models in Analyzing Common Hypertension Scenarios. Hypertension 2025 View
  32. Ben-Haim G, Livne A, Manor U, Hochstein D, Saban M, Blaier O, Iram Y, Balzam M, Lutenberg A, Eyade R, Qassem R, Trabelsi D, Dahari Y, Eisenmann B, Shechtman Y, Nadkarni G, Glicksberg B, Zimlichman E, Perry A, Klang E. Evaluating empathy in GPT-4-generated vs. physician-written emergency department discharge letters. DIGITAL HEALTH 2025;11 View
  33. Wang W, Zhou Y, Fu J, Hu K. Evaluating the Performance of DeepSeek-R1 and DeepSeek-V3 Versus OpenAI Models in the Chinese National Medical Licensing Examination: Cross-Sectional Comparative Study. JMIR Medical Education 2025;11:e73469 View
  34. Roberts A, Patel R, Babu S, Engelhard M, Greenberg R, Ajmera A. Can artificial intelligence pass the test? Evaluating chatbot scores on pediatric gastroenterology board‐style questions. JPGN Reports 2025 View
  35. Siden R, Kerman H, Gallo R, Cool J, Hom J, Goh E, Ahuja N, Heidenreich P, Shieh L, Yang D, Chen J, Rodman A, Holdsworth L. A typology of physician input approaches to using AI chatbots for clinical decision-making. npj Digital Medicine 2025 View

Books/Policy Documents

  1. Dehankar P, Das S. Exploration of Transformative Technologies in Healthcare 6.0. View