Assessing Generative Pretrained Transformers (GPT) in Clinical Decision-Making: Comparative Analysis of GPT-3.5 and GPT-4

Zhao B, Liu H, Liu Q, Qi W, Zhang W, Du J, Jin Y, Weng X. Breaking Boundaries in Spinal Surgery: GPT-4's Quest to Revolutionize Surgical Site Infection Management. The Journal of Infectious Diseases 2025;231(2):e345 View
Brin D, Sorin V, Konen E, Nadkarni G, Glicksberg B, Klang E. How GPT models perform on the United States medical licensing examination: a systematic review. Discover Applied Sciences 2024;6(10) View
Le K, Chen J, Mai D, Le K. An Evaluation on the Potential of Large Language Models for Use in Trauma Triage. Emergency Care and Medicine 2024;1(4):350 View
Azeroual O. Can generative AI transform data quality? a critical discussion of ChatGPT’s capabilities. Academia Engineering 2024;1(4) View
Camlet A, Kusiak A, Świetlik D. Application of Conversational AI Models in Decision Making for Clinical Periodontology: Analysis and Predictive Modeling. AI 2025;6(1):3 View
Janumpally R, Nanua S, Ngo A, Youens K. Generative artificial intelligence in graduate medical education. Frontiers in Medicine 2025;11 View
Öztürk Z, Bal C, Çelikkaya B. Evaluation of Information Provided by ChatGPT Versions on Traumatic Dental Injuries for Dental Students and Professionals. Dental Traumatology 2025;41(4):427 View
Erdat E, Kavak E. Benchmarking LLM chatbots’ oncological knowledge with the Turkish Society of Medical Oncology’s annual board examination questions. BMC Cancer 2025;25(1) View
Waaler P, Hussain M, Molchanov I, Bongo L, Elvevåg B. Prompt Engineering an Informational Chatbot for Education on Mental Health Using a Multiagent Approach for Enhanced Compliance With Prompt Instructions: Algorithm Development and Validation. JMIR AI 2025;4:e69820 View
Çalışkan E. Exploring possibilities and limits of ChatGPT: Usage in building design studies. Turkish Journal of Engineering 2025;9(3):490 View
Demir S. Comparison of ChatGPT-4o, Google Gemini 1.5 Pro, Microsoft Copilot Pro, and Ophthalmologists in the management of uveitis and ocular inflammation: A comparative study of large language models. Journal Français d'Ophtalmologie 2025;48(4):104468 View
Coen E, Del Fiol G, Kaphingst K, Borsato E, Shannon J, Smith H, Masino A, Allen C. Chatbot for the Return of Positive Genetic Screening Results for Hereditary Cancer Syndromes: Prompt Engineering Project. JMIR Cancer 2025;11:e65848 View
Liu J, Segal K, Daher M, Ozolin J, Binder W, Bergen M, McDonald C, Owens B, Antoci V. Artificial intelligence versus orthopedic surgeons as an orthopedic consultant in the emergency department. Injury 2025;56(4):112297 View
Meyer N, Meyer J. A Practical Guide to the Utilization of ChatGPT in the Emergency Department: A Systematic Review of Current Applications, Future Directions, and Limitations. Cureus 2025 View
Saglam S, Uludag V, Karaduman Z, Arıcan M, Yücel M, Dalaslan R. Comparative evaluation of artificial intelligence models GPT-4 and GPT-3.5 in clinical decision-making in sports surgery and physiotherapy: a cross-sectional study. BMC Medical Informatics and Decision Making 2025;25(1) View
Yang X, Xiao Y, Liu D, Deng H, Huang J, Zhou Y, Dai C, Wu J, Liu D, Liang M, Xu C. Cross language transformation of free text into structured lobectomy surgical records from a multi center study. Scientific Reports 2025;15(1) View
Chen R, Zhang S, Zheng Y, Yu Q, Wang C. Enhancing treatment decision-making for low back pain: a novel framework integrating large language models with retrieval-augmented generation technology. Frontiers in Medicine 2025;12 View
Wang W, Fu J, Zhang Y, Hu K. A Comparative Analysis of GPT-4o and ERNIE Bot in a Chinese Radiation Oncology Exam. Journal of Cancer Education 2026;41(2):256 View
Niel O, Dookhun D, Caliment A. Performance evaluation of large language models in pediatric nephrology clinical decision support: a comprehensive assessment. Pediatric Nephrology 2025;40(10):3211 View
Hao W, Chen C, Chen K, Li L, Chiu C, Yang T, Jong H, Yang H, Huang C, Liu J, Li Y. ChatGPT Performance Deteriorated in Patients with Comorbidities When Providing Cardiological Therapeutic Consultations. Healthcare 2025;13(13):1598 View
Jin Y, Wu G, Seo J, Park S, Hur S, Aliyeva D, Park J, Kim K. AI Veterinary Assistance: Enhancing Clinical Decision-Making in Animal Healthcare. IEEE Access 2025;13:119292 View
Liu Y, Yuan Y, Yan K, Li Y, Sacca V, Hodges S, Cannistra M, Jeong P, Wu J, Kong J. Evaluating the role of large language models in traditional Chinese medicine diagnosis and treatment recommendations. npj Digital Medicine 2025;8(1) View
Han M, Liu Y. Evaluating generative artificial intelligence products using fuzzy social network multi-attribute decision-making model: User perspective. Applied Soft Computing 2025;183:113715 View
Vithanage D, Yu P, Xie Q, Xu H, Wang L, Deng C. A comprehensive evaluation of large language models for information extraction from unstructured electronic health records in residential aged care. Computers in Biology and Medicine 2025;197:111013 View
Celikten T, Onan A. Benchmarking Large Language Models for Biomedical Literature Summarization: Abstractive Versus Extractive Paradigms. IEEE Access 2025;13:152682 View
Jaleel A, Aziz U, Farid G, Zahid Bashir M, Mirza T, Khizar Abbas S, Aslam S, Sikander R. Evaluating the Potential and Accuracy of ChatGPT-3.5 and 4.0 in Medical Licensing and In-Training Examinations: Systematic Review and Meta-Analysis. JMIR Medical Education 2025;11:e68070 View
Calleo Y, Pilla F. Real-Time AI Delphi: A novel method for decision-making and foresight contexts. Futures 2025;174:103703 View
Yoon D, Kim C, Ryu Y, Lee Y, Chae Y. Performance of GPT-4 for planning acupuncture treatment: comparison with human clinician performance. Frontiers in Medicine 2025;12 View
Nanua S, Steward R, Neely B, Datto M, Youens K. Retrieval-augmented generation for interpreting clinical laboratory regulations using large language models. Journal of Pathology Informatics 2025;19:100520 View
Rai M, Ngaw M, Nannas N. Artificial Intelligence Performance in Introductory Biology: Passing Grades but Poor Performance at High Cognitive Complexity. Education Sciences 2025;15(10):1400 View
Zand J, Miao J, Hommos M, Schwartz G, Taler S, Nejat P, Cheungpasitporn W, Garovic V, Zoghby Z. Performance of Large Language Models in Analyzing Common Hypertension Scenarios. Hypertension 2026;83(1):225 View
Ben-Haim G, Livne A, Manor U, Hochstein D, Saban M, Blaier O, Iram Y, Balzam M, Lutenberg A, Eyade R, Qassem R, Trabelsi D, Dahari Y, Eisenmann B, Shechtman Y, Nadkarni G, Glicksberg B, Zimlichman E, Perry A, Klang E. Evaluating empathy in GPT-4-generated vs. physician-written emergency department discharge letters. DIGITAL HEALTH 2025;11 View
Wang W, Zhou Y, Fu J, Hu K. Evaluating the Performance of DeepSeek-R1 and DeepSeek-V3 Versus OpenAI Models in the Chinese National Medical Licensing Examination: Cross-Sectional Comparative Study. JMIR Medical Education 2025;11:e73469 View
Roberts A, Patel R, Babu S, Engelhard M, Greenberg R, Ajmera A. Can artificial intelligence pass the test? Evaluating chatbot scores on pediatric gastroenterology board‐style questions. JPGN Reports 2026;7(1):28 View
Siden R, Kerman H, Gallo R, Cool J, Hom J, Goh E, Ahuja N, Heidenreich P, Shieh L, Yang D, Chen J, Rodman A, Holdsworth L. A typology of physician input approaches to using AI chatbots for clinical decision-making. npj Digital Medicine 2025;9(1) View
Ulutaş F, Altınışık G, Güngör G, Çakmak V, Yiğit N, Herek D, Yiğit M, Karasu U, Çobankara V. Concordance Between the Multidisciplinary Team and ChatGPT-4o Decisions: A Blinded, Cross-Sectional Concordance Study in Systemic Autoimmune Rheumatic Diseases. Diagnostics 2025;16(1):113 View
Obeid E, Ulrych J, Krupa J, Malinowski M, Krasowski M, Kalinowska A, Pietras W, Kozieł A, Kurek Z, Jentkiewicz A. Application of the medicine GPT model in diagnosing rare diseases in the emergency department. Polski Merkuriusz Lekarski 2025:826 View
Winberg D, Tsai E, Tang T, Xuan D, Marchi N, Shi L. Can AI write your code? A case study of chatgpt’s statistical coding capabilities for quantitative research. Health Economics Review 2026;16(1) View
Feng S, Li X, Wake A, Bitterman D. Engaging Artificial Intelligence (AI)-based chatbots in digital health: A systematic review. PLOS Digital Health 2026;5(2):e0001201 View
Wu J, Chen Y, Min Q, Chen M, Zhao J, Ye M. Domain-Adaptive Multimodal Large Language Models for Photovoltaic Fault Diagnosis via Dynamic LoRA Routing. Processes 2026;14(4):653 View
Olszewski R, Brzeziński J, Watros K, Rysz J. Quantifying Readability in Chatbot-Generated Medical Texts Using Classical Linguistic Indices: A Review. Applied Sciences 2026;16(3):1423 View
Lemieux A, Kutcher S, Galani Tietcheu B, Seitz G, Trickovic J, Archibald D, Grosjean S, Hogg W, Johnston S. A double-blind, crossover, non-inferiority randomised controlled trial where primary care providers and patients compare human-generated and AI-generated digital health messages: the AI-CARE study protocol. BMJ Open 2026;16(4):e115673 View
Gablasova D, Harding L, Brezina V, Hazelhurst E, O’Sullivan B, Spiby R. Evaluating Chatbot Authenticity in Simulations of Spoken Interaction: Demonstrating The Utility of Corpus-Based Methods for Development and Validation. Language Testing 2026 View

Books/Policy Documents

Dehankar P, Das S. Exploration of Transformative Technologies in Healthcare 6.0. View
Vadisetty R, Polamarasetti A, Goyal M, Yadav D. Proceedings of the Third Congress on Control, Robotics, and Mechatronics. View
Kalra V, Kaur H. Harvesting Intelligence: The AI Revolution in Agriculture. View
MOHITE S, BAINALWAR P. AI‐driven Innovations in Physiotherapy and Oncology 4. View

Conference Proceedings

Saha P, E. P, G K. 2026 IEEE International Conference for Convergence in Computing Technology (I3CTCON). PawMed Bot: A Retrieval-Grounded AI System for Evidence-Based Veterinary Care View

Citation

Please cite as:

Lahat A, Sharif K, Zoabi N, Shneor Patt Y, Sharif Y, Fisher L, Shani U, Arow M, Levin R, Klang E
Assessing Generative Pretrained Transformers (GPT) in Clinical Decision-Making: Comparative Analysis of GPT-3.5 and GPT-4
J Med Internet Res 2024;26:e54571
doi: 10.2196/54571 PMID: 38935937 PMCID: 11240076

Export Metadata

END for: Endnote

BibTeX for: BibDesk, LaTeX

RIS for: RefMan, Procite, Endnote, RefWorks

Add this article to your Mendeley library

This paper is in the following e-collection/theme issue:

Artificial Intelligence (4535) Ethics, Privacy, and Legal Issues (787) Natural Language Processing (1225) Decision Support for Health Professionals (2118) Clinical Communication, Electronic Consultation and Telehealth (909) Chatbots and Conversational Agents (1134) Generative Language Models Including ChatGPT (1419)

Download

Download PDF Download XML

Share Article

Share on Bluesky Share on Twitter Share on Facebook Share on LinkedIn