Performance of Large Language Models in Numerical Versus Semantic Medical Knowledge: Cross-Sectional Benchmarking Study on Evidence-Based Questions and Answers

Gong E, Bang C, Lee J, Baik G. Knowledge-Practice Performance Gap in Clinical Large Language Models: Systematic Review of 39 Benchmarks. Journal of Medical Internet Research 2025;27:e84120 View
Gély L, Chaillot M, Fréour T. Can large language models provide accurate and empathetic answers to the most frequently asked questions by infertile patients? A pilot study. Reproductive BioMedicine Online 2026;52(4):105221 View
Martini S, Schluessel S, Aghamaliyev U, Rippl M, Deissler L, Tausendfreund O, Nuebler D, Mueller K, Schmidmaier R, Drey M. Expert Evaluation of the Perceived Accuracy, Relevance, and Safety of Large Language Model–Generated Patient Information in Geriatrics: Cross-Condition Study. JMIR AI 2026;5:e91369 View
Jauhiainen J, Guerra A. Evaluating Open-Ended High-Stakes Examinations with LLMs: Alignment Between ChatGPT-4o and Human Grading in High- and Low-Resource Languages. Frontiers of Digital Education 2026;3(2) View

Citation

Please cite as:

Avnat E, Levy M, Herstain D, Yanko E, Ben Joya D, Tzuchman Katz M, Eshel D, Laros S, Dagan Y, Barami S, Mermelstein J, Ovadia S, Shomron N, Shalev V, Abdulnour REE
Performance of Large Language Models in Numerical Versus Semantic Medical Knowledge: Cross-Sectional Benchmarking Study on Evidence-Based Questions and Answers
J Med Internet Res 2025;27:e64452
doi: 10.2196/64452 PMID: 40658983 PMCID: 12279315

Export Metadata

END for: Endnote

BibTeX for: BibDesk, LaTeX

RIS for: RefMan, Procite, Endnote, RefWorks

Add this article to your Mendeley library

This paper is in the following e-collection/theme issue:

Artificial Intelligence (4609) Research Instruments, Questionnaires, and Tools (1177) Generative Language Models Including ChatGPT (1446) AI Language Models in Health Care (711) Foundation Models and Their Applications in AI (104)

Download

Download PDF Download XML

Share Article

Share on Bluesky Share on Twitter Share on Facebook Share on LinkedIn