Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Journal Description

The Journal of Medical Internet Research (JMIR) is the pioneer open access eHealth journal, and is the flagship journal of JMIR Publications. It is a leading health services and digital health journal globally in terms of quality/visibility (Journal Impact Factor 6.0, Journal Citation Reports 2025 from Clarivate), ranking Q1 in both the 'Medical Informatics' and 'Health Care Sciences & Services' categories, and is also the largest journal in the field. The journal is ranked #1 on Google Scholar in the 'Medical Informatics' discipline. The journal focuses on emerging technologies, medical devices, apps, engineering, telehealth and informatics applications for patient education, prevention, population health and clinical care.

JMIR is indexed in all major literature indices including National Library of Medicine(NLM)/MEDLINE, Sherpa/Romeo, PubMed, PMCScopus, Psycinfo, Clarivate (which includes Web of Science (WoS)/ESCI/SCIE), EBSCO/EBSCO Essentials, DOAJ, GoOA and others. Journal of Medical Internet Research received a Scopus CiteScore of 11.7 (2024), placing it in the 92nd percentile (#12 of 153) as a Q1 journal in the field of Health Informatics. It is a selective journal complemented by almost 30 specialty JMIR sister journals, which have a broader scope, and which together receive over 10,000 submissions a year. 

As an open access journal, we are read by clinicians, allied health professionals, informal caregivers, and patients alike, and have (as with all JMIR journals) a focus on readable and applied science reporting the design and evaluation of health innovations and emerging technologies. We publish original research, viewpoints, and reviews (both literature reviews and medical device/technology/app reviews). Peer-review reports are portable across JMIR journals and papers can be transferred, so authors save time by not having to resubmit a paper to a different journal but can simply transfer it between journals. 

We are also a leader in participatory and open science approaches, and offer the option to publish new submissions immediately as preprints, which receive DOIs for immediate citation (eg, in grant proposals), and for open peer-review purposes. We also invite patients to participate (eg, as peer-reviewers) and have patient representatives on editorial boards.

As all JMIR journals, the journal encourages Open Science principles and strongly encourages publication of a protocol before data collection. Authors who have published a protocol in JMIR Research Protocols get a discount of 20% on the Article Processing Fee when publishing a subsequent results paper in any JMIR journal.

Be a widely cited leader in the digital health revolution and submit your paper today!

 

Recent Articles:

  • Source: Freepik; Copyright: freepik; URL: https://www.freepik.com/free-photo/close-up-man-with-joystick-couch_6528089.htm; License: Licensed by JMIR.

    Genre-Specific Gaming Addiction and Flourishing in Adolescents: Cross-Sectional Survey Study

    Abstract:

    Background: Adolescent gaming addiction (GA) has been linked to a range of adverse health outcomes. However, whether the associated health risks differ across game genres remains poorly understood. Objective: Guided by VanderWeele’s multidimensional flourishing framework, this study aims to examine genre-specific associations between GA and flourishing among adolescents. Methods: This study used a cross-sectional observational design. A total of 2194 middle school students were recruited via convenience sampling from a private tutoring center in a northwestern city in China. Eligibility criteria were (1) enrollment in participating classes at the tutoring center, (2) provision of both student and parental consent, and (3) presence during questionnaire administration. The mean age of participants was 14.53 (SD 0.76) years; 985 (44.90%) were boys and 1174 (53.51%) were girls. During class time, students completed paper-based questionnaires that assessed their demographics, gaming addiction, and flourishing. Participants listed up to 3 video games played in the past month and rated their addiction to each. Games were classified into 8 genres: action and adventure (AA), sandbox and simulation (SS), multiplayer online battle arena (MOBA), shooting, strategy, casual, sports, and role-playing. Flourishing was assessed using the Human Flourishing Index across 5 domains: happiness and life satisfaction, mental and physical health, meaning and purpose, character and virtue, and close social relationships. Results: Robust linear regression analyses (α=.05) showed that AA addiction was associated with lower overall flourishing (b=–3.11, 95% CI –4.34 to –1.88) and all 5 subdomains (happiness and life satisfaction: b=–0.46, 95% CI –0.75 to –0.17; mental and physical health: b=–0.61, 95% CI –0.88 to –0.34; meaning and purpose: b=–0.55, 95% CI –0.82 to –0.27; character and virtue: b=–0.74, 95% CI –1.06 to –0.43; and close social relationships: b=–0.62, 95% CI –0.92 to –0.32). MOBA addiction was associated with lower overall flourishing (b=–1.33, 95% CI –2.34 to –0.32), character and virtue (b=–0.34, 95% CI –0.59 to –0.08), and meaning and purpose (b=–0.34, 95% CI –0.56 to –0.11). SS addiction was associated with lower overall flourishing (b=–3.42, 95% CI –5.80 to –1.04), close social relationships (b=–0.86, 95% CI –1.46 to –0.27), and mental and physical health (b=–1.09, 95% CI –1.60 to –0.58). Conclusions: This study provides novel evidence that the association between GA and adolescent flourishing is genre dependent. In contrast to prior research that conceptualizes health narrowly or unidimensionally, a multidimensional perspective provides a more nuanced understanding of the health risks associated with GA. The findings advance the field by showing that addiction to AA, MOBA, and SS games is associated with greater health risks than addiction to other genres. Accordingly, prevention, education, and policy efforts should prioritize higher-risk genres to promote adolescent health.

  • AI-generated image in response to the request "30歳の女性ががん検診を受けて偽陽性になって心配している写真を生成して" (Generator: ChatGPT; Requestor: Kosuke Sakai; Date: 2026-02-03). Source: Chat-GPT; Copyright: N/A (AI-Generated image); URL: https://www.jmir.org/2026/1/e82322; License: Public Domain (CC0).

    Effectiveness of Educational Videos in Encouraging Preferences for Guideline-Based Cancer Screening in Japan: Three-Arm Pseudorandomized Controlled Trial

    Abstract:

    Background: Although cancer screening is essential for early detection and an improved prognosis, screening beyond the recommended guidelines may increase the risk of false-positive results. Consequently, educating individuals about the potential harm of non–guideline-based cancer screening is essential; however, effective communication methods remain unclear. Objective: This study aimed to evaluate the effectiveness of different types of educational videos in encouraging preferences for guideline-based cancer screening. Methods: This 3-arm pseudorandomized controlled trial was conducted in June 2025 using a Japanese online survey platform. Eligible respondents were working adults aged 30 to 60 years with no history of major cancer. Respondents were assigned to 1 of the following 3 video conditions: video A, which provided a logical explanation of false-positive risks; video B, which presented the narrative of a woman who received a false-positive result from breast cancer screening; and video C, which depicted a man who underwent unnecessary follow-up testing after tumor marker screening. The primary outcome was the preference for guideline-based cancer screening after watching the videos. The secondary outcomes included 7 self-reported video evaluation items, such as perceived relevance and clarity, assessed using a 5-point Likert scale. Differences in the primary outcome between video groups were analyzed using multivariable logistic regression with adjustment for covariates. Means and 95% CIs were calculated for each secondary outcome according to sex and video group. In addition, before-and-after changes in screening preferences were assessed using McNemar test, with a significance level of .05. Results: In total, 1200 respondents (400 per group) completed the survey. No statistically significant differences in the primary outcome were observed among the video groups. With reference to video A, the adjusted odds ratios for preferring guideline-based screening were 0.89 (95% CI 0.59-1.32) for video B and 0.98 (95% CI 0.65-1.46) for video C. Regarding secondary outcomes, male respondents rated video B less favorably than female respondents in terms of relevance and willingness to undergo guideline-based screening. The before-and-after comparison showed a significant change in preference for guideline-based screening (P=.04). These videos appeared to be more effective for individuals with an annual history of colorectal cancer screening than for those without such a history. Conclusions: Educational videos have the potential to influence cancer screening preferences; however, no single video format has demonstrated clear superiority. These findings underscore the importance of tailoring educational materials to the target audience characteristics. Further research is required to develop effective strategies for encouraging guideline-based cancer screening. Trial Registration: University Hospital Medical Information Network Clinical Trials Registry UMIN000060549; https://center6.umin.ac.jp/cgi-open-bin/ctr/ctr_view.cgi?recptno=R000066119

  • Source: Freepik; Copyright: freepik; URL: https://www.freepik.com/free-photo/front-view-psychologist-patient_8623674.htm; License: Licensed by JMIR.

    Text-Based Depression Estimation Using Machine Learning With Standard Labels: Systematic Review and Meta-Analysis

    Abstract:

    Background: Depression affects people’s daily lives and even leads to suicidal behavior. Text-based depression estimation using natural language processing has emerged as a feasible approach for early mental health screening. However, most existing reviews often included studies with weak depression labels, which affected the reliability of the results and further limited the practical application of the automatic depression estimation models. Objective: This review aimed to evaluate the predictive performance of text-based depression models that used standard labels, and to identify text resources, text representation, model architecture, annotation source, and reporting quality contributing to performance heterogeneity. Methods: Following PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) 2020 guidelines, we systematically searched 4 main databases (PubMed, Scopus, IEEE Xplore, and Web of Science) for studies published between 2014 and 2025. The eligible studies were included: machine learning models were developed based on the text generated by the participants and used validated scales or clinical diagnoses as depression labels. Pooled effect sizes (r) were calculated using random-effects meta-analysis with Hartung-Knapp-Sidik-Jonkman correction, and subgroup and meta-regression analyses were conducted to explore potential moderators. Results: We scanned 3067 articles and finally filtered 15 models from 11 studies for the meta-analysis. The overall pooled effect size was 0.605 (95% CI 0.498-0.693), indicating a large strength of association. Subgroup analyses showed that models using embedding-based text representations achieved higher performance than those using traditional features (r=0.741, 95% CI 0.648-0.812 vs r=0.514, 95% CI 0.385-0.623; P<.001 for subgroup difference), and deep learning architectures outperformed shallow models (r=0.731, 95% CI 0.660-0.789 vs r=0.486, 95% CI 0.352-0.599; P<.001). Models trained with clinician diagnoses also outperformed better than those relying on self-report scales (r=0.688, 95% CI 0.554-0.787 vs r=0.500, 95% CI 0.340-0.631; P=.03). Reporting quality was positively associated with model performance (β=0.085, 95% CI 0.050-0.119; P<.001). Begg–Mazumdar and Egger tests provided no evidence of small-study effects. Begg–Mazumdar test (Kendall τ=0.17143, P=.37) and the Egger test (t14=1.13401, 2-tailed P=.28) indicated no evidence of small-study effects. Conclusions: Text-based depression estimation models trained with standard depression labels demonstrate solid predictive performance, with embedding features, deep model architectures, and clinician diagnosis labels showing significantly higher performance. Transparent reporting is also positively associated with model performance. This study highlights the importance of standard labels, feature representation, and reporting quality for improving model reliability. Unlike prior reviews that included weak or heterogeneous depression labels, this study offers more clinically reliable and comparable evidence. Moreover, this review provides clearer methodological guidance for developing more consistent and practically informative text-based depression screening models. Trial Registration: PROSPERO CRD420251056902; https://www.crd.york.ac.uk/PROSPERO/view/CRD420251056902

  • Patient filling out computerized history taking. Source: Fredrik Persson; Copyright: Fredrik Persson; URL: https://www.jmir.org/2026/1/e76087/; License: Creative Commons Attribution (CC-BY).

    Computerized Self-Reported Medical History Taking to Support Early Rule Out of Major Adverse Cardiac Events in Patients With Acute Chest Pain: Post Hoc...

    Abstract:

    Background: Self-reported, computerized history taking (CHT) may enable efficient collection of medical histories for acute chest pain management. Objective: The primary aim is to determine the diagnostic performance of 4 CHT-derived chest pain risk scores for ruling out 30-day major adverse cardiac events (MACEs) or acute coronary syndrome (ACS). The secondary aim is to assess their impact on patient disposition in the emergency department (ED). Methods: This is a prospective cohort study conducted at a tertiary hospital ED in Stockholm, Sweden. Clinically stable adults (≥18 years) with chest pain and an electrocardiogram (ECG) not indicating an acute disease requiring immediate care provided medical histories via a tablet-based CHT program (Clinical Expert Operating System [CLEOS]). CHT data and ECG interpretations and troponin values were used to calculate the History, ECG, Age, Risk Factors, and Troponin (HEART) score, Danderyd HEART (D-HEART) score, Emergency Department Assessment of Chest Pain Score combined with an Accelerated Diagnostic Protocol (EDACS-ADP), and Troponin-only Manchester Acute Coronary Syndrome (T-MACS). The primary outcome was 30-day ACS; the secondary outcome was 30-day MACE (ACS, revascularization, or cardiovascular death). Results: Among 1000 participants (age: mean 55 years, SD 17 years; 456/1000, 45.60%, women), risk scores could be calculated in 838 (83.80%). Within 30 days, 65 (6.50%) participants experienced ACS, and 72 (7.20%) had a MACE. Negative predictive values were 0.99 (95% CI 0.97-1.00) for both outcomes. Sensitivity for MACE was 0.91 (95% CI 0.81-0.97) for HEART, 0.94 (95% CI 0.86-0.98) for D-HEART, 0.94 (95% CI 0.86-0.98) for EDACS-ADP, and 0.97 (95% CI 0.90-1.00) for T-MACS, with similar results for ACS. As many as 89 of the 528 (16.9%) patients admitted could be reclassified from “nonlow risk” to “low risk.” Among reclassified patients, 30-day MACE or ACS occurred in 0-4 cases; miss rates were below 1% for D-HEART (4/416, 0.96%) and T-MACS (2/286, 0.7%), but exceeded 1% for HEART (6/406, 1.5%) and EDACS-ADP (4/346, 1.2%). Conclusions: Automated, self-reported CHT provided sufficient data to calculate 4 chest pain risk scores in 838 of 1000 (83.80%) patients with acute chest pain, with score calculation dependent on physician-interpreted ECGs. These CHT-derived risk scores demonstrated good diagnostic performance for ruling out 30-day MACE and ACS. Performance was broadly comparable with prior studies using physician-acquired scores, although suggested safety thresholds were primarily met by D-HEART and T-MACS. The improved safety of D-HEART compared with HEART is likely attributable to the incorporation of serial 0/1-hour troponin testing. Use of CHT-derived risk scores may reclassify a substantial fraction of admitted patients as “low risk,” potentially supporting discharge decisions in selected patients, while admission may still be required for non-ACS reasons. However, any gains in discharge rates should be weighed against the possibility of missed events among reclassified patients. Multicenter studies are needed to confirm generalizability, operational feasibility, and safety. Trial Registration: ClinicalTrials.gov NCT03439449; https://clinicaltrials.gov/ct2/show/NCT03439449

  • Source: Freepik; Copyright: Freepik; URL: https://www.freepik.com/free-photo/side-view-woman-holding-smartphone_34653890.htm; License: Licensed by JMIR.

    User Profiles and Engagement in a Hypertension Self-Management App: Cross-Sectional Survey

    Abstract:

    Background: Mobile health (mHealth) technologies can improve hypertension self-management, yet real-world adoption remains limited and unequally distributed. Objective: This study aimed to characterize the profiles, usage patterns and engagement of active users of a hypertension self-management app (Hypertension.APP) in Germany, with a focus on user engagement and potential digital divides. Methods: We conducted a cross-sectional online survey among adult users of Hypertension.APP in Germany between January and September 2023. An 88-item questionnaire assessed app usage patterns, perceived utility, integration into clinical care, sociodemographic and clinical data, and digital health literacy (eHealth Literacy Scale, eHEALS; scores 16–40). Digital health literacy was categorized as low (16–23.99), moderate (24–31.99), or high (32–40). Descriptive statistics and univariable ordinal logistic regression were used to explore associations between sociodemographic and clinical variables and app usage frequency. Results: Of 254 respondents (mean age 53.6 years, 54.3% male), 44.5% had a university or technical college degree, and 44.5% reported a monthly net income higher than €2500. Most participants (88.2%) reported access to at least two digital devices. Overall, 88.2% had moderate or high digital health literacy (eHEALS ≥24). App engagement was high: 80.7% used the app at least weekly, and 52.4% used it to prepare for medical visits. However, only 20.1% reported that the app was formally integrated into their medical care, and 11.8% indicated that medication had been adjusted based on app data. In univariable ordinal logistic regression, higher education, longer duration of hypertension, and living in a small town (5,000–20,000 inhabitants) were associated with more frequent app use, whereas systolic blood pressure ≥140 mmHg was associated with less frequent use. Digital health literacy was not clearly associated with app usage frequency among current users. Conclusions: Users of this hypertension self-management app were predominantly well-educated, digitally literate individuals with established hypertension, reinforcing concerns about a persistent digital divide. While app usability and engagement were high, formal clinical integration remained limited. Simply making an app available is insufficient; strategies to promote equitable access, strengthen clinical integration, and support patients with lower digital health literacy are needed for mHealth to contribute effectively to hypertension management. Clinical Trial: German Clinical Trials Register (DRKS00029761)

  • Source: Freepik; Copyright: freepik; URL: https://www.freepik.com/free-photo/business-concept-with-progress-close-up_19924257.htm; License: Licensed by JMIR.

    Data-Driven Guideline Adherence in Data Representation and Compliance Measurement: Scoping Review

    Abstract:

    Background: Best practice standards, in the form of clinical practice guidelines (CPGs) and clinical pathways (CPs), aim to standardize care and improve outcomes. However, variation in clinical practice exists, and not all deviations are inappropriate. Measuring adherence to best practice standards remains challenging due to limitations in representation methods and data fidelity. Objective: This scoping review aims to survey and synthesize the existing literature on the computable representation of guideline recommendations and to explore methods for detecting and quantifying deviations from best-practice standards. Methods: We followed the Arksey and O’Malley framework and PRISMA-ScR guidelines. Five databases (Ovid Medline, EMBASE, IEEE Xplore, Web of Science, and Scopus) were searched in November 2025. Studies were included if they either (1) described or modelled a computer representation of best practice standards, or (2) assessed adherence to such standards using patient data, including patient data derived from electronic medical records (eMR) or patient event logs. Titles, abstracts, and full texts were screened using Covidence. Data was extracted on representation, clinical context, data sources, adherence metrics, and modelling techniques. A narrative synthesis was conducted to identify themes. Results: Twenty-four studies were included. Most studies were published as conference proceedings (56%). Fourteen studies (58%) included measurement of adherence to best practice standards. Cardiovascular conditions were the most common focus (n=13, 54%). Data sources included Health Level Seven (HL7) messages, structured eMR data, event logs, and Fast Healthcare Interoperability Resources (FHIR) -transformed data. Best practice standards were formalized using Business Process Model and Notation (BPMN) (n=6, 25%), ontologies (n=7, 29%), FHIR (n=4, 17%), or hybrid approaches (n=4,17%). The most common method for adherence measurement was rule-based alignment of patient data with guideline components. Several studies incorporated weighted scoring to differentiate the severity of deviations. Process mining was used in a subset to detect sequence and timing variations. However, most models lacked contextual sensitivity and rarely incorporated patient-specific factors, such as comorbidities, patient acuity, or clinician rationale. Consequently, although deviations can be automatically identified, determining whether they were clinically warranted remained largely unresolved. Conclusions: Despite promising advances, challenges persist in representing best-practice standards in computer-interpretable formats and measuring adherence in a clinically meaningful way. Current approaches predominantly assess technical alignment rather than clinical relevance and are limited by data quality and standardization, thereby limiting real-world utility. This scoping review offers an innovative contribution by synthesizing evidence from two separate domains – the computable representation of best practice standards and the measurement of adherence. The findings emphasize the need for context-aware, standardized modelling and integration with clinical workflows that can distinguish warranted from unwarranted deviations. Developing such systems will be crucial to enabling scalable, transparent and real-time adherence monitoring – ultimately driving safer, patient-centered care delivery.

  • Source: Freepik; Copyright: lookstudio; URL: https://www.freepik.com/free-photo/attractive-woman-sitting-home-working_28789069.htm; License: Licensed by JMIR.

    Digital Inequalities in the Use of eHealth Services in European Public Health Care Systems: Systematic Review of Observational Studies

    Abstract:

    Background: European public healthcare systems are expanding eHealth tools such as teleconsultations, online appointment bookings, and electronic health records (EHRs) to improve efficiency and access to healthcare. However, their use depends on factors like digital skills and internet access, which are unequally distributed across socioeconomic and demographic determinants. Most existing evidence on these inequalities are qualitative or outside universal healthcare systems. Objective: This systematic review aims to synthesize quantitative evidence on inequalities in access to and use of eHealth services—such as online appointment booking, teleconsultations, and access to EHRs an eHealth portal—within European public healthcare systems, by examining differences across age, gender, socioeconomic status, education, and other social determinants of health. Methods: A systematic search was conducted across four electronic databases (PubMed, Scopus, Web of Science, and PsycINFO) for studies published in English or Spanish between 2015 and 2025. Eligible quantitative studies focused on adults aged 18 and older using public healthcare systems in European countries. Screening and data extraction were independently performed by three reviewers using Rayyan®, with disagreements resolved by a third reviewer. Extracted information included study characteristics, population details, digital health tools assessed, social determinants, and quantitative outcomes. Risk of bias was evaluated using Joanna Briggs Institute appraisal tools. Due to study heterogeneity in the digital tools assessed and inequality dimensions analyzed, a narrative synthesis was used to summarize findings by type of digital tool and social inequality factors. Results: A total of 2,366 records were retrieved through the initial search, from which 18 studies were included in the systematic review. Publication output increased notably from 2020 onwards, with most studies published between 2020 and 2025. Most of the research originated from northern and western Europe. The findings of this review reveal consistent social gradients in the use of digital health tools within European public healthcare systems. Older adults, individuals with lower educational or socioeconomic level, ethnic minorities, and those with limited digital skills or health status were less likely to use eHealth tools. Conclusions: Digital transformation in European public health systems has not benefited all groups equally. This review highlights persistent social inequalities in the use of key digital health tools. However, existing research has certain limitations, including heterogeneity in study designs and populations, exclusion of qualitative studies due to methodological criteria, and geographical concentration of studies primarily in Northern and Western Europe. Future research should deepen understanding of how these inequalities emerge and interact, incorporating both individual and structural factors. Emphasizing an intersectional approach and standardizing measures of digital access will be essential to develop effective, equity-focused policies that ensure inclusive digital health services for all. Clinical Trial: PROSPERO CRD420251015756;https://www.crd.york.ac.uk/PROSPERO/view/CRD420251015756.

  • Source: Freepik; Copyright: Freepik; URL: https://www.freepik.com/free-photo/medium-shot-doctor-holding-smartphone_32338375.htm; License: Licensed by JMIR.

    Problems and Barriers Regarding the Admission, Financing, and Service Provision of Digital Health Apps: Qualitative Stakeholder Survey

    Abstract:

    Background: Since their introduction with the Digital Care Act in 2019, DiGA are a part of the German statutory healthcare system. In order to become a DiGA, mHealth apps have to complete a certification process covering both technical and evidence related aspects. After completion, DiGA are added to the DiGA-directory, containing a list of all reimbursable DiGA within German statutory health insurance (SHI). The first apps were added at the end of 2020 with the number steadily increasing. The novelty of the introduction leads to problems and barriers to optimal use along the way, which is studied from different stakeholder perspectives in this research article. Objective: The aim of the survey was to identify problems and barriers in the context of certification, financing and use of DiGA in Germany. Methods: We used semi-structured expert interviews to evaluate the perspective of stakeholders of the German healthcare system on DiGA. The interview guide was developed according to Helfferich, the interviews were transcribed and analyzed using the qualitative content approach by Mayring and Kuckartz. Results: We identified problems from stakeholder perspectives regarding the certification/admission, financing and service distribution regarding DiGA. The interviewed stakeholders reported problems with authorization of DiGA and the corresponding process. DiGA prices and the different negotiation positions were criticized, as well as financial challenges for smaller DiGA-manufacturers. Within service provision, technical problems, e. g., with activation codes or software surrounding DiGA-prescription were mentioned. Problems were also seen in insufficient knowledge and skills on the side of the patients as well as the medical providers. Conclusions: mHealth applications provide potentially disruptive innovations within the healthcare sector. Nevertheless, since the evidence-based and regulated use of this technology is relatively new there are still problems and barriers limiting the optimized, patient-centered use. This study provides an overview of problems in the context of DiGA in Germany from the stakeholder perspective. Since other countries showed interest in potentially adopting the German system, valuable implications can be drawn from this survey.

  • Source: Freepik; Copyright: halayalex; URL: https://www.freepik.com/free-photo/young-man-working-with-laptop-man-s-hands-notebook-computer-business-person-casual-clothes-street_7200995.htm; License: Licensed by JMIR.

    Screen Time and Chronic Pain Health: Mendelian Randomization Study

    Abstract:

    Background: The rapid proliferation of electronic devices has increased screen time, raising concerns about its potential health effects, including chronic pain. However, existing studies have limitations in scope and causal inference, with inconsistent findings and a lack of exploration of potential biological mechanisms. Objective: The objective of our study was to investigate the causal associations and potential shared biological mechanisms between different forms of screen time and various chronic pain phenotypes. Methods: Leveraging genome-wide association study data, we investigated the association and potential shared biological mechanisms between screen time (time spent watching television, time spent using computer, and length of mobile phone use) and chronic pain phenotypes (including multisite chronic pain [MCP], back, knee, neck or shoulder, hip pain, and headaches). Two-sample Mendelian randomization (MR), reverse MR and multivariable Mendelian randomization (MVMR) analysis were performed to examine associations between screen time and chronic pain. Summary data–based Mendelian randomization (SMR), transcriptome-wide association study (TWAS), and colocalization analysis were used to identify the shared genes and potential biological mechanism. Results: MR analysis revealed that time spent watching television and length of mobile phone use were positively associated with several types of chronic pain, while time spent using computer showed a negative association. Specifically, time spent watching television was positively associated with the risk of MCP (P=1.05×10–31; odds ratio [OR] 1.61, 95% CI 1.49-1.74), back pain (P=2.41×10–8; OR 1.14, 95% CI 1.09-1.19), knee pain (P=7.10×10–6; OR 1.09, 95% CI 1.05-1.13), neck or shoulder pain, and hip pain. Length of mobile phone use was positively associated with the risk of MCP (P=2.15×10–5; OR 1.22, 95% CI 1.11-1.34), headaches, and neck or shoulder pain. However, time spent using computer was negatively associated with the risk of MCP (P<.001; OR 0.83, 95% CI 0.75-0.92), back pain, and knee pain. The reverse MR results showed that MCP was positively associated with time spent watching television (P=4.8×10–7; OR 1.27, 95% CI 1.16-1.4) and length of mobile phone use (P=3.38×10–5; OR 1.29, 95% CI 1.14-1.45), while the association with time spent using computer (P=.61; OR 0.97, 95% CI 0.87-1.09) was not statistically significant. The MVMR results failed to meet the criterion that all conditional F-statistics exceed 10. Integrative 3 analysis methods identified overlapping genes, with CEP170 emerging as a key gene consistently supported by SMR, TWAS, and colocalization analysis in the relationship between time spent using computer and MCP. Conclusions: Our findings demonstrate an association between screen time and various aspects of chronic pain. The CEP170 gene might contribute to the shared biological mechanism between time spent using computer and MCP risk. However, due to the absence of robust MVMR results, the potential influence of confounding factors cannot be ruled out. Trial Registration:

  • Source: Freepik; Copyright: freepik; URL: https://www.freepik.com/free-photo/young-people-using-reels_36028875.htm; License: Licensed by JMIR.

    Interventions That Use Highly Visual Social Media Platforms to Tackle Unhealthy Body Image in Adolescents and Young Adults: Systematic Review of Randomized...

    Abstract:

    Background: Highly visual social media (HVSM) platforms such as Facebook (Meta Platforms, Inc), Instagram (Meta Platforms, Inc), TikTok (ByteDance Ltd), and Snapchat (Snap Inc) have become central to the digital lives of adolescents and young adults. While these platforms have been linked to body dissatisfaction, they are also increasingly used as vehicles for health promotion. However, the evidence on interventions delivered through HVSM to address body image issues remains fragmented. Objective: This review aimed to synthesize available evidence on interventions using HVSM platforms to reduce negative body image in adolescents and young adults. Methods: We conducted a systematic search across 5 electronic databases (Scopus, MEDLINE, APA Psynet, Embase, and Web of Science) for studies published between January 2012 and October 2025. Eligible studies included experimental or quasi-experimental designs evaluating the effect of an HVSM-based intervention on body image outcomes in individuals aged 13 to 35 years. Risk of bias was assessed using the Risk of Bias Tool 2.0 (Cochrane) and was conducted independently by 2 researchers. Results: Eight studies met the inclusion criteria with 4975 participants (2612 in intervention groups and 2363 in control groups). Most studies were conducted in high-income countries and had predominantly female participants. The interventions varied widely in format, duration, and theoretical basis. Microinterventions, brief interactive strategies such as gamified chatbots or short videos, were the most common and had moderate effects. Stimulus-based interventions using content with a positive body image or that did not focus on appearance were also identified, achieving moderate effects (ηp²<.07), as well as combined approaches that integrated digital and face-to-face components to reduce negative body image (P<.001). The use and functionality of interventions using social media platforms were also compared by gender. Conclusions: Body image management platforms offer an emerging avenue for implementing body image interventions in adolescents and young adults. While current evidence suggests modest benefits, the high heterogeneity among presentation formats and the variability in duration make comparisons between these studies difficult. This review synthesizes social media–delivered interventions for body image disturbance, going beyond broader digital approaches centered on websites or apps. It identifies cross-platform, putative mechanisms of action and common intervention formats, highlighting the potential of brief interventions for scalable reach and user empowerment via content curation. These findings define targets for optimization and underscore the need for platform safeguards and supportive policy and regulatory frameworks to enable safe real-world implementation, particularly for adolescents.

  • Source: Freepik; Copyright: wayhomestudio; URL: https://www.freepik.com/free-photo/woman-checks-results-fitness-training-smartwatch-listens-music-via-headphones-dressed-anorak-poses-blurred_19046120.htm; License: Licensed by JMIR.

    The Feasibility of Smartwatch Micro–Ecological Momentary Assessment for Tracking Eating Patterns of Malaysian Children and Adolescents in the South-East...

    Abstract:

    Background: Mobile phone ecological momentary assessment (EMA) methods are a well-established measure of eating and drinking behaviors, but compliance can be poor. Micro-EMA (μEMA), which collects information with a single tap response to brief questions on smartwatches, offers a novel application that may improve response rates. To our knowledge, there is no data evaluating μEMA to measure eating habits in children or in low-to-middle-income countries. Objective: In this study, we investigated the feasibility of micro-EMA to measure eating patterns in Malaysian children and adolescents. Methods: We invited 100 children and adolescents aged 7-18 years in Segamat, Malaysia, to participate in 2021-2022. Smartwatches were distributed to 83 children and adolescents who agreed to participate. Participants were asked to wear the smartwatch for 8 days and respond to 12 prompts per day, hourly, from 9AM to 8PM, asking for information on their meals, snacks, and drinks consumed. A questionnaire captured their experiences using the smartwatch and μEMA interface. Response rate (proportion of prompts responded to) assessed participants’ adherence. We explored associations between response rate with time of day, across days, age, and sex using multilevel binomial logistic regression modeling. Results: Eighty-two participants provided usable smartwatch data. The median number (IQR) of meals, drinks, and snacks per day was 2 (2-4), 3 (1-5), and 1 (0-2), respectively, on the first day of the study. The median response rate across the study was 68% (IQR 50-83). The response rate decreased across study days from 74% (68-78) on Day 1 to 40% (30-50) on Day 7 (odds ratio [OR] per study day 0.73, 95% CI 0.64-0.83). Response rate was lowest at the start of the day and highest between the hours of 12 PM and 2 PM. Female participants responded to more prompts than male participants (OR 1.72, 95% CI 1.03-2.86). There was no evidence of differential response by age (OR 0.73, 95% CI 0.41-1.28). Most participants (65%) rated their experience using the smartwatch positively, with 33% saying they were happy to participate in future studies using the smartwatch. For children that did not wear the smartwatch for the full study duration (n=22), discomfort was the most common complaint (41%). Conclusions: In this study of the feasibility of μEMA on smartwatches to measure eating in Malaysian children, we found the method was acceptable. However, response rates declined across study days, resulting in substantial missingness. Future studies (eg, through focus groups) should explore approaches to improving response to event prompts, trial alternative devices to increase children’s comfort, and evaluate revised protocols for reporting of intake events.

  • AI-generated image, in response to the request "Asian man showing his smartwatch his his female psychiatrist" (Generator: Google Pixel Studio December 21, 2025; Requestor: Adam Charles Frank). Source: Created with Google Pixel Studio; Copyright: N/A (AI Generated Image); URL: https://www.jmir.org/2026/1/e85033; License: Public Domain (CC0).

    Interactions of Technology and Obsessive-Compulsive Disorder Symptomatology in Adults: Qualitative Interview Study

    Abstract:

    Background: Obsessive-Compulsive Disorder (OCD) affects 1–3% of the population and is marked by intrusive obsessions and compulsive behaviors that impair daily functioning. As digital technologies have become ubiquitous, their features may interact with OCD symptom dimensions in ways that both exacerbate and alleviate symptoms. While case reports and clinical anecdotes suggest such interactions, systematic investigation of patients’ lived experiences with technology remains limited. Objective: This study aimed to explore how individuals with OCD perceive and navigate their interactions with modern technologies, and to identify how specific features of technology may contribute to, reinforce, or relieve obsessive-compulsive symptom cycles. Methods: We conducted semi-structured interviews (n=24) with adults self-reporting a diagnosis of OCD, recruited through online OCD communities and advocacy networks. Interviews were conducted via HIPAA-compliant Zoom between May and December 2024 (median duration: 51 minutes). Transcripts were coded in Dedoose (v9.2.22) using a constructivist grounded theory approach. Coding proceeded iteratively through open and focused coding, with theoretical saturation reached after 15 interviews. Constant comparison and analytic memoing guided the development of a conceptual framework linking technology features to OCD symptom dimensions. Results: Participants (median age 26, range 20–64; 67% female; 29% male; 4% non-binary) described technology as both a trigger for and a coping tool against OCD symptoms. Analysis produced four central technology-related categories: (1) information-provision platforms (e.g., social media, search engines, large language models, etc) that triggered disturbing-thought obsessions and enabled compulsive checking and reassurance-seeking; (2) gamification/quantification features (e.g., streaks, progress bars, tracking metrics) that reinforced “not-just-right” and symmetry-based compulsions; (3) notifications that provoked urges to clear, check, and maintain control, spanning both disturbing-thought and symmetry domains; and (4) user interfaces whose complexity and customizability elicited compulsive ordering, avoidance behaviors, and digital overwhelm. Conclusions: This study characterizes how interactions between OCD and digital technologies manifest across established symptom domains, most notably disturbing-thought and “not-just-right” categories. Participants overwhelmingly experienced compulsive checking, reassurance-seeking, and ordering behaviors reinforced by features such as information-provision, gamification, notifications, and user interfaces. These findings highlight the clinical relevance of technology-related compulsions and suggest value in their systematic assessment, incorporation into psychoeducation, and consideration in digital design.

Citing this Article

Right click to copy or hit: ctrl+c (cmd+c on mac)

Latest Submissions Open for Peer-Review:

View All Open Peer Review Articles
  • A Multidimensional Comparative Study of Multiple Mainstream Large Language Models in Perioperative Consultation for Hypospadias Surgery

    Date Submitted: Feb 12, 2026

    Open Peer Review Period: Feb 13, 2026 - Apr 10, 2026

    Background: Hypospadias is a common congenital malformation requiring surgical correction, with caregivers facing significant perioperative information needs. Large language models (LLMs) offer a pote...

    Background: Hypospadias is a common congenital malformation requiring surgical correction, with caregivers facing significant perioperative information needs. Large language models (LLMs) offer a potential solution for health education, yet their performance in pediatric urological consultations remains unexplored. Objective: This study aimed to evaluate the performance of five LLMs—ChatGPT-4o, Gemini-2.5-Pro, OpenEvidence, Zhipu Qingyan, and DeepSeek—in addressing core concerns of caregivers during the perioperative period for children with hypospadias, and to assess their application value and limitations in pediatric urological clinical health education. Methods: A prospective, non-interventional, cross-sectional design was employed. A question bank was developed based on literature and clinical practice, with 10 high-priority questions selected via questionnaire screening for the test set. Seven experts (6 dimensions) and 32 caregivers (4 dimensions) were prospectively recruited to evaluate responses using a double-blind forced-order ranking method (reverse scoring from 1 to 5), with reference authenticity being verified. Results: Significant differences existed across dimensions among the five models (P<.001). Gemini-2.5-Pro demonstrated overall superior performance, ranking first in both expert evaluations (median 5.0 [IQR 4.0-5.0]) and caregiver evaluations (median 4.0 [IQR 3.0-5.0]), with outstanding structural capabilities; DeepSeek ranked second (median 4.0), demonstrating relatively consistent ratings across socioeconomic strata and superior emotional support compared to ChatGPT-4o (P=.001); OpenEvidence scored lowest (nearly 50% “poor” ratings), exhibiting poor readability but reliable evidence sources. Conclusions: Gemini-2.5-Pro offers the most comprehensive quality for perioperative hypospadias consultations. DeepSeek and Zhipu Qingyan demonstrate strong rapport in home care guidance but require strict control of literature hallucination risks. Perioperative care for hypospadias should adopt a tiered human-machine collaboration model based on clinical risk stratification, balancing communication efficiency and health education safety.

  • Smartphone-based assessment of physical activity and cardiovascular health: Findings from the Dutch MyHeart Counts study

    Date Submitted: Feb 12, 2026

    Open Peer Review Period: Feb 13, 2026 - Apr 10, 2026

    Background: Fitness and physical activity patterns are key predictors of cardiovascular disease. Traditionally, these factors have been assessed through participant self-report, which is prone to reca...

    Background: Fitness and physical activity patterns are key predictors of cardiovascular disease. Traditionally, these factors have been assessed through participant self-report, which is prone to recall bias and inaccuracy. Smartphone-based monitoring provides a scalable and objective alternative for measuring physical activity, offering improved accuracy over conventional assessment methods. Objective: To evaluated the feasibility of smartphone-based cardiovascular research in the Netherlands and to examine associations between objectively measured physical activity, perceived activity, functional capacity, life satisfaction, and cardiovascular risk. Methods: Adults in the Netherlands were recruited via the MyHeart Counts iPhone app between August 2022 and December 2023. Within the app, participants completed surveys, passively shared motion sensor data, and were invited to perform a smartphone-based 6-minute walk test (6MWT). Perceived activity was compared with sensor-measured activity and actual activity (sensor-measured with supplemented self-reported unrecorded activity). Multivariable linear regression assessed associations between activity and 6MWT performance and between activity and life satisfaction. Perceived cardiovascular risk was compared with the difference between heart age and actual age. Results: Of 518 enrolled participants (median age 58 years; 72% female), 93% shared data beyond demographics. Median engagement duration was 27 days, and 58% completed at least one full consecutive week of motion tracking. Perceived activity weakly correlated with both sensor-measured activity (ρ = 0.15, P = .01) and with actual activity (ρ = 0.15, P = .01). Median perceived activity was 3.5 hours/week, significantly higher than sensor-measured activity (0.9 hours/week; mean difference 2.9 hours, 95% CI 2.2–3.7; P < .001). In contrast, median actual activity was 3.2 hours/week and did not differ significantly from perceived activity (mean difference 0.7 hours, 95% CI −0.2 to 1.6; P = .11), indicating no significant over- or underestimation when unrecorded activity was accounted for. Sensor-measured physical activity was associated with longer 6MWT distance (+10.1 m per hour; 95% CI 3.9-16.4, P = 0.002). No association was observed between sensor-measured activity and life satisfaction. Perceived cardiovascular risk correlated with the difference between heart age and actual age (ρ = 0.41; P < 0.001). Conclusions: Smartphone-based cardiovascular monitoring is feasible in a European adult population and yields valid functional correlates of physical activity. However, incomplete phone carriage substantially limits sensor-only activity estimates, underscoring the need for hybrid measurement strategies. These findings support the use of smartphone platforms for scalable cardiovascular research, while highlighting persistent challenges in engagement and measurement completeness.

  • Impact of Large Language Model-Generated versus Clinician-Generated Advice on Resuscitation Preferences and Chinese-Language Readability in Advanced Cancer Patients in the Emergency Department: A Randomised Controlled Trial

    Date Submitted: Feb 12, 2026

    Open Peer Review Period: Feb 13, 2026 - Apr 10, 2026

    Background: For patients with advanced cancer in the emergency department (ED), decisions regarding life-sustaining treatments (LST) are critical and hinge on clear communication of complex prognoses....

    Background: For patients with advanced cancer in the emergency department (ED), decisions regarding life-sustaining treatments (LST) are critical and hinge on clear communication of complex prognoses. While large language models (LLMs) can synthesize clinical information, their comparative effectiveness against clinicians in shaping real patient preferences, and the readability of their outputs, remain unproven. Objective: This study aimed to determine if LLM-generated advice is non-inferior to clinician-generated advice in changing patient resuscitation preferences. Secondarily, we compared the Chinese-language readability of the advice using a validated formula with a clinical cutoff and assessed patient satisfaction. Methods: We conducted a three-arm, parallel, randomized controlled non-inferiority trial. 189 adult patients with advanced cancer in the ED were assigned to review structured advice generated by: (1) a senior clinician, (2) ChatGPT-5.0 Mini, or (3) DeepSeek. The primary outcome was the change in score on the Cancer Advanced Care Preferences Scale. Secondary outcomes included text readability score (assessed by a validated Chinese health literacy formula) and patient satisfaction. Results: A total of 189 participants were enrolled and completed the study. In the primary non-inferiority analysis, the change in resuscitation preference scores for the DeepSeek group was non-inferior to that of the clinician group (mean difference: -0.095 points, 95% CI: -0.750 to 0.560; lower limit > -1.7 margin). Similarly, ChatGPT-5.0 Mini was also non-inferior to the clinician group (mean difference: 0.349 points, 95% CI: -0.237 to 0.935; lower limit > -1.7 margin). Regarding secondary outcomes, a significant difference in readability was found among the three groups (Kruskal-Wallis H(2)=129.36, p<0.001). Post-hoc comparisons indicated that texts from DeepSeek had the highest median readability score (7.53, IQR: 7.39-7.62), followed by ChatGPT-5.0 Mini (5.93, IQR: 5.60-6.23), and clinician-generated texts (5.51, IQR: 5.29-5.74), with all pairwise differences being significant (p<0.001). However, no significant difference in patient satisfaction was observed across the groups (H(2)=1.10, p=.578). Conclusions: LLM-generated advice was non-inferior to clinician advice in influencing resuscitation preferences. Its superior readability and higher patient satisfaction highlight the potential of LLMs as a scalable tool to support complex decision-making in time-pressured ED settings.

  • Patient-generated health data in lung cancer symptom management and health promotion: clinical practice, challenges and future directions

    Date Submitted: Feb 12, 2026

    Open Peer Review Period: Feb 13, 2026 - Apr 10, 2026

    Patient-generated health data (PGHD) refers to health-related information collected by patients themselves, serving as a vital supplement to traditional clinical data. In the era of big data, the pote...

    Patient-generated health data (PGHD) refers to health-related information collected by patients themselves, serving as a vital supplement to traditional clinical data. In the era of big data, the potential of PGHD in the long-term management of chronic diseases and cancer is increasingly recognised, with its clinical application becoming a key issue in the digital health field. The proliferation of smart devices and wearable technology, improvements in sensor performance, and rapid advancements in artificial intelligence have made the collection of PGHD more convenient. Existing clinical evidence preliminarily indicates that PGHD may alleviate symptom burden in lung cancer patients and enhance the quality of cancer care. However, significant challenges remain in effectively integrating PGHD with clinical data, conducting reliable analyses of vast PGHD datasets, and ultimately incorporating it into routine clinical practice. Furthermore, regulatory bodies, healthcare institutions, and device manufacturers must collaboratively establish policies and standards to safeguard patient data security and privacy. While leveraging digital tools for PGHD collection, attention must also be paid to economic costs and technical barriers to broaden coverage and promote health equity. The potential and application models of PGHD in the long-term management of lung cancer patients warrant further exploration. Against this backdrop, this paper proposes a WeChat Official Account-based model for PGHD collection and remote management, aimed at implementing sustainable symptom monitoring and health guidance for lung cancer patients. This approach seeks to advance the widespread clinical application of PGHD and further explore its potential value in promoting patient self-management and improving quality of life.

  • Attitudes and Needs of Healthcare Providers Toward Artificial Intelligence-Assisted Pediatric Palliative Care: A Mixed-Methods Study

    Date Submitted: Feb 12, 2026

    Open Peer Review Period: Feb 12, 2026 - Apr 9, 2026

    Background: While AI's transformative potential in healthcare is widely acknowledged, its application in highly sensitive, humanistic domains like PPC remains largely unexplored. Objective: To explore...

    Background: While AI's transformative potential in healthcare is widely acknowledged, its application in highly sensitive, humanistic domains like PPC remains largely unexplored. Objective: To explore the attitudes and needs of healthcare providers on the pediatric palliative care (PPC) assisted by artificial intelligence (AI), with the goal of informing future development and implementation of AI systems in this field. Methods: This was an explanatory sequential mixed-methods study consisting of a nationwide cross-sectional questionnaire survey (March–April 2025) followed by qualitative semi-structured interviews (August–October 2025). The quantitative study aimed to investigate PPC healthcare providers' experiences, attitudes, and needs for the application of AI. Participants included team members of all recognized PPC teams in mainland China. The qualitative study aimed to explore in greater depth the potential future roles of AI in this field, as well as the features of an ideal AI-assisted tool for PPC. Potential interviewees were recruited from the pool of quantitative survey respondents. Results: Among 352 survey respondents, most (58.24%) reported moderate familiarity with AI, with large language models being the most commonly used (79.55%). Among large language model users, over half (57.50%) reported using them for clinical purposes. Attitudes were generally positive: 67.05% believed AI's benefits would outweigh drawbacks, and 78.98% considered its implementation feasible. The most desired applications were patient/family education (78.41%) and symptom management (73.01%). Interviews with 17 providers revealed three themes: (1) clinical roles and boundaries; (2) elements for clinical integration; and (3) challenges in development and deployment. Conclusions: This study reveals that PPC providers express positive attitudes and strong demand for AI-assisted clinical work. Furthermore, the research clarifies appropriate roles for AI, outlines elements for clinical integration, and highlights potential challenges in development and integration. This study provides evidence for the feasibility of AI application in PPC and offers guidance for the future development and deployment of AI tools.

  • Automating Frailty Identification in Older Adults: A scoping review of Natural Language Processing and Explainable Artificial Intelligence methods

    Date Submitted: Feb 11, 2026

    Open Peer Review Period: Feb 12, 2026 - Apr 9, 2026

    Background: Frailty is a multidimensional clinical syndrome characterized by diminished physiologic reserve and increased vulnerability to stressors, thus putting older adults at higher risk of advers...

    Background: Frailty is a multidimensional clinical syndrome characterized by diminished physiologic reserve and increased vulnerability to stressors, thus putting older adults at higher risk of adverse outcomes (e.g., falls, mental and physical disability, hospitalization, mortality) in response to even minor stress events. Frailty can be reversed or at least attenuated if detected early, yet early identification remains challenging in primary care due to time- and resource-intensive assessment methods. Artificial intelligence (AI) offers promise in automating frailty identification at the point of care. Natural Language Processing (NLP) is particularly valuable for extracting frailty indicators from rich text data stored in electronic health records, but its limited interpretability has prompted growing interest in augmenting the NLP processes with the use of explainable AI (XAI) techniques. Although NLP and XAI methods have been applied for chronic disease identification, their use for frailty identification has not yet been systematically examined. Objective: This scoping review aimed to synthesize current evidence on the use of NLP and XAI methods for automating frailty identification in older adults. Methods: Peer-reviewed studies published in English between January 2015 and November 2025 were eligible if they applied AI, NLP, or XAI methods to identify frailty in adults aged ≥50 years using real-world health data from OECD or OECD-partner countries. Searches were performed in PubMed and Google Scholar and supplemented by screening bibliographies of identified studies. Data were extracted using a standardized form that captured study characteristics, sample size, data sources, and specific aspects of the AI models, and NLP and XAI methods used. Results: We identified 24 studies that satisfied the eligibility criteria. While all studies used AI approaches to identify frailty, only six used neural network-based models. Logistic regression was the most frequently used AI method (n=14), and only one study employed Bidirectional Encoder Representations from Transformers (BERT). Seven studies relied on both structured and unstructured data, two relied exclusively on structured data only, and the rest relied exclusively on unstructured data. Seven studies used NLP methods, seven used XAI methods, and only one integrated both. Only two studies reported deploying their models in real clinical settings. Conclusions: AI-based approaches show promise for automating frailty identification, yet current applications remain limited by reliance on traditional machine learning models, underuse of NLP and XAI methods, and very little real-world deployment. Future work should focus on developing explainable NLP models, facilitating access to large volumes of unstructured data, and developing standardized frameworks for the systematic evaluation of NLP and XAI methods. Coordinated efforts across clinical, technical, and regulatory domains are essential to develop scalable, transparent, and clinically meaningful AI systems for frailty identification.