Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

The leading peer-reviewed journal for digital medicine and health and health care in the internet age. 

Latest Submissions Open for Peer Review

JMIR has been a leader in applying openness, participation, collaboration and other "2.0" ideas to scholarly publishing, and since December 2009 offers open peer review articles, allowing JMIR users to sign themselves up as peer reviewers for specific articles currently considered by the Journal (in addition to author- and editor-selected reviewers).

For a complete list of all submissions across all JMIR journals as well as partner journals, see JMIR Preprints

Note that this is a not a complete list of submissions as authors can opt-out. The list below shows recently submitted articles where submitting authors have not opted-out of open peer-review and where the editor has not made a decision yet. (Note that this feature is for reviewing specific articles - if you just want to sign up as reviewer (and wait for the editor to contact you if articles match your interests), please sign up as reviewer using your profile).

To assign yourself to an article as reviewer, you must have a user account on this site (if you don't have one, register for a free account here) and be logged in (please verify that your email address in your profile is correct).

Add yourself as a peer reviewer to any article by clicking the '+Peer-review Me!+' link under each article. Full instructions on how to complete your review will be sent to you via email shortly after. Do not sign up as peer-reviewer if you have any conflicts of interest (note that we will treat any attempts by authors to sign up as reviewer under a false identity as scientific misconduct and reserve the right to promptly reject the article and inform the host institution).

The standard turnaround time for reviews is currently 2 weeks, and the general aim is to give constructive feedback to the authors and/or to prevent publication of uninteresting or fatally flawed articles. Reviewers will be acknowledged by name if the article is published, but remain anonymous if the article is declined.

The abstracts on this page are unpublished studies - please do not cite them (yet). If you wish to cite them/wish to see them published, write your opinion in the form of a peer-review!

Tip: Include the RSS feed of the JMIR submissions on this page on your homepage, blog, or desktop RSS reader to stay informed about current submissions!

JMIR Submissions under Open Peer Review

↑ Grab this Headline Animator

If you follow us on Twitter, we will also announce new submissions under open peer-review there.

Titles/Abstracts of Articles Currently Open for Review:

  • Background: Thyroid fine-needle aspiration biopsy (FNAB) is a commonly used diagnostic procedure in patients with suspected thyroid cancer; however, it may induce pain, anxiety, and fear during the procedure. Objective: This randomized controlled study aimed to evaluate the effect of virtual reality (VR) on pain, anxiety, and fear of pain in patients undergoing diagnostic thyroid procedures. Methods: The study was conducted between October 15, 2024, and April 30, 2025, at Gaziantep City Hospital, Türkiye. A total of 100 patients with suspected thyroid nodules were randomly assigned to either a VR intervention group (n = 50) or a control group (n = 50). Data were collected using a Patient Information Form, the Beck Anxiety Inventory (BAI), the Fear of Pain Questionnaire III (FPQ-III), and the Visual Analog Scale (VAS). Between-group comparisons were performed using ANCOVA adjusting for relevant baseline covariates, and effect sizes were calculated using Cohen’s d with 95% confidence intervals. Results: After adjustment for baseline values and relevant covariates, no statistically significant differences were found between the VR and control groups in post-intervention VAS (P =.152), BAI (P =.501), or FPQ-III scores (P =.20). Effect size analyses indicated small between-group effects across all outcomes (Cohen’s d = −0.05 to −0.29), with 95% confidence intervals including zero. Within-group analyses indicated reductions in VAS, BAI, and FPQ-III scores over time in both groups; however, these changes were not supported by statistically significant between-group differences. Conclusions: Virtual reality was not associated with statistically significant improvements in pain, anxiety, or fear of pain when compared with standard care after adjustment for baseline differences. Although small within-group improvements were observed, these findings do not support a strong independent effect of VR on procedural discomfort in this sample. Further well-powered randomized trials are warranted. Clinical Trial: Clinical trial registration: This study was registered at ClinicalTrials.gov (NCT06792929), https://register.clinicaltrials.gov/prs/beta/studies/S000F9GN00000034/protocol/protocolSummary?fragmentId=status

  • Background: Mobile Health (mHealth) and telemedicine can improve access to healthcare services in rural Ethiopia through education, disease monitoring, and remote consultations. Adoption depends on factors such as digital literacy, trust, training, and infrastructure, while challenges include limited smartphones and connectivity. Objective: The aim of this review was to synthesis the evidence of willingness to use mHealth and telemedicine intervention among health professionals and patients. Methods: This systematic review was conducted following the PRISMA guidelines to examine the willingness to use mHealth and telemedicine interventions and associated factors among healthcare professionals and patients in Ethiopia. A comprehensive search was carried from database such as MEDLINE, PubMed Central, CINAHL, and Africa-Wide Information using predefined searching strategies. Only full-text, peer-reviewed studies published in English were included. Two reviewers independently screened and selected studies, extracted data using a standardized form, and resolved disagreements through consensus. Study quality was assessed using Joanna Briggs Institute checklists. Results: This review consisted of 13 studies, and indicates that patients and healthcare workers are strongly willing to use mHealth and telemedicine. The highest willingness was observed among patients with chronic conditions (59.1%–96%), and they preferred simple technologies to engage (voice calls/SMS). Healthcare professionals also indicated varying but substantial willingness to engage (46.5%–83%). Conclusions: Both patients and healthcare providers have a high degree of willingness. Younger age, higher education, urban life, smartphone ownership, digital literacy, and some degree of perceived utility and simplicity of use were significant factors determining willingness. A supporting role was also provided by additional behavioral, clinical, and environmental factors. Clinical Trial: Registration: This systematic review was registered in the International Prospective Register of Systematic Reviews (PROSPERO; CRD42024629424.

  • Background: Effective chronic disease management requires individuals to prioritize long-term health goals over immediate temptations. As chronic patients increasingly engage with online health information, it is important to understand how such engagement may relate to future-oriented cognition and self-regulatory capacity. Objective: This study examined the association between online health information seeking behavior (HISB) and self-control among adults with chronic diseases and investigated whether consideration of future consequences (CFC) was associated with this relationship. Methods: Cross-sectional survey data were collected from 11,031 adults with chronic diseases in China. Mediation analyses were conducted using SPSS macro PROCESS with 5,000 bootstrap samples while controlling for demographic, health-related, and psychological covariates. Results: HISB was positively associated with CFC (B=.07, SE=.005, p<.001). CFC was positively associated with self-control (B=.56, SE=.008, p<.001). After CFC was entered into the model, the direct association between HISB and self-control was no longer statistically significant (B=.005, SE=.004, p=.22). Bootstrap analyses indicated a significant indirect effect of HISB on self-control through CFC (B=.041, BootSE=.003, 95% CI .0348-.0479). Conclusions: The findings suggest that consideration of future consequences may help explain the association between online health information seeking and self-control among adults with chronic diseases. More broadly, digital health environments may increase the salience of future health consequences by repeatedly rendering long-term outcomes cognitively accessible in everyday life. Longitudinal and experimental research is needed to clarify causal mechanisms underlying these associations.

  • Digital Maturity in Integrated Care Systems: Development strategies – A Scoping Review

    Date Submitted: Jun 6, 2026
    Open Peer Review Period: Jun 8, 2026 - Aug 3, 2026

    Background: Digital maturity is a priority for creating efficient, patient-centered health systems, yet Integrated Care Systems (ICS) often face challenges like a lack of interoperability and weak data governance. A systematic mapping of strategies is essential to guide these organizations identify areas for improvement and define sustainable actions to ensure technology adds value to all stakeholders. Objective: To map the development strategies and interventions implemented in ICS to promote digital maturity, while identifying the associated facilitators, barriers, and recommendations described in the literature. Methods: A search was conducted on PubMed, Scopus, and Web of Science on October 17th, 2025, for English articles published since January 1st 2015. Following Joanna Briggs Institute methodology, two independent reviewers performed study selection and data extraction, and quality appraisal using the Mixed Methods Appraisal Tool, with a third reviewer resolving any conflicts, and the obtained results were synthesized through descriptive analysis and thematic grouping. Results: Eighteen articles were included, featuring mixed-methods and case study designs, predominantly set in the United States, as well as several multi-country studies set in Europe. The results identified that most strategies were technological (telehealth, electronic health records and care coordination tools) or structural (governance frameworks). Key facilitators included strong organizational leadership, pre-existing digital infrastructure, and stakeholder engagement, while significant barriers included a lack of interoperability and inadequate funding. Regulation was found to be an obstacle to digital tools development and implementation, as privacy legislation often prevents from fully achieving interoperability, making it essential to use frameworks like “Privacy by Design” to address privacy concerns during digital solutions development phase. Several frameworks surfaced, with both the Chronic Care Model and eHealth Enhanced Chronic Care Model being the most prevalent. Stakeholder engagement emerged as a pivotal enabler, yet significant resistance persists due to low digital literacy, misconceptions and an aging workforce, making it critical not only to develop formal and continuous training, but actively involving them in problem-solving though a co-creation process. Conclusions: Developing digital maturity in ICS requires a multidimensional approach that extends beyond technological adoption to include multidisciplinary governance, national eHealth policies, and value-based funding models. Addressing low digital literacy through formal training for staff and patients is critical for health care system´s sustainability. The review provides a foundational framework for healthcare managers and future research and development of digital maturity guidelines in ICS.

  • Background: Media consumption is a pathway through which the public encounters health information, misinformation, and politicized interpretations of evidence, yet its relationship with knowledge across multiple health domains remains incompletely understood. Objective: We conducted a cross-sectional survey of U.S. adults recruited through CloudResearch Connect to examine associations among media source use, institutional trust, demographic characteristics, and knowledge accuracy regarding climate change, type 2 diabetes, and infectious diseases. Methods: After excluding invalid responses and a failed attention check, 509 participants were included. Knowledge was assessed with domain-specific true/false items scored as correct, incorrect, or “I don’t know,” producing climate change, chronic disease, infectious disease, and total knowledge scores. Results: Rural residence, lower income, lack of health insurance, and absence of a primary care provider were associated with lower knowledge across several domains, suggesting structural barriers to reliable health information. Trust in the CDC, physicians, and pharmacists showed the strongest and most consistent positive associations with knowledge. Political affiliation and consumption of ideologically distinct news sources were most strongly associated with climate change and infectious disease knowledge, but less so with diabetes knowledge. Conclusions: These findings suggest that public health literacy interventions should address both polarized media environments and inequitable access to trusted clinical and institutional information.

  • How Wearables Shape Sleep and Health Behaviors: A Cross-Sectional Analysis by Gender and Global Region

    Date Submitted: Jun 5, 2026
    Open Peer Review Period: Jun 7, 2026 - Aug 2, 2026

    Background: Over the last 15 years, wearable health devices have become increasingly commonplace, with ownership ranging from 30-50% of adults globally. Using advanced technologies, wearable devices today support both consumer wellness and medical-grade applications for a range of chronic conditions including sleep disorders such as obstructive sleep apnea and insomnia. Despite the growing interest in wearable devices, few studies have explored global consumer adoption and perspectives, and how health and lifestyle decisions are impacted by available insights. Objective: This study aimed to examine wearable device adoption, usage patterns, motivations, confidence in wearable-generated data, and health-related behavior changes among adults, with a particular focus on differences by gender and geographic region. Methods: We conducted a cross-sectional, global electronic survey of 9980 employees from a multinational health technology company between July and August 2024 to better understand wearable device usage, perceptions, and beliefs among employees. Participants were invited to participate via digital flyers and emails, as well as via printed materials placed in the workplace. Of the total employees invited to the survey, 1589 (16%) employees were eligible, consented, and completed the survey. Descriptive statistics summarized survey responses, and chi-square tests and proportion tests were used to evaluate differences in drivers of wearable use, confidence, and beliefs by gender, region, and sleep-tracking status. Results: Respondents had a mean (SD) age of 42 (10) years, and 737 (47%) were women, with regional representation from North America (579/1570, 37%), Western Europe (395/1570, 25%), Australasia (395/1570, 25%), and Asia (201/1570, 13%). Most participants (n=1023, 64%) were current wearable device users and 50% of them (n=513) tracked used a wearable device to track sleep. Wrist-worn devices were the most common form factor (1000/1023, 98%) followed by rings (99/1023, 10%). Participants primarily used devices for wellness tracking, and while most (616/1023, 60%) felt confident in the accuracy of the data, only 15% (n=153) regularly shared data with their healthcare provider (HCP). Among those respondents with obstructive sleep apnea (312/1589, 20%), about 21% (108/513) used a wearable device to track their sleep, with similar rates of data sharing. Usage patterns differed by age, geography and gender, as did sharing of data with HCPs. Conclusions: Wearable devices are widely used to support health and wellness behaviors and are associated with self-reported changes in exercise, goal-setting, and sleep habits. Significant differences in engagement and health-related behaviors across gender and geographic regions suggest that demographic and contextual factors influence how wearable technologies are used. These findings may inform the development of more personalized and equitable digital health interventions and support greater integration of consumer-generated health data into healthcare settings.

  • Musculoskeletal health literacy requires patients to understand complex treatment options, postoperative precautions, and recovery timelines, which together help set realistic expectations for recovery. However, existing patient education materials are often text-heavy, exceed recommended reading levels, and fail to depict how functional recovery progresses over time, which may be especially limiting in safety-net settings serving populations with variable health literacy. In this paper, we describe methods for designing and developing a secure, institution-restricted gait-recovery video library for orthopedic patient education. Our library was built to close that gap with short, patient-perspective recovery videos centered on one of the most meaningful outcomes of lower-extremity surgery: functional mobility. Each video was built around a standardized Timed Up and Go (TUG) assessment recorded from frontal and sagittal views, paired with relevant radiographs and visually adapted patient-reported outcome measures (PROM), to create a multimodal, visually guided recovery pathway. This publication aims to detail the process of selecting a secure hosting platform; choosing the filming setup, recovery milestones, and key visual features to capture; maintaining patient privacy and data security; executing clinic-based filming and video editing; and building a personalized interface that allows videos to be filtered by procedure type, recovery stage, and patient characteristics.

  • Generative AI Chatbot Responses to Suicide and Self-Harm: A Systematic Review

    Date Submitted: Jun 5, 2026
    Open Peer Review Period: Jun 6, 2026 - Aug 1, 2026

    Background: A growing number of US adults and youth confide in generative artificial intelligence (AI) chatbots for mental health support, including disclosure of suicide and self-harm risk. While the quality, safety, and effectiveness of chatbot responses to risk disclosure have the potential to impact population-level rates of suicide and self-harm, there have been no systematic reviews of this burgeoning literature. Objective: We conducted a systematic review of studies evaluating generative AI chatbot responses to disclosure of suicide and self-harm risk. Methods: We searched six databases from January 2020-December 2025 and identified empirical studies involving interactions with generative AI chatbots that included discussion of suicide or self-harm. Following deduplication, studies (k = 1,042) were imported into Covidence and titles and abstracts were independently screened by two reviewers, with discrepancies resolved by a third reviewer. The same methods were used to evaluate 126 full texts. Data extraction was led by one reviewer and verified by a second. Results: We identified 29 papers (14 published; 15 preprints). Most (k = 20) were solely audit studies evaluating AI chatbot responses to suicide risk disclosure. Two developed chatbots or AI evaluation frameworks, and one was a jailbreaking study (adversarially testing AI systems or attempting to circumvent chatbot safety guardrails). The remaining studies combined approaches. Across studies, proprietary, frontier model chatbots (eg, ChatGPT, Claude) provided higher quality responses to suicide and self-harm risk than open-source chatbots (eg, LlaMA, DeepSeek), and many AI companions (eg, Replika, Character.AI). All chatbots, not just proprietary models, generally performed well on empathy, validation, and support. However, chatbot responses were often generic and lacked context. Chatbots did not proactively assess risk and performed most poorly when risk disclosure was ambiguous or moderate, frequently failing to recognize implicit risk or escalate to human-delivered services. Furthermore, responses were inconsistent between chatbots and often required multiple conversational turns before providing referrals to crisis resources and human-delivered professional support. While there were few examples of overtly harmful responses under standard conditions, jailbreaking attempts easily led to problematic responses. Finally, no chatbot proactively recommended limiting access to lethal means such as firearms, medications, or sharps. Conclusions: Chatbots provide validation and support in response to suicide and self-harm disclosure. Overall, however, their poor risk assessment, delays in referrals to crisis resources and human-delivered support, difficulty detecting jailbreaking attempts, and general lack of adherence to clinical guidelines present safety risks. While findings are limited by the rapid versioning of AI models over time, research is needed to evaluate stakeholder perspectives on AI chatbot responses to suicide and self-harm risk disclosure. Research should also examine the short- and long-term impact of these responses on clinical outcomes, utilizing follow-up assessments in real-world or clinical settings. Clinical Trial: OSF Registries osf.io/9uva3

  • Validation of a Patient-Facing AI System for Symptom Guidance: A Simulation Study With Physician Review

    Date Submitted: Jun 4, 2026
    Open Peer Review Period: Jun 6, 2026 - Aug 1, 2026

    Background: Rapid advances in large language models (LLMs) have expanded interest in healthcare applications that require complex information processing and decision support. Digital health assistants and symptom checkers, which have historically been rule-based, are increasingly incorporating AI capabilities to support initial symptom assessment and patient triage. An accurate and reliable AI-enabled triage tool could improve patient navigation, reduce unnecessary health care utilization, and support earlier recognition of clinically serious conditions. Objective: The objective of this study was to analytically validate the performance of the Personal Health Assistant (PHA), a Large Language Model (LLM)-based patient support tool in producing appropriate recommendations using simulated patient encounters and expert physician review. Methods: We conducted a prospective analytical validation study of the PHA using synthetic patient data. The evaluation set was constructed from 772 synthetic cases generated from published triage protocols and persona-based LLM-assisted generation. Cases included patient vignette summaries with medical histories and simulated patient conversations. PHA provided guidance and recommendations based on these inputs, which were compared to an independent ground-truth derived from expert physician review. Co-primary endpoints of urgent undertriage, nonurgent undertriage, and overtriage were each evaluated against prespecified clinical performance thresholds. Results: The final analysis dataset contained 772 synthetic cases. Urgent undertriage was 28/406 (6.9%, 95% CI 4.6%-9.8%), nonurgent undertriage was 56/305 (18.4%, 95% CI 14.2%-23.2%), and overtriage was 40/364 (11.0%, 95% CI 8.0%-14.7%). Overtriage met the prespecified performance threshold (<30%), whereas urgent (<5%) and nonurgent (<15%) undertriage thresholds were not met. Conclusions: PHA maintained acceptable overtriage but did not meet prespecified undertriage targets. These findings support the value of structured predeployment analytical validation as an early evidence step for patient-facing AI systems and highlight its utility for iterative refinement; while also underscoring the need for prospective clinical validation in light of the inherent limitations of simulation-based studies.

  • VAGPT and Free-Response Open Coding: Using AI for Qualitative Tasks within the Department of Veterans Affairs

    Date Submitted: Jun 4, 2026
    Open Peer Review Period: Jun 6, 2026 - Aug 1, 2026

    This research assesses the ability of VAGPT, a large language model authorized for use in the Department of Veterans Affairs, to identify qualitative codes and its coding reliability when applied to free-response data from an anonymous survey of servicemembers' access to posttraumatic stress disorder treatments.

  • Digital Health Inclusion in the Global South: Healthcare Providers’ Perspectives on EMR Adoption and Health Data Quality in Nigerian Tertiary Hospitals

    Date Submitted: May 31, 2026
    Open Peer Review Period: Jun 5, 2026 - Jul 31, 2026

    Background: Background: Digital transformation has increasingly influenced healthcare systems globally, with Electronic Medical Records (EMRs) becoming central to improving healthcare documentation, communication and decision-making. Despite growing recognition of EMRs as tools for strengthening health data quality, healthcare institutions in many low- and middle-income countries like Nigeria continue to experience setback as regards digital inclusion, infrastructural limitations and workforce readiness. In Nigeria, public tertiary hospitals still experience inconsistent EMR implementation and persistent concerns regarding the quality of patients’ health data. Objective: Objective: This study explored healthcare providers’ perspectives on EMR adoption and health data quality in selected public tertiary hospitals in North-Central Nigeria within the broader context of digital health inclusion in the Global South. Methods: Methods: The study adopted explanatory sequential mixed-method design. The design involved quantitative phase, identification of key quantitative results, qualitative phase, integration of findings and interpretation. The quantitative data were collected using a structured clinical chart review checklist developed from internationally recognized health data quality dimensions and existing literature on EMR systems and health information management. The qualitative data were collected through semi-structured key informant interviews among physicians, nurses and Health Information Management professionals purposively selected from three public tertiary hospitals with varying levels of EMR implementation. Interviews were audio-recorded, transcribed verbatim and analyzed using thematic analysis. Results: Results: The study revealed an overall moderate level of health data quality, with high a Health Data Quality Index (HDQI) of 73%. Healthcare providers acknowledged the potential benefits of EMRs in improving accessibility, timeliness, comprehensiveness, relevancy and consistency of health data. Participants identified ease of information retrieval, reduction in missing records and improved continuity of care as major strengths of EMR systems. Several barriers to meaningful digital inclusion however emerged. These include unstable electricity supply, poor internet connectivity, inadequate training, workload pressure, dual documentation practices and limited institutional support. Providers further reported that system reliability, ease of use and user satisfaction strongly influenced their willingness to utilize EMRs consistently. Positive attitudes toward digital systems were associated with improved documentation practices and enhanced health data quality. Conclusions: Conclusion: Electronic medical records adoption in Nigerian tertiary hospitals remains shaped by complex technological, organizational and behavioural factors. Strengthening digital inclusion through reliable infrastructure, workforce capacity building and supportive institutional policies is essential for improving sustainable EMR utilization and health data quality in resource-constrained healthcare settings.

  • The Use of Natural Language Processing to Investigate Social Isolation and Loneliness: A Scoping Review

    Date Submitted: Jun 3, 2026
    Open Peer Review Period: Jun 5, 2026 - Jul 31, 2026

    Background: Social isolation and loneliness (SIL) are associated with critical health consequences but are difficult to measure in healthcare settings because they typically appear in unstructured text. Natural language processing (NLP) offers a promising approach to identify these constructs at scale, but its current applications in this domain have not been systematically characterized. Objective: To investigate how NLP is being used to study SIL, identify gaps, and outline priorities for advancing rigorous NLP-based measurement of these constructs in health research. Methods: A scoping review was conducted following the PRISMA-ScR guidelines. Six bibliographical databases (Ovid MEDLINE, Embase, Scopus, Web of Science, APA PsycINFO, ProQuest Dissertations & Theses Global) and two preprint servers (bioRxiv, medRxiv) were searched from inception to June 18, 2025. Reviewers independently screened abstracts and full texts, data were double-charted using a standardized form, and final results were synthesized using structured template analysis. Results: A total of 63 studies published between 2019 and 2025 met the inclusion criteria. Most were conducted in the US (27/63, 42.9%) and used cross-sectional designs (37/63, 58.7%). Studies mostly targeted older adults (31/63, 49.2%), used survey data (26/63, 41.3%), and focused on loneliness (32/63, 50.8%). Most studies (44/63, 69.8%) did not use a validated loneliness scale; among those that did, the UCLA Loneliness Scale was most common (16/63, 25.4%). Classification (24/63, 38.1%) was the most frequent NLP application. Rule-based (32/63, 50.8%) and traditional machine learning (18/63, 28.6%) approaches predominated, but large language models (16/63, 25.4%) and transformer-based models (14/63, 22.2%) increased over time. External validation was rare (2/63, 3.2%), code was shared in only 19% of studies (12/63), and 25.4% (16/63) addressed bias in their data or analysis. Conclusions: NLP applications for SIL are expanding rapidly but rest on narrow methodologies with limited validation measures and demographic groups. Advancing the field requires tying model development to validated measurement of the target constructs and adopting established reporting frameworks such as TRIPOD+AI and MI-CLAIM.

  • Background: Background: Attention-deficit/hyperactivity disorder (ADHD) is a common neurodevelopmental disorder in children, characterised by core symptoms of hyperactivity, impulsivity, and inattention, in addition to cognitive impairments that compromise physical and psychological development. Digital technology-based interventions have emerged as a promising approach for ameliorating both core symptoms and cognitive impairments. However, a comprehensive evidence base supporting their efficacy is lacking. Objective: Objective: This study aimed to systematically evaluate the effects of digital interventions on core symptoms and cognitive impairments in children with ADHD. Methods: Methods: Web of Science, PubMed, EBSCOhost, and ProQuest were systematically searched using predefined inclusion and exclusion criteria. The risk of bias of the included studies was assessed using the revised Cochrane risk-of-bias tool for randomised trials. Effect sizes were pooled under a random-effects model, and heterogeneity across studies was evaluated using the I² statistic. Publication bias was assessed using Egger’s regression and Begg’s rank correlation tests. Sensitivity analyses were performed by switching from a random-effects model to a fixed-effects model to confirm the results’ robustness. Results: Results: Thirty-seven randomised controlled trials were included, encompassing four types of digital interventions: computer-based interventions, serious video games, exergames, and virtual reality. Digital interventions significantly alleviated core symptoms and cognitive impairments, with improvements in the former primarily attributable to serious video games (P = .02) and those in the latter mainly attributable to computer-based interventions (P = .04) and serious video games (P = .02). Conclusions: Conclusions: Thus, digital interventions can significantly alleviate core symptoms and cognitive impairments in children with ADHD. Future research should consider further optimising trial designs and conducting targeted analyses, such as subgroup analyses by symptom subtype and age stratification, to enhance intervention efficacy. Clinical Trial: Trial Registration: PROSPEROKeywords: ADHD; children; digital technology-based interventions; core symptoms; cognitive function; meta-analysis CRD420261399517; https://www.crd.york.ac.uk/PROSPERO/view/CRD420261399517

  • MIRAPIE: Proposing a harmonising framework as a minimal community standard for biomedical provenance documentation

    Date Submitted: Jun 4, 2026
    Open Peer Review Period: Jun 5, 2026 - Jul 31, 2026

    Background: Generating Findable, Accessible, Interoperable, and Reusable (FAIR) biomedical samples, data, and tools is costly and time-consuming. Thus, transparency about their processing or evolution and reuse, particularly of health data, are highly desirable. Therefore, an appropriate fact-based decision framework to evaluate data (re)usability is required. Provenance information documents the processing or evolution of a data object, thereby providing an essential formal basis for such a (re)usability evaluation. Standardised, this provenance information facilitates better FAIR biomedical data. Objective: The MInimal Requirements for Automated Provenance Information Enrichment (MIRAPIE) project aims at defining the minimal required provenance information for harmonised documentation of a data objects processing history and to establish the MIRAPIE approach as a community standard to assure interoperability of the collected provenance information. Methods: A hybrid consensus finding method, adjusted from Nominal Group Technique (NGT) and Delphi, has been applied within an international community setting to iteratively implement a minimal data model, an ontology, and an application guideline. The data model is based on the PROV Data Model (PROV-DM), the ontology expands the PROV Ontology (PROV-O). Results: With the MIRAPIE question, we defined a harmonising framework for provenance information in biomedicine and presumably beyond. The minimal data model, a respective ontology, and an accompanying guideline facilitate means for standardised and possibly automated provenance documentation. In diverse biomedical usage scenarios their general applicability to data, workflows, models, and even samples is shown. Setting up provenance documentation from scratch is equally supported as linking alternative data schemata and mapping existing provenance documentation. Conclusions: MIRAPIE question, minimal data model, ontology, and guideline together significantly contribute to the advancement of biomedical and especially health research, setting up a basis for a contextual (re)usability evaluation. This fosters traceability of changes applied to data, workflows, tools, and samples and, in consequence, sustainable data usage and reproducibility of scientific results. The generalisation allows to overcome domain-specific differences and local, national, and international boundaries. We invite biomedical research community and health data gathering institutions to create lasting change by establishing MIRAPIE-compliant provenance information for transparent data processing and (re)usability assessment.

  • Background: Gestational diabetes mellitus (GDM) has a growing global prevalence and brings multiple adverse short- and long-term hazards to mothers and fetuses. Conventional offline management is restricted by time and space limitations, accompanied by poor patient compliance and delayed individualized intervention. Telemedicine has gradually been applied to GDM management, yet existing relevant randomized controlled trial (RCT) conclusions remain inconsistent, lacking unified quantitative evidence. Objective: This meta-analysis systematically synthesizes available RCT evidence to quantitatively evaluate the comprehensive efficacy of telemedicine interventions on maternal glycemic metabolism, delivery modes and multiple neonatal adverse outcomes in women with gestational diabetes mellitus, so as to provide evidence-based references for optimizing clinical GDM management schemes. Methods: Relevant RCT literatures were comprehensively retrieved from PubMed, EMBASE, Cochrane Library, Web of Science and Scopus up to February 9, 2026. Strict inclusion and exclusion criteria based on PICOS framework were formulated. Two independent researchers completed literature screening, data extraction and Cochrane RoB 2 bias risk assessment, and meta-analysis was performed via STATA 18.0 software. Continuous outcomes were expressed as standardized mean difference (SMD), and dichotomous outcomes were summarized by odds ratio (OR) with 95%CI; fixed or random-effect models were selected according to heterogeneity (I²). Results: Altogether 17 eligible RCTs involving 2391 GDM pregnant women were included. Meta-analysis indicated telemedicine significantly reduced fasting blood glucose (SMD=-0.55, 95%CI:-0.94~-0.16) and 2-hour postprandial blood glucose (SMD=-0.62,95%CI:-1.20~-0.04), lowered the risks of emergency cesarean section (OR=0.65,95%CI:0.45~0.93), macrosomia (OR=0.49,95%CI:0.35~0.69), neonatal hypoglycemia (OR=0.60,95%CI:0.42~0.86) and neonatal respiratory distress (OR=0.61,95%CI:0.41~0.92). No statistically significant improvements were observed in overall cesarean delivery rate, gestational weight gain, preterm birth incidence and neonatal NICU admission rate. Conclusions: Telemedicine interventions effectively optimize glycemic control and decrease multiple adverse perinatal complications among GDM patients, serving as a valuable supplementary mode for routine prenatal care. Further large-sample, long-term follow-up RCTs are still required to verify its long-term maternal and infant clinical benefits. Clinical Trial: PROSPERO CRD420261403669; https://www.crd.york.ac.uk/PROSPERO/view/CRD420251155941

  • Left behind by digital health. A seven-year population study of technology adoption, HIV status, and the emerging prevention paradox in rural South Africa

    Date Submitted: Jun 2, 2026
    Open Peer Review Period: Jun 4, 2026 - Jul 30, 2026

    Background: Digital health programs in sub-Saharan Africa often assume broad mobile reach, yet population-level evidence on who can use specific technologies, and who is excluded, remains limited. Without accurate denominators, digital interventions may reinforce inequities by missing people least engaged with conventional healthcare. Objective: We assessed technology adoption, disparities, and trajectories in a high-HIV-burden rural South African population to inform equitable digital health implementation. Methods: We analyzed 309,151 person-years from the Africa Health Research Institute demographic surveillance platform in rural KwaZulu-Natal, South Africa (2017 to 2023). We measured adoption of seven technologies (calls and SMS, internet, WhatsApp, email, mobile banking, entertainment, and health tracking) and constructed a five-tier Digital Adoption Ladder from offline (T0) to digital-health ready (T4). We quantified disparities by HIV status, gender, and their intersection using logistic regression, and tracked temporal trajectories including the COVID-19 period. Results: In 2023, 61.3% of records were classified as offline (T0) under the harmonized coding rules, and only 2.9% reached digital-health readiness (T4). Among tested individuals, people living with HIV showed higher adoption across all technologies (odds ratios 1.13 to 1.57) than HIV-negative individuals, with 56.0% connected versus 44.6%. Females also showed higher adoption than males (odds ratios 1.24 to 1.80). Intersectional analysis identified HIV-positive females as the most connected group (58.1%) and HIV-negative males as the least connected (38.4%), a 20-percentage-point gap. This pattern emerged after 2019 and defines a prevention paradox: a group important for HIV testing, PrEP, and prevention outreach is also the least reachable through digital channels. Conclusions: Digital health implementation should adopt a floor-up strategy: start with SMS (reaching approximately 39%), add WhatsApp where connectivity exists, and reserve apps for the small minority able to use them. HIV-negative males require targeted outreach through non-health channels to prevent digital exclusion from weakening HIV prevention.

  • Background: There are few theoretical frameworks in the literature for the strategic planning of health information systems. Demonstrating and analyzing their use in practice can lead to a broader application and evidence-based decision making. Objective: The study aimed to analyze and assess the information systems of a university hospital’s physi-cal therapy section and a university department of physical therapy in order to plan their integra-tion following the merger of the two facilities to form an institute for physical therapy at a Ger-man medical center. Building on this, a strategic plan for the institute’s information system is proposed. Methods: We used a methodological framework for the strategic planning of information systems in hos-pitals, extended it by lean management methods and applied it at the organizational unit level. We described the organizational units’ information systems’ static view by the three-layer graph-based metamodel for health information systems (3LGM²) and the dynamic view by Business Process Model and Notation (BPMN). Information sources were interviews with per-sonnel. Results: A strategic management plan for developing the institute’s information system has been pro-posed. A migration path has been established with 23 tactical projects over the next 3 years to accomplish to attain strategic management goals. Conclusions: The method for strategic planning of information systems could successfully be adapted to the organizational unit level and should therefore be applied to other departments in hospitals as well. It helps them identify weaknesses in information logistics through a systematic approach, enabling gradual improvement as part of a long-term plan.

  • Background: Artificial intelligence research in image-guided oncology has grown exponentially, yet how far the field has progressed from diagnostic assistance toward direct therapeutic execution has never been quantified. Existing bibliometric surveys categorize studies by technical architecture or clinical domain, metrics that track publication volume but not proximity to procedural deployment. Objective: We developed a hierarchical functional classification framework to map the global landscape of therapeutic AI development across five major oncological indications. Our two specific objectives were: (1) to classify publications by clinical output function along the diagnostic-to-therapeutic continuum, and (2) to quantify the translation gap using three complementary metrics, triangulated against trial and device registries. Methods: We extracted 29,277 Web of Science publications spanning five image-guided oncologic specialties (thyroid, breast, lung, prostate, and liver) published between January 2010 and April 2026. AI-related records were classified by clinical function using a three-stage protocol: keyword categorization, contextual scoring, and rule-based filtering. Inter-rater reliability, validated on 518 independently coded publications, yielded Cohen's κ of 0.92. Our framework distinguished Diagnosis AI (disease identification) from therapeutic AI, then further stratified therapeutic AI into Bridge-support AI (treatment planning, prognosis, patient selection) and True Treatment AI. True Treatment AI was defined by concurrent satisfaction of two criteria: ≥Level 2 on the Yang Surgical Autonomy Scale and ≥Stage 1 on the IDEAL Framework. Results: Of 16,937 AI-related publications identified, 14,277 (84.3%) were categorized as Diagnosis AI and only 2,660 (15.7%) as therapeutic AI. All therapeutic publications fell exclusively within the Bridge-support tier. None satisfied the dual-framework criteria for True Treatment AI, yielding a uniform penetration rate of 0.00% across all five oncological domains. This complete execution vacuum persisted despite an 11-fold variation in inter-domain treatment-to-diagnosis ratios. The finding held under threshold relaxation, sensitivity analyses, and independent triangulation against 3,491 ClinicalTrials.gov records and 1,430 FDA device listings. Conclusions: Each specialty should periodically profile its diagnostic-to-therapeutic translational progress. The uniform absence of True Treatment AI across 15 years and five domains indicates that this gap is structural rather than cumulative, rooted in methodological inheritance from diagnostic paradigms and in regulatory category mismatches. Closing this gap requires coordinated framework development across regulatory, research, and clinical communities, rather than incremental algorithmic improvements.

  • Background: Large language models (LLMs) are increasingly used by patients seeking medication advice. Their quality for secondary stroke prevention counseling has not been well characterized. Objective: To compare five widely used search-enabled consumer LLM interfaces on patient-facing medication counseling for secondary stroke prevention across fourteen evaluation metrics covering safety, clinical accuracy, information quality, readability, empathy, actionability, and model test-retest stability, operationalized as lexical text stability. Methods: A 56-item English-language question bank was developed from current stroke prevention guidelines and submitted to five consumer LLM interfaces (ChatGPT, Claude, Gemini, DeepSeek, Doubao) via their official web interfaces on May 1, 2026, with repeat querying on May 8, 2026 to assess model test-retest stability. All systems were accessed using a logged-in account with web search enabled via a US-based connection. Responses were independently rated by two blinded raters. Non-parametric tests with Benjamini-Hochberg correction were applied. Results: Clinical accuracy was high and uniform across models (mean 4.44-4.52/5; Friedman p = 0.578). Gemini, DeepSeek, and Doubao scored significantly higher on EQIP (70.2-70.7 vs. 63.6-64.1; p < 0.001) and DISCERN (p < 0.001) than ChatGPT and Claude. All models substantially exceeded commonly used patient-education readability benchmarks (FKGL 11.1-14.4; benchmark <=6; FRES 33.4-46.8; benchmark >=60). ChatGPT had the highest unsafe response rate (14.3% vs. 7.1-10.7%). Conclusions: In this controlled evaluation of researcher-generated questions, the tested search-enabled LLM interfaces produced broadly accurate responses for secondary stroke prevention medication counseling, but weaknesses in readability, source transparency, and safety indicate that readability optimization, source-attribution prompting, and clinical review are needed before patient-facing use.

  • Background: Artificial intelligence (AI) is rapidly reshaping healthcare, offering tools to enhance diagnostic accuracy, streamline clinical workflows, and personalize care delivery. However, real-world AI implementation remains limited, hindered by organizational, technical, and sociocultural barriers that implementation science has only begun to address systematically. Objective: This scoping review maps the intersection of AI and implementation science in healthcare, examining the types of AI technologies deployed, their intended use, and the processes by which these tools are implemented into practice. Methods: Following PRISMA-ScR guidelines, we synthesized empirical evidence from 65 studies published between December 2011 and March 2025. Searches were performed across the databases CINAHL, PubMed, PsycINFO, Scopus, and Web of Science using the terms Artificial Intelligence, Healthcare, implementation, and empirical, combined with relevant synonyms. Results: AI implementation research has expanded rapidly, predominantly in high-income countries, raising important questions about global equity. The most common application areas were automation and optimization (40%), computer vision (34%), and human language technologies (20%), primarily targeting clinical care (68%) and health systems management (25%). Most systems were designed for low-action autonomy (62%), emphasizing human-in-the-loop decision-making. Intended users were physicians (43%), nurses (26%), and radiologists (25%), while patients appeared as intended users in only 11% of implementations. Across the 65 studies, 40 barriers and 55 facilitators were identified across five themes: the AI system itself, healthcare professionals, patients, organizational context, and the macro level. Organizational factors and multidisciplinary stakeholder engagement emerged as the most critical enablers of successful adoption. Key barriers included insufficient AI performance, lack of transparency and explainability, limited IT infrastructure, and inadequate workflow integration. Patient-level and governance-level barriers, including data privacy and regulatory uncertainty, remained underexplored. Only 20% of studies applied theoretical implementation frameworks, and most analyses were conducted retrospectively. Mapping via the AIGENT framework revealed a disproportionate focus on workflow alignment and outcome evaluation, with comparatively little attention to early-phase activities such as needs assessment, adaptation planning, and stakeholder approvals. Conclusions: The current literature predominantly focuses on implementation evaluation and workflow alignment, while patient perspectives, governance conditions, and early implementation activities are underexplored. The finding that only 20% applied theoretical implementation frameworks, mostly retrospectively, reflects a gap between theory and practice and points to a need to apply them prospectively across the full implementation process. From a practitioner perspective, AI implementation should be seen as a sociotechnical and governance process that requires technical, contextual, and system knowledge, rather than merely a technical deployment.

  • Background: Background: Patients who experience a stroke or transient ischemic attack (TIA) face a substantial risk of future events, making optimal management of risk factors essential for secondary prevention. Digital health interventions have demonstrated promise in enhancing the control of vascular risk factors among individuals with stroke or TIA; however, the relative efficacy of different intervention modalities in achieving risk factor control remains uncertain. Objective: Objective: This study systematically assessed and compared the impact of various digital health interventions on the control of risk factors for secondary prevention among patients with stroke or TIA, aiming to determine the most effective intervention approach. Methods: Methods: A comprehensive and systematic literature search was performed across PubMed, Cochrane Library, Embase, and Web of Science databases from January 2010 to January 2026. This review included randomized controlled trials (RCTs) evaluating distinct digital health modalities among patients who experienced a stroke or TIA. Systolic blood pressure (SBP) changes served as the primary outcome, whereas alterations in diastolic blood pressure (DBP), patient medication adherence, total cholesterol (TC), and low density lipoprotein cholesterol (LDL-C) constituted the secondary outcomes. Utilizing the RoB 2 tool, two independent reviewers evaluated the risk of bias, followed by a Bayesian random-effects network meta-analysis to synthesize both direct and indirect evidence. We ranked the interventions based on their cumulative ranking curve (SUCRA) values and appraised the certainty of evidence through the GRADE approach. Crucially, the study protocol was registered prospectively in the PROSPERO database (CRD420261367782). Results: Results: A total of 25 RCTs involving 10,752 patients and six types of electronic health technologies were included. The results showed that, compared with usual care, combined digital technologies had a more pronounced benefit in reducing SBP (MD: −3.7, 95% CrI: −4.8 to −2.7; SUCRA: 71.95%); telephone follow-up demonstrated better effects on lowering DBP and LDL-C (MD: −2.4, 95% CrI: −3.7 to −1.2; SUCRA: 97.04%), (MD = −0.21, 95% CrI: −0.28 to −0.14; SUCRA = 55.95%). In addition, smartphone applications also showed certain advantages in improving medication adherence and reducing TC (MD = −0.39, 95% CrI: −0.71 to −0.068; SUCRA = 87.93%). Conclusions: Conclusions: Different digital health interventions may provide distinct benefits for secondary prevention after stroke or transient ischemic attack. Combined digital technologies appeared to be more effective for reducing SBP, telephone follow-up for improving DBP and LDL-C, and smartphone applications for enhancing medication adherence and reducing TC. However, due to the limited evidence base and small study sample size, these outcomes should be treated conservatively. Future large-scale, high-quality trials are required to verify these determinations. Clinical Trial: The study protocol was registered prospectively in the PROSPERO database (CRD420261367782).

  • Biometric Data From Wearable Devices in the Assessment of Premenstrual Syndrome: Prospective, Longitudinal Observational Study

    Date Submitted: Jun 1, 2026
    Open Peer Review Period: Jun 2, 2026 - Jul 28, 2026

    Background: Despite the substantial burden imposed by premenstrual syndrome (PMS) on women’s quality of life, clinical diagnosis remains dependent on subjective self-assessment. The emergence of wearable technology enables continuous collection of physiological metrics that may serve as objective indicators of PMS symptom severity. Objective: This study evaluated the feasibility of using Fitbit-derived heart rate (HR) and autonomic indices, along with interstitial glucose data obtained from FreeStyle Libre, as objective digital biomarkers for PMS diagnosis and monitoring. Methods: This prospective, longitudinal observational study enrolled 122 women aged 18-60 years in Japan. Physiological data, including HR, HR variability, and interstitial glucose levels, were collected using Fitbit Inspire 3 and FreeStyle Libre devices over 14 weeks. PMS severity was assessed using the Menstrual Distress Questionnaire (MDQ). Results: Participants with severe PMS symptoms exhibited higher autonomic nervous system (ANS) markers such as root mean square of successive differences of RR intervals (RMSSD) and the standard deviation of normal-to-normal intervals and lower sleep HRs during the luteal phase compared with those who had milder symptoms (eg, sleep RMSSD: P=.002; sleep mean HR: P=.007). Furthermore, beginning 3 days before menstruation, participants with severe PMS showed a decline in ANS markers accompanied by an upward trend in HR, whereas those with mild symptoms exhibited the opposite pattern (eg, sleep RMSSD: P=.002; sleep mean HR: P=.005). Conclusions: Sleep ANS markers and HRs serve as objective measures for assessing PMS symptoms. Continuous monitoring using wearable devices offers a promising, noninvasive method for objective PMS diagnosis and personalized health management. Clinical Trial: UMIN Clinical Trials Registry UMIN000051467

  • Background: Chronic obstructive pulmonary disease (COPD) is a major global health challenge, with the number of affected individuals projected to approach approximately 592 million by 2050. Primary healthcare institutions bear substantial responsibility for COPD screening, diagnosis, and follow-up, but often face underdiagnosis, fragmented information systems, and workforce constraints. Although digital health and artificial intelligence (AI) have shown potential in COPD management, workflow-integrated solutions tailored to primary care remain limited. Objective: To describe a designathon-based co-creation process and the subsequent development of an early-stage prototype of an AI-enabled digital workflow for COPD screening and follow-up management in primary care. Methods: This descriptive process and prototype development study followed WHO practical guidance on crowdsourcing and designathons in health research. It comprised three phases: (1) an online open call (July 22 to August 1, 2025) soliciting ideas related to AI-assisted chronic disease management and digitalized follow-up care; (2) a 3-day in-person designathon in Guangzhou involving 23 participants from five stakeholder groups (primary care physicians, implementation science scholars, AI engineers, patient representatives, and chronic disease management specialists) who worked in five interdisciplinary teams using user journey mapping and structured co-creation activities; and (3) a post-designathon translation phase in which co-created deliverables were synthesized into an early-stage WeChat Mini Program prototype named FeiChangShun. Expert rubric scoring was used to assess team deliverables generated during the designathon. Results: The online open call received 26 submissions, 25 of which met eligibility criteria. During the designathon, five priority pain points were identified: data silos and interoperability barriers, training–practice disconnect, communication barriers, human resource shortages, and low disease awareness. The five teams generated differentiated workflow concepts and corresponding user journey maps to address these challenges. Drawing on these co-created outputs, the research team developed an early-stage prototype comprising five core modules: voice interaction support, health education support, behavior management support, standardized workflow support, and draft document/report generation. Conclusions: This study reports a structured designathon-based co-creation process and the development of an early-stage, guideline-informed workflow prototype for COPD management in primary care. Future studies should evaluate the prototype with end users and assess implementation feasibility, safety, and clinical impact in real-world settings.

  • Needs Assessment and Evaluation of Mobile Apps for Children of Parents with a Mental Illness: A Mixed methods study

    Date Submitted: May 29, 2026
    Open Peer Review Period: Jun 1, 2026 - Jul 27, 2026

    Background: Children of parents with mental health illness (COPMI) have the right to receive preventive interventions to avoid developing their own mental health and socioeconomic problems. However, access to interventions varies widely within health care and social services depending on their geographical location in Sweden. In this regard, mobile health (mHealth) interventions for COPMI provide a pathway for more equitable and sustainable preventive support. Objective: This project has the following aims: (1) to map and assess the quality of available mHealth apps relevant to COPMI; (2) to understand the existing needs for digital health solutions (particularly mHealth) by consulting and collaborating with an interdisciplinary reference group (scholars, child rights organizations, and IT professionals); and, (3) to explore the prerequisites for development, implementation, and sustainable access, for future digital solutions for COPMI. Methods: We collaborated with a reference group through a series of meetings and workshops, identifying and assessing the quality of 10 free, highly ranked apps (i.e., leveraging the well-known Mobile App Rating Scale, MARS). All the workshop sessions were recorded and transcribed for further qualitative analysis, allowing us to document and derive our main findings. Results: Three out of 10 apps, scored high across the MARS dimensions of functionality, aesthetics, information and engagement, indicating a significant lack of high-quality apps relevant for COPMI. Further findings were derived, such as the preference for more general apps that are not specifically targeted to COPMI, as these could promote self-identification and reduce stigmatization. Regarding the third aim, the result showed that it was important to find an app that protected users’ privacy by allowing anonymous access to digital support, and that mobile apps should be complemented (or replaced) by web-based applications for accessibility for children who may not be allowed to download apps without parental permission. Conclusions: The sustainability of digital solutions (web or mobile apps) for COPMI is the biggest challenge for future developments. Partnering with providers that are already established in the mental health area is key, extending their services to COPMI, while leveraging an app development infrastructure that already has sustainable processes and business models behind it. To address engagement and deployment issues, it is important to actively involve children through participatory and co-creational approaches when designing and developing mHealth solutions.

  • Background: Traditional neuropsychological assessments for cognitive decline are lengthy in-clinic evaluations by a specialist, with typical wait times of 6-8 months. This creates a substantial patient burden and prolonged diagnostic and treatment timelines. Digital cognitive assessments (DCA) offer a scalable solution to these challenges, but their validation is challenged by the scarcity of large, high-quality datasets with established ground truth. Objective: To develop a model to identify mild cognitive impairment (MCI) and probable dementia using metrics from the Digital Assessment of Cognition (DAC), a brief, remote-capable DCA. A secondary objective was to conduct a preliminary assessment of the model's validity. Methods: We applied a semi-supervised model-based clustering method to combine a large dataset (N=1189) of DAC assessments alone, with a smaller dataset pairing DAC assessments with ground-truth neuropsychological diagnoses (N=248). We examined the model's predictive validity by comparing its predictions with diagnoses on a held-out test set. We examined congruent validity by testing associations with traditional analog assessments and demographic variables. Results: We identified a 6-cluster model with 3 MCI clusters and 2 probable dementia clusters. The model identified cognitively unimpaired, MCI, and dementia groups with high accuracy (78.7%) on the held-out test dataset, and showed excellent ability to identify cognitive impairment (AUROC=0.985) and dementia (AUROC=0.932). We identified strong associations with traditional analog assessments and demographic variables. An exploratory analysis showed evidence that clusters correspond to clinically meaningful subtypes of MCI. Conclusions: These results validate prior exploratory work and demonstrate the potential for more nuanced, holistic, and scalable cognitive assessments in non-specialist settings.

  • Physicians' Job Demands and Job Resources in Digital and Intelligent Healthcare: Scale Development and Validation

    Date Submitted: May 28, 2026
    Open Peer Review Period: May 31, 2026 - Jul 26, 2026

    Background: The rapid development of digital and intelligent medical technologies is profoundly reshaping the clinical work patterns of physicians, introducing new job demands and job resources into their clinical practice. However, existing measurement instruments have not captured these specific changes and novel challenges posed by such technologies, resulting in a lack of corresponding assessment tools, which limits in-depth quantitative research in this field. Objective: This study aims to develop and validate the Job Demands Scale (JDS) and Job Resources Scale (JRS) for physicians suitable for digital and intelligent healthcare scenarios. Methods: Building upon the foundation of prior qualitative interviews and literature review, the dimensions of the scales and a corresponding pool of measurement items were constructed. The scales underwent content revision through two rounds of Delphi expert consultation (N=18) and cognitive interviews (N=6). Subsequently, an online questionnaire survey was conducted with 1,016 clinicians using convenience sampling. The psychometric properties of the scales were evaluated through item analysis, exploratory factor analysis (EFA), confirmatory factor analysis (CFA), and reliability testing. Results: The finalized JDS comprises 22 items across six dimensions: Human-Machine Interaction Burden, Technology Output Risk, Information Security Burden, Occupational Substitution Risk, Doctor-Patient Communication Burden, and Technology Dependence Risk. The JRS consists of 23 items, also organized into six dimensions: Decision-Making Support, Risk Prevention Support, Workload Reduction Tools, Doctor-Patient Collaborative Platform, Precision Efficiency Support, and Clinical Competence Support. EFA indicated that the six factors of the JDS cumulatively explained 71.10% of the variance, and the six factors of the JRS cumulatively explained 58.98% of the variance. CFA demonstrated good model fit for both scales. For the JDS, the composite reliability (CR) values for the dimensions ranged from 0.758 to 0.869, and the average variance extracted (AVE) values ranged from 0.441 to 0.687. For the JRS, the CR values ranged from 0.640 to 0.792; however, the AVE values were relatively low, ranging from 0.339 to 0.490. The overall Cronbach's α coefficients for the JDS and JRS were 0.944 and 0.923, respectively. These results demonstrate that both scales possess good preliminary reliability and validity. However, their dimensional structure and discriminant validity still require further optimization. Conclusions: The JDS and JRS developed in this study exhibit good psychometric properties and hold strong potential for effectively evaluating physicians' job demands and job resources within the context of digital-intelligent healthcare. This provides a scientific basis for subsequent related research and clinical management practices. It is noteworthy that the high correlations observed between the dimensions of both scales suggest that future analyses on the impact of job demands and job resources on physicians' work in digital-intelligent healthcare settings should pay attention to the synergistic effects of these elements. Furthermore, the content of the scales should be continuously updated alongside the advancement of digital-intelligent medical technologies.

  • Smart Hospitals and Digital Health Powered by 5G and 6G Networks: A Scoping Review

    Date Submitted: May 28, 2026
    Open Peer Review Period: May 31, 2026 - Jul 26, 2026

    Background: Global health systems face increasing pressure due to population aging and recurrent pandemics, requiring a transition from Health 4.0 to Health 5.0. Although 4G technologies initiated the era of remote monitoring, their limitations in bandwidth and latency hinder critical real-time applications. Fifth-generation (5G) and sixth-generation (6G) networks, integrated with artificial intelligence (AI) and the Internet of Medical Things (IoMT), have emerged as key enablers of ultra-low-latency and highly reliable services in smart healthcare ecosystems. Objective: This scoping review aimed to map and synthesize scientific evidence on the use of 5G and 6G network technologies in healthcare, particularly in smart hospitals and eHealth services, and to identify related opportunities, challenges, and research gaps. Methods: We conducted a scoping review following the PRISMA Extension for Scoping Reviews (PRISMA-ScR) and the Arksey and O’Malley and Joanna Briggs Institute (JBI) frameworks. The research question was structured using the Population, Concept, Context (PCC) mnemonic. A systematic search was performed in Google Scholar using a comprehensive search string targeting 5G/6G, eHealth, smart hospitals, digital health, and telemedicine. Eligibility criteria included studies in English that explicitly addressed 5G or 6G infrastructure, architecture, or applications in healthcare contexts. Screening and data extraction were performed iteratively by reviewers, and studies were categorized according to implementation maturity, architectural advances, and security requirements. Results: Most identified studies were theoretical proposals (about 57%) or feasibility analyses, with a smaller proportion of real-world implementations. Practical evidence suggests that 5G can reduce emergency response times by up to 30% and enable in-transit imaging-based diagnosis, supporting the transformation of ambulances into advanced triage units. However, field tests report real-world 5G latency of approximately 10 ms, which is above the theoretical target of <1 ms and constrains latency-critical applications such as holographic telesurgery. Across studies, security and privacy—particularly for contextual and IoMT sensors—emerged as critical challenges, together with interoperability with legacy systems and the high cost of infrastructure. Conclusions: Advanced connectivity networks, particularly 5G and future 6G infrastructures, are positioned as foundational components for smart hospitals and digital health, supporting the transition from Health 4.0 to Health 5.0. Nonetheless, the evidence base is still dominated by conceptual works, and the full potential of these technologies is limited by technical, organizational, and economic barriers. Future work should prioritize explainable AI, end-to-end security, and sustainable business models to ensure safe, equitable, and clinically meaningful adoption of 5G/6G-enabled smart healthcare. Clinical Trial: This does not apply as it is a survey.

  • Evaluating a Web-Based Intervention for Digital Health Measurement: a Mixed-methods Study

    Date Submitted: May 28, 2026
    Open Peer Review Period: May 31, 2026 - Jul 26, 2026

    Background: Despite its potential to address key challenges in primary health care, digital health measurement faces substantial implementation barriers for health care professionals. To address these barriers, professionals from 4 disciplines - physical therapy, occupational therapy, speech and language therapy, and general practitioner practice assistance – collaborated with researchers to develop an intervention. The intervention comprised a website supported by coaching on the job as a temporary implementation strategy during development. Objective: This study explored whether and how the intervention facilitates optimized use of digital health measurement in patient care to inform further intervention refinement. Methods: A mixed-methods formative process evaluation was conducted using a predominantly qualitative approach. 18 health care professionals tested the intervention in daily practice. Data collection was guided by the Medical Research Council framework, a predefined process evaluation plan, and the intervention’s initial program theory. Quantitative data (questionnaires, 7-point Global Perceived Effect measures, and monitoring lists) informed semi-structured interviews and focus groups. Data were analyzed using descriptive statistics and directed content analysis. Results: The intervention was largely implemented as intended and improved digital health measurement in patient care by enhancing participants’ capability, opportunity, and motivation. Consistent with the initial program theory, these changes triggered implementation activity at the organizational level, strengthening implementation readiness through bottom-up change processes. Intervention strategies included collaborative learning, modelling and prompting action. These strategies operated through mechanisms such as experiential learning, in which professionals experienced the benefits and feasibility of digital health measurement, reinforcing motivation for its continued use. During intervention use, additional processes emerged, including champions facilitating organizational-level adoption of digital measurement by sharing knowledge and enthusiasm with colleagues. Coaching particularly supported initial intervention engagement by contextualizing generic information, stimulating interaction, and prompting action. Individual, organizational, instrumental, temporal, policy, and societal factors interacted with intervention components, strategies, and mechanisms to facilitate or constrain outcomes. Conclusions: Future refinement should strengthen key mechanisms and processes, integrate mechanisms previously supported by coaching, and develop scalable implementation strategies. As no single approach will fit all contexts, practices should tailor implementation to their local needs. The intervention’s generic framework and flexible use of core components support local adaptations.

  • Chatbot-Based Psychiatric Medication Counseling in Outpatients With Schizophrenia: Pre-Post Study

    Date Submitted: May 26, 2026
    Open Peer Review Period: May 28, 2026 - Jul 23, 2026

    Background: Chatbot-based interventions have shown promise in common mental health conditions such as depression and anxiety. However, their application in schizophrenia (SZ), particularly for psychiatric medication counseling, remains extremely limited. Objective: This study aimed to investigate the effects of a rule-based psychiatric medication counseling chatbot on clinical and patient-reported outcomes in patients with SZ. Methods: A total of 31 outpatients with SZ participated in a single-group pre–post study. Participants used a rule-based chatbot via a mobile app for 3 months. The chatbot provided structured guidance on antipsychotic medications, including side effects, management strategies, medication use, expected therapeutic effects and duration of medication. Primary outcomes included medication adherence (Adherence Rating Scale [ARS]), subjective well-being (Subjective Well-being under Neuroleptic Treatment Scale [SWN]), and side effects (Udvalg for Kliniske Undersøgelser Side Effect Rating Scale [UKU]). Secondary outcomes included psychopathology (PANSS), functioning (SOFAS), and insight (SUMD-K). Results: A total of 31 participants (mean age 33.91 years, SD 11.60; 20 males) completed the study. Medication adherence (ARS) showed a trend-level increase (4.77 vs 4.94, t=2.02, p=.057) but did not reach statistical significance. Among SWN subdomains, self-control improved significantly (mean difference 1.48, 95% CI 0.03 to 2.94, p=.045). UKU total severity scores decreased significantly (19.48 vs 14.52, mean difference −4.96, 95% CI −8.53 to −1.40, p=.008), driven by reductions in psychic (mean difference −1.97, 95% CI −3.52 to −0.41, p=.015) and miscellaneous symptom domains (mean difference −1.58, 95% CI −3.06 to −0.10, p=.037). Among secondary outcomes, PANSS positive symptoms decreased significantly (11.39 vs 10.35, mean difference −1.04, 95% CI −1.90 to −0.17, p=.021), whereas functioning (SOFAS) and insight (SUMD-K) did not change significantly. Older age (β=0.157, p=.017) and living with family members (β=4.304, p=.047) were associated with improvements in physical functioning, and greater chatbot use was associated with improvements in social integration (β=0.150, p=.029) and socio-occupational functioning (β=0.082, p=.024). Conclusions: A rule-based psychiatric medication counseling chatbot was associated with modest but significant improvements in subjective well-being and perceived side effect burden in patients with SZ, while its impact on medication adherence and broader clinical outcomes was limited. These findings suggest that chatbot-based interventions may serve as a useful adjunctive tool in SZ care, particularly for addressing medication-related concerns. Clinical Trial: Clinical Research Information Service (CRIS) KCT0011949; https://cris.nih.go.kr (registration number: KCT0011949)

  • Supporting Veterinary Students in Clinical Reasoning With Generative Artificial Intelligence: A Controlled Interventional Study

    Date Submitted: May 26, 2026
    Open Peer Review Period: May 28, 2026 - Jul 23, 2026

    Background: Clinical reasoning competency development is central to veterinary education. Generative artificial intelligence (GenAI) opens new possibilities for supporting students in acquiring these competencies, yet its effectiveness as a reasoning support tool in case-based learning (CBL) remains unclear. Objective: This study examined whether a commercially available GenAI chatbot could support veterinary students in CBL and evaluate its potential for clinical reasoning training. Methods: Following systematic evaluation, Microsoft Copilot was selected for its accessibility, functionality, and data protection compliance, and students were provided with a user-oriented manual including prompt instructions. In an interventional crossover study involving 60 fourth-year veterinary students at a Swiss university, participants alternated between AI-supported and traditional case-based learning (CBL) across four clinical cases. Clinical reasoning outcomes were assessed by a dedicated lecturer per case using 34 scored items, complemented by student surveys and lecturer reflections. Results: Clinical reasoning outcomes showed no meaningful evidence of a difference between AI-supported and traditional CBL groups (W = 6719, p = 0.464), with results varying across cases. Post-class surveys (n = 38) indicated that most students viewed GenAI support positively: 68% agreed the AI provided relevant inputs they had not previously considered, 58% perceived reduced task difficulty, and 61% found the AI-generated starting point effective. However, 45% also reported negative effects on case understanding and dissatisfaction with the overall learning experience. Qualitative feedback highlighted benefits such as information retrieval and stimulation of reflection, alongside limitations related to superficial or inaccurate AI outputs. Conclusions: These findings indicate that AI integration alone is insufficient to enhance clinical reasoning in case-based learning. Without sufficient AI literacy on top of developing clinical competencies, the cognitive demands of verifying AI-generated outputs may offset potential benefits in complex reasoning tasks. Tailoring AI integration to learner experience, scaffolding, and prior AI exposure appear relevant to realizing GenAI's potential for clinical reasoning development in veterinary education.

  • Digital Health Technologies for Psychotic Disorders: A Systematic Review and Meta-Analysis of Randomized Controlled Trials

    Date Submitted: May 26, 2026
    Open Peer Review Period: May 28, 2026 - Jul 23, 2026

    Background: Digital health technologies (DHTs) for psychosis may help address the substantial gap in access to psychological services, yet prior syntheses are limited by heterogeneous designs and populations. T Objective: This systematic review and meta-analysis aimed to synthesize evidence from randomized controlled trials (RCTs) to estimate the relative effectiveness of DHTs in individuals with confirmed psychotic disorders. Methods: Web of Science, PubMed, Embase, Scopus, PsycINFO, and CENTRAL were searched from inception to January 2026. Eligible studies were RCTs enrolling adults with psychotic disorders that evaluated DHT-delivered psychological interventions targeting psychotic symptoms. Comparators included passive and active controls. Primary outcomes were positive, negative, and overall symptoms. Secondary outcomes included depression, anxiety, functioning, quality of life, dropout, and adverse events. Results: Forty-one RCTs (N = 4139) were included. Compared with passive controls, DHTs showed small to moderate significant reductions in positive (g = -0.18, 95% CI: -0.33 to -0.03; I2= 60%), negative (g = -0.32, 95% CI: -0.56 to -0.07; I2= 63%), and overall symptoms (g = -0.41, 95% CI: -0.71 to -0.10; I2= 78%) at posttreatment, with effects for positive symptoms also at follow-up. No significant effects were observed when compared with active controls. Subgroup analyses indicated significant effects for delusions but not auditory hallucinations, and stronger effects for therapist-supported versus interventions delivered fully automated. Secondary outcomes showed small improvements both posttreatment and follow-up in depression, anxiety, and general functioning, but not for quality of life. Heterogeneity was moderate to high in some of the analyses. Dropout rates were comparable across groups, with no consistent pattern of serious adverse events identified, although safety reporting was inconsistent. Conclusions: DHTs represent a promising approach, with outcomes that appear broadly comparable to face-to-face interventions, while offering potential advantages in accessibility, scalability, and flexibility. Further high-quality RCTs with active comparators and standardized safety monitoring are needed. Clinical Trial: CRD42021251108

  • Refined Exclusion in Medical AI: Reframing Algorithmic Fairness as Data Justice and Patient Safety Governance

    Date Submitted: May 25, 2026
    Open Peer Review Period: May 27, 2026 - Jul 22, 2026

    Medical artificial intelligence (AI) systems are often evaluated through aggregate performance metrics and output-level fairness measures. However, clinically meaningful harms may remain hidden when systems perform well on average while underperforming for data-poor, underrepresented, or structurally marginalized populations. This Viewpoint uses the concept of refined exclusion to synthesize a recurring pattern in medical AI: systems may appear technically successful at the population level while transferring uncertainty, misclassification, delayed recognition, or reduced clinical reliability to groups that are less visible within training data, validation cohorts, proxy definitions, and deployment workflows. Drawing on representative cases from population health management, chest radiograph AI, dermatology, computational pathology, and foundation model applications, we argue that refined exclusion should not be treated merely as algorithmic bias or a defect of model outputs. Rather, it reflects a data governance failure with direct implications for patient safety. Moving beyond output-centered algorithmic fairness, we propose data justice as a governance foundation for medical AI, organized across distributional, procedural, and substantive dimensions. We further outline operational checkpoints across the medical AI lifecycle, including subgroup learnability assessment, data provenance documentation, local validation, procurement-stage accountability, explainability-based proxy audits, post-deployment subgroup monitoring, and patient participation. Reframing refined exclusion as a patient safety problem shifts the central governance question from “Is this model accurate on average?” to “For whom is this system safe, reliable, and clinically accountable?”

  • Background: Patient-reported outcome measure (PROM) completion is hindered by patient-level barriers—including motor, sensory, cognitive, and motivational constraints—that risk insufficient participation and non-response bias. While technology-enabled approaches such as multimodal speech assistance hold promise for reducing these barriers, assistance is a complex interaction: it can both alleviate and introduce barriers depending on how well it aligns with patients’ routines and needs. Objective: This qualitative study explores how patients perceive the advantages and disadvantages of AI-based speech assistance for PROM collection, focusing on how assistance functionalities interact with individual barriers and completion practices. Methods: We conducted semi-structured qualitative interviews with 96 psychosomatic and neurological rehabilitation outpatients, embedded in a pragmatic cross-randomised controlled trial. Participants completed PROMs with and without an AI-based speech assistance system offering speech output, speech input, and guidance by a socially interactive agent (SIA) that was physically, virtually, or voice-only embodied. The system was iteratively refined during data collection to address usability and performance issues. We included a broad sample to reflect real-world care settings, including patients without reported barriers. Using inductive content analysis (61 codes, grouped into 4 overarching and 9 subthemes), we examined perceived advantages and disadvantages of the three main assistance functionalities and multimodal interaction. Reporting followed the COREQ guideline. Results: The speech output function emerged as the most widely valued assistance feature, with many patients reporting improved concentration, question comprehension, and deeper engagement with item content. The social agent was described as making the interaction more engaging and less monotonous, by at the same time not evoking social pressure. Speech input was perceived as helpful by some, especially for those with motor impairments or a preference for verbal expression. However, each function also introduced challenges: speech output disrupted reading routines for some, the social agent was perceived as distracting or unnecessary by others, and speech input was criticised for recognition errors, inefficiency, and privacy concerns. Conclusions: AI-based speech assistance for PROM collection offers significant potential to reduce barriers and enhance patient engagement, but its effectiveness depends on alignment with individual needs, preferences and routines. While speech output proved broadly beneficial, speech input and socially interactive agents require careful design to avoid introducing new barriers, particularly for marginalised groups. Configurable, modular assistance systems that adapt to diverse user preferences and impairments are essential for equitable implementation. Future research should focus on inclusive co-design and longitudinal studies to refine these technologies for real-world clinical use. Clinical Trial: German Clinical Trail Register-ID: DRKS00035213

  • Background: Adolescent depression is clinically heterogeneous, and the presence of mixed features – defined as subthreshold manic symptoms co-occurring with a depressive episode – complicates diagnosis and treatment. Intensive longitudinal monitoring using wrist-worn actigraphy and daily ecological momentary assessment (EMA) may capture behavioral and experiential signatures that differentiate depression with mixed features (Mixed-Dep) from depression without mixed features (NoMix-Dep), but evidence in adolescents remains limited. Objective: This study aimed to examine whether multimodal digital monitoring using wrist-worn actigraphy and daily ecological momentary assessment of mood and energy can distinguish adolescents with depression with mixed features from those with depression without mixed features, and to identify dynamic energy–activity patterns specific to mixed depression. Methods: Ninety-eight adolescents (ages 12–18; 37 Mixed-Dep, 31 NoMix-Dep, 30 healthy controls) from the longitudinal Mood & Brain Circuitry in Adolescence (MBA) study wore wrist-worn actigraphy devices and completed daily mood and energy self-reports using the Mood and Energy Thermometer (MET) over two weeks. Group classification was defined based on the K-SADS-PL Mania Rating Scale. Dynamic within-person associations among mood, energy, and activity were estimated using generalized estimating equations with a first-order autoregressive working correlation structure, controlling for sleep duration, age, sex, and weekday/weekend status. Results: Both depressed groups showed lower overall activity and greater minimum activity suppression compared to healthy controls (mean activity: F = 32.67, p < 0.001), with NoMix-Dep showing lower minimum activity than Mixed-Dep (Min2: F = 17.91, p < 0.001; Min4: F = 23.37, p < 0.001). Mixed-Dep participants had significantly higher positive and negative energy scores (EnergyPosMax: F = 10.12, p < 0.001; EnergyNegMax: F = 91.93, p < 0.001), shorter wake after sleep onset (F = 3.67, p = 0.03), and higher sleep efficiency (F = 7.03, p < 0.01) than NoMix-Dep. Mood scores did not differ between depressed groups. Energy–mood associations were largely similar across groups. Energy–activity temporal coupling differed markedly: NoMix-Dep showed same-day congruent coupling (high energy predicted high activity), while Mixed-Dep showed an inverted lagged pattern (high energy today predicted lower activity tomorrow). Similar group-differential patterns were observed for mood–activity associations. Conclusions: An inverted, lagged energy–activity coupling represents a novel digital phenotype distinguishing mixed from non-mixed adolescent depression. Energy dysregulation, more than mood, differentiates the two depressed subgroups, with implications for scalable EMA-based screening and earlier identification of mixed features in clinical settings.

  • Background: Artificial intelligence (AI) is increasingly integrated into prostate cancer diagnostics, with the potential to improve accuracy and efficiency. However, it also raises important questions about the conditions and barriers that may influence its successful implementation in this clinical context. Objective: This study examined how patients and healthcare professionals perceive the integration of AI in prostate cancer diagnostics, with particular attention to its impact on clinical relationships and the roles of patients and physicians. Methods: A sequential explanatory mixed-methods design was used. Quantitative data were collected through an online questionnaire administered to patients with localized prostate cancer (N=51). Descriptive analyses focused on perceived benefits, willingness to use AI, and associated concerns. Qualitative data were collected through focus groups and semi-structured interviews with patients (n=16) and physicians (n=11). Data were analyzed using iterative, inductive thematic analysis. Results: Quantitative findings showed that despite recognizing the potential benefits of AI, patients remained divided regarding the use of such tools in their own care. Qualitative findings suggest that this hesitation cannot be explained solely in terms of perceived performance or utility. Rather than simply reducing complexity in clinical decision-making, AI appeared to reconfigure the certainties on which trust within the patient–physician relationship is established. This reconfiguration was reflected across epistemic, ethical, and role-related dimensions. Patients emphasized difficulties in understanding AI-generated knowledge, whereas clinicians focused on issues of reliability, validation, and clinical relevance. Ethical concerns centered on responsibility, which was consistently attributed to physicians, while errors made by AI were perceived as less acceptable than those made by clinicians. Role-related uncertainties were reflected in ambivalent patient positions: while some participants sought more information to remain involved in decision-making, others preferred to rely on physicians, reflecting variation in how patients engage with complex clinical information. AI was generally viewed as a supportive tool rather than a replacement for clinical judgement, while its integration was associated with evolving professional roles, including increased demands for interpretation, communication, and oversight. Conclusions: The integration of AI in prostate cancer diagnostics is shaped not only by its technical performance, but by how it reconfigures trust within the patient–physician relationship. Rather than eliminating uncertainty, AI redistributes it across knowledge, responsibility, and social roles. Ensuring that AI contributes positively to clinical practice therefore requires careful attention to clinician oversight, communication, and the relational context in which decisions are made. Clinical Trial: NCT07074405 (ClinicalTrials.gov)

  • Background: The high prevalence of sedentary lifestyles and non‑communicable diseases in Malaysia calls for scalable physical activity interventions. Hence, in this study, we leverage on the potential benefits of social media for exercise promotion, particularly Instagram. Objective: This pilot study examined the acceptability, observed changes, and predictors of improvement associated with an Instagram‑based exercise promotion among sedentary adults in Klang Valley, Malaysia. Methods: A total of 56 sedentary adults (34 females, 22 males) were recruited; 50 completed the 12‑week intervention (mean sedentary behaviour 7.30±2.75 hours/day; retention rate 89.3%). Participants joined a private Instagram page delivering cardiorespiratory‑focused exercise content every two days. Pre‑ and post‑intervention assessments included anthropometry, body composition (InBody 370), 6‑Minute Walk Test (6MWT), and Client Satisfaction Questionnaire‑8 (CSQ‑8). Results: Significant pre‑post changes were observed in body weight (mean change -2.05±2.88 kg, P<.001), BMI (-0.83±1.11 kg/m², P<.001), body fat percentage (-2.23±1.91%, P<.001), and 6MWT distance (67.82±40.81 m, P<.001). The mean total CSQ‑8 score was 27.02±4.91 (out of 32), indicating high satisfaction. Baseline body fat percentage, baseline 6MWT distance, and gender were associated with the degree of functional change (R²=0.71). Conclusions: This pilot study suggests that an Instagram‑based intervention is acceptable and may be associated with positive health changes among sedentary adults. These findings support the need for a definitive randomised controlled trial in the future.

  • MAGR-TCM—A Knowledge Graph-Powered Multi-Agent System for Intelligent Home Health Consultation and Risk Assessment in TCM Gynecology: System Development and Evaluation

    Date Submitted: May 21, 2026
    Open Peer Review Period: May 21, 2026 - Jul 16, 2026

    Background: Large language models (LLMs) have shown considerable potential in intelligent healthcare consultation. However, their application in Traditional Chinese Medicine (TCM) gynecology remains limited by semantic gaps between colloquial patient descriptions and professional TCM reasoning, as well as risks of hallucinated medical content. Objective: We proposed MAGR-TCM, a knowledge graph-powered multi-agent retrieval-augmented generation framework for home-based TCM consultation and preliminary risk assessment. Methods: A domain-specific knowledge graph containing 10,231 entities and 32,051 relationships was constructed from 741 curated clinical case records. The framework integrates four specialized agents for question analysis, risk routing, graph reasoning, and response evaluation. Model performance was evaluated using the RAGAS framework and a double-blind expert assessment on 60 independent cases, including a safety stress-test with 10 emergency "Red Flag" scenarios. Results: MAGR-TCM achieved the best overall performance among baseline models, with an average RAGAS score of 0.900 and a consultation professionalism score of 0.904. The proposed framework demonstrated strong factual consistency (Faithfulness: 0.821) and comprehensive diagnostic accuracy (0.952), approaching the performance of human experts. In safety stress testing, MAGR-TCM achieved 100% emergency identification accuracy and the lowest unsafe recommendation rate (0.240) among all evaluated AI systems. Conclusions: The proposed MAGR-TCM framework demonstrates the potential of integrating knowledge graphs and multi-agent reasoning to support interpretable and safety-aware TCM consultation. The system serves as a reliable methodological prototype for intelligent home-based health management and preliminary risk assessment.

  • Short-Term Forecasting of Other Infectious Diarrhea in Chongqing, China Using a Deep Learning Attention Model: Model Development and Evaluation Study

    Date Submitted: May 18, 2026
    Open Peer Review Period: May 18, 2026 - Jul 13, 2026

    Background: Other infectious diarrhea (OID) remains an important public health concern in China because of its high incidence, marked seasonality, and substantial burden, particularly among children. Accurate short-term forecasting and early warning are important for timely public health response. However, previous OID forecasting studies have mainly relied on reported case data, and the added value of multisource indicators remains insufficiently evaluated. Objective: This study aimed to develop and evaluate a multisource CNN-BiLSTM-SE Attention model for short-term forecasting and early warning of reported other infectious diarrhea cases in Chongqing, China. Methods: Daily OID case counts in Chongqing from January 2015 to June 2025 were collected, together with meteorological variables and Baidu search indices related to infectious diarrhea. After data normalization, Pearson correlation analysis and random forest variable-importance analysis were used for predictor selection. A CNN-BiLSTM-SE Attention hybrid model was developed to integrate multisource data, extract local temporal patterns, model temporal dependencies, and recalibrate informative feature channels. Forecasting performance was evaluated using RMSE, MAE, MAPE, and R², and compared across different input settings and benchmark models. In addition, 5-day-ahead predictions were converted into binary warning signals using training-set 75th and 90th percentile thresholds, and compared with a persistence baseline. Results: Under the full-input setting, the CNN-BiLSTM-SE Attention model achieved the best predictive performance, with an R² of 0.7828, RMSE of 35.418, MAE of 25.411, and MAPE of 17.27%. Compared with the case-only model, R² increased by 0.0326, while RMSE and MAE decreased by 2.560 and 1.643, respectively. The proposed model also outperformed random forest, XGBoost, CNN, and LSTM. In the threshold-based early-warning evaluation, the full-input model showed better overall warning performance than the persistence baseline at both the 75th and 90th percentile thresholds. Conclusions: The CNN-BiLSTM-SE Attention hybrid model improved short-term forecasting of reported OID case counts in Chongqing. Integrating epidemiological, meteorological, and internet search data provided complementary information, suggesting potential utility for OID surveillance, forecasting, and early warning.

  • Non-Patient Stakeholder Perspectives on the use of Gamification and Financial Incentives in mHealth for Medication Adherence: Mixed Methods Consensus Study

    Date Submitted: May 13, 2026
    Open Peer Review Period: May 14, 2026 - Jul 9, 2026

    Background: Medication nonadherence remains a major global health challenge, contributing to preventable disease, hospitalizations, and healthcare costs. Mobile health (mHealth) applications incorporating gamification and financial incentives have shown potential to improve adherence; however, most research has focused on patient perspectives, with limited understanding of how non-patient stakeholders perceive their feasibility, risks, and implementation. Understanding non-patient stakeholder perspectives in relation to patient viewpoints is essential for informing future policy development and establishing practical, industry-supported safeguards that protect consumers while enabling innovation. Objective: This study aimed to explore non-patient stakeholder perspectives on the use of gamification and financial incentives in mHealth apps for medication adherence and to integrate these with previously reported patient perspectives to inform consensus-based design and policy considerations. Methods: A mixed-methods study was conducted using a modified virtual Nominal Group Technique (vNGT). Non-patient stakeholders across healthcare, industry, and policy sectors in Australia were recruited. Data collection involved a pre-session survey followed by online focus groups. Qualitative responses were analyzed using thematic analysis supported by AI-assisted coding. Consensus statements derived from themes were rated during the focus groups. Additional prompts were used to elicit further discussion where consensus was not immediately achieved. Results: A total of 20 participants were included in the study. Six key themes were identified: tailored gamification for adherence, financial incentives as a contested motivator, designing for diversity and inclusion, usability barriers to engagement, trust through data governance, and validated and sustainable innovation. These informed 24 consensus statements, of which 54% (13/24) achieved unanimous agreement. Stakeholders strongly endorsed personalization, simplicity, and transparent data practices, while expressing nuanced concerns regarding the ethical use, sustainability, and potential unintended consequences of financial incentives. Compared with prior patient findings, the participants demonstrated substantial alignment on core design principles but contributed additional system-level considerations related to feasibility, scalability, and regulation. Conclusions: Non-patient stakeholders largely reinforce patient priorities while extending them with critical perspectives on implementation, governance, and sustainability. Gamification and financial incentives are viewed as potentially effective but require careful, ethically grounded design to balance engagement with long-term motivation and trust. These findings support the development of stakeholder-informed guidelines for responsible mHealth innovation and highlight the importance of integrating patient and system-level perspectives in digital health design. Future research should prioritize co-designed longitudinal studies utilizing apps with gamification and a range of incentive offers with clear redemption processes to evaluate the long-term impact on medication adherence across diverse patient populations.

  • Visualizing Health in Platform Work: A Photovoice Study Comparing Freelancers, Couriers, and Taxi Drivers in Sweden

    Date Submitted: May 13, 2026
    Open Peer Review Period: May 14, 2026 - Jul 9, 2026

    Background: The platform-based economy has expanded rapidly through the integration of digital platforms into sectors such as transportation, delivery, and freelance work. Platform labor combines features of precarious employment and digitalized work organization, encompassing both location-based and web-based work. However, the occupational health implications of platform work remain insufficiently understood, particularly regarding how risks differ across platform worker groups. Objective: This study aimed to explore how platform workers experience their working conditions and how platform work affects their health, wellbeing, and safety. Methods: A participatory photovoice study was conducted with platform-based taxi drivers, delivery couriers, and freelancers living in Stockholm. Between September and November 2022, 16 participants were recruited into three groups (5–6 participants per group). Across five sessions, participants documented their working lives through photographs and discussed them collectively, generating 105 photographs in total. Data were analyzed collaboratively to identify key themes and recommendations related to working conditions, health, and wellbeing. Results: Participants identified 14 themes representing major determinants of health, wellbeing, and safety at work, as well as 23 recommendations for improving working conditions. Workers reported exposure to both platform-specific risks, including algorithmic management and digital surveillance, and traditional occupational risks such as psychosocial strain, ergonomic challenges, and traffic-related hazards. Experiences differed substantially across platform work types. Delivery and taxi drivers reported greater exposure to physical and traffic-related risks, whereas freelancers emphasized psychosocial demands and digital work intensification. Economic insecurity and costs associated with maintaining work equipment emerged as common challenges across all groups. Attitudes toward flexibility, autonomy, and algorithmic management also varied between worker categories. Conclusions: This study highlights important similarities and differences in working conditions and health risks across platform work types. The findings suggest that research and occupational health interventions targeting platform workers should differentiate between specific forms of platform labor to better capture the diversity of workers’ experiences and exposures.

  • The Co-Tech Taxonomy: A Health CASCADE Framework for Evaluating Digital Technologies in Participatory Health Research and Co-Creation

    Date Submitted: May 12, 2026
    Open Peer Review Period: May 14, 2026 - Jul 9, 2026

    Background: Co-creation is increasingly used in health research, public health, and participatory initiatives to support inclusive, collaborative, and evidence-informed problem-solving. However, the integration of digital technologies into co-creation processes remains fragmented and largely ad hoc, with limited frameworks available to guide technology selection, evaluation, and development. Objective: This study aimed to develop the Co-Tech Taxonomy, an empirically grounded evaluative framework for assessing digital technologies used in co-creation and participatory digital health ecosystems. Methods: Using the Nickerson–Varshney–Muntermann (NVM) taxonomy-building method, the taxonomy was developed through the analysis of six foundational conceptual and empirical frameworks related to co-creation, participatory processes, and digital technologies. The taxonomy was subsequently refined through iterative empirical classification of 84 technologies used in co-creation contexts. Results: The final taxonomy consists of seven functional dimensions: governance, inclusivity, methodology, collaboration, engagement, data management, and cognitive support. Each dimension is operationalised across three progressive levels of co-creation alignment. The empirical mapping revealed that current digital ecosystems remain insufficiently aligned with participatory collaboration requirements, particularly regarding governance, inclusivity, and AI-supported cognitive facilitation. While communication and data-management functionalities were comparatively mature, participatory governance, collaborative decision-making, and AI explainability remained underdeveloped across most evaluated technologies. The taxonomy also enabled the development of a three-tier indicative certification model to support technology assessment and implementation. Conclusions: The Co-Tech Taxonomy provides a structured evaluative framework for assessing existing technologies, identifying implementation and innovation gaps, and guiding the development of more inclusive, transparent, interoperable, and AI-ready participatory digital infrastructures. The framework offers a practical foundation for strengthening digitally supported co-creation and participatory collaboration within health-related contexts.

  • Development and Preliminary Validation of CLEAR: A Framework for Evaluating Patient-Friendly AI-Generated Clinical Documentation

    Date Submitted: May 12, 2026
    Open Peer Review Period: May 14, 2026 - Jul 9, 2026

    Background: Generative artificial intelligence (GenAI) is increasingly used to produce patient-friendly clinical documentation, yet evaluation of these outputs remains inconsistent and difficult to scale. Patient-friendliness is commonly reduced to narrow readability metrics, such as Flesch-Kincaid grade level, without accounting for clinical accuracy, completeness, or the patient perspective. No standardized framework exists to evaluate the quality and safety of AI-generated patient-friendly documentation across document types or the full documentation lifecycle. Objective: To develop and preliminarily validate CLEAR (Clinical Language Evaluation and AI Documentation Review), a theoretically grounded evaluation framework for AI-generated patient-friendly clinical documentation across the generation, review, and monitoring stages of the AI documentation lifecycle. Methods: CLEAR was developed using Messick's validity framework across four stages: content validation, response process, internal structure, and consequences. Domains were identified through a targeted literature review and reviewed by a panel of six clinical and operational experts. An iterative, consensus-based process involving four board-certified internists across 10 rounds refined domain definitions and scoring instructions. Inter-rater reliability was assessed on 50 AI-generated patient-friendly discharge summaries using Cohen's kappa and Gwet's AC1 for binary domains and intraclass correlation coefficients (ICC) and Gwet's AC2 for continuous domains. Additionally, 19 semi-structured stakeholder interviews with clinicians, informaticists, institutional leaders, and patient education experts explored operational needs and implementation contexts. Results: CLEAR comprises five domains for evaluating patient-friendly AI documentation: readability, understandability, patient-centeredness, accuracy, and completeness. Inter-rater reliability was good to almost perfect across all subjectively scored domains per Gwet's agreement coefficients. Stakeholder interviews independently identified three operational gaps aligned with the CLEAR lifecycle: lack of structured guidance for prompt engineering, subjectivity in human review, and absence of scalable monitoring infrastructure, directly validating the framework's real-world relevance. CLEAR was applied across three illustrative implementation contexts: prompt engineering for patient-friendly echocardiogram reports, structured human review of discharge summaries, and development of LLM-as-judge automated monitoring tools. Conclusions: CLEAR provides a preliminarily validated evaluation framework designed to span the full AI documentation lifecycle, from prompt engineering through human review to automated monitoring. By conceptualizing patient-friendliness as a multidimensional construct that integrates communication quality with patient safety, CLEAR offers practical infrastructure for consistent and scalable governance of patient-facing AI documentation in healthcare systems.

  • Background: House dust mite (HDM) sensitization commonly begins in early life and contributes to persistent allergic airway inflammation and asthma chronicity. Primary prevention via early-life environmental control is a key pathway to reduce HDM sensitization and asthma risk. Objective: To characterize child caregivers’ knowledge, attitudes, and practices (KAP) regarding pediatric HDM control using a hybrid literature/expert-driven and social media-driven approach, and examine associations between KAP levels, child age and caregiver social media activity. Methods: This cross-sectional study comprised two interconnected components: (1) mining of content published between August 2023 and July 2025 from five major Chinese social media platforms, analyzed via Latent Dirichlet Allocation (LDA); and (2) a social media-enhanced web-based KAP survey administered in November 2025 to child caregivers in Chongqing, a warm-humid region where HDMs dominate indoor allergens, with participants recruited via local child health facilities. In total, 132,341 social media documents and 2,275 caregivers of children <18 years were included in the analysis. The main outcomes included social media discourse patterns and domain-specific KAP levels across five dimensions: foundational knowledge (K1), recommended control knowledge (K2), attitude toward social media topics (A1), attitude toward recommended methods (A2), and control practices (P). Stratified analysis was conducted by two exposure variables: child age (≤3 years vs >3 years) and caregiver social media activity (active vs. inactive). Results: LDA topic modeling identified five distinct topic clusters in the social media content. Commercial, emotional, and misleading content collectively dominated the information landscape, accounting for 83.3% of included documents, with commercial content often systematically conflating the concepts of “disinfection” and “mite elimination”. Only 16.7% was classified as health educational content focusing on HDM allergy prevention. The average KAP levels of K1, K2, A1, A2, and P domains were 62.9%, 84.7%, 57.0%, 37.8%, and 25.8%, respectively. Social media emerged as the primary knowledge source (80.7%), with methodological knowledge gaps (47.5%) being the top implementation barrier. Caregivers of children ≤3 years had significantly lower self-rated knowledge (23.5% vs. 28.3%, P=.01), stronger endorsement of recommended methods, but also greater information overload (OR 1.39, 95% CI 1.15-1.67, P<.001) and decision difficulties (OR 1.23, 95% CI 1.01-1.52, P<.001). Socially active caregivers showed better performance across multiple items in five domains, but also increased non-recommended practices (ultraviolet irradiation: OR 1.85, 95% CI 1.35-2.53, P<.001) and misconception acceptance (allergy impact exaggeration: OR 1.39, 95% CI 1.04-1.87, P=.03). Conclusions: Complex and suboptimal KAP levels exist, particularly among caregivers of young children (≤3 years). Social media activity associates with both enhanced implementation of control practices and elevated misconception endorsement. These findings reveal critical educational gaps and the necessity of social media intervention. Clinical Trial: Not applicable.

  • Examining inequities in the use of Continuous Glucose Monitors Among People

    Date Submitted: May 6, 2026
    Open Peer Review Period: May 6, 2026 - Jul 1, 2026

    Background: Continuous glucose monitoring (CGM) offers clinical and behavioural benefits for people with type 2 diabetes (T2D), including improved glycaemic control and enhanced self-management. However, important evidence gaps remain regarding whether CGM use is equitably distributed across patient groups and whether Objective: To examine the relationship between CGM use among individuals with type 2 diabetes (T2D) and a range of patient characteristics, including socio-demographic factors linked to health inequities, digital health literacy, clinical characteristics, and service utilisation. Methods: A cross-sectional online survey was conducted in November 2024 among adults in the UK with self-reported type 2 diabetes (T2D), recruited via the YouGov panel. The primary outcome was self-reported CGM use. Predictor variables included PROGRESS-Plus characteristics (age, gender, ethnicity, religion, education, occupation, household income, disability, and social engagement), digital health literacy (eHEALS scale), clinical characteristics (disease duration, current treatment, and complications), overall health status (number of long-term conditions), and healthcare utilisation (frequency of visits). Descriptive statistics and multivariable logistic regression were used to examine associations between CGM use and patient characteristics. Results: Among 403 participants, 12.7% reported CGM use. Nearly half of participants were aged 65 years or older, and 56.80% were male. Most participants were White 83.90% and lived in urban areas. Higher odds of CGM use were observed among insulin users (OR=3.80, 95% CI: 1.6–9.22, p<0.001). No other demographic, clinical, or service utilisation variables were statistically significantly associated with CGM use. Conclusions: CGM use was primarily driven by insulin therapy, consistent with established clinical pathways within the National Health Service that prioritise access for this group. No significant variation was observed across demographic, socioeconomic, or health literacy-related characteristics, suggesting no clear evidence of inequalities in this sample. These findings indicate potentially equitable access, although further research in larger and more diverse populations is needed to confirm these patterns.

  • Perceptions, Responsibility, and Implementation of AI-CDSS in VTE Prevention: A Qualitative Study

    Date Submitted: May 6, 2026
    Open Peer Review Period: May 6, 2026 - Jul 1, 2026

    Background: Background: Artificial intelligence–enabled clinical decision support systems (AI-CDSSs) are increasingly deployed for venous thromboembolism (VTE) prevention. However, healthcare professionals’ perceptions and experiences of these systems across diverse regional, occupational, and specialty contexts remain poorly understood, with limited evidence on how AI integration influences clinical workflows, responsibility allocation, and professional trust within multi‑tiered healthcare systems. Objective: Objective: This study aimed to systematically investigate healthcare professionals’ perceptions and experiences of using AI-CDSS for VTE prevention across different institutional levels and clinical roles in China. Methods: Methods: A nationwide qualitative study was conducted using semi‑structured interviews with 23 healthcare professionals from diverse institutional levels and clinical roles. Data collection proceeded until thematic saturation was reached. All interviews were transcribed verbatim and analyzed using inductive thematic analysis. Results: Five core themes were identified: (1) AI reduces workload but complicates clinical responsibility; (2) patient involvement is perceived as beneficial yet problematic; (3) digital readiness shapes implementation feasibility; (4) trust in AI varies by professional role; and (5) responsibility and risk remain ambiguous after AI introduction. Facilitating factors included clearly defined responsibility assignment, comprehensive training, incentive mechanisms, and institutional oversight. Key barriers comprised economic costs, additional workload burden, and complex hospital approval processes. Conclusions: Our findings reveal structural tensions arising from the interaction between professional roles, institutional readiness, and responsibility distribution during AI integration. These results underscore the need for tiered, role‑specific implementation strategies and provide practical insights for the sustainable deployment of AI in VTE prevention.

  • Background: Complex digital interventions that integrate electronic patient-reported outcome measures (ePROM) into clinical practice in cancer have the potential to improve quality of life, increase survival, and reduce health resource use and costs. Such systems can help oncology patients self-manage chemotherapy symptoms, reduce workloads for clinicians through automated decision support, and resolve problems earlier. However, there is a need for more research on the cost-effectiveness of such interventions. Objective: This review aims to (1) summarize and evaluate the quantitative and qualitative evidence related to the cost-effectiveness and economic evaluation methods of ePROM-integrated interventions, and (2) extract data and validate assumptions useful for health economic modelling of ePROM-based treatment strategies. Methods: We searched for original English-language papers published on or before March 2025 on Ovid (including MEDLINE and Embase), Scopus, and the International Health Technology Assessment Database (INAHTA) using search strings that combined terms related to ePROMs, health economics, and cancer/oncology. We included papers reporting health economic-related outcomes for ePROM interventions designed for adult cancer populations and excluded screening tools and conference abstracts. Results: We included 34 publications from 27 unique studies, and identified and analyzed 26 ePROM-integrated interventions within these. Most (23/26) of the included interventions explicitly described some form of alert handling and automated decision support based on remote ePROM monitoring. 5/34 publications presented full cost-utility analysis results, of which 3 were characterized by high uncertainty and a lack of clear differences in costs and health outcomes between ePROMs and standard care, while 2 presented strong evidence of cost-effectiveness due to quality-of-life improvements, reduced hospitalizations, and potentially more autonomy in health-related travel (e.g., ePROM-monitored patients can drive or walk to the hospital instead of using taxis or ambulances). A further 5/34 publications reported partial health economic results (e.g., cost-consequence, budget impact), of which 1 detected no difference in strategies, while 4 reported lower health resource use and costs of ePROMs, mainly due to hospitalization reductions. 12/27 studies included a qualitative component but mostly focused on user experience and design-related themes; only 2/12 of these addressed economic-specific themes (e.g., changes in workflow and resource use due to ePROM implementation and integration), indicating some potential for time saving due to ePROM monitoring. Conclusions: There is some evidence that ePROM-integrated interventions can be cost-effective in cancer care, but the evidence base remains limited. Where evidence does exist, cost-effectiveness appears driven by reduced hospitalization and improved quality of life. Qualitative research within the included studies rarely addressed economic questions. We provide a detailed parameter extraction for use in future economic modelling and recommend research priorities, including quantitative mapping of ePROM symptom data onto health resource use patterns, and qualitative work exploring how ePROM implementation affects clinical workloads and patient-perspective costs.

  • Background: Depressive disorders are one of the most prevalent psychiatric disorders globally and impose considerable individual and societal burdens. Psychotherapy, including cognitive behavioral therapy, is recommended as a first-line treatment especially for mild to moderate depressive disorders. However, face-to-face psychotherapy is often limited by issues of accessibility and cost. Digital therapeutics (DTx) have gained increasing attention as alternatives for overcoming these hurdles. With advances in digital technology, digital placebos have been increasingly adopted as comparators in the clinical trials for DTx. However, the characteristics of the clinical trials, the magnitude of digital placebos and their moderators remain poorly understood. Objective: The objectives of this study were to investigate the characteristics of clinical trials using digital placebos as comparators, and to assess the magnitude of the digital placebo effects and their moderators on depressive symptoms measured by Patient Health Questionnaire-9 (PHQ-9). Methods: The blind randomized clinical trials (RCTs) evaluating PHQ-9 by setting digital placebos as comparators were identified by searching MEDLINE, Scopus, Web of Science, PsycINFO, CINAHL, Cochrane Central Register of Controlled Trials, ClinicalTrials.gov, ISRCTN in November 2025. The characteristics of the RCTs and of the digital placebos were reviewed systematically. The meta-analysis including sub-group analyses and meta-regressions were conducted to investigate the magnitude and the moderators of the digital placebos. Results: 29 articles and 30 studies with 5680 participants were included in this systematic review and meta-analysis. The most common trial design was 2-arm, parallel-group study conducted in a single country, adopting “Replaced” and “Mobile” as the placebo approach and delivery type, respectively. The pooled effect size for all the included studies was Hedges’ g = 0.44 (95% CI 0.29 to 0.59) with an overall I2 = 93.2 %. Subgroup analyses showed moderate-to-large and statistically significant placebo effect in the group of primary psychiatric disorders (Hedges’ g = 0.69; 95% CI 0.40 to 0.99). Meta-regressions indicated that the group of primary psychiatric disorders and baseline PHQ-9 score were the independent moderators of the digital placebo effects and the major contributing factors of the high heterogeneity (R2 = 51.5%). Conclusions: Statistically significant digital placebo effects were observed on depressive symptoms, and target population and baseline PHQ-9 score were identified as the independent moderators. These findings would have implications for the planning of future DTx clinical trials using digital placebos for depressive symptoms.

  • Socio-Cultural Challenges and Design Implications for Ethical AI in Healthcare: A Systematic Review

    Date Submitted: May 4, 2026
    Open Peer Review Period: May 5, 2026 - Jun 30, 2026

    Background: Artificial intelligence (AI) is increasingly embedded in healthcare, yet its benefits remain unevenly distributed due to persistent concerns regarding bias, inequity, and socio-cultural misalignment. Although existing Ethical AI frameworks typically emphasize universal principles, they often insufficiently address the socio-cultural contexts in which AI systems are developed, implemented, and used. Objective: This systematic review aimed to examine how socio-cultural factors shape ethical challenges in healthcare AI, influence the interpretation of ethical principles, and inform context-sensitive design and governance strategies. Methods: Following PRISMA 2020 guidelines, we conducted a systematic search of PubMed, IEEE Xplore, and Web of Science for studies published between 2018 and 2025. Eligible studies addressed ethical issues related to AI in healthcare through a socio-cultural lens. A thematic synthesis combining inductive and deductive coding was used to analyze reported challenges, context-dependent ethical interpretations, and proposed mitigation approaches. Results: A total of 49 studies were included. The findings show that ethical challenges in healthcare AI are deeply embedded in structural inequalities, data collection, curation, and documentation practices, institutional conditions, and cultural norms rather than being purely technical problems. Key challenges included algorithmic bias, underrepresentation of minorities in datasets, cultural and linguistic mismatches, limited transparency and trust, and systemic disparities in access to AI technologies. The reviewed literature proposed a broad range of technical, design-related, and governance-oriented strategies, but these remained fragmented and were rarely integrated systematically across the AI lifecycle. Based on this synthesis, the study proposes the Inclusive Ethical AI Framework (IEAF), a socio-technical framework that systematically translates socio-cultural context into context-sensitive ethical interpretations and actionable design and governance decisions across the AI lifecycle. Conclusions: The findings highlight that ethical challenges in healthcare AI are fundamentally shaped by socio-cultural context and cannot be addressed through technical solutions or universal ethical principles alone. Instead, effective and equitable AI systems require the systematic integration of socio-cultural considerations into data practices, system design, and governance across the AI lifecycle. Clinical Trial: PROSPERO CRD420251058607; prospectively registered.

  • Quality Criteria for Cancer Patient Portal Content: Framework Development and Pilot Audit Study

    Date Submitted: May 1, 2026
    Open Peer Review Period: May 1, 2026 - Jun 26, 2026

    Background: Patient-facing cancer portals are increasingly used to provide education, support interpretation of results, navigate services, and guide self-management across the cancer journey. However, variation in content quality, transparency, readability, accessibility, and governance can undermine equity, safety, and trust. Objective: To develop and present EU-CiP20 as a first-phase, evidence-informed, operational, and auditable framework of quality criteria for cancer patient portal content. Methods: We synthesised established instruments and authoritative guidance on online health information quality, health literacy and plain-language communication, transparency and conflicts of interest, patient engagement, privacy and data protection, digital governance, accessibility, and AI-related safety. Candidate criteria were harmonised from a broader evidence-mapped set (EU-CiP30) into a streamlined taxonomy (EU-CiP20) using explicit consolidation rules and an auditable mapping trail. Each category was operationalised into four observable sub-criteria and scored using a pragmatic 0-2 scale. EU-CiP20 is presented as an initial comprehensive framework to be refined in the next phase through stakeholder focus groups, an online survey with affected cancer patients, expert inquiry, and a Delphi expert panel, with the aim of reducing the 20 criteria to a final operational core of approximately 10 criteria. Results: EU-CiP20 comprises five domains and 20 categories spanning accessibility and comprehensibility; evidence and content governance; relevance and personalisation; human-centred design and empowerment; and ethics, safety, and trust. In the pilot, adjusted EU-CiP20 totals ranged from 19.5% to 40.6%. The most consistent gaps were governance signals required for portal readiness, including named clinical ownership, explicit review cycles, evidence traceability, and accessibility auditability. Comparator tools characterised content-level strengths but did not fully capture these governance risks. Conclusions: EU-CiP20 offers a practical and auditable first-phase approach to strengthen governance of patient-facing cancer portal content. It complements existing information-quality instruments by linking readability, evidence governance, relevance, empowerment, transparency, safety, and digital trust within a single operational taxonomy. The work is not yet complete: the current 20-criteria framework will be refined through stakeholder focus groups, an online survey with affected cancer patients, expert inquiry, and Delphi expert panel consensus to produce a shorter final set of approximately 10 criteria, followed by assessment of inter-rater reliability, feasibility, sensitivity to change, and real-world implementation impact.

  • Background: Diffusion of innovations theory posits that inequalities arising from the early adoption of new technologies, such as telemedicine, are likely to decrease over time. However, evidence is scarce on the evolution of inequalities related to individual telemedicine adoption over time. Objective: This study aims to assess changes in age and socioeconomic inequalities in telemedicine adoption in Japan from 2020 to 2024. Methods: We used data from a nationwide, internet-based panel survey of the general population in Japan. Participants aged 18–75 years who completed both the 2020 baseline and 2024 follow-up surveys were included. The primary outcome was self-reported telemedicine adoption (ever use at each survey). Using multivariable logistic regression models, we regressed telemedicine adoption on (1) indicators of age and socioeconomic status at baseline, (2) survey year, and (3) their interaction, adjusting for other demographic, socioeconomic, and health-related characteristics. We then estimated the adjusted prevalence of telemedicine adoption in 2020 and 2024 for each age and socioeconomic group. Results: We included 10,818 participants (mean [SD] age, 49.7 [16.8] years; 50.7% women). In 2020, 271 participants (2.5%) reported telemedicine adoption; by the 2024 follow-up survey, this increased to 840 participants (7.8%). The prevalence of telemedicine adoption was lower among older individuals, those with lower educational attainment, those with medium income (vs high income), and unemployed individuals (vs upper non-manual workers) in 2020. While the prevalence increased across groups from 2020 to 2024, the increases were smaller among older age groups (70–75 years: +1.0 percentage points [pp] vs 18–29 years: +13.2 pp; difference-in-differences, −12.1 pp; 95% CI, −18.3 to −6.0 pp). Similarly, increases were smaller among unemployed individuals than among upper non-manual workers (+2.8 vs +5.8 pp; difference-in-differences, −3.0 pp; 95% CI, −4.7 to −1.2 pp). Changes in the prevalence of telemedicine adoption did not vary significantly by educational attainment, urban vs rural residence, or income level. Conclusions: Despite growth in telemedicine adoption from 2020 to 2024, age-related and occupational inequalities widened, and educational inequalities persisted, underscoring the need for strategies to reduce age-related and socioeconomic barriers to telemedicine adoption.

  • Longitudinal Modeling or Monitoring of Depression in Speech: A Systematic Review

    Date Submitted: Apr 30, 2026
    Open Peer Review Period: Apr 30, 2026 - Jun 25, 2026

    Background: Depressive disorders are a leading cause of disability worldwide, and more than 40% of people who experience a single depressive episode will experience recurrence. It is, therefore, essential that people living with a depressive disorder are able to access appropriate means of monitoring, to identify recurrences and enable timely interventions. Existing monitoring methods are burdensome for both clinicians and patients, but previous research into automated depression diagnosis has demonstrated links between participants’ depression severity and speech features. Longitudinal depression modeling through speech aims to build on these links and provide automated methods of long-term depression monitoring. Objective: This systematic review collates existing research into the monitoring or modeling of changes in depression severity, through its impact on speech. Methods: We searched the ProQuest, Scoups, Web of Science, PubMed and IEEE Xplore databases for studies relating to the longitudinal modeling of depression in speech. Publications of any age were acceptable, but only English-language studies were included. All studies underwent quality appraisal using the CASP cohort study checklist. Results: We retrieved 22 relevant documents from the database searches, and a further 40 documents through citation chasing and manual searching. The observational periods employed by these studies varied from 7 days to 18 months, and sample sizes of 16-954. Speech features such as speaking rate and pause duration show promising sensitivity to changes in depression severity. However other features, such as average energy velocity, exhibit conflicting trends across different studies - as does the generalizability of prosodic and acoustic features between languages. Conclusions: We identified significant methodological variation within the data collection, feature extraction, and modeling stages of the studies. While there is evidence to suggest that speech features are sensitive to changes in depression severity, some findings are inconsistent between studies. We advocate for greater clarity and consistency in the reporting of methods to support comparisons of findings between studies and generalizability testing. Future work could explore the predictive capacity of speech to identify oncoming depressive episodes. Clinical Trial: PROSPERO CRD420251003661; https://www.crd.york.ac.uk/PROSPERO/view/CRD420251003661.

  • Liability and Standard of Care in AI-Driven Psychiatric Practice: A European Viewpoint

    Date Submitted: Apr 30, 2026
    Open Peer Review Period: Apr 30, 2026 - Jun 25, 2026

    Artificial intelligence is increasingly entering psychiatric care through decision-support systems, digital phenotyping tools, suicide-risk prediction models, documentation assistants, and conversational agents. These technologies may improve access, consistency, and personalised care, yet they also redistribute clinical authority and complicate liability when harm occurs. This article examines how European law and psychiatric ethics should respond to this shift. It argues that liability in AI-driven psychiatry cannot be understood only as a product-defect issue or only as a malpractice problem. Because psychiatric practice depends on interpretation, testimony, contextual judgment, and therapeutic alliance, the relevant standard of care must remain human, even when technologically augmented. The article advocates an augmented-clinician model in which AI informs but does not replace psychiatric reasoning. After outlining the European regulatory framework, including the AI Act, the Medical Device Regulation, the General Data Protection Regulation, the revised Product Liability Directive, and the European Health Data Space Regulation, the article analyses the implications of the withdrawal of the proposed AI Liability Directive and the persistence of divergent national tort regimes. It then examines psychiatric risk vectors, including automation bias, testimonial injustice, bias in mental health datasets, therapeutic chatbots, suicide prediction tools, passive monitoring, and large language model documentation. The discussion proposes a layered accountability model that links developers, deployers, and clinicians while preserving therapeutic integrity, patient rights, and legal clarity.

  • Background: Electronic nicotine delivery systems (ENDS) are at the center of global public health debate. China is the largest producer of e-cigarettes while the U.S. has the largest consumer market, yet analyses of news coverage of ENDS comparing China and the United States (U.S.) remain limited. Objective: The primary objective of this study is to identify and compare dominant themes in ENDS-related news coverage across leading broadcast-branded digital outlets in China and the United States, and to assess how these themes and coverage volume changed over time. Methods: We conducted a thematic analysis of 470 ENDS-related stories from January 1, 2020, to July 30, 2025, from four leading broadcast news digital media platforms: CNN.com and FoxNews.com in the U.S.; CCTV.com and ifeng.com in China. Using a single theme approach, coders identified core themes for each article based on prespecified rules and a hierarchical decision structure. Frequencies and proportion of each core theme were summarized for the overall sample and stratified by country. Pearson chi-square tests and binary logistic regression models were conducted to examine cross-national differences with false discovery rate (FDR) adjusted p-values. Temporal changes in themes were examined and visualized. Results: In U.S. coverage, the most prevalent themes were policy and regulatory governance (32.1%), youth appeal, flavors, and school responses (22.4%), and health risks, harms, symptoms, and dependence (13.9%). In Chinese coverage, the most prevalent themes were commercial practices and market dynamics of ENDS (26.0%), policy and regulatory governance (23.4%), and enforcement and compliance (15.7%). Cross-national differences in themes were consistently observed between the two countries. Between 2020 and 2025, coverage in China transitioned away from commercial and market themes toward greater focus on illicit substances and enforcement, while U.S. coverage showed relatively stable focus on commercial market with a gradual increase in enforcement-related reporting. Conclusions: Broadcast news in China and the U.S. may actively shape how ENDS are defined as a public issue and what policy responses appear legitimate. Chinese coverage tends to stress commercial activity and enforcement, whereas U.S. coverage more often foregrounds youth risks and regulatory debates. These distinct thematic patterns may influence risk perceptions and policies in each country and are important to consider in comparative media and public health research.

  • Digital health interventions to prevent post-traumatic arthritis after traumatic knee injury: a scoping review

    Date Submitted: Apr 28, 2026
    Open Peer Review Period: Apr 28, 2026 - Jun 23, 2026

    Background: Traumatic knee injuries (TKI) are common, associated with a 4-6 times increased risk of post-traumatic knee osteoarthritis (PTOAK) over the subsequent 15–20 year period. There is clear evidence that risk can be reduced, but long-term care availability is limited, prompting the development of DHIs (digital health interventions) such as wearable devices, telehealth innovations and mobile apps. Objective: To evaluate existing DHIs against the OPTIKNEE consensus guidelines for PTOAK prevention and investigate adoption into practice. Methods: A search of 7 online databases and the grey literature was completed from inception to 03/06/2025, complemented by hand searching government, charity and university websites for reports and technical prototype papers concerning DHIs to support care after TKI. DHI features were mapped to the OPTIKNEE recommendations, evaluated against the health-technology pathway to identify development stage, and implementation analysed using NPT (Normalisation Process Theory). Results: 81 reports, 53 peer-reviewed and 28 other, concerning 49 distinct DHIs were found. They were designed for injuries of the anterior cruciate ligament (ACL, n=12); ACL meniscus (n=15); meniscus (n=3); ACL or meniscus (n=2), bone (n=2), patella dislocation (n=1), and 14 were non-specific. No DHIs addressed all OTPIKNEE recommendations, however the eight most complete reported 4/7 components, including exercise, information provision, patient reported outcome measures, goal setting and overall patient outcome. A remote, self-assessed strength evaluation was not reported in any DHI. NPT analysis typically demonstrated low DHI adoption levels, and no clear correlation with health technology pathway stage. The DHI with the highest adoption into routine practice, according to NPT, was ‘getUbetter’ with 56% positive scores. Conclusions: There are many available, or developing, DHIs but none include the content recommended by OPTIKNEE to reduce the risk of PTOAK. Further, there is negligible evidence of DHIs being adopted into usual care. There is a clear need to develop guideline-compliant DHIs to support effective prevention.

  • Providing consultation recordings to patients in German routine cancer care: A mixed-methods pilot study

    Date Submitted: Apr 28, 2026
    Open Peer Review Period: Apr 28, 2026 - Jun 23, 2026

    Background: The provision of audio recordings of medical encounters to patients, referred to as consultation recordings, is a well-established intervention to address information needs like recall and comprehension in cancer care. Despite these benefits, consultation recordings are not routine practice. Furthermore, research on consultation recordings in Germany is lacking. Objective: This study aims to pilot test consultation recordings in routine cancer care in Germany and assess feasibility of implementation and perceived effects from patients’ perspective. Methods: Using a sequential mixed methods approach, we assessed consultation recordings’ use, usability, acceptability, appropriateness, influencing factors, and perceived effects. Consultation recordings were piloted in an outpatient setting. Adult cancer patients were eligible to participate. Four weeks after the recorded consultation, participants received a quantitative questionnaire. In addition, a selection of participants were qualitatively interviewed. Quantitative data was analyzed using descriptive statistics, qualitative data using a combination of Practical Thematic Analysis and qualitative content analysis. Results: Ninety-seven consultations were audio-recorded and provided to patients. Seventy participants returned the quantitative survey (response rate 72.2%) and 16 participated in qualitative interviews. Most participants listened to the consultation recording and experienced improvements in recall, comprehension, and feeling informed. Routine implementation of consultation recordings was desired by many. The results suggest that patients perceive consultation recordings as feasible. However, we encountered organizational implementation challenges. Conclusions: This study provides initial evidence on the patient-perceived feasibility of consultation recordings in German routine cancer care. Consultation recordings have the potential to help patients navigate complex medical information. However, organizational implementation challenges hinder their uptake. Future research could investigate technically easier solutions suited to the German healthcare context.

  • Background: Background: Hypertension remains a predominant global risk factor for cardiovascular disease. Conventional follow-up models frequently fail to address the requirements for real-time monitoring and sustained intervention, whereas mobile health (mHealth) offers a transformative trajectory for chronic disease management. Despite a surge in relevant literature, the diversity of intervention modalities and the fragmented nature of existing evidence necessitate a systematic synthesis. Objective: Objective: This study aimed to comprehensively evaluate the efficacy of mHealth in hypertension management through a systematic review combined with evidence mapping, identifying research gaps to provide evidence-based insights for precision nursing and future research directions. Methods: Methods: A systematic search was conducted across PubMed, Web of Science, Cochrane Library, and Embase for randomized controlled trials (RCTs) involving mHealth interventions for hypertension, with the search period extending through February 2026. Literature was screened according to PICOS criteria, and methodological quality was appraised using the Cochrane Risk of Bias tool (RoB 1.0). Visual analytics, including Sankey diagrams and bubble plots, were employed to characterize the associations between intervention modalities and clinical outcomes. The study protocol was prospectively registered on the Open Science Framework (URL: https://osf.io/2vkwu). Results: Results: A total of 106 publications (comprising 108 RCTs) were included. Publication volume has increased significantly since 2018, with the United States (31 papers) and China (19 papers) being the primary contributors. The intervention paradigm has evolved from rudimentary SMS reminders to a "closed-loop" management model centered on "App + Remote Monitoring," which demonstrates the most robust and consistent positive evidence for blood pressure (SBP/DBP) control and goal attainment rates. Blood pressure parameters occupied the "core evidence layer," while therapeutic adherence and disease knowledge formed the "behavioral evidence layer". Conversely, BMI, mental health, and quality of life remained in the "peripheral evidence layer," characterized by a notably higher proportion of non-significant results. Methodological quality was generally moderate-to-high with robust randomization; however, the implementation of blinding faced prevalent high risks due to the inherent nature of the interventions. Conclusions: Conclusion: mHealth significantly enhances hypertension management efficacy through a digital "monitoring-feedback-adjustment" loop, yet it encounters bottlenecks in achieving profound lifestyle modifications (e.g., weight management) and psychological interventions. Clinical decision-making should prioritize multicomponent interventions featuring real-time interaction. Future research should focus on long-term (>1 year) follow-up and cost-effectiveness transformation in resource-limited settings.

  • Background: Adults may experience subjective cognitive decline (SCD). However, it is unclear whether SCD is related to measurable cognitive impairment, particularly women ages 40 to 60 and early dementia. Further, Medicare has mandated assessment of cognitive and memory function in individuals over 65 as part of the Medicare Annual Wellness Visit. In order to assess possible impairment and change over time, efficient, objective measures of SCD are needed. Objective: To assess the relationship between performance on an online continuous recognition task (CRT, MemTrax) and age, sex, and memory concern. Methods: This study evaluated CRT performance in participants aged 21-99 who enrolled in an online program (HAPPYneuron) to measure mental functions, including those who reported concerns about them. This program asked participants if they had complaints about their memory, and then the program offered them the opportunity to assess cognition using the CRT. This CRT instructs individuals to attend to visual stimuli (50 images) and respond as quickly as possible to repeated images (25 images). The CRT components were used to measure learning and memory (as related to HITs, response to a repeated image), executive function (as related to CRs, correctly not responding to an initial image presentation), and processing speed (HIT-RTs, average response time to HITs). Results: Analysis of 18,178 (5,795 males, 32%; 12,383 females, 68%) only included those who answered the sex, age, and memory questions. There were 11,786 (65%) between 40 and 70 years of age. Females outnumbered males by over two-fold, beginning about 35 years of age, peaking at 55 years of age at over three-fold, and falling below two-fold at about 65 years of age. Approximately 30% more men complained of memory problems than those who did not, primarily 30 – 60 years old. About 80% more women complained of memory problems, over two-fold more than women who did not, 30-50 years old. The number of HITs, number of CRs, and HIT-RTs varied little between men and women. While those without memory complaints generally performed better than those with memory complaints, there was little difference in performance levels for each group between males and females. For all groups, there was a gradual reduction of performance over age for HITs and CRs and a slowing of HIT-RTs. Conclusions: Most subjects were 40-65, more than twice as many females, suggesting that these demographics have a relationship to concern about SCD. However, there was little difference between males and females for the various CRT components, though SCD was associated with impairment. Age-related declines were progressive, the largest being in slower processing speed, presumably to compensate for age-related changes in cognitive function. Present results suggest clinicians may use these metrics to quantify patient concerns expressed in the primary care setting. Clinical Trial: none

  • Momentary Mood State Detection using Smartwatches: Algorithm Development and Validation

    Date Submitted: Apr 20, 2026
    Open Peer Review Period: Apr 21, 2026 - Jun 16, 2026

    Background: Mental health encompasses not only chronic conditions such as depression or anxiety, but also acute fluctuations in mood that unfold over minutes to hours and can disrupt daily functioning. These transient states, such as sudden fatigue, irritability, or low energy, remain largely invisible to current digital health approaches, which typically aggregate behavioral and physiological data over days or weeks to detect trait-level conditions. The ability to detect momentary mood shifts in real time carries significant clinical promise: continuous affective monitoring could enable early detection of mental health crisis, support clinical decisions and clinical trials with continuous mood measurements, and improve occupational safety with detection fo states like fatigue or confusion. However, affective computing research has demonstrated that while physiological signals carry information relevant to mood, most prior work relies on controlled laboratory settings where performance degrades substantially in naturalistic environments, or employs research-grade devices with proprietary sensors unavailable on consumer hardware. Bridging this gap between laboratory-validated sensing and real-world momentary mood detection is essential for translating these clinical possibilities into practice through just-in-time adaptive interventions. Objective: This study investigates whether continuous sensing from a low-cost, opensource smartwatch can support detection of multi-dimensional momentary mood states in naturalistic settings, using personalized models with on-device computation. Methods: We conducted a 7-day field study in which participants (N=10) wore Bangle.js 2 smartwatches that continuously collected physiological and contextual data, including heart rate, accelerometry, barometric pressure, temperature, and GPS, while prompting hourly mood self-reports using the Brunel Mood Scale (BRUMS) across six mood dimensions (tension, depression, anger, vigor, fatigue, confusion) and additional affective and physical states. All feature extraction was performed on-device. We developed personalized mood detection models using best-subset regression across multiple feature combinations. Results: Personalized models decoded momentary states with mean R2 values ranging from 0.09 (pain) to 0.31 (vigor). Fatigue, happiness, vigor, and depression were the most reliably decoded dimensions (mean R2 = 0.26–0.31). Cross-subject decoding was substantially lower, confirming that personalization is essential for accurate mood inference. Including privacy-preserving location features did not significantly improve prediction accuracy beyond physiological and contextual sensors alone. Conclusions: This work demonstrates that a broad range of momentary mood states can be decoded from low-cost, open-source wearable sensors as people go about their daily lives, bridging the gap between controlled laboratory studies and real-world momentary assessment. The finding that personalized models substantially outperform generalized approaches underscores the need for individual calibration in affective computing systems. The on-device, privacy-preserving architecture establishes a foundation for future closed-loop adaptive interventions in clinical and occupational contexts, including continuous monitoring of high-risk psychiatric populations, early warning systems for substance use relapse, and real-time assessment of cognitive and emotional fitness in safety-critical work environments. Clinical Trial: N/A

  • Virtual Reality for Cognitive Mastery in Airway Trauma Management: A Prospective Randomized Controlled Trial

    Date Submitted: Apr 17, 2026
    Open Peer Review Period: Apr 18, 2026 - Jun 13, 2026

    Background: Innovation in teaching methods is essential for advancing medical education, particularly for trainees developing crisis management skills. Virtual reality (VR) offers access to immersive, scalable, and accessible learning environments, but its effectiveness compared to traditional mannequin-based simulation remains underexplored. Objective: This prospective randomized controlled trial evaluates the efficacy of VR-based simulation versus traditional gold-standard mannequin-based training in enhancing medical trainees’ knowledge acquisition and application of decision-making concepts for airway trauma management. Methods: Forty medical students were randomized to either the VR (intervention) group or the Mannequin (control) group. Participants engaged in airway trauma management training using their assigned modality. Both groups completed a pre-and post-intervention test to evaluate knowledge acquisition, and undertook a mannequin-based crisis scenario one week after training to evaluate knowledge application. Results: Both groups demonstrated significant knowledge acquisition (VR: mean improvement +2.0/15, P=0.006; Mannequin: mean improvement +3.2/15, P<0.001), though no statistically significant differences were observed between groups (P=0.15). The VR group achieved self-assessed readiness and knowledge saturation faster, on average, than the Mannequin group. Both groups, on average, were successful in the post-training knowledge application test, however, the Mannequin group outperformed the VR group (mean difference: 1.58/15, P=0.021), and recognized a potential airway injury more quickly (P=0.004). Nevertheless, students in the VR group reported greater engagement and satisfaction, expressing a preference for VR as a future learning modality. Conclusions: Overall, VR-based simulation is a promising and engaging method for teaching airway trauma management and demonstrates comparable knowledge acquisition to traditional mannequin-based training. However, mannequin-based simulation still confers advantages for applied performance. Further studies using larger samples, multiple scenarios, and VR-based assessments are needed. Clinical Trial: ClinicalTrials.gov NCT04451590; https://clinicaltrials.gov/study/NCT04451590

  • Background: Large real-world data sources offer a unique opportunity to study the health of diverse ethnic groups. High-quality and accessible ethnicity data is needed to maximise this potential. Objective: To validate a newly developed ethnicity phenotype in the Oxford-Royal College of General Practitioners (RCGP) Research and Surveillance Centre (RSC). Methods: Retrospective cross-sectional study of individuals registered at a practice within the Oxford-RCGP RSC on 4th December 2024. An updated ethnicity phenotype was implemented and validated. Ethnicity data quality was assessed by evaluating completeness, distribution, and accuracy through external validation against estimates from the 2021 UK Census. Results: Of 21,902,852 individuals, 88.63% (19,412,154) had a recorded ethnicity following the implementation of the updated ethnicity phenotype. There was a marked improvement in the recording of granular (19-point) ethnicity data, with completeness increasing from 69.06% (15,126,835) to 88.63% (19,412,154) with the updated phenotype. There was significant variation in the completeness of ethnicity data according to demographic subgroups. The proportion of individuals in each ethnicity group was within 3.56 percentage points of the 2021 Census estimates for the same ethnicity group across England. Larger relative differences were observed for non-White ethnic groups. Conclusions: The updated ethnicity phenotype provides high-quality and granular ethnicity data based on official classifications for almost 90% of individuals. The overall ethnicity breakdown in the Oxford-RCGP RSC population was broadly similar to 2021 UK Census estimates. The updated ethnicity phenotype supports secondary uses of primary care CMRs, providing high-quality and accessible ethnicity data to study the health of diverse ethnic groups.

  • Background: Chronic or persistent pain can limit an individual’s ability to work or be productive at work, creating substantial societal and economic burden. Despite this, evidence-based work‑related advice and support for people with chronic pain is inconsistent. The Pain‑at‑Work Toolkit was co‑created with people living with pain, health care professionals, and employers to increase knowledge of employee rights, improve access to workplace support, and provide guidance on lifestyle behaviors that facilitate pain self‑management. Objective: This study aimed to establish the feasibility of conducting a definitive cluster randomized controlled trial comparing access to the Pain‑at‑Work Toolkit plus optional occupational therapist telephone support (intervention) with support-as-usual (SAU) from the employer (control). Primary outcomes were feasibility, acceptability, usability, and safety of the digital intervention. We also assessed the feasibility of candidate primary and secondary outcomes and tested research processes required for a definitive trial. Methods: We conducted an open‑label, parallel, two‑arm pragmatic feasibility cluster randomized controlled trial with exploratory health‑economics analysis and a nested qualitative study. Eligible organizations were based in England, had ≥10 employees, and were recruited through professional networks and direct approach. Individual participants were working adults aged ≥18 years, with internet access and self‑reported chronic pain interfering with their ability to undertake or enjoy productive work. A restricted 1:1 cluster‑level randomization allocated organizations to the intervention or control arms. After organizational and individual consent, participants completed a web‑based baseline survey (T0) assessing work capacity, health and wellbeing, and health‑care resource use. Follow‑up occurred at 3 months (T1) and 6 months (T2). Feasibility outcomes included recruitment, intervention fidelity (delivery, reach, uptake, engagement), retention, and follow‑up completion. Qualitative interviews with employees and stakeholders at T2 explored acceptability and contextual factors influencing delivery and uptake. Results: A total of 380 employees from 18 organizations participated. Recruitment exceeded targets at both organizational and individual levels, demonstrating strong feasibility and engagement. Follow‑up completion met predefined feasibility criteria but showed variability, largely due to employee turnover, providing realistic attrition estimates for a future trial. Outcome measures showed acceptable completion rates and variability, supporting their suitability for use in a future definitive trial. Employees and stakeholders reported high acceptability of the Pain‑at‑Work Toolkit, and qualitative findings highlighted improved knowledge, confidence, and self‑management among employees. Stakeholders endorsed the Toolkit’s relevance and practicality within workplace settings. Conclusions: The feasibility trial demonstrated that the Pain‑at‑Work Toolkit and trial procedures are acceptable, scalable, and deliverable across diverse workplaces. Findings identify responsive outcome measures, emphasize the need for strengthened retention strategies, and support the Toolkit’s use as a standalone intervention. Overall, the study provides a strong foundation for progressing to a fully powered definitive trial. Clinical Trial: ClinicalTrials.gov NCT05838677; https://clinicaltrials.gov/study/NCT05838677 International Registered Report Identifier (IRRID): DERR1-10.2196/51474

  • Machine Learning–Enabled Interventions in Palliative Care: A Scoping Review

    Date Submitted: Apr 14, 2026
    Open Peer Review Period: Apr 14, 2026 - Jun 9, 2026

    Background: Machine learning-based prognostic models have been increasingly developed to support palliative and serious illness care, particularly in oncology. While predictive accuracy has improved substantially, less is known about how these models are translated into real-world interventions and whether they meaningfully influence clinical practice and patient care. Objective: This scoping review aimed to map and synthesize interventional studies that used machine learning-enabled interventions to support palliative and serious illness care, with a focus on model integration strategies and reported effects on communication processes, care planning, and downstream clinical outcomes. Methods: Following PRISMA-ScR guidelines, we conducted a scoping review of peer-reviewed English language studies published since 2015. Searches were performed in PubMed, Embase, Web of Science, and the Cochrane Library. Eligible studies implemented Machine learning-based predictions to trigger or guide real-world palliative care related interventions, including serious illness conversations, advance care planning, or palliative care referral. Results: Eight interventional studies were included, encompassing cluster randomized trials, stepped wedge designs, and real-world implementation studies. Machine learning-enabled interventions were consistently associated with increased documentation of serious illness conversations and advance care planning, particularly when predictive outputs were embedded within clinical workflows through behavioral nudges, automated alerts, or facilitated outreach. In contrast, effects on treatment intensity, health care utilization, and end-of-life costs were limited, inconsistent, or not observed. Conclusions: Current evidence suggests that machine learning-enabled interventions in oncology palliative care are most effective when used to support prioritization and timing of communication related processes rather than to directly alter care trajectories or resource use. Future research should focus on implementation strategies, patient centered outcomes, and equity sensitive evaluation to better translate predictive insights into meaningful clinical impact.