Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

The leading peer-reviewed journal for digital medicine and health and health care in the internet age. 

Latest Submissions Open for Peer Review

JMIR has been a leader in applying openness, participation, collaboration and other "2.0" ideas to scholarly publishing, and since December 2009 offers open peer review articles, allowing JMIR users to sign themselves up as peer reviewers for specific articles currently considered by the Journal (in addition to author- and editor-selected reviewers).

For a complete list of all submissions across all JMIR journals as well as partner journals, see JMIR Preprints

Note that this is a not a complete list of submissions as authors can opt-out. The list below shows recently submitted articles where submitting authors have not opted-out of open peer-review and where the editor has not made a decision yet. (Note that this feature is for reviewing specific articles - if you just want to sign up as reviewer (and wait for the editor to contact you if articles match your interests), please sign up as reviewer using your profile).

To assign yourself to an article as reviewer, you must have a user account on this site (if you don't have one, register for a free account here) and be logged in (please verify that your email address in your profile is correct).

Add yourself as a peer reviewer to any article by clicking the '+Peer-review Me!+' link under each article. Full instructions on how to complete your review will be sent to you via email shortly after. Do not sign up as peer-reviewer if you have any conflicts of interest (note that we will treat any attempts by authors to sign up as reviewer under a false identity as scientific misconduct and reserve the right to promptly reject the article and inform the host institution).

The standard turnaround time for reviews is currently 2 weeks, and the general aim is to give constructive feedback to the authors and/or to prevent publication of uninteresting or fatally flawed articles. Reviewers will be acknowledged by name if the article is published, but remain anonymous if the article is declined.

The abstracts on this page are unpublished studies - please do not cite them (yet). If you wish to cite them/wish to see them published, write your opinion in the form of a peer-review!

Tip: Include the RSS feed of the JMIR submissions on this page on your homepage, blog, or desktop RSS reader to stay informed about current submissions!

JMIR Submissions under Open Peer Review

↑ Grab this Headline Animator

If you follow us on Twitter, we will also announce new submissions under open peer-review there.

Titles/Abstracts of Articles Currently Open for Review:

  • Background: There is tremendous enthusiasm on the use of AI in health care because of the ability to analyze existing data for preventative, diagnostic, and treatment support. Agentic AI can make access to large real world datasets for the generation of real world evidence for health care and clinical applications feasible for health care providers, researchers, and administrators without access to large analytic programming resources. Objective: The objective of this study was to understand the feasibility of using agentic AI for clinical outcomes research and population health management. Specifically, this study used an agentic AI evidence generation platform to obtain epidemiologic estimates of several diverse medical conditions with the results evaluated against existing AI frameworks. Methods: Prevalence estimates of six diverse conditions (amyotrophic lateral sclerosis (ALS), acute myeloid leukemia (AML), bladder cancer, Huntington’s disease (HD), elevated lipoprotein (a) (LP(a)), and Parkinson’s disease (PD)) were estimated using an agentic AI evidence generation platform applied to an administrative claims database with representation from every US state. Gender-specific rates were calculated within the following age categories: 0-17, 18-24, 25-34, 35-44, 45-54, 55-64, 65-74, and 75 years and older. Period prevalence was estimated from Jan 1, 2020 – June 30, 2025, and annual prevalence rates for each year from 2020-2024. Continuous enrollment for 12 months was required during the study period for inclusion. Source code generated by the platform as part of the analysis was reviewed by an independent programmer for validation of methods and programming. Results obtained throughout the process were evaluated against several existing AI application frameworks. Results: Accuracy: Epidemiologic estimates obtain using the agentic AI platform were consistent with published estimates for all six conditions as well as with estimates obtained from traditional programming methods. Rigor: The agentic AI platform conducted the analysis with rigor by confirming acceptable methods in published literature for the type of data source used. Code lists used for the analysis were confirmed against existing algorithms when available. Appropriate statistical methods were used to compare differences in prevalence rates by age and gender. Trust (explainability, transparency, replicability, traceability, and validation): The agentic AI platform generated all source code used for the analyses, which was reviewed and validated for accuracy and appropriateness. The analysis included a ‘human-in-the-loop’ to validate the research question, data extraction method, statistical analysis plan, and output plan prior to proceeding with each step. Conclusions: With specific design aspects to ensure responsible use, agentic AI can be invaluable to making large datasets accessible for applied clinical outcomes research and population health management analyses.

  • Socio-Cultural Challenges and Design Implications for Ethical AI in Healthcare: A Systematic Review

    Date Submitted: May 4, 2026
    Open Peer Review Period: May 5, 2026 - Jun 30, 2026

    Background: Artificial intelligence (AI) is increasingly embedded in healthcare, yet its benefits remain unevenly distributed due to persistent concerns regarding bias, inequity, and socio-cultural misalignment. Although existing Ethical AI frameworks typically emphasize universal principles, they often insufficiently address the socio-cultural contexts in which AI systems are developed, implemented, and used. Objective: This systematic review aimed to examine how socio-cultural factors shape ethical challenges in healthcare AI, influence the interpretation of ethical principles, and inform context-sensitive design and governance strategies. Methods: Following PRISMA 2020 guidelines, we conducted a systematic search of PubMed, IEEE Xplore, and Web of Science for studies published between 2018 and 2025. Eligible studies addressed ethical issues related to AI in healthcare through a socio-cultural lens. A thematic synthesis combining inductive and deductive coding was used to analyze reported challenges, context-dependent ethical interpretations, and proposed mitigation approaches. Results: A total of 49 studies were included. The findings show that ethical challenges in healthcare AI are deeply embedded in structural inequalities, data collection, curation, and documentation practices, institutional conditions, and cultural norms rather than being purely technical problems. Key challenges included algorithmic bias, underrepresentation of minorities in datasets, cultural and linguistic mismatches, limited transparency and trust, and systemic disparities in access to AI technologies. The reviewed literature proposed a broad range of technical, design-related, and governance-oriented strategies, but these remained fragmented and were rarely integrated systematically across the AI lifecycle. Based on this synthesis, the study proposes the Inclusive Ethical AI Framework (IEAF), a socio-technical framework that systematically translates socio-cultural context into context-sensitive ethical interpretations and actionable design and governance decisions across the AI lifecycle. Conclusions: The findings highlight that ethical challenges in healthcare AI are fundamentally shaped by socio-cultural context and cannot be addressed through technical solutions or universal ethical principles alone. Instead, effective and equitable AI systems require the systematic integration of socio-cultural considerations into data practices, system design, and governance across the AI lifecycle. Clinical Trial: PROSPERO CRD420251058607; prospectively registered.

  • Background: Up to 50% of patients treated with internet-delivered cognitive behavioral therapy (ICBT) for depression and anxiety disorders do not experience clinically significant symptom reduction. Identifying these patients prior to ICBT initiation can optimize treatment effect. Objective: The aim of this study was to enhance baseline prediction of clinically meaningful improvement in patients treated with ICBT for common psychiatric disorders in routine care, which could ultimately inform treatment allocation at intake. Methods: We developed multimodal predictive models integrating clinical, sociodemographic, and genetic data to predict clinically meaningful improvement in a sample of n=1790 patients treated with ICBT for major depressive disorder, panic disorder, and social anxiety disorder. Only data available pre-treatment were used to enable baseline prediction. We applied machine learning algorithms of varying complexity (logistic regression, random forest, XGBoost, support vector machines, soft voting, and stacking ensemble), with nested cross-validation, elastic net variable selection, multiple imputation, and temporal validation in a 20% holdout test set (n=356). The primary performance measure was area under the receiver operating characteristic curve (AUC). Results: All models showed comparable performance, with random forest achieving the best discrimination (AUCtest 0.749, 95% CI [0.698, 0.797]). Models that included data from national registers outperformed a benchmark model based on self-reported screening data (AUCtest 0.732–0.749 vs 0.695), whereas polygenic scores added no independent predictive value (DeLong test P=.966). Conclusions: These promising results provide a foundation for a future prospective trial to ascertain that baseline prediction can effectively guide tailored interventions for at-risk patients.

  • Background: Primary care encounters are increasingly constrained by limited time and documentation burden, reducing opportunities for meaningful patient–physician communication. Pre-visit planning tools can improve efficiency but are often rigid, burdensome, and inconsistently adopted. Patient-facing applications using large language models (LLMs) offer the potential for more flexible, conversational approaches to eliciting patient information prior to clinical encounters, though concerns remain regarding safety, hallucination, and workflow integration. Objective: To evaluate the feasibility of an LLM-based conversational assistant (PCP-Bot) for pre-visit planning in primary care, focusing on the quality, usability, and perceived clinical utility of generated pre-visit summaries. Methods: We conducted a prospective feasibility study using simulated primary care scenarios. PCP-Bot, implemented using ChatGPT-4o, engaged users via a voice-based conversational interface and generated structured pre-visit summaries using a schema-constrained output. Ten synthetic cases were enacted by trained non-medical researchers acting as patients, producing 30 complete dialogues and corresponding summaries. Practicing physicians (N=10) independently rated summaries across six domains (usefulness, readability, relevance, coherence, comprehensiveness, and factual accuracy) using 5-point Likert scales. Perceived usefulness (TAM-PU) was assessed among physicians, and perceived ease of use (TAM-PEU) was assessed among participants simulating patient interactions. Quantitative analyses examined interaction characteristics and their associations with summary quality. Results: PCP-Bot generated concise conversations (median 28 exchanges, IQR 26.25–31) and summaries (median 148 words, IQR 132.75–162). Clinician ratings were favorable across domains, including usefulness (mean 3.99, SD 0.25), relevance (mean 4.07, SD 0.21), readability (mean 4.07, SD 0.26), coherence (mean 3.94, SD 0.27), and comprehensiveness (mean 3.88, SD 0.22), with a low hallucination rate (mean 0.51, SD 0.25). Simulated patients reported high perceived ease of use (mean TAM-PEU 97.2, SD 2.48), while physicians reported moderate perceived usefulness (mean TAM-PU 61.3, SD 15.9). Longer summaries were associated with higher ratings of usefulness (r=0.39, P=.033) and comprehensiveness (r=0.39, P=.031), whereas longer patient dialogue was negatively associated with perceived relevance (r=−0.39, P=.035). Conclusions: In simulated primary care scenarios, an LLM-based conversational assistant produced concise, structured pre-visit summaries that clinicians rated favorably, supporting the feasibility of conversational pre-visit workflows. Summary quality appears sensitive to the balance between detail and conciseness. Real-world evaluation is needed to assess clinical impact, safety, equity, and integration into routine care.

  • Mapping the Use of AI Chatbots for Health Purposes: A Systematic Review Across Stakeholders

    Date Submitted: May 3, 2026
    Open Peer Review Period: May 3, 2026 - Jun 28, 2026

    Background: Artificial intelligence (AI) chatbots are increasingly shaping health communication by mediating how patients, health professionals, researchers, and health institutions seek, interpret, produce, and act on health information. Existing reviews have largely focused on a single stakeholder group or clinical domain, leaving the broader question of who uses AI chatbots for what health purposes inadequately addressed. A multi-stakeholder synthesis is needed to understand where evidence is concentrated and where critical gaps remain. Objective: This systematic review aimed to map the empirical literature on AI chatbot use for health purposes across three stakeholder groups (the general public or patients, health professionals or researchers, and health institutions) and to characterize the purposes, evidence patterns, benefits, and risks associated with such use. Methods: Following PRISMA guidelines, we searched nine databases for peer-reviewed English-language empirical journal articles. Searches combined AI chatbot–related and health-related terms and covered records available through July 2025. After title, abstract, and full-text screening, 301 articles were retained for content coding. Articles were coded for publication year, country, sample size, method, chatbot modality, health topic, stakeholder group, and purposes of AI chatbot use, using a structured codebook. Results: The 301 included articles were published between 2023 and 2025. Studies were geographically concentrated in the United States and China and predominantly examined text-based interactions, with noncommunicable or chronic diseases and general health information as the most common health topics. Ten purposes of AI chatbot use were identified across stakeholder groups. Among the general public or patients (n = 152), purposes included (1) seeking health information, (2) symptom assessment and self-care management, (3) emotional support, and (4) preventive and transitional care support. Among health professionals or researchers (n = 189), purposes included (5) clinical decision support, (6) enhancing continuing education, (7) facilitating health research, (8) improving administrative efficiency, and (9) supporting patient interaction and doctor–patient communication. Among health institutions (n = 6), (10) public health management was the major purpose. Conclusions: This review provides a stakeholder–purpose mapping of AI chatbot use in health care, identifying ten purposes across three stakeholder groups. The evidence base is expanding rapidly but remains concentrated in text-based, lower-acuity, and information-oriented contexts, with AI chatbots used mainly to support interpretation, decision-making, communication, and workflow tasks. Reported benefits in access, scalability, personalization, low-barrier support, and administrative efficiency are accompanied by persistent concerns about accuracy, equity, transparency, and governance. Important gaps remain at the institutional level, in higher-stakes and longitudinal deployment contexts, in voice and multimodal modalities, and in empirical evaluation of equity and ethics. These findings offer a structured foundation for future research, design, and governance of AI chatbots in health care.

  • Background: Traditional research ethics governance was built for bounded protocols, identifiable investigators, and temporally limited encounters with human participants. Data science health research (DSHR) disrupts that architecture because health data, biological materials, computational representations, and models persist, travel, combine, and acquire new uses across time. The ethical problem is not simply that individual instruments such as consent forms, IRB approvals, data access agreements, model cards, or privacy notices become outdated. It is that, across the lifecycle, the authorities that make these instruments ethically meaningful may lose jurisdiction, standing, evidentiary force, or remedial control. Conceptual contribution: This paper introduces Ethical Governance Continuity Dissolution (EGCD): the progressive and sometimes irreversible loss of domain-specific governance authority across a data, model, or biological-material lineage, such that no coherent assemblage of actors, instruments, rules, or community processes can any longer authorize, constrain, monitor, adjudicate, or remediate current use. EGCD differs from consent staleness, function creep, contextual integrity violations, algorithmic drift, and the continuity trap because it names the systemic condition in which several such failures become mutually reinforcing. Framework: Building on prior work on representational veracity and the continuity trap, we develop a six-domain authority taxonomy, diagnostic criteria, staging categories, and a prototype Ethical Continuity Dissolution Score. We then propose an Ethical Continuity Governance and Response Mechanism consisting of a continuity registry, continuity authority matrix, trigger-based lifecycle review, a Data Lifecycle Governance Officer, a Continuity Dissolution Review Board, cross-institutional audits, and community governance integration. Implications: EGCD provides a practical vocabulary and governance architecture for diagnosing and managing the loss of ethical governance continuity in global DSHR. The goal is not to freeze all original consent conditions indefinitely, but to prevent silent loss of governance authority as data, models, and biological materials move through increasingly complex research and translational ecosystems.

  • Quality Criteria for Cancer Patient Portal Content: Framework Development and Pilot Audit Study

    Date Submitted: May 1, 2026
    Open Peer Review Period: May 1, 2026 - Jun 26, 2026

    Background: Patient-facing cancer portals are increasingly used to provide education, support interpretation of results, navigate services, and guide self-management across the cancer journey. However, variation in content quality, transparency, readability, accessibility, and governance can undermine equity, safety, and trust. Objective: To develop and present EU-CiP20 as a first-phase, evidence-informed, operational, and auditable framework of quality criteria for cancer patient portal content. Methods: We synthesised established instruments and authoritative guidance on online health information quality, health literacy and plain-language communication, transparency and conflicts of interest, patient engagement, privacy and data protection, digital governance, accessibility, and AI-related safety. Candidate criteria were harmonised from a broader evidence-mapped set (EU-CiP30) into a streamlined taxonomy (EU-CiP20) using explicit consolidation rules and an auditable mapping trail. Each category was operationalised into four observable sub-criteria and scored using a pragmatic 0-2 scale. EU-CiP20 is presented as an initial comprehensive framework to be refined in the next phase through stakeholder focus groups, an online survey with affected cancer patients, expert inquiry, and a Delphi expert panel, with the aim of reducing the 20 criteria to a final operational core of approximately 10 criteria. Results: EU-CiP20 comprises five domains and 20 categories spanning accessibility and comprehensibility; evidence and content governance; relevance and personalisation; human-centred design and empowerment; and ethics, safety, and trust. In the pilot, adjusted EU-CiP20 totals ranged from 19.5% to 40.6%. The most consistent gaps were governance signals required for portal readiness, including named clinical ownership, explicit review cycles, evidence traceability, and accessibility auditability. Comparator tools characterised content-level strengths but did not fully capture these governance risks. Conclusions: EU-CiP20 offers a practical and auditable first-phase approach to strengthen governance of patient-facing cancer portal content. It complements existing information-quality instruments by linking readability, evidence governance, relevance, empowerment, transparency, safety, and digital trust within a single operational taxonomy. The work is not yet complete: the current 20-criteria framework will be refined through stakeholder focus groups, an online survey with affected cancer patients, expert inquiry, and Delphi expert panel consensus to produce a shorter final set of approximately 10 criteria, followed by assessment of inter-rater reliability, feasibility, sensitivity to change, and real-world implementation impact.

  • Background: Diffusion of innovations theory posits that inequalities arising from the early adoption of new technologies, such as telemedicine, are likely to decrease over time. However, evidence is scarce on the evolution of inequalities related to individual telemedicine adoption over time. Objective: This study aims to assess changes in age and socioeconomic inequalities in telemedicine adoption in Japan from 2020 to 2024. Methods: We used data from a nationwide, internet-based panel survey of the general population in Japan. Participants aged 18–75 years who completed both the 2020 baseline and 2024 follow-up surveys were included. The primary outcome was self-reported telemedicine adoption (ever use at each survey). Using multivariable logistic regression models, we regressed telemedicine adoption on (1) indicators of age and socioeconomic status at baseline, (2) survey year, and (3) their interaction, adjusting for other demographic, socioeconomic, and health-related characteristics. We then estimated the adjusted prevalence of telemedicine adoption in 2020 and 2024 for each age and socioeconomic group. Results: We included 10,818 participants (mean [SD] age, 49.7 [16.8] years; 50.7% women). In 2020, 271 participants (2.5%) reported telemedicine adoption; by the 2024 follow-up survey, this increased to 840 participants (7.8%). The prevalence of telemedicine adoption was lower among older individuals, those with lower educational attainment, those with medium income (vs high income), and unemployed individuals (vs upper non-manual workers) in 2020. While the prevalence increased across groups from 2020 to 2024, the increases were smaller among older age groups (70–75 years: +1.0 percentage points [pp] vs 18–29 years: +13.2 pp; difference-in-differences, −12.1 pp; 95% CI, −18.3 to −6.0 pp). Similarly, increases were smaller among unemployed individuals than among upper non-manual workers (+2.8 vs +5.8 pp; difference-in-differences, −3.0 pp; 95% CI, −4.7 to −1.2 pp). Changes in the prevalence of telemedicine adoption did not vary significantly by educational attainment, urban vs rural residence, or income level. Conclusions: Despite growth in telemedicine adoption from 2020 to 2024, age-related and occupational inequalities widened, and educational inequalities persisted, underscoring the need for strategies to reduce age-related and socioeconomic barriers to telemedicine adoption.

  • Longitudinal Modeling or Monitoring of Depression in Speech: A Systematic Review

    Date Submitted: Apr 30, 2026
    Open Peer Review Period: Apr 30, 2026 - Jun 25, 2026

    Background: Depressive disorders are a leading cause of disability worldwide, and more than 40% of people who experience a single depressive episode will experience recurrence. It is, therefore, essential that people living with a depressive disorder are able to access appropriate means of monitoring, to identify recurrences and enable timely interventions. Existing monitoring methods are burdensome for both clinicians and patients, but previous research into automated depression diagnosis has demonstrated links between participants’ depression severity and speech features. Longitudinal depression modeling through speech aims to build on these links and provide automated methods of long-term depression monitoring. Objective: This systematic review collates existing research into the monitoring or modeling of changes in depression severity, through its impact on speech. Methods: We searched the ProQuest, Scoups, Web of Science, PubMed and IEEE Xplore databases for studies relating to the longitudinal modeling of depression in speech. Publications of any age were acceptable, but only English-language studies were included. All studies underwent quality appraisal using the CASP cohort study checklist. Results: We retrieved 22 relevant documents from the database searches, and a further 40 documents through citation chasing and manual searching. The observational periods employed by these studies varied from 7 days to 18 months, and sample sizes of 16-954. Speech features such as speaking rate and pause duration show promising sensitivity to changes in depression severity. However other features, such as average energy velocity, exhibit conflicting trends across different studies - as does the generalizability of prosodic and acoustic features between languages. Conclusions: We identified significant methodological variation within the data collection, feature extraction, and modeling stages of the studies. While there is evidence to suggest that speech features are sensitive to changes in depression severity, some findings are inconsistent between studies. We advocate for greater clarity and consistency in the reporting of methods to support comparisons of findings between studies and generalizability testing. Future work could explore the predictive capacity of speech to identify oncoming depressive episodes. Clinical Trial: PROSPERO CRD420251003661; https://www.crd.york.ac.uk/PROSPERO/view/CRD420251003661.

  • Online Search Behavior Related to Psoriasis in Germany: Infodemiology Study Using Google Ads and Google Trends

    Date Submitted: Apr 29, 2026
    Open Peer Review Period: Apr 30, 2026 - Jun 25, 2026

    Background: Psoriasis causes considerable physical and psychosocial burden. Despite the availability of effective topical and systemic treatments, real-world care appears to be characterized by undertreatment, and many patients have limited knowledge of modern therapeutic options. Online search behavior can provide valuable insights into unmet informatiThis study aims to analyze psoriasis-related online search behavior in Germany to identify knowledge gaps and assess whether commonly searched topics reflect guideline-recommended care.onal needs, concerns, and treatment interests at the population level. Objective: This study aims to analyze psoriasis-related online search behavior in Germany to identify knowledge gaps and assess whether commonly searched topics reflect guideline-recommended care. Methods: Google Ads data for psoriasis-related search terms in Germany from May 2022 to April 2024 were systematically evaluated by topic and search volume. In addition, Google Trends data from 2014 to 2024 were analyzed to assess long-term trends. Results: The colloquial term “Schuppenflechte” was searched far more frequently than “psoriasis.” In total, 1,775 search terms were identified, corresponding to up to 70 million searches over two years. Almost half of all location-specific queries referred to the head region, consistent with guideline criteria for systemic therapy. In contrast, interest in evidence-based treatments was low. Searches focused mainly on home remedies (44.5%), while biologics accounted for only 0.4%. Conclusions: Future information campaigns and educational materials should use the lay term “Schuppenflechte”. Online search behavior reveals potential gaps in awareness of guideline-recommended therapies for psoriasis. These findings highlight the need for better patient-oriented, evidence-based digital education.

  • Liability and Standard of Care in AI-Driven Psychiatric Practice: A European Viewpoint

    Date Submitted: Apr 30, 2026
    Open Peer Review Period: Apr 30, 2026 - Jun 25, 2026

    Artificial intelligence is increasingly entering psychiatric care through decision-support systems, digital phenotyping tools, suicide-risk prediction models, documentation assistants, and conversational agents. These technologies may improve access, consistency, and personalised care, yet they also redistribute clinical authority and complicate liability when harm occurs. This article examines how European law and psychiatric ethics should respond to this shift. It argues that liability in AI-driven psychiatry cannot be understood only as a product-defect issue or only as a malpractice problem. Because psychiatric practice depends on interpretation, testimony, contextual judgment, and therapeutic alliance, the relevant standard of care must remain human, even when technologically augmented. The article advocates an augmented-clinician model in which AI informs but does not replace psychiatric reasoning. After outlining the European regulatory framework, including the AI Act, the Medical Device Regulation, the General Data Protection Regulation, the revised Product Liability Directive, and the European Health Data Space Regulation, the article analyses the implications of the withdrawal of the proposed AI Liability Directive and the persistence of divergent national tort regimes. It then examines psychiatric risk vectors, including automation bias, testimonial injustice, bias in mental health datasets, therapeutic chatbots, suicide prediction tools, passive monitoring, and large language model documentation. The discussion proposes a layered accountability model that links developers, deployers, and clinicians while preserving therapeutic integrity, patient rights, and legal clarity.

  • Background: Illicit substance use poses severe global burdens. Digital interventions utilizing cognitive behavioral therapy offer accessible treatment alternatives, but evidence regarding their design and effectiveness remains fragmented. Objective: This review synthesized current evidence on cognitive behavioral therapy based digital interventions for illicit substance use disorders, focusing on theoretical frameworks, efficacy indicators, and methodological limitations. Methods: Following the PRISMA-ScR guidelines, we searched PubMed, Embase, Web of Science, and PsycINFO for studies published up to February 16, 2026. Eligible studies were empirical trials of digital interventions using cognitive behavioral therapy for illicit substance use disorders. Studies focusing solely on alcohol or nicotine were excluded. Results: A total of 30 studies were included. Interventions targeted polysubstance, cannabis, stimulant, and sedative use. Programs were mainly web- or app-based, often combined with motivational interviewing, contingency management, or mindfulness approaches. Core components were therapeutic modules, self-tracking, rewards, and feedback. While primary outcomes often demonstrated substance use reduction, interpretation was hindered by prevalent reliance on self-reported data, inconsistent baseline controls, and the use of bundled multicomponent packages. Conclusions: Digital interventions based on cognitive behavioral therapy show potential to reduce illicit substance use and support psychosocial recovery. To advance clinical utility, future research must implement substance-specific designs, integrate objective outcome measures, and adopt modular trial designs to isolate active therapeutic mechanisms.

  • Background: Electronic nicotine delivery systems (ENDS) are at the center of global public health debate. China is the largest producer of e-cigarettes while the U.S. has the largest consumer market, yet analyses of news coverage of ENDS comparing China and the United States (U.S.) remain limited. Objective: The primary objective of this study is to identify and compare dominant themes in ENDS-related news coverage across leading broadcast-branded digital outlets in China and the United States, and to assess how these themes and coverage volume changed over time. Methods: We conducted a thematic analysis of 470 ENDS-related stories from January 1, 2020, to July 30, 2025, from four leading broadcast news digital media platforms: CNN.com and FoxNews.com in the U.S.; CCTV.com and ifeng.com in China. Using a single theme approach, coders identified core themes for each article based on prespecified rules and a hierarchical decision structure. Frequencies and proportion of each core theme were summarized for the overall sample and stratified by country. Pearson chi-square tests and binary logistic regression models were conducted to examine cross-national differences with false discovery rate (FDR) adjusted p-values. Temporal changes in themes were examined and visualized. Results: In U.S. coverage, the most prevalent themes were policy and regulatory governance (32.1%), youth appeal, flavors, and school responses (22.4%), and health risks, harms, symptoms, and dependence (13.9%). In Chinese coverage, the most prevalent themes were commercial practices and market dynamics of ENDS (26.0%), policy and regulatory governance (23.4%), and enforcement and compliance (15.7%). Cross-national differences in themes were consistently observed between the two countries. Between 2020 and 2025, coverage in China transitioned away from commercial and market themes toward greater focus on illicit substances and enforcement, while U.S. coverage showed relatively stable focus on commercial market with a gradual increase in enforcement-related reporting. Conclusions: Broadcast news in China and the U.S. may actively shape how ENDS are defined as a public issue and what policy responses appear legitimate. Chinese coverage tends to stress commercial activity and enforcement, whereas U.S. coverage more often foregrounds youth risks and regulatory debates. These distinct thematic patterns may influence risk perceptions and policies in each country and are important to consider in comparative media and public health research.

  • Digital health interventions to prevent post-traumatic arthritis after traumatic knee injury: a scoping review

    Date Submitted: Apr 28, 2026
    Open Peer Review Period: Apr 28, 2026 - Jun 23, 2026

    Background: Traumatic knee injuries (TKI) are common, associated with a 4-6 times increased risk of post-traumatic knee osteoarthritis (PTOAK) over the subsequent 15–20 year period. There is clear evidence that risk can be reduced, but long-term care availability is limited, prompting the development of DHIs (digital health interventions) such as wearable devices, telehealth innovations and mobile apps. Objective: To evaluate existing DHIs against the OPTIKNEE consensus guidelines for PTOAK prevention and investigate adoption into practice. Methods: A search of 7 online databases and the grey literature was completed from inception to 03/06/2025, complemented by hand searching government, charity and university websites for reports and technical prototype papers concerning DHIs to support care after TKI. DHI features were mapped to the OPTIKNEE recommendations, evaluated against the health-technology pathway to identify development stage, and implementation analysed using NPT (Normalisation Process Theory). Results: 81 reports, 53 peer-reviewed and 28 other, concerning 49 distinct DHIs were found. They were designed for injuries of the anterior cruciate ligament (ACL, n=12); ACL meniscus (n=15); meniscus (n=3); ACL or meniscus (n=2), bone (n=2), patella dislocation (n=1), and 14 were non-specific. No DHIs addressed all OTPIKNEE recommendations, however the eight most complete reported 4/7 components, including exercise, information provision, patient reported outcome measures, goal setting and overall patient outcome. A remote, self-assessed strength evaluation was not reported in any DHI. NPT analysis typically demonstrated low DHI adoption levels, and no clear correlation with health technology pathway stage. The DHI with the highest adoption into routine practice, according to NPT, was ‘getUbetter’ with 56% positive scores. Conclusions: There are many available, or developing, DHIs but none include the content recommended by OPTIKNEE to reduce the risk of PTOAK. Further, there is negligible evidence of DHIs being adopted into usual care. There is a clear need to develop guideline-compliant DHIs to support effective prevention.

  • Providing consultation recordings to patients in German routine cancer care: A mixed-methods pilot study

    Date Submitted: Apr 28, 2026
    Open Peer Review Period: Apr 28, 2026 - Jun 23, 2026

    Background: The provision of audio recordings of medical encounters to patients, referred to as consultation recordings, is a well-established intervention to address information needs like recall and comprehension in cancer care. Despite these benefits, consultation recordings are not routine practice. Furthermore, research on consultation recordings in Germany is lacking. Objective: This study aims to pilot test consultation recordings in routine cancer care in Germany and assess feasibility of implementation and perceived effects from patients’ perspective. Methods: Using a sequential mixed methods approach, we assessed consultation recordings’ use, usability, acceptability, appropriateness, influencing factors, and perceived effects. Consultation recordings were piloted in an outpatient setting. Adult cancer patients were eligible to participate. Four weeks after the recorded consultation, participants received a quantitative questionnaire. In addition, a selection of participants were qualitatively interviewed. Quantitative data was analyzed using descriptive statistics, qualitative data using a combination of Practical Thematic Analysis and qualitative content analysis. Results: Ninety-seven consultations were audio-recorded and provided to patients. Seventy participants returned the quantitative survey (response rate 72.2%) and 16 participated in qualitative interviews. Most participants listened to the consultation recording and experienced improvements in recall, comprehension, and feeling informed. Routine implementation of consultation recordings was desired by many. The results suggest that patients perceive consultation recordings as feasible. However, we encountered organizational implementation challenges. Conclusions: This study provides initial evidence on the patient-perceived feasibility of consultation recordings in German routine cancer care. Consultation recordings have the potential to help patients navigate complex medical information. However, organizational implementation challenges hinder their uptake. Future research could investigate technically easier solutions suited to the German healthcare context.

  • Background: Medication nonadherence after percutaneous coronary intervention (PCI) remains a major barrier to secondary prevention. Prior SMS-based interventions have shown inconsistent results, often limited to reminders without addressing behavioral or psychological determinants. Objective: To evaluate the effectiveness and feasibility of a theory-informed, WeChat-based text-messaging program for improving adherence and patient-reported outcomes in post-PCI patients. Methods: A nonrandomized, quasi-experimental parallel-group study with a mixed-methods design was conducted from July 2022 to April 2023 at a tertiary hospital in Hangzhou, China. Patients were allocated by ward admission to intervention or control groups. The intervention comprised a 12-week WeChat program with daily medication reminders and educational messages mapped to COM-B domains and behavior change techniques. The primary outcome was adherence (MMAS-8); secondary outcomes were medication beliefs (BMQ-Specific), self-efficacy (SEAMS), and health status (SAQ). Outcomes were measured at baseline, discharge, and 12 weeks by blinded assessors. Analyses used ANCOVA with baseline adjustment and multiple imputation. After 12 weeks, semi-structured telephone interviews with intervention participants explored usability, message clarity and relevance, reminder helpfulness, timing and tone, and unmet needs. Interviews were conducted by an independent researcher, purposively sampled until thematic saturation, and thematically analyzed from verbatim transcripts. Results: Of 180 patients screened, 92 were enrolled and 87 (94.6%) completed follow-up. At 12 weeks, adherence was higher in the intervention group (adjusted mean MMAS-8, 7.38 vs 6.20; mean difference, 1.18 [95% CI, 0.95–1.41]; P<.001). Secondary outcomes also favored the intervention, including improved beliefs, self-efficacy, and quality of life. Feasibility benchmarks were exceeded, with deliverability 97.9%, retention 95.7%, and acceptability scores >8/10. Interviews confirmed usability and highlighted the need for personalization. Conclusions: A theory-informed, WeChat-based program improved adherence, beliefs, self-efficacy, and quality of life after PCI. The intervention was feasible, acceptable, and scalable. Randomized trials are warranted to confirm long-term effectiveness and cost-effectiveness. Clinical Trial: Chinese Clinical Trial Registry (ChiCTR2200061353) https://www.chictr.org.cn/bin/project/edit?pid=172238

  • From Measurement Failure to Privacy Infrastructure: Reframing Contact Tracing Governance for the Next Pandemic

    Date Submitted: Apr 27, 2026
    Open Peer Review Period: Apr 28, 2026 - Jun 23, 2026

    Effective infectious disease control rests on a foundational principle: no measurement, no understanding; no understanding, no control. The COVID-19 pandemic exposed, with devastating clarity, how thoroughly this principle can fail in public health practice. Transmission chains spread invisibly; contact histories, mobility patterns, and biosignals essential for control were never systematically collected. The necessary sensors and digital technologies existed — the fundamental reason measurement failed was not the absence of technology, but the absence of privacy infrastructure that would allow people to share data with confidence. This failure has structural roots. The objects of measurement in infectious disease control are not physical phenomena but human beings, and measurement therefore inevitably engages the core of privacy: contact histories, social relationships, and bodily states. This asymmetry — whereby greater measurement precision deepens privacy intrusion — manifested acutely in COVID-19 contact tracing apps. Designs that prioritized privacy lost epidemiological utility; designs that prioritized utility were rejected through public distrust. Neither direction achieved sufficient measurement. This Viewpoint reframes the problem. Privacy protection is not a constraint that impedes infectious disease control; it is the enabling condition upon which effective measurement depends. Existing regulations and technical approaches were not designed from this premise, and have therefore been unable to break the cycle of structural distrust. As one institutional approach to filling this gap, we present VRAIO (Verifiable Record of AI Output), which integrates democratic rule-setting, metadata declaration, independent third-party verification, tamper-proof ledgers, and violation-deterrence incentives. When privacy infrastructure is established, the foundational scientific principle "no measurement, no understanding; no understanding, no control" will begin to operate freely in infectious disease control for the first time. This opens the path toward high-resolution epidemiology and precision intervention: a new public health paradigm that simultaneously pursues strengthened disease control and the preservation of individual autonomy and social freedom, without dependence on blanket social restrictions.

  • From Innovation to Responsibility: A Scoping-Umbrella Review of Artificial Intelligence in Mental Health

    Date Submitted: Apr 26, 2026
    Open Peer Review Period: Apr 27, 2026 - Jun 22, 2026

    Background: Artificial intelligence (AI) has rapidly transformed psychological research and mental health practice through advances in machine learning, deep learning, natural language processing, and large-scale data analytics. AI-based systems are increasingly employed to support psychological assessment, diagnosis, intervention, monitoring, and clinical decision-making. This rapid expansion has resulted in a substantial and growing body of empirical and review literature. However, despite the accelerated development of AI applications in psychology, discussions surrounding ethics, legal frameworks, and governance have not progressed at a comparable pace. Objective: Concerns related to privacy, transparency, data security, informed consent, algorithmic bias, and emotional safety remain particularly critical in psychological contexts, where AI systems may influence highly sensitive aspects of human experience. Given the rapidly evolving and heterogeneous nature of the literature, this study aimed to conduct a scoping umbrella review to map the breadth of existing evidence, identify key thematic domains, and highlight gaps in the application of AI in mental health. Methods: Abstracts of 1,827 records retrieved from Web of Science (n = 50), PubMed (n = 677), and Scopus (n = 1,100) were screened. Following a full-text assessment of 218 potentially eligible studies, a total of 182 review articles were included in the final synthesis. Results: The findings indicate that research on AI in psychology is primarily organized around five thematic domains: intervention, diagnosis, prediction, theoretical framework, and ethical issues. The intervention domain represents a substantial proportion of the literature, suggesting that AI is most frequently examined in relation to applied psychological functions. In contrast, ethical issues are comparatively underrepresented. This pattern reflects a broader imbalance in which the field is progressing through application-driven innovation, while ethical reflection remains relatively limited and often theoretical. Although AI-based interventions and assessment tools are expanding rapidly, only a small number of reviews have systematically examined how these systems address core ethical concerns, including informed consent, data privacy, accountability, cultural bias, and emotional safety. Furthermore, the increasing reliance on cloud-based infrastructures introduces additional challenges related to confidentiality, cross-border data transfers, third-party access, and system reliability in sensitive clinical settings. Conclusions: Taken together, these findings underscore the risk of integrating AI technologies into psychological practice without sufficient ethical, clinical, and infrastructural safeguards. Future research should prioritize the development of evidence-based and context-sensitive ethical frameworks, alongside the exploration of alternative implementation models—such as local or hybrid infrastructures—that can better balance scalability, privacy, and institutional control.

  • Background: With the rapid development and widespread application of artificial intelligence (AI) technology, AI has demonstrated high accuracy and reliability in medical practice, and patients' trust in algorithmic has gradually increased. However, in clinical practice, disagreements may still arise between algorithmic recommendations and clinical expert experience, and such disagreements can affect patients' trust. To date, however, the impact of these disagreements on patients’ medical trust and the strategies for addressing them have not been systematically reviewed. Objective: To systematically map the impact of disagreements between AI recommendations and clinical expert judgment on patients’ medical trust, identify influencing factors based on Mayer’s integrative model of organizational trust, and summarize strategies to enhance trust. Methods: Following Joanna Briggs Institute (JBI) scoping review methodology and Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews (PRISMA-ScR) guideline, we systematically searched Web of Science, PubMed, Embase, Scopus, and EBSCO up to March 2026, limited to English-language literature. Studies focusing on patients' trust in the context of disagreements between AI and expert opinions were included. Data were charted using the Population, Concept, Context (PCC) framework. Guided by Mayer’s integrative model of organizational trust, influencing factors were analyzed through a framework synthesis approach across the dimensions of ability, benevolence, integrity, and trustor propensity. The protocol was pre-registered on OSF (Registration DOI: 10.17605/OSF.IO/AHSGD). Results: A total of 2,630 records were identified, and 26 studies were ultimately included after screening, including six qualitative studies, seven quantitative studies, three mixed-methods studies, five theoretical studies, and five review articles. These studies were conducted across 10 countries and were published mainly between 2022 and 2026. Disagreements were concentrated in clinical diagnosis and risk assessment, treatment planning and medication decision-making, clinician–patient communication and intelligent interaction, as well as emerging application scenarios. In situations of disagreement, patients commonly expressed skepticism toward both algorithms and experts; overall, however, patients tended to trust experts more than algorithms. Data security and privacy risks, insufficient communication, AI accuracy and reliability, demographic and socioeconomic characteristics, and patients’ disease and health status were identified as high-frequency factors influencing patients’ medical trust. Six trust-enhancing strategies were extracted: transparency and explainability, patient participation and shared decision-making, clinician–patient communication and role positioning, institutional regulation and governance, education and capacity building, and privacy protection and data security. Conclusions: In situations of disagreement between AI and clinical experts, patients’ medical trust is dynamically shaped by ability, benevolence, integrity, and individual-contextual multiple interacting factors. Strengthening transparency, communication, and governance is essential for fostering trust in human–AI collaborative healthcare.

  • Deep Learning Algorithms for Predicting Intraoperative Hypotension: A Systematic Review and Meta-Analysis

    Date Submitted: Apr 25, 2026
    Open Peer Review Period: Apr 25, 2026 - Jun 20, 2026

    Background: Intraoperative hypotension (IOH) is associated with myocardial injury, acute kidney injury, perioperative stroke, and 30-day mortality, yet conventional blood pressure monitoring remains reactive rather than anticipatory. Deep learning (DL) algorithms applied to continuous physiological waveforms represent a rapidly expanding paradigm for early IOH prediction, but the comparative performance of distinct DL architectures and the influence of prediction-window length, input data modality, IOH reference standard, and analysis unit on diagnostic accuracy have not been systematically synthesised. Objective: To quantify the pooled diagnostic accuracy of DL-based IOH prediction models and to identify methodological and clinical factors that modify their performance. Methods: PubMed, Embase, Web of Science, and the Cochrane Library were searched through March 2026. Methodological quality was appraised with the PROBAST+AI tool and overall certainty of evidence with the GRADE framework. A bivariate random-effects model generated pooled sensitivity, specificity, and the area under the summary receiver operating characteristic (SROC) curve, with heterogeneity quantified by τ²(Se), τ²(Sp), and the inter-study correlation ρ. Threshold effect was tested with Spearman’s correlation, publication bias with Deeks’ test, and clinical utility with Fagan’s nomogram. Prespecified subgroup analyses (prediction window, DL architecture, input modality, IOH reference standard, analysis unit) and Bayesian random-effects meta-regression explored heterogeneity sources. Results: Twelve studies were included; nine contributed 22 validation datasets to the quantitative synthesis. The pooled sensitivity was 0.78 (95% CI 0.73–0.81), specificity 0.88 (0.82–0.92), and SROC-AUC 0.87 (0.83–0.90); the diagnostic odds ratio was 24.7 (16.1–37.9), positive likelihood ratio 6.31, and negative likelihood ratio 0.26. Heterogeneity was τ²(Se) = 0.25, τ²(Sp) = 1.04, and ρ = −0.28; no significant threshold effect was detected (Spearman ρ = 0.29, P = 0.20). The 5-minute window achieved the highest performance (sensitivity 0.81, 95% CI 0.77–0.85; specificity 0.91, 0.84–0.95). Meta-regression identified DL architecture as the only significant moderator of specificity (P = 0.02), with hybrid CNN-RNN exceeding pure CNN (β = 1.77, 95% CI 0.45–3.09); no covariate significantly moderated sensitivity. Deeks’ test showed no statistically significant publication bias (P = 0.06). At a 10% pre-test probability, post-test probabilities were 41% (positive) and 3% (negative). GRADE certainty was Low. Conclusions: Deep learning models for IOH prediction achieve moderate diagnostic accuracy, with hybrid CNN-RNN architectures and 5-minute prediction windows showing the most favourable performance. The universal absence of formal calibration assessment, scarce external validation, and geographic concentration of the evidence base constrain immediate clinical translation. Prospective multinational validation with mandatory calibration reporting and patient-level evaluation is required before DL-based IOH alerts can be safely integrated into perioperative decision support. Clinical Trial: PROSPERO CRD420261377604.

  • Background: Youth experiencing mental health and substance use (MHSU) challenges face notable barriers to accessing adequate care, including limited service availability and a lack of youth-centered, evidence-based treatments. The prevalence of digital device ownership among youth presents an opportunity to help bridge this treatment gap using these accessible tools to deliver scalable evidence-based interventions. Self-guided digital MHSU interventions (ie, self-directed, technology-delivered psychosocial interventions not requiring clinical or technical support) present interesting opportunities for service providers or youth looking for self-help-style interventions. Many self-guided digital interventions have been developed for youth, yet little guidance on the effectiveness of these interventions exists for those looking to leverage evidence-based self-guided tools for these populations. Objective: This systematic review aimed to synthesize the evidence for self-guided digital MHSU interventions developed for youth dealing with MHSU disorders. Methods: Five major databases were searched for controlled trials of self-guided digital interventions targeting MHSU disorders in youth (12 - 25 years). Search concepts included: youth, mental health/substance use, digital, intervention, effectiveness. Eligible studies included trials with passive controls (determining initial effectiveness) or active comparators (determining superiority). Data describing trial characteristics, intervention characteristics, and MHSU outcomes were extracted. Risk of bias was assessed using Cochrane Risk of Bias 2.0 tool. Findings were synthesized narratively, both by MHSU disorder and by digital modality. Results: The search yielded 15,828 unique records; 76 trials met inclusion criteria. Interventions targeted symptoms relating to depression, alcohol use disorder, anxiety, eating disorders, tobacco use disorder, cannabis use disorder, post-traumatic stress disorder (PTSD), suicide, obsessive-compulsive disorder, and attention-deficit/hyperactivity disorder (ADHD). Interventions used diverse modalities: web-based interventions, mobile applications, text messages, computers, chatbots, video games, and wearables. Across disorders, 33 of 74 (45%) initial effectiveness evaluations and 3 of 25 (12%) superiority evaluations were positive. Across digital modalities, 33 of 71 (46%) initial effectiveness and 3 of 21 (14%) superiority evaluations were positive. Notably, in superiority evaluations, whether classified by disorder or modality, the majority of digital interventions performed similarly to their active comparators (71% - 72%). Positive evaluations of initial effectiveness were more common for interventions targeting PTSD and tobacco use disorder, and for interventions using chatbot- or computer-based interventions. Positive evaluations were limited for ADHD- or cannabis-focused interventions. Three of 76 trials (4%) were rated as low risk of bias; the remainder had some concerns or high risk of bias. Conclusions: This review synthesizes evidence for self-guided digital MHSU interventions and demonstrates their potential to address MHSU disorders in youth. Although more rigorous evaluations are still needed, this review identifies numerous effective self-guided digital interventions that can be used to help youth struggling with MHSU disorders, and identifies trends within modalities that might be considered for future MHSU intervention development.

  • Background: Providing care to a family member or friend with a serious illness like cancer increases risk for poor physical, psychological, and functional health outcomes. Despite their critical role, family caregivers (FCGs) are rarely screened in clinical settings for the wide range of factors that may put them and the person they care for at risk for poor outcomes. Mobile health (mHealth) applications can efficiently facilitate access to high-quality health information for FCGs; however, few are clinically integrated. Objective: This study aimed to evaluate the usability of CareCheck, an mHealth-based digital risk screening tool designed to enable family caregivers' self-awareness of potential caregiving-related risks for adverse health and psychosocial outcomes and to support health care professionals in personalizing interventions that address FCGs' specific risk factors. Methods: We conducted a usability testing study of CareCheck using two evaluation methods: quantitative measurement with a modified 5-item Mobile Health App Usability Questionnaire (MAUQ) and exploratory qualitative thematic analysis based on feedback from FCGs and trained staff. FCGs of individuals with gynecologic cancer were recruited through the inpatient unit and the outpatient gynecologic oncology clinic of a Comprehensive Cancer Center. Participants completed CareCheck and the usability questionnaire via the mHealth app installed on tablets. Staff observed the assessment process and provided feedback. Results: A total of 56 CGs and 2 trained staff participated in the usability study. The mean MAUQ score was 6.49 (SD = 1.06) out of 7, indicating high usability. Qualitative analysis identified recommendations in three categories: 1) Improvements to CareCheck ; 2) Perceptions of CareCheck’s Usability and Functionality, and 3) Clinical Implementation Considerations for CareCheck. Conclusions: FCGs and staff found CareCheck to be user-friendly and easy to navigate. While further iterations are needed to refine content and optimize integration with clinical workflows, CareCheck demonstrated potential as a clinically integrated tool for identifying and addressing FCG risk for poor social, psychological, or health outcomes in gynecologic oncology care settings.

  • Background: Background: Hypertension remains a predominant global risk factor for cardiovascular disease. Conventional follow-up models frequently fail to address the requirements for real-time monitoring and sustained intervention, whereas mobile health (mHealth) offers a transformative trajectory for chronic disease management. Despite a surge in relevant literature, the diversity of intervention modalities and the fragmented nature of existing evidence necessitate a systematic synthesis. Objective: Objective: This study aimed to comprehensively evaluate the efficacy of mHealth in hypertension management through a systematic review combined with evidence mapping, identifying research gaps to provide evidence-based insights for precision nursing and future research directions. Methods: Methods: A systematic search was conducted across PubMed, Web of Science, Cochrane Library, and Embase for randomized controlled trials (RCTs) involving mHealth interventions for hypertension, with the search period extending through February 2026. Literature was screened according to PICOS criteria, and methodological quality was appraised using the Cochrane Risk of Bias tool (RoB 1.0). Visual analytics, including Sankey diagrams and bubble plots, were employed to characterize the associations between intervention modalities and clinical outcomes. The study protocol was prospectively registered on the Open Science Framework (URL: https://osf.io/2vkwu). Results: Results: A total of 106 publications (comprising 108 RCTs) were included. Publication volume has increased significantly since 2018, with the United States (31 papers) and China (19 papers) being the primary contributors. The intervention paradigm has evolved from rudimentary SMS reminders to a "closed-loop" management model centered on "App + Remote Monitoring," which demonstrates the most robust and consistent positive evidence for blood pressure (SBP/DBP) control and goal attainment rates. Blood pressure parameters occupied the "core evidence layer," while therapeutic adherence and disease knowledge formed the "behavioral evidence layer". Conversely, BMI, mental health, and quality of life remained in the "peripheral evidence layer," characterized by a notably higher proportion of non-significant results. Methodological quality was generally moderate-to-high with robust randomization; however, the implementation of blinding faced prevalent high risks due to the inherent nature of the interventions. Conclusions: Conclusion: mHealth significantly enhances hypertension management efficacy through a digital "monitoring-feedback-adjustment" loop, yet it encounters bottlenecks in achieving profound lifestyle modifications (e.g., weight management) and psychological interventions. Clinical decision-making should prioritize multicomponent interventions featuring real-time interaction. Future research should focus on long-term (>1 year) follow-up and cost-effectiveness transformation in resource-limited settings.

  • Background: Molecular tumor boards (MTBs) generate highly technical recommendations. The language used in their protocols is rarely accessible to patients. Lay-language patient protocols could support patient-clinician communication, yet manual production is difficult to sustain in high-volume oncology settings. Large language models (LLMs) may offer scalable drafting assistance, yet clinical usability remains largely uninvestigated under real-world deployment constraints. Existing evaluations rely predominantly on synthetic data or closed-source models that are incompatible with strict data protection requirements. Objective: This study evaluated whether open-weight LLMs can provide clinically usable drafting support for German MTB patient protocols under real-world deployment constraints and developed a transferable evaluation framework for patient-facing text generation. Methods: Eight open-weight LLMs were evaluated under zero-shot (A1) and one-shot (A2) prompting with constrained decoding, which ensures section-schema compliance. Automatic evaluation used ROUGE-1, BERTScore-F1, WSTF4, and DistilBERT-based complexity using a corpus of 316 MTB protocols and 47 expert-written patient protocols. For expert evaluation, seven medical oncologists evaluated 50 protocols from the best-performing model across three ISO 9241-11 usability dimensions using fine-grained error annotation, perceived post-editing effort (PPEE), and net promoter score (NPS). Critical errors were defined as bearing the risk of patient harm. Results: Llama-3.3-70B-Instruct achieved the strongest automatic performance. Across models, A2 significantly improved most automatic metrics compared to A1. However, expert usability evaluation showed the opposite picture: the proportion of protocols containing at least one critical error doubled under A2 (40% vs. 20%) compared with A1, and the dominant error type shifted from language (37%) errors to factual errors (48%). Overall, 6.1% of the annotated paragraphs contained errors. Median PPEE was 2 (low) and median NPS was 7. Detractors (46%) outweighed promoters (29%), which signals clinical hesitation toward routine adoption. Conclusions: Prompting strategies that improve automatic metrics can simultaneously increase the number of critical errors. Surface-level metric gains were, therefore, insufficient proxies for clinical safety. Nonetheless, the low paragraph-level error rate and favorable PPEE suggest that structured open-weight LLM generation may be a useful drafting aid in a clinician-supervised setting. The proposed evaluation framework establishes a text-quality-focused basis for future assessment of patient-facing LLM applications in real-world clinical settings.

  • Background: As the digital transformation of healthcare systems accelerates, interest in the Social and Digital Determinants of digital health tools is growing. Objective: This OSF-registered scoping review maps how these determinants are assessed in both countries and identifies research implications Methods: Using the PiCo framework, PubMed, Scopus, and Web of Science were searched for peer-reviewed primary studies (2015-2025; English, German, French) on access and use of digital health tools in Germany or France. Eligible studies were screened and extracted using a standardized template in the Covidence software. Descriptive and exploratory analyses were conducted in SPSS. Exploratory correlation-based heatmaps were used to visualize recurring determinants, barriers, and facilitators shaping digital health access and use. Results: Seventy studies, 59 from Germany and 11 from France, were included, most were quantitative and cross-sectional. Frequently reported social determinants of health consisted of age, gender, education, geographic location, and health literacy, while central digital determinants of health included digital literacy, trust in digital tools, perceived usefulness, usability, and access to digital infrastructure. Comparative analyses revealed both shared patterns and country-specific emphases, with German studies more frequently addressing eHealth literacy and functional aspects of digital health, while French studies placed greater emphasis on social environment, housing conditions, and ethical considerations. Conclusions: Whether social or digital, most determinants were either person- or technology-centered. While reflecting an emerging field focused on individual-level factors, this emphasis risks overlooking broader, multi-level, mechanisms of social inequalities that may also shape digital health access and use. Clinical Trial: not applicable.

  • Background: Digital interventions may help reduce exacerbations and increase adherence to the Chronic Obstructive Pulmonary Disease (COPD) action plan by providing opportunities for self-monitoring of symptoms and self-management behaviors, and remote support from a case-manager Objective: To compare the impact of a COPD Patient Health Portal (PHP) on action plan adherence during exacerbations vs. usual care. A secondary objective was to evaluate the association between severity of COPD, sociodemographic, and knowledge, self-efficacy, anxiety and depression with adherence to the action plan Methods: This trial was a 12-month parallel, 2-arm RCT. Participants recruited from a speciality clinic were assigned to either: (1) COPD PHP with self-monitoring and automated feedback and online communication with a nurse case manager and usual care, or (2) usual care which consisted of self-management support delivered by a nurse case-manager with access to online educational material (Living Well with COPD). Analyses were based on an intent-to-treat analysis. The primary outcome was self-reported adherence to the exacerbation action plan. Secondary outcomes included COPD self-efficacy, HRQL, healthcare utilization, anxiety, depression, technology acceptance (TAM), and PHP use. Results: Forty-nine participants were randomized to either intervention ((n=24) or control (n=25). Groups were similar in age, sex, BMI, education, marital status, smoking status, spirometry and GOLD classification. While a greater proportion of intervention participants were adherent to the action plan (47%, 20/43 exacerbations) compared to the control arm (37%, 18 of 48 exacerbations), differences were not statistically significant. Twelve percent of exacerbation in the intervention arm led to the participant contacting a health professional compared to 4% in the control group. The average exacerbation length among individuals who contacted a healthcare professional during the exacerbation and took a medication was significantly lower in the intervention (days=10.6 ± 2.2, n=5) vs. control (29.5 ± 6.4, n=2) group. There was no effect of treatment arm on adherence to action plan at the exacerbation level (OR 1.16, 95% CI 0.44-3.03) nor of GOLD severity group (GOLD 3 vs 2: OR 1.6 and 95% CI 0.5-5.7, GOLD 4 vs 2: OR 1.3 and 95% CI 0.4-4.2) in adjusted models. Conclusions: A 12-month PHP intervention did not significantly increase adherence to the exacerbation action plan. There was a trend for those using the PHP to have fewer exacerbations in the first 4 months and with a shorter duration when contacting their HCP resulting in taking medication.

  • Background: Adults may experience subjective cognitive decline (SCD). However, it is unclear whether SCD is related to measurable cognitive impairment, particularly women ages 40 to 60 and early dementia. Further, Medicare has mandated assessment of cognitive and memory function in individuals over 65 as part of the Medicare Annual Wellness Visit. In order to assess possible impairment and change over time, efficient, objective measures of SCD are needed. Objective: To assess the relationship between performance on an online continuous recognition task (CRT, MemTrax) and age, sex, and memory concern. Methods: This study evaluated CRT performance in participants aged 21-99 who enrolled in an online program (HAPPYneuron) to measure mental functions, including those who reported concerns about them. This program asked participants if they had complaints about their memory, and then the program offered them the opportunity to assess cognition using the CRT. This CRT instructs individuals to attend to visual stimuli (50 images) and respond as quickly as possible to repeated images (25 images). The CRT components were used to measure learning and memory (as related to HITs, response to a repeated image), executive function (as related to CRs, correctly not responding to an initial image presentation), and processing speed (HIT-RTs, average response time to HITs). Results: Analysis of 18,178 (5,795 males, 32%; 12,383 females, 68%) only included those who answered the sex, age, and memory questions. There were 11,786 (65%) between 40 and 70 years of age. Females outnumbered males by over two-fold, beginning about 35 years of age, peaking at 55 years of age at over three-fold, and falling below two-fold at about 65 years of age. Approximately 30% more men complained of memory problems than those who did not, primarily 30 – 60 years old. About 80% more women complained of memory problems, over two-fold more than women who did not, 30-50 years old. The number of HITs, number of CRs, and HIT-RTs varied little between men and women. While those without memory complaints generally performed better than those with memory complaints, there was little difference in performance levels for each group between males and females. For all groups, there was a gradual reduction of performance over age for HITs and CRs and a slowing of HIT-RTs. Conclusions: Most subjects were 40-65, more than twice as many females, suggesting that these demographics have a relationship to concern about SCD. However, there was little difference between males and females for the various CRT components, though SCD was associated with impairment. Age-related declines were progressive, the largest being in slower processing speed, presumably to compensate for age-related changes in cognitive function. Present results suggest clinicians may use these metrics to quantify patient concerns expressed in the primary care setting. Clinical Trial: none

  • Momentary Mood State Detection using Smartwatches: Algorithm Development and Validation

    Date Submitted: Apr 20, 2026
    Open Peer Review Period: Apr 21, 2026 - Jun 16, 2026

    Background: Mental health encompasses not only chronic conditions such as depression or anxiety, but also acute fluctuations in mood that unfold over minutes to hours and can disrupt daily functioning. These transient states, such as sudden fatigue, irritability, or low energy, remain largely invisible to current digital health approaches, which typically aggregate behavioral and physiological data over days or weeks to detect trait-level conditions. The ability to detect momentary mood shifts in real time carries significant clinical promise: continuous affective monitoring could enable early detection of mental health crisis, support clinical decisions and clinical trials with continuous mood measurements, and improve occupational safety with detection fo states like fatigue or confusion. However, affective computing research has demonstrated that while physiological signals carry information relevant to mood, most prior work relies on controlled laboratory settings where performance degrades substantially in naturalistic environments, or employs research-grade devices with proprietary sensors unavailable on consumer hardware. Bridging this gap between laboratory-validated sensing and real-world momentary mood detection is essential for translating these clinical possibilities into practice through just-in-time adaptive interventions. Objective: This study investigates whether continuous sensing from a low-cost, opensource smartwatch can support detection of multi-dimensional momentary mood states in naturalistic settings, using personalized models with on-device computation. Methods: We conducted a 7-day field study in which participants (N=10) wore Bangle.js 2 smartwatches that continuously collected physiological and contextual data, including heart rate, accelerometry, barometric pressure, temperature, and GPS, while prompting hourly mood self-reports using the Brunel Mood Scale (BRUMS) across six mood dimensions (tension, depression, anger, vigor, fatigue, confusion) and additional affective and physical states. All feature extraction was performed on-device. We developed personalized mood detection models using best-subset regression across multiple feature combinations. Results: Personalized models decoded momentary states with mean R2 values ranging from 0.09 (pain) to 0.31 (vigor). Fatigue, happiness, vigor, and depression were the most reliably decoded dimensions (mean R2 = 0.26–0.31). Cross-subject decoding was substantially lower, confirming that personalization is essential for accurate mood inference. Including privacy-preserving location features did not significantly improve prediction accuracy beyond physiological and contextual sensors alone. Conclusions: This work demonstrates that a broad range of momentary mood states can be decoded from low-cost, open-source wearable sensors as people go about their daily lives, bridging the gap between controlled laboratory studies and real-world momentary assessment. The finding that personalized models substantially outperform generalized approaches underscores the need for individual calibration in affective computing systems. The on-device, privacy-preserving architecture establishes a foundation for future closed-loop adaptive interventions in clinical and occupational contexts, including continuous monitoring of high-risk psychiatric populations, early warning systems for substance use relapse, and real-time assessment of cognitive and emotional fitness in safety-critical work environments. Clinical Trial: N/A

  • Background: Patients with vestibular schwannoma often experience postoperative vestibular dysfunction, including vertigo, dizziness, and imbalance, which severely impair daily functioning and quality of life. Effective and accessible rehabilitation strategies are therefore essential. Objective: To evaluate the effects of a mobile health-based vestibular rehabilitation program in patients following unilateral vestibular schwannoma surgery. Methods: A prospective randomized controlled trial was conducted. A total of 60 patients who underwent unilateral vestibular schwannoma surgery at the Otology Center, Eye & ENT Hospital, Fudan University, from October 2023 to May 2025, was enrolled and randomly assigned to either the control group or the intervention group (n = 30 each) . Both groups underwent a 90-day vestibular rehabilitation program that targeted the gaze stability exercises, balance training and gait training. The intervention group received self-assessment, video-based guidance, symptom recording, and automated adherence monitoring via a customized mobile app, whereas the control group received face to face guidance and maintained their records using paper diaries. The primary outcomes was the between-group difference in the change in Dizziness Handicap Inventory (DHI) score from baseline (preoperative) to 90 days postoperatively. Secondary outcomes included DHI change at 30 days, visual analog scale (VAS) scores, and incidence of vestibular symptoms. Demographic, baseline, and outcome data were collected at admission and on postoperative days 7, 30, and 90. Intention-to-treat analysis was performed; missing continuous data were handled using multiple imputation, and dichotomous variables were imputed with the last observation carried forward. Independent t‑tests or Mann-Whitney U tests were used for continuous variables, and chi‑square tests for categorical variables. Results: The 90‑day follow-up primary outcome assessment was completed by 52 of 60 patients (86.7%), with 8 non-responders in the intervention group and 6 in the control group. No significant differences in baseline demographic or clinical data were observed between the two groups, whereas tumor size distribution differed significantly (χ2= –2.513, P=.012), with larger tumors in the intervention group. For the primary endpoint, no significant between‑group difference was observed in the change in DHI score from baseline to 90 days (P>.05). For secondary outcomes, no significant differences were found in DHI change at 30 days or VAS scores at any time point (all P>.05). However, on postoperative day 7, the incidence of postural symptoms was significantly lower in the intervention group than in the control group (53.33% vs 83.33%, χ² = 6.239, P = .012). Conclusions: The mobile health–based vestibular rehabilitation program demonstrated comparable efficacy to conventional face‑to‑face rehabilitation in improving vestibular function and may accelerate recovery during the early phase of unilateral vestibular loss.These findings support the feasibility of mHealth as an alternative approach for postoperative vestibular rehabilitation in patients with vestibular schwannoma. Clinical Trial: Chinese Clinical Trial Registry ChiCTR2200056123; https://www.chictr.org.cn/showproj.html?proj=150939

  • From Digital Access to Digital Assurance: Governing Equity in Digital Medicine.

    Date Submitted: Apr 20, 2026
    Open Peer Review Period: Apr 21, 2026 - Jun 16, 2026

    Digital health technologies are often promoted as means to expand access to care and reduce health disparities. Nevertheless, evidence from large-scale implementations indicates that access alone does not ensure equity. Although access initiates opportunities for care, assurance is required to sustain safety and fairness within digital health systems. Software-based and AI-enabled clinical systems may introduce new forms of exclusion, inequitable benefits, and unintended harm, as their performance varies across populations, clinical contexts, and temporal settings. The availability of a digital solution does not guarantee comparable usability or benefit across generations, due to differences in skills, trust, and access. Similarly, territorial disparities arise where internal and peripheral areas experience infrastructural discontinuities and uneven service provision, increasing the risk of amplifying pre-existing inequalities. Health equity in digital medicine should be conceptualized as a system-level property arising from governance, quality assurance, continuous clinical oversight, and the meaningful involvement of affected communities and patient organizations in governance, monitoring, and accountability processes, rather than as a downstream effect of adoption. Clinical and regulatory evidence shows that inequities often stem from lifecycle blind spots, subgroup performance asymmetries, and fragmented post-deployment accountability. An assurance-oriented approach, grounded in continuous validation, real-world monitoring, and predefined pathways for managing change, provides a clinically meaningful and systemically robust framework for achieving equity in digital medicine. In this context, freedom and verifiability in digital systems—defined as interoperability, open standards, independent audit capability, and, when proportionate to risk, technical inspectability and surveyability through open-source components or controlled disclosure—represent an essential ethical dimension. Effective and actionable policy levers, including mandatory subgroup monitoring, investment in governance infrastructure, and funding for adaptive oversight frameworks, can serve as effective mechanisms for decision-makers to ensure fair digital health outcomes.

  • Patient Portal Activation Among Neurology Patients in Washington, DC: A Cross-sectional Study

    Date Submitted: Apr 17, 2026
    Open Peer Review Period: Apr 18, 2026 - Jun 13, 2026

    Background: Patient portals have become essential infrastructure for healthcare delivery following the 21st Century Cures Act, yet adoption remains inequitable. Understanding demographic and geographic determinants of portal activation is critical for addressing digital health disparities, particularly among neurology patients who face unique access barriers. Objective: We examined the demographic, geographic, and neighborhood-level factors associated with patient portal activation among neurology patients at multiple geographic scales in the Washington, DC metropolitan area. Methods: We conducted a retrospective cohort study of 72,417 adult neurology patients seen at two academic medical centers sharing an electronic health record in Washington, DC (February 2021–February 2026). We examined portal activation using multivariable logistic regression and geographic analysis at four nested scales: the metropolitan catchment area, DC’s eight wards, individual census tracts (via geocoded patient addresses), and individual DC residents. Results: Portal activation was 64.7% overall. Activation varied by race/ethnicity (Non-Hispanic White 76.1%, Non-Hispanic Black 57.0%, Non-Hispanic Asian 57.6%, Hispanic 55.0%) and geography (DC Ward 2: 82.0% vs. Ward 7: 48.0%). Ward-level educational attainment (r = 0.948), broadband access (r = 0.889), and income (r = 0.811) were strongly correlated with activation. Within individual wards, Non-Hispanic White patients activated at 84–91% while Non-Hispanic Black patients activated at 48–64%, demonstrating that neighborhood resources alone do not explain disparities. Conclusions: Patient portal activation is shaped by demographic, socioeconomic, and geographic factors operating at multiple levels. Persistent within-ward racial disparities indicate that neighborhood resources alone do not explain the digital divide. Geographically targeted interventions must be paired with culturally tailored approaches to achieve digital health equity.

  • Virtual Reality for Cognitive Mastery in Airway Trauma Management: A Prospective Randomized Controlled Trial

    Date Submitted: Apr 17, 2026
    Open Peer Review Period: Apr 18, 2026 - Jun 13, 2026

    Background: Innovation in teaching methods is essential for advancing medical education, particularly for trainees developing crisis management skills. Virtual reality (VR) offers access to immersive, scalable, and accessible learning environments, but its effectiveness compared to traditional mannequin-based simulation remains underexplored. Objective: This prospective randomized controlled trial evaluates the efficacy of VR-based simulation versus traditional gold-standard mannequin-based training in enhancing medical trainees’ knowledge acquisition and application of decision-making concepts for airway trauma management. Methods: Forty medical students were randomized to either the VR (intervention) group or the Mannequin (control) group. Participants engaged in airway trauma management training using their assigned modality. Both groups completed a pre-and post-intervention test to evaluate knowledge acquisition, and undertook a mannequin-based crisis scenario one week after training to evaluate knowledge application. Results: Both groups demonstrated significant knowledge acquisition (VR: mean improvement +2.0/15, P=0.006; Mannequin: mean improvement +3.2/15, P<0.001), though no statistically significant differences were observed between groups (P=0.15). The VR group achieved self-assessed readiness and knowledge saturation faster, on average, than the Mannequin group. Both groups, on average, were successful in the post-training knowledge application test, however, the Mannequin group outperformed the VR group (mean difference: 1.58/15, P=0.021), and recognized a potential airway injury more quickly (P=0.004). Nevertheless, students in the VR group reported greater engagement and satisfaction, expressing a preference for VR as a future learning modality. Conclusions: Overall, VR-based simulation is a promising and engaging method for teaching airway trauma management and demonstrates comparable knowledge acquisition to traditional mannequin-based training. However, mannequin-based simulation still confers advantages for applied performance. Further studies using larger samples, multiple scenarios, and VR-based assessments are needed. Clinical Trial: ClinicalTrials.gov NCT04451590; https://clinicaltrials.gov/study/NCT04451590

  • Wearable Eye-Tracking Metrics From Smart Glasses for Cognitive Assessment: A Prospective Digital Health Study

    Date Submitted: Apr 17, 2026
    Open Peer Review Period: Apr 18, 2026 - Jun 13, 2026

    Background: Reading performance is closely associated with cognitive function, and eye-tracking metrics have emerged as sensitive, non-invasive indicators of cognitive processes. Recent advances in wearable technologies, such as smart glasses, enable continuous and scalable measurement of eye movements in real-world settings. However, rapid, accessible, and objective tools for cognitive screening remain limited. Integrating wearable eye-tracking with multidomain cognitive assessment may provide a scalable digital approach for early detection of cognitive impairment. Objective: To evaluate the association between wearable eye-tracking metrics and cognitive performance and to assess the feasibility of a smart glasses–based reading task as a rapid digital screening tool. Methods: In this prospective observational study, Mandarin-literate adults were recruited from Taipei Veterans General Hospital between May to August 2025. Participants completed a standardized reading task while wearing J7EF Gaze smart glasses. Eight eye-tracking metrics were recorded, followed by the six-domain cognitive assessment using gaze-based interaction. Associations were analyzed via multivariable regression adjusted for age and sex. Results: A total of 134 participants were enrolled (mean age 68.2 ± 13.4 years). Age correlated with all six cognitive domains and the total score, while sex exhibited smaller, domain-specific effects. In unadjusted analyses, total reading time showed the strongest associations with all cognitive domains (p < 0.001), while fixation duration, fixation frequency, and long or ultra-long fixations showed selective associations with orientation. After adjusting for age and sex, total reading time, total fixation time and average fixation time remained significant predictors. Conclusions: Total reading time emerged as a robust, age-independent eye-tracking marker of cognitive performance. Fixation-related metrics showed domain-specific associations, particularly with the puzzle game hobbies domain of the cognitive assessment. Wearable smart glasses with integrated eye tracking may provide a rapid, non-invasive, and scalable approach for digital cognitive screening in clinical and real-world settings.

  • Background: Large real-world data sources offer a unique opportunity to study the health of diverse ethnic groups. High-quality and accessible ethnicity data is needed to maximise this potential. Objective: To validate a newly developed ethnicity phenotype in the Oxford-Royal College of General Practitioners (RCGP) Research and Surveillance Centre (RSC). Methods: Retrospective cross-sectional study of individuals registered at a practice within the Oxford-RCGP RSC on 4th December 2024. An updated ethnicity phenotype was implemented and validated. Ethnicity data quality was assessed by evaluating completeness, distribution, and accuracy through external validation against estimates from the 2021 UK Census. Results: Of 21,902,852 individuals, 88.63% (19,412,154) had a recorded ethnicity following the implementation of the updated ethnicity phenotype. There was a marked improvement in the recording of granular (19-point) ethnicity data, with completeness increasing from 69.06% (15,126,835) to 88.63% (19,412,154) with the updated phenotype. There was significant variation in the completeness of ethnicity data according to demographic subgroups. The proportion of individuals in each ethnicity group was within 3.56 percentage points of the 2021 Census estimates for the same ethnicity group across England. Larger relative differences were observed for non-White ethnic groups. Conclusions: The updated ethnicity phenotype provides high-quality and granular ethnicity data based on official classifications for almost 90% of individuals. The overall ethnicity breakdown in the Oxford-RCGP RSC population was broadly similar to 2021 UK Census estimates. The updated ethnicity phenotype supports secondary uses of primary care CMRs, providing high-quality and accessible ethnicity data to study the health of diverse ethnic groups.

  • Background: Continuous glucose monitoring (CGM) is central to modern diabetes care, but explaining CGM patterns clearly, consistently, and empathetically remains time-intensive in practice. Large language model (LLM)–based systems may support patient-facing interpretation of CGM data, but evidence remains limited for retrieval-grounded tools evaluated against clinician-authored responses in counseling scenarios. The system was intended for structured CGM interpretation and communication support rather than autonomous therapeutic decision making. Objective: To evaluate whether a retrieval-grounded LLM-based conversational agent (CA) could support patient understanding of CGM data and preparation for routine diabetes consultations by generating responses to questions arising during CGM-informed diabetes counseling, with quality comparable to clinician-authored responses. Methods: We developed a retrieval-grounded LLM-based CA for CGM interpretation and diabetes counseling support. The system was designed to provide plain-language explanations of CGM patterns and responses to diabetes management questions while avoiding directive or individualized medical advice, such as recommending medication initiation, dose adjustment, or regimen changes. 12 CGM-informed cases, each comprising a de-identified CGM trace, a synthetic patient vignette, and accompanying CGM visual materials, were constructed from publicly available clinical datasets. Between Oct 2025 and Feb 2026, six senior UK diabetes clinicians each reviewed 2 assigned cases and answered 24 questions (12 per case). In a blinded multi-rater evaluation, each CA-generated and clinician-authored response was independently rated by 3 clinicians on 6 quality dimensions: clinical accuracy, guideline adherence, actionability, personalization, communication clarity, and empathy. Safety flags and perceived source labels were also recorded. The primary analysis used linear mixed-effects models with random intercepts for case and rater. Results: A total of 288 unique responses (144 CA and 144 clinician responses) were evaluated, generating 864 ratings. The CA received higher quality scores than clinician responses (mean 4.37 vs 3.58), with an estimated mean difference of 0.782 points on a 5-point scale (95% CI 0.692-0.872; P<.001). This pattern was observed across all 6 categories of patient questions. The largest estimated differences were for empathy (mean difference 1.062, 95% CI 0.948-1.177) and actionability (0.992, 95% CI 0.877-1.106). Safety flag distributions were similar between CA and clinician responses, with major concerns rare in both groups (3/432, 0.7% each). Although CA responses were longer, additional analyses adjusting for word count did not indicate that response length explained the overall quality difference. Conclusions: Retrieval-grounded LLM-based systems may have value as adjunct tools for routine CGM review, patient education, and preconsultation preparation, with potential to reduce clinician time spent on standardized interpretive tasks. However, these findings should be interpreted in light of the vignette-based design, restricted datasets, and a small clinician panel, and they do not establish suitability for autonomous therapeutic decision-making, medication adjustment, or unsupervised real-world use. Prospective validation in interactive clinical workflows is needed before implementation.

  • When Learning Cycles Turn Vicious: A Governance Model for AI-Enabled Learning Health Systems

    Date Submitted: Mar 17, 2026
    Open Peer Review Period: Apr 16, 2026 - Jun 11, 2026

    The shared commitments trust framework codified by the National Academy of Medicine and the AI-enabled learning health system framework proposed by Ko et al. together provide the normative foundation and operational architecture that AI-driven clinical learning requires. Neither, however, specifies the governance mechanisms needed when AI-mediated learning produces harm rather than improvement. This paper identifies three governance gaps that expose AI-enabled learning health systems to compounding failure: the absence of operational controls that translate shared commitments into enforceable requirements, an accountability vacuum in which no designated actor bears responsibility at each stage of the AI learning lifecycle, and the lack of a failure detection mechanism capable of identifying when learning cycles become vicious rather than virtuous. To address these gaps, the paper proposes an integrated governance model comprising three interdependent layers. A control layer maps each shared commitment to auditable requirements with defined metrics and responsible actors. An accountability layer assigns explicit responsibility across five stages of the AI learning lifecycle with quantitative escalation triggers. A failure detection layer monitors six trust decay indicators and activates a circuit breaker mechanism when predefined thresholds are breached, enabling institutional intervention before harm compounds at machine speed. The model is offered as a practical complement to existing frameworks, providing health system leaders, policymakers, and researchers with the governance infrastructure required for safe and trustworthy AI-enabled learning at scale.

  • Background: Concurrent chemoradiotherapy (CCRT) for abdominal cancer frequently induces chemoradiotherapy‑induced muscle loss, weight loss, and malnutrition. Objective: This randomized phase II trial evaluated whether a multidisciplinary, mHealth‑based multimodal rehabilitation program could preserve handgrip strength and muscle mass in abdominal cancer patients undergoing CCRT. Methods: In this prospective, multicenter, randomized, open‑label phase II trial (NCT05325554), 111 eligible patients with abdominal malignancies scheduled for CCRT were randomly assigned (1:1) to receive either multidisciplinary mHealth rehabilitation care (MRC, n=57) or standard care (SC, n=54). The MRC program was delivered by a dedicated multidisciplinary team comprising oncologists, rehabilitation physicians, nurses, clinical nutritionists, and psychologists. Using the AINST mHealth platform and wearable heart rate monitors, the team provided coordinated, individualized exercise, nutritional, and psychological interventions based on weekly assessments and real‑time data. The SC group received routine oncology care. The primary endpoint was change in handgrip strength from baseline to CCRT completion. Secondary endpoints included body weight, skeletal muscle mass, nutritional biomarkers, quality of life, psychological status, and adverse events. Results: Between February 2022 and April 2023, 111 patients were enrolled. Adherence was high, with 83.9% (47/56) of MRC patients achieving preset exercise targets. Compared with SC, the MRC group demonstrated significantly less decline in handgrip strength at all time points (all p < 0.001). The MRC group also showed better preservation of body weight (mean difference 1.3 kg, p=0.005) and a significantly lower proportion of patients with >5% weight loss (10.7% vs. 32.7%, p=0.005). Skeletal muscle mass was also better preserved (mean difference 1.3 kg, p<0.001). The MRC group had less decline in serum albumin (p=0.009) and prealbumin (p=0.019), and lower incidences of ≥G3 leukopenia (5.4% vs. 19.2%, p=0.037) and ≥G1 thrombocytopenia (16.1% vs. 34.6%, p=0.026). Nutritional and psychological benefits persisted at 4‑week post‑CCRT follow‑up. Conclusions: A multidisciplinary, mHealth‑based multimodal rehabilitation program effectively preserves handgrip strength, muscle mass, and nutritional status while reducing treatment toxicity in abdominal cancer patients undergoing CCRT. The multicenter implementation using standardized digital tools supports its scalability and translation into real‑world clinical pathways. Clinical Trial: Clinical Trial Registration: ClinicalTrials.gov NCT05325554.Registration Date 03/08/2022.

  • Background: Large Language Models (LLMs) are increasingly used in qualitative research, but their reliability compared to human analysis, especially on large, non-English datasets, is unclear. Previous studies on older models (like GPT-4) show limitations in nuance and token capacity. Objective: This thesis compares the qualitative analysis capabilities of OpenAI's GPT-5 and Google's Gemini 2.5 with a traditional human analysis. The study uses a large dataset of 317 Dutch newspaper articles (860 pages) from January 1, 2020, to December 31st, 2023, investigating the sentiment towards nurses during the COVID-19 pandemic. Methods: The study employed a two-part methodology. First, a thematic comparison was conducted where the human researcher, GPT-5, and Gemini independently generated inductive coding trees from the entire corpus. Second, a comparative test was performed where all three coded a 10% (31/317) random sample using a predefined codebook. This process was iterative, requiring a second round of AI analysis with refined prompts and an article-by-article approach to ensure a valid comparison. Results: The results show that both AI models identified third-order themes (e.g., "Healthcare Heroes") that were highly consistent with the data. In the practical application, however, both AIs "over-coded", identifying more quotations than the human (approx. 180 vs. 136). Conclusions: This study reveals a fundamental divergence in analytical logic: whereas human coders prioritize interpretive significance (contextual weight), LLMs default to semantic presence (literal frequency), leading to systematic over-coding. Consequently, this article argues that LLMs should not be viewed as autonomous researchers but as high-sensitivity filtering instruments requiring human calibration. This study concludes that AI serves as a valuable assistant for qualitative researchers. Still, it requires a rigorous, iterative, and human-in-th

  • Background: Chronic or persistent pain can limit an individual’s ability to work or be productive at work, creating substantial societal and economic burden. Despite this, evidence-based work‑related advice and support for people with chronic pain is inconsistent. The Pain‑at‑Work Toolkit was co‑created with people living with pain, health care professionals, and employers to increase knowledge of employee rights, improve access to workplace support, and provide guidance on lifestyle behaviors that facilitate pain self‑management. Objective: This study aimed to establish the feasibility of conducting a definitive cluster randomized controlled trial comparing access to the Pain‑at‑Work Toolkit plus optional occupational therapist telephone support (intervention) with support-as-usual (SAU) from the employer (control). Primary outcomes were feasibility, acceptability, usability, and safety of the digital intervention. We also assessed the feasibility of candidate primary and secondary outcomes and tested research processes required for a definitive trial. Methods: We conducted an open‑label, parallel, two‑arm pragmatic feasibility cluster randomized controlled trial with exploratory health‑economics analysis and a nested qualitative study. Eligible organizations were based in England, had ≥10 employees, and were recruited through professional networks and direct approach. Individual participants were working adults aged ≥18 years, with internet access and self‑reported chronic pain interfering with their ability to undertake or enjoy productive work. A restricted 1:1 cluster‑level randomization allocated organizations to the intervention or control arms. After organizational and individual consent, participants completed a web‑based baseline survey (T0) assessing work capacity, health and wellbeing, and health‑care resource use. Follow‑up occurred at 3 months (T1) and 6 months (T2). Feasibility outcomes included recruitment, intervention fidelity (delivery, reach, uptake, engagement), retention, and follow‑up completion. Qualitative interviews with employees and stakeholders at T2 explored acceptability and contextual factors influencing delivery and uptake. Results: A total of 380 employees from 18 organizations participated. Recruitment exceeded targets at both organizational and individual levels, demonstrating strong feasibility and engagement. Follow‑up completion met predefined feasibility criteria but showed variability, largely due to employee turnover, providing realistic attrition estimates for a future trial. Outcome measures showed acceptable completion rates and variability, supporting their suitability for use in a future definitive trial. Employees and stakeholders reported high acceptability of the Pain‑at‑Work Toolkit, and qualitative findings highlighted improved knowledge, confidence, and self‑management among employees. Stakeholders endorsed the Toolkit’s relevance and practicality within workplace settings. Conclusions: The feasibility trial demonstrated that the Pain‑at‑Work Toolkit and trial procedures are acceptable, scalable, and deliverable across diverse workplaces. Findings identify responsive outcome measures, emphasize the need for strengthened retention strategies, and support the Toolkit’s use as a standalone intervention. Overall, the study provides a strong foundation for progressing to a fully powered definitive trial. Clinical Trial: ClinicalTrials.gov NCT05838677; https://clinicaltrials.gov/study/NCT05838677 International Registered Report Identifier (IRRID): DERR1-10.2196/51474

  • Background: Artificial Intelligence (AI) methods offer a valuable complementary approach to public health emergency management, supporting prediction, rapid threat identification, and timely decision-making alongside the already established human-led systems and processes. However, updated and comprehensive evidence on the extent and characteristics of AI use in public health emergencies, with a specific focus on infectious hazards, remains limited globally. Objective: This review aimed to map the scope, nature, and extent of AI applications in public health emergency management resulting from infectious hazards, and to characterize key implementation features. Methods: A scoping review was conducted following the Arksey and O’Malley framework, and a search was performed in three electronic databases, including PubMed, Scopus, and the IEEE Xplore Digital Library. The search period covered studies published between January 2014 and June 2024. Results: A total of 613 studies were included, of which 526 (85.8%) were AI-related methodological studies and 87 (14.2%) were reviews or other article types. Across these studies, 665 infectious-hazard records were extracted, with COVID-19 accounting for the majority (387, 58.2%), followed by influenza (8.4%), dengue (4.1%), and malaria (3.0%). Publications increased steadily from 2014 to mid-2024, with a sharp rise beginning in 2019 and peaking in 2022, aligning with the COVID-19 pandemic. Notably, studies on non–COVID-19 hazards also grew between 2019 and 2023, suggesting expanding AI applications. Among methodological studies, 31.4% used social media data, mainly from X (formerly Twitter) and Weibo. Most focused on predictive analytics and disease surveillance (58.2%), followed by risk communication (23.2%), compliance with public health measures (12.5%), and policy evaluation (6.1%). Data were predominantly sourced from the USA (12.1%) and China (10.5%), with limited representation from Africa, Central Asia, and the Middle East. Funding was mainly reported from organizations in China (14.4%) and the USA (14.1%), followed by Saudi Arabia and South Korea. Conclusions: The findings indicate that applications of AI in infectious disease emergencies are predominantly focused on predictive modeling and surveillance, with a considerable reliance on social media data. The United States and China emerge as the primary contributors, both as sources of data and as leading funders of this research. To promote more equitable and effective use of AI in public health emergencies, there is a critical need for increased investment in local expertise, data infrastructure, and operational capacity, particularly in low- and middle-income countries.

  • Machine Learning–Enabled Interventions in Palliative Care: A Scoping Review

    Date Submitted: Apr 14, 2026
    Open Peer Review Period: Apr 14, 2026 - Jun 9, 2026

    Background: Machine learning-based prognostic models have been increasingly developed to support palliative and serious illness care, particularly in oncology. While predictive accuracy has improved substantially, less is known about how these models are translated into real-world interventions and whether they meaningfully influence clinical practice and patient care. Objective: This scoping review aimed to map and synthesize interventional studies that used machine learning-enabled interventions to support palliative and serious illness care, with a focus on model integration strategies and reported effects on communication processes, care planning, and downstream clinical outcomes. Methods: Following PRISMA-ScR guidelines, we conducted a scoping review of peer-reviewed English language studies published since 2015. Searches were performed in PubMed, Embase, Web of Science, and the Cochrane Library. Eligible studies implemented Machine learning-based predictions to trigger or guide real-world palliative care related interventions, including serious illness conversations, advance care planning, or palliative care referral. Results: Eight interventional studies were included, encompassing cluster randomized trials, stepped wedge designs, and real-world implementation studies. Machine learning-enabled interventions were consistently associated with increased documentation of serious illness conversations and advance care planning, particularly when predictive outputs were embedded within clinical workflows through behavioral nudges, automated alerts, or facilitated outreach. In contrast, effects on treatment intensity, health care utilization, and end-of-life costs were limited, inconsistent, or not observed. Conclusions: Current evidence suggests that machine learning-enabled interventions in oncology palliative care are most effective when used to support prioritization and timing of communication related processes rather than to directly alter care trajectories or resource use. Future research should focus on implementation strategies, patient centered outcomes, and equity sensitive evaluation to better translate predictive insights into meaningful clinical impact.

  • Background: The Emergency Intensive Care Unit (EICU) is the core setting for the treatment of critically ill patients, where the diagnostic error rate is more than twice that of general inpatient wards, which seriously affects patient prognosis. Large Language Models (LLMs) have shown application potential in clinical diagnosis, but there is still very limited evidence comparing the diagnostic efficacy of critical care-specific LLMs and general-purpose LLMs in the complex diagnostic scenarios of the EICU. Objective: This study aimed to evaluate and compare the diagnostic accuracy of a critical care-specific LLM (Qiyuan 3.0.1) and three mainstream general-purpose LLMs (GPT5.1, DeepSeek V3.1, Qwen3-32B) in EICU diseases, and to provide evidence-based basis for the selection of intelligent auxiliary diagnostic tools in the EICU. Methods: This was a single-center retrospective paired diagnostic accuracy study, which consecutively enrolled 184 critically ill patients admitted to the EICU of Peking University Shenzhen Hospital from April 2025 to March 2026. Standardized datasets were constructed based on the patients' clinical data, including an initial diagnosis dataset (clinical data within 24 hours after admission) and a final diagnosis dataset (complete course data from admission to discharge). A unified zero-shot learning prompt strategy was adopted, and four LLMs independently generated corresponding diagnoses in a double-blind manner. The consensus diagnosis reached by three senior intensive care physicians with more than 10 years of EICU working experience, who were blinded to the model results, was used as the gold standard. The primary endpoint was the Top-1 accuracy in the final diagnosis stage, defined as the proportion of cases where the first primary diagnosis output by the model completely matched the gold standard. Secondary endpoints included the Top-1 accuracy in the initial diagnosis stage and the number of correct diagnoses in the Top-3 outputs in the final diagnosis stage. Cochran's Q test was used for the overall comparison of accuracy among multiple groups, and post hoc pairwise comparisons were performed using the paired McNemar test with Bonferroni correction for type I error. The Friedman non-parametric rank sum test was used for the intergroup comparison of the number of correct Top-3 diagnoses. Results: In the final diagnosis stage, the overall difference in Top-1 accuracy among the four models was statistically significant (Cochran's Q=20.32, df=3, P=4.57×10⁻⁵). The Top-1 accuracy of Qiyuan 3.0.1 was the highest (64.13%, 95%CI 56.83%-71.00%), followed by GPT5.1 (59.24%, 95%CI 51.83%-66.35%), DeepSeek V3.1 (57.07%, 95%CI 49.64%-64.28%), and Qwen3-32B had the lowest accuracy (51.63%, 95%CI 44.26%-58.98%). Post hoc pairwise comparisons showed that the Top-1 accuracy of Qiyuan 3.0.1, GPT5.1, and DeepSeek V3.1 was significantly higher than that of Qwen3-32B (all adjusted P<0.0083), while no significant difference was found in other pairwise comparisons (all adjusted P>0.0083). A similar trend was observed in the initial diagnosis stage, where only Qiyuan 3.0.1 was significantly superior to Qwen3-32B (adjusted P=0.008). The median number of correct Top-3 diagnoses for all four models was 2.0 (IQR 1.0-2.0), with no significant intergroup difference (Friedman χ²=3.34, df=3, P=0.339). Conclusions: The critical care-specific LLM Qiyuan 3.0.1 has superior Top-1 diagnostic accuracy in EICU diseases compared with some general-purpose LLMs, but the absolute diagnostic accuracy of all included models still has considerable room for improvement. LLMs have potential application value as auxiliary diagnostic tools in the EICU, but their clinical application still requires further optimization and multi-center prospective clinical trial validation.

  • Background: Passive vaccine safety surveillance systems often generate clinically incomplete adverse event following immunization (AEFI) reports, which may lack the diagnostic evidence needed for causality assessment. While the period for collecting critical clinical data is limited, specialist expertise to identify necessary evidence at the point of reporting is not often available. Currently, no existing system provides the structured guidance to evaluate whether a report contains sufficient evidence for assessment or to identify the specific clinical data required. Objective: This study aimed to develop and evaluate a surveillance support system that generates actionable investigation guidance for field epidemiologists at the point of AEFI report intake. the system identifies what clinical evidence is present, what is missing, and what additional data would most impact the causality assessment. Methods: We developed Vax-Beacon, a 6-agent neuro-symbolic pipeline that processes Vaccine Adverse Event Reporting System (VAERS). The system utilizes large language model (LLM) for generating free-text narratives through clinical observation, curated knowledge database for differential diagnosis matching, and deterministic code for WHO causality classification, producing structured investigation guidance for each case. We tested the system on 100 purposively curated VAERS myocarditis/pericarditis cases. Two field epidemiologists independently evaluated pipeline-generated guidance using 5-point Likert scales and open-ended feedback. Results: The pipeline processed all 100 cases without errors. WHO classification yielded A1 in 45%, C in 27%, B2 in 7%, and Unclassifiable in 21%. Brighton Level 4 early exit occurred in 20% of cases precluding definitive classification. For these cases, the pipeline generated prioritized diagnostic checklists specifying which tests would upgrade certainty. Cardiac biomarkers such as troponin I, CK-MB were recommended as high-priority tests and cardiac magnetic resonance imaging as a lower-priority follow-up for suspected myocarditis. The neuro-symbolic architecture ensured 100% reproducibility of all classification decisions across independent benchmark runs. In structured expert review (two field epidemiologists), Likert scores ranged from 3 to 5 (mean 4.33); both reviewers estimated 30–50% workload reduction and agreed the system is suitable as an official investigation support tool. Conclusions: Vax-Beacon demonstrates that neuro-symbolic AI can function not as a classification oracle, but as a surge-ready investigation focus tool — directing field epidemiologists to the right evidence items for known adverse events at the moment when collecting that evidence remains feasible. This principle, Designed Deference, addresses a critical gap in passive surveillance: the loss of retrievable clinical evidence between reporting and expert review. Clinical Trial: .

  • Background: Rapid research responses to emerging infectious disease (EID) outbreaks depend not only on how quickly studies are launched, but also on whether their data can be combined, compared, and reused across studies. Health data standards, including shared vocabularies, terminology, and information models, are the structural prerequisite for interoperable, findable, accessible, interoperable, and reusable (FAIR) data. Despite European Commission (EC) investments exceeding €130 million across the EID cohort and clinical trial consortia coordinated through the Cohort Coordination Board (CCB) and Trial Coordination Board (TCB), little empirical evidence exists on the extent to which these consortia adopt standards, the barriers they face, or what funders could do to improve implementation. Objective: To characterise health data standards adoption across EC-funded EID consortia, identify the barriers that prevent uptake, and generate evidence-based recommendations for funders to strengthen standards implementation for the rapid reuse of interoperable participant-level data in epidemic detection and response. Methods: We conducted a cross-sectional online survey May 2023-Feb 2024, developed through a literature review and stakeholder consultation, with CCB and TCB-affiliated EC-funded EID consortia. Research networks and consortia outside these boards could participate if forwarded the survey. We collected information on consortium characteristics, standards use, barriers to adoption, awareness of EC-supported standardisation initiatives, and recommendations for improving uptake. Responses were analysed descriptively; open-text responses were categorised thematically. Results: Thirty-three responses, representing 15 consortia or research networks spanning over 40 countries were collected. Most responses came from cohort consortia. Adoption of data standards was limited. The most frequently used standards were ICD codes (n=10) and the Systematised Nomenclature of Medicine Clinical Terms (n=9); 7 respondents reported not using standards. The main barriers were insufficient experience applying standards (n=17), lack of budget (n=12), uncertainty about which standard to use (n=12), uncertainty about which standards related studies used (n=9), and inadequate tools (n=9). Awareness of EC initiatives designed to support standards adoption was strikingly low, suggesting that EC investment in standards support is not reaching its intended audience. Respondents recommended dedicated budgets, clearer guidance on preferred standards by data type, better communication of the benefits of standards adoption, stronger tooling, and funder mandates. Conclusions: Health data standards are underused across European EID consortia, representing a preventable bottleneck for pandemic preparedness despite substantial public investment. European funders can address this through the following actions recommended by major EC-funded EID Consortia: mandating dedicated standards budgets at the grant submission stage, issuing formal guidance on preferred standards by data type, investing in open-source tooling that delivers value to data generators, requiring machine-actionable data management plans, and establishing a public registry of standards adopted by funded consortia. Strengthening coordinated standards adoption is a necessary and achievable step toward the FAIR, interoperable research data infrastructure that effective pandemic response demands. Clinical Trial: Not applicable.

  • Background: The Registry of Stroke Care Quality (RES-Q) is healthcare quality improvement platform used globally. RES-Q collects structured quality-of-care data for stroke patients, requiring clinicians to manually extract information from electronic health records or documents such as discharge summaries. This process is essential but time-consuming, particularly given the variability, length, and semi-structured nature of clinical reports. Objective: To develop and evaluate a multilingual Evidence-Based Question-Answering framework that identifies supporting text spans in clinical reports of stroke patients and proposes answer suggestions for structured clinical forms, with the goal of reducing clinician workload while preserving full human oversight. Methods: We conduct a multilingual study using 1,596 pseudonymized stroke discharge summaries in six languages, annotated with question-evidence-answer triplets. Encoder-based language models are used to extract evidence spans from the reports, while generative language models are used to predict normalized form answers based on the extracted evidences. We compare multiple training strategies: models trained on reports in a single target language, models trained jointly on reports in different languages, and models trained on original reports combined with cross-lingual data augmentations. We evaluate performance on Evidence Extraction, Answer Prediction, and end-to-end Evidence-Based Question Answering across the six languages. Results: The presented Evidence-Based Question-Answering system achieves 89% end-to-end accuracy in form filling across six languages (77% for patient-specific questions and 95% for default or unverifiable items). Evidence Extraction is the primary bottleneck, reaching 85% F1 and 79% Exact Match, whereas Answer Prediction based on extracted evidences is more stable, achieving 95% accuracy. The performance varies by question type, and cross-lingual training generally reduces Evidence Extraction performance but has little effect on Answer Prediction. Model performance is influenced more by reporting practices and dataset characteristics than by language itself. Conclusions: Evidence-Based Question Answering over multilingual stroke discharge summaries enables human-in-the-loop validation and effective answer prediction with moderate computational resources. Evidence Extraction is the main bottleneck, while Answer Prediction is robust across languages and model sizes. The approach supports structured data collection, though generalization to new languages requires target-language training data.

  • Background: Patient-facing digital health tools such as mobile health (mHealth) apps, wearables, and digital therapeutics have expanded rapidly and show promise for improving chronic disease management. Despite increasing evidence of effectiveness, health systems and payers continue to face challenges integrating these tools into routine care. Objective: This study examined the decision-making processes of health system and payer leaders regarding the adoption and sustainability of patient-facing digital health tools within their organizations. Methods: We conducted semi structured interviews with nine senior leaders from a large Midwestern academic health system and affiliated payer organizations, including a provider owned health plan and a state Medicaid program. Interviews explored digital health adoption decisions, perceived value and fit, barriers, and sustainability considerations, focusing on adoption of an evidence-based mHealth intervention for alcohol use disorder as a use case. Transcripts were analyzed using thematic analysis with inductive and deductive coding. Results: Four decision making mechanisms shaped adoption and sustainability decisions: prioritization under organizational constraint, risk mitigation, operational fit, and value determination. These mechanisms describe how leaders navigate limited organizational capacity, reduce uncertainty and protect against clinical, financial, and operational risks, assess whether tools can integrate within existing clinical and technical systems, and determine whether anticipated and measurable benefits justify adoption and continued organizational support. Conclusions: Adoption and sustainability of patient-facing digital health tools are shaped by dynamic organizational decision-making processes that often remain invisible to patients, clinicians, researchers, and developers. Making these processes visible may help better align digital health tools with the realities of the healthcare system to support implementation.

  • Background: The growing integration of Personalized Risk Prediction (PRP) and Artificial Intelligence (AI) substantially re-shapes diagnostic and therapeutic decision-making in health care. At the same time, its responsible adoption depends not only on technical performance, but also on patients’ perspectives and acceptance. Objective: This study systematically examined patients’ perspectives across several European countries and explored how patients’ technology-related attitudes relate to their evaluations of personalized and AI-supported ap-proaches in cardiac care. As part of the PROFID (Prevention of Sudden Cardiac Death After Myocardial Infarc-tion by Defibrillator Implantation) project, its focus is on the ethical use of PRP and AI in the clinical context of decision-making regarding sudden cardiac death (SCD) prevention and implantable cardioverter-defibrillator (ICD) implantation. Methods: The study used a cross-sectional survey design with a standardized questionnaire including multimedia con-tent. The target population comprised adults aged 18 years or older living in six European countries who met at least one of the following (self-reported) clinical criteria: heart failure, myocardial infarction (MI), cardiac arrest, or current ICD implantation. An exploratory factor analysis (EFA) was used to identify and evaluate internally consistent scales, and subsequent regression analyses examined associations between these scales and technological openness, sociodemographic characteristics, and patients’ views on PRP and AI in cardiac care. Results: The sample consisted of 470 participants from Germany (n=210), the Netherlands (n=86), the United Kingdom (n=145), and three other European countries (n=29; Austria, Belgium, and Spain). Overall, 51.9% (244/470) of respondents were male and 48.1% (226/470) were female. The mean age of the sample was 61.12 (SD 12.62) years. The EFA showed six clearly interpretable factors: (1) Perceived benefits and support of PRP models in medical decision-making (MDM); (2) Perceived benefits and support of AI in MDM; (3) Transparency expectations in algorithmic decision-making; (4) Support for delegating decisions to algorithms; (5) Self-reported AI literacy and (6) Preference for shared decision-making (SDM). The regression analysis showed the relations of technologi-cal readiness, self-reported AI literacy, support for delegation of decisions to algorithms, transparency expecta-tions in algorithmic decision-making, preferences for SDM, and educational attainment to predict patients’ perceived benefits and support of PRP or AI in MDM. Conclusions: The findings support existing assumptions while also highlighting additional aspects that should be considered if high-level technologies are used in decision-making processes related to ICD implantation. PRP and AI were generally perceived as useful tools to support decision-making regarding ICD indication, provided that trans-parency is ensured and patients remain actively involved in the decision-making process. Mandatory use and full delegation to decision-making directly by Al were broadly rejected. Generally, men showed more positive perceptions of the use of AI in MDM than women. The attributed acceptance of delegation to PRP models was significantly higher than AI.

  • Background: Chronic pain is a prevalent and complex condition requiring long-term, multidimensional management. Digital therapeutics (DTx) have emerged as a promising nonpharmacological intervention; however, evidence regarding their effectiveness remains inconsistent due to heterogeneity in intervention types and study designs. Objective: This study aimed to systematically review and meta-analyze the effectiveness of digital therapeutics in reducing pain among patients with chronic pain. Methods: A systematic review and meta-analysis were conducted following PRISMA 2020 guidelines. Electronic databases, including PubMed, Embase, CINAHL, and the Cochrane Library, were searched from inception to December 10, 2025. Randomized controlled trials evaluating DTx interventions in adults with chronic pain were included. The primary outcome was pain intensity, and secondary outcomes included physical function and psychological outcomes (quality of life, anxiety, depression, and pain catastrophizing). Effect sizes were calculated as standardized mean differences using a random-effects model, and risk of bias was assessed using the Cochrane Risk of Bias tool. Results: A total of 7 studies were included in the meta-analysis. Digital therapeutics demonstrated a statistically significant reduction in pain intensity (SMD = -0.87, 95% CI: -1.70 to -0.03, p = 0.04); however, heterogeneity was substantial (I² = 97%). No significant effects were observed for physical function (SMD = 0.62, 95% CI: -1.57 to 2.80, p = 0.58) or overall psychological outcomes (SMD = -0.92, 95% CI: -2.01 to 0.17, p = 0.10). Among psychological outcomes, quality of life showed a trend toward improvement (SMD = 0.28, p = 0.07), whereas anxiety, depression, and pain catastrophizing showed no significant effects and substantial heterogeneity. Conclusions: Digital therapeutics may contribute to reductions in pain intensity in patients with chronic pain; however, the effects on physical function and psychological outcomes remain inconsistent. The high level of heterogeneity suggests that the effectiveness of DTx varies considerably depending on intervention characteristics and study design. Further high-quality and standardized trials are needed to establish the clinical effectiveness of DTx. Clinical Trial: PROSPERO CRD 420261355510; https://www.crd.york.ac.uk/PROSPERO/view/CRD420261355510

  • Impact of Patient Engagement in Remote Diabetes Management on Glycemic Outcomes: A Causal Inference Approach

    Date Submitted: Apr 8, 2026
    Open Peer Review Period: Apr 9, 2026 - Jun 4, 2026

    Background: Suboptimal glycemic control remains a major public health challenge for patients with type 2 diabetes and prediabetes. Remote glucose monitoring offers scalable support for self-management, but evidence on its real-world effectiveness and the causal impact of varying engagement levels is limited. Objective: To estimate the effect of patient engagement measured through glucose monitoring frequency on hemoglobin A1c (HbA1c). Methods: We analyzed 1,479 adults with type 2 diabetes or prediabetes enrolled in the iHealth Unified Care program, integrating Bluetooth glucose meters, a mobile app, lifestyle coaching, and primary care coordination. Engagement during the first six months was defined as the weekly frequency of glucose monitoring. The causal effect of monitoring frequency on HbA1c was estimated using marginal structural models with inverse probability weighting to address time-varying confounding. Results: At 6 months, HbA1c decreased by 0.53 (SD 1.46) percentage points (p < 0.001). We observed a dose-response relationship across engagement tiers: the highest-engagement group (16.99 measurements/week) achieved a 1.00 percentage point HbA1c reduction versus 0.34 in the lowest tier. In weighted models, each additional weekly measurement was associated with a 0.03 percentage point greater HbA1c reduction (p < 0.01). Findings were consistent in sensitivity analyses at 3 and 12 months. Conclusions: Engagement with a digitally enabled, primary care-integrated remote glucose monitoring program significantly improved glycemic outcomes across all engagement levels. Higher monitoring frequency produced greater HbA1c reductions, underscoring the importance of fostering sustained patient engagement to optimize diabetes management. Clinical Trial: Not Applicable

  • Background: Digital technology in health and social care can improve the well-being of people with long-term health conditions, but prior research has identified factors that hinder its adoption, particularly accessibility issues and challenges to its integration into everyday life. It is therefore important to fully understand both the facilitators and hindrances to the adoption of such technology. Objective: To explore the facilitators and hindrances to the adoption of digital technology in the everyday lives of adults with long-term health conditions. Methods: This scoping review systematically mapped relevant research following the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews). Thematic analysis was used to critically analyze the identified articles. Results: Forty-six articles were selected that examined 5,018 adults aged 18 and over. Six themes were identified: personal characteristics and preconditions; perceived usefulness in everyday life; design and technical functionality; support and guidance; human interaction; and integrity and trustworthiness. Conclusions: The findings are discussed in relation to the key constructs of the Unified Theory of Acceptance and Use of Technology (UTAUT): performance expectancy, effort expectancy, social influence, and facilitating conditions. Digital technologies support the daily lives of adults with long-term health conditions, but several challenges remain, including functionality that is not adapted to specific diseases or ages and a perceived lack of human interaction. Thus, digital technology is not a one-size-fits-all solution but should be an adaptable tool that accommodates individual preferences and contexts and that complements in-person human interactions.

  • BeProGuide: A Behavior Design Tutorial to De-Implement Low-Value Clinical Practices

    Date Submitted: Mar 30, 2026
    Open Peer Review Period: Apr 6, 2026 - Jun 1, 2026

    Background: Although concern about low-value care (LVC) practices has grown in recent years, interventions relying solely on informational or educational strategies have not proven effective in reducing them. This suggests a need to involve professionals and/or patients in collaborative decision-making processes and change strategy design. This study builds on this premise using the Fogg Behavior Model, which posits that a behavior can only occur when motivation, ability, and a prompt converge at the same time. Objective: he aim of this study is to develop a guide to designing, implementing and evaluating interventions that reduce LVC practices. We present a specific case study involving the deprescribing of benzodiazepines in primary care and use it as an example of the process to be followed to reduce other practices of this kind. Methods: This study was conducted in two primary care centers in Catalonia, Spain. A total of 31 professionals (physicians and nurses) took part in focus groups employing three techniques from the Fogg Behavior Model: Swarm of Behaviors, Focus Mapping, and Golden Behaviors. Through these techniques, we worked with participants to compile a set of actions for implementation. These actions were tailored to the conditions and capacities of their health centers and were assessed by the participants as feasible and effective in reducing benzodiazepine prescribing. Results: Based on this practical experience, we developed our ten-step BeProGuide, which outlines a series of tasks that we recommend completing in any project aimed at reducing LVC practices. This is presented as a detailed checklist to support informed decision-making. Conclusions: Our research operationalizes the Fogg Behavior Model by setting out a concrete, replicable procedure for reducing LVC clinical practices. In doing so, it transforms this conceptual framework into an actionable methodological tool, BeProGuide, which takes the form of a step-by-step guide and detailed checklist. This guide is not only applicable in health and medicine, but can be used in other fields such as education, work and organizations, and environmental protection.

  • Background: Children with inattentive attention-deficit/hyperactivity disorder (ADHD) often present with impairments in executive functions and fine motor skills in addition to core inattentive symptoms. However, the effects of structured remote fine motor training on these outcomes remain unclear. Objective: To examine the effects of a 12-week telerehabilitation-based fine motor training program on inattention symptoms, executive functions, and fine motor performance in children with inattentive ADHD. Methods: This assessor-blinded randomized controlled trial investigated a 12-week remote fine motor training program delivered via Tencent Meeting in children aged 6-10 years with inattentive ADHD. Sixty-six children were randomly assigned to either the intervention group (n=33) or a wait-list control group (n=33). The intervention was conducted 3 times per week, 60 minutes per session, for 12 weeks. Assessments were performed at baseline, immediately postintervention, and at 3-month follow-up. The outcomes were inattention symptoms, executive functions and fine motor skills. Linear mixed models were used for the main analysis, and mediation analysis was performed to examine whether executive functions explained changes in inattention. Results: Compared with the wait-list control group, the intervention group showed significantly greater reductions in inattention symptoms at 12 weeks (MD= −3.85, 95% CI: −5.01 to −2.68) and 3-month follow-up (MD= −2.00, 95% CI: −3.17 to −0.83). For executive functions, significant between-group differences were observed in inhibitory control, immediate memory, and cognitive flexibility at both time points (P<0.05), while delayed memory was significant at 12 weeks only (MD= −3.03, 95% CI: 0.57 to 5.49) and showed no significant between-group difference at follow-up (MD= 1.62, 95% CI: −0.84 to 4.08). For fine motor outcomes, significant between-group differences were found in manual dexterity and hand-eye coordination at both 12 weeks and 3-month follow-up (P<0.05), and in writing skills at 12 weeks (MD= −6.85, 95% CI: −13.38 to −0.32) but not at follow-up (MD= −2.18, 95% CI: −8.71 to 4.35). Subgroup analyses suggested age-related variation in treatment response, with younger children showing more evident gains in fine motor performance and older children showing more sustained improvements in inattention and selected executive function domains. Mediation analysis showed that inhibitory control partially mediated the effect of the intervention on inattention (indirect effect: β= −0.85, 95% CI: −1.85 to −0.08). Conclusions: A 12-week remote fine motor training program may be a feasible, safe, and effective nonpharmacological intervention for children with inattentive ADHD. The intervention improved inattention symptoms, executive functions, and fine motor performance, with inhibitory control emerging as a potential mechanism underlying symptom improvement. The subgroup findings further suggest that developmental stage may influence the pattern of response, which may help guide age-tailored intervention design in future practice. Clinical Trial: Chictr.org.cn ChiCTR2200065413; https://www.chictr.org.cn/showproj.html?proj=182412

  • The Role of Incentivisation in Communities of Practice: a systematic review

    Date Submitted: Apr 5, 2026
    Open Peer Review Period: Apr 6, 2026 - Jun 1, 2026

    Background: Incentivisation is increasingly used to maintain engagement and support behaviour change within communities of practice (CoPs), yet its effectiveness across chronic disease contexts remains uncertain. Objective: To examine how incentives are integrated into CoPs and related peer-support models, and to assess their impact on participant activation, engagement, and health-related outcomes. Methods: PubMed/MEDLINE, Embase, Scopus, and CENTRAL were searched from inception to June 2025 using predefined terms relating to CoPs, incentivisation, and patient-centred outcomes. Peer-reviewed empirical studies involving incentivised CoPs or analogous peer-support interventions for adults with chronic conditions were eligible. Four reviewers independently screened studies, extracted data, and assessed risk of bias in line with PRISMA 2020 guidance. Heterogeneity in design and outcomes required narrative synthesis. Results: From 667 records, four randomised controlled trials met inclusion criteria. Financial incentives produced the greatest short-term gains in physical activity, while non-financial approaches such as gamification, points, badges, and structured peer support yielded modest improvements in step count, treatment adherence, or diet quality. No consistent effects were observed for patient activation, self-efficacy, mental health, or quality of life. Engagement moderated effectiveness, although attrition was common. Conclusions: Incentivisation can enhance short-term behavioural outcomes within CoPs, but evidence for sustained psychosocial benefit is limited. Larger, longer-term studies are needed to clarify which incentive strategies deliver durable improvements in engagement and self-management. Clinical Trial: This review was registered on PROSPERO, an international prospective register of systematic review (January 2026, reference CRD420251244276).

  • Restrictive Family Relationships as a Mediator of Adolescent Social Media Use Disorder

    Date Submitted: Apr 2, 2026
    Open Peer Review Period: Apr 3, 2026 - May 29, 2026

    We investigated the relationships between academic stress, family relationships, and social media use disorder (SMUD) among middle school students through the ecological systems perspective. We used mixed-methods sequential explanatory. In addition, structural equation modeling showed that academic stress had a significant direct impact on SMUD. Restrictive family relationships significantly mediated this relationship. The proposed model explained 42.6% of the variance in SMUD. Qualitative findings highlighted family patterns and coping and revealed how rigid parental control and limited emotional support reinforced restrictive family dynamics, which in turn further linked academic stress to SMUD. Present research extends Ecological Systems Theory to online behavior and establishes restrictive family relationships as key mediating mechanisms in SMUD development. These findings underscore the importance of family-based interventions that promote open communication and adaptive stress management strategies to mitigate the risk of SMUD.

  • Background: Hospitalized children frequently experience pain and distress. Pain is a multidimensional experience involving both sensory and emotional components, necessitating multimodal management strategies. Socially assistive robots (SARs) have shown promise as non-pharmacological interventions in pediatric care. However, the interaction mechanisms through which SARs influence pain and emotional responses, particularly positive emotion and real-time emotional dynamics during child–robot interaction, remain underexplored. Objective: This study, titled the HAPPY (Hospitalized Assistance for Pediatric Pain Yields) study, aimed to evaluate the association between a SAR-based intervention and postoperative pain in hospitalized children and to examine whether different levels of engagement are associated with changes in real-time emotional dynamics. Methods: A single-group pretest–posttest design was conducted with 37 hospitalized children (mean age 7.35, SD 2.06 years) following tonsillectomy or adenoidectomy. The intervention was structured into three sequential phases: Phase 1 (warm-up/limited engagement), Phase 2 (educational video/passive engagement), and Phase 3 (social interaction/active engagement). Pain was assessed using the Wong-Baker FACES pain scale and observed behavioral FLACC scales. Emotional response, as valence, was measured using an automated facial expression recognition system (FaceReader 10). Changes in pain were analyzed using Wilcoxon signed-rank tests, and differences in emotional valence across phases were examined using the Friedman test with post hoc pairwise comparisons. Results: Self-reported pain significantly decreased from a median of 6 (IQR 4–6) to 4 (IQR 2–4) (P<.001), and observer-rated behavioral pain decreased from a median of 3 (IQR 2–4) to 1 (IQR 1–2) (P<.001). Overall differences in emotional valence across phases did not reach statistical significance (P=.053; Kendall’s W=0.084). However, the V-shaped trajectory of emotional valence was observed, with the lowest values during the passive engagement phase 2 (mean –0.24, SD 0.20) and relatively higher values during the active engagement phase 3 (mean –0.15, SD 0.13). The exploratory post hoc analyses indicated a significant increase in emotional valence from Phase 2 to Phase 3 (adjusted P=.012). Conclusions: SAR-based interventions were associated with reductions in postoperative pain in hospitalized children. Although overall emotional differences across phases were not statistically significant, the observed pattern suggests that active engagement may be associated with more positive emotional responses compared to passive engagement. These findings highlight the potential importance of interaction quality in SAR interventions and provide insight into the processes underlying their clinical effects in pediatric care. Clinical Trial: No

  • Externalized Living Memory: Structuring Clinical Knowledge for the Age of AI Agents

    Date Submitted: Apr 1, 2026
    Open Peer Review Period: Apr 1, 2026 - May 27, 2026

    As AI agents become increasingly capable of autonomous action in health care, a prerequisite remains underaddressed: the persistent, structured memory that makes such action contextually meaningful. Clinicians face cognitive overload not from any single task but from the erosion of decision context over time. Existing tools—personal knowledge management frameworks, LLM built-in memory, and autonomous agents—each address parts of this problem but leave gaps in auditability, portability, or contextual persistence. This Viewpoint argues that memory should precede action: before AI agents can act meaningfully, they need persistent, human-controlled context. We describe externalized living memory—a structured knowledge base that both human and AI can read and write—as it emerged from the first author's practice as a cardiovascular radiologist and division chief. The approach is organized as a layered architecture with a routing table for scalable context loading and a governance hierarchy for sustainable maintenance. We illustrate the approach through clinical vignettes, compare it with existing solutions, and discuss limitations including the small-team evidence base and maintenance costs. An open-source implementation with templates and setup instructions accompanies this paper.

  • Professionals, leaders, and institutions in healthcare and health research are rapidly adopting and integrating AI systems and chatbots into their regular work, but this poses risks for patients in the case of patient and public involvement and engagement (PPIE). AI offers economical solutions for overstretched health systems and burned-out staff, already shows strengths in speeding up more long-term and minute research practices, and providing unique accessibility accommodations. However, AI can also be used to create personas and virtual PPIE panels, which can speak completely or partially for human patients with lived experience of conditions, thus minimising, distorting, or erasing their voices from collaborative research processes. AI pose risks through several distorting factors, including hallucinations, overconfidence, sycophancy, bias, sexism, and racism. Staley and Barron have argued that learning is the greatest outcome of PPIE. However, if researchers, professionals, and staff use AI chatbots in conjunction with or in lieu of human collaborators, the amount of learning that takes places is greatly reduced, according to AI expert and cultural critic, Ethan Mollick. In conclusion, we provide a checklist to guide professionals and researchers in ethical and responsible uses of AI that preserves the voices and roles of patients, members of the public, and lived experience.

  • Opioid-related Drug-Drug Interactions and Harm to Hospitalized Patients: A Retrospective Multicenter Cohort Study

    Date Submitted: Mar 29, 2026
    Open Peer Review Period: Mar 30, 2026 - May 25, 2026

    Background Opioid-related drug–drug interactions (DDIs) are common in hospitalized patients and can lead to serious harm, especially when opioids are combined with central nervous system depressants. Electronic medical records (EMRs) often trigger DDI alerts to warn clinicians of potential DDIs, but the effect of DDI alerts on clinically relevant opioid DDIs and related patient harm remains uncertain. This study evaluated whether EMR-integrated opioid DDI alerts reduce clinically relevant interactions and associated harms in routine hospital care. Objective This study aimed to address this gap by determining whether introducing these alerts reduces the prevalence of potentially and clinically relevant opioid-related DDIs, as well as the rate of DDI-related patient harm in hospitalized patients. Methods This retrospective cohort study was a secondary analysis of a multicenter quasi-experimental controlled pre–post evaluation of EMR implementation across five Australian hospitals. Adult inpatients were randomly selected from all patients who stayed in study hospitals for a one-week period six months before and six months after EMR implementation. Inpatients were included if they had at least one prescribed and administered opioid and one concurrent medication. Interruptive opioid DDI alerts were active only at intervention sites post-EMR. Potential DDIs were identified using Stockley’s Interaction Checker; pharmacists adjudicated clinically relevant DDIs, and clinical pharmacologists assessed DDI-related harm and causality. Clustered logistic regression with generalized estimating equations, adjusting for demographic and clinical variables, estimated the effect of alerts involving opioids on three outcomes: clinically relevant opioid DDIs (primary), any potential opioid DDI, and opioid DDI-related harm. Results Of 1,144 patients prescribed an opioid, 847 (74.0%) had at least one potential opioid DDI and 548 (47.9%) had at least one clinically relevant DDI. EMR alerts were associated with no significant change in clinically relevant DDIs (adjusted odds ratio 1.06, 95% CI 0.72–1.55; p=0.75). There was a significant reduction in potential opioid DDIs (adjusted odds ratio 0.55, 95% CI 0.41–0.74; p<0.001). Of all patients, there were 11 patients with a total of 38 DDIs experienced harm (0.6% of potential and 1.1% of clinically relevant DDIs), with most DDIs involving pharmacodynamic interactions with concomitant CNS depressants. Conclusion EMR opioid DDI alerts reduced overall exposure to potential DDIs but did not decrease clinically relevant interactions or related harm. The low rate of harmful events highlights the limited clinical value of current alert systems and the burden of low-value warnings.

  • Background: Background: Artificial intelligence (AI) is quickly becoming a key part of digital health systems in oncology, supporting activities like cancer screening, clinical decision-making, and patient care management. Although AI has the potential to enhance care quality and efficiency, its adoption at cancer centers varies widely, raising concerns about disparities in digital health access and capacity. Objective: Objective: This research investigates the multiple factors influencing AI adoption as part of digital health implementation at National Cancer Institute (NCI)-designated cancer centers across the U.S., focusing on institutional readiness, policy environment, and geographic spread. Methods: Methods: A national dataset of 75 cancer centers was assembled using public sources to track AI use in screening, treatment, and patient care. AI adoption was measured as a composite index (0-3), indicating integration across clinical areas. Spatial patterns were analyzed with Moran’s I, and multilevel ordered logistic regression models examined links between AI adoption, institutional features (like number of physicians, hospital beds, center type), and contextual factors (such as socioeconomic status and state politics). Results: Results: No significant clustering of AI adoption was found geographically, implying limited regional diffusion. The size of the physician workforce was the most consistent predictor of AI adoption, emphasizing that organizational readiness is a key driver. Policy environment also influenced adoption: comprehensive cancer centers in Republican-controlled states showed higher AI uptake. Socioeconomic status at the community level was not significantly related. Conclusions: Conclusions: This study identifies institutional capacity and policy environment as primary constraints on scalable innovative digital health implementation in cancer institutions. These results point to structural barriers to broad digital health deployment and indicate that advancing AI-enabled cancer treatment will need focused investments in institutional capacity and policy support. Without these efforts, disparities in digital health infrastructure could restrict equitable access to AI-driven innovations in oncology.

  • Digital Delivery of Lifestyle Interventions in Online Clinical Trials: An Umbrella Scoping Review

    Date Submitted: Mar 23, 2026
    Open Peer Review Period: Mar 24, 2026 - May 19, 2026

    Background: Noncommunicable diseases (NCDs) cause a very significant health and economic burden. As they are associated with modifiable behavioral risk factors such as physical inactivity and poor diet, more evidence on effective health behavior modification methods is needed. Fully online delivery of clinical trials can provide a practical and scalable way to evaluate interventions that aim to modify relevant lifestyle factors. The emergence of online delivery methods presents opportunities and challenges that need to be better understood to inform future research. Objective: This umbrella scoping review aimed to summarize current evidence on the opportunities and challenges provided by lifestyle intervention clinical trials that are delivered fully online. Methods: Evidence was synthesized from existing peer-reviewed review papers to map the digital delivery methods in online lifestyle intervention trials, focusing on technologies, recruitment, engagement and retention strategies, and reported strengths, limitations, and future directions. Using PRISMA-ScR guidelines, PubMed, EMBASE, CINAHL, Web of Science, and Scopus were searched for reviews published between January 2013 and May 2025. Predominantly (>50%) hybrid, telehealth, or acute condition focused interventions were excluded. Results: Eligible reviews (n=39) discussed digital interventions targeting diet, physical activity, or both, for lifestyle improvement, chronic disease prevention or management. The most common hardware used in online lifestyle clinical trials were smartphones and wearables, with the most frequent software modes being web-based platforms, mobile apps, and SMS. Successful engagement strategies often integrated behavior changes techniques, such as goal setting, self-monitoring, personalized feedback, and human support into the intervention design, or had behavior change techniques as a feature of the technology itself. Reported strengths of conducting clinical trials online included improved accessibility, scalability, cost-efficiency and personalization, whereas limitations discussed were poor engagement and retention, digital literacy barriers, and rapid technological change outpacing evaluation capabilities. Interventions that used theory-based designs, particularly those using Social Cognitive Theory, the Transtheoretical Model, and the Theory of Planned Behavior, were reportedly most successful in improving behavioral outcomes. Engagement and retention varied considerably across online trials, suggesting that the success of these studies may depend less on the online delivery modality itself and more on how interventions and technologies are designed, including the integration of behavioral theory and behavior change techniques. Conclusions: This review shows that online delivery of lifestyle intervention trials is a feasible and potentially advantageous method as it can improve reach, increase scalability, be cost-efficient, and allow more personalization of the intervention. To further improve the conduct of online clinical trials, future research should address increased use of behavioral change theory, equitable access to clinical trial participation, management of data privacy and security, intervention fidelity, and use of novel technologies such as artificial intelligence in a field that is rapidly evolving. Clinical Trial: The protocol was prospectively registered with the Open Science Framework https://osf.io/umcfv; 5 September 2023

  • Background: Clinical guidelines recommend an integrated, person-centered care model with better control of modifiable risk factors and coexisting conditions in patients with atrial fibrillation (AF), but many persons with AF receive insufficient risk factor management. Digital health technologies may provide valuable support in addressing this gap. Objective: Our aim was to evaluate a co-designed digital platform for supporting person-centered management of modifiable risk factors in individuals with AF. Methods: This is a mixed-methods study including a standardized quantitative questionnaire used to score the usability of digital tools, the System Usability Scale (SUS), and a qualitative, descriptive, manifest content analysis of individual interviews. Results: Twenty-two patients hospitalized for AF were included (age 68 (48-79) years; 32% female; BMI 27.7 (20.8-35.0) kg/m2; paroxysmal/persistent AF (36%/64%); AF duration 4 (0.5-18 years). Relevant comorbidities were hypertension (77%), heart failure (36%), diabetes mellitus type 2 (14%), and ischemic heart disease (18%). Usability was rated high, with a mean SUS score of 75 (±18.2), indicating above-average user acceptance. Participants’ requirements were summarized into four main categories and ten subcategories. First, they value a clear layout with simple design and easy navigation. Second, they appreciate positive content, which is informative, inclusive and motivating. Third, they request personalized information on different aspects and provided on different levels. Fourth, they desire individualized medical recommendations that are personalized and flexible but open to individual choice. Conclusions: To improve digital management of lifestyle-related risk factors and comorbidities, individuals with AF seek a solution with a clear layout, positive content, personalized information, and individualized medical recommendations.

  • Background: Chronic primary pain is a complex condition involving biological, psychological, and behavioral mechanisms and is commonly associated with emotional distress and reduced quality of life (QoL). Digital mental health interventions (DMHIs) offer scalable and accessible solutions for delivering psychological care in chronic pain management; however, evidence regarding their effectiveness across delivery modalities and outcome domains remains heterogeneous. Objective: This systematic review aimed to (1) evaluate the effectiveness of DMHIs on clinical (pain intensity, disability) and psychological outcomes (QoL, anxiety, depression, catastrophizing, and self-efficacy) in adults with chronic primary pain; (2) examine whether specific digital delivery modalities are differentially associated with particular outcomes; and (3) identify methodological gaps to inform future research and implementation. Methods: A systematic literature search was conducted in PubMed, Scopus, PsycINFO, Cochrane Library, Web of Science, and Google Scholar following PRISMA guidelines. Two independent reviewers screened randomized controlled trials (RCTs) and assessed risk of bias using the Cochrane Risk of Bias 2.0 tool. Given substantial heterogeneity in study designs, interventions, and outcome measures, a narrative synthesis was performed. Results: Twenty-two RCTs were included. DMHIs were effective in improving psychological functioning and pain-related disability, often independently of changes in pain intensity, particularly when grounded in evidence-based psychotherapeutic frameworks such as cognitive behavioral therapy and acceptance and commitment therapy. Guided web-based interventions demonstrated the most consistent benefits, whereas unguided interventions showed smaller effects. Mobile applications and virtual reality–based interventions also showed positive effects on emotional functioning, self-management, and pain interference. Interventions incorporating some form of human guidance were generally associated with superior outcomes. Conclusions: DMHIs represent a promising, scalable, and person-centered approach to improving psychological well-being and functional outcomes in adults with chronic primary pain, particularly when integrated into stepped-care or hybrid care models. Clinical Trial: CRD420251010767