Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

The leading peer-reviewed journal for digital medicine and health and health care in the internet age. 

Latest Submissions Open for Peer Review

JMIR has been a leader in applying openness, participation, collaboration and other "2.0" ideas to scholarly publishing, and since December 2009 offers open peer review articles, allowing JMIR users to sign themselves up as peer reviewers for specific articles currently considered by the Journal (in addition to author- and editor-selected reviewers).

For a complete list of all submissions across all JMIR journals as well as partner journals, see JMIR Preprints

Note that this is a not a complete list of submissions as authors can opt-out. The list below shows recently submitted articles where submitting authors have not opted-out of open peer-review and where the editor has not made a decision yet. (Note that this feature is for reviewing specific articles - if you just want to sign up as reviewer (and wait for the editor to contact you if articles match your interests), please sign up as reviewer using your profile).

To assign yourself to an article as reviewer, you must have a user account on this site (if you don't have one, register for a free account here) and be logged in (please verify that your email address in your profile is correct).

Add yourself as a peer reviewer to any article by clicking the '+Peer-review Me!+' link under each article. Full instructions on how to complete your review will be sent to you via email shortly after. Do not sign up as peer-reviewer if you have any conflicts of interest (note that we will treat any attempts by authors to sign up as reviewer under a false identity as scientific misconduct and reserve the right to promptly reject the article and inform the host institution).

The standard turnaround time for reviews is currently 2 weeks, and the general aim is to give constructive feedback to the authors and/or to prevent publication of uninteresting or fatally flawed articles. Reviewers will be acknowledged by name if the article is published, but remain anonymous if the article is declined.

The abstracts on this page are unpublished studies - please do not cite them (yet). If you wish to cite them/wish to see them published, write your opinion in the form of a peer-review!

Tip: Include the RSS feed of the JMIR submissions on this page on your homepage, blog, or desktop RSS reader to stay informed about current submissions!

JMIR Submissions under Open Peer Review

↑ Grab this Headline Animator

If you follow us on Twitter, we will also announce new submissions under open peer-review there.

Titles/Abstracts of Articles Currently Open for Review:

  • Background: Medication nonadherence after percutaneous coronary intervention (PCI) remains a major barrier to secondary prevention. Prior SMS-based interventions have shown inconsistent results, often limited to reminders without addressing behavioral or psychological determinants. Objective: To evaluate the effectiveness and feasibility of a theory-informed, WeChat-based text-messaging program for improving adherence and patient-reported outcomes in post-PCI patients. Methods: A nonrandomized, quasi-experimental parallel-group study with a mixed-methods design was conducted from July 2022 to April 2023 at a tertiary hospital in Hangzhou, China. Patients were allocated by ward admission to intervention or control groups. The intervention comprised a 12-week WeChat program with daily medication reminders and educational messages mapped to COM-B domains and behavior change techniques. The primary outcome was adherence (MMAS-8); secondary outcomes were medication beliefs (BMQ-Specific), self-efficacy (SEAMS), and health status (SAQ). Outcomes were measured at baseline, discharge, and 12 weeks by blinded assessors. Analyses used ANCOVA with baseline adjustment and multiple imputation. After 12 weeks, semi-structured telephone interviews with intervention participants explored usability, message clarity and relevance, reminder helpfulness, timing and tone, and unmet needs. Interviews were conducted by an independent researcher, purposively sampled until thematic saturation, and thematically analyzed from verbatim transcripts. Results: Of 180 patients screened, 92 were enrolled and 87 (94.6%) completed follow-up. At 12 weeks, adherence was higher in the intervention group (adjusted mean MMAS-8, 7.38 vs 6.20; mean difference, 1.18 [95% CI, 0.95–1.41]; P<.001). Secondary outcomes also favored the intervention, including improved beliefs, self-efficacy, and quality of life. Feasibility benchmarks were exceeded, with deliverability 97.9%, retention 95.7%, and acceptability scores >8/10. Interviews confirmed usability and highlighted the need for personalization. Conclusions: A theory-informed, WeChat-based program improved adherence, beliefs, self-efficacy, and quality of life after PCI. The intervention was feasible, acceptable, and scalable. Randomized trials are warranted to confirm long-term effectiveness and cost-effectiveness. Clinical Trial: Chinese Clinical Trial Registry (ChiCTR2200061353) https://www.chictr.org.cn/bin/project/edit?pid=172238

  • From Measurement Failure to Privacy Infrastructure: Reframing Contact Tracing Governance for the Next Pandemic

    Date Submitted: Apr 27, 2026
    Open Peer Review Period: Apr 28, 2026 - Jun 23, 2026

    Effective infectious disease control rests on a foundational principle: no measurement, no understanding; no understanding, no control. The COVID-19 pandemic exposed, with devastating clarity, how thoroughly this principle can fail in public health practice. Transmission chains spread invisibly; contact histories, mobility patterns, and biosignals essential for control were never systematically collected. The necessary sensors and digital technologies existed — the fundamental reason measurement failed was not the absence of technology, but the absence of privacy infrastructure that would allow people to share data with confidence. This failure has structural roots. The objects of measurement in infectious disease control are not physical phenomena but human beings, and measurement therefore inevitably engages the core of privacy: contact histories, social relationships, and bodily states. This asymmetry — whereby greater measurement precision deepens privacy intrusion — manifested acutely in COVID-19 contact tracing apps. Designs that prioritized privacy lost epidemiological utility; designs that prioritized utility were rejected through public distrust. Neither direction achieved sufficient measurement. This Viewpoint reframes the problem. Privacy protection is not a constraint that impedes infectious disease control; it is the enabling condition upon which effective measurement depends. Existing regulations and technical approaches were not designed from this premise, and have therefore been unable to break the cycle of structural distrust. As one institutional approach to filling this gap, we present VRAIO (Verifiable Record of AI Output), which integrates democratic rule-setting, metadata declaration, independent third-party verification, tamper-proof ledgers, and violation-deterrence incentives. When privacy infrastructure is established, the foundational scientific principle "no measurement, no understanding; no understanding, no control" will begin to operate freely in infectious disease control for the first time. This opens the path toward high-resolution epidemiology and precision intervention: a new public health paradigm that simultaneously pursues strengthened disease control and the preservation of individual autonomy and social freedom, without dependence on blanket social restrictions.

  • From Innovation to Responsibility: A Scoping-Umbrella Review of Artificial Intelligence in Mental Health

    Date Submitted: Apr 26, 2026
    Open Peer Review Period: Apr 27, 2026 - Jun 22, 2026

    Background: Artificial intelligence (AI) has rapidly transformed psychological research and mental health practice through advances in machine learning, deep learning, natural language processing, and large-scale data analytics. AI-based systems are increasingly employed to support psychological assessment, diagnosis, intervention, monitoring, and clinical decision-making. This rapid expansion has resulted in a substantial and growing body of empirical and review literature. However, despite the accelerated development of AI applications in psychology, discussions surrounding ethics, legal frameworks, and governance have not progressed at a comparable pace. Objective: Concerns related to privacy, transparency, data security, informed consent, algorithmic bias, and emotional safety remain particularly critical in psychological contexts, where AI systems may influence highly sensitive aspects of human experience. Given the rapidly evolving and heterogeneous nature of the literature, this study aimed to conduct a scoping umbrella review to map the breadth of existing evidence, identify key thematic domains, and highlight gaps in the application of AI in mental health. Methods: Abstracts of 1,827 records retrieved from Web of Science (n = 50), PubMed (n = 677), and Scopus (n = 1,100) were screened. Following a full-text assessment of 218 potentially eligible studies, a total of 182 review articles were included in the final synthesis. Results: The findings indicate that research on AI in psychology is primarily organized around five thematic domains: intervention, diagnosis, prediction, theoretical framework, and ethical issues. The intervention domain represents a substantial proportion of the literature, suggesting that AI is most frequently examined in relation to applied psychological functions. In contrast, ethical issues are comparatively underrepresented. This pattern reflects a broader imbalance in which the field is progressing through application-driven innovation, while ethical reflection remains relatively limited and often theoretical. Although AI-based interventions and assessment tools are expanding rapidly, only a small number of reviews have systematically examined how these systems address core ethical concerns, including informed consent, data privacy, accountability, cultural bias, and emotional safety. Furthermore, the increasing reliance on cloud-based infrastructures introduces additional challenges related to confidentiality, cross-border data transfers, third-party access, and system reliability in sensitive clinical settings. Conclusions: Taken together, these findings underscore the risk of integrating AI technologies into psychological practice without sufficient ethical, clinical, and infrastructural safeguards. Future research should prioritize the development of evidence-based and context-sensitive ethical frameworks, alongside the exploration of alternative implementation models—such as local or hybrid infrastructures—that can better balance scalability, privacy, and institutional control.

  • Background: With the rapid development and widespread application of artificial intelligence (AI) technology, AI has demonstrated high accuracy and reliability in medical practice, and patients' trust in algorithmic has gradually increased. However, in clinical practice, disagreements may still arise between algorithmic recommendations and clinical expert experience, and such disagreements can affect patients' trust. To date, however, the impact of these disagreements on patients’ medical trust and the strategies for addressing them have not been systematically reviewed. Objective: To systematically map the impact of disagreements between AI recommendations and clinical expert judgment on patients’ medical trust, identify influencing factors based on Mayer’s integrative model of organizational trust, and summarize strategies to enhance trust. Methods: Following Joanna Briggs Institute (JBI) scoping review methodology and Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews (PRISMA-ScR) guideline, we systematically searched Web of Science, PubMed, Embase, Scopus, and EBSCO up to March 2026, limited to English-language literature. Studies focusing on patients' trust in the context of disagreements between AI and expert opinions were included. Data were charted using the Population, Concept, Context (PCC) framework. Guided by Mayer’s integrative model of organizational trust, influencing factors were analyzed through a framework synthesis approach across the dimensions of ability, benevolence, integrity, and trustor propensity. The protocol was pre-registered on OSF (Registration DOI: 10.17605/OSF.IO/AHSGD). Results: A total of 2,630 records were identified, and 26 studies were ultimately included after screening, including six qualitative studies, seven quantitative studies, three mixed-methods studies, five theoretical studies, and five review articles. These studies were conducted across 10 countries and were published mainly between 2022 and 2026. Disagreements were concentrated in clinical diagnosis and risk assessment, treatment planning and medication decision-making, clinician–patient communication and intelligent interaction, as well as emerging application scenarios. In situations of disagreement, patients commonly expressed skepticism toward both algorithms and experts; overall, however, patients tended to trust experts more than algorithms. Data security and privacy risks, insufficient communication, AI accuracy and reliability, demographic and socioeconomic characteristics, and patients’ disease and health status were identified as high-frequency factors influencing patients’ medical trust. Six trust-enhancing strategies were extracted: transparency and explainability, patient participation and shared decision-making, clinician–patient communication and role positioning, institutional regulation and governance, education and capacity building, and privacy protection and data security. Conclusions: In situations of disagreement between AI and clinical experts, patients’ medical trust is dynamically shaped by ability, benevolence, integrity, and individual-contextual multiple interacting factors. Strengthening transparency, communication, and governance is essential for fostering trust in human–AI collaborative healthcare.

  • Deep Learning Algorithms for Predicting Intraoperative Hypotension: A Systematic Review and Meta-Analysis

    Date Submitted: Apr 25, 2026
    Open Peer Review Period: Apr 25, 2026 - Jun 20, 2026

    Background: Intraoperative hypotension (IOH) is associated with myocardial injury, acute kidney injury, perioperative stroke, and 30-day mortality, yet conventional blood pressure monitoring remains reactive rather than anticipatory. Deep learning (DL) algorithms applied to continuous physiological waveforms represent a rapidly expanding paradigm for early IOH prediction, but the comparative performance of distinct DL architectures and the influence of prediction-window length, input data modality, IOH reference standard, and analysis unit on diagnostic accuracy have not been systematically synthesised. Objective: To quantify the pooled diagnostic accuracy of DL-based IOH prediction models and to identify methodological and clinical factors that modify their performance. Methods: PubMed, Embase, Web of Science, and the Cochrane Library were searched through March 2026. Methodological quality was appraised with the PROBAST+AI tool and overall certainty of evidence with the GRADE framework. A bivariate random-effects model generated pooled sensitivity, specificity, and the area under the summary receiver operating characteristic (SROC) curve, with heterogeneity quantified by τ²(Se), τ²(Sp), and the inter-study correlation ρ. Threshold effect was tested with Spearman’s correlation, publication bias with Deeks’ test, and clinical utility with Fagan’s nomogram. Prespecified subgroup analyses (prediction window, DL architecture, input modality, IOH reference standard, analysis unit) and Bayesian random-effects meta-regression explored heterogeneity sources. Results: Twelve studies were included; nine contributed 22 validation datasets to the quantitative synthesis. The pooled sensitivity was 0.78 (95% CI 0.73–0.81), specificity 0.88 (0.82–0.92), and SROC-AUC 0.87 (0.83–0.90); the diagnostic odds ratio was 24.7 (16.1–37.9), positive likelihood ratio 6.31, and negative likelihood ratio 0.26. Heterogeneity was τ²(Se) = 0.25, τ²(Sp) = 1.04, and ρ = −0.28; no significant threshold effect was detected (Spearman ρ = 0.29, P = 0.20). The 5-minute window achieved the highest performance (sensitivity 0.81, 95% CI 0.77–0.85; specificity 0.91, 0.84–0.95). Meta-regression identified DL architecture as the only significant moderator of specificity (P = 0.02), with hybrid CNN-RNN exceeding pure CNN (β = 1.77, 95% CI 0.45–3.09); no covariate significantly moderated sensitivity. Deeks’ test showed no statistically significant publication bias (P = 0.06). At a 10% pre-test probability, post-test probabilities were 41% (positive) and 3% (negative). GRADE certainty was Low. Conclusions: Deep learning models for IOH prediction achieve moderate diagnostic accuracy, with hybrid CNN-RNN architectures and 5-minute prediction windows showing the most favourable performance. The universal absence of formal calibration assessment, scarce external validation, and geographic concentration of the evidence base constrain immediate clinical translation. Prospective multinational validation with mandatory calibration reporting and patient-level evaluation is required before DL-based IOH alerts can be safely integrated into perioperative decision support. Clinical Trial: PROSPERO CRD420261377604.

  • Background: Youth experiencing mental health and substance use (MHSU) challenges face notable barriers to accessing adequate care, including limited service availability and a lack of youth-centered, evidence-based treatments. The prevalence of digital device ownership among youth presents an opportunity to help bridge this treatment gap using these accessible tools to deliver scalable evidence-based interventions. Self-guided digital MHSU interventions (ie, self-directed, technology-delivered psychosocial interventions not requiring clinical or technical support) present interesting opportunities for service providers or youth looking for self-help-style interventions. Many self-guided digital interventions have been developed for youth, yet little guidance on the effectiveness of these interventions exists for those looking to leverage evidence-based self-guided tools for these populations. Objective: This systematic review aimed to synthesize the evidence for self-guided digital MHSU interventions developed for youth dealing with MHSU disorders. Methods: Five major databases were searched for controlled trials of self-guided digital interventions targeting MHSU disorders in youth (12 - 25 years). Search concepts included: youth, mental health/substance use, digital, intervention, effectiveness. Eligible studies included trials with passive controls (determining initial effectiveness) or active comparators (determining superiority). Data describing trial characteristics, intervention characteristics, and MHSU outcomes were extracted. Risk of bias was assessed using Cochrane Risk of Bias 2.0 tool. Findings were synthesized narratively, both by MHSU disorder and by digital modality. Results: The search yielded 15,828 unique records; 76 trials met inclusion criteria. Interventions targeted symptoms relating to depression, alcohol use disorder, anxiety, eating disorders, tobacco use disorder, cannabis use disorder, post-traumatic stress disorder (PTSD), suicide, obsessive-compulsive disorder, and attention-deficit/hyperactivity disorder (ADHD). Interventions used diverse modalities: web-based interventions, mobile applications, text messages, computers, chatbots, video games, and wearables. Across disorders, 33 of 74 (45%) initial effectiveness evaluations and 3 of 25 (12%) superiority evaluations were positive. Across digital modalities, 33 of 71 (46%) initial effectiveness and 3 of 21 (14%) superiority evaluations were positive. Notably, in superiority evaluations, whether classified by disorder or modality, the majority of digital interventions performed similarly to their active comparators (71% - 72%). Positive evaluations of initial effectiveness were more common for interventions targeting PTSD and tobacco use disorder, and for interventions using chatbot- or computer-based interventions. Positive evaluations were limited for ADHD- or cannabis-focused interventions. Three of 76 trials (4%) were rated as low risk of bias; the remainder had some concerns or high risk of bias. Conclusions: This review synthesizes evidence for self-guided digital MHSU interventions and demonstrates their potential to address MHSU disorders in youth. Although more rigorous evaluations are still needed, this review identifies numerous effective self-guided digital interventions that can be used to help youth struggling with MHSU disorders, and identifies trends within modalities that might be considered for future MHSU intervention development.

  • Background: Providing care to a family member or friend with a serious illness like cancer increases risk for poor physical, psychological, and functional health outcomes. Despite their critical role, family caregivers (FCGs) are rarely screened in clinical settings for the wide range of factors that may put them and the person they care for at risk for poor outcomes. Mobile health (mHealth) applications can efficiently facilitate access to high-quality health information for FCGs; however, few are clinically integrated. Objective: This study aimed to evaluate the usability of CareCheck, an mHealth-based digital risk screening tool designed to enable family caregivers' self-awareness of potential caregiving-related risks for adverse health and psychosocial outcomes and to support health care professionals in personalizing interventions that address FCGs' specific risk factors. Methods: We conducted a usability testing study of CareCheck using two evaluation methods: quantitative measurement with a modified 5-item Mobile Health App Usability Questionnaire (MAUQ) and exploratory qualitative thematic analysis based on feedback from FCGs and trained staff. FCGs of individuals with gynecologic cancer were recruited through the inpatient unit and the outpatient gynecologic oncology clinic of a Comprehensive Cancer Center. Participants completed CareCheck and the usability questionnaire via the mHealth app installed on tablets. Staff observed the assessment process and provided feedback. Results: A total of 56 CGs and 2 trained staff participated in the usability study. The mean MAUQ score was 6.49 (SD = 1.06) out of 7, indicating high usability. Qualitative analysis identified recommendations in three categories: 1) Improvements to CareCheck ; 2) Perceptions of CareCheck’s Usability and Functionality, and 3) Clinical Implementation Considerations for CareCheck. Conclusions: FCGs and staff found CareCheck to be user-friendly and easy to navigate. While further iterations are needed to refine content and optimize integration with clinical workflows, CareCheck demonstrated potential as a clinically integrated tool for identifying and addressing FCG risk for poor social, psychological, or health outcomes in gynecologic oncology care settings.

  • Background: Background: Hypertension remains a predominant global risk factor for cardiovascular disease. Conventional follow-up models frequently fail to address the requirements for real-time monitoring and sustained intervention, whereas mobile health (mHealth) offers a transformative trajectory for chronic disease management. Despite a surge in relevant literature, the diversity of intervention modalities and the fragmented nature of existing evidence necessitate a systematic synthesis. Objective: Objective: This study aimed to comprehensively evaluate the efficacy of mHealth in hypertension management through a systematic review combined with evidence mapping, identifying research gaps to provide evidence-based insights for precision nursing and future research directions. Methods: Methods: A systematic search was conducted across PubMed, Web of Science, Cochrane Library, and Embase for randomized controlled trials (RCTs) involving mHealth interventions for hypertension, with the search period extending through February 2026. Literature was screened according to PICOS criteria, and methodological quality was appraised using the Cochrane Risk of Bias tool (RoB 1.0). Visual analytics, including Sankey diagrams and bubble plots, were employed to characterize the associations between intervention modalities and clinical outcomes. The study protocol was prospectively registered on the Open Science Framework (URL: https://osf.io/2vkwu). Results: Results: A total of 106 publications (comprising 108 RCTs) were included. Publication volume has increased significantly since 2018, with the United States (31 papers) and China (19 papers) being the primary contributors. The intervention paradigm has evolved from rudimentary SMS reminders to a "closed-loop" management model centered on "App + Remote Monitoring," which demonstrates the most robust and consistent positive evidence for blood pressure (SBP/DBP) control and goal attainment rates. Blood pressure parameters occupied the "core evidence layer," while therapeutic adherence and disease knowledge formed the "behavioral evidence layer". Conversely, BMI, mental health, and quality of life remained in the "peripheral evidence layer," characterized by a notably higher proportion of non-significant results. Methodological quality was generally moderate-to-high with robust randomization; however, the implementation of blinding faced prevalent high risks due to the inherent nature of the interventions. Conclusions: Conclusion: mHealth significantly enhances hypertension management efficacy through a digital "monitoring-feedback-adjustment" loop, yet it encounters bottlenecks in achieving profound lifestyle modifications (e.g., weight management) and psychological interventions. Clinical decision-making should prioritize multicomponent interventions featuring real-time interaction. Future research should focus on long-term (>1 year) follow-up and cost-effectiveness transformation in resource-limited settings.

  • Background: Molecular tumor boards (MTBs) generate highly technical recommendations. The language used in their protocols is rarely accessible to patients. Lay-language patient protocols could support patient-clinician communication, yet manual production is difficult to sustain in high-volume oncology settings. Large language models (LLMs) may offer scalable drafting assistance, yet clinical usability remains largely uninvestigated under real-world deployment constraints. Existing evaluations rely predominantly on synthetic data or closed-source models that are incompatible with strict data protection requirements. Objective: This study evaluated whether open-weight LLMs can provide clinically usable drafting support for German MTB patient protocols under real-world deployment constraints and developed a transferable evaluation framework for patient-facing text generation. Methods: Eight open-weight LLMs were evaluated under zero-shot (A1) and one-shot (A2) prompting with constrained decoding, which ensures section-schema compliance. Automatic evaluation used ROUGE-1, BERTScore-F1, WSTF4, and DistilBERT-based complexity using a corpus of 316 MTB protocols and 47 expert-written patient protocols. For expert evaluation, seven medical oncologists evaluated 50 protocols from the best-performing model across three ISO 9241-11 usability dimensions using fine-grained error annotation, perceived post-editing effort (PPEE), and net promoter score (NPS). Critical errors were defined as bearing the risk of patient harm. Results: Llama-3.3-70B-Instruct achieved the strongest automatic performance. Across models, A2 significantly improved most automatic metrics compared to A1. However, expert usability evaluation showed the opposite picture: the proportion of protocols containing at least one critical error doubled under A2 (40% vs. 20%) compared with A1, and the dominant error type shifted from language (37%) errors to factual errors (48%). Overall, 6.1% of the annotated paragraphs contained errors. Median PPEE was 2 (low) and median NPS was 7. Detractors (46%) outweighed promoters (29%), which signals clinical hesitation toward routine adoption. Conclusions: Prompting strategies that improve automatic metrics can simultaneously increase the number of critical errors. Surface-level metric gains were, therefore, insufficient proxies for clinical safety. Nonetheless, the low paragraph-level error rate and favorable PPEE suggest that structured open-weight LLM generation may be a useful drafting aid in a clinician-supervised setting. The proposed evaluation framework establishes a text-quality-focused basis for future assessment of patient-facing LLM applications in real-world clinical settings.

  • Background: As the digital transformation of healthcare systems accelerates, interest in the Social and Digital Determinants of digital health tools is growing. Objective: This OSF-registered scoping review maps how these determinants are assessed in both countries and identifies research implications Methods: Using the PiCo framework, PubMed, Scopus, and Web of Science were searched for peer-reviewed primary studies (2015-2025; English, German, French) on access and use of digital health tools in Germany or France. Eligible studies were screened and extracted using a standardized template in the Covidence software. Descriptive and exploratory analyses were conducted in SPSS. Exploratory correlation-based heatmaps were used to visualize recurring determinants, barriers, and facilitators shaping digital health access and use. Results: Seventy studies, 59 from Germany and 11 from France, were included, most were quantitative and cross-sectional. Frequently reported social determinants of health consisted of age, gender, education, geographic location, and health literacy, while central digital determinants of health included digital literacy, trust in digital tools, perceived usefulness, usability, and access to digital infrastructure. Comparative analyses revealed both shared patterns and country-specific emphases, with German studies more frequently addressing eHealth literacy and functional aspects of digital health, while French studies placed greater emphasis on social environment, housing conditions, and ethical considerations. Conclusions: Whether social or digital, most determinants were either person- or technology-centered. While reflecting an emerging field focused on individual-level factors, this emphasis risks overlooking broader, multi-level, mechanisms of social inequalities that may also shape digital health access and use. Clinical Trial: not applicable.

  • Background: Digital interventions may help reduce exacerbations and increase adherence to the Chronic Obstructive Pulmonary Disease (COPD) action plan by providing opportunities for self-monitoring of symptoms and self-management behaviors, and remote support from a case-manager Objective: To compare the impact of a COPD Patient Health Portal (PHP) on action plan adherence during exacerbations vs. usual care. A secondary objective was to evaluate the association between severity of COPD, sociodemographic, and knowledge, self-efficacy, anxiety and depression with adherence to the action plan Methods: This trial was a 12-month parallel, 2-arm RCT. Participants recruited from a speciality clinic were assigned to either: (1) COPD PHP with self-monitoring and automated feedback and online communication with a nurse case manager and usual care, or (2) usual care which consisted of self-management support delivered by a nurse case-manager with access to online educational material (Living Well with COPD). Analyses were based on an intent-to-treat analysis. The primary outcome was self-reported adherence to the exacerbation action plan. Secondary outcomes included COPD self-efficacy, HRQL, healthcare utilization, anxiety, depression, technology acceptance (TAM), and PHP use. Results: Forty-nine participants were randomized to either intervention ((n=24) or control (n=25). Groups were similar in age, sex, BMI, education, marital status, smoking status, spirometry and GOLD classification. While a greater proportion of intervention participants were adherent to the action plan (47%, 20/43 exacerbations) compared to the control arm (37%, 18 of 48 exacerbations), differences were not statistically significant. Twelve percent of exacerbation in the intervention arm led to the participant contacting a health professional compared to 4% in the control group. The average exacerbation length among individuals who contacted a healthcare professional during the exacerbation and took a medication was significantly lower in the intervention (days=10.6 ± 2.2, n=5) vs. control (29.5 ± 6.4, n=2) group. There was no effect of treatment arm on adherence to action plan at the exacerbation level (OR 1.16, 95% CI 0.44-3.03) nor of GOLD severity group (GOLD 3 vs 2: OR 1.6 and 95% CI 0.5-5.7, GOLD 4 vs 2: OR 1.3 and 95% CI 0.4-4.2) in adjusted models. Conclusions: A 12-month PHP intervention did not significantly increase adherence to the exacerbation action plan. There was a trend for those using the PHP to have fewer exacerbations in the first 4 months and with a shorter duration when contacting their HCP resulting in taking medication.

  • Background: Adults may experience subjective cognitive decline (SCD). However, it is unclear whether SCD is related to measurable cognitive impairment, particularly women ages 40 to 60 and early dementia. Further, Medicare has mandated assessment of cognitive and memory function in individuals over 65 as part of the Medicare Annual Wellness Visit. In order to assess possible impairment and change over time, efficient, objective measures of SCD are needed. Objective: To assess the relationship between performance on an online continuous recognition task (CRT, MemTrax) and age, sex, and memory concern. Methods: This study evaluated CRT performance in participants aged 21-99 who enrolled in an online program (HAPPYneuron) to measure mental functions, including those who reported concerns about them. This program asked participants if they had complaints about their memory, and then the program offered them the opportunity to assess cognition using the CRT. This CRT instructs individuals to attend to visual stimuli (50 images) and respond as quickly as possible to repeated images (25 images). The CRT components were used to measure learning and memory (as related to HITs, response to a repeated image), executive function (as related to CRs, correctly not responding to an initial image presentation), and processing speed (HIT-RTs, average response time to HITs). Results: Analysis of 18,178 (5,795 males, 32%; 12,383 females, 68%) only included those who answered the sex, age, and memory questions. There were 11,786 (65%) between 40 and 70 years of age. Females outnumbered males by over two-fold, beginning about 35 years of age, peaking at 55 years of age at over three-fold, and falling below two-fold at about 65 years of age. Approximately 30% more men complained of memory problems than those who did not, primarily 30 – 60 years old. About 80% more women complained of memory problems, over two-fold more than women who did not, 30-50 years old. The number of HITs, number of CRs, and HIT-RTs varied little between men and women. While those without memory complaints generally performed better than those with memory complaints, there was little difference in performance levels for each group between males and females. For all groups, there was a gradual reduction of performance over age for HITs and CRs and a slowing of HIT-RTs. Conclusions: Most subjects were 40-65, more than twice as many females, suggesting that these demographics have a relationship to concern about SCD. However, there was little difference between males and females for the various CRT components, though SCD was associated with impairment. Age-related declines were progressive, the largest being in slower processing speed, presumably to compensate for age-related changes in cognitive function. Present results suggest clinicians may use these metrics to quantify patient concerns expressed in the primary care setting. Clinical Trial: none

  • Momentary Mood State Detection using Smartwatches: Algorithm Development and Validation

    Date Submitted: Apr 20, 2026
    Open Peer Review Period: Apr 21, 2026 - Jun 16, 2026

    Background: Mental health encompasses not only chronic conditions such as depression or anxiety, but also acute fluctuations in mood that unfold over minutes to hours and can disrupt daily functioning. These transient states, such as sudden fatigue, irritability, or low energy, remain largely invisible to current digital health approaches, which typically aggregate behavioral and physiological data over days or weeks to detect trait-level conditions. The ability to detect momentary mood shifts in real time carries significant clinical promise: continuous affective monitoring could enable early detection of mental health crisis, support clinical decisions and clinical trials with continuous mood measurements, and improve occupational safety with detection fo states like fatigue or confusion. However, affective computing research has demonstrated that while physiological signals carry information relevant to mood, most prior work relies on controlled laboratory settings where performance degrades substantially in naturalistic environments, or employs research-grade devices with proprietary sensors unavailable on consumer hardware. Bridging this gap between laboratory-validated sensing and real-world momentary mood detection is essential for translating these clinical possibilities into practice through just-in-time adaptive interventions. Objective: This study investigates whether continuous sensing from a low-cost, opensource smartwatch can support detection of multi-dimensional momentary mood states in naturalistic settings, using personalized models with on-device computation. Methods: We conducted a 7-day field study in which participants (N=10) wore Bangle.js 2 smartwatches that continuously collected physiological and contextual data, including heart rate, accelerometry, barometric pressure, temperature, and GPS, while prompting hourly mood self-reports using the Brunel Mood Scale (BRUMS) across six mood dimensions (tension, depression, anger, vigor, fatigue, confusion) and additional affective and physical states. All feature extraction was performed on-device. We developed personalized mood detection models using best-subset regression across multiple feature combinations. Results: Personalized models decoded momentary states with mean R2 values ranging from 0.09 (pain) to 0.31 (vigor). Fatigue, happiness, vigor, and depression were the most reliably decoded dimensions (mean R2 = 0.26–0.31). Cross-subject decoding was substantially lower, confirming that personalization is essential for accurate mood inference. Including privacy-preserving location features did not significantly improve prediction accuracy beyond physiological and contextual sensors alone. Conclusions: This work demonstrates that a broad range of momentary mood states can be decoded from low-cost, open-source wearable sensors as people go about their daily lives, bridging the gap between controlled laboratory studies and real-world momentary assessment. The finding that personalized models substantially outperform generalized approaches underscores the need for individual calibration in affective computing systems. The on-device, privacy-preserving architecture establishes a foundation for future closed-loop adaptive interventions in clinical and occupational contexts, including continuous monitoring of high-risk psychiatric populations, early warning systems for substance use relapse, and real-time assessment of cognitive and emotional fitness in safety-critical work environments. Clinical Trial: N/A

  • Background: Patients with vestibular schwannoma often experience postoperative vestibular dysfunction, including vertigo, dizziness, and imbalance, which severely impair daily functioning and quality of life. Effective and accessible rehabilitation strategies are therefore essential. Objective: To evaluate the effects of a mobile health-based vestibular rehabilitation program in patients following unilateral vestibular schwannoma surgery. Methods: A prospective randomized controlled trial was conducted. A total of 60 patients who underwent unilateral vestibular schwannoma surgery at the Otology Center, Eye & ENT Hospital, Fudan University, from October 2023 to May 2025, was enrolled and randomly assigned to either the control group or the intervention group (n = 30 each) . Both groups underwent a 90-day vestibular rehabilitation program that targeted the gaze stability exercises, balance training and gait training. The intervention group received self-assessment, video-based guidance, symptom recording, and automated adherence monitoring via a customized mobile app, whereas the control group received face to face guidance and maintained their records using paper diaries. The primary outcomes was the between-group difference in the change in Dizziness Handicap Inventory (DHI) score from baseline (preoperative) to 90 days postoperatively. Secondary outcomes included DHI change at 30 days, visual analog scale (VAS) scores, and incidence of vestibular symptoms. Demographic, baseline, and outcome data were collected at admission and on postoperative days 7, 30, and 90. Intention-to-treat analysis was performed; missing continuous data were handled using multiple imputation, and dichotomous variables were imputed with the last observation carried forward. Independent t‑tests or Mann-Whitney U tests were used for continuous variables, and chi‑square tests for categorical variables. Results: The 90‑day follow-up primary outcome assessment was completed by 52 of 60 patients (86.7%), with 8 non-responders in the intervention group and 6 in the control group. No significant differences in baseline demographic or clinical data were observed between the two groups, whereas tumor size distribution differed significantly (χ2= –2.513, P=.012), with larger tumors in the intervention group. For the primary endpoint, no significant between‑group difference was observed in the change in DHI score from baseline to 90 days (P>.05). For secondary outcomes, no significant differences were found in DHI change at 30 days or VAS scores at any time point (all P>.05). However, on postoperative day 7, the incidence of postural symptoms was significantly lower in the intervention group than in the control group (53.33% vs 83.33%, χ² = 6.239, P = .012). Conclusions: The mobile health–based vestibular rehabilitation program demonstrated comparable efficacy to conventional face‑to‑face rehabilitation in improving vestibular function and may accelerate recovery during the early phase of unilateral vestibular loss.These findings support the feasibility of mHealth as an alternative approach for postoperative vestibular rehabilitation in patients with vestibular schwannoma. Clinical Trial: Chinese Clinical Trial Registry ChiCTR2200056123; https://www.chictr.org.cn/showproj.html?proj=150939

  • From Digital Access to Digital Assurance: Governing Equity in Digital Medicine.

    Date Submitted: Apr 20, 2026
    Open Peer Review Period: Apr 21, 2026 - Jun 16, 2026

    Digital health technologies are often promoted as means to expand access to care and reduce health disparities. Nevertheless, evidence from large-scale implementations indicates that access alone does not ensure equity. Although access initiates opportunities for care, assurance is required to sustain safety and fairness within digital health systems. Software-based and AI-enabled clinical systems may introduce new forms of exclusion, inequitable benefits, and unintended harm, as their performance varies across populations, clinical contexts, and temporal settings. The availability of a digital solution does not guarantee comparable usability or benefit across generations, due to differences in skills, trust, and access. Similarly, territorial disparities arise where internal and peripheral areas experience infrastructural discontinuities and uneven service provision, increasing the risk of amplifying pre-existing inequalities. Health equity in digital medicine should be conceptualized as a system-level property arising from governance, quality assurance, continuous clinical oversight, and the meaningful involvement of affected communities and patient organizations in governance, monitoring, and accountability processes, rather than as a downstream effect of adoption. Clinical and regulatory evidence shows that inequities often stem from lifecycle blind spots, subgroup performance asymmetries, and fragmented post-deployment accountability. An assurance-oriented approach, grounded in continuous validation, real-world monitoring, and predefined pathways for managing change, provides a clinically meaningful and systemically robust framework for achieving equity in digital medicine. In this context, freedom and verifiability in digital systems—defined as interoperability, open standards, independent audit capability, and, when proportionate to risk, technical inspectability and surveyability through open-source components or controlled disclosure—represent an essential ethical dimension. Effective and actionable policy levers, including mandatory subgroup monitoring, investment in governance infrastructure, and funding for adaptive oversight frameworks, can serve as effective mechanisms for decision-makers to ensure fair digital health outcomes.

  • Patient Portal Activation Among Neurology Patients in Washington, DC: A Cross-sectional Study

    Date Submitted: Apr 17, 2026
    Open Peer Review Period: Apr 18, 2026 - Jun 13, 2026

    Background: Patient portals have become essential infrastructure for healthcare delivery following the 21st Century Cures Act, yet adoption remains inequitable. Understanding demographic and geographic determinants of portal activation is critical for addressing digital health disparities, particularly among neurology patients who face unique access barriers. Objective: We examined the demographic, geographic, and neighborhood-level factors associated with patient portal activation among neurology patients at multiple geographic scales in the Washington, DC metropolitan area. Methods: We conducted a retrospective cohort study of 72,417 adult neurology patients seen at two academic medical centers sharing an electronic health record in Washington, DC (February 2021–February 2026). We examined portal activation using multivariable logistic regression and geographic analysis at four nested scales: the metropolitan catchment area, DC’s eight wards, individual census tracts (via geocoded patient addresses), and individual DC residents. Results: Portal activation was 64.7% overall. Activation varied by race/ethnicity (Non-Hispanic White 76.1%, Non-Hispanic Black 57.0%, Non-Hispanic Asian 57.6%, Hispanic 55.0%) and geography (DC Ward 2: 82.0% vs. Ward 7: 48.0%). Ward-level educational attainment (r = 0.948), broadband access (r = 0.889), and income (r = 0.811) were strongly correlated with activation. Within individual wards, Non-Hispanic White patients activated at 84–91% while Non-Hispanic Black patients activated at 48–64%, demonstrating that neighborhood resources alone do not explain disparities. Conclusions: Patient portal activation is shaped by demographic, socioeconomic, and geographic factors operating at multiple levels. Persistent within-ward racial disparities indicate that neighborhood resources alone do not explain the digital divide. Geographically targeted interventions must be paired with culturally tailored approaches to achieve digital health equity.

  • Virtual Reality for Cognitive Mastery in Airway Trauma Management: A Prospective Randomized Controlled Trial

    Date Submitted: Apr 17, 2026
    Open Peer Review Period: Apr 18, 2026 - Jun 13, 2026

    Background: Innovation in teaching methods is essential for advancing medical education, particularly for trainees developing crisis management skills. Virtual reality (VR) offers access to immersive, scalable, and accessible learning environments, but its effectiveness compared to traditional mannequin-based simulation remains underexplored. Objective: This prospective randomized controlled trial evaluates the efficacy of VR-based simulation versus traditional gold-standard mannequin-based training in enhancing medical trainees’ knowledge acquisition and application of decision-making concepts for airway trauma management. Methods: Forty medical students were randomized to either the VR (intervention) group or the Mannequin (control) group. Participants engaged in airway trauma management training using their assigned modality. Both groups completed a pre-and post-intervention test to evaluate knowledge acquisition, and undertook a mannequin-based crisis scenario one week after training to evaluate knowledge application. Results: Both groups demonstrated significant knowledge acquisition (VR: mean improvement +2.0/15, P=0.006; Mannequin: mean improvement +3.2/15, P<0.001), though no statistically significant differences were observed between groups (P=0.15). The VR group achieved self-assessed readiness and knowledge saturation faster, on average, than the Mannequin group. Both groups, on average, were successful in the post-training knowledge application test, however, the Mannequin group outperformed the VR group (mean difference: 1.58/15, P=0.021), and recognized a potential airway injury more quickly (P=0.004). Nevertheless, students in the VR group reported greater engagement and satisfaction, expressing a preference for VR as a future learning modality. Conclusions: Overall, VR-based simulation is a promising and engaging method for teaching airway trauma management and demonstrates comparable knowledge acquisition to traditional mannequin-based training. However, mannequin-based simulation still confers advantages for applied performance. Further studies using larger samples, multiple scenarios, and VR-based assessments are needed. Clinical Trial: ClinicalTrials.gov NCT04451590; https://clinicaltrials.gov/study/NCT04451590

  • Wearable Eye-Tracking Metrics From Smart Glasses for Cognitive Assessment: A Prospective Digital Health Study

    Date Submitted: Apr 17, 2026
    Open Peer Review Period: Apr 18, 2026 - Jun 13, 2026

    Background: Reading performance is closely associated with cognitive function, and eye-tracking metrics have emerged as sensitive, non-invasive indicators of cognitive processes. Recent advances in wearable technologies, such as smart glasses, enable continuous and scalable measurement of eye movements in real-world settings. However, rapid, accessible, and objective tools for cognitive screening remain limited. Integrating wearable eye-tracking with multidomain cognitive assessment may provide a scalable digital approach for early detection of cognitive impairment. Objective: To evaluate the association between wearable eye-tracking metrics and cognitive performance and to assess the feasibility of a smart glasses–based reading task as a rapid digital screening tool. Methods: In this prospective observational study, Mandarin-literate adults were recruited from Taipei Veterans General Hospital between May to August 2025. Participants completed a standardized reading task while wearing J7EF Gaze smart glasses. Eight eye-tracking metrics were recorded, followed by the six-domain cognitive assessment using gaze-based interaction. Associations were analyzed via multivariable regression adjusted for age and sex. Results: A total of 134 participants were enrolled (mean age 68.2 ± 13.4 years). Age correlated with all six cognitive domains and the total score, while sex exhibited smaller, domain-specific effects. In unadjusted analyses, total reading time showed the strongest associations with all cognitive domains (p < 0.001), while fixation duration, fixation frequency, and long or ultra-long fixations showed selective associations with orientation. After adjusting for age and sex, total reading time, total fixation time and average fixation time remained significant predictors. Conclusions: Total reading time emerged as a robust, age-independent eye-tracking marker of cognitive performance. Fixation-related metrics showed domain-specific associations, particularly with the puzzle game hobbies domain of the cognitive assessment. Wearable smart glasses with integrated eye tracking may provide a rapid, non-invasive, and scalable approach for digital cognitive screening in clinical and real-world settings.

  • Background: Large real-world data sources offer a unique opportunity to study the health of diverse ethnic groups. High-quality and accessible ethnicity data is needed to maximise this potential. Objective: To validate a newly developed ethnicity phenotype in the Oxford-Royal College of General Practitioners (RCGP) Research and Surveillance Centre (RSC). Methods: Retrospective cross-sectional study of individuals registered at a practice within the Oxford-RCGP RSC on 4th December 2024. An updated ethnicity phenotype was implemented and validated. Ethnicity data quality was assessed by evaluating completeness, distribution, and accuracy through external validation against estimates from the 2021 UK Census. Results: Of 21,902,852 individuals, 88.63% (19,412,154) had a recorded ethnicity following the implementation of the updated ethnicity phenotype. There was a marked improvement in the recording of granular (19-point) ethnicity data, with completeness increasing from 69.06% (15,126,835) to 88.63% (19,412,154) with the updated phenotype. There was significant variation in the completeness of ethnicity data according to demographic subgroups. The proportion of individuals in each ethnicity group was within 3.56 percentage points of the 2021 Census estimates for the same ethnicity group across England. Larger relative differences were observed for non-White ethnic groups. Conclusions: The updated ethnicity phenotype provides high-quality and granular ethnicity data based on official classifications for almost 90% of individuals. The overall ethnicity breakdown in the Oxford-RCGP RSC population was broadly similar to 2021 UK Census estimates. The updated ethnicity phenotype supports secondary uses of primary care CMRs, providing high-quality and accessible ethnicity data to study the health of diverse ethnic groups.

  • Background: Continuous glucose monitoring (CGM) is central to modern diabetes care, but explaining CGM patterns clearly, consistently, and empathetically remains time-intensive in practice. Large language model (LLM)–based systems may support patient-facing interpretation of CGM data, but evidence remains limited for retrieval-grounded tools evaluated against clinician-authored responses in counseling scenarios. The system was intended for structured CGM interpretation and communication support rather than autonomous therapeutic decision making. Objective: To evaluate whether a retrieval-grounded LLM-based conversational agent (CA) could support patient understanding of CGM data and preparation for routine diabetes consultations by generating responses to questions arising during CGM-informed diabetes counseling, with quality comparable to clinician-authored responses. Methods: We developed a retrieval-grounded LLM-based CA for CGM interpretation and diabetes counseling support. The system was designed to provide plain-language explanations of CGM patterns and responses to diabetes management questions while avoiding directive or individualized medical advice, such as recommending medication initiation, dose adjustment, or regimen changes. 12 CGM-informed cases, each comprising a de-identified CGM trace, a synthetic patient vignette, and accompanying CGM visual materials, were constructed from publicly available clinical datasets. Between Oct 2025 and Feb 2026, six senior UK diabetes clinicians each reviewed 2 assigned cases and answered 24 questions (12 per case). In a blinded multi-rater evaluation, each CA-generated and clinician-authored response was independently rated by 3 clinicians on 6 quality dimensions: clinical accuracy, guideline adherence, actionability, personalization, communication clarity, and empathy. Safety flags and perceived source labels were also recorded. The primary analysis used linear mixed-effects models with random intercepts for case and rater. Results: A total of 288 unique responses (144 CA and 144 clinician responses) were evaluated, generating 864 ratings. The CA received higher quality scores than clinician responses (mean 4.37 vs 3.58), with an estimated mean difference of 0.782 points on a 5-point scale (95% CI 0.692-0.872; P<.001). This pattern was observed across all 6 categories of patient questions. The largest estimated differences were for empathy (mean difference 1.062, 95% CI 0.948-1.177) and actionability (0.992, 95% CI 0.877-1.106). Safety flag distributions were similar between CA and clinician responses, with major concerns rare in both groups (3/432, 0.7% each). Although CA responses were longer, additional analyses adjusting for word count did not indicate that response length explained the overall quality difference. Conclusions: Retrieval-grounded LLM-based systems may have value as adjunct tools for routine CGM review, patient education, and preconsultation preparation, with potential to reduce clinician time spent on standardized interpretive tasks. However, these findings should be interpreted in light of the vignette-based design, restricted datasets, and a small clinician panel, and they do not establish suitability for autonomous therapeutic decision-making, medication adjustment, or unsupervised real-world use. Prospective validation in interactive clinical workflows is needed before implementation.

  • We developed and validated a patient-question–driven multimodal platform that maps natural language clinical queries to ten validated prostate cancer prediction models across five decision stages, achieving a correct model invocation rate of 96% (96/100) across 100 simulated scenarios.

  • Background: Background: Physical exercise interventions are widely used as non-pharmacological approaches in the rehabilitation of individuals with autism spectrum disorder (ASD). However, conventional exercise-based programs remain limited in contextual control, task specificity, and participant adherence. With advances in immersive technologies, virtual reality (VR)–based physical exercise interventions have emerged as a novel paradigm that integrates motor training within controllable and interactive environments for individuals with ASD. However, their overall therapeutic efficacy has not yet been systematically evaluated. Objective: Objective: To systematically review and quantitatively evaluate the effects of VR–based physical exercise interventions on behavioral outcomes, executive function, and motor performance in children and adolescents with ASD. Methods: Methods: Following the PRISMA guidelines, seven electronic databases (PubMed, ProQuest, ScienceDirect, PsycINFO, Scopus, IEEE, and Web of Science) were systematically searched. Study quality and certainty of evidence were assessed using risk-of-bias tools (RoB 2.0 and ROBINS-I) and the GRADE framework. Data analyses were performed using R software (version 4.5.2; R Foundation for Statistical Computing). Results: Results: Sixteen studies involving 642 children and adolescents with ASD were included in the systematic review, of which 10 met the criteria for meta-analysis. The meta-analysis showed that VR-based physical exercise interventions significantly improved executive function (SMD = 0.75, 95% CI: 0.33 to 1.16, I² = 0.00%, GRADE= moderate) and motor performance (SMD = 1.01, 95% CI: 0.02 to 2.00, I² = 72.4%, GRADE= low) in children and adolescents with ASD. No significant overall effect was observed on behavioral outcomes (SMD = 1.19, 95% CI: −8.00 to 10.38, I² = 75.7%, GRADE= very low). Subgroup analyses indicated that short-term interventions (5 weeks) produced more consistent improvements in motor performance (SMD = 1.31, 95% CI: 0.44 to 2.18, I² = 0.00%). Conclusions: Conclusion: The available evidence suggests that VR–based physical exercise interventions may improve executive function and motor performance in children and adolescents with ASD. However, evidence regarding their effects on behavioral outcomes remains insufficient and inconsistent. Overall, VR-based physical exercise interventions appear more suitable as an adjunct to conventional exercise or rehabilitation programs. Future research should prioritize high-quality randomized controlled trials with larger sample sizes and extended follow-up periods to clarify optimal intervention conditions and long-term effects.

  • From Administrative Exhaust to Clinical Foresight: A Documentation-Native, Workflow-Aware Future for Clinical AI

    Date Submitted: Mar 28, 2026
    Open Peer Review Period: Apr 16, 2026 - Jun 11, 2026

    Artificial intelligence in health care has largely been framed around prediction, automation, and generative assistance, yet nursing work remains underrepresented in how clinical intelligence is conceptualized and built. A major reason is that routine documentation and electronic health record (EHR) interaction behavior are still treated primarily as administrative burden rather than as meaningful traces of clinical work. This Viewpoint argues that medical futures studies should recognize nursing documentation as a documentation-native signal environment: one that reflects not only recordkeeping, but also surveillance intensity, workflow strain, care coordination, and emerging clinical risk. The argument is grounded in a growing body of work on documentation burden, audit-log modeling, workflow-aware clinical intelligence, and EHR-derived temporal signals, including work developed through the AIM-AHEAD CLINAQ Fellowship, IRB-exempt study #2026-027, SIIM-CAIMI25, and related Intensive Documentation Index manuscripts. The central claim is not that documentation should be intensified, but that future nursing AI should be burden-aware, workflow-aware, nurse-centered, equitable, and implementation-realistic. Rather than merely accelerating chart completion, next-generation systems should identify burden, detect strain, surface changes in surveillance behavior, and support safer decisions without increasing cognitive load. We propose documentation-native, workflow-aware nursing AI as a field-shaping agenda for medical futures studies: one that treats routine EHR behavior not as administrative exhaust, but as an underused source of operational and clinical intelligence. Such a shift would better align AI development with real nursing work, implementation constraints, and the goal of safer, more equitable care.

  • When Learning Cycles Turn Vicious: A Governance Model for AI-Enabled Learning Health Systems

    Date Submitted: Mar 17, 2026
    Open Peer Review Period: Apr 16, 2026 - Jun 11, 2026

    The shared commitments trust framework codified by the National Academy of Medicine and the AI-enabled learning health system framework proposed by Ko et al. together provide the normative foundation and operational architecture that AI-driven clinical learning requires. Neither, however, specifies the governance mechanisms needed when AI-mediated learning produces harm rather than improvement. This paper identifies three governance gaps that expose AI-enabled learning health systems to compounding failure: the absence of operational controls that translate shared commitments into enforceable requirements, an accountability vacuum in which no designated actor bears responsibility at each stage of the AI learning lifecycle, and the lack of a failure detection mechanism capable of identifying when learning cycles become vicious rather than virtuous. To address these gaps, the paper proposes an integrated governance model comprising three interdependent layers. A control layer maps each shared commitment to auditable requirements with defined metrics and responsible actors. An accountability layer assigns explicit responsibility across five stages of the AI learning lifecycle with quantitative escalation triggers. A failure detection layer monitors six trust decay indicators and activates a circuit breaker mechanism when predefined thresholds are breached, enabling institutional intervention before harm compounds at machine speed. The model is offered as a practical complement to existing frameworks, providing health system leaders, policymakers, and researchers with the governance infrastructure required for safe and trustworthy AI-enabled learning at scale.

  • A Systematic Review of Linguistic and Cultural Adaptation in AI-Based Health Communication

    Date Submitted: Mar 25, 2026
    Open Peer Review Period: Apr 16, 2026 - Jun 11, 2026

    Background: Artificial intelligence (AI) is increasingly transforming health communication by enabling scalable, multilingual, and personalised information delivery. While AI systems are often promoted as tools to improve access for culturally and linguistically diverse populations, emerging evidence suggests that their performance varies significantly across languages and contexts. In particular, concerns have been raised about a “cultural alignment deficit,” whereby AI systems achieve technical accuracy but fail to account for cultural meaning, communication norms, and contextual relevance, potentially reinforcing existing health inequities. Objective: This review aims to examine the current state of practice in linguistic and cultural adaptation in AI-enabled health communication, and to evaluate the effectiveness of these systems across different linguistic and cultural contexts. Methods: A scoping review was conducted following PRISMA-ScR guidelines. Four major databases (PsycINFO, Scopus, Web of Science, and PubMed) were searched, supplemented by Google Scholar and grey literature sources. Studies were included if they examined AI-based health communication systems with a focus on linguistic or cultural adaptation. Both empirical studies and review-type papers were included. Data were extracted using a standardised form, and findings were synthesised narratively across key dimensions, including adaptation approaches, system performance, and patient-centred outcomes. Results: A total of 104 studies were included, comprising 52 empirical studies and 52 review or conceptual papers. The findings reveal a strong Anglocentric bias, with most research conducted in English-speaking or Western contexts. Linguistic and cultural adaptation was frequently limited to surface-level translation, with minimal integration of cultural norms, metaphors, or contextual meaning. AI systems demonstrated improvements in readability, efficiency, and patient engagement, particularly in high-resource languages such as English and Spanish. However, performance declined in low-resource and non-English contexts, where outputs were more likely to contain inaccuracies, reduced cultural resonance, and lower perceived trustworthiness. Hybrid human–AI approaches, involving clinician or cultural expert input, were consistently associated with improved outcomes. Overall, evidence for long-term effectiveness, trust, and equity remains limited. Conclusions: AI-enabled health communication shows considerable potential to improve access and efficiency, but its benefits are unevenly distributed across linguistic and cultural contexts. Current systems remain largely constrained to surface-level adaptation, failing to achieve deeper cultural alignment. This review highlights the need to reconceptualise AI not merely as a translation tool but as a culturally competent communication partner. Embedding cultural adaptation at the design stage, improving methodological transparency, and addressing structural inequities in data and system development are critical to ensuring equitable and trustworthy AI-mediated health communication.

  • Neurocognitive, affective, and psychosocial health in older adults requires specialized focus due to the interplay of multiple comorbidities, reduced social interactions, and age-related physical decline. Traditional methods for early detection and diagnosis of these disorders rely on screening tools, surveys, interviews, assessments, and brain imaging. Emerging artificial intelligence (AI) approaches have been developed to assist clinicians with screening, diagnosis and intervention, for example, analyzing patterns in electronic health records (EHR), speech recognition, and digital phenotyping. This scoping review aims to describe the current landscape of AI applications in neurocognitive, affective, and psychosocial health for older adults. Specifically, we examine publication trends over the past decade, characterize the AI technologies and models being applied, and assess how clinical tasks such as detection, identification, diagnosis, prediction, monitoring, and treatment benefit from AI technologies. We followed a process that includes formulating the research question, defining eligibility criteria, searching for literature, screening against eligibility criteria, extracting and coding data, and synthesizing findings. Inclusion criteria include two databases (Scopus and PubMed) for papers published between January 1st, 2015, and July 31st, 2025, discussing neurocognitive, affective, and psychosocial health issues in older adults (≥ 65 years) and AI-based applications. Data extraction focused on psychological conditions (e.g., dementia, depression, anxiety), AI techniques, and the clinical context of AI usage. Descriptive statistics and thematic analysis were used to summarize results. We identified 268 relevant publications. There is a clear increasing trend in the number of papers per year, with a notable spike around 2020–2022 corresponding to surging interest in AI and the introduction of GenAI tools. A variety of methodologies were identified, with a more utilization of “shallow” machine learning models such as Support Vector Machine (SVM), Random Forest, and Logistic Regression than “deep” learning models like XGBoost, CNN, or LSTM during the time period studied. AI techniques were predominantly used for screening and diagnosis tasks, such as early detection of cognitive impairment and automated classification of dementia or depression, with relatively fewer studies focusing on interventions or monitoring. Reported model performance was generally high, with the majority of studies achieving good accuracy and area under the curve values. The literature demonstrated a growing body of literature focused on enhancing the neurocognitive, affective, and psychosocial health care of older adults through technology applications. AI shows promise for improving early identification of at-risk individuals and aiding diagnosis through analysis of complex data, which could enable timelier and more personalized interventions. However, most applications to date emphasize diagnostic prediction and risk assessment, whereas AI-driven intervention and monitoring tools remain scarce. AI applications are likely to become an increasingly integral part of mental health practice, but careful implementation and oversight will be required to ensure these tools augment rather than replace person-centered care.

  • Background: Generative artificial intelligence (GenAI) is increasingly used by health information consumers to interpret medical content and support decision-making. While these systems provide accessible, timely information, they may also produce inaccurate or misleading outputs. Effective use of GenAI, therefore, depends on users’ ability to calibrate trust based on information accuracy. However, little is known about how learned dependency on GenAI influences trust calibration in health information contexts. Objective: This study examines how learned dependency on GenAI affects health information consumers’ trust calibration in AI-generated information and whether visual attention cues (e.g., highlighting critical information) mitigate overreliance on incorrect outputs. Methods: We conducted a randomized controlled experiment with 338 participants. The study employed a 2 × 2 design manipulating (1) information accuracy (correct vs incorrect) and (2) visual attention cues (highlight vs no highlight). Participants evaluated AI-generated health information presented alongside source text. Trust was measured using a multi-item scale, and learned dependency on GenAI was assessed using a validated self-reported measure. Linear regression models were used to examine main and interaction effects. Results: Information accuracy had a strong positive effect on trust (β = 2.107, 95% CI [1.337, 2.878], p < .001), indicating that participants generally trusted correct information more than incorrect information. Learned dependency on GenAI was also positively associated with trust (β = 0.277, 95% CI [0.033, 0.521], p = .026). Importantly, the interaction between information accuracy and learned dependency was negative and significant (β = −0.399, 95% CI [−0.695, −0.104], p = .008), suggesting that higher dependency reduces users’ ability to differentiate between accurate and inaccurate information. In contrast, visual attention cues did not significantly affect trust (β = 0.149, 95% CI [−0.622, 0.920], p = .704), nor did they moderate the effect of dependency (β = −0.009, 95% CI [−0.305, 0.287], p = .950). Conclusions: This study demonstrates that while users generally trust accurate AI-generated health information more than inaccurate information, learned dependency weakens trust calibration, increasing susceptibility to incorrect outputs. Visual attention cues alone are insufficient to mitigate this effect. These findings highlight the need for more effective design interventions to support critical evaluation and reduce overreliance on GenAI in health information environments. Keywords: Learned Dependency; GenAI; Trust Calibration; Attention Mechanism; Automation Bias; Health Information; Human–GenAI Interaction.

  • Background: Concurrent chemoradiotherapy (CCRT) for abdominal cancer frequently induces chemoradiotherapy‑induced muscle loss, weight loss, and malnutrition. Objective: This randomized phase II trial evaluated whether a multidisciplinary, mHealth‑based multimodal rehabilitation program could preserve handgrip strength and muscle mass in abdominal cancer patients undergoing CCRT. Methods: In this prospective, multicenter, randomized, open‑label phase II trial (NCT05325554), 111 eligible patients with abdominal malignancies scheduled for CCRT were randomly assigned (1:1) to receive either multidisciplinary mHealth rehabilitation care (MRC, n=57) or standard care (SC, n=54). The MRC program was delivered by a dedicated multidisciplinary team comprising oncologists, rehabilitation physicians, nurses, clinical nutritionists, and psychologists. Using the AINST mHealth platform and wearable heart rate monitors, the team provided coordinated, individualized exercise, nutritional, and psychological interventions based on weekly assessments and real‑time data. The SC group received routine oncology care. The primary endpoint was change in handgrip strength from baseline to CCRT completion. Secondary endpoints included body weight, skeletal muscle mass, nutritional biomarkers, quality of life, psychological status, and adverse events. Results: Between February 2022 and April 2023, 111 patients were enrolled. Adherence was high, with 83.9% (47/56) of MRC patients achieving preset exercise targets. Compared with SC, the MRC group demonstrated significantly less decline in handgrip strength at all time points (all p < 0.001). The MRC group also showed better preservation of body weight (mean difference 1.3 kg, p=0.005) and a significantly lower proportion of patients with >5% weight loss (10.7% vs. 32.7%, p=0.005). Skeletal muscle mass was also better preserved (mean difference 1.3 kg, p<0.001). The MRC group had less decline in serum albumin (p=0.009) and prealbumin (p=0.019), and lower incidences of ≥G3 leukopenia (5.4% vs. 19.2%, p=0.037) and ≥G1 thrombocytopenia (16.1% vs. 34.6%, p=0.026). Nutritional and psychological benefits persisted at 4‑week post‑CCRT follow‑up. Conclusions: A multidisciplinary, mHealth‑based multimodal rehabilitation program effectively preserves handgrip strength, muscle mass, and nutritional status while reducing treatment toxicity in abdominal cancer patients undergoing CCRT. The multicenter implementation using standardized digital tools supports its scalability and translation into real‑world clinical pathways. Clinical Trial: Clinical Trial Registration: ClinicalTrials.gov NCT05325554.Registration Date 03/08/2022.

  • Background: Large Language Models (LLMs) are increasingly used in qualitative research, but their reliability compared to human analysis, especially on large, non-English datasets, is unclear. Previous studies on older models (like GPT-4) show limitations in nuance and token capacity. Objective: This thesis compares the qualitative analysis capabilities of OpenAI's GPT-5 and Google's Gemini 2.5 with a traditional human analysis. The study uses a large dataset of 317 Dutch newspaper articles (860 pages) from January 1, 2020, to December 31st, 2023, investigating the sentiment towards nurses during the COVID-19 pandemic. Methods: The study employed a two-part methodology. First, a thematic comparison was conducted where the human researcher, GPT-5, and Gemini independently generated inductive coding trees from the entire corpus. Second, a comparative test was performed where all three coded a 10% (31/317) random sample using a predefined codebook. This process was iterative, requiring a second round of AI analysis with refined prompts and an article-by-article approach to ensure a valid comparison. Results: The results show that both AI models identified third-order themes (e.g., "Healthcare Heroes") that were highly consistent with the data. In the practical application, however, both AIs "over-coded", identifying more quotations than the human (approx. 180 vs. 136). Conclusions: This study reveals a fundamental divergence in analytical logic: whereas human coders prioritize interpretive significance (contextual weight), LLMs default to semantic presence (literal frequency), leading to systematic over-coding. Consequently, this article argues that LLMs should not be viewed as autonomous researchers but as high-sensitivity filtering instruments requiring human calibration. This study concludes that AI serves as a valuable assistant for qualitative researchers. Still, it requires a rigorous, iterative, and human-in-th

  • Background: Chronic or persistent pain can limit an individual’s ability to work or be productive at work, creating substantial societal and economic burden. Despite this, evidence-based work‑related advice and support for people with chronic pain is inconsistent. The Pain‑at‑Work Toolkit was co‑created with people living with pain, health care professionals, and employers to increase knowledge of employee rights, improve access to workplace support, and provide guidance on lifestyle behaviors that facilitate pain self‑management. Objective: This study aimed to establish the feasibility of conducting a definitive cluster randomized controlled trial comparing access to the Pain‑at‑Work Toolkit plus optional occupational therapist telephone support (intervention) with support-as-usual (SAU) from the employer (control). Primary outcomes were feasibility, acceptability, usability, and safety of the digital intervention. We also assessed the feasibility of candidate primary and secondary outcomes and tested research processes required for a definitive trial. Methods: We conducted an open‑label, parallel, two‑arm pragmatic feasibility cluster randomized controlled trial with exploratory health‑economics analysis and a nested qualitative study. Eligible organizations were based in England, had ≥10 employees, and were recruited through professional networks and direct approach. Individual participants were working adults aged ≥18 years, with internet access and self‑reported chronic pain interfering with their ability to undertake or enjoy productive work. A restricted 1:1 cluster‑level randomization allocated organizations to the intervention or control arms. After organizational and individual consent, participants completed a web‑based baseline survey (T0) assessing work capacity, health and wellbeing, and health‑care resource use. Follow‑up occurred at 3 months (T1) and 6 months (T2). Feasibility outcomes included recruitment, intervention fidelity (delivery, reach, uptake, engagement), retention, and follow‑up completion. Qualitative interviews with employees and stakeholders at T2 explored acceptability and contextual factors influencing delivery and uptake. Results: A total of 380 employees from 18 organizations participated. Recruitment exceeded targets at both organizational and individual levels, demonstrating strong feasibility and engagement. Follow‑up completion met predefined feasibility criteria but showed variability, largely due to employee turnover, providing realistic attrition estimates for a future trial. Outcome measures showed acceptable completion rates and variability, supporting their suitability for use in a future definitive trial. Employees and stakeholders reported high acceptability of the Pain‑at‑Work Toolkit, and qualitative findings highlighted improved knowledge, confidence, and self‑management among employees. Stakeholders endorsed the Toolkit’s relevance and practicality within workplace settings. Conclusions: The feasibility trial demonstrated that the Pain‑at‑Work Toolkit and trial procedures are acceptable, scalable, and deliverable across diverse workplaces. Findings identify responsive outcome measures, emphasize the need for strengthened retention strategies, and support the Toolkit’s use as a standalone intervention. Overall, the study provides a strong foundation for progressing to a fully powered definitive trial. Clinical Trial: ClinicalTrials.gov NCT05838677; https://clinicaltrials.gov/study/NCT05838677 International Registered Report Identifier (IRRID): DERR1-10.2196/51474

  • Suicide-Related Responses from AI Chatbots through Consumer-Facing Interfaces and APIs: Comparative Study

    Date Submitted: Apr 11, 2026
    Open Peer Review Period: Apr 14, 2026 - Jun 9, 2026

    Background: Generative AI chatbots are frequently used for mental health-related questions, including suicide-related queries. Prior evaluations have focused on user interfaces (UI), which may include additional safety controls not present in direct application programming interface (API) access. Objective: To examine how frequently AI chatbot models provide direct responses to suicide-related prompts across access pathways and to determine whether safer behavior is intrinsic to the model or depends on UI-level safeguards. Methods: Observational cross-sectional study conducted in February and March 2026. We evaluated how five widely used consumer AI models (ChatGPT, Claude, Gemini, Grok, and Llama) responded to 30 previously vetted suicide-related prompts spanning five clinician-assigned risk levels. Each prompt was submitted 100 times through both public-facing UI and direct API access, yielding 30,000 total responses. The primary outcome was whether a response directly answered the suicide-related prompt. A direct response was defined as one that provided specific information or guidance related to the question asked, rather than refusing to answer, redirecting the user to a crisis resource, or responding only with general safety language. AI responses were categorized using a blinded large language model–based classifier. We estimated mixed-effects logistic regression models that predicted a direct response. The primary predictors were the AI model, access mode (UI vs. API), and prompt risk category. Results: 69.8% of responses to the suicide prompts were direct. Direct responses were more common through APIs than UIs (77.4% vs. 62.2%). Differences in the likelihood of a direct response were most pronounced for higher-risk prompts: among very high-risk prompts, 24.8% of API responses were direct compared with 4.6% of UI responses; among high-risk prompts, 80.1% of API responses were direct compared with 48.4% of UI responses. Claude and Gemini had the highest direct response rates (78.1% and 73.9%), whereas ChatGPT, Grok, and Llama had lower rates (64.6%, 64.5%, and 67.8%). In mixed-effects models, UI access was associated with lower odds of a direct response than API access (odds ratio, 0.09; 95% CI, 0.08-0.10). Higher prompt risk was associated with lower direct response probability, and access-mode differences varied by risk level and model. Conclusions: AI safety behavior depends on how the user accesses the model. The safety observed in an AI chatbot’s UI should not be assumed to generalize direct model access through APIs or to downstream applications built on those APIs. Evaluations and policies must consider access channels to ensure comprehensive safety protections for all individuals interacting with AI chatbots.

  • Background: Artificial Intelligence (AI) methods offer a valuable complementary approach to public health emergency management, supporting prediction, rapid threat identification, and timely decision-making alongside the already established human-led systems and processes. However, updated and comprehensive evidence on the extent and characteristics of AI use in public health emergencies, with a specific focus on infectious hazards, remains limited globally. Objective: This review aimed to map the scope, nature, and extent of AI applications in public health emergency management resulting from infectious hazards, and to characterize key implementation features. Methods: A scoping review was conducted following the Arksey and O’Malley framework, and a search was performed in three electronic databases, including PubMed, Scopus, and the IEEE Xplore Digital Library. The search period covered studies published between January 2014 and June 2024. Results: A total of 613 studies were included, of which 526 (85.8%) were AI-related methodological studies and 87 (14.2%) were reviews or other article types. Across these studies, 665 infectious-hazard records were extracted, with COVID-19 accounting for the majority (387, 58.2%), followed by influenza (8.4%), dengue (4.1%), and malaria (3.0%). Publications increased steadily from 2014 to mid-2024, with a sharp rise beginning in 2019 and peaking in 2022, aligning with the COVID-19 pandemic. Notably, studies on non–COVID-19 hazards also grew between 2019 and 2023, suggesting expanding AI applications. Among methodological studies, 31.4% used social media data, mainly from X (formerly Twitter) and Weibo. Most focused on predictive analytics and disease surveillance (58.2%), followed by risk communication (23.2%), compliance with public health measures (12.5%), and policy evaluation (6.1%). Data were predominantly sourced from the USA (12.1%) and China (10.5%), with limited representation from Africa, Central Asia, and the Middle East. Funding was mainly reported from organizations in China (14.4%) and the USA (14.1%), followed by Saudi Arabia and South Korea. Conclusions: The findings indicate that applications of AI in infectious disease emergencies are predominantly focused on predictive modeling and surveillance, with a considerable reliance on social media data. The United States and China emerge as the primary contributors, both as sources of data and as leading funders of this research. To promote more equitable and effective use of AI in public health emergencies, there is a critical need for increased investment in local expertise, data infrastructure, and operational capacity, particularly in low- and middle-income countries.

  • Machine Learning–Enabled Interventions in Palliative Care: A Scoping Review

    Date Submitted: Apr 14, 2026
    Open Peer Review Period: Apr 14, 2026 - Jun 9, 2026

    Background: Machine learning-based prognostic models have been increasingly developed to support palliative and serious illness care, particularly in oncology. While predictive accuracy has improved substantially, less is known about how these models are translated into real-world interventions and whether they meaningfully influence clinical practice and patient care. Objective: This scoping review aimed to map and synthesize interventional studies that used machine learning-enabled interventions to support palliative and serious illness care, with a focus on model integration strategies and reported effects on communication processes, care planning, and downstream clinical outcomes. Methods: Following PRISMA-ScR guidelines, we conducted a scoping review of peer-reviewed English language studies published since 2015. Searches were performed in PubMed, Embase, Web of Science, and the Cochrane Library. Eligible studies implemented Machine learning-based predictions to trigger or guide real-world palliative care related interventions, including serious illness conversations, advance care planning, or palliative care referral. Results: Eight interventional studies were included, encompassing cluster randomized trials, stepped wedge designs, and real-world implementation studies. Machine learning-enabled interventions were consistently associated with increased documentation of serious illness conversations and advance care planning, particularly when predictive outputs were embedded within clinical workflows through behavioral nudges, automated alerts, or facilitated outreach. In contrast, effects on treatment intensity, health care utilization, and end-of-life costs were limited, inconsistent, or not observed. Conclusions: Current evidence suggests that machine learning-enabled interventions in oncology palliative care are most effective when used to support prioritization and timing of communication related processes rather than to directly alter care trajectories or resource use. Future research should focus on implementation strategies, patient centered outcomes, and equity sensitive evaluation to better translate predictive insights into meaningful clinical impact.

  • Background: The Emergency Intensive Care Unit (EICU) is the core setting for the treatment of critically ill patients, where the diagnostic error rate is more than twice that of general inpatient wards, which seriously affects patient prognosis. Large Language Models (LLMs) have shown application potential in clinical diagnosis, but there is still very limited evidence comparing the diagnostic efficacy of critical care-specific LLMs and general-purpose LLMs in the complex diagnostic scenarios of the EICU. Objective: This study aimed to evaluate and compare the diagnostic accuracy of a critical care-specific LLM (Qiyuan 3.0.1) and three mainstream general-purpose LLMs (GPT5.1, DeepSeek V3.1, Qwen3-32B) in EICU diseases, and to provide evidence-based basis for the selection of intelligent auxiliary diagnostic tools in the EICU. Methods: This was a single-center retrospective paired diagnostic accuracy study, which consecutively enrolled 184 critically ill patients admitted to the EICU of Peking University Shenzhen Hospital from April 2025 to March 2026. Standardized datasets were constructed based on the patients' clinical data, including an initial diagnosis dataset (clinical data within 24 hours after admission) and a final diagnosis dataset (complete course data from admission to discharge). A unified zero-shot learning prompt strategy was adopted, and four LLMs independently generated corresponding diagnoses in a double-blind manner. The consensus diagnosis reached by three senior intensive care physicians with more than 10 years of EICU working experience, who were blinded to the model results, was used as the gold standard. The primary endpoint was the Top-1 accuracy in the final diagnosis stage, defined as the proportion of cases where the first primary diagnosis output by the model completely matched the gold standard. Secondary endpoints included the Top-1 accuracy in the initial diagnosis stage and the number of correct diagnoses in the Top-3 outputs in the final diagnosis stage. Cochran's Q test was used for the overall comparison of accuracy among multiple groups, and post hoc pairwise comparisons were performed using the paired McNemar test with Bonferroni correction for type I error. The Friedman non-parametric rank sum test was used for the intergroup comparison of the number of correct Top-3 diagnoses. Results: In the final diagnosis stage, the overall difference in Top-1 accuracy among the four models was statistically significant (Cochran's Q=20.32, df=3, P=4.57×10⁻⁵). The Top-1 accuracy of Qiyuan 3.0.1 was the highest (64.13%, 95%CI 56.83%-71.00%), followed by GPT5.1 (59.24%, 95%CI 51.83%-66.35%), DeepSeek V3.1 (57.07%, 95%CI 49.64%-64.28%), and Qwen3-32B had the lowest accuracy (51.63%, 95%CI 44.26%-58.98%). Post hoc pairwise comparisons showed that the Top-1 accuracy of Qiyuan 3.0.1, GPT5.1, and DeepSeek V3.1 was significantly higher than that of Qwen3-32B (all adjusted P<0.0083), while no significant difference was found in other pairwise comparisons (all adjusted P>0.0083). A similar trend was observed in the initial diagnosis stage, where only Qiyuan 3.0.1 was significantly superior to Qwen3-32B (adjusted P=0.008). The median number of correct Top-3 diagnoses for all four models was 2.0 (IQR 1.0-2.0), with no significant intergroup difference (Friedman χ²=3.34, df=3, P=0.339). Conclusions: The critical care-specific LLM Qiyuan 3.0.1 has superior Top-1 diagnostic accuracy in EICU diseases compared with some general-purpose LLMs, but the absolute diagnostic accuracy of all included models still has considerable room for improvement. LLMs have potential application value as auxiliary diagnostic tools in the EICU, but their clinical application still requires further optimization and multi-center prospective clinical trial validation.

  • Background: Passive vaccine safety surveillance systems often generate clinically incomplete adverse event following immunization (AEFI) reports, which may lack the diagnostic evidence needed for causality assessment. While the period for collecting critical clinical data is limited, specialist expertise to identify necessary evidence at the point of reporting is not often available. Currently, no existing system provides the structured guidance to evaluate whether a report contains sufficient evidence for assessment or to identify the specific clinical data required. Objective: This study aimed to develop and evaluate a surveillance support system that generates actionable investigation guidance for field epidemiologists at the point of AEFI report intake. the system identifies what clinical evidence is present, what is missing, and what additional data would most impact the causality assessment. Methods: We developed Vax-Beacon, a 6-agent neuro-symbolic pipeline that processes Vaccine Adverse Event Reporting System (VAERS). The system utilizes large language model (LLM) for generating free-text narratives through clinical observation, curated knowledge database for differential diagnosis matching, and deterministic code for WHO causality classification, producing structured investigation guidance for each case. We tested the system on 100 purposively curated VAERS myocarditis/pericarditis cases. Two field epidemiologists independently evaluated pipeline-generated guidance using 5-point Likert scales and open-ended feedback. Results: The pipeline processed all 100 cases without errors. WHO classification yielded A1 in 45%, C in 27%, B2 in 7%, and Unclassifiable in 21%. Brighton Level 4 early exit occurred in 20% of cases precluding definitive classification. For these cases, the pipeline generated prioritized diagnostic checklists specifying which tests would upgrade certainty. Cardiac biomarkers such as troponin I, CK-MB were recommended as high-priority tests and cardiac magnetic resonance imaging as a lower-priority follow-up for suspected myocarditis. The neuro-symbolic architecture ensured 100% reproducibility of all classification decisions across independent benchmark runs. In structured expert review (two field epidemiologists), Likert scores ranged from 3 to 5 (mean 4.33); both reviewers estimated 30–50% workload reduction and agreed the system is suitable as an official investigation support tool. Conclusions: Vax-Beacon demonstrates that neuro-symbolic AI can function not as a classification oracle, but as a surge-ready investigation focus tool — directing field epidemiologists to the right evidence items for known adverse events at the moment when collecting that evidence remains feasible. This principle, Designed Deference, addresses a critical gap in passive surveillance: the loss of retrievable clinical evidence between reporting and expert review. Clinical Trial: .

  • eHealth for Safe Communication When Giving Birth: A Mixed Methods Study With Pregnant Women and Mothers

    Date Submitted: Apr 6, 2026
    Open Peer Review Period: Apr 13, 2026 - Jun 8, 2026

    Background: Patient safety in obstetrics can be enhanced through theory-based communication interventions, mitigating risks of preventable adverse events (PAE). Digital interventions could be a way of scaling up and expanding interventions. Objective: To find out to what extent patient safety and individual determinants of PAE (perceived safety, communication behavior, outcome expectancies, self-efficacy, planning) can be improved through a training theory-based on the health action process approach and delivered via the internet in form of a web-app. Methods: A mixed methods design was applied. Qualitative data were collected through semi-structured interviews with N = 20 app users and analyzed thematically with qualitative content analysis (1) using MAXQDA2020. Quantitatively, a nationwide online sample of N = 651 participants was recruited with provided informed consent and were randomized to the waitlist control group (CG; NCG = 324), or intervention group (IG; NIG = 327) with immediate access to the app. Baseline values were drawn from the pre-app survey (T0), and post-intervention values from T2 (after Module 3) for the IG, and at T1 (before Module 1) for the CG. Missing data were imputed using Multiple Imputation by Chained Equations (MICE; m = 122 datasets). All test variables were linearly rescaled to a 0–100 metric to ensure comparability. Bayesian mixed-effects models were fitted for each test variable with group×time as the primary interaction term, including informed priors from previous evidence (age, education, family status). Results: Qualitative findings revealed that participants perceived improved confidence, empathy, and preparedness for childbirth, and valued the app’s clarity, usability, and relevance to real clinical encounters. In the qualitative study, 75% of the study participants (n = 15) worked through all modules and those who discontinued (n = 5) reported time constraints or lack of interest. Duration of engagement ranged from a few days to several months. Increased empathy and perspective taking was mentioned most often as learnings. In the quantitative study, the app produced meaningful improvements across most evaluated constructs. Posterior estimates for the group×time interaction indicated significant effects of −0.30 (95% CrI [−0.33, −0.26]) for perceived safety, 0.38 (95% Credible Interval [0.35, 0.41]) for communication behavior, 0.21 (95% CrI [0.17, 0.24]) for self-efficacy, and 0.83 (95% CrI [0.79, 0.86]) for action planning. In contrast, there were no interaction effects of group×time for outcome expectancies (0.03, 95% Crl [-0.07, 0.01]). Conclusions: The app effectively enhanced patient safety and volitional indicators of safe communication (behavior, self-efficacy, planning) in line with the theory and hypotheses. Together, the qualitative and quantitative findings show that internet-delivered interventions can produce measurable improvements in communication behavior and perceived patient safety when designed with interactive, user-centered, and theory-based elements. Future implementations should strengthen personalization, feedback, and adaptive engagement strategies to sustain use and extend behavioral transfer beyond obstetric contexts. Clinical Trial: ClinicalTrials.gov Identifier: NCT03855735; https://classic.clinicaltrials.gov/ct2/show/NCT03855735

  • Background: Rapid research responses to emerging infectious disease (EID) outbreaks depend not only on how quickly studies are launched, but also on whether their data can be combined, compared, and reused across studies. Health data standards, including shared vocabularies, terminology, and information models, are the structural prerequisite for interoperable, findable, accessible, interoperable, and reusable (FAIR) data. Despite European Commission (EC) investments exceeding €130 million across the EID cohort and clinical trial consortia coordinated through the Cohort Coordination Board (CCB) and Trial Coordination Board (TCB), little empirical evidence exists on the extent to which these consortia adopt standards, the barriers they face, or what funders could do to improve implementation. Objective: To characterise health data standards adoption across EC-funded EID consortia, identify the barriers that prevent uptake, and generate evidence-based recommendations for funders to strengthen standards implementation for the rapid reuse of interoperable participant-level data in epidemic detection and response. Methods: We conducted a cross-sectional online survey May 2023-Feb 2024, developed through a literature review and stakeholder consultation, with CCB and TCB-affiliated EC-funded EID consortia. Research networks and consortia outside these boards could participate if forwarded the survey. We collected information on consortium characteristics, standards use, barriers to adoption, awareness of EC-supported standardisation initiatives, and recommendations for improving uptake. Responses were analysed descriptively; open-text responses were categorised thematically. Results: Thirty-three responses, representing 15 consortia or research networks spanning over 40 countries were collected. Most responses came from cohort consortia. Adoption of data standards was limited. The most frequently used standards were ICD codes (n=10) and the Systematised Nomenclature of Medicine Clinical Terms (n=9); 7 respondents reported not using standards. The main barriers were insufficient experience applying standards (n=17), lack of budget (n=12), uncertainty about which standard to use (n=12), uncertainty about which standards related studies used (n=9), and inadequate tools (n=9). Awareness of EC initiatives designed to support standards adoption was strikingly low, suggesting that EC investment in standards support is not reaching its intended audience. Respondents recommended dedicated budgets, clearer guidance on preferred standards by data type, better communication of the benefits of standards adoption, stronger tooling, and funder mandates. Conclusions: Health data standards are underused across European EID consortia, representing a preventable bottleneck for pandemic preparedness despite substantial public investment. European funders can address this through the following actions recommended by major EC-funded EID Consortia: mandating dedicated standards budgets at the grant submission stage, issuing formal guidance on preferred standards by data type, investing in open-source tooling that delivers value to data generators, requiring machine-actionable data management plans, and establishing a public registry of standards adopted by funded consortia. Strengthening coordinated standards adoption is a necessary and achievable step toward the FAIR, interoperable research data infrastructure that effective pandemic response demands. Clinical Trial: Not applicable.

  • Background: The Registry of Stroke Care Quality (RES-Q) is healthcare quality improvement platform used globally. RES-Q collects structured quality-of-care data for stroke patients, requiring clinicians to manually extract information from electronic health records or documents such as discharge summaries. This process is essential but time-consuming, particularly given the variability, length, and semi-structured nature of clinical reports. Objective: To develop and evaluate a multilingual Evidence-Based Question-Answering framework that identifies supporting text spans in clinical reports of stroke patients and proposes answer suggestions for structured clinical forms, with the goal of reducing clinician workload while preserving full human oversight. Methods: We conduct a multilingual study using 1,596 pseudonymized stroke discharge summaries in six languages, annotated with question-evidence-answer triplets. Encoder-based language models are used to extract evidence spans from the reports, while generative language models are used to predict normalized form answers based on the extracted evidences. We compare multiple training strategies: models trained on reports in a single target language, models trained jointly on reports in different languages, and models trained on original reports combined with cross-lingual data augmentations. We evaluate performance on Evidence Extraction, Answer Prediction, and end-to-end Evidence-Based Question Answering across the six languages. Results: The presented Evidence-Based Question-Answering system achieves 89% end-to-end accuracy in form filling across six languages (77% for patient-specific questions and 95% for default or unverifiable items). Evidence Extraction is the primary bottleneck, reaching 85% F1 and 79% Exact Match, whereas Answer Prediction based on extracted evidences is more stable, achieving 95% accuracy. The performance varies by question type, and cross-lingual training generally reduces Evidence Extraction performance but has little effect on Answer Prediction. Model performance is influenced more by reporting practices and dataset characteristics than by language itself. Conclusions: Evidence-Based Question Answering over multilingual stroke discharge summaries enables human-in-the-loop validation and effective answer prediction with moderate computational resources. Evidence Extraction is the main bottleneck, while Answer Prediction is robust across languages and model sizes. The approach supports structured data collection, though generalization to new languages requires target-language training data.

  • Background: Reproducibility is a cornerstone of scientific validity, yet many biomedical studies lack sufficient transparency for independent verification. Recent advances in Large Language Models (LLMs) enable the development of autonomous agent systems capable of performing complex research tasks, offering new opportunities to assess and enhance reproducibility at scale. Objective: To evaluate the ability of LLM-based autonomous agents to reproduce key findings from published Alzheimer’s disease studies using a shared, publicly available dataset. Methods: We used the National Alzheimer’s Coordinating Center Uniform Data Set “Quick Access” dataset. Five eligible studies were identified through citation-based screening and predefined inclusion criteria. We developed a multi-agent system using GPT-4o (Autogen framework), simulating a research team to generate and execute code based on study abstracts, methods, and selected data dictionary variables. Reproducibility was evaluated at the assertion level using extracted abstract findings, with agreement defined by numerical tolerance or directional consistency. We additionally assessed statistical method alignment and overall workflow coherence. Results: A total of 35 findings were extracted across 5 studies. LLM agents reproduced a mean of 53.2% of findings, with 3/5 studies achieving majority replication. Agreement was higher for directionality and significance than for numerical estimates. Exact statistical method alignment occurred in 1/5 studies; 8/15 comparisons were partially aligned, mainly for standard methods. Domain-specific methods were often omitted or simplified. Reproduction required iterative correction (mean 35.6 steps), with code errors in 47.2% of runs but resolved autonomously. Failures were primarily due to incomplete reporting and incorrect implementation Conclusions: LLM-based autonomous agents demonstrate moderate capability in reproducing published biomedical findings, particularly for studies with clear, well-specified methods. However, reproducibility is limited by incomplete reporting, challenges in implementing domain-specific methods, and breakdowns in multi-step workflow fidelity. These findings suggest that LLM agents may serve as scalable tools for preliminary reproducibility assessment, while emphasizing the need for improved methodological transparency and validation frameworks in biomedical research.

  • Background: Patient-facing digital health tools such as mobile health (mHealth) apps, wearables, and digital therapeutics have expanded rapidly and show promise for improving chronic disease management. Despite increasing evidence of effectiveness, health systems and payers continue to face challenges integrating these tools into routine care. Objective: This study examined the decision-making processes of health system and payer leaders regarding the adoption and sustainability of patient-facing digital health tools within their organizations. Methods: We conducted semi structured interviews with nine senior leaders from a large Midwestern academic health system and affiliated payer organizations, including a provider owned health plan and a state Medicaid program. Interviews explored digital health adoption decisions, perceived value and fit, barriers, and sustainability considerations, focusing on adoption of an evidence-based mHealth intervention for alcohol use disorder as a use case. Transcripts were analyzed using thematic analysis with inductive and deductive coding. Results: Four decision making mechanisms shaped adoption and sustainability decisions: prioritization under organizational constraint, risk mitigation, operational fit, and value determination. These mechanisms describe how leaders navigate limited organizational capacity, reduce uncertainty and protect against clinical, financial, and operational risks, assess whether tools can integrate within existing clinical and technical systems, and determine whether anticipated and measurable benefits justify adoption and continued organizational support. Conclusions: Adoption and sustainability of patient-facing digital health tools are shaped by dynamic organizational decision-making processes that often remain invisible to patients, clinicians, researchers, and developers. Making these processes visible may help better align digital health tools with the realities of the healthcare system to support implementation.

  • Background: The growing integration of Personalized Risk Prediction (PRP) and Artificial Intelligence (AI) substantially re-shapes diagnostic and therapeutic decision-making in health care. At the same time, its responsible adoption depends not only on technical performance, but also on patients’ perspectives and acceptance. Objective: This study systematically examined patients’ perspectives across several European countries and explored how patients’ technology-related attitudes relate to their evaluations of personalized and AI-supported ap-proaches in cardiac care. As part of the PROFID (Prevention of Sudden Cardiac Death After Myocardial Infarc-tion by Defibrillator Implantation) project, its focus is on the ethical use of PRP and AI in the clinical context of decision-making regarding sudden cardiac death (SCD) prevention and implantable cardioverter-defibrillator (ICD) implantation. Methods: The study used a cross-sectional survey design with a standardized questionnaire including multimedia con-tent. The target population comprised adults aged 18 years or older living in six European countries who met at least one of the following (self-reported) clinical criteria: heart failure, myocardial infarction (MI), cardiac arrest, or current ICD implantation. An exploratory factor analysis (EFA) was used to identify and evaluate internally consistent scales, and subsequent regression analyses examined associations between these scales and technological openness, sociodemographic characteristics, and patients’ views on PRP and AI in cardiac care. Results: The sample consisted of 470 participants from Germany (n=210), the Netherlands (n=86), the United Kingdom (n=145), and three other European countries (n=29; Austria, Belgium, and Spain). Overall, 51.9% (244/470) of respondents were male and 48.1% (226/470) were female. The mean age of the sample was 61.12 (SD 12.62) years. The EFA showed six clearly interpretable factors: (1) Perceived benefits and support of PRP models in medical decision-making (MDM); (2) Perceived benefits and support of AI in MDM; (3) Transparency expectations in algorithmic decision-making; (4) Support for delegating decisions to algorithms; (5) Self-reported AI literacy and (6) Preference for shared decision-making (SDM). The regression analysis showed the relations of technologi-cal readiness, self-reported AI literacy, support for delegation of decisions to algorithms, transparency expecta-tions in algorithmic decision-making, preferences for SDM, and educational attainment to predict patients’ perceived benefits and support of PRP or AI in MDM. Conclusions: The findings support existing assumptions while also highlighting additional aspects that should be considered if high-level technologies are used in decision-making processes related to ICD implantation. PRP and AI were generally perceived as useful tools to support decision-making regarding ICD indication, provided that trans-parency is ensured and patients remain actively involved in the decision-making process. Mandatory use and full delegation to decision-making directly by Al were broadly rejected. Generally, men showed more positive perceptions of the use of AI in MDM than women. The attributed acceptance of delegation to PRP models was significantly higher than AI.

  • Background: Chronic pain is a prevalent and complex condition requiring long-term, multidimensional management. Digital therapeutics (DTx) have emerged as a promising nonpharmacological intervention; however, evidence regarding their effectiveness remains inconsistent due to heterogeneity in intervention types and study designs. Objective: This study aimed to systematically review and meta-analyze the effectiveness of digital therapeutics in reducing pain among patients with chronic pain. Methods: A systematic review and meta-analysis were conducted following PRISMA 2020 guidelines. Electronic databases, including PubMed, Embase, CINAHL, and the Cochrane Library, were searched from inception to December 10, 2025. Randomized controlled trials evaluating DTx interventions in adults with chronic pain were included. The primary outcome was pain intensity, and secondary outcomes included physical function and psychological outcomes (quality of life, anxiety, depression, and pain catastrophizing). Effect sizes were calculated as standardized mean differences using a random-effects model, and risk of bias was assessed using the Cochrane Risk of Bias tool. Results: A total of 7 studies were included in the meta-analysis. Digital therapeutics demonstrated a statistically significant reduction in pain intensity (SMD = -0.87, 95% CI: -1.70 to -0.03, p = 0.04); however, heterogeneity was substantial (I² = 97%). No significant effects were observed for physical function (SMD = 0.62, 95% CI: -1.57 to 2.80, p = 0.58) or overall psychological outcomes (SMD = -0.92, 95% CI: -2.01 to 0.17, p = 0.10). Among psychological outcomes, quality of life showed a trend toward improvement (SMD = 0.28, p = 0.07), whereas anxiety, depression, and pain catastrophizing showed no significant effects and substantial heterogeneity. Conclusions: Digital therapeutics may contribute to reductions in pain intensity in patients with chronic pain; however, the effects on physical function and psychological outcomes remain inconsistent. The high level of heterogeneity suggests that the effectiveness of DTx varies considerably depending on intervention characteristics and study design. Further high-quality and standardized trials are needed to establish the clinical effectiveness of DTx. Clinical Trial: PROSPERO CRD 420261355510; https://www.crd.york.ac.uk/PROSPERO/view/CRD420261355510

  • Impact of Patient Engagement in Remote Diabetes Management on Glycemic Outcomes: A Causal Inference Approach

    Date Submitted: Apr 8, 2026
    Open Peer Review Period: Apr 9, 2026 - Jun 4, 2026

    Background: Suboptimal glycemic control remains a major public health challenge for patients with type 2 diabetes and prediabetes. Remote glucose monitoring offers scalable support for self-management, but evidence on its real-world effectiveness and the causal impact of varying engagement levels is limited. Objective: To estimate the effect of patient engagement measured through glucose monitoring frequency on hemoglobin A1c (HbA1c). Methods: We analyzed 1,479 adults with type 2 diabetes or prediabetes enrolled in the iHealth Unified Care program, integrating Bluetooth glucose meters, a mobile app, lifestyle coaching, and primary care coordination. Engagement during the first six months was defined as the weekly frequency of glucose monitoring. The causal effect of monitoring frequency on HbA1c was estimated using marginal structural models with inverse probability weighting to address time-varying confounding. Results: At 6 months, HbA1c decreased by 0.53 (SD 1.46) percentage points (p < 0.001). We observed a dose-response relationship across engagement tiers: the highest-engagement group (16.99 measurements/week) achieved a 1.00 percentage point HbA1c reduction versus 0.34 in the lowest tier. In weighted models, each additional weekly measurement was associated with a 0.03 percentage point greater HbA1c reduction (p < 0.01). Findings were consistent in sensitivity analyses at 3 and 12 months. Conclusions: Engagement with a digitally enabled, primary care-integrated remote glucose monitoring program significantly improved glycemic outcomes across all engagement levels. Higher monitoring frequency produced greater HbA1c reductions, underscoring the importance of fostering sustained patient engagement to optimize diabetes management. Clinical Trial: Not Applicable

  • Background: Digital technology in health and social care can improve the well-being of people with long-term health conditions, but prior research has identified factors that hinder its adoption, particularly accessibility issues and challenges to its integration into everyday life. It is therefore important to fully understand both the facilitators and hindrances to the adoption of such technology. Objective: To explore the facilitators and hindrances to the adoption of digital technology in the everyday lives of adults with long-term health conditions. Methods: This scoping review systematically mapped relevant research following the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews). Thematic analysis was used to critically analyze the identified articles. Results: Forty-six articles were selected that examined 5,018 adults aged 18 and over. Six themes were identified: personal characteristics and preconditions; perceived usefulness in everyday life; design and technical functionality; support and guidance; human interaction; and integrity and trustworthiness. Conclusions: The findings are discussed in relation to the key constructs of the Unified Theory of Acceptance and Use of Technology (UTAUT): performance expectancy, effort expectancy, social influence, and facilitating conditions. Digital technologies support the daily lives of adults with long-term health conditions, but several challenges remain, including functionality that is not adapted to specific diseases or ages and a perceived lack of human interaction. Thus, digital technology is not a one-size-fits-all solution but should be an adaptable tool that accommodates individual preferences and contexts and that complements in-person human interactions.

  • BeProGuide: A Behavior Design Tutorial to De-Implement Low-Value Clinical Practices

    Date Submitted: Mar 30, 2026
    Open Peer Review Period: Apr 6, 2026 - Jun 1, 2026

    Background: Although concern about low-value care (LVC) practices has grown in recent years, interventions relying solely on informational or educational strategies have not proven effective in reducing them. This suggests a need to involve professionals and/or patients in collaborative decision-making processes and change strategy design. This study builds on this premise using the Fogg Behavior Model, which posits that a behavior can only occur when motivation, ability, and a prompt converge at the same time. Objective: he aim of this study is to develop a guide to designing, implementing and evaluating interventions that reduce LVC practices. We present a specific case study involving the deprescribing of benzodiazepines in primary care and use it as an example of the process to be followed to reduce other practices of this kind. Methods: This study was conducted in two primary care centers in Catalonia, Spain. A total of 31 professionals (physicians and nurses) took part in focus groups employing three techniques from the Fogg Behavior Model: Swarm of Behaviors, Focus Mapping, and Golden Behaviors. Through these techniques, we worked with participants to compile a set of actions for implementation. These actions were tailored to the conditions and capacities of their health centers and were assessed by the participants as feasible and effective in reducing benzodiazepine prescribing. Results: Based on this practical experience, we developed our ten-step BeProGuide, which outlines a series of tasks that we recommend completing in any project aimed at reducing LVC practices. This is presented as a detailed checklist to support informed decision-making. Conclusions: Our research operationalizes the Fogg Behavior Model by setting out a concrete, replicable procedure for reducing LVC clinical practices. In doing so, it transforms this conceptual framework into an actionable methodological tool, BeProGuide, which takes the form of a step-by-step guide and detailed checklist. This guide is not only applicable in health and medicine, but can be used in other fields such as education, work and organizations, and environmental protection.

  • Background: Children with inattentive attention-deficit/hyperactivity disorder (ADHD) often present with impairments in executive functions and fine motor skills in addition to core inattentive symptoms. However, the effects of structured remote fine motor training on these outcomes remain unclear. Objective: To examine the effects of a 12-week telerehabilitation-based fine motor training program on inattention symptoms, executive functions, and fine motor performance in children with inattentive ADHD. Methods: This assessor-blinded randomized controlled trial investigated a 12-week remote fine motor training program delivered via Tencent Meeting in children aged 6-10 years with inattentive ADHD. Sixty-six children were randomly assigned to either the intervention group (n=33) or a wait-list control group (n=33). The intervention was conducted 3 times per week, 60 minutes per session, for 12 weeks. Assessments were performed at baseline, immediately postintervention, and at 3-month follow-up. The outcomes were inattention symptoms, executive functions and fine motor skills. Linear mixed models were used for the main analysis, and mediation analysis was performed to examine whether executive functions explained changes in inattention. Results: Compared with the wait-list control group, the intervention group showed significantly greater reductions in inattention symptoms at 12 weeks (MD= −3.85, 95% CI: −5.01 to −2.68) and 3-month follow-up (MD= −2.00, 95% CI: −3.17 to −0.83). For executive functions, significant between-group differences were observed in inhibitory control, immediate memory, and cognitive flexibility at both time points (P<0.05), while delayed memory was significant at 12 weeks only (MD= −3.03, 95% CI: 0.57 to 5.49) and showed no significant between-group difference at follow-up (MD= 1.62, 95% CI: −0.84 to 4.08). For fine motor outcomes, significant between-group differences were found in manual dexterity and hand-eye coordination at both 12 weeks and 3-month follow-up (P<0.05), and in writing skills at 12 weeks (MD= −6.85, 95% CI: −13.38 to −0.32) but not at follow-up (MD= −2.18, 95% CI: −8.71 to 4.35). Subgroup analyses suggested age-related variation in treatment response, with younger children showing more evident gains in fine motor performance and older children showing more sustained improvements in inattention and selected executive function domains. Mediation analysis showed that inhibitory control partially mediated the effect of the intervention on inattention (indirect effect: β= −0.85, 95% CI: −1.85 to −0.08). Conclusions: A 12-week remote fine motor training program may be a feasible, safe, and effective nonpharmacological intervention for children with inattentive ADHD. The intervention improved inattention symptoms, executive functions, and fine motor performance, with inhibitory control emerging as a potential mechanism underlying symptom improvement. The subgroup findings further suggest that developmental stage may influence the pattern of response, which may help guide age-tailored intervention design in future practice. Clinical Trial: Chictr.org.cn ChiCTR2200065413; https://www.chictr.org.cn/showproj.html?proj=182412

  • The Role of Incentivisation in Communities of Practice: a systematic review

    Date Submitted: Apr 5, 2026
    Open Peer Review Period: Apr 6, 2026 - Jun 1, 2026

    Background: Incentivisation is increasingly used to maintain engagement and support behaviour change within communities of practice (CoPs), yet its effectiveness across chronic disease contexts remains uncertain. Objective: To examine how incentives are integrated into CoPs and related peer-support models, and to assess their impact on participant activation, engagement, and health-related outcomes. Methods: PubMed/MEDLINE, Embase, Scopus, and CENTRAL were searched from inception to June 2025 using predefined terms relating to CoPs, incentivisation, and patient-centred outcomes. Peer-reviewed empirical studies involving incentivised CoPs or analogous peer-support interventions for adults with chronic conditions were eligible. Four reviewers independently screened studies, extracted data, and assessed risk of bias in line with PRISMA 2020 guidance. Heterogeneity in design and outcomes required narrative synthesis. Results: From 667 records, four randomised controlled trials met inclusion criteria. Financial incentives produced the greatest short-term gains in physical activity, while non-financial approaches such as gamification, points, badges, and structured peer support yielded modest improvements in step count, treatment adherence, or diet quality. No consistent effects were observed for patient activation, self-efficacy, mental health, or quality of life. Engagement moderated effectiveness, although attrition was common. Conclusions: Incentivisation can enhance short-term behavioural outcomes within CoPs, but evidence for sustained psychosocial benefit is limited. Larger, longer-term studies are needed to clarify which incentive strategies deliver durable improvements in engagement and self-management. Clinical Trial: This review was registered on PROSPERO, an international prospective register of systematic review (January 2026, reference CRD420251244276).

  • Restrictive Family Relationships as a Mediator of Adolescent Social Media Use Disorder

    Date Submitted: Apr 2, 2026
    Open Peer Review Period: Apr 3, 2026 - May 29, 2026

    We investigated the relationships between academic stress, family relationships, and social media use disorder (SMUD) among middle school students through the ecological systems perspective. We used mixed-methods sequential explanatory. In addition, structural equation modeling showed that academic stress had a significant direct impact on SMUD. Restrictive family relationships significantly mediated this relationship. The proposed model explained 42.6% of the variance in SMUD. Qualitative findings highlighted family patterns and coping and revealed how rigid parental control and limited emotional support reinforced restrictive family dynamics, which in turn further linked academic stress to SMUD. Present research extends Ecological Systems Theory to online behavior and establishes restrictive family relationships as key mediating mechanisms in SMUD development. These findings underscore the importance of family-based interventions that promote open communication and adaptive stress management strategies to mitigate the risk of SMUD.

  • Background: Hospitalized children frequently experience pain and distress. Pain is a multidimensional experience involving both sensory and emotional components, necessitating multimodal management strategies. Socially assistive robots (SARs) have shown promise as non-pharmacological interventions in pediatric care. However, the interaction mechanisms through which SARs influence pain and emotional responses, particularly positive emotion and real-time emotional dynamics during child–robot interaction, remain underexplored. Objective: This study, titled the HAPPY (Hospitalized Assistance for Pediatric Pain Yields) study, aimed to evaluate the association between a SAR-based intervention and postoperative pain in hospitalized children and to examine whether different levels of engagement are associated with changes in real-time emotional dynamics. Methods: A single-group pretest–posttest design was conducted with 37 hospitalized children (mean age 7.35, SD 2.06 years) following tonsillectomy or adenoidectomy. The intervention was structured into three sequential phases: Phase 1 (warm-up/limited engagement), Phase 2 (educational video/passive engagement), and Phase 3 (social interaction/active engagement). Pain was assessed using the Wong-Baker FACES pain scale and observed behavioral FLACC scales. Emotional response, as valence, was measured using an automated facial expression recognition system (FaceReader 10). Changes in pain were analyzed using Wilcoxon signed-rank tests, and differences in emotional valence across phases were examined using the Friedman test with post hoc pairwise comparisons. Results: Self-reported pain significantly decreased from a median of 6 (IQR 4–6) to 4 (IQR 2–4) (P<.001), and observer-rated behavioral pain decreased from a median of 3 (IQR 2–4) to 1 (IQR 1–2) (P<.001). Overall differences in emotional valence across phases did not reach statistical significance (P=.053; Kendall’s W=0.084). However, the V-shaped trajectory of emotional valence was observed, with the lowest values during the passive engagement phase 2 (mean –0.24, SD 0.20) and relatively higher values during the active engagement phase 3 (mean –0.15, SD 0.13). The exploratory post hoc analyses indicated a significant increase in emotional valence from Phase 2 to Phase 3 (adjusted P=.012). Conclusions: SAR-based interventions were associated with reductions in postoperative pain in hospitalized children. Although overall emotional differences across phases were not statistically significant, the observed pattern suggests that active engagement may be associated with more positive emotional responses compared to passive engagement. These findings highlight the potential importance of interaction quality in SAR interventions and provide insight into the processes underlying their clinical effects in pediatric care. Clinical Trial: No

  • Public Perception of Health Care Before, During, and After COVID-19: A Longitudinal Analysis of Online Reviews

    Date Submitted: Apr 1, 2026
    Open Peer Review Period: Apr 1, 2026 - May 27, 2026

    Background: Online reviews of health care services represent a growing source of unsolicited, citizen-generated data that can complement traditional instruments for monitoring public perception of health systems. However, longitudinal analyses examining how citizen perception evolved before, during, and after the COVID-19 pandemic remain scarce, and existing studies have rarely differentiated between levels of care. Objective: This study aimed to examine the longitudinal evolution of public perception of a regional public health system over a ten-year period, with particular attention to differences between primary care and hospital services, and to assess whether the COVID-19 pandemic produced a temporary disruption or a more persistent structural shift in citizen evaluation of health care. Methods: A retrospective longitudinal observational study was conducted using 47,589 online reviews of 812 public health care facilities in Andalusia, Spain, collected from Google Maps and covering the period 2016–2025. Reviews were classified as positive or negative based on star ratings, validated against manual annotation using Cohen's kappa. The proportion of negative reviews was analyzed across three periods: pre-pandemic (2016–2019), pandemic (2020–2021), and post-pandemic (2022–2025). Structural breaks were identified using change-point detection analysis. Logistic regression models with robust standard errors clustered at the facility level were used to quantify differences in negative sentiment across levels of care and over time. Results: The proportion of negative reviews increased from 38.7% in the pre-pandemic period to 73.7% during the pandemic, remaining elevated at 66.5% in the post-pandemic period. Change-point detection identified March 2020 as a major structural break. The pandemic had markedly different effects across levels of care: negative reviews in primary care rose from 34.8% to 81.9% during the pandemic, remaining at 75.7% post-pandemic, whereas hospital care showed a more moderate increase from 43.8% to 55.6%, remaining stable thereafter. Logistic regression models confirmed that the trajectory of negative perception in primary care diverged significantly from hospital care during and after the pandemic, with interaction terms indicating substantially higher odds of negative reviews in primary care during the pandemic (OR = 5.28, 95% CI 3.95–7.07) and post-pandemic periods (OR = 3.66, 95% CI 2.66–5.03). Conclusions: The findings indicate that the COVID-19 pandemic was associated with a persistent structural shift in public perception of health care services rather than a temporary fluctuation, and that this shift was disproportionately concentrated in primary care. The sustained deterioration in citizen perception of primary care observed years after the acute crisis suggests that post-pandemic recovery strategies should explicitly address the post-crisis phase and prioritize the relational and communicative dimensions of primary care alongside structural capacity. Large-scale digital trace data offer a scalable and continuous complement to traditional patient satisfaction instruments for monitoring health system legitimacy over time.

  • Externalized Living Memory: Structuring Clinical Knowledge for the Age of AI Agents

    Date Submitted: Apr 1, 2026
    Open Peer Review Period: Apr 1, 2026 - May 27, 2026

    As AI agents become increasingly capable of autonomous action in health care, a prerequisite remains underaddressed: the persistent, structured memory that makes such action contextually meaningful. Clinicians face cognitive overload not from any single task but from the erosion of decision context over time. Existing tools—personal knowledge management frameworks, LLM built-in memory, and autonomous agents—each address parts of this problem but leave gaps in auditability, portability, or contextual persistence. This Viewpoint argues that memory should precede action: before AI agents can act meaningfully, they need persistent, human-controlled context. We describe externalized living memory—a structured knowledge base that both human and AI can read and write—as it emerged from the first author's practice as a cardiovascular radiologist and division chief. The approach is organized as a layered architecture with a routing table for scalable context loading and a governance hierarchy for sustainable maintenance. We illustrate the approach through clinical vignettes, compare it with existing solutions, and discuss limitations including the small-team evidence base and maintenance costs. An open-source implementation with templates and setup instructions accompanies this paper.

  • Professionals, leaders, and institutions in healthcare and health research are rapidly adopting and integrating AI systems and chatbots into their regular work, but this poses risks for patients in the case of patient and public involvement and engagement (PPIE). AI offers economical solutions for overstretched health systems and burned-out staff, already shows strengths in speeding up more long-term and minute research practices, and providing unique accessibility accommodations. However, AI can also be used to create personas and virtual PPIE panels, which can speak completely or partially for human patients with lived experience of conditions, thus minimising, distorting, or erasing their voices from collaborative research processes. AI pose risks through several distorting factors, including hallucinations, overconfidence, sycophancy, bias, sexism, and racism. Staley and Barron have argued that learning is the greatest outcome of PPIE. However, if researchers, professionals, and staff use AI chatbots in conjunction with or in lieu of human collaborators, the amount of learning that takes places is greatly reduced, according to AI expert and cultural critic, Ethan Mollick. In conclusion, we provide a checklist to guide professionals and researchers in ethical and responsible uses of AI that preserves the voices and roles of patients, members of the public, and lived experience.

  • Opioid-related Drug-Drug Interactions and Harm to Hospitalized Patients: A Retrospective Multicenter Cohort Study

    Date Submitted: Mar 29, 2026
    Open Peer Review Period: Mar 30, 2026 - May 25, 2026

    Background Opioid-related drug–drug interactions (DDIs) are common in hospitalized patients and can lead to serious harm, especially when opioids are combined with central nervous system depressants. Electronic medical records (EMRs) often trigger DDI alerts to warn clinicians of potential DDIs, but the effect of DDI alerts on clinically relevant opioid DDIs and related patient harm remains uncertain. This study evaluated whether EMR-integrated opioid DDI alerts reduce clinically relevant interactions and associated harms in routine hospital care. Objective This study aimed to address this gap by determining whether introducing these alerts reduces the prevalence of potentially and clinically relevant opioid-related DDIs, as well as the rate of DDI-related patient harm in hospitalized patients. Methods This retrospective cohort study was a secondary analysis of a multicenter quasi-experimental controlled pre–post evaluation of EMR implementation across five Australian hospitals. Adult inpatients were randomly selected from all patients who stayed in study hospitals for a one-week period six months before and six months after EMR implementation. Inpatients were included if they had at least one prescribed and administered opioid and one concurrent medication. Interruptive opioid DDI alerts were active only at intervention sites post-EMR. Potential DDIs were identified using Stockley’s Interaction Checker; pharmacists adjudicated clinically relevant DDIs, and clinical pharmacologists assessed DDI-related harm and causality. Clustered logistic regression with generalized estimating equations, adjusting for demographic and clinical variables, estimated the effect of alerts involving opioids on three outcomes: clinically relevant opioid DDIs (primary), any potential opioid DDI, and opioid DDI-related harm. Results Of 1,144 patients prescribed an opioid, 847 (74.0%) had at least one potential opioid DDI and 548 (47.9%) had at least one clinically relevant DDI. EMR alerts were associated with no significant change in clinically relevant DDIs (adjusted odds ratio 1.06, 95% CI 0.72–1.55; p=0.75). There was a significant reduction in potential opioid DDIs (adjusted odds ratio 0.55, 95% CI 0.41–0.74; p<0.001). Of all patients, there were 11 patients with a total of 38 DDIs experienced harm (0.6% of potential and 1.1% of clinically relevant DDIs), with most DDIs involving pharmacodynamic interactions with concomitant CNS depressants. Conclusion EMR opioid DDI alerts reduced overall exposure to potential DDIs but did not decrease clinically relevant interactions or related harm. The low rate of harmful events highlights the limited clinical value of current alert systems and the burden of low-value warnings.

  • Digital Health Among Educators: A Scoping Review

    Date Submitted: Mar 30, 2026
    Open Peer Review Period: Mar 30, 2026 - May 25, 2026

    Background: Digital health technologies are increasingly used to support health management, yet research focusing on educators’ engagement with digital health remains limited. Given that educators’ health literacy and digital practices influence both their own wellbeing and their students’ health behaviors, understanding their interaction with digital health tools is essential. Objective: This scoping review maps the current literature on educators’ digital health and identifies gaps in knowledge. Methods: Following the Arksey and O’Malley framework, a systematic search for peer-reviewed articles yielded 17 eligible studies. Results: Thematic analysis revealed three themes: 1) educators’ digital health literacy and its correlates; 2) educators’ experiences and challenges in using digital health technologies; and 3) professional support and interventions for educators’ digital health. Conclusions: The findings indicate uneven levels of educators’ digital health literacy, mixed experiences shaped by usability and contextual constraints, and insufficient institutional support for sustained engagement. This review highlights the need for targeted capacity-building efforts and context-sensitive interventions to enhance educators’ digital health competencies and to inform future research and policy development.

  • Operationalizing ethical review protocols for health data access: a scoping review of tools and frameworks

    Date Submitted: Mar 27, 2026
    Open Peer Review Period: Mar 30, 2026 - May 25, 2026

    Background: The European Health Data Space (EHDS) will substantially increase cross-border health data sharing in the EU, tasking Health Data Access Bodies (HDABs) with the legal and ethical assessment of health data access requests. However, ethical evaluation of data access is often criticized as an opaque, inconsistent and difficult to operationalize, relying on list of principles with limited procedural guidance. As data access requests increase, the lack of standardized and actionable ethics review processes risks undermining transparency, consistency, and trust in health data governance. Objective: This scoping review (ScR) aimed to identify, map and synthesize existing tools and frameworks designed to operationalize ethical evaluation of health data access. It focused on tools moving beyond conceptual reflection by providing standardized, repeatable, or quantifiable processes that support actionable decision-making. Methods: The ScR was conducted in accordance with PRISMA-ScR guidelines, searching both academic databases and grey literature. Search results were imported into Covidence for screening when possible. For sources incompatible with Covidence, screening and data extraction were conducted manually, using Excel sheets. Two reviewers independently screened titles, abstracts and full texts, with disagreements resolved by consensus. To minimize bias, reviewers were aware of each other’s involvement but could not see each other’s decisions during the initial screening phase. Data were extracted and thematically analyzed to examine tool characteristics, intended users, evaluative criteria, patient involvement, use of quantification, and real-world applications. Results: The ScR retrieved 2215 results, of which 1887 were unique (82.15%). A total of 13 full text studies were included and analysed based on the following criteria: the type of tool, the actor meant to use it, what it measured, its actionable components, whether it contained quantified or quantifiable components compatible with (partial) automation, and real-life applications. These 13 tools differed in their approaches, varying from multiple choice questionnaires (n=5), qualitative questionnaires (n=3) and decision aids, frameworks or matrixes (n=5). Most of these tools were directed at data users (n=10) and mainly aimed at guiding reflection or generating reports for further ethics assessment (n=7). Very few directly involved the public or patients in their development (n=3). Some provided means to classify data use into risk categories with an associated lower or higher level of ethical scrutiny (n=5), at times incorporating quantification (n=5) in the review process. Finally, most tools had limited or no documented real-world implementation (n=9). Conclusions: An increasing number of tools and frameworks aiming at standardizing and operationalizing ethical assessment in data access are being developed in recent years, however their effectiveness remains untested in most cases. As data flows increase in the EHDS, and consequently, the need for streamlined ethics becomes more apparent, hybrid models incorporating both quantitative and deliberative components may play an important role in tackling the challenges associated with ethical data access management. Clinical Trial: N.A.

  • Background: Background: Artificial intelligence (AI) is quickly becoming a key part of digital health systems in oncology, supporting activities like cancer screening, clinical decision-making, and patient care management. Although AI has the potential to enhance care quality and efficiency, its adoption at cancer centers varies widely, raising concerns about disparities in digital health access and capacity. Objective: Objective: This research investigates the multiple factors influencing AI adoption as part of digital health implementation at National Cancer Institute (NCI)-designated cancer centers across the U.S., focusing on institutional readiness, policy environment, and geographic spread. Methods: Methods: A national dataset of 75 cancer centers was assembled using public sources to track AI use in screening, treatment, and patient care. AI adoption was measured as a composite index (0-3), indicating integration across clinical areas. Spatial patterns were analyzed with Moran’s I, and multilevel ordered logistic regression models examined links between AI adoption, institutional features (like number of physicians, hospital beds, center type), and contextual factors (such as socioeconomic status and state politics). Results: Results: No significant clustering of AI adoption was found geographically, implying limited regional diffusion. The size of the physician workforce was the most consistent predictor of AI adoption, emphasizing that organizational readiness is a key driver. Policy environment also influenced adoption: comprehensive cancer centers in Republican-controlled states showed higher AI uptake. Socioeconomic status at the community level was not significantly related. Conclusions: Conclusions: This study identifies institutional capacity and policy environment as primary constraints on scalable innovative digital health implementation in cancer institutions. These results point to structural barriers to broad digital health deployment and indicate that advancing AI-enabled cancer treatment will need focused investments in institutional capacity and policy support. Without these efforts, disparities in digital health infrastructure could restrict equitable access to AI-driven innovations in oncology.

  • Digital Delivery of Lifestyle Interventions in Online Clinical Trials: An Umbrella Scoping Review

    Date Submitted: Mar 23, 2026
    Open Peer Review Period: Mar 24, 2026 - May 19, 2026

    Background: Noncommunicable diseases (NCDs) cause a very significant health and economic burden. As they are associated with modifiable behavioral risk factors such as physical inactivity and poor diet, more evidence on effective health behavior modification methods is needed. Fully online delivery of clinical trials can provide a practical and scalable way to evaluate interventions that aim to modify relevant lifestyle factors. The emergence of online delivery methods presents opportunities and challenges that need to be better understood to inform future research. Objective: This umbrella scoping review aimed to summarize current evidence on the opportunities and challenges provided by lifestyle intervention clinical trials that are delivered fully online. Methods: Evidence was synthesized from existing peer-reviewed review papers to map the digital delivery methods in online lifestyle intervention trials, focusing on technologies, recruitment, engagement and retention strategies, and reported strengths, limitations, and future directions. Using PRISMA-ScR guidelines, PubMed, EMBASE, CINAHL, Web of Science, and Scopus were searched for reviews published between January 2013 and May 2025. Predominantly (>50%) hybrid, telehealth, or acute condition focused interventions were excluded. Results: Eligible reviews (n=39) discussed digital interventions targeting diet, physical activity, or both, for lifestyle improvement, chronic disease prevention or management. The most common hardware used in online lifestyle clinical trials were smartphones and wearables, with the most frequent software modes being web-based platforms, mobile apps, and SMS. Successful engagement strategies often integrated behavior changes techniques, such as goal setting, self-monitoring, personalized feedback, and human support into the intervention design, or had behavior change techniques as a feature of the technology itself. Reported strengths of conducting clinical trials online included improved accessibility, scalability, cost-efficiency and personalization, whereas limitations discussed were poor engagement and retention, digital literacy barriers, and rapid technological change outpacing evaluation capabilities. Interventions that used theory-based designs, particularly those using Social Cognitive Theory, the Transtheoretical Model, and the Theory of Planned Behavior, were reportedly most successful in improving behavioral outcomes. Engagement and retention varied considerably across online trials, suggesting that the success of these studies may depend less on the online delivery modality itself and more on how interventions and technologies are designed, including the integration of behavioral theory and behavior change techniques. Conclusions: This review shows that online delivery of lifestyle intervention trials is a feasible and potentially advantageous method as it can improve reach, increase scalability, be cost-efficient, and allow more personalization of the intervention. To further improve the conduct of online clinical trials, future research should address increased use of behavioral change theory, equitable access to clinical trial participation, management of data privacy and security, intervention fidelity, and use of novel technologies such as artificial intelligence in a field that is rapidly evolving. Clinical Trial: The protocol was prospectively registered with the Open Science Framework https://osf.io/umcfv; 5 September 2023

  • Background: Clinical guidelines recommend an integrated, person-centered care model with better control of modifiable risk factors and coexisting conditions in patients with atrial fibrillation (AF), but many persons with AF receive insufficient risk factor management. Digital health technologies may provide valuable support in addressing this gap. Objective: Our aim was to evaluate a co-designed digital platform for supporting person-centered management of modifiable risk factors in individuals with AF. Methods: This is a mixed-methods study including a standardized quantitative questionnaire used to score the usability of digital tools, the System Usability Scale (SUS), and a qualitative, descriptive, manifest content analysis of individual interviews. Results: Twenty-two patients hospitalized for AF were included (age 68 (48-79) years; 32% female; BMI 27.7 (20.8-35.0) kg/m2; paroxysmal/persistent AF (36%/64%); AF duration 4 (0.5-18 years). Relevant comorbidities were hypertension (77%), heart failure (36%), diabetes mellitus type 2 (14%), and ischemic heart disease (18%). Usability was rated high, with a mean SUS score of 75 (±18.2), indicating above-average user acceptance. Participants’ requirements were summarized into four main categories and ten subcategories. First, they value a clear layout with simple design and easy navigation. Second, they appreciate positive content, which is informative, inclusive and motivating. Third, they request personalized information on different aspects and provided on different levels. Fourth, they desire individualized medical recommendations that are personalized and flexible but open to individual choice. Conclusions: To improve digital management of lifestyle-related risk factors and comorbidities, individuals with AF seek a solution with a clear layout, positive content, personalized information, and individualized medical recommendations.

  • Background: Attention-deficit/hyperactivity disorder (ADHD) is a prevalent neurodevelopmental condition in children and adolescents, for which conventional treatments present certain limitations. While digital therapeutics (DTx) have developed rapidly, the relative efficacy of different DTx modalities for this population remains to be established. Objective: To systematically compare the efficacy of four digital therapeutics (DTx) modalities (single-task, cognitive-motor dual-task, AI-integrated single-task, and AI-integrated cognitive-motor dual-task) on core symptoms and executive functions in children and adolescents with ADHD within the dual framework of task design and AI empowerment. Methods: We systematically searched PubMed/MEDLINE, PsycINFO, Web of Science, EMBASE, Scopus, ProQuest Dissertations and Theses, the Cochrane Library, and grey literature from ClinicalTrials.gov for randomized controlled trials published from January 2000 to February 2026, without language restrictions. A snowballing method was also employed. Risk of bias was assessed using the Cochrane RoB 2.0 tool. Data were analyzed using Bayesian network meta-analysis in R software (version 4.2.3). Heterogeneity was assessed using I² statistics, and publication bias was evaluated using Egger's test. Subgroup analyses, meta-regression, and sensitivity analyses were performed to explore sources of heterogeneity. Results: A total of 32 studies involving 2,819 patients were included. Network meta-analysis showed that AI-integrated cognitive-motor dual-task DTx appeared to be the most effective modality for improving core symptoms and executive functions, demonstrating the highest probability of being the best treatment on the Attention Deficit/Hyperactivity Disorder-Rating Scale (ADHD-RS) [Surface Under the Cumulative Ranking Curve (SUCRA): 57.5%; Mean Difference (MD): -3.03, 95% Confidence Interval (95% CI): -5.59 to -0.47], the Swanson, Nolan, and Pelham Rating Scale, Version IV - Inattention subscale (SNAP-IV-PI) [SUCRA: 58%; MD: -5.58, 95% CI: -8.76 to -2.39], the Swanson, Nolan, and Pelham Rating Scale, Version IV - Hyperactivity-Impulsivity subscale (SNAP-IV-PHI) [SUCRA: 81.6%; MD: -6.84, 95% CI: -10.37 to -3.31], and the Behavior Rating Inventory of Executive Function (BRIEF) [SUCRA: 67.4%; MD: -7.75, 95% CI: -10.06 to -5.43]. Moreover, this modality significantly outperformed conventional pharmacotherapy across all outcomes. Subgroup analyses revealed that intervention duration emerged as a potential source of heterogeneity for the SNAP-IV (both PI and PHI subscales) and BRIEF, while mean participant age was identified as a potential source of heterogeneity for the SNAP-IV-PI and BRIEF (all P < 0.05). Sensitivity analyses indicated that individual studies influenced heterogeneity. Of note, all outcome measures reported were based on parent versions of the scales. Conclusions: AI-integrated cognitive-motor dual-task DTx may be the most effective intervention for improving core symptoms and executive functions in children and adolescents with ADHD. Subgroup analyses suggested that Intervention duration and age emerged as moderators of treatment outcomes, warranting consideration in clinical practice. Clinical Trial: CRD420261304236

  • Background: Chronic primary pain is a complex condition involving biological, psychological, and behavioral mechanisms and is commonly associated with emotional distress and reduced quality of life (QoL). Digital mental health interventions (DMHIs) offer scalable and accessible solutions for delivering psychological care in chronic pain management; however, evidence regarding their effectiveness across delivery modalities and outcome domains remains heterogeneous. Objective: This systematic review aimed to (1) evaluate the effectiveness of DMHIs on clinical (pain intensity, disability) and psychological outcomes (QoL, anxiety, depression, catastrophizing, and self-efficacy) in adults with chronic primary pain; (2) examine whether specific digital delivery modalities are differentially associated with particular outcomes; and (3) identify methodological gaps to inform future research and implementation. Methods: A systematic literature search was conducted in PubMed, Scopus, PsycINFO, Cochrane Library, Web of Science, and Google Scholar following PRISMA guidelines. Two independent reviewers screened randomized controlled trials (RCTs) and assessed risk of bias using the Cochrane Risk of Bias 2.0 tool. Given substantial heterogeneity in study designs, interventions, and outcome measures, a narrative synthesis was performed. Results: Twenty-two RCTs were included. DMHIs were effective in improving psychological functioning and pain-related disability, often independently of changes in pain intensity, particularly when grounded in evidence-based psychotherapeutic frameworks such as cognitive behavioral therapy and acceptance and commitment therapy. Guided web-based interventions demonstrated the most consistent benefits, whereas unguided interventions showed smaller effects. Mobile applications and virtual reality–based interventions also showed positive effects on emotional functioning, self-management, and pain interference. Interventions incorporating some form of human guidance were generally associated with superior outcomes. Conclusions: DMHIs represent a promising, scalable, and person-centered approach to improving psychological well-being and functional outcomes in adults with chronic primary pain, particularly when integrated into stepped-care or hybrid care models. Clinical Trial: CRD420251010767

  • Background: Online patient reviews are widely used by consumers to assess the quality of direct-to-consumer teleconsultation (DTCT) services, particularly in settings where objective quality information is limited. However, whether these reviews validly reflect actual clinical and patient-centered care quality remains unclear. Objective: This study aimed to evaluate the validity of online physician reviews in reflecting the quality of care delivered on China’s three largest DTCT platforms. Methods: We conducted a cross-sectional study using unannounced standardized patients (USPs) to objectively assess the quality of DTCT services. Thirty-three USPs were trained to present 11 standardized clinical cases and completed 542 DTCT consultations between physicians on three major Chinese platforms. Technical quality was assessed using a clinical guideline adherence checklist, and patient-centered quality was measured using the Patient–Patient-Centered Care Chinese version (PPPC-CN) scale. Online review quality was defined as the positive review rate displayed on each physician’s profile. Agreement between online reviews and measured quality was evaluated using Intraclass Correlation Coefficients (ICCs), with additional rank correlation analyses. Results: Of the 542 consultations initiated, 530 were completed and 404 physicians had publicly available review data. Among all encounters, 53.14% (288/542) were phone-based and 46.86% (254/542) were text-based consultations. The median positive review rate was 99.9% (interquartile range [IQR], 99.4%–100%). Median guideline adherence was low (0.16; IQR, 0.08–0.26), and median patient-centered quality was modest (PPPC-CN score 2.1; IQR, 1.98–2.79). Diagnoses were completely correct in 40.92% (196/530) of consultations. Unnecessary examinations occurred in 1.7% of encounters, and medication prescribing was appropriate in 79.04%. The median consultation time was 13 minutes (IQR, 7–64.69), and the median registration fee was 29.9 yuan (IQR, 26.1–39.9). Agreement between positive review rate and guideline adherence (ICC= 0.002; 95% CI, −0.006 to 0.013) and between positive review rate and patient-centered quality (ICC= 0.014; 95% CI, −0.043 to 0.083) was negligible and far below accepted validity thresholds. Correlations between positive review rate and diagnostic accuracy were weak but statistically significant (Spearman ρ = 0.168; Kendall τ = 0.141; both P < 0.05). Limitations include the use of standardized cases rather than real patients and the focus on publicly visible review metrics. Conclusions: Online reviews on major platforms were overwhelmingly positive but showed almost no alignment with actual provider performance. DTCT providers demonstrated low guideline adherence and modest patient-centered quality. More research on improving the review frameworks is urgently needed to fill the gap between patient feedback and service quality. Clinical Trial: The study has been approved by the Southern Medical University Ethics Committee ([2022] No. 013) and registered with the China Clinical Trial Registry (ChiCTR2200062975).

  • Background: Healthcare service quality is inherently multidimensional, yet document-level text analysis methods such as Latent Dirichlet Allocation (LDA) force patient reviews into single dominant topics. This simplification may systematically discard evaluative information when patients discuss multiple service dimensions with varying sentiments within the same review. Objective: This study compared document-level topic modeling (LDA) with GPT-based aspect-level sentiment analysis (ABSA) to address three research questions: (1) How much information is lost when collapsing multi-aspect reviews to single topics? (2) How prevalent are mixed-sentiment reviews, and what quality tensions do they reveal—both cross-aspect trade-offs and within-aspect ambivalence? (3) Do positive and negative reviews exhibit different structural patterns in aspect co-occurrence? Methods: We analyzed 2024 Google Reviews from 24 medical centers in Taiwan. Both LDA (K=7 topics) and GPT-based ABSA were applied to the same 5,467 reviews, ensuring fair comparison on identical data. The ABSA design employed structured prompts to extract aspects from seven predefined quality dimensions. Quality validation achieved Cohen κ=.82 against human annotation. Mixed-sentiment reviews were identified as those containing both positive and negative aspect evaluations, and cross-polarity couplings were analyzed to identify recurring trade-off patterns. Rating-stratified network analysis compared aspect co-occurrence patterns between positive reviews and negative reviews using Jaccard similarity. Results: Reviews discussed an average of 2.05 distinct aspects (SD=0.97), producing 51.2% information loss under LDA's single-topic assignment. Among multi-aspect reviews, 11.0% exhibited cross-aspect mixed sentiment, with Technical–Functional Divergence—praising Professional Quality while criticizing functional dimensions—appearing in 49.9% of these mixed-sentiment cases. Network analysis revealed differential bundling: operational dimensions co-occurred more strongly in negative reviews, whereas clinical dimensions co-occurred more strongly in positive reviews. Conclusions: Document-level topic modeling discards more than half of the evaluative information patients provide. Our findings reveal that patients cognitively decouple clinical competence from service delivery—Technical–Functional Divergence appeared in half of mixed-sentiment cases—and that positive and negative reviews organize quality dimensions differently. We recommend a complementary approach: topic modeling for exploratory discovery and ABSA for diagnostic assessment. For healthcare quality improvement, hospitals should separate clinical signals from operational signals in feedback dashboards.

  • Background: Continuous renal replacement therapy (CRRT) is a life-sustaining critical care intervention widely used for hemodynamically unstable individuals with acute kidney injury. Recent efforts, including standardized procedures, structured documentation, and quality monitory, have shown small improvements in CRRT delivery and safety. However, fragmented workflows and paper-based documentation limit the sustainable implementation of these improvements in routine practice. Objective: This study aimed to design and evaluate a CRRT information system to support standardized procedures, structured documentation, and quality monitory. Methods: A user-centered design approach, informed by Design Science Research (DSR) methodology, guided a multi-step process of identifying problems, defining objectives, and designing and evaluating the information system. The approach to design involved close collaboration within a nurse-led, 10-member multidisciplinary team comprising nephrologists, nurses, information technology specialists, and information engineers. Evaluation included six months of real-world clinical use with ongoing feedback collected through a dedicated WeChat workgroup and a System Usability Scale (SUS) survey of 27 CRRT care team members. Results: A role-based CRRT information system was developed, comprising 14 clinical modules and 6 core functions. The system embedded a continuous data-processing pipeline that enabled automated capture of treatment-related data directly from CRRT machines, creation of structured nursing documentation, and generation of quality indicators from structured data. During demonstration, workflow refinements—including dual-nurse verification and enhanced device data transmission—were incorporated following pilot testing. Over six months of clinical use, 42 user-reported issues were identified across three domains: data retrieval and calculation, fidelity of automatically generated clinical documentation, and interface appearance. Quantitative usability survey (n=27) demonstrated excellent usability (mean SUS score 95.19, SD 5.09). Conclusions: A CRRT information system integrating standardized clinical procedures, structured documentation, and ongoing quality monitoring supported complex clinical practice and management beyond simple digitization. Workflow-aligned, data-flow–enabled design may help future critical care information systems better support clinicians working in information-intensive environments.

  • Digital Health Engagement Behaviors in U.S. Family Caregivers: Trend Analysis Using HINTS 2019-2022

    Date Submitted: Mar 6, 2026
    Open Peer Review Period: Mar 9, 2026 - May 4, 2026

    Background: Digital health has provided caregivers with access to supportive resources without space-time restrictions. Caregivers’ digital health engagement behaviors can help them track their own health and that of care recipients as well as communicate with others. While digital health tools have become more prevalent since COVID-19, the trend of caregiver engagement has been less explored. Objective: This study examined the trends and factors associated with selected digital health engagement behaviors in family caregivers in the United States (U.S.), using the Health Information National Trends Survey (HINTS) datasets collected in 2019, 2020, and 2022. Methods: Our cross-sectional data analysis included 1,676 family caregivers. Dependent variables were: 1) access to online medical records (caregiver’s, care recipient’s); and 2) health-related use of social media (sharing health information, interacting with others, watching health-related videos). Independent variables were survey year, demographic, socioeconomic, caregiving, and Internet technology factors. Weighted multivariable logistic regression analyses were conducted. Results: Among 1,676 caregivers (2019: n = 570; 2020: n = 412; 2022: n = 694), access to online medical records increased from 2019 to 2022. Access to caregivers’ own records rose from 48.7% to 72.6% (P<.001), and access to care recipients’ records increased from 30.8% to 44.5% (P<.001). Health-related social media use also increased, including sharing health information (22.5% vs 39.1%, P<.001), interacting with others (16.6% vs 27.0%, P<.001), and watching health-related videos (49.5% vs 60.9%, P=.005). In adjusted analyses, higher education (college graduate vs ≤high school: OR = 2.75, 95% CI 1.56–4.85, P<.001) and having health insurance (OR = 2.40, 95% CI 1.24–4.68, P=.010) were associated with access to caregivers’ records. Female sex (OR = 1.96, 95% CI 1.36–2.84, P<.001) and spousal caregiving (OR = 2.14, 95% CI 1.26–3.65, P=.005) were associated with access to care recipients’ records. High-speed internet access was strongly associated with digital engagement outcomes (e.g., sharing health information: OR = 3.98, 95% CI 2.15–7.35, P<.001). Conclusions: Digital health engagement—including access to online medical records and the use of social media for health-related purposes—among U.S. family caregivers increased following the COVID-19 pandemic. These findings suggest that healthcare professionals and researchers should consider multifaceted factors, such as age, race/ethnicity, geography, education, insurance coverage, and digital access, when designing and implementing digital health tools and technology-based interventions. Future research should evaluate how digital technologies, automation of systems “talking” to other systems, including artificial intelligence (AI) can better support caregivers’ health information needs and care coordination. The varying learning curves for individuals and groups could further be explored for effective and efficient adoption and utilization.

  • Background: Problematic digital use among youth is associated with mental health concerns, yet the affective and behavioral mechanisms linking self-esteem to problematic digital use remain insufficiently characterized. Objective: To examine whether depressive and anxiety symptoms and objectively measured smartphone behaviors are associated with the relationship between self-esteem and problematic digital use among adolescents and young adults. Methods: This cross-sectional observational study was conducted between April 2022 and January 2023 in academic institutions in Grenoble, France. Participants were 171 adolescents and young adults aged 11 to 25 years using Android smartphones who completed self-report questionnaires alongside passive smartphone monitoring. Measures included self-esteem, depressive symptoms, anxiety symptoms, recreational smartphone time, delay to first connection in the morning, and nighttime digital disconnection (digital sleep). Problematic digital use was modeled as a latent construct encompassing excessive use, emotional regulation, and reactivity and assessed using a validated self-report scale. Results: Among 171 participants (meanage, 17.6 years, SD 3.0; 57% female), depressive symptoms mediated the association between self-esteem and problematic digital use (β = −0.33; 95% CI −0.45 to −0.21), with larger indirect effects than anxiety symptoms (β = −0.13; 95% CI −0.22 to −0.04). Recreational smartphone time was positively associated with problematic digital use (β = 0.28), whereas digital sleep was independently associated with lower problematic digital use (β = −0.24). Conclusions: Lower self-esteem was indirectly associated with problematic digital use primarily through depressive symptoms, which showed stronger associations than anxiety symptoms. Objective smartphone behaviors were independently associated with problematic digital use. Clinical Trial: This trial was registered at ClinicalTrials.gov (NCT07293208).

  • Social Determinants of Digital Health Intervention Use in Mainland China: A National Cross-Sectional Study

    Date Submitted: Mar 6, 2026
    Open Peer Review Period: Mar 9, 2026 - May 4, 2026

    Background: : Digital health interventions (DHIs), including telemedicine and artificial intelligence–enabled health tools, are increasingly integrated into health care systems worldwide. While these technologies have the potential to improve access and efficiency, unequal access to digital resources and health capabilities may create disparities in their use. Evidence on population-level determinants of digital health use remains limited in rapidly digitalizing health systems such as China. Objective: This study aimed to examine social and structural determinants of DHI use among adults in mainland China using the World Health Organization’s Social Determinants of Health (SDoH) framework. Methods: This cross-sectional study analyzed data from a nationally representative survey conducted in mainland China in 2024 among adults aged ≥18 years. The primary outcome was self-reported ever use of digital health interventions, including telemedicine, digital health applications, and AI-enabled health tools. Explanatory variables were categorized into five SDoH domains: economic stability, education and health-related capabilities, health care access and quality, neighborhood and built environment, and social and community context. Multivariable logistic regression models were used to estimate adjusted odds ratios (aORs) and 95% confidence intervals (CIs) for associations between social determinants and DHI use. Results: Among 34,672 participants, 14,565 (42.0%) reported ever using a DHI. Higher household income (≥6001 CNY vs ≤3000 CNY: aOR, 1.37; 95% CI, 1.29–1.46), higher educational attainment (bachelor’s degree or above vs junior high school or below: aOR, 1.49; 95% CI, 1.38–1.61), higher health literacy (per SD increase: aOR, 1.10; 95% CI, 1.07–1.13), and higher eHealth literacy (per SD increase: aOR, 1.20; 95% CI, 1.17–1.24) were associated with greater odds of DHI use. Health insurance coverage was associated with higher DHI use (aOR, 1.22; 95% CI, 1.11–1.34), whereas individuals aware of but not enrolled in family doctor services had lower odds (aOR, 0.65; 95% CI, 0.60–0.70). Difficulty paying medical expenses was associated with higher DHI use (aOR, 1.31; 95% CI, 1.22–1.41), while rural residence was associated with lower odds (aOR, 0.94; 95% CI, 0.89–1.00). Conclusions: DHI use in China is strongly associated with socioeconomic resources, health-related capabilities, and access to health care. These findings highlight the importance of addressing structural and social determinants to promote equitable adoption of digital health technologies in rapidly digitalizing health systems. Clinical Trial: NA

  • Background: Bangladeshi adolescents, who constitute a fifth of the country's population, experience barriers in accessing sexual and reproductive health (SRH) information. Previous studies have shown that mobile health (mHealth) interventions provide adolescents with timely access to evidence-based curricula, gamified, and interactive content, sessions, and information. The widespread adoption of mHealth technologies among adolescents and their willingness to embrace emerging technologies are encouraging specialists to employ mHealth approaches to share health information. Despite the high mobile phone usage among adolescents in Bangladesh, there are a few mHealth interventions specifically targeting their SRH needs. Objective: We aimed to assess changes in SRH knowledge and awareness among adolescents in Bangladesh following exposure to "Mukhorito", an interactive mobile app-based intervention. Methods: This pilot study employing a pre-post non-randomized experimental approach was conducted in three selected secondary schools in Feni, Bangladesh, from June 2023 to March 2024. 46 students from class 9 across the three schools were recruited, with a minimum of 10 per school. Bivariate analyses were performed to assess the association between SRH knowledge and awareness scores with other covariates. Significantly associated covariates for both scores were used in building the adjusted linear regression models. Results: The adjusted models indicated a significant improvement in the end-line group compared with the baseline group for both knowledge (1.2 units; 95% CI: 0.8-1.6 units) and awareness scores (1.0 units; 95% CI: 0.3-1.5 units), indicating a high level of intervention effect. Conclusions: These findings demonstrate the potential of mobile app-based innovations to improve adolescent SRH education within a national program in resource-constrained settings, specially where conventional methods may be less effective.

  • Remotely Delivered Yoga Interventions for Pain Management: A Scoping Review

    Date Submitted: Mar 5, 2026
    Open Peer Review Period: Mar 5, 2026 - Apr 30, 2026

    Background: An expanding body of evidence suggests that yoga may be beneficial for pain management across a range of conditions. At the same time, healthcare delivery has evolved rapidly with the growth of telehealth, including advances in the remote delivery of yoga interventions. However, the literature lacks a comprehensive synthesis focused specifically on remote yoga interventions for pain. Objective: This scoping review aimed to map and characterize the extent and type of the existing evidence on remotely delivered yoga interventions for pain management. Specifically, we (i) examined general study characteristics, participant populations and intervention features, (ii) summarized reported findings on feasibility, safety and effects on pain-related outcomes, and (iii) identified research gaps to inform future investigation and practice. Methods: A systematic search was conducted in August 2025 across six databases to identify primary studies. Studies, including study protocols, conference abstracts and trial registrations, were eligible for inclusion if they reported primary data, across any study design, involving participants experiencing any type of pain, undergoing remote delivered yoga interventions (referred to also as online, virtual or tele-yoga), from any contextual setting. Eligibility was assessed through abstract and title screening and a subsequent full-text review independently by two reviewers. Results: A total of 82 sources of evidence were included, comprising 47 peer-reviewed publications, 1 preprint, 17 conference abstracts, and 17 trial registrations. Detailed data charting was conducted for peer-reviewed publications only. Overall, 3.199 participants were represented. Fewer than half of the studies examined pain as the primary condition, while the remainder assessed pain as secondary to other medical conditions or within non-clinical populations. Interventions varied considerably in duration, frequency, delivery format, and yoga style. Synchronous delivery was most common, Hatha yoga and adaptions were the most frequent style and eight- or twelve-week programs delivered twice weekly predominated. Feasibility was generally favorable, safety findings suggested a low risk of adverse events, and adherence was typically moderate to high, however reporting across these domains was inconsistent. Given the substantial methodological heterogeneity, conclusions about efficacy are limited, however reported findings indicate potential benefits for pain-related outcomes. Conclusions: Based on the current existing evidence, remotely delivered yoga for pain appears feasible and low risk, with signals of potential benefit. Future systematic reviews with formal quality appraisal and quantitative synthesis are needed to clarify effect sizes and the certainty of the evidence. Important gaps remain, including inconsistent reporting, limited comparative research on delivery formats, further investigation of intervention characteristics, underrepresentation of certain pain conditions and low- and middle-income settings.

  • Background: People living with advanced cancer experience more frequent and severe symptoms than people living with early-stage disease. Four common and distressing symptoms include sleep difficulties, worry-anxiety, fatigue, and depression. Cognitive-behavioral therapy (CBT) and acceptance and commitment therapy (ACT) interventions are effective for managing these symptoms but are often too time-intensive for people with multiple appointments, limited energy, and competing priorities. Brief, mobile health (mHealth) interventions provide an accessible alternative, particularly for those in rural communities with limited access to palliative and/or psychosocial oncology services. Objective: Based on our successful in-person/DVD-based pilot trial of a four session, integrated CBT-ACT symptom management intervention for advanced cancer patients, Finding Our Center Under Stress (FOCUS), this study tests the feasibility and acceptability of a mHealth translation of this intervention. Methods: In this single-group, feasibility trial, 11 people with advanced cancer were recruited through hospital-based oncology clinics representing four cancer types (breast, melanoma, multiple myeloma, prostate). Patients completed sociodemographic questions, initial patient-reported outcomes including sleep (ISI), anxiety (GAD-7, PSWQ), fatigue (FSI), and depression (CES-D) and a 7-day sleep diary via the mobile app. They then completed four modules focused on the self-management of sleep difficulties, worry-anxiety, fatigue, and depression. To assess feasibility, we examined recruitment, retention, and module completion. At the end of six weeks, to assess acceptability, participants completed the Internet Evaluation and Utility Scale and some participants completed a qualitative interview assessing their experience with the FOCUS app. We present quantitative and qualitative results as well as lessons learned in designing the application for this patient population. Results: Sixty-five percent entered the trial (N =11) and seventy percent completed more than half of the app. These participants gave strong ratings for FOCUS ease of use (3/4), convenience (3.7/4), utility (3.3/4), and ease of understanding (3.83/4). All participants (10/10) said they would recommend the app to other people with cancer and would return to the app with future problems. Participants’ favorite components were video recordings of other patients and the sleep and worry/uncertainty modules. Areas for improvement based on participant feedback included video quality for some components (i.e., lighting, sound), sleep diary ease of use, and a desire for professional guidance. Conclusions: The FOCUS intervention was successfully delivered via mobile technology and was feasible and acceptable per beta testing. The FOCUS mHealth app provides an evidence-based, accessible symptom management intervention for people with advanced cancer in rural communities. In accordance with participant feedback, for FOCUS 2.0 we will enhance video segments, incorporate a telehealth component to support app usage, and further develop the interactive and motivational features of the app. Future research will explore the effectiveness of this mHealth symptom management application via a randomized controlled trial.

  • Pallvi–Family Focused Telepalliative Care: Development of a Complex Intervention

    Date Submitted: Mar 4, 2026
    Open Peer Review Period: Mar 5, 2026 - Apr 30, 2026

    Background: Telepalliative care, the use of telehealth in palliative care, has emerged as a strategy to improve access to specialist palliative services amid growing demand, workforce shortages, and increasing digitalization of health care. Although telepalliative care has demonstrated positive outcomes for patients, families, and clinicians, its integration into standard services remains inconsistent. Existing initiatives are often operationally focused and rarely grounded in programme theory or developed collaboratively with key stakeholders, limiting sustainability and contextual alignment, particularly in Nordic health systems that emphasize home-based palliative care. Objective: This study aimed to develop a family focused model of telepalliative care for clinical practice through active involvement of key stakeholders. Methods: A co-design qualitative study grounded in interpretive description was conducted. The development followed the British Medical Research Council’s guidance for the development and evaluation of complex interventions and represents the development phase. Key stakeholders including patients, family representatives, specialized palliative care team members, community care nurses, general practitioners, voluntary representatives, IT consultants, managers, and researchers, were purposively recruited. Data were generated through four scientific workshops across two Danish sites, supplemented by participant observations of video consultations and a short questionnaire inspired by the Normalisation Measure Development (NoMAD) questionnaire. Data were analyzed using abductive thematic analysis, with qualitative and quantitative findings converged and iteratively refined through stakeholder consensus. A programme theory and logic model guided development. Results: Eighteen stakeholders participated in the workshops, with additional input from clinicians through observations (6 consultations involving 22 participants) and questionnaires (n=10). Findings highlighted both alignment and tension between the proposed model and current clinical practice, particularly regarding when and for whom telepalliative care should be used, clinician digital competencies, and family involvement. These, and insights from previous studies, informed the primary output of the study which is Pallvi – Family Focused Telepalliative Care, a comprehensive, theory-informed model comprising of a structured consultation guide and two co-designed quick guides; one for health care professionals and one for patients and families. Pallvi integrates family focused care, shared decision-making, advance care planning, and the Calgary-Cambridge Communication Guide, operationalized across seven consultation phases. Conclusions: Through systematic stakeholder involvement and theory-driven development, this study produced a contextually and culturally aligned family focused model of telepalliative care. Pallvi addresses identified gaps in telepalliative care research by providing a structured, practical guide designed to support communication, family involvement, and cross-sectoral collaboration. Future research will focus on feasibility and implementation testing to assess acceptability, fidelity, and sustainability in clinical practice and implementation.

  • Digital Frailty in Ageing Societies: Introducing a Digital Health Vulnerability Index

    Date Submitted: Mar 3, 2026
    Open Peer Review Period: Mar 3, 2026 - Apr 28, 2026

    Digital health is now embedded in routine care through patient portals, teleconsultations, remote monitoring, digital triage, and other hybrid service models. While these changes can improve access and efficiency, they may also create new barriers for older adults who have limited cognitive, sensory, functional, or social capacity to engage with digitally mediated care. Current constructs such as digital literacy, digital exclusion, and conventional frailty only partly explain this problem because they do not fully capture the mismatch between the digital demands of healthcare systems and the real world capabilities and supports available to patients. This Viewpoint introduces digital frailty as a clinically relevant, multidimensional state of vulnerability that arises when a person’s intrinsic capacity and available support are insufficient to meet the digital requirements of healthcare. We argue that digital frailty should be understood not as a synonym for age, disability, or low digital confidence, but as a relational and potentially modifiable mismatch between individuals and care environments. Framing the issue in this way shifts attention from blaming patients to designing safer and more equitable systems. To operationalize this concept, we propose a Digital Health Vulnerability Index as a pragmatic framework for identifying patients at risk of digitally mediated care failure. The framework focuses on four proximal domains of vulnerability, namely access, skills, confidence or trust, and support, and is paired with brief consideration of hearing, vision, and cognition to improve clinical interpretability. Rather than functioning as a static label, the index is intended as a routable mechanism to trigger proportionate responses such as assisted digital support, proxy enabled access, simplified workflows, and analogue alternatives for safety critical steps. We further propose proportionate universalism as the most appropriate implementation principle, so that digital support is universal in reach but calibrated in intensity according to need. This approach has implications beyond individual assessment and extends to pathway design, procurement, governance, reimbursement, and digital inclusion policy. In ageing societies, digital vulnerability should be recognized as a determinant of functional access to care. A digitally inclusive health system therefore requires not only better technology, but also better identification, adaptation, and accountability for the patients most at risk of being left behind.

  • The Role of Measurement in Identifying High-Intensity Secure Message Senders: An Observational Study

    Date Submitted: Feb 27, 2026
    Open Peer Review Period: Mar 3, 2026 - Apr 28, 2026

    Background: The growing volume of secure messaging within the patient portal has imposed significant demands on clinicians and contributed to burnout. Little is known about the characteristics of patients who comprise high-volume message senders, and we lack a nuanced understanding of patient messaging intensity beyond measures accounting for sheer volume. Objective: Our objective was to characterize older adult patients (65+) with high secure messaging volume, examining both patient characteristics and other aspects of their messaging intensity such as messaging frequency, length, and messaging use relative to patient portal logins and healthcare encounters. Methods: We analyzed electronic medical record (EMR) and patient portal data from a large academic health system, encompassing 16,023 older adults who sent 199,952 messages over a 12-month period. We developed five measures to account for secure messaging intensity. Our primary measure of messaging intensity was based on message volume; high-volume message senders were identified using outlier analysis based on patients’ aggregate number of messages sent during the observation period. Additional measures of messaging intensity included identifying individuals with concentrated periods of messaging, message length (character count), a ratio of messages to portal logins and a ratio of messages to healthcare encounters. We compared sociodemographic characteristics, health status, and messaging intensity of high-volume secure messaging senders to other message senders. We also identified patients who were classified as high-intensity message senders based on all five measures of messaging intensity (‘super-senders’). Results: Of 16,023 older adult patients who sent at least one message during the observation period, 1,298 (8.1%) were classified as high-volume message senders; these patients accounted for 39.7% of total messages. High-volume message senders, compared to all other message senders, were more likely to be White (80.4% vs. 72.5%, p < 0.001), have higher comorbidity scores (2.6 vs. 1.8, p <0.001), and higher incidence of cancer (35.8% vs. 22.8%, p<0.001) and dementia (8.3% vs. 6.1%, p < 0.002). High-volume message senders were also more likely to be identified as having concentrated periods of messaging, to send longer messages, and to send more messages in relation to patient portal logins and healthcare encounters. A small subgroup of patients classified as high-volume senders were also classified as high-intensity across all four of the other measures of messaging intensity (59/1,298; 4.5%), the ‘super senders’. Conclusions: High-volume message senders represent a small but distinct group of older patients who send a disproportionate share of messages to clinicians. Triangulating multiple measures of messaging intensity can help provide additional context about patient messaging behavior and help to identify patients that may most benefit from targeted outreach while potentially easing clinicians' inbox workload.

  • Predicting Healthcare Professionals’ Use of Telehealth in China: A Cross-Sectional Study

    Date Submitted: Mar 1, 2026
    Open Peer Review Period: Mar 3, 2026 - Apr 28, 2026

    Background: While telehealth has become a transformative tool enhancing healthcare accessibility and efficiency, adoption rates in China remain low. Chinese healthcare professionals’ low telehealth adoption rates are poorly understood. Objective: Our study investigates the key factors influencing Chinese healthcare professionals’ intention to adopt and actual use of telehealth. Based on the results from estimating an integrated telehealth use framework, we also make recommendations for improving healthcare professionals’ telehealth adoption. Methods: Data on 10,372 healthcare professionals from the 2023 Xi’an Healthcare Worker Survey were analyzed, utilizing descriptive statistics (chi-square test, group differences), reliability testing (Cronbach’s α coefficients), Discriminant validity analysis (square root of average variance extracted) and fit tests. Based on our integrated telehealth use framework, structural equation modeling was employed to test hypotheses and path relationships, including multi-group analysis to examine demographic moderating effects. Results: Confirming our hypotheses on telehealth intention to use and actual use, the structural equation model showed strong fit indices. Key predictors of behavioral intention to use telehealth included effort expectancy, price value, performance expectancy, and social influence. Behavioral intention and facilitating conditions positively influenced actual use behavior, while demographic characteristics moderated specific relationships. Conclusions: Our study identifies critical factors influencing healthcare professionals’ adoption of telehealth, including performance expectancy, social influence, and facilitating conditions. It offers an integrated framework to assess behavioral intentions and provides practical insights for advancing telehealth implementation in China. Tailored strategies for diverse demographics and institutions are essential for promoting sustainable adoption. Clinical Trial: This study was reviewed and approved by the Biomedical Ethics Committee of Xi’an Jiaotong University (approval number: XJTUAE2646).