Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

The leading peer-reviewed journal for digital medicine and health and health care in the internet age. 

Latest Submissions Open for Peer Review

JMIR has been a leader in applying openness, participation, collaboration and other "2.0" ideas to scholarly publishing, and since December 2009 offers open peer review articles, allowing JMIR users to sign themselves up as peer reviewers for specific articles currently considered by the Journal (in addition to author- and editor-selected reviewers).

For a complete list of all submissions across all JMIR journals as well as partner journals, see JMIR Preprints

Note that this is a not a complete list of submissions as authors can opt-out. The list below shows recently submitted articles where submitting authors have not opted-out of open peer-review and where the editor has not made a decision yet. (Note that this feature is for reviewing specific articles - if you just want to sign up as reviewer (and wait for the editor to contact you if articles match your interests), please sign up as reviewer using your profile).

To assign yourself to an article as reviewer, you must have a user account on this site (if you don't have one, register for a free account here) and be logged in (please verify that your email address in your profile is correct).

Add yourself as a peer reviewer to any article by clicking the '+Peer-review Me!+' link under each article. Full instructions on how to complete your review will be sent to you via email shortly after. Do not sign up as peer-reviewer if you have any conflicts of interest (note that we will treat any attempts by authors to sign up as reviewer under a false identity as scientific misconduct and reserve the right to promptly reject the article and inform the host institution).

The standard turnaround time for reviews is currently 2 weeks, and the general aim is to give constructive feedback to the authors and/or to prevent publication of uninteresting or fatally flawed articles. Reviewers will be acknowledged by name if the article is published, but remain anonymous if the article is declined.

The abstracts on this page are unpublished studies - please do not cite them (yet). If you wish to cite them/wish to see them published, write your opinion in the form of a peer-review!

Tip: Include the RSS feed of the JMIR submissions on this page on your homepage, blog, or desktop RSS reader to stay informed about current submissions!

JMIR Submissions under Open Peer Review

↑ Grab this Headline Animator

If you follow us on Twitter, we will also announce new submissions under open peer-review there.

Titles/Abstracts of Articles Currently Open for Review:

  • Background Artificial intelligence is increasingly embedded in prior authorization (PA) and utilization management (UM) systems across commercial and Medicare Advantage health plans. Emerging evidence, including American Medical Association survey data showing 94% of physicians report PA negatively impacts clinical outcomes, and Senate investigative findings linking AI-assisted adjudication to denial rates up to 16 times higher than typical benchmarks, indicates that AI-driven PA is amplifying existing harms rather than correcting them. Despite regulatory attention from the Centers for Medicare & Medicaid Services (CMS) and the Office of Inspector General (OIG), no standardized compliance framework explicitly governs the deployment of AI in PA decision-making. Objective This paper argues that existing healthcare compliance infrastructure, including OIG compliance program guidance, HIPAA nondiscrimination requirements, CMS coverage determination standards, and internal audit mechanisms, provides a largely underutilized foundation for governing AI-driven PA systems. We propose a structured Algorithmic Accountability Framework (AAF) to help health plan compliance officers and executives navigate uncertainty in AI-enabled utilization management. Methods Drawing on regulatory guidance, published denial rate analyses, American Medical Association survey data, and organizational compliance program design principles, we identify five governance domains where existing compliance infrastructure can be applied or extended to AI PA systems: (1) algorithm transparency and documentation, (2) clinical validity and human oversight, (3) disparate impact monitoring, (4) appeals process integrity, and (5) vendor oversight and contractual accountability. We further integrate a patient-agency lens drawn from the Prepare/Verify/Protect framework, positioning patients as an underutilized accountability mechanism in AI-driven PA governance. Results/Discussion The AAF maps each governance domain to existing regulatory obligations and operational controls that most health plans have in place today. We argue that the systemic misclassification of AI PA tools as IT or operational efficiency systems, rather than high-risk compliance matters, is the primary organizational barrier to adequate governance. Compliance officers, not data science or IT teams, hold the cross-cutting authority needed to own AI PA governance. Patient complaint, grievance, and appeal data, disaggregated by AI involvement, constitute an underutilized error-detection layer that supplements internal compliance monitoring.

  • Background: Pectus excavatum (PE) is the most common congenital chest wall deformity, corrected during adolescence using minimally invasive techniques (MIRPE). Cryoanalgesia has emerged as a valuable adjunct in reducing pain, opioid consumption, and hospital stay. Since recovery trajectories vary widely among patients, we hypothesized that integrating clinical and wearable-derived digital data may allow early identification of recovery patterns and functional improvement following MIRPE. Objective: This study was a proof of concept to test the hypothesis that continuous pre- and post-operative parameters collected via wearable devices may provide a personalized approach for identifying recovery patterns or early signs of complications. Methods: We designed a feasibility randomized controlled trial in patients undergoing MIRPE, assigned to either cryoanalgesia or thoracic epidural analgesia. All patients were invited to wear a Fitbit Sense 2 device, and digital data were recorded continuously and stored in a dedicated database. Clinical data were entered in REDCap electronic data capture tools by clinicians or through the MyCap mobile app by patients. The study aimed to assess patient compliance with wearing the device and the feasibility of integrating wearable-derived biometric data with clinical variables in adolescents undergoing MIRPE with cryo- or epidural analgesia. Reporting follows the CONSORT 2010 extension to randomized pilot and feasibility trials. Results: Patients in the cryoanalgesia group demonstrated higher compliance with wearing their devices during the day and at night throughout the study period, and higher average step counts, compared to the epidural group. Differences between groups were not statistically significant (P values > .05) regarding the Pediatric Quality of Life Inventory (PedsQL™). By the end of the study both groups followed similar recovery trends. Occurrence of complications was early detected by digital data through wearable devices. Conclusions: This proof-of-concept trial confirms that retention and compliance are crucial for the success of the second trial phase. Leveraging Fitbit-derived data may enable a personalized pain management approach with early detection of complications. The benefits of cryoanalgesia in terms of health-related quality of life, as assessed by subjective scales, did not differ significantly from those of the standard of care (epidural–based analgesia); however, the present sample size does not allow a formal claim of equivalence. Continuous biometric data provided a more personalized follow-up. Clinical Trial: Ethical approval was obtained from the Institutional Review Board at Istituto Giannina Gaslini (approval number: 278/2021 – DB id 11421; NCT number: NCT0520182041). All procedures were conducted in accordance with the Declaration of Helsinki.

  • Make the Connection: An Analysis of Enrollment and Adherence in an Online Parenting Intervention

    Date Submitted: Jun 11, 2026
    Open Peer Review Period: Jun 12, 2026 - Aug 7, 2026

    Background: Online parenting interventions provide a unique opportunity to increase access and scalability to evidence-based parenting information that can enhance parenting practices, caregiver well-being, and child developmental outcomes. While online parenting programs can reduce access barriers, less is known about who enrolls in such programs, how participants engage with them, and whether engagement is sustained over time. This knowledge could help inform parenting program usability and engagement strategies. Objective: This study explored the demographic characteristics of caregivers who enrolled (registered) and adhered (completed) to an online parenting intervention protocol and examined sociodemographic factors associated with engagement. Methods: Data were drawn from a larger pragmatic randomized controlled trial evaluating the online Make the Connection® (MTC) program, which aims to promote healthy parent-child relationships. This secondary analysis focused on participants assigned to the intervention group (N = 215). Baseline sociodemographic characteristics were collected using an online survey prior to enrolling in the intervention. Descriptive statistics and logistic regression models were used to examine correlates of enrollment and adherence. Results: Participants were predominantly women (91.6%, 197/215), with a mean age of 35.6 years (SD = 5.19), and their children were a mean age of 13.9 months (SD = 10.7). Of the 215 participants assigned to the intervention, 107 (49.7%) enrolled in the program, while 108 (50.2%) did not. Younger child age was associated with a higher likelihood of enrollment (OR= 0.96, 95% CI 0.93-0.98, P =.002). Older caregiver age was associated with greater likelihood of enrollment (OR= 1.07, 95% CI 1.01-1.14, P =.017) and adherence (OR= 1.11, 95% CI 1.02-1.22, P=.020). Caregiver social isolation was associated with a lower likelihood of adherence (OR= 0.30, 95% CI 0.11-0.84, P =.022), but not enrollment (OR = 1.12, 95% CI 0.62-2.03, P = .706). Depressive symptoms were not significantly associated with program adherence (OR = 1.21, 95% CI 0.45-3.28, P =.708) or enrollment (OR = 0.97, 95% CI 0.90-1.05, P =.457). Conclusions: Results suggest that different factors influence enrollment and adherence in online parenting programs. Although digital delivery may reduce barriers to access, additional strategies, such as goal setting tools, personalised feedback, tailored reminders, and opportunity for peer-connection may be needed to support sustained engagement, particularly among caregivers at risk of disengagement. In this study, sociodemographic factors have been identified that can inform strategic interventions to improve engagement from caregivers most vulnerable to disengagement. Clinical Trial: NCT05770414

  • Background: Acute pancreatitis-associated acute kidney injury (AP-AKI) is linked to substantial morbidity and mortality, but the comparative performance of artificial intelligence (AI) models for early AP-AKI prediction and mortality risk stratification remains uncertain. Objective: This systematic review and network meta-analysis evaluated the diagnostic performance and comparative ranking of AI models for early prediction of AP-AKI and AP-AKI-related mortality. Objective: This systematic review and network meta-analysis evaluated the diagnostic performance and comparative ranking of AI models for early prediction of AP-AKI and AP-AKI-related mortality. Methods: PubMed, Embase, Web of Science, Cochrane Library, and IEEE Xplore were searched from inception to February 23, 2026. Eligible studies developed or validated AI, machine learning, or deep learning models for AP-AKI or mortality among patients with AP-AKI and reported reconstructable diagnostic accuracy data. Two reviewers independently extracted study and model characteristics, validation methods, and 2 x 2 diagnostic data. Risk of bias was assessed with PROBAST+AI, certainty of evidence with GRADE, pooled accuracy with bivariate random-effects models, and algorithm rankings with Bayesian diagnostic network meta-analysis. Results: Fourteen studies were included. For AP-AKI prediction, 11 studies with 35 validation datasets yielded pooled sensitivity of 0.76, specificity of 0.85, and area under the receiver operating characteristic curve of 0.87; XGBoost ranked highest for sensitivity and diagnostic odds ratio. For mortality prediction, three studies with 66 validation datasets yielded pooled sensitivity of 0.73, specificity of 0.77, and area under the receiver operating characteristic curve of 0.81; support vector machine ranked highest for sensitivity and diagnostic odds ratio. Certainty of evidence was mostly low or very low, mainly because of heterogeneity, retrospective designs, limited external validation, and incomplete calibration or clinical utility reporting. Conclusions: AI models show promising but heterogeneous performance for AP-AKI prediction and mortality risk stratification. Prospective multicenter validation, calibration assessment, workflow evaluation, and clinical utility studies are needed before routine implementation. Clinical Trial: PROSPERO CRD420261360696.

  • Ethics of Health Data Infrastructures: Toward Continuous Governance and Public Trust

    Date Submitted: Jun 11, 2026
    Open Peer Review Period: Jun 12, 2026 - Aug 7, 2026

    In April 2026, reports of de-identified UK Biobank participant data being listed on overseas commercial online platforms highlighted concrete vulnerabilities in health data governance. Drawing on this incident, as well as broader discussions on secondary use, cross-border sharing, and public trust, this Viewpoint argues that traditional, trust-based, pre-access review models possess significant ethical and operational limitations. The core concern is not data sharing, commercial involvement, or international collaboration per se, but rather the movement of participant-contributed data beyond approved research governance into external commercial digital environments. Preserving public trust and the "social license" of health data infrastructures—defined as ongoing public acceptance of institutional data practices—requires a shift from access gatekeeping to continuous, proportionate stewardship. This should encompass Trusted Research Environments, audit logging, and ethics-by-design approaches that render secure data use practicable without encouraging insecure workarounds. As initiatives such as the European Health Data Space develop, legal alignment must be complemented by operational accountability for downstream use. Policymakers, funders, data access bodies, infrastructure custodians, and technology providers should embed auditability, user-friendly secure environments, and participant- and public-facing transparency into routine governance. Health data infrastructures can sustain scientific validity and public trust only when responsible data sharing is coupled with continuous, practical, and publicly accountable stewardship.

  • Background: Diabetes affects 11.1% of adults worldwide, and care access is limited by cost and specialist shortages. Prior meta-analyses used small samples and rarely assessed publication bias or evidence certainty. Objective: To estimate telemedicine's effect on glycated hemoglobin (HbA1c), body mass index (BMI), and self-efficacy in diabetes, explore heterogeneity, and grade evidence certainty. Methods: DATA SOURCES PubMed, EMBASE, the Cochrane Library, and Web of Science from inception to May 15, 2026. STUDY SELECTION Randomized clinical trials (RCTs) of telemedicine vs usual care, in-person care, or no intervention in type 1, 2, or gestational diabetes (T1DM, T2DM, GDM) reporting HbA1c, BMI, or self-efficacy. Of 7,033 records, 145 RCTs (147 arms) were included. DATA EXTRACTION AND SYNTHESIS Two reviewers independently screened, extracted, and assessed risk of bias (Joanna Briggs Institute tool) per PRISMA 2020. Random-effects (REML) models included subgroup and meta-regression analyses by diabetes type, comorbidity, and duration. Publication bias used Egger's test, trim-and-fill, PET-PEESE, and worst-case meta-analysis; certainty was graded with GRADE. PROSPERO: CRD420261387371. MAIN OUTCOMES AND MEASURES Mean differences (MDs) for HbA1c and BMI; standardized mean differences (SMDs) for self-efficacy; 95% CIs. Results: In 31,657 participants, telemedicine reduced HbA1c (k = 133; MD, −0.36%; 95% CI, −0.43 to −0.28) and improved self-efficacy (k = 22; SMD, 0.57; 95% CI, 0.21 to 0.94) but did not affect BMI (k = 64; MD, −0.23 kg/m²; 95% CI, −0.46 to 0.01). The HbA1c effect varied by diabetes type (P for interaction < 0.001; near-null in GDM), comorbidity (P = 0.016; largest with depression or anxiety, MD, −0.68%), and duration (P = 0.030; weaker beyond 6 months); residual I² = 76.7%. All bias corrections attenuated the HbA1c estimate (PET P = 0.197); the self-efficacy estimate did not survive trim-and-fill or worst-case analysis. GRADE certainty was very low for HbA1c and self-efficacy and low for BMI. Conclusions: Telemedicine was associated with modest improvements in HbA1c and self-efficacy but not BMI, strongest in T2DM, comorbid depression or anxiety, and interventions ≤3 months. However, certainty was low to very low, and bias corrections suggest the pooled effects may be inflated. Targeted, short-duration telemedicine is promising but not sufficient for broad implementation; rigorous pre-registered RCTs are needed.

  • Background: Pregnancies extending beyond the estimated date of delivery (EDD) often require repeated fetal surveillance, contributing to increased healthcare utilization. Digital health approaches enabling patient-operated remote monitoring may support more flexible models of antenatal care. This study evaluated the feasibility of a patient-operated remote fetal surveillance pathway using a smartphone-based ultrasound system in women at or beyond EDD. Objective: To assess the feasibility, usability, and acceptability of patient-performed smartphone-based ultrasound with remote physician review for fetal surveillance beyond EDD. Methods: In this prospective, single-center observational study, 50 women with singleton pregnancies ≥40+0 weeks independently performed fetal ultrasound examinations using a smartphone-coupled ultrasound device following video-based instruction only. Examinations were transmitted for remote physician review and compared with routine clinician-performed ultrasound assessments. Fetal heart activity, amniotic fluid volume, and fetal movements were assessed. Participants completed structured questionnaires evaluating usability, satisfaction, perceived safety, and anticipated impact on outpatient visit utilization. Results: Participants successfully obtained interpretable assessments of fetal heart activity in 92% of cases, amniotic fluid volume in 96%, and fetal movements in 100%. All routine clinician-performed scans were normal. Mean user satisfaction was 4.6/5. Most participants reported feeling safe using the system (91.9%) and indicated willingness to use it at home (97.9%). Additionally, 81.6% anticipated reduced outpatient visits with access to the system. No device-related adverse events occurred. Conclusions: Patient-operated smartphone-based ultrasound with remote physician review was feasible and well accepted for fetal surveillance in late-term pregnancy. These findings support further evaluation of patient-operated tele-ultrasound within digital antenatal care pathways in larger multicenter studies conducted in real-world settings.

  • Background: There are increasing interests in developing AI tools to identify and address individual-level social determinants of health in both health care and human service settings. These activities are part of social care integration, which at the individual level involves identifying individuals with social risks (awareness) and connecting them with relevant social care resources (assistance). Social care providers such as community health workers and social workers are deemed critical stakeholders in both settings. Chatbots have shown feasibility and acceptability for social risk screening in emergency departments and primary care centers. However, we do not know if a screening chatbot is worth developing for the safety net health care and human service settings, where there are visitors with greater social needs and less organizational resources. Objective: The study aims to investigate the perceived value proposition of an AI-based chatbot for social risk screening from the perspectives of social care providers in safety net health care and human service organizations. Providers’ perceived value propositions of other AI-based applications for social care integration were also examined. Methods: We conducted semi-structured interviews with 19 social care providers who have experience with awareness and/or assistance from 16 safety net health care and human service organizations in Michigan. Interview questions focused on their experiences and challenges regarding awareness and assistance when applicable. A simulated screening chatbot based on ChatGPT-4o was also used to solicit their feedback on the technology. The nonadoption, abandonment, scale-up, spread, and sustainability (NASSS) framework was used to guide data analysis. Interview transcripts were first coded deductively, then inductively. Results: Social care providers perceived the screening chatbot as offering limited value. This is mainly because many participants engaged in assistance activities and they noted addressing social needs is a multi-step process requiring follow-up that screening chatbots do not provide. In addition, they valued cultivating trust as many patients/clients have high social needs and lack trust in the health care system, and felt chatbots present new challenges for maintaining essential trust and care quality with clients/patients. Instead of a screening chatbot, we identified that technologies to reduce documentation burden could improve providers’ efficiency and potentially increase time spent with patients/clients. This is because social risks and needs documentation generates administrative burden for providers. We also found that technologies to improve referral accuracy and engagement could improve providers’ effectiveness, as study participants have overall limited access to technologies that typically support referral-related activities. Conclusions: Social care providers in the safety net preferred AI-based applications for addressing documentation burden and social needs assistance rather than for social risk screening. Future strategies to develop AI tools for social care integration should align with social care providers’ professional values and focus on equity-centered care.

  • Characterizing Patient Portal Usage and Implications: A Data-driven Analysis on a Rural Population

    Date Submitted: Jun 10, 2026
    Open Peer Review Period: Jun 11, 2026 - Aug 6, 2026

    Background: Patient portals have become a primary channel for asynchronous patient–provider communication, yet rural-specific communication patterns remain underexplored. Objective: This study characterizes portal communication patterns in a rural-serving academic medical center over 5 years and links messaging behaviors to markers of care burden. Methods: We analyzed 370,498 patient messages and 256,295 provider responses from 10,206 patients at Dartmouth Hitchcock Medical Center (2020-2024). We also analyze the linked structured electronic health record (EHR) data for each patient. We used validated large language model (LLM)–based classifiers for thematic analysis and message authorship identification at scale. Results: Female patients and older adults generated disproportionately high message volumes. Anxiety (47.0%), hypertension (36.1%), and lipid disorders (33.7%) were the most prevalent conditions. Information seeking dominated portal communication (33.4%). The median response time to a patient messages was 10.6 hours, with patients who had dementia or cerebrovascular conditions waiting for the longest. Care partner-authored messages were substantially elevated among patients with Alzheimer disease and dementia (approximately 55%-60%) vs 5% for those without. Conclusions: Rural portal communication reflects systematic disparities linked to age, sex, and clinical complexity. LLM-based analysis enables scalable characterization of thematic patterns and clarity failures that may inform AI-assisted triage.

  • Background: Sensor metadata is critical for exposure health research because it supports accurate sensor identification, deployments, data integration, interoperability, and reproducibility. Yet it is often fragmented across multiple heterogeneous sources, such as scientific literature and manufacturer guides, where key specifications are frequently reported indirectly through citation chains, making reference tracing essential for metadata enrichment and completeness. Objective: To address this bottleneck, we developed and evaluated an LLM-based automated, citation-aware pipeline that enriches sensor metadata extracted from a primary article by identifying sensor-related citation markers and extracting additional metadata from the referenced sources. Methods: We extend our prior LLM-based metadata extraction approach by (i) detecting sensor mentions in full-text articles, (ii) capturing nearby citation markers, (iii) resolving markers to full bibliographic entries in the reference list, and (iv) retrieving cited papers to extract additional sensor metadata that may be absent from the primary document and using it to enrich and complete the base metadata. Results: Across 20 primary papers, the citation extraction component achieved 74.2% precision, 92.0% recall, 82.1% F1-score, and 69.7% accuracy, and all extracted bibliographic entries were correctly matched to their source references. This component increased sensor extraction by about 261%, yielding 94 additional sensors overall. Conclusions: The developed citation-guided pipeline improved sensor discovery and metadata completeness, thereby supporting the development of richer, more complete sensor metadata repositories.

  • Background: World Health Organization reports that chronic non-communicable diseases account for 74% of global deaths. Despite rapid advances in digital health technology, Artificial Intelligence tools for self-management remain deficient in two crucial elements: emotional connection with patients and trustworthiness. Concern around these two topics is of increasing interest and importance. With regards to trustworthiness of AI and emotional intelligence of AI however, the studies for these two concepts were developed completely separately and in an isolated manner and this in itself, is a considerable design gap. Within self-management for chronic conditions, it becomes necessary to build an approach to design with the integration of these two concerns. Objective: This systematic review aims to examine the extent to which the concepts of trustworthiness, emotional intelligence, situational awareness and personalization are integrated within artificial intelligence systems designed to facilitate self-management of chronic illness, and what the impact of integration is. Methods: From the beginning of February 2026, a thorough search was undertaken on 6 databases (PubMed, Scopus, IEEE Xplore, PsycINFO, Web of Science, and ACM Digital Library) along with connected papers, using Boolean strings linking together AI, chronic disease self-management, trust and emotional intelligence (tailored to each database's individual vernacular). Initial screening of identified articles was completed in two phases using the PRISMA 2020 criteria of pre-determined inclusion and exclusion criteria before critical appraisal using the MMAT v2018 and CASP tools. Results: After a systematic selection process of 1,486 studies, 45 studies were finally selected based on inclusion criteria. Four major theme areas emerged from the papers including: Technology in chronic illness self-management, Trust in human-AI interaction, Empathy and emotional intelligence in AI and The ethics, equity and ethical application of AI. The quality appraisal showed that 91% (41/45) of the selected studies were rated as high quality, with an average appraisal score of 93%. In addition, 73% (33/45) of the selected papers were published during 2024 and 2025 thus highlighting a high quality and contemporary compilation of literature on the subject. Conclusions: Despite advances in AI for chronic disease management and in trust-empathy theory, these fields remain siloed. We identify 5 critical research gaps; the lack of a combined trust-empathy model, the under-specification of context awareness, the absence of equity consideration, the exclusion of overtrust consideration, and lack of long-term studies demonstrating effectiveness and safety of emotional AI systems. Clinical Trial: Not registered. The review protocol was developed before the search but was not prospectively registered in PROSPERO or an equivalent database. This limitation is discussed in Section 4.4.

  • Background: Mobile health (mHealth) offers new possibilities for self-management among elderly patients with chronic diseases. However, age-related physiological decline, reduced cognitive function, and low digital literacy create a significant "digital divide," hindering their effective access to and use of mHealth services. Adolescents, as "digital natives," hold significant potential in helping their elderly family members adapt to digital technologies. Nevertheless, the mechanisms, action patterns, and influencing factors of their backfeeding behaviors remain unclear. Objective: This study aims to explore the conditions, action/interaction strategies, and consequences of digital backfeeding from adolescents to elderly patients with chronic diseases for mHealth adoption, and to construct a mechanism model based on grounded theory. Methods: This study followed the procedural grounded theory approach by Strauss and Corbin. From April 2025 to January 2026, using purposive and theoretical sampling, we recruited 15 adolescents (aged 14-24 years) who provided digital backfeeding to elderly relatives with chronic diseases for semi-structured in-depth interviews. We followed a three-level coding paradigm: open coding, axial coding, and selective coding. Data analysis was performed using NVivo 15 software. Two researchers independently performed all coding, and disagreements were resolved through team discussion. Results: Among the 15 participants, 11 were female, and 4 were male; 9 were students, and 6 were employed; 8 lived with their elderly patient relatives, and 6 lived separately. The primary care recipients were grandparents (9 participants), and the main chronic diseases were hypertension (10 cases), diabetes (5 cases), and heart disease (5 cases). Coding analysis generated 87 initial concepts, which were grouped into 32 categories, and finally integrated into 4 antecedent conditions (individual characteristics of elderly patients, family intergenerational context, technology and task environment, and health management needs), 4 action/interaction strategies (proxy operation mode, digital teaching empowerment mode, information intermediary adjustment mode, and collaborative management mode), and 4 consequence dimensions (impact on elderly patients, impact on adolescents, impact on family intergenerational relations, and impact on the backfeeding process itself). Based on these findings, a systematic digital backfeeding mechanism model was constructed. The model reveals 15 typical backfeeding pathways, including empowerment success, proxy dependence, teaching compromise, family collaboration, remote assistance, AI enhancement, and abandonment of backfeeding. Conclusions: This study is the first to systematically elucidate the core action patterns and dynamic evolution mechanisms of digital backfeeding from adolescents to elderly patients with chronic diseases for mHealth adoption. It constructs a backfeeding mechanism model based on the "conditions—action/interaction strategies—consequences" paradigm, extending the application boundaries of digital backfeeding theory to the health care domain. The findings provide an evidence-based foundation for the age-friendly transformation of mHealth and the development of intergenerational support policies in China.

  • Background: Tracheostomy is a frequently performed procedure in critical care settings, but procedures are often inconsistently coded in electronic health records (EHRs), with explicit designation as elective or emergency frequently absent. This coding ambiguity limits the ability to identify planned tracheostomy cohorts for observational research on outcomes and time toxicity. Common data models such as the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) enable large-scale federated research, but require validated computable phenotypes to ensure reliable cohort identification across heterogeneous data sources. Objective: To develop and validate a computable phenotype that identifies elective tracheostomy procedures from EHR data standardized to the OMOP CDM, enabling scalable and reproducible analysis of tracheostomy-related time toxicity in critically ill patients. Methods: We conducted a retrospective observational study using EHR data from the Johns Hopkins Health System from 2017 to 2024, comprising approximately 2.1 million patients with data mapped to the OMOP CDM. A series of cohort definitions were developed using standardized clinical code sets (International Classification of Diseases, 10th Revision [ICD-10] and Current Procedural Terminology [CPT]) from the Observational Health Data Sciences and Informatics (OHDSI) Standardized Vocabularies. To classify tracheostomy procedures lacking explicit urgency coding, we compared covariate prevalence and temporal relationships (e.g., intubation timing relative to tracheostomy) between explicitly coded elective and emergency cohorts. Six candidate computable phenotypes with stepwise inclusion and exclusion criteria were evaluated using PheValuator, a validated probabilistic phenotype evaluation tool. Results: Among 3552 patients with a tracheostomy procedure identified between 2017 and 2024, 2484 (69.9%) were explicitly coded as elective and 107 (3.0%) as emergency; the remaining 961 (27.1%) lacked explicit urgency classification. Covariate analysis revealed significant differences in intubation timing, drug exposures, and procedure codes between the explicitly coded groups. The best-performing computable phenotype (Cohort #202), which used inpatient visit-based attribution of planned and emergency codes, achieved a sensitivity of 0.88 (95% CI 0.84-0.91) and a positive predictive value (PPV) of 0.81 (95% CI 0.77-0.84), with an F1 score of 0.84. Conclusions: The proposed computable phenotype effectively distinguishes elective from emergency tracheostomy in structured EHR data. This approach enables large-scale, reproducible studies of tracheostomy-related time toxicity across heterogeneous OMOP-mapped data sources and provides a generalizable framework for phenotyping intent-ambiguous procedures across federated research networks.

  • Background: Medication-related harm is a major cause of preventable morbidity and mortality in hospitalised patients, particularly among older individuals with polypharmacy. Healthcare-specific large language models (LLMs) trained on validated pharmacological sources may provide more reliable and clinically relevant drug–drug interaction (DDI) detection than general-purpose systems. Objective: To evaluate the accuracy and processing time of Katana AI, a novel healthcare-specific large language model, compared with pharmacist-led medication review and a general-purpose large language model for DDI detection in surgical inpatients. Methods: Medication charts from surgical inpatients were prospectively reviewed between September 2025 and February 2026. DDIs identified by Katana AI, pharmacist-led review, and ChatGPT were compared against the British National Formulary (BNF) reference standard. Interactions were classified by severity and level of supporting evidence. Detection accuracy and processing time were recorded and compared. Results: Thirty-nine surgical inpatients were included, comprising 293 prescribed medications. The median age was 70 years (IQR 60–75), with 69% aged over 65 years. Katana AI identified 125 DDIs, of which 117 were clinically accurate (93.6%), compared with 85.9% for pharmacist review and substantially lower accuracy for ChatGPT. Katana AI demonstrated significantly higher accuracy than ChatGPT (p<0.001) and a modest but statistically significant improvement over pharmacist review (p=0.041). Mean processing time was 32.1 seconds for Katana AI, comparable to ChatGPT (30.7 seconds) and significantly faster than pharmacist review (227 seconds; p<0.001). Conclusions: Katana AI demonstrated high accuracy and rapid detection of clinically relevant DDIs, outperforming a general-purpose large language model and showing a modest improvement over pharmacist review. These findings support the potential role of healthcare-specific large language models as clinical decision-support tools to enhance medication safety and prescribing efficiency. Further multi-centre validation is warranted. Clinical Trial: N/A

  • Background: Large language models (LLMs) are increasingly evaluated for clinical decision support, but their practical value depends on post-training adaptation rather than raw benchmark performance. Fine-tuning, retrieval-augmented generation (RAG), and hybrid approaches represent the principal strategies for improving clinical reliability and relevance, yet evidence remains fragmented across specialties, model architectures, and evaluation designs. Objective: To synthesize evidence on fine-tuning, RAG, and hybrid post-training strategies for improving LLM clinical performance across healthcare tasks. Methods: We searched PubMed/MEDLINE, EMBASE, and Scopus (January 2018 – January 2026). Eligible studies evaluated transformer-based LLMs with post-training adaptation or retrieval augmentation applied to clinical datasets and reported quantitative performance outcomes. Risk of bias was assessed using PROBAST+AI. Results were synthesized descriptively by enhancement strategy. This review was prospectively registered (PROSPERO: CRD420261308522). Results: Of 1,890 records identified, 35 studies (published 2024–2026) met inclusion criteria across 12 countries and multiple clinical domains, including oncology, radiology, neurology, and emergency medicine. Three enhancement strategies were identified: SFT/PEFT (n=7, 20%), RAG (n=17, 49%), and hybrid pipelines combining fine-tuning, retrieval, and structured prompting (n=11, 31%). Fine-tuning was most effective for narrow, labeled classification tasks. External AUROC reached 0.912 for cancer detection and 0.938 for hepatocellular carcinoma from cell-free DNA signatures; macro-sensitivity was 0.918 for acute infarct detection from radiology reports; and AUC was 0.892 for major depressive disorder on UK Biobank data. RAG produced the largest gains when corpora were authoritative and aligned with the clinical task. Incorporating the ESC acute coronary syndrome guideline raised accuracy from 71.1% to 92.1% (GPT-4o) and from 78.9% to 94.7% (DeepSeek R1). A trauma-radiology chatbot improved injury grading accuracy from 48% to 87%, and a guideline-grounded urology pipeline reached 95.5% concordance versus 62.3% among junior clinicians. RAG reduced performance in two studies with noisy or poorly structured corpora, and reasoning-class models showed limited incremental benefit from retrieval. Hybrid systems achieved the strongest results for complex tasks. A stroke pipeline fine-tuned using LoRA reached 99.0% internal and 95.5%/79.1% external accuracy; a federated multimodal dermatology system achieved 90.2% diagnostic accuracy across 11 lesion types; and a multimodal osteonecrosis pipeline reached 96.0% expert-rated accuracy. Structured prompting (persona, chain-of-thought, task decomposition) shifted accuracy by 5–15 percentage points across studies. Only 10 studies (29%) reported external validation, and 24 (69%) lacked formal safety moderation. Conclusions: Post-training adaptation and retrieval augmentation improved clinical LLM performance across diverse tasks. Strategy-task alignment, corpus quality, and prompt design were primary determinants of benefit. Evidence remains predominantly retrospective with limited external validation. Future studies should prioritize prospective, clinically embedded evaluations incorporating safety and fairness reporting.

  • Background: Thyroid fine-needle aspiration biopsy (FNAB) is a commonly used diagnostic procedure in patients with suspected thyroid cancer; however, it may induce pain, anxiety, and fear during the procedure. Objective: This randomized controlled study aimed to evaluate the effect of virtual reality (VR) on pain, anxiety, and fear of pain in patients undergoing diagnostic thyroid procedures. Methods: The study was conducted between October 15, 2024, and April 30, 2025, at Gaziantep City Hospital, Türkiye. A total of 100 patients with suspected thyroid nodules were randomly assigned to either a VR intervention group (n = 50) or a control group (n = 50). Data were collected using a Patient Information Form, the Beck Anxiety Inventory (BAI), the Fear of Pain Questionnaire III (FPQ-III), and the Visual Analog Scale (VAS). Between-group comparisons were performed using ANCOVA adjusting for relevant baseline covariates, and effect sizes were calculated using Cohen’s d with 95% confidence intervals. Results: After adjustment for baseline values and relevant covariates, no statistically significant differences were found between the VR and control groups in post-intervention VAS (P =.152), BAI (P =.501), or FPQ-III scores (P =.20). Effect size analyses indicated small between-group effects across all outcomes (Cohen’s d = −0.05 to −0.29), with 95% confidence intervals including zero. Within-group analyses indicated reductions in VAS, BAI, and FPQ-III scores over time in both groups; however, these changes were not supported by statistically significant between-group differences. Conclusions: Virtual reality was not associated with statistically significant improvements in pain, anxiety, or fear of pain when compared with standard care after adjustment for baseline differences. Although small within-group improvements were observed, these findings do not support a strong independent effect of VR on procedural discomfort in this sample. Further well-powered randomized trials are warranted. Clinical Trial: Clinical trial registration: This study was registered at ClinicalTrials.gov (NCT06792929), https://register.clinicaltrials.gov/prs/beta/studies/S000F9GN00000034/protocol/protocolSummary?fragmentId=status

  • Background: Mobile Health (mHealth) and telemedicine can improve access to healthcare services in rural Ethiopia through education, disease monitoring, and remote consultations. Adoption depends on factors such as digital literacy, trust, training, and infrastructure, while challenges include limited smartphones and connectivity. Objective: The aim of this review was to synthesis the evidence of willingness to use mHealth and telemedicine intervention among health professionals and patients. Methods: This systematic review was conducted following the PRISMA guidelines to examine the willingness to use mHealth and telemedicine interventions and associated factors among healthcare professionals and patients in Ethiopia. A comprehensive search was carried from database such as MEDLINE, PubMed Central, CINAHL, and Africa-Wide Information using predefined searching strategies. Only full-text, peer-reviewed studies published in English were included. Two reviewers independently screened and selected studies, extracted data using a standardized form, and resolved disagreements through consensus. Study quality was assessed using Joanna Briggs Institute checklists. Results: This review consisted of 13 studies, and indicates that patients and healthcare workers are strongly willing to use mHealth and telemedicine. The highest willingness was observed among patients with chronic conditions (59.1%–96%), and they preferred simple technologies to engage (voice calls/SMS). Healthcare professionals also indicated varying but substantial willingness to engage (46.5%–83%). Conclusions: Both patients and healthcare providers have a high degree of willingness. Younger age, higher education, urban life, smartphone ownership, digital literacy, and some degree of perceived utility and simplicity of use were significant factors determining willingness. A supporting role was also provided by additional behavioral, clinical, and environmental factors. Clinical Trial: Registration: This systematic review was registered in the International Prospective Register of Systematic Reviews (PROSPERO; CRD42024629424.

  • Background: Effective chronic disease management requires individuals to prioritize long-term health goals over immediate temptations. As chronic patients increasingly engage with online health information, it is important to understand how such engagement may relate to future-oriented cognition and self-regulatory capacity. Objective: This study examined the association between online health information seeking behavior (HISB) and self-control among adults with chronic diseases and investigated whether consideration of future consequences (CFC) was associated with this relationship. Methods: Cross-sectional survey data were collected from 11,031 adults with chronic diseases in China. Mediation analyses were conducted using SPSS macro PROCESS with 5,000 bootstrap samples while controlling for demographic, health-related, and psychological covariates. Results: HISB was positively associated with CFC (B=.07, SE=.005, p<.001). CFC was positively associated with self-control (B=.56, SE=.008, p<.001). After CFC was entered into the model, the direct association between HISB and self-control was no longer statistically significant (B=.005, SE=.004, p=.22). Bootstrap analyses indicated a significant indirect effect of HISB on self-control through CFC (B=.041, BootSE=.003, 95% CI .0348-.0479). Conclusions: The findings suggest that consideration of future consequences may help explain the association between online health information seeking and self-control among adults with chronic diseases. More broadly, digital health environments may increase the salience of future health consequences by repeatedly rendering long-term outcomes cognitively accessible in everyday life. Longitudinal and experimental research is needed to clarify causal mechanisms underlying these associations.

  • Digital Maturity in Integrated Care Systems: Development strategies – A Scoping Review

    Date Submitted: Jun 6, 2026
    Open Peer Review Period: Jun 8, 2026 - Aug 3, 2026

    Background: Digital maturity is a priority for creating efficient, patient-centered health systems, yet Integrated Care Systems (ICS) often face challenges like a lack of interoperability and weak data governance. A systematic mapping of strategies is essential to guide these organizations identify areas for improvement and define sustainable actions to ensure technology adds value to all stakeholders. Objective: To map the development strategies and interventions implemented in ICS to promote digital maturity, while identifying the associated facilitators, barriers, and recommendations described in the literature. Methods: A search was conducted on PubMed, Scopus, and Web of Science on October 17th, 2025, for English articles published since January 1st 2015. Following Joanna Briggs Institute methodology, two independent reviewers performed study selection and data extraction, and quality appraisal using the Mixed Methods Appraisal Tool, with a third reviewer resolving any conflicts, and the obtained results were synthesized through descriptive analysis and thematic grouping. Results: Eighteen articles were included, featuring mixed-methods and case study designs, predominantly set in the United States, as well as several multi-country studies set in Europe. The results identified that most strategies were technological (telehealth, electronic health records and care coordination tools) or structural (governance frameworks). Key facilitators included strong organizational leadership, pre-existing digital infrastructure, and stakeholder engagement, while significant barriers included a lack of interoperability and inadequate funding. Regulation was found to be an obstacle to digital tools development and implementation, as privacy legislation often prevents from fully achieving interoperability, making it essential to use frameworks like “Privacy by Design” to address privacy concerns during digital solutions development phase. Several frameworks surfaced, with both the Chronic Care Model and eHealth Enhanced Chronic Care Model being the most prevalent. Stakeholder engagement emerged as a pivotal enabler, yet significant resistance persists due to low digital literacy, misconceptions and an aging workforce, making it critical not only to develop formal and continuous training, but actively involving them in problem-solving though a co-creation process. Conclusions: Developing digital maturity in ICS requires a multidimensional approach that extends beyond technological adoption to include multidisciplinary governance, national eHealth policies, and value-based funding models. Addressing low digital literacy through formal training for staff and patients is critical for health care system´s sustainability. The review provides a foundational framework for healthcare managers and future research and development of digital maturity guidelines in ICS.

  • Background: Media consumption is a pathway through which the public encounters health information, misinformation, and politicized interpretations of evidence, yet its relationship with knowledge across multiple health domains remains incompletely understood. Objective: We conducted a cross-sectional survey of U.S. adults recruited through CloudResearch Connect to examine associations among media source use, institutional trust, demographic characteristics, and knowledge accuracy regarding climate change, type 2 diabetes, and infectious diseases. Methods: After excluding invalid responses and a failed attention check, 509 participants were included. Knowledge was assessed with domain-specific true/false items scored as correct, incorrect, or “I don’t know,” producing climate change, chronic disease, infectious disease, and total knowledge scores. Results: Rural residence, lower income, lack of health insurance, and absence of a primary care provider were associated with lower knowledge across several domains, suggesting structural barriers to reliable health information. Trust in the CDC, physicians, and pharmacists showed the strongest and most consistent positive associations with knowledge. Political affiliation and consumption of ideologically distinct news sources were most strongly associated with climate change and infectious disease knowledge, but less so with diabetes knowledge. Conclusions: These findings suggest that public health literacy interventions should address both polarized media environments and inequitable access to trusted clinical and institutional information.

  • How Wearables Shape Sleep and Health Behaviors: A Cross-Sectional Analysis by Gender and Global Region

    Date Submitted: Jun 5, 2026
    Open Peer Review Period: Jun 7, 2026 - Aug 2, 2026

    Background: Over the last 15 years, wearable health devices have become increasingly commonplace, with ownership ranging from 30-50% of adults globally. Using advanced technologies, wearable devices today support both consumer wellness and medical-grade applications for a range of chronic conditions including sleep disorders such as obstructive sleep apnea and insomnia. Despite the growing interest in wearable devices, few studies have explored global consumer adoption and perspectives, and how health and lifestyle decisions are impacted by available insights. Objective: This study aimed to examine wearable device adoption, usage patterns, motivations, confidence in wearable-generated data, and health-related behavior changes among adults, with a particular focus on differences by gender and geographic region. Methods: We conducted a cross-sectional, global electronic survey of 9980 employees from a multinational health technology company between July and August 2024 to better understand wearable device usage, perceptions, and beliefs among employees. Participants were invited to participate via digital flyers and emails, as well as via printed materials placed in the workplace. Of the total employees invited to the survey, 1589 (16%) employees were eligible, consented, and completed the survey. Descriptive statistics summarized survey responses, and chi-square tests and proportion tests were used to evaluate differences in drivers of wearable use, confidence, and beliefs by gender, region, and sleep-tracking status. Results: Respondents had a mean (SD) age of 42 (10) years, and 737 (47%) were women, with regional representation from North America (579/1570, 37%), Western Europe (395/1570, 25%), Australasia (395/1570, 25%), and Asia (201/1570, 13%). Most participants (n=1023, 64%) were current wearable device users and 50% of them (n=513) tracked used a wearable device to track sleep. Wrist-worn devices were the most common form factor (1000/1023, 98%) followed by rings (99/1023, 10%). Participants primarily used devices for wellness tracking, and while most (616/1023, 60%) felt confident in the accuracy of the data, only 15% (n=153) regularly shared data with their healthcare provider (HCP). Among those respondents with obstructive sleep apnea (312/1589, 20%), about 21% (108/513) used a wearable device to track their sleep, with similar rates of data sharing. Usage patterns differed by age, geography and gender, as did sharing of data with HCPs. Conclusions: Wearable devices are widely used to support health and wellness behaviors and are associated with self-reported changes in exercise, goal-setting, and sleep habits. Significant differences in engagement and health-related behaviors across gender and geographic regions suggest that demographic and contextual factors influence how wearable technologies are used. These findings may inform the development of more personalized and equitable digital health interventions and support greater integration of consumer-generated health data into healthcare settings.

  • Musculoskeletal health literacy requires patients to understand complex treatment options, postoperative precautions, and recovery timelines, which together help set realistic expectations for recovery. However, existing patient education materials are often text-heavy, exceed recommended reading levels, and fail to depict how functional recovery progresses over time, which may be especially limiting in safety-net settings serving populations with variable health literacy. In this paper, we describe methods for designing and developing a secure, institution-restricted gait-recovery video library for orthopedic patient education. Our library was built to close that gap with short, patient-perspective recovery videos centered on one of the most meaningful outcomes of lower-extremity surgery: functional mobility. Each video was built around a standardized Timed Up and Go (TUG) assessment recorded from frontal and sagittal views, paired with relevant radiographs and visually adapted patient-reported outcome measures (PROM), to create a multimodal, visually guided recovery pathway. This publication aims to detail the process of selecting a secure hosting platform; choosing the filming setup, recovery milestones, and key visual features to capture; maintaining patient privacy and data security; executing clinic-based filming and video editing; and building a personalized interface that allows videos to be filtered by procedure type, recovery stage, and patient characteristics.

  • Generative AI Chatbot Responses to Suicide and Self-Harm: A Systematic Review

    Date Submitted: Jun 5, 2026
    Open Peer Review Period: Jun 6, 2026 - Aug 1, 2026

    Background: A growing number of US adults and youth confide in generative artificial intelligence (AI) chatbots for mental health support, including disclosure of suicide and self-harm risk. While the quality, safety, and effectiveness of chatbot responses to risk disclosure have the potential to impact population-level rates of suicide and self-harm, there have been no systematic reviews of this burgeoning literature. Objective: We conducted a systematic review of studies evaluating generative AI chatbot responses to disclosure of suicide and self-harm risk. Methods: We searched six databases from January 2020-December 2025 and identified empirical studies involving interactions with generative AI chatbots that included discussion of suicide or self-harm. Following deduplication, studies (k = 1,042) were imported into Covidence and titles and abstracts were independently screened by two reviewers, with discrepancies resolved by a third reviewer. The same methods were used to evaluate 126 full texts. Data extraction was led by one reviewer and verified by a second. Results: We identified 29 papers (14 published; 15 preprints). Most (k = 20) were solely audit studies evaluating AI chatbot responses to suicide risk disclosure. Two developed chatbots or AI evaluation frameworks, and one was a jailbreaking study (adversarially testing AI systems or attempting to circumvent chatbot safety guardrails). The remaining studies combined approaches. Across studies, proprietary, frontier model chatbots (eg, ChatGPT, Claude) provided higher quality responses to suicide and self-harm risk than open-source chatbots (eg, LlaMA, DeepSeek), and many AI companions (eg, Replika, Character.AI). All chatbots, not just proprietary models, generally performed well on empathy, validation, and support. However, chatbot responses were often generic and lacked context. Chatbots did not proactively assess risk and performed most poorly when risk disclosure was ambiguous or moderate, frequently failing to recognize implicit risk or escalate to human-delivered services. Furthermore, responses were inconsistent between chatbots and often required multiple conversational turns before providing referrals to crisis resources and human-delivered professional support. While there were few examples of overtly harmful responses under standard conditions, jailbreaking attempts easily led to problematic responses. Finally, no chatbot proactively recommended limiting access to lethal means such as firearms, medications, or sharps. Conclusions: Chatbots provide validation and support in response to suicide and self-harm disclosure. Overall, however, their poor risk assessment, delays in referrals to crisis resources and human-delivered support, difficulty detecting jailbreaking attempts, and general lack of adherence to clinical guidelines present safety risks. While findings are limited by the rapid versioning of AI models over time, research is needed to evaluate stakeholder perspectives on AI chatbot responses to suicide and self-harm risk disclosure. Research should also examine the short- and long-term impact of these responses on clinical outcomes, utilizing follow-up assessments in real-world or clinical settings. Clinical Trial: OSF Registries osf.io/9uva3

  • Validation of a Patient-Facing AI System for Symptom Guidance: A Simulation Study With Physician Review

    Date Submitted: Jun 4, 2026
    Open Peer Review Period: Jun 6, 2026 - Aug 1, 2026

    Background: Rapid advances in large language models (LLMs) have expanded interest in healthcare applications that require complex information processing and decision support. Digital health assistants and symptom checkers, which have historically been rule-based, are increasingly incorporating AI capabilities to support initial symptom assessment and patient triage. An accurate and reliable AI-enabled triage tool could improve patient navigation, reduce unnecessary health care utilization, and support earlier recognition of clinically serious conditions. Objective: The objective of this study was to analytically validate the performance of the Personal Health Assistant (PHA), a Large Language Model (LLM)-based patient support tool in producing appropriate recommendations using simulated patient encounters and expert physician review. Methods: We conducted a prospective analytical validation study of the PHA using synthetic patient data. The evaluation set was constructed from 772 synthetic cases generated from published triage protocols and persona-based LLM-assisted generation. Cases included patient vignette summaries with medical histories and simulated patient conversations. PHA provided guidance and recommendations based on these inputs, which were compared to an independent ground-truth derived from expert physician review. Co-primary endpoints of urgent undertriage, nonurgent undertriage, and overtriage were each evaluated against prespecified clinical performance thresholds. Results: The final analysis dataset contained 772 synthetic cases. Urgent undertriage was 28/406 (6.9%, 95% CI 4.6%-9.8%), nonurgent undertriage was 56/305 (18.4%, 95% CI 14.2%-23.2%), and overtriage was 40/364 (11.0%, 95% CI 8.0%-14.7%). Overtriage met the prespecified performance threshold (<30%), whereas urgent (<5%) and nonurgent (<15%) undertriage thresholds were not met. Conclusions: PHA maintained acceptable overtriage but did not meet prespecified undertriage targets. These findings support the value of structured predeployment analytical validation as an early evidence step for patient-facing AI systems and highlight its utility for iterative refinement; while also underscoring the need for prospective clinical validation in light of the inherent limitations of simulation-based studies.

  • VAGPT and Free-Response Open Coding: Using AI for Qualitative Tasks within the Department of Veterans Affairs

    Date Submitted: Jun 4, 2026
    Open Peer Review Period: Jun 6, 2026 - Aug 1, 2026

    This research assesses the ability of VAGPT, a large language model authorized for use in the Department of Veterans Affairs, to identify qualitative codes and its coding reliability when applied to free-response data from an anonymous survey of servicemembers' access to posttraumatic stress disorder treatments.

  • Background: Background: Digital transformation has increasingly influenced healthcare systems globally, with Electronic Medical Records (EMRs) becoming central to improving healthcare documentation, communication and decision-making. Despite growing recognition of EMRs as tools for strengthening health data quality, healthcare institutions in many low- and middle-income countries like Nigeria continue to experience setback as regards digital inclusion, infrastructural limitations and workforce readiness. In Nigeria, public tertiary hospitals still experience inconsistent EMR implementation and persistent concerns regarding the quality of patients’ health data. Objective: Objective: This study explored healthcare providers’ perspectives on EMR adoption and health data quality in selected public tertiary hospitals in North-Central Nigeria within the broader context of digital health inclusion in the Global South. Methods: Methods: The study adopted explanatory sequential mixed-method design. The design involved quantitative phase, identification of key quantitative results, qualitative phase, integration of findings and interpretation. The quantitative data were collected using a structured clinical chart review checklist developed from internationally recognized health data quality dimensions and existing literature on EMR systems and health information management. The qualitative data were collected through semi-structured key informant interviews among physicians, nurses and Health Information Management professionals purposively selected from three public tertiary hospitals with varying levels of EMR implementation. Interviews were audio-recorded, transcribed verbatim and analyzed using thematic analysis. Results: Results: The study revealed an overall moderate level of health data quality, with high a Health Data Quality Index (HDQI) of 73%. Healthcare providers acknowledged the potential benefits of EMRs in improving accessibility, timeliness, comprehensiveness, relevancy and consistency of health data. Participants identified ease of information retrieval, reduction in missing records and improved continuity of care as major strengths of EMR systems. Several barriers to meaningful digital inclusion however emerged. These include unstable electricity supply, poor internet connectivity, inadequate training, workload pressure, dual documentation practices and limited institutional support. Providers further reported that system reliability, ease of use and user satisfaction strongly influenced their willingness to utilize EMRs consistently. Positive attitudes toward digital systems were associated with improved documentation practices and enhanced health data quality. Conclusions: Conclusion: Electronic medical records adoption in Nigerian tertiary hospitals remains shaped by complex technological, organizational and behavioural factors. Strengthening digital inclusion through reliable infrastructure, workforce capacity building and supportive institutional policies is essential for improving sustainable EMR utilization and health data quality in resource-constrained healthcare settings.

  • The Use of Natural Language Processing to Investigate Social Isolation and Loneliness: A Scoping Review

    Date Submitted: Jun 3, 2026
    Open Peer Review Period: Jun 5, 2026 - Jul 31, 2026

    Background: Social isolation and loneliness (SIL) are associated with critical health consequences but are difficult to measure in healthcare settings because they typically appear in unstructured text. Natural language processing (NLP) offers a promising approach to identify these constructs at scale, but its current applications in this domain have not been systematically characterized. Objective: To investigate how NLP is being used to study SIL, identify gaps, and outline priorities for advancing rigorous NLP-based measurement of these constructs in health research. Methods: A scoping review was conducted following the PRISMA-ScR guidelines. Six bibliographical databases (Ovid MEDLINE, Embase, Scopus, Web of Science, APA PsycINFO, ProQuest Dissertations & Theses Global) and two preprint servers (bioRxiv, medRxiv) were searched from inception to June 18, 2025. Reviewers independently screened abstracts and full texts, data were double-charted using a standardized form, and final results were synthesized using structured template analysis. Results: A total of 63 studies published between 2019 and 2025 met the inclusion criteria. Most were conducted in the US (27/63, 42.9%) and used cross-sectional designs (37/63, 58.7%). Studies mostly targeted older adults (31/63, 49.2%), used survey data (26/63, 41.3%), and focused on loneliness (32/63, 50.8%). Most studies (44/63, 69.8%) did not use a validated loneliness scale; among those that did, the UCLA Loneliness Scale was most common (16/63, 25.4%). Classification (24/63, 38.1%) was the most frequent NLP application. Rule-based (32/63, 50.8%) and traditional machine learning (18/63, 28.6%) approaches predominated, but large language models (16/63, 25.4%) and transformer-based models (14/63, 22.2%) increased over time. External validation was rare (2/63, 3.2%), code was shared in only 19% of studies (12/63), and 25.4% (16/63) addressed bias in their data or analysis. Conclusions: NLP applications for SIL are expanding rapidly but rest on narrow methodologies with limited validation measures and demographic groups. Advancing the field requires tying model development to validated measurement of the target constructs and adopting established reporting frameworks such as TRIPOD+AI and MI-CLAIM.

  • Background: Background: Attention-deficit/hyperactivity disorder (ADHD) is a common neurodevelopmental disorder in children, characterised by core symptoms of hyperactivity, impulsivity, and inattention, in addition to cognitive impairments that compromise physical and psychological development. Digital technology-based interventions have emerged as a promising approach for ameliorating both core symptoms and cognitive impairments. However, a comprehensive evidence base supporting their efficacy is lacking. Objective: Objective: This study aimed to systematically evaluate the effects of digital interventions on core symptoms and cognitive impairments in children with ADHD. Methods: Methods: Web of Science, PubMed, EBSCOhost, and ProQuest were systematically searched using predefined inclusion and exclusion criteria. The risk of bias of the included studies was assessed using the revised Cochrane risk-of-bias tool for randomised trials. Effect sizes were pooled under a random-effects model, and heterogeneity across studies was evaluated using the I² statistic. Publication bias was assessed using Egger’s regression and Begg’s rank correlation tests. Sensitivity analyses were performed by switching from a random-effects model to a fixed-effects model to confirm the results’ robustness. Results: Results: Thirty-seven randomised controlled trials were included, encompassing four types of digital interventions: computer-based interventions, serious video games, exergames, and virtual reality. Digital interventions significantly alleviated core symptoms and cognitive impairments, with improvements in the former primarily attributable to serious video games (P = .02) and those in the latter mainly attributable to computer-based interventions (P = .04) and serious video games (P = .02). Conclusions: Conclusions: Thus, digital interventions can significantly alleviate core symptoms and cognitive impairments in children with ADHD. Future research should consider further optimising trial designs and conducting targeted analyses, such as subgroup analyses by symptom subtype and age stratification, to enhance intervention efficacy. Clinical Trial: Trial Registration: PROSPEROKeywords: ADHD; children; digital technology-based interventions; core symptoms; cognitive function; meta-analysis CRD420261399517; https://www.crd.york.ac.uk/PROSPERO/view/CRD420261399517

  • Background: Generating Findable, Accessible, Interoperable, and Reusable (FAIR) biomedical samples, data, and tools is costly and time-consuming. Thus, transparency about their processing or evolution and reuse, particularly of health data, are highly desirable. Therefore, an appropriate fact-based decision framework to evaluate data (re)usability is required. Provenance information documents the processing or evolution of a data object, thereby providing an essential formal basis for such a (re)usability evaluation. Standardised, this provenance information facilitates better FAIR biomedical data. Objective: The MInimal Requirements for Automated Provenance Information Enrichment (MIRAPIE) project aims at defining the minimal required provenance information for harmonised documentation of a data objects processing history and to establish the MIRAPIE approach as a community standard to assure interoperability of the collected provenance information. Methods: A hybrid consensus finding method, adjusted from Nominal Group Technique (NGT) and Delphi, has been applied within an international community setting to iteratively implement a minimal data model, an ontology, and an application guideline. The data model is based on the PROV Data Model (PROV-DM), the ontology expands the PROV Ontology (PROV-O). Results: With the MIRAPIE question, we defined a harmonising framework for provenance information in biomedicine and presumably beyond. The minimal data model, a respective ontology, and an accompanying guideline facilitate means for standardised and possibly automated provenance documentation. In diverse biomedical usage scenarios their general applicability to data, workflows, models, and even samples is shown. Setting up provenance documentation from scratch is equally supported as linking alternative data schemata and mapping existing provenance documentation. Conclusions: MIRAPIE question, minimal data model, ontology, and guideline together significantly contribute to the advancement of biomedical and especially health research, setting up a basis for a contextual (re)usability evaluation. This fosters traceability of changes applied to data, workflows, tools, and samples and, in consequence, sustainable data usage and reproducibility of scientific results. The generalisation allows to overcome domain-specific differences and local, national, and international boundaries. We invite biomedical research community and health data gathering institutions to create lasting change by establishing MIRAPIE-compliant provenance information for transparent data processing and (re)usability assessment.

  • Background: Gestational diabetes mellitus (GDM) has a growing global prevalence and brings multiple adverse short- and long-term hazards to mothers and fetuses. Conventional offline management is restricted by time and space limitations, accompanied by poor patient compliance and delayed individualized intervention. Telemedicine has gradually been applied to GDM management, yet existing relevant randomized controlled trial (RCT) conclusions remain inconsistent, lacking unified quantitative evidence. Objective: This meta-analysis systematically synthesizes available RCT evidence to quantitatively evaluate the comprehensive efficacy of telemedicine interventions on maternal glycemic metabolism, delivery modes and multiple neonatal adverse outcomes in women with gestational diabetes mellitus, so as to provide evidence-based references for optimizing clinical GDM management schemes. Methods: Relevant RCT literatures were comprehensively retrieved from PubMed, EMBASE, Cochrane Library, Web of Science and Scopus up to February 9, 2026. Strict inclusion and exclusion criteria based on PICOS framework were formulated. Two independent researchers completed literature screening, data extraction and Cochrane RoB 2 bias risk assessment, and meta-analysis was performed via STATA 18.0 software. Continuous outcomes were expressed as standardized mean difference (SMD), and dichotomous outcomes were summarized by odds ratio (OR) with 95%CI; fixed or random-effect models were selected according to heterogeneity (I²). Results: Altogether 17 eligible RCTs involving 2391 GDM pregnant women were included. Meta-analysis indicated telemedicine significantly reduced fasting blood glucose (SMD=-0.55, 95%CI:-0.94~-0.16) and 2-hour postprandial blood glucose (SMD=-0.62,95%CI:-1.20~-0.04), lowered the risks of emergency cesarean section (OR=0.65,95%CI:0.45~0.93), macrosomia (OR=0.49,95%CI:0.35~0.69), neonatal hypoglycemia (OR=0.60,95%CI:0.42~0.86) and neonatal respiratory distress (OR=0.61,95%CI:0.41~0.92). No statistically significant improvements were observed in overall cesarean delivery rate, gestational weight gain, preterm birth incidence and neonatal NICU admission rate. Conclusions: Telemedicine interventions effectively optimize glycemic control and decrease multiple adverse perinatal complications among GDM patients, serving as a valuable supplementary mode for routine prenatal care. Further large-sample, long-term follow-up RCTs are still required to verify its long-term maternal and infant clinical benefits. Clinical Trial: PROSPERO CRD420261403669; https://www.crd.york.ac.uk/PROSPERO/view/CRD420251155941

  • Background: Digital health programs in sub-Saharan Africa often assume broad mobile reach, yet population-level evidence on who can use specific technologies, and who is excluded, remains limited. Without accurate denominators, digital interventions may reinforce inequities by missing people least engaged with conventional healthcare. Objective: We assessed technology adoption, disparities, and trajectories in a high-HIV-burden rural South African population to inform equitable digital health implementation. Methods: We analyzed 309,151 person-years from the Africa Health Research Institute demographic surveillance platform in rural KwaZulu-Natal, South Africa (2017 to 2023). We measured adoption of seven technologies (calls and SMS, internet, WhatsApp, email, mobile banking, entertainment, and health tracking) and constructed a five-tier Digital Adoption Ladder from offline (T0) to digital-health ready (T4). We quantified disparities by HIV status, gender, and their intersection using logistic regression, and tracked temporal trajectories including the COVID-19 period. Results: In 2023, 61.3% of records were classified as offline (T0) under the harmonized coding rules, and only 2.9% reached digital-health readiness (T4). Among tested individuals, people living with HIV showed higher adoption across all technologies (odds ratios 1.13 to 1.57) than HIV-negative individuals, with 56.0% connected versus 44.6%. Females also showed higher adoption than males (odds ratios 1.24 to 1.80). Intersectional analysis identified HIV-positive females as the most connected group (58.1%) and HIV-negative males as the least connected (38.4%), a 20-percentage-point gap. This pattern emerged after 2019 and defines a prevention paradox: a group important for HIV testing, PrEP, and prevention outreach is also the least reachable through digital channels. Conclusions: Digital health implementation should adopt a floor-up strategy: start with SMS (reaching approximately 39%), add WhatsApp where connectivity exists, and reserve apps for the small minority able to use them. HIV-negative males require targeted outreach through non-health channels to prevent digital exclusion from weakening HIV prevention.

  • Background: There are few theoretical frameworks in the literature for the strategic planning of health information systems. Demonstrating and analyzing their use in practice can lead to a broader application and evidence-based decision making. Objective: The study aimed to analyze and assess the information systems of a university hospital’s physi-cal therapy section and a university department of physical therapy in order to plan their integra-tion following the merger of the two facilities to form an institute for physical therapy at a Ger-man medical center. Building on this, a strategic plan for the institute’s information system is proposed. Methods: We used a methodological framework for the strategic planning of information systems in hos-pitals, extended it by lean management methods and applied it at the organizational unit level. We described the organizational units’ information systems’ static view by the three-layer graph-based metamodel for health information systems (3LGM²) and the dynamic view by Business Process Model and Notation (BPMN). Information sources were interviews with per-sonnel. Results: A strategic management plan for developing the institute’s information system has been pro-posed. A migration path has been established with 23 tactical projects over the next 3 years to accomplish to attain strategic management goals. Conclusions: The method for strategic planning of information systems could successfully be adapted to the organizational unit level and should therefore be applied to other departments in hospitals as well. It helps them identify weaknesses in information logistics through a systematic approach, enabling gradual improvement as part of a long-term plan.

  • Background: Artificial intelligence research in image-guided oncology has grown exponentially, yet how far the field has progressed from diagnostic assistance toward direct therapeutic execution has never been quantified. Existing bibliometric surveys categorize studies by technical architecture or clinical domain, metrics that track publication volume but not proximity to procedural deployment. Objective: We developed a hierarchical functional classification framework to map the global landscape of therapeutic AI development across five major oncological indications. Our two specific objectives were: (1) to classify publications by clinical output function along the diagnostic-to-therapeutic continuum, and (2) to quantify the translation gap using three complementary metrics, triangulated against trial and device registries. Methods: We extracted 29,277 Web of Science publications spanning five image-guided oncologic specialties (thyroid, breast, lung, prostate, and liver) published between January 2010 and April 2026. AI-related records were classified by clinical function using a three-stage protocol: keyword categorization, contextual scoring, and rule-based filtering. Inter-rater reliability, validated on 518 independently coded publications, yielded Cohen's κ of 0.92. Our framework distinguished Diagnosis AI (disease identification) from therapeutic AI, then further stratified therapeutic AI into Bridge-support AI (treatment planning, prognosis, patient selection) and True Treatment AI. True Treatment AI was defined by concurrent satisfaction of two criteria: ≥Level 2 on the Yang Surgical Autonomy Scale and ≥Stage 1 on the IDEAL Framework. Results: Of 16,937 AI-related publications identified, 14,277 (84.3%) were categorized as Diagnosis AI and only 2,660 (15.7%) as therapeutic AI. All therapeutic publications fell exclusively within the Bridge-support tier. None satisfied the dual-framework criteria for True Treatment AI, yielding a uniform penetration rate of 0.00% across all five oncological domains. This complete execution vacuum persisted despite an 11-fold variation in inter-domain treatment-to-diagnosis ratios. The finding held under threshold relaxation, sensitivity analyses, and independent triangulation against 3,491 ClinicalTrials.gov records and 1,430 FDA device listings. Conclusions: Each specialty should periodically profile its diagnostic-to-therapeutic translational progress. The uniform absence of True Treatment AI across 15 years and five domains indicates that this gap is structural rather than cumulative, rooted in methodological inheritance from diagnostic paradigms and in regulatory category mismatches. Closing this gap requires coordinated framework development across regulatory, research, and clinical communities, rather than incremental algorithmic improvements.

  • Background: Large language models (LLMs) are increasingly used by patients seeking medication advice. Their quality for secondary stroke prevention counseling has not been well characterized. Objective: To compare five widely used search-enabled consumer LLM interfaces on patient-facing medication counseling for secondary stroke prevention across fourteen evaluation metrics covering safety, clinical accuracy, information quality, readability, empathy, actionability, and model test-retest stability, operationalized as lexical text stability. Methods: A 56-item English-language question bank was developed from current stroke prevention guidelines and submitted to five consumer LLM interfaces (ChatGPT, Claude, Gemini, DeepSeek, Doubao) via their official web interfaces on May 1, 2026, with repeat querying on May 8, 2026 to assess model test-retest stability. All systems were accessed using a logged-in account with web search enabled via a US-based connection. Responses were independently rated by two blinded raters. Non-parametric tests with Benjamini-Hochberg correction were applied. Results: Clinical accuracy was high and uniform across models (mean 4.44-4.52/5; Friedman p = 0.578). Gemini, DeepSeek, and Doubao scored significantly higher on EQIP (70.2-70.7 vs. 63.6-64.1; p < 0.001) and DISCERN (p < 0.001) than ChatGPT and Claude. All models substantially exceeded commonly used patient-education readability benchmarks (FKGL 11.1-14.4; benchmark <=6; FRES 33.4-46.8; benchmark >=60). ChatGPT had the highest unsafe response rate (14.3% vs. 7.1-10.7%). Conclusions: In this controlled evaluation of researcher-generated questions, the tested search-enabled LLM interfaces produced broadly accurate responses for secondary stroke prevention medication counseling, but weaknesses in readability, source transparency, and safety indicate that readability optimization, source-attribution prompting, and clinical review are needed before patient-facing use.

  • Background: Artificial intelligence (AI) is rapidly reshaping healthcare, offering tools to enhance diagnostic accuracy, streamline clinical workflows, and personalize care delivery. However, real-world AI implementation remains limited, hindered by organizational, technical, and sociocultural barriers that implementation science has only begun to address systematically. Objective: This scoping review maps the intersection of AI and implementation science in healthcare, examining the types of AI technologies deployed, their intended use, and the processes by which these tools are implemented into practice. Methods: Following PRISMA-ScR guidelines, we synthesized empirical evidence from 65 studies published between December 2011 and March 2025. Searches were performed across the databases CINAHL, PubMed, PsycINFO, Scopus, and Web of Science using the terms Artificial Intelligence, Healthcare, implementation, and empirical, combined with relevant synonyms. Results: AI implementation research has expanded rapidly, predominantly in high-income countries, raising important questions about global equity. The most common application areas were automation and optimization (40%), computer vision (34%), and human language technologies (20%), primarily targeting clinical care (68%) and health systems management (25%). Most systems were designed for low-action autonomy (62%), emphasizing human-in-the-loop decision-making. Intended users were physicians (43%), nurses (26%), and radiologists (25%), while patients appeared as intended users in only 11% of implementations. Across the 65 studies, 40 barriers and 55 facilitators were identified across five themes: the AI system itself, healthcare professionals, patients, organizational context, and the macro level. Organizational factors and multidisciplinary stakeholder engagement emerged as the most critical enablers of successful adoption. Key barriers included insufficient AI performance, lack of transparency and explainability, limited IT infrastructure, and inadequate workflow integration. Patient-level and governance-level barriers, including data privacy and regulatory uncertainty, remained underexplored. Only 20% of studies applied theoretical implementation frameworks, and most analyses were conducted retrospectively. Mapping via the AIGENT framework revealed a disproportionate focus on workflow alignment and outcome evaluation, with comparatively little attention to early-phase activities such as needs assessment, adaptation planning, and stakeholder approvals. Conclusions: The current literature predominantly focuses on implementation evaluation and workflow alignment, while patient perspectives, governance conditions, and early implementation activities are underexplored. The finding that only 20% applied theoretical implementation frameworks, mostly retrospectively, reflects a gap between theory and practice and points to a need to apply them prospectively across the full implementation process. From a practitioner perspective, AI implementation should be seen as a sociotechnical and governance process that requires technical, contextual, and system knowledge, rather than merely a technical deployment.

  • Background: Background: Patients who experience a stroke or transient ischemic attack (TIA) face a substantial risk of future events, making optimal management of risk factors essential for secondary prevention. Digital health interventions have demonstrated promise in enhancing the control of vascular risk factors among individuals with stroke or TIA; however, the relative efficacy of different intervention modalities in achieving risk factor control remains uncertain. Objective: Objective: This study systematically assessed and compared the impact of various digital health interventions on the control of risk factors for secondary prevention among patients with stroke or TIA, aiming to determine the most effective intervention approach. Methods: Methods: A comprehensive and systematic literature search was performed across PubMed, Cochrane Library, Embase, and Web of Science databases from January 2010 to January 2026. This review included randomized controlled trials (RCTs) evaluating distinct digital health modalities among patients who experienced a stroke or TIA. Systolic blood pressure (SBP) changes served as the primary outcome, whereas alterations in diastolic blood pressure (DBP), patient medication adherence, total cholesterol (TC), and low density lipoprotein cholesterol (LDL-C) constituted the secondary outcomes. Utilizing the RoB 2 tool, two independent reviewers evaluated the risk of bias, followed by a Bayesian random-effects network meta-analysis to synthesize both direct and indirect evidence. We ranked the interventions based on their cumulative ranking curve (SUCRA) values and appraised the certainty of evidence through the GRADE approach. Crucially, the study protocol was registered prospectively in the PROSPERO database (CRD420261367782). Results: Results: A total of 25 RCTs involving 10,752 patients and six types of electronic health technologies were included. The results showed that, compared with usual care, combined digital technologies had a more pronounced benefit in reducing SBP (MD: −3.7, 95% CrI: −4.8 to −2.7; SUCRA: 71.95%); telephone follow-up demonstrated better effects on lowering DBP and LDL-C (MD: −2.4, 95% CrI: −3.7 to −1.2; SUCRA: 97.04%), (MD = −0.21, 95% CrI: −0.28 to −0.14; SUCRA = 55.95%). In addition, smartphone applications also showed certain advantages in improving medication adherence and reducing TC (MD = −0.39, 95% CrI: −0.71 to −0.068; SUCRA = 87.93%). Conclusions: Conclusions: Different digital health interventions may provide distinct benefits for secondary prevention after stroke or transient ischemic attack. Combined digital technologies appeared to be more effective for reducing SBP, telephone follow-up for improving DBP and LDL-C, and smartphone applications for enhancing medication adherence and reducing TC. However, due to the limited evidence base and small study sample size, these outcomes should be treated conservatively. Future large-scale, high-quality trials are required to verify these determinations. Clinical Trial: The study protocol was registered prospectively in the PROSPERO database (CRD420261367782).

  • Background: Despite the substantial burden imposed by premenstrual syndrome (PMS) on women’s quality of life, clinical diagnosis remains dependent on subjective self-assessment. The emergence of wearable technology enables continuous collection of physiological metrics that may serve as objective indicators of PMS symptom severity. Objective: This study evaluated the feasibility of using Fitbit-derived heart rate (HR) and autonomic indices, along with interstitial glucose data obtained from FreeStyle Libre, as objective digital biomarkers for PMS diagnosis and monitoring. Methods: This prospective, longitudinal observational study enrolled 122 women aged 18-60 years in Japan. Physiological data, including HR, HR variability, and interstitial glucose levels, were collected using Fitbit Inspire 3 and FreeStyle Libre devices over 14 weeks. PMS severity was assessed using the Menstrual Distress Questionnaire (MDQ). Results: Participants with severe PMS symptoms exhibited higher autonomic nervous system (ANS) markers such as root mean square of successive differences of RR intervals (RMSSD) and the standard deviation of normal-to-normal intervals and lower sleep HRs during the luteal phase compared with those who had milder symptoms (eg, sleep RMSSD: P=.002; sleep mean HR: P=.007). Furthermore, beginning 3 days before menstruation, participants with severe PMS showed a decline in ANS markers accompanied by an upward trend in HR, whereas those with mild symptoms exhibited the opposite pattern (eg, sleep RMSSD: P=.002; sleep mean HR: P=.005). Conclusions: Sleep ANS markers and HRs serve as objective measures for assessing PMS symptoms. Continuous monitoring using wearable devices offers a promising, noninvasive method for objective PMS diagnosis and personalized health management. Clinical Trial: UMIN Clinical Trials Registry UMIN000051467

  • Background: Chronic obstructive pulmonary disease (COPD) is a major global health challenge, with the number of affected individuals projected to approach approximately 592 million by 2050. Primary healthcare institutions bear substantial responsibility for COPD screening, diagnosis, and follow-up, but often face underdiagnosis, fragmented information systems, and workforce constraints. Although digital health and artificial intelligence (AI) have shown potential in COPD management, workflow-integrated solutions tailored to primary care remain limited. Objective: To describe a designathon-based co-creation process and the subsequent development of an early-stage prototype of an AI-enabled digital workflow for COPD screening and follow-up management in primary care. Methods: This descriptive process and prototype development study followed WHO practical guidance on crowdsourcing and designathons in health research. It comprised three phases: (1) an online open call (July 22 to August 1, 2025) soliciting ideas related to AI-assisted chronic disease management and digitalized follow-up care; (2) a 3-day in-person designathon in Guangzhou involving 23 participants from five stakeholder groups (primary care physicians, implementation science scholars, AI engineers, patient representatives, and chronic disease management specialists) who worked in five interdisciplinary teams using user journey mapping and structured co-creation activities; and (3) a post-designathon translation phase in which co-created deliverables were synthesized into an early-stage WeChat Mini Program prototype named FeiChangShun. Expert rubric scoring was used to assess team deliverables generated during the designathon. Results: The online open call received 26 submissions, 25 of which met eligibility criteria. During the designathon, five priority pain points were identified: data silos and interoperability barriers, training–practice disconnect, communication barriers, human resource shortages, and low disease awareness. The five teams generated differentiated workflow concepts and corresponding user journey maps to address these challenges. Drawing on these co-created outputs, the research team developed an early-stage prototype comprising five core modules: voice interaction support, health education support, behavior management support, standardized workflow support, and draft document/report generation. Conclusions: This study reports a structured designathon-based co-creation process and the development of an early-stage, guideline-informed workflow prototype for COPD management in primary care. Future studies should evaluate the prototype with end users and assess implementation feasibility, safety, and clinical impact in real-world settings.

  • Background: Children of parents with mental health illness (COPMI) have the right to receive preventive interventions to avoid developing their own mental health and socioeconomic problems. However, access to interventions varies widely within health care and social services depending on their geographical location in Sweden. In this regard, mobile health (mHealth) interventions for COPMI provide a pathway for more equitable and sustainable preventive support. Objective: This project has the following aims: (1) to map and assess the quality of available mHealth apps relevant to COPMI; (2) to understand the existing needs for digital health solutions (particularly mHealth) by consulting and collaborating with an interdisciplinary reference group (scholars, child rights organizations, and IT professionals); and, (3) to explore the prerequisites for development, implementation, and sustainable access, for future digital solutions for COPMI. Methods: We collaborated with a reference group through a series of meetings and workshops, identifying and assessing the quality of 10 free, highly ranked apps (i.e., leveraging the well-known Mobile App Rating Scale, MARS). All the workshop sessions were recorded and transcribed for further qualitative analysis, allowing us to document and derive our main findings. Results: Three out of 10 apps, scored high across the MARS dimensions of functionality, aesthetics, information and engagement, indicating a significant lack of high-quality apps relevant for COPMI. Further findings were derived, such as the preference for more general apps that are not specifically targeted to COPMI, as these could promote self-identification and reduce stigmatization. Regarding the third aim, the result showed that it was important to find an app that protected users’ privacy by allowing anonymous access to digital support, and that mobile apps should be complemented (or replaced) by web-based applications for accessibility for children who may not be allowed to download apps without parental permission. Conclusions: The sustainability of digital solutions (web or mobile apps) for COPMI is the biggest challenge for future developments. Partnering with providers that are already established in the mental health area is key, extending their services to COPMI, while leveraging an app development infrastructure that already has sustainable processes and business models behind it. To address engagement and deployment issues, it is important to actively involve children through participatory and co-creational approaches when designing and developing mHealth solutions.

  • Background: Traditional neuropsychological assessments for cognitive decline are lengthy in-clinic evaluations by a specialist, with typical wait times of 6-8 months. This creates a substantial patient burden and prolonged diagnostic and treatment timelines. Digital cognitive assessments (DCA) offer a scalable solution to these challenges, but their validation is challenged by the scarcity of large, high-quality datasets with established ground truth. Objective: To develop a model to identify mild cognitive impairment (MCI) and probable dementia using metrics from the Digital Assessment of Cognition (DAC), a brief, remote-capable DCA. A secondary objective was to conduct a preliminary assessment of the model's validity. Methods: We applied a semi-supervised model-based clustering method to combine a large dataset (N=1189) of DAC assessments alone, with a smaller dataset pairing DAC assessments with ground-truth neuropsychological diagnoses (N=248). We examined the model's predictive validity by comparing its predictions with diagnoses on a held-out test set. We examined congruent validity by testing associations with traditional analog assessments and demographic variables. Results: We identified a 6-cluster model with 3 MCI clusters and 2 probable dementia clusters. The model identified cognitively unimpaired, MCI, and dementia groups with high accuracy (78.7%) on the held-out test dataset, and showed excellent ability to identify cognitive impairment (AUROC=0.985) and dementia (AUROC=0.932). We identified strong associations with traditional analog assessments and demographic variables. An exploratory analysis showed evidence that clusters correspond to clinically meaningful subtypes of MCI. Conclusions: These results validate prior exploratory work and demonstrate the potential for more nuanced, holistic, and scalable cognitive assessments in non-specialist settings.

  • Physicians' Job Demands and Job Resources in Digital and Intelligent Healthcare: Scale Development and Validation

    Date Submitted: May 28, 2026
    Open Peer Review Period: May 31, 2026 - Jul 26, 2026

    Background: The rapid development of digital and intelligent medical technologies is profoundly reshaping the clinical work patterns of physicians, introducing new job demands and job resources into their clinical practice. However, existing measurement instruments have not captured these specific changes and novel challenges posed by such technologies, resulting in a lack of corresponding assessment tools, which limits in-depth quantitative research in this field. Objective: This study aims to develop and validate the Job Demands Scale (JDS) and Job Resources Scale (JRS) for physicians suitable for digital and intelligent healthcare scenarios. Methods: Building upon the foundation of prior qualitative interviews and literature review, the dimensions of the scales and a corresponding pool of measurement items were constructed. The scales underwent content revision through two rounds of Delphi expert consultation (N=18) and cognitive interviews (N=6). Subsequently, an online questionnaire survey was conducted with 1,016 clinicians using convenience sampling. The psychometric properties of the scales were evaluated through item analysis, exploratory factor analysis (EFA), confirmatory factor analysis (CFA), and reliability testing. Results: The finalized JDS comprises 22 items across six dimensions: Human-Machine Interaction Burden, Technology Output Risk, Information Security Burden, Occupational Substitution Risk, Doctor-Patient Communication Burden, and Technology Dependence Risk. The JRS consists of 23 items, also organized into six dimensions: Decision-Making Support, Risk Prevention Support, Workload Reduction Tools, Doctor-Patient Collaborative Platform, Precision Efficiency Support, and Clinical Competence Support. EFA indicated that the six factors of the JDS cumulatively explained 71.10% of the variance, and the six factors of the JRS cumulatively explained 58.98% of the variance. CFA demonstrated good model fit for both scales. For the JDS, the composite reliability (CR) values for the dimensions ranged from 0.758 to 0.869, and the average variance extracted (AVE) values ranged from 0.441 to 0.687. For the JRS, the CR values ranged from 0.640 to 0.792; however, the AVE values were relatively low, ranging from 0.339 to 0.490. The overall Cronbach's α coefficients for the JDS and JRS were 0.944 and 0.923, respectively. These results demonstrate that both scales possess good preliminary reliability and validity. However, their dimensional structure and discriminant validity still require further optimization. Conclusions: The JDS and JRS developed in this study exhibit good psychometric properties and hold strong potential for effectively evaluating physicians' job demands and job resources within the context of digital-intelligent healthcare. This provides a scientific basis for subsequent related research and clinical management practices. It is noteworthy that the high correlations observed between the dimensions of both scales suggest that future analyses on the impact of job demands and job resources on physicians' work in digital-intelligent healthcare settings should pay attention to the synergistic effects of these elements. Furthermore, the content of the scales should be continuously updated alongside the advancement of digital-intelligent medical technologies.

  • Smart Hospitals and Digital Health Powered by 5G and 6G Networks: A Scoping Review

    Date Submitted: May 28, 2026
    Open Peer Review Period: May 31, 2026 - Jul 26, 2026

    Background: Global health systems face increasing pressure due to population aging and recurrent pandemics, requiring a transition from Health 4.0 to Health 5.0. Although 4G technologies initiated the era of remote monitoring, their limitations in bandwidth and latency hinder critical real-time applications. Fifth-generation (5G) and sixth-generation (6G) networks, integrated with artificial intelligence (AI) and the Internet of Medical Things (IoMT), have emerged as key enablers of ultra-low-latency and highly reliable services in smart healthcare ecosystems. Objective: This scoping review aimed to map and synthesize scientific evidence on the use of 5G and 6G network technologies in healthcare, particularly in smart hospitals and eHealth services, and to identify related opportunities, challenges, and research gaps. Methods: We conducted a scoping review following the PRISMA Extension for Scoping Reviews (PRISMA-ScR) and the Arksey and O’Malley and Joanna Briggs Institute (JBI) frameworks. The research question was structured using the Population, Concept, Context (PCC) mnemonic. A systematic search was performed in Google Scholar using a comprehensive search string targeting 5G/6G, eHealth, smart hospitals, digital health, and telemedicine. Eligibility criteria included studies in English that explicitly addressed 5G or 6G infrastructure, architecture, or applications in healthcare contexts. Screening and data extraction were performed iteratively by reviewers, and studies were categorized according to implementation maturity, architectural advances, and security requirements. Results: Most identified studies were theoretical proposals (about 57%) or feasibility analyses, with a smaller proportion of real-world implementations. Practical evidence suggests that 5G can reduce emergency response times by up to 30% and enable in-transit imaging-based diagnosis, supporting the transformation of ambulances into advanced triage units. However, field tests report real-world 5G latency of approximately 10 ms, which is above the theoretical target of <1 ms and constrains latency-critical applications such as holographic telesurgery. Across studies, security and privacy—particularly for contextual and IoMT sensors—emerged as critical challenges, together with interoperability with legacy systems and the high cost of infrastructure. Conclusions: Advanced connectivity networks, particularly 5G and future 6G infrastructures, are positioned as foundational components for smart hospitals and digital health, supporting the transition from Health 4.0 to Health 5.0. Nonetheless, the evidence base is still dominated by conceptual works, and the full potential of these technologies is limited by technical, organizational, and economic barriers. Future work should prioritize explainable AI, end-to-end security, and sustainable business models to ensure safe, equitable, and clinically meaningful adoption of 5G/6G-enabled smart healthcare. Clinical Trial: This does not apply as it is a survey.

  • Evaluating a Web-Based Intervention for Digital Health Measurement: a Mixed-methods Study

    Date Submitted: May 28, 2026
    Open Peer Review Period: May 31, 2026 - Jul 26, 2026

    Background: Despite its potential to address key challenges in primary health care, digital health measurement faces substantial implementation barriers for health care professionals. To address these barriers, professionals from 4 disciplines - physical therapy, occupational therapy, speech and language therapy, and general practitioner practice assistance – collaborated with researchers to develop an intervention. The intervention comprised a website supported by coaching on the job as a temporary implementation strategy during development. Objective: This study explored whether and how the intervention facilitates optimized use of digital health measurement in patient care to inform further intervention refinement. Methods: A mixed-methods formative process evaluation was conducted using a predominantly qualitative approach. 18 health care professionals tested the intervention in daily practice. Data collection was guided by the Medical Research Council framework, a predefined process evaluation plan, and the intervention’s initial program theory. Quantitative data (questionnaires, 7-point Global Perceived Effect measures, and monitoring lists) informed semi-structured interviews and focus groups. Data were analyzed using descriptive statistics and directed content analysis. Results: The intervention was largely implemented as intended and improved digital health measurement in patient care by enhancing participants’ capability, opportunity, and motivation. Consistent with the initial program theory, these changes triggered implementation activity at the organizational level, strengthening implementation readiness through bottom-up change processes. Intervention strategies included collaborative learning, modelling and prompting action. These strategies operated through mechanisms such as experiential learning, in which professionals experienced the benefits and feasibility of digital health measurement, reinforcing motivation for its continued use. During intervention use, additional processes emerged, including champions facilitating organizational-level adoption of digital measurement by sharing knowledge and enthusiasm with colleagues. Coaching particularly supported initial intervention engagement by contextualizing generic information, stimulating interaction, and prompting action. Individual, organizational, instrumental, temporal, policy, and societal factors interacted with intervention components, strategies, and mechanisms to facilitate or constrain outcomes. Conclusions: Future refinement should strengthen key mechanisms and processes, integrate mechanisms previously supported by coaching, and develop scalable implementation strategies. As no single approach will fit all contexts, practices should tailor implementation to their local needs. The intervention’s generic framework and flexible use of core components support local adaptations.

  • Chatbot-Based Psychiatric Medication Counseling in Outpatients With Schizophrenia: Pre-Post Study

    Date Submitted: May 26, 2026
    Open Peer Review Period: May 28, 2026 - Jul 23, 2026

    Background: Chatbot-based interventions have shown promise in common mental health conditions such as depression and anxiety. However, their application in schizophrenia (SZ), particularly for psychiatric medication counseling, remains extremely limited. Objective: This study aimed to investigate the effects of a rule-based psychiatric medication counseling chatbot on clinical and patient-reported outcomes in patients with SZ. Methods: A total of 31 outpatients with SZ participated in a single-group pre–post study. Participants used a rule-based chatbot via a mobile app for 3 months. The chatbot provided structured guidance on antipsychotic medications, including side effects, management strategies, medication use, expected therapeutic effects and duration of medication. Primary outcomes included medication adherence (Adherence Rating Scale [ARS]), subjective well-being (Subjective Well-being under Neuroleptic Treatment Scale [SWN]), and side effects (Udvalg for Kliniske Undersøgelser Side Effect Rating Scale [UKU]). Secondary outcomes included psychopathology (PANSS), functioning (SOFAS), and insight (SUMD-K). Results: A total of 31 participants (mean age 33.91 years, SD 11.60; 20 males) completed the study. Medication adherence (ARS) showed a trend-level increase (4.77 vs 4.94, t=2.02, p=.057) but did not reach statistical significance. Among SWN subdomains, self-control improved significantly (mean difference 1.48, 95% CI 0.03 to 2.94, p=.045). UKU total severity scores decreased significantly (19.48 vs 14.52, mean difference −4.96, 95% CI −8.53 to −1.40, p=.008), driven by reductions in psychic (mean difference −1.97, 95% CI −3.52 to −0.41, p=.015) and miscellaneous symptom domains (mean difference −1.58, 95% CI −3.06 to −0.10, p=.037). Among secondary outcomes, PANSS positive symptoms decreased significantly (11.39 vs 10.35, mean difference −1.04, 95% CI −1.90 to −0.17, p=.021), whereas functioning (SOFAS) and insight (SUMD-K) did not change significantly. Older age (β=0.157, p=.017) and living with family members (β=4.304, p=.047) were associated with improvements in physical functioning, and greater chatbot use was associated with improvements in social integration (β=0.150, p=.029) and socio-occupational functioning (β=0.082, p=.024). Conclusions: A rule-based psychiatric medication counseling chatbot was associated with modest but significant improvements in subjective well-being and perceived side effect burden in patients with SZ, while its impact on medication adherence and broader clinical outcomes was limited. These findings suggest that chatbot-based interventions may serve as a useful adjunctive tool in SZ care, particularly for addressing medication-related concerns. Clinical Trial: Clinical Research Information Service (CRIS) KCT0011949; https://cris.nih.go.kr (registration number: KCT0011949)

  • Background: Clinical reasoning competency development is central to veterinary education. Generative artificial intelligence (GenAI) opens new possibilities for supporting students in acquiring these competencies, yet its effectiveness as a reasoning support tool in case-based learning (CBL) remains unclear. Objective: This study examined whether a commercially available GenAI chatbot could support veterinary students in CBL and evaluate its potential for clinical reasoning training. Methods: Following systematic evaluation, Microsoft Copilot was selected for its accessibility, functionality, and data protection compliance, and students were provided with a user-oriented manual including prompt instructions. In an interventional crossover study involving 60 fourth-year veterinary students at a Swiss university, participants alternated between AI-supported and traditional case-based learning (CBL) across four clinical cases. Clinical reasoning outcomes were assessed by a dedicated lecturer per case using 34 scored items, complemented by student surveys and lecturer reflections. Results: Clinical reasoning outcomes showed no meaningful evidence of a difference between AI-supported and traditional CBL groups (W = 6719, p = 0.464), with results varying across cases. Post-class surveys (n = 38) indicated that most students viewed GenAI support positively: 68% agreed the AI provided relevant inputs they had not previously considered, 58% perceived reduced task difficulty, and 61% found the AI-generated starting point effective. However, 45% also reported negative effects on case understanding and dissatisfaction with the overall learning experience. Qualitative feedback highlighted benefits such as information retrieval and stimulation of reflection, alongside limitations related to superficial or inaccurate AI outputs. Conclusions: These findings indicate that AI integration alone is insufficient to enhance clinical reasoning in case-based learning. Without sufficient AI literacy on top of developing clinical competencies, the cognitive demands of verifying AI-generated outputs may offset potential benefits in complex reasoning tasks. Tailoring AI integration to learner experience, scaffolding, and prior AI exposure appear relevant to realizing GenAI's potential for clinical reasoning development in veterinary education.

  • Background: Digital health technologies (DHTs) for psychosis may help address the substantial gap in access to psychological services, yet prior syntheses are limited by heterogeneous designs and populations. T Objective: This systematic review and meta-analysis aimed to synthesize evidence from randomized controlled trials (RCTs) to estimate the relative effectiveness of DHTs in individuals with confirmed psychotic disorders. Methods: Web of Science, PubMed, Embase, Scopus, PsycINFO, and CENTRAL were searched from inception to January 2026. Eligible studies were RCTs enrolling adults with psychotic disorders that evaluated DHT-delivered psychological interventions targeting psychotic symptoms. Comparators included passive and active controls. Primary outcomes were positive, negative, and overall symptoms. Secondary outcomes included depression, anxiety, functioning, quality of life, dropout, and adverse events. Results: Forty-one RCTs (N = 4139) were included. Compared with passive controls, DHTs showed small to moderate significant reductions in positive (g = -0.18, 95% CI: -0.33 to -0.03; I2= 60%), negative (g = -0.32, 95% CI: -0.56 to -0.07; I2= 63%), and overall symptoms (g = -0.41, 95% CI: -0.71 to -0.10; I2= 78%) at posttreatment, with effects for positive symptoms also at follow-up. No significant effects were observed when compared with active controls. Subgroup analyses indicated significant effects for delusions but not auditory hallucinations, and stronger effects for therapist-supported versus interventions delivered fully automated. Secondary outcomes showed small improvements both posttreatment and follow-up in depression, anxiety, and general functioning, but not for quality of life. Heterogeneity was moderate to high in some of the analyses. Dropout rates were comparable across groups, with no consistent pattern of serious adverse events identified, although safety reporting was inconsistent. Conclusions: DHTs represent a promising approach, with outcomes that appear broadly comparable to face-to-face interventions, while offering potential advantages in accessibility, scalability, and flexibility. Further high-quality RCTs with active comparators and standardized safety monitoring are needed. Clinical Trial: CRD42021251108

  • Refined Exclusion in Medical AI: Reframing Algorithmic Fairness as Data Justice and Patient Safety Governance

    Date Submitted: May 25, 2026
    Open Peer Review Period: May 27, 2026 - Jul 22, 2026

    Medical artificial intelligence (AI) systems are often evaluated through aggregate performance metrics and output-level fairness measures. However, clinically meaningful harms may remain hidden when systems perform well on average while underperforming for data-poor, underrepresented, or structurally marginalized populations. This Viewpoint uses the concept of refined exclusion to synthesize a recurring pattern in medical AI: systems may appear technically successful at the population level while transferring uncertainty, misclassification, delayed recognition, or reduced clinical reliability to groups that are less visible within training data, validation cohorts, proxy definitions, and deployment workflows. Drawing on representative cases from population health management, chest radiograph AI, dermatology, computational pathology, and foundation model applications, we argue that refined exclusion should not be treated merely as algorithmic bias or a defect of model outputs. Rather, it reflects a data governance failure with direct implications for patient safety. Moving beyond output-centered algorithmic fairness, we propose data justice as a governance foundation for medical AI, organized across distributional, procedural, and substantive dimensions. We further outline operational checkpoints across the medical AI lifecycle, including subgroup learnability assessment, data provenance documentation, local validation, procurement-stage accountability, explainability-based proxy audits, post-deployment subgroup monitoring, and patient participation. Reframing refined exclusion as a patient safety problem shifts the central governance question from “Is this model accurate on average?” to “For whom is this system safe, reliable, and clinically accountable?”

  • Background: Patient-reported outcome measure (PROM) completion is hindered by patient-level barriers—including motor, sensory, cognitive, and motivational constraints—that risk insufficient participation and non-response bias. While technology-enabled approaches such as multimodal speech assistance hold promise for reducing these barriers, assistance is a complex interaction: it can both alleviate and introduce barriers depending on how well it aligns with patients’ routines and needs. Objective: This qualitative study explores how patients perceive the advantages and disadvantages of AI-based speech assistance for PROM collection, focusing on how assistance functionalities interact with individual barriers and completion practices. Methods: We conducted semi-structured qualitative interviews with 96 psychosomatic and neurological rehabilitation outpatients, embedded in a pragmatic cross-randomised controlled trial. Participants completed PROMs with and without an AI-based speech assistance system offering speech output, speech input, and guidance by a socially interactive agent (SIA) that was physically, virtually, or voice-only embodied. The system was iteratively refined during data collection to address usability and performance issues. We included a broad sample to reflect real-world care settings, including patients without reported barriers. Using inductive content analysis (61 codes, grouped into 4 overarching and 9 subthemes), we examined perceived advantages and disadvantages of the three main assistance functionalities and multimodal interaction. Reporting followed the COREQ guideline. Results: The speech output function emerged as the most widely valued assistance feature, with many patients reporting improved concentration, question comprehension, and deeper engagement with item content. The social agent was described as making the interaction more engaging and less monotonous, by at the same time not evoking social pressure. Speech input was perceived as helpful by some, especially for those with motor impairments or a preference for verbal expression. However, each function also introduced challenges: speech output disrupted reading routines for some, the social agent was perceived as distracting or unnecessary by others, and speech input was criticised for recognition errors, inefficiency, and privacy concerns. Conclusions: AI-based speech assistance for PROM collection offers significant potential to reduce barriers and enhance patient engagement, but its effectiveness depends on alignment with individual needs, preferences and routines. While speech output proved broadly beneficial, speech input and socially interactive agents require careful design to avoid introducing new barriers, particularly for marginalised groups. Configurable, modular assistance systems that adapt to diverse user preferences and impairments are essential for equitable implementation. Future research should focus on inclusive co-design and longitudinal studies to refine these technologies for real-world clinical use. Clinical Trial: German Clinical Trail Register-ID: DRKS00035213

  • Background: Adolescent depression is clinically heterogeneous, and the presence of mixed features – defined as subthreshold manic symptoms co-occurring with a depressive episode – complicates diagnosis and treatment. Intensive longitudinal monitoring using wrist-worn actigraphy and daily ecological momentary assessment (EMA) may capture behavioral and experiential signatures that differentiate depression with mixed features (Mixed-Dep) from depression without mixed features (NoMix-Dep), but evidence in adolescents remains limited. Objective: This study aimed to examine whether multimodal digital monitoring using wrist-worn actigraphy and daily ecological momentary assessment of mood and energy can distinguish adolescents with depression with mixed features from those with depression without mixed features, and to identify dynamic energy–activity patterns specific to mixed depression. Methods: Ninety-eight adolescents (ages 12–18; 37 Mixed-Dep, 31 NoMix-Dep, 30 healthy controls) from the longitudinal Mood & Brain Circuitry in Adolescence (MBA) study wore wrist-worn actigraphy devices and completed daily mood and energy self-reports using the Mood and Energy Thermometer (MET) over two weeks. Group classification was defined based on the K-SADS-PL Mania Rating Scale. Dynamic within-person associations among mood, energy, and activity were estimated using generalized estimating equations with a first-order autoregressive working correlation structure, controlling for sleep duration, age, sex, and weekday/weekend status. Results: Both depressed groups showed lower overall activity and greater minimum activity suppression compared to healthy controls (mean activity: F = 32.67, p < 0.001), with NoMix-Dep showing lower minimum activity than Mixed-Dep (Min2: F = 17.91, p < 0.001; Min4: F = 23.37, p < 0.001). Mixed-Dep participants had significantly higher positive and negative energy scores (EnergyPosMax: F = 10.12, p < 0.001; EnergyNegMax: F = 91.93, p < 0.001), shorter wake after sleep onset (F = 3.67, p = 0.03), and higher sleep efficiency (F = 7.03, p < 0.01) than NoMix-Dep. Mood scores did not differ between depressed groups. Energy–mood associations were largely similar across groups. Energy–activity temporal coupling differed markedly: NoMix-Dep showed same-day congruent coupling (high energy predicted high activity), while Mixed-Dep showed an inverted lagged pattern (high energy today predicted lower activity tomorrow). Similar group-differential patterns were observed for mood–activity associations. Conclusions: An inverted, lagged energy–activity coupling represents a novel digital phenotype distinguishing mixed from non-mixed adolescent depression. Energy dysregulation, more than mood, differentiates the two depressed subgroups, with implications for scalable EMA-based screening and earlier identification of mixed features in clinical settings.

  • Background: Artificial intelligence (AI) is increasingly integrated into prostate cancer diagnostics, with the potential to improve accuracy and efficiency. However, it also raises important questions about the conditions and barriers that may influence its successful implementation in this clinical context. Objective: This study examined how patients and healthcare professionals perceive the integration of AI in prostate cancer diagnostics, with particular attention to its impact on clinical relationships and the roles of patients and physicians. Methods: A sequential explanatory mixed-methods design was used. Quantitative data were collected through an online questionnaire administered to patients with localized prostate cancer (N=51). Descriptive analyses focused on perceived benefits, willingness to use AI, and associated concerns. Qualitative data were collected through focus groups and semi-structured interviews with patients (n=16) and physicians (n=11). Data were analyzed using iterative, inductive thematic analysis. Results: Quantitative findings showed that despite recognizing the potential benefits of AI, patients remained divided regarding the use of such tools in their own care. Qualitative findings suggest that this hesitation cannot be explained solely in terms of perceived performance or utility. Rather than simply reducing complexity in clinical decision-making, AI appeared to reconfigure the certainties on which trust within the patient–physician relationship is established. This reconfiguration was reflected across epistemic, ethical, and role-related dimensions. Patients emphasized difficulties in understanding AI-generated knowledge, whereas clinicians focused on issues of reliability, validation, and clinical relevance. Ethical concerns centered on responsibility, which was consistently attributed to physicians, while errors made by AI were perceived as less acceptable than those made by clinicians. Role-related uncertainties were reflected in ambivalent patient positions: while some participants sought more information to remain involved in decision-making, others preferred to rely on physicians, reflecting variation in how patients engage with complex clinical information. AI was generally viewed as a supportive tool rather than a replacement for clinical judgement, while its integration was associated with evolving professional roles, including increased demands for interpretation, communication, and oversight. Conclusions: The integration of AI in prostate cancer diagnostics is shaped not only by its technical performance, but by how it reconfigures trust within the patient–physician relationship. Rather than eliminating uncertainty, AI redistributes it across knowledge, responsibility, and social roles. Ensuring that AI contributes positively to clinical practice therefore requires careful attention to clinician oversight, communication, and the relational context in which decisions are made. Clinical Trial: NCT07074405 (ClinicalTrials.gov)

  • Background: The high prevalence of sedentary lifestyles and non‑communicable diseases in Malaysia calls for scalable physical activity interventions. Hence, in this study, we leverage on the potential benefits of social media for exercise promotion, particularly Instagram. Objective: This pilot study examined the acceptability, observed changes, and predictors of improvement associated with an Instagram‑based exercise promotion among sedentary adults in Klang Valley, Malaysia. Methods: A total of 56 sedentary adults (34 females, 22 males) were recruited; 50 completed the 12‑week intervention (mean sedentary behaviour 7.30±2.75 hours/day; retention rate 89.3%). Participants joined a private Instagram page delivering cardiorespiratory‑focused exercise content every two days. Pre‑ and post‑intervention assessments included anthropometry, body composition (InBody 370), 6‑Minute Walk Test (6MWT), and Client Satisfaction Questionnaire‑8 (CSQ‑8). Results: Significant pre‑post changes were observed in body weight (mean change -2.05±2.88 kg, P<.001), BMI (-0.83±1.11 kg/m², P<.001), body fat percentage (-2.23±1.91%, P<.001), and 6MWT distance (67.82±40.81 m, P<.001). The mean total CSQ‑8 score was 27.02±4.91 (out of 32), indicating high satisfaction. Baseline body fat percentage, baseline 6MWT distance, and gender were associated with the degree of functional change (R²=0.71). Conclusions: This pilot study suggests that an Instagram‑based intervention is acceptable and may be associated with positive health changes among sedentary adults. These findings support the need for a definitive randomised controlled trial in the future.

  • Background: Large language models (LLMs) have shown considerable potential in intelligent healthcare consultation. However, their application in Traditional Chinese Medicine (TCM) gynecology remains limited by semantic gaps between colloquial patient descriptions and professional TCM reasoning, as well as risks of hallucinated medical content. Objective: We proposed MAGR-TCM, a knowledge graph-powered multi-agent retrieval-augmented generation framework for home-based TCM consultation and preliminary risk assessment. Methods: A domain-specific knowledge graph containing 10,231 entities and 32,051 relationships was constructed from 741 curated clinical case records. The framework integrates four specialized agents for question analysis, risk routing, graph reasoning, and response evaluation. Model performance was evaluated using the RAGAS framework and a double-blind expert assessment on 60 independent cases, including a safety stress-test with 10 emergency "Red Flag" scenarios. Results: MAGR-TCM achieved the best overall performance among baseline models, with an average RAGAS score of 0.900 and a consultation professionalism score of 0.904. The proposed framework demonstrated strong factual consistency (Faithfulness: 0.821) and comprehensive diagnostic accuracy (0.952), approaching the performance of human experts. In safety stress testing, MAGR-TCM achieved 100% emergency identification accuracy and the lowest unsafe recommendation rate (0.240) among all evaluated AI systems. Conclusions: The proposed MAGR-TCM framework demonstrates the potential of integrating knowledge graphs and multi-agent reasoning to support interpretable and safety-aware TCM consultation. The system serves as a reliable methodological prototype for intelligent home-based health management and preliminary risk assessment.

  • Background: Other infectious diarrhea (OID) remains an important public health concern in China because of its high incidence, marked seasonality, and substantial burden, particularly among children. Accurate short-term forecasting and early warning are important for timely public health response. However, previous OID forecasting studies have mainly relied on reported case data, and the added value of multisource indicators remains insufficiently evaluated. Objective: This study aimed to develop and evaluate a multisource CNN-BiLSTM-SE Attention model for short-term forecasting and early warning of reported other infectious diarrhea cases in Chongqing, China. Methods: Daily OID case counts in Chongqing from January 2015 to June 2025 were collected, together with meteorological variables and Baidu search indices related to infectious diarrhea. After data normalization, Pearson correlation analysis and random forest variable-importance analysis were used for predictor selection. A CNN-BiLSTM-SE Attention hybrid model was developed to integrate multisource data, extract local temporal patterns, model temporal dependencies, and recalibrate informative feature channels. Forecasting performance was evaluated using RMSE, MAE, MAPE, and R², and compared across different input settings and benchmark models. In addition, 5-day-ahead predictions were converted into binary warning signals using training-set 75th and 90th percentile thresholds, and compared with a persistence baseline. Results: Under the full-input setting, the CNN-BiLSTM-SE Attention model achieved the best predictive performance, with an R² of 0.7828, RMSE of 35.418, MAE of 25.411, and MAPE of 17.27%. Compared with the case-only model, R² increased by 0.0326, while RMSE and MAE decreased by 2.560 and 1.643, respectively. The proposed model also outperformed random forest, XGBoost, CNN, and LSTM. In the threshold-based early-warning evaluation, the full-input model showed better overall warning performance than the persistence baseline at both the 75th and 90th percentile thresholds. Conclusions: The CNN-BiLSTM-SE Attention hybrid model improved short-term forecasting of reported OID case counts in Chongqing. Integrating epidemiological, meteorological, and internet search data provided complementary information, suggesting potential utility for OID surveillance, forecasting, and early warning.

  • Background: Medication nonadherence remains a major global health challenge, contributing to preventable disease, hospitalizations, and healthcare costs. Mobile health (mHealth) applications incorporating gamification and financial incentives have shown potential to improve adherence; however, most research has focused on patient perspectives, with limited understanding of how non-patient stakeholders perceive their feasibility, risks, and implementation. Understanding non-patient stakeholder perspectives in relation to patient viewpoints is essential for informing future policy development and establishing practical, industry-supported safeguards that protect consumers while enabling innovation. Objective: This study aimed to explore non-patient stakeholder perspectives on the use of gamification and financial incentives in mHealth apps for medication adherence and to integrate these with previously reported patient perspectives to inform consensus-based design and policy considerations. Methods: A mixed-methods study was conducted using a modified virtual Nominal Group Technique (vNGT). Non-patient stakeholders across healthcare, industry, and policy sectors in Australia were recruited. Data collection involved a pre-session survey followed by online focus groups. Qualitative responses were analyzed using thematic analysis supported by AI-assisted coding. Consensus statements derived from themes were rated during the focus groups. Additional prompts were used to elicit further discussion where consensus was not immediately achieved. Results: A total of 20 participants were included in the study. Six key themes were identified: tailored gamification for adherence, financial incentives as a contested motivator, designing for diversity and inclusion, usability barriers to engagement, trust through data governance, and validated and sustainable innovation. These informed 24 consensus statements, of which 54% (13/24) achieved unanimous agreement. Stakeholders strongly endorsed personalization, simplicity, and transparent data practices, while expressing nuanced concerns regarding the ethical use, sustainability, and potential unintended consequences of financial incentives. Compared with prior patient findings, the participants demonstrated substantial alignment on core design principles but contributed additional system-level considerations related to feasibility, scalability, and regulation. Conclusions: Non-patient stakeholders largely reinforce patient priorities while extending them with critical perspectives on implementation, governance, and sustainability. Gamification and financial incentives are viewed as potentially effective but require careful, ethically grounded design to balance engagement with long-term motivation and trust. These findings support the development of stakeholder-informed guidelines for responsible mHealth innovation and highlight the importance of integrating patient and system-level perspectives in digital health design. Future research should prioritize co-designed longitudinal studies utilizing apps with gamification and a range of incentive offers with clear redemption processes to evaluate the long-term impact on medication adherence across diverse patient populations.

  • Background: The platform-based economy has expanded rapidly through the integration of digital platforms into sectors such as transportation, delivery, and freelance work. Platform labor combines features of precarious employment and digitalized work organization, encompassing both location-based and web-based work. However, the occupational health implications of platform work remain insufficiently understood, particularly regarding how risks differ across platform worker groups. Objective: This study aimed to explore how platform workers experience their working conditions and how platform work affects their health, wellbeing, and safety. Methods: A participatory photovoice study was conducted with platform-based taxi drivers, delivery couriers, and freelancers living in Stockholm. Between September and November 2022, 16 participants were recruited into three groups (5–6 participants per group). Across five sessions, participants documented their working lives through photographs and discussed them collectively, generating 105 photographs in total. Data were analyzed collaboratively to identify key themes and recommendations related to working conditions, health, and wellbeing. Results: Participants identified 14 themes representing major determinants of health, wellbeing, and safety at work, as well as 23 recommendations for improving working conditions. Workers reported exposure to both platform-specific risks, including algorithmic management and digital surveillance, and traditional occupational risks such as psychosocial strain, ergonomic challenges, and traffic-related hazards. Experiences differed substantially across platform work types. Delivery and taxi drivers reported greater exposure to physical and traffic-related risks, whereas freelancers emphasized psychosocial demands and digital work intensification. Economic insecurity and costs associated with maintaining work equipment emerged as common challenges across all groups. Attitudes toward flexibility, autonomy, and algorithmic management also varied between worker categories. Conclusions: This study highlights important similarities and differences in working conditions and health risks across platform work types. The findings suggest that research and occupational health interventions targeting platform workers should differentiate between specific forms of platform labor to better capture the diversity of workers’ experiences and exposures.

  • Background: Co-creation is increasingly used in health research, public health, and participatory initiatives to support inclusive, collaborative, and evidence-informed problem-solving. However, the integration of digital technologies into co-creation processes remains fragmented and largely ad hoc, with limited frameworks available to guide technology selection, evaluation, and development. Objective: This study aimed to develop the Co-Tech Taxonomy, an empirically grounded evaluative framework for assessing digital technologies used in co-creation and participatory digital health ecosystems. Methods: Using the Nickerson–Varshney–Muntermann (NVM) taxonomy-building method, the taxonomy was developed through the analysis of six foundational conceptual and empirical frameworks related to co-creation, participatory processes, and digital technologies. The taxonomy was subsequently refined through iterative empirical classification of 84 technologies used in co-creation contexts. Results: The final taxonomy consists of seven functional dimensions: governance, inclusivity, methodology, collaboration, engagement, data management, and cognitive support. Each dimension is operationalised across three progressive levels of co-creation alignment. The empirical mapping revealed that current digital ecosystems remain insufficiently aligned with participatory collaboration requirements, particularly regarding governance, inclusivity, and AI-supported cognitive facilitation. While communication and data-management functionalities were comparatively mature, participatory governance, collaborative decision-making, and AI explainability remained underdeveloped across most evaluated technologies. The taxonomy also enabled the development of a three-tier indicative certification model to support technology assessment and implementation. Conclusions: The Co-Tech Taxonomy provides a structured evaluative framework for assessing existing technologies, identifying implementation and innovation gaps, and guiding the development of more inclusive, transparent, interoperable, and AI-ready participatory digital infrastructures. The framework offers a practical foundation for strengthening digitally supported co-creation and participatory collaboration within health-related contexts.

  • Background: Generative artificial intelligence (GenAI) is increasingly used to produce patient-friendly clinical documentation, yet evaluation of these outputs remains inconsistent and difficult to scale. Patient-friendliness is commonly reduced to narrow readability metrics, such as Flesch-Kincaid grade level, without accounting for clinical accuracy, completeness, or the patient perspective. No standardized framework exists to evaluate the quality and safety of AI-generated patient-friendly documentation across document types or the full documentation lifecycle. Objective: To develop and preliminarily validate CLEAR (Clinical Language Evaluation and AI Documentation Review), a theoretically grounded evaluation framework for AI-generated patient-friendly clinical documentation across the generation, review, and monitoring stages of the AI documentation lifecycle. Methods: CLEAR was developed using Messick's validity framework across four stages: content validation, response process, internal structure, and consequences. Domains were identified through a targeted literature review and reviewed by a panel of six clinical and operational experts. An iterative, consensus-based process involving four board-certified internists across 10 rounds refined domain definitions and scoring instructions. Inter-rater reliability was assessed on 50 AI-generated patient-friendly discharge summaries using Cohen's kappa and Gwet's AC1 for binary domains and intraclass correlation coefficients (ICC) and Gwet's AC2 for continuous domains. Additionally, 19 semi-structured stakeholder interviews with clinicians, informaticists, institutional leaders, and patient education experts explored operational needs and implementation contexts. Results: CLEAR comprises five domains for evaluating patient-friendly AI documentation: readability, understandability, patient-centeredness, accuracy, and completeness. Inter-rater reliability was good to almost perfect across all subjectively scored domains per Gwet's agreement coefficients. Stakeholder interviews independently identified three operational gaps aligned with the CLEAR lifecycle: lack of structured guidance for prompt engineering, subjectivity in human review, and absence of scalable monitoring infrastructure, directly validating the framework's real-world relevance. CLEAR was applied across three illustrative implementation contexts: prompt engineering for patient-friendly echocardiogram reports, structured human review of discharge summaries, and development of LLM-as-judge automated monitoring tools. Conclusions: CLEAR provides a preliminarily validated evaluation framework designed to span the full AI documentation lifecycle, from prompt engineering through human review to automated monitoring. By conceptualizing patient-friendliness as a multidimensional construct that integrates communication quality with patient safety, CLEAR offers practical infrastructure for consistent and scalable governance of patient-facing AI documentation in healthcare systems.

  • Background: House dust mite (HDM) sensitization commonly begins in early life and contributes to persistent allergic airway inflammation and asthma chronicity. Primary prevention via early-life environmental control is a key pathway to reduce HDM sensitization and asthma risk. Objective: To characterize child caregivers’ knowledge, attitudes, and practices (KAP) regarding pediatric HDM control using a hybrid literature/expert-driven and social media-driven approach, and examine associations between KAP levels, child age and caregiver social media activity. Methods: This cross-sectional study comprised two interconnected components: (1) mining of content published between August 2023 and July 2025 from five major Chinese social media platforms, analyzed via Latent Dirichlet Allocation (LDA); and (2) a social media-enhanced web-based KAP survey administered in November 2025 to child caregivers in Chongqing, a warm-humid region where HDMs dominate indoor allergens, with participants recruited via local child health facilities. In total, 132,341 social media documents and 2,275 caregivers of children <18 years were included in the analysis. The main outcomes included social media discourse patterns and domain-specific KAP levels across five dimensions: foundational knowledge (K1), recommended control knowledge (K2), attitude toward social media topics (A1), attitude toward recommended methods (A2), and control practices (P). Stratified analysis was conducted by two exposure variables: child age (≤3 years vs >3 years) and caregiver social media activity (active vs. inactive). Results: LDA topic modeling identified five distinct topic clusters in the social media content. Commercial, emotional, and misleading content collectively dominated the information landscape, accounting for 83.3% of included documents, with commercial content often systematically conflating the concepts of “disinfection” and “mite elimination”. Only 16.7% was classified as health educational content focusing on HDM allergy prevention. The average KAP levels of K1, K2, A1, A2, and P domains were 62.9%, 84.7%, 57.0%, 37.8%, and 25.8%, respectively. Social media emerged as the primary knowledge source (80.7%), with methodological knowledge gaps (47.5%) being the top implementation barrier. Caregivers of children ≤3 years had significantly lower self-rated knowledge (23.5% vs. 28.3%, P=.01), stronger endorsement of recommended methods, but also greater information overload (OR 1.39, 95% CI 1.15-1.67, P<.001) and decision difficulties (OR 1.23, 95% CI 1.01-1.52, P<.001). Socially active caregivers showed better performance across multiple items in five domains, but also increased non-recommended practices (ultraviolet irradiation: OR 1.85, 95% CI 1.35-2.53, P<.001) and misconception acceptance (allergy impact exaggeration: OR 1.39, 95% CI 1.04-1.87, P=.03). Conclusions: Complex and suboptimal KAP levels exist, particularly among caregivers of young children (≤3 years). Social media activity associates with both enhanced implementation of control practices and elevated misconception endorsement. These findings reveal critical educational gaps and the necessity of social media intervention. Clinical Trial: Not applicable.

  • Examining inequities in the use of Continuous Glucose Monitors Among People

    Date Submitted: May 6, 2026
    Open Peer Review Period: May 6, 2026 - Jul 1, 2026

    Background: Continuous glucose monitoring (CGM) offers clinical and behavioural benefits for people with type 2 diabetes (T2D), including improved glycaemic control and enhanced self-management. However, important evidence gaps remain regarding whether CGM use is equitably distributed across patient groups and whether Objective: To examine the relationship between CGM use among individuals with type 2 diabetes (T2D) and a range of patient characteristics, including socio-demographic factors linked to health inequities, digital health literacy, clinical characteristics, and service utilisation. Methods: A cross-sectional online survey was conducted in November 2024 among adults in the UK with self-reported type 2 diabetes (T2D), recruited via the YouGov panel. The primary outcome was self-reported CGM use. Predictor variables included PROGRESS-Plus characteristics (age, gender, ethnicity, religion, education, occupation, household income, disability, and social engagement), digital health literacy (eHEALS scale), clinical characteristics (disease duration, current treatment, and complications), overall health status (number of long-term conditions), and healthcare utilisation (frequency of visits). Descriptive statistics and multivariable logistic regression were used to examine associations between CGM use and patient characteristics. Results: Among 403 participants, 12.7% reported CGM use. Nearly half of participants were aged 65 years or older, and 56.80% were male. Most participants were White 83.90% and lived in urban areas. Higher odds of CGM use were observed among insulin users (OR=3.80, 95% CI: 1.6–9.22, p<0.001). No other demographic, clinical, or service utilisation variables were statistically significantly associated with CGM use. Conclusions: CGM use was primarily driven by insulin therapy, consistent with established clinical pathways within the National Health Service that prioritise access for this group. No significant variation was observed across demographic, socioeconomic, or health literacy-related characteristics, suggesting no clear evidence of inequalities in this sample. These findings indicate potentially equitable access, although further research in larger and more diverse populations is needed to confirm these patterns.

  • Background: Complex digital interventions that integrate electronic patient-reported outcome measures (ePROM) into clinical practice in cancer have the potential to improve quality of life, increase survival, and reduce health resource use and costs. Such systems can help oncology patients self-manage chemotherapy symptoms, reduce workloads for clinicians through automated decision support, and resolve problems earlier. However, there is a need for more research on the cost-effectiveness of such interventions. Objective: This review aims to (1) summarize and evaluate the quantitative and qualitative evidence related to the cost-effectiveness and economic evaluation methods of ePROM-integrated interventions, and (2) extract data and validate assumptions useful for health economic modelling of ePROM-based treatment strategies. Methods: We searched for original English-language papers published on or before March 2025 on Ovid (including MEDLINE and Embase), Scopus, and the International Health Technology Assessment Database (INAHTA) using search strings that combined terms related to ePROMs, health economics, and cancer/oncology. We included papers reporting health economic-related outcomes for ePROM interventions designed for adult cancer populations and excluded screening tools and conference abstracts. Results: We included 34 publications from 27 unique studies, and identified and analyzed 26 ePROM-integrated interventions within these. Most (23/26) of the included interventions explicitly described some form of alert handling and automated decision support based on remote ePROM monitoring. 5/34 publications presented full cost-utility analysis results, of which 3 were characterized by high uncertainty and a lack of clear differences in costs and health outcomes between ePROMs and standard care, while 2 presented strong evidence of cost-effectiveness due to quality-of-life improvements, reduced hospitalizations, and potentially more autonomy in health-related travel (e.g., ePROM-monitored patients can drive or walk to the hospital instead of using taxis or ambulances). A further 5/34 publications reported partial health economic results (e.g., cost-consequence, budget impact), of which 1 detected no difference in strategies, while 4 reported lower health resource use and costs of ePROMs, mainly due to hospitalization reductions. 12/27 studies included a qualitative component but mostly focused on user experience and design-related themes; only 2/12 of these addressed economic-specific themes (e.g., changes in workflow and resource use due to ePROM implementation and integration), indicating some potential for time saving due to ePROM monitoring. Conclusions: There is some evidence that ePROM-integrated interventions can be cost-effective in cancer care, but the evidence base remains limited. Where evidence does exist, cost-effectiveness appears driven by reduced hospitalization and improved quality of life. Qualitative research within the included studies rarely addressed economic questions. We provide a detailed parameter extraction for use in future economic modelling and recommend research priorities, including quantitative mapping of ePROM symptom data onto health resource use patterns, and qualitative work exploring how ePROM implementation affects clinical workloads and patient-perspective costs.

  • Background: Depressive disorders are one of the most prevalent psychiatric disorders globally and impose considerable individual and societal burdens. Psychotherapy, including cognitive behavioral therapy, is recommended as a first-line treatment especially for mild to moderate depressive disorders. However, face-to-face psychotherapy is often limited by issues of accessibility and cost. Digital therapeutics (DTx) have gained increasing attention as alternatives for overcoming these hurdles. With advances in digital technology, digital placebos have been increasingly adopted as comparators in the clinical trials for DTx. However, the characteristics of the clinical trials, the magnitude of digital placebos and their moderators remain poorly understood. Objective: The objectives of this study were to investigate the characteristics of clinical trials using digital placebos as comparators, and to assess the magnitude of the digital placebo effects and their moderators on depressive symptoms measured by Patient Health Questionnaire-9 (PHQ-9). Methods: The blind randomized clinical trials (RCTs) evaluating PHQ-9 by setting digital placebos as comparators were identified by searching MEDLINE, Scopus, Web of Science, PsycINFO, CINAHL, Cochrane Central Register of Controlled Trials, ClinicalTrials.gov, ISRCTN in November 2025. The characteristics of the RCTs and of the digital placebos were reviewed systematically. The meta-analysis including sub-group analyses and meta-regressions were conducted to investigate the magnitude and the moderators of the digital placebos. Results: 29 articles and 30 studies with 5680 participants were included in this systematic review and meta-analysis. The most common trial design was 2-arm, parallel-group study conducted in a single country, adopting “Replaced” and “Mobile” as the placebo approach and delivery type, respectively. The pooled effect size for all the included studies was Hedges’ g = 0.44 (95% CI 0.29 to 0.59) with an overall I2 = 93.2 %. Subgroup analyses showed moderate-to-large and statistically significant placebo effect in the group of primary psychiatric disorders (Hedges’ g = 0.69; 95% CI 0.40 to 0.99). Meta-regressions indicated that the group of primary psychiatric disorders and baseline PHQ-9 score were the independent moderators of the digital placebo effects and the major contributing factors of the high heterogeneity (R2 = 51.5%). Conclusions: Statistically significant digital placebo effects were observed on depressive symptoms, and target population and baseline PHQ-9 score were identified as the independent moderators. These findings would have implications for the planning of future DTx clinical trials using digital placebos for depressive symptoms.

  • Socio-Cultural Challenges and Design Implications for Ethical AI in Healthcare: A Systematic Review

    Date Submitted: May 4, 2026
    Open Peer Review Period: May 5, 2026 - Jun 30, 2026

    Background: Artificial intelligence (AI) is increasingly embedded in healthcare, yet its benefits remain unevenly distributed due to persistent concerns regarding bias, inequity, and socio-cultural misalignment. Although existing Ethical AI frameworks typically emphasize universal principles, they often insufficiently address the socio-cultural contexts in which AI systems are developed, implemented, and used. Objective: This systematic review aimed to examine how socio-cultural factors shape ethical challenges in healthcare AI, influence the interpretation of ethical principles, and inform context-sensitive design and governance strategies. Methods: Following PRISMA 2020 guidelines, we conducted a systematic search of PubMed, IEEE Xplore, and Web of Science for studies published between 2018 and 2025. Eligible studies addressed ethical issues related to AI in healthcare through a socio-cultural lens. A thematic synthesis combining inductive and deductive coding was used to analyze reported challenges, context-dependent ethical interpretations, and proposed mitigation approaches. Results: A total of 49 studies were included. The findings show that ethical challenges in healthcare AI are deeply embedded in structural inequalities, data collection, curation, and documentation practices, institutional conditions, and cultural norms rather than being purely technical problems. Key challenges included algorithmic bias, underrepresentation of minorities in datasets, cultural and linguistic mismatches, limited transparency and trust, and systemic disparities in access to AI technologies. The reviewed literature proposed a broad range of technical, design-related, and governance-oriented strategies, but these remained fragmented and were rarely integrated systematically across the AI lifecycle. Based on this synthesis, the study proposes the Inclusive Ethical AI Framework (IEAF), a socio-technical framework that systematically translates socio-cultural context into context-sensitive ethical interpretations and actionable design and governance decisions across the AI lifecycle. Conclusions: The findings highlight that ethical challenges in healthcare AI are fundamentally shaped by socio-cultural context and cannot be addressed through technical solutions or universal ethical principles alone. Instead, effective and equitable AI systems require the systematic integration of socio-cultural considerations into data practices, system design, and governance across the AI lifecycle. Clinical Trial: PROSPERO CRD420251058607; prospectively registered.

  • Quality Criteria for Cancer Patient Portal Content: Framework Development and Pilot Audit Study

    Date Submitted: May 1, 2026
    Open Peer Review Period: May 1, 2026 - Jun 26, 2026

    Background: Patient-facing cancer portals are increasingly used to provide education, support interpretation of results, navigate services, and guide self-management across the cancer journey. However, variation in content quality, transparency, readability, accessibility, and governance can undermine equity, safety, and trust. Objective: To develop and present EU-CiP20 as a first-phase, evidence-informed, operational, and auditable framework of quality criteria for cancer patient portal content. Methods: We synthesised established instruments and authoritative guidance on online health information quality, health literacy and plain-language communication, transparency and conflicts of interest, patient engagement, privacy and data protection, digital governance, accessibility, and AI-related safety. Candidate criteria were harmonised from a broader evidence-mapped set (EU-CiP30) into a streamlined taxonomy (EU-CiP20) using explicit consolidation rules and an auditable mapping trail. Each category was operationalised into four observable sub-criteria and scored using a pragmatic 0-2 scale. EU-CiP20 is presented as an initial comprehensive framework to be refined in the next phase through stakeholder focus groups, an online survey with affected cancer patients, expert inquiry, and a Delphi expert panel, with the aim of reducing the 20 criteria to a final operational core of approximately 10 criteria. Results: EU-CiP20 comprises five domains and 20 categories spanning accessibility and comprehensibility; evidence and content governance; relevance and personalisation; human-centred design and empowerment; and ethics, safety, and trust. In the pilot, adjusted EU-CiP20 totals ranged from 19.5% to 40.6%. The most consistent gaps were governance signals required for portal readiness, including named clinical ownership, explicit review cycles, evidence traceability, and accessibility auditability. Comparator tools characterised content-level strengths but did not fully capture these governance risks. Conclusions: EU-CiP20 offers a practical and auditable first-phase approach to strengthen governance of patient-facing cancer portal content. It complements existing information-quality instruments by linking readability, evidence governance, relevance, empowerment, transparency, safety, and digital trust within a single operational taxonomy. The work is not yet complete: the current 20-criteria framework will be refined through stakeholder focus groups, an online survey with affected cancer patients, expert inquiry, and Delphi expert panel consensus to produce a shorter final set of approximately 10 criteria, followed by assessment of inter-rater reliability, feasibility, sensitivity to change, and real-world implementation impact.

  • Background: Diffusion of innovations theory posits that inequalities arising from the early adoption of new technologies, such as telemedicine, are likely to decrease over time. However, evidence is scarce on the evolution of inequalities related to individual telemedicine adoption over time. Objective: This study aims to assess changes in age and socioeconomic inequalities in telemedicine adoption in Japan from 2020 to 2024. Methods: We used data from a nationwide, internet-based panel survey of the general population in Japan. Participants aged 18–75 years who completed both the 2020 baseline and 2024 follow-up surveys were included. The primary outcome was self-reported telemedicine adoption (ever use at each survey). Using multivariable logistic regression models, we regressed telemedicine adoption on (1) indicators of age and socioeconomic status at baseline, (2) survey year, and (3) their interaction, adjusting for other demographic, socioeconomic, and health-related characteristics. We then estimated the adjusted prevalence of telemedicine adoption in 2020 and 2024 for each age and socioeconomic group. Results: We included 10,818 participants (mean [SD] age, 49.7 [16.8] years; 50.7% women). In 2020, 271 participants (2.5%) reported telemedicine adoption; by the 2024 follow-up survey, this increased to 840 participants (7.8%). The prevalence of telemedicine adoption was lower among older individuals, those with lower educational attainment, those with medium income (vs high income), and unemployed individuals (vs upper non-manual workers) in 2020. While the prevalence increased across groups from 2020 to 2024, the increases were smaller among older age groups (70–75 years: +1.0 percentage points [pp] vs 18–29 years: +13.2 pp; difference-in-differences, −12.1 pp; 95% CI, −18.3 to −6.0 pp). Similarly, increases were smaller among unemployed individuals than among upper non-manual workers (+2.8 vs +5.8 pp; difference-in-differences, −3.0 pp; 95% CI, −4.7 to −1.2 pp). Changes in the prevalence of telemedicine adoption did not vary significantly by educational attainment, urban vs rural residence, or income level. Conclusions: Despite growth in telemedicine adoption from 2020 to 2024, age-related and occupational inequalities widened, and educational inequalities persisted, underscoring the need for strategies to reduce age-related and socioeconomic barriers to telemedicine adoption.

  • Longitudinal Modeling or Monitoring of Depression in Speech: A Systematic Review

    Date Submitted: Apr 30, 2026
    Open Peer Review Period: Apr 30, 2026 - Jun 25, 2026

    Background: Depressive disorders are a leading cause of disability worldwide, and more than 40% of people who experience a single depressive episode will experience recurrence. It is, therefore, essential that people living with a depressive disorder are able to access appropriate means of monitoring, to identify recurrences and enable timely interventions. Existing monitoring methods are burdensome for both clinicians and patients, but previous research into automated depression diagnosis has demonstrated links between participants’ depression severity and speech features. Longitudinal depression modeling through speech aims to build on these links and provide automated methods of long-term depression monitoring. Objective: This systematic review collates existing research into the monitoring or modeling of changes in depression severity, through its impact on speech. Methods: We searched the ProQuest, Scoups, Web of Science, PubMed and IEEE Xplore databases for studies relating to the longitudinal modeling of depression in speech. Publications of any age were acceptable, but only English-language studies were included. All studies underwent quality appraisal using the CASP cohort study checklist. Results: We retrieved 22 relevant documents from the database searches, and a further 40 documents through citation chasing and manual searching. The observational periods employed by these studies varied from 7 days to 18 months, and sample sizes of 16-954. Speech features such as speaking rate and pause duration show promising sensitivity to changes in depression severity. However other features, such as average energy velocity, exhibit conflicting trends across different studies - as does the generalizability of prosodic and acoustic features between languages. Conclusions: We identified significant methodological variation within the data collection, feature extraction, and modeling stages of the studies. While there is evidence to suggest that speech features are sensitive to changes in depression severity, some findings are inconsistent between studies. We advocate for greater clarity and consistency in the reporting of methods to support comparisons of findings between studies and generalizability testing. Future work could explore the predictive capacity of speech to identify oncoming depressive episodes. Clinical Trial: PROSPERO CRD420251003661; https://www.crd.york.ac.uk/PROSPERO/view/CRD420251003661.

  • Liability and Standard of Care in AI-Driven Psychiatric Practice: A European Viewpoint

    Date Submitted: Apr 30, 2026
    Open Peer Review Period: Apr 30, 2026 - Jun 25, 2026

    Artificial intelligence is increasingly entering psychiatric care through decision-support systems, digital phenotyping tools, suicide-risk prediction models, documentation assistants, and conversational agents. These technologies may improve access, consistency, and personalised care, yet they also redistribute clinical authority and complicate liability when harm occurs. This article examines how European law and psychiatric ethics should respond to this shift. It argues that liability in AI-driven psychiatry cannot be understood only as a product-defect issue or only as a malpractice problem. Because psychiatric practice depends on interpretation, testimony, contextual judgment, and therapeutic alliance, the relevant standard of care must remain human, even when technologically augmented. The article advocates an augmented-clinician model in which AI informs but does not replace psychiatric reasoning. After outlining the European regulatory framework, including the AI Act, the Medical Device Regulation, the General Data Protection Regulation, the revised Product Liability Directive, and the European Health Data Space Regulation, the article analyses the implications of the withdrawal of the proposed AI Liability Directive and the persistence of divergent national tort regimes. It then examines psychiatric risk vectors, including automation bias, testimonial injustice, bias in mental health datasets, therapeutic chatbots, suicide prediction tools, passive monitoring, and large language model documentation. The discussion proposes a layered accountability model that links developers, deployers, and clinicians while preserving therapeutic integrity, patient rights, and legal clarity.

  • Background: Electronic nicotine delivery systems (ENDS) are at the center of global public health debate. China is the largest producer of e-cigarettes while the U.S. has the largest consumer market, yet analyses of news coverage of ENDS comparing China and the United States (U.S.) remain limited. Objective: The primary objective of this study is to identify and compare dominant themes in ENDS-related news coverage across leading broadcast-branded digital outlets in China and the United States, and to assess how these themes and coverage volume changed over time. Methods: We conducted a thematic analysis of 470 ENDS-related stories from January 1, 2020, to July 30, 2025, from four leading broadcast news digital media platforms: CNN.com and FoxNews.com in the U.S.; CCTV.com and ifeng.com in China. Using a single theme approach, coders identified core themes for each article based on prespecified rules and a hierarchical decision structure. Frequencies and proportion of each core theme were summarized for the overall sample and stratified by country. Pearson chi-square tests and binary logistic regression models were conducted to examine cross-national differences with false discovery rate (FDR) adjusted p-values. Temporal changes in themes were examined and visualized. Results: In U.S. coverage, the most prevalent themes were policy and regulatory governance (32.1%), youth appeal, flavors, and school responses (22.4%), and health risks, harms, symptoms, and dependence (13.9%). In Chinese coverage, the most prevalent themes were commercial practices and market dynamics of ENDS (26.0%), policy and regulatory governance (23.4%), and enforcement and compliance (15.7%). Cross-national differences in themes were consistently observed between the two countries. Between 2020 and 2025, coverage in China transitioned away from commercial and market themes toward greater focus on illicit substances and enforcement, while U.S. coverage showed relatively stable focus on commercial market with a gradual increase in enforcement-related reporting. Conclusions: Broadcast news in China and the U.S. may actively shape how ENDS are defined as a public issue and what policy responses appear legitimate. Chinese coverage tends to stress commercial activity and enforcement, whereas U.S. coverage more often foregrounds youth risks and regulatory debates. These distinct thematic patterns may influence risk perceptions and policies in each country and are important to consider in comparative media and public health research.

  • Digital health interventions to prevent post-traumatic arthritis after traumatic knee injury: a scoping review

    Date Submitted: Apr 28, 2026
    Open Peer Review Period: Apr 28, 2026 - Jun 23, 2026

    Background: Traumatic knee injuries (TKI) are common, associated with a 4-6 times increased risk of post-traumatic knee osteoarthritis (PTOAK) over the subsequent 15–20 year period. There is clear evidence that risk can be reduced, but long-term care availability is limited, prompting the development of DHIs (digital health interventions) such as wearable devices, telehealth innovations and mobile apps. Objective: To evaluate existing DHIs against the OPTIKNEE consensus guidelines for PTOAK prevention and investigate adoption into practice. Methods: A search of 7 online databases and the grey literature was completed from inception to 03/06/2025, complemented by hand searching government, charity and university websites for reports and technical prototype papers concerning DHIs to support care after TKI. DHI features were mapped to the OPTIKNEE recommendations, evaluated against the health-technology pathway to identify development stage, and implementation analysed using NPT (Normalisation Process Theory). Results: 81 reports, 53 peer-reviewed and 28 other, concerning 49 distinct DHIs were found. They were designed for injuries of the anterior cruciate ligament (ACL, n=12); ACL meniscus (n=15); meniscus (n=3); ACL or meniscus (n=2), bone (n=2), patella dislocation (n=1), and 14 were non-specific. No DHIs addressed all OTPIKNEE recommendations, however the eight most complete reported 4/7 components, including exercise, information provision, patient reported outcome measures, goal setting and overall patient outcome. A remote, self-assessed strength evaluation was not reported in any DHI. NPT analysis typically demonstrated low DHI adoption levels, and no clear correlation with health technology pathway stage. The DHI with the highest adoption into routine practice, according to NPT, was ‘getUbetter’ with 56% positive scores. Conclusions: There are many available, or developing, DHIs but none include the content recommended by OPTIKNEE to reduce the risk of PTOAK. Further, there is negligible evidence of DHIs being adopted into usual care. There is a clear need to develop guideline-compliant DHIs to support effective prevention.

  • Providing consultation recordings to patients in German routine cancer care: A mixed-methods pilot study

    Date Submitted: Apr 28, 2026
    Open Peer Review Period: Apr 28, 2026 - Jun 23, 2026

    Background: The provision of audio recordings of medical encounters to patients, referred to as consultation recordings, is a well-established intervention to address information needs like recall and comprehension in cancer care. Despite these benefits, consultation recordings are not routine practice. Furthermore, research on consultation recordings in Germany is lacking. Objective: This study aims to pilot test consultation recordings in routine cancer care in Germany and assess feasibility of implementation and perceived effects from patients’ perspective. Methods: Using a sequential mixed methods approach, we assessed consultation recordings’ use, usability, acceptability, appropriateness, influencing factors, and perceived effects. Consultation recordings were piloted in an outpatient setting. Adult cancer patients were eligible to participate. Four weeks after the recorded consultation, participants received a quantitative questionnaire. In addition, a selection of participants were qualitatively interviewed. Quantitative data was analyzed using descriptive statistics, qualitative data using a combination of Practical Thematic Analysis and qualitative content analysis. Results: Ninety-seven consultations were audio-recorded and provided to patients. Seventy participants returned the quantitative survey (response rate 72.2%) and 16 participated in qualitative interviews. Most participants listened to the consultation recording and experienced improvements in recall, comprehension, and feeling informed. Routine implementation of consultation recordings was desired by many. The results suggest that patients perceive consultation recordings as feasible. However, we encountered organizational implementation challenges. Conclusions: This study provides initial evidence on the patient-perceived feasibility of consultation recordings in German routine cancer care. Consultation recordings have the potential to help patients navigate complex medical information. However, organizational implementation challenges hinder their uptake. Future research could investigate technically easier solutions suited to the German healthcare context.

  • Background: Background: Hypertension remains a predominant global risk factor for cardiovascular disease. Conventional follow-up models frequently fail to address the requirements for real-time monitoring and sustained intervention, whereas mobile health (mHealth) offers a transformative trajectory for chronic disease management. Despite a surge in relevant literature, the diversity of intervention modalities and the fragmented nature of existing evidence necessitate a systematic synthesis. Objective: Objective: This study aimed to comprehensively evaluate the efficacy of mHealth in hypertension management through a systematic review combined with evidence mapping, identifying research gaps to provide evidence-based insights for precision nursing and future research directions. Methods: Methods: A systematic search was conducted across PubMed, Web of Science, Cochrane Library, and Embase for randomized controlled trials (RCTs) involving mHealth interventions for hypertension, with the search period extending through February 2026. Literature was screened according to PICOS criteria, and methodological quality was appraised using the Cochrane Risk of Bias tool (RoB 1.0). Visual analytics, including Sankey diagrams and bubble plots, were employed to characterize the associations between intervention modalities and clinical outcomes. The study protocol was prospectively registered on the Open Science Framework (URL: https://osf.io/2vkwu). Results: Results: A total of 106 publications (comprising 108 RCTs) were included. Publication volume has increased significantly since 2018, with the United States (31 papers) and China (19 papers) being the primary contributors. The intervention paradigm has evolved from rudimentary SMS reminders to a "closed-loop" management model centered on "App + Remote Monitoring," which demonstrates the most robust and consistent positive evidence for blood pressure (SBP/DBP) control and goal attainment rates. Blood pressure parameters occupied the "core evidence layer," while therapeutic adherence and disease knowledge formed the "behavioral evidence layer". Conversely, BMI, mental health, and quality of life remained in the "peripheral evidence layer," characterized by a notably higher proportion of non-significant results. Methodological quality was generally moderate-to-high with robust randomization; however, the implementation of blinding faced prevalent high risks due to the inherent nature of the interventions. Conclusions: Conclusion: mHealth significantly enhances hypertension management efficacy through a digital "monitoring-feedback-adjustment" loop, yet it encounters bottlenecks in achieving profound lifestyle modifications (e.g., weight management) and psychological interventions. Clinical decision-making should prioritize multicomponent interventions featuring real-time interaction. Future research should focus on long-term (>1 year) follow-up and cost-effectiveness transformation in resource-limited settings.

  • Background: Adults may experience subjective cognitive decline (SCD). However, it is unclear whether SCD is related to measurable cognitive impairment, particularly women ages 40 to 60 and early dementia. Further, Medicare has mandated assessment of cognitive and memory function in individuals over 65 as part of the Medicare Annual Wellness Visit. In order to assess possible impairment and change over time, efficient, objective measures of SCD are needed. Objective: To assess the relationship between performance on an online continuous recognition task (CRT, MemTrax) and age, sex, and memory concern. Methods: This study evaluated CRT performance in participants aged 21-99 who enrolled in an online program (HAPPYneuron) to measure mental functions, including those who reported concerns about them. This program asked participants if they had complaints about their memory, and then the program offered them the opportunity to assess cognition using the CRT. This CRT instructs individuals to attend to visual stimuli (50 images) and respond as quickly as possible to repeated images (25 images). The CRT components were used to measure learning and memory (as related to HITs, response to a repeated image), executive function (as related to CRs, correctly not responding to an initial image presentation), and processing speed (HIT-RTs, average response time to HITs). Results: Analysis of 18,178 (5,795 males, 32%; 12,383 females, 68%) only included those who answered the sex, age, and memory questions. There were 11,786 (65%) between 40 and 70 years of age. Females outnumbered males by over two-fold, beginning about 35 years of age, peaking at 55 years of age at over three-fold, and falling below two-fold at about 65 years of age. Approximately 30% more men complained of memory problems than those who did not, primarily 30 – 60 years old. About 80% more women complained of memory problems, over two-fold more than women who did not, 30-50 years old. The number of HITs, number of CRs, and HIT-RTs varied little between men and women. While those without memory complaints generally performed better than those with memory complaints, there was little difference in performance levels for each group between males and females. For all groups, there was a gradual reduction of performance over age for HITs and CRs and a slowing of HIT-RTs. Conclusions: Most subjects were 40-65, more than twice as many females, suggesting that these demographics have a relationship to concern about SCD. However, there was little difference between males and females for the various CRT components, though SCD was associated with impairment. Age-related declines were progressive, the largest being in slower processing speed, presumably to compensate for age-related changes in cognitive function. Present results suggest clinicians may use these metrics to quantify patient concerns expressed in the primary care setting. Clinical Trial: none

  • Virtual Reality for Cognitive Mastery in Airway Trauma Management: A Prospective Randomized Controlled Trial

    Date Submitted: Apr 17, 2026
    Open Peer Review Period: Apr 18, 2026 - Jun 13, 2026

    Background: Innovation in teaching methods is essential for advancing medical education, particularly for trainees developing crisis management skills. Virtual reality (VR) offers access to immersive, scalable, and accessible learning environments, but its effectiveness compared to traditional mannequin-based simulation remains underexplored. Objective: This prospective randomized controlled trial evaluates the efficacy of VR-based simulation versus traditional gold-standard mannequin-based training in enhancing medical trainees’ knowledge acquisition and application of decision-making concepts for airway trauma management. Methods: Forty medical students were randomized to either the VR (intervention) group or the Mannequin (control) group. Participants engaged in airway trauma management training using their assigned modality. Both groups completed a pre-and post-intervention test to evaluate knowledge acquisition, and undertook a mannequin-based crisis scenario one week after training to evaluate knowledge application. Results: Both groups demonstrated significant knowledge acquisition (VR: mean improvement +2.0/15, P=0.006; Mannequin: mean improvement +3.2/15, P<0.001), though no statistically significant differences were observed between groups (P=0.15). The VR group achieved self-assessed readiness and knowledge saturation faster, on average, than the Mannequin group. Both groups, on average, were successful in the post-training knowledge application test, however, the Mannequin group outperformed the VR group (mean difference: 1.58/15, P=0.021), and recognized a potential airway injury more quickly (P=0.004). Nevertheless, students in the VR group reported greater engagement and satisfaction, expressing a preference for VR as a future learning modality. Conclusions: Overall, VR-based simulation is a promising and engaging method for teaching airway trauma management and demonstrates comparable knowledge acquisition to traditional mannequin-based training. However, mannequin-based simulation still confers advantages for applied performance. Further studies using larger samples, multiple scenarios, and VR-based assessments are needed. Clinical Trial: ClinicalTrials.gov NCT04451590; https://clinicaltrials.gov/study/NCT04451590