Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

The leading peer-reviewed journal for digital medicine and health and health care in the internet age. 

Latest Submissions Open for Peer Review

JMIR has been a leader in applying openness, participation, collaboration and other "2.0" ideas to scholarly publishing, and since December 2009 offers open peer review articles, allowing JMIR users to sign themselves up as peer reviewers for specific articles currently considered by the Journal (in addition to author- and editor-selected reviewers).

For a complete list of all submissions across all JMIR journals as well as partner journals, see JMIR Preprints

Note that this is a not a complete list of submissions as authors can opt-out. The list below shows recently submitted articles where submitting authors have not opted-out of open peer-review and where the editor has not made a decision yet. (Note that this feature is for reviewing specific articles - if you just want to sign up as reviewer (and wait for the editor to contact you if articles match your interests), please sign up as reviewer using your profile).

To assign yourself to an article as reviewer, you must have a user account on this site (if you don't have one, register for a free account here) and be logged in (please verify that your email address in your profile is correct).

Add yourself as a peer reviewer to any article by clicking the '+Peer-review Me!+' link under each article. Full instructions on how to complete your review will be sent to you via email shortly after. Do not sign up as peer-reviewer if you have any conflicts of interest (note that we will treat any attempts by authors to sign up as reviewer under a false identity as scientific misconduct and reserve the right to promptly reject the article and inform the host institution).

The standard turnaround time for reviews is currently 2 weeks, and the general aim is to give constructive feedback to the authors and/or to prevent publication of uninteresting or fatally flawed articles. Reviewers will be acknowledged by name if the article is published, but remain anonymous if the article is declined.

The abstracts on this page are unpublished studies - please do not cite them (yet). If you wish to cite them/wish to see them published, write your opinion in the form of a peer-review!

Tip: Include the RSS feed of the JMIR submissions on this page on your homepage, blog, or desktop RSS reader to stay informed about current submissions!

JMIR Submissions under Open Peer Review

↑ Grab this Headline Animator

If you follow us on Twitter, we will also announce new submissions under open peer-review there.

Titles/Abstracts of Articles Currently Open for Review:

  • Background: As societies digitalize, unequal access and use of technology risk creating a digital divide where older adults often are disadvantaged. While newer generations of older adults have higher levels of access and daily internet use, digital insecurities—worries and fears regarding cyberthreats and cybercrime — are suggested to be a barrier to their digital inclusion. Little is known about the characteristics of older adults who avoid digital technology due to insecurities, and there is also limited quantitative research on how insecurities affect older adults’ use and embracement of digital technology. Objective: The aim of this study was to characterize which older adults avoid digital technology due to insecurities and to examine the association between this avoidance and their levels of digital use and embracement. Methods: This cross-sectional study utilized data from the "Healthy Ageing in the Digital Society (HeADS)" survey, involving 451 participants, mean age 69 years (age range 55-92 years), in Sweden. Avoidance was assessed across five domains (e.g., e-commerce, smart devices, chatting with strangers). Digital inclusion was analyzed using measures for independent use (second-level divide), digital usage, and the Digital Living Index (DLI), which measures digital benefit and embracement (third-level divide). Multiple linear regression models were used to analyze the associations. Results: Over 70% of participants reported at least one form of avoidance due to insecurity, with chatting with unknown individuals (60.9%) and e-commerce (31.6%) being the most common. Higher levels of avoidance were significantly associated with female gender, older age, and a lower ability to use technology independently. Importantly, digital insecurity-related avoidance was found to be an independent barrier to digital embracement, even when accounting for potential confounders. Furthermore, over 50% of participants expressed a strong interest in learning more about safe digital use. Conclusions: Digital insecurity-related avoidance contribute to widening the digital divide by preventing late middle aged and older adults from fully realizing the benefits of digital services, despite having access and basic skills. To foster digital inclusion, support interventions must move beyond technical training and focus on building confidence and providing practical strategies for navigating the digital environment safely. Many late middle aged and older adults see the need of such support.

  • Background: Chronic Obstructive Pulmonary Disease (COPD) is currently the 3rd leading cause of death globally, and exacerbations are responsible for hospitalizations, cost, and mortality. The Internet of Medical Things (IoMT) is an enabling technology that could revolutionize care from reactive hospital-based care to proactive home care for those with COPD by combining sensor networks, wearables, smart inhalers, and cloud computing. Given the dynamism of evidence, no systematic review has yet to capture the clinical, economic, patient, and implementation perspectives of IoMT interventions used in COPD home care in under-resourced areas such as the Gulf Cooperation Council (GCC) region and low- and middle-income countries (LMICs). Objective: To synthesize evidence from the last five years (2020–2025) to discuss clinical, economic, engagement, and adoption outcomes of home-based COPD IoMT, highlighting inequities in LMICs. Methods: PRISMA 2020 guidelines and a PICOTS-SD framework guided a systematic review. PubMed and the Imam Abdulrahman Bin Faisal University (IAU) E-Library (which includes Embase, CINAHL, and Scopus) were searched from 2020 to 2025. Studies included were peer-reviewed empirical studies that assessed IoMT or telemonitoring systems for the home management of COPD, including randomized controlled trials, observational and cohort studies, feasibility studies, economic studies, and qualitative studies. The methodological quality of the studies was assessed independently by two reviewers using the Mixed Methods Appraisal Tool (MMAT 2018). The results were thematically synthesized into five thematic areas. Results: 655 records were identified for 28 empirical studies from 13 countries. A total of 13 (76%) of 17 clinical effectiveness studies reported significant benefits, including reduced hospitalizations and exacerbations and improved quality of life, while one large real-world study reported a survival benefit. Three economic analyses were presented: two showing cost savings and one showing increased costs, with survival benefits. Adoption was fairly high, while digital illiteracy and physical discomfort were prevalent, especially with older adults. Scepticism was not the main barrier to the uptake of healthcare providers; it was governance, infrastructural, and role ambiguity. In particular, the literature primarily focuses on high-income countries in the West, with limited information from the GCC, LMICs, and the Global South, where the burden of COPD is disproportionate and rising. Conclusions: IoMT has been shown to have clinically and economically valuable home-based COPD management benefits. Collaborative solutions are needed to address governance, infrastructure, workforce, and patient education issues to achieve success. There is a significant literature gap in the equity literature, and only scarce evidence from health systems in the GCC and LMICs. To fill this gap, it is essential that research and policy align with national digital health policies and strategies, such as Saudi Arabia's Vision 2030. Clinical Trial: A protocol has not been prospectively registered, but it is pre-designed and can be requested from the corresponding author. The project is recommended for registration in PROSPERO in the future.

  • Background: Urolithiasis is a highly prevalent urological condition with recurrence rates exceeding 50% within five years. While eHealth literacy has been associated with better health outcomes, its dynamic interplay with illness cognitions, particularly illness-related helplessness, acceptance, and perceived benefits—remains poorly understood. Guided by Leventhal's Common-Sense Model, this study examined the bidirectional relationships between eHealth literacy and illness cognitions over a six-month period following urolithiasis surgery. Objective: To investigate the directionality and magnitude of longitudinal associations between eHealth literacy and three dimensions of illness cognitions (helplessness, acceptance, and perceived benefits) in postoperative urolithiasis patients. Methods: A three-wave longitudinal design was employed with 368 patients who underwent urolithiasis surgery. Data were collected at 1 month (T1), 3 months (T2), and 6 months (T3) postoperatively. eHealth literacy was measured using the eHealth Literacy Scale (eHEALS), and illness cognitions were assessed with the Illness Cognition Questionnaire (ICQ). Cross-lagged panel models (CLPMs) were estimated to examine bidirectional effects while controlling for autoregressive stability. Results: The primary CLPM (eHEALS–Helplessness) demonstrated good fit (χ²=9.94, df=6, P=0.127; CFI=0.988; TLI=0.967; RMSEA=0.050). Significant bidirectional cross-lagged effects were identified: higher eHEALS predicted lower Helplessness (T1→T2: β=−0.096, P=0.003; T2→T3: β=−0.245, P<0.001), while Helplessness also predicted lower eHEALS (T1→T2: β=−0.099, P=0.017; T2→T3: β=−0.139, P=0.010), revealing a bidirectional negative spiral. Supplementary models revealed that eHEALS predicted increased Acceptance (β=0.098 at T2→T3, P=0.010) and Perceived Benefits (β=0.083–0.092, P<0.01), with reverse effects being non-significant. Conclusions: eHealth literacy and illness helplessness are reciprocally related in a bidirectional negative spiral, while eHealth literacy exerts unidirectional effects on promoting acceptance and perceived benefits. These findings delineate a public health causal chain—eHealth literacy → adaptive illness cognitions → improved self-management → reduced recurrence risk—that supports the integration of eHealth literacy interventions into postoperative care and tiered healthcare systems to facilitate cognitive adaptation and recurrence prevention in urolithiasis.

  • Background: Type 2 diabetes mellitus (T2DM) remains one of the leading chronic diseases contributing to morbidity, mortality, and healthcare burden globally. Although Diabetes Self-Management Education (DSME) has demonstrated positive outcomes, evidence regarding the integration of artificial intelligence (AI)-supported nursing education in Indonesian hospital settings remains limited, particularly in multicenter contexts. Objective: This study aimed to examine the effectiveness of Artificial Intelligence–Integrated Diabetes Self-Management Education (AI-DSME) on glycemic control, diabetes self-care behavior, self-efficacy, quality of life, and hospital readmission among adults with T2DM in several Type B hospitals in South Sulawesi, Indonesia. Methods: A multicenter prospective cohort study was conducted from February 2025 to February 2026 in five Type B hospitals across South Sulawesi Province, Indonesia. A total of 630 adult patients with T2DM were recruited using stratified proportional random sampling. Participants received nurse-led AI-assisted DSME interventions incorporating personalized mobile education, automated reminders, nutritional recommendations, medication adherence monitoring, and family-centered counseling. Data were collected at baseline, 3 months, 6 months, and 12 months using the Summary of Diabetes Self-Care Activities (SDSCA), Diabetes Management Self-Efficacy Scale (DMSES), EQ-5D-5L, glycated hemoglobin (HbA1c), and hospital readmission records. Multivariate generalized estimating equation analysis was performed. Results: The mean age of participants was 56.8 ± 10.7 years, and 58.4% were female. Significant improvements were identified in self-care behavior scores (β = 1.92; p < 0.001), self-efficacy (β = 2.14; p < 0.001), and quality of life (β = 1.38; p < 0.001). Mean HbA1c decreased from 9.1% ± 1.8 at baseline to 7.3% ± 1.2 at 12 months (p < 0.001). Hospital readmission rates declined from 21.7% to 8.9% during follow-up. AI-supported individualized education demonstrated stronger effects among participants with poor baseline glycemic control and low educational attainment. Conclusions: AI-integrated DSME significantly improved glycemic outcomes, self-care practices, quality of life, and reduced readmission among adults with T2DM. Integrating digital nursing interventions into hospital-based diabetes management programs may provide scalable and sustainable solutions for chronic disease management in low- and middle-income countries.

  • Background: Background: Healthcare organizations increasingly rely on large language model (LLM) vendors for clinical documentation and decision support, yet they face substantial uncertainty regarding costs, vendor lock-in, data sovereignty, performance reliability, and accountability. These factors create strategic challenges for healthcare executives seeking to balance AI innovation with organizational risk management [1]. Objective: Objective: This study presents the development and evaluation of an in-house, fine-tuned small language model architecture for medico-legal case processing. The system uses orchestrated domain-specific small language models (SLMs) to support auditable, deterministic performance while preserving organizational data control. Its primary goal is to automate the summarization and classification of medical-legal documentation, with particular focus on Note-to-File (NTF) records documenting physician–advisor interactions and on automating medical advice summarization and categorization. Methods: Methods: We fine-tuned open-source models on 9,772 human-coded medical-legal advice cases from selected specialties including Family Medicine, Psychiatry, Ophthalmology, Emergency Medicine, and surgery. Using a 70/15/15 train-validation-test data split, we applied a two-phase approach: Phase I focused on automated NTF summarization, evaluated with BERTScore and ROUGE; Phase II focused on multi-label classification of medical-legal issues, evaluated with Hamming Score and per-label precision, recall, and F1. Performance was assessed on an out-of-sample test set of 1,466 cases, with stratified analyses by specialty. Subcategory models were also developed using an 80/20 split within each category. Results: Results: In Phase I, the top models (BART-large and Pegasus-PubMed) achieved BERTScore-F1 of 0.88 and ROUGE-1 of 0.43, exceeding the predefined target of 0.35 (ROUGE-1). In Phase II, RoBERTa-large (355M parameters) performed the best, with a Micro-F1 of 0.75 and subset accuracy of 0.43 at the category level. At the subcategory level, Hamming Scores ranged from 0.84 for Conduct to 0.97 for Patient Care. Run-to-run consistency was 97–98%. Human inter-rater validation of 273 cases showed AI-model agreement comparable to coder-to-coder agreement, and in some subcategories higher. Projected annual savings are estimated at CAD $502,200, or equivalent to six full-time coders. Conclusions: Conclusions: Task-specific orchestration of SLMs offers a governance-friendly alternative to vendor-dependent LLMs by reducing uncertainty around cost, performance, liability, and data sovereignty. This approach appears feasible for healthcare organizations balancing AI innovation with risk management and strategic autonomy. This work is intended not only as a technical evaluation, but as an operating model for AI adoption in regulated healthcare settings.

  • Background: Mental health digital interventions (MHDIs) are increasingly viewed as promising support in university counseling centers. Yet many remain stand-alone and poorly integrated into counseling practice, leading to lower engagement. Little is known about how digital support should be integrated across counseling stages or how counselors and students with counseling experience perceive their role in hybrid care. Objective: This study aimed to examine counselors’ and students’ perspectives on how digital interventions could be integrated across the counseling journey in university counseling centers and to derive stakeholder-informed strategies for hybrid care. Methods: We conducted semi-structured interviews with 18 counselors and 24 university students with prior counseling experience. Participants provided answers on counseling-stage vignette scenarios and prototype features representing major MHDIs functions, including their expectations, concerns, and implementation strategies. Data were analyzed using framework analysis, combining a primary stage-based analysis of counseling experiences with a secondary function-based analysis of digital intervention needs and implementation requirements. Results: Across the counseling journey, participants viewed digital support as useful for promoting reflection before counseling, facilitating disclosure during intake, maintaining behavioral engagement between sessions, and supporting self-regulation after counseling. Students with counseling experience emphasized emotional safety, motivation, and self-understanding, whereas counselors emphasized interpretability, moral responsibility, and implementation feasibility. Integration was also perceived as requiring personalization according to students’ distress, readiness to engage, and understanding of the counseling process, while also supporting counselors’ professional responsibilities related to data interpretation, risk management, and alignment with existing therapeutic approaches. Participants suggested that implementation should proceed in phased ways, beginning with lower-burden functions and clearer role boundaries, while distinguishing student-facing supportive uses from counselor-facing interpretive uses Conclusions: This study identified how different digital intervention modalities and functions may be integrated across the offline in-person counseling journey, and how their roles vary according to counseling stage and user context. Successful implementation may depend not only on user engagement, but also on counselors’ concerns regarding moral burden and integration into existing therapeutic practice. The findings highlight the need for staged and personalized implementation strategies based on psychotherapeutic literacy and level of distress and readiness to engage. This study provides a stakeholder-informed strategic framework for service-level decisions about when, how, and for whom digital intervention components may be integrated within real-world counseling settings.

  • The Application of Mobile Health in Cognitive Management Among Children with Cancer: A Scoping Review

    Date Submitted: Jun 17, 2026
    Open Peer Review Period: Jun 18, 2026 - Aug 13, 2026

    Background: The incidence of cognitive late effects among children with cancer has been increasing. Mobile health (mHealth), which delivers healthcare services through portable devices, may represent an innovative and scalable approach to optimize cognitive management in this population. However, evidence regarding its effectiveness and acceptability remains limited. Objective: This study aimed to systematically explore the evidence on mHealth for cognitive management in pediatric cancer patients and to characterize its key features, including feasibility, acceptability, functionality, and cost-effectiveness. Methods: This scoping review was conducted in accordance with the Arksey and O’Malley framework and the PRISMA-ScR(Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews) checklist. Six academic databases and grey literature were searched to identify relevant studies published between January 2010 and April 2026. Reference lists of included studies were also manually screened. A random sample of abstracts and full texts was independently screened by a second reviewer to ensure consistency. Data were charted based on key study characteristics, including study design, participants, intervention content, outcome measures, and main functional components. The results were collated and synthesized using a structured spreadsheet. Results: A total of 3,120 records were identified through the literature search, of which 19 studies met the eligibility criteria. Among the included studies, app-based cognitive training was the predominant form of intervention, while other modalities included exergaming, interactive websites, and newly developed mHealth applications. Nearly all studies incorporated cognitive training as a core component; however, only a limited number included coaching support, telephone communication, or educational features, and most interventions primarily focused on training a single cognitive domain. Conclusions: Conclusion: Although mHealth shows promise for improving cognitive function in children with cancer, substantial room for improvement remains. Future research should focus on developing multidimensional integrated intervention models, addressing the specific needs of pediatric patients, incorporating multidimensional clinical outcomes, and evaluating the cost-effectiveness of interventions. Our findings provide recommendations for optimizing mHealth-based cognitive management in children with cancer and offer targeted guidance and practical insights for the development of future interventions. Clinical Trial: https://osf.io/mkbaj/overview

  • Background: Due to concerns about privacy breaches and fear of social stigma, some individuals engaging in high-risk behaviors—particularly men who have sex with men (MSM)—may avoid in-person HIV testing services and instead choose to purchase HIV self-test kits online for self-testing. Unlike traditional e-commerce platforms that directly provide self-test kits without conducting risk assessments, the “Easy Test Know” platform integrates a structured behavioral risk assessment prior to HIVST kit purchase, allowing users to be categorized into different HIV risk groups. Objective: This study aimed to evaluate the performance of a digital HIV behavioral risk stratification tool integrated with online self-testing among MSM, and to determine whether incorporating multi-dimensional behavioral indicators—particularly high-risk venue diversity score—could improve the prediction of HIV positivity compared with the platform’s existing rule-based risk scoring algorithm. Methods: This retrospective observational study utilized data from the “Easy Test Know” platform between November 2023 and February 2025. The study included MSM classified as medium or high risk by the platform algorithm. Firth logistic regression models were developed to predict HIV positivity using behavioral variables, including high-risk venue diversity score. Model performance was evaluated using area under the receiver operating characteristic curve (AUC) and calibration analyses. Results: A total of 9,961 participants were included, of whom 90.8% were classified as high risk by the platform. The overall HIV positivity rate was 0.59%, with a higher rate in the high-risk group. The platform’s baseline risk score demonstrated moderate discriminatory power for predicting HIV positivity (AUC = 0.646). High-risk venue diversity score showed a dose–response association with HIV positivity. The data-driven model incorporating the platform’s existing variables together with the newly derived high-risk venue diversity score improved discrimination compared with the rule-based platform algorithm (ΔAUC = +0.082). The fully adjusted model achieved the highest predictive performance (AUC = 0.752). Conclusions: Digital behavioral risk stratification demonstrated moderate ability to differentiate HIV risk among MSM using online HIV self-testing services. Compared with the platform’s original rule-based scoring system, data-driven behavioral models incorporating high-risk venue diversity score showed improved predictive performance. These findings support the potential value of data-driven approaches for optimizing digital HIV risk assessment tools.

  • Speech-Driven Reporting in Long-Term Care: A Mixed Methods Evaluation Study

    Date Submitted: Jun 17, 2026
    Open Peer Review Period: Jun 18, 2026 - Aug 13, 2026

    Background: Long-term care (LTR) faces critical challenges driven by workforce shortages, an aging population, and a growing population of people living with dementia. Administrative burdens add to this pressure, as healthcare professionals spend up to 40% of their working time on administration and documentation. Speech-driven AI reporting (SDR) may offer a technological solution to alleviate administrative reporting workload and enhance the workflow efficiency of care workers. Objective: This study aimed to empirically study the effects of SDR on documentation time, transcription accuracy measured by Word Error Rate, user experiences, and the client-caregiver interaction within nursing homes and home care settings. Methods: A mixed-methods study, involving 21 healthcare organizations, was conducted in the Netherlands between January and September of 2025. An experimental evaluation study comparing speech-driven and typed reporting under controlled conditions (n=35), complemented by a cross-sectional questionnaire study among care professionals from 14 elderly care organizations (n=293). Documentation time and Word Error Rate were analyzed using linear mixed models. Associations between system use duration and user experience were examined using correlation analyses. Results: The controlled evaluation study demonstrated a significant reduction in reporting time. SDR was found to be significantly faster than typing (p < 0.01), with a significant interaction between reporting device and method (p = 0.01), being 3.5 times faster on smartphones (34 seconds vs. 122 s) and 2.3 times faster on laptops (43 vs. 102 seconds). The SDR AI software demonstrated high transcription accuracy (Word Error Rate <0.05). SDR did change the reporting process: healthcare workers reported more directly after they provided care for their clients (19.0% vs 42.1%; p<0.001) and fewer reports were made after the end of their shift. Also, no correlations between SDR use and technology acceptance aspects, or perceived work pressure were determined. Conclusions: The current SDR technology offers time savings and high accuracy regardless of the device used (smartphone or laptop). However, the technological capability alone does not automatically translate to reduced perceived work pressure by care workers. The findings suggest that the challenge has shifted from technical feasibility to implementation strategy and behavioral change.

  • Wearable Thermal Sensing for Real-Time Smoking and Eating Activity Detection: A Confirm-Refute Study

    Date Submitted: Jun 16, 2026
    Open Peer Review Period: Jun 17, 2026 - Aug 12, 2026

    Background: Smoking and overeating are repetitive hand-to-mouth behaviors that contribute to highly prevalent yet preventable diseases. Most existing wearable systems have not been validated in free-living conditions to detect these behaviors in real-time. Objective: Leveraging shared behavioral patterns of eating and smoking, we developed HabitSense, a wearable system that integrates thermal sensors, a privacy conscious camera, and on-device algorithms, to detect smoking and eating events in real time and trigger a paired smartwatch to collect contextual data using ecological momentary assessment (EMA). We evaluated the detection accuracy of HabitSense in a free-living user study. Methods: Seventeen participants (9 in the smoking cohort and 8 in the eating cohort) were instructed to wear HabitSense, a custom necklace paired with a smartwatch, during waking hours for 7 consecutive days. Two separate machine-learned algorithms processed data from the thermal sensor array and camera on-device. When HabitSense predicted a smoking or eating event, the smartwatch prompted a micro- Ecological Momentary Assessment (micro-EMA) asking the participant to confirm or refute the prediction (“Are you smoking?” yes/no; “Are you eating?” yes/no). Additionally, an integrated camera recorded video to enable visual confirmation of each predicted smoking and eating event. Results: In total, 780.6 hours of sensor data were collected, capturing 217 smoking episodes and 87 eating episodes. The necklace generated 229 smoking-event predictions, of which 209 (91%) were true positives and 20 (9%) were false positives. 8 undetected smoking episodes were identified through manual review of the video footage (3% of total episodes). Participants responded to 212 EMA smoking-event prompts (92.6%); of these responses, 206 (97.2%) were correct (i.e., participants responded “yes” during actual smoking events and vice-versa). The necklace also generated 84 eating-event predictions, of which 67 (79.8%) were true positives and 17 (20.2%) were false positives. 20 undetected meals were identified in video footage (23% of total meals). Conclusions: The findings suggest that the proposed system is feasible for automated and objective monitoring of contextual triggers associated with smoking relapse. HabitSense demonstrated high accuracy in smoking detection and strong response rates to smoking-triggered EMAs, supporting its potential for real-time behavioral assessment in free-living settings. For eating detection, the variability and complexity of food-related behaviors indicate that more advanced machine-learning approaches may be required, particularly for deployment on highly resource-constrained wearable devices. Future work will expand EMA queries to capture contextual factors surrounding smoking and eating episodes, leverage these data to develop just-in-time smartwatch-based interventions. Ultimately, this work aims to enable a personalized, adaptive intervention system that accounts for individual differences in behavior, a dimension often insufficiently addressed in current smoking cessation strategies.

  • Trust in Generative Artificial Intelligence Chatbots for Mental Health Support: A Systematic Review

    Date Submitted: Jun 15, 2026
    Open Peer Review Period: Jun 16, 2026 - Aug 11, 2026

    Background: Generative artificial intelligence (GenAI) chatbots are increasingly used for mental health support, but trust in these emotionally vulnerable and relationally sensitive interactions remains poorly understood. Objective: This study aims to synthesize empirical evidence on how trust is conceptualized, shaped, and associated with outcomes in GenAI-based mental health support, with attention to differences across AI roles. Methods: This systematic review was conducted in accordance with the PRISMA 2020 guidelines. Peer-reviewed empirical studies were identified by searching five electronic databases. Two reviewers independently screened records, selected eligible studies, extracted data, and assessed methodological quality using the Mixed Methods Appraisal Tool. Data were synthesized descriptively and thematically. Results: Of 1180 citations retrieved, 28 studies were included. Trust was rarely explicitly defined and was most often operationalized through affective and relational indicators, including emotional comfort, perceived empathy, and psychological safety, rather than technical competence alone. Trust-related antecedents involved user vulnerability and attitudes, system reliability and emotional responsiveness, interactional continuity, and contextual constraints. Trust was associated with engagement and emotional relief, but also with relational and safety concerns, including excessive reliance on AI support, blurred role boundaries, and inadequate responses to crisis-related disclosures. Role-based synthesis suggested that lower-engagement roles (eg, functional assistants and structured facilitators) mainly involved cognitive trust, whereas more emotionally engaging or autonomous roles (eg, empathetic co-therapists and therapeutic companions) involved broader affective and alliance-like trust, together with greater relational and safety risks. Conclusions: Trust in GenAI-based mental health support should not be treated as a uniform or inherently desirable outcome. Role-sensitive evaluation and governance are needed to align user trust with system capability, safety boundaries, and responsibility.

  • Background: Digital psychological interventions have emerged as a promising strategy to address the growing psychosocial needs of cancer survivors. However, the specific contribution of interventions delivered with active involvement of trained mental health professionals remains insufficiently understood, particularly across different phases of cancer survivorship. Objective: This systematic review evaluates the effectiveness of professionally guided digital psychological interventions in improving psychological and symptom-related outcomes among adult cancer survivors across different phases of survivorship. Methods: Following PRISMA guidelines, a systematic search of PubMed, Scopus, and Ovid MEDLINE was conducted to identify studies published between 2013 and 2025. Eligible studies included adult cancer survivors receiving professionally guided digital psychological interventions delivered through web-based platforms or videoconferencing by trained mental health professionals. Data were extracted and synthesized narratively, and methodological quality was assessed using established risk-of-bias criteria. Results: 32 studies met the inclusion criteria, the majority of which were randomized controlled trials, with sample sizes ranging from 9 to 269 participants. Interventions included cognitive-behavioural, mindfulness-based, and supportive approaches delivered via videoconferencing or web-based platforms, with active involvement of trained mental health professionals. Most interventions were delivered synchronously (78%) and focused on acute (31%) and extended (62.5%) survivorship phases. Across studies, guided digital interventions were consistently associated with reductions in psychological distress, anxiety, and depression, as well as improvements in fear of cancer recurrence. Significant reductions were also observed in symptom burden, including fatigue, pain, and sleep disturbances; for example, one randomized trial reported a greater decrease in fatigue severity in the intervention group compared to controls (between-group difference = 0.48; p = 0.04). Improvements extended to quality of life and key psychological processes such as mindfulness, coping, and self-compassion. Overall methodological quality was fair to good. Conclusions: These findings suggest that professionally guided digital psychological interventions provide clinically meaningful benefits for cancer survivors, with their effectiveness likely linked to the preservation of structured therapeutic processes and active professional involvement, supporting their integration into stepped or blended models of survivorship care.

  • Background: Digital Health Technologies (DHTs) enable remote, objective and frequent assessments of motor signs in people living with Parkinson’s (PwP). It remains unclear whether participant satisfaction with daily DHT interactions relates to long-term adherence and clinical outcomes. Objective: To quantify PwP’s satisfaction with daily remote DHT monitoring and its relationships with DHT adherence, motor disease severity, and anxiety and depression. Methods: Data from 710 participants living with early-stage Parkinson’s were analyzed from the PASADENA Phase IIa (NCT03100149; n=293) and PADOVA Phase IIb (NCT04777331; n=417) studies. At baseline, PASADENA participants (H&Y 1–2) were treatment-naive or on stable MAO-Bi, PADOVA participants (H&Y 1–2) were on stable L-DOPA or MAO-Bi. Remote monitoring with the Roche PD Mobile Application included active tests (AT), surveys, and passive monitoring (PM) via a study watch and phone, over 2 years (PASADENA) or 76–172 weeks (PADOVA). Towards the end of both studies (follow-up), participants completed questionnaires on DHT acceptance (Likert-scale: 1=very negative, 5=very positive) and open feedback. “Global satisfaction” with the DHT was defined as the mean Likert score from 4 comparable questions in the PASADENA and PADOVA questionnaires. Open feedback was analyzed qualitatively. Spearman’s correlations related global satisfaction levels with overall DHT adherence, self-reported motor experiences of daily living (MDS-UPDRS Part II), clinician-rated motor sign severity (MDS-UPDRS Part III) and, for PASADENA, anxiety and depression (HADS). Results: Global satisfaction was 4.2/5 (SD=0.7) in PASADENA and 3.9/5 (SD=0.7) in PADOVA, reflecting overall positive sentiments. Most participants rated all DHT aspects positively, including study devices and app (PASADENA: 3.6±1.0; PADOVA: 3.4±1.0) and daily active testing (PASADENA: 3.9±1.3; PADOVA: 3.9±1.2). Respondents highlighted areas for improvement in open feedback, including technical issues (PASADENA: 20%; PADOVA: 29%), repetitive ATs/surveys (PASADENA and PADOVA:13%) and device usability (PASADENA: 10%; PADOVA:12%). Higher global satisfaction ratings were weakly-to-moderately associated with higher adherence levels in PASADENA (AT: ρ=0.40; PM smartphone [PMsp]: ρ=0.29; PM smartwatch [PMsw]: ρ=0.23; all P<.001) and PADOVA (AT: ρ=0.24; PMsp: ρ=0.18; PMsw: ρ=0.17; all P<.001). Global satisfaction showed negligible-to-weak and inconsistent associations (-.11≤ρ≤.14) with MDS-UPDRS Parts II and III across studies. In PASADENA, lower global satisfaction ratings were weakly associated with higher levels of anxiety (ρ=−.15, P=.02) and depression (ρ=−0.11, P=.06). Conclusions: Participants living with early-stage Parkinson’s were generally satisfied with daily DHT testing in clinical trials over ~1.5–3 years. Satisfaction was positively associated with adherence but not robustly with motor disease severity, anxiety and depression. Improving device design, battery life and test variety may boost satisfaction and adherence, strengthening DHT-based monitoring. Clinical Trial: ClinicalTrials.gov NCT03100149 (PASADENA), NCT04777331 (PADOVA)

  • Background: Remote digital phenotyping has expanded the scalability of cognitive neuroscience studies. However, the integrity of millisecond-level response time (RT) data relies on the client-side graphics pipeline. When local Graphics Processing Units (GPUs) become unavailable, operating systems and web browsers silently transition to software-based Central Processing Unit (CPU) renderers. The extent to which these software fallbacks corrupt behavioral metrics remains unquantified. Objective: To characterize the technical constraints of software-based rendering architectures and evaluate their systemic impact on the data validity of remote, web-based cognitive tasks. Methods: We evaluated behavioral outcomes and technical paradata across two studies using our custom Adaptive Cognitive Evaluation-Explorer (ACE-X) platform. Study 1 utilized a naturalistic longitudinal sample (N = 864,702 trials; n = 277 participants) to observe real-world performance under the legacy software-based Google SwiftShader renderer. Study 2 employed a controlled, within-subjects experimental design (N = 4,089 trials; n = 74 participants) on Windows machines running Google Chrome to isolate hardware acceleration (native GPU) against software-based rendering (Windows Advanced Rasterization Platform (WARP)). Statistical profiling was conducted using stratified outlier removal and linear mixed-effects models (LMMs) with log-transformed RTs. Results: In Study 1, software rendering with SwiftShader introduced a massive, statistically significant delay, increasing baseline reaction times by 171.23% (β = 0.9978, P < .001), yielding an average hardware penalty of 515 ms (816 ms CPU vs 301 ms GPU). Study 2 experimentally validated this behavior, showing that WARP significantly inflated reaction times by 39.60% (β = 0.3336, P < .001), yielding a baseline penalty of 107 ms (377 ms CPU vs 270 ms GPU). Software rendering increased visual frame instability (FPS (frames per second) Coefficient of Variation) by over 1.5 standard deviations (P < .001). Furthermore, the integration of random slopes demonstrated that individual participant reaction times varied heterogeneously in response to this hardware-induced jitter (P < .001). Conclusions: Software-based rendering pipelines act as destructive technical artifacts in digital research, introducing profound, non-uniform delays and visual stutters that mask true psychophysiological signals. Because high individual heterogeneity renders uniform post-hoc linear corrections mathematically invalid, researchers collecting high-resolution timing data on varying hardware must actively capture graphics paradata and exclude software-rendered sessions. Ultimately, these mitigation strategies must be balanced with health equity considerations, as systematic data exclusion risks underrepresenting populations with restricted access to optimized hardware or stable device configurations.

  • Background: Addressing the burnout epidemic among health care providers remains one of the foremost challenges of modern health care systems. Past researchers have turned to passively gathered biometric data from wearable devices to identify key variables that predict burnout. However, this has resulted in limited success. Objective: This study aims to test the feasibility and accuracy of using a machine learning approach to classifying health care providers as at higher or lower risk of burnout based upon biometric data. Methods: We conducted a 3-month study where health care providers self-reported burnout every 6 weeks and wore a Garmin Venu 3 smartwatch to collect continuous biometric data. Participants were recruited from 3 pediatric urgent care sites and an inpatient hospitalist division of a pediatric health care system. We had a final sample of 41 participants who met our inclusion criteria requiring completion of burnout measurements at two of the three timepoints and having worn the watch for at least 70% of the study period. Participants were asked to wear the Garmin watch continuously over the course of the study. Burnout self-report surveys were delivered via email. Our main outcome was burnout as measured by the Maslach Burnout Inventory (MBI-HSS). The measure consists of three subscales that together constitute burnout: emotional exhaustion, depersonalization, and reduced personal efficacy. Results: First, we found evidence of feasibility for wearable watches to be used in situ for health care providers to gather data passively for the purposes of classifying burnout. 73% of participants met our criteria of wearing the watch 70% of the time. Second, our machine learning model achieved strong overall performance in identifying risk of burnout with strong sensitivity and specificity, demonstrating that biometric data collected by wearable devices can accurately predict burnout. Conclusions: Our evidence of feasibility offers promising results for a passive method of data collection to be used for classifying burnout risk in other health care settings. Further, the strong performance of our machine learning model provides a foundational step towards improving an organization’s ability to respond and mitigate burnout through thoughtful intervention strategies.

  • Background: The increasing type 2 diabetes mellitus (T2DM) prevalence necessitates scalable prevention strategies. Until now, prediabetes screening and early intervention as a diabetes prevention step in primary healthcare remain suboptimal in terms of effectiveness and feasibility. Objective: Therefore, we aimed to develop "SI-GAP," a mobile health application for prediabetes screening and intervention, and evaluate its efficiency in Indonesian primary care. Methods: This exploratory sequential mixed-methods study consisted of a qualitative phase to design the app and a quantitative phase comprising a diagnostic test and a Randomized Controlled Trial (RCT). The modified American Diabetes Association risk score integrated into SI-GAP was validated for prediabetes screening. Participants with confirmed prediabetes were then randomly assigned to receive either the SI-GAP app intervention or conventional care for a duration of 12 weeks. Changes in HbA1c and Fasting Blood Glucose (FBG) were the main results. Results: The application-based screening instrument showed a sensitivity of 82.42% and a specificity of 31.44%. In the randomized controlled trial, the SI-GAP group demonstrated significant reductions in HbA1c (6.05%-5.50%, P<0.001) and FBG (98.65 mg/dL-94.33 mg/dL, P<0.001) compared with the control group. Analysis using the UTAUT model demonstrated that perceived utility and ease of use substantially influenced user intention. Conclusions: The SI-GAP app assists as a valid screening instrument and an effective intervention app to improve glycemic control in primary care settings. It provides a scalable digital solution for diabetes prevention in resource-limited settings. Clinical Trial: NCT04979559

  • Background Artificial intelligence is increasingly embedded in prior authorization (PA) and utilization management (UM) systems across commercial and Medicare Advantage health plans. Emerging evidence, including American Medical Association survey data showing 94% of physicians report PA negatively impacts clinical outcomes, and Senate investigative findings linking AI-assisted adjudication to denial rates up to 16 times higher than typical benchmarks, indicates that AI-driven PA is amplifying existing harms rather than correcting them. Despite regulatory attention from the Centers for Medicare & Medicaid Services (CMS) and the Office of Inspector General (OIG), no standardized compliance framework explicitly governs the deployment of AI in PA decision-making. Objective This paper argues that existing healthcare compliance infrastructure, including OIG compliance program guidance, HIPAA nondiscrimination requirements, CMS coverage determination standards, and internal audit mechanisms, provides a largely underutilized foundation for governing AI-driven PA systems. We propose a structured Algorithmic Accountability Framework (AAF) to help health plan compliance officers and executives navigate uncertainty in AI-enabled utilization management. Methods Drawing on regulatory guidance, published denial rate analyses, American Medical Association survey data, and organizational compliance program design principles, we identify five governance domains where existing compliance infrastructure can be applied or extended to AI PA systems: (1) algorithm transparency and documentation, (2) clinical validity and human oversight, (3) disparate impact monitoring, (4) appeals process integrity, and (5) vendor oversight and contractual accountability. We further integrate a patient-agency lens drawn from the Prepare/Verify/Protect framework, positioning patients as an underutilized accountability mechanism in AI-driven PA governance. Results/Discussion The AAF maps each governance domain to existing regulatory obligations and operational controls that most health plans have in place today. We argue that the systemic misclassification of AI PA tools as IT or operational efficiency systems, rather than high-risk compliance matters, is the primary organizational barrier to adequate governance. Compliance officers, not data science or IT teams, hold the cross-cutting authority needed to own AI PA governance. Patient complaint, grievance, and appeal data, disaggregated by AI involvement, constitute an underutilized error-detection layer that supplements internal compliance monitoring.

  • Make the Connection: An Analysis of Enrollment and Adherence in an Online Parenting Intervention

    Date Submitted: Jun 11, 2026
    Open Peer Review Period: Jun 12, 2026 - Aug 7, 2026

    Background: Online parenting interventions provide a unique opportunity to increase access and scalability to evidence-based parenting information that can enhance parenting practices, caregiver well-being, and child developmental outcomes. While online parenting programs can reduce access barriers, less is known about who enrolls in such programs, how participants engage with them, and whether engagement is sustained over time. This knowledge could help inform parenting program usability and engagement strategies. Objective: This study explored the demographic characteristics of caregivers who enrolled (registered) and adhered (completed) to an online parenting intervention protocol and examined sociodemographic factors associated with engagement. Methods: Data were drawn from a larger pragmatic randomized controlled trial evaluating the online Make the Connection® (MTC) program, which aims to promote healthy parent-child relationships. This secondary analysis focused on participants assigned to the intervention group (N = 215). Baseline sociodemographic characteristics were collected using an online survey prior to enrolling in the intervention. Descriptive statistics and logistic regression models were used to examine correlates of enrollment and adherence. Results: Participants were predominantly women (91.6%, 197/215), with a mean age of 35.6 years (SD = 5.19), and their children were a mean age of 13.9 months (SD = 10.7). Of the 215 participants assigned to the intervention, 107 (49.7%) enrolled in the program, while 108 (50.2%) did not. Younger child age was associated with a higher likelihood of enrollment (OR= 0.96, 95% CI 0.93-0.98, P =.002). Older caregiver age was associated with greater likelihood of enrollment (OR= 1.07, 95% CI 1.01-1.14, P =.017) and adherence (OR= 1.11, 95% CI 1.02-1.22, P=.020). Caregiver social isolation was associated with a lower likelihood of adherence (OR= 0.30, 95% CI 0.11-0.84, P =.022), but not enrollment (OR = 1.12, 95% CI 0.62-2.03, P = .706). Depressive symptoms were not significantly associated with program adherence (OR = 1.21, 95% CI 0.45-3.28, P =.708) or enrollment (OR = 0.97, 95% CI 0.90-1.05, P =.457). Conclusions: Results suggest that different factors influence enrollment and adherence in online parenting programs. Although digital delivery may reduce barriers to access, additional strategies, such as goal setting tools, personalised feedback, tailored reminders, and opportunity for peer-connection may be needed to support sustained engagement, particularly among caregivers at risk of disengagement. In this study, sociodemographic factors have been identified that can inform strategic interventions to improve engagement from caregivers most vulnerable to disengagement. Clinical Trial: NCT05770414

  • Background: Acute pancreatitis-associated acute kidney injury (AP-AKI) is linked to substantial morbidity and mortality, but the comparative performance of artificial intelligence (AI) models for early AP-AKI prediction and mortality risk stratification remains uncertain. Objective: This systematic review and network meta-analysis evaluated the diagnostic performance and comparative ranking of AI models for early prediction of AP-AKI and AP-AKI-related mortality. Objective: This systematic review and network meta-analysis evaluated the diagnostic performance and comparative ranking of AI models for early prediction of AP-AKI and AP-AKI-related mortality. Methods: PubMed, Embase, Web of Science, Cochrane Library, and IEEE Xplore were searched from inception to February 23, 2026. Eligible studies developed or validated AI, machine learning, or deep learning models for AP-AKI or mortality among patients with AP-AKI and reported reconstructable diagnostic accuracy data. Two reviewers independently extracted study and model characteristics, validation methods, and 2 x 2 diagnostic data. Risk of bias was assessed with PROBAST+AI, certainty of evidence with GRADE, pooled accuracy with bivariate random-effects models, and algorithm rankings with Bayesian diagnostic network meta-analysis. Results: Fourteen studies were included. For AP-AKI prediction, 11 studies with 35 validation datasets yielded pooled sensitivity of 0.76, specificity of 0.85, and area under the receiver operating characteristic curve of 0.87; XGBoost ranked highest for sensitivity and diagnostic odds ratio. For mortality prediction, three studies with 66 validation datasets yielded pooled sensitivity of 0.73, specificity of 0.77, and area under the receiver operating characteristic curve of 0.81; support vector machine ranked highest for sensitivity and diagnostic odds ratio. Certainty of evidence was mostly low or very low, mainly because of heterogeneity, retrospective designs, limited external validation, and incomplete calibration or clinical utility reporting. Conclusions: AI models show promising but heterogeneous performance for AP-AKI prediction and mortality risk stratification. Prospective multicenter validation, calibration assessment, workflow evaluation, and clinical utility studies are needed before routine implementation. Clinical Trial: PROSPERO CRD420261360696.

  • Ethics of Health Data Infrastructures: Toward Continuous Governance and Public Trust

    Date Submitted: Jun 11, 2026
    Open Peer Review Period: Jun 12, 2026 - Aug 7, 2026

    In April 2026, reports of de-identified UK Biobank participant data being listed on overseas commercial online platforms highlighted concrete vulnerabilities in health data governance. Drawing on this incident, as well as broader discussions on secondary use, cross-border sharing, and public trust, this Viewpoint argues that traditional, trust-based, pre-access review models possess significant ethical and operational limitations. The core concern is not data sharing, commercial involvement, or international collaboration per se, but rather the movement of participant-contributed data beyond approved research governance into external commercial digital environments. Preserving public trust and the "social license" of health data infrastructures—defined as ongoing public acceptance of institutional data practices—requires a shift from access gatekeeping to continuous, proportionate stewardship. This should encompass Trusted Research Environments, audit logging, and ethics-by-design approaches that render secure data use practicable without encouraging insecure workarounds. As initiatives such as the European Health Data Space develop, legal alignment must be complemented by operational accountability for downstream use. Policymakers, funders, data access bodies, infrastructure custodians, and technology providers should embed auditability, user-friendly secure environments, and participant- and public-facing transparency into routine governance. Health data infrastructures can sustain scientific validity and public trust only when responsible data sharing is coupled with continuous, practical, and publicly accountable stewardship.

  • Background: Diabetes affects 11.1% of adults worldwide, and care access is limited by cost and specialist shortages. Prior meta-analyses used small samples and rarely assessed publication bias or evidence certainty. Objective: To estimate telemedicine's effect on glycated hemoglobin (HbA1c), body mass index (BMI), and self-efficacy in diabetes, explore heterogeneity, and grade evidence certainty. Methods: DATA SOURCES PubMed, EMBASE, the Cochrane Library, and Web of Science from inception to May 15, 2026. STUDY SELECTION Randomized clinical trials (RCTs) of telemedicine vs usual care, in-person care, or no intervention in type 1, 2, or gestational diabetes (T1DM, T2DM, GDM) reporting HbA1c, BMI, or self-efficacy. Of 7,033 records, 145 RCTs (147 arms) were included. DATA EXTRACTION AND SYNTHESIS Two reviewers independently screened, extracted, and assessed risk of bias (Joanna Briggs Institute tool) per PRISMA 2020. Random-effects (REML) models included subgroup and meta-regression analyses by diabetes type, comorbidity, and duration. Publication bias used Egger's test, trim-and-fill, PET-PEESE, and worst-case meta-analysis; certainty was graded with GRADE. PROSPERO: CRD420261387371. MAIN OUTCOMES AND MEASURES Mean differences (MDs) for HbA1c and BMI; standardized mean differences (SMDs) for self-efficacy; 95% CIs. Results: In 31,657 participants, telemedicine reduced HbA1c (k = 133; MD, −0.36%; 95% CI, −0.43 to −0.28) and improved self-efficacy (k = 22; SMD, 0.57; 95% CI, 0.21 to 0.94) but did not affect BMI (k = 64; MD, −0.23 kg/m²; 95% CI, −0.46 to 0.01). The HbA1c effect varied by diabetes type (P for interaction < 0.001; near-null in GDM), comorbidity (P = 0.016; largest with depression or anxiety, MD, −0.68%), and duration (P = 0.030; weaker beyond 6 months); residual I² = 76.7%. All bias corrections attenuated the HbA1c estimate (PET P = 0.197); the self-efficacy estimate did not survive trim-and-fill or worst-case analysis. GRADE certainty was very low for HbA1c and self-efficacy and low for BMI. Conclusions: Telemedicine was associated with modest improvements in HbA1c and self-efficacy but not BMI, strongest in T2DM, comorbid depression or anxiety, and interventions ≤3 months. However, certainty was low to very low, and bias corrections suggest the pooled effects may be inflated. Targeted, short-duration telemedicine is promising but not sufficient for broad implementation; rigorous pre-registered RCTs are needed.

  • Background: There are increasing interests in developing AI tools to identify and address individual-level social determinants of health in both health care and human service settings. These activities are part of social care integration, which at the individual level involves identifying individuals with social risks (awareness) and connecting them with relevant social care resources (assistance). Social care providers such as community health workers and social workers are deemed critical stakeholders in both settings. Chatbots have shown feasibility and acceptability for social risk screening in emergency departments and primary care centers. However, we do not know if a screening chatbot is worth developing for the safety net health care and human service settings, where there are visitors with greater social needs and less organizational resources. Objective: The study aims to investigate the perceived value proposition of an AI-based chatbot for social risk screening from the perspectives of social care providers in safety net health care and human service organizations. Providers’ perceived value propositions of other AI-based applications for social care integration were also examined. Methods: We conducted semi-structured interviews with 19 social care providers who have experience with awareness and/or assistance from 16 safety net health care and human service organizations in Michigan. Interview questions focused on their experiences and challenges regarding awareness and assistance when applicable. A simulated screening chatbot based on ChatGPT-4o was also used to solicit their feedback on the technology. The nonadoption, abandonment, scale-up, spread, and sustainability (NASSS) framework was used to guide data analysis. Interview transcripts were first coded deductively, then inductively. Results: Social care providers perceived the screening chatbot as offering limited value. This is mainly because many participants engaged in assistance activities and they noted addressing social needs is a multi-step process requiring follow-up that screening chatbots do not provide. In addition, they valued cultivating trust as many patients/clients have high social needs and lack trust in the health care system, and felt chatbots present new challenges for maintaining essential trust and care quality with clients/patients. Instead of a screening chatbot, we identified that technologies to reduce documentation burden could improve providers’ efficiency and potentially increase time spent with patients/clients. This is because social risks and needs documentation generates administrative burden for providers. We also found that technologies to improve referral accuracy and engagement could improve providers’ effectiveness, as study participants have overall limited access to technologies that typically support referral-related activities. Conclusions: Social care providers in the safety net preferred AI-based applications for addressing documentation burden and social needs assistance rather than for social risk screening. Future strategies to develop AI tools for social care integration should align with social care providers’ professional values and focus on equity-centered care.

  • Characterizing Patient Portal Usage and Implications: A Data-driven Analysis on a Rural Population

    Date Submitted: Jun 10, 2026
    Open Peer Review Period: Jun 11, 2026 - Aug 6, 2026

    Background: Patient portals have become a primary channel for asynchronous patient–provider communication, yet rural-specific communication patterns remain underexplored. Objective: This study characterizes portal communication patterns in a rural-serving academic medical center over 5 years and links messaging behaviors to markers of care burden. Methods: We analyzed 370,498 patient messages and 256,295 provider responses from 10,206 patients at Dartmouth Hitchcock Medical Center (2020-2024). We also analyze the linked structured electronic health record (EHR) data for each patient. We used validated large language model (LLM)–based classifiers for thematic analysis and message authorship identification at scale. Results: Female patients and older adults generated disproportionately high message volumes. Anxiety (47.0%), hypertension (36.1%), and lipid disorders (33.7%) were the most prevalent conditions. Information seeking dominated portal communication (33.4%). The median response time to a patient messages was 10.6 hours, with patients who had dementia or cerebrovascular conditions waiting for the longest. Care partner-authored messages were substantially elevated among patients with Alzheimer disease and dementia (approximately 55%-60%) vs 5% for those without. Conclusions: Rural portal communication reflects systematic disparities linked to age, sex, and clinical complexity. LLM-based analysis enables scalable characterization of thematic patterns and clarity failures that may inform AI-assisted triage.

  • Background: Sensor metadata is critical for exposure health research because it supports accurate sensor identification, deployments, data integration, interoperability, and reproducibility. Yet it is often fragmented across multiple heterogeneous sources, such as scientific literature and manufacturer guides, where key specifications are frequently reported indirectly through citation chains, making reference tracing essential for metadata enrichment and completeness. Objective: To address this bottleneck, we developed and evaluated an LLM-based automated, citation-aware pipeline that enriches sensor metadata extracted from a primary article by identifying sensor-related citation markers and extracting additional metadata from the referenced sources. Methods: We extend our prior LLM-based metadata extraction approach by (i) detecting sensor mentions in full-text articles, (ii) capturing nearby citation markers, (iii) resolving markers to full bibliographic entries in the reference list, and (iv) retrieving cited papers to extract additional sensor metadata that may be absent from the primary document and using it to enrich and complete the base metadata. Results: Across 20 primary papers, the citation extraction component achieved 74.2% precision, 92.0% recall, 82.1% F1-score, and 69.7% accuracy, and all extracted bibliographic entries were correctly matched to their source references. This component increased sensor extraction by about 261%, yielding 94 additional sensors overall. Conclusions: The developed citation-guided pipeline improved sensor discovery and metadata completeness, thereby supporting the development of richer, more complete sensor metadata repositories.

  • Background: World Health Organization reports that chronic non-communicable diseases account for 74% of global deaths. Despite rapid advances in digital health technology, Artificial Intelligence tools for self-management remain deficient in two crucial elements: emotional connection with patients and trustworthiness. Concern around these two topics is of increasing interest and importance. With regards to trustworthiness of AI and emotional intelligence of AI however, the studies for these two concepts were developed completely separately and in an isolated manner and this in itself, is a considerable design gap. Within self-management for chronic conditions, it becomes necessary to build an approach to design with the integration of these two concerns. Objective: This systematic review aims to examine the extent to which the concepts of trustworthiness, emotional intelligence, situational awareness and personalization are integrated within artificial intelligence systems designed to facilitate self-management of chronic illness, and what the impact of integration is. Methods: From the beginning of February 2026, a thorough search was undertaken on 6 databases (PubMed, Scopus, IEEE Xplore, PsycINFO, Web of Science, and ACM Digital Library) along with connected papers, using Boolean strings linking together AI, chronic disease self-management, trust and emotional intelligence (tailored to each database's individual vernacular). Initial screening of identified articles was completed in two phases using the PRISMA 2020 criteria of pre-determined inclusion and exclusion criteria before critical appraisal using the MMAT v2018 and CASP tools. Results: After a systematic selection process of 1,486 studies, 45 studies were finally selected based on inclusion criteria. Four major theme areas emerged from the papers including: Technology in chronic illness self-management, Trust in human-AI interaction, Empathy and emotional intelligence in AI and The ethics, equity and ethical application of AI. The quality appraisal showed that 91% (41/45) of the selected studies were rated as high quality, with an average appraisal score of 93%. In addition, 73% (33/45) of the selected papers were published during 2024 and 2025 thus highlighting a high quality and contemporary compilation of literature on the subject. Conclusions: Despite advances in AI for chronic disease management and in trust-empathy theory, these fields remain siloed. We identify 5 critical research gaps; the lack of a combined trust-empathy model, the under-specification of context awareness, the absence of equity consideration, the exclusion of overtrust consideration, and lack of long-term studies demonstrating effectiveness and safety of emotional AI systems. Clinical Trial: Not registered. The review protocol was developed before the search but was not prospectively registered in PROSPERO or an equivalent database. This limitation is discussed in Section 4.4.

  • Background: Mobile health (mHealth) offers new possibilities for self-management among elderly patients with chronic diseases. However, age-related physiological decline, reduced cognitive function, and low digital literacy create a significant "digital divide," hindering their effective access to and use of mHealth services. Adolescents, as "digital natives," hold significant potential in helping their elderly family members adapt to digital technologies. Nevertheless, the mechanisms, action patterns, and influencing factors of their backfeeding behaviors remain unclear. Objective: This study aims to explore the conditions, action/interaction strategies, and consequences of digital backfeeding from adolescents to elderly patients with chronic diseases for mHealth adoption, and to construct a mechanism model based on grounded theory. Methods: This study followed the procedural grounded theory approach by Strauss and Corbin. From April 2025 to January 2026, using purposive and theoretical sampling, we recruited 15 adolescents (aged 14-24 years) who provided digital backfeeding to elderly relatives with chronic diseases for semi-structured in-depth interviews. We followed a three-level coding paradigm: open coding, axial coding, and selective coding. Data analysis was performed using NVivo 15 software. Two researchers independently performed all coding, and disagreements were resolved through team discussion. Results: Among the 15 participants, 11 were female, and 4 were male; 9 were students, and 6 were employed; 8 lived with their elderly patient relatives, and 6 lived separately. The primary care recipients were grandparents (9 participants), and the main chronic diseases were hypertension (10 cases), diabetes (5 cases), and heart disease (5 cases). Coding analysis generated 87 initial concepts, which were grouped into 32 categories, and finally integrated into 4 antecedent conditions (individual characteristics of elderly patients, family intergenerational context, technology and task environment, and health management needs), 4 action/interaction strategies (proxy operation mode, digital teaching empowerment mode, information intermediary adjustment mode, and collaborative management mode), and 4 consequence dimensions (impact on elderly patients, impact on adolescents, impact on family intergenerational relations, and impact on the backfeeding process itself). Based on these findings, a systematic digital backfeeding mechanism model was constructed. The model reveals 15 typical backfeeding pathways, including empowerment success, proxy dependence, teaching compromise, family collaboration, remote assistance, AI enhancement, and abandonment of backfeeding. Conclusions: This study is the first to systematically elucidate the core action patterns and dynamic evolution mechanisms of digital backfeeding from adolescents to elderly patients with chronic diseases for mHealth adoption. It constructs a backfeeding mechanism model based on the "conditions—action/interaction strategies—consequences" paradigm, extending the application boundaries of digital backfeeding theory to the health care domain. The findings provide an evidence-based foundation for the age-friendly transformation of mHealth and the development of intergenerational support policies in China.

  • Background: Tracheostomy is a frequently performed procedure in critical care settings, but procedures are often inconsistently coded in electronic health records (EHRs), with explicit designation as elective or emergency frequently absent. This coding ambiguity limits the ability to identify planned tracheostomy cohorts for observational research on outcomes and time toxicity. Common data models such as the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) enable large-scale federated research, but require validated computable phenotypes to ensure reliable cohort identification across heterogeneous data sources. Objective: To develop and validate a computable phenotype that identifies elective tracheostomy procedures from EHR data standardized to the OMOP CDM, enabling scalable and reproducible analysis of tracheostomy-related time toxicity in critically ill patients. Methods: We conducted a retrospective observational study using EHR data from the Johns Hopkins Health System from 2017 to 2024, comprising approximately 2.1 million patients with data mapped to the OMOP CDM. A series of cohort definitions were developed using standardized clinical code sets (International Classification of Diseases, 10th Revision [ICD-10] and Current Procedural Terminology [CPT]) from the Observational Health Data Sciences and Informatics (OHDSI) Standardized Vocabularies. To classify tracheostomy procedures lacking explicit urgency coding, we compared covariate prevalence and temporal relationships (e.g., intubation timing relative to tracheostomy) between explicitly coded elective and emergency cohorts. Six candidate computable phenotypes with stepwise inclusion and exclusion criteria were evaluated using PheValuator, a validated probabilistic phenotype evaluation tool. Results: Among 3552 patients with a tracheostomy procedure identified between 2017 and 2024, 2484 (69.9%) were explicitly coded as elective and 107 (3.0%) as emergency; the remaining 961 (27.1%) lacked explicit urgency classification. Covariate analysis revealed significant differences in intubation timing, drug exposures, and procedure codes between the explicitly coded groups. The best-performing computable phenotype (Cohort #202), which used inpatient visit-based attribution of planned and emergency codes, achieved a sensitivity of 0.88 (95% CI 0.84-0.91) and a positive predictive value (PPV) of 0.81 (95% CI 0.77-0.84), with an F1 score of 0.84. Conclusions: The proposed computable phenotype effectively distinguishes elective from emergency tracheostomy in structured EHR data. This approach enables large-scale, reproducible studies of tracheostomy-related time toxicity across heterogeneous OMOP-mapped data sources and provides a generalizable framework for phenotyping intent-ambiguous procedures across federated research networks.

  • Background: Medication-related harm is a major cause of preventable morbidity and mortality in hospitalised patients, particularly among older individuals with polypharmacy. Healthcare-specific large language models (LLMs) trained on validated pharmacological sources may provide more reliable and clinically relevant drug–drug interaction (DDI) detection than general-purpose systems. Objective: To evaluate the accuracy and processing time of Katana AI, a novel healthcare-specific large language model, compared with pharmacist-led medication review and a general-purpose large language model for DDI detection in surgical inpatients. Methods: Medication charts from surgical inpatients were prospectively reviewed between September 2025 and February 2026. DDIs identified by Katana AI, pharmacist-led review, and ChatGPT were compared against the British National Formulary (BNF) reference standard. Interactions were classified by severity and level of supporting evidence. Detection accuracy and processing time were recorded and compared. Results: Thirty-nine surgical inpatients were included, comprising 293 prescribed medications. The median age was 70 years (IQR 60–75), with 69% aged over 65 years. Katana AI identified 125 DDIs, of which 117 were clinically accurate (93.6%), compared with 85.9% for pharmacist review and substantially lower accuracy for ChatGPT. Katana AI demonstrated significantly higher accuracy than ChatGPT (p<0.001) and a modest but statistically significant improvement over pharmacist review (p=0.041). Mean processing time was 32.1 seconds for Katana AI, comparable to ChatGPT (30.7 seconds) and significantly faster than pharmacist review (227 seconds; p<0.001). Conclusions: Katana AI demonstrated high accuracy and rapid detection of clinically relevant DDIs, outperforming a general-purpose large language model and showing a modest improvement over pharmacist review. These findings support the potential role of healthcare-specific large language models as clinical decision-support tools to enhance medication safety and prescribing efficiency. Further multi-centre validation is warranted. Clinical Trial: N/A

  • Background: Large language models (LLMs) are increasingly evaluated for clinical decision support, but their practical value depends on post-training adaptation rather than raw benchmark performance. Fine-tuning, retrieval-augmented generation (RAG), and hybrid approaches represent the principal strategies for improving clinical reliability and relevance, yet evidence remains fragmented across specialties, model architectures, and evaluation designs. Objective: To synthesize evidence on fine-tuning, RAG, and hybrid post-training strategies for improving LLM clinical performance across healthcare tasks. Methods: We searched PubMed/MEDLINE, EMBASE, and Scopus (January 2018 – January 2026). Eligible studies evaluated transformer-based LLMs with post-training adaptation or retrieval augmentation applied to clinical datasets and reported quantitative performance outcomes. Risk of bias was assessed using PROBAST+AI. Results were synthesized descriptively by enhancement strategy. This review was prospectively registered (PROSPERO: CRD420261308522). Results: Of 1,890 records identified, 35 studies (published 2024–2026) met inclusion criteria across 12 countries and multiple clinical domains, including oncology, radiology, neurology, and emergency medicine. Three enhancement strategies were identified: SFT/PEFT (n=7, 20%), RAG (n=17, 49%), and hybrid pipelines combining fine-tuning, retrieval, and structured prompting (n=11, 31%). Fine-tuning was most effective for narrow, labeled classification tasks. External AUROC reached 0.912 for cancer detection and 0.938 for hepatocellular carcinoma from cell-free DNA signatures; macro-sensitivity was 0.918 for acute infarct detection from radiology reports; and AUC was 0.892 for major depressive disorder on UK Biobank data. RAG produced the largest gains when corpora were authoritative and aligned with the clinical task. Incorporating the ESC acute coronary syndrome guideline raised accuracy from 71.1% to 92.1% (GPT-4o) and from 78.9% to 94.7% (DeepSeek R1). A trauma-radiology chatbot improved injury grading accuracy from 48% to 87%, and a guideline-grounded urology pipeline reached 95.5% concordance versus 62.3% among junior clinicians. RAG reduced performance in two studies with noisy or poorly structured corpora, and reasoning-class models showed limited incremental benefit from retrieval. Hybrid systems achieved the strongest results for complex tasks. A stroke pipeline fine-tuned using LoRA reached 99.0% internal and 95.5%/79.1% external accuracy; a federated multimodal dermatology system achieved 90.2% diagnostic accuracy across 11 lesion types; and a multimodal osteonecrosis pipeline reached 96.0% expert-rated accuracy. Structured prompting (persona, chain-of-thought, task decomposition) shifted accuracy by 5–15 percentage points across studies. Only 10 studies (29%) reported external validation, and 24 (69%) lacked formal safety moderation. Conclusions: Post-training adaptation and retrieval augmentation improved clinical LLM performance across diverse tasks. Strategy-task alignment, corpus quality, and prompt design were primary determinants of benefit. Evidence remains predominantly retrospective with limited external validation. Future studies should prioritize prospective, clinically embedded evaluations incorporating safety and fairness reporting.

  • Effects of virtual reality on pain, anxiety and fear during thyroid fine-needle aspiration biopsy: a randomized controlled trial

    Date Submitted: Jun 6, 2026
    Open Peer Review Period: Jun 8, 2026 - Aug 3, 2026

    Background: Thyroid fine-needle aspiration biopsy (FNAB) is a commonly used diagnostic procedure in patients with suspected thyroid cancer; however, it may induce pain, anxiety, and fear during the procedure. Objective: This randomized controlled study aimed to evaluate the effect of virtual reality (VR) on pain, anxiety, and fear of pain in patients undergoing diagnostic thyroid procedures. Methods: The study was conducted between October 15, 2024, and April 30, 2025, at Gaziantep City Hospital, Türkiye. A total of 100 patients with suspected thyroid nodules were randomly assigned to either a VR intervention group (n = 50) or a control group (n = 50). Data were collected using a Patient Information Form, the Beck Anxiety Inventory (BAI), the Fear of Pain Questionnaire III (FPQ-III), and the Visual Analog Scale (VAS). Between-group comparisons were performed using ANCOVA adjusting for relevant baseline covariates, and effect sizes were calculated using Cohen’s d with 95% confidence intervals. Results: After adjustment for baseline values and relevant covariates, no statistically significant differences were found between the VR and control groups in post-intervention VAS (P =.152), BAI (P =.501), or FPQ-III scores (P =.20). Effect size analyses indicated small between-group effects across all outcomes (Cohen’s d = −0.05 to −0.29), with 95% confidence intervals including zero. Within-group analyses indicated reductions in VAS, BAI, and FPQ-III scores over time in both groups; however, these changes were not supported by statistically significant between-group differences. Conclusions: Virtual reality was not associated with statistically significant improvements in pain, anxiety, or fear of pain when compared with standard care after adjustment for baseline differences. Although small within-group improvements were observed, these findings do not support a strong independent effect of VR on procedural discomfort in this sample. Further well-powered randomized trials are warranted. Clinical Trial: Clinical trial registration: This study was registered at ClinicalTrials.gov (NCT06792929), https://register.clinicaltrials.gov/prs/beta/studies/S000F9GN00000034/protocol/protocolSummary?fragmentId=status

  • Background: Mobile Health (mHealth) and telemedicine can improve access to healthcare services in rural Ethiopia through education, disease monitoring, and remote consultations. Adoption depends on factors such as digital literacy, trust, training, and infrastructure, while challenges include limited smartphones and connectivity. Objective: The aim of this review was to synthesis the evidence of willingness to use mHealth and telemedicine intervention among health professionals and patients. Methods: This systematic review was conducted following the PRISMA guidelines to examine the willingness to use mHealth and telemedicine interventions and associated factors among healthcare professionals and patients in Ethiopia. A comprehensive search was carried from database such as MEDLINE, PubMed Central, CINAHL, and Africa-Wide Information using predefined searching strategies. Only full-text, peer-reviewed studies published in English were included. Two reviewers independently screened and selected studies, extracted data using a standardized form, and resolved disagreements through consensus. Study quality was assessed using Joanna Briggs Institute checklists. Results: This review consisted of 13 studies, and indicates that patients and healthcare workers are strongly willing to use mHealth and telemedicine. The highest willingness was observed among patients with chronic conditions (59.1%–96%), and they preferred simple technologies to engage (voice calls/SMS). Healthcare professionals also indicated varying but substantial willingness to engage (46.5%–83%). Conclusions: Both patients and healthcare providers have a high degree of willingness. Younger age, higher education, urban life, smartphone ownership, digital literacy, and some degree of perceived utility and simplicity of use were significant factors determining willingness. A supporting role was also provided by additional behavioral, clinical, and environmental factors. Clinical Trial: Registration: This systematic review was registered in the International Prospective Register of Systematic Reviews (PROSPERO; CRD42024629424.

  • Online Health Information Seeking, Consideration of Future Consequences, and Self-Control Among Adults With Chronic Diseases: Cross-Sectional Survey Study

    Date Submitted: Jun 6, 2026
    Open Peer Review Period: Jun 8, 2026 - Aug 3, 2026

    Background: Effective chronic disease management requires individuals to prioritize long-term health goals over immediate temptations. As chronic patients increasingly engage with online health information, it is important to understand how such engagement may relate to future-oriented cognition and self-regulatory capacity. Objective: This study examined the association between online health information seeking behavior (HISB) and self-control among adults with chronic diseases and investigated whether consideration of future consequences (CFC) was associated with this relationship. Methods: Cross-sectional survey data were collected from 11,031 adults with chronic diseases in China. Mediation analyses were conducted using SPSS macro PROCESS with 5,000 bootstrap samples while controlling for demographic, health-related, and psychological covariates. Results: HISB was positively associated with CFC (B=.07, SE=.005, p<.001). CFC was positively associated with self-control (B=.56, SE=.008, p<.001). After CFC was entered into the model, the direct association between HISB and self-control was no longer statistically significant (B=.005, SE=.004, p=.22). Bootstrap analyses indicated a significant indirect effect of HISB on self-control through CFC (B=.041, BootSE=.003, 95% CI .0348-.0479). Conclusions: The findings suggest that consideration of future consequences may help explain the association between online health information seeking and self-control among adults with chronic diseases. More broadly, digital health environments may increase the salience of future health consequences by repeatedly rendering long-term outcomes cognitively accessible in everyday life. Longitudinal and experimental research is needed to clarify causal mechanisms underlying these associations.

  • Digital Maturity in Integrated Care Systems: Development strategies – A Scoping Review

    Date Submitted: Jun 6, 2026
    Open Peer Review Period: Jun 8, 2026 - Aug 3, 2026

    Background: Digital maturity is a priority for creating efficient, patient-centered health systems, yet Integrated Care Systems (ICS) often face challenges like a lack of interoperability and weak data governance. A systematic mapping of strategies is essential to guide these organizations identify areas for improvement and define sustainable actions to ensure technology adds value to all stakeholders. Objective: To map the development strategies and interventions implemented in ICS to promote digital maturity, while identifying the associated facilitators, barriers, and recommendations described in the literature. Methods: A search was conducted on PubMed, Scopus, and Web of Science on October 17th, 2025, for English articles published since January 1st 2015. Following Joanna Briggs Institute methodology, two independent reviewers performed study selection and data extraction, and quality appraisal using the Mixed Methods Appraisal Tool, with a third reviewer resolving any conflicts, and the obtained results were synthesized through descriptive analysis and thematic grouping. Results: Eighteen articles were included, featuring mixed-methods and case study designs, predominantly set in the United States, as well as several multi-country studies set in Europe. The results identified that most strategies were technological (telehealth, electronic health records and care coordination tools) or structural (governance frameworks). Key facilitators included strong organizational leadership, pre-existing digital infrastructure, and stakeholder engagement, while significant barriers included a lack of interoperability and inadequate funding. Regulation was found to be an obstacle to digital tools development and implementation, as privacy legislation often prevents from fully achieving interoperability, making it essential to use frameworks like “Privacy by Design” to address privacy concerns during digital solutions development phase. Several frameworks surfaced, with both the Chronic Care Model and eHealth Enhanced Chronic Care Model being the most prevalent. Stakeholder engagement emerged as a pivotal enabler, yet significant resistance persists due to low digital literacy, misconceptions and an aging workforce, making it critical not only to develop formal and continuous training, but actively involving them in problem-solving though a co-creation process. Conclusions: Developing digital maturity in ICS requires a multidimensional approach that extends beyond technological adoption to include multidisciplinary governance, national eHealth policies, and value-based funding models. Addressing low digital literacy through formal training for staff and patients is critical for health care system´s sustainability. The review provides a foundational framework for healthcare managers and future research and development of digital maturity guidelines in ICS.

  • Background: Media consumption is a pathway through which the public encounters health information, misinformation, and politicized interpretations of evidence, yet its relationship with knowledge across multiple health domains remains incompletely understood. Objective: We conducted a cross-sectional survey of U.S. adults recruited through CloudResearch Connect to examine associations among media source use, institutional trust, demographic characteristics, and knowledge accuracy regarding climate change, type 2 diabetes, and infectious diseases. Methods: After excluding invalid responses and a failed attention check, 509 participants were included. Knowledge was assessed with domain-specific true/false items scored as correct, incorrect, or “I don’t know,” producing climate change, chronic disease, infectious disease, and total knowledge scores. Results: Rural residence, lower income, lack of health insurance, and absence of a primary care provider were associated with lower knowledge across several domains, suggesting structural barriers to reliable health information. Trust in the CDC, physicians, and pharmacists showed the strongest and most consistent positive associations with knowledge. Political affiliation and consumption of ideologically distinct news sources were most strongly associated with climate change and infectious disease knowledge, but less so with diabetes knowledge. Conclusions: These findings suggest that public health literacy interventions should address both polarized media environments and inequitable access to trusted clinical and institutional information.

  • Musculoskeletal health literacy requires patients to understand complex treatment options, postoperative precautions, and recovery timelines, which together help set realistic expectations for recovery. However, existing patient education materials are often text-heavy, exceed recommended reading levels, and fail to depict how functional recovery progresses over time, which may be especially limiting in safety-net settings serving populations with variable health literacy. In this paper, we describe methods for designing and developing a secure, institution-restricted gait-recovery video library for orthopedic patient education. Our library was built to close that gap with short, patient-perspective recovery videos centered on one of the most meaningful outcomes of lower-extremity surgery: functional mobility. Each video was built around a standardized Timed Up and Go (TUG) assessment recorded from frontal and sagittal views, paired with relevant radiographs and visually adapted patient-reported outcome measures (PROM), to create a multimodal, visually guided recovery pathway. This publication aims to detail the process of selecting a secure hosting platform; choosing the filming setup, recovery milestones, and key visual features to capture; maintaining patient privacy and data security; executing clinic-based filming and video editing; and building a personalized interface that allows videos to be filtered by procedure type, recovery stage, and patient characteristics.

  • Generative AI Chatbot Responses to Suicide and Self-Harm: A Systematic Review

    Date Submitted: Jun 5, 2026
    Open Peer Review Period: Jun 6, 2026 - Aug 1, 2026

    Background: A growing number of US adults and youth confide in generative artificial intelligence (AI) chatbots for mental health support, including disclosure of suicide and self-harm risk. While the quality, safety, and effectiveness of chatbot responses to risk disclosure have the potential to impact population-level rates of suicide and self-harm, there have been no systematic reviews of this burgeoning literature. Objective: We conducted a systematic review of studies evaluating generative AI chatbot responses to disclosure of suicide and self-harm risk. Methods: We searched six databases from January 2020-December 2025 and identified empirical studies involving interactions with generative AI chatbots that included discussion of suicide or self-harm. Following deduplication, studies (k = 1,042) were imported into Covidence and titles and abstracts were independently screened by two reviewers, with discrepancies resolved by a third reviewer. The same methods were used to evaluate 126 full texts. Data extraction was led by one reviewer and verified by a second. Results: We identified 29 papers (14 published; 15 preprints). Most (k = 20) were solely audit studies evaluating AI chatbot responses to suicide risk disclosure. Two developed chatbots or AI evaluation frameworks, and one was a jailbreaking study (adversarially testing AI systems or attempting to circumvent chatbot safety guardrails). The remaining studies combined approaches. Across studies, proprietary, frontier model chatbots (eg, ChatGPT, Claude) provided higher quality responses to suicide and self-harm risk than open-source chatbots (eg, LlaMA, DeepSeek), and many AI companions (eg, Replika, Character.AI). All chatbots, not just proprietary models, generally performed well on empathy, validation, and support. However, chatbot responses were often generic and lacked context. Chatbots did not proactively assess risk and performed most poorly when risk disclosure was ambiguous or moderate, frequently failing to recognize implicit risk or escalate to human-delivered services. Furthermore, responses were inconsistent between chatbots and often required multiple conversational turns before providing referrals to crisis resources and human-delivered professional support. While there were few examples of overtly harmful responses under standard conditions, jailbreaking attempts easily led to problematic responses. Finally, no chatbot proactively recommended limiting access to lethal means such as firearms, medications, or sharps. Conclusions: Chatbots provide validation and support in response to suicide and self-harm disclosure. Overall, however, their poor risk assessment, delays in referrals to crisis resources and human-delivered support, difficulty detecting jailbreaking attempts, and general lack of adherence to clinical guidelines present safety risks. While findings are limited by the rapid versioning of AI models over time, research is needed to evaluate stakeholder perspectives on AI chatbot responses to suicide and self-harm risk disclosure. Research should also examine the short- and long-term impact of these responses on clinical outcomes, utilizing follow-up assessments in real-world or clinical settings. Clinical Trial: OSF Registries osf.io/9uva3

  • Validation of a Patient-Facing AI System for Symptom Guidance: A Simulation Study With Physician Review

    Date Submitted: Jun 4, 2026
    Open Peer Review Period: Jun 6, 2026 - Aug 1, 2026

    Background: Rapid advances in large language models (LLMs) have expanded interest in healthcare applications that require complex information processing and decision support. Digital health assistants and symptom checkers, which have historically been rule-based, are increasingly incorporating AI capabilities to support initial symptom assessment and patient triage. An accurate and reliable AI-enabled triage tool could improve patient navigation, reduce unnecessary health care utilization, and support earlier recognition of clinically serious conditions. Objective: The objective of this study was to analytically validate the performance of the Personal Health Assistant (PHA), a Large Language Model (LLM)-based patient support tool in producing appropriate recommendations using simulated patient encounters and expert physician review. Methods: We conducted a prospective analytical validation study of the PHA using synthetic patient data. The evaluation set was constructed from 772 synthetic cases generated from published triage protocols and persona-based LLM-assisted generation. Cases included patient vignette summaries with medical histories and simulated patient conversations. PHA provided guidance and recommendations based on these inputs, which were compared to an independent ground-truth derived from expert physician review. Co-primary endpoints of urgent undertriage, nonurgent undertriage, and overtriage were each evaluated against prespecified clinical performance thresholds. Results: The final analysis dataset contained 772 synthetic cases. Urgent undertriage was 28/406 (6.9%, 95% CI 4.6%-9.8%), nonurgent undertriage was 56/305 (18.4%, 95% CI 14.2%-23.2%), and overtriage was 40/364 (11.0%, 95% CI 8.0%-14.7%). Overtriage met the prespecified performance threshold (<30%), whereas urgent (<5%) and nonurgent (<15%) undertriage thresholds were not met. Conclusions: PHA maintained acceptable overtriage but did not meet prespecified undertriage targets. These findings support the value of structured predeployment analytical validation as an early evidence step for patient-facing AI systems and highlight its utility for iterative refinement; while also underscoring the need for prospective clinical validation in light of the inherent limitations of simulation-based studies.

  • VAGPT and Free-Response Open Coding: Using AI for Qualitative Tasks within the Department of Veterans Affairs

    Date Submitted: Jun 4, 2026
    Open Peer Review Period: Jun 6, 2026 - Aug 1, 2026

    This research assesses the ability of VAGPT, a large language model authorized for use in the Department of Veterans Affairs, to identify qualitative codes and its coding reliability when applied to free-response data from an anonymous survey of servicemembers' access to posttraumatic stress disorder treatments.

  • Background: Background: Digital transformation has increasingly influenced healthcare systems globally, with Electronic Medical Records (EMRs) becoming central to improving healthcare documentation, communication and decision-making. Despite growing recognition of EMRs as tools for strengthening health data quality, healthcare institutions in many low- and middle-income countries like Nigeria continue to experience setback as regards digital inclusion, infrastructural limitations and workforce readiness. In Nigeria, public tertiary hospitals still experience inconsistent EMR implementation and persistent concerns regarding the quality of patients’ health data. Objective: Objective: This study explored healthcare providers’ perspectives on EMR adoption and health data quality in selected public tertiary hospitals in North-Central Nigeria within the broader context of digital health inclusion in the Global South. Methods: Methods: The study adopted explanatory sequential mixed-method design. The design involved quantitative phase, identification of key quantitative results, qualitative phase, integration of findings and interpretation. The quantitative data were collected using a structured clinical chart review checklist developed from internationally recognized health data quality dimensions and existing literature on EMR systems and health information management. The qualitative data were collected through semi-structured key informant interviews among physicians, nurses and Health Information Management professionals purposively selected from three public tertiary hospitals with varying levels of EMR implementation. Interviews were audio-recorded, transcribed verbatim and analyzed using thematic analysis. Results: Results: The study revealed an overall moderate level of health data quality, with high a Health Data Quality Index (HDQI) of 73%. Healthcare providers acknowledged the potential benefits of EMRs in improving accessibility, timeliness, comprehensiveness, relevancy and consistency of health data. Participants identified ease of information retrieval, reduction in missing records and improved continuity of care as major strengths of EMR systems. Several barriers to meaningful digital inclusion however emerged. These include unstable electricity supply, poor internet connectivity, inadequate training, workload pressure, dual documentation practices and limited institutional support. Providers further reported that system reliability, ease of use and user satisfaction strongly influenced their willingness to utilize EMRs consistently. Positive attitudes toward digital systems were associated with improved documentation practices and enhanced health data quality. Conclusions: Conclusion: Electronic medical records adoption in Nigerian tertiary hospitals remains shaped by complex technological, organizational and behavioural factors. Strengthening digital inclusion through reliable infrastructure, workforce capacity building and supportive institutional policies is essential for improving sustainable EMR utilization and health data quality in resource-constrained healthcare settings.

  • The Use of Natural Language Processing to Investigate Social Isolation and Loneliness: A Scoping Review

    Date Submitted: Jun 3, 2026
    Open Peer Review Period: Jun 5, 2026 - Jul 31, 2026

    Background: Social isolation and loneliness (SIL) are associated with critical health consequences but are difficult to measure in healthcare settings because they typically appear in unstructured text. Natural language processing (NLP) offers a promising approach to identify these constructs at scale, but its current applications in this domain have not been systematically characterized. Objective: To investigate how NLP is being used to study SIL, identify gaps, and outline priorities for advancing rigorous NLP-based measurement of these constructs in health research. Methods: A scoping review was conducted following the PRISMA-ScR guidelines. Six bibliographical databases (Ovid MEDLINE, Embase, Scopus, Web of Science, APA PsycINFO, ProQuest Dissertations & Theses Global) and two preprint servers (bioRxiv, medRxiv) were searched from inception to June 18, 2025. Reviewers independently screened abstracts and full texts, data were double-charted using a standardized form, and final results were synthesized using structured template analysis. Results: A total of 63 studies published between 2019 and 2025 met the inclusion criteria. Most were conducted in the US (27/63, 42.9%) and used cross-sectional designs (37/63, 58.7%). Studies mostly targeted older adults (31/63, 49.2%), used survey data (26/63, 41.3%), and focused on loneliness (32/63, 50.8%). Most studies (44/63, 69.8%) did not use a validated loneliness scale; among those that did, the UCLA Loneliness Scale was most common (16/63, 25.4%). Classification (24/63, 38.1%) was the most frequent NLP application. Rule-based (32/63, 50.8%) and traditional machine learning (18/63, 28.6%) approaches predominated, but large language models (16/63, 25.4%) and transformer-based models (14/63, 22.2%) increased over time. External validation was rare (2/63, 3.2%), code was shared in only 19% of studies (12/63), and 25.4% (16/63) addressed bias in their data or analysis. Conclusions: NLP applications for SIL are expanding rapidly but rest on narrow methodologies with limited validation measures and demographic groups. Advancing the field requires tying model development to validated measurement of the target constructs and adopting established reporting frameworks such as TRIPOD+AI and MI-CLAIM.

  • Background: Background: Attention-deficit/hyperactivity disorder (ADHD) is a common neurodevelopmental disorder in children, characterised by core symptoms of hyperactivity, impulsivity, and inattention, in addition to cognitive impairments that compromise physical and psychological development. Digital technology-based interventions have emerged as a promising approach for ameliorating both core symptoms and cognitive impairments. However, a comprehensive evidence base supporting their efficacy is lacking. Objective: Objective: This study aimed to systematically evaluate the effects of digital interventions on core symptoms and cognitive impairments in children with ADHD. Methods: Methods: Web of Science, PubMed, EBSCOhost, and ProQuest were systematically searched using predefined inclusion and exclusion criteria. The risk of bias of the included studies was assessed using the revised Cochrane risk-of-bias tool for randomised trials. Effect sizes were pooled under a random-effects model, and heterogeneity across studies was evaluated using the I² statistic. Publication bias was assessed using Egger’s regression and Begg’s rank correlation tests. Sensitivity analyses were performed by switching from a random-effects model to a fixed-effects model to confirm the results’ robustness. Results: Results: Thirty-seven randomised controlled trials were included, encompassing four types of digital interventions: computer-based interventions, serious video games, exergames, and virtual reality. Digital interventions significantly alleviated core symptoms and cognitive impairments, with improvements in the former primarily attributable to serious video games (P = .02) and those in the latter mainly attributable to computer-based interventions (P = .04) and serious video games (P = .02). Conclusions: Conclusions: Thus, digital interventions can significantly alleviate core symptoms and cognitive impairments in children with ADHD. Future research should consider further optimising trial designs and conducting targeted analyses, such as subgroup analyses by symptom subtype and age stratification, to enhance intervention efficacy. Clinical Trial: Trial Registration: PROSPEROKeywords: ADHD; children; digital technology-based interventions; core symptoms; cognitive function; meta-analysis CRD420261399517; https://www.crd.york.ac.uk/PROSPERO/view/CRD420261399517

  • Background: Generating Findable, Accessible, Interoperable, and Reusable (FAIR) biomedical samples, data, and tools is costly and time-consuming. Thus, transparency about their processing or evolution and reuse, particularly of health data, are highly desirable. Therefore, an appropriate fact-based decision framework to evaluate data (re)usability is required. Provenance information documents the processing or evolution of a data object, thereby providing an essential formal basis for such a (re)usability evaluation. Standardised, this provenance information facilitates better FAIR biomedical data. Objective: The MInimal Requirements for Automated Provenance Information Enrichment (MIRAPIE) project aims at defining the minimal required provenance information for harmonised documentation of a data objects processing history and to establish the MIRAPIE approach as a community standard to assure interoperability of the collected provenance information. Methods: A hybrid consensus finding method, adjusted from Nominal Group Technique (NGT) and Delphi, has been applied within an international community setting to iteratively implement a minimal data model, an ontology, and an application guideline. The data model is based on the PROV Data Model (PROV-DM), the ontology expands the PROV Ontology (PROV-O). Results: With the MIRAPIE question, we defined a harmonising framework for provenance information in biomedicine and presumably beyond. The minimal data model, a respective ontology, and an accompanying guideline facilitate means for standardised and possibly automated provenance documentation. In diverse biomedical usage scenarios their general applicability to data, workflows, models, and even samples is shown. Setting up provenance documentation from scratch is equally supported as linking alternative data schemata and mapping existing provenance documentation. Conclusions: MIRAPIE question, minimal data model, ontology, and guideline together significantly contribute to the advancement of biomedical and especially health research, setting up a basis for a contextual (re)usability evaluation. This fosters traceability of changes applied to data, workflows, tools, and samples and, in consequence, sustainable data usage and reproducibility of scientific results. The generalisation allows to overcome domain-specific differences and local, national, and international boundaries. We invite biomedical research community and health data gathering institutions to create lasting change by establishing MIRAPIE-compliant provenance information for transparent data processing and (re)usability assessment.

  • Background: Gestational diabetes mellitus (GDM) has a growing global prevalence and brings multiple adverse short- and long-term hazards to mothers and fetuses. Conventional offline management is restricted by time and space limitations, accompanied by poor patient compliance and delayed individualized intervention. Telemedicine has gradually been applied to GDM management, yet existing relevant randomized controlled trial (RCT) conclusions remain inconsistent, lacking unified quantitative evidence. Objective: This meta-analysis systematically synthesizes available RCT evidence to quantitatively evaluate the comprehensive efficacy of telemedicine interventions on maternal glycemic metabolism, delivery modes and multiple neonatal adverse outcomes in women with gestational diabetes mellitus, so as to provide evidence-based references for optimizing clinical GDM management schemes. Methods: Relevant RCT literatures were comprehensively retrieved from PubMed, EMBASE, Cochrane Library, Web of Science and Scopus up to February 9, 2026. Strict inclusion and exclusion criteria based on PICOS framework were formulated. Two independent researchers completed literature screening, data extraction and Cochrane RoB 2 bias risk assessment, and meta-analysis was performed via STATA 18.0 software. Continuous outcomes were expressed as standardized mean difference (SMD), and dichotomous outcomes were summarized by odds ratio (OR) with 95%CI; fixed or random-effect models were selected according to heterogeneity (I²). Results: Altogether 17 eligible RCTs involving 2391 GDM pregnant women were included. Meta-analysis indicated telemedicine significantly reduced fasting blood glucose (SMD=-0.55, 95%CI:-0.94~-0.16) and 2-hour postprandial blood glucose (SMD=-0.62,95%CI:-1.20~-0.04), lowered the risks of emergency cesarean section (OR=0.65,95%CI:0.45~0.93), macrosomia (OR=0.49,95%CI:0.35~0.69), neonatal hypoglycemia (OR=0.60,95%CI:0.42~0.86) and neonatal respiratory distress (OR=0.61,95%CI:0.41~0.92). No statistically significant improvements were observed in overall cesarean delivery rate, gestational weight gain, preterm birth incidence and neonatal NICU admission rate. Conclusions: Telemedicine interventions effectively optimize glycemic control and decrease multiple adverse perinatal complications among GDM patients, serving as a valuable supplementary mode for routine prenatal care. Further large-sample, long-term follow-up RCTs are still required to verify its long-term maternal and infant clinical benefits. Clinical Trial: PROSPERO CRD420261403669; https://www.crd.york.ac.uk/PROSPERO/view/CRD420251155941

  • Background: Digital health programs in sub-Saharan Africa often assume broad mobile reach, yet population-level evidence on who can use specific technologies, and who is excluded, remains limited. Without accurate denominators, digital interventions may reinforce inequities by missing people least engaged with conventional healthcare. Objective: We assessed technology adoption, disparities, and trajectories in a high-HIV-burden rural South African population to inform equitable digital health implementation. Methods: We analyzed 309,151 person-years from the Africa Health Research Institute demographic surveillance platform in rural KwaZulu-Natal, South Africa (2017 to 2023). We measured adoption of seven technologies (calls and SMS, internet, WhatsApp, email, mobile banking, entertainment, and health tracking) and constructed a five-tier Digital Adoption Ladder from offline (T0) to digital-health ready (T4). We quantified disparities by HIV status, gender, and their intersection using logistic regression, and tracked temporal trajectories including the COVID-19 period. Results: In 2023, 61.3% of records were classified as offline (T0) under the harmonized coding rules, and only 2.9% reached digital-health readiness (T4). Among tested individuals, people living with HIV showed higher adoption across all technologies (odds ratios 1.13 to 1.57) than HIV-negative individuals, with 56.0% connected versus 44.6%. Females also showed higher adoption than males (odds ratios 1.24 to 1.80). Intersectional analysis identified HIV-positive females as the most connected group (58.1%) and HIV-negative males as the least connected (38.4%), a 20-percentage-point gap. This pattern emerged after 2019 and defines a prevention paradox: a group important for HIV testing, PrEP, and prevention outreach is also the least reachable through digital channels. Conclusions: Digital health implementation should adopt a floor-up strategy: start with SMS (reaching approximately 39%), add WhatsApp where connectivity exists, and reserve apps for the small minority able to use them. HIV-negative males require targeted outreach through non-health channels to prevent digital exclusion from weakening HIV prevention.

  • Background: There are few theoretical frameworks in the literature for the strategic planning of health information systems. Demonstrating and analyzing their use in practice can lead to a broader application and evidence-based decision making. Objective: The study aimed to analyze and assess the information systems of a university hospital’s physi-cal therapy section and a university department of physical therapy in order to plan their integra-tion following the merger of the two facilities to form an institute for physical therapy at a Ger-man medical center. Building on this, a strategic plan for the institute’s information system is proposed. Methods: We used a methodological framework for the strategic planning of information systems in hos-pitals, extended it by lean management methods and applied it at the organizational unit level. We described the organizational units’ information systems’ static view by the three-layer graph-based metamodel for health information systems (3LGM²) and the dynamic view by Business Process Model and Notation (BPMN). Information sources were interviews with per-sonnel. Results: A strategic management plan for developing the institute’s information system has been pro-posed. A migration path has been established with 23 tactical projects over the next 3 years to accomplish to attain strategic management goals. Conclusions: The method for strategic planning of information systems could successfully be adapted to the organizational unit level and should therefore be applied to other departments in hospitals as well. It helps them identify weaknesses in information logistics through a systematic approach, enabling gradual improvement as part of a long-term plan.

  • A Functional Taxonomy for Quantifying the Diagnostic-to-Therapeutic AI Gap in Image-Guided Oncology: Analysis of 29,277 Publications Across Five Domains

    Date Submitted: Jun 2, 2026
    Open Peer Review Period: Jun 3, 2026 - Jul 29, 2026

    Background: Artificial intelligence research in image-guided oncology has grown exponentially, yet how far the field has progressed from diagnostic assistance toward direct therapeutic execution has never been quantified. Existing bibliometric surveys categorize studies by technical architecture or clinical domain, metrics that track publication volume but not proximity to procedural deployment. Objective: We developed a hierarchical functional classification framework to map the global landscape of therapeutic AI development across five major oncological indications. Our two specific objectives were: (1) to classify publications by clinical output function along the diagnostic-to-therapeutic continuum, and (2) to quantify the translation gap using three complementary metrics, triangulated against trial and device registries. Methods: We extracted 29,277 Web of Science publications spanning five image-guided oncologic specialties (thyroid, breast, lung, prostate, and liver) published between January 2010 and April 2026. AI-related records were classified by clinical function using a three-stage protocol: keyword categorization, contextual scoring, and rule-based filtering. Inter-rater reliability, validated on 518 independently coded publications, yielded Cohen's κ of 0.92. Our framework distinguished Diagnosis AI (disease identification) from therapeutic AI, then further stratified therapeutic AI into Bridge-support AI (treatment planning, prognosis, patient selection) and True Treatment AI. True Treatment AI was defined by concurrent satisfaction of two criteria: ≥Level 2 on the Yang Surgical Autonomy Scale and ≥Stage 1 on the IDEAL Framework. Results: Of 16,937 AI-related publications identified, 14,277 (84.3%) were categorized as Diagnosis AI and only 2,660 (15.7%) as therapeutic AI. All therapeutic publications fell exclusively within the Bridge-support tier. None satisfied the dual-framework criteria for True Treatment AI, yielding a uniform penetration rate of 0.00% across all five oncological domains. This complete execution vacuum persisted despite an 11-fold variation in inter-domain treatment-to-diagnosis ratios. The finding held under threshold relaxation, sensitivity analyses, and independent triangulation against 3,491 ClinicalTrials.gov records and 1,430 FDA device listings. Conclusions: Each specialty should periodically profile its diagnostic-to-therapeutic translational progress. The uniform absence of True Treatment AI across 15 years and five domains indicates that this gap is structural rather than cumulative, rooted in methodological inheritance from diagnostic paradigms and in regulatory category mismatches. Closing this gap requires coordinated framework development across regulatory, research, and clinical communities, rather than incremental algorithmic improvements.

  • Background: Large language models (LLMs) are increasingly used by patients seeking medication advice. Their quality for secondary stroke prevention counseling has not been well characterized. Objective: To compare five widely used search-enabled consumer LLM interfaces on patient-facing medication counseling for secondary stroke prevention across fourteen evaluation metrics covering safety, clinical accuracy, information quality, readability, empathy, actionability, and model test-retest stability, operationalized as lexical text stability. Methods: A 56-item English-language question bank was developed from current stroke prevention guidelines and submitted to five consumer LLM interfaces (ChatGPT, Claude, Gemini, DeepSeek, Doubao) via their official web interfaces on May 1, 2026, with repeat querying on May 8, 2026 to assess model test-retest stability. All systems were accessed using a logged-in account with web search enabled via a US-based connection. Responses were independently rated by two blinded raters. Non-parametric tests with Benjamini-Hochberg correction were applied. Results: Clinical accuracy was high and uniform across models (mean 4.44-4.52/5; Friedman p = 0.578). Gemini, DeepSeek, and Doubao scored significantly higher on EQIP (70.2-70.7 vs. 63.6-64.1; p < 0.001) and DISCERN (p < 0.001) than ChatGPT and Claude. All models substantially exceeded commonly used patient-education readability benchmarks (FKGL 11.1-14.4; benchmark <=6; FRES 33.4-46.8; benchmark >=60). ChatGPT had the highest unsafe response rate (14.3% vs. 7.1-10.7%). Conclusions: In this controlled evaluation of researcher-generated questions, the tested search-enabled LLM interfaces produced broadly accurate responses for secondary stroke prevention medication counseling, but weaknesses in readability, source transparency, and safety indicate that readability optimization, source-attribution prompting, and clinical review are needed before patient-facing use.

  • Background: Artificial intelligence (AI) is rapidly reshaping healthcare, offering tools to enhance diagnostic accuracy, streamline clinical workflows, and personalize care delivery. However, real-world AI implementation remains limited, hindered by organizational, technical, and sociocultural barriers that implementation science has only begun to address systematically. Objective: This scoping review maps the intersection of AI and implementation science in healthcare, examining the types of AI technologies deployed, their intended use, and the processes by which these tools are implemented into practice. Methods: Following PRISMA-ScR guidelines, we synthesized empirical evidence from 65 studies published between December 2011 and March 2025. Searches were performed across the databases CINAHL, PubMed, PsycINFO, Scopus, and Web of Science using the terms Artificial Intelligence, Healthcare, implementation, and empirical, combined with relevant synonyms. Results: AI implementation research has expanded rapidly, predominantly in high-income countries, raising important questions about global equity. The most common application areas were automation and optimization (40%), computer vision (34%), and human language technologies (20%), primarily targeting clinical care (68%) and health systems management (25%). Most systems were designed for low-action autonomy (62%), emphasizing human-in-the-loop decision-making. Intended users were physicians (43%), nurses (26%), and radiologists (25%), while patients appeared as intended users in only 11% of implementations. Across the 65 studies, 40 barriers and 55 facilitators were identified across five themes: the AI system itself, healthcare professionals, patients, organizational context, and the macro level. Organizational factors and multidisciplinary stakeholder engagement emerged as the most critical enablers of successful adoption. Key barriers included insufficient AI performance, lack of transparency and explainability, limited IT infrastructure, and inadequate workflow integration. Patient-level and governance-level barriers, including data privacy and regulatory uncertainty, remained underexplored. Only 20% of studies applied theoretical implementation frameworks, and most analyses were conducted retrospectively. Mapping via the AIGENT framework revealed a disproportionate focus on workflow alignment and outcome evaluation, with comparatively little attention to early-phase activities such as needs assessment, adaptation planning, and stakeholder approvals. Conclusions: The current literature predominantly focuses on implementation evaluation and workflow alignment, while patient perspectives, governance conditions, and early implementation activities are underexplored. The finding that only 20% applied theoretical implementation frameworks, mostly retrospectively, reflects a gap between theory and practice and points to a need to apply them prospectively across the full implementation process. From a practitioner perspective, AI implementation should be seen as a sociotechnical and governance process that requires technical, contextual, and system knowledge, rather than merely a technical deployment.

  • Comparing Digital Health Interventions for Risk Factor Control in Secondary Stroke and TIA Prevention: A Systematic Review and Network Meta-analysis

    Date Submitted: Jun 1, 2026
    Open Peer Review Period: Jun 2, 2026 - Jul 28, 2026

    Background: Background: Patients who experience a stroke or transient ischemic attack (TIA) face a substantial risk of future events, making optimal management of risk factors essential for secondary prevention. Digital health interventions have demonstrated promise in enhancing the control of vascular risk factors among individuals with stroke or TIA; however, the relative efficacy of different intervention modalities in achieving risk factor control remains uncertain. Objective: Objective: This study systematically assessed and compared the impact of various digital health interventions on the control of risk factors for secondary prevention among patients with stroke or TIA, aiming to determine the most effective intervention approach. Methods: Methods: A comprehensive and systematic literature search was performed across PubMed, Cochrane Library, Embase, and Web of Science databases from January 2010 to January 2026. This review included randomized controlled trials (RCTs) evaluating distinct digital health modalities among patients who experienced a stroke or TIA. Systolic blood pressure (SBP) changes served as the primary outcome, whereas alterations in diastolic blood pressure (DBP), patient medication adherence, total cholesterol (TC), and low density lipoprotein cholesterol (LDL-C) constituted the secondary outcomes. Utilizing the RoB 2 tool, two independent reviewers evaluated the risk of bias, followed by a Bayesian random-effects network meta-analysis to synthesize both direct and indirect evidence. We ranked the interventions based on their cumulative ranking curve (SUCRA) values and appraised the certainty of evidence through the GRADE approach. Crucially, the study protocol was registered prospectively in the PROSPERO database (CRD420261367782). Results: Results: A total of 25 RCTs involving 10,752 patients and six types of electronic health technologies were included. The results showed that, compared with usual care, combined digital technologies had a more pronounced benefit in reducing SBP (MD: −3.7, 95% CrI: −4.8 to −2.7; SUCRA: 71.95%); telephone follow-up demonstrated better effects on lowering DBP and LDL-C (MD: −2.4, 95% CrI: −3.7 to −1.2; SUCRA: 97.04%), (MD = −0.21, 95% CrI: −0.28 to −0.14; SUCRA = 55.95%). In addition, smartphone applications also showed certain advantages in improving medication adherence and reducing TC (MD = −0.39, 95% CrI: −0.71 to −0.068; SUCRA = 87.93%). Conclusions: Conclusions: Different digital health interventions may provide distinct benefits for secondary prevention after stroke or transient ischemic attack. Combined digital technologies appeared to be more effective for reducing SBP, telephone follow-up for improving DBP and LDL-C, and smartphone applications for enhancing medication adherence and reducing TC. However, due to the limited evidence base and small study sample size, these outcomes should be treated conservatively. Future large-scale, high-quality trials are required to verify these determinations. Clinical Trial: The study protocol was registered prospectively in the PROSPERO database (CRD420261367782).

  • Background: Chronic obstructive pulmonary disease (COPD) is a major global health challenge, with the number of affected individuals projected to approach approximately 592 million by 2050. Primary healthcare institutions bear substantial responsibility for COPD screening, diagnosis, and follow-up, but often face underdiagnosis, fragmented information systems, and workforce constraints. Although digital health and artificial intelligence (AI) have shown potential in COPD management, workflow-integrated solutions tailored to primary care remain limited. Objective: To describe a designathon-based co-creation process and the subsequent development of an early-stage prototype of an AI-enabled digital workflow for COPD screening and follow-up management in primary care. Methods: This descriptive process and prototype development study followed WHO practical guidance on crowdsourcing and designathons in health research. It comprised three phases: (1) an online open call (July 22 to August 1, 2025) soliciting ideas related to AI-assisted chronic disease management and digitalized follow-up care; (2) a 3-day in-person designathon in Guangzhou involving 23 participants from five stakeholder groups (primary care physicians, implementation science scholars, AI engineers, patient representatives, and chronic disease management specialists) who worked in five interdisciplinary teams using user journey mapping and structured co-creation activities; and (3) a post-designathon translation phase in which co-created deliverables were synthesized into an early-stage WeChat Mini Program prototype named FeiChangShun. Expert rubric scoring was used to assess team deliverables generated during the designathon. Results: The online open call received 26 submissions, 25 of which met eligibility criteria. During the designathon, five priority pain points were identified: data silos and interoperability barriers, training–practice disconnect, communication barriers, human resource shortages, and low disease awareness. The five teams generated differentiated workflow concepts and corresponding user journey maps to address these challenges. Drawing on these co-created outputs, the research team developed an early-stage prototype comprising five core modules: voice interaction support, health education support, behavior management support, standardized workflow support, and draft document/report generation. Conclusions: This study reports a structured designathon-based co-creation process and the development of an early-stage, guideline-informed workflow prototype for COPD management in primary care. Future studies should evaluate the prototype with end users and assess implementation feasibility, safety, and clinical impact in real-world settings.

  • Background: Children of parents with mental health illness (COPMI) have the right to receive preventive interventions to avoid developing their own mental health and socioeconomic problems. However, access to interventions varies widely within health care and social services depending on their geographical location in Sweden. In this regard, mobile health (mHealth) interventions for COPMI provide a pathway for more equitable and sustainable preventive support. Objective: This project has the following aims: (1) to map and assess the quality of available mHealth apps relevant to COPMI; (2) to understand the existing needs for digital health solutions (particularly mHealth) by consulting and collaborating with an interdisciplinary reference group (scholars, child rights organizations, and IT professionals); and, (3) to explore the prerequisites for development, implementation, and sustainable access, for future digital solutions for COPMI. Methods: We collaborated with a reference group through a series of meetings and workshops, identifying and assessing the quality of 10 free, highly ranked apps (i.e., leveraging the well-known Mobile App Rating Scale, MARS). All the workshop sessions were recorded and transcribed for further qualitative analysis, allowing us to document and derive our main findings. Results: Three out of 10 apps, scored high across the MARS dimensions of functionality, aesthetics, information and engagement, indicating a significant lack of high-quality apps relevant for COPMI. Further findings were derived, such as the preference for more general apps that are not specifically targeted to COPMI, as these could promote self-identification and reduce stigmatization. Regarding the third aim, the result showed that it was important to find an app that protected users’ privacy by allowing anonymous access to digital support, and that mobile apps should be complemented (or replaced) by web-based applications for accessibility for children who may not be allowed to download apps without parental permission. Conclusions: The sustainability of digital solutions (web or mobile apps) for COPMI is the biggest challenge for future developments. Partnering with providers that are already established in the mental health area is key, extending their services to COPMI, while leveraging an app development infrastructure that already has sustainable processes and business models behind it. To address engagement and deployment issues, it is important to actively involve children through participatory and co-creational approaches when designing and developing mHealth solutions.

  • Background: Traditional neuropsychological assessments for cognitive decline are lengthy in-clinic evaluations by a specialist, with typical wait times of 6-8 months. This creates a substantial patient burden and prolonged diagnostic and treatment timelines. Digital cognitive assessments (DCA) offer a scalable solution to these challenges, but their validation is challenged by the scarcity of large, high-quality datasets with established ground truth. Objective: To develop a model to identify mild cognitive impairment (MCI) and probable dementia using metrics from the Digital Assessment of Cognition (DAC), a brief, remote-capable DCA. A secondary objective was to conduct a preliminary assessment of the model's validity. Methods: We applied a semi-supervised model-based clustering method to combine a large dataset (N=1189) of DAC assessments alone, with a smaller dataset pairing DAC assessments with ground-truth neuropsychological diagnoses (N=248). We examined the model's predictive validity by comparing its predictions with diagnoses on a held-out test set. We examined congruent validity by testing associations with traditional analog assessments and demographic variables. Results: We identified a 6-cluster model with 3 MCI clusters and 2 probable dementia clusters. The model identified cognitively unimpaired, MCI, and dementia groups with high accuracy (78.7%) on the held-out test dataset, and showed excellent ability to identify cognitive impairment (AUROC=0.985) and dementia (AUROC=0.932). We identified strong associations with traditional analog assessments and demographic variables. An exploratory analysis showed evidence that clusters correspond to clinically meaningful subtypes of MCI. Conclusions: These results validate prior exploratory work and demonstrate the potential for more nuanced, holistic, and scalable cognitive assessments in non-specialist settings.

  • Physicians' Job Demands and Job Resources in Digital and Intelligent Healthcare: Scale Development and Validation

    Date Submitted: May 28, 2026
    Open Peer Review Period: May 31, 2026 - Jul 26, 2026

    Background: The rapid development of digital and intelligent medical technologies is profoundly reshaping the clinical work patterns of physicians, introducing new job demands and job resources into their clinical practice. However, existing measurement instruments have not captured these specific changes and novel challenges posed by such technologies, resulting in a lack of corresponding assessment tools, which limits in-depth quantitative research in this field. Objective: This study aims to develop and validate the Job Demands Scale (JDS) and Job Resources Scale (JRS) for physicians suitable for digital and intelligent healthcare scenarios. Methods: Building upon the foundation of prior qualitative interviews and literature review, the dimensions of the scales and a corresponding pool of measurement items were constructed. The scales underwent content revision through two rounds of Delphi expert consultation (N=18) and cognitive interviews (N=6). Subsequently, an online questionnaire survey was conducted with 1,016 clinicians using convenience sampling. The psychometric properties of the scales were evaluated through item analysis, exploratory factor analysis (EFA), confirmatory factor analysis (CFA), and reliability testing. Results: The finalized JDS comprises 22 items across six dimensions: Human-Machine Interaction Burden, Technology Output Risk, Information Security Burden, Occupational Substitution Risk, Doctor-Patient Communication Burden, and Technology Dependence Risk. The JRS consists of 23 items, also organized into six dimensions: Decision-Making Support, Risk Prevention Support, Workload Reduction Tools, Doctor-Patient Collaborative Platform, Precision Efficiency Support, and Clinical Competence Support. EFA indicated that the six factors of the JDS cumulatively explained 71.10% of the variance, and the six factors of the JRS cumulatively explained 58.98% of the variance. CFA demonstrated good model fit for both scales. For the JDS, the composite reliability (CR) values for the dimensions ranged from 0.758 to 0.869, and the average variance extracted (AVE) values ranged from 0.441 to 0.687. For the JRS, the CR values ranged from 0.640 to 0.792; however, the AVE values were relatively low, ranging from 0.339 to 0.490. The overall Cronbach's α coefficients for the JDS and JRS were 0.944 and 0.923, respectively. These results demonstrate that both scales possess good preliminary reliability and validity. However, their dimensional structure and discriminant validity still require further optimization. Conclusions: The JDS and JRS developed in this study exhibit good psychometric properties and hold strong potential for effectively evaluating physicians' job demands and job resources within the context of digital-intelligent healthcare. This provides a scientific basis for subsequent related research and clinical management practices. It is noteworthy that the high correlations observed between the dimensions of both scales suggest that future analyses on the impact of job demands and job resources on physicians' work in digital-intelligent healthcare settings should pay attention to the synergistic effects of these elements. Furthermore, the content of the scales should be continuously updated alongside the advancement of digital-intelligent medical technologies.

  • Smart Hospitals and Digital Health Powered by 5G and 6G Networks: A Scoping Review

    Date Submitted: May 28, 2026
    Open Peer Review Period: May 31, 2026 - Jul 26, 2026

    Background: Global health systems face increasing pressure due to population aging and recurrent pandemics, requiring a transition from Health 4.0 to Health 5.0. Although 4G technologies initiated the era of remote monitoring, their limitations in bandwidth and latency hinder critical real-time applications. Fifth-generation (5G) and sixth-generation (6G) networks, integrated with artificial intelligence (AI) and the Internet of Medical Things (IoMT), have emerged as key enablers of ultra-low-latency and highly reliable services in smart healthcare ecosystems. Objective: This scoping review aimed to map and synthesize scientific evidence on the use of 5G and 6G network technologies in healthcare, particularly in smart hospitals and eHealth services, and to identify related opportunities, challenges, and research gaps. Methods: We conducted a scoping review following the PRISMA Extension for Scoping Reviews (PRISMA-ScR) and the Arksey and O’Malley and Joanna Briggs Institute (JBI) frameworks. The research question was structured using the Population, Concept, Context (PCC) mnemonic. A systematic search was performed in Google Scholar using a comprehensive search string targeting 5G/6G, eHealth, smart hospitals, digital health, and telemedicine. Eligibility criteria included studies in English that explicitly addressed 5G or 6G infrastructure, architecture, or applications in healthcare contexts. Screening and data extraction were performed iteratively by reviewers, and studies were categorized according to implementation maturity, architectural advances, and security requirements. Results: Most identified studies were theoretical proposals (about 57%) or feasibility analyses, with a smaller proportion of real-world implementations. Practical evidence suggests that 5G can reduce emergency response times by up to 30% and enable in-transit imaging-based diagnosis, supporting the transformation of ambulances into advanced triage units. However, field tests report real-world 5G latency of approximately 10 ms, which is above the theoretical target of <1 ms and constrains latency-critical applications such as holographic telesurgery. Across studies, security and privacy—particularly for contextual and IoMT sensors—emerged as critical challenges, together with interoperability with legacy systems and the high cost of infrastructure. Conclusions: Advanced connectivity networks, particularly 5G and future 6G infrastructures, are positioned as foundational components for smart hospitals and digital health, supporting the transition from Health 4.0 to Health 5.0. Nonetheless, the evidence base is still dominated by conceptual works, and the full potential of these technologies is limited by technical, organizational, and economic barriers. Future work should prioritize explainable AI, end-to-end security, and sustainable business models to ensure safe, equitable, and clinically meaningful adoption of 5G/6G-enabled smart healthcare. Clinical Trial: This does not apply as it is a survey.

  • Evaluating a Web-Based Intervention for Digital Health Measurement: a Mixed-methods Study

    Date Submitted: May 28, 2026
    Open Peer Review Period: May 31, 2026 - Jul 26, 2026

    Background: Despite its potential to address key challenges in primary health care, digital health measurement faces substantial implementation barriers for health care professionals. To address these barriers, professionals from 4 disciplines - physical therapy, occupational therapy, speech and language therapy, and general practitioner practice assistance – collaborated with researchers to develop an intervention. The intervention comprised a website supported by coaching on the job as a temporary implementation strategy during development. Objective: This study explored whether and how the intervention facilitates optimized use of digital health measurement in patient care to inform further intervention refinement. Methods: A mixed-methods formative process evaluation was conducted using a predominantly qualitative approach. 18 health care professionals tested the intervention in daily practice. Data collection was guided by the Medical Research Council framework, a predefined process evaluation plan, and the intervention’s initial program theory. Quantitative data (questionnaires, 7-point Global Perceived Effect measures, and monitoring lists) informed semi-structured interviews and focus groups. Data were analyzed using descriptive statistics and directed content analysis. Results: The intervention was largely implemented as intended and improved digital health measurement in patient care by enhancing participants’ capability, opportunity, and motivation. Consistent with the initial program theory, these changes triggered implementation activity at the organizational level, strengthening implementation readiness through bottom-up change processes. Intervention strategies included collaborative learning, modelling and prompting action. These strategies operated through mechanisms such as experiential learning, in which professionals experienced the benefits and feasibility of digital health measurement, reinforcing motivation for its continued use. During intervention use, additional processes emerged, including champions facilitating organizational-level adoption of digital measurement by sharing knowledge and enthusiasm with colleagues. Coaching particularly supported initial intervention engagement by contextualizing generic information, stimulating interaction, and prompting action. Individual, organizational, instrumental, temporal, policy, and societal factors interacted with intervention components, strategies, and mechanisms to facilitate or constrain outcomes. Conclusions: Future refinement should strengthen key mechanisms and processes, integrate mechanisms previously supported by coaching, and develop scalable implementation strategies. As no single approach will fit all contexts, practices should tailor implementation to their local needs. The intervention’s generic framework and flexible use of core components support local adaptations.

  • Chatbot-Based Psychiatric Medication Counseling in Outpatients With Schizophrenia: Pre-Post Study

    Date Submitted: May 26, 2026
    Open Peer Review Period: May 28, 2026 - Jul 23, 2026

    Background: Chatbot-based interventions have shown promise in common mental health conditions such as depression and anxiety. However, their application in schizophrenia (SZ), particularly for psychiatric medication counseling, remains extremely limited. Objective: This study aimed to investigate the effects of a rule-based psychiatric medication counseling chatbot on clinical and patient-reported outcomes in patients with SZ. Methods: A total of 31 outpatients with SZ participated in a single-group pre–post study. Participants used a rule-based chatbot via a mobile app for 3 months. The chatbot provided structured guidance on antipsychotic medications, including side effects, management strategies, medication use, expected therapeutic effects and duration of medication. Primary outcomes included medication adherence (Adherence Rating Scale [ARS]), subjective well-being (Subjective Well-being under Neuroleptic Treatment Scale [SWN]), and side effects (Udvalg for Kliniske Undersøgelser Side Effect Rating Scale [UKU]). Secondary outcomes included psychopathology (PANSS), functioning (SOFAS), and insight (SUMD-K). Results: A total of 31 participants (mean age 33.91 years, SD 11.60; 20 males) completed the study. Medication adherence (ARS) showed a trend-level increase (4.77 vs 4.94, t=2.02, p=.057) but did not reach statistical significance. Among SWN subdomains, self-control improved significantly (mean difference 1.48, 95% CI 0.03 to 2.94, p=.045). UKU total severity scores decreased significantly (19.48 vs 14.52, mean difference −4.96, 95% CI −8.53 to −1.40, p=.008), driven by reductions in psychic (mean difference −1.97, 95% CI −3.52 to −0.41, p=.015) and miscellaneous symptom domains (mean difference −1.58, 95% CI −3.06 to −0.10, p=.037). Among secondary outcomes, PANSS positive symptoms decreased significantly (11.39 vs 10.35, mean difference −1.04, 95% CI −1.90 to −0.17, p=.021), whereas functioning (SOFAS) and insight (SUMD-K) did not change significantly. Older age (β=0.157, p=.017) and living with family members (β=4.304, p=.047) were associated with improvements in physical functioning, and greater chatbot use was associated with improvements in social integration (β=0.150, p=.029) and socio-occupational functioning (β=0.082, p=.024). Conclusions: A rule-based psychiatric medication counseling chatbot was associated with modest but significant improvements in subjective well-being and perceived side effect burden in patients with SZ, while its impact on medication adherence and broader clinical outcomes was limited. These findings suggest that chatbot-based interventions may serve as a useful adjunctive tool in SZ care, particularly for addressing medication-related concerns. Clinical Trial: Clinical Research Information Service (CRIS) KCT0011949; https://cris.nih.go.kr (registration number: KCT0011949)

  • Background: Digital health technologies (DHTs) for psychosis may help address the substantial gap in access to psychological services, yet prior syntheses are limited by heterogeneous designs and populations. T Objective: This systematic review and meta-analysis aimed to synthesize evidence from randomized controlled trials (RCTs) to estimate the relative effectiveness of DHTs in individuals with confirmed psychotic disorders. Methods: Web of Science, PubMed, Embase, Scopus, PsycINFO, and CENTRAL were searched from inception to January 2026. Eligible studies were RCTs enrolling adults with psychotic disorders that evaluated DHT-delivered psychological interventions targeting psychotic symptoms. Comparators included passive and active controls. Primary outcomes were positive, negative, and overall symptoms. Secondary outcomes included depression, anxiety, functioning, quality of life, dropout, and adverse events. Results: Forty-one RCTs (N = 4139) were included. Compared with passive controls, DHTs showed small to moderate significant reductions in positive (g = -0.18, 95% CI: -0.33 to -0.03; I2= 60%), negative (g = -0.32, 95% CI: -0.56 to -0.07; I2= 63%), and overall symptoms (g = -0.41, 95% CI: -0.71 to -0.10; I2= 78%) at posttreatment, with effects for positive symptoms also at follow-up. No significant effects were observed when compared with active controls. Subgroup analyses indicated significant effects for delusions but not auditory hallucinations, and stronger effects for therapist-supported versus interventions delivered fully automated. Secondary outcomes showed small improvements both posttreatment and follow-up in depression, anxiety, and general functioning, but not for quality of life. Heterogeneity was moderate to high in some of the analyses. Dropout rates were comparable across groups, with no consistent pattern of serious adverse events identified, although safety reporting was inconsistent. Conclusions: DHTs represent a promising approach, with outcomes that appear broadly comparable to face-to-face interventions, while offering potential advantages in accessibility, scalability, and flexibility. Further high-quality RCTs with active comparators and standardized safety monitoring are needed. Clinical Trial: CRD42021251108

  • Refined Exclusion in Medical AI: Reframing Algorithmic Fairness as Data Justice and Patient Safety Governance

    Date Submitted: May 25, 2026
    Open Peer Review Period: May 27, 2026 - Jul 22, 2026

    Medical artificial intelligence (AI) systems are often evaluated through aggregate performance metrics and output-level fairness measures. However, clinically meaningful harms may remain hidden when systems perform well on average while underperforming for data-poor, underrepresented, or structurally marginalized populations. This Viewpoint uses the concept of refined exclusion to synthesize a recurring pattern in medical AI: systems may appear technically successful at the population level while transferring uncertainty, misclassification, delayed recognition, or reduced clinical reliability to groups that are less visible within training data, validation cohorts, proxy definitions, and deployment workflows. Drawing on representative cases from population health management, chest radiograph AI, dermatology, computational pathology, and foundation model applications, we argue that refined exclusion should not be treated merely as algorithmic bias or a defect of model outputs. Rather, it reflects a data governance failure with direct implications for patient safety. Moving beyond output-centered algorithmic fairness, we propose data justice as a governance foundation for medical AI, organized across distributional, procedural, and substantive dimensions. We further outline operational checkpoints across the medical AI lifecycle, including subgroup learnability assessment, data provenance documentation, local validation, procurement-stage accountability, explainability-based proxy audits, post-deployment subgroup monitoring, and patient participation. Reframing refined exclusion as a patient safety problem shifts the central governance question from “Is this model accurate on average?” to “For whom is this system safe, reliable, and clinically accountable?”

  • Designing agentic speech assistance for PROM collection: a qualitative interview study with patients on assistance functions

    Date Submitted: May 22, 2026
    Open Peer Review Period: May 25, 2026 - Jul 20, 2026

    Background: Patient-reported outcome measure (PROM) completion is hindered by patient-level barriers—including motor, sensory, cognitive, and motivational constraints—that risk insufficient participation and non-response bias. While technology-enabled approaches such as multimodal speech assistance hold promise for reducing these barriers, assistance is a complex interaction: it can both alleviate and introduce barriers depending on how well it aligns with patients’ routines and needs. Objective: This qualitative study explores how patients perceive the advantages and disadvantages of AI-based speech assistance for PROM collection, focusing on how assistance functionalities interact with individual barriers and completion practices. Methods: We conducted semi-structured qualitative interviews with 96 psychosomatic and neurological rehabilitation outpatients, embedded in a pragmatic cross-randomised controlled trial. Participants completed PROMs with and without an AI-based speech assistance system offering speech output, speech input, and guidance by a socially interactive agent (SIA) that was physically, virtually, or voice-only embodied. The system was iteratively refined during data collection to address usability and performance issues. We included a broad sample to reflect real-world care settings, including patients without reported barriers. Using inductive content analysis (61 codes, grouped into 4 overarching and 9 subthemes), we examined perceived advantages and disadvantages of the three main assistance functionalities and multimodal interaction. Reporting followed the COREQ guideline. Results: The speech output function emerged as the most widely valued assistance feature, with many patients reporting improved concentration, question comprehension, and deeper engagement with item content. The social agent was described as making the interaction more engaging and less monotonous, by at the same time not evoking social pressure. Speech input was perceived as helpful by some, especially for those with motor impairments or a preference for verbal expression. However, each function also introduced challenges: speech output disrupted reading routines for some, the social agent was perceived as distracting or unnecessary by others, and speech input was criticised for recognition errors, inefficiency, and privacy concerns. Conclusions: AI-based speech assistance for PROM collection offers significant potential to reduce barriers and enhance patient engagement, but its effectiveness depends on alignment with individual needs, preferences and routines. While speech output proved broadly beneficial, speech input and socially interactive agents require careful design to avoid introducing new barriers, particularly for marginalised groups. Configurable, modular assistance systems that adapt to diverse user preferences and impairments are essential for equitable implementation. Future research should focus on inclusive co-design and longitudinal studies to refine these technologies for real-world clinical use. Clinical Trial: German Clinical Trail Register-ID: DRKS00035213

  • Background: Adolescent depression is clinically heterogeneous, and the presence of mixed features – defined as subthreshold manic symptoms co-occurring with a depressive episode – complicates diagnosis and treatment. Intensive longitudinal monitoring using wrist-worn actigraphy and daily ecological momentary assessment (EMA) may capture behavioral and experiential signatures that differentiate depression with mixed features (Mixed-Dep) from depression without mixed features (NoMix-Dep), but evidence in adolescents remains limited. Objective: This study aimed to examine whether multimodal digital monitoring using wrist-worn actigraphy and daily ecological momentary assessment of mood and energy can distinguish adolescents with depression with mixed features from those with depression without mixed features, and to identify dynamic energy–activity patterns specific to mixed depression. Methods: Ninety-eight adolescents (ages 12–18; 37 Mixed-Dep, 31 NoMix-Dep, 30 healthy controls) from the longitudinal Mood & Brain Circuitry in Adolescence (MBA) study wore wrist-worn actigraphy devices and completed daily mood and energy self-reports using the Mood and Energy Thermometer (MET) over two weeks. Group classification was defined based on the K-SADS-PL Mania Rating Scale. Dynamic within-person associations among mood, energy, and activity were estimated using generalized estimating equations with a first-order autoregressive working correlation structure, controlling for sleep duration, age, sex, and weekday/weekend status. Results: Both depressed groups showed lower overall activity and greater minimum activity suppression compared to healthy controls (mean activity: F = 32.67, p < 0.001), with NoMix-Dep showing lower minimum activity than Mixed-Dep (Min2: F = 17.91, p < 0.001; Min4: F = 23.37, p < 0.001). Mixed-Dep participants had significantly higher positive and negative energy scores (EnergyPosMax: F = 10.12, p < 0.001; EnergyNegMax: F = 91.93, p < 0.001), shorter wake after sleep onset (F = 3.67, p = 0.03), and higher sleep efficiency (F = 7.03, p < 0.01) than NoMix-Dep. Mood scores did not differ between depressed groups. Energy–mood associations were largely similar across groups. Energy–activity temporal coupling differed markedly: NoMix-Dep showed same-day congruent coupling (high energy predicted high activity), while Mixed-Dep showed an inverted lagged pattern (high energy today predicted lower activity tomorrow). Similar group-differential patterns were observed for mood–activity associations. Conclusions: An inverted, lagged energy–activity coupling represents a novel digital phenotype distinguishing mixed from non-mixed adolescent depression. Energy dysregulation, more than mood, differentiates the two depressed subgroups, with implications for scalable EMA-based screening and earlier identification of mixed features in clinical settings.

  • Background: Large language models (LLMs) have shown considerable potential in intelligent healthcare consultation. However, their application in Traditional Chinese Medicine (TCM) gynecology remains limited by semantic gaps between colloquial patient descriptions and professional TCM reasoning, as well as risks of hallucinated medical content. Objective: We proposed MAGR-TCM, a knowledge graph-powered multi-agent retrieval-augmented generation framework for home-based TCM consultation and preliminary risk assessment. Methods: A domain-specific knowledge graph containing 10,231 entities and 32,051 relationships was constructed from 741 curated clinical case records. The framework integrates four specialized agents for question analysis, risk routing, graph reasoning, and response evaluation. Model performance was evaluated using the RAGAS framework and a double-blind expert assessment on 60 independent cases, including a safety stress-test with 10 emergency "Red Flag" scenarios. Results: MAGR-TCM achieved the best overall performance among baseline models, with an average RAGAS score of 0.900 and a consultation professionalism score of 0.904. The proposed framework demonstrated strong factual consistency (Faithfulness: 0.821) and comprehensive diagnostic accuracy (0.952), approaching the performance of human experts. In safety stress testing, MAGR-TCM achieved 100% emergency identification accuracy and the lowest unsafe recommendation rate (0.240) among all evaluated AI systems. Conclusions: The proposed MAGR-TCM framework demonstrates the potential of integrating knowledge graphs and multi-agent reasoning to support interpretable and safety-aware TCM consultation. The system serves as a reliable methodological prototype for intelligent home-based health management and preliminary risk assessment.

  • Background: Other infectious diarrhea (OID) remains an important public health concern in China because of its high incidence, marked seasonality, and substantial burden, particularly among children. Accurate short-term forecasting and early warning are important for timely public health response. However, previous OID forecasting studies have mainly relied on reported case data, and the added value of multisource indicators remains insufficiently evaluated. Objective: This study aimed to develop and evaluate a multisource CNN-BiLSTM-SE Attention model for short-term forecasting and early warning of reported other infectious diarrhea cases in Chongqing, China. Methods: Daily OID case counts in Chongqing from January 2015 to June 2025 were collected, together with meteorological variables and Baidu search indices related to infectious diarrhea. After data normalization, Pearson correlation analysis and random forest variable-importance analysis were used for predictor selection. A CNN-BiLSTM-SE Attention hybrid model was developed to integrate multisource data, extract local temporal patterns, model temporal dependencies, and recalibrate informative feature channels. Forecasting performance was evaluated using RMSE, MAE, MAPE, and R², and compared across different input settings and benchmark models. In addition, 5-day-ahead predictions were converted into binary warning signals using training-set 75th and 90th percentile thresholds, and compared with a persistence baseline. Results: Under the full-input setting, the CNN-BiLSTM-SE Attention model achieved the best predictive performance, with an R² of 0.7828, RMSE of 35.418, MAE of 25.411, and MAPE of 17.27%. Compared with the case-only model, R² increased by 0.0326, while RMSE and MAE decreased by 2.560 and 1.643, respectively. The proposed model also outperformed random forest, XGBoost, CNN, and LSTM. In the threshold-based early-warning evaluation, the full-input model showed better overall warning performance than the persistence baseline at both the 75th and 90th percentile thresholds. Conclusions: The CNN-BiLSTM-SE Attention hybrid model improved short-term forecasting of reported OID case counts in Chongqing. Integrating epidemiological, meteorological, and internet search data provided complementary information, suggesting potential utility for OID surveillance, forecasting, and early warning.

  • Background: The platform-based economy has expanded rapidly through the integration of digital platforms into sectors such as transportation, delivery, and freelance work. Platform labor combines features of precarious employment and digitalized work organization, encompassing both location-based and web-based work. However, the occupational health implications of platform work remain insufficiently understood, particularly regarding how risks differ across platform worker groups. Objective: This study aimed to explore how platform workers experience their working conditions and how platform work affects their health, wellbeing, and safety. Methods: A participatory photovoice study was conducted with platform-based taxi drivers, delivery couriers, and freelancers living in Stockholm. Between September and November 2022, 16 participants were recruited into three groups (5–6 participants per group). Across five sessions, participants documented their working lives through photographs and discussed them collectively, generating 105 photographs in total. Data were analyzed collaboratively to identify key themes and recommendations related to working conditions, health, and wellbeing. Results: Participants identified 14 themes representing major determinants of health, wellbeing, and safety at work, as well as 23 recommendations for improving working conditions. Workers reported exposure to both platform-specific risks, including algorithmic management and digital surveillance, and traditional occupational risks such as psychosocial strain, ergonomic challenges, and traffic-related hazards. Experiences differed substantially across platform work types. Delivery and taxi drivers reported greater exposure to physical and traffic-related risks, whereas freelancers emphasized psychosocial demands and digital work intensification. Economic insecurity and costs associated with maintaining work equipment emerged as common challenges across all groups. Attitudes toward flexibility, autonomy, and algorithmic management also varied between worker categories. Conclusions: This study highlights important similarities and differences in working conditions and health risks across platform work types. The findings suggest that research and occupational health interventions targeting platform workers should differentiate between specific forms of platform labor to better capture the diversity of workers’ experiences and exposures.

  • Background: Co-creation is increasingly used in health research, public health, and participatory initiatives to support inclusive, collaborative, and evidence-informed problem-solving. However, the integration of digital technologies into co-creation processes remains fragmented and largely ad hoc, with limited frameworks available to guide technology selection, evaluation, and development. Objective: This study aimed to develop the Co-Tech Taxonomy, an empirically grounded evaluative framework for assessing digital technologies used in co-creation and participatory digital health ecosystems. Methods: Using the Nickerson–Varshney–Muntermann (NVM) taxonomy-building method, the taxonomy was developed through the analysis of six foundational conceptual and empirical frameworks related to co-creation, participatory processes, and digital technologies. The taxonomy was subsequently refined through iterative empirical classification of 84 technologies used in co-creation contexts. Results: The final taxonomy consists of seven functional dimensions: governance, inclusivity, methodology, collaboration, engagement, data management, and cognitive support. Each dimension is operationalised across three progressive levels of co-creation alignment. The empirical mapping revealed that current digital ecosystems remain insufficiently aligned with participatory collaboration requirements, particularly regarding governance, inclusivity, and AI-supported cognitive facilitation. While communication and data-management functionalities were comparatively mature, participatory governance, collaborative decision-making, and AI explainability remained underdeveloped across most evaluated technologies. The taxonomy also enabled the development of a three-tier indicative certification model to support technology assessment and implementation. Conclusions: The Co-Tech Taxonomy provides a structured evaluative framework for assessing existing technologies, identifying implementation and innovation gaps, and guiding the development of more inclusive, transparent, interoperable, and AI-ready participatory digital infrastructures. The framework offers a practical foundation for strengthening digitally supported co-creation and participatory collaboration within health-related contexts.

  • Background: House dust mite (HDM) sensitization commonly begins in early life and contributes to persistent allergic airway inflammation and asthma chronicity. Primary prevention via early-life environmental control is a key pathway to reduce HDM sensitization and asthma risk. Objective: To characterize child caregivers’ knowledge, attitudes, and practices (KAP) regarding pediatric HDM control using a hybrid literature/expert-driven and social media-driven approach, and examine associations between KAP levels, child age and caregiver social media activity. Methods: This cross-sectional study comprised two interconnected components: (1) mining of content published between August 2023 and July 2025 from five major Chinese social media platforms, analyzed via Latent Dirichlet Allocation (LDA); and (2) a social media-enhanced web-based KAP survey administered in November 2025 to child caregivers in Chongqing, a warm-humid region where HDMs dominate indoor allergens, with participants recruited via local child health facilities. In total, 132,341 social media documents and 2,275 caregivers of children <18 years were included in the analysis. The main outcomes included social media discourse patterns and domain-specific KAP levels across five dimensions: foundational knowledge (K1), recommended control knowledge (K2), attitude toward social media topics (A1), attitude toward recommended methods (A2), and control practices (P). Stratified analysis was conducted by two exposure variables: child age (≤3 years vs >3 years) and caregiver social media activity (active vs. inactive). Results: LDA topic modeling identified five distinct topic clusters in the social media content. Commercial, emotional, and misleading content collectively dominated the information landscape, accounting for 83.3% of included documents, with commercial content often systematically conflating the concepts of “disinfection” and “mite elimination”. Only 16.7% was classified as health educational content focusing on HDM allergy prevention. The average KAP levels of K1, K2, A1, A2, and P domains were 62.9%, 84.7%, 57.0%, 37.8%, and 25.8%, respectively. Social media emerged as the primary knowledge source (80.7%), with methodological knowledge gaps (47.5%) being the top implementation barrier. Caregivers of children ≤3 years had significantly lower self-rated knowledge (23.5% vs. 28.3%, P=.01), stronger endorsement of recommended methods, but also greater information overload (OR 1.39, 95% CI 1.15-1.67, P<.001) and decision difficulties (OR 1.23, 95% CI 1.01-1.52, P<.001). Socially active caregivers showed better performance across multiple items in five domains, but also increased non-recommended practices (ultraviolet irradiation: OR 1.85, 95% CI 1.35-2.53, P<.001) and misconception acceptance (allergy impact exaggeration: OR 1.39, 95% CI 1.04-1.87, P=.03). Conclusions: Complex and suboptimal KAP levels exist, particularly among caregivers of young children (≤3 years). Social media activity associates with both enhanced implementation of control practices and elevated misconception endorsement. These findings reveal critical educational gaps and the necessity of social media intervention. Clinical Trial: Not applicable.

  • Examining inequities in the use of Continuous Glucose Monitors Among People

    Date Submitted: May 6, 2026
    Open Peer Review Period: May 6, 2026 - Jul 1, 2026

    Background: Continuous glucose monitoring (CGM) offers clinical and behavioural benefits for people with type 2 diabetes (T2D), including improved glycaemic control and enhanced self-management. However, important evidence gaps remain regarding whether CGM use is equitably distributed across patient groups and whether Objective: To examine the relationship between CGM use among individuals with type 2 diabetes (T2D) and a range of patient characteristics, including socio-demographic factors linked to health inequities, digital health literacy, clinical characteristics, and service utilisation. Methods: A cross-sectional online survey was conducted in November 2024 among adults in the UK with self-reported type 2 diabetes (T2D), recruited via the YouGov panel. The primary outcome was self-reported CGM use. Predictor variables included PROGRESS-Plus characteristics (age, gender, ethnicity, religion, education, occupation, household income, disability, and social engagement), digital health literacy (eHEALS scale), clinical characteristics (disease duration, current treatment, and complications), overall health status (number of long-term conditions), and healthcare utilisation (frequency of visits). Descriptive statistics and multivariable logistic regression were used to examine associations between CGM use and patient characteristics. Results: Among 403 participants, 12.7% reported CGM use. Nearly half of participants were aged 65 years or older, and 56.80% were male. Most participants were White 83.90% and lived in urban areas. Higher odds of CGM use were observed among insulin users (OR=3.80, 95% CI: 1.6–9.22, p<0.001). No other demographic, clinical, or service utilisation variables were statistically significantly associated with CGM use. Conclusions: CGM use was primarily driven by insulin therapy, consistent with established clinical pathways within the National Health Service that prioritise access for this group. No significant variation was observed across demographic, socioeconomic, or health literacy-related characteristics, suggesting no clear evidence of inequalities in this sample. These findings indicate potentially equitable access, although further research in larger and more diverse populations is needed to confirm these patterns.

  • Background: Depressive disorders are one of the most prevalent psychiatric disorders globally and impose considerable individual and societal burdens. Psychotherapy, including cognitive behavioral therapy, is recommended as a first-line treatment especially for mild to moderate depressive disorders. However, face-to-face psychotherapy is often limited by issues of accessibility and cost. Digital therapeutics (DTx) have gained increasing attention as alternatives for overcoming these hurdles. With advances in digital technology, digital placebos have been increasingly adopted as comparators in the clinical trials for DTx. However, the characteristics of the clinical trials, the magnitude of digital placebos and their moderators remain poorly understood. Objective: The objectives of this study were to investigate the characteristics of clinical trials using digital placebos as comparators, and to assess the magnitude of the digital placebo effects and their moderators on depressive symptoms measured by Patient Health Questionnaire-9 (PHQ-9). Methods: The blind randomized clinical trials (RCTs) evaluating PHQ-9 by setting digital placebos as comparators were identified by searching MEDLINE, Scopus, Web of Science, PsycINFO, CINAHL, Cochrane Central Register of Controlled Trials, ClinicalTrials.gov, ISRCTN in November 2025. The characteristics of the RCTs and of the digital placebos were reviewed systematically. The meta-analysis including sub-group analyses and meta-regressions were conducted to investigate the magnitude and the moderators of the digital placebos. Results: 29 articles and 30 studies with 5680 participants were included in this systematic review and meta-analysis. The most common trial design was 2-arm, parallel-group study conducted in a single country, adopting “Replaced” and “Mobile” as the placebo approach and delivery type, respectively. The pooled effect size for all the included studies was Hedges’ g = 0.44 (95% CI 0.29 to 0.59) with an overall I2 = 93.2 %. Subgroup analyses showed moderate-to-large and statistically significant placebo effect in the group of primary psychiatric disorders (Hedges’ g = 0.69; 95% CI 0.40 to 0.99). Meta-regressions indicated that the group of primary psychiatric disorders and baseline PHQ-9 score were the independent moderators of the digital placebo effects and the major contributing factors of the high heterogeneity (R2 = 51.5%). Conclusions: Statistically significant digital placebo effects were observed on depressive symptoms, and target population and baseline PHQ-9 score were identified as the independent moderators. These findings would have implications for the planning of future DTx clinical trials using digital placebos for depressive symptoms.

  • Quality Criteria for Cancer Patient Portal Content: Framework Development and Pilot Audit Study

    Date Submitted: May 1, 2026
    Open Peer Review Period: May 1, 2026 - Jun 26, 2026

    Background: Patient-facing cancer portals are increasingly used to provide education, support interpretation of results, navigate services, and guide self-management across the cancer journey. However, variation in content quality, transparency, readability, accessibility, and governance can undermine equity, safety, and trust. Objective: To develop and present EU-CiP20 as a first-phase, evidence-informed, operational, and auditable framework of quality criteria for cancer patient portal content. Methods: We synthesised established instruments and authoritative guidance on online health information quality, health literacy and plain-language communication, transparency and conflicts of interest, patient engagement, privacy and data protection, digital governance, accessibility, and AI-related safety. Candidate criteria were harmonised from a broader evidence-mapped set (EU-CiP30) into a streamlined taxonomy (EU-CiP20) using explicit consolidation rules and an auditable mapping trail. Each category was operationalised into four observable sub-criteria and scored using a pragmatic 0-2 scale. EU-CiP20 is presented as an initial comprehensive framework to be refined in the next phase through stakeholder focus groups, an online survey with affected cancer patients, expert inquiry, and a Delphi expert panel, with the aim of reducing the 20 criteria to a final operational core of approximately 10 criteria. Results: EU-CiP20 comprises five domains and 20 categories spanning accessibility and comprehensibility; evidence and content governance; relevance and personalisation; human-centred design and empowerment; and ethics, safety, and trust. In the pilot, adjusted EU-CiP20 totals ranged from 19.5% to 40.6%. The most consistent gaps were governance signals required for portal readiness, including named clinical ownership, explicit review cycles, evidence traceability, and accessibility auditability. Comparator tools characterised content-level strengths but did not fully capture these governance risks. Conclusions: EU-CiP20 offers a practical and auditable first-phase approach to strengthen governance of patient-facing cancer portal content. It complements existing information-quality instruments by linking readability, evidence governance, relevance, empowerment, transparency, safety, and digital trust within a single operational taxonomy. The work is not yet complete: the current 20-criteria framework will be refined through stakeholder focus groups, an online survey with affected cancer patients, expert inquiry, and Delphi expert panel consensus to produce a shorter final set of approximately 10 criteria, followed by assessment of inter-rater reliability, feasibility, sensitivity to change, and real-world implementation impact.

  • Background: Diffusion of innovations theory posits that inequalities arising from the early adoption of new technologies, such as telemedicine, are likely to decrease over time. However, evidence is scarce on the evolution of inequalities related to individual telemedicine adoption over time. Objective: This study aims to assess changes in age and socioeconomic inequalities in telemedicine adoption in Japan from 2020 to 2024. Methods: We used data from a nationwide, internet-based panel survey of the general population in Japan. Participants aged 18–75 years who completed both the 2020 baseline and 2024 follow-up surveys were included. The primary outcome was self-reported telemedicine adoption (ever use at each survey). Using multivariable logistic regression models, we regressed telemedicine adoption on (1) indicators of age and socioeconomic status at baseline, (2) survey year, and (3) their interaction, adjusting for other demographic, socioeconomic, and health-related characteristics. We then estimated the adjusted prevalence of telemedicine adoption in 2020 and 2024 for each age and socioeconomic group. Results: We included 10,818 participants (mean [SD] age, 49.7 [16.8] years; 50.7% women). In 2020, 271 participants (2.5%) reported telemedicine adoption; by the 2024 follow-up survey, this increased to 840 participants (7.8%). The prevalence of telemedicine adoption was lower among older individuals, those with lower educational attainment, those with medium income (vs high income), and unemployed individuals (vs upper non-manual workers) in 2020. While the prevalence increased across groups from 2020 to 2024, the increases were smaller among older age groups (70–75 years: +1.0 percentage points [pp] vs 18–29 years: +13.2 pp; difference-in-differences, −12.1 pp; 95% CI, −18.3 to −6.0 pp). Similarly, increases were smaller among unemployed individuals than among upper non-manual workers (+2.8 vs +5.8 pp; difference-in-differences, −3.0 pp; 95% CI, −4.7 to −1.2 pp). Changes in the prevalence of telemedicine adoption did not vary significantly by educational attainment, urban vs rural residence, or income level. Conclusions: Despite growth in telemedicine adoption from 2020 to 2024, age-related and occupational inequalities widened, and educational inequalities persisted, underscoring the need for strategies to reduce age-related and socioeconomic barriers to telemedicine adoption.

  • Longitudinal Modeling or Monitoring of Depression in Speech: A Systematic Review

    Date Submitted: Apr 30, 2026
    Open Peer Review Period: Apr 30, 2026 - Jun 25, 2026

    Background: Depressive disorders are a leading cause of disability worldwide, and more than 40% of people who experience a single depressive episode will experience recurrence. It is, therefore, essential that people living with a depressive disorder are able to access appropriate means of monitoring, to identify recurrences and enable timely interventions. Existing monitoring methods are burdensome for both clinicians and patients, but previous research into automated depression diagnosis has demonstrated links between participants’ depression severity and speech features. Longitudinal depression modeling through speech aims to build on these links and provide automated methods of long-term depression monitoring. Objective: This systematic review collates existing research into the monitoring or modeling of changes in depression severity, through its impact on speech. Methods: We searched the ProQuest, Scoups, Web of Science, PubMed and IEEE Xplore databases for studies relating to the longitudinal modeling of depression in speech. Publications of any age were acceptable, but only English-language studies were included. All studies underwent quality appraisal using the CASP cohort study checklist. Results: We retrieved 22 relevant documents from the database searches, and a further 40 documents through citation chasing and manual searching. The observational periods employed by these studies varied from 7 days to 18 months, and sample sizes of 16-954. Speech features such as speaking rate and pause duration show promising sensitivity to changes in depression severity. However other features, such as average energy velocity, exhibit conflicting trends across different studies - as does the generalizability of prosodic and acoustic features between languages. Conclusions: We identified significant methodological variation within the data collection, feature extraction, and modeling stages of the studies. While there is evidence to suggest that speech features are sensitive to changes in depression severity, some findings are inconsistent between studies. We advocate for greater clarity and consistency in the reporting of methods to support comparisons of findings between studies and generalizability testing. Future work could explore the predictive capacity of speech to identify oncoming depressive episodes. Clinical Trial: PROSPERO CRD420251003661; https://www.crd.york.ac.uk/PROSPERO/view/CRD420251003661.

  • Comparative News Narratives on Electronic Nicotine Delivery Systems in Broadcast Digital Media Across China and The U.S.: A Cross-Country Thematic Analysis, 2020-2025

    Date Submitted: Jan 29, 2026
    Open Peer Review Period: Apr 28, 2026 - Jun 28, 2026

    Background: Electronic nicotine delivery systems (ENDS) are at the center of global public health debate. China is the largest producer of e-cigarettes while the U.S. has the largest consumer market, yet analyses of news coverage of ENDS comparing China and the United States (U.S.) remain limited. Objective: The primary objective of this study is to identify and compare dominant themes in ENDS-related news coverage across leading broadcast-branded digital outlets in China and the United States, and to assess how these themes and coverage volume changed over time. Methods: We conducted a thematic analysis of 470 ENDS-related stories from January 1, 2020, to July 30, 2025, from four leading broadcast news digital media platforms: CNN.com and FoxNews.com in the U.S.; CCTV.com and ifeng.com in China. Using a single theme approach, coders identified core themes for each article based on prespecified rules and a hierarchical decision structure. Frequencies and proportion of each core theme were summarized for the overall sample and stratified by country. Pearson chi-square tests and binary logistic regression models were conducted to examine cross-national differences with false discovery rate (FDR) adjusted p-values. Temporal changes in themes were examined and visualized. Results: In U.S. coverage, the most prevalent themes were policy and regulatory governance (32.1%), youth appeal, flavors, and school responses (22.4%), and health risks, harms, symptoms, and dependence (13.9%). In Chinese coverage, the most prevalent themes were commercial practices and market dynamics of ENDS (26.0%), policy and regulatory governance (23.4%), and enforcement and compliance (15.7%). Cross-national differences in themes were consistently observed between the two countries. Between 2020 and 2025, coverage in China transitioned away from commercial and market themes toward greater focus on illicit substances and enforcement, while U.S. coverage showed relatively stable focus on commercial market with a gradual increase in enforcement-related reporting. Conclusions: Broadcast news in China and the U.S. may actively shape how ENDS are defined as a public issue and what policy responses appear legitimate. Chinese coverage tends to stress commercial activity and enforcement, whereas U.S. coverage more often foregrounds youth risks and regulatory debates. These distinct thematic patterns may influence risk perceptions and policies in each country and are important to consider in comparative media and public health research.

  • Digital health interventions to prevent post-traumatic arthritis after traumatic knee injury: a scoping review

    Date Submitted: Apr 28, 2026
    Open Peer Review Period: Apr 28, 2026 - Jun 23, 2026

    Background: Traumatic knee injuries (TKI) are common, associated with a 4-6 times increased risk of post-traumatic knee osteoarthritis (PTOAK) over the subsequent 15–20 year period. There is clear evidence that risk can be reduced, but long-term care availability is limited, prompting the development of DHIs (digital health interventions) such as wearable devices, telehealth innovations and mobile apps. Objective: To evaluate existing DHIs against the OPTIKNEE consensus guidelines for PTOAK prevention and investigate adoption into practice. Methods: A search of 7 online databases and the grey literature was completed from inception to 03/06/2025, complemented by hand searching government, charity and university websites for reports and technical prototype papers concerning DHIs to support care after TKI. DHI features were mapped to the OPTIKNEE recommendations, evaluated against the health-technology pathway to identify development stage, and implementation analysed using NPT (Normalisation Process Theory). Results: 81 reports, 53 peer-reviewed and 28 other, concerning 49 distinct DHIs were found. They were designed for injuries of the anterior cruciate ligament (ACL, n=12); ACL meniscus (n=15); meniscus (n=3); ACL or meniscus (n=2), bone (n=2), patella dislocation (n=1), and 14 were non-specific. No DHIs addressed all OTPIKNEE recommendations, however the eight most complete reported 4/7 components, including exercise, information provision, patient reported outcome measures, goal setting and overall patient outcome. A remote, self-assessed strength evaluation was not reported in any DHI. NPT analysis typically demonstrated low DHI adoption levels, and no clear correlation with health technology pathway stage. The DHI with the highest adoption into routine practice, according to NPT, was ‘getUbetter’ with 56% positive scores. Conclusions: There are many available, or developing, DHIs but none include the content recommended by OPTIKNEE to reduce the risk of PTOAK. Further, there is negligible evidence of DHIs being adopted into usual care. There is a clear need to develop guideline-compliant DHIs to support effective prevention.

  • Providing consultation recordings to patients in German routine cancer care: A mixed-methods pilot study

    Date Submitted: Apr 28, 2026
    Open Peer Review Period: Apr 28, 2026 - Jun 23, 2026

    Background: The provision of audio recordings of medical encounters to patients, referred to as consultation recordings, is a well-established intervention to address information needs like recall and comprehension in cancer care. Despite these benefits, consultation recordings are not routine practice. Furthermore, research on consultation recordings in Germany is lacking. Objective: This study aims to pilot test consultation recordings in routine cancer care in Germany and assess feasibility of implementation and perceived effects from patients’ perspective. Methods: Using a sequential mixed methods approach, we assessed consultation recordings’ use, usability, acceptability, appropriateness, influencing factors, and perceived effects. Consultation recordings were piloted in an outpatient setting. Adult cancer patients were eligible to participate. Four weeks after the recorded consultation, participants received a quantitative questionnaire. In addition, a selection of participants were qualitatively interviewed. Quantitative data was analyzed using descriptive statistics, qualitative data using a combination of Practical Thematic Analysis and qualitative content analysis. Results: Ninety-seven consultations were audio-recorded and provided to patients. Seventy participants returned the quantitative survey (response rate 72.2%) and 16 participated in qualitative interviews. Most participants listened to the consultation recording and experienced improvements in recall, comprehension, and feeling informed. Routine implementation of consultation recordings was desired by many. The results suggest that patients perceive consultation recordings as feasible. However, we encountered organizational implementation challenges. Conclusions: This study provides initial evidence on the patient-perceived feasibility of consultation recordings in German routine cancer care. Consultation recordings have the potential to help patients navigate complex medical information. However, organizational implementation challenges hinder their uptake. Future research could investigate technically easier solutions suited to the German healthcare context.