Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

The leading peer-reviewed journal for digital medicine and health and health care in the internet age. 

Latest Submissions Open for Peer Review

JMIR has been a leader in applying openness, participation, collaboration and other "2.0" ideas to scholarly publishing, and since December 2009 offers open peer review articles, allowing JMIR users to sign themselves up as peer reviewers for specific articles currently considered by the Journal (in addition to author- and editor-selected reviewers).

For a complete list of all submissions across all JMIR journals as well as partner journals, see JMIR Preprints

Note that this is a not a complete list of submissions as authors can opt-out. The list below shows recently submitted articles where submitting authors have not opted-out of open peer-review and where the editor has not made a decision yet. (Note that this feature is for reviewing specific articles - if you just want to sign up as reviewer (and wait for the editor to contact you if articles match your interests), please sign up as reviewer using your profile).

To assign yourself to an article as reviewer, you must have a user account on this site (if you don't have one, register for a free account here) and be logged in (please verify that your email address in your profile is correct).

Add yourself as a peer reviewer to any article by clicking the '+Peer-review Me!+' link under each article. Full instructions on how to complete your review will be sent to you via email shortly after. Do not sign up as peer-reviewer if you have any conflicts of interest (note that we will treat any attempts by authors to sign up as reviewer under a false identity as scientific misconduct and reserve the right to promptly reject the article and inform the host institution).

The standard turnaround time for reviews is currently 2 weeks, and the general aim is to give constructive feedback to the authors and/or to prevent publication of uninteresting or fatally flawed articles. Reviewers will be acknowledged by name if the article is published, but remain anonymous if the article is declined.

The abstracts on this page are unpublished studies - please do not cite them (yet). If you wish to cite them/wish to see them published, write your opinion in the form of a peer-review!

Tip: Include the RSS feed of the JMIR submissions on this page on your homepage, blog, or desktop RSS reader to stay informed about current submissions!

JMIR Submissions under Open Peer Review

↑ Grab this Headline Animator

If you follow us on Twitter, we will also announce new submissions under open peer-review there.

Titles/Abstracts of Articles Currently Open for Review:

  • Digital Health Technology Use Among Rehabilitation Professionals in China: A National Cross-Sectional Survey

    Date Submitted: Dec 31, 2025
    Open Peer Review Period: Jan 1, 2026 - Feb 26, 2026

    Background: The rapid expansion of rehabilitation needs in China has intensified pressure on a workforce that remains unevenly distributed. Digital health technologies offer potential to increase service reach and efficiency. However, little is known about how rehabilitation professionals currently gather and document clinical information, nor about their readiness to integrate digital tools into routine practice within China’s rapidly digitalizing health system. Objective: This study aimed to describe how rehabilitation professionals in China collect subjective and objective clinical information, document patient data in routine practice, and assess their willingness to use digital health technologies in clinical settings. Methods: We conducted a national observational cross-sectional survey using a culturally adapted questionnaire based on the World Health Organization Digital Health Interventions framework. The instrument assessed participant characteristics, information collection methods, documentation practices and willingness to adopt digital functions across rehabilitation activities. Descriptive analyses and subgroup comparisons were performed on 324 complete responses from certified rehabilitation professionals. Results: Respondents represented 20 provinces across China, and 82.7% were employed in public sector rehabilitation services, consistent with national workforce distribution patterns. Traditional methods dominated clinical work. Face to face communication was used frequently for subjective assessment by 96.3% of respondents, whereas digital channels such as email (14.2%) and telephone (5.2%) saw limited use. For objective information, visual observation (83.7%) and manual measurement tools (60.2%) remained the primary approaches, while motion capture technology (13.8%) and wearable sensors (4.0%) were rarely used. Documentation practices also relied heavily on analogue formats, with 82.1% using handwritten notes and 60.2% using paper templates. In contrast, willingness to adopt digital health technologies was consistently high with more than 75% of respondents indicated readiness to use digital systems for identity verification, progress tracking and outcome measurement. Conclusions: Rehabilitation professionals in China demonstrate strong readiness to use digital health technologies, yet their routine practice remains largely paper based and analogue. These findings provide national level evidence to inform implementation strategies, workforce training and system level planning aimed at accelerating digital transformation in rehabilitation services.

  • Background: Systematic collection of social determinants of health (SDoH) data remains inconsistent across healthcare settings, despite its critical impact on patient outcomes. Large language model (LLM)-powered chatbots offer promise for scalable SDoH data collection, but rigorous, feasible evaluation methods for patient-facing applications are lacking. Objective: To describe an efficient, iterative, multidisciplinary approach for developing and evaluating a patient-facing SDoH chatbot using synthetic data and case simulation, with the goal of optimizing both chatbot performance and the evaluation rubric prior to clinical deployment. Methods: A 10-criterion evaluation rubric was adapted from established healthcare artificial intelligence (AI) frameworks and applied to 27 synthetic clinical scenarios representing diverse SDoH profiles. Scenarios were role-played by a licensed clinical social worker, and chatbot-patient interactions were independently rated by three multidisciplinary experts (social worker, nurse practitioner, physician). Quantitative analysis used descriptive statistics and percent agreement to characterize chatbot performance and rater consensus, with percent agreement selected due to high prevalence of ceiling effects in several domains. Qualitative analysis synthesized rater feedback to guide iterative refinement of both chatbot prompts and rubric domains. Results: The chatbot demonstrated robust performance in domains such as accurate interpretation (mean = 0.98, SD = 0.09), communication quality and cultural sensitivity (mean = 0.99, SD = 0.06), and adaptive questioning (mean = 0.99, SD = 0.06), with near-perfect rater agreement. Lower scores and greater variability were observed in completeness of data collection, systematic domain exploration, and safety, prompting targeted adaptations. Qualitative feedback highlighted the importance of distinguishing screening from clinical interviewing capabilities and informed the refinement of the rubric, including clarifying the definition of safety to focus on recognition of physical and mental health emergencies. Conclusions: This study provides a practical, replicable blueprint for pre-deployment evaluation of patient-facing SDoH chatbots, balancing rigor with feasibility. The iterative, multidisciplinary approach enabled rapid identification and remediation of performance gaps, supporting responsible integration of AI into SDoH data collection. Explicit performance thresholds and rubric refinement are essential for protecting patient trust and safety, particularly in vulnerable populations. Future work will validate findings with real patient interactions and expand stakeholder involvement. Clinical Trial: N/A

  • A Data Analytics Dashboard for Pre-MRI Safety Screening of Implantable Medical Devices

    Date Submitted: Dec 27, 2025
    Open Peer Review Period: Dec 29, 2025 - Feb 23, 2026

    This research letter summarizes the development and deployment of a data analytics dashboard that uses natural language processing to streamline pre-MRI safety screening for implantable medical devices, resulting in a 98% reduction in manual screening workload while maintaining high diagnostic accuracy.

  • Usability Across Three mHealth Problem Solving Training Interventions for Diverse Neurodevelopmental and Neurological Populations: Multicase Usability Evaluation

    Date Submitted: Dec 27, 2025
    Open Peer Review Period: Dec 29, 2025 - Feb 23, 2026

    Background: Mobile health (mHealth) interventions that integrate psychoeducation with structured problem-solving training (PST) hold strong potential for improving self-management of chronic conditions. Evaluating the usability of these interventions requires assessing technological, pedagogical, and sociocultural fit. However, most usability evaluations remain narrowly technocentric, focusing on interface-level metrics while neglecting pedagogical coherence, cultural responsiveness, and patient learning needs. Objective: This study aimed to characterize usability challenges and facilitators across three psychoeducational mHealth Problem Solving Training interventions and to identify technological, pedagogical, and sociocultural design features that can improve engagement, accessibility, and implementation for diverse users. Methods: A multi-method, multicase study was conducted with a total of n=14 participants who completed think-aloud usability sessions while interacting with one of three different mHealth PST interventions designed for persons with neurodevelopmental and/or neurological disability: (1) Epilepsy Journey 2.0, (2) Survivor’s Journey, and (3) Electronic Problem-Solving Training (ePST). Participants completed a presession technology comfort survey and the Comprehensive Assessment of Usability for Learning Technologies (CAUSLT) postsession. All sessions were recorded, transcribed, and analyzed thematically. CAUSLT data were analyzed using descriptive quantitative methods. Results: ePST demonstrated the highest usability (x̄ 88 out of 100, 95 percent CI 71.8 to 104.2), followed by Survivor’s Journey (x̄ 83 out of 100, 95 percent CI 57.1 to 108.9) and then Epilepsy Journey 2.0 (x̄ 79 out of 100, 95 percent CI 65.3 to 92.7). Findings revealed that usability in healthcare learning design is shaped by how effectively the technology, learning content, and contextual factors align with patients’ needs. Recurring challenges across interventions included unclear navigation, poor mobile responsiveness, instructional ambiguity, insufficient feedback, potential for greater inclusivity, and limited error recovery. Twelve cross-case design principles were derived, emphasizing mobile-first accessibility, cognitive load reduction, context-sensitive feedback, and empathetic, inclusive design. Conclusions: Usability challenges in mHealth PST interventions arise not only from interface level issues but also from how effectively the intervention supports users’ understanding, decision making, and real world application demands. This extends prior mHealth usability research by demonstrating that user difficulties often reflect misalignments between technological features, instructional structure, and the everyday contexts in which individuals engage with PST. Resulting design principles highlight specific, actionable priorities for developers, including mobile first optimization, clearer task scaffolding, and better feedback and error recovery. Future work should evaluate these principles in larger samples and clinical settings to determine their impact on engagement, adherence, and downstream health outcomes.

  • Digital Transformation in Healthcare: Are we on the right track?

    Date Submitted: Dec 26, 2025
    Open Peer Review Period: Dec 29, 2025 - Feb 23, 2026

    The healthcare digital transformation is gaining increasing notoriety, despite the observed challenges in its implementation. The envisioned benefits together with the growing need for better healthcare are motivating academia, organizations, regulatory agencies, and governments to develop more effective digital healthcare solutions. Through extensive debates among the authors and supported by a narrative literature review, this paper discusses how digital transformation is being conducted in the healthcare sector. Our discussion relies on the concepts from the sociotechnical systems theory categorizing it according to three social (people, culture, and goals) and three technical (processes/procedures, infrastructure, and technology) dimensions. Overall, we argue that both social and technical dimensions present elements that have been either encouraging or discouraging the progress of healthcare digital transformation. The identification of current trends on such (on- and off-track) elements allowed the formulation of propositions for future testing and validation. This approach can help the establishment of better government policies, foster private initiatives, and shift regulatory guidelines to support a successful digital transformation in health systems. Lastly, from a research perspective, we outline some opportunities for further interdisciplinary investigation in the field, promoting advances in the understanding of healthcare digital transformation.

  • Commercialization of Online Cancer Information in South Korea: Examining Covert Promotional Cancer-related Posts Across Two Major Search Engines

    Date Submitted: Dec 25, 2025
    Open Peer Review Period: Dec 25, 2025 - Feb 19, 2026

    Background: Internet search engines serve as primary gateways for cancer information, yet the commercialization of health content within organic search results remains understudied. While covert promotional content—such as native advertising and stealth marketing—has been documented in various contexts, systematic comparisons across structurally divergent search platforms are lacking. Objective: This study examined the prevalence, distribution, and information quality characteristics of covert promotional cancer-related content across Naver and Google, South Korea's two dominant search engines, which have fundamentally different platform architectures. Methods: A two-phase cross-sectional content analysis was conducted. Phase 1 employed natural language processing to identify 33 cancer-related keywords from 1,400 preliminary posts. Phase 2 systematically collected 5,848 posts in October 2023, yielding 919 unique posts (598 from Naver and 321 from Google) that covered seven major cancer types, representing over 70% of Korean cancer incidence. Two trained coders analyzed promotional status, intensity, institutional sources, and information quality indicators (citation practices, information depth, and source attribution), with inter-coder reliability exceeding κ=.80. Chi-square tests examined the associations between platform and cancer type. Results: Covert promotional content appeared in 48.6% (447/919) of analyzed posts, with significantly higher prevalence on Google (54.2%, 174/321) than Naver (45.7%, 273/598; χ²₁=5.78, p=.016). Platform differences were pronounced: Naver promotional posts predominantly originated from blogs (96.0%, 262/273) and exhibited full promotional intensity (52.1%, 126/242), while Google posts primarily came from hospital websites (81.0%, 141/174) with simple institutional identification (57.8%, 52/90). Institutional source distribution varied significantly by platform (χ²₅=215.714, P<.001): traditional medicine institutions dominated Naver (99.2%, 119/120), whereas university-affiliated hospitals predominated on Google (85.0%, 96/113). Information quality differed substantially: indirect citation was more common on Google (81.6%, 142/174) than Naver (58.6%, 160/273; χ²₁=25.653, P<.001), while comparative informational depth was higher on Google (55.7%, 97/174) versus Naver (19.4%, 53/273; χ²₂=64.683, P<.001). Conclusions: Covert promotional cancer content is pervasive in Korean search results, with platform architecture systematically shaping promotional patterns, institutional sources, and information quality rather than reflecting deliberate marketing strategies. These findings underscore the need for platform-sensitive regulation and enhanced digital health literacy to protect vulnerable cancer information seekers from commercial exploitation embedded within ostensibly neutral search environments.

  • Background: Chronic low back pain is a major global health challenge. While non-pharmacological therapies are recommended, patient compliance is often hindered by kinesiophobia. Virtual reality (VR) offers an immersive, distraction-based approach, but the comparative effectiveness of different VR modalities remains unclear. Objective: To compare and rank the efficacy of different Virtual Reality Exposure Therapy (VRET) modalities on pain intensity, functional disability, and kinesiophobia in patients with chronic low back pain (CLBP). Methods: Systematic searches were conducted in PubMed, Web of Science, Scopus, Embase, CINAHL, and the Cochrane Library from inception until June 2025. Randomized controlled trials assessing the effects of virtual reality exposure therapy on individuals with chronic low back pain were selected. Primary outcomes were pain intensity, functional disability (Oswestry Disability Index), and kinesiophobia (Tampa Scale of Kinesiophobia). The Cochrane Risk of Bias tool (RoB2) was used for quality assessment. A Bayesian Network Meta-Analysis with Standardized Mean Difference (SMD) as Effect Size was performed to synthesize evidence and rank interventions using Surface Under the Cumulative Ranking Curve (SUCRA) values. The GRADE framework was adapted to evaluate the quality of evidence. Results: 25 RCTs with a total of 2,610 participants were included in the analysis. For pain intensity, shooting games (SMD -4.40; 95% CrI -6.80 to -2.20) and VR-based equestrian training (SMD -2.00; 95% CrI -3.70 to -0.57) were significantly superior to all types of controls. Surface Under the Cumulative Ranking Curve (SUCRA) indicated that shooting games had the highest probability (98%) of being the most effective intervention for pain relief. For functional disability, no intervention demonstrated statistically significant superiority. For kinesiophobia, shooting games (SMD -3.40; 95% CrI -5.60 to -1.10) significantly outperformed traditional exercise controls. The quality of evidence ranged from very low to moderate across outcomes. Conclusions: VRET, particularly in the context of shooting games and VR-based equestrian training, appears effective for reducing pain in CLBP; Prioritize higher-ranked shooting games, reserving VR-based cognitive behavioral therapy as an alternative or adjunct for kinesiophobia. However, the benefits for functional improvement remain uncertain. Clinical Trial: PROSPERO CRD420251131116; https://www.crd.york.ac.uk/PROSPERO/view/CRD420251131116

  • Using GPT-4 to Automate the Generation of Lay Summaries for Cancer Publications: Human-centric Quantitative and Qualitative Evaluation

    Date Submitted: Dec 23, 2025
    Open Peer Review Period: Dec 24, 2025 - Feb 18, 2026

    Background: Cancer research literature is often riddled with technical jargon that is not digestible to the average person. Individuals interested in research studies may want to contribute through patient partner engagement or sample donation but find the relevant literature overwhelming. Through the generation of lay summaries, previously inaccessible research papers become easier to comprehend, especially for patient partners or data donors. With large language models (LLMs) continuing to advance, so does their capability to summarize large texts. Objective: In this study, we examined whether LLMs can produce lay summaries of scientific literature at-scale, while maintaining readability and accuracy to their source texts. Methods: We developed a tool to generate lay summaries of open-access article abstracts and their full texts with GPT-4-Turbo. Prompt development aimed for a target 8th grade reading level assessed with Flesch-Kincaid Grade Level. Human-review metrics were used to evaluate readability and accuracy when generated using abstracts versus full text articles. Results: The average Flesch-Kincaid Grade Level Score was 7.13 for abstract-based summaries and 7.39 for full text-based summaries, indicating summaries at around 7th grade reading level. Human-review metrics showed these summaries were of similar readability and accuracy when generated using abstracts versus full text articles, with mean accuracy scores from human review of 7.09 vs 7.42 out of 10 respectively. Additionally, qualitative patient-based assessment indicated these summaries would encourage participation in research studies. Conclusions: By generating lay summaries for complex and lengthy research papers, their scientific information becomes accessible to a larger audience, including patient partners interested in contributing to cancer research. Summaries that are easy to understand will allow participants to make informed decisions about their involvement and appreciate the impact of their contributions if and when their results are published.

  • Large Language Models in Colorectal Cancer: A Systematic Review

    Date Submitted: Dec 22, 2025
    Open Peer Review Period: Dec 23, 2025 - Feb 17, 2026

    Background: The growing complexity of colorectal cancer (CRC) management requires advanced tools for integrating multimodal data and clinical knowledge. Large language models (LLMs) offer a promising approach to address these challenges through sophisticated natural language processing and reasoning capabilities. Objective: This systematic review evaluates the current applications, performance, and practical implications of LLMs across the continuum of CRC care, from screening to treatment decision support. Objective: This systematic review evaluates the current applications, performance, and practical implications of LLMs across the continuum of CRC care, from screening to treatment decision support. Methods: We searched six databases (PubMed, Embase, Web of Science, Scopus, CINAHL, Cochrane) up to November 1, 2025, following PRISMA guidelines. Included studies were original research investigating LLM applications specific to CRC, with extractable outcome data. Quality was assessed using QUADAS-2, PROBAST, and ROBINS-I tools by two independent reviewers. Results: Following the screening of 1,261 records, 34 studies met the inclusion criteria, all published between 2023 and 2025. The synthesis highlighted the utility of LLMs in automating data extraction from clinical texts, supporting patient education, aiding diagnostic processes, and assisting in clinical decision-making, with growing evidence of their emerging visual interpretation and multimodal capacities. The effectiveness of these models was significantly influenced by prompt design, which varied from basic zero-shot queries to specialized fine-tuning techniques. While the overall methodological quality of the included studies was deemed adequate, assessments identified recurring concerns regarding insufficient control of biases and inadequate reporting on data security measures. Conclusions: LLMs demonstrate tangible potential to augment CRC care, particularly in structuring unstructured data and providing clinical decision support. However, translating this potential into practice requires solutions for domain adaptation, multimodal integration, and rigorous prospective validation to ensure reliability and safety in real-world settings. Clinical Trial: PROSPERO CRD420251248261; https://www.crd.york.ac.uk/PROSPERO/view/CRD420251248261.

  • Background: Anxiety disorders are highly prevalent among autistic adults, with 20%-65% experiencing at least one diagnosable anxiety disorder. While mindfulness-based interventions have demonstrated efficacy for anxiety reduction, treatment response varies considerably across individuals. Machine learning approaches offer potential for identifying who is most likely to benefit from smartphone-based mindfulness interventions, enabling more personalized treatment recommendations. Objective: This study aimed to develop and evaluate machine learning models to predict individual treatment response, in the form of reduced anxiety symptoms, to a smartphone-based mindfulness intervention for autistic adults. We sought to identify baseline characteristics that distinguish responders from non-responders, explore few-shot learning with large language models as a complementary approach for low-data clinical prediction, and implement a Personalized Advantage Index approach for individualized treatment recommendations. Methods: We conducted a secondary analysis of data from a randomized controlled trial comparing a 6-week smartphone-based mindfulness intervention (Healthy Minds Program) with a waitlist control condition in autistic adults. Among 73 participants who completed the intervention, we defined responders as those achieving ≥7-point reduction in State-Trait Anxiety Inventory state anxiety scores. Baseline predictors included demographic variables, autism trait measures, and self-report questionnaires assessing anxiety symptoms, perceived stress, affect, and mindfulness. We trained six machine learning models (logistic regression, Random Forest, XGBoost, TabNet, Tab-ICL, and TabPFN) using nested 10-fold cross-validation with inner 5-fold cross-validation for hyperparameter tuning. Additionally, we evaluated few-shot learning using GPT-4o models with tokenized baseline features at varying shot counts (20-70 examples). Model performance was evaluated using area under the receiver operating characteristic curve (AUC) for machine learning model and classification accuracy for few-shot learning. We examined feature importance and implemented Personalized Advantage Index analysis to estimate individualized treatment benefit. Results: Random Forest achieved the highest predictive performance for state anxiety response (AUC 0.79, 95% CI 0.66-0.91), followed by TabPFN (AUC 0.78, 95% CI 0.64-0.94) and logistic regression (AUC 0.77, 95% CI 0.73-0.81). Higher baseline state anxiety (coefficient 1.20, P<.001) predicted better treatment response, while higher trait anxiety (coefficient -0.17, P=.001), older age (coefficient -0.18, P=.02), and lower childhood pretend play scores (coefficient -0.93, P=.007) were associated with poorer response. Few-shot learning with 7-feature tokenization achieved accuracy of 0.867 (95% CI 0.81-0.92) at 70 shots, significantly outperforming Random Forest baseline (0.733, p<.001). Prediction of trait anxiety changes was substantially weaker (AUCs 0.57-0.68), likely reflecting the inherent stability of this personality dimension. The Personalized Advantage Index demonstrated significant moderation of treatment group differences (adjusted R²=0.29), with 75% of participants predicted to benefit more from the mindfulness intervention than the waitlist control. Conclusions: Machine learning models successfully identified baseline characteristics predicting treatment response to a smartphone-based mindfulness intervention in autistic adults. Few-shot learning with large language models demonstrated superior performance to traditional machine learning when provided with compact, high-signal feature representations, offering a promising approach for clinical prediction in small-sample settings. These findings demonstrate the feasibility of precision psychiatry approaches in digital mental health interventions for autistic adults. While modest sample size and limited demographic diversity warrant cautious interpretation, the stable cross-validation performance suggests robust predictive patterns within similar populations. Future research should validate these models in larger, more diverse samples and explore whether algorithm-guided treatment recommendations improve outcomes compared to standard care, through prospective randomized trials.

  • Gender Bias in Large Language Models for Healthcare: Assignment Consistency and Clinical Implications

    Date Submitted: Dec 22, 2025
    Open Peer Review Period: Dec 22, 2025 - Feb 16, 2026

    Background: The integration of large language models (LLMs) into healthcare holds promise to enhance clinical decision-making, yet their susceptibility to biases remains a critical concern. Gender has long influenced physician behaviors and patient outcomes, raising concerns that LLMs assuming human-like roles, such as clinicians or medical educators, may replicate or amplify gender-related biases. Objective: To evaluate the consistency of LLM responses across different assigned genders (personas) regarding both diagnostic outputs and model judgments on the clinical relevance or necessity of patient gender. Methods: Using case studies from the New England Journal of Medicine Challenge (NEJM), we assigned genders (female, male, or unspecified) to multiple open-source and proprietary LLMs. We evaluated their response consistency across LLM-gender assignments regarding both LLM-based diagnosis and models’ judgments on the clinical relevance or necessity of patient gender. For representative models with high diagnostic accuracy, we further evaluated consistency across question difficulty tiers and clinical specialties. Results: All models showed high diagnostic consistency across assigned LLM genders (range of consistency rates: 91.45%–97.44%), though this did not always correspond to diagnostic accuracy (e.g., GPT-4.1: 97.44% consistency, 0.943 accuracy; Gemma-2B: 97.44% consistency, 0.478 accuracy). In contrast, judgments on the clinical importance of patient gender showed marked inconsistency: consistency rates ranged from 58.97% to 90.6% for relevance judgements, 78.63% to 98.29% for necessity judgements. Stratified by difficulty tier and specialty, the open-source model (LLaMA-3.1-8B) particularly showed statistically significant differences across LLM genders regarding both relevance and necessity judgements. Conclusions: Despite stable diagnostic outputs, LLMs varied substantially in their assessments of patient gender’s clinical importance across gendered personas. These findings present an underexplored bias that could undermine the reliability of LLMs in clinical practice, underscoring the need for routine checks of identity-assignment consistency when interacting with LLMs to ensure reliable and equitable AI-supported clinical care. Clinical Trial: not applicable

  • Background: Clinicians spend a substantial share of their working hours on documentation, contributing to workflow inefficiencies, reduced patient-facing time, and increased burnout. AI medical scribes have emerged as a promising solution to reduce this burden, yet real-world evidence remains limited and heterogeneous. Data from European health systems are especially scarce, despite growing interest in AI-enabled documentation support. Reducing clinicians’ documentation burden is a critical priority in modern health care, as excessive administrative work consumes substantial clinician time, contributes to burnout, and limits time available for direct patient care. Objective: To quantify the impact of an AI medical scribe on documentation time and clinician experience. Methods: This observational real-world evaluation was conducted between April 26th 2024 and October 27th 2025 to assess the impact of an AI medical scribe on documentation time and clinician experience using retrospective paired ratings. The study was carried out across multiple specialties in primary, secondary and hospital care within Capio Ramsay Santé, a large integrated health care provider operating in Sweden. The target population consisted of licensed clinicians actively using the AI medical scribe in routine clinical practice. Eligibility was limited to “fully onboarded” users, defined as clinicians who had used the scribe for at least 3 months, created more than 100 notes, generated at least one document or certificate, and used the conversational edit (“Add or adjust”) feature at least once. Results: With the introduction of the AI medical scribe, the estimated time spent on documentation per note decreased from 6.69 minutes to 4.72 minutes (-29%, p = 1.70e-11). On a five-point Likert scale, the ability to work without stress related to administrative tasks increased from a mean of 2.41 to 3.14 (p = 2.46e-8), and perceived presence with patients increased from 3.73 to 4.33 (p = 2.47e-8). The median editing time was 93 seconds, and it did not decrease significantly over continued use. Conclusions: This study shows that the clinician time savings and reductions in cognitive load and stress reported in prior US-based studies can also be achieved in a European health care system using an AI scribe. Clinical Trial: The study adhered to the Standards for Quality Improvement Reporting Excellence (SQUIRE) guideline and was preregistered on the Open Science Framework on 7 October 2025 (DOI: 10.17605/OSF.IO/YPD9E)

  • Physical voice parameters and mental state changes in affective disorders – a longitudinal study using the MoodMon AI system

    Date Submitted: Dec 19, 2025
    Open Peer Review Period: Dec 22, 2025 - Feb 16, 2026

    Background: Psychiatry needs objective technological tools to address global staffing shortages, stigma, and other systemic challenges. A long-term, naturalistic study using AI to effectively detect changes in mental state in major depressive disorder (MDD) and bipolar disorder (BD) based on physical characteristics of the voice represents a breakthrough in biomarker validation. The MoodMon system was developed along with a mobile application for smartphones. Objective: The aim of the study was to determine whether physical voice parameters would be effective as biomarkers of mental status changes in affective disorders and whether they would be useful in remote clinical monitoring of patients by psychiatrists. Methods: To evaluate the effectiveness of artificial intelligence (AI) algorithms in detecting changes in mental state based on physical voice parameters, data from 75 patients diagnosed with bipolar disorder (BD) and 25 patients with major depressive disorder (MDD) for 944 days were used. This makes this the longest analysis in the world covering two of the most common mental disorder diagnoses. A wealth of clinical, behavioral, and technical data was collected and used to train the MoodMon machine learning system under the supervision of human experts- experienced psychiatrists. The AI module consists of an ensemble of selected supervised learning and clustering algorithms In the first stage, the AI was trained using objective data and clinical assessments conducted by psychiatrists, including 17-item versions of the HDRS and YMRS, as well as the CGI scale. The second stage involved further refinement of the AI using individual and population data and generating alerts when subtle changes in mental state were detected. Results: 19 of the 243 specific physical voice parameters tested were found to be most effective in detecting changes in mental status. The system demonstrated high performance, achieving the following sensitivity (true positive rate – TPR) and specificity (true negative rate – TNR) values for both diagnoses: TPR = 89.5%, TNR = 98.8%; BD: TPR = 89.6%, TNR = 98.9%; MDD: TPR = 89.1%, TNR = 98.5%. Voice alerts in the MoodMon system are a key tool supporting clinical decision-making. They increase the probability of a clinical visit and exert a significant influence on the likelihood of treatment modification. Conclusions: The system confirmed the presence of parameters that may serve as biomarkers of mental state changes in bipolar disorder (BD) and major depressive disorder (MDD). A key clinical implication is the increased probability of prompt treatment modification following an alert, thereby supporting the primary objective underlying the development of the MoodMon AI tool. Clinical Trial: Study: UR.D.WM.DNB.39.2021; Funder: National Centre for Research and Development, Poland. Project title: Development of a system supporting the monitoring of the course and early detection of relapses of affective disorders based on artificial intelligence algorithms. Agreement: POIR.01.01.01-00-0342/20

  • Artificial Intelligence Trustworthiness in the Preoperative Period for Patients with Serious Illness: A Systematic Narrative Review.

    Date Submitted: Dec 20, 2025
    Open Peer Review Period: Dec 18, 2025 - Feb 12, 2026

    Background: Despite the promising potential of artificial intelligence (AI) in the perioperative context, the rapid pace of development and diverse implementation warrants a systematic review to consolidate existing knowledge, identify gaps, and assess the utilization of trustworthiness principles of AI integration into the perioperative period for patients with serious illness. Objective: The purpose of this study was to address deficiencies in perioperative AI literature by elucidating the extent to which equity, ethics, and safety discussions are incorporated, thereby establishing a foundation for developing robust ethical guidelines for the safe and effective integration of AI in healthcare. This study also examined the utilization of AI enabled team augmentation in perioperative serious illness care. Methods: We searched PubMed, Embase, CENTRAL, and Scopus for studies published between 2010 to July 2024. We included studies that reported patient functional outcomes, occurred in the perioperative period (30 days before and up to 90 days after surgery), included AI integration, and included patients with serious illness (defined as: malignancy, advanced organ failure, frailty, dementia/neurodegenerative disease, or stroke). To ensure reliability and minimize bias, two independent reviewers screened all studies through the title/ abstract and full-text stage; conflicts were resolved through team consensus. The abstraction form was developed iteratively and was tested through pilot abstractions. Any discrepancies identified during data extraction were resolved through discussion and consensus among the reviewers. The ROBINS-I risk of bias tool in non-randomized studies was used to assess quality. Abstraction and risk assessment occurred through a blinded, independent dual review. A narrative review was compiled with the identified studies. Results: Of the 10,980 articles identified through the database searches, this review yielded 81 articles that met inclusion criteria. A majority of the studies were published in China (35), with the United States (9) and South Korea (7) having the subsequent most publications, and 80 out of 81 (98.8%) articles focused on patients with malignancy. Analysis of AI implementation strategies revealed foundational efforts toward equitable access, with six studies providing open-access tools and several more designing models with simple inputs suitable for low-resource settings (17). Seven studies mentioned their commitment to transparency (e.g., publishing code) to enhance safety and trust. However, significant ethical deficiencies persist, particularly around input data, as only two studies explicitly addressed racial or ethnic disparities, and concerns about lack of sample diversity (16) and the omission of socially relevant features (5) were frequently noted as limitations. Although no current studies considered AI enabled team augmentation, a majority of articles described how AI could be used to prompt a team member to make a tangible action. Conclusions: Machine learning for predictive analytics and other types of AI tools in surgical outcomes offers significant potential but requires adherence to trustworthiness and safety principles to be clinically viable. By leveraging longitudinal data and continuous performance tracking, these models have the potential positive impact on diverse patient needs and healthcare systems. Future research should prioritize adhering to guidelines for equity, ethics, and safety, conduct prospective studies, incorporate more external validation of AI models, and facilitate transparent monitoring and reporting of model performance to build clinician and patient trust and to encourage broader healthcare system adoption. Clinical Trial: PROSPERO CRD42024608387

  • Background: Health services increasingly face decisions about how to integrate immersive technologies into routine practice. International guidance highlights the need for structured governance in digital health, yet extended reality (XR) initiatives are often launched through isolated pilots without a clear assessment of organisational readiness or implementation risk. Although factors influencing XR adoption are well documented, healthcare organisations and system-level decision makers still lack practical, governance-oriented tools to translate these determinants into structured strategic decisions made before implementation. Objective: To develop MCDA-XR, a strategic governance framework that translates behavioural, organisational, and technical implementation determinants into a structured decision-support process for healthcare organisations. Methods: The study followed a sequential mixed-methods design covering the first two phases of a three-stage framework development and validation project. Phase 1 (Identification) defined strategic criteria by integrating theoretical perspectives on organisational complexity, behaviour change, technology acceptance, and immersive safety, together with a targeted review of XR implementation evidence. Phase 2 (Construction) refined the framework through participatory sessions. A multidisciplinary group of 33 stakeholders, including professionals and managers from hospital and primary care settings and postgraduate students, evaluated the proposed criteria for strategic relevance and operational clarity. This process resulted in a final ten-criterion structure and the establishment of a dual-score assessment logic. Phase 3 (Validation), planned as a subsequent step, will examine the predictive value of the framework in longitudinal clinical settings. Results: The development process yielded a framework comprising ten operational criteria grouped into three conceptual domains (Human, Organisational, and Technical). Stakeholder ratings indicated high strategic relevance across all criteria (mean scores above 4.0 on a 5-point scale), with Safety and Comfort receiving the highest prioritisation (mean 4.6). The final instrument applies a dual-assessment approach in which each criterion is rated separately for Strategic Importance and Organisational Readiness. Mapping these dimensions enables organisations to identify priority gaps, particularly areas of high importance and low readiness, and to distinguish between manageable constraints and critical barriers requiring targeted preparatory action prior to implementation. Conclusions: MCDA-XR addresses a key governance gap in XR implementation by providing a structured way to align adoption decisions with institutional priorities and operational constraints. Rather than relying on descriptive feasibility assessments, the framework supports explicit prioritisation and action-oriented decision making at the organisational level. MCDA-XR is positioned for Phase 3 evaluation, which will examine whether its readiness profiles anticipate implementation challenges and early sustainability outcomes in real-world clinical deployments.

  • Background: Online virtual worlds are platforms that allow users, represented as avatars, to meet and interact with other users in real time within 3D virtual environments. These platforms have potential utility as vehicles to deliver/receive clinical services, especially as a preference to video-conferencing-based telehealth. However, commercial virtual worlds (e.g.,“Second Life”) are often deemed unsuitable due to privacy and safety concerns. Objective: The aim of this study was therefore to co-develop and test a bespoke virtual world platform to deliver routine youth mental health services. Methods: We undertook a participatory-design process to develop the platform (Orygen Virtual Worlds) involving 10 young people with lived experience of mental health difficulties, researchers, software designers and mental health clinicians. We then tested two types of clinic-led interventions delivered through the virtual world (a structured therapy group and an individual therapy) in a public youth mental health service setting in Australia. Participants were patients receiving treatment in the service. The main outcomes were acceptability and feasibility; we also measured symptom change, usability, presence and therapeutic alliance. We conducted qualitative interviews post-intervention with the participants and analysed these interviews using thematic analysis. Results: 15 young people were recruited to the structured group (27% consented from referred) and 8 were recruited to the individual therapy (36% consented from referred). Drop out was higher in the individual therapy than the structured group therapy (38% versus 80%). Acceptability ratings were high for both therapy approaches and there were no significant safety events attributed to using the platform. There were no significant pre-post differences in the symptom outcome measures in either the structured group intervention or individual therapy. The platform was perceived as being comfortable and safe, enjoyable, fun and interactive, and was not confusing to navigate or difficult to use. The qualitative themes included the platform being fun and engaging, making treatment more accessible, providing a safe and inclusive place, fostering connections, positively impacting wellbeing and providing a catalyst for real life functional change. Young people perceived decreased barriers, increased comfort with help-seeking and reduced social stress facilitated by the avatar, communication options (emoji, text, voice) and accessibility from home. Conclusions: Our findings indicate that online virtual world platforms, such as the one we have designed, hold considerable promise for providing interventions for young people in clinical services. Virtual worlds can provide fun and engaging experiences of therapeutic interventions for young people with mental health difficulties which are safe and inclusive, especially for harder to reach groups.

  • Background: Public awareness campaigns and testing promotion must be strengthened to eliminate infections with hepatitis B and C viruses (HBV and HCV, respectively) by 2030. Although public health campaigns using various types of advertising are widely conducted, the appropriate channels for viral hepatitis testing remain unclear. Objective: To identify web services and digital advertising channels appropriate for promoting HBV and HCV testing, segmented by prior testing history and testing intention. Methods: A nationwide cross-sectional online survey of Japanese adults aged 20–69 years was conducted. The respondents answered questions on viral hepatitis testing status, routinely used web services with 180 options, and exposure to digital advertising with 25 choices. Correspondence analysis was used to visualize the associations among the testing segments, web services, and digital advertising. The distinctiveness was quantitatively evaluated. Results: Of the 2000 respondents (1011 men, 989 women), 18.0% (359/2000) reported prior HBV and HCV testing, and 22.1% (441/2000) were unsure whether they had ever been tested. Web services characteristically associated with those who had never been tested but were willing to be tested included Lawson (convenience store) and cosme (cosmetic shopping). The corresponding digital advertising channels included in-store and storefront screens at Welcia (pharmacy chain) and Lawson (convenience store). Segment-specific patterns varied according to age group and sex. Conclusions: In Japan, the convenience store chain Lawson was a distinctively frequent touchpoint, both online and offline, among individuals who wished to undergo viral hepatitis testing. Future studies are needed to determine whether implementing awareness-raising activities through Lawson can lead to an increased uptake of testing and subsequent treatment.

  • Background: Diabetes self-management and education services can improve health outcomes, but engagement is often low. ‘Healthy Living’ is an online self-management programme for people with type 2 diabetes, based on the ‘HeLP-Diabetes’ intervention which demonstrated effectiveness in a randomised controlled trial. Healthy Living was commissioned by NHS England and rolled out nationally into routine care. The website comprises structured learning, unstructured articles (which users could access at any time), and tracking tools such as goal setting. Objective: To investigate overall usage and exposure to content of Healthy Living, including differences in usage/ exposure by user characteristics. Methods: Anonymous usage data from all people (n=27,422) who activated an account between May 2020 and September 2023 were available, including (1) which website activities were accessed, (2) when activities were accessed and (3) how long users spent on each activity. User demographic and usage information was summarised. Logistic regression evaluated the association between user demographics and usage. Results: The median length of time spent on the website in total was 7·6 minutes (IQR 0·6-27·6 minutes); 12,066 (44·0%) users spent less than five minutes on the website and 3,022 (11·0%) spent one hour or more. Of those who activated an account, 69·8% accessed some website content, 40·7% completed the first section of structured education, and 4·7% completed 60% of the structured education. Usage of the unstructured aspects of the programme was low. Female gender, lower deprivation, White ethnicity, and a shorter time since diagnosis were associated with increased usage. Conclusions: This study is one of the first to provide detailed analysis of user engagement with a national digital self-management programme for type 2 diabetes. Usage of with Healthy Living was generally low, in line with other digital self-management programmes. However, encouraging increased usage with the programme has the potential to lead to better health outcomes in people with type 2 diabetes.

  • Background: Over the past quarter-century, designers of digital behavior change tools have increasingly blended constructs from multiple theories, yet the extent to which such integration enhances intervention outcomes remains unclear. Objective: To clarify this relationship, this study systematically reviewed literature published between 1999 and 2025, focusing on IT-mediated interventions that explicitly combined at least two behavioral theories and reported intention or behavior outcomes. Methods: Following a registered protocol (PROSPERO CRD42022285741) and PRISMA guidelines, searches across seven databases identified 62 eligible studies. Results: Most investigations were quantitative (77%), featured sample sizes from 16 to 8840, and lasted under 6 months; only 9 applied randomized controlled designs. Twenty-nine theories appeared, with Self-Determination Theory (35%) and the Theory of Planned Behavior (29%) being the most prevalent, often paired with the Technology Acceptance Model or Task-Technology Fit. Integrated models consistently outperformed their single-theory counterparts. Health care and fitness interventions dominated (44%), followed by online learning (23%) and mobile commerce (11%), but long-term follow-ups and explicit mappings of theory to behavior change techniques were scarce, and overall risk-of-bias ratings were moderate. Conclusions: Findings indicate that integrated theoretical frameworks deliver measurably superior behavioral outcomes in digital environments, yet evidence remains short-term and health centric. Future research should extend evaluation horizons beyond 6 months, diversify application domains, apply more rigorous randomized designs, and articulate more transparently how theoretical constructs guide specific intervention techniques to advance replicable, theory-driven digital solutions.

  • Background: Digital health has the potential to mitigate health inequity for priority populations who are underserved or marginalised by the health system. However, there is a lack of practical guidance on how to include priority communities in the co-production of digital health technologies, particularly across the entire lifecycle of innovation including research, development, and evaluation. Objective: The aim of this scoping review was to systematically identify and assess published methods used during digital health innovation to promote equitable inclusion of priority communities at every stage of the CeHRes Roadmap for Digital Health Technologies. Methods: This review was based on the Arksey and O’Malley framework for scoping reviews. A 6-stage framework was used to execute the review. To increase the trustworthiness of the findings, an expert advisory group was consulted and their feedback incorporated into the final manuscript. The Participant, Concept and Context (PCC) framework was used to structure the inclusion criteria. Results: The review identified a total of 106 articles, 58 methods, 4 approaches, and 17 research adjustments utilised to co-produce digital health technologies with priority communities. Common methods across multiple stages included interviews, focus groups, surveys and workshops, however the most accessible way to make equity a practical reality during health technology innovation is to appoint a priority population community advisor, or advisory group, from project inception to project closure. Visual and creative methods like photovoice, home tours and body-mapping were also employed, often by priority population researchers themselves. Research adjustments that promote patient safety and comfort, enhanced literacy, peer-support and recognize socio-cultural and demographic considerations have been employed to increase the inclusion of priority populations during digital health innovation. Conclusions: Embedding equity is possible using the practical methods and research adjustments identified to promote inclusive co-production. Professionals working across healthcare, health informatics, research, digital health, and technology development can utilise these findings to centre digital health equity during technology innovation. This research also recognises that co-production must draw on epistemological frameworks, or ways of thinking, which support Indigenous and other priority population knowledge systems. A solely Western lens risks reinforcing structural barriers and overlooking essential knowledge, as demonstrated by this review when the search strategy missed key scholarly works by priority population authors themselves.

  • Background: Limited public understanding of randomized controlled trials (RCTs) hinders recruitment, retention, and confidence in research. Early exposure to trial concepts may strengthen health literacy and research engagement. The Kid’s Trial was a global, decentralized, child-led study that co-created and conducted an RCT to help children understand trials, their importance, and improve critical thinking. Objective: This paper presents its design, outcomes, and methodological reflections. Methods: The Kid’s Trial employed a dedicated website with study materials guiding children through each step of designing and conducting an RCT. Each step was linked to an online survey. Materials were co-developed with two patient and public involvement groups of children and parents. Any child, aged 7 to 12 years, could take part in as many or as few steps as desired. Recruitment combined online and offline strategies, and engagement and self-reported learning were descriptively analyzed. The co-created REST (Randomized Evaluation of Sleeping with a Toy or Comfort Item) trial was a two-arm, pragmatic RCT comparing one week of sleeping with versus without a comfort item. The primary outcome was sleep-related impairment, and the secondary outcome was overall sleep quality. Analyses followed an intention-to-treat approach using mixed-effects models adjusted for baseline measures. Results: Overall, 224 children from 15 countries participated in at least one step. Participation varied: 37% (n = 82) completed one step and 21% (n = 48) completed six. The REST trial randomized 139 children, with 73% (n = 101) completing outcome surveys. Adjusted mean differences (intervention–control) were −0.53 for sleep-related impairment (95% CI −3.40 to 2.34; P=.71) and 0.28 for sleep quality (95% CI 0.01 to 0.55; P=.04), a small, uncertain difference not supported with sensitivity analyses. Post-study responses (n = 20) indicated improved understanding of RCT concepts. Conclusions: The Kid’s Trial demonstrates the feasibility of a decentralized, child-led RCT co-created through participatory citizen-science methods. Children can meaningfully contribute to trial design and conduct, and experiential participation may foster early trial literacy and critical thinking. Future studies should enhance engagement through community partnerships, shorter intervals between steps, and embedded learning assessments to improve inclusivity and retention.

  • Background: The health benefits of breastfeeding for both infants and parents are well-established, yet global breastfeeding rates remain below recommended levels. Parent-targeted Digital Health Interventions (DHIs), including mobile health (mHealth) and electronic health (eHealth) strategies, offer a scalable way to support breastfeeding, but their effectiveness remains uncertain. Objective: To explore the effectiveness of parent-targeted DHIs for improving breastfeeding outcomes. Methods: Seven databases (CENTRAL, CINAHL, Education Research Complete, Embase, MEDLINE, PsycINFO and Scopus) were searched on April 15, 2024, for randomised controlled trials (RCTs) involving parents of children aged under five years. Eligible interventions aimed to promote breastfeeding and were primarily delivered via digital platforms (e.g. mobile apps, text messaging and websites). Studies were excluded if the DHI exclusively targeted breastfeeding within clinical settings or focused on non-digital content. Outcomes of interest included exclusive breastfeeding, any breastfeeding, breastfeeding duration, breastfeeding self-efficacy, cost-effectiveness and adverse events. Risk of bias of the primary outcome was assessed using the Cochrane Risk of Bias 2 (RoB2) tool. Meta-analyses were conducted in accordance with Cochrane methods and result are reported following PRISMA guidelines. Results: Thirty-one (29 RCTs and 2 cluster-RCT) studies, including 14776 participants from 17 diverse countries were included. Nineteen of the interventions focused on mHealth strategies, nine were delivered online and five were telecommunication interventions. Risk of bias was indicated with ‘some concerns’ or ‘high risk’ for 26 (84%) studies. Pooled results indicated that DHIs can significantly improve the odds of exclusive breastfeeding (OR: 2.35, 95% CI: 1.71 to 3.23, I2=81%; 26 trials, 9884 participants), however considerable heterogeneity was present. Pooled results also indicated DHIs may improve breastfeeding duration (SMD: 0.50, 95% CI: 0.30 to 0.69, I2=15%, 5 trials, 601 participants), and ‘any’ breastfeeding (OR: 1.16, 95% CI: 0.99 to 1.35, I2=7%, 14 trials, 7974 participants). Conclusions: Improvements to exclusive breastfeeding rates and breastfeeding duration are linked to major societal and health benefits for infants and mothers. Our results indicate that parent-targeted DHIs are effective for improving key breastfeeding behaviours, with evidence of their impact spanning diverse populations and contexts. Clinical Trial: PROSPERO (CRD42023492644)

  • Background: Patients with terminal illnesses often face profound challenges when making end-of-life care decisions. Digital health technologies, particularly patient decision aids (PDAs), have emerged as a promising approach to support informed, value-concordant decision-making in hospice care contexts. Objective: This systematic review aimed to evaluate the efficacy, functional characteristics, and implementation features of digital health-based PDAs designed to support hospice care decisions among patients with terminal illness. Methods: A systematic review was conducted following PRISMA guidelines, searching nine electronic databases from inception to June 2024 for randomized controlled trials (RCTs) and controlled clinical trials (CCTs). Additional sources included the Ottawa Hospital Research Institute's Decision Aid Library Inventory, professional books, reference lists, and author contacts. Two reviewers independently performed study selection, quality assessment using Cochrane RoB 2 and ROBINS-I tools, and data extraction. PDA quality was evaluated using International Patient Decision Aids Standards (IPDASi v4.0). Results were synthesized narratively without meta-analysis due to heterogeneity. Results: Nine RCT studies involving 1466 participants were included. Methodological quality varied, with four RCTs demonstrating low/some risk of bias, while five RCTs had high risk. Among them, five PDAs met IPDASi qualifying criteria (66.7%-100% compliance), though overall compliance rates of all included PDAs were modest (37.1%-57.14%). Four core digital PDA functions were identified: (1) multimedia knowledge delivery, (2) interactive values clarification, (3) facilitated decision communication, and (4) emotional normalization. PDAs significantly improved patients' decision-making knowledge (6 studies, n=965), and decision self-efficacy (1 studies, n=265). PDAs also positively influenced end-of-life care preferences (n=6) and actual transitions to hospice care (n=2) . Conclusions: Digital health-based PDAs are valuable tools for improving decision-related outcomes and facilitating hospice care transitions. However, significant gaps remain in addressing emotional dimensions and cultural sensitivity. Future research should develop rigorously designed, culturally adapted digital PDAs that comprehensively address the cognitive-emotional aspects of end-of-life decision-making. Clinical Trial: The review was registered with PROSPERO (CRD420251031554).

  • Background: Depression and anxiety are prevalent in working-age adults. Although treatment provided by health professionals can improve symptoms and functioning, many people experiencing mental-ill health do not seek help. There have been very few effective interventions to improve help seeking in adults, with none implemented across diverse workplaces through online delivery. Objective: The primary aim of this trial was to test the effectiveness of a co-designed program for increasing professional help seeking intentions in Australian employees, relative to an active control condition. Methods: A triple-blinded two-arm cluster randomized controlled trial (N=487, control workplaces=26, intervention workplaces=25) was conducted to assess the relative effectiveness of Helipad, a fully-automated co-designed single-session interactive program (intervention condition) with a standard psychoeducation program (active control condition). Workplaces (clusters) were recruited via advertising or invited directly by researchers. Participants completed a pre-test, immediate post-test, and 6-month follow-up survey sent via email assessing help-seeking intentions (primary outcome), mental illness stigma, mental health literacy, help seeking attitudes and behavior, work and activity functioning, quality of life, and symptoms of depression, anxiety, and general psychological distress. Results: A significant difference in change over time on professional help seeking intentions was found between the two conditions F(2, 185.44)=6.89, P=.001, with planned contrasts showing that the Helipad program was effective in increasing professional help seeking intentions compared with the control at the primary endpoint of immediate post-test (t(359.35)=-3.72, P<.001). This difference was not maintained at the 6-month follow-up (t(119.76)=-1.05, P=.295). Retention rates were 71.1% at post-test and 24.9% at follow-up. The Helipad program was also associated with improved mental health literacy and help seeking attitudes at post-test. Helipad was not significantly superior to the control in reducing mental illness stigma or improving help seeking behavior, functioning, quality of life, or symptoms of depression, anxiety or general psychological distress (secondary outcomes) at 6-month follow-up. Conclusions: This study demonstrated that the Helipad program was effective in improving the intentions of employees to seek help from a professional compared to an active control. The program also improved mental health literacy and help-seeking attitudes, but these changes were not sustained and did not translate into observable differences in help-seeking behaviors or mental health symptoms. Selective interventions may be needed to demonstrate behavioral outcomes, and programs may be more effective when paired with organizational interventions. Clinical Trial: Australian New Zealand Clinical Trials Registry (ANZCTR) ACTRN12623000270617p

  • Psychotherapists’ Trust, Distrust, and Generative AI Practices in Psychotherapy: Qualitative Study

    Date Submitted: Dec 4, 2025
    Open Peer Review Period: Dec 8, 2025 - Feb 2, 2026

    Background: Generative artificial intelligence (GenAI) is rapidly entering mental health care, supporting both client-facing tools (e.g., chatbots for support and self-management) and clinician-facing systems (e.g., documentation and assessment aids). Whether these tools ultimately help or harm psychotherapists and their clients depends not only on their technical performance but on how psychotherapists trust and distrust them in practice—that is, when they are willing to rely on GenAI, when they withhold reliance, and how they manage clients’ own GenAI use. Understanding how psychotherapists negotiate their trust and distrust is essential for future responsible and ethical integration of GenAI in mental health care, where GenAI’s promising benefits, such as reducing administration burden or enhancing client’ accessibility, must be balanced against risk that requires professional judgement rather than blanket adoption or rejection. Yet little empirical work has examined how practicing psychotherapists actively calibrate trust and distrust in GenAI across tasks and contexts, or how these judgments shape the evolving psychotherapist–client–GenAI relationship. Objective: This study aims to examine (1) what are psychotherapists' experiences with, perceptions of, and trust/distrust in GenAI in therapeutic contexts? and (2) how do psychotherapists perceive the role of GenAI within the therapeutic relationship, and how do their perceptions shape their trust and distrust in GenAI? Methods: Between January 2025 and May 2025, we conducted an interview study with 18 psychotherapists in the United States. Psychotherapists were recruited. Results: Our findings show that psychotherapists' adoption of GenAI was highly individualized and underpinned by “conditional'' trust—confidence that depended on maintaining professional control, aligning GenAI use with specific tasks, and considering who was using the GenAI tools. Trust was sustained when GenAI operated in clinician-supervised, supportive roles, but diminished when control shifted, tasks became high-stakes, or GenAI appeared to encroach on the therapeutic relationship (e.g., forming emotional bonds with clients or replacing core psychotherapist functions). Additionally, participants also voiced distrust towards the broader sociotechnical ecosystem, including developers, commercial incentives, and the absence of clear organizational guidelines. Conclusions: Psychotherapists’ perspectives offer critical insights into GenAI's current usages in their professional practices and the conditions under which they are willing to trust and distrust GenAI tools. Their experiences highlight the importance of maintaining clinician control, ensuring contextual appropriateness, and preserving the human connection central to psychotherapy. Future work should further examine how therapeutic orientation, professional experience, and client characteristics shape trust and distrust in GenAI. As GenAI becomes more embedded in mental-health care, research is also needed to explore how specific GenAI system features can be responsibly designed to support clinical workflows and enhance therapeutic relationships. Organizational and policy frameworks will be essential to ensure responsible, ethically aligned, and human-centered GenAI deployment in psychotherapy.

  • Simulating the Patient's Perspective: Promise and Pitfalls of LLMs in Patient-Centric Communication

    Date Submitted: Dec 8, 2025
    Open Peer Review Period: Dec 8, 2025 - Feb 2, 2026

    Background: Large Language Models (LLMs) have shown broad applicability in medicine, including the generation of clinical documents. Beyond content creation, LLMs can also be used to evaluate the quality of medical documents. Because of LLMs' ability to simulate (or impersonate) specific personas, they can offer diverse perspectives (such as those of healthcare professionals versus patients with lower health literacy) on the clarity of medical texts. Objective: The primary objective of this research was to evaluate the ability of LLMs to simulate diverse user personas, varying by demographic profiles including educational background, gender, visit frequency, for the task of interpreting ICU discharge summaries. The study aimed to benchmark the clarity assessments generated by these LLM personas against a baseline established by human participants with corresponding backgrounds, in order to highlight the potential and limitations of using current LLMs to create personalized health information. Methods: We evaluated the ability of LLMs to simulate diverse user personas for the task of interpreting ICU discharge summaries. LLMs were prompted to adopt personas with varied demographic profiles, including different educational backgrounds. The resulting LLM-generated assessments of the summaries’ clarity were then benchmarked against a baseline established by human participants with corresponding backgrounds. Results: LLMs demonstrated a strong ability to simulate personas based on educational attainment, accurately interpreting key medical information in 88% of cases. However, the models’ performance varied widely when other demographic variables were introduced. For instance, persona performance was highly erratic based on gender, with simulated male personas achieving 97% accuracy while female personas achieved only 44%. The inclusion of additional details, such as the frequency of prior emergency room visits, further degraded the models' performance. Conclusions: This research highlights both the potential and the significant limitations of using LLMs to create personalized health information. While LLMs are promising for simulating user perspectives based on education, the current models exhibit unpredictable performance when tasked with incorporating other fundamental demographic traits like gender.

  • Background: Digital health literacy (DHL), the ability to seek, understand, and apply digital health information, has become increasingly important in the United Kingdom (UK), with a focus on digital transformation within the health service. While digital tools offer the potential to improve access and equity, they may also exacerbate existing health inequities if segments of the population are unable to engage with them effectively. Understanding the determinants of DHL is essential to designing inclusive digital health services. Objective: To measure DHL among adults in the UK and identify its sociodemographic, economic, and social determinants. Methods: A cross-sectional online survey was disseminated to adults in the UK in December 2024. DHL was self-reported using the validated eHealth Literacy Scale (eHEALS), which ranges from eight to 40. eHEALS score was dichotomized into high and low DHL based on a cut-off of 26. A multivariable logistic regression model was built to identify sociodemographic, economic, and social determinants of DHL. Results: The median eHEALS score was 31; 21% of participants had a low level of DHL, while 79% had a high level of DHL. Those aged 45–64 and 65 years and older, compared to the 18–45 age group, had 1.61 and 1.98 times the odds of low DHL, respectively (45–64 years odds ratio [OR]: 1.61, 95% confidence interval [CI]: 1.13 to 2.31, P=.01; 65 years and older OR: 1.98, 95% CI: 1.36 to 2.91, P<.001). Females had 0.55 times the odds of low DHL (OR: 0.55, 95% CI: 0.42 to 0.74, P<.001), and those with an undergraduate or postgraduate degree or higher had lower odds of low DHL, compared to those educated to below degree level (undergraduate degree OR: 0.49, 95% CI: 0.33 to 0.71, P<.001; postgraduate degree or higher: OR: 0.48, 95% CI: 0.32 to 0.71, P<.001). Conclusions: Among the UK population, male sex, lower educational attainment, and older age were significant predictors of low DHL. Inclusive educational interventions and digital health solutions, tailored towards individuals with low DHL, are needed to ensure that digital transformation in healthcare helps to narrow health inequities.

  • Background: The use of large language models (LLMs) for medical literature retrieval is gaining traction due to its potential to enhance efficiency. However, concerns regarding the accuracy and reliability of citations generated by LLMs remain inadequately addressed, with significant variations observed across different models. Objective: This study aimed to evaluate and compare the performance of four popular LLMs (Grok, Gemini, ChatGPT, DeepSeek) against manual PubMed retrieval in the context of chronic obstructive pulmonary disease (COPD) inhalation therapy, focusing on citation accuracy and relevance, and to preliminarily analyse the mechanisms behind performance disparities. Methods: We prompted each LLM to retrieve 150 distinct English references on COPD inhalation therapy, extracting key information including titles, authors, journals, publication dates, PMIDs, and DOIs. A parallel manual search was conducted on PubMed. All retrieved references were verified against PubMed, Google Scholar, and Web of Science. Verification results were categorized into five types: (1) Accurate, (2) Incomplete, (3) Incorrect, (4) Fabricated, and (5) Irrelevant. A chi-squared test was employed to assess significant differences in performance. Then, we attempted to analyze the mechanisms behind the differences and put forward suggestions. Results: Results showed significant performance variations. Gemini achieved 124 (82.67%) accurate references, 1 (0.67%) incomplete, 7 (4.67%) incorrect, 1 (0.67%) fabricated, and 17 (11.33%) irrelevant, which demonstrated superior accuracy, while DeepSeek showed a high fabrication rate. For all models, incomplete information was primarily limited to titles, whereas errors were concentrated in PMIDs. Conclusions: Gemini 2.5 Pro had a significant advantage in literature retrieval for COPD inhalation therapy. Although artificial intelligence (AI) had shown potential to assist in medical literature retrieval, there were still significant performance gaps among models. The reasons for the differences were likely multifactorial, involving the model architecture, user interaction and semantic bias. Therefore, manual verification of citations generated by AI remained crucial for medical research. We recommend prioritizing the verification of PMID and title, referring to Medical Subject Heading (MeSH), and choosing models with an agent system and advanced context management.

  • Background: The global prevalence of overweight and obesity among children and adolescents has tripled since 1990. Currently, approximately 390 million individuals are affected, including 160 million with obesity, and an estimated 80% are projected to remain obese into adulthood. Concurrently, over 81% of 11- to 17-year-olds worldwide fail to meet recommended physical activity guidelines. Virtual reality (VR) exergaming—categorized into non-immersive, semi-immersive, and fully immersive modalities—has emerged as a promising intervention to enhance energy expenditure, improve motivation, and reduce body mass index (BMI). However, evidence regarding its application among children and adolescents remains fragmented. Following the methodological framework established by Arksey and O’Malley, this scoping review aims to systematically map the characteristics of existing interventions in order to inform future research and clinical practice. Objective: This scoping review aimed to explore the application of virtual reality (VR) exergaming in overweight or obese school-aged children and adolescents by identifying the intervention content, outcome indicators, evaluation tools, and application effects of VR exergaming and to provide a reference for future research and clinical practice in this field. Methods: Following the Arksey and O'Malley framework, a systematic search was conducted in PubMed, Embase, Web of Science, CINAHL, the Cochrane Library, CNKI, Wanfang, and VIP databases from their inception to April 12, 2025. Two reviewers independently screened studies and extracted data using a standardized template. Results: This study included 24 research projects from nine countries. Regarding the technical type of VR exergaming, three studies (12.5%) employed IVR, while the remaining 21 (87.5%) utilized NIVR. The intervention settings exhibited diverse characteristics, with half of the studies conducted in schools, homes, or communities. Furthermore, most intervention cycles lasted between 6 and 20 weeks, emphasizing high-frequency training to achieve significant health promotion outcomes. The outcome measures broadly encompassed three dimensions: physiological, psychological, and behavioral. Conclusions: VR exergaming improves engagement and adherence in overweight/obese youth through enjoyable, interactive features. However, research shows regional disparities and non-standardized outcomes. Future efforts should foster multisector collaboration to enhance its role in obesity prevention.

  • Background: e-cohorts are susceptible to low participation rates, undermining representativeness. Frequent reminders can be a cost-effective strategy to increase response rate. However, their effectiveness may vary depending on the delivery mode, the content, and the formatting of messages. Objective: To compare different reminder strategies (i.e., emails and/or text messages with standard and/or institutional formatting) and evaluate their effectiveness in terms of response rate, in the context of a population-based e-cohort of healthy adults. Methods: We conducted a 4-arm randomized-controlled trial nested in Le French Gut e-cohort (registration number 2021-A01439-32). In November 2024, we included participants who were enrolled online (SKEZIA plateform) but not yet active participants (i.e., eligibility, consent form, and/or personal information questionnaire not completed). Randomization was stratified on time since enrolment. We sent three reminders, 72-h apart, following four experimental designs: (Group 1) standard emails only; (Group 2) text messages only; (Group 3) institutional emails only; and (Group 4) standard email, followed by text message and institutional email. Our primary outcome was the completion rate of the personal information questionnaire. We also measured the completion time (i.e. time between reminder received and questionnaire fully completed), online login rate, login time, email opening, and click-through rates. Results: At the end of the trial, out of 20,487 eligible participants, 19,525 received at least one reminder. The per-protocol completion rate was 8.4%, with a higher rate in Group 4 (9.4% vs 7.4%, 8.4% and 8.3% for Groups 1, 2, and 3, respectively; P <.001). Completion time was faster in Group 2 (mean ± SD, 6.3 ± 4.3 days) compared to Groups 3 and 4 (7.0 ± 3.5 and 7.2 ± 3.8 days; P = .003). Online login rate was higher and login time faster in Group 4 (rate: 15.6% vs 12.1%, 14.3% and 15.3% for Groups 1, 2, and 3, respectively; P <.001; and time: 7.1 ± 4.3 days vs 6.7 ± 4.3, 6.2 ± 4.6, 6.8 ± 3.9 for Groups 1, 2, and 3, respectively; P = .002). For emails, opening rates were similar (P = .87) but click-through rate was higher for institutional emails (23.1% vs 19.6% for standard formatting; P <.001). Conclusions: A mixed-delivery mode strategy, combining emails and text messages, effectively increases response rate by 27% compared to other strategies. Institutional emails with plain design and signed by study coordinators, seemed more appealing to participants than a more elaborate design. Clinical Trial: registration number 2021-A01439-32

  • Channel Allocation and Equity in Preventive Campaigns for Older Adults: Agent-Based Simulation Study

    Date Submitted: Nov 25, 2025
    Open Peer Review Period: Nov 26, 2025 - Jan 21, 2026

    Background: Preventive campaigns for older adults must decide how to allocate limited resources across media channels. However, these channel allocation and budget decisions rarely use explicit criteria for distributional equity or digital health strategic planning. As a result, health systems may optimize average uptake while leaving large gaps across socioeconomic groups and media-use profiles. Objective: This study aimed to develop and apply a data-driven agent-based model as a strategic planning tool for older-adult preventive campaigns, comparing channel allocation, personalization, and loss framing options under explicit budget and equity constraints. Methods: We built an agent-based simulation calibrated to national survey data on influenza vaccination and routine health screening among older adults in South Korea. Fifteen prespecified campaign scenarios varied channel allocation across television (TV), digital, and print; total exposure budgets; two equity-focused personalization strategies; and graded loss framing. Primary outcomes were final adoption and time to adoption. Equity outcomes included the minimum class-level adoption and the 90–10 gap across latent classes. Each scenario was simulated over 12 monthly steps with 100 Monte Carlo replications. We also compared scenario portfolios using logistic and clipped-linear link functions and varied the balance of media versus social reinforcement weights, the social reinforcement threshold, and network realizations in sensitivity analyses. Results: TV-only and high-budget strategies produced some of the highest mean adoption rates for both vaccination and screening but often failed to meet equity guardrails for minimum class coverage and between-class gaps. In contrast, personalization strategies that modestly reweighted exposure toward the lowest-uptake class or assigned class-tailored channel portfolios maintained or improved mean adoption. These strategies also substantially raised minimum class-level coverage and narrowed disparities. When efficiency and distributional equity were considered jointly, these personalized portfolios emerged as the most attractive options under fixed budget constraints. Loss framing acted as a secondary tuning lever: within the tested range, stronger loss framing yielded small, monotonic gains in adoption and shorter time to adoption without worsening equity metrics. Scenario rankings were stable across sensitivity analyses, suggesting that the main patterns reflected underlying diffusion dynamics rather than any single modeling choice. Conclusions: This agent-based simulation shows how ex ante planning for preventive campaigns can move beyond intuition by comparing channel allocation and personalization options under explicit equity and budget criteria. For campaigns targeting older adults, modest equity-oriented personalization of TV and digital exposure improved or preserved mean uptake. It also consistently improved distributional equity, whereas diversified channel mixes without personalization were less efficient and less equitable. These findings support integrating equity guardrails and channel-allocation guardrails into early-stage campaign design and prioritizing targeted personalization over simple channel diversification. Future work should validate these patterns in other populations and health systems and link simulated diffusion trajectories with observed exposure and engagement in real-world digital and traditional-media campaigns.

  • “PrEP Saves Lives!”: A Content Analysis of PrEP-Related Messages Across Facebook, Instagram and Twitter

    Date Submitted: Nov 11, 2025
    Open Peer Review Period: Nov 25, 2025 - Jan 20, 2026

    Interventions are sorely needed to address the lack of PrEP awareness and mitigate barriers related to PrEP use. One such intervention modality is social media, as PrEP awareness and communicating issues, such as access and cost, are easily addressable via clear social media messages on platforms PrEP-eligible people, and especially young people, use frequently. This study seeks to extend understanding of PrEP awareness and usage by examining PrEP-related communication across 3 popular social media platforms (Facebook, Instagram, and Twitter), and identifying message and source characteristics. In February 2023, we used CrowdTangle (a public-insights tool owned by Facebook, now known as Meta) to gather a total of 39,790 Facebook posts and 5,628 Instagram posts. We also used Twitter’s public API to collect 14,061 Twitter posts during the same time frame. Of these, we drew a random sample of social media posts from each platform [Facebook (N = 1,000), Instagram (N = 1,000), and Twitter (N = 811)] in February 2023 and analyzed them using a quantitative content analysis. Our findings showed some differences in the type of text-based content most likely to appear on each platform. We also uncovered similar patterns across all 3 platforms. Across all platforms, we observed that definitions of and indications for PrEP were the most common type of text-based content in posts likely to be shared, information about PrEP appearing in social media posts did not seem to draw from traditional sources, and men who have sex with men (MSM) represented the most frequently mentioned target population. Although our study did not detect a large presence of theory-based concepts from behavior change theory such as the reasoned action approach (RAA), across all platforms, attitude emerged most frequently, followed by self-efficacy. These findings shed light on the PrEP-related beliefs shaping young people’s perceptions and engagement. Such insights can guide the design of future social media–based messages, targeting the most influential beliefs to strengthen HIV prevention efforts. They also provide a foundation for advanced machine learning models capable of predicting and explaining the diffusion potential of PrEP-related content.

  • Background: Sleep disturbance is a common symptom of and potential risk factor for neurodegeneration. Remote sleep and cognitive assessments offer promise for monitoring symptoms and treatment response from patients’ homes, but the acceptability of remote sleep and circadian technology in older adults with and without cognitive impairment is not known. Objective: This qualitative study was designed to explore and describe the barriers, facilitators, and user experience of older adults with mild cognitive impairment and dementia and cognitively unimpaired older adults who participated in a longitudinal sleep and memory study designed around remote monitoring technologies. Methods: Patients with mild cognitive impairment or dementia due to probable Alzheimer’s disease or Lewy body disease and age-matched controls participated in a longitudinal remote study involving multimodal assessments of sleep and cognition including actigraphy, wireless electroencephalography, a smartphone app, web-based cognitive tasks, and serial saliva samples. Participants were asked for feedback via questionnaires during the study and invited to complete end-of-study interviews about their experiences. Questions were informed and thematic analysis was guided by the Capability, Opportunity, Motivation – Behaviour model of behaviour change and the extended Unified Theory of Acceptance and Use of Technology and focused on perceived barriers and facilitators. Results: The study identified six key themes. The first theme, ‘motivations to participate’, highlighted how participants felt the research could be helpful to themselves and others. The second theme, ‘navigating the user experience of devices’, identified comfort, security, privacy, ease of use, and reliability as fundamental in determining acceptability. ‘Adjusting over time to study participation’, the third theme, covered changing perceptions with increased exposure and familiarity, and the importance of convenience, flexibility, and developing a routine. The fourth theme explored ‘social support as a facilitator and barrier to research participation’, looking at the influence of both the research team and relatives supporting at home. A fifth theme of ‘adherence, accuracy, and getting it right’ was also identified, as participants were motivated to provide good quality data for the study. Finally, we identified a sixth theme surrounding participants’ ‘reflections, realities, and uncertainties around sleep’, which focused on sleep hygiene and common sleeping problems in older adults, such as snoring and nocturnal awakenings. Conclusions: Older adults with and without cognitive impairment were motivated to engage in longitudinal remote sleep research, follow remote research protocols, and produce good quality data. Acceptability was related to burden and convenience, usability, and emotional responses to study tasks. When study tasks are repeated over time, care should be taken to introduce variety where possible to avoid fatigue and frustration. Study partners offer essential support for some participants, but requiring a study partner may also be an unnecessary barrier to research participation for others. Future studies should aim to identify effective strategies for recruiting diverse populations, particularly those with limited technology experience or from underserved communities, to ensure equitable participation and representation in research. Providing education on the importance of sleep for brain health and technology use may be beneficial.

  • Background: Web-based advertisements, specifically social media advertisements, are a popular recruitment avenue among research projects involving human participants. Social media recruitment has advantages over other methods (e.g., in-person recruitment), such as aiding teams in reaching the population of interest and increasing enrollment pace at a relatively low cost. Nonetheless, social media recruitment comes with the challenge of fraudulent responses, and therefore effective identity verification procedures must be put in place in order to maintain the integrity of the final sample and data. Objective: In this paper, we outline the identity verification methods (herein referred to as “checks”) used in the recruitment process for a pilot study featuring a mobile health (mHealth) intervention app for emerging adults (EAs; aged 18-25) who regularly use cannabis. Each identity verification check is examined for its rate of passing. Methods: Participants were recruited via social media advertisements that linked directly to a study eligibility screening survey. Advertisements were posted on Meta (Facebook and Instagram), Snapchat, and TikTok. Participants were enrolled if they met study inclusion criteria (e.g., aged 18-25, reported regular cannabis use), completed the baseline consent and survey, downloaded the app, and passed all identity verification checks. Identity verification checks happened at two checkpoints: directly following screening survey completion (e.g., geolocation check, duplicative IP address check, social media check) and directly following app download and login (duplicative device ID and/or push token check). Failing an identity verification check resulted in exclusion from the study. Results: Identity checks were non-exclusive such that a single eligible screening response could undergo multiple checks. Of the 573 eligible screening responses that went through the identity verification process, a total of 3,031 identity verification checks were completed. Of these 3,031 aggregate checks, 396 failed the verification criteria (13.1%), and therefore 396 of the 573 eligible respondents were excluded from continuation in the enrollment process (69.1%). Social media checks, wherein study staff ensured the individual’s public-facing account had personally relevant information, had the highest failure rate (61.5%). The second most common failed check was due to a duplicate device ID upon logging into the app (10.0%), followed by the geolocation check (4.9%), the duplicate IP address check (4.2%), the combination check (time zone; 4.1%), and duplicate push token check (3.2%). Conclusions: This paper describes a participant identity verification process for app-based mHealth studies using social media as a recruitment source. A combination of identify verification safeguards is suggested to maintain integrity of the study sample and data. Clinical Trial: ClinicalTrials.gov NCT05824754; University of Michigan IRB: HUM00222194

  • Feasibility and Usability of a Digital Perinatal Navigator for High-Risk Pregnancies: A Mixed-Methods Study

    Date Submitted: Dec 7, 2025
    Open Peer Review Period: Nov 18, 2025 - Jan 13, 2026

    Background: The journey to parenthood involves significant physical, emotional, and psychosocial changes. Mental health challenges impact both maternal and fetal health, potentially leading to obstetric complications and developmental risks for children. Access to needed perinatal support is often limited due to individual and structural barriers. Digital health solutions can offer opportunities to provide low-threshold, personalized, and scalable support. We developed a digital navigator offering personalized guidance and connecting users to relevant support services with interactive follow-ups to self-assess their well-being. However, evidence regarding feasibility of digital solutions in high-risk patients is limited. Objective: Aim of the was assess the feasibility, usability, and preliminary effectiveness of a digital perinatal navigator app designed to provide personalized support and connect pregnant individuals to relevant health and social services. Methods: The study was conducted at the University Women’s Hospital Heidelberg to assess an app-based health service program. Using convenience sampling, eligible participants tested the perinatal guide for two weeks. A convergent mixed-methods design combined qualitative interviews (n=30) and psychometric questionnaires (n=35) to evaluate feasibility, usability and preliminary effectiveness. Statistical analysis included descriptive evaluations, paired t-tests, and Pearson correlations. Results: Participants (median age 33, median gestational age of 30 weeks) reported moderate high rates of stress, anxiety and depressive symptoms. Usability ratings were excellent (median SUS 80; MAUQ 105). Knowledge of HSPs increased significantly (mean +1.2 points, p<.01), with modest improvements in utilization. Qualitative analysis revealed key success factors such as intuitive structure, trustworthy medical content, and personalized information. Technical disruptions, navigation challenges, limited personalization, and incomplete regional integration of healthcare services were reported as barriers. Conclusions: The results indicate high feasibility and acceptance for our digital navigator in this high-risk population. The identified barriers are to be considered in the further development of the app and other perinatal digital care programs.

  • Background: The integration of Virtual Reality (VR) tools in mental healthcare, such as VR relaxation, shows promise for supporting stress reduction and mental well-being. However, implementation across healthcare settings remains complex and context-dependent, influenced by organizational capacity, stakeholder readiness, and external factors such as policy and funding. This study explores how barriers, facilitators, and evolving implementation strategies shape the use of VRelax, a VR relaxation tool, in primary, secondary, and tertiary mental healthcare settings in the Netherlands. Objective: To identify shared and context-specific barriers and facilitators, and to develop and refine tailored implementation strategies for the integration of VR relaxation in primary, secondary, and tertiary mental healthcare, with a focus on learning from the implementation approach used within the research process itself. Methods: A qualitative, comparative study was conducted using a participatory approach with 33 healthcare professionals and eight patients across primary, secondary, and tertiary mental healthcare settings in the Netherlands, involving 18 interviews and eight focus groups. Thematic analysis, guided by the Consolidated Framework for Implementation Research, was used to assess implementation barriers and facilitators. The Expert Recommendations for Implementing Change tool was applied to match found barriers with evidence-based implementation strategies. Results: Across all settings, key lessons emerged about what supports and hinders the implementation of VR relaxation in mental healthcare. While challenges such as equipment costs, limited staff capacity, technical issues, and lack of structural funding persisted, they also revealed opportunities for improvement. In primary care, collaboration with community organizations enabled low-threshold, accessible use. In secondary care, staff feedback refined strategies and strengthened team learning. In tertiary care, co-development with professionals and patients advanced person-centered care, though time constraints and fragmented organizational structures limited full adoption. Across settings, the gap between professional assumptions about patient suitability and patients’ actual enthusiasm underscores the need for shared decision-making, patient involvement, and flexible, hybrid approaches to care. Conclusions: Successful integration of VR relaxation in mental healthcare requires balancing flexibility with structured, setting-specific strategies while addressing system-wide barriers. Collaboration with community facilities, iterative refinement through staff feedback, and co-development with patients show how VR can strengthen person-centered, hybrid, and sustainable mental healthcare. These findings align with efforts to ensure accessible, appropriate, and future-ready care across all mental healthcare settings. They also underscore that effective implementation requires both localized adaptation and system-level solutions, including shared infrastructure, post-discharge continuity, and long-term funding models.

  • Background: eHealth interventions have demonstrated potential to address challenges related to health and the health care system in low- and middle-income countries. To effectively leverage eHealth in supporting health care in Ethiopia, the assessment and development of eHealth literacy of patients is essential. Objective: This study aimed to translate and culturally adapt the eHealth Literacy Questionnaire (eHLQ) to Amharic and assess its psychometric properties. Methods: We applied a systematic process of translation and cultural adaptation, including forward and backward translation, expert review, and cognitive interviews. Then we conducted a cross-sectional questionnaire-based study using a convenience sample (N=300) of patients with internet access in the primary health-care level between January and March 2025 in the capital and a larger city of Ethiopia. Internal consistency was assessed using Cronbach α and McDonald ω. Factor structure was assessed using Confirmatory Factor Analysis. Convergent and discriminant validity were examined by calculating Spearman correlations between each eHLQ scale and the total score of the eHealth Literacy Scale (eHEALS). Results: A total of 300 participants were included in the analysis. The mean age was 30.4 years (SD 6.8; range 18–55), and 69.7% (209/300) were women. Internal consistency was acceptable for all scales (Cronbach α=0.72–0.91; McDonald ω=0.79–0.96), except for Scale 4 (α=0.62; ω=0.70). The 7-factor model showed satisfactory fit, with a Comparative Fit Index of 0.97, Tucker-Lewis Index of 0.97, and Standardized Root Mean Square Residual of 0.07. Factor loadings exceeded 0.40 for all items except one. Strong correlations between Scales 1–3 and eHEALS (range r=0.69–0.74) supported convergent validity, while moderate correlations between Scales 5–7 and eHEALS (range r=0.66–0.67) indicated limited discriminant validity. Conclusions: The Amharic eHLQ demonstrated generally satisfying psychometric properties and can be considered as a valid tool for assessing eHealth literacy among patients with internet access in Ethiopia, marking the first validation of the eHLQ in Sub-Saharan Africa. Future studies could provide additional evidence to substantiate the psychometric robustness of Scale 4 (“Feeling Safe and in Control”). Overall, the Amharic eHLQ can support the development of tailored eHealth interventions in Ethiopia.

  • Background: Digital health technologies (DHTs) are increasingly integrated into clinical practice, yet economic evaluations remain scarce, particularly in early development stages. Within the NICE Evidence Standards Framework, Tier C DHTs comprise technologies with direct clinical implications and measurable health outcomes, for which robust economic evidence is essential. Early-stage assessments are particularly important to inform subsequent development, refinement, and adoption decisions across the digital health lifecycle. Objective: This study aims to explore the feasibility of integrating a full trial-based economic evaluation within an early-stage pilot comparing a chatbot-supported remote patient monitoring (RPM) solution for anticoagulation management with standard of care (SOC). Methods: A cost-effectiveness analysis was performed alongside a pilot crossover trial among adult cardiac surgery patients receiving vitamin K antagonists. Participants were allocated to two 6-month sequences (SOC→RPM or RPM→SOC). The intervention consisted of a rule-based chatbot integrated with home-based international normalized ratio self-testing using portable coagulometers to support communication and therapy management. Effectiveness was measured as time in therapeutic range (TTR), and costs were estimated from the Portuguese National Health Service and a limited societal perspective over a 1-year horizon. The analysis (i) applied a within-patient cost-effectiveness approach to estimate incremental costs, incremental TTR, and incremental cost-effectiveness ratios (ICERs). Uncertainty was explored through non-parametric bootstrapping (5,000 replications) and deterministic sensitivity analyses. Complementary comparisons examined differences between sequences (analysis ii), between periods (analysis iii), and within each sequence (analysis iv). Results: A total of 19 patients were included in the analyses. In the analysis (i), RPM improved anticoagulation control, with a mean within-patient increase of 10.43 percentage points in time in TTR. The mean incremental costs were €198.61 from the SNS perspective and €270.05 from the limited societal perspective. The corresponding ICERs were €19.03 and €25.88 per additional percentage point of TTR gained. Sensitivity analyses produced consistent estimates across parameter variations. Complementary analyses (ii–iv) suggested that RPM tended to be more cost-effective when implemented after the initial 6-month postoperative period. Conclusions: This proof-of-concept study demonstrates that full trial-based economic evaluation can be feasibly embedded within an early-stage Tier C DHT. The intervention showed improved anticoagulation control alongside higher costs, providing initial insights on its cost-effectiveness profile. Positioned within the digital health evidence continuum, such assessments can function as a learning stage within the lifecycle. To address the persistent adoption–evidence gap, tier- and stage-aligned frameworks are needed to guide the economic evaluation of DHTs. This study contributes to that goal by providing a set of recommendations specifically for Tier C DHTs. Clinical Trial: ClinicalTrials.gov NCT06423521

  • Towards a common data model to support the FAIRification of colorectal cancer screening data.

    Date Submitted: Nov 17, 2025
    Open Peer Review Period: Nov 14, 2025 - Jan 9, 2026

    The potential to combine and analyze massive data from different colorectal cancer (CRC) screening programs across Europe is a powerful tool for improving early cancer detection. However, the current landscape of CRC screening programs is characterized by significant data heterogeneity, which makes data integration challenging. Achieving such data interoperability among different CRC screening programs is crucial to leverage the maximum benefit of the existing and future data. At EOSC4Cancer, we have worked towards optimizing the secondary use of these cancer data and contributing to its FAIRification. Starting from the harmonization of the real-world in-house data models from four different European CRC screening programs, we have created a common data model that provides an initial foundational baseline to build a new interoperable data scenario.

  • Background: Inflammatory bowel disease (IBD) requires continuous self-management, yet long-term engagement remains challenging. Digital health applications can support self-monitoring and treatment adherence, but their effectiveness often declines over time. Nurse-led interventions may complement such tools by providing emotional support and personalized feedback that sustain engagement. Objective: This randomized controlled trial (RCT) evaluated the effectiveness of WITH-Jang, a mobile self-management app, integrated with WITH-Care, a nurse-led tailored intervention, in improving digital health readiness, self-management capacity, and clinical outcomes among patients with IBD. Methods: A total of 100 adults with ulcerative colitis or Crohn’s disease were randomly assigned (1:1) to either the experimental group (app + nurse-led intervention) or the control group (app only). The 12-week intervention included motivational messages, educational content, scheduled nurse consultations, and personalized health reports, followed by a 12-week app-only follow-up. Outcomes were assessed at baseline, week 4, week 12, and week 24. Measures included digital health readiness (mDiHERS), quality of life (SIBDQ), clinical indices (Mayo Score, CDAI), and app usage logs. Focus group interviews (FGIs) were conducted with participants in the experimental group at weeks 12 and 24 to explore user experiences qualitatively. Results: No statistically significant difference was found in SIBDQ scores between groups, although the intervention group showed an overall trend toward improved quality of life. mDiHERS scores correlated positively with app usage frequency (symptom, diet, and medication logging), indicating that higher digital readiness was associated with greater engagement. Digital health equity declined in the control group but remained stable in the intervention group (p = .055), suggesting a potential protective effect of nurse involvement. App usage was strongly associated with disease activity: participants with higher CDAI scores logged symptoms more frequently (r = 0.392, p = .0137). Despite structured support, app engagement declined after week 12, reflecting the “Law of Attrition.” FGIs revealed that nurse consultations and personalized reports were key motivators that provided reassurance, contextual feedback, and emotional support beyond the app’s technical features. Conclusions: Integrating nurse-led support into digital self-management interventions may enhance long-term engagement and mitigate attrition among patients with IBD. While digital tools can improve self-management, sustained effectiveness requires ongoing human support, emotional reinforcement, and adaptive engagement strategies. Future digital health programs should incorporate booster sessions, clinician involvement, and peer support networks to ensure lasting and equitable outcomes. Clinical Trial: KCT0010068

  • The Devil Is In The Details: A Scoping Review of Real-time Psychological Factors Using Ecological Momentary Assessment (EMA) with Continuous Glucose Monitoring (CGM)

    Date Submitted: Nov 12, 2025
    Open Peer Review Period: Nov 12, 2025 - Jan 7, 2026

    Background: Ecological momentary assessment (EMA) is a tool that captures emotional states, experiences, and behaviors in real or near-real time. Using continuous glucose monitoring (CGM) data and EMA in unison may be beneficial to understand associations between psychosocial factors and momentary glucose levels. An in-depth understanding of these relationships is crucial for future interventions targeting psychosocial factors in chronic diseases such as diabetes mellitus. Objective: The goal of this scoping review was to summarize the objectives, methodologies, and outcomes of studies analyzing concurrent psychosocial EMA and CGM data. Methods: This study was conducted following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for scoping reviews. One-hundred and six studies were identified from PubMed, Embase, and EBSCOhost from May 2009-Jan 2025. Thirteen original research articles that collected and analyzed simultaneous EMA and CGM data were included. Methodological data abstracted included study characteristics, EMA protocols and outcomes, CGM outcomes, and integrated EMA and CGM study objectives. Results: Studies primarily recruited adult (92%) populations with type 1 diabetes (T1DM) (69%). EMA delivery protocols and outcomes varied significantly and included emotion, self-care behaviors, disordered eating behaviors, interpersonal interactions, cognition, sleep, workload, and impacts of hypoglycemia. Most (69%) studies analyzed blinded CGM data and CGM outcomes included both standardized and non-standardized glucose outcomes. Integrated EMA and CGM data answered study objectives including evaluating impacts of psychosocial and lifestyle factors on momentary glucose metrics; the influence of momentary glucose on emotional states, mood, personal behaviors, sleep, and cognition; and study protocol or mobile application optimization among others. Conclusions: The combination of EMA and CGM data provides an opportunity to elucidate the relationship between psychological and behavioral factors with momentary glucose. In this review, we describe a broad range of study characteristics, protocols, outcome measures, and objectives using these novel combined methodologies. Clinical Trial: Not applicable

  • Background: Parental burnout is an under-recognised syndrome characterised by emotional exhaustion, detachment from children, and reduced parental efficacy. It is associated with sleep disturbance, addictive behaviours, suicidal ideation, and increased risk of child neglect and family conflict. Despite its public-health relevance, evidence-based interventions remain limited, particularly in low- and middle-income contexts. Objective: To evaluate the efficacy and safety of a mindfulness- and compassion-based group program—Inter-Care for Parental Burnout (IBAP-PB) —designed to reduce burnout symptoms in teleworking mothers. Methods: A three-arm randomised controlled trial (IBAP-BP, active control, waitlist) was conducted across Chile (December 2022–March 2023) with nine-month follow-up. Participants (n = 593) were women ≥ 18 years teleworking ≥ 1 day/week and living with ≥ 1 child. Exclusion criteria were self-reported severe psychiatric disorders. Randomisation was computer-generated and centrally concealed; data analysts were blinded. The IBAP-BP group attended eight weekly two-hour online sessions plus daily home practice integrating mindfulness and compassion. The active control performed relaxation and reflective journaling matched for duration and structure. The primary outcome was parental burnout (Parental Burnout Assessment, PBA) at nine months; secondary outcomes were mindfulness, balance of risks/resources, and adverse effects. Modified intention-to-treat analyses and multilevel structural models assessed effects over time. Results: Of 665 enrolled participants, 343 completed follow-up. At nine months, IBAP-BP produced greater reductions in parental burnout than the waitlist (mean difference = −0.81, p < 0.05; d ≈ 0.6). No significant difference was found between IBAP-BP and the active control, which showed transient improvements up to three months. Effects remained robust in sensitivity analyses. Adverse events were rare and mild across all groups. Mediation analyses showed inconsistent associations between mindfulness facets and outcomes. Conclusions: The culturally adapted, online IBAP-BP programme is a feasible, safe, and effective approach for reducing parental burnout in working mothers, with effects sustained over nine months. Clinical Trial: ClinicalTrials.gov Identifier: NCT05833269

  • Psychological Mechanisms of Telecommunications Fraud: Development and Validation of a Three-Stage Model in a Mixed-Methods Study

    Date Submitted: Nov 8, 2025
    Open Peer Review Period: Nov 8, 2025 - Jan 3, 2026

    Background: Telecommunications fraud has become a salient digital-health threat, producing shame, anxiety, and other mental-health harms. However, the psychological processes through which scams progress—from initial contact to loss of behavioral control—remain underexplored. Objective: To develop and validate a time-sensitive psychological model of telecommunications fraud that traces the progression from cognition to emotion to behavior, and to identify modifiable levers for prevention. Methods: We used a mixed-methods design. In Study 1, we conducted semi-structured interviews with telecom-fraud victims and analyzed transcripts with grounded theory. In Study 2, we fielded a survey to test the qualitative model’s pathways. We measured Truth-Seeking and Cognitive Maturity, Gullibility, Difficulties in Emotion Regulation, and Brief self-control. A serial mediation was estimated using Hayes’s PROCESS Model 6 with 5,000 bootstrap resamples and 95% CIs. Results: The qualitative analysis yielded a three-stage model—Credulity Priming → Affective Manipulation → Behavioral Dyscontrol—showing how scammers (1) build trust via impersonation and scripted scenarios, (2) heighten arousal to blunt analytic judgment and push heuristic processing, and (3) trigger irrational choices through emotional reversals. We further identified two emotion mechanisms—the “desire loop” (appetitive arousal) and “fear loop” (threat arousal)—and highlighted over-expectation events as catalysts of reversal and cognitive depletion. In the survey, Truth-Seeking and Cognitive Maturity were inversely associated with Gullibility, Difficulties in Emotion Regulation, and Brief self-control. Gullibility and Difficulties in Emotion Regulation formed a serial mediation pathway from Truth-Seeking and Cognitive Maturity to Brief self-control, consistent with the three-stage model. Conclusions: Telecommunications fraud follows a dynamic progression from trust manipulation to affective manipulation and, ultimately, behavioral dyscontrol. Critical-thinking capacities protect against victimization by lowering gullibility and improving emotion regulation. The model maps stage-specific intervention points for digital-health practice and policy: (a) strengthen verification habits and critical-thinking skills to curb credulity; (b) teach emotion-regulation strategies to resist affective manipulation; and (c) deploy just-in-time frictions (eg, secondary verification/transfer holds) to interrupt dyscontrol at transaction points. This framework integrates qualitative mechanism discovery with quantitative validation and offers actionable guidance for anti-fraud campaigns, platform design, and clinical counseling.