JMIR - Open Peer-Review: Stigmatizing Language in Gender-Expansive Patient Records: Corpus, Disparity Analysis, and NLP-based Detection, and other submissions

Latest Submissions Open for Peer Review

JMIR has been a leader in applying openness, participation, collaboration and other "2.0" ideas to scholarly publishing, and since December 2009 offers open peer review articles, allowing JMIR users to sign themselves up as peer reviewers for specific articles currently considered by the Journal (in addition to author- and editor-selected reviewers).

For a complete list of all submissions across all JMIR journals as well as partner journals, see JMIR Preprints

Note that this is a not a complete list of submissions as authors can opt-out. The list below shows recently submitted articles where submitting authors have not opted-out of open peer-review and where the editor has not made a decision yet. (Note that this feature is for reviewing specific articles - if you just want to sign up as reviewer (and wait for the editor to contact you if articles match your interests), please sign up as reviewer using your profile).

To assign yourself to an article as reviewer, you must have a user account on this site (if you don't have one, register for a free account here) and be logged in (please verify that your email address in your profile is correct).

Add yourself as a peer reviewer to any article by clicking the '+Peer-review Me!+' link under each article. Full instructions on how to complete your review will be sent to you via email shortly after. Do not sign up as peer-reviewer if you have any conflicts of interest (note that we will treat any attempts by authors to sign up as reviewer under a false identity as scientific misconduct and reserve the right to promptly reject the article and inform the host institution).

The standard turnaround time for reviews is currently 2 weeks, and the general aim is to give constructive feedback to the authors and/or to prevent publication of uninteresting or fatally flawed articles. Reviewers will be acknowledged by name if the article is published, but remain anonymous if the article is declined.

The abstracts on this page are unpublished studies - please do not cite them (yet). If you wish to cite them/wish to see them published, write your opinion in the form of a peer-review!

Tip: Include the RSS feed of the JMIR submissions on this page on your homepage, blog, or desktop RSS reader to stay informed about current submissions!

↑ Grab this Headline Animator

If you follow us on Twitter, we will also announce new submissions under open peer-review there.

Titles/Abstracts of Articles Currently Open for Review:

Stigmatizing Language in Gender-Expansive Patient Records: Corpus, Disparity Analysis, and NLP-based Detection
Date Submitted: Jan 8, 2026

Open Peer Review Period: Jan 8, 2026 - Mar 5, 2026
Peer Review Me
Background: Stigmatizing language (SL) in electronic health records (EHRs) can influence clinical decision-making, propagate bias across care encounters, and undermine patient trust. Gender-expansive patients (GEPs) may be particularly vulnerable to documentation-based stigma, yet large-scale quantitative evidence and fairness-aware evaluation of automated SL detection methods remain limited. Objective: To quantify stigmatizing language in clinical documentation for gender-expansive patients by introducing a labeled corpus, analyzing demographic disparities, and evaluating fairness-aware natural language processing (NLP) methods for SL detection. Methods: We developed a corpus of 780 discharge summaries from a large academic health system, annotated for SL and its subtypes. Notes were categorized by GEP versus non-GEP status. We conducted logistic regression to assess associations between GEP status and SL presence, adjusting for demographics. Multiple NLP models, including transfer learning approaches, were benchmarked for SL detection. We implemented fairness-aware thresholding to reduce subgroup performance gaps. Results: SL appeared in 61.9% of GEP notes compared to 26.1% of non-GEP notes. After adjustment, GEP status remained a significant predictor of SL (odds ratio > 4). Baseline NLP models exhibited subgroup disparities, with high performance gaps in accuracy, true positive rate, and false positive rates between GEP and non-GEP patients. Our fairness-aware thresholding approach reduced error disparities (ΔFPR from 21.16% to 6.65% and ΔTPR from 21.11% to 0.00% with minimal accuracy loss), while maintaining overall accuracy. Conclusions: Stigmatizing language is common in EHR documentation and disproportionately affects gender-expansive patients, with automated detection models showing persistent subgroup performance gaps. This study introduces the first annotated corpus focused on SL in GEP documentation, quantifies demographic disparities, and demonstrates practical fairness-aware NLP strategies that can reduce error-rate inequities while preserving accuracy. These findings support equity-focused interventions and inform digital health workflows aimed at reducing stigmatizing language in EHRs.
A Text Mining Approach to Measure Consistency in Self-Reported Situational Causes for Irritability
Date Submitted: Jan 6, 2026

Open Peer Review Period: Jan 7, 2026 - Mar 4, 2026
This manuscript needs more reviewers Peer Review Me
Background: Irritability, a transdiagnostic symptom linked to severe functional impairment and suicide risk, comprises of tonic irritability (i.e., chronic irritable mood) and phasic irritability (i.e., episodic anger outbursts). However, the consistency in situational triggers for irritability (i.e., irritability-related stressors) has not been thoroughly studied. Objective: This study uses text mining to create a metric for consistency in irritability-related stressors and examines its association with daily irritability. Methods: Ninety-seven participants (47% female; age: M = 38.85, SD = 10.62; 16% ethno-racial minority) with self-reported depression completed a baseline survey and up to 18 days of daily diaries. We computed the semantic similarity between daily text descriptions of irritability-related stressors to estimate within-person consistency across days. We applied linear regression, mixed-effects linear model, and permutation regression to test the metric’s fairness, relations to daily tonic and phasic irritability separately, and associations with the intraindividual covariance between tonic and phasic irritability. Results: The constructed metric showed no demographic differences. The metric was negatively associated with baseline phasic irritability (β = 0.23, p <0.001) and daily phasic irritability (β = -0.24, p < 0.001). Higher value in the metric was linked to greater covariance between tonic and phasic irritability (β = 0.23, p < 0.001). Conclusions: The findings indicated that the irritability-related stressors consistency index serves as a fair and unbiased measurement tool for assessing the consistency of self-reported situational causes of irritability. Furthermore, diverse situational causes are indicative of vulnerability to daily phasic irritability. When situational causes are reported to be consistent, individuals tend to experience co-occurring tonic and phasic irritability.
Radiomics-based AI for predicting and prognosticating VETC in hepatocellular carcinoma: a systematic review and meta-analysis
Date Submitted: Jan 6, 2026

Open Peer Review Period: Jan 7, 2026 - Mar 4, 2026
This manuscript needs more reviewers Peer Review Me
Background: Vessels encapsulating tumor clusters (VETC) are a distinct vascular pattern associated with aggressive behavior and poor prognosis in hepatocellular carcinoma (HCC). Preoperative identification of VETC is crucial for treatment planning but currently relies on invasive pathological examination. Radiomics-based artificial intelligence (AI) offers a potential noninvasive solution, yet evidence regarding its diagnostic and prognostic accuracy remains synthesized. Objective: We aimed to systematically evaluate the diagnostic performance and prognostic value of radiomics-based AI models for noninvasively predicting VETC status in patients with HCC. Methods: We systematically searched PubMed, Embase, Web of Science, and the Cochrane Library for studies published up to July 11, 2025. Studies developing or validating AI models using medical imaging (contrast-enhanced MRI [CEMRI], contrast-enhanced CT [CECT], contrast-enhanced ultrasound [CEUS], or [18F]FDG PET/CT) to predict pathologically confirmed VETC status in HCC patients were included. Study quality was assessed using the PROBAST+AI tool. Diagnostic accuracy (sensitivity, specificity, AUC) and prognostic value for early recurrence (hazard ratio [HR]) were pooled using random-effects models. Results: Fourteen studies involving 729 patients in internal and 581 in external validation cohorts were analyzed. AI models based on CEMRI demonstrated the highest diagnostic accuracy, with a pooled AUC of 0.87 (95% CI 0.84-0.90), sensitivity of 0.82 (95% CI 0.75-0.88), and specificity of 0.77 (95% CI 0.71-0.82). Models using other modalities (CECT, PET/CT, CEUS) showed moderate to good performance. Prognostically, HCC patients classified as VETC-positive by AI had a significantly higher risk of early recurrence (pooled HR 2.34, 95% CI 1.93-2.84). Conclusions: Radiomics-based AI models, particularly those using CEMRI, are promising for the noninvasive prediction of VETC and offer valuable prognostic stratification for early recurrence risk in HCC. However, significant heterogeneity and the retrospective nature of current studies limit the strength of evidence. Prospective, multicenter validation is required to confirm clinical utility. Clinical Trial: PROSPERO CRD420251167155
Improving Patient Outcomes through a Human-Centered Approach: The Role of Empathy, Metacognition, and Persuasive Psychological Principles
Date Submitted: Jan 2, 2026

Open Peer Review Period: Jan 5, 2026 - Mar 2, 2026
Peer Review Me
The shortcomings of digital technologies and artificial intelligence have transformed the landscape of health education, creating opportunities for better health education techniques and accessibility to information. These developments have accelerated the spread of a one-size-fits-all approaches which may risk disengagement, and poor adherence to care plans and may compromise the irreplaceable role of human connection in facilitating behavior change. This paper introduces a human-centered framework for patient health education that integrates theoretical insights and empirical evidence to counter the limitations of AI-driven and generalized approaches. Specifically, it presents two innovative tools—the Empathy Map and the Persuasive Pattern framework. As a theoretical paper, proposes a structured framework to align with patient-centered care principles within a proper use of technology that integrates the humanistic approach. The proposed framework is built around three pillars: (1) empathy-driven needs assessment, operationalized through the Empathy Map to capture patient perspectives, barriers, and motivations; (2) metacognitive empowerment to build reflective, self-directed learning skills; and (3) persuasive psychological strategies, organized into a Persuasive Pattern framework that enhances motivation, sustains engagement, and supports long-term behavior change.. This model reframes health education as a collaborative and empowering process rather than a passive transfer of information. A human-centered framework—with its Empathy Map and Persuasive Pattern model—offers a pathway to more effective, ethical, and equitable patient education. Integrating the framework components will ensure that Artificial IntelligenceI tools are applied as supportive complements rather than replacements for human empathy and relational care.
The Emerging Roles of AI in Self-Directed Stress-Management: a Systematic Review
Date Submitted: Jan 2, 2026

Open Peer Review Period: Jan 5, 2026 - Mar 2, 2026
Peer Review Me
Background: Stress is widespread and carries substantial mental health, social, and economic burdens. Yet access to clinician-led stress management remains constrained by service capacity, cost, and stigma. In response, artificial intelligence (AI)–enabled tools have rapidly proliferated as scalable, self-directed options. However, evidence on how these systems support stress management outside formal clinical settings remains fragmented. Objective: This systematic review synthesises empirical evidence on how AI-enabled technologies are used for self-directed stress management. We map the emerging functions of these tools, the psychological frameworks informing their design, the populations and settings studied, and the outcomes reported. Methods: We conducted a PRISMA-compliant systematic review of English-language studies published between 2000 and 2025. Six databases were searched (APA PsycINFO, PubMed/MEDLINE, Scopus, Web of Science Core Collection, ProQuest, and Google Scholar). Results: Out of 3,008 records identified, 35 studies met the inclusion criteria. AI-supported stress management operates through five core functions, including psychological intervention, behavioural support, psychoeducation, emotional companionship, and stress monitoring and triage, collectively enabling users to identify stress, regulate responses, and engage in self-directed coping outside formal clinical care. Conclusions: AI-enabled systems show preliminary promise for supporting self-directed stress management through multiple user-facing functions grounded in established psychological frameworks.
Fine-Tuning, Retrieval-Augmented Generation, and Hybrid Large Language Models for Postoperative Clinical Decision Support: A Comparative Analysis
Date Submitted: Jan 1, 2026

Open Peer Review Period: Jan 1, 2026 - Feb 26, 2026
This manuscript needs more reviewers Peer Review Me
Background: Large language models (LLMs) have shown growing potential for clinical decision support. However, effectively integrating domain-specific medical knowledge into LLMs while maintaining accuracy, safety, and interpretability remains a key challenge for postoperative discharge instructions and patient education. Fine-tuning (FT), retrieval-augmented generation (RAG), and hybrid FT+RAG approaches represent three prominent strategies for knowledge integration, yet their comparative performance in postoperative clinical contexts has not been systematically evaluated. Objective: We aimed to compare the clinical performance, reliability, and safety characteristics of baseline, fine-tuned, retrieval-augmented, and hybrid FT+RAG LLM configurations for postoperative clinical decision support. Methods: We conducted a controlled comparative evaluation of four LLM configurations using Google Gemini 2.5 Flash. A total of 600 postoperative question–answer pairs were used for model adaptation and validation, while 150 queries were reserved for final evaluation. Queries included routine postoperative care questions, emergency escalation scenarios, and deliberately out-of-scope questions. Model outputs were independently assessed by three blinded clinical experts for accuracy, completeness, and relevance. Automated metrics were used to evaluate readability, faithfulness, and hallucination propensity. Results: All knowledge-enhanced models significantly outperformed the baseline model in clinical accuracy (baseline 68.0% vs FT 92.7%, RAG 91.3%, FT+RAG 97.3%; p<.001). The hybrid FT+RAG model achieved the highest overall performance, including 100% precision, 96.7% recall, and the lowest hallucination rate. FT and RAG alone yielded comparable gains across accuracy, completeness, relevance, faithfulness, and hallucination reduction, with no statistically significant differences between them. While enhanced models produced shorter and more concise responses, they demonstrated reduced readability compared with the baseline model. Conclusions: Incorporating domain knowledge substantially improves the clinical performance of LLMs for postoperative decision support. Hybrid FT+RAG approaches provide the strongest overall accuracy and safety profile, although trade-offs in readability, interpretability, and rater variability remain. These findings support the use of knowledge-augmented LLMs in postoperative care while underscoring the need for careful governance, transparency, and human oversight prior to clinical deployment. Clinical Trial: Not applicable
Digital Health Technology Use Among Rehabilitation Professionals in China: A National Cross-Sectional Survey
Date Submitted: Dec 31, 2025

Open Peer Review Period: Jan 1, 2026 - Feb 26, 2026
Peer Review Me
Background: The rapid expansion of rehabilitation needs in China has intensified pressure on a workforce that remains unevenly distributed. Digital health technologies offer potential to increase service reach and efficiency. However, little is known about how rehabilitation professionals currently gather and document clinical information, nor about their readiness to integrate digital tools into routine practice within China’s rapidly digitalizing health system. Objective: This study aimed to describe how rehabilitation professionals in China collect subjective and objective clinical information, document patient data in routine practice, and assess their willingness to use digital health technologies in clinical settings. Methods: We conducted a national observational cross-sectional survey using a culturally adapted questionnaire based on the World Health Organization Digital Health Interventions framework. The instrument assessed participant characteristics, information collection methods, documentation practices and willingness to adopt digital functions across rehabilitation activities. Descriptive analyses and subgroup comparisons were performed on 324 complete responses from certified rehabilitation professionals. Results: Respondents represented 20 provinces across China, and 82.7% were employed in public sector rehabilitation services, consistent with national workforce distribution patterns. Traditional methods dominated clinical work. Face to face communication was used frequently for subjective assessment by 96.3% of respondents, whereas digital channels such as email (14.2%) and telephone (5.2%) saw limited use. For objective information, visual observation (83.7%) and manual measurement tools (60.2%) remained the primary approaches, while motion capture technology (13.8%) and wearable sensors (4.0%) were rarely used. Documentation practices also relied heavily on analogue formats, with 82.1% using handwritten notes and 60.2% using paper templates. In contrast, willingness to adopt digital health technologies was consistently high with more than 75% of respondents indicated readiness to use digital systems for identity verification, progress tracking and outcome measurement. Conclusions: Rehabilitation professionals in China demonstrate strong readiness to use digital health technologies, yet their routine practice remains largely paper based and analogue. These findings provide national level evidence to inform implementation strategies, workforce training and system level planning aimed at accelerating digital transformation in rehabilitation services.
A Data Analytics Dashboard for Pre-MRI Safety Screening of Implantable Medical Devices
Date Submitted: Dec 27, 2025

Open Peer Review Period: Dec 29, 2025 - Feb 23, 2026
Peer Review Me
This research letter summarizes the development and deployment of a data analytics dashboard that uses natural language processing to streamline pre-MRI safety screening for implantable medical devices, resulting in a 98% reduction in manual screening workload while maintaining high diagnostic accuracy.
Usability Across Three mHealth Problem Solving Training Interventions for Diverse Neurodevelopmental and Neurological Populations: Multicase Usability Evaluation
Date Submitted: Dec 27, 2025

Open Peer Review Period: Dec 29, 2025 - Feb 23, 2026
This manuscript needs more reviewers Peer Review Me
Background: Mobile health (mHealth) interventions that integrate psychoeducation with structured problem-solving training (PST) hold strong potential for improving self-management of chronic conditions. Evaluating the usability of these interventions requires assessing technological, pedagogical, and sociocultural fit. However, most usability evaluations remain narrowly technocentric, focusing on interface-level metrics while neglecting pedagogical coherence, cultural responsiveness, and patient learning needs. Objective: This study aimed to characterize usability challenges and facilitators across three psychoeducational mHealth Problem Solving Training interventions and to identify technological, pedagogical, and sociocultural design features that can improve engagement, accessibility, and implementation for diverse users. Methods: A multi-method, multicase study was conducted with a total of n=14 participants who completed think-aloud usability sessions while interacting with one of three different mHealth PST interventions designed for persons with neurodevelopmental and/or neurological disability: (1) Epilepsy Journey 2.0, (2) Survivor’s Journey, and (3) Electronic Problem-Solving Training (ePST). Participants completed a presession technology comfort survey and the Comprehensive Assessment of Usability for Learning Technologies (CAUSLT) postsession. All sessions were recorded, transcribed, and analyzed thematically. CAUSLT data were analyzed using descriptive quantitative methods. Results: ePST demonstrated the highest usability (x̄ 88 out of 100, 95 percent CI 71.8 to 104.2), followed by Survivor’s Journey (x̄ 83 out of 100, 95 percent CI 57.1 to 108.9) and then Epilepsy Journey 2.0 (x̄ 79 out of 100, 95 percent CI 65.3 to 92.7). Findings revealed that usability in healthcare learning design is shaped by how effectively the technology, learning content, and contextual factors align with patients’ needs. Recurring challenges across interventions included unclear navigation, poor mobile responsiveness, instructional ambiguity, insufficient feedback, potential for greater inclusivity, and limited error recovery. Twelve cross-case design principles were derived, emphasizing mobile-first accessibility, cognitive load reduction, context-sensitive feedback, and empathetic, inclusive design. Conclusions: Usability challenges in mHealth PST interventions arise not only from interface level issues but also from how effectively the intervention supports users’ understanding, decision making, and real world application demands. This extends prior mHealth usability research by demonstrating that user difficulties often reflect misalignments between technological features, instructional structure, and the everyday contexts in which individuals engage with PST. Resulting design principles highlight specific, actionable priorities for developers, including mobile first optimization, clearer task scaffolding, and better feedback and error recovery. Future work should evaluate these principles in larger samples and clinical settings to determine their impact on engagement, adherence, and downstream health outcomes.
Digital Transformation in Healthcare: Are we on the right track?
Date Submitted: Dec 26, 2025

Open Peer Review Period: Dec 29, 2025 - Feb 23, 2026
This manuscript needs more reviewers Peer Review Me
The healthcare digital transformation is gaining increasing notoriety, despite the observed challenges in its implementation. The envisioned benefits together with the growing need for better healthcare are motivating academia, organizations, regulatory agencies, and governments to develop more effective digital healthcare solutions. Through extensive debates among the authors and supported by a narrative literature review, this paper discusses how digital transformation is being conducted in the healthcare sector. Our discussion relies on the concepts from the sociotechnical systems theory categorizing it according to three social (people, culture, and goals) and three technical (processes/procedures, infrastructure, and technology) dimensions. Overall, we argue that both social and technical dimensions present elements that have been either encouraging or discouraging the progress of healthcare digital transformation. The identification of current trends on such (on- and off-track) elements allowed the formulation of propositions for future testing and validation. This approach can help the establishment of better government policies, foster private initiatives, and shift regulatory guidelines to support a successful digital transformation in health systems. Lastly, from a research perspective, we outline some opportunities for further interdisciplinary investigation in the field, promoting advances in the understanding of healthcare digital transformation.
Commercialization of Online Cancer Information in South Korea: Examining Covert Promotional Cancer-related Posts Across Two Major Search Engines
Date Submitted: Dec 25, 2025

Open Peer Review Period: Dec 25, 2025 - Feb 19, 2026
This manuscript needs more reviewers Peer Review Me
Background: Internet search engines serve as primary gateways for cancer information, yet the commercialization of health content within organic search results remains understudied. While covert promotional content—such as native advertising and stealth marketing—has been documented in various contexts, systematic comparisons across structurally divergent search platforms are lacking. Objective: This study examined the prevalence, distribution, and information quality characteristics of covert promotional cancer-related content across Naver and Google, South Korea's two dominant search engines, which have fundamentally different platform architectures. Methods: A two-phase cross-sectional content analysis was conducted. Phase 1 employed natural language processing to identify 33 cancer-related keywords from 1,400 preliminary posts. Phase 2 systematically collected 5,848 posts in October 2023, yielding 919 unique posts (598 from Naver and 321 from Google) that covered seven major cancer types, representing over 70% of Korean cancer incidence. Two trained coders analyzed promotional status, intensity, institutional sources, and information quality indicators (citation practices, information depth, and source attribution), with inter-coder reliability exceeding κ=.80. Chi-square tests examined the associations between platform and cancer type. Results: Covert promotional content appeared in 48.6% (447/919) of analyzed posts, with significantly higher prevalence on Google (54.2%, 174/321) than Naver (45.7%, 273/598; χ²₁=5.78, p=.016). Platform differences were pronounced: Naver promotional posts predominantly originated from blogs (96.0%, 262/273) and exhibited full promotional intensity (52.1%, 126/242), while Google posts primarily came from hospital websites (81.0%, 141/174) with simple institutional identification (57.8%, 52/90). Institutional source distribution varied significantly by platform (χ²₅=215.714, P<.001): traditional medicine institutions dominated Naver (99.2%, 119/120), whereas university-affiliated hospitals predominated on Google (85.0%, 96/113). Information quality differed substantially: indirect citation was more common on Google (81.6%, 142/174) than Naver (58.6%, 160/273; χ²₁=25.653, P<.001), while comparative informational depth was higher on Google (55.7%, 97/174) versus Naver (19.4%, 53/273; χ²₂=64.683, P<.001). Conclusions: Covert promotional cancer content is pervasive in Korean search results, with platform architecture systematically shaping promotional patterns, institutional sources, and information quality rather than reflecting deliberate marketing strategies. These findings underscore the need for platform-sensitive regulation and enhanced digital health literacy to protect vulnerable cancer information seekers from commercial exploitation embedded within ostensibly neutral search environments.
Efficacy of Various Virtual Reality Exposure Therapies for Chronic Low Back Pain: A Systematic Review and Network Meta-Analysis
Date Submitted: Dec 24, 2025

Open Peer Review Period: Dec 24, 2025 - Feb 18, 2026
Peer Review Me
Background: Chronic low back pain is a major global health challenge. While non-pharmacological therapies are recommended, patient compliance is often hindered by kinesiophobia. Virtual reality (VR) offers an immersive, distraction-based approach, but the comparative effectiveness of different VR modalities remains unclear. Objective: To compare and rank the efficacy of different Virtual Reality Exposure Therapy (VRET) modalities on pain intensity, functional disability, and kinesiophobia in patients with chronic low back pain (CLBP). Methods: Systematic searches were conducted in PubMed, Web of Science, Scopus, Embase, CINAHL, and the Cochrane Library from inception until June 2025. Randomized controlled trials assessing the effects of virtual reality exposure therapy on individuals with chronic low back pain were selected. Primary outcomes were pain intensity, functional disability (Oswestry Disability Index), and kinesiophobia (Tampa Scale of Kinesiophobia). The Cochrane Risk of Bias tool (RoB2) was used for quality assessment. A Bayesian Network Meta-Analysis with Standardized Mean Difference (SMD) as Effect Size was performed to synthesize evidence and rank interventions using Surface Under the Cumulative Ranking Curve (SUCRA) values. The GRADE framework was adapted to evaluate the quality of evidence. Results: 25 RCTs with a total of 2,610 participants were included in the analysis. For pain intensity, shooting games (SMD -4.40; 95% CrI -6.80 to -2.20) and VR-based equestrian training (SMD -2.00; 95% CrI -3.70 to -0.57) were significantly superior to all types of controls. Surface Under the Cumulative Ranking Curve (SUCRA) indicated that shooting games had the highest probability (98%) of being the most effective intervention for pain relief. For functional disability, no intervention demonstrated statistically significant superiority. For kinesiophobia, shooting games (SMD -3.40; 95% CrI -5.60 to -1.10) significantly outperformed traditional exercise controls. The quality of evidence ranged from very low to moderate across outcomes. Conclusions: VRET, particularly in the context of shooting games and VR-based equestrian training, appears effective for reducing pain in CLBP; Prioritize higher-ranked shooting games, reserving VR-based cognitive behavioral therapy as an alternative or adjunct for kinesiophobia. However, the benefits for functional improvement remain uncertain. Clinical Trial: PROSPERO CRD420251131116; https://www.crd.york.ac.uk/PROSPERO/view/CRD420251131116
Using GPT-4 to Automate the Generation of Lay Summaries for Cancer Publications: Human-centric Quantitative and Qualitative Evaluation
Date Submitted: Dec 23, 2025

Open Peer Review Period: Dec 24, 2025 - Feb 18, 2026
Peer Review Me
Background: Cancer research literature is often riddled with technical jargon that is not digestible to the average person. Individuals interested in research studies may want to contribute through patient partner engagement or sample donation but find the relevant literature overwhelming. Through the generation of lay summaries, previously inaccessible research papers become easier to comprehend, especially for patient partners or data donors. With large language models (LLMs) continuing to advance, so does their capability to summarize large texts. Objective: In this study, we examined whether LLMs can produce lay summaries of scientific literature at-scale, while maintaining readability and accuracy to their source texts. Methods: We developed a tool to generate lay summaries of open-access article abstracts and their full texts with GPT-4-Turbo. Prompt development aimed for a target 8th grade reading level assessed with Flesch-Kincaid Grade Level. Human-review metrics were used to evaluate readability and accuracy when generated using abstracts versus full text articles. Results: The average Flesch-Kincaid Grade Level Score was 7.13 for abstract-based summaries and 7.39 for full text-based summaries, indicating summaries at around 7th grade reading level. Human-review metrics showed these summaries were of similar readability and accuracy when generated using abstracts versus full text articles, with mean accuracy scores from human review of 7.09 vs 7.42 out of 10 respectively. Additionally, qualitative patient-based assessment indicated these summaries would encourage participation in research studies. Conclusions: By generating lay summaries for complex and lengthy research papers, their scientific information becomes accessible to a larger audience, including patient partners interested in contributing to cancer research. Summaries that are easy to understand will allow participants to make informed decisions about their involvement and appreciate the impact of their contributions if and when their results are published.
Large Language Models in Colorectal Cancer: A Systematic Review
Date Submitted: Dec 22, 2025

Open Peer Review Period: Dec 23, 2025 - Feb 17, 2026
This manuscript needs more reviewers Peer Review Me
Background: The growing complexity of colorectal cancer (CRC) management requires advanced tools for integrating multimodal data and clinical knowledge. Large language models (LLMs) offer a promising approach to address these challenges through sophisticated natural language processing and reasoning capabilities. Objective: This systematic review evaluates the current applications, performance, and practical implications of LLMs across the continuum of CRC care, from screening to treatment decision support. Objective: This systematic review evaluates the current applications, performance, and practical implications of LLMs across the continuum of CRC care, from screening to treatment decision support. Methods: We searched six databases (PubMed, Embase, Web of Science, Scopus, CINAHL, Cochrane) up to November 1, 2025, following PRISMA guidelines. Included studies were original research investigating LLM applications specific to CRC, with extractable outcome data. Quality was assessed using QUADAS-2, PROBAST, and ROBINS-I tools by two independent reviewers. Results: Following the screening of 1,261 records, 34 studies met the inclusion criteria, all published between 2023 and 2025. The synthesis highlighted the utility of LLMs in automating data extraction from clinical texts, supporting patient education, aiding diagnostic processes, and assisting in clinical decision-making, with growing evidence of their emerging visual interpretation and multimodal capacities. The effectiveness of these models was significantly influenced by prompt design, which varied from basic zero-shot queries to specialized fine-tuning techniques. While the overall methodological quality of the included studies was deemed adequate, assessments identified recurring concerns regarding insufficient control of biases and inadequate reporting on data security measures. Conclusions: LLMs demonstrate tangible potential to augment CRC care, particularly in structuring unstructured data and providing clinical decision support. However, translating this potential into practice requires solutions for domain adaptation, multimodal integration, and rigorous prospective validation to ensure reliability and safety in real-world settings. Clinical Trial: PROSPERO CRD420251248261; https://www.crd.york.ac.uk/PROSPERO/view/CRD420251248261.
Artificial Intelligence Prediction of Individual Treatment Response to Smartphone-Based Mindfulness in Autistic Adults with Anxiety Symptoms: Randomized Controlled Trial Analysis
Date Submitted: Dec 22, 2025

Open Peer Review Period: Dec 22, 2025 - Feb 16, 2026
This manuscript needs more reviewers Peer Review Me
Background: Anxiety disorders are highly prevalent among autistic adults, with 20%-65% experiencing at least one diagnosable anxiety disorder. While mindfulness-based interventions have demonstrated efficacy for anxiety reduction, treatment response varies considerably across individuals. Machine learning approaches offer potential for identifying who is most likely to benefit from smartphone-based mindfulness interventions, enabling more personalized treatment recommendations. Objective: This study aimed to develop and evaluate machine learning models to predict individual treatment response, in the form of reduced anxiety symptoms, to a smartphone-based mindfulness intervention for autistic adults. We sought to identify baseline characteristics that distinguish responders from non-responders, explore few-shot learning with large language models as a complementary approach for low-data clinical prediction, and implement a Personalized Advantage Index approach for individualized treatment recommendations. Methods: We conducted a secondary analysis of data from a randomized controlled trial comparing a 6-week smartphone-based mindfulness intervention (Healthy Minds Program) with a waitlist control condition in autistic adults. Among 73 participants who completed the intervention, we defined responders as those achieving ≥7-point reduction in State-Trait Anxiety Inventory state anxiety scores. Baseline predictors included demographic variables, autism trait measures, and self-report questionnaires assessing anxiety symptoms, perceived stress, affect, and mindfulness. We trained six machine learning models (logistic regression, Random Forest, XGBoost, TabNet, Tab-ICL, and TabPFN) using nested 10-fold cross-validation with inner 5-fold cross-validation for hyperparameter tuning. Additionally, we evaluated few-shot learning using GPT-4o models with tokenized baseline features at varying shot counts (20-70 examples). Model performance was evaluated using area under the receiver operating characteristic curve (AUC) for machine learning model and classification accuracy for few-shot learning. We examined feature importance and implemented Personalized Advantage Index analysis to estimate individualized treatment benefit. Results: Random Forest achieved the highest predictive performance for state anxiety response (AUC 0.79, 95% CI 0.66-0.91), followed by TabPFN (AUC 0.78, 95% CI 0.64-0.94) and logistic regression (AUC 0.77, 95% CI 0.73-0.81). Higher baseline state anxiety (coefficient 1.20, P<.001) predicted better treatment response, while higher trait anxiety (coefficient -0.17, P=.001), older age (coefficient -0.18, P=.02), and lower childhood pretend play scores (coefficient -0.93, P=.007) were associated with poorer response. Few-shot learning with 7-feature tokenization achieved accuracy of 0.867 (95% CI 0.81-0.92) at 70 shots, significantly outperforming Random Forest baseline (0.733, p<.001). Prediction of trait anxiety changes was substantially weaker (AUCs 0.57-0.68), likely reflecting the inherent stability of this personality dimension. The Personalized Advantage Index demonstrated significant moderation of treatment group differences (adjusted R²=0.29), with 75% of participants predicted to benefit more from the mindfulness intervention than the waitlist control. Conclusions: Machine learning models successfully identified baseline characteristics predicting treatment response to a smartphone-based mindfulness intervention in autistic adults. Few-shot learning with large language models demonstrated superior performance to traditional machine learning when provided with compact, high-signal feature representations, offering a promising approach for clinical prediction in small-sample settings. These findings demonstrate the feasibility of precision psychiatry approaches in digital mental health interventions for autistic adults. While modest sample size and limited demographic diversity warrant cautious interpretation, the stable cross-validation performance suggests robust predictive patterns within similar populations. Future research should validate these models in larger, more diverse samples and explore whether algorithm-guided treatment recommendations improve outcomes compared to standard care, through prospective randomized trials.
Gender Bias and Assignment Consistency in Large Language Models for Clinical Decision-Making: Comparative Evaluation Study
Date Submitted: Dec 22, 2025

Open Peer Review Period: Dec 22, 2025 - Feb 16, 2026
Peer Review Me
Background: The integration of large language models (LLMs) into healthcare holds promise to enhance clinical decision-making, yet their susceptibility to biases remains a critical concern. Gender has long influenced physician behaviors and patient outcomes, raising concerns that LLMs assuming human-like roles, such as clinicians or medical educators, may replicate or amplify gender-related biases. Objective: To evaluate the consistency of LLM responses across different assigned genders (personas) regarding both diagnostic outputs and model judgments on the clinical relevance or necessity of patient gender. Methods: Using case studies from the New England Journal of Medicine Challenge (NEJM), we assigned genders (female, male, or unspecified) to multiple open-source and proprietary LLMs. We evaluated their response consistency across LLM-gender assignments regarding both LLM-based diagnosis and models’ judgments on the clinical relevance or necessity of patient gender. For representative models with high diagnostic accuracy, we further evaluated consistency across question difficulty tiers and clinical specialties. Results: All models showed high diagnostic consistency across assigned LLM genders (range of consistency rates: 91.45%–97.44%), though this did not always correspond to diagnostic accuracy (e.g., GPT-4.1: 97.44% consistency, 0.943 accuracy; Gemma-2B: 97.44% consistency, 0.478 accuracy). In contrast, judgments on the clinical importance of patient gender showed marked inconsistency: consistency rates ranged from 58.97% to 90.6% for relevance judgements, 78.63% to 98.29% for necessity judgements. Stratified by difficulty tier and specialty, the open-source model (LLaMA-3.1-8B) particularly showed statistically significant differences across LLM genders regarding both relevance and necessity judgements. Conclusions: Despite stable diagnostic outputs, LLMs varied substantially in their assessments of patient gender’s clinical importance across gendered personas. These findings present an underexplored bias that could undermine the reliability of LLMs in clinical practice, underscoring the need for routine checks of identity-assignment consistency when interacting with LLMs to ensure reliable and equitable AI-supported clinical care. Clinical Trial: not applicable
Artificial Intelligence Medical Scribe for Documentation Burden and Clinician Experience: Real-World Retrospective Observational Study
Date Submitted: Dec 20, 2025

Open Peer Review Period: Dec 22, 2025 - Feb 16, 2026
Peer Review Me
Background: Clinicians spend a substantial share of their working hours on documentation, contributing to workflow inefficiencies, reduced patient-facing time, and increased burnout. AI medical scribes have emerged as a promising solution to reduce this burden, yet real-world evidence remains limited and heterogeneous. Data from European health systems are especially scarce, despite growing interest in AI-enabled documentation support. Reducing clinicians’ documentation burden is a critical priority in modern health care, as excessive administrative work consumes substantial clinician time, contributes to burnout, and limits time available for direct patient care. Objective: To quantify the impact of an AI medical scribe on documentation time and clinician experience. Methods: This observational real-world evaluation was conducted between April 26th 2024 and October 27th 2025 to assess the impact of an AI medical scribe on documentation time and clinician experience using retrospective paired ratings. The study was carried out across multiple specialties in primary, secondary and hospital care within Capio Ramsay Santé, a large integrated health care provider operating in Sweden. The target population consisted of licensed clinicians actively using the AI medical scribe in routine clinical practice. Eligibility was limited to “fully onboarded” users, defined as clinicians who had used the scribe for at least 3 months, created more than 100 notes, generated at least one document or certificate, and used the conversational edit (“Add or adjust”) feature at least once. Results: With the introduction of the AI medical scribe, the estimated time spent on documentation per note decreased from 6.69 minutes to 4.72 minutes (-29%, p = 1.70e-11). On a five-point Likert scale, the ability to work without stress related to administrative tasks increased from a mean of 2.41 to 3.14 (p = 2.46e-8), and perceived presence with patients increased from 3.73 to 4.33 (p = 2.47e-8). The median editing time was 93 seconds, and it did not decrease significantly over continued use. Conclusions: This study shows that the clinician time savings and reductions in cognitive load and stress reported in prior US-based studies can also be achieved in a European health care system using an AI scribe. Clinical Trial: The study adhered to the Standards for Quality Improvement Reporting Excellence (SQUIRE) guideline and was preregistered on the Open Science Framework on 7 October 2025 (DOI: 10.17605/OSF.IO/YPD9E)
Physical voice parameters and mental state changes in affective disorders – a longitudinal study using the MoodMon AI system
Date Submitted: Dec 19, 2025

Open Peer Review Period: Dec 22, 2025 - Feb 16, 2026
This manuscript needs more reviewers Peer Review Me
Background: Psychiatry needs objective technological tools to address global staffing shortages, stigma, and other systemic challenges. A long-term, naturalistic study using AI to effectively detect changes in mental state in major depressive disorder (MDD) and bipolar disorder (BD) based on physical characteristics of the voice represents a breakthrough in biomarker validation. The MoodMon system was developed along with a mobile application for smartphones. Objective: The aim of the study was to determine whether physical voice parameters would be effective as biomarkers of mental status changes in affective disorders and whether they would be useful in remote clinical monitoring of patients by psychiatrists. Methods: To evaluate the effectiveness of artificial intelligence (AI) algorithms in detecting changes in mental state based on physical voice parameters, data from 75 patients diagnosed with bipolar disorder (BD) and 25 patients with major depressive disorder (MDD) for 944 days were used. This makes this the longest analysis in the world covering two of the most common mental disorder diagnoses. A wealth of clinical, behavioral, and technical data was collected and used to train the MoodMon machine learning system under the supervision of human experts- experienced psychiatrists. The AI module consists of an ensemble of selected supervised learning and clustering algorithms In the first stage, the AI was trained using objective data and clinical assessments conducted by psychiatrists, including 17-item versions of the HDRS and YMRS, as well as the CGI scale. The second stage involved further refinement of the AI using individual and population data and generating alerts when subtle changes in mental state were detected. Results: 19 of the 243 specific physical voice parameters tested were found to be most effective in detecting changes in mental status. The system demonstrated high performance, achieving the following sensitivity (true positive rate – TPR) and specificity (true negative rate – TNR) values for both diagnoses: TPR = 89.5%, TNR = 98.8%; BD: TPR = 89.6%, TNR = 98.9%; MDD: TPR = 89.1%, TNR = 98.5%. Voice alerts in the MoodMon system are a key tool supporting clinical decision-making. They increase the probability of a clinical visit and exert a significant influence on the likelihood of treatment modification. Conclusions: The system confirmed the presence of parameters that may serve as biomarkers of mental state changes in bipolar disorder (BD) and major depressive disorder (MDD). A key clinical implication is the increased probability of prompt treatment modification following an alert, thereby supporting the primary objective underlying the development of the MoodMon AI tool. Clinical Trial: Study: UR.D.WM.DNB.39.2021; Funder: National Centre for Research and Development, Poland. Project title: Development of a system supporting the monitoring of the course and early detection of relapses of affective disorders based on artificial intelligence algorithms. Agreement: POIR.01.01.01-00-0342/20
Artificial Intelligence Trustworthiness in the Preoperative Period for Patients with Serious Illness: A Systematic Narrative Review.
Date Submitted: Dec 20, 2025

Open Peer Review Period: Dec 18, 2025 - Feb 12, 2026
This manuscript needs more reviewers Peer Review Me
Background: Despite the promising potential of artificial intelligence (AI) in the perioperative context, the rapid pace of development and diverse implementation warrants a systematic review to consolidate existing knowledge, identify gaps, and assess the utilization of trustworthiness principles of AI integration into the perioperative period for patients with serious illness. Objective: The purpose of this study was to address deficiencies in perioperative AI literature by elucidating the extent to which equity, ethics, and safety discussions are incorporated, thereby establishing a foundation for developing robust ethical guidelines for the safe and effective integration of AI in healthcare. This study also examined the utilization of AI enabled team augmentation in perioperative serious illness care. Methods: We searched PubMed, Embase, CENTRAL, and Scopus for studies published between 2010 to July 2024. We included studies that reported patient functional outcomes, occurred in the perioperative period (30 days before and up to 90 days after surgery), included AI integration, and included patients with serious illness (defined as: malignancy, advanced organ failure, frailty, dementia/neurodegenerative disease, or stroke). To ensure reliability and minimize bias, two independent reviewers screened all studies through the title/ abstract and full-text stage; conflicts were resolved through team consensus. The abstraction form was developed iteratively and was tested through pilot abstractions. Any discrepancies identified during data extraction were resolved through discussion and consensus among the reviewers. The ROBINS-I risk of bias tool in non-randomized studies was used to assess quality. Abstraction and risk assessment occurred through a blinded, independent dual review. A narrative review was compiled with the identified studies. Results: Of the 10,980 articles identified through the database searches, this review yielded 81 articles that met inclusion criteria. A majority of the studies were published in China (35), with the United States (9) and South Korea (7) having the subsequent most publications, and 80 out of 81 (98.8%) articles focused on patients with malignancy. Analysis of AI implementation strategies revealed foundational efforts toward equitable access, with six studies providing open-access tools and several more designing models with simple inputs suitable for low-resource settings (17). Seven studies mentioned their commitment to transparency (e.g., publishing code) to enhance safety and trust. However, significant ethical deficiencies persist, particularly around input data, as only two studies explicitly addressed racial or ethnic disparities, and concerns about lack of sample diversity (16) and the omission of socially relevant features (5) were frequently noted as limitations. Although no current studies considered AI enabled team augmentation, a majority of articles described how AI could be used to prompt a team member to make a tangible action. Conclusions: Machine learning for predictive analytics and other types of AI tools in surgical outcomes offers significant potential but requires adherence to trustworthiness and safety principles to be clinically viable. By leveraging longitudinal data and continuous performance tracking, these models have the potential positive impact on diverse patient needs and healthcare systems. Future research should prioritize adhering to guidelines for equity, ethics, and safety, conduct prospective studies, incorporate more external validation of AI models, and facilitate transparent monitoring and reporting of model performance to build clinician and patient trust and to encourage broader healthcare system adoption. Clinical Trial: PROSPERO CRD42024608387
A feasibility and acceptability pilot trial of a virtual world platform for delivering youth mental health treatment
Date Submitted: Dec 17, 2025

Open Peer Review Period: Dec 17, 2025 - Feb 11, 2026
Peer Review Me
Background: Online virtual worlds are platforms that allow users, represented as avatars, to meet and interact with other users in real time within 3D virtual environments. These platforms have potential utility as vehicles to deliver/receive clinical services, especially as a preference to video-conferencing-based telehealth. However, commercial virtual worlds (e.g.,“Second Life”) are often deemed unsuitable due to privacy and safety concerns. Objective: The aim of this study was therefore to co-develop and test a bespoke virtual world platform to deliver routine youth mental health services. Methods: We undertook a participatory-design process to develop the platform (Orygen Virtual Worlds) involving 10 young people with lived experience of mental health difficulties, researchers, software designers and mental health clinicians. We then tested two types of clinic-led interventions delivered through the virtual world (a structured therapy group and an individual therapy) in a public youth mental health service setting in Australia. Participants were patients receiving treatment in the service. The main outcomes were acceptability and feasibility; we also measured symptom change, usability, presence and therapeutic alliance. We conducted qualitative interviews post-intervention with the participants and analysed these interviews using thematic analysis. Results: 15 young people were recruited to the structured group (27% consented from referred) and 8 were recruited to the individual therapy (36% consented from referred). Drop out was higher in the individual therapy than the structured group therapy (38% versus 80%). Acceptability ratings were high for both therapy approaches and there were no significant safety events attributed to using the platform. There were no significant pre-post differences in the symptom outcome measures in either the structured group intervention or individual therapy. The platform was perceived as being comfortable and safe, enjoyable, fun and interactive, and was not confusing to navigate or difficult to use. The qualitative themes included the platform being fun and engaging, making treatment more accessible, providing a safe and inclusive place, fostering connections, positively impacting wellbeing and providing a catalyst for real life functional change. Young people perceived decreased barriers, increased comfort with help-seeking and reduced social stress facilitated by the avatar, communication options (emoji, text, voice) and accessibility from home. Conclusions: Our findings indicate that online virtual world platforms, such as the one we have designed, hold considerable promise for providing interventions for young people in clinical services. Virtual worlds can provide fun and engaging experiences of therapeutic interventions for young people with mental health difficulties which are safe and inclusive, especially for harder to reach groups.
Usage and exposure to content of the NHS Healthy Living programme for people with type 2 diabetes: a retrospective observational cohort study
Date Submitted: Dec 16, 2025

Open Peer Review Period: Dec 16, 2025 - Feb 10, 2026
Peer Review Me
Background: Diabetes self-management and education services can improve health outcomes, but engagement is often low. ‘Healthy Living’ is an online self-management programme for people with type 2 diabetes, based on the ‘HeLP-Diabetes’ intervention which demonstrated effectiveness in a randomised controlled trial. Healthy Living was commissioned by NHS England and rolled out nationally into routine care. The website comprises structured learning, unstructured articles (which users could access at any time), and tracking tools such as goal setting. Objective: To investigate overall usage and exposure to content of Healthy Living, including differences in usage/ exposure by user characteristics. Methods: Anonymous usage data from all people (n=27,422) who activated an account between May 2020 and September 2023 were available, including (1) which website activities were accessed, (2) when activities were accessed and (3) how long users spent on each activity. User demographic and usage information was summarised. Logistic regression evaluated the association between user demographics and usage. Results: The median length of time spent on the website in total was 7·6 minutes (IQR 0·6-27·6 minutes); 12,066 (44·0%) users spent less than five minutes on the website and 3,022 (11·0%) spent one hour or more. Of those who activated an account, 69·8% accessed some website content, 40·7% completed the first section of structured education, and 4·7% completed 60% of the structured education. Usage of the unstructured aspects of the programme was low. Female gender, lower deprivation, White ethnicity, and a shorter time since diagnosis were associated with increased usage. Conclusions: This study is one of the first to provide detailed analysis of user engagement with a national digital self-management programme for type 2 diabetes. Usage of with Healthy Living was generally low, in line with other digital self-management programmes. However, encouraging increased usage with the programme has the potential to lead to better health outcomes in people with type 2 diabetes.
Integrated Theory, Better Outcomes: A 25-Year Systematic Review of Digital Information Technology (IT)-Based Behavior-Change Tools
Date Submitted: Dec 15, 2025

Open Peer Review Period: Dec 15, 2025 - Feb 9, 2026
This manuscript needs more reviewers Peer Review Me
Background: Over the past quarter-century, designers of digital behavior change tools have increasingly blended constructs from multiple theories, yet the extent to which such integration enhances intervention outcomes remains unclear. Objective: To clarify this relationship, this study systematically reviewed literature published between 1999 and 2025, focusing on IT-mediated interventions that explicitly combined at least two behavioral theories and reported intention or behavior outcomes. Methods: Following a registered protocol (PROSPERO CRD42022285741) and PRISMA guidelines, searches across seven databases identified 62 eligible studies. Results: Most investigations were quantitative (77%), featured sample sizes from 16 to 8840, and lasted under 6 months; only 9 applied randomized controlled designs. Twenty-nine theories appeared, with Self-Determination Theory (35%) and the Theory of Planned Behavior (29%) being the most prevalent, often paired with the Technology Acceptance Model or Task-Technology Fit. Integrated models consistently outperformed their single-theory counterparts. Health care and fitness interventions dominated (44%), followed by online learning (23%) and mobile commerce (11%), but long-term follow-ups and explicit mappings of theory to behavior change techniques were scarce, and overall risk-of-bias ratings were moderate. Conclusions: Findings indicate that integrated theoretical frameworks deliver measurably superior behavioral outcomes in digital environments, yet evidence remains short-term and health centric. Future research should extend evaluation horizons beyond 6 months, diversify application domains, apply more rigorous randomized designs, and articulate more transparently how theoretical constructs guide specific intervention techniques to advance replicable, theory-driven digital solutions.
Centering equity during health technology innovation: a scoping review of methods and research adjustments to promote inclusive co-production
Date Submitted: Dec 16, 2025

Open Peer Review Period: Dec 15, 2025 - Feb 9, 2026
Peer Review Me
Background: Digital health has the potential to mitigate health inequity for priority populations who are underserved or marginalised by the health system. However, there is a lack of practical guidance on how to include priority communities in the co-production of digital health technologies, particularly across the entire lifecycle of innovation including research, development, and evaluation. Objective: The aim of this scoping review was to systematically identify and assess published methods used during digital health innovation to promote equitable inclusion of priority communities at every stage of the CeHRes Roadmap for Digital Health Technologies. Methods: This review was based on the Arksey and O’Malley framework for scoping reviews. A 6-stage framework was used to execute the review. To increase the trustworthiness of the findings, an expert advisory group was consulted and their feedback incorporated into the final manuscript. The Participant, Concept and Context (PCC) framework was used to structure the inclusion criteria. Results: The review identified a total of 106 articles, 58 methods, 4 approaches, and 17 research adjustments utilised to co-produce digital health technologies with priority communities. Common methods across multiple stages included interviews, focus groups, surveys and workshops, however the most accessible way to make equity a practical reality during health technology innovation is to appoint a priority population community advisor, or advisory group, from project inception to project closure. Visual and creative methods like photovoice, home tours and body-mapping were also employed, often by priority population researchers themselves. Research adjustments that promote patient safety and comfort, enhanced literacy, peer-support and recognize socio-cultural and demographic considerations have been employed to increase the inclusion of priority populations during digital health innovation. Conclusions: Embedding equity is possible using the practical methods and research adjustments identified to promote inclusive co-production. Professionals working across healthcare, health informatics, research, digital health, and technology development can utilise these findings to centre digital health equity during technology innovation. This research also recognises that co-production must draw on epistemological frameworks, or ways of thinking, which support Indigenous and other priority population knowledge systems. A solely Western lens risks reinforcing structural barriers and overlooking essential knowledge, as demonstrated by this review when the search strategy missed key scholarly works by priority population authors themselves.
The Kid’s Trial: Methods and reflections from co-creating and conducting an online, randomized trial with 7 to 12-year-old children.
Date Submitted: Dec 9, 2025

Open Peer Review Period: Dec 10, 2025 - Feb 4, 2026
Peer Review Me
Background: Limited public understanding of randomized controlled trials (RCTs) hinders recruitment, retention, and confidence in research. Early exposure to trial concepts may strengthen health literacy and research engagement. The Kid’s Trial was a global, decentralized, child-led study that co-created and conducted an RCT to help children understand trials, their importance, and improve critical thinking. Objective: This paper presents its design, outcomes, and methodological reflections. Methods: The Kid’s Trial employed a dedicated website with study materials guiding children through each step of designing and conducting an RCT. Each step was linked to an online survey. Materials were co-developed with two patient and public involvement groups of children and parents. Any child, aged 7 to 12 years, could take part in as many or as few steps as desired. Recruitment combined online and offline strategies, and engagement and self-reported learning were descriptively analyzed. The co-created REST (Randomized Evaluation of Sleeping with a Toy or Comfort Item) trial was a two-arm, pragmatic RCT comparing one week of sleeping with versus without a comfort item. The primary outcome was sleep-related impairment, and the secondary outcome was overall sleep quality. Analyses followed an intention-to-treat approach using mixed-effects models adjusted for baseline measures. Results: Overall, 224 children from 15 countries participated in at least one step. Participation varied: 37% (n = 82) completed one step and 21% (n = 48) completed six. The REST trial randomized 139 children, with 73% (n = 101) completing outcome surveys. Adjusted mean differences (intervention–control) were −0.53 for sleep-related impairment (95% CI −3.40 to 2.34; P=.71) and 0.28 for sleep quality (95% CI 0.01 to 0.55; P=.04), a small, uncertain difference not supported with sensitivity analyses. Post-study responses (n = 20) indicated improved understanding of RCT concepts. Conclusions: The Kid’s Trial demonstrates the feasibility of a decentralized, child-led RCT co-created through participatory citizen-science methods. Children can meaningfully contribute to trial design and conduct, and experiential participation may foster early trial literacy and critical thinking. Future studies should enhance engagement through community partnerships, shorter intervals between steps, and embedded learning assessments to improve inclusivity and retention.
Brief internet-based workplace training program to support help seeking for mental ill-health: Results of the Helipad cluster randomized controlled trial
Date Submitted: Dec 8, 2025

Open Peer Review Period: Dec 8, 2025 - Feb 2, 2026
Peer Review Me
Background: Depression and anxiety are prevalent in working-age adults. Although treatment provided by health professionals can improve symptoms and functioning, many people experiencing mental-ill health do not seek help. There have been very few effective interventions to improve help seeking in adults, with none implemented across diverse workplaces through online delivery. Objective: The primary aim of this trial was to test the effectiveness of a co-designed program for increasing professional help seeking intentions in Australian employees, relative to an active control condition. Methods: A triple-blinded two-arm cluster randomized controlled trial (N=487, control workplaces=26, intervention workplaces=25) was conducted to assess the relative effectiveness of Helipad, a fully-automated co-designed single-session interactive program (intervention condition) with a standard psychoeducation program (active control condition). Workplaces (clusters) were recruited via advertising or invited directly by researchers. Participants completed a pre-test, immediate post-test, and 6-month follow-up survey sent via email assessing help-seeking intentions (primary outcome), mental illness stigma, mental health literacy, help seeking attitudes and behavior, work and activity functioning, quality of life, and symptoms of depression, anxiety, and general psychological distress. Results: A significant difference in change over time on professional help seeking intentions was found between the two conditions F(2, 185.44)=6.89, P=.001, with planned contrasts showing that the Helipad program was effective in increasing professional help seeking intentions compared with the control at the primary endpoint of immediate post-test (t(359.35)=-3.72, P<.001). This difference was not maintained at the 6-month follow-up (t(119.76)=-1.05, P=.295). Retention rates were 71.1% at post-test and 24.9% at follow-up. The Helipad program was also associated with improved mental health literacy and help seeking attitudes at post-test. Helipad was not significantly superior to the control in reducing mental illness stigma or improving help seeking behavior, functioning, quality of life, or symptoms of depression, anxiety or general psychological distress (secondary outcomes) at 6-month follow-up. Conclusions: This study demonstrated that the Helipad program was effective in improving the intentions of employees to seek help from a professional compared to an active control. The program also improved mental health literacy and help-seeking attitudes, but these changes were not sustained and did not translate into observable differences in help-seeking behaviors or mental health symptoms. Selective interventions may be needed to demonstrate behavioral outcomes, and programs may be more effective when paired with organizational interventions. Clinical Trial: Australian New Zealand Clinical Trials Registry (ANZCTR) ACTRN12623000270617p
Simulating the Patient's Perspective: Promise and Pitfalls of LLMs in Patient-Centric Communication
Date Submitted: Dec 8, 2025

Open Peer Review Period: Dec 8, 2025 - Feb 2, 2026
Peer Review Me
Background: Large Language Models (LLMs) have shown broad applicability in medicine, including the generation of clinical documents. Beyond content creation, LLMs can also be used to evaluate the quality of medical documents. Because of LLMs' ability to simulate (or impersonate) specific personas, they can offer diverse perspectives (such as those of healthcare professionals versus patients with lower health literacy) on the clarity of medical texts. Objective: The primary objective of this research was to evaluate the ability of LLMs to simulate diverse user personas, varying by demographic profiles including educational background, gender, visit frequency, for the task of interpreting ICU discharge summaries. The study aimed to benchmark the clarity assessments generated by these LLM personas against a baseline established by human participants with corresponding backgrounds, in order to highlight the potential and limitations of using current LLMs to create personalized health information. Methods: We evaluated the ability of LLMs to simulate diverse user personas for the task of interpreting ICU discharge summaries. LLMs were prompted to adopt personas with varied demographic profiles, including different educational backgrounds. The resulting LLM-generated assessments of the summaries’ clarity were then benchmarked against a baseline established by human participants with corresponding backgrounds. Results: LLMs demonstrated a strong ability to simulate personas based on educational attainment, accurately interpreting key medical information in 88% of cases. However, the models’ performance varied widely when other demographic variables were introduced. For instance, persona performance was highly erratic based on gender, with simulated male personas achieving 97% accuracy while female personas achieved only 44%. The inclusion of additional details, such as the frequency of prior emergency room visits, further degraded the models' performance. Conclusions: This research highlights both the potential and the significant limitations of using LLMs to create personalized health information. While LLMs are promising for simulating user perspectives based on education, the current models exhibit unpredictable performance when tasked with incorporating other fundamental demographic traits like gender.
Determinants of Digital Health Literacy in the United Kingdom: a Cross-sectional Study Using Nationally Representative Data
Date Submitted: Dec 7, 2025

Open Peer Review Period: Dec 7, 2025 - Feb 1, 2026
Peer Review Me
Background: Digital health literacy (DHL), the ability to seek, understand, and apply digital health information, has become increasingly important in the United Kingdom (UK), with a focus on digital transformation within the health service. While digital tools offer the potential to improve access and equity, they may also exacerbate existing health inequities if segments of the population are unable to engage with them effectively. Understanding the determinants of DHL is essential to designing inclusive digital health services. Objective: To measure DHL among adults in the UK and identify its sociodemographic, economic, and social determinants. Methods: A cross-sectional online survey was disseminated to adults in the UK in December 2024. DHL was self-reported using the validated eHealth Literacy Scale (eHEALS), which ranges from eight to 40. eHEALS score was dichotomized into high and low DHL based on a cut-off of 26. A multivariable logistic regression model was built to identify sociodemographic, economic, and social determinants of DHL. Results: The median eHEALS score was 31; 21% of participants had a low level of DHL, while 79% had a high level of DHL. Those aged 45–64 and 65 years and older, compared to the 18–45 age group, had 1.61 and 1.98 times the odds of low DHL, respectively (45–64 years odds ratio [OR]: 1.61, 95% confidence interval [CI]: 1.13 to 2.31, P=.01; 65 years and older OR: 1.98, 95% CI: 1.36 to 2.91, P<.001). Females had 0.55 times the odds of low DHL (OR: 0.55, 95% CI: 0.42 to 0.74, P<.001), and those with an undergraduate or postgraduate degree or higher had lower odds of low DHL, compared to those educated to below degree level (undergraduate degree OR: 0.49, 95% CI: 0.33 to 0.71, P<.001; postgraduate degree or higher: OR: 0.48, 95% CI: 0.32 to 0.71, P<.001). Conclusions: Among the UK population, male sex, lower educational attainment, and older age were significant predictors of low DHL. Inclusive educational interventions and digital health solutions, tailored towards individuals with low DHL, are needed to ensure that digital transformation in healthcare helps to narrow health inequities.
Mixed-Delivery Mode Reminder Strategy to Tackle Down Low Participation in e-Cohorts: Randomized-Controlled Trial Nested Within Le French Gut e-Cohort
Date Submitted: Nov 28, 2025

Open Peer Review Period: Nov 27, 2025 - Jan 22, 2026
Peer Review Me
Background: e-cohorts are susceptible to low participation rates, undermining representativeness. Frequent reminders can be a cost-effective strategy to increase response rate. However, their effectiveness may vary depending on the delivery mode, the content, and the formatting of messages. Objective: To compare different reminder strategies (i.e., emails and/or text messages with standard and/or institutional formatting) and evaluate their effectiveness in terms of response rate, in the context of a population-based e-cohort of healthy adults. Methods: We conducted a 4-arm randomized-controlled trial nested in Le French Gut e-cohort (registration number 2021-A01439-32). In November 2024, we included participants who were enrolled online (SKEZIA plateform) but not yet active participants (i.e., eligibility, consent form, and/or personal information questionnaire not completed). Randomization was stratified on time since enrolment. We sent three reminders, 72-h apart, following four experimental designs: (Group 1) standard emails only; (Group 2) text messages only; (Group 3) institutional emails only; and (Group 4) standard email, followed by text message and institutional email. Our primary outcome was the completion rate of the personal information questionnaire. We also measured the completion time (i.e. time between reminder received and questionnaire fully completed), online login rate, login time, email opening, and click-through rates. Results: At the end of the trial, out of 20,487 eligible participants, 19,525 received at least one reminder. The per-protocol completion rate was 8.4%, with a higher rate in Group 4 (9.4% vs 7.4%, 8.4% and 8.3% for Groups 1, 2, and 3, respectively; P <.001). Completion time was faster in Group 2 (mean ± SD, 6.3 ± 4.3 days) compared to Groups 3 and 4 (7.0 ± 3.5 and 7.2 ± 3.8 days; P = .003). Online login rate was higher and login time faster in Group 4 (rate: 15.6% vs 12.1%, 14.3% and 15.3% for Groups 1, 2, and 3, respectively; P <.001; and time: 7.1 ± 4.3 days vs 6.7 ± 4.3, 6.2 ± 4.6, 6.8 ± 3.9 for Groups 1, 2, and 3, respectively; P = .002). For emails, opening rates were similar (P = .87) but click-through rate was higher for institutional emails (23.1% vs 19.6% for standard formatting; P <.001). Conclusions: A mixed-delivery mode strategy, combining emails and text messages, effectively increases response rate by 27% compared to other strategies. Institutional emails with plain design and signed by study coordinators, seemed more appealing to participants than a more elaborate design. Clinical Trial: registration number 2021-A01439-32
Channel Allocation and Equity in Preventive Campaigns for Older Adults: Agent-Based Simulation Study
Date Submitted: Nov 25, 2025

Open Peer Review Period: Nov 26, 2025 - Jan 21, 2026
Peer Review Me
Background: Preventive campaigns for older adults must decide how to allocate limited resources across media channels. However, these channel allocation and budget decisions rarely use explicit criteria for distributional equity or digital health strategic planning. As a result, health systems may optimize average uptake while leaving large gaps across socioeconomic groups and media-use profiles. Objective: This study aimed to develop and apply a data-driven agent-based model as a strategic planning tool for older-adult preventive campaigns, comparing channel allocation, personalization, and loss framing options under explicit budget and equity constraints. Methods: We built an agent-based simulation calibrated to national survey data on influenza vaccination and routine health screening among older adults in South Korea. Fifteen prespecified campaign scenarios varied channel allocation across television (TV), digital, and print; total exposure budgets; two equity-focused personalization strategies; and graded loss framing. Primary outcomes were final adoption and time to adoption. Equity outcomes included the minimum class-level adoption and the 90–10 gap across latent classes. Each scenario was simulated over 12 monthly steps with 100 Monte Carlo replications. We also compared scenario portfolios using logistic and clipped-linear link functions and varied the balance of media versus social reinforcement weights, the social reinforcement threshold, and network realizations in sensitivity analyses. Results: TV-only and high-budget strategies produced some of the highest mean adoption rates for both vaccination and screening but often failed to meet equity guardrails for minimum class coverage and between-class gaps. In contrast, personalization strategies that modestly reweighted exposure toward the lowest-uptake class or assigned class-tailored channel portfolios maintained or improved mean adoption. These strategies also substantially raised minimum class-level coverage and narrowed disparities. When efficiency and distributional equity were considered jointly, these personalized portfolios emerged as the most attractive options under fixed budget constraints. Loss framing acted as a secondary tuning lever: within the tested range, stronger loss framing yielded small, monotonic gains in adoption and shorter time to adoption without worsening equity metrics. Scenario rankings were stable across sensitivity analyses, suggesting that the main patterns reflected underlying diffusion dynamics rather than any single modeling choice. Conclusions: This agent-based simulation shows how ex ante planning for preventive campaigns can move beyond intuition by comparing channel allocation and personalization options under explicit equity and budget criteria. For campaigns targeting older adults, modest equity-oriented personalization of TV and digital exposure improved or preserved mean uptake. It also consistently improved distributional equity, whereas diversified channel mixes without personalization were less efficient and less equitable. These findings support integrating equity guardrails and channel-allocation guardrails into early-stage campaign design and prioritizing targeted personalization over simple channel diversification. Future work should validate these patterns in other populations and health systems and link simulated diffusion trajectories with observed exposure and engagement in real-world digital and traditional-media campaigns.
“PrEP Saves Lives!”: A Content Analysis of PrEP-Related Messages Across Facebook, Instagram and Twitter
Date Submitted: Nov 11, 2025

Open Peer Review Period: Nov 25, 2025 - Jan 20, 2026
Peer Review Me
Interventions are sorely needed to address the lack of PrEP awareness and mitigate barriers related to PrEP use. One such intervention modality is social media, as PrEP awareness and communicating issues, such as access and cost, are easily addressable via clear social media messages on platforms PrEP-eligible people, and especially young people, use frequently. This study seeks to extend understanding of PrEP awareness and usage by examining PrEP-related communication across 3 popular social media platforms (Facebook, Instagram, and Twitter), and identifying message and source characteristics. In February 2023, we used CrowdTangle (a public-insights tool owned by Facebook, now known as Meta) to gather a total of 39,790 Facebook posts and 5,628 Instagram posts. We also used Twitter’s public API to collect 14,061 Twitter posts during the same time frame. Of these, we drew a random sample of social media posts from each platform [Facebook (N = 1,000), Instagram (N = 1,000), and Twitter (N = 811)] in February 2023 and analyzed them using a quantitative content analysis. Our findings showed some differences in the type of text-based content most likely to appear on each platform. We also uncovered similar patterns across all 3 platforms. Across all platforms, we observed that definitions of and indications for PrEP were the most common type of text-based content in posts likely to be shared, information about PrEP appearing in social media posts did not seem to draw from traditional sources, and men who have sex with men (MSM) represented the most frequently mentioned target population. Although our study did not detect a large presence of theory-based concepts from behavior change theory such as the reasoned action approach (RAA), across all platforms, attitude emerged most frequently, followed by self-efficacy. These findings shed light on the PrEP-related beliefs shaping young people’s perceptions and engagement. Such insights can guide the design of future social media–based messages, targeting the most influential beliefs to strengthen HIV prevention efforts. They also provide a foundation for advanced machine learning models capable of predicting and explaining the diffusion potential of PrEP-related content.
“Like taking part in Star Wars”: A thematic analysis of acceptability and experiences of older adults participating in remote longitudinal sleep and dementia research
Date Submitted: Nov 19, 2025

Open Peer Review Period: Nov 19, 2025 - Jan 14, 2026
This manuscript needs more reviewers Peer Review Me
Background: Sleep disturbance is a common symptom of and potential risk factor for neurodegeneration. Remote sleep and cognitive assessments offer promise for monitoring symptoms and treatment response from patients’ homes, but the acceptability of remote sleep and circadian technology in older adults with and without cognitive impairment is not known. Objective: This qualitative study was designed to explore and describe the barriers, facilitators, and user experience of older adults with mild cognitive impairment and dementia and cognitively unimpaired older adults who participated in a longitudinal sleep and memory study designed around remote monitoring technologies. Methods: Patients with mild cognitive impairment or dementia due to probable Alzheimer’s disease or Lewy body disease and age-matched controls participated in a longitudinal remote study involving multimodal assessments of sleep and cognition including actigraphy, wireless electroencephalography, a smartphone app, web-based cognitive tasks, and serial saliva samples. Participants were asked for feedback via questionnaires during the study and invited to complete end-of-study interviews about their experiences. Questions were informed and thematic analysis was guided by the Capability, Opportunity, Motivation – Behaviour model of behaviour change and the extended Unified Theory of Acceptance and Use of Technology and focused on perceived barriers and facilitators. Results: The study identified six key themes. The first theme, ‘motivations to participate’, highlighted how participants felt the research could be helpful to themselves and others. The second theme, ‘navigating the user experience of devices’, identified comfort, security, privacy, ease of use, and reliability as fundamental in determining acceptability. ‘Adjusting over time to study participation’, the third theme, covered changing perceptions with increased exposure and familiarity, and the importance of convenience, flexibility, and developing a routine. The fourth theme explored ‘social support as a facilitator and barrier to research participation’, looking at the influence of both the research team and relatives supporting at home. A fifth theme of ‘adherence, accuracy, and getting it right’ was also identified, as participants were motivated to provide good quality data for the study. Finally, we identified a sixth theme surrounding participants’ ‘reflections, realities, and uncertainties around sleep’, which focused on sleep hygiene and common sleeping problems in older adults, such as snoring and nocturnal awakenings. Conclusions: Older adults with and without cognitive impairment were motivated to engage in longitudinal remote sleep research, follow remote research protocols, and produce good quality data. Acceptability was related to burden and convenience, usability, and emotional responses to study tasks. When study tasks are repeated over time, care should be taken to introduce variety where possible to avoid fatigue and frustration. Study partners offer essential support for some participants, but requiring a study partner may also be an unnecessary barrier to research participation for others. Future studies should aim to identify effective strategies for recruiting diverse populations, particularly those with limited technology experience or from underserved communities, to ensure equitable participation and representation in research. Providing education on the importance of sleep for brain health and technology use may be beneficial.
Methods for Participant Verification in Social Media Recruitment for a Pilot Study of an mHealth App: Lessons Learned
Date Submitted: Nov 20, 2025

Open Peer Review Period: Nov 18, 2025 - Jan 13, 2026
Peer Review Me
Background: Web-based advertisements, specifically social media advertisements, are a popular recruitment avenue among research projects involving human participants. Social media recruitment has advantages over other methods (e.g., in-person recruitment), such as aiding teams in reaching the population of interest and increasing enrollment pace at a relatively low cost. Nonetheless, social media recruitment comes with the challenge of fraudulent responses, and therefore effective identity verification procedures must be put in place in order to maintain the integrity of the final sample and data. Objective: In this paper, we outline the identity verification methods (herein referred to as “checks”) used in the recruitment process for a pilot study featuring a mobile health (mHealth) intervention app for emerging adults (EAs; aged 18-25) who regularly use cannabis. Each identity verification check is examined for its rate of passing. Methods: Participants were recruited via social media advertisements that linked directly to a study eligibility screening survey. Advertisements were posted on Meta (Facebook and Instagram), Snapchat, and TikTok. Participants were enrolled if they met study inclusion criteria (e.g., aged 18-25, reported regular cannabis use), completed the baseline consent and survey, downloaded the app, and passed all identity verification checks. Identity verification checks happened at two checkpoints: directly following screening survey completion (e.g., geolocation check, duplicative IP address check, social media check) and directly following app download and login (duplicative device ID and/or push token check). Failing an identity verification check resulted in exclusion from the study. Results: Identity checks were non-exclusive such that a single eligible screening response could undergo multiple checks. Of the 573 eligible screening responses that went through the identity verification process, a total of 3,031 identity verification checks were completed. Of these 3,031 aggregate checks, 396 failed the verification criteria (13.1%), and therefore 396 of the 573 eligible respondents were excluded from continuation in the enrollment process (69.1%). Social media checks, wherein study staff ensured the individual’s public-facing account had personally relevant information, had the highest failure rate (61.5%). The second most common failed check was due to a duplicate device ID upon logging into the app (10.0%), followed by the geolocation check (4.9%), the duplicate IP address check (4.2%), the combination check (time zone; 4.1%), and duplicate push token check (3.2%). Conclusions: This paper describes a participant identity verification process for app-based mHealth studies using social media as a recruitment source. A combination of identify verification safeguards is suggested to maintain integrity of the study sample and data. Clinical Trial: ClinicalTrials.gov NCT05824754; University of Michigan IRB: HUM00222194
Feasibility and Usability of a Digital Perinatal Navigator for High-Risk Pregnancies: A Mixed-Methods Study
Date Submitted: Dec 7, 2025

Open Peer Review Period: Nov 18, 2025 - Jan 13, 2026
Peer Review Me
Background: The journey to parenthood involves significant physical, emotional, and psychosocial changes. Mental health challenges impact both maternal and fetal health, potentially leading to obstetric complications and developmental risks for children. Access to needed perinatal support is often limited due to individual and structural barriers. Digital health solutions can offer opportunities to provide low-threshold, personalized, and scalable support. We developed a digital navigator offering personalized guidance and connecting users to relevant support services with interactive follow-ups to self-assess their well-being. However, evidence regarding feasibility of digital solutions in high-risk patients is limited. Objective: Aim of the was assess the feasibility, usability, and preliminary effectiveness of a digital perinatal navigator app designed to provide personalized support and connect pregnant individuals to relevant health and social services. Methods: The study was conducted at the University Women’s Hospital Heidelberg to assess an app-based health service program. Using convenience sampling, eligible participants tested the perinatal guide for two weeks. A convergent mixed-methods design combined qualitative interviews (n=30) and psychometric questionnaires (n=35) to evaluate feasibility, usability and preliminary effectiveness. Statistical analysis included descriptive evaluations, paired t-tests, and Pearson correlations. Results: Participants (median age 33, median gestational age of 30 weeks) reported moderate high rates of stress, anxiety and depressive symptoms. Usability ratings were excellent (median SUS 80; MAUQ 105). Knowledge of HSPs increased significantly (mean +1.2 points, p<.01), with modest improvements in utilization. Qualitative analysis revealed key success factors such as intuitive structure, trustworthy medical content, and personalized information. Technical disruptions, navigation challenges, limited personalization, and incomplete regional integration of healthcare services were reported as barriers. Conclusions: The results indicate high feasibility and acceptance for our digital navigator in this high-risk population. The identified barriers are to be considered in the further development of the app and other perinatal digital care programs.
Translation, Cultural Adaptation, and Psychometric Validation of the Amharic eHealth Literacy Questionnaire (eHLQ): A Cross-Sectional Study
Date Submitted: Nov 14, 2025

Open Peer Review Period: Nov 17, 2025 - Jan 12, 2026
Peer Review Me
Background: eHealth interventions have demonstrated potential to address challenges related to health and the health care system in low- and middle-income countries. To effectively leverage eHealth in supporting health care in Ethiopia, the assessment and development of eHealth literacy of patients is essential. Objective: This study aimed to translate and culturally adapt the eHealth Literacy Questionnaire (eHLQ) to Amharic and assess its psychometric properties. Methods: We applied a systematic process of translation and cultural adaptation, including forward and backward translation, expert review, and cognitive interviews. Then we conducted a cross-sectional questionnaire-based study using a convenience sample (N=300) of patients with internet access in the primary health-care level between January and March 2025 in the capital and a larger city of Ethiopia. Internal consistency was assessed using Cronbach α and McDonald ω. Factor structure was assessed using Confirmatory Factor Analysis. Convergent and discriminant validity were examined by calculating Spearman correlations between each eHLQ scale and the total score of the eHealth Literacy Scale (eHEALS). Results: A total of 300 participants were included in the analysis. The mean age was 30.4 years (SD 6.8; range 18–55), and 69.7% (209/300) were women. Internal consistency was acceptable for all scales (Cronbach α=0.72–0.91; McDonald ω=0.79–0.96), except for Scale 4 (α=0.62; ω=0.70). The 7-factor model showed satisfactory fit, with a Comparative Fit Index of 0.97, Tucker-Lewis Index of 0.97, and Standardized Root Mean Square Residual of 0.07. Factor loadings exceeded 0.40 for all items except one. Strong correlations between Scales 1–3 and eHEALS (range r=0.69–0.74) supported convergent validity, while moderate correlations between Scales 5–7 and eHEALS (range r=0.66–0.67) indicated limited discriminant validity. Conclusions: The Amharic eHLQ demonstrated generally satisfying psychometric properties and can be considered as a valid tool for assessing eHealth literacy among patients with internet access in Ethiopia, marking the first validation of the eHLQ in Sub-Saharan Africa. Future studies could provide additional evidence to substantiate the psychometric robustness of Scale 4 (“Feeling Safe and in Control”). Overall, the Amharic eHLQ can support the development of tailored eHealth interventions in Ethiopia.

Other pages

Years

Issues

Search

Latest Submissions Open for Peer Review

Titles/Abstracts of Articles Currently Open for Review:

Date Submitted: Jan 8, 2026

Open Peer Review Period: Jan 8, 2026 - Mar 5, 2026

Date Submitted: Jan 6, 2026

Open Peer Review Period: Jan 7, 2026 - Mar 4, 2026

This manuscript needs more reviewers Peer Review Me

Date Submitted: Jan 6, 2026

Open Peer Review Period: Jan 7, 2026 - Mar 4, 2026

This manuscript needs more reviewers Peer Review Me

Date Submitted: Jan 2, 2026

Open Peer Review Period: Jan 5, 2026 - Mar 2, 2026

Date Submitted: Jan 2, 2026

Open Peer Review Period: Jan 5, 2026 - Mar 2, 2026

Date Submitted: Jan 1, 2026

Open Peer Review Period: Jan 1, 2026 - Feb 26, 2026

This manuscript needs more reviewers Peer Review Me

Date Submitted: Dec 31, 2025

Open Peer Review Period: Jan 1, 2026 - Feb 26, 2026

Date Submitted: Dec 27, 2025

Open Peer Review Period: Dec 29, 2025 - Feb 23, 2026

Date Submitted: Dec 27, 2025

Open Peer Review Period: Dec 29, 2025 - Feb 23, 2026

This manuscript needs more reviewers Peer Review Me

Date Submitted: Dec 26, 2025

Open Peer Review Period: Dec 29, 2025 - Feb 23, 2026

This manuscript needs more reviewers Peer Review Me

Date Submitted: Dec 25, 2025

Open Peer Review Period: Dec 25, 2025 - Feb 19, 2026

This manuscript needs more reviewers Peer Review Me

Date Submitted: Dec 24, 2025

Open Peer Review Period: Dec 24, 2025 - Feb 18, 2026

Date Submitted: Dec 23, 2025

Open Peer Review Period: Dec 24, 2025 - Feb 18, 2026

Date Submitted: Dec 22, 2025

Open Peer Review Period: Dec 23, 2025 - Feb 17, 2026

This manuscript needs more reviewers Peer Review Me

Date Submitted: Dec 22, 2025

Open Peer Review Period: Dec 22, 2025 - Feb 16, 2026

This manuscript needs more reviewers Peer Review Me

Date Submitted: Dec 22, 2025

Open Peer Review Period: Dec 22, 2025 - Feb 16, 2026

Date Submitted: Dec 20, 2025

Open Peer Review Period: Dec 22, 2025 - Feb 16, 2026

Date Submitted: Dec 19, 2025

Open Peer Review Period: Dec 22, 2025 - Feb 16, 2026

This manuscript needs more reviewers Peer Review Me

Date Submitted: Dec 20, 2025

Open Peer Review Period: Dec 18, 2025 - Feb 12, 2026

This manuscript needs more reviewers Peer Review Me

Date Submitted: Dec 17, 2025

Open Peer Review Period: Dec 17, 2025 - Feb 11, 2026

Date Submitted: Dec 16, 2025

Open Peer Review Period: Dec 16, 2025 - Feb 10, 2026

Date Submitted: Dec 15, 2025

Open Peer Review Period: Dec 15, 2025 - Feb 9, 2026

This manuscript needs more reviewers Peer Review Me

Date Submitted: Dec 16, 2025

Open Peer Review Period: Dec 15, 2025 - Feb 9, 2026

Date Submitted: Dec 9, 2025

Open Peer Review Period: Dec 10, 2025 - Feb 4, 2026

Date Submitted: Dec 8, 2025

Open Peer Review Period: Dec 8, 2025 - Feb 2, 2026

Date Submitted: Dec 8, 2025

Open Peer Review Period: Dec 8, 2025 - Feb 2, 2026

Date Submitted: Dec 7, 2025

Open Peer Review Period: Dec 7, 2025 - Feb 1, 2026

Date Submitted: Nov 28, 2025

Open Peer Review Period: Nov 27, 2025 - Jan 22, 2026

Date Submitted: Nov 25, 2025

Open Peer Review Period: Nov 26, 2025 - Jan 21, 2026

Date Submitted: Nov 11, 2025

Open Peer Review Period: Nov 25, 2025 - Jan 20, 2026

Date Submitted: Nov 19, 2025

Open Peer Review Period: Nov 19, 2025 - Jan 14, 2026

This manuscript needs more reviewers Peer Review Me