Review
Abstract
Background: Conversational agents (CAs) are increasingly used as a promising tool for scalable, accessible, and personalized self-management support of people with a chronic disease. Studies of CAs for self-management of chronic disease operate within a multidisciplinary domain: self-management originates from (behavioral) psychology and CAs stem from intervention technology, while diseases are typically studied within the biomedical context. To ensure their effectiveness, structured evaluations and descriptions of the interventions, integrating biomedical, behavioral, and technological perspectives, are essential.
Objective: We aimed to examine the design and evaluation of CAs for self-management support of chronic diseases, focusing on their characteristics, integration of behavioral change techniques, and evaluation methods. The findings will guide future research and inform intervention design.
Methods: We conducted a systematic search in the PubMed and Embase databases to identify studies that investigated CAs for chronic disease self-management, published from January 1, 2018, to April 15, 2024. Full-text journal articles, published in English, studying the efficacy or effectiveness of a CA in the context of self-management for chronic diseases in adults were included. Data extraction was guided by conceptual frameworks to ensure comprehensive reporting of intervention and methodologies: the behavioral intervention technology model and the CONSORT-EHEALTH (Consolidated Standards of Reporting Trials of Electronic and Mobile Health Applications and Online Telehealth) checklist. Risk of bias was assessed using the Risk of Bias 2 tool and the Risk of Bias in Non-randomized Studies-of Interventions (ROBINS-I) tool (version 2).
Results: In total, 25 studies were included, primarily focusing on text-based, rule-based CAs delivered via a mobile apps. The chronic diseases predominantly targeted were diabetes and cancer. Commonly identified clusters of behavior change techniques were “shaping knowledge,” “feedback and monitoring,” “natural consequences,” and “associations.” However, reporting of behavior change techniques and their delivery was lacking, and intervention descriptions were limited. Studies were mostly in the early phase, with a great variety in intervention descriptions, study methods, and outcome measures.
Conclusions: Advancing the field of CA-based interventions requires transparent intervention descriptions, rigorous methodologies, consistent use of validated scales, standardized taxonomy, and reporting aligned with standardized frameworks. Enhanced integration of artificial intelligence–driven personalization and a focus on implementation in health care settings are critical for future research.
doi:10.2196/72309
Keywords
Introduction
Background
Chronic diseases are the leading cause of mortality and morbidity and impose a significant global burden on patients, their families, and the health care system [,]. Tackling these challenges requires strategies to empower patients in self-managing their disease effectively. Self-management is a promising strategy to treat chronic diseases, enabling patients to actively identify and solve illness-related problems to prevent complications or reduce disability []. Self-management interventions improve health behavior, health outcomes, and quality of life (QoL) and reduce health care use [,]. Conversational agents (CAs), web- or app-based computer programs that engage in 2-way interactions by simulating humanlike conversations, have emerged as a promising tool for self-management support [].
Self-management refers to an individual’s ability to manage symptoms, treatments, physical and psychosocial consequences, and lifestyle changes []. Self-management involves role, medical, and emotional management tasks and requires problem-solving, decision-making, resource use, patient–health care provider partnership formation, action planning, and self-tailoring skills []. To deploy these skills, patients should be knowledgeable about their illness, symptoms, and available treatments, including self-care options, to make appropriate decisions about their disease management []. CAs facilitate self-management by providing knowledge (eg, answering questions about the disease []), promoting self-management skills (eg, decision-making based on self-monitoring data), and assisting with self-management tasks (eg, by offering emotional support []). CAs can serve as multicomponent technology-based behavior change interventions by incorporating behavior change techniques (BCTs). BCTs are observable, replicable, and irreducible components designed to alter or redirect causal processes that regulate behavior []. For example, a CA designed to support mental well-being in people with chronic headaches contains the BCT “goals and planning,” and the technology delivers this technique to the user by assisting them to plan a mindfulness-based activity during the next week []. Early evidence highlights that CAs hold significant promises to support self-management of chronic diseases, showing acceptability, usability, and potential efficacy for improving self-management [,,].
CAs are present in diverse forms and functionalities, ranging from systems with pre-programmed responses to more sophisticated embodied artificial intelligence (AI) agents. The field of health care CAs is rapidly evolving, especially with the new developments in AI. A key development is the emergence of large language models (LLMs), deep learning models designed to process and generate natural language text. LLMs excel in generating meaningful humanlike creativity, reasoning, and contextually appropriate language [,], allowing users of CAs to converse with the CA as they would with other humans []. Well-known examples of CAs that use LLMs are ChatGPT and, more recently, DeepSeek. The rise of AI may further enhance the potential of CAs as self-management interventions.
CAs for chronic disease self-management operate in a multidisciplinary context, integrating intervention technology, (behavioral) psychology, and medical or biomedical sciences. Therefore, rigorous evaluation of their effectiveness in chronic disease management is complex, and a consensus on the best approach is lacking [,,,]. A comprehensive perspective that incorporates frameworks from all relevant disciplines is essential for effective evaluation []. Also, previous reviews typically focused on specific subsets of CAs—such as text-based [], voice-based [], AI-driven [,,] or embodied [] agents—resulting in an incomplete understanding of intervention content and design.
This Study
A previous scoping review [] outlined various types of CAs used in chronic condition management, along with their characteristics and study designs. Expanding on this work, we adopt a multidisciplinary perspective by integrating frameworks from diverse domains, including the CONSORT-EHEALTH (Consolidated Standards of Reporting Trials of Electronic and Mobile Health Applications and Online Telehealth) checklist, the behavioral intervention technology (BIT) model, and BCTs. Through this synthesis, we provide a comprehensive overview of CAs developed for chronic disease self-management support and their characteristics and evaluation approaches. The findings offer practical insights for developers and researchers, ultimately contributing to the design of robust, evidence-based CA health interventions.
Methods
We adhered to the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines for conducting systematic reviews () []. Our protocol was not registered.
Search Strategy
The search was performed in the PubMed and Embase databases, covering papers published between January 1, 2018, and April 15, 2024. We prioritized sensitivity over specificity to ensure that all potentially relevant studies were included. Therefore, we based our search string on synonyms for CA and derived no search terms from “self-management” and “chronic disease,” as these are broad concepts that are hard to capture within a set of terms. Our approach resulted in the following search string for PubMed: “Conversational robot*” [tiab] OR “conversational agent*” [tiab] OR “chat bot*” [tiab] OR “chat agent*” [tiab] OR “conversationalagent*” [tiab] OR “chatbot*” [tiab] OR “chatterbot*” [tiab] OR “virtual agent*” [tiab] OR “virtual robot*” [tiab] OR “relational agent*” [tiab] OR “conversational assistant*” [tiab] OR “speech assistant*” [tiab] OR “chat assistant*” [tiab] OR “virtual assistant*” [tiab] OR “chatassistant*” [tiab] OR “AI agent*” [tiab] OR “dialogue system*” [tiab] OR “voice assistant*” [tiab]” OR “ai agent*”[Title/Abstract] OR “dialog system*”[Title/Abstract] OR “voice assistant*”[Title/Abstract]. The search string was adjusted for use in Embase. The initial search was performed in PubMed on May 1, 2023, and an updated search in PubMed was performed on April 15, 2024. To improve the comprehensiveness of our search, we added "AI agent", "dialogue system", and "voice assistant" to our search string in PubMed on March 25, 2025, and repeated our full search in the Embase database on March 27, 2025.
Study Selection Criteria
Inclusion criteria were (1) full-text journal articles; (2) published in English; (3) from 2018 to April 15 2024, to account for rapid technological advancements; and (4) included a CA in; (5) the context of self-management for; (6) chronic diseases in; (7) adults; and (8) reporting primary research findings evaluating efficacy, including pre- and postuse measures.
Chronic disease was defined, according to the Centers for Disease Control and Prevention, as “conditions that last 1 year or more and require ongoing medical attention or limit activities of daily living or both.” [] Our selection on self-management was guided by the definition of Barlow et al [] as an individual’s ability to manage symptoms, treatments, physical and psychosocial consequences, and lifestyle changes. We added the criterion that CAs should at least offer information or education, as these components are essential for developing the knowledge and skills necessary to manage chronic diseases effectively []. Studies investigating one-time use of a CA, virtual reality agents, mental health, addiction, or neurodiversity were considered out of the scope of this review and were therefore excluded.
Study Selection
One researcher (TFP) screened the studies’ eligibility based on title and abstract, rating them as “potentially relevant” or “not relevant.” Doubtful cases were reviewed by 2 other researchers (SWvdB and NMdV). Subsequently, the full text of potentially relevant articles was assessed by one researcher (TFP). Uncertainties were discussed with the 2 other researchers (SWvdB and NMdV).
Data Extraction and Synthesis
Overview
One researcher (TFP) extracted the data in a standardized form using the abstract, main text, supplemental material, and previous publications describing the CA under study if referred to by the authors. To generate a comprehensive overview of the CA description and study methods, we used the BIT model [] and CONSORT-EHEALTH checklist []. The BIT model defines the conceptual and technological architecture of BITs. CAs for self-management can be considered as BITs, as they function as mobile health or eHealth interventions that support users in changing behaviors and cognitions. The model includes reporting on the intervention aims (the “why”), behavior change strategies (the conceptual “how”), by what elements these strategies are implemented in technology (the “what”), when an intervention component is delivered (the “when”), and intervention characteristics to meet the user’s needs and preferences (the technical “how”). The CONSORT-EHEALTH checklist was developed to improve reporting of eHealth trials and serves as a basis for evaluating the validity and applicability of eHealth trials.
Characteristics of the CA
We described CAs using the “technical how” component of the BIT model and additional characteristics based on the CONSORT (Consolidated Standards of Reporting Trials) checklist, extracting data on mode of delivery, access, embodiment status, conversational techniques, in- and output modalities, and recommended dose.
Mode of delivery refers to how the CA is delivered to the users, such as via an app. Method of access reflects how users access the mode of delivery, such as by downloading the app through a link provided via WhatsApp. We specified conversational techniques, encompassing input processing and response generation methods, as rule-based or AI-driven. We considered the conversational techniques AI-driven if the input was processed using AI (eg, using natural language processing to understand user input, context, and recognize patterns) or AI was used to respond in a humanlike conversational manner (eg, using LLMs) or both. In contrast, rule-based systems do not adapt or learn from user input and provide preprogrammed responses. Input and output modalities refer to how users communicate with the CA (input modality, eg, text) and how the CA communicates with the user (output modality, eg, speech). Recommended dose refers to the planned or recommended amount (eg, duration or content completion, such as one lesson) or frequency (eg, once a day) of exposure to the intervention for optimal effectiveness.
Finally, we described any cointerventions, which are elements of the intervention that are not part of the CA itself.
Integration of BCTs
On the basis of the BIT model, we extracted the clinical aim (eg, reducing blood sugar), BCT clusters (eg, feedback and monitoring), and BCT delivery by technology (eg, providing self-care guidance based on monitored blood sugar levels). A BCT is an “observable, replicable, and irreducible component of an intervention designed to alter or redirect causal processes that regulate behaviour” and BCTs are categorized into clusters []. If not reported, 3 researchers (TFP, RJAvD, and SWvdB) identified BCT clusters based on intervention descriptions and BCT taxonomy [] in consensus meetings.
Evaluation Methods
To describe the evaluation method, we extracted study characteristics (the study’s aim, design, follow-up period, location, participant eligibility criteria related to technical use and skills, intervention and control conditions, and demographics [age, gender, education, and technical skills and use]) and outcome measures. To ensure a comprehensive overview, we reported outcome measures in the following categories: clinical, feasibility, and acceptability outcomes. We adhered to the authors’ classification when provided and self-categorized otherwise. Clinical outcomes included disease-specific, psychological, behavioral, and knowledge and skills measures. Feasibility included recruitment, enrollment, attrition, accrual, exclusion, and retention rates, and outcomes based on use data. We categorized user experience measures under acceptability.
Primary or Coprimary Study Results
The focus of this review is not to draw conclusions on the effectiveness, feasibility, or acceptability of CAs for chronic disease self-management support. However, to get an overview of whether CAs seem to work for the self-management of chronic diseases, we chose to report the results of the primary or coprimary outcomes.
Study Risk of Bias Assessment
We only reported primary study outcomes in this review. Risk of bias for primary outcomes that evaluated efficacy or effectiveness was assessed using the Cochrane Risk of Bias in Non-randomized Studies-of Intervention (ROBINS-I) tool (version 2 []) for nonrandomized studies and the Risk of Bias 2 tool [] for randomized controlled trials. One reviewer (TFP) individually assessed the quality of eligible studies; the assessments were discussed with SWvdB and NMdV.
Results
Overview
Our initial search in the PubMed database on May 1, 2023, yielded 1215 results. A subsequent search on April 15, 2024, resulted in 1114 additional hits. The expanded search of the additional terms yielded 243 hits, and the Embase database search yielded 3286 hits. In total, 5858 articles were screened. After screening, we excluded 5497 articles based on title and abstract. After full-text screening, 25 articles were included. illustrates the inclusion process details.

Characteristics of the CAs
summarizes the characteristics of 23 unique CAs included in this study. Two studies evaluated nurse Addressing Metastatic Individuals Everyday [,] and Wysa [,]. “Chatbot” was the predominant used term (13/23, 57%) [-]. Most CAs (18/23, 78%) operated through mobile apps [,,-,-,-] as stand-alone apps or integrated into existing apps such as Telegram [] or Signal []. The majority were unembodied agents (14/23, 61%) that communicated through text [-,,,,,-]. Embodied agents used speech and animated behaviors [,,-,]. CAs were often supplemented with multimedia, such as audio recordings or images. Embodiment status of 4 CAs [,-] and output modalities for 4 CAs [,,,] were not reported.
Input modalities varied, including natural language text, voice, predefined choices, and combinations. Rule-based conversational techniques were most common (10/23, 43%) [,,,,,,,,-,]. Some CAs (4/23, 17%) solely relied on AI-driven conversational techniques [-,,] while others (4/23) used a combination of AI-driven and rule-based techniques [,,,,,]. Some descriptions of conversational techniques or input modalities were incomplete [,,,,-]. The recommended dose varied from daily interactions to use frequency and duration based on the users’ preference [,,,,-,] but sometimes lacked (5/23, 22%). Most CAs (15/23, 65%) were part of broader co-interventions, including additional app functionalities, physical and digital information sources, physical self-monitoring devices, and human involvement [,,,,-,,,,]. Human roles encompassed monitoring symptoms and activity, conducting weekly check-ins, and updating individualized goals [,]; monitoring of disabling conditions and addressing users’ questions []; responding to questions the chatbot could not answer and delivering contextually tailored messages []; reviewing patient data, providing personalized feedback, and answering queries []; and managing technical alerts []. Clinicians were also involved in responding to flagged participant responses [,].
| Name of CA, terminology used, and study | Instantiation—how (technical) | ||||||
| Device, mode of delivery, and method of access | Embodiment | Conversational techniques | In- and output modalities | Recommended dose | Description of cointervention | ||
| Atrial fibrillation | |||||||
| Tanya, relational agent, Guhl et al [] | Mobile app installed on smartphone | Yes | Rule-based |
| Daily interactions |
| |
| Cancer | |||||||
| Chemofree, chatbot, Tawfik et al [] | Devices with the Android operating system, accessed via a link received on WhatsApp. | No | AIb-based |
| Continuous access with user-initiated conversations. | N/Ac | |
| Vivibot, chatbot, Greer et al [] | Device not reported, but accessible via Facebook messenger (web- and app-based). Log-in account required. | No | Rule-based |
| Structured daily interactions for 28 d. | N/A | |
| Nurse AMIEd, virtual assistant, Caru et al [], Schmitz et al [], and Qiu et al [] | Tablet (Echo Show study device), via Amazon Alexa with a voice command, “Alexa, open nurse AMIE.” | Yes | AI-driven (voice) and rule-based (touch) |
| Daily use, 1 lesson a day for 28 d |
| |
| Nameless, Chatbot, Gomaa et al [] | Mobile app (text messaging) on smartphone, the app is not stored on a device. | No | Rule-based |
| 3 interactions a week | N/A | |
| Nameless, chatbot, Huang et al [] | Device not reported. Access through a QR code leading to the Facebook messenger interface (web- and app-based); log-in account required. | No | Unclear |
| Daily interactions |
| |
| Nameless, chatbot or CA, Albino de Queiroz et al [] | Via the Facebook messenger app and webpage on a notebook, smartphone, or tablet. | No | Rule-based and AI-driven |
| Not reported |
| |
| Chronic disease | |||||||
| Nameless, interactive reminder-based app, Fang et al [] | Mobile app preinstalled with user information entered on an iPad study device | Yes | Rule-based |
| On the basis of users’ medication intake schedule | N/A | |
| CKDf | |||||||
| CIM-SHEpprogram, chatbot, Chen et al [] | Via a preinstalled mobile app instant messaging on smartphone | No | Rule based, possibly with AI-driven components. |
| Not reported |
| |
| Chronic pain | |||||||
| Wysa, CA, Meheli et al [] and Cheng et al [] | Device not specified. Via a mobile app that runs on iOS and Android devices and can be voluntarily downloaded via a publicly available link for free. | No | AI-based |
| Not reported by Meheli et al [] and ≥3 interactions a week reported by Cheng et al []. | N/A | |
| Selma, chatbot, Hauser-Ulrich et al [] | Mobile app downloaded via the project webpage or directly via the App Store or Google Play Store on an iOS or Android smartphone | No | Rule-based |
| Daily interactions for 8 wk | N/A | |
| Diabetes | |||||||
| Nameless, chatbot, Krishnakumar et al [] | Mobile app downloaded from the Google Play store using a unique link sent to the user via SMS or text message on an Android smartphone | Not reported | Unclear; reports on AI-powered decision support but does not describe conversational techniques. |
| Not reported |
| |
| Elena, virtual assistant, Roca et al [] and Roca et al [] | Mobile app signal downloaded on smartphone and user registration by the nurse during medical appointment | No | Rule-based |
| Multiple times daily | N/A | |
| Laura, CA, Gong et al [] | Mobile app on a smart device with an operating system of at least iOS 8 for Apple or OS 4.2 for Android | Yes | Rule-based (touch-based selections) and AI-based (interactive voice recognition) |
| Scheduled weekly appointments with CA |
| |
| Nameless, CA, Bruijnes et al [] | Not reported | No | AI-driven (free-text) and rule-based (predefined choices). |
| Weekly interactions | N/A | |
| Nameless, chatbot, Nassar et al [] | Web app via mobile devices (desktop or tablet) the app does not require a download on device. | No | Rule-based |
| 1st mo: weekly and >1 mo: based on user preference |
| |
| HIV | |||||||
| Nameless, realistic talking human avatar, Dworkin et al [] | The mobile app was downloaded from the study laptop to the participants’ smartphones by the study team. | Yes | Rule-based |
| Daily interactions |
| |
| Hypertension | |||||||
| Tensiobot, chatbot, Echeazarra et al [] | The mobile app Telegram is downloaded and installed on a smartphone. All assisted by the nurse (including registration) | No | Rule-based |
| Interactions twice a day |
| |
| Nameless, chatbot, Sakane et al [] | Device not reported. Mobile app downloaded via Google Play or App Store. | Not reported | Unclearly described |
| Approximately daily |
| |
| Irritable bowel syndrome | |||||||
| Zemedy, chatbot, Hunt et al [] | Mobile app downloaded via a link on an iOS or Android smartphone | No | Not reported |
| Approximately weekly |
| |
| Kidney failure | |||||||
| PDj-AI chatbot, chatbot, Cheng et al [] | Mobile app LINE on a smart device | Not reported | AI-based |
| Not reported |
| |
| Overactive bladder | |||||||
| CeCe, CA, Sheyn et al [] | Web app on an iOS or Android smartphone; assistance from the website via a research coordinator | No | Rule-based |
| Interactions of 5-10 min daily for 8 wk |
| |
| Primary headaches | |||||||
| David or Sophie, CA, Ulrich et al [] | The mobile app self-downloaded via the study website on an Android or iOS smartphone. | No | Rule-based |
| As preferred, from daily to weekly |
| |
aVisual modes of communication include various channels of expression (eg, visual cues such as facial expressions and body movement).
bAI: artificial intelligence.
cN/A: not applicable.
dAMIE: Addressing Metastatic Individuals Everyday.
eCTCAE: Common Terminology Criteria for Adverse Events
fCKD: chronic kidney disease.
gCIM-SHE: Chat-based Instant Messaging Support Health Education.
hMDC: My Diabetes Coach.
iBP: blood pressure.
jPD: peritoneal dialysis.
Integration of BCTs
On the basis of intervention descriptions, we identified BCT clusters and delivery methods (). All authors reported the aim of their CA—except for one []. The most frequently identified BCT clusters were “shaping knowledge” (17/23, 74% [-,-,,,,,,-]), “feedback and monitoring” (15/23, 65% [-,-,-,,,,,,]), and “natural consequences” (13/23, 57% [-,,,,, ,,,-]). “Shaping knowledge” equips individuals with practical knowledge (eg, how to perform a behavior) and cognitive strategies to adopt and sustain desired behaviors. This was delivered mostly by behavioral instructions, for example, how to perform relaxation []. “Feedback and monitoring” refer to strategies to monitor or self-monitor behavior and its outcomes and deliver feedback. Monitoring often involved (prompted) assessments of health outcomes [,,-,] or behaviors [,,]. Feedback included self-care recommendations based on the data. “Natural consequences” involves strategies that help to understand the relationship between behavior and its consequences. This was delivered through information about the necessity of self-care behaviors and cognitions [,,,] or medication intake [,,]. “Shaping knowledge” was often combined with “feedback and monitoring,” providing personalized self-care guidance based on the user’s symptoms [,,,,]. In addition, “associations” were often incorporated (14/23, 61% [,-,,,,,-,-]), often with prompts to use the CA. “Social support” was delivered by facilitating contact with health care providers []; videos of fellow patients [,]; advice on and arranging social support and practical help from friends and relatives []; or being the actor that provides the social support [,,,,,,].
| Study, why? (aim of the CAa), and how? (conceptual, eg, BCTs) | What? (ie the elements through which the BCTs are integrated in the technology) | ||||
| Atrial fibrillation | |||||
| Guhl et al [] | |||||
| To improve medication adherence and health-related quality of life | |||||
| Natural consequences |
| ||||
| Shaping knowledge |
| ||||
| Feedback and monitoring |
| ||||
| Goals and planning |
| ||||
| Associations |
| ||||
| No BCT; personalized experience |
| ||||
| Cancer | |||||
| Tawfik et al [] | |||||
| To improve the effectiveness of self-care behaviors and reduce the frequency, severity, and distress of chemotherapy side effects | |||||
| Shaping knowledge and natural consequencesb |
| ||||
| Greer et al [] | |||||
| To deliver cognitive and behavioral intervention to increase positive emotion | |||||
| Feedback and monitoring |
| ||||
| Shaping knowledge, repetition, and substitution |
| ||||
| No BCT; give space to vent |
| ||||
| Social support |
| ||||
| Associations |
| ||||
| Schmitz et al [] and Caru et al [] | |||||
| Schmit et al []: to record relevant symptoms (eg, the level of pain), provide customized interventions (based on metastasis location and symptoms), and deliver nutrition tips; Caru et al []: to enhance daily step counts in women with MBCc | |||||
| Associations |
| ||||
| Natural consequences |
| ||||
| Social support |
| ||||
| Feedback and monitoring |
| ||||
| Feedback and monitoring, antecedents, and shaping knowledgeb |
| ||||
| Goal setting |
| ||||
| No BCT; personalized experience |
| ||||
| Gomaa et al [] | |||||
| To optimize the care experience for patients with gastrointestinal cancer receiving chemotherapy, ultimately contributing to improved outcomes and enhanced patient well-being | |||||
| Associations, social support, natural consequences, and shaping knowledgef |
| ||||
| Associations, shaping knowledge and natural consequencesb, feedback and monitoring, and no BCT; personalized experience |
| ||||
| Associations and feedback and monitoring |
| ||||
| Huang et al [] | |||||
| To decrease EDguse and reduce unscheduled hospitalizations by collecting patient-reported symptoms during chemotherapy treatment and automatically alerting clinicians to severe or worsening symptoms | |||||
| Associations, feedback and monitoring, and shaping knowledge |
| ||||
| Social support |
| ||||
| Feedback and monitoring and shaping knowledge |
| ||||
| Albino de Queiroz et al [] | |||||
| To encourage patients to be more involved in their treatment, improve self-management, and report on their clinical condition | |||||
| Shaping knowledgef |
| ||||
| Feedback and monitoring |
| ||||
| Associations |
| ||||
| No BCT, flagging system |
| ||||
| Chronic disease | |||||
| Fang et al [] | |||||
| To increase adherence and improve overall user satisfaction by simulating a human carer or physician through an avatar | |||||
| Associations |
| ||||
| Natural consequences |
| ||||
| No BCT; personalization |
| ||||
| CKDi | |||||
| Chen et al [] | |||||
| To improve participants’ health literacy related to communication ability and disease-specific knowledge | |||||
| Associations |
| ||||
| Shaping knowledge |
| ||||
| Reward and threat |
| ||||
| Human involvement |
| ||||
| Chronic pain | |||||
| Meheli et al [] | |||||
| To promote mental health | |||||
| —f |
| ||||
| Cheng et al [] | |||||
| To deliver therapeutic content, including cognitive behavioral therapy, cognitive restructuring, motivational interviewing, mindfulness training, deep breathing techniques, and sleep meditations to collectively improve users’ behavioral activation, pain acceptance, and sleep quality | |||||
| Natural consequences |
| ||||
| Feedback and monitoring |
| ||||
| Shaping knowledge and repetition and substitutionb |
| ||||
| Monitoring feedback |
| ||||
| Reward and threat |
| ||||
| Hauser-Ulrich et al [] | |||||
| To promote self-management of chronic pain | |||||
| Shaping knowledge |
| ||||
| Natural consequences and shaping knowledgeb |
| ||||
| Feedback and monitoring |
| ||||
| No BCT, personalized experience |
| ||||
| No BCT, technical feature |
| ||||
| Goals and planning, social support, and feedback and monitoring |
| ||||
| Associations |
| ||||
| Social support |
| ||||
| Diabetes | |||||
| Krishnakumar et al [] | |||||
| To enhance multiple behavior patterns (self-monitoring of diet, exercise, weight, and blood glucose) | |||||
| Feedback and monitoringf |
| ||||
| Roca et al [] | |||||
| To improve medication adherence | |||||
| Associations and repetition and substitution |
| ||||
| No BCT; personalized experience and technical features |
| ||||
| Goals and planning and monitoring and feedback |
| ||||
| Natural consequences |
| ||||
| Shaping knowledge |
| ||||
| No BCT; persuasive feature in increasing CA use |
| ||||
| Gong et al [] | |||||
| To provide more accessible and engaging self-management support, monitoring, and coaching | |||||
| Feedback and monitoring and shaping knowledgef |
| ||||
| No BCT; personalized experience |
| ||||
| Goals and planning |
| ||||
| Feedback and monitoring |
| ||||
| —f, k |
| ||||
| Bruijnes et al [] | |||||
| To help people deal with social problems caused by diabetes | |||||
| No BCT |
| ||||
| Goals and planning |
| ||||
| Shaping knowledge and natural consequencesb and personalized experience |
| ||||
| Comparison of outcomes |
| ||||
| Natural consequences |
| ||||
| Goals and planning |
| ||||
| Nassar et al [] | |||||
| To improve self-care confidence and to improve A1Cl | |||||
| Feedback and monitoring |
| ||||
| Natural consequences and shaping knowledgeb |
| ||||
| No BCT; patient-initiated use |
| ||||
| Associations and goals and planning |
| ||||
| Social support |
| ||||
| No BCT; flagging system |
| ||||
| HIV | |||||
| Dworkin et al [] | |||||
| To address adherence and retention in HIV care with the overarching goal of improving viral suppression | |||||
| Natural consequences and shaping knowledgeb |
| ||||
| Social support |
| ||||
| Social support and comparison of outcomes |
| ||||
| Shaping knowledge |
| ||||
| Associations |
| ||||
| No BCT; personalized experience |
| ||||
| Hypertension | |||||
| Echeazarra et al [] | |||||
| To assist patients with high blood pressure in performing home blood pressure checks | |||||
| Feedback and monitoring, associations |
| ||||
| Shaping knowledge and natural consequencesb |
| ||||
| No BCT; personalized experience and technical features |
| ||||
| Sakane et al [] | |||||
| NRm | |||||
| Shaping knowledge and natural consequencesb |
| ||||
| Feedback and monitoring |
| ||||
| IBSn | |||||
| Hunt et al [] | |||||
| To treat irritable bowel syndrome with CBT | |||||
| Natural consequencesf |
| ||||
| Antecedents and associations, natural consequences, repetition and substitution, and shaping knowledge |
| ||||
| Associations |
| ||||
| Antecedents |
| ||||
| Kidney failure | |||||
| Cheng et al [] | |||||
| To improve the self-care ability of patients undergoing peritoneal dialysis | |||||
| Shaping knowledge and natural consequencesb |
| ||||
| Association and goals and planning |
| ||||
| No BCT, technical feature |
| ||||
| Overactive bladder | |||||
| Sheyn et al [] | |||||
| To provide first-line behavioral modification therapy for the treatment of overactive bladder | |||||
| Feedback and monitoring and goals and planning |
| ||||
| No BCT; personalized experience |
| ||||
| Shaping knowledge and repetition and substitutionf |
| ||||
| Goals and planningf |
| ||||
| Primary headaches | |||||
| Ulricho et al [] | |||||
| To improve mental well-being by promoting BCTs in behaviors, emotions, thoughts, and beliefs related to headaches while ensuring low-threshold access and scalability | |||||
| Goals and planning |
| ||||
| Feedback and monitoring |
| ||||
| Social support |
| ||||
| Shaping knowledge |
| ||||
| Natural consequences |
| ||||
| Comparison of behavior |
| ||||
| Associations |
| ||||
| Repetition and substitution |
| ||||
| Comparison of outcomes |
| ||||
| Reward and threat |
| ||||
| Regulation |
| ||||
| Antecedents |
| ||||
| Identity |
| ||||
| Scheduled consequences |
| ||||
| Self-belief |
| ||||
aCA: conversational agent.
bIndistinguishable due to lack of details on the intervention.
cMBC: metastatic breast cancer.
dAMIE: Addressing Metastatic Individuals Everyday.
eCBT: cognitive behavioral therapy.
fLikely more BCT’s involved but due to lack of details on the intervention, these cannot be clearly identified.
gED: emergency department.
hCTCAE: Common Terminology Criteria for Adverse Events.
iCKD: chronic kidney disease.
jGP: general practitioner.
kNot available.
lA1C: hemoglobin A1c.
mNR: not reported.
nIBS: inflammatory bowel disease.
oDescribed the integrated BCT with examples of how the BCT were integrated into the app. Only the clusters are reported in this table, with one example per category.
One study [] integrated the BCTs into their intervention with explicit examples of delivery by technology. All others lacked explicit reporting on which and how BCTs were delivered. The details in which the CA was described varied. Ambiguous intervention descriptions hindered the classification of BCTs. As an example, “recommending evidence-based elements from cognitive behavioral therapy, behavioral reinforcement, and mindfulness, among others” [] suggests the integration of many BCTs but does not allow us to identify which and how BCTs are delivered. The insufficient reporting on CA content made distinguishing “shaping knowledge” from “natural consequences” challenging [,,,,]. For example, “providing information about medication” could refer to instructions on its use (“shaping knowledge”) or explaining the effect and importance of adherence (“natural consequences”).
Evaluation Methods
Study Characteristics
In total, 24 unique studies were reviewed ( and ), as the findings of 1 study were published in 2 separate articles [,]. Most studies (11/24, 46%) were randomized controlled trials [,,,,,-,-,]. Most (14/24, 58%) were early-phase studies, such as pilot studies, initial trials, or exploratory studies [,,,,,,,,,,,,,]. Follow-up periods ranged from 3 weeks to 1 year. More than half of the studies (13/24, 54%) reported primary or coprimary outcomes [,,,-,,,,,,,]. Primary or coprimary outcomes were predominantly clinical outcomes, and user satisfaction was assessed as a primary outcome once []. One study [] lacked reporting of which outcome was primary. Many studies (17/24, 71%) selected participants on technical skills (eg, ability to install apps or communicate online) [,,], device access or possession [-,,,,,,], or both [,,,]. CAs were accessible to the participants primarily in addition to usual care, although this was not always specified. In one study [], the CA replaced parts of usual care. Some studies (7/24, 29%) involved active research team engagement by checking in with their participants to, among others, troubleshoot (technical) issues [,,,,,,,]. Preuse instructions were provided in some of the studies (9/24, 38%) [,,,,,,,,,]. Control conditions were typically usual care. Three studies chose an active control group in which participants received a nurse-led education program or discussion with nurses [], a self-help book [], or intensive specific health guidance []. One study [] assigned their participants to the control or intervention condition based on CA use. Only 8% (2/24) of the studies [,] reported the technological experience of their participants.
| Study, year, and country | Study design, follow-up, and study objective | Sample characteristics (age in y, mean [SD], gender, level of education, and technology skills) | Description of the study condition | |
| Atrial fibrillation | ||||
| Guhl et al [], 2020; United States | Randomized controlled trial,a 30 d, to measure acceptability and adherence and to assess its effectiveness to improve health-related QoLb and adherence. | Age: 72 (9); gender: 52% women; level of education: 28% high school and vocational, 11% some college, 28% bachelor’s degree, 33% graduate; technology skills: NRc | Reimbursement: NR, IGd (N=61): use of the app and Kardia device. Preuse instruction: trained until achieved proficiency in using the smartphone, relational agent, and Kardia device, including an orientation to the phone and app. CGe (N=59): Usual care. Preuse instruction: recording an electrocardiogram with Kardia under study personnel supervision. | |
| Cancer | ||||
| Tawfik et al [], 2023; Egypt | 3-arm randomized controlled trial, 4 mo, to examine the effects compared to nurse-led education on the effectiveness of self-care behaviors and the frequency, severity, and distress of chemotherapy side effects. | Breast cancer. Age: IG 44 (6), CG nurse-led 46 (9), and CG usual care 45 (8); gender: 100% women; level of education: 40% read and write, 16% basic education, 30% secondary diploma, 14% higher education; technology skills: NR | Reimbursement: NR. IG (N=50): received empowerment-based educational program via ChemoFreeBot. Preuse instruction: taught about the chatbot, its objectives, and that they would be chatting to an automated system. CG, nurse-led (N=50): 3 face-to-face group teaching sessions (45 min each) and a brochure. CG, usual care (N=50): discussing general knowledge of self-care behaviors regarding managing chemotherapy side effects with a nurse, varies in depth across individuals. | |
| Greer et al [], 2019; United States | Randomized controlled cross-over triala, 4 wk, to evaluate the engagement and usability of Vivibot and its preliminary effects of positive psychology skills delivered on psychosocial well-being outcomes in young adults treated for cancer. | Age: 25 (3); gender: 46% women; level of education: 0% >high school, 12% high school graduate and general educational development, 32% some college, 8% 2-y college degree, 40% 4-y college degree, 8% master’s degree; technology skills: NR | Reimbursement: US $20 Amazon gift card for each completed online survey. IG (N=25): access to the Vivibot. CG (N=26): waitlist, access to Vivibot after 4 wk. | |
| Schmitz et al [], 2023; United States and Caru et al [], 2023; United States | Partial cross-over randomized trial, 6 mo, Schmitz: To test the feasibility, acceptability, and initial efficacy of a supportive care intervention called nurse AMIEf. Caru: To present step counts data as exploratory evidence to document the impact of virtual-assistant technology on enhancing daily step count in women with metastatic breast cancer. | Metastatic breast cancer. Age: all 53 (11), IG 55 (10), CG 52 (8); gender: 100% women; level of education (all): high school 17%; some college 26%; 4-y degree or more 57%, level of education in IG: high school 10%; some college 33%; 4-y degree or more 57%, level of education in CG: high school 24%, some college 19%, 4-y degree or more 57%; technology skills: NR | Reimbursement: NR, IG (n=21): access to nurse AMIE. Check-in: weekly calls during the first 3 mo, which covered general queries regarding well-being and symptom checks and troubleshooting technology issues. Preuse instruction: participants engaged in an orientation session. CG (n=21): waitlist, first 0-3 mo usual care. After 3 mo access to intervention. | |
| Gomaa et al [], 2023; United States | Single-arm studya, 2 mo, to evaluate its practical implementation, user acceptance, and its potential to enhance QoL, patient activation, and symptom distress management. | Malignant gastrointestinal cancer. Age: 61 (12); gender: 56% women; level of education: high school 18%, college graduate 47%, postgraduate school 32%; unknown 3%, technology skills: NR | Reimbursement: gift cards valued at US $20, US $30, and US $40 for successive time points. IG (n=34): Received pro- and interactive text messages 3 times a week, a chatbot interface to monitor participant symptoms, and immediate self-care feedback. Check-ins: biweekly in-person meetings or telephone calls to ensure participant progress and address any queries. In addition to usual care. | |
| Huang et al [], 2023; Taiwan | Retrospective cohort study, 1 y (follow-up per participant varies depending on chemotherapy starting date). To evaluate whether a chatbot-based collection of patient-reported symptoms during chemotherapy treatment, with automated alerts to clinicians for severe or worsening symptoms, decreases emergency department use and reduces unscheduled hospitalizations | Gynecologic cancer. Age: IG 48 (9) and CG 64 (11); gender: 100% women; level of education: NR; technology skills: NR | Reimbursement: NR, IG (N=20): access to the chatbot program. CG (N=43): usual care consisting of standard care procedures at the hospital, including discussing symptoms and documenting them in the medical record during clinical encounters between patients and their oncologists. Patients with direct concerns about symptoms were also encouraged to initiate telephone contact with cancer managers between visits. | |
| Albino de Queiroz et al [], 2023; Brazil | Prospective nonrandomized clinical study, 8 wk, to evaluate the benefits of the SMTg model to patients regarding adverse effects, treatment, and QoL. | Colorectal cancer. Age: IG 49.7 (13.4) and CG 50.8 (13.5). Gender: IG 61.5% female and CG 29.4% female. Level of education: NR. Technology skills: NR | Reimbursement: NR, IG (N=13); monitoring proposed in this study. Check-ins at wk 2 and 6 addressed user doubts and provided guidance on app and device use. Participants with low engagement received alerts encouraging them to adhere to the intervention. CG (N=17): traditional monitoring during the active cancer treatment phase. | |
| Chronic illness | ||||
| Fang et al [], 2018; Australia | Randomized controlled triala, 3 wk, to describe the solution of using an avatar-based reminder app and the results of an initial trial. | People taking supplements. Age (range): 18-70, gender: NR, level of education: NR, technology skills: NR | Reimbursement: NR, IG (N=11); access to a simple version of the app with limited interaction via a study iPad. Participants filled weekly Zip-lock bags with their supplements. CG (N=13): an electronic pillbox with alarm functions and compartments for participants to fill with their supplements weekly. | |
| Chronic kidney disease | ||||
| Chen et al [], 2023; Taiwan | Pre-post intervention designa, 3 mo, to gather the data required to develop an intervention program of CIM-SHEh for patients with chronic kidney disease. | Age: 55 (10); gender: 36% women; level of education: 36%≤high school; 64%≥college; technology skills: NR | Reimbursement: gift certificate of NT $300 (approximately US $10) after completing the pretest, intervention, and posttest. IG (n=60): access to the app. Chatbot manager actively interacts with participants during the intervention to improve their learning motivation. Preuse instruction: face-to-face meeting with the research team for advice on operating the chatbot and interacting with the instructor and other participants. | |
| Chronic pain | ||||
| Meheli et al [], 2022; United States | Retrospective observational study, 1 y, to evaluate the perceived needs, engagements, and effectiveness of the mental health app Wysa regarding mental health outcomes among real-world users who reported chronic pain and engaged with the app for support. | Age: NR; gender: NR; level of education: NR; technology skills: NR | IG: Textual snippets (n=2194) of Wysa app users that reported chronic pain. CG: textual snippets of Wysa app users without chronic pain (only used for testing the app engagement or disengagement). | |
| Cheng et al [], 2023; United States | Single-site, single-arm, prospective cohort studya, 1 mo, to identify behavioral mechanisms that may mediate changes in mental and physical health associated with the use of Wysa for chronic pain during orthopedic management of chronic musculoskeletal pain. | Age: 59 (14); gender: 70% women; level of education: NR. Technology skills with technological experience measured as self-reported general smartphone use patterns: 57% independent downloading and using apps; 20% need help with downloading and using apps; 17% smartphone users but not for apps; 7% never use a smartphone. | Reimbursement: US $40 gift card. IG (N=30): Wysa for Chronic Pain digital intervention in addition to usual orthopedic care, which includes analgesic medication, physical therapy, and interventional spine procedures, as appropriate. | |
| Hauser-Ulrich et al [], 2020; Switzerland | Randomized controlled triala, 8 wk, to describe the design and implementation of Selma and present findings from a trial that evaluated effectiveness, acceptance, and adherence. | Age: 44 (13); gender: 80% women; level of education: 12% obligatory and high school, 32% matriculation and A-level, 21% higher vocational training, 35% university. Technology skills: NR | Reimbursement: NR, IG (N=59): 8-wk digital coaching program with a fully automated CA mediating coping strategies and psychoeducation to support pain self-management. CG (N=43): wait-list, completed the introduction process with the chatbot at day 0 and received weekly motivational messages from the chatbot. Were able to access the app after the waiting time (8 wk). | |
| Diabetes | ||||
| Krishnakuma et al [], 2021; South Asia | Pre-post study, 16 wk, to investigate the real-world effectiveness of the Wealthy CARE digital therapeutic for improving glycemic control among the South Asian population of Indian origin. | Type 2 diabetes mellitus. Age: 41 (CI 49-53); gender: 31% women; level of education: NR; technology skills: NR | IG (N=102): 16-wk structured lifestyle coaching through the Wellthy CARE digital therapeutic app. | |
| Roca et al [], 2021; Spain | Pre-post study,a 9 mo, to validate the effectiveness of a health care virtual assistant, integrated within messaging platforms, with the aim of improving medication adherence in patients with comorbid type 2 diabetes mellitus and depressive disorder. | Age: 64 (9); gender: 69% women; level of education: NR, technology skills: NR | IG (N=13 patients, N=5 health care professionals): access to the app via signal. Preuse instruction: during a medical appointment, nurses assisted with app download, explained the initial interaction with the virtual assistant, registered the patient, and configured the medication and reminder functions. | |
| Gong et al [], 2020; Australia | Randomized controlled trial, 12 mo, to evaluate the adoption, use, and effectiveness of the My Diabetes Coach program designed to support diabetes type 2 self-management in the home setting. | Age: 57 (10); gender: 42% women. Level of education: 19% secondary high school or lower, 31% technical apprenticeship or diploma, 21% bachelor’s degree; 19% postgraduate degree or higher; technology skills: NR | IG (N=93): My diabetes coach program and (optional) blood glucose meter with Bluetooth. Encouraged to regularly access the user guide and website and to join the discussion forums. Check-in: phone call encompassing a brief, structured interaction with the program coordinator (at 1, 4, 8, 12, and 24 wk) for technical assistance, to answer questions, and to encourage program use. CG (N=94): encouraged to continue routine diabetes self-care (including access to health care services and resources via NDSSi and diabetes not-for-profit organizations). Received a quarterly project newsletter to maintain their interest in the study. Program access after the study duration, if desired. | |
| Bruijnes et al [], 2023; unclear | Double-blinded between-subject study,a 3 wk, to determine the feasibility and preliminary efficacy of an automated CA to deliver to people with diabetes personalized psychoeducation on dealing with (psycho-) social distress related to their chronic illness. | Age: IG 37 (15), CG 40 (16), all 39 (16); gender: IG 57% women, CG 37% women, all 47% women; level of education: NR; technology skills: NR | Reimbursement: a minimum of £6 (approximately US $7.50) per hour, with increasing bonuses for completing consecutive sessions to reduce attrition. IG (N=79): 3 sessions in which the CA iterates over providing advice, evaluating the usefulness of the previous advice, and giving alternative advice, or working on another issue. Each session was separated by at least 1 wk. CG (N=77): 3 sessions in which they read or reread a self-help text from the book Diabetes Burnout to deal with interpersonal distress and friends and family distress (similar content but not personalized). Each session was separated by at least 1 wk. | |
| Nassar et al [], 2023; United States | Study design: NR, 9-12 mo, to provide support for T2DMj management, to improve self-care confidence, and to explore impact on A1C. | Age: 58 (10.6); gender: 60% women; level of education: NR; technology skills: NR | IG (N=58): patients that completed more than 1 chat. Preuse instruction: participants were informed they were not talking to a live human and that the chatbot was not monitored 24 h a day. CG (N=36): A priori enrollee who completed none or only a single chat, as the first chat consisted of administrative information alone and did not deliver any education content. | |
| HIV | ||||
| Dworkin et al [], 2019; United States | Pre-post study,a 3 mo, to explore the acceptability, feasibility, and preliminary efficacy of the effect of My Personal Health Guide on adherence. | Young men positive with HIV that have sex with men. Age (range) 29 (18-34); gender: 0% women; level of education: 33% less than a high school diploma, 67% college or more; technology skills: NR | Reimbursement: baseline and follow-up visit US $50; returning to study site to resolve issues in case of app deletion or phone loss, US $15. IG (N=43): access to health coach app. Preuse instruction: demonstration of app functions, reminder settings, encourage app use, and answering questions. Check-in: phone call by project staff (after mo 1 and 2) to troubleshoot technical problems. | |
| Hypertension | ||||
| Echeazarra et al [], 2021; Spain | Randomized controlled trial, 2 y, to evaluate the feasibility of developing the chatbot, assess its effectiveness on blood pressure checking and knowledge improvements on best-practice or self-management blood-pressure measurement procedures and investigate advantages of using the CA. | Age: 52.1; gender: 42% women; level of education: 25% basic studies, 22% medium studies, 31% vocational training, 13% university graduated, 9% not specified. Technology skills: NR | IG (N=55): Access to Tensiobot. Preuse instruction: the nurse helps to download and install the Telegram app and with registering to and use of Tensiobot. CG (N=57): receives a written procedure on how to self-monitor blood pressure. | |
| Sakane et al [], 2023; Japan | Randomized controlled trial,a 12 wk, to determine the efficacy of the KENPO app in facilitating weight loss in Japanese adults with obesity and hypertension. | Age: 52 (7); gender: 45% women; level of education: NR; technology skills: NR | Initial counseling by a registered and trained dietician consisting of a briefing about the patients’ health condition and lifestyle and instructions to set achievable personalized behavioral goals. Receive self-monitoring devices (Bluetooth weighing scale, pedometer, and upper arm blood pressure monitor). Check-in: e-mail support at 2, 6, and 12 wk. Kind of support not specified. IG (N=39): access to the app. CG (N=39): active control, usual support based on intensive specific health guidance. Recommended to record step count and body weight daily and measure blood pressure in the morning and evening. | |
| IBSk | ||||
| Hunt et al [], 2021; United States | Randomized controlled crossover trial, 3 mo, to evaluate the efficacy of Zemedy to apply cognitive behavioral therapy to IBS. | Age: 32 (10); gender: 75% women; level of education: NR; technology skills: NR | Reimbursements: US $20 in Amazon credit upon completion of each round of questionnaires. IG (N=62): Access to app, entire intervention delivered within the app, no human involvement. In case of technical difficulties, users could reach out to technical support. Check-in: via email at 4 wk from a research coordinator providing general encouragement to continue working through the app. CG (N=59): waitlist, access to app after 8 wk. Check-in: received a single email at 4 wk to hang in there. | |
| Kidney failure | ||||
| Cheng et al [], 2023; Taiwan | Study design: NR, 12 mo, to improve the peritonitis incidence by assisting PDl treated patients with knowledge and quality of self-care by using AIm combined with social media | Age (percentage of participants within age range): 21-50 (42%), 51-70 (47%), >70 (11%); gender: 53% women. Level of education: elementary school (8%), junior high school (7%), senior high school (28%), university (44%), graduate school (13%); technology skills: NR | IG (use data N=440, questionnaire data N=297): introduced to the PD AI-Chatbot | |
| Overactive bladder | ||||
| Sheyn et al [], 2024; United States | Prospective observational study,a 8 wk, to evaluate the efficacy of a digital CA (CeCe) for the treatment of overactive bladder. | Age (median): 61 (IQR 52-67); gender: NR; level of education: NR; technological experience, median: MDPQ-16n: 34 (IQR 25-37), CPQo: 26 (IQR 19-30) | Reimbursement: US $175 was compensated throughout the study to ensure adherence: US $50 at completion of the voiding diary and initial set of questionnaires (d 1-3), US $50 at completion of the 4-wk voiding diary and questionnaires, and US $75 at completion of the 8-wk voiding diary and questionnaires. IG (N=29): CeCe treatment program. Preuse instruction: the research coordinator assisted with setting CeCe up on the patient’s mobile phones, enrolled them in CeCe, and showed how to use the app. | |
| Primary headaches | ||||
| Ulrich et al [], 2024; Switzerland, Germany, and Austria | Unblinded randomized controlled trial, 24 to 60 d (intervention group); 60 d (control group). To develop a smartphone-based and CA-delivered intervention for people with headache and to evaluate the smartphone-based and CA-delivered interventions’ effectiveness, engagement, and acceptance. | Age: 39 (12); gender: 91% women; level of education: no education 3%, obligatory or high school 1%, vocational training and high school 30%, higher vocational training 15%, university (of applied sciences) 52%, technology skills: NR | IG (N=110): unguided use of the intervention. CG (N=88): waitlist, unguided use of the intervention after 42 d. Received a weekly reminder from the CA during the 42-d waiting period (not described what these reminders include). | |
aReported as pilot study.
bQoL: quality of life.
cNR: not reported.
dIG: intervention group.
eCG: control group.
fAMIE: Addressing Metastatic Individuals Everyday.
gSMT: Smart Monitoring Tool.
hCIM-SHE: Chat-based Instant Messaging Support Health Education.
iNDSS: National Diabetes Service Scheme.
jT2DM: type 2 diabetes mellitus.
kIBS: inflammatory bowel disease.
lPD: peritoneal dialysis.
mAI: artificial intelligence.
nMDPQ-16: Mobile Device Proficiency Questionnaire.
oCPQ: Computer Proficiency Questionnaire.
| Study | Outcome measures | Primary outcomes | ||||
| Efficacy or effectiveness | Feasibility | Acceptability | X>Y: X significantly improved compared to Y. X=Y: No significant difference. | |||
| Atrial fibrillation | ||||||
| Guhl et al [] | Health-related QoLa: Atrial fibrillation effect on QoL measure, pre and post, self-reported at clinical site.b Medication adherence: 2 questions (forget or not take medication), pre and post, self-reported at clinical site. | Participant flow diagram (enrolment, exclusion, attrition, and retention rate). Intervention adherence: median number of conversations and days of use, and mean total interaction duration, duration per conversation, and number of completed modules, and number of reported symptoms, not described as outcome in the methods but reported in the results. | Acceptability: survey including free-text responses (eg, “overall impressions of Tanya”) and closed questions (usefulness, informativeness, trustworthiness, easiness, and repetitiveness of the CAc), post, self-reported at clinical site. | Health-related QoL: IGd>CGe | ||
| Cancer | ||||||
| Tawfik et al [] | Frequency, severity, and distress of physical and psychological chemotherapy-related side effects: Adapted version of the MSASg, pre and post, self-reported, collection NRb,f. Effectiveness of self-care behavior: Modified SCBDh, pre and post, self-reported, collection NR.b | N/Ai | Usability: CUQj, post, self-reported, collection NR. | Frequency, severity, and distress of physical and psychological chemotherapy-related side effects: IG>CG. Effectiveness of self-care behavior: IG>CG | ||
| Greer et al [] | Pre and post, self-reported in app:
| Participant recruitment and flow (enrolment, attrition, exclusion, and retention rate). Engagement (user data):
| CA feedback: Ratings of helpfulness of each lesson, and survey assessing if users would recommend to a friend and why after lesson 7, self-reported in app. | N/A | ||
| Schmitz et al []; Caru et al [] | Schmitz et al []: Assessed at baseline, 3 mo follow-up and post intervention, online via, video conference:
| CONSORTm diagram (enrolment, attrition, exclusion, and retention rate). Feasibility: Number of days out of the first 90 d of exposure to Nurse AMIEn that patients logged in. | Helpfulness of the intervention: one question, daily, in app. Usability: the CSQo, the Credibility and Expectancy Questionnaire, and the user version of the mobile app rating scale. At 3 mo, online video conferencing. Acceptability: % of the eligible patients approached who agreed to participate. | N/A | ||
| Gomaa et al [] | Pre and post, online via web-based platform or on paper during clinical visits:
| Feasibility: study accrual and attrition rates. Use with use data:
| Acceptability: intervention satisfaction with self-made 5-item ratings on a 1-5-point scale evaluating usability and acceptability, post, online via web-based platform or on paper during clinical visits. Acceptability: semistructured interviews on user experience and feedback, post, by phone. | N/A | ||
| Huang et al [] | EDp visits: adjusted incidence rate ratio of ED visits after initiation of chemotherapy, method NR.b Unscheduled hospitalizations: adjusted incidence rate ratio of unscheduled hospitalizations after initiation of chemotherapy, method NR. | Number of consultations with the intervention, use data. Percentage of consultations that require further in-person communication, use data. | Patient satisfaction (method NR, not described as outcome in the methods but reported in the results). | ED visits: IG>CG | ||
| Albino de Queiroz et al [] | QoL: QLQ-C30q and QLQ-CR29r, at 0, 4, and 8 wk, self-reported within the CA, IG only. Eating habits and physical activity: “Food Guide: how to have a healthy diet questionnaire,” pre and post intervention, self-reported within the CA, IG only. Signs, symptoms, and adverse effects: Data from the CG were extracted from the patient\'s medical records, and from the IG were obtained from the patient\'s self-report, not described as outcome in the methods but reported in the results. | Engagement:
| Usability: SUSs, post, self-reported within the CA. User experience: User Experience Questionnaire, post, self-reported within the CA. | N/A | ||
| Chronic illness | ||||||
| Fang et al [] | Medication adherence: adherence rate using weekly pill counts, pre and post, self-reported, collection NR. | Study recruitment | User experience: ratings (post) and interviews (pre and post), face-to-face. | N/A | ||
| Chronic kidney disease | ||||||
| Chen et al [] | Communicative literacy: survey based on the chronic kidney disease knowledge scale with 2 items added evaluating disease-specific health literacy and disease knowledge, pre and post, by face-to-face or telephonic interviews, depending on the participants’ willingness. | Study enrolment (enrolment, attrition, exclusion, and retention rate) | Acceptability: SUS evaluating usability, weeks 1, 4, and 12 and post, by face-to-face or through telephonic interviews, depending on the participants’ willingness. | N/A | ||
| Chronic pain | ||||||
| Meheli et al [] | Depression (n=69): PHQ-9t, pre and post, self-reported in app. Anxiety (n=57): GAD-7u, pre and post, self-reported in app. Collection moment varies per individual, based on first and last assessments within the 1-y study. | App engagement or disengagement: with use data.
| Perceived needs of users with chronic pain: textual snippets from conversations. | N/A | ||
| Cheng et al [] | Changes in self-reported mental and physical health with the adult PROMIS computer adaptive test Anxiety (version 1.0), Depression (version 1.0), Pain Interference (version 1.1), and Physical Function (version 2.0). Pre and post, post self-reported online and pre-intervention withdrawn from electronic medical record. | Inclusion flow sheet (enrolment, attrition, exclusion, and retention rate). Use of the intervention with CA interactions (digital and human coach): time-stamped participant use data. | NR | N/A | ||
| Hauser-Ulrich et al [] | Pre and post, self-reported in app:
| Participant flowchart (enrolment, attrition, exclusion, and retention rate). Intervention adherence: Use data, the ratio of conversations replied to by participants and all conversations initiated by Selma. | Acceptability: survey assessing usefulness, ease of use and enjoyment, satisfaction with intervention duration and number of messages, sufficiency of the content, net promotor score, and what users like most and what they want to see improved, post, self-reported in app. Working alliance: context-adapted German version of the Working Alliance Inventory Short Revised, post, self-reported in app. | Pain-related impairment: IG=CG | ||
| Diabetes | ||||||
| Krishnakuma et al [] | HbA1cv: Independent pathological laboratory test, pre and post.b Pre and post, self-monitoring data in app:
| Participant recruitment and retention flowchart (enrolment, attrition, exclusion, and retention rate). Use: average time spent with the chatbot and health coach with use data. Program engagement: number of interactions with the health coach and AI-powered chatbot with use data. | NR | HbA1c: post > pre | ||
| Roca et al [] | HbA1c: Laboratory blood test, at pre and post. Pre and post, telephone interview:
| Flow diagram of patient selection and completion of the pilot study (enrolment, attrition, exclusion, and retention rate) | Patient user experience: Guided interviews in primary health care centers, 3 mo to evaluate general impression (ease of learning, usefulness, use frequency, covered needs in medication reminders, and easiness of functionalities), observed problems (use, reminders, and vocabulary), and other suggestions (unmet needs and opinions). Health care provider user experience: Self-administered questionnaire assessing ease of use, usefulness, monitoring features, willingness to continue, patient motivation and dissatisfaction, at post. Patients’ use and acceptance of the virtual assistant: Use data to evaluate CA use (daily interactions, functionalities, and answered reminders), interaction preference (numeric and text), acceptance (uninstalls), and usefulness (misunderstandings) | N/A | ||
| Gong et al [] | HbA1c: Pathology blood test, requested from general practitioner, pre and post, in clinic.b Pre and post, self-report survey, online:
| Enrollment, randomization, and follow-up of study participants. (attrition, exclusion, and retention rates). Adoption and use: use data to evaluate duration of chats and number of blood glucose uploads, technical alerts, clinical alerts, and completed chats | NR | HbA1c: IG=CG. health-related QoL: IG>CG | ||
| Bruijnes et al [] | Social diabetes distress: a survey generated from subscales of the type 1 Diabetes Distress Scale and type 2 Diabetes Distress Scale adjusted to 3-wk intervention, post, self-reported online via email invitation. Feeling of being heard: FBHw questionnaire on 7-point Likert scale, post, self-reported online via email invitation. | Exclusion and attrition rate | Attitude toward the intervention: CSQ 8 items, post, self-reported online via email invitation. Usefulness of the implementation of the CA: SUS, post, self-reported online via email invitation. | N/A | ||
| Nassar et al [] | Glycemic outcomes (A1C): extracted from the electronic medical record at pre (most recent A1C measure up to 7 d after enrolment) and post (first A1C measure occurring at least 6 wk after the enrolment date).b Self-care confidence: one question “I feel confident that I can control and manage most of my health problems,” every 3 mo, self-reported in app. | Enrollment and unenrollment (ie, opting to unenroll through the chatbot or by asking the staff). Engagement: the proportion of eligible users who complete an engagement behavior, based on use data from the CA dashboard. Activation: use data evaluating the enrollment ratio (users who clicked through terms and conditions vs those who agreed to enroll), simple engagement (users who returned within 1 mo of their first chat), active engagement (users who completed more than half of their invited chats). Number of completed chats: use data. Red flags: use data evaluating the number of red flags, users with red flags, and reasons for red flags. | Satisfaction: questions (eg, helpfulness of today’s chat) and overall satisfaction with the length and frequency of the chat and quarterly survey, self-reported in app. User experience: Chat comments and posts. | A1C: IG>CG | ||
| HIVy | ||||||
| Dworkin et al [] | Medication adherence: the percentage of users with pill count–based adherence ratio >80%, pre and post, collection method not specified NR.b. Pre and post, self-reported, at clinical site:
| Enrollment, attrition, exclusion, and retention rate. Methods NR:
| User experience: Likert-scale survey (use easiness, willingness to continue using after study, and perceived value of the functions, feelings of discomfort, being embarrassed, and in control of their health and being cared for by the avatar, and whether they recommend the app to an HIV-infected friend), post, in-clinic. Extent of app use: Use data. | Medication adherence: post=pre. | ||
| Hypertension | ||||||
| Echeazarra et al [] | Monitoring adherence and quality: contrast between Holter device measurement assisted by nurse and self-monitored blood pressure, post. Blood pressure self-monitoring knowledge and skill: checklist assessed by nurse during clinical appointment, pre and post. Patient adherence: number of correct blood pressure checks (method unspecified) | NR | User satisfaction: satisfaction survey, including questions about ease of use, usefulness, preference, and stopped using the app, post, and method NR | N/A | ||
| Sakane et al [] | Weight loss and BMI: weighing scale, daily, self-monitored and uploaded from device to the cloud. Daily, self-monitored and uploaded from device to the cloud:
| CONSORT flow diagram (enrolment, attrition, exclusion, and retention rate). Device adherence: number of uploads of weight, blood pressure, and steps, self-monitoring data, weekly. | NR | N/A | ||
| IBSx | ||||||
| Hunt et al [] | Pre and post, online, self-reported survey:
| CONSORT diagram of participant flow (enrolment, attrition, exclusion, and retention rate). Received dose: number of modules completed with use data. | NR | QoL: IG>CG after intervention of 8 wk. Symptom severity: IG>CG, after intervention of 8 wk. | ||
| Kidney failure | ||||||
| Cheng [] | Infection rate: Differences in PDy-related infections, during app use, method NR. | User clicks: Number of clicks by patients by use data. | Acceptance and satisfaction OR user satisfaction and willingness to use the PD AIbb Chatbot during the COVID-19 pandemic OR Patient satisfaction: Likert-scale based satisfaction self-report survey at 3 mo.b, qualitative evaluation (NR) | Satisfaction: overall satisfaction=4.5 out of 5. High satisfaction (<4 points) was reported for (1) 93.6% of users who believed that the PD AI chatbot could help reduce the length of their hospital stay during OPDaa visits. (2) 98.6% of users reported receiving health education content immediately; (3) 94.9% of users found the chatbot easy to use; and (4) 98.6% of users expressed their desire to continue using the chatbot. | ||
| Overactive bladder | ||||||
| Sheyn et al [] | QoL: Change in the International Consultation on Incontinence-Overactive Bladder QoL Questionnaire, pre and post, self-reported in app.b, self-reported in app:
| Flowchart of participant selection (enrolment, attrition, exclusion, and retention rate) | Usability: SUS, post, self-reported in app. | QoL: post>pre. | ||
| Primary headaches | ||||||
| Ulrich et al [] | Mental well-being: composite score of the PHQ-9 and GAD-7, pre and post, self-reported in app.b Pre and post, self-reported in app:
| Participant flowchart (enrolment, attrition, exclusion, and retention rate). Extent of use: use data assessing the time spent on in-app relaxation and imagination exercises, the number of inactivity reminders, and days taken to complete a coaching module. Engagement with use data: percentage of answered conversational turns between the participant and the CA (higher=higher engagement). Intended use with use data: participants that completed the outro. | Subjective experience, self-reported in app:
| Mental well-being: IG>CG | ||
aQoL: quality of life.
bPrimary and coprimary outcomes.
cCA: conversational agent.
dIG: intervention group.
eCG: control group.
fNR: not reported.
gMSAS: Memorial Symptoms Assessment Scale.
hSCBD: self-care behaviors diary.
iN/A: not applicable.
jCUQ: Chatbot Usability Questionnaire.
kPROMIS: Patient-Reported Outcomes Measurement Information System.
lSF-36: 36-Item Short Form Survey.
mCONSORT: Consolidated Standards of Reporting Trials.
nAMIE: Addressing Metastatic Individuals Everyday.
oCSQ: Client Satisfaction Questionnaire.
pED: emergency department.
qQLQ-C30: Quality of life of Cancer Patients survey.
rQLQ-CR29: Quality of Life of the Colorectal Cancer patients survey.
sSUS: System Usability Scale.
tPHQ-9: Patient Health Questionnaire-9.
uGAD-7: General Anxiety Disorder-7.
vHbA1c: hemoglobin A1c.
wFBH: Feeling of Being Heard.
xIBS: inflammatory bowel disease.
yPD: peritoneal dialysis.
zAI: artificial intelligence.
aaOPD: outpatient department.
Efficacy or Effectiveness Outcomes
Clinical manifestations of the disease were assessed most often (11/24, 46%). These include physical function examined with a chair test in cancer and a survey in chronic pain [,]; symptoms of cancer, pain, headaches, overactive bladder, and irritable bowel syndrome evaluated with surveys or a diary [,,,,,,]; blood glucose levels determined with blood tests in diabetes [,,,]; BMI or weight assessed by a general practitioner in clinic or self-monitored data using a weighing scale in diabetes type 2, obesity, and hypertension [,,]; and blood pressure examined using a blood pressure monitor in obesity and hypertension []. In addition, medical complications were assessed using unspecified methods, such as infection rate from peritoneal dialysis for chronic kidney disease [] and department emergency visits and unscheduled hospitalizations of patients with cancer []. Side effects of chemotherapy for cancer were measured once with a survey [].
More general outcomes were measured with a variety of (disease-specific) surveys. General outcomes encompassed the (health-related) QoL of patients with colorectal cancer, irritable bowel syndrome, atrial fibrillation, diabetes, and an overactive bladder [,,,,] and mental health–related outcomes, with anxiety or depression being the most frequent, in patients with cancer, chronic pain, irritable bowel syndrome, diabetes, headaches, or an overactive bladder [-,,,,,-].
Self-management [] or related concepts such as self-care confidence [] and self-efficacy [,] were assessed using standardized questionnaires [,,] or by adding one question to the test battery []. In addition, 13% (3/24) of the studies measured the effect of education with surveys assessing communicative literacy [], health literacy [], and blood pressure self-monitoring knowledge and skills []. Moreover, 8.3% (2/24) of the studies evaluated the participants’ intention to change their behavior [,]. Some (8/24, 33%) studies assessed changes in behavior with step counts using a pedometer [,], healthy habits using a survey [], changes in food consumption and physical activity using a survey [], application of BCTs using a survey [], adherence to medication using pill counts [,], questions about (changes in) adherence [,], adherence to monitoring using number of correct measures [], and monitoring quality by comparing self-assessed measurement to golden standards [,,,]. The effectiveness of self-care behaviors was assessed once []. The impact of the intervention on health care use was measured twice [,].
Feasibility Outcomes
One study [] reported accrual and attrition rates as feasibility measures. Another study [] defined feasibility as at least 50% of participants logging in for 30 out of the first 90 days. Most studies did not define feasibility but assessed metrics that we considered as feasibility. The majority (17/24, 71%) reported participant flow data from which enrollment, attrition, exclusion, or retention rates could be inferred [-,,-,,,-]. Many studies (17/24, 71%) examined user-data metrics to assess intervention adherence [,,], engagement [,,,,,], adoption [], activation [], intended use [], and the extent of use [,,,,,]. “Intervention adherence” was assessed by conversation replies [] and self-monitoring uploads []. Measures of “engagement” varied widely; for instance, one study [] defined “engaged sessions” as the number of completed sessions, while other studies considered the quality of data reported by the participants [] or the percentage of answered conversational turns []. One study [] studied a similar measure, the ratio of conversations answered by the user and all conversations the CA initiated, rendering this as intervention adherence. Similarly, “extent of use” was measured inconsistently, varying from the number of consultations [,] to time spent in-app []. Some used “extent of use” to assess “intervention exposure” [] and “received dosage” []. Others included use metrics without further specification of what they intended to measure [,,,] or reported what they measured without describing how []. In addition, 8.3% (2/24) of the studies did not evaluate feasibility [,].
Acceptability Outcomes
Some (10/24, 42%) studies reported acceptability. Acceptability was assessed with nonstandardized surveys [,,], the extent of app use [], usability using the System Usability Scale [], or intervention satisfaction by a usability and acceptability survey combined with an interview to collect user experience and feedback []. One study [] reported “engagement and acceptance,” including intention and commitment to change behavior, working alliance, participants’ sensitivity to triggers, and tendency to avoid triggers. Another study [] reported on “patient use and acceptance of CAs,” encompassing use of CA (tools and reminders), acceptance (#users that did not uninstall signal), and usefulness (#times the patient was not understood by the CA). One study [] defined an acceptability threshold as 50% of patients agreeing to participate. Some of these studies included additional metrics that we classified as acceptability, such as usability [] and working alliance [] using standardized questionnaires, helpfulness of the intervention [], and perceived enjoyment using one single question [].
Some authors did not report on acceptability but reported measures that we classified as acceptability, such as usability assessed with standardized questionnaires (ie, Chatbot Usability Questionnaire [] or System Usability Scale [,,]), CA feedback [], helpfulness of the intervention [], or user experience and satisfaction [,,] using self-compromised surveys consisting of (single) questions, ratings, or the User Experience Questionnaire []. In addition, 13% (3/24) of studies interviewed their participants about the user experience [,,]. One study [] referred to patient satisfaction in their results, but the method description lacked, and another reported measures related to user experience but left them undefined []. The client-satisfaction questionnaire was used to address attitudes toward the intervention in one study [] and usability in another [,].
Primary or Coprimary Study Results and RoB
Most study findings support the effectiveness of the CAs [,,,,,,]. Their interventions were effective in elevating (health-related) QoL in patients with atrial fibrillation [], diabetes [], inflammatory bowel syndrome [], and overactive bladder []. Other studies showed that their interventions improved mental well-being in patients with chronic headaches [] and diminished symptom severity in patients with inflammatory bowel syndrome []. In cancer, the interventions led to fewer emergency department visits of patients with gynecologic cancer [] and diminished chemotherapy-related side effects and improved effectiveness of self-care behaviors in breast cancer []. In diabetes, 8% (2/24) of studies found improved blood glucose levels [,] while another study found no effect on blood glucose levels []. Pain-related impairment in chronic pain did not differ between the intervention and control group []. Another CA intervention did not significantly improve medication adherence in young men who were HIV positive and have sex with men []. Furthermore, one study showed acceptability of their CA with a high overall user satisfaction (1/24, 4%) [].
All included studies that reported primary outcomes were rated as having a high, serious, or critical RoB ( [,,-,,,,,,]. Missing outcome data were a shared concern in randomized controlled trials and nonrandomized trials: 29% (7/24) of studies reported substantial missing data without providing reasons or using appropriate methods to account for the missing data [-,,,,]. In the randomized controlled trials, the most common source of high RoB was in outcome measurement [,,,,,]. Given the nature of the interventions, participants were often unblinded, and primary outcomes—such as QoL—were self-reported and susceptible to influence by participants’ expectations or beliefs about the intervention. In nonrandomized trials, bias also often emerged from the lack of controlling for key confounding factors, such as age, digital skills, comorbidities, and disease stage [,,,].
Discussion
Principal Findings
This review provides a comprehensive overview of the current state of intervention descriptions and evaluation methods for CAs supporting people to self-manage chronic diseases. Despite promising development, significant gaps remain in intervention descriptions and evaluations. We discuss these findings in further detail, highlighting existing strengths and directions for future research.
Characteristics of CAs and Integration of BCTs
Consistent with prior reviews [,], nearly all CAs were designed for specific diseases, such as cancer and diabetes. This allows CAs to be designed attuned to the disease-specific needs of people. Interventions aimed at improving adherence, self-care confidence, and behavior. However, intervention descriptions were inconsistent or incomplete, particularly conversational techniques, hindering understanding how responses are generated and insights into conversational flows, limiting replicability. To illustrate, one study reported about AI-powered decision support [], but because the description of the conversational techniques is lacking, it is unclear how this functionality works. This aligns with earlier research highlighting inconsistent documentation of AI methodologies in CAs for chronic disease self-management []. Despite the rapid advancements in AI, the number of AI-driven CAs identified in our selection was lower than expected, and many AI-driven CAs relied on basic natural language processing rather than more advanced AI capabilities. This highlights the early-stage nature of AI integration in health care CA research. This is further supported by our observation that studies investigating more advanced AI-driven CAs evaluated for response accuracy rather than their effects on disease self-management.
BCT clusters “feedback and monitoring,” “shaping knowledge,” “natural consequences,” and “associations” were most prominently integrated. Findings from other studies examining eHealth interventions underscore the prevalence of feedback and monitoring, shaping knowledge, and associations [,]. “Associations” were often prompts to CA use, which can also be considered a persuasive feature of technology. Furthermore, CAs demonstrate the unique capacity to deliver “social support” by serving as a source of support themselves. Explicit reporting on BCT integration is often lacking. Only a few studies explicitly reported the theoretical frameworks guiding their interventions or the BCTs used, a concern echoed by a previous review on BCTs in self-management interventions for chronic obstructive pulmonary disease []. This raises the question of whether the intervention was designed using established theoretical frameworks or if such frameworks were absent. The latter could potentially result in less effective interventions. Conversely, if a theoretical basis exists, insufficient reporting generates ambiguity in two critical areas: (1) the chosen BCTs and (2) the role of technology in delivering these techniques. Clear reporting of the BCTs and their delivery—such as Ulrich et al []—is crucial to evaluate the theoretical underpinnings, providing insights into the mechanism behind behavior change and improving replicability and comparability.
Methodological Limitations in Intervention Evaluation
The primary findings of the studies supported the potential of CAs to improve self-management and health outcomes. A previous review similarly suggested that AI-powered chatbots may contribute to better health outcomes; however, the evidence was limited due to insufficient technical documentation []. These findings should be interpreted with caution, as all included studies exhibited a heightened RoB—a concern raised by earlier literature as well [].
The field of CA interventions for chronic disease self-management remains in an early stage [,], with many studies being exploratory. Also, the heterogeneity in study designs, variability in taxonomy, and use of broad (not unified) outcomes present a major challenge [,,]. Consequently, the possibility of conducting a meta-analysis is prevented [,]. Many studies relied on nonstandardized acceptability surveys. Clinical outcomes were often self-reported. Although self-reported clinical outcomes are valuable for capturing patient experiences, they are also prone to subjectivity, recall, and social desirability biases []. Combining self-reported outcomes with objective measures, such as blood glucose levels [,], offers a more comprehensive perspective on the patient’s health, reduces bias, and enhances reliability. Another issue is that few studies measured outcomes related to self-management (eg, knowledge), despite their critical role in improving health, highlighting a significant gap we should overcome to understand how CAs contribute to improving health outcomes.
Selection bias is a concern. Participants were often required to possess a compatible device. This eligibility criterion excludes less technology-oriented individuals or individuals with lower income, thereby limiting the generalizability of findings to more diverse populations. Dworkin et al [] tried to overcome this bias by providing a study loaner phone. Moreover, minimal attention was given to patients’ technical proficiency, a factor that directly impacts engagement and outcomes, complicating the understanding of whether limited effects are due to the intervention or users’ technological comfort. Recommended would be at least an approach that describes the technological proficiency of the study group using standardized questionnaires [].
While CAs are often proposed as solutions to health care resource scarcity, studies provided minimal details about the care environment where the interventions were implemented, and it was often unclear whether they opted to supplement or replace usual care. Also, the associated time investment of human involvement was not reported. Both hamper the understanding of the resource implication of implementing these interventions.
Limitations
This study was not without shortcomings. Only 2 databases were used in our search. Consequently, we might have missed some potentially relevant articles. However, our search in Embase and PubMed gives a good presentation of the articles published within the biomedical field—which was our key focus. Also, we included a considerable number of papers, and our main findings were consistently observed throughout the articles included. Therefore, even if some articles were missing, it is unlikely that additional articles would alter our results. The article inclusion and data extraction process were conducted by a single researcher, which increases the risk of subjective bias. To mitigate this, the inclusion and selection process was frequently discussed and reviewed with 2 additional researchers whenever uncertainties about an article’s inclusion arose, and the extracted data was cross-checked with the original articles to ensure accuracy. Moreover, the categorization of BCT clusters involved a degree of interpretation, which could affect the consistency of the findings. However, by categorizing BCTs in consensus meetings with 3 researchers, we minimized this risk. The limited detail in intervention reporting may have led to an underestimation of the presence and diversity of BCTs in this analysis. This concern is supported by the observation that the only study [] that reported what and how BCTs were delivered included the most (diverse) BCT clusters. The same applies to the classification of outcome variables and conversational techniques, which require subjective judgment. However, to enable comparability across studies, this categorization was the preferred option.
Recommendations for Future CAs in Health Care Research
To improve the rapidly evolving field of research on CA for chronic disease self-management, our review highlights several areas for advancing the evaluation of CAs.
Standardize Reporting
Adopt frameworks like CONSORT-eHealth to ensure comprehensive documentation of the study and the BIT model for detailed reports of CA intervention components, separating between BCTs and their delivery by technology. We believe this enhances replicability and comparability across studies and allows us to evaluate what (combinations of) BCTs are most effective for chronic disease self-management and how they should be delivered to the user. This contributes to more robust and transparent evidence, including a better understanding of the working mechanisms of CAs as self-management interventions. If a study aims to further characterize the dialogue management systems of CAs, the framework by Laranjo et al [] offers a valuable basis to refine CA characterization and technical evaluation.
Enhance and Broaden Methodological Rigor
Use validated, standardized evaluation methods for feasibility, acceptability, and clinical efficacy. Also, add objective disease-specific and technology (eg, log data for reach and engagement) measures alongside self-reported outcomes to strengthen the evidence of clinical effectiveness and adoption of the CA. Objective measures are essential to minimize bias, given the challenges of blinding participants in CA interventions. The use of shared taxonomy and clear definitions about the intended measure are key. This will improve reliability and comparability across studies and allow meta-analysis across multiple studies. As attrition is typically high in eHealth trials [,], including those in our assessment, robust strategies for handling missing data—such as sensitivity analysis [], multiple imputation [] using relevant covariates, and transparent reporting of missingness and its causes []—are essential to reduce bias. In addition, greater attention should be given to key factors, such as age, digital skills, health or eHealth literacy, comorbidities, and disease stage, which may influence CA engagement and effectiveness. Incorporate methodological frameworks of multiple domains, such as human-centered design research, to further CA research into a more agile field of science [].
Conduct Real-World Evaluations
This includes evaluating the role of CAs in supplementing or replacing traditional care, reporting on associated resource implications, and examining their impact on health care use. We recommend evaluating the implementation of CAs for self-management using established, well-defined outcomes with respect to BIT use in routine practice settings [] and using real-world evaluation frameworks that consider the complexity of the health care system as an evaluation context [,].
Leverage Advanced AI
Expanding the use of AI-driven personalization and decision-support functionalities may enhance user engagement and intervention efficacy. Improved personalization based on individual data and dynamic adapting and learning from the interaction can support self-management by fostering better decision-making (eg, predicting behavioral patterns and identifying potential challenges and proactively suggest solutions) and self-tailoring (refining suggestions and ensuring interventions remain relevant to the user). Furthermore, incorporating sentiment analysis into CAs enhances their capacity to deliver social support that is both empathetic and contextually appropriate [].
Acknowledgments
This research is part of the PRIME project, which was funded by the Gatsby Foundation (GAT3676) as well as by the Ministry of Economic Affairs by means of the Public-Private Partnership Allowance made available by the Top Sector Life Sciences & Health to stimulate public-private partnerships. The Center of Expertise for Parkinson & Movement Disorders was supported by a center of excellence grant from the Parkinson Foundation.
Data Availability
The data that support the findings of this study are available from the corresponding author upon reasonable request.
Authors' Contributions
TFP, LJWE, NMdV, and SWvdB contributed to the conceptualization of the systematic literature review. TFP, SWvdB, and NMdV were responsible for the methodology. TFP conducted data curation and led the investigation, with contributions from RAvD and SWvdB. TFP, NMdV, and SWvdB drafted the original manuscript. BRB was responsible for funding acquisition. All authors contributed to critical review and editing of the manuscript.
Conflicts of Interest
BRB serves as the coeditor in chief for the Journal of Parkinson’s Disease, serves on the editorial board of Practical Neurology and Digital Biomarkers, has received fees from serving on the scientific advisory board for the Critical Path Institute, Gyenno Science, MedRhythms, UCB, Kyowa Kirin and Zambon (paid to the institute), has received fees for speaking at conferences from AbbVie, Bial, Biogen, GE Healthcare, Oruen, Roche, UCB and Zambon (paid to the Institute), and has received research support from Biogen, Cure Parkinson’s, Davis Phinney Foundation, Edmond J. Safra Foundation, Fred Foundation, Gatsby Foundation, Hersenstichting Nederland, Horizon 2020, IRLAB Therapeutics, Maag Lever Darm Stichting, Michael J Fox Foundation, Ministry of Agriculture, Ministry of Economic Affairs and Climate Policy, Ministry of Health, Welfare and Sport, Netherlands Organization for Scientific Research (ZonMw), Not Impossible, Parkinson Vereniging, Parkinson’s Foundation, Parkinson’s UK, Stichting Alkemade-Keuls, Stichting Parkinson NL, Stichting Woelse Waard, Health Holland/Topsector Life Sciences and Health, UCB, Verily Life Sciences, Roche and Zambon. BLB does not hold any stocks or stock options with any companies that are connected to Parkinson’s disease or to any of his clinical or research activities. All other authors declare no conflicts of interest.
PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) 2020 checklist.
DOCX File , 272 KBRisk of bias assessment.
PDF File (Adobe PDF File), 224 KBReferences
- Vandenberghe D, Albrecht J. The financial burden of non-communicable diseases in the European Union: a systematic review. Eur J Public Health. Aug 01, 2020;30(4):833-839. [CrossRef] [Medline]
- Hacker K. The burden of chronic disease. Mayo Clin Proc Innov Qual Outcomes. Jan 2024;8(1):112-119. [FREE Full text] [CrossRef] [Medline]
- Grady PA, Gough LL. Self-management: a comprehensive approach to management of chronic conditions. Am J Public Health. Aug 2014;104(8):e25-e31. [CrossRef] [Medline]
- Allegrante JP, Wells MT, Peterson JC. Interventions to support behavioral self-management of chronic diseases. Annu Rev Public Health. Apr 01, 2019;40(1):127-146. [FREE Full text] [CrossRef] [Medline]
- Lorig KR, Holman HR. Self-management education: history, definition, outcomes, and mechanisms. Ann Behav Med. Aug 2003;26(1):1-7. [CrossRef] [Medline]
- Griffin AC, Xing Z, Khairat S, Wang Y, Bailey S, Arguello J, et al. Conversational agents for chronic disease self-management: a systematic review. AMIA Annu Symp Proc. 2020;2020:504-513. [FREE Full text] [Medline]
- Barlow J, Wright C, Sheasby J, Turner A, Hainsworth J. Self-management approaches for people with chronic conditions: a review. Patient Educ Couns. Oct 2002;48(2):177-187. [CrossRef] [Medline]
- Romano MF, Shih LC, Paschalidis IC, Au R, Kolachalama VB. Large language models in neurology research and future practice. Neurology. Dec 04, 2023;101(23):1058-1067. [FREE Full text] [CrossRef] [Medline]
- Michie S, Richardson M, Johnston M, Abraham C, Francis J, Hardeman W, et al. The behavior change technique taxonomy (v1) of 93 hierarchically clustered techniques: building an international consensus for the reporting of behavior change interventions. Ann Behav Med. Aug 2013;46(1):81-95. [FREE Full text] [CrossRef] [Medline]
- Mohr DC, Schueller SM, Montague E, Burns MN, Rashidi P. The behavioral intervention technology model: an integrated conceptual and technological framework for eHealth and mHealth interventions. J Med Internet Res. Jun 05, 2014;16(6):e146. [FREE Full text] [CrossRef] [Medline]
- Bin Sawad A, Narayan B, Alnefaie A, Maqbool A, Mckie I, Smith J, et al. A systematic review on healthcare artificial intelligent conversational agents for chronic conditions. Sensors (Basel). Mar 29, 2022;22(7):2625. [FREE Full text] [CrossRef] [Medline]
- Kurniawan MH, Handiyani H, Nuraini T, Hariyati RTS, Sutrisno S. A systematic review of artificial intelligence-powered (AI-powered) chatbot intervention for managing chronic illness. Ann Med. Dec 11, 2024;56(1):2302980. [FREE Full text] [CrossRef] [Medline]
- Moura L, Jones DT, Sheikh IS, Murphy S, Kalfin M, Kummer BR, et al. Implications of large language models for quality and efficiency of neurologic care: emerging issues in neurology. Neurology. Jun 11, 2024;102(11):e209497. [CrossRef] [Medline]
- Schachner T, Keller R, V Wangenheim F. Artificial intelligence-based conversational agents for chronic conditions: systematic literature review. J Med Internet Res. Sep 14, 2020;22(9):e20701. [FREE Full text] [CrossRef] [Medline]
- Patrick K, Hekler EB, Estrin D, Mohr DC, Riper H, Crane D, et al. The pace of technologic change: implications for digital health behavior intervention research. Am J Prev Med. Nov 2016;51(5):816-824. [CrossRef] [Medline]
- Bérubé C, Schachner T, Keller R, Fleisch E, V Wangenheim F, Barata F, et al. Voice-based conversational agents for the prevention and management of chronic and mental health conditions: systematic literature review. J Med Internet Res. Mar 29, 2021;23(3):e25933. [FREE Full text] [CrossRef] [Medline]
- Jiang Z, Huang X, Wang Z, Liu Y, Huang L, Luo X. Embodied conversational agents for chronic diseases: scoping review. J Med Internet Res. Jan 09, 2024;26:e47134. [FREE Full text] [CrossRef] [Medline]
- Uetova E, Hederman L, Ross R, O'Sullivan D. Exploring the characteristics of conversational agents in chronic disease management interventions: a scoping review. Digit Health. 2024;10:20552076241277693. [FREE Full text] [CrossRef] [Medline]
- Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ. Mar 29, 2021;372:n71. [FREE Full text] [CrossRef] [Medline]
- About chronic diseases. Centers for Disease Control and Prevention. URL: https://www.cdc.gov/chronic-disease/about/index.html [accessed 2024-07-22]
- Eysenbach G, CONSORT-EHEALTH Group. CONSORT-EHEALTH: improving and standardizing evaluation reports of web-based and mobile health interventions. J Med Internet Res. Dec 31, 2011;13(4):e126. [FREE Full text] [CrossRef] [Medline]
- Sterne JA, Hernán MA, Reeves BC, Savović J, Berkman ND, Viswanathan M, et al. ROBINS-I: a tool for assessing risk of bias in non-randomised studies of interventions. BMJ. Oct 12, 2016;355:i4919. [FREE Full text] [CrossRef] [Medline]
- Sterne JA, Savović J, Page MJ, Elbers RG, Blencowe NS, Boutron I, et al. RoB 2: a revised tool for assessing risk of bias in randomised trials. BMJ. Aug 28, 2019;366:l4898. [FREE Full text] [CrossRef] [Medline]
- Caru M, Abdullah S, Qiu L, Kanski B, Gordon B, Truica CI, et al. Women with metastatic breast cancer don't just follow step-count trends, they exceed them: an exploratory study. Breast Cancer Res Treat. Jul 25, 2023;200(2):265-270. [FREE Full text] [CrossRef] [Medline]
- Schmitz KH, Kanski B, Gordon B, Caru M, Vasakar M, Truica CI, et al. Technology-based supportive care for metastatic breast cancer patients. Support Care Cancer. Jun 20, 2023;31(7):401. [CrossRef] [Medline]
- Cheng AL, Agarwal M, Armbrecht MA, Abraham J, Calfee RP, Goss CW. Behavioral mechanisms that mediate mental and physical health improvements in people with chronic pain who receive a digital health intervention: prospective cohort pilot study. JMIR Form Res. Nov 17, 2023;7:e51422. [FREE Full text] [CrossRef] [Medline]
- Meheli S, Sinha C, Kadaba M. Understanding people with chronic pain who use a cognitive behavioral therapy-based artificial intelligence mental health app (Wysa): mixed methods retrospective observational study. JMIR Hum Factors. Apr 27, 2022;9(2):e35671. [FREE Full text] [CrossRef] [Medline]
- Tawfik E, Ghallab E, Moustafa A. A nurse versus a chatbot ‒ the effect of an empowerment program on chemotherapy-related side effects and the self-care behaviors of women living with breast Cancer: a randomized controlled trial. BMC Nurs. Apr 06, 2023;22(1):102. [FREE Full text] [CrossRef] [Medline]
- Greer S, Ramo D, Chang Y, Fu M, Moskowitz J, Haritatos J. Use of the chatbot "Vivibot" to deliver positive psychology skills and promote well-being among young people after cancer treatment: randomized controlled feasibility trial. JMIR Mhealth Uhealth. Oct 31, 2019;7(10):e15018. [FREE Full text] [CrossRef] [Medline]
- Gomaa S, Posey J, Bashir B, Basu Mallick A, Vanderklok E, Schnoll M, et al. Feasibility of a text messaging-integrated and chatbot-interfaced self-management program for symptom control in patients with gastrointestinal cancer undergoing chemotherapy: pilot mixed methods study. JMIR Form Res. Nov 10, 2023;7:e46128. [FREE Full text] [CrossRef] [Medline]
- Huang MY, Weng CS, Kuo HL, Su YC. Using a chatbot to reduce emergency department visits and unscheduled hospitalizations among patients with gynecologic malignancies during chemotherapy: a retrospective cohort study. Heliyon. May 2023;9(5):e15798. [FREE Full text] [CrossRef] [Medline]
- Chen NJ, Huang C, Fan C, Lu L, Lin F, Liao J, et al. User evaluation of a chat-based instant messaging support health education program for patients with chronic kidney disease: preliminary findings of a formative study. JMIR Form Res. Sep 19, 2023;7:e45484. [FREE Full text] [CrossRef] [Medline]
- Hauser-Ulrich S, Künzli H, Meier-Peterhans D, Kowatsch T. A smartphone-based health care chatbot to promote self-management of chronic pain (SELMA): pilot randomized controlled trial. JMIR Mhealth Uhealth. Apr 03, 2020;8(4):e15806. [FREE Full text] [CrossRef] [Medline]
- Krishnakumar A, Verma R, Chawla R, Sosale A, Saboo B, Joshi S, et al. Evaluating glycemic control in patients of south Asian origin with type 2 diabetes using a digital therapeutic platform: analysis of real-world data. J Med Internet Res. Mar 25, 2021;23(3):e17908. [FREE Full text] [CrossRef] [Medline]
- Nassar CM, Dunlea R, Montero A, Tweedt A, Magee MF. Feasibility and preliminary behavioral and clinical efficacy of a diabetes education chatbot pilot among adults with type 2 diabetes. J Diabetes Sci Technol. Jun 06, 2023;19(1):54-62. [CrossRef] [Medline]
- Echeazarra L, Pereira J, Saracho R. TensioBot: a chatbot assistant for self-managed in-house blood pressure checking. J Med Syst. Mar 15, 2021;45(4):54. [CrossRef] [Medline]
- Sakane N, Suganuma A, Domichi M, Sukino S, Abe K, Fujisaki A, et al. The effect of a mHealth app (KENPO-app) for specific health guidance on weight changes in adults with obesity and hypertension: pilot randomized controlled trial. JMIR Mhealth Uhealth. Apr 12, 2023;11:e43236. [FREE Full text] [CrossRef] [Medline]
- Hunt M, Miguez S, Dukas B, Onwude O, White S. Efficacy of Zemedy, a mobile digital therapeutic for the self-management of irritable bowel syndrome: crossover randomized controlled trial. JMIR Mhealth Uhealth. May 20, 2021;9(5):e26152. [FREE Full text] [CrossRef] [Medline]
- Cheng CI, Lin WJ, Liu HT, Chen YT, Chiang CK, Hung KY. Implementation of artificial intelligence Chatbot in peritoneal dialysis nursing care: experience from a Taiwan medical center. Nephrology (Carlton). Dec 12, 2023;28(12):655-662. [CrossRef] [Medline]
- Albino de Queiroz D, Silva Passarello R, Veloso de Moura Fé V, Rossini A, Folchini da Silveira E, Aparecida Isquierdo Fonseca de Queiroz E, et al. A wearable chatbot-based model for monitoring colorectal cancer patients in the active phase of treatment. Healthc Anal. Dec 2023;4:100257. [CrossRef]
- Guhl E, Althouse AD, Pusateri AM, Kimani E, Paasche-Orlow MK, Bickmore TW, et al. The atrial fibrillation health literacy information technology trial: pilot trial of a mobile health app for atrial fibrillation. JMIR Cardio. Sep 04, 2020;4(1):e17162. [FREE Full text] [CrossRef] [Medline]
- Fang KY, Bjering H, Ginige A. Adherence, avatars and where to from here. Stud Health Technol Inform. 2018;252:45-50. [Medline]
- Gong E, Baptista S, Russell A, Scuffham P, Riddell M, Speight J, et al. My diabetes coach, a mobile app-based interactive conversational agent to support type 2 diabetes self-management: randomized effectiveness-implementation trial. J Med Internet Res. Nov 05, 2020;22(11):e20322. [FREE Full text] [CrossRef] [Medline]
- Roca S, Lozano ML, García J, Alesanco ?. Validation of a virtual assistant for improving medication adherence in patients with comorbid type 2 diabetes mellitus and depressive disorder. Int J Environ Res Public Health. Nov 17, 2021;18(22):12056. [FREE Full text] [CrossRef] [Medline]
- Dworkin MS, Lee S, Chakraborty A, Monahan C, Hightow-Weidman L, Garofalo R, et al. Acceptability, feasibility, and preliminary efficacy of a theory-based relational embodied conversational agent mobile phone intervention to promote HIV medication adherence in young HIV-positive African American MSM. AIDS Educ Prev. Feb 2019;31(1):17-37. [CrossRef] [Medline]
- Ulrich S, Gantenbein AR, Zuber V, Von Wyl A, Kowatsch T, Künzli H. Development and evaluation of a smartphone-based chatbot coach to facilitate a balanced lifestyle in individuals with headaches (BalanceUP app): randomized controlled trial. J Med Internet Res. Jan 24, 2024;26:e50132. [FREE Full text] [CrossRef] [Medline]
- Bruijnes M, Kesteloo M, Brinkman WP. Reducing social diabetes distress with a conversational agent support system: a three-week technology feasibility evaluation. Front Digit Health. Jun 13, 2023;5:1149374. [FREE Full text] [CrossRef] [Medline]
- Sheyn D, Chakraborty N, Chen YB, Mahajan ST, Hijaz A. Use of a digital conversational agent for the management of overactive bladder. Urogynecology (Phila). Jun 01, 2024;30(6):536-544. [CrossRef] [Medline]
- Qiu L, Kanski B, Doerksen S, Winkels R, Schmitz KH, Abdullah S. Nurse AMIE: using smart speakers to provide supportive care intervention for women with metastatic breast cancer. In: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 2021. Presented at: CHI EA '21; May 8-13, 2021:1-7; Yokohama, Japan. URL: https://doi.org/10.1145/3411763.3451827 [CrossRef]
- Roca S, Sancho J, García J, Alesanco Á. Microservice chatbot architecture for chronic patient support. J Biomed Inform. Feb 2020;102:103305. [FREE Full text] [CrossRef] [Medline]
- Martinengo L, Jabir AI, Goh WW, Lo NY, Ho MR, Kowatsch T, et al. Conversational agents in health care: scoping review of their behavior change techniques and underpinning theory. J Med Internet Res. Oct 03, 2022;24(10):e39243. [FREE Full text] [CrossRef] [Medline]
- Asbjørnsen RA, Smedsrød ML, Solberg Nes L, Wentzel J, Varsi C, Hjelmesæth J, et al. Persuasive system design principles and behavior change techniques to stimulate motivation and adherence in electronic health interventions to support weight loss maintenance: scoping review. J Med Internet Res. Jun 21, 2019;21(6):e14265. [FREE Full text] [CrossRef] [Medline]
- Lenferink A, Brusse-Keizer M, van der Valk PD, Frith PA, Zwerink M, Monninkhof EM, et al. Self-management interventions including action plans for exacerbations versus usual care in patients with chronic obstructive pulmonary disease. Cochrane Database Syst Rev. Aug 04, 2017;8(8):CD011682. [FREE Full text] [CrossRef] [Medline]
- Laranjo L, Dunn AG, Tong HL, Kocaballi AB, Chen J, Bashir R, et al. Conversational agents in healthcare: a systematic review. J Am Med Inform Assoc. Sep 01, 2018;25(9):1248-1258. [FREE Full text] [CrossRef] [Medline]
- Eysenbach G. The law of attrition. J Med Internet Res. Mar 31, 2005;7(1):e11. [FREE Full text] [CrossRef] [Medline]
- Blankers M, Koeter MW, Schippers GM. Missing data approaches in eHealth research: simulation study and a tutorial for nonmathematically inclined researchers. J Med Internet Res. Dec 19, 2010;12(5):e54. [FREE Full text] [CrossRef] [Medline]
- Hekler EB, Klasnja P, Harlow J. Agile science. In: Gellman MD, editor. Encyclopedia of Behavioral Medicine. Cham, Switzerland. Springer; 2020:66-71.
- Hermes ED, Lyon AR, Schueller SM, Glass JE. Measuring the implementation of behavioral intervention technologies: recharacterization of established outcomes. J Med Internet Res. Jan 25, 2019;21(1):e11752. [FREE Full text] [CrossRef] [Medline]
- Kim M, Patrick K, Nebeker C, Godino J, Stein S, Klasnja P, et al. The digital therapeutics real-world evidence framework: an approach for guiding evidence-based digital therapeutics design, development, testing, and monitoring. J Med Internet Res. Mar 05, 2024;26:e49208. [FREE Full text] [CrossRef] [Medline]
- Greenhalgh T, Abimbola S. The NASSS framework - a synthesis of multiple theories of technology implementation. Stud Health Technol Inform. Jul 30, 2019;263:193-204. [CrossRef] [Medline]
Abbreviations
| AI: artificial intelligence |
| BCT: behavior change technique |
| BIT: behavioral intervention technology |
| CA: conversational agent |
| CONSORT: Consolidated Standards of Reporting Trials |
| LLM: large language model |
| PRISMA: Preferred Reporting Items for Systematic Reviews and Meta-Analyses |
| QoL: quality of life |
| ROBINS-I: Risk of Bias in Non-randomized Studies-of Intervention |
Edited by J Sarvestan; submitted 07.02.25; peer-reviewed by E Sezgin, M Peeples; comments to author 14.03.25; revised version received 07.05.25; accepted 16.05.25; published 26.08.25.
Copyright©Tessa F Peerbolte, Rozanne JA van Diggelen, Pieter van den Haak, Kim Geurts, Luc JW Evers, Bastiaan R Bloem, Nienke M de Vries, Sanne W van den Berg. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 26.08.2025.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research (ISSN 1438-8871), is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.

