This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.
Conversational agents, also known as chatbots, are computer programs designed to simulate human text or verbal conversations. They are increasingly used in a range of fields, including health care. By enabling better accessibility, personalization, and efficiency, conversational agents have the potential to improve patient care.
This study aimed to review the current applications, gaps, and challenges in the literature on conversational agents in health care and provide recommendations for their future research, design, and application.
We performed a scoping review. A broad literature search was performed in MEDLINE (Medical Literature Analysis and Retrieval System Online; Ovid), EMBASE (Excerpta Medica database; Ovid), PubMed, Scopus, and Cochrane Central with the search terms “conversational agents,” “conversational AI,” “chatbots,” and associated synonyms. We also searched the gray literature using sources such as the OCLC (Online Computer Library Center) WorldCat database and ResearchGate in April 2019. Reference lists of relevant articles were checked for further articles. Screening and data extraction were performed in parallel by 2 reviewers. The included evidence was analyzed narratively by employing the principles of thematic analysis.
The literature search yielded 47 study reports (45 articles and 2 ongoing clinical trials) that matched the inclusion criteria. The identified conversational agents were largely delivered via smartphone apps (n=23) and used free text only as the main input (n=19) and output (n=30) modality. Case studies describing chatbot development (n=18) were the most prevalent, and only 11 randomized controlled trials were identified. The 3 most commonly reported conversational agent applications in the literature were treatment and monitoring, health care service support, and patient education.
The literature on conversational agents in health care is largely descriptive and aimed at treatment and monitoring and health service support. It mostly reports on text-based, artificial intelligence–driven, and smartphone app–delivered conversational agents. There is an urgent need for a robust evaluation of diverse health care conversational agents’ formats, focusing on their acceptability, safety, and effectiveness.
Conversational agents or chatbots are computer programs that simulate conversations with users [
Conversational agents cover a broad spectrum of aptitudes ranging from
In contrast, smart conversational agents do not respond with preprepared answers but with adequate suggestions instead. This is enabled by machine learning, a type of artificial intelligence (AI), which allows for broadening of the computer system’s capacity through its learning from data (in this case conversations) without being explicitly programmed [
The first conversational agent
The literature over the next few decades does not explicitly mention
Evolution of conversational agents from 1966 to 2019.
The next big milestone for conversational agents was in 2010 when Apple released
Health care, which has seen a decade of text messaging on smartphones, is an ideal candidate for conversational agent–delivered interventions. Conversational agents enable interactive, 2-way communication, and their text- or speech-based method of communication makes it suitable for a variety of target populations, ranging from young children to older people. The concept of using mobile phone messaging as a health care intervention has been present and increasingly explored in health care research since 2002 [
Our objective was to provide a comprehensive overview of the existing research literature on the use of health care–focused conversational agents. We aimed to examine how conversational agents have been employed and evaluated in the literature to date and map out their characteristics. Finally, in line with the observed gaps in the literature, we sought to provide recommendations for future conversational agent research, design, and applications.
We adopted methodological guidance from an updated version of the Arksey and O’Malley framework with suggestions proposed by Peters et al [
We used an extensive list of 63 search terms, including various synonyms for conversational agents (
To map out the current conversational agent applications in health care, we included primary research studies that had conducted an evaluation and reported findings on a conversational agent implemented for a health care–specific purpose. We excluded articles that just presented a proposal for conversational agent development, articles that mentioned conversational agents briefly or as an insignificant part of a review, as well as opinion pieces and articles where primary research was not conducted or discussed. A further point of exclusion was articles with poorly reported data on chatbot assessments where there was minimal or no evaluation data. In addition, we excluded articles concerning ECAs, relational agents, animated conversational agents, or other conversational agents with a visual or animated component.
ECAs are computer-generated virtual individuals with an animated appearance to enable face-to-face interaction between the user and the system [
Screening of articles for inclusion was performed in 2 stages: title and abstract review and full article review, undertaken independently by 2 reviewers. Following an initial screening of titles and abstracts, full texts were obtained and screened by 2 reviewers. From the included studies, 2 reviewers independently extracted relevant information in an Excel (Microsoft) spreadsheet. We extracted data on the first author, year of publication, source of literature, title of article, type of literature, study design and methods, geographic focus, health care sector, conversational agent name, accessibility of conversational agent, dialogue technique, input and output modalities, and nature of conversational agent’s end goal. We piloted the data extraction sheet on at least five articles. Potential discrepancies in the extracted data were discussed between the authors and resolved through discussion and consensus.
We performed a narrative synthesis of the included literature and presented findings on (1) study specifics, such as study design, geographic focus, and type of literature; (2) conversational agent specifics (ie, conversational agent delivery channel, dialogue technique, personality, etc); (3) conversational agent content analysis; and (4) study evaluation findings.
We used the principles of thematic analysis to analyze the content, scope, and personality traits of the conversational agents. Two researchers familiarized themselves with the literature identified, generated the initial codes in relation to personality and content analysis, applied the codes to the included studies, compared their findings, and resolved any discrepancies via discussion.
The need to present information on conversational agent personality was motivated by the concepts presented in the study by de Haan et al [
The initial database searches yielded 11,401 records, and another 28 records were retrieved through additional sources such as the gray literature sources and screening of reference lists of relevant studies. A total of 196 duplicates were identified and removed, leaving 11,233 titles and abstracts that needed to be screened. Title and abstract screening led to the exclusion of 11,099 records, resulting in 134 full texts that needed to be assessed for eligibility. Of these, 87 articles were excluded, resulting in a final pool of 47 reports comprising 45 studies and 2 ongoing trials (
PRISMA flow chart.
In this scoping review, 40 included studies were from high-income countries (HICs) and 6 were from low- and middle-income countries (LMICs). A total of 22 studies were from European countries, including Italy [
A variety of study designs were used in the included studies, comprising 20 case studies [
Bubble plots showing the distribution of identified study designs, types of conversational agents and healthcare topics in the included articles, plotted against the year of the publication. The scale on the right indicates that the size of the bubble is associated with the number of studies whereby the smallest denotes 1 study and the largest, 10 studies.
The types of literature included 25 journal articles [
There was an increase in the number of publications each year, from 3 in 2015 to 5 in 2016, 10 in 2017, and 23 in 2018. Some author groups were highly productive and published at least two papers within 2 years. Kowatsch et al published 3 papers between 2017 and 2018 based on their open source behavioral intervention platform MobileCoach, which allows the authors to design a text-based health care conversational agent for obesity management and behavior change [
Conversational agents were delivered through a variety of means in the included studies. Most (n=23) were smartphone apps [
A total of 8 studies made a reference to the technical details of the conversational agent development process. Some mentioned specific tools such as C and MS Access [
The conversational agents could be categorized according to whether the user input was fixed (ie, predetermined text) or unrestricted (ie, free text/speech). A total of 10 studies employed fixed text user inputs [
Similarly, output modalities largely employed text alone (n=30) [
We condensed the descriptive terms used in individual studies to present the conversational agents into a list of 9 relevant personality traits as presented in
The conversational agents in the included studies were health care professional like [
One article [
Personality codes derived for the conversational agents included in this review, adapted from Haan et al.
Personality codes | Descriptions |
Coach like | Encouraging, motivating, and nurturing |
Conversational agent identity | Explicitly identifies as a conversational agent |
Culture specific | Speaks the native language or has native names |
Factual | Nonjudgmental, no personal opinions, and responses based on facts or observations |
Gender specific | Male and female versions available |
Health care professional like | Designed to be a doctor or expert, that is, mimics a health care professional |
Human like | Tries to emulate humans, for example, participants reported feeling like they were talking to another human or researchers used features like “typing” to make the conversation more human like |
Informal | Informal, like talking to a friend. Uses exclamations, abbreviations, and emoticons |
Knowledgeable | Content created or informed by medical experts |
A health care administrator or professional was available via the conversational agent for the user to communicate with in some studies. The role of the human varied from an administrator who could be contacted via a dedicated chat channel for the user to ask questions or an individual whose role was to monitor the user’s activity on the conversational agent and provide personalized feedback to them. Seven studies [
All the conversational agents in this review were identified as
Five distinct themes were identified in terms of conversational agent content: treatment and monitoring (ie, treatment implementation, management, adherence, support, and monitoring), health service support (ie, connecting patients to health care services), education (ie, provision of health care–related information), lifestyle behavior change (ie, supporting users in tackling various modifiable health risk factors), and diagnosis (ie, identification of the nature of a disease or a condition). A number of included conversational agents spanned several different themes (
Overall, 17 articles reported on conversational agents that focused on treatment, monitoring, or rehabilitation of patients with specific conditions. One study reported on a conversational agent to help preserve cognitive abilities in those with Alzheimer disease [
Overall, 19 studies reported on conversational agents used to support or complement existing health care services. These tasks included remote delivery of health care services for mental health support [
We found 13 articles in which conversational agents were used primarily for educating patients or users. Education focused on topics such as sexual health [
We identified 12 studies with conversational agents for healthy lifestyle behavior change in the general population as well as overweight and obese individuals. Two studies discussed conversational agents for the management of obesity in younger patients, including adolescents [
Seven articles presented health care conversational agents with a primary purpose of establishing a diagnosis. Three articles reported on conversational agents’ triage, diagnosis, or a combination of both, mainly employing a symptom checker function [
Included studies that evaluated conversational agents reported on their accuracy (in terms of information retrieval, diagnosis, and triaging), user acceptability, and effectiveness. Some studies reported on more than 1 outcome, for example, acceptability and effectiveness. In general, evaluation data were mostly positive, with a few studies reporting the shortcomings of the conversational agent or technical issues experienced by users. Seventeen studies presented self-reported data from participants in the form of surveys, questionnaires, etc. In 16 studies, the data were objectively assessed in the form of changes in BMI, number of user interactions, etc. In 12 studies, there was a mixture of self-reported and objectively assessed outcomes and outcomes were not reported in the two ongoing trials (
Eleven studies reported on the accuracy of conversational agents [
The effectiveness of health care conversational agents was assessed in 8 studies [
Eight studies noted the effectiveness of conversational agents for mental health applications [
A total of 26 studies commented on the acceptability of conversational agents (
In 4 studies, health care conversational agents were targeted at chronic conditions [
A further 3 studies were concerned with sexual health and/or HIV management [
Two studies employed an emotionally sensitive conversational agent for mental health counselling and general health information advice [
In 3 studies, conversational agents were used for healthy behavior change, specifically targeting smoking cessation, alcohol misuse treatment, and physical activity promotion [
Two studies examined the acceptability of conversational agents for health care service delivery [
One study discussed a condition-specific conversational agent application targeted at improving the quality of life and medication adherence of breast cancer patients [
Our scoping review identified 45 studies and 2 ongoing clinical trials. Although conversational agents have been widely employed in various fields, their use in health care is still in its infancy, as evidenced by the study findings that indicate much of the literature being published recently (2016-2018). Most conversational agents used text input and were machine learning based and mobile app delivered. The 3 most commonly reported themes in the health care conversational agent–related literature were treatment and monitoring, health services support, and patient education. Results from the studies evaluating conversational agents were generally positive, reporting effectiveness, accuracy, and acceptability of the conversational agent. However, there is currently a dearth of robust evaluations and a predominance of small case studies.
Our review shows that most of the health care conversational agents reported in the literature used machine learning and were long-term goal oriented. This suggests that conversational agents are evolving from conducting simple transactional tasks toward more involved end points such as long-term disease management [
Our findings show a predominance of text-based conversational agents, with only a few apps using speech as the main mode of communication. Yet, certain populations, such as older people, may be more comfortable interacting via speech, as some individuals may find the dexterity involved with typing on small keypads on smartphones challenging and time consuming. Furthermore, most conversational agents included in our review were app based. Research shows that the use of apps (which need to be downloaded and regularly updated) is often associated with high dropout rates and low utilization [
A recent systematic review on the effectiveness of ECAs and other conversational agents noted a lack of an established method for evaluating health care conversational agents in health care and a dearth of data on adverse effects [
The health care sectors for conversational agent application identified in the review were generally very broad, with references to only a few specialties including mental health [
There is also a need for more geographically diverse research. Although our review identified 12 articles with a geographical focus in Asia, the evidence stemming from middle-income countries was scarce, and there were no studies from a low-income country. However, digital health initiatives are becoming more common in developing countries, often with a different, context-specific scope, such as ensuring access to health care using social media [
Although the studies reported accuracy, efficacy, effectiveness, and acceptability as outcomes, there were no measurements of cost, efficiency, or how the solution led to improved productivity when used instead of or to augment the work of a health professional. Therefore, it was not possible to ascertain whether the solutions developed were cost-effective compared with alternative approaches.
We conducted a comprehensive literature search of multiple databases, including gray literature sources. We prioritized sensitivity over specificity in our search strategy to capture a holistic representation of conversational agent usage uptake in health care. However, given the novelty of the field and the employed terminology, some unpublished studies discussed at niche conferences or meetings may have been omitted. Furthermore, although classification of the themes of our conversational agents was based on thorough analysis, team discussions, and consensus, it might not be all inclusive and may require further development with the advent of new conversational agents. In addition, although some conversational agents belong to more than 1 theme, we mostly classified them based on the dominant mode of application for the sake of clarity. Finally, we excluded articles with poorly reported data on chatbot assessments; therefore, we may have missed some health care conversational agents (
Conversational agents are an up-and-coming form of technology to be used in health care, which has yet to be robustly assessed. Most conversational agents reported in the literature to date are text based, machine learning driven, and mobile app delivered. Future research should focus on assessing the feasibility, acceptability, safety, and effectiveness of diverse conversational agent formats aligned with the target population’s needs and preferences. There is also a need for clearer guidance on health care –related conversational agents’ development and evaluation and further exploration on the role of conversational agents within existing health systems.
Search strategy.
Types of user input (blue) and output (green) in the conversational agents.
Characteristics of conversational agents reported in the included studies.
Characteristics of included studies.
List of excluded studies and reasons for exclusion.
artificial intelligence
cognitive behavioral therapy
embodied conversational agent
Excerpta Medica database
high-income country
low- and middle-income country
Medical Literature Analysis and Retrieval System Online
natural language processing
Online Computer Library Center
structure association technique
This research is supported by the Ageing Research Institute for Society and Education (ARISE), Nanyang Technological University, Singapore. This study is also supported by the National Research Foundation, Prime Minister’s Office, Singapore under its Campus for Research Excellence and Technological Enterprise (CREATE) program.
LTC conceived the idea for this study. DD, BK, and LC screened the articles. DD, BK, and LC extracted and analyzed the data. DD and LC wrote the manuscript. BK, TK, JR, RA, and YLT revised the manuscript critically.
TK is affiliated with the Center for Digital Health Interventions, a joint initiative of the Department of Management, Technology, and Economics at ETH Zurich and the Institute of Technology Management at the University of St. Gallen, which is funded in part by the Swiss health insurer CSS. TK is also a cofounder of Pathmate Technologies, a university spin-off company that creates and delivers digital clinical pathways. Other authors declare that they have no competing interests.