This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.
Chatbots have been used in the last decade to improve access to mental health care services. Perceptions and opinions of patients influence the adoption of chatbots for health care. Many studies have been conducted to assess the perceptions and opinions of patients about mental health chatbots. To the best of our knowledge, there has been no review of the evidence surrounding perceptions and opinions of patients about mental health chatbots.
This study aims to conduct a scoping review of the perceptions and opinions of patients about chatbots for mental health.
The scoping review was carried out in line with the PRISMA (Preferred Reporting Items for Systematic reviews and Meta-Analyses) extension for scoping reviews guidelines. Studies were identified by searching 8 electronic databases (eg, MEDLINE and Embase) in addition to conducting backward and forward reference list checking of the included studies and relevant reviews. In total, 2 reviewers independently selected studies and extracted data from the included studies. Data were synthesized using thematic analysis.
Of 1072 citations retrieved, 37 unique studies were included in the review. The thematic analysis generated 10 themes from the findings of the studies: usefulness, ease of use, responsiveness, understandability, acceptability, attractiveness, trustworthiness, enjoyability, content, and comparisons.
The results demonstrated overall positive perceptions and opinions of patients about chatbots for mental health. Important issues to be addressed in the future are the linguistic capabilities of the chatbots: they have to be able to deal adequately with unexpected user input, provide high-quality responses, and have to show high variability in responses. To be useful for clinical practice, we have to find ways to harmonize chatbot content with individual treatment recommendations, that is, a personalization of chatbot conversations is required.
Mental disorders are a growing global concern. Approximately 29% of individuals may experience such disorders in their lifetime [
Technological advancements have improved access to mental health care services [
Chatbots are programs able to converse and interact with a human using voice, text, and animation [
The adoption of new technology relies on the perceptions and opinions of users. Numerous studies have been conducted to assess the perceptions and opinions of patients about mental health chatbots [
We conducted a scoping review to accomplish this objective. A scoping review was conducted as the aim was to map the body of literature on this topic [
The following electronic databases were searched in the current review: MEDLINE (via Ovid), Embase (via Ovid), PsycINFO (via Ovid), Scopus, Cochrane Central Register of Controlled Trials, IEEE Xplore, ACM Digital Library, and Google Scholar. Given that Google Scholar usually finds several thousands of references, which are ordered by their relevance to the search topic, we screened only the first 100 references [
To derive search terms, we checked previous literature reviews [
The intervention of interest in this review was chatbots that operate as stand-alone software or a web browser (
Intervention: chatbots operate as stand-alone software or a web browser
Population: patients who use chatbots for improving their psychological well-being or mental disorders
Outcome: patients’ perceptions and opinions about mental health chatbots
Type of publication: peer-reviewed articles, dissertations, and conference proceedings
Language: English
Intervention: chatbots integrated into robotics, serious games, SMS, or telephone systems and those depend on human operator–generated dialog
Population: physicians or caregivers who use chatbots for improving their psychological well-being or mental disorders
Outcome: other outcomes
Type of publication: reviews, proposals, editorials, and conference abstracts
Language: other languages
In this review, MA and NA independently screened the titles and abstracts of all retrieved studies and independently read the full texts of studies included from the first step. AA resolved any disagreements between the reviewers. Cohen kappa was calculated to assess the intercoder agreement [
Scoping reviews do not usually assess the risk of bias of the included studies because they have broad aims and include studies with diverse study designs [
A narrative approach was used to synthesize the data extracted from the included studies. Thematic analysis was used to generate themes based on the findings of the included studies. This data synthesis approach (ie, thematic analysis) has been applied in numerous systematic and scoping reviews [
As shown in
Flowchart of the study selection process.
As shown in
Characteristics of the included studies.
Parameters and characteristics | Studiesa | |||
|
||||
|
|
|||
|
|
Survey | 34 (92) | |
|
|
Quasi-experiment | 2 (5) | |
|
|
Randomized controlled trial | 1 (3) | |
|
|
|||
|
|
Journal article | 24 (65) | |
|
|
Conference proceeding | 12 (32) | |
|
|
Thesis | 1 (3) | |
|
|
|||
|
|
United States | 17 (46) | |
|
|
Australia | 3 (8) | |
|
|
France | 3 (8) | |
|
|
The Netherlands | 3 (8) | |
|
|
Japan | 2 (5) | |
|
|
Germany | 1 (3) | |
|
|
Korea | 1 (3) | |
|
|
Spain | 1 (3) | |
|
|
Sweden | 1 (3) | |
|
|
Turkey | 1 (3) | |
|
|
United Kingdom | 1 (3) | |
|
|
Romania, Spain, and Scotland | 1 (3) | |
|
|
Spain and Mexico | 1 (3) | |
|
|
Global population | 1 (3) | |
|
|
|||
|
|
Before 2010 | 3 (8) | |
|
|
2010-2014 | 11 (30) | |
|
|
2015-2019 | 23 (62) | |
|
||||
|
|
|||
|
|
≤50 | 24 (65) | |
|
|
51-100 | 5 (14) | |
|
|
101-200 | 6 (16) | |
|
|
>200 | 2 (5) | |
|
|
|||
|
|
Mean (range)b | 33.4 (13-79) | |
|
|
|||
|
|
Malec | 1436 (50) | |
|
|
|||
|
|
Clinical sample | 21 (57) | |
|
|
Nonclinical sample | 16 (43) | |
|
|
|||
|
|
Clinical | 14 (38) | |
|
|
Educational | 12 (32) | |
|
|
Community | 8 (22) | |
|
||||
|
|
|||
|
|
Therapy | 12 (32) | |
|
|
Training | 9 (24) | |
|
|
Self-management | 6 (16) | |
|
|
Counseling | 5 (14) | |
|
|
Screening | 4 (11) | |
|
|
Diagnosing | 1 (3) | |
|
|
|||
|
|
Stand-alone software | 24 (65) | |
|
|
Web based | 13 (35) | |
|
|
|||
|
|
Rule based | 32 (86) | |
|
|
Artificial intelligence | 5 (14) | |
|
|
|||
|
|
Chatbot | 32 (86) | |
|
|
Both | 5 (14) | |
|
|
|||
|
|
Yes | 30 (81) | |
|
|
No | 7 (19) | |
|
|
|||
|
|
Depression | 41 (23) | |
|
|
Autism | 6 (16) | |
|
|
Anxiety | 6 (16) | |
|
|
Any mental disorder | 6 (16) | |
|
|
Substance use disorder | 5 (14) | |
|
|
Posttraumatic stress disorder | 5 (14) | |
|
|
Schizophrenia | 3 (8) | |
|
|
Stress | 3 (8) |
aPercentages were rounded and may not sum to 100.
bMean age was reported in 24 studies.
cSex was reported in 29 studies.
dSetting was reported in 34 studies.
eNumbers do not add up as several chatbots target more than one health condition.
The sample size was 50 or less in 24 studies and more than 200 in 2 studies (
The 37 included studies assessed patients’ perceptions and opinions about 32 different chatbots. Chatbots were used for therapeutic purposes (n=12), training (n=9), self-management (n=6) counseling (n=5), screening (n=4), and diagnosis (n=1;
The thematic analysis generated 10 themes from the findings of the studies: usefulness, ease of use, responsiveness, understandability, acceptability, attractiveness, trustworthiness, enjoyability, content, and comparisons. More details about these themes are elaborated in the following subsections.
In total, 20 studies investigated the usefulness of chatbots and/or their features for patients [
Users considered the following components of chatbots useful: real-time feedback [
The ease of use and usability of chatbots were assessed in 20 studies [
In 3 studies, participants faced difficulty in using the chatbot because they did not know when [
This theme brings together perceptions and opinions of participants about verbal and nonverbal responses generated by chatbots in terms of realism, repetitiveness (variability), speed, friendliness, and empathy. A total of 10 studies assessed participants’ perceptions and opinions about how real the chatbots were in terms of verbal and nonverbal responses. Although participants in 7 studies had mixed or neutral perceptions and opinions about the realism of verbal and nonverbal responses [
Most participants in several studies stated that chatbots were able to show friendly [
A total of 7 studies demonstrated that chatbot responses were repetitive [
In general, participants in 6 studies were satisfied with chatbot responses [
Participants suggested several enhancements related to the responsiveness of chatbots, such as the ability to speak [
This theme brings together perceptions and opinions of participants about the ability of chatbots to understand their verbal and nonverbal contact. Chatbot understandability for verbal responses was rated as high among participants in 3 studies [
This theme concerns participants’ acceptability of chatbots and its functionalities and their intentions to use them in the future. The acceptability of chatbots was rated high by users in 12 studies [
Furthermore, 6 studies demonstrated that people would like to use chatbots in the future [
Participants in one study rated the attractiveness of a chatbot as low [
This theme concerns participants’ trust in chatbot. In 7 studies, participants believed that chatbots are trustworthy [
Participants in 9 studies considered using chatbots as enjoyable and fun [
This theme contains participants’ opinions about the content of chatbots. In 6 studies, participants were satisfied with the contents of chatbots such as videos, games, topics, suggestions, and weekly graphs [
This theme brings together participant perspectives about chatbots in comparison with other chatbots or traditional methods. Although most participants in one study preferred interacting with a chatbot rather than a human for their health care [
Participants in one study preferred that chatbot provides real-time feedback on their nonverbal behavior rather than postsession feedback [
A chatbot without an embodied virtual agent (text-based chatbot) was compared with 2 chatbots with an embodied virtual agent (one reacts to the user with verbal and nonverbal empathic reactions, whereas the other did not) in another study [
One study compared AI chatbots with an individual or a chatbot controlled by the same individual (Wizard-of-Oz) [
In another study [
The main finding of this review is that there are features of chatbots that health care providers cannot deliver over a long period. These features have been identified as useful in mental health chatbots: real-time feedback, weekly summary, and continuous data collection in terms of a diary. Usefulness and ease of use are aspects of chatbots that have been studied most comprehensively in the analyzed papers. Overall, the usefulness of mental health chatbots is perceived as high by patients. According to these studies, patients find chatbot systems easy to use. Interactional enjoyment and perceived trust are significant mediators of chatbot interaction [
This is the first review that summarizes perceptions and opinions of patients about mental health chatbots, as reported by previous studies. Palanica et al [
In their review of the landscape of psychiatric chatbots, Vaidyam [
A study assessed the use of mobile technologies in health-related areas from various perspectives [
The study results have the following practical implications. To be useful, we need to create high-quality chatbots that are able to respond to a user in multiple ways. A mental health chatbot must be empathic to be perceived as motivating and engaging and to establish a relationship with the user. A study by de Gennaro [
The patient-doctor or patient-therapist relationship in standard health care settings is characterized by trust and loyalty. Measurements must be undertaken to make the chatbot-patient relationship also trustworthy. This could be realized by providing information on the secondary use of the collected patient data on data storage and analysis procedures. Another approach is blended therapy [
From the practical implications, we can derive the following research implications. There is still a need to improve the linguistic capabilities of mental health chatbots [
Furthermore, methods have to be developed to deal with unexpected user input and to detect critical situations. In mental health, it is crucial to react appropriately for people who are at risk of suicide or self-harm [
For evaluating the mental health chatbot, benchmarks have to be created, and consistent metrics and methods have to be developed. Laranjo et al [
This review was developed, executed, and reported according to the PRISMA Extension for Scoping Reviews [
The most commonly used databases in health and information technology were searched to retrieved relevant studies as many as possible. Searching Google Scholar and carrying out backward and forward reference list checking enabled us to identify gray literature and minimize the risk of publication bias as much as possible. As no restrictions were applied regarding the study design, study setting, comparator, year of publication, and country of publication, this review can be considered comprehensive.
Selection bias in this review was minimal because study selection and data extraction were performed independently by 2 reviewers. Furthermore, the agreement between reviewers was very good for study selection and data extraction. This study is one of the few reviews that used thematic analysis to synthesize the findings of the included studies. The thematic analysis followed the highly recommended guidelines proposed by Braun and Clarke [
This review focused on chatbots that only work on stand-alone software and a web browser (but not robotics, serious games, SMS, or telephones). Furthermore, this review was restricted to chatbots that are not controlled by human operators (Wizard-of-Oz). Therefore, perceptions and opinions of patients found in this review may be different from their perceptions and opinions about Wizard-of-Oz chatbots and/or chatbots with alternative modes of delivery. The abovementioned restrictions were applied by previous reviews about chatbots, as these features are not part of ordinary chatbots [
Owing to practical constraints, we restricted the search to English studies and we could not search interdisciplinary databases (eg, Web of Science and ProQuest), conduct manual search, or contact experts. Consequently, it is likely that we have missed some English and non-English studies. Most included studies were conducted in developed countries, particularly in the United States. Therefore, the findings of this review may not be generalizable to developing countries, as patients in such countries may have different perceptions and opinions about mental health chatbots.
In this paper, we explored perceptions and opinions of patients about mental health chatbots, as reported in the existing literature. The results demonstrated that there are overall positive perceptions and opinions of patients about mental health chatbots, although there is some skepticism toward trustworthiness and usefulness. Many important aspects have been identified to be addressed in research and practice. Among them are the need to improve the linguistic capabilities of chatbots and seamless integration into the health care process. Future research will have to pick up those issues to create successful, well-perceived chatbot systems, and we will start developing corresponding concepts and methods. The research implications are also relevant for health care chatbots beyond mental health chatbots. Their consideration has the potential to improve patients’ perceptions of health care chatbots in general.
Search strategy.
Data extraction form.
The metadata and population characteristics of each included study.
Characteristics of the intervention in each included study.
artificial intelligence
mobile health
Preferred Reporting Items for Systematic reviews and Meta-Analyses
The publication of this study was funded by the Qatar National Library. This study was a part of a project funded by the Qatar National Research Fund (NPRP12S-0303-190204). The project title is
AA developed the protocol and conducted a search with guidance from and under the supervision of MH and BB. Study selection and data extraction were performed independently by MA and NA. AA executed the analysis, and all authors checked the validity of the generated themes. AA and KD drafted the manuscript, and it was revised critically for important intellectual content by all authors. All authors approved the manuscript for publication and agree to be accountable for all aspects of the work.
None declared.