This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.
Virtual humans (VH) are computer-generated characters that appear humanlike and simulate face-to-face conversations using verbal and nonverbal cues. Unlike formless conversational agents, like smart speakers or chatbots, VH bring together the capabilities of both a conversational agent and an interactive avatar (computer-represented digital characters). Although their use in patient-facing systems has garnered substantial interest, it is unknown to what extent VH are effective in health applications.
The purpose of this review was to examine the effectiveness of VH in patient-facing systems. The design and implementation characteristics of these systems were also examined.
Electronic bibliographic databases were searched for peer-reviewed articles with relevant key terms. Studies were included in the systematic review if they designed or evaluated VH in patient-facing systems. Of the included studies, studies that used a randomized controlled trial to evaluate VH were included in the meta-analysis; they were then summarized using the PICOTS framework (population, intervention, comparison group, outcomes, time frame, setting). Summary effect sizes, using random-effects models, were calculated, and the risk of bias was assessed.
Among the 8,125 unique records identified, 53 articles describing 33 unique systems, were qualitatively, systematically reviewed. Two distinct design categories emerged — simple VH and VH augmented with health sensors and trackers. Of the 53 articles, 16 (26 studies) with 44 primary and 22 secondary outcomes were included in the meta-analysis. Meta-analysis of the 44 primary outcome measures revealed a significant difference between intervention and control conditions, favoring the VH intervention (SMD = .166, 95% CI .039-.292,
We offer evidence for the efficacy of VH in patient-facing systems. Considering that studies included different population and outcome types, more focused analysis is needed in the future. Future studies also need to identify what features of virtual human interventions contribute toward their effectiveness.
Patient-facing systems are digital technologies that offer health services and engage people in their health and wellbeing [
Unlike formless conversational agents, like smart speakers or chatbots, VH bring together the capabilities of both a conversational agent and an interactive avatar (computer-represented digital characters). While their humanlike physical appearance is computer-generated (ie, animated), VH are not human-controlled [
Attempts to make computer interfaces anthropomorphic are not new [
While intelligent virtual assistants and chatbots have gained mainstream popularity, VH applications are still in their infancy. As graphic rendering capacities progress and ubiquitous computing peripherals such as virtual reality (VR) and augmented reality head-mounted displays (HMDs) become inexpensive, VH will be increasingly adopted for everyday use. Indeed, similar to education and training [
To our knowledge, VH in patient-facing systems have not been surveyed before. Only recently, other types of conversational agents in health have been reviewed [
This systematic review of the English-language scholarly literature followed standard guidelines for conducting and reporting systematic reviews, including Preferred Reporting Items for Systematic Reviews and Meta-analyses [
Literature searches were performed from inception to December 31, 2019 in Google Scholar and 7 online databases: MEDLINE, EMBASE, PsycINFO, CINAHL, Cochrane Central Register of Controlled Trials, PubMed, and ACM Digital Library. Search queries covered 3 domains: (1) avatars, (2) (digital) narratives, and (3) virtual assistants (for details, see
Summary of the literature search.
The topic of our review crosscuts two disciplines — health care and computer science — and their disciplinary priorities are divergent if not orthogonal. On the one hand, health care research prioritizes reporting and replicating empirical evidence of efficacy. On the other hand, driving innovation is a key mission of computer science research. Thus, empirical investigation of these innovative designs — particularly replication of such studies — often takes a back seat. To offer a comprehensive review of the topic at hand, we first present a qualitative, systematic review of VH in patient-facing systems. Some of those articles were then included in a quantitative meta-analysis (
Studies were included in the qualitative review if they met the following criterion: designed or evaluated VH for health-related outcomes in a patient-facing system. Studies were excluded if they used VH for the training and education of health care professionals or students.
Some of the articles included in the qualitative synthesis were further included in a meta-analysis if they met the following criteria: (1) compared the effectiveness of VH in a health-related outcome in a target population against a control group with no VH; (2) studied humans of any age; and (3) reported the sample size and mean and variance of the outcome measure in control and experimental groups. Studies were excluded if they did not use a comparator that was equivalent but different from VH (eg, [
All records were first downloaded into an EndNote X8.2 library [
Data from eligible articles were extracted into a spreadsheet. For the qualitative review, data included target population, design objective, type of evaluation, principal findings, and VH characteristics. For the meta-analysis, studies were summarized using the PICOTS (population, intervention, comparison group, outcomes, time frame, setting) framework [
Articles included in the meta-analysis were assessed for quality and risk of bias using the latest criteria from the Cochrane Consumers and Communication Review Group [
All statistical analyses were performed in R using the meta [
A total of 16,794 search records were retrieved from the databases, and 1,985 additional records were identified from the bibliographies and Google Scholar. After removing duplicates, we screened the titles and abstracts of 8,125 articles; 380 articles involved a functioning VH system outside a game. Because the computing technology for creating VH did not exist prior to circa 2000, all studies published before 2000 were excluded. These 380 abstracts were further reviewed for their relevance to health, and 282 articles were excluded because the VH was involved in contexts like training, education, or demonstration. The remaining 98 articles underwent full-text review, and 53 articles met the inclusion and exclusion criteria for the qualitative systematic review.
A total of 30 health-related outcomes were identified in the 53 eligible articles, targeting 25 types of populations (
In the 53 eligible articles, 33 unique systems were identified. Of these, 9 systems were used for health assessments and the rest for health interventions (
In the 53 eligible articles, 30 health-related outcomes and 25 target populations were identified.
Health outcome | Target population | Studies |
Improve quality of life | Women with overactive bladder (OAB) symptoms | [ |
Self-manage chronic conditions | Individuals with chronic atrial fibrillation (heart condition) | [ |
Individuals with spinal cord injury | [ |
|
Engage in physical activity | Older adults | [ |
Individuals with Parkinson’s disease | [ |
|
Inactive older adults with low socioeconomic status | [ |
|
Healthy adults (no reported health conditions) | [ |
|
Individuals with schizophrenia | [ |
|
Improve mood | Individuals with depression | [ |
Assess auditory verbal hallucinations (AVH) | Individuals with schizophrenia | [ |
Stress management | Women | [ |
Individuals with chronic pain and depression | [ |
|
Healthy eating | Women | [ |
Healthy adults (no reported health conditions) | [ |
|
Improve social skills | Children with autism spectrum disorders (ASD) | [ |
Individuals with schizophrenia | [ |
|
Assess PTSDb symptoms | US military service members | [ |
Assess body image disturbance (BID) | Women on diet (nonclinical) | [ |
Anxiety toward death | Older adults | [ |
Find health-related information online | Individuals with low health and computer literacy | [ |
Explain health documents | Individuals with low health literacy | [ |
Attitude toward regular physical activity | Healthy adults (no reported health conditions) | [ |
Attitude toward breastfeeding | Pregnant women in their third semester | [ |
Attitude toward weight loss | Healthy adults (no reported health conditions) | [ |
Retention of medication knowledge | Individuals with type 2 diabetes mellitus | [ |
Attitudes toward prenatal testing for Down syndrome | Nulliparous women | [ |
Improve medication adherence | Individuals with schizophrenia | [ |
Assess emotion recognition | Adults with ASD | [ |
Individuals with schizophrenia | [ |
|
Children with ASD | [ |
|
Preconception risk assessment | Women | [ |
Assess the effects of social rejection | Individuals with psychotic disorder | [ |
Assess social attention | Children with ASD | [ |
Assist in deep breathing | Healthy adults (no reported health conditions) | [ |
Substance use counseling | Individuals with alcohol use disorder | [ |
Individuals with opioid use disorder | [ |
|
Patient trust | Healthy adults (no reported health conditions) | [ |
Assess social anxiety disorder | Women with high social anxiety | [ |
Alleviate social isolation | Older adults | [ |
Understand the distinction between connective and fatty tissue in the breast | Mammography-eligible middle-aged women (40-74 years old) | [ |
Pill count adherence >80% | HIV-positive African American men who have sex with men | [ |
aStudies included in the meta-analysis.
bPTSD: post-traumatic stress disorder.
Technology characteristics identified in the eligible studies.
Technology characteristics | Studies |
Unconstrained speech input | [ |
Computer at a community center or school | [ |
Smartphone | [ |
Head-mounted display (HMD) | [ |
Virtual reality (VR) in a PC or HMD | [ |
Mobile kiosk with a computer | [ |
Tablet | [ |
Two broad categories of virtual humans emerged from the 53 articles included in the qualitative review.
Type of use | Number of simple virtual humans | Number of virtual humans with health trackers |
Intervention | 34 [ |
9 [ |
Assessment | 7 [ |
3 [ |
Of the 34 articles that described simple VH in health-related interventions (
The VH interface in
Design approaches were varied in the rest of the studies. Dworkin et al [
Some designs augmented VH with sensor-based tracking (
The most common structure of a simple virtual human system designed for health-related interventions. BEAT: Behavior Expression Animation Toolkit.
Compared with interventions, fewer studies were found for health-related assessments (
Overall, the physical appearances of VH were primarily created using 3-dimensional (3D) character modeling and animation software, such as the Unity3D game engine. They were designed to be racially ambiguous [
Characteristics, personalities, and mannerisms of VH were manipulated to build rapport with end users [
Another study explored the use of personal stories available on the internet to personalize the VH’s message and change health behavior [
Middle-aged Caucasian and African American VH were designed to achieve racial concordance with users [
Of the 53 articles, 23 explicitly described their VH design process [
Only a few papers explicitly mentioned adopting theoretical frameworks to ground their design of VH [
Two frameworks of behavior change were used widely — the transtheoretical model (TTM) of health behavior change [
Four papers explicitly offered design guidelines for VH [
A total of 26 studies (16 articles) published between 2000 and December 31, 2019 were eligible for the meta-analysis, targeting 11 types of populations and including 10 studies with healthy adults [
The included studies comprised approximately 1400 participants across 13 health and wellbeing objectives. The PICOTS information [
As evident from
The meta-analysis of data from 26 studies (44 outcomes) revealed a significant difference between intervention and control conditions, favoring the VH intervention (SMD .166, 95% CI .039-.292, 95% prediction interval –.548 to .879,
A 3-level model (level 2: different outcome measures; level 3: different studies) did not capture a significant amount of variability in the data (
A subgroup analysis for health-related outcomes and health-related attitudes was conducted, but no significant difference was found in the overall effect between outcome types (
To explore publication bias, a funnel plot was generated. Egger’s test was not significant (
Forest plot of the meta-analysis of health-related virtual human interventions from 26 studies (44 primary outcomes). a-PDHA: anonymized post-deployment health assessment; ACT: physical activity; BDI-2: Beck Depression Inventory-II; BICEP: brief informed consent evaluation protocol; DAS−SF2: Dysfunctional Attitude Scale-Short Form 2; DIET: fruit and vegetable consumption; EQ−5D−5L VAS: 5-level version of the EuroQol 5D visual analogue scale; FVS: NIH/NCI Fruit and Vegetable Scan; HRQOL: health-related quality of life; OABq: overactive bladder questionnaire; PDHA: post-deployment health assessment; PTSD: post-traumatic stress disorder; QIDS−SR: Quality of Life Enjoyment and Satisfaction Questionnaire-Short Form; SBS: social behavior scales; SMD: standardized mean difference; SVH: social virtual human.
The observed
The quality of studies included in the meta-analysis was evaluated for risk of bias (
Eleven studies were successful in blinding participants and research personnel to the allocated interventions [
Summary of the authors’ consensus judgment about the risk of bias for each study included in the meta-analysis, by various sources of potential bias.
Risk of bias presented as percentages across all 26 studies included in the meta-analysis.
This meta-analysis found VH interventions significantly more effective than other types of traditional interventions that did not include conversational agents (SMD .166, 95% CI=.039-.292,
The efficacy of VH may also depend on the type of intervention for which they are used, such as delivering cognitive behavioral therapy vs delivering education. However, the number of current studies was insufficient to conduct such subgroup analyses. Nevertheless, the effectiveness of VH did not significantly differ between health-related outcomes and health-related attitudes.
Like prior reports [
Of course, some health applications, especially some mental health assessments, would only work with VH and could not be replaced with a voice-based or text-based conversational agent. For example, VH were used to assess the influence of unusual voices on daily activities of hallucinating patients [
Currently, the prevalence of VH in health applications appears to lag behind that in other areas, such as education and training [
Finally, we found that the input and output of VH systems have evolved significantly over the last 2 decades, drawing on the most recent technological advancements. While systems in the 2000s extensively used desktops, kiosks, 2-dimensional graphics, and constrained text input [
The limitations of this review should be noted and can be addressed by future studies. First, not all studies on VH in patient-facing systems were included in our work. This is because they did not present sufficient quantitative information, only reported usability metrics, or did not clarify whether their avatar technology was computer-controlled or human-controlled. Including additional studies and VH designs could allow reinforcing the results reported here or provide different results. Second, some of the studies included in our meta-analysis had relatively small sample sizes (<20 participants); thus, additional caution is recommended when generalizing these results. Third, there was moderate heterogeneity among trials in the meta-analysis (
Although research on conversational agents began circa 2000, their design and capabilities have changed and diverged substantially as new technologies and sensors have emerged. This change is expected to continue. Future studies are suggested to consider the difference between different types of conversational agents when synthesizing or generalizing the agents. For example, does a physical appearance or nonverbal behavior increase the effectiveness of a conversational agent? In what kind of tasks? Furthermore, there is rich literature on behavior change and health behavior change theories. However, theoretical frameworks explicating how different features of VH work together in building (or disrupting) rapport with patients is lacking. As such models emerge, future studies will need to examine those relationships between model constructs with methods such as meta-analytic structural equation models.
VH are conversational agents with a humanlike physical appearance; autonomy in verbal and nonverbal behavior; and speech, gaze, or gesture interaction capabilities. In patient-facing systems, they can demonstrate listening and empathy, as well as tailor to various sociocultural backgrounds, languages, and literacy levels. We surveyed the existing literature on VH in patient-facing systems — from inception to December 2019. Of the 53 articles reviewed, a meta-analysis of 26 studies with more than 1400 participants showed that VH interventions significantly improve health outcomes compared with other traditional intervention methods. But whether a physical embodiment is crucial for a conversational agent to significantly improve health-related outcomes remains to be explored, as does any effect of the VH’s physical appearance, type of voice, or quality of movements.
Although not yet comparable to computer-animated films or high-end video games, the appearance and behavior of VH in health care are increasingly becoming sophisticated, with studies finding that users prefer more humanlike VHs in health care [
Literature search details.
PICOTS information.
3-dimensional.
anonymized post-deployment health assessment.
physical activity.
autism spectrum disorders.
augmented transition network.
Beck Depression Inventory-II.
Behavior Expression Animation Toolkit.
brief informed consent evaluation protocol.
Dysfunctional Attitude Scale-Short Form 2.
fruit and vegetable consumption.
5-level version of the EuroQol 5D visual analogue scale.
NIH/NCI Fruit and Vegetable Scan.
head-mounted display.
health-related quality of life.
mobile health.
motivational interviewing.
overactive bladder questionnaire.
post-deployment health assessment.
post-traumatic stress disorder.
Quality of Life Enjoyment and Satisfaction Questionnaire-Short Form.
randomized controlled trial.
social behavior scales.
standardized mean difference.
social virtual human.
virtual human.
virtual reality.
This article’s publication was partially funded by the Research Open Access Publishing (ROAAP) Fund of the University of Illinois at Chicago (UIC), administered by the UIC library.
None declared.