Patient Perspectives on the Usefulness of an Artificial Intelligence–Assisted Symptom Checker: Cross-Sectional Survey Study

Background: Patients are increasingly seeking Web-based symptom checkers to obtain diagnoses. However, little is known about the characteristics of the patients who use these resources, their rationale for use, and whether they find them accurate and useful. Objective: The study aimed to examine patients’ experiences using an artificial intelligence (AI)–assisted online symptom checker. Methods: An online survey was administered between March 2, 2018, through March 15, 2018, to US users of the Isabel Symptom Checker within 6 months of their use. User characteristics, experiences of symptom checker use, experiences discussing results with physicians, and prior personal history of experiencing a diagnostic error were collected. Results: A total of 329 usable responses was obtained. The mean respondent age was 48.0 (SD 16.7) years; most were women (230/304, 75.7%) and white (271/304, 89.1%). Patients most commonly used the symptom checker to better understand the causes of their symptoms (232/304, 76.3%), followed by for deciding whether to seek care (101/304, 33.2%) or where (eg, primary or urgent care: 63/304, 20.7%), obtaining medical advice without going to a doctor (48/304, 15.8%), and understanding their


Background
Patients are increasingly seeking to be more involved in their health care [1,2]. As a result, digital health care tools (both online and mobile health tools) have proliferated [3,4], and their use by patients has dramatically increased [5]. Overall, 1 in 3 US adults reported going online to try to self-diagnose a medical condition in 2013 [6]. In addition to searching the internet for health information, use of digital health care tools includes online, artificial intelligence (AI)-assisted symptom checkers for obtaining diagnoses or self-triage [7][8][9][10]. A previous report assessed the accuracy of general symptom checkers using patient vignettes [9] and found that diagnostic accuracy (defined as the correct diagnosis being listed first) was 34% and triage advice was appropriate 57% of the time. Accuracy varied considerably among symptom checkers (with a range of 5%-50%), leading to a concern about their use [11,12]. Furthermore, it is unknown if patients [7] use online symptom checkers as a replacement for seeing physicians in person. Also unknown are the rationale why patients use symptom checkers, whether they find them accurate and useful, and if these tools provide them with any benefit.
In light of evidence that approximately 1 in 20 US adults experience a diagnostic error every year (with half incurring severe or permanent harm) [13], the National Academies of Sciences, Engineering, and Medicine recommends the use of patient engagement tools, including symptom checkers and other digital health tools, in efforts to address this issue [14]. As a part of the solution digital health tools offer patients broader, quicker access to health information, [15], but their use may differ among patient groups. Mobile phone use for looking up general health information differs across race and ethnicity (with 67% of African Americans/blacks, 73% of Hispanics, and 58% of whites reportedly doing so) [16] and patients with chronic health conditions tend to have less access to the internet [17]. It is unclear how these patterns would relate to the use of online symptom checkers, but differences in use among these groups of patients could result in disparate benefits of the tools. Furthermore, other patient characteristics, such as previous positive or negative health care experiences, could also alter use, usefulness, and experiences with such tools.
Currently, it is unclear if patients use symptom checkers to supplement medical advice (which is what many of the tool developers suggest in addition to speaking with physicians about the obtained results) or if they are using them as a substitute for in-person health care by seeking in-person health care only if instructed by the symptom checker. Finally, in assessing symptom checker benefits, it is vital to understand patient perspectives [18] after actual use [19] (rather than to just assess their accuracy in fictitious situations as these data may not be ecologically valid). Knowledge about both the benefits of symptom checkers and how they can be improved could maximize patient benefits and minimize unintended consequences (such as cyberchondria, anxiety, or unnecessary health care use-proposed consequences of Web-based medical tools) [20][21][22].

Objectives
To address current knowledge gaps, we examined user characteristics and experiences and potential consequences of symptom checker use, including subsequent physician discussions around use of the symptom checker in relation to a popular online AI-assisted symptom checker, the Isabel Symptom Checker [23]. In addition, we compared perceptions of the symptom checker in patients who previously experienced errors in diagnosis versus those who did not, because these experiences may affect symptom checker favorability.

Description of the Isabel Symptom Checker
The Isabel Symptom Checker (Isabel Healthcare) [23] is a free Web-based, AI-assisted symptom checker intended for use by patients (as opposed to the Isabel Differential Diagnosis Generator [Isabel Healthcare] intended for clinicians) and has been shown to have better accuracy than the average symptom checker in a vignette-based study (defined as having the correct diagnosis listed first in 44% of cases compared with an average rate of 34% in the 23 symptom checkers tested) [9]. It currently has over 12,000 registered users globally, with almost 7000 in the United States (not all users register) and the symptom checker completes between 200,000 to 300,000 searches per month [24]. Patients research their symptoms by entering their age range, gender, pregnancy status, geographic location or travel history, and symptoms in everyday language. Using machine learning and a training database of 6000 disease presentations, the symptom checker uses evidence-based natural language processing techniques to create a list of likely diagnoses ranked in order of relevance for the symptoms entered. Patients can sort their likely diagnoses as a top-10 list; a full list of all relevant diagnoses; a list including only red-flag, do-not-miss diagnoses, which indicate that medical advice should be sought immediately; or as a list divided into common versus rare diagnoses. Diagnoses are linked to reference resources, allowing patients to learn more. These resources include the consumer-facing Merck Manual (Merck Sharpe & Dohme Corp) [25], MedlinePlus (National Library of Medicine) [26], a patient version of UpToDate (UpToDate, Inc) [27], and the Mayo Clinic website (Mayo Foundation for Medical Education and Research) [28]. Next steps are provided where users can "contact a doctor," "find a lab test," or determine where they should go for medical care using additional triage functionality (using the "Where Now" button). The symptom checker is freely available and provides information for both adult and pediatric patients (see Figure 1 for example screenshot).

Participants
With the help of Isabel Healthcare, we sent email invitations to all registered US users of the Isabel Symptom Checker (4000) to complete an online survey through SurveyMonkey (SurveyMonkey) [29], a commercial survey website. All of these users had registered and used the symptom checker within the last 6 months. On the basis of the limited available internal institutional funding, we were able to offer a survey incentive to only the first 385 respondents, all of whom received US $20 gift card incentives; these available funds thus determined sample size. Local institutional review board approval was obtained at Baylor College of Medicine and written consent was obtained from the participants.

Survey
The survey was created by a multidisciplinary team (authors AM, TG, CS, and HS) with expertise in patient experience, cognitive psychology, psychometrics, internal medicine, and diagnostic errors. It comprises multiple-choice questions, 5-item Likert-type questions (with choices ranging from strongly disagree to strongly agree), and 5 open-ended questions and was designed to elicit information related to 4 main areas (see Multimedia Appendix 1 for full survey): 1 (defined for them as whether or not they have ever been given either the wrong diagnosis for a health concern or not given any diagnosis for a health concern that they were seeking medical help for; this includes both multiple-choice questions and an open-ended response, where participants could detail their diagnostic error experiences).
After development, the survey was pilot tested in both paper and online forms with 5 and 13 patients, respectively, and correspondingly refined to increase readability and understandability by simplifying and clarifying the language.

Data Analysis
All data were summarized using descriptive statistics, except open-ended responses, which were coded using content analysis. In addition, we compared demographics, experiences around Isabel Symptom Checker use, and subsequent interactions with physicians between users who had previously experienced diagnostic errors and those who had not using independent t tests, Chi-square, or Fisher exact tests where appropriate. We also conducted additional subanalyses using Chi-square or Fisher exact tests to determine whether certain behaviors (following the advice of the symptom checker and going to the ED and talking to one's doctor about Isabel results) were associated with other demographics. All tests were two tailed, done using IBM SPSS Statistics 22 (IBM Corporation), and considered significant when P<.05.

Sample
From the sample of 385 respondents, 329 provided mostly complete (>90% of the survey was complete) and relevant data (18 participants' responses were excluded for not completing >90% of the questions and 38 because they described using the tool as a medical professional for either education or diagnosing patients when elaborating on the question "What prompted you to use the Isabel Symptom Checker?" after choosing the "Other" response). Only data from the 329 nonexcluded respondents are reported. The mean time to complete the survey was 12:21 (SD 10:43) min. Patients who chose not to discuss the findings with their physicians (110/213, 51.6%) did so because of various concerns, including thinking their doctors would not approve of their use of the tool or the doctors would think the patients mistrusted them or were trying to second guess or replace them by using the tool (see Multimedia Appendix 4).

User Characteristics
In the corresponding open-ended response, they most often described not discussing the results with their doctors because of worry about pushback or concerns about their physicians' reactions (21/52, 40%; see Multimedia Appendix 5 for additional findings).

Additional Behavioral Differences as Related to Demographics
Neither likelihood of going to the ED when the symptom checker suggested (n=25) nor the likelihood of discussing the results with their doctors (assuming they saw a doctor after using the symptom checker; n=217) were significantly related to gender, income, education, or being an underrepresented minority in our sample (see Multimedia Appendix 5 for details).

Principal Findings
Patients used an online symptom checker to learn more about what could cause their symptoms, to determine whether to seek care or where, to get medical advice without going to a doctor, or to better understand their diagnosis. Most patients thought the tool gave them useful information for their health problems and thought it provided them with insights leading them closer to correct diagnoses. Half of the patients reported positive health effects. However, the patients who discussed the findings with their physicians conveyed mixed experiences about whether physicians were interested or open about discussing symptom checker results.

Strengths
The strengths of this study are the examination of naturalistic patient experiences and the assessment of subsequent related events, which are often missing from existing digital health tool studies (most previous studies examined vignette-based assessments [30,31] or patients already presenting to their doctors [32][33][34][35] with limited follow-up) [7]. Most patients used the symptom checker between 2 weeks and 4 months before the survey, allowing for adequate time for diagnoses to evolve and related subsequent events to occur, such as the completion of diagnostic tests, referrals, treatment, and potential responses to treatment.

Limitations
However, there are several study limitations. As we rely on self-reported data, there is no validation of patient outcomes via some type of medical record audit, making it difficult to assess outcome accuracy. Nonetheless, over time, patients would have enough information to make a determination about the ultimate accuracy of the diagnosis suggested by the tool. In addition, as with all surveys, participants may be subject to acquiescence bias-the tendency to agree with most statements. However, we did not find much evidence for this: despite much agreement with positively worded questions, negatively worded questions were not similarly agreed with (people were not merely agreeing). An additional limitation is that these data represent patient perceptions of only 1 symptom checker, and it is not clear if these results would generalize to other symptom checkers, especially to those that do not utilize AI-assisted natural language searching. We also offered an incentive of a US $20 gift card to the first 385 participants, which may have skewed our sample to people who are quick to respond to emails. Our sample might also be unique: participants had a mean of 8 visits to physicians within the last 12 months, meaning they could be different-perhaps sicker-compared with the general population. However, this population may also be more likely to use such tools given their high interaction with the health care system, so these patterns are still important to understand.
In addition, our sample is overwhelmingly female and white, with a mean age of 48 years, thereby reducing our ability to examine demographic differences in terms of experiences or behavior related to symptom checker use. However, this represents user data available from Isabel Healthcare (females represented 62% of users over the last year, with 39% of users aged between 40 and 64 years). It is difficult to know if our sample is representative of typical users in other ways. Finally, this study was not designed to explain the differences in perceptions and experiences between groups who had experienced diagnostic errors versus those who had not, but only to describe them: the reasons for these differences are likely very complicated and future studies could further examine the roots of these differences.

For Additional Discussion
Some findings warrant additional consideration. For example, previous studies show that some underrepresented groups use mobile resources more for obtaining health information [16]. Perhaps these groups are using digital health tools as a substitute for other less-available health resources. Given that the long-term implications of using these tools are not understood, this could represent disparities affecting health outcomes, especially as patients in this study used the tool to triage themselves or get medical advice without going to a doctor. Nonetheless, our sample did not overwhelmingly include underrepresented groups. As such, additional research is needed to further scrutinize disparities related to symptom checker use.
Another finding worth additional consideration is that over half of the respondents reported previously experiencing diagnostic errors. Although this may seem high, this is a selected sample of symptom checker users, many of whom have had multiple interactions with the health care system. We do not intend this to be a population-based estimate. Nonetheless, the National Academies of Sciences, Engineering, and Medicine have extrapolated from large estimates that most Americans will get a wrong or late diagnosis at some point in their lives [14], and population-based surveys suggest that 12% of patients may have been misdiagnosed, so the high rate of misdiagnosis is quite possible in our sample [36]. These patients used the tool at more time points and used more online health resources in general, but they perceived their doctors to be less interested when discussing the tool's results. This could relate to the higher incidence of chronic diseases reported in this group and more negative health care experiences that often occur in patients with chronic disease [37]. Although past dissatisfaction with the health care system has been linked to increased use of the internet for health-related purposes [38][39][40], the impact of medical circumstances or past diagnostic errors on the use of alternate health resources (such as symptom checkers) remains ripe for exploration.
Our findings also highlight a disconnect between patients and physicians when it comes to the use of digital health tools. Although the sample was generally enthusiastic and satisfied with the tool, the patients felt their physicians showed mixed receptivity to the information and mixed openness to discussing it. This might discourage future use of such tools and future engagement by patients, similar to patterns seen in the contrasting patient and physician enthusiasm about email use for health communications [41].
In addition to this concern, a fear that has surfaced over the use of these tools is the potential for patients' anxiety to increase, thereby increasing health care utilization. These data show that many patients are using the tool to see whether they needed to see a doctor and help them determine where they should seek care. Despite this, a previous study pointed out that this particular symptom checker never advises self-care, which may also increase health care utilization [9]. We currently do not know if such tools would lead to a significant increase in health care use. A larger sample and additional objective follow-up data would help us understand if this represents appropriate utilization of resources.
Finally, we think it is worth reflecting on the effect that such tools might have on patients' sense of confidence in their abilities to diagnose themselves. Diagnosis is a task that often involves clinical uncertainty, something physicians themselves face [42]. Undoubtedly, patients would experience more diagnostic uncertainty than physicians owing to less expertise, but as more patients use these types of tools and obtain answers without actually seeing a health care professional, it will be important to examine the effect of these tools on how patients think about self-diagnosis and any resulting consequences thereof (such as false reassurance, suggested by others [43]). This study is an initial examination of real-life symptom checker use, but as Fraser et al point out [43], the evaluation of such tools should assess them with increasing ecological validity and should examine multiple aspects: usability, effectiveness, and safety. We have begun to examine usability and effectiveness, but much more remains to be understood to thoroughly investigate all of these facets in real-world situations.

Conclusions
In conclusion, while accessing a popular online symptom checker for triage and diagnosis, patients reported receiving useful information for their diagnostic process, despite ongoing concerns about the accuracy of various types of symptom checkers [43]. Prior negative health care experiences related to misdiagnoses might affect how patients use and benefit from these tools for triage and diagnosis, an area ripe for exploration. Evaluation of long-term, objective health benefits, particularly in diverse patient groups, is needed to better understand the broader impact of symptom checkers on diagnosis and health outcomes.