Published on 23.11.20 in Vol 22, No 11 (2020): November
Preprints (earlier versions) of this paper are available at http://preprints.jmir.org/preprint/22407, first published Jul 10, 2020.
COVID-19–Related Internet Search Patterns Among People in the United States: Exploratory Analysis
Background: The internet is a well-known source of information that patients use to better inform their opinions and to guide their conversations with physicians during clinic visits. The novelty of the recent COVID-19 outbreak has led patients to turn more frequently to the internet to gather more information and to alleviate their concerns about the virus.
Objective: The aims of the study were to (1) determine the most commonly searched phrases related to COVID-19 in the United States and (2) identify the sources of information for these web searches.
Methods: Search terms related to COVID-19 were entered into Google. Questions and websites from Google web search were extracted to a database using customized software. Each question was categorized into one of 6 topics: clinical signs and symptoms, treatment, transmission, cleaning methods, activity modification, and policy. Additionally, the websites were categorized according to source: World Health Organization (WHO), Centers for Disease Control and Prevention (CDC), non-CDC government, academic, news, and other media.
Results: In total, 200 questions and websites were extracted. The most common question topic was transmission (n=63, 31.5%), followed by clinical signs and symptoms (n=54, 27.0%) and activity modification (n=31, 15.5%). Notably, the clinical signs and symptoms category captured questions about myths associated with the disease, such as whether consuming alcohol stops the coronavirus. The most common websites provided were maintained by the CDC, the WHO, and academic medical organizations. Collectively, these three sources accounted for 84.0% (n=168) of the websites in our sample.
Conclusions: In the United States, the most commonly searched topics related to COVID-19 were transmission, clinical signs and symptoms, and activity modification. Reassuringly, a sizable majority of internet sources provided were from major health organizations or from academic medical institutions.
J Med Internet Res 2020;22(11):e22407
Since its emergence in late 2019 in Wuhan, China, COVID-19, the disease caused by the novel coronavirus SARS-CoV-2 has drastically impacted daily life around the world [, ]. Among the changes to the public sphere include government-imposed lockdowns of businesses, schools, and universities, designed to mitigate the spread of the disease and to alleviate the significant strain on health care systems [ ]. As many continue to stay at home due to the COVID-19 pandemic, internet use has become an increasingly dominant part of daily life. In a recent poll, a majority of Americans considered the internet “essential” during this time [ ]. Nearly all major internet services have seen increased traffic since early March 2020 [ ]. Given the unprecedented nature of the pandemic, there is naturally much public uncertainty regarding COVID-19, and thus, many are turning to the internet to ask their questions and obtain information about the coronavirus.
Previous studies have shown that patients frequently use the internet to research their conditions and inform their discussions in clinic [, ]. As physicians, insight into what topics patients are curious or anxious about may help guide and structure our interactions, leading to improved patient rapport and satisfaction. Additionally, with well-publicized recent examples of misinformation originating from many sources, including places of authority, it is paramount for physicians to collectively take responsibility to provide reliable and trustworthy information based on the best available evidence [ , ]. Thus, the aims of the present study were to (1) determine the most commonly searched phrases related to COVID-19 in the United States and (2) identify the sources of information for these web searches. In doing so, we believe that we can distill the collective curiosity of the internet-using public into useful information for physicians in clinic.
Search terms related to COVID-19 were entered into Google web search using a clean-installed Google Chrome browser on May 30, 2020, in New York, NY. Google web search is by far the most widely used internet search engine in the United States . In 2018, Google introduced a natural language processing algorithm, which greatly improved the ability of the search engine to identify clusters of search queries related to any given topic [ ]. Due to this new technology, Google redirects all searches related to COVID-19, such as “COVID-19,” “coronavirus,” “coronavirus disease,” etc, to a centralized COVID-19 homepage. This search results page incorporates the location of the user’s search and generates a list of questions and websites that are frequently associated with the initial query. On each results page, the 200 most commonly asked questions were generated. The questions were downloaded to a database using a freely available program (Scraper, version 1.7). The specific question and web address were noted on the webpage by their unique XML Path Language (XPath) strings.
The questions were first categorized according to Rothwell’s classification system by a single trained reviewer [, ]. This classification system was expanded into one of 6 topics—clinical signs and symptoms, treatment, transmission, cleaning methods, activity modification, and policy—which were chosen based on previously published studies that examined the web and social media concerns of users during the COVID-19 pandemic [ - ] ( ).
Activity modification consisted of questions regarding the effectiveness of various activities or lifestyle changes in preventing COVID-19. Policy included questions detailing local or national policy changes enacted in response to COVID-19, including questions about economic support. A full listing of the criteria for each topic category is listed in.
In line with previous studies, the websites were categorized according to source: World Health Organization (WHO), Centers for Disease Control and Prevention (CDC), non-CDC government, academic, news, and other media [, ] ( ). Specifically, non-CDC government websites consisted of webpages directly maintained by a national governmental entity such as the National Institutes of Health (United States) or the National Health Service (United Kingdom). Academic websites were defined as an organization with a clear academic mission statement. Other media consisted of websites not described by one of the previous categories, including CNET, WebMD, and Wikipedia. A full listing of the criteria for each web source category is listed in .
|Question classification by topic|
|Clinical signs and symptoms|
|Website categorization by source|
|World Health Organization (WHO)|
|Centers for Disease Control and Prevention (CDC)|
In total, 200 questions and their corresponding source of information were extracted; the top 25 questions are listed in. The most common question topic was transmission (n=63, 31.5%), followed by clinical signs and symptoms (n=54, 27.0%) and activity modification (n=31, 15.5%) ( ).
Most questions regarding the transmissibility of the coronavirus asked about specific modes of transmission such as spread through food, feces, air conditioning units, delivery packages, and mosquitoes. Interestingly, the clinical signs and symptoms category captured questions about myths associated with the disease, such as whether consuming alcohol stops the coronavirus. In the activity modifications category, there were many questions about staying at home, wearing masks, and managing pre-existing travel plans. The most commonly asked question—“Can antibiotics treat the coronavirus disease?”—was classified as treatment, which, in total, comprised 9% (n=18) of the searched questions.
|Variable||Frequency, n (%)|
|Questions by topic (n=200)|
|Clinical signs and symptoms||54 (27.0)|
|Activity modification||31 (15.5)|
|Cleaning methods||12 (6.0)|
|Websites by source (n=200)|
|Centers for Disease Control and Prevention||73 (36.5)|
|World Health Organization||47 (23.5)|
|Other media||9 (4.5)|
With respect to sources of information, the most common websites provided were maintained by the CDC (n=73, 36.5%), academic medical organizations (n=48, 24.0%), and the WHO (n=47, 23.5%) (). With an additional 5% (n=10) of web information provided by a government source, an overwhelming majority of information (n=178, 89%) came from highly trustworthy web sources. However, the remaining 11% (n=22) of information came from either news or other media. In particular, 4.5% (n=9) of information came from web sources classified as other media, which included potentially erroneous sources of information such as Wikipedia.
In the midst of a highly unprecedented pandemic with significant economic and public health implications, the internet is a crucial source of information for the general public in order to guide their everyday life. As information is changing rapidly and is compounded by fallacies originating from places of authority, we believe that the pandemic highlights the role of physicians in providing patients the most reliable information based on the highest quality of evidence. Thus, the present study effectively characterized the intellectual curiosity of millions of Americans into 6 easily categorizable groups and demonstrated the origin of the general public’s sources of information.
Previous studies have examined search and Twitter trends related to the COVID-19 pandemic from regions around the world, including the United States, China, Italy, and Spain [- , - ]. In April, Husnayain et al [ ] examined Google search trends in Taiwan, effectively noting that searches for handwashing drastically increased after a perceived face mask shortage in the country. More recently, Rovetta et al [ ] examined the Google search trends in Italy, and characterized the most common search terms in the country, including “face mask,” “disinfectant,” “symptoms of the coronavirus,” “health bulletin,” and “vaccines.” In the United States, Chen et al [ ] examined over 100 million tweets to track social media conversations about the COVID-19 pandemic. However, to our knowledge, no study has examined internet search patterns related to COVID-19 in the United States. This question is of utmost importance for several reasons. First, the United States represents not only the highest COVID-19 burden in the world, but is also a country where recent well-publicized examples of misinformation originated from the head of state [ ]. Second, while Twitter effectively captures a significant source of information, it is by no means comprehensive, and the platform appeals to a select audience [ ].
Thus, the present study revealed that the most commonly searched criteria about COVID-19 included information about transmission, clinical signs and symptoms, and activity modification. Understanding what matters to our patients should compel us to be well informed on these topics. We believe that as physicians, we should collectively take responsibility to provide reliable information based on the best available evidence . Even if we do not regularly manage patients with COVID-19, at the very least we should be prepared to answer the most common clinical questions asked online such as modes of transmission or the status of a vaccine. Although many answers may be obvious to us, there remain many questions that are active areas of study for which we must remain up to date. Regardless of the specific details of our individual practice, we should always have the willingness to learn and the preparation to answer these commonly asked questions.
Further, the present study revealed that the sizable majority of internet sources provided were from major health organizations or from academic medical institutions. While we find this to be encouraging, we must remember that our patients also consume information from multiple other sources. The social media “echo chamber” phenomenon is an active area of study for sociologists and computer scientists and has been shown to rapidly propagate rumors or misinformation on a mass scale [- ]. Again, we believe that physicians should take responsibility for providing the best-quality information in the domains in which we hold influence. On a larger scale, perhaps there is a role for physicians to learn and adapt techniques employed by marketers and politicians to better communicate medical information with the public. However, for most of us, we believe that our role is simply to understand the concerns of our patients regarding COVID-19, to remain informed ourselves, and to be ready to answer their questions.
There are several limitations to the present study. First, the COVID-19 pandemic is rapidly changing, and the results of the present study captured web searches as of May 2020. Due to the changes in pandemic characteristics, such as the emergence of new hotspots, it is entirely plausible that the focus of web searches has changed as well. In addition, the present study only captures the web searches of users in New York during May 2020, as the Google COVID-19 database generates search results based on the date of search and user location. Thus, we are unable to analyze trends in either. However, other published studies have examined Google searches in other regions of the world, including China, Taiwan, and Italy [- , ]. Thus, the results of the present study should be used in conjunction with those around the world to provide a more comprehensive view of the search patterns of citizens across the world. In addition, the present study makes use of Google’s coronavirus homepage, which generates the most commonly asked questions based on the specific user’s location and date of search. Due to the limitations of this feature, we are unable to compare trends in location and trends in time, which should be a direction for future studies. Lastly, Google web search was the only search database examined, and the present study fails to capture information from alternative search engines. However, as previously noted, Google is by far the most highly utilized search engine in the United States [ ].
People use Google Web Search to identify sources of information about COVID-19. In the United States, the most commonly searched topics related to COVID-19 were transmission, clinical signs and symptoms, and activity modification. Reassuringly, the majority of information in the present study came from highly reputable sources, including the CDC, academic websites, and the WHO.
Conflicts of Interest
- Haleem A, Javaid M, Vaishya R. Effects of COVID-19 pandemic in daily life. Curr Med Res Pract 2020 Mar;10(2):78-79 [FREE Full text] [CrossRef] [Medline]
- Kache T, Mrowka R. How Simulations May Help Us to Understand the Dynamics of COVID-19 Spread - Visualizing Non-Intuitive Behaviours of a Pandemic (pansim.uni-jena.de). Acta Physiol (Oxf) 2020 Jun 04:e13520 [FREE Full text] [CrossRef] [Medline]
- Vogels EA, Perrin A, Rainie L, Anderson M. 53% of Americans Say the Internet Has Been Essential During the COVID-19 Outbreak. Pew Research Center. 2020 Apr 30. URL: https://www.pewresearch.org/internet/2020/04/30/53-of-americans-say-the-internet-has-been-essential-during-the-COVID-19-outbreak [accessed 2020-07-09]
- Koeze E, Popper N. The Virus Changed the Way We Internet. The New York Times. 2020 Apr 7. URL: https://www.nytimes.com/interactive/2020/04/07/technology/coronavirus-internet-use.html [accessed 2020-07-09]
- Nuti SV, Wayda B, Ranasinghe I, Wang S, Dreyer RP, Chen SI, et al. The use of google trends in health care research: a systematic review. PLoS One 2014;9(10):e109583 [FREE Full text] [CrossRef] [Medline]
- Fraval A, Ming CY, Holcdorf D, Plunkett V, Tran P. Internet use by orthopaedic outpatients - current trends and practices. Australas Med J 2012;5(12):633-638 [FREE Full text] [CrossRef] [Medline]
- Cuan-Baltazar JY, Muñoz-Perez MJ, Robledo-Vega C, Pérez-Zepeda MF, Soto-Vega E. Misinformation of COVID-19 on the Internet: Infodemiology Study. JMIR Public Health Surveill 2020 Apr 09;6(2):e18444 [FREE Full text] [CrossRef] [Medline]
- McCarthy T. 'It will disappear': the disinformation Trump spread about the coronavirus – timeline. The Guardian. 2020 Apr 14. URL: https://www.theguardian.com/us-news/2020/apr/14/trump-coronavirus-alerts-disinformation-timeline [accessed 2020-07-09]
- Wang L, Wang J, Wang M, Li Y, Liang Y, Xu D. Using Internet search engines to obtain medical information: a comparative study. J Med Internet Res 2012;14(3):e74 [FREE Full text] [CrossRef] [Medline]
- Tripathi S, Singh C, Kumar A, Pandey C, Jain N. Bidirectional Transformer Based Multi-Task Learning for Natural Language Understanding. In: Métais E, Meziane F, Vadera S, Sugumaran M, Saraee M, editors. Natural Language Processing and Information Systems. NLDB 2019. Lecture Notes in Computer Science. Cham: Springer; 2019:54-65.
- Rothwell J. In Mixed Company: Communicating in Small Groups and Teams, 10th ed. New York: Oxford University Press; 2019.
- Kanthawala S, Vermeesch A, Given B, Huh J. Answers to Health Questions: Internet Search Results Versus Online Health Community Responses. J Med Internet Res 2016 Apr 28;18(4):e95 [FREE Full text] [CrossRef] [Medline]
- Abd-Alrazaq A, Alhuwail D, Househ M, Hamdi M, Shah Z. Top Concerns of Tweeters During the COVID-19 Pandemic: Infoveillance Study. J Med Internet Res 2020 Apr 21;22(4):e19016 [FREE Full text] [CrossRef] [Medline]
- Husnayain A, Fuad A, Su EC. Applications of Google Search Trends for risk communication in infectious disease management: A case study of the COVID-19 outbreak in Taiwan. Int J Infect Dis 2020 Jun;95:221-223 [FREE Full text] [CrossRef] [Medline]
- Tao Z, Chu G, McGrath C, Hua F, Leung YY, Yang W, et al. Nature and Diffusion of COVID-19-related Oral Health Information on Chinese Social Media: Analysis of Tweets on Weibo. J Med Internet Res 2020 Jun 15;22(6):e19981 [FREE Full text] [CrossRef] [Medline]
- López-Jornet P, Camacho-Alonso F. The quality of internet sites providing information relating to oral cancer. Oral Oncol 2009 Sep;45(9):e95-e98. [CrossRef] [Medline]
- Starman JS, Gettys FK, Capo JA, Fleischli JE, Norton HJ, Karunakar MA. Quality and content of Internet-based information for ten common orthopaedic sports medicine diagnoses. J Bone Joint Surg Am 2010 Jul 07;92(7):1612-1618. [CrossRef] [Medline]
- Chen E, Lerman K, Ferrara E. Tracking Social Media Discourse About the COVID-19 Pandemic: Development of a Public Coronavirus Twitter Data Set. JMIR Public Health Surveill 2020 May 29;6(2):e19273 [FREE Full text] [CrossRef] [Medline]
- Rovetta A, Bhagavathula AS. COVID-19-Related Web Search Behaviors and Infodemic Attitudes in Italy: Infodemiological Study. JMIR Public Health Surveill 2020 May 05;6(2):e19374 [FREE Full text] [CrossRef] [Medline]
- Hernández-García I, Giménez-Júlvez T. Assessment of Health Information About COVID-19 Prevention on the Internet: Infodemiological Study. JMIR Public Health Surveill 2020 Apr 01;6(2):e18717 [FREE Full text] [CrossRef] [Medline]
- Sinnenberg L, Buttenheim AM, Padrez K, Mancheno C, Ungar L, Merchant RM. Twitter as a Tool for Health Research: A Systematic Review. Am J Public Health 2017 Jan;107(1):e1-e8. [CrossRef] [Medline]
- Silberg WM, Lundberg GD, Musacchio RA. Assessing, controlling, and assuring the quality of medical information on the Internet: Caveant lector et viewor--Let the reader and viewer beware. JAMA 1997 Apr 16;277(15):1244-1245. [Medline]
- Choi D, Chun S, Oh H, Han J, Kwon TT. Rumor Propagation is Amplified by Echo Chambers in Social Media. Sci Rep 2020 Jan 15;10(1):310 [FREE Full text] [CrossRef] [Medline]
- Scanfeld D, Scanfeld V, Larson EL. Dissemination of health information through social networks: twitter and antibiotics. Am J Infect Control 2010 Apr;38(3):182-188 [FREE Full text] [CrossRef] [Medline]
- Vosoughi S, Roy D, Aral S. The spread of true and false news online. Science 2018 Mar 09;359(6380):1146-1151. [CrossRef] [Medline]
|CDC: Centers for Disease Control and Prevention|
|WHO: World Health Organization|
|XPath: XML Path Language|
Edited by G Eysenbach; submitted 10.07.20; peer-reviewed by Q Zhu, C McGrath; comments to author 01.09.20; revised version received 18.09.20; accepted 26.10.20; published 23.11.20
©Tony S Shen, Aaron Z Chen, Patawut Bovonratwet, Carol L Shen, Edwin P Su. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 23.11.2020.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.