Published on in Vol 22, No 8 (2020): August

Preprints (earlier versions) of this paper are available at, first published .
When Public Health Research Meets Social Media: Knowledge Mapping From 2000 to 2018

When Public Health Research Meets Social Media: Knowledge Mapping From 2000 to 2018

When Public Health Research Meets Social Media: Knowledge Mapping From 2000 to 2018


1School of Media and Communication, Shenzhen University, Shenzhen, China

2Department of Communication, Michigan State University, East Lansing, MI, United States

*these authors contributed equally

Corresponding Author:

Bolin Cao, PhD

School of Media and Communication

Shenzhen University

Rm809, Building L7, South Campus, Shenzhen University

Nanhai Avenue, Nanshan District



Phone: 86 13247393943


Background: Social media has substantially changed how people confront health issues. However, a comprehensive understanding of how social media has altered the foci and methods in public health research remains lacking.

Objective: This study aims to examine research themes, the role of social media, and research methods in social media–based public health research published from 2000 to 2018.

Methods: A dataset of 3419 valid studies was developed by searching a list of relevant keywords in the Web of Science and PubMed databases. In addition, this study employs an unsupervised text-mining technique and topic modeling to extract research themes of the published studies. Moreover, the role of social media and research methods adopted in those studies were analyzed.

Results: This study identifies 25 research themes, covering different diseases, various population groups, physical and mental health, and other significant issues. Social media assumes two major roles in public health research: produce substantial research interest for public health research and furnish a research context for public health research. Social media provides substantial research interest for public health research when used for health intervention, human-computer interaction, as a platform of social influence, and for disease surveillance, risk assessment, or prevention. Social media acts as a research context for public health research when it is mere reference, used as a platform to recruit participants, and as a platform for data collection. While both qualitative and quantitative methods are frequently used in this emerging area, cutting edge computational methods play a marginal role.

Conclusions: Social media enables scholars to study new phenomena and propose new research questions in public health research. Meanwhile, the methodological potential of social media in public health research needs to be further explored.

J Med Internet Res 2020;22(8):e17582



Social media has deeply penetrated people’s lives in many aspects. In developing and developed societies, social media has played a significant role in health management and disease control [1]. Social media is integrated into empirical examinations of the prevention and control of various types of diseases, including emerging, infectious, and chronic diseases [2-4]. Social media has been employed to study health phenomena among different populations or social groups such as children, pregnant women, and older adults [5,6]; different genders [7,8]; and individuals in various social classes [9].

Agencies widely use social media to fulfill different health purposes. For the general public, social media is used to satisfy its orientation for health information, linking with health services and communication with others who share the same health interests [10]. For public health professionals and organizations, social media (eg, Facebook, Grindr, mobile apps) serves as a multifunctional tool to launch interventions to reach a wide array of the population efficiently [11-14]. The large volume of mobility and discourse data on social media (eg, Twitter) can be conducive for public health management, including disease surveillance, assessment, and control [15-17].

These studies have demonstrated the increase of scholarly interest in empirical research conducted on social media platforms with public health goals, including the social media–based public health research in this study [18-21]. Although many scholars in social science and public health have contributed to this field, the overview about how social media has been integrated into public health research remains limited. Prior systematic reviews on social media–based public health research often focused on certain domains or topics. Many reviews systematically investigated the effectiveness of social media interventions for varied specific health outcomes, such as the promotion of safe sexual health behaviors [22], vaccine uptake [23], noncommunicable disease management [24], and HIV prevention [19]. Scholars are often dedicated to one or two domains, neglecting the fact that using social media in one field may shed light on another. Meanwhile, focusing on one particular area, these reviews often face challenges to identify similar patterns across domains and capture an integrated picture about social media–based public health research. In addition, most existing reviews included a limited number of original articles [25,26]. Even a review of systematic reviews only extracted few studies [18]. In this emerging and fast-growing subject field, the limited literature being included may fail to provide a panoramic description of social media–based public health research.

Furthermore, most prior systematic reviews have adopted a top-down approach and therefore may have narrowed the view by overlooking certain nuances and novelties that have been emerging. This study adopts a bottom-up approach [27] to understand the growth of social media–based public health research and remain open to map the intellectual landscape in this area. Specifically, the study aims to address the following research questions:

  • RQ 1: What are the major publication trends of social media–based public health research since 2000?
  • RQ 2: What are the major research themes in social media–based public health research since 2000?
  • RQ 3: What role does social media play in social media–based public health research since 2000?
  • RQ 4: What are the major research methods adopted in social media–based public health research since 2000?

Data Collection

To examine how social media has been adopted and integrated into public health research, a list of terms was identified and the Web of Science (Clarivate Analytics) and PubMed databases were searched (see Figure 1). Lists of keywords about social media and disease were established. This study focused on emerging, infectious, and chronic diseases. Specifically, 14 diseases were selected that are of high prevalence among the population or pose major public health threats according to the World Health Organization [28]: influenza, HIV, hepatitis A, hepatitis B, hepatitis E, dengue, Ebola, Middle East respiratory syndrome (MERS), asthma, diabetes, obesity, cancer, oral disease, and alcohol use. A list of keywords was constructed. Then a list of social media keywords, including the general social media categories and specific social media platforms, was established.

Figure 1. Knowledge-mapping workflow of social media–based public health research from 2000 to 2018.
View this figure

The lists of disease and social media keywords were combined pairwise and submitted to search titles, abstracts, and keywords of published studies in the databases of the Web of Science and PubMed. The publication period was limited to between 2000 and 2018. The article language was limited to English, and document type was limited to scholarly journal articles. Considering the PubMed database includes numerous medical and clinical studies that are beyond the research scope of this study, the search results were refined by setting the broad subject terms as related categories, such as public health and medical informatics to reduce the noise in the search results. This search strategy led to the identification of 5271 articles in the two databases. Then, another round of data checking to remove unqualified studies such as non–social media–related articles, duplicate records, and clinical studies was implemented. Finally, a dataset of 3419 articles was collected for further analysis. Ultimately, document level information of the 3419 articles from the databases, including authors, article title, journal title, abstract, author keywords, and cited references, was retrieved.

Data Analysis

An automatic text-mining approach was adopted to extract research themes in the field of social media–based public health research. As the abstracts of published studies conveyed the themes or foci of the articles [29], the article abstracts were mined through latent Dirichlet allocation (LDA) topic modeling, which is a popular unsupervised text-mining technique in computational social science. LDA topic modeling helps recognize the structure of research development, current trends, and interdisciplinary landscapes of research [27]. The LDA topic modeling [30] was implemented with the tm package in the R software (R Foundation for Statistical Computing). Data preprocessing, such as removal of stop words and numbers, was performed before the LDA topic modeling.

Numerous LDA topic models were estimated with various numbers of topics. These models were evaluated on the basis of three main criteria: (1) a substantial proportion of articles exists under each topic, (2) themes show independence with one another and the lists of top terms of topics are not highly overlapped or not relevant, (3) models differing in theme number are compared to identify the nuanced differences and determine the best theme extraction by assuring that each term list is coherent. Finally, a topic model with 25 topics, which presented adequate discrimination between topics and convergence within a topic, was selected. The articles were classified into the research theme with which they had the greatest probability scores.

To understand how social media has been integrated into public health research with different thematic foci, a manual content analysis was conducted among a randomly selected sample of 500 abstracts (20 from articles in each theme) to understand the role of social media. A coding scheme was developed by two authors of this study. The two authors first separately coded a subsample of 60 randomly selected abstracts to construct the coding scheme. After several rounds of exploration and discussion, they achieved a satisfying intercoder reliability (as measured with a Cohen kappa). The role of social media is categorized into two main types. First, social media provides a substantial research interest for public health research, which includes the use of social media for intervention, as human-computer interaction characteristics, as platforms of social influence, and for risk assessment or disease prevention. Second, social media is employed as a context in public health research with social media as mere reference, as participant recruitment tool, or as data source. Categories and their definitions are further illustrated in the analytical findings section (also see Multimedia Appendix 1).

Finally, the research methods adopted by social media–based public health research were identified by searching a list of keywords associated with various research methods among the titles, abstracts, and keywords of the retrieved studies. Figure 1 summarizes the study workflow.

Publication Trends in Social Media–based Public Health Research

Publication trends in the field of social media–based public health research in the past two decades were presented in 3 dimensions: growth of overall publications, growth of publications by specific diseases, and growth of journal outlets.

Empirical studies in this area were relatively limited in the first decade (2000 to 2010), which demonstrated a minimal increase as shown in Figure 2. A significant annual increase was observed from 2011 to 2018. Such trends intersected with the advancement of the internet, especially that of social media. Although popular social media platforms, such as Facebook, Twitter, and Instagram, were launched before 2010, they have been widely accepted worldwide since 2010. This implies that social media–based public health research is a study area responsive to technological development. Prior research also demonstrated similar findings that internet research evolved along with technological development [29].

Social media has been increasingly incorporated into the studies of certain types of diseases in the past decades. Dramatic increases in research occurred on cancer, HIV, diabetes, obesity, and alcohol use after 2010. Other diseases, such as influenza, hepatitis A, hepatitis B, dengue, Ebola, MERS, asthma, and oral disease, showed a relatively slow growth rate that remained quite stable from 2000 to 2018.

For the journal outlets, a total of 799 journals published studies in these areas (see Figure 3). Table 1 reports the 15 most visible journals in this area. Among them, Journal of Medical Internet Research, which published 331 articles, is the most visible one accounting for 9.68% of the total publications. Moreover, JMIR sister journals, such as JMIR mHealth and uHealth (165 publications), JMIR Research Protocols (114 publications), and JMIR Public Health and Surveillance (49 publications) also showed great interest in this domain. Other high-ranked journals included PLoS One, BMC Public Health, Studies in Health Technology and Informatics, AIDS and Behavior, and BMJ Open, each occupy more than 1.5% of publication in this field.

Figure 2. Total number of articles for each disease in the Social Sciences Citation Index from 2000 to 2018.
View this figure
Figure 3. Number of unique outlets that published output of social media–based public health research from 2000 to 2018.
View this figure
Table 1. Top journals in social media–based public health research.
NumberJournal namePublications, n (%)
1Journal of Medical Internet Research331 (9.7)
2JMIR mHealth and uHealth165 (4.8)
3PLoS One123 (3.6)
4JMIR Research Protocols114 (3.3)
5BMC Public Health72 (2.1)
6Studies in Health Technology and Informatics68 (2.0)
7AIDS and Behavior53 (1.5)
8BMJ Open52 (1.5)
9JMIR Public Health and Surveillance49 (1.4)
10Journal of Health Communication49 (1.4)
11International Journal of Medical Informatics37 (1.1)
12Journal of Diabetes Science and Technology37 (1.1)
13Health Communication34 (1.0)
14Computers in Human Behavior33 (1.0)
15International Journal of Environmental Research and Public Health31 (0.9)

Research Themes in Social Media–Based Public Health Research

The 25 extracted research themes were labeled on the basis of the top 15 most frequently used terms associated with each theme and the articles assigned to the theme. Table 2 presents the lists of terms under each theme. A network graph of theme-word probability of the 25 research themes is provided in Multimedia Appendix 2. Multimedia Appendix 3 displays a typical study under each theme.

The percentages in Table 2 reveal the article distribution across research themes. The articles under each theme varied greatly from 2.22% (76/3419) to 7.93% (271/3419; Table 2), with men and HIV occupying the largest number of articles and reproductive cancers the least. Among them, the mHealth family, the themes about mHealth (themes 1, 2, 3, 4), contained a large body of 653 articles. Themes about substance use (themes 6, 7, 8) comprised 409 articles. Another big cluster was cancer (themes 10, 12, 13), which consisted of 385 articles.

The 25 research themes were further grouped into 6 research clusters on the basis of similar concerns and associations. The first cluster was on health education, which comprises 4 themes: health education–school and students, health education–family and oral/dental health, mHealth and medical decisions, and pregnancy. Health education aims to prevent diseases through improving people’s knowledge and health efficacy. School and family, as the main scenes for the students to learn health beliefs and behaviors, have been the foci of health education. Sexual health education on condom use and pregnancy have also attracted increasing scholarly attention.

The second cluster was on health management with the help of mHealth. The themes mHealth and weight control, mHealth and diabetes management, digital campaigns in targeted populations, social media and alcohol drinking, substance usage and cessation, food and asthma, and vaccination and immunization all fell into this cluster. This indicates the functional attributes of social media to help manage health problems. Social media use is used to intervene in certain unhealthy behaviors and promote healthy behavior adoption.

The third cluster, cancer studies, includes women’s cancer, reproductive cancer, cancer survivor, and caregiving on social media. Cancer is one of the world’s largest health problems and a significant cause of death. Thus, continuous attention has been paid to cancer studies.

The fourth cluster, infectious diseases, includes HIV as a key topic: mHealth and HIV; men and HIV; and infectious disease, health campaign, and stigma belong to this cluster. In this line of research, social media provides breakthrough channels to reach risky subgroups and focuses more attention on campaigns to reduce the stigma surrounding infectious diseases.

The fifth cluster was on mental health issues. This cluster consists of two themes: mental health and substance use and mental health–depression and digital technology. Mental health problems have been prominent in modern society. Digital technology is considered a cause and a solution to mental health issues.

The sixth cluster was on extended health research empowered by social media: health and human mobility, health marketing, surveillance and Twitter, and eHealth–miscellaneous. These research areas have flourished due to the availability of geographical information, mass user behavioral data, and extensive online discourse on social media platforms.

Table 2. Research themes and the top 15 keywords under each theme.
NumberResearch themesTop 15 keywordsn (%)
1mHealth and weight managementApp, weight, loss, selfmonitor, usabl, mHealth, adher, exercise, download, dietary, BMI, Fit, Android, coach, mainten208 (6.08)
2mHealth and diabetes managementdiabet, selfmanag, glucos, usabl, mHealth, adolesc, HbA1c, TDM, young, selfcar, older, glycem, insulin, selfefficacy, cardiovascular192 (5.62)
3mHealth and medical decisionsdecis, mHealth, intent, peer, consum, screen, trust, cell, doctor, choic, privacy, navig, leader, read, worker103 (3.01)
4mHealth and HIVadher, mhealth, HIV, literacy, SMS, ART, portal, selfmanag, PLWH, retent, digit, RCTs, beta, viral, nurs150 (4.39)
5Men and HIVHIV, men, sexual, MSM, partner, PrEP, AOR, gay, condom, drug, young, STI, YMSM, websit, Latino271 (7.93)
6Alcohol drinking and social mediaalcohol, drink, young, consumpt, student, Facebook, post, colleg, alcoholrel, SNS, exposur, norm, adolesc, market, peer174 (5.09)
7Substance use and cessationsmoke, cessat, smoker, drug, tobacco, quit, marijuana, cigarett, EMA, abstin, substanc, addict, ecolog, alcohol, momentary115 (3.36)
8Mental health and substance usesymptom, mental, substanc, pain, disord, veteran, drug, cope, distress, adolesc, tan, fatigu, abus, stress, pro120 (3.51)
9Mental health–depression and digital technologiesdepress, pain, digit, anxiety, adolesc, symptom, suicid, cancer, disord, older, memory, genet, injury, dengu, young112 (3.28)
10Women’s cancercancer, breast, screen, women, vaccin, HPV, campaign, prostat, colorect, cervic, lung, imag, news, cancerrel, papillomavirus151 (4.42)
11Pregnancywomen, pregnanc, pregnant, mother, gestat, GDM, matern, worker, CHWS, child, contracept, HCV, mHealth, postpartum, antenat117 (3.42)
12Reproductive cancerscancer, ovarian, prostat, gene, polymorph, cultur, genotyp, postop, cohort, predict, nutrit, genet, PON, BRCA, surgic76 (2.22)
13Cancer survivor carecancer, survivor, emot, breast, psychosoci, oncolog, young, modul, wellb, survivorship, AYA, selfmanag, QOL, depress, consult158 (4.62)
14Caregiving on social mediaFacebook, post, caregiv, page, blog, comment, emot, CRC, virtual, channel, Twitter, friend, profil, chat, fit136 (3.98)
15Vaccination and immunizationvaccin, influenza, predict, coverag, season, event, news, queri, flu, immun, volum, forecast, surveil, websit, outbreak119 (3.48)
16Infectious disease, health campaigns and stigmaEbola, HIV/AIDS, campaign, epidem, stigma, outbreak, audienc, Africa, outreach, facil, news, post, neighbourhood, stori, IBD110 (3.22)
17Food and asthmafood, asthma, nutrit, game, children, intak, consumpt, dietari, exposur, veget, infant, eat, weight, beverag, feed146 (4.27)
18Health education–family and oral/dental healthparent, websit, children, oral, dental, child, readabl, read, grade, instrument, discern, childhood, rank, pediatr, page128 (3.74)
19Health education–school and studentsstudent, nurs, physician, mHealth, skin, school, melanoma, sun, cluster, EMR, hypertens, rural, India, NCDS, CVD107 (3.13)
20Health and human mobilitymap, street, neighborhood, urban, walk, audit, built, sale, resid, agreement, happi, crowdsourc, hookah, LOS, imag111 (3.25)
21Digital campaigns in targeted populationsyouth, advertis, rural, Hispan, adolesc, percent, young, homeless, campaign, urban, digit, women, black, cultur, underserv131 (3.83)
22Health behavior guidelinesbehavior, guidelin, usag, practition, programm, ethic, sedentary, men, websit, GPS, ICT, screen, kingdom, citat, geosoci86 (2.52)
23Health marketing and social mediavideo, YouTub, market, brand, product, girl, adolesc, tobacco, industry, consum, company, ecigarett, boy, cigarett, surgery100 (2.92)
24Surveillance and Twittertweet, Twitter, surveil, post, sentiment, ILI, influenza, outbreak, detect, drug, opinion, mention, retweet, marijuana, pandem165 (4.83)
25eHealth–miscellaneouseHealth, phase, eat, referr, client, mHealth, telemedicin, uncertainty, COPD, reward, PLHIV, emot, EVD, static, compet133 (3.89)

Roles of Social Media in Public Health Research

Social Media as Research Context or Substantial Interest

Social media is integrated into public health research by providing a new research context or producing new substantial interest in public health research.

When social media was adopted as a research context, social media was specifically considered as a mere reference, a platform for participant recruitment, and as a data source. When social media was adopted as a mere reference, research mostly used social media as a tool to offer intervention and facilitate the health management of individuals. For the role of a platform for recruitment, research either recruited participants through distributing questionnaires or posting participant recruitment announcements on social media (eg, Facebook) or employed users of certain social media platforms as the study target group (Grindr for the men who have sex with men group). For the role of data source, social media could contribute to collecting data in text, image, video, and app interface formats and collecting published posts and articles for meta-analysis or scope review.

When social media produced substantial interests for public health research, social media was used for intervention; employed to study human-computer interaction characteristics; used as a platform of social influence; and used for disease surveillance, risk assessment, or prevention. Under these 4 broad categories, the role of social media is described as follows:


For public health intervention, the 4 subroles of social media in the published studies are as follows: (1) interactive intervention tool targeted at changing personal and environmental risky health factors, (2) intervention information-distributing tool (1-way and not real-time interactive), (3) source for health information seeking, such as YouTube and other platforms, and (4) usability test of social media platforms as intervention instruments.

Human-Computer Interaction Characteristics

Under this role, social media was used to serve the goal of revealing (1) the public’s attitudes toward technology and social media for health use, (2) characteristics and behaviors of social media users and groups, (3) factors affecting the health behaviors or attitudes of users on social media, and (4) consequences/influences on health behaviors caused by (popular) social media.

Platform of Social Influence

Social networking and interaction between different individuals and groups on social media could facilitate the change of health behaviors through the following approaches: (1) building online (support) groups for patients, such as cancer patient groups on Facebook, (2) promoting physician-patient communication or information seeker–provider communication, (3) enhancing health-related marketing, such as precision advertising, and (4) changing public health behavior at a macro level. All of these approaches are representations of social influence in online communities.

Disease Surveillance, Risk Assessment, or Prevention

The digital traces of online behaviors and massive online discourse granted opportunities to understand health conditions at a population level. For instance, Google trends and search query records could grant references to predict the possibility of a flu outbreak at an early stage.

Figure 4 presents the percentages of articles in each type of social media role. The results showed that among substantial interest, “an interactive intervention tool targeted at changing personal and environmental risky health factors” accounted for the largest percentage of 19.2%, followed by “usability test of social media platforms as intervention instruments” (10.6%), and “a source for health information seeking” (8.0%). Four of other types, “characteristics and behaviors of social media users and groups” (7.4%), “consequences/influences on health behaviors caused by (popular) social media” (6.4%), “enhancing health-related marketing” (6.4%), and “changing public health behavior as macro influence” (6.4%), also occupied a relatively larger proportion of more than 6%.

Among the dimension of social media as research context (Figure 5), “as a mere reference,” which took social media as a research background or a research environment, played the dominant role (47.2%). The second most frequent role that social media plays was content data source (ie, text/picture/video/app data sources, 24.8%). The other three were “social media as article search platform for meta-analysis or literature review” (11.6%), “for participant recruitment” (10.6%), and “as platforms to recruit their users” (1.2%).

Figure 4. Article distributions based on social media as substantial interest.
View this figure
Figure 5. Article distributions based on social media as research context.
View this figure

Research Methods in Social Media–Based Public Health Research

Public health research with social media data was dominated by traditional quantitative research methods, whereas cutting edge computational methods played a minor role. Among all the articles, 30.6% employed survey method, 24.0% employed experiment design, 22.7% employed qualitative methods (eg, field observation, in-depth interview, and focus group), 8.3% included employed digital methods (including digital tracks analysis and computational methods, such as text mining, sentiment analysis, agent-based modeling, and network modeling), and 5.6% employed traditional content analysis.

Figure 6 demonstrates that the method distributions under the 25 research themes were similar to the general distribution among the whole body of the studies. Survey and experiment were the two most adopted methods, whereas review article number was relatively small among all themes.

Figure 6. Research methods adopted under research topics.
View this figure

Principal Findings

With a bottom-up approach, this study provided a panoramic mapping of the landscape of social media–based public health research. By analyzing publication trends, research themes, roles of social media, and research methods adopted in this emerging research area, this study concluded that (1) social media has penetrated almost all the health-related processes and domains since 2010, showing a dramatic increase in the research body; (2) existing social media–based public health research mainly focuses on 25 themes in 6 clusters; (3) social media generally played two roles in public health research: generating substantial research interest and providing a research context/platform; (4) existing social media–based public health research is dominated by traditional research methods while the share of computational method is on the rise. The panoramic mapping can help scholars understand the state of the art in this research area and what is under- or overstudied in this field. This study can enable scholars across various disciplines to understand each other’s needs and contribute jointly to health promotion and disease control. Here, three notable issues that possess theoretical and methodological implications in social media–based public health research are elaborated.

When Public Health Research Meets Social Media: From New Phenomena to New Questions

Social media has infiltrated almost all health-related processes and domains with the rapid advancement of social and mobile media. This dramatic change has entailed many new phenomena to be explored in public health research. The findings of this study are consistent with previous studies that found that almost one-third of internet studies have focused on eHealth and mHealth since 2009 [29] and a trend toward digitization exists in health care [31]. Many traditional public health activities, such as health education, health promotion, and disease surveillance, have taken advantage of social media technologies to become digitized [32-34]. Social media has substantially altered how individuals seek and share health information, discuss health issues, and engage in health behaviors [35]. Social media also provides innovative ways to change health behaviors in various domains, such as smoking cessation, substance use, weight control, HIV prevention, and cancer screening [36-38]. Consistent with previous reviews on social media and public health studies, this study concludes that social media contributes to these public health domains by broadening the reach of health education, providing accessible online professional consultation, and improving the efficacy of access to care and medication uptake, etc [19,39]. Moreover, an upward trend of integrating social media in various public health campaigns exists due to the instrumental benefits of social media technologies, such as lower intervention cost, higher user engagement, higher efficiency, and better documentation of the process [40].

When public health research meets social media, new topics have emerged and attracted the attention of public health scholars [41]: mHealth and social media–empowered health research. “Digital campaigns in targeted populations” and “surveillance and Twitter” are typical new topics where researchers frequently examine new research questions [42,43]. For example, researchers discuss how to employ user-generated content together with geolocation information to predict an outbreak of an emerging disease or visually map their diffusion routes and locate the risky population [44]. The digital trace on social and mobile media offers many possibilities to study online health behaviors such as online health information–seeking, online social support, and online medical consultation behaviors [45]. In addition, some health topics have attracted burgeoning attention in the era of social media. For instance, mental health problems have been identified as significant concerns among the 25 themes in this study. However, no conclusion has yet been reached whether and how the adoption and use of social media alleviates or exacerbates mental health problems [46].

This relatively new domain calls for in-depth exploration. New phenomena and new questions raised by social media are of practical and theoretical significance for public health research. Timely responses to those new phenomena via scientific research can promote the advancement of the domain to keep up with technology advances and establish a realistic understanding of what social media can and cannot do in public health. Meanwhile, public health research should delve into scientific research questions behind those new phenomena and address them either by exploiting the existing body of knowledge or exploring new methods and knowledge to extend the domain.

When Public Health Research Meets Social Media: Methodological Potential to Be Further Tapped

When public health research meets social media, the dominant research methods are traditional quantitative methods, despite the growing interest in computational methods among public health scholars. The methodological potentials of social media for public health research can and should be further tapped. Specifically, the potential of social media in participant recruitment and measurement development has direct and salient implications to public health research.

Social media substantially facilitates participant recruitment in public health research. Recruiting research participants from specific groups of individuals who have sensitive health issues or are stigmatized in society, such as people living with HIV/AIDS or individuals with mental health issues, remains a significant challenge for public health scholars [19,47]. Given the size and heterogeneity of social media users, recruiting a fairly sizable number of subjects from particular social groups to participate in public health surveys and experiments should be possible. More importantly, participants recruited from online platforms such as Facebook and Amazon’s Mechanical Turk can have significant heterogeneity in their demographic characteristics (eg, age, gender, race, cultural background) and other key variables relevant to specific researcher contexts [48]. Nevertheless, it is worth noting here that the representativeness of participants recruited on social media needs to be empirically evaluated in particular contexts. Amazon’s Mechanical Turk workers are not a generalizable population with regard to health status and behaviors in the United States [49]. Without an empirical evaluation of representativeness of recruited subjects on social media, researchers should be cautious in the generalizability of their research findings. Moreover, ethical issues involved in participant recruitment via social media platforms have become more prominent and challenging. Due to the anonymity of social media users, it is extremely difficult if not impossible to obtain informed consent beforehand from recruited participants. When users of a social media platform accept the terms of service of the platform, can researchers assume that the users have given an explicit or implicit consent to participate in any type of experiment or intervention conducted on the platform [50]? We do not have a widely accepted ethical guideline in this regard. A collective effort from the scientific community is needed to outline responsible and ethical conduct in this emerging research area.

Social media contributes to public health research by providing refreshed measurements of existing concepts or new observations of emerging phenomena. Rich semantic information in digital traces can provide a social telescope [51] with which to observe or infer what health information is produced, shared, and consumed by ordinary users. Multiple social and interactive relations in digital traces facilitate empirical studies on who connects with whom in various contexts. Voluminous and real-time social media data have been widely employed for epidemic surveillance or tracking emotional contagion [52,53]. A growing number of studies have employed user-generated content on social media to monitor emerging diseases at the breaking-out stage to minimize consequences or track trends in public health issues [54-56]. When public health scholars embrace new measures derived from social media data, empirically assessing and monitoring the quality of the new measures by cross-validating them with established measures is necessary. The parable of Google Flu Trends well illustrates the necessity of such cross-validation. When Google Flu Trends was first released, it outperformed traditional flu surveillance measures adopted by the US Centers for Disease Control and Prevention [57]. However, Google Flu Trends is reported to overestimate flu cases in the United States [58]. Validation of empirical measures is an ongoing process in public health research and beyond [59].

When Public Health Research Meets Social Media: Unequal Status With Detached Concerns?

Social media–based public health research lies in the crossroad between public health studies and social science studies on information and communication technologies (ICTs) [60] and benefits from both perspectives. In the cooperative process, social media–based public health research reaches various levels in elaborating on the two perspectives. Taking the initial perspective of public health interest, many acceptability studies and randomized controlled trials have been documented to examine the effectiveness of social media to reach different public health goals [40,61]. In these studies, social media is often considered a new functional tool to improve public health. Meanwhile, in research that further examines the influence of ICTs on public health, who used what social media content targeted at whom through which social media platforms with what health effects is the core concern [62]. In this line of research, studies typically focus on the transmission of health information, communication between health agencies, the uses of health apps, and so on [63]. The inherent concerns of these studies seem to be detached though not in conflict in that social media facilitates the public health promotion process, and public health outcomes add value to the communication through social media.

From an overview of social media–based public health research, the dominating approach of these published studies considers public health issues as the substantial interests and ultimate outcomes rather than regard social media as an equally important area of concern. Many articles used limited space to describe the use of social media in health promotion campaigns or projects [64,65]. The subordinate role of social media suggests that the potential of ICTs has not been fully realized in the domain of public health [41]. Empirical studies should not only focus on what social media can contribute to public health research but should also examine how and why social media can make an impact in various contexts of public health research. This can substantially improve the understanding of the intended as well as unintended consequences social media can exert on health attitudes and behaviors. This can also enable public health researchers to integrate social media into their research design further.


Despite the strengths and contributions, this study has certain limitations. First, the study may suffer from the file drawer effect given that only studies indexed in Web of Science and PubMed were included. Empirical studies published in other outlets were not considered here. Future studies are warranted to expand the pools to conference proceedings and articles indexed in other databases. Second, this study used numerous diseases as search terms in the initial search, but the list remains incomplete. Some important diseases, such as mental disorders, were not incorporated. Despite this, the topic modeling captured mental health as a major theme. Further research is suggested to include mental health keywords as search terms. Third, LDA topic modeling is a well-recognized method to identify related themes through document-word matrices. However, the results of the topic modeling were not as neat as expected. No standard and quantitative thresholds exist for researchers to choose the optimal number of topics. Future studies are encouraged to replicate this study and examine the reliability of such themes.


This study examined research themes, roles of social media, and research methods in social media–based public health research published from 2000 to 2018. This research identifies 25 research themes covering different diseases, various population groups, physical and mental health topics, and other significant issues. Social media assumes two major roles in public health research: one is to produce substantial research interest for public health research and the other is to furnish a research context for public health research. Social media enables scholars to study new phenomena and propose new research questions in public health research. Meanwhile, the methodological potential of social media in public health research needs further exploration.


This study was supported by National Social Science Fund of China (19ZDA324 and 18CXW017), Junior Faculty Supporting Fund at Shenzhen University (QNFC1903), and a research grant awarded to TQP at Michigan State University (RC109127).

Conflicts of Interest

None declared.

Multimedia Appendix 1

Social media roles and definitions.

DOCX File , 20 KB

Multimedia Appendix 2

Network graph of topic-word probability of the 25 research themes.

PNG File , 648 KB

Multimedia Appendix 3

Representative articles of the 25 research themes.

XLSX File (Microsoft Excel File), 36 KB

  1. Tran BX, Mai HT, Nguyen LH, Nguyen CT, Latkin CA, Zhang MWB, et al. Vietnamese validation of the short version of Internet Addiction Test. Addict Behav Rep 2017 Dec;6:45-50 [FREE Full text] [CrossRef] [Medline]
  2. Bernard S, Cooke T, Cole T, Hachani L, Bernard J. Quality and readability of online information about type 2 diabetes and nutrition. JAAPA 2018 Nov;31(11):41-44. [CrossRef] [Medline]
  3. Crawford R, Rutz D, Evans D. Between Combat boots and Birkenstocks: Lessons from HIV/AIDS, SARS, H1N1 and Ebola. Public Health 2016 Dec;141:186-191. [CrossRef] [Medline]
  4. Fitzpatrick T, Zhou K, Cheng Y, Chan P, Cui F, Tang W, et al. A crowdsourced intervention to promote hepatitis B and C testing among men who have sex with men in China: study protocol for a nationwide online randomized controlled trial. BMC Infect Dis 2018 Sep 29;18(1):489 [FREE Full text] [CrossRef] [Medline]
  5. Roberts CA, Geryk LL, Sage AJ, Sleath BL, Tate DF, Carpenter DM. Adolescent, caregiver, and friend preferences for integrating social support and communication features into an asthma self-management app. J Asthma 2016 Nov;53(9):948-954. [CrossRef] [Medline]
  6. Kampmeijer R, Pavlova M, Tambor M, Golinowska S, Groot W. The use of e-health and m-health tools in health promotion and primary prevention among older adults: a systematic literature review. BMC Health Serv Res 2016 Sep 05;16 Suppl 5:290 [FREE Full text] [CrossRef] [Medline]
  7. Almutairi N, Alhabash S, Hellmueller L, Willis E. The effects of Twitter users' gender and weight on viral behavioral intentions toward obesity-related news. J Health Commun 2018;23(3):233-243. [CrossRef] [Medline]
  8. Bogaerts A, Ameye L, Bijlholt M, Amuli K, Heynickx D, Devlieger R. INTER-ACT: prevention of pregnancy complications through an e-health driven interpregnancy lifestyle intervention: study protocol of a multicentre randomised controlled trial. BMC Pregnancy Childbirth 2017 May 26;17(1):154 [FREE Full text] [CrossRef] [Medline]
  9. Lennox J, Emslie C, Sweeting H, Lyons A. The role of alcohol in constructing gender and class identities among young women in the age of social media. Int J Drug Pol 2018 Aug;58:13-21. [CrossRef]
  10. Zhang MWB, Ho RCM, Hawa R, Sockalingam S. Analysis of the information quality of bariatric surgery smartphone applications using the Silberg scale. Obes Surg 2016 Jan;26(1):163-168. [CrossRef] [Medline]
  11. Kazemi DM, Borsari B, Levine MJ, Li S, Lamberson KA, Matta LA. A systematic review of the mHealth interventions to prevent alcohol and substance abuse. J Health Commun 2017 May;22(5):413-432. [CrossRef] [Medline]
  12. Vorderstrasse A, Lewinski A, Melkus GD, Johnson C. Social support for diabetes self-management via eHealth interventions. Curr Diab Rep 2016 Dec;16(7):56. [CrossRef] [Medline]
  13. Menefee HK, Thompson MJ, Guterbock TM, Williams IC, Valdez RS. Mechanisms of communicating health information through Facebook: implications for consumer health information technology design. J Med Internet Res 2016 Aug 11;18(8):e218 [FREE Full text] [CrossRef] [Medline]
  14. Dehlin JM, Stillwagon R, Pickett J, Keene L, Schneider JA. #PrEP4Love: an evaluation of a sex-positive HIV prevention campaign. JMIR Public Health Surveill 2019 Jun 17;5(2):e12822 [FREE Full text] [CrossRef] [Medline]
  15. Kass-Hout TA, Alhinnawi H. Social media in public health. Br Med Bull 2013;108:5-24. [CrossRef] [Medline]
  16. Thackeray R, Neiger BL, Smith AK, Van Wagenen SB. Adoption and use of social media among public health departments. BMC Public Health 2012;12:242 [FREE Full text] [CrossRef] [Medline]
  17. Zhang MW, Ho CS, Fang P, Lu Y, Ho RC. Usage of social media and smartphone application in assessment of physical and psychological well-being of individuals in times of a major air pollution crisis. JMIR Mhealth Uhealth 2014 Mar 25;2(1):e16 [FREE Full text] [CrossRef] [Medline]
  18. Giustini D, Ali SM, Fraser M, Kamel Boulos MN. Effective uses of social media in public health and medicine: a systematic review of systematic reviews. Online J Public Health Inform 2018;10(2):e215 [FREE Full text] [CrossRef] [Medline]
  19. Taggart T, Grewe ME, Conserve DF, Gliwa C, Roman IM. Social media and HIV: a systematic review of uses of social media in HIV communication. J Med Internet Res 2015 Nov 02;17(11):e248 [FREE Full text] [CrossRef] [Medline]
  20. Whitehead L, Seaton P. The effectiveness of self-management mobile phone and tablet apps in long-term condition management: a systematic review. J Med Internet Res 2016;18(5):e97 [FREE Full text] [CrossRef] [Medline]
  21. Williams G, Hamm MP, Shulhan J, Vandermeer B, Hartling L. Social media interventions for diet and exercise behaviours: a systematic review and meta-analysis of randomised controlled trials. BMJ Open 2014;4(2):e003926 [FREE Full text] [CrossRef] [Medline]
  22. Swanton R, Allom V, Mullan B. A meta-analysis of the effect of new-media interventions on sexual-health behaviours. Sex Transm Infect 2015 Feb;91(1):14-20. [CrossRef] [Medline]
  23. Odone A, Ferrari A, Spagnoli F, Visciarelli S, Shefer A, Pasquarella C, et al. Effectiveness of interventions that apply new media to improve vaccine uptake and vaccine coverage. Hum Vaccin Immunother 2015;11(1):72-82 [FREE Full text] [CrossRef] [Medline]
  24. Mita G, Mhurchu CN, Jull A. Effectiveness of social media in reducing risk factors for noncommunicable diseases: a systematic review and meta-analysis of randomized controlled trials. Nutr Rev 2016 Apr;74(4):237-247 [FREE Full text] [CrossRef] [Medline]
  25. Cartledge P, Miller M, Phillips B. The use of social-networking sites in medical education. Med Teach 2013 Oct;35(10):847-857. [CrossRef] [Medline]
  26. Willis EA, Szabo-Reed AN, Ptomey LT, Steger FL, Honas JJ, Washburn RA, et al. Do weight management interventions delivered by online social networks effectively improve body weight, body composition, and chronic disease risk factors? A systematic review. J Telemed Telecare 2017 Feb;23(2):263-272. [CrossRef] [Medline]
  27. Tran BX, Nghiem S, Sahin O, Vu TM, Ha GH, Vu GT, et al. Modeling research topics for artificial intelligence applications in medicine: latent Dirichlet allocation application study. J Med Internet Res 2019 Nov 01;21(11):e15511 [FREE Full text] [CrossRef] [Medline]
  28. WHO publishes list of top emerging diseases likely to cause major epidemics. World Health Organization. 2017.   URL: [accessed 2020-08-03]
  29. Peng T, Zhang L, Zhong Z, Zhu JJ. Mapping the landscape of internet studies: text mining of social science journal articles 2000–2009. New Media Soc 2012 Nov 26;15(5):644-664. [CrossRef]
  30. Blei D, Ng A, Jordan M. Latent Dirichlet allocation. J Mach Learn Res 2003:1. [CrossRef]
  31. Agarwal R, Gao G, DesRoches C, Jha AK. The digital transformation of healthcare: current status and the road ahead. Inf Sys Res 2010 Dec;21(4):796-809. [CrossRef]
  32. Paton C, Bamidis PD, Eysenbach G, Hansen M, Cabrer M. Experience in the use of social media in medical and health education. Contribution of the IMIA Social Media Working Group. Yearb Med Inform 2011;6:21-29. [Medline]
  33. Cao B, Liu C, Durvasula M, Tang W, Pan S, Saffer AJ, et al. Social media engagement and HIV testing among men who have sex with men in China: a nationwide cross-sectional survey. J Med Internet Res 2017 Jul 19;19(7):e251 [FREE Full text] [CrossRef] [Medline]
  34. Jashinsky J, Burton SH, Hanson CL, West J, Giraud-Carrier C, Barnes MD, et al. Tracking suicide risk factors through Twitter in the US. Crisis 2014;35(1):51-59. [CrossRef] [Medline]
  35. Chen X, Hao T. Quantifying and visualizing the research status of social media and health research field. Soc Web Heal Res 2019:31-51. [CrossRef]
  36. Ramo DE, Prochaska JJ. Broad reach and targeted recruitment using Facebook for an online survey of young adult substance use. J Med Internet Res 2012;14(1):e28 [FREE Full text] [CrossRef] [Medline]
  37. Tucker J, Cao B, Li H, Tang S, Tang W, Wong N, et al. Social media interventions to promote HIV testing. Clin Infect Dis 2016 Jul 15;63(2):282-283 [FREE Full text] [CrossRef] [Medline]
  38. Chang T, Chopra V, Zhang C, Woolford S. The role of social media in online weight management: systematic review. J Med Internet Res 2013 Nov 28;15(11):e262 [FREE Full text] [CrossRef] [Medline]
  39. Patel R, Chang T, Greysen SR, Chopra V. Social media use in chronic disease: a systematic review and novel taxonomy. Am J Med 2015 Dec;128(12):1335-1350. [CrossRef] [Medline]
  40. Tang W, Wei C, Cao B, Wu D, Li K, Lu H, et al. Crowdsourcing to expand HIV testing among men who have sex with men in China: a closed cohort stepped wedge cluster randomized controlled trial. PLoS Med 2018 Aug;15(8):e1002645 [FREE Full text] [CrossRef] [Medline]
  41. Moorhead S, Hazlett D, Harrison L, Carroll J, Irwin A, Hoving C. A new dimension of health care: systematic review of the uses, benefits, and limitations of social media for health communication. J Med Internet Res 2013 Apr 23;15(4):e85 [FREE Full text] [CrossRef] [Medline]
  42. Dowshen N, Lee S, Matty LB, Castillo M, Mollen C. IknowUshould2: feasibility of a youth-driven social media campaign to promote STI and HIV testing among adolescents in Philadelphia. AIDS Behav 2015 Jun;19 Suppl 2:106-111 [FREE Full text] [CrossRef] [Medline]
  43. Vijaykumar S, Nowak G, Himelboim I, Jin Y. Virtual Zika transmission after the first U.S. case: who said what and how it spread on Twitter. Am J Infect Control 2018 Jan 04:1. [CrossRef] [Medline]
  44. Charles-Smith LE, Reynolds TL, Cameron MA, Conway M, Lau EHY, Olsen JM, et al. Using social media for actionable disease surveillance and outbreak management: a systematic literature review. PLoS One 2015;10(10):e0139701 [FREE Full text] [CrossRef] [Medline]
  45. Young SD, Zhang Q. Using search engine big data for predicting new HIV diagnoses. PLoS One 2018;13(7):e0199527 [FREE Full text] [CrossRef] [Medline]
  46. Lee K, Noh M, Koo D. Lonely people are no longer lonely on social networking sites: the mediating role of self-disclosure and social support. Cyberpsychol Behav Soc Netw 2013 Jun;16(6):413-418. [CrossRef] [Medline]
  47. Guntuku SC, Yaden DB, Kern ML, Ungar LH, Eichstaedt JC. Detecting depression and mental illness on social media: an integrative review. Curr Opin Behav Sci 2017 Dec;18:43-49. [CrossRef]
  48. Peng T, Liang H, Zhu JJ. Introducing computational social science for Asia-Pacific communication research. Asian J Comm 2019 Apr 16;29(3):205-216. [CrossRef]
  49. Walters K, Christakis DA, Wright DR. Are Mechanical Turk worker samples representative of health status and health behaviors in the U.S.? PLoS One 2018;13(6):e0198835 [FREE Full text] [CrossRef] [Medline]
  50. van Atteveldt W, Peng T. When communication meets computation: opportunities, challenges, and pitfalls in computational communication science. Commun Methods Meas 2018;12(2-3):81-92. [CrossRef]
  51. Golder SA, Macy MW. Digital footprints: opportunities and challenges for online social research. Annu Rev Sociol 2014 Jul 30;40(1):129-152. [CrossRef]
  52. Dreisbach C, Koleck TA, Bourne PE, Bakken S. A systematic review of natural language processing and text mining of symptoms from electronic patient-authored text data. Int J Med Inform 2019 May;125:37-46 [FREE Full text] [CrossRef] [Medline]
  53. Sadah SA, Shahbazi M, Wiley MT, Hristidis V. Demographic-based content analysis of web-based health-related social media. J Med Internet Res 2016 Jun 13;18(6):e148 [FREE Full text] [CrossRef] [Medline]
  54. Amankwah-Amoah J. Emerging economies, emerging challenges: mobilising and capturing value from big data. Technol Forecast Soc Change 2016 Sep;110:167-174. [CrossRef]
  55. Huang D, Wang J. Monitoring hand, foot and mouth disease by combining search engine query data and meteorological factors. Sci Total Environ 2018 Jan 15;612:1293-1299. [CrossRef] [Medline]
  56. Ling R, Lee J. Disease monitoring and health campaign evaluation using Google search activities for HIV and AIDS, stroke, colorectal cancer, and marijuana use in Canada: a retrospective observational study. JMIR Public Health Surveill 2016 Oct 12;2(2):e156 [FREE Full text] [CrossRef] [Medline]
  57. Ginsberg J, Mohebbi MH, Patel RS, Brammer L, Smolinski MS, Brilliant L. Detecting influenza epidemics using search engine query data. Nature 2009 Feb 19;457(7232):1012-1014. [CrossRef] [Medline]
  58. Lazer D, Kennedy R, King G, Vespignani A. Google flu trends still appears sick: an evaluation of the 2013-2014 flu season. SSRN Journal 2014:1. [CrossRef]
  59. Anastasi A. Evolving concepts of test validation. Ann Rev Psychol 1986 Jan;37(1):1-16. [CrossRef]
  60. Driouchi AA. ICTs for Health, Education and Socioeconomics: Regional Cases. Hershey: IGI GLobal; 2013.
  61. Lorig K, Ritter PL, Villa FJ, Armas J. Community-based peer-led diabetes self-management: a randomized trial. Diabetes Educ 2009;35(4):641-651. [CrossRef] [Medline]
  62. Wenxiu P. Analysis of new media communication based on Lasswell’s 5W model. J Educ Soc Res 2015 Sep 1:1. [CrossRef]
  63. Kim J, Park S, Yoo S, Shen H. Mapping health communication scholarship: breadth, depth, and agenda of published research in Health Communication. Health Commun 2010 Sep;25(6-7):487-503. [CrossRef] [Medline]
  64. Carroll JK, Tobin JN, Luque A, Farah S, Sanders M, Cassells A, et al. "Get Ready and Empowered About Treatment" (GREAT) study: a pragmatic randomized controlled trial of activation in persons living with HIV. J Gen Intern Med 2019 Sep;34(9):1782-1789. [CrossRef] [Medline]
  65. Narang B, Park S, Norrmén-Smith IO, Lange M, Ocampo AJ, Gany FM, et al. The use of a mobile application to increase access to interpreters for cancer patients with limited English proficiency: a pilot study. Med Care 2019 Jun;57 Suppl 6 Suppl 2:S184-S189. [CrossRef] [Medline]

ICT: information and communication technology
LDA: latent Dirichlet allocation
MERS: Middle East respiratory syndrome

Edited by G Eysenbach; submitted 22.12.19; peer-reviewed by S Ali, R Ho, A Lee, T Steeb; comments to author 23.01.20; revised version received 12.05.20; accepted 25.07.20; published 13.08.20


©Yan Zhang, Bolin Cao, Yifan Wang, Tai-Quan Peng, Xiaohua Wang. Originally published in the Journal of Medical Internet Research (, 13.08.2020.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on, as well as this copyright and license information must be included.