Published on in Vol 26 (2024)

Preprints (earlier versions) of this paper are available at, first published .
Reporting of Ethical Considerations in Qualitative Research Utilizing Social Media Data on Public Health Care: Scoping Review

Reporting of Ethical Considerations in Qualitative Research Utilizing Social Media Data on Public Health Care: Scoping Review

Reporting of Ethical Considerations in Qualitative Research Utilizing Social Media Data on Public Health Care: Scoping Review


1Nanfang Hospital, Southern Medical University, Guangzhou, China

2School of Nursing, Southern Medical University, Guangzhou, China

*these authors contributed equally

Corresponding Author:

Yanni Wu, PhD

Nanfang Hospital

Southern Medical University

No 1838 Guangzhou Avenue North

Baiyun District, Guangdong Province

Guangzhou, 510515


Phone: 86 02061641192


Background: The internet community has become a significant source for researchers to conduct qualitative studies analyzing users’ views, attitudes, and experiences about public health. However, few studies have assessed the ethical issues in qualitative research using social media data.

Objective: This study aims to review the reportage of ethical considerations in qualitative research utilizing social media data on public health care.

Methods: We performed a scoping review of studies mining text from internet communities and published in peer-reviewed journals from 2010 to May 31, 2023. These studies, limited to the English language, were retrieved to evaluate the rates of reporting ethical approval, informed consent, and privacy issues. We searched 5 databases, that is, PubMed, Web of Science, CINAHL, Cochrane, and Embase. Gray literature was supplemented from Google Scholar and OpenGrey websites. Studies using qualitative methods mining text from the internet community focusing on health care topics were deemed eligible. Data extraction was performed using a standardized data extraction spreadsheet. Findings were reported using PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews) guidelines.

Results: After 4674 titles, abstracts, and full texts were screened, 108 studies on mining text from the internet community were included. Nearly half of the studies were published in the United States, with more studies from 2019 to 2022. Only 59.3% (64/108) of the studies sought ethical approval, 45.3% (49/108) mentioned informed consent, and only 12.9% (14/108) of the studies explicitly obtained informed consent. Approximately 86% (12/14) of the studies that reported informed consent obtained digital informed consent from participants/administrators, while 14% (2/14) did not describe the method used to obtain informed consent. Notably, 70.3% (76/108) of the studies contained users’ written content or posts: 68% (52/76) contained verbatim quotes, while 32% (24/76) paraphrased the quotes to prevent traceability. However, 16% (4/24) of the studies that paraphrased the quotes did not report the paraphrasing methods. Moreover, 18.5% (20/108) of the studies used aggregated data analysis to protect users’ privacy. Furthermore, the rates of reporting ethical approval were different between different countries (P=.02) and between papers that contained users’ written content (both direct and paraphrased quotes) and papers that did not contain users’ written content (P<.001).

Conclusions: Our scoping review demonstrates that the reporting of ethical considerations is widely neglected in qualitative research studies using social media data; such studies should be more cautious in citing user quotes to maintain user privacy. Further, our review reveals the need for detailed information on the precautions of obtaining informed consent and paraphrasing to reduce the potential bias. A national consensus of ethical considerations such as ethical approval, informed consent, and privacy issues is needed for qualitative research of health care using social media data of internet communities.

J Med Internet Res 2024;26:e51496



Social media are web-based computer-mediated tools to collaborate, share, or exchange information, ideas, pictures, or videos in virtual communities and networks such as message boards, communities, chat rooms, forums, Twitter, and Facebook [1]. Moreover, patients and researchers can use internet communities to provide health care and disseminate health information [2,3]. Health care refers to the efforts made to improve or maintain physical, mental, or emotional well-being, including prevention, diagnosis, treatment, recovery, and other physical and mental impairments [4]. Currently, with 57% of the global population’s access to social media, more than 40% of the patients and caregivers worldwide utilize the internet community for health care information needs [5]. With diverse populations accessing internet communities and sharing information about health care topics, researchers have the opportunity to collect and analyze text about health care from a diverse range of participants in the internet community, which was unavailable previously [6]. Usually, quantitative data are derived from information extraction, which can be analyzed statistically, and the summary results presented cannot be directly linked to individual participants. In contrast, qualitative research within internet community analysis posts and comments qualitatively or thematically involves a more detailed and in-depth analysis and understanding of the full written content [7]. However, a controversial ethical problem has been raised about conducting qualitative research containing internet users’ verbatim quotes that could lead to traceability of the original post, thereby causing a threat to an individual’s privacy [8]. Additionally, a previous study investigated public and patients’ views regarding ethics in research using social media data and reported that internet users were aggrieved if they found any of their quotes cited in a medical research paper without obtaining their informed consent [9]. Further, besides the privacy breach caused by posts being traced, there is greater harm for special groups or vulnerable groups if we do not highlight the importance of the technical standards for text mining and privacy protection in health care. For instance, some unusual postings, abnormal pictures, and interactions that were expressed by individuals with mental disorders in social media can be detected by researchers by using text mining tools without obtaining their consent [10]. The publication of research on mental disorders, including quotes in posts, can result in a high risk of information harm, which can lead to personal information being revealed and further stigmatization of the condition or disease [11]. Since 2001, ethical concerns have been debated for decades about ethical approval, informed consent, and how to ensure anonymity and preserve data privacy and confidentiality in qualitative research in the internet community [12-14].

With the rapid development of social media and internet research, some ethical guidelines or standards have been published to ensure that research based on internet communities is conducted ethically. The Association of Internet Researchers (internet research ethical guidelines 2.0 and 3.0) showed that researchers working without the direct approval of ethics review boards would have additional challenges to face, and obtaining informed consent is obviously impracticable in several big data projects. However, with the ethical issues about privacy breaches and harms of risk of discrimination, the Association of Internet Researchers recommended reserving the acquisition of informed consent to the dissemination stage by asking for informed consent from specific participants before publication of their quotes [15,16]. Furthermore, researchers should take responsibility for information confidentiality and anonymity according to the internet research ethics criteria prepared by the National Committee for Research Ethics in the Social Sciences and the Humanities guidelines, which recommend a basic research ethic norm for the analyses, reports, and evaluations that apply to all research [17]. Moreover, the National Committee for Research Ethics in the Social Sciences and the Humanities guidelines contain more details about the demand for legal consent and privacy standards imposed by the European Union’s General Data Protection Regulation. The General Data Protection Regulation is a European Union–wide regulation targeting the project of personal data processing. The General Data Protection Regulation defines personal data as any information relating to an identifiable person (data subject), including name, online identification number, location data, and other factors related to personal, physical, physiological, mental, or social identity [18]. The General Data Protection Regulation recommends using anonymous data and deleting identifiable information to ensure the confidentiality of the data. Consent should be obtained from the individual for use in scientific research [18,19]. The British Psychological Society guideline does not explicitly refer to the internet community but suggests that researchers may consider paraphrasing the verbatim quotes to reduce the risk of being traced or identified in qualitative research [20]. When paraphrasing, steps must be put into place to ensure that the original meaning of the message is maintained. Currently, there is no widespread consensus on ethical considerations by social media researchers.

Some researchers have tried to explore the reporting of existing ethical considerations in research papers using social media data. For instance, Sinnenberg et al [6] reported that only 32% and 12% of the papers mentioned acquiring ethical approval and informed consent, respectively, by utilizing multiple analysis methods, including surveillance, intervention, recruitment, engagement, content analysis, and network analysis with Twitter data before 2015. Thereafter, Takats et al [21] conducted an updated examination based on Sinnenberg et al’s [6] study. They found that of 367 studies using different methodological approaches, including sentiment mining, surveillance, and thematic exploration of public health research using Twitter data between 2010 to 2019, 17% of the studies included verbatim tweets and identifiable information about the internet users [21]. Similarly, Lathan et al [22] reviewed papers, including both qualitative and quantitative methods, by using Facebook data to explore public health issues and reported that only 48% and 10% of the papers obtained ethical approval and informed consent, respectively. Furthermore, in a study on research using YouTube data or comments, Tanner et al [23] found that only 26.1% of these studies sought ethical approval, only 1 paper (0.08%) sought informed consent, and 27.7% contained identifiable information. These findings indicate widespread neglect of ethical issues such as ethical approval, informed consent, and privacy issues in research papers using social media data.

Our study focuses on the ethical challenges of qualitative studies utilizing social media data. First, social media can be considered as sources for qualitative data collection because of the low cost, vast amount of available sources about health information, and users’ health behaviors, experiences, and attitudes. Second, qualitative research is context-dependent and mainly contains quotations and written content to support the viewpoint. It is acknowledged that quote materials from social media would potentially be traced back to the original posts and threaten the users’ privacy [24]. This is supported by findings reported by Ayers et al [25] who found that online searches of verbatim Twitter quotes in journal papers described as “content analyses” or “coded Twitter postings” can be traced back to individual internet users 84% of the time. Furthermore, Lathan et al [22] identified that 46% of the studies with verbatim or paraphrased quotes could be traced to the original posts in 10 minutes. Therefore, it is essential to investigate the extent to which ethical oversight is reported in qualitative studies using social media data. Moreover, qualitative research often involves personally sensitive data about health conditions and diseases; hence, anonymity and proper deidentification would be more important for researchers [26,27].

Previous studies have reviewed the ethical challenges and methodological use of social media platforms such as Twitter [6,21], Facebook [22], and YouTube [23] for health care research in both qualitative and quantitative studies. Although there is plenty of qualitative data pouring into social media such as blogs, Twitter, Facebook, and Weibo, evidence is lacking on the investigation of ethical considerations targeting qualitative data in different software and web-based discussion forums to provide a more comprehensive understanding of the ethical issues. To address the ethical considerations in qualitative research of different internet communities and draw the attention of researchers and publishers to ethical issues, we conducted this study to evaluate the ethical practices and ethical considerations of qualitative studies on health care by using data of internet communities. This review aims to (1) assess the rates of reporting institutional review board (IRB) approval and informed consent in studies focused on mining text in the internet community and social media, (2) compare these rates according to the year of publication, country conducting the research, website included in the study’s analysis, and journal’s guidelines about ethical approval for the type of study, and (3) describe whether the studies used anonymized/deidentified data.

Research Design

We conducted a scoping review to investigate how qualitative research mining social media data handles ethical approval, informed consent, and confidential issues. We performed this study according to the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews) guidelines. The completed PRISMA-ScR checklist is provided in Multimedia Appendix 1.

Search Strategy

All published qualitative studies from 2010 to March 31, 2023, focusing on mining text from online community and social media sources about health care in the following databases were included in this study: PubMed, Web of Science, CINAHL, Cochrane, and Embase. A standardized search string containing Medical Subject Headings (MeSH) and non-MeSH entry terms was used in the search strategy. In addition, the reference lists of the retrieved papers and citation tracking were manually searched as a supplement to database searches to improve comprehensiveness. Gray literature was also identified through internet searches in Google Scholar and OpenGrey websites. The search strategies are represented in Multimedia Appendix 2.

Inclusion and Exclusion Criteria

We divided the criteria into 2 parts. First, we limited the inclusion and exclusion criteria used at the title and abstract screening stage eligible for (1) studies mining existing text and posts from the internet community and social media data focusing on health care topics, (2) studies using qualitative methods or available qualitative parts in mixed methods studies to analyze data, and (3) studies only written in English. Ineligible studies were those related to investigating the use and dissemination of social media in health care, using social media or internet community as an intervention tool, and using social media to conduct web-based interviews, surveys, or focus groups. We also excluded studies published as reviews, case studies, conference abstracts, commentaries, policies, guidelines, and recommendations. Second, at the full screening stage, the specific eligible inclusion criteria were studies focused on mining text about health care topics with full-text papers. Studies that did not have the full text after contacting the authors and that were not originally in the English language were excluded.

Study Selection

All results of the searches were entered into the EndNote library, and duplicates were removed. Two researchers reviewed the titles and abstracts based on the inclusion and exclusion criteria independently. Those studies that were irrelevant to the study topic were discarded, and then the full text was screened to select eligible papers. Any disagreements were discussed and resolved by consensus or a third person.

Data Extraction

Data were extracted between April 2023 and May 2023. Two researchers independently read the full text carefully, and the results were extracted using a standardized data extraction spreadsheet, including research type, first author, study objective, sample size, publication time, country where the research was conducted or country of the first author, website or internet community the studies focus on, type of data collected from social media, language of collected posts or data, privacy level of data (public or privacy posts), study design, research results, published journal, and information about the ethical considerations. Disagreements were resolved by consensus of a third person. The information about ethical considerations was analyzed to investigate the rates of reporting ethical approval, informed consent, and privacy issues: whether IRB review was reported (IRB approval, IRB exemption, unnecessary, not mentioned) and the reason for not requiring IRB approval; whether informed consent was obtained from participants or the websites’ administrators, consent types (digitally informed consent or written informed consent, informed consent is not required, consent was waived by IRB), and the methods used to obtain consent in each study; and whether quoting a post in papers could lead to the identification of internet users in each study. The description of users’ posts (verbatim quote, paraphrase) was recorded. We also analyzed if posts were paraphrased to maintain the original meaning, if actions were taken to deidentify the internet users, and if the posts contained other identifying information (ie, usernames, photos, links, hashtags) attached to the post. As every journal would provide publication ethical considerations and requirements, we also searched the submission guidelines and editorial policies of each journal submission website to check whether the journal contained any ethical guidance targeting studies using data from internet community and social media platforms. Additional information was included about the details of ethical approval, informed consent, and privacy, for example, whether individuals can withdraw their quotes if they want to be excluded from the study at any time without any reprisal and whether the quotations were tested for deidentification via search engines. There was excellent agreement on the primary outcome between the 2 researchers (k>.95 for all).

Data Analysis

Data were analyzed using SPSS software (IBM Corp). The chi-square test or Fisher exact tests (when cell size was less than 5) were used to test for differences between the rates of informed consent and ethical approval according to publication year, website, and different countries. All P values were 2-sided, and P values <.05 indicated significance.

Study Selection for the Review

We reviewed 4674 papers after removing the duplicates. After screening the titles, abstracts, and full-texts, we reviewed 108 eligible papers (Figure 1). The full list of the included papers and all the extracted information are incorporated in Multimedia Appendix 3 [28-135]. Of the 108 studies reviewed, 73 (67.6%) were qualitative studies and 35 (32.4%) were mixed methods studies. All papers had text mined from internet communities or social media for qualitative analysis. The sample size ranged from 32 to 392,962. Approximately 82.4% (89/108) of the studies were published after 2018, and there was a sharp increase in the number of studies from 2019 to 2022. Moreover, nearly half of the studies (55/108, 50.9%) were published in the United States. Regarding the websites for mining text, the most widely used social media platform was Twitter (42/108, 38.9%), followed by Facebook (17/108, 15.7%).

Figure 1. PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews) flow diagram of the study selection process.

Ethics Approval in These Studies

Our results indicated that of the 108 studies, 78 (72.2%) reported ethics approval. Of the 78 studies, 31 (40%) explicitly stated that ethics approval was obtained before the study was undertaken, 33 (42%) reported that the ethics approval was granted through exemptions by the local IRB, and 14 (18%) explicitly demonstrated that approval by the ethics committee was not required because publicly available data were collected from internet communities and social media platforms. However, 30 (27.8%) of the 108 studies did not mention about obtaining IRB approval (Table 1).

Table 1. Ethical considerations in the qualitative studies using data of the internet community.
Ethical considerationsValues, n (%)
Institutional review board review sought (N=108)

Yes (including exemption)64 (59.3)

No14 (12.9)

Not mentioned30 (27.8)
Informed consent (N=108)

Yes14 (12.9)

No (not required/exemption)35 (32.4)

Not mentioned59 (54.7)
Anonymous data (N=108)

Yes104 (96.3)

No4 (3.7)
Studies contain internet users’ written content (n=76)

Verbatim quote52 (68)

Paraphrased24 (32)
Identifiable information attached to the post (links, photos, screenshots) (n=76)

Yes14 (18)

No62 (82)

Based on our exploration of the ethical guidelines of each journal to determine whether there were ethical requirements for studies mining social media data, only 36.1% (39/108) of the studies were published in journals that required ethical considerations for studies gathering data from social media platforms by using internet and digital technologies. Of the 39 studies published in 19 journals, 27 (69%) were published in the Journal of Medical Internet Research and its sister journals. The submission guidelines of the Journal of Medical Internet Research state that authors of manuscripts describing studies of internet, digital tools, and technologies are required to verify that they have adhered to local, national, regional, and international laws and regulations, and are required to verify that they complied with informed consent guidelines. Moreover, 2 journals also provided a specific requirement, that is, when researchers interact with individuals or obtain privacy information gathered from social media platforms, they should obtain ethics approval prior to conducting the study and informed consent from anyone who could potentially be identified. Surprisingly, there were no significant differences in the ethics approval reportage between journals with ethics approval guidelines and those that did not have ethics guidelines for researchers gathering data from social media platforms (P=.08). Notably, the rates of reporting ethics approval were different between different countries (P=.02). However, there were no statistically significant differences between the rates of reporting ethical approval and different websites or publication years (all P>.05) (Table 2).

Table 2. Reporting of ethical considerations in studies published in different publication years, countries, websites, and journals containing ethical requirements for research involving text mining and internet users’ written content.
Items (total number of studies)Ethical approval reportedInformed consent reported

Values, n (%)Chi-square (df)P valueValues, n (%)Chi-square (df)P value
Year17.2 (13).11
12.1 (13).52

2010 (n=1)1 (100)

0 (0)

2011 (n=2)2 (100)

1 (100)

2012 (n=2)1 (50)

1 (50)

2013 (n=2)0 (0)

0 (0)

2014 (n=3)1 (33)

2 (67)

2015 (n=2)1 (50)

1 (50)

2016 (n=3)3 (100)

2 (67)

2017 (n=4)4 (100)

3 (75)

2018 (n=12)7 (58)

3 (25)

2019 (n=9)5 (56)

4 (44)

2020 (n=24)16 (67)

9 (38)

2021 (n=14)11 (78)

5 (36)

2022 (n=25)22 (88)

16 (64)

2023 (n=5)4 (80)

2 (40)

Country conducting the research28.4 (20).02
17.8 (20).64

United States (n=55)40 (73)

23 (43)

Australia (n=12)10 (83)

6 (50)

United Kingdom (n=8)8 (100)

5 (62)

Canada (n=9)7 (78)

5 (56)

China (n=3)0 (0)

0 (0)

Netherlands (n=3)2 (67)

2 (67)

Turkey (n=2)2 (100)

1 (50)

United Arab Emirates (n=2)1 (50)

1 (50)

India (n=2)0 (0)

0 (0)

Sweden (n=1)1 (100)

1 (100)

Norway (n=1)1 (100)

1 (100)

Italy (n=1)1 (100)

1 (100)

Germany (n=1)1 (100)

0 (0)

France (n=1)1 (100)

0 (0)

Finland (n=1)1 (100)

1 (100)

Bangladesh (n=1)1 (100)

1 (100)

Austria (n=1)1 (100)

1 (100)

Thailand (n=1)0 (0)

0 (0)

Saudi Arabia (n=1)0 (0)

0 (0)

Singapore (n=1)0 (0)

0 (0)

Israel (n=1)0 (0)

0 (0)

Website cited in the research14.7 (11).12
18.7 (11).07

Twitter (n=42)26 (62)

14 (33)

Facebook (n=17)12 (70)

10 (59)

≥2 websites (n=14)11 (79)

6 (43)

Reddit (n=9)8 (89)

3 (33)

Specialist forums (n=7)7 (100)

5 (57)

Instagram (n=5)4 (80)

4 (80)

Blog (n=4)4 (100)

4 (100)

YouTube (n=4)3 (75)

1 (25)

Sina Weibo (n=3)0 (0)

0 (0)

Quora (n=1)1 (100)

1 (100)

STUMPPI (n=1)1 (100)

1 (100)

WhatsApp (n=1)1 (100)

1 (100)

Whether journals contained ethical requirements for research involving text mining from internet community and social media platforms3.5 (1).08
2.2 (1).16

Yes (n=39)24 (62)

14 (36)

No (n=69)54 (78)

35 (51)

Whether studies had users’ written content12.9 (1)<.001
2.2 (1).15

Yes (n=76)60 (79)

38 (50)

No (n=32)14 (44)

21 (67)

Informed Consent

Of the 108 studies, 59 (54.7%) showed that they did not include any information about informed consent and 49 (45.3%) mentioned informed consent. Of the 49 studies that mentioned informed consent, 14 (13%) demonstrated that informed consent was waived by local institutional boards, and 21 (19%) reported that informed consent was not required because this information is publicly available in websites or did not involve human participants. We interpreted this as not seeking informed consent. Only 14 (12.9%) of the 108 studies explicitly indicated that informed consent was obtained (Table 1). Among the 14 studies, 2 (14%) only provided a generic statement that informed consent was obtained but did not report the process of how the informed consent was obtained, while 12 (86%) received digital informed consent. Of the 12 studies that reported receiving digital informed consent, 6 reported that they sought permission from the communities’ or groups’ administrators and by posting a statement of the research objective on the group’s wall, while 5 studies contacted the participants privately via email, commenting below the posts and software to gain consent, and 1 study reported that it had sent a digital version of the informed consent book. Furthermore, among the studies that had obtained informed consent, 7 studies included the statement that the individuals’ posts would be removed if they wanted to be excluded from the study, and they could withdraw from the study whenever they wanted. In addition, the rates of reporting informed consent showed no statistical significance between publication years, different countries, and different websites (all P>.05) (Table 2).

Confidentiality of the Information

All data sources were obtained from anonymous websites or communities, and the majority (104/108, 96.3%) of the data sources did not contain usernames. Notably, only 3.7% (4/108) of the studies contained the participants’ usernames or pseudonyms. One study reported that pseudonyms like Sasha had been used instead of the real name. The other 3 studies contained the expression for usernames but did not state whether pseudonyms were used. Except for 9 studies that used nonnative language quotes and 3 studies that were transcribed into text via video, among the 108 included studies, 76 (70.3%) quoted at least one native language post in their reports. Additionally, 20 studies presenting aggregated analysis or composite accounts did not include any quotation or written content. Of the 76 studies containing internet users’ written content, 52 (68%) contained just verbatim-quoted participants’ posts and 24 (32%) contained paraphrased posts (Table 1). Among the 52 studies containing direct and verbatim quotations, which are likely to be traced to the original posts from users, only 17 (33%) studies took measures to deidentify the users. The 17 studies mentioned that all names or usernames were removed and personal identifying information was removed to maintain privacy, while 42% (22/52) of the studies did not mention any measures that were taken to deidentify the users and maintain confidentiality. Approximately 32% (24/76) of the studies described that they paraphrased posts and removed any explicitly identified personal information to maintain confidentiality to reduce the likelihood of users being identified via search engines. Of the 24 studies, 20 (83%) reported that the quotations were slightly modified or summarized for readability, the symbol information was removed using “…”, and key identifiable information was removed to protect privacy while maintaining the meaning of posts. Four of the 24 (17%) studies did not report the methods and details of paraphrasing. Notably, only 3% (2/76) of the studies containing users’ written content showed that researchers intentionally entered each quote into search engines to ensure that every quote did not lead to the original posts. Moreover, of the 76 studies containing written content, 62 (82%) did not contain other types of identity information attached to the posts, while 14 (18%) included other identifying data (hashtags, emojis, geolocation, photos, links, screenshots) attached to the original posts for analysis (Table 1). Of the 14 studies including other identifying information, 4 (29%) contained photos and screenshots associated with the website pages. Of the 52 studies that disclosed verbatim quotes and other identifiable information, 26 (46%) studies reported informed consent consideration, and only 8 (15%) obtained explicitly informed consent. Additionally, of the 77% (40/52) of the studies that mentioned IRB or ethical review, 38% (15/40) received IRB approval, and 63% (25/40) of the studies were granted exemption. The proportion of reporting ethical approval in studies containing users’ written content was modestly higher than that in studies not containing users’ written content (60/76, 79% vs 14/32, 44%; P<.001) (Table 2).

Principal Findings and Comparison to Prior Work

In this scoping review, we included 108 studies (Multimedia Appendix 3; [28-135]) that focused on mining text from internet community and social media data for health care research, and we reviewed the ethical consideration reportage and outcome reports in these studies. We found that the rates of reporting IRB approval and informed consent in qualitative research on health care utilizing social media data were 59.3% (64/108) and 12.9% (14/108), respectively. Our findings demonstrate that the key ethical considerations for qualitative research in online communities are insufficiently discussed and described. However, the reporting rates of ethical considerations in the papers in our scoping review were much higher than those reported in systematic reviews including multiple analysis methodologies on only 1 social media platform. For example, ethics approval and informed consent were reported in 48% and 10% of research studies using only Facebook data [22], 32% and 0% from 2006 to 2019 [21], 40% and 0.9% (only 1 paper) from 2015 to 2016 in public health research using only Twitter data [25], and 26.1% and 0.8% (only 1 paper) in researches incorporating only YouTube data [23], respectively. In fact, previous studies were limited to only a few selected websites such as Twitter, Facebook, and YouTube. There is a lack of research that incorporates a variety of different social media data for comparisons. Differences in the reporting of ethical considerations may be attributed to the different methodologies adopted by studies. For example, Lathan et al [22] analyzed the ethical considerations in studies including predictive or model development, while our research focuses on the ethical considerations in qualitative studies.

Importantly, our findings indicate that there is a need to develop a standardized and apparent approach for the reporting of ethical considerations in qualitative research of data from social media and online communities. Our research demonstrates that the rates of reporting ethics approval are different in different countries (P=.02). Specifically, a wide variety of national research ethics governing bodies and over 1000 laws, regulations, and standards provide oversight for human subjects research in 130 countries. Obviously, a guideline is needed for best ethical practices for qualitative research involving posts from social media platforms. Surprisingly, there were no significant differences between the rates of reporting ethical approval and those of journals specifying ethical requirements for studies involving text mining (P=.08). This inconsistent result of publication guidelines and reports of ethical approval consent is similar to previous findings on the ethical standards in COVID-19 human studies [136]. Although there are journal publication guidelines for studies mining social media data, the reports of ethical approval and consent in the papers published in such journals do not exactly follow the guidelines. Consequently, this finding indicates that more ethical awareness is needed among researchers, editors, and reviewers for qualitative studies on data mining.

Besides the different legal and regulations in different countries, the inconsistency in the ethics approval in published papers may be because social media research is a highly interdisciplinary science, and computer science researchers may be less experienced or may pay less attention to the key ethical issues of protecting human subjects [137]. Medical and health science researchers may have considered some ethical concerns about gathering social media data but they may not be familiar with the relevant guidelines. For example, the Association of Internet Researchers has a detailed ethical guideline targeting social scientists conducting digital research, while it may be less popular and less well-known among medical and health care researchers. At the institute level, Ferretti et al [138] noticed that institutionalized review committees, especially the individual IRB institutes for universities and health care systems lack knowledge about the methodology, text mining technical standards, data security, and ethical harms for studies using big data and social media as sources. Because of this lack of knowledge, institutional ethics committees may have inconsistent ethical criteria and perspectives about web-based projects using social media data [139]. Therefore, some ethics review committees exclude research on internet communities from ethical oversights because their ethics standards are confined only to medical fields. Above all, it is additionally challenging for ethical approval institutions because of the continuous development and dynamic change of studies using social media data. Furthermore, it is necessary for ethics committee members to be trained about the ethical issues in studies mining text from social media. Inviting interdisciplinary researchers to join in the approval process would be an appropriate method to increase the awareness of ethical considerations [140,141].

Interestingly, the reporting rate of obtaining informed consent for mining social media data in qualitative studies was unexpected. The most influential ethical reports such as the Nuremberg Code [142], Declaration of Helsinki [143], and the Belmont Report [144] have demonstrated the principle of informed consent in research involving humans. Our review shows that only 12.9% (14/108) of the studies explicitly obtained informed consent and 32.4% (35/108) of the studies reported that informed consent was exempted by IRB or was not required, as the information was available publicly in websites or did not involve human participants. Our results are similar to those of Wongkoblap et al [145] who reported that only 16.7% of the studies received informed consent from participants prior to data analysis on data mining of social network data on mental health disorders.

There are multiple reasons for the challenges in obtaining informed consent in an internet setting. First, it is impractical for researchers to gain individual informed consent from a large number of users in an internet community [146]. Second, members of ethics review boards lack consensus about the need for informed consent from an internet community for qualitative research under the current legal definition [147]. Moreover, there has been a debate on the criteria of human subject research in using social media data. The federal regulation recommends that if data in the studies are obtained from public social media websites, where data are identifiable and do not require interaction with individuals, such studies do not constitute human subject research, while studies involving the identification of private information or interaction with the individual can be considered as human subject research [148]. In contrast, some researchers believe that social media and big data research are not ethically exceptional and should be treated in the same manner and with the same rules as those for traditional forms of research [149]. There is ambiguity as to what is appropriate or should be standard practice for obtaining informed consent.

Currently, it is challenging to maintain privacy and protect the traceability of individuals posting content in the internet community. Our findings indicated that 70.3% (76/108) of the studies contained internet users’ written content, of which 68% (52/76) included verbatim quotations of users’ posts that could lead to identification, and 18% (14/76) of the studies included other identifiable information such as links, screenshots, and emojis linked to original posts, which are similar to the findings of Ayers et al [25] and Lathan et al [22]. Usha Lawrance et al [150] and Wilkinson and Thelwall [151] argued that using direct quotes to support findings would lead to the identification of users and breach users’ confidentiality in internet community data. Moreover, quoting social media posts or disclosing usernames violate the International Committee of Medical Journal Editors’ ethics standards, which state that identifying information such as written descriptions and photos should not be published unless the information is essential for scientific purposes and the participants give written informed consent for publication [152]. Furthermore, our study demonstrates that the proportion of studies containing users’ written content (both direct and paraphrased quotations) is higher than that of studies that do not include any quotation or written content (60/76, 79% vs 14/32, 44%; P<.001)——a tentative explanation is that some researchers realized that ethical reportage should be stricter for qualitative papers with quotations from social media posts due to privacy and security issues. This is supported by Boyd and Crawford [153] who stated that rigorous thinking about the process of mining and anonymizing big data is required for ethics boards to ensure that people are protected. Our findings show that 32% (24/76) of the studies intentionally paraphrased the quotes to ensure that users could not locate them, and 20 studies used aggregated data interconnected with anonymity. Moreover, it is recommended by Wilkinson and Thelwall [151], Bond et al [154], and Markham et al [155] that researchers should not directly quote and work with aggregate data sets and separate texts from their original context, which is more acceptable to participants. In addition, the British Psychological Society guidelines recommend that researchers consider paraphrasing any verbatim quotes to reduce the risk of these being traced to the source [20]. Notably, 13 of the 25 papers in this study showed that they did not report the precautions taken for paraphrasing. This may be due to the lack of detailed methodology and consensus about paraphrasing quotes to reduce bias and maintaining the original meaning.

Limitations and Strengths

Our scoping review has several limitations. First, our research was limited to qualitative studies and the qualitative parts in mixed methods studies on text mining from social media, and it is unclear whether ethical considerations are critical in quantitative studies among internet communities. Second, we were restricted to studies published in English language and those with the full text available, and therefore, we could be underestimating the number of relevant papers published in other languages. Third, the rates of reporting ethical approval, informed consent, and privacy of this research relied on self-reported data. Thus, it is possible that although certain studies did not report the process of ethical considerations, such considerations may have been followed during the research. Conversely, some studies may have mentioned about the ethical considerations but may not have conducted them in practice. Hence, there is a bias because of the lack of accurate documentation that must be considered.


Social media text mining can be a useful tool for researchers to understand patient experiences of health conditions and health care. However, as illustrated by the absence of ethical discourse in publications, our analysis indicates significant gaps in the ethical considerations and governance of qualitative research of internet posts. Therefore, a complete and consistent consensus guideline of ethical considerations in qualitative research of internet posts is needed to protect users’ data. With the continued advancing development of text-mining techniques, qualitative studies mining text from social media should be more cautious while using user quotations to maintain user privacy and protect the traceability of the internet users posting content. We suggest that authors should report their results by using aggregated findings or deidentified ways like paraphrasing instead of verbatim quotations, which can prevent internet users from being identified through search engines. In addition, authors should provide more detailed information about the precautions taken for obtaining informed consent and paraphrasing to reduce the potential bias. Furthermore, journals and editors should pay more attention to the reporting standards of ethical consideration and privacy issues in qualitative research involving social media data.


This project was funded by the National Natural Science Foundation of China (72304131) and the Outstanding Youths Development Scheme of Nanfang Hospital, Southern Medical University (2023J005). The funders had no role in the study design, data collection and analysis, decision to publish, or preparation of this manuscript. We sincerely thank the funders of this study.

Data Availability

All data extracted and analyzed during this study are presented in this paper and in the multimedia appendices.

Authors' Contributions

YW was responsible for the protocol of the research and redrafted the paper critically. YZ and JF performed literature searches. YZ, JL, and WC performed study identification and screening. ZG, SD, CZ, and JT extracted and analyzed the data from the included journals. YZ and JL wrote the first draft of the paper. All authors read and approved the final manuscript.

Conflicts of Interest

None declared.

Multimedia Appendix 1

PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews) checklist.

DOCX File , 71 KB

Multimedia Appendix 2

Search strategy for each database.

DOCX File , 28 KB

Multimedia Appendix 3

Summary of included literature.

XLSX File (Microsoft Excel File), 65 KB

  1. Obar JA, Wildman S. Social media definition and the governance challenge: An introduction to the special issue. Telecommunications Policy. Oct 2015;39(9):745-750. [CrossRef]
  2. Kaplan AM, Haenlein M. Users of the world, unite! The challenges and opportunities of social media. Business Horizons. Jan 2010;53(1):59-68. [CrossRef]
  3. VanDam C, Kanthawala S, Pratt W, Chai J, Huh J. Detecting clinically related content in online patient posts. J Biomed Inform. Nov 2017;75:96-106. [FREE Full text] [CrossRef] [Medline]
  4. Health care. Wikipedia. URL: [accessed 2023-06-01]
  5. Moorhead SA, Hazlett DE, Harrison L, Carroll JK, Irwin A, Hoving C. A new dimension of health care: systematic review of the uses, benefits, and limitations of social media for health communication. J Med Internet Res. Apr 23, 2013;15(4):e85. [FREE Full text] [CrossRef] [Medline]
  6. Sinnenberg L, Buttenheim AM, Padrez K, Mancheno C, Ungar L, Merchant RM. Twitter as a tool for health research: A systematic review. Am J Public Health. Jan 2017;107(1):e1-e8. [CrossRef] [Medline]
  7. Braun V, Clarke V. Using thematic analysis in psychology. Qualitative Research in Psychology. Jan 2006;3(2):77-101. [CrossRef]
  8. Hswen Y, Naslund JA, Brownstein JS, Hawkins JB. Monitoring online discussions about suicide among Twitter users with schizophrenia: exploratory study. JMIR Ment Health. Dec 13, 2018;5(4):e11483. [FREE Full text] [CrossRef] [Medline]
  9. Golder S, Ahmed S, Norman G, Booth A. Attitudes toward the ethics of research using social media: a systematic review. J Med Internet Res. Jun 06, 2017;19(6):e195. [FREE Full text] [CrossRef] [Medline]
  10. Athanasopoulou C, Sakellari E. Facebook and health information: content analysis of groups related to schizophrenia. Stud Health Technol Inform. 2015;213:255-258. [Medline]
  11. Conway M, O'Connor D. Social media, big data, and mental health: current advances and ethical implications. Curr Opin Psychol. Jun 2016;9:77-82. [FREE Full text] [CrossRef] [Medline]
  12. Eysenbach G, Till JE. Ethical issues in qualitative research on internet communities. BMJ. Nov 10, 2001;323(7321):1103-1105. [FREE Full text] [CrossRef] [Medline]
  13. Heilferty CM. Ethical considerations in the study of online illness narratives: a qualitative review. J Adv Nurs. May 2011;67(5):945-953. [CrossRef] [Medline]
  14. Roberts LD. Ethical issues in conducting qualitative research in online communities. Qualitative Research in Psychology. Apr 20, 2015;12(3):314-325. [CrossRef]
  15. Franzke A, Bechmann A, Zimmer M, Ess C, the Association of Internet Researchers. Internet research: ethical guidelines 3.0. Association of Internet Researchers. 2020. URL: [accessed 2023-05-30]
  16. Markham A, Buchanan E. Ethical decision-making and internet research: recommendations from the AoIR ethics working committee (version 2.0). AoIR. 2012. URL: [accessed 2023-06-05]
  17. The NCFREITSSH(. Ethical guidelines for internet research. 2022. URL: https:/​/www.​​en/​guidelines/​social-sciences-humanities-law-and-theology/​a-guide-to-internet-research-ethics [accessed 2023-06-13]
  18. Chapter 3: Rights of the data subject. General Data Protection Regulation (GDPR). European Commission. Official Journal of the European Union; 2018. URL: [accessed 2023-11-29]
  19. Hand DJ. Aspects of data ethics in a changing world: where are we now? Big Data. Sep 01, 2018;6(3):176-190. [FREE Full text] [CrossRef] [Medline]
  20. British Psychological Society. Ethics guidelines for internet-mediated research. 2021. URL: https:/​/explore.​​binary/​bpsworks/​64374754e0c1dd30/​5a757d7c2d39a9837f0eedec1b9ba28fa5e9e38f0bffa9950b678ab727803959/​rep155_2021.​pdf [accessed 2023-05-30]
  21. Takats C, Kwan A, Wormer R, Goldman D, Jones HE, Romero D. Ethical and methodological considerations of Twitter data for public health research: systematic review. J Med Internet Res. Nov 29, 2022;24(11):e40380. [FREE Full text] [CrossRef] [Medline]
  22. Lathan HS, Kwan A, Takats C, Tanner JP, Wormer R, Romero D, et al. Ethical considerations and methodological uses of Facebook data in public health research: A systematic review. Soc Sci Med. Apr 2023;322:115807. [FREE Full text] [CrossRef] [Medline]
  23. Tanner J, Takats C, Lathan H, Kwan A, Wormer R, Romero D, et al. Approaches to research ethics in health research on YouTube: systematic review. J Med Internet Res. Oct 04, 2023;25:e43060. [FREE Full text] [CrossRef] [Medline]
  24. Stockdale J, Cassell J, Ford E. "Giving something back": A systematic review and ethical enquiry into public views on the use of patient data for research in the United Kingdom and the Republic of Ireland. Wellcome Open Res. 2018;3:6. [FREE Full text] [CrossRef] [Medline]
  25. Ayers JW, Caputi TL, Nebeker C, Dredze M. Don't quote me: reverse identification of research participants in social media studies. NPJ Digit Med. 2018;1:30. [FREE Full text] [CrossRef] [Medline]
  26. Aitken M, de St Jorre J, Pagliari C, Jepson R, Cunningham-Burley S. Public responses to the sharing and linkage of health data for research purposes: a systematic review and thematic synthesis of qualitative studies. BMC Med Ethics. Nov 10, 2016;17(1):73. [FREE Full text] [CrossRef] [Medline]
  27. Hunter RF, Gough A, O’Kane N, McKeown G, Fitzpatrick A, Walker T, et al. Ethical issues in social media research for public health. Am J Public Health. Mar 2018;108(3):343-348. [CrossRef]
  28. Ahmed OH, Sullivan SJ, Schneiders AG, McCrory P. iSupport: do social networking sites have a role to play in concussion awareness? Disabil Rehabil. 2010;32(22):1877-1883. [CrossRef] [Medline]
  29. Bender JL, Jimenez-Marroquin M, Jadad AR. Seeking support on Facebook: a content analysis of breast cancer groups. J Med Internet Res. Feb 04, 2011;13(1):e16. [FREE Full text] [CrossRef] [Medline]
  30. Gajaria A, Yeung E, Goodale T, Charach A. Beliefs about attention-deficit/hyperactivity disorder and response to stereotypes: youth postings in Facebook groups. J Adolesc Health. Jul 2011;49(1):15-20. [CrossRef] [Medline]
  31. Degroot JM. Maintaining relational continuity with the deceased on Facebook. Omega (Westport). Nov 01, 2012;65(3):195-212. [CrossRef]
  32. Donelle L, Booth RG. Health tweets: an exploration of health promotion on twitter. Online J Issues Nurs. Sep 30, 2012;17(3):4. [FREE Full text] [Medline]
  33. Rui JR, Chen Y, Damiano A. Health organizations providing and seeking social support: a Twitter-based content analysis. Cyberpsychol Behav Soc Netw. Sep 2013;16(9):669-673. [CrossRef] [Medline]
  34. Lyles CR, López A, Pasick R, Sarkar U. "5 mins of uncomfyness is better than dealing with cancer 4 a lifetime": an exploratory qualitative analysis of cervical and breast cancer screening dialogue on Twitter. J Cancer Educ. Mar 2013;28(1):127-133. [CrossRef] [Medline]
  35. Lee JL, DeCamp M, Dredze M, Chisolm MS, Berger ZD. What are health-related users tweeting? A qualitative content analysis of health-related users and their messages on twitter. J Med Internet Res. Oct 15, 2014;16(10):e237. [FREE Full text] [CrossRef] [Medline]
  36. Ahlwardt K, Heaivilin N, Gibbs J, Page J, Gerbert B, Tsoh JY. Tweeting about pain: comparing self-reported toothache experiences with those of backaches, earaches and headaches. J Am Dent Assoc. Jul 2014;145(7):737-743. [FREE Full text] [CrossRef] [Medline]
  37. Struik LL, Baskerville NB. The role of Facebook in Crush the Crave, a mobile- and social media-based smoking cessation intervention: qualitative framework analysis of posts. J Med Internet Res. Jul 11, 2014;16(7):e170. [FREE Full text] [CrossRef] [Medline]
  38. Abdel-Razig S, Anglade P, Ibrahim H. Impact of the COVID-19 pandemic on a physician group's WhatsApp chat: qualitative content analysis. JMIR Form Res. Dec 07, 2021;5(12):e31791. [FREE Full text] [CrossRef] [Medline]
  39. Reddy A. Skincare in social media: analyzing prominent themes in online dermatologic discussions. Cureus. May 7, 2021:e14890. [CrossRef] [Medline]
  40. van der Pijl MSG, Hollander MH, van der Linden T, Verweij R, Holten L, Kingma E, et al. Left powerless: A qualitative social media content analysis of the Dutch #breakthesilence campaign on negative and traumatic experiences of labour and birth. PLoS One. 2020;15(5):e0233114. [FREE Full text] [CrossRef] [Medline]
  41. Rath L, Vijiaratnam N, Skibina O. Alemtuzumab in multiple sclerosis: lessons from social media in enhancing patient care. Int J MS Care. 2017;19(6):323-328. [CrossRef] [Medline]
  42. Bloom R, Beck S, Chou W, Reblin M, Ellington L. In their own words: experiences of caregivers of adults with cancer as expressed on social media. ONF. Sep 1, 2019;46(5):617-630. [CrossRef]
  43. Grumme V, Gordon S. Social media use by transplant recipients for support and healing. Comput Inform Nurs. Dec 2016;34(12):570-577. [CrossRef] [Medline]
  44. Haug NA, Bielenberg J, Linder SH, Lembke A. Assessment of provider attitudes toward #naloxone on Twitter. Subst Abus. 2016;37(1):35-41. [CrossRef] [Medline]
  45. Oren E, Martinez L, Hensley RE, Jain P, Ahmed T, Purnajo I, et al. Twitter communication during an outbreak of hepatitis A in San Diego, 2016–2018. Am J Public Health. Oct 2020;110(S3):S348-S355. [CrossRef]
  46. Parker C, Zomer E, Liew D, Ayton D. Characterising experiences with acute myeloid leukaemia using an Instagram content analysis. PLoS One. 2021;16(5):e0250641. [FREE Full text] [CrossRef] [Medline]
  47. Kurko T, Linden K, Kolstela M, Pietilä K, Airaksinen M. Is nicotine replacement therapy overvalued in smoking cessation? Analysis of smokers' and quitters' communication in social media. Health Expect. Dec 2015;18(6):2962-2977. [FREE Full text] [CrossRef] [Medline]
  48. Reuter K, Lee D. Perspectives toward seeking treatment among patients with psoriasis: protocol for a Twitter content analysis. JMIR Res Protoc. Feb 18, 2021;10(2):e13731. [FREE Full text] [CrossRef] [Medline]
  49. Mollema L, Harmsen IA, Broekhuizen E, Clijnk R, De Melker H, Paulussen T, et al. Disease detection or public opinion reflection? Content analysis of tweets, other social media, and online newspapers during the measles outbreak in The Netherlands in 2013. J Med Internet Res. May 26, 2015;17(5):e128. [FREE Full text] [CrossRef] [Medline]
  50. O'Hagan ET, Traeger AC, Bunzli S, Leake HB, Schabrun SM, Wand BM, et al. What do people post on social media relative to low back pain? A content analysis of Australian data. Musculoskelet Sci Pract. Aug 2021;54:102402. [CrossRef] [Medline]
  51. Zhou F, Zhang W, Cai H, Cao Y. Portrayals of 2v, 4v and 9vHPV vaccines on Chinese social media: a content analysis of hot posts on Sina Weibo. Hum Vaccin Immunother. Nov 02, 2021;17(11):4433-4441. [FREE Full text] [CrossRef] [Medline]
  52. Golder S, Bach M, O'Connor K, Gross R, Hennessy S, Gonzalez Hernandez G. Public perspectives on anti-diabetic drugs: exploratory analysis of Twitter posts. JMIR Diabetes. Jan 26, 2021;6(1):e24681. [FREE Full text] [CrossRef] [Medline]
  53. Mercier R, Senter K, Webster R, Henderson RA. Instagram users? experiences of miscarriage. Obstet Gynecol 2020 Jan;? Jan 2020;135(1):166-173. [CrossRef]
  54. Alghamdi A, Abumelha K, Allarakia J, Al-Shehri A. Conversations and misconceptions about chemotherapy in Arabic tweets: content analysis. J Med Internet Res. Jul 29, 2020;22(7):e13979. [FREE Full text] [CrossRef] [Medline]
  55. Golder S, O'Connor K, Hennessy S, Gross R, Gonzalez-Hernandez G. Assessment of beliefs and attitudes about statins posted on Twitter: a qualitative study. JAMA Netw Open. Jun 01, 2020;3(6):e208953. [FREE Full text] [CrossRef] [Medline]
  56. Stekelenburg N, Horsham C, O'Hara M, Janda M. Using social media to determine the affective and cognitive components of tweets about sunburn. Dermatology. Feb 27, 2020;236(2):75-80. [FREE Full text] [CrossRef] [Medline]
  57. Meeking K. Patients' experiences of radiotherapy: Insights from Twitter. Radiography (Lond). Aug 2020;26(3):e146-e151. [CrossRef] [Medline]
  58. Shah S, Bradbury-Jones C, Taylor J. Using Facebook to tell stories of premature ageing and sexual and reproductive healthcare across the life course for women with cerebral palsy in the UK and USA. BMJ Open. Feb 17, 2020;10(2):e032172. [FREE Full text] [CrossRef] [Medline]
  59. Pretorius K, Choi E, Kang S, Mackert M. Sudden infant death syndrome on Facebook: qualitative descriptive content analysis to guide prevention efforts. J Med Internet Res. Jul 30, 2020;22(7):e18474. [FREE Full text] [CrossRef] [Medline]
  60. Karmegam D, Mapillairaju B. What people share about the COVID-19 outbreak on Twitter? An exploratory analysis. BMJ Health Care Inform. Nov 2020;27(3):e100133. [FREE Full text] [CrossRef] [Medline]
  61. Hairston TK, Links AR, Harris V, Tunkel DE, Walsh J, Beach MC, et al. Evaluation of parental perspectives and concerns about pediatric tonsillectomy in social media. JAMA Otolaryngol Head Neck Surg. Jan 01, 2019;145(1):45-52. [FREE Full text] [CrossRef] [Medline]
  62. Årsand E, Bradway M, Gabarron E. What are diabetes patients versus health care personnel discussing on social media? J Diabetes Sci Technol. Mar 2019;13(2):198-205. [FREE Full text] [CrossRef] [Medline]
  63. Jiang X, Jiang W, Cai J, Su Q, Zhou Z, He L, et al. Characterizing media content and effects of organ donation on a social media platform: content analysis. J Med Internet Res. Mar 12, 2019;21(3):e13058. [FREE Full text] [CrossRef] [Medline]
  64. Oser TK, Minnehan KA, Wong G, Parascando J, McGinley E, Radico J, et al. Using social media to broaden understanding of the barriers and facilitators to exercise in adults with type 1 diabetes. J Diabetes Sci Technol. May 2019;13(3):457-465. [FREE Full text] [CrossRef] [Medline]
  65. Sutton J, Vos SC, Olson MK, Woods C, Cohen E, Gibson CB, et al. Lung cancer messages on Twitter: content analysis and evaluation. J Am Coll Radiol. Jan 2018;15(1 Pt B):210-217. [CrossRef] [Medline]
  66. Thomas J, Prabhu AV, Heron DE, Beriwal S. Twitter and brachytherapy: An analysis of "tweets" over six years by patients and health care professionals. Brachytherapy. 2018;17(6):1004-1010. [CrossRef] [Medline]
  67. Kelly-Hedrick M, Grunberg PH, Brochu F, Zelkowitz P. "It's totally okay to be sad, but never lose hope": content analysis of infertility-related videos on YouTube in relation to viewer preferences. J Med Internet Res. May 23, 2018;20(5):e10199. [FREE Full text] [CrossRef] [Medline]
  68. Gage-Bouchard EA, LaValley S, Mollica M, Beaupin LK. Cancer communication on social media: examining how cancer caregivers use Facebook for cancer-related communication. Cancer Nurs. 2017;40(4):332-338. [CrossRef] [Medline]
  69. Rael CT, Pierre D, Frye V, Kessler D, Duffy L, Malos N, et al. Evaluating blood donor experiences and barriers/facilitators to blood donation in the United States using YouTube video content. Transfusion. Sep 2021;61(9):2650-2657. [FREE Full text] [CrossRef] [Medline]
  70. Fisher S, Jehassi A, Ziv M. Hidradenitis suppurativa on Facebook: thematic and content analyses of patient support group. Arch Dermatol Res. Aug 2020;312(6):421-426. [CrossRef] [Medline]
  71. Abdoli S, Hessler D, Vora A, Smither B, Stuckey H. Descriptions of diabetes burnout from individuals with Type 1 diabetes: an analysis of YouTube videos. Diabet Med. Aug 2020;37(8):1344-1351. [CrossRef] [Medline]
  72. Myneni S, Lewis B, Singh T, Paiva K, Kim SM, Cebula AV, et al. Diabetes self-management in the age of social media: large-scale analysis of peer interactions using semiautomated methods. JMIR Med Inform. Jun 30, 2020;8(6):e18441. [FREE Full text] [CrossRef] [Medline]
  73. Charlie AM, Gao Y, Heller SL. What do patients want to know? questions and concerns regarding mammography expressed through social media. J Am Coll Radiol. Oct 2018;15(10):1478-1486. [CrossRef] [Medline]
  74. Watts G, Christou P, Antonarakis G. Experiences of individuals concerning combined orthodontic and orthognathic surgical treatment: a qualitative Twitter analysis. Med Princ Pract. 2018;27(3):227-235. [FREE Full text] [CrossRef] [Medline]
  75. Pai RR, Alathur S. Assessing mobile health applications with Twitter analytics. Int J Med Inform. May 2018;113:72-84. [CrossRef] [Medline]
  76. Cheng TY, Liu L, Woo BK. Analyzing Twitter as a platform for Alzheimer-related dementia awareness: thematic analyses of tweets. JMIR Aging. Dec 10, 2018;1(2):e11542. [FREE Full text] [CrossRef] [Medline]
  77. Bridges N, Howell G, Schmied V. Exploring breastfeeding support on social media. Int Breastfeed J. 2018;13:22. [FREE Full text] [CrossRef] [Medline]
  78. Anderson JG, Hundt E, Dean M, Keim-Malpass J, Lopez RP. "The Church of Online Support": examining the use of blogs among family caregivers of persons with dementia. J Fam Nurs. Feb 2017;23(1):34-54. [CrossRef] [Medline]
  79. Kearney MD, Selvan P, Hauer MK, Leader AE, Massey PM. Characterizing HPV vaccine sentiments and content on Instagram. Health Educ Behav. Dec 2019;46(2_suppl):37-48. [CrossRef] [Medline]
  80. Davies SH, Langer MD, Klein A, Gonzalez-Hernandez G, Dowshen N. Adolescent perceptions of menstruation on Twitter: opportunities for advocacy and education. J Adolesc Health. Jul 2022;71(1):94-104. [CrossRef] [Medline]
  81. Van Diepen C, Rosales Valdes D. A content analysis on the perceptions of LGBTQ+ (centred) health care on Twitter. Health Expect. Dec 2022;25(6):3238-3245. [FREE Full text] [CrossRef] [Medline]
  82. Pleasure ZH, Frohwirth LF, Li N, Polis CB. A content analysis of Reddit users' posts about challenges to contraceptive care-seeking during COVID-19-related restrictions in the United States. J Health Commun. Oct 03, 2022;27(10):746-754. [CrossRef] [Medline]
  83. Du Y, Dennis B, Ramirez V, Li C, Wang J, Meireles CL. Experiences and disease self-management in individuals living with chronic kidney disease: qualitative analysis of the National Kidney Foundation's online community. BMC Nephrol. Mar 04, 2022;23(1):88. [FREE Full text] [CrossRef] [Medline]
  84. Sadek Attalla S, Ow NL, McNarry M, De Simoni A. Experiences of exercise in patients with asthma: a qualitative analysis of discussions in a UK asthma online community. BJGP Open. Apr 29, 2022;6(3):BJGPO.2021.0162. [CrossRef]
  85. Bartmess M, Talbot C, O'Dwyer ST, Lopez RP, Rose KM, Anderson JG. Using Twitter to understand perspectives and experiences of dementia and caregiving at the beginning of the COVID-19 pandemic. Dementia (London). Jul 2022;21(5):1734-1752. [FREE Full text] [CrossRef] [Medline]
  86. Brewer G, Centifanti L, Caicedo JC, Huxley G, Peddie C, Stratton K, et al. Experiences of mental distress during COVID-19: thematic analysis of discussion forum posts for anxiety, depression, and obsessive-compulsive disorder. Illn Crises Loss. Oct 2022;30(4):795-811. [FREE Full text] [CrossRef] [Medline]
  87. Castillo LIR, Hadjistavropoulos T, Beahm J. Social media discussions about long-term care and the COVID-19 pandemic. J Aging Stud. Dec 2022;63:101076. [FREE Full text] [CrossRef] [Medline]
  88. Colaceci S, Anderson G, Ricciuto V, Montinaro D, Alazraki G, Mena-Tudela D. Experiences of birth during COVID-19 pandemic in Italy and Spain: A thematic analysis. Int J Environ Res Public Health. Jun 18, 2022;19(12):7488. [FREE Full text] [CrossRef] [Medline]
  89. Tripathi SD, Parker PD, Prabhu AV, Thomas K, Rodriguez A. An examination of patients and caregivers on Reddit Navigating Brain Cancer: content analysis of the brain tumor Subreddit. JMIR Cancer. Jun 22, 2022;8(2):e35324. [FREE Full text] [CrossRef] [Medline]
  90. Jina-Pettersen N. Fear, neglect, coercion, and dehumanization: is inpatient psychiatric trauma contributing to a public health crisis? J Patient Exp. 2022;9:23743735221079138. [FREE Full text] [CrossRef] [Medline]
  91. Rossi NA, Devarajan K, Chokshi SN, Ochoa VJ, Benavidez M, Malaya LT, et al. Social media depictions of cochlear implants: An Instagram and TikTok analysis. Otol Neurotol. 2023;44(1):e13-e21. [CrossRef]
  92. Singh GK, Rego J, Chambers S, Fox J. Health professionals' perspectives of the role of palliative care during COVID-19: content analysis of articles and blogs posted on Twitter. Am J Hosp Palliat Care. Apr 2022;39(4):487-493. [FREE Full text] [CrossRef] [Medline]
  93. Koly KN, Tasnim Z, Ahmed S, Saba J, Mahmood R, Farin FT, et al. Mental healthcare-seeking behavior of women in Bangladesh: content analysis of a social media platform. BMC Psychiatry. Dec 19, 2022;22(1):797. [FREE Full text] [CrossRef] [Medline]
  94. Belcher R, Sim D, Meykler M, Owens-Walton J, Hassan N, Rubin R, et al. A qualitative analysis of female Reddit users' experiences with low libido: how do women perceive their changes in sexual desire? J Sex Med. Feb 27, 2023;20(3):287-297. [CrossRef] [Medline]
  95. Culp F, Wu Y, Wu D, Ren Y, Raynor P, Hung P, et al. Understanding alcohol use discourse and stigma patterns in perinatal care on Twitter. Healthcare (Basel). Nov 26, 2022;10(12):2375. [FREE Full text] [CrossRef] [Medline]
  96. Ferrey A, Ashworth G, Cabling M, Rundblad G, Ismail K. A thematic analysis of YouTube comments on a television documentary titled 'Diabulimia: The world's most dangerous eating disorder'. Diabet Med. May 2023;40(5):e15025. [CrossRef] [Medline]
  97. Watt S, Salway T, Gómez-Ramírez O, Ablona A, Barton L, Chang H, et al. Rumination, risk, and response: a qualitative analysis of sexual health anxiety among online sexual health chat service users. Sex. Health. May 23, 2022;19(3):182-191. [CrossRef]
  98. Naganathan G, Bilgen I, Cleland J, Reel E, Cil T. #COVID19 and #Breastcancer: a qualitative analysis of tweets. Curr Oncol. Nov 08, 2022;29(11):8483-8500. [FREE Full text] [CrossRef] [Medline]
  99. Potgieter I, Hoare DJ, Fackrell K. Hyperacusis in children: a thematic analysis of discussions in online forums. Am J Audiol. Mar 03, 2022;31(1):166-174. [CrossRef]
  100. Manning Hutson M, Hosking SM, Mantalvanos S, Berk M, Pasco J, Dunning T. What injured workers with complex claims look for in online communities: netnographic analysis. J Med Internet Res. Apr 07, 2022;24(4):e17180. [FREE Full text] [CrossRef] [Medline]
  101. Lawless MT, Hunter SC, Pinero de Plaza MA, Archibald MM, Kitson AL. "You Are By No Means Alone": a netnographic study of self-care support in an online community for older adults. Qual Health Res. Nov 2022;32(13):1935-1951. [CrossRef] [Medline]
  102. Keim-Malpass J, Mitchell EM, Sun E, Kennedy C. Using Twitter to understand public perceptions regarding the #HPV vaccine: opportunities for public health nurses to engage in social marketing. Public Health Nurs. Jul 2017;34(4):316-323. [CrossRef] [Medline]
  103. Yamada R, Rasmussen KM, Felice JP. "What is 'enough,' and how do I make it?": a qualitative examination of questions mothers ask on social media about pumping and providing an adequate amount of milk for their infants. Breastfeed Med. 2019;14(1):17-21. [FREE Full text] [CrossRef] [Medline]
  104. Osadchiy V, Mills JN, Eleswarapu SV. Understanding patient anxieties in the social media era: qualitative analysis and natural language processing of an online male infertility community. J Med Internet Res. Mar 10, 2020;22(3):e16728. [FREE Full text] [CrossRef] [Medline]
  105. Thomas TH, Nauth-Shelley K, Thompson MA, Attai DJ, Katz MS, Graham D, et al. The needs of women treated for ovarian cancer: results from a #gyncsm Twitter chat. J Patient Cent Res Rev. 2018;5(2):149-157. [FREE Full text] [CrossRef] [Medline]
  106. Mehta N, Zhu L, Lam K, Stall NM, Savage R, Read SH, et al. Health forums and Twitter for dementia research: opportunities and considerations. J Am Geriatr Soc. Dec 2020;68(12):2881-2889. [CrossRef] [Medline]
  107. Gupta R, Ariefdjohan M. Mental illness on Instagram: a mixed method study to characterize public content, sentiments, and trends of antidepressant use. J Ment Health. Aug 2021;30(4):518-525. [CrossRef] [Medline]
  108. He L, He C, Reynolds T, Bai Q, Huang Y, Li C, et al. Why do people oppose mask wearing? A comprehensive analysis of U.S. tweets during the COVID-19 pandemic. J Am Med Inform Assoc. Jul 14, 2021;28(7):1564-1573. [FREE Full text] [CrossRef] [Medline]
  109. Loeb S, Mihalcea R, Perez-Rosas V, Xu A, Taylor J, Byrne N, et al. Leveraging social media as a thermometer to gauge patient and caregiver concerns: COVID-19 and prostate cancer. Eur Urol Open Sci. Mar 2021;25:1-4. [FREE Full text] [CrossRef] [Medline]
  110. Graf I, Gerwing H, Hoefer K, Ehlebracht D, Christ H, Braumann B. Social media and orthodontics: A mixed-methods analysis of orthodontic-related posts on Twitter and Instagram. Am J Orthod Dentofacial Orthop. Aug 2020;158(2):221-228. [CrossRef] [Medline]
  111. Boon-Itt S, Skunkan Y. Public perception of the COVID-19 pandemic on Twitter: sentiment analysis and topic modeling study. JMIR Public Health Surveill. Nov 11, 2020;6(4):e21978. [FREE Full text] [CrossRef] [Medline]
  112. Wang J, Zhou Y, Zhang W, Evans R, Zhu C. Concerns expressed by Chinese social media users during the COVID-19 pandemic: content analysis of Sina Weibo microblogging data. J Med Internet Res. Nov 26, 2020;22(11):e22152. [FREE Full text] [CrossRef] [Medline]
  113. Wahbeh A, Nasralah T, Al-Ramahi M, El-Gayar O. Mining physicians' opinions on social media to obtain insights into COVID-19: mixed methods analysis. JMIR Public Health Surveill. Jun 18, 2020;6(2):e19276. [FREE Full text] [CrossRef] [Medline]
  114. Çınar S, Boztepe H, Prof. The use of social media among parents of infants with cleft lip and/or palate. J Pediatr Nurs. 2020;54:e91-e96. [CrossRef] [Medline]
  115. Jiang T, Osadchiy V, Mills JN, Eleswarapu SV. Is It all in my head? self-reported psychogenic erectile dysfunction and depression are common among young men seeking advice on social media. Urology. Aug 2020;142:133-140. [CrossRef] [Medline]
  116. Dzubur E, Khalil C, Almario CV, Noah B, Minhas D, Ishimori M, et al. Patient concerns and perceptions regarding biologic therapies in ankylosing spondylitis: insights from a large-scale survey of social media platforms. Arthritis Care Res (Hoboken). Feb 2019;71(2):323-330. [FREE Full text] [CrossRef] [Medline]
  117. Gonzalez G, Vaculik K, Khalil C, Zektser Y, Arnold C, Almario CV, et al. Women's experience with stress urinary incontinence: insights from social media analytics. Journal of Urology. May 2020;203(5):962-968. [CrossRef]
  118. Jha SR, McDonagh J, Prichard R, Newton PJ, Hickman LD, Fung E, et al. #Frailty: A snapshot Twitter report on frailty knowledge translation. Australas J Ageing. Dec 2018;37(4):309-312. [CrossRef] [Medline]
  119. Litchman ML, Snider C, Edelman LS, Wawrzynski SE, Gee PM. Diabetes online community user perceptions of successful aging with diabetes: analysis of a #DSMA Tweet chat. JMIR Aging. Jun 22, 2018;1(1):e10176. [FREE Full text] [CrossRef] [Medline]
  120. Jones J, Pradhan M, Hosseini M, Kulanthaivel A, Hosseini M. Novel approach to cluster patient-generated data into actionable topics: case study of a web-based breast cancer forum. JMIR Med Inform. Nov 29, 2018;6(4):e45. [FREE Full text] [CrossRef] [Medline]
  121. Gonzalez G, Vaculik K, Khalil C, Zektser Y, Arnold C, Almario CV, et al. Using large-scale social media analytics to understand patient perspectives about urinary tract infections: thematic analysis. J Med Internet Res. Jan 25, 2022;24(1):e26781. [FREE Full text] [CrossRef] [Medline]
  122. Sormunen T, Westerbotn M, Aanesen A, Fossum B, Karlgren K. Social media in the infertile community-using a text analysis tool to identify the topics of discussion on the multitude of infertility blogs. Womens Health (Lond). 2021;17:17455065211063280. [FREE Full text] [CrossRef] [Medline]
  123. Berkovic D, Ackerman IN, Briggs AM, Ayton D. Tweets by people with arthritis during the COVID-19 pandemic: content and sentiment analysis. J Med Internet Res. Dec 03, 2020;22(12):e24550. [FREE Full text] [CrossRef] [Medline]
  124. Della Rosa S, Sen F. Health topics on Facebook Groups: content analysis of posts in multiple sclerosis communities. Interact J Med Res. Feb 11, 2019;8(1):e10146. [FREE Full text] [CrossRef] [Medline]
  125. Awofeso N, Imam SA, Ahmed A. Content analysis of media coverage of childhood obesity topics in UAE newspapers and popular social media platforms, 2014-2017. Int J Health Policy Manag. Feb 01, 2019;8(2):81-89. [FREE Full text] [CrossRef] [Medline]
  126. Smith DJ, Mac VV, Hertzberg VS. Using Twitter for nursing research: a tweet analysis on heat illness and health. J Nurs Scholarsh. May 2021;53(3):343-350. [CrossRef] [Medline]
  127. Green BM, Van Horn KT, Gupte K, Evans M, Hayes S, Bhowmick A. Assessment of adaptive engagement and support model for people with chronic health conditions in online health communities: combined content analysis. J Med Internet Res. Jul 07, 2020;22(7):e17338. [FREE Full text] [CrossRef] [Medline]
  128. Osakwe ZT, Cortés YI. Impact of COVID-19: a text mining analysis of Twitter data in Spanish language. Hisp Health Care Int. Dec 2021;19(4):239-245. [CrossRef] [Medline]
  129. Sümeyye Yorulmaz D, Karadeniz H. Vaccination refusal debate on social media in Turkey: a content analysis of the comments on Instagram blogs. Iran J Public Health. Mar 2022;51(3):615-623. [FREE Full text] [CrossRef] [Medline]
  130. Choi E, Becker H, Kim S. A blog text analysis to explore psychosocial support in adolescents and young adults with cancer. Cancer Nurs. Dec 11, 2022;46(2):143-151. [CrossRef]
  131. Jun J, Wickersham K, Zain A, Ford R, Zhang N, Ciccarelli C, et al. Cancer and COVID-19 vaccines on Twitter: the voice and vaccine attitude of cancer community. J Health Commun. Jan 02, 2023;28(1):1-14. [CrossRef] [Medline]
  132. Yashpal S, Raghunath A, Gencerliler N, Burns LE. Exploring public perceptions of dental care affordability in the United States: mixed method analysis via Twitter. JMIR Form Res. Jul 01, 2022;6(7):e36315. [FREE Full text] [CrossRef] [Medline]
  133. Damier P, Henderson EJ, Romero-Imbroda J, Galimam L, Kronfeld N, Warnecke T. Impact of off-time on quality of life in Parkinson's patients and their caregivers: insights from social media. Parkinsons Dis. 2022;2022:1800567. [FREE Full text] [CrossRef] [Medline]
  134. Miller WR, Malloy C, Mravec M, Sposato MF, Groves D. Nursing in the spotlight: Talk about nurses and the nursing profession on Twitter during the early COVID-19 pandemic. Nurs Outlook. 2022;70(4):580-589. [FREE Full text] [CrossRef] [Medline]
  135. Hriberšek M, Eibensteiner F, Kapral L, Teufel A, Nawaz FA, Cenanovic M, et al. "Loved ones are not 'visitors' in a patient's life"-The importance of including loved ones in the patient's hospital stay: An international Twitter study of #HospitalsTalkToLovedOnes in times of COVID-19. Front Public Health. 2023;11:1100280. [FREE Full text] [CrossRef] [Medline]
  136. O'Sullivan L, Killeen RP, Doran P, Crowley RK. Adherence with reporting of ethical standards in COVID-19 human studies: a rapid review. BMC Med Ethics. Jun 28, 2021;22(1):80. [FREE Full text] [CrossRef] [Medline]
  137. Denecke K, Bamidis P, Bond C, Gabarron E, Househ M, Lau AYS, et al. Ethical issues of social media usage in healthcare. Yearb Med Inform. Aug 13, 2015;10(1):137-147. [FREE Full text] [CrossRef] [Medline]
  138. Ferretti A, Ienca M, Sheehan M, Blasimme A, Dove ES, Farsides B, et al. Ethics review of big data research: What should stay and what should be reformed? BMC Med Ethics. Apr 30, 2021;22(1):51. [FREE Full text] [CrossRef] [Medline]
  139. Kohn T, Shore C. The Ethics of University Ethics Committees: risk management and the research imagination. Death Public Univ Uncertain Futur High Educ Knowl Econ. May 2017:229-249. [CrossRef]
  140. Friesen P, Redman B, Caplan A. Of straws, camels, research regulation, and IRBs. Ther Innov Regul Sci. Jul 2019;53(4):526-534. [CrossRef] [Medline]
  141. Sellers C, Samuel G, Derrick G. Reasoning "uncharted territory": notions of expertise within ethics review panels assessing research use of social media. J Empir Res Hum Res Ethics. 2020;15(1-2):28-39. [FREE Full text] [CrossRef] [Medline]
  142. Kious B. The Nuremberg Code: its history and implications. Princet J Bioeth. 2001;4:7-19. [Medline]
  143. World Medical Association. World Medical Association Declaration of Helsinki: ethical principles for medical research involving human subjects. JAMA. Nov 27, 2013;310(20):2191-2194. [CrossRef] [Medline]
  144. Department of Health‚ Education‚Welfare, National Commission for the Protection of Human Subjects of BiomedicalBehavioral Research. The Belmont Report. Ethical principles and guidelines for the protection of human subjects of research. J Am Coll Dent. 2014;81(3):4-13. [Medline]
  145. Wongkoblap A, Vadillo MA, Curcin V. Researching mental health disorders in the era of social media: systematic review. J Med Internet Res. Jun 29, 2017;19(6):e228. [FREE Full text] [CrossRef] [Medline]
  146. Alim S. An initial exploration of ethical research practices regarding automated data extraction from online social media user profiles. First Monday. Jul 2014;19(7):105-127. [CrossRef] [Medline]
  147. Vitak J, Proferes N, Shilton K, Ashktorab Z. Ethics regulation in social computing research: examining the role of institutional review boards. J Empir Res Hum Res Ethics. Dec 2017;12(5):372-382. [CrossRef] [Medline]
  148. Moreno MA, Goniu N, Moreno PS, Diekema D. Ethics of social media research: common concerns and practical considerations. Cyberpsychol Behav Soc Netw. Sep 2013;16(9):708-713. [FREE Full text] [CrossRef] [Medline]
  149. Metcalf J, Crawford K. Where are human subjects in Big Data research? The emerging ethics divide. Big Data & Society. Jun 01, 2016;3(1):205395171665021. [CrossRef]
  150. Usha Lawrance J, Nayahi Jesudhasan JV. Privacy preserving parallel clustering based anonymization for big data using MapReduce framework. Applied Artificial Intelligence. Oct 17, 2021;35(15):1587-1620. [CrossRef]
  151. Wilkinson D, Thelwall M. Researching personal information on the public web. Social Science Computer Review. Aug 17, 2010;29(4):387-401. [CrossRef]
  152. International Committee of Medical Journal Editors. Uniform requirements for manuscripts submitted to biomedical journals: writing and editing for biomedical publication. Croat Med J. Dec 2003;44(6):770-783. [FREE Full text] [Medline]
  153. Boyd D, Crawford K. Critical questions for big data. Information, Communication & Society. Jun 2012;15(5):662-679. [CrossRef]
  154. Bond CS, Ahmed OH, Hind M, Thomas B, Hewitt-Taylor J. The conceptual and practical ethical dilemmas of using health discussion board posts as research data. J Med Internet Res. Jun 07, 2013;15(6):e112. [FREE Full text] [CrossRef] [Medline]
  155. Markham A, Tiidenberg K, Herman A. Ethics as methods: doing ethics in the era of big data research—introduction. Social Media + Society. Jul 19, 2018;4(3):205630511878450-205630511878416. [CrossRef]

IRB: institutional review board
MeSH: Medical Subject Headings
PRISMA-ScR: Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews

Edited by T Leung, T de Azevedo Cardoso; submitted 02.08.23; peer-reviewed by E Zibrowski, J Scheibner; comments to author 06.10.23; revised version received 29.11.23; accepted 16.04.24; published 17.05.24.


©Yujie Zhang, Jiaqi Fu, Jie Lai, Shisi Deng, Zihan Guo, Chuhan Zhong, Jianyao Tang, Wenqiong Cao, Yanni Wu. Originally published in the Journal of Medical Internet Research (, 17.05.2024.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on, as well as this copyright and license information must be included.