This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.
SARS-CoV-2 (severe acute respiratory coronavirus 2) was spreading rapidly in South Korea at the end of February 2020 following its initial outbreak in China, making Korea the new center of global attention. The role of social media amid the current coronavirus disease (COVID-19) pandemic has often been criticized, but little systematic research has been conducted on this issue. Social media functions as a convenient source of information in pandemic situations.
Few infodemiology studies have applied network analysis in conjunction with content analysis. This study investigates information transmission networks and news-sharing behaviors regarding COVID-19 on Twitter in Korea. The real time aggregation of social media data can serve as a starting point for designing strategic messages for health campaigns and establishing an effective communication system during this outbreak.
Korean COVID-19-related Twitter data were collected on February 29, 2020. Our final sample comprised of 43,832 users and 78,233 relationships on Twitter. We generated four networks in terms of key issues regarding COVID-19 in Korea. This study comparatively investigates how COVID-19-related issues have circulated on Twitter through network analysis. Next, we classified top news channels shared via tweets. Lastly, we conducted a content analysis of news frames used in the top-shared sources.
The network analysis suggests that the spread of information was faster in the Coronavirus network than in the other networks (Corona19, Shincheon, and Daegu). People who used the word “Coronavirus” communicated more frequently with each other. The spread of information was faster, and the diameter value was lower than for those who used other terms. Many of the news items highlighted the positive roles being played by individuals and groups, directing readers’ attention to the crisis. Ethical issues such as deviant behavior among the population and an entertainment frame highlighting celebrity donations also emerged often. There was a significant difference in the use of nonportal (n=14) and portal news (n=26) sites between the four network types. The news frames used in the top sources were similar across the networks (
Most of the popular news on Twitter had nonmedical frames. Nevertheless, the spillover effect of the news articles that delivered medical information about COVID-19 was greater than that of news with nonmedical frames. Social media network analytics cannot replace the work of public health officials; however, monitoring public conversations and media news that propagates rapidly can assist public health professionals in their complex and fast-paced decision-making processes.
SARS-CoV-2 (severe acute respiratory coronavirus 2) is spreading rapidly around the world, and the number of associated deaths has also been increasing. At the end of February 2020, the virus was spreading in South Korea following its initial outbreak in China, making Korea the new center of global attention. Mass infection occurred in Korea due to a closed religious group called
Social media has been criticized often amid the current coronavirus disease (COVID-19) pandemic, mainly due to their use as a medium for the quick spread of fake news [
Using an infodemiological approach, this study analyzes networking trends in public conversations and news-sharing behavior regarding COVID-19, particularly in Daegu, South Korea, on Twitter. The Pew Research Center reported that approximately 75% of Twitter users visit Twitter.com to read the news [
The Twitterverse examined in this context includes diverse messages on topics such as nationwide emergency relief efforts, media news, mass condolences, requests for central and regional governmental measures, and the provision of crucial medical information. The fact that these four networks have similar network sizes allows their conversational patterns and news diffusion to be easily compared.
Overcoming the current COVID-19 crisis may require increasingly diverse forms of data and more complex models. Handling the real time aggregation and artificial intelligence-based analytics of social media, media news, academic publications, and other data sets is a daunting task. Nevertheless, this study could serve as a starting point for designing strategic messages for health campaigns and establishing an effective communication channel system.
Infodemiology is a growing area of research that aims to inform public health officials and develop public policies using informatics for the analysis of health data produced and consumed online [
Infodemiology studies have covered a wide range of topics. These include information search behaviors such as Ebola- or vaccination-related information [
The trustworthiness of user-created information is questionable [
In particular, user-generated content and shared health information on social media can serve as an alternative tool for syndromic surveillance [
Investigating the public’s communication framing of and approaches to health issues as observed on social media provides insights into the public’s thoughts on, perceptions about, and self-disclosures of disease-related symptoms [
Studies have focused on quantifying the search queries and tracking the volume of health-related information or user-generated content in electronic media using Google Trends, Google Health application programming interface (API), or Google Flu Trends [
We address three research questions (RQs) about Korea’s COVID-19 conversations in terms of socially disseminated Twitter messages. First, is there a difference in communication network structure among the four networks generated from four keywords (written in Korean)—Coronavirus, Corona19, Shincheonji, and Daegu—and what are the characteristics of the conversation patterns among users? Second, which news topics and media channels generate the users’ interest, and what are their characteristics? Do the most frequently mentioned news topics among the four networks display any differences in media outlet type? Third, what perspectives on news articles are observed from a media organizational point of view? In other words, do news articles with a medically oriented thematic frame have broader spillover effects on the COVID-19 issue in the Twitter context?
This study evaluates trends in Korea’s COVID-19 conversations using Twitter data. Data were collected on February 29, 2020, roughly covering the most recent weeks in the Twitter database. Using the Twitter search API embedded in NodeXL (Social Media Research Foundation) [
This study used three main methodological approaches. It conducted a social media network analysis to determine how COVID-19-related issues circulate on Twitter. The study traced the characteristics of information diffusion regarding COVID-19 on Twitter by generating communication networks composed of all tweets containing any of the search terms (“Coronavirus,” “corona19,” “Shincheonji,” or “Daegu”). A communication network in this study refers to a social network generated by users to communicate to each other. Each node represents a user, and the links between the users refer to a conversation (ie, a retweet, reply to, or mention). Four networks were generated. A network analysis was conducted to identify the multidimensional communication activities between Twitter users and grasp the nature of the information transmission networks composed of entities such as words, hyperlinks, and hashtags. Twitter users are compiled by subgroup using the Clauset–Newman–Moore cluster algorithm and visualized using the Harel–Koren Fast Multiscale layout algorithm [
We then classified the top news items in terms of their media channels. The media outlet of a news article can be regarded as both a form of carrier interface and a means of expression. In Korea, portals are increasingly becoming the primary point of news access. Given Korea’s unique news environment, a media organization’s presence on a portal offers an interface between its news production and its readership [
Besides news channels, we also considered news frames as points of view that guide readers’ cognitive direction. Content analysis was conducted to determine the main frames within the news stories by generating content categories that encompass the entire text corpus [
A coding scheme was developed to further subclassify the nonmedical news frames based on studies of journalists’ use of news frames [
Reflects disagreement between parties, individuals, or groups
Emphasizes the (positive) role of individuals and groups affecting the issue
Suggests that some government agencies, including politicians and public officials, are responsible for the issue
Contains a moral and ethical message
Mentions medical and health issues related to the problem
Covers cultural issues such as celebrity, sports, or food
We addressed RQ1 by comparing the topologies of the four networks, which are shown in
As
The frequency of unique edges was lowest in the Coronavirus network. Unique edges reflect frequency, excluding redundant relationships. In other words, the Coronavirus network has the most redundant relationships, indicating that people continued to talk to each other while exchanging comments several times. Thus, it is highly likely that a “big mouth” existed in the Coronavirus network. On the other hand, the Daegu and Corona19 networks had the lowest frequencies of edges with duplicates. These fewer overlapping relationships suggest that many one-time conversations took place, forming an instant and improvised community.
There are many self-loops in the Shincheonji network, wherein tweets started and ended with the same user in a conversation thread. The Shincheonji network has the highest reciprocated vertex pair and reciprocated edge ratios, which shows that “birds of a feather flock together.” When two Twitter users talk to each other, their relationship is regarded as being reciprocated. In sharp contrast, less than one-tenth of tweets in the Coronavirus network were self-loops, revealing that different types of users were paired in comment exchanges. Similarly, the Shincheonji network also had the largest number of isolates, followed by the Daegu network. An isolate has zero connections. This result suggests that the communication patterns in region-oriented networks, Shincheonji and Daegu, differed from the networks that concerned more overall issues regarding COVID-19. The difference can be attributed to geographic variation in information-sharing behaviors on Twitter [
Korean coronavirus disease networks on Twitter. Coronavirus (top left), Corona19 (top right), Shincheonji (bottom left), and Daegu (bottom right).
Comparing user relationships across coronavirus disease networks.
Network measures | Coronavirus | Corona19 | Shincheonji | Daegu |
Nodes, n | 12,803 | 11,739 | 9589 | 9701 |
Isolates, n (%) | 368 (2.87) | 324 (2.76) | 471 (4.91) | 434 (4.47) |
Total edges, n | 18,407 | 19,772 | 20,327 | 19,727 |
Unique edges, n (%) | 14,486 (78.70) | 17,042 (86.19) | 16,879 (83.04) | 17,017 (86.26) |
Edges with duplicates, n (%) | 3921 (21.30) | 2730 (13.81) | 3448 (16.96) | 2710 (13.74) |
Self-loops, n (%) | 1450 (7.88) | 2318 (11.72) | 2754 (13.55) | 1901 (9.64) |
Reciprocated vertex pair ratio | 0.00020 | 0.00042 | 0.00353 | 0.00334 |
Reciprocated edge ratio | 0.00040 | 0.00084 | 0.00704 | 0.00666 |
Next, the modularity value of the Coronavirus network was the highest among the four networks, while the Daegu network had the lowest value. This result suggests that the clusters created within the Coronavirus network may be less cohesive in terms of the subgroups’ internal collectivity because the Twitter users in group A tend to be connected with other users in group B. If modularity is low, the clusters are well-defined in terms of the quality of the subgroups generated. Because the Daegu and Shincheonji networks both have lower modularity than the other networks, users classified in the same cluster rarely left their own group to talk to others in a different cluster.
A component analysis reveals that Shincheonji and Daegu network members had the largest numbers of connected components and the maximum edges in a connected component. Coronavirus network members had the largest chat room with the highest value for maximum vertices (ie, users) in a connected component, followed by the users of Corona19.
Comparing coronavirus disease network properties on Twitter.
Types | Coronavirus | Corona19 | Shincheonji | Daegu |
Maximum geodesic distance (diameter) | 12 | 16 | 14 | 14 |
Average geodesic distance | 3.865 | 5.459 | 4.432 | 4.471 |
Modularity | 0.674 | 0.667 | 0.563 | 0.530 |
Connected components, n | 584 | 582 | 799 | 828 |
Maximum vertices in a connected component, n | 10,783 | 10,135 | 8270 | 8098 |
Maximum edges in a connected component, n | 16,121 | 17,916 | 18,947 | 18,123 |
RQ2 addresses the popularity of news topics and media channels. We extracted the most cited news among the four networks. Five news items appeared twice on the top 10 list of the four Twitter networks. The most popular news concerned suspicions that the
The top 10 news stories on the Twitterverse related to Shincheonji included the term “Shincheonji” in their headline titles. For example, the most popular news was that Shincheonji leader Lee Man-hee had received recognition for his service to the country from ex-President Park Geun-hye and that he was set to be buried at the National Cemetery. Thus, the top news stories shared on Twitter featured eye-catching headlines, the use of dramatic expressions, and emotional narratives.
It is noteworthy that international news, rather than domestic news, was chosen as the top news item.
Finally, we investigated the most cited news channels among the four networks. Korea’s portals and
The study computed a 4 (Corona19 vs Coronavirus vs Daegu vs Shincheonji) x 2 (portal vs nonportals) chi-square comparing the frequency of portal vs nonportal news site use between network types. The difference is found to be significant (χ23=12.747,
We also determined the origin of the news items reported in the portals. The four
This study addresses RQ3 by analyzing the news frames of media organizations that were circulated in tweets. The results of content analysis show that 17.5% (n=7/40) of the articles mentioned medical or health problems. The medical news items include discussions of the characteristics of COVID-19, warnings about the potential for beards to be infected with COVID-19, the effects of health conditions on mortality, the status of COVID-19 tests in Italy, and the difference in infection rates between Shincheonji members and ordinary citizens.
The results of an independent two-tailed
This study ranked the most popular news articles from 1 to 10. This study then calculated reverse scores to measure the spillover effects of the articles. For example, the top-ranked news item were given 10 points, and the 10th-ranked item was given 1 point.
Lastly, the study investigated the news frames included in the tweets produced across the four networks and compared the association between the network typology and the frames. The findings suggest that the “attribution of responsibility” frame was the most frequently used, followed by “human interest.” Both “morality” and “entertainment” were cited 6 times. “Conflict” was the least used frame. This study conducted a 4 (network types) x 6 (news frames) chi-square analysis to examine the association between network type and the news frames used in the tweets. As shown in
Chi-square results for the news frames across coronavirus disease networks.
Network type | Total (N=40), n (%) | Conflict (n=4), n (%) | Entertainment (n=6), n (%) | Human interest (n=7), n (%) | Medical (n=7), n (%) | Morality (n=6), n (%) | Attribution of responsibility (n=11), n (%) | Chi-square ( |
||
|
N/Aa | N/A | N/A | N/A | N/A | N/A | N/A |
|
8.727 (15) | |
|
Corona19 | 10 (25) | 1 (10)b | 2 (20) | 2 (20) | 3 (30) | 1 (10) | 1 (10) |
|
N/A |
|
Coronavirus | 10 (25) | 0 (0) | 1 (10) | 2 (20) | 2 (20) | 2 (20) | 3 (30) |
|
N/A |
|
Daegu | 10 (25) | 0 (0) | 2 (20) | 2 (20) | 1 (10) | 1 (10) | 4 (40) |
|
N/A |
|
Shincheonji | 10 (25) | 2 (20) | 1 (10) | 1 (10) | 1 (10) | 2 (20) | 3 (30) |
|
N/A |
aNot applicable.
bIn the calculation of the n (%) values across each row, the row total is taken as the N value.
By March 1, 2020, Korea had become one of the most SARS-CoV-2-infected countries in the world. The greater Daegu metropolitan area had Korea’s highest COVID-19 infection rate per household as well as the highest absolute rate [
Our main research platform was Twitter, but the analysis has also considered intermedia journalism that goes beyond Twitter. People use various news channels to share information, even when communicating via social media. This study found that portals were the preferred news sources on Twitter. As shown in Endo’s [
Although this study proposed and demonstrated a useful infodemiology framework by performing social network analytics to explore information diffusion related to the COVID-19 pandemic via Twitter in Korea, it is not without limitations. These limitations are inherent in Twitter’s user population. The literature suggests that only 15% of online adults are regular Twitter users [
People experiencing social disasters such as epidemics of infectious diseases are unfamiliar with their situation and find it difficult to predict what will happen next. Therefore, risk communication that delivers accurate and appropriate information is important. We found that most of the popular news on Korea’s Twitterverse had nonmedical frames. Nevertheless, it must be noted that the spillover effect of the news articles that delivered medical information about COVID-19 was greater than that of news with nonmedical frames. For instance, many news items reported that the initial response of government agencies was responsible for the spread of COVID-19. Many news items highlighted the positive role of individuals and groups, directing readers’ attention to the epidemic crisis. Ethical issues such as deviant behavior among the population and an entertainment frame highlighting celebrity donations also emerged often. Relatively few articles reflected discrepancies in positions or opinions among individuals or groups.
Augmented intelligence systems in the medical sector have been widely cited as an important approach to helping detect and clinically diagnose diseases [
application programming interface
coronavirus disease
Foreign Policy
research question
severe acute respiratory coronavirus 2
social networking services
HWP would like to thank Dr Marc Smith and the Social Media Foundation for offering a valuable Twitter data set via NodeXL. SP appreciates the financial support of John Carroll University.
None declared.