This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.
Twitter is becoming an important tool in medicine, but there is little information on Twitter metrics. In order to recommend best practices for information dissemination and diffusion, it is important to first study and analyze the networks.
This study describes the characteristics of four medical networks, analyzes their theoretical dissemination potential, their actual dissemination, and the propagation and distribution of tweets.
Open Twitter data was used to characterize four networks: the American Medical Association (AMA), the American Academy of Family Physicians (AAFP), the American Academy of Pediatrics (AAP), and the American College of Physicians (ACP). Data were collected between July 2012 and September 2012. Visualization was used to understand the follower overlap between the groups. Actual flow of the tweets for each group was assessed. Tweets were examined using Topsy, a Twitter data aggregator.
The theoretical information dissemination potential for the groups is large. A collective community is emerging, where large percentages of individuals are following more than one of the groups. The overlap across groups is small, indicating a limited amount of community cohesion and cross-fertilization. The AMA followers’ network is not as active as the other networks. The AMA posted the largest number of tweets while the AAP posted the fewest. The number of retweets for each organization was low indicating dissemination that is far below its potential.
To increase the dissemination potential, medical groups should develop a more cohesive community of shared followers. Tweet content must be engaging to provide a hook for retweeting and reaching potential audience. Next steps call for content analysis, assessment of the behavior and actions of the messengers and the recipients, and a larger-scale study that considers other medical groups using Twitter.
Social media, including Facebook and Twitter, is fast becoming an important tool in health care. In editorials, essays, and blogs, physicians have been urged to become active participants in social media as a form of engagement with the larger health community, patients, and peers [
This rapidly growing social network has approximately 500 million users worldwide, 140 million of them in the United States [
In order to make inferences about group behavior and predictions about “best practices” for dissemination and diffusion of information through these networks, it is important to analyze the networks first. Social network analysis (SNA) is a well-established technique in sociology that can be adapted and used to systematically explore virtual communities, such as those that exist within the world of medicine. Applied graph theory is an overlapping area focused on using graphs to represent structures and networks, and theory developed about graphs to explain applications in a variety of fields, from computer science, to biology and chemistry, to mathematics and linguistics, to name a few. SNA and applied graph theory have been used to analyze structural patterns of social relationships, to explore influential information brokers, and to visualize the formal or informal personal networks within and between organizations.
Online networks have been studied before in relation to their topological structure, patterns of propagation of information, homophily (the tendency of individuals to associate and bond with similar others), and the types of tie formations and decays [
Twitter can be thought of as an information sharing network because of its highly skewed distribution of followers, or listeners, and its low rate of reciprocated connections (most information sharers are not followers of their followers) [
There are several network topologies (structures or models of the network) that highlight common information diffusion patterns.
The number of edges connected to a node is referred to as the degree of the node. The direction of the edge in each network indicates the direction of information flow. The difference between these networks is the degree distribution, ie, the number of incoming and outgoing edges of each node in the network. The pattern of these edge connections defines the structure of the network and dictates how quickly a message travels.
Network configurations - Star, Random, Small World (left to right).
In star networks, the degree distribution of the nodes is heavily skewed. Every time node 0 sends a message, every node in the network receives it immediately. However, when node 8 sends a message, no one ever receives it. This is equivalent to a Twitter subnetwork where a Twitter account or Twitter user has followers who do not connect to each other (they do not have a direct communication channel to each other). When users in a subnetwork are well connected, we say that they form a “cohesive community”. The amount of cohesion is defined as the number of common neighbors a group of individuals have, divided by the total number of neighbors. This measure is a variant of a more traditional local “clustering coefficient” measure. A clustering coefficient is user specific and measures the number of triangles a user is involved in. Because connectivity is limited in a star network (no neighbors have edges between them), it does not form a cohesive community. However, for messages sent from the center node, this network is optimal for basic information dissemination since everyone receives the message right away and no extra messages are sent. Messages and ideas sent from the periphery though, do not spread. At the same time, while star structures within a real world network are ideal for quick information diffusion from the center node, they are problematic in terms of network resilience and community development.
In random networks, the degree of the nodes follows a normal distribution. It is unusual to see a node that has a very high or low degree and the dissemination power of each node tends toward the mean. When node 0 sends a message, it takes three steps, or three hops, before it reaches everyone in the network. When node 1 sends a message, it reaches everyone in two hops. If everyone who receives the message sends it forward, some nodes will receive the message multiple times. Since extra messages are being sent, the dissemination is not considered efficient. In other words, because of the random connectivity pattern, spreading a message requires more individuals to participate and is thus not efficient. Further, this network is not a cohesive community since only a small number of neighbors are connected to each other. In general, information networks, social networks, epidemics, and other such networks exhibit non-random connectivity patterns [
In small world networks, the diameter of the graph (the furthest distance between any two nodes in the graph) is low and the amount of cohesion is higher than in a random network. A network in which the degree of the nodes follows a power law distribution indicates a small world network. In this network, a few nodes are very well connected, but most are not. In our example, if node 0 sends a message, it takes two hops to reach everyone without extra messages (in a larger example, we would expect a small number of redundant messages). This network is more efficient than a random network. Well-connected users in this network have comparatively high dissemination power and act as hubs, but messages from the periphery can also be efficiently disseminated because the diameter of small world networks is low. This means that even though most nodes are not neighbors of each other, the number of hops needed to reach every node is small. Small world networks tend to have pockets of cohesive communities throughout the network. Another interesting property of small world networks is that they are more resilient to removal of random nodes from the network than are random networks. Because most random nodes will have a small degree, deleting them will not increase the diameter or decrease the cohesion/clustering coefficient significantly [
Many networks have been shown to follow small world properties, including social networks, protein networks, and voter networks. When celebrities are excluded, the degree distribution of nodes on Twitter approximates a power law distribution [
A network with a power law structure has the theoretical capacity to spread information, even arising from the periphery, efficiently to many users. In generated networks of this type, a message can be disseminated to everyone in the network using a simple dissemination strategy and a small number of resends (logarithmic in the network size) [
To summarize, computer science theory on information dissemination elicits two things about networks with the properties observed of the Twitter follower graph: (1) Twitter resembles a small world graph with a degree distribution that follows a power law distribution, (2) dissemination to a large number of nodes in a small amount of time is possible, and (3) these large scale disseminations can be achieved with simple resend rules (ie, they do not require sophisticated centralized planning).
It should be mentioned that there is a natural trade-off between information dissemination and community cohesion. If there is high community cohesion, members of the community will have quick access to information. However, members outside of the community will not. In contrast, if cohesion is low, information can disseminate to a broader audience. However, the actual amount of dissemination in a subnetwork without active community participants can be low. For information transmission networks, developing a community with moderate cohesion will increase the dissemination of information to a broader audience.
In this preliminary study, we sought to employ applied graph theory and a basic SNA framework in order to characterize and understand information diffusion on social media within a subset of the medical community by examining the Twitter networks of a few medical professional societies.
Social network analysis and network configuration models were used to characterize community structure and information dissemination of four professional physician groups that have a presence on Twitter. The core groups in this analysis are: the American Medical Association (AMA), the American Academy of Family Physicians (AAFP), the American Academy of Pediatrics (AAP), and the American College of Physicians (ACP). Explanations of the metrics used in this study are presented in
Description of metrics from Twitter.
Metric | What it measures | Description and purpose of metric |
Number of followers | Actual information dissemination | How many people/groups received your message? AND How many people/groups may resend (retweet) your tweet? |
Number of Level 2 followers | Level 2 information dissemination potential | How many followers (active listeners) do your followers have? |
Dissemination network size | Information dissemination potential | How many people can see your message if all of your followers retweet it? |
Number of information sharers | Active sources of information | Who are the other people or groups on Twitter that you are getting information from? |
Number of tweets | Frequency of information disseminated | How often do you share information with your followers? |
Number of retweeters | Actual number of information disseminators | How many people retweeted a particular tweet you sent? |
Retweeter network size | Number of Level 2 followers | How many people receive the tweet when some of the followers retweet it? |
The Twitter API (api.twitter.com) was used to characterize the network: determine the number of followers and the number of tweets for each of these groups. These data were collected between July 2012 and September 2012. Accounts that were disabled, private, or not recognizable by our automated programs were ignored. This amounted to less than 1% (1257/238,853) of the accounts we had access to. Similar to other studies in computer science, the number of followers was used as an indicator of actual information dissemination, the Level 2 followers as an indicator of the information dissemination potential, the number a user is following as a way to identify potential sources of information, and the number of tweets as an indicator of the frequency of information dissemination for a particular group or individual.
We approximated community cohesion for this dataset by measuring the amount of overlap in followers between the professional groups as a percentage [(A intersect B)/min(|A|,|B|)*100]. Overlap is necessary to develop a cohesive community that has common sets of followers and connections between subsets of the followers. As previously mentioned, too much overlap may reduce the amount of information that disseminates outside of the community. We also used visualization to better understand this overlap.
The actual information flow of the tweets for each of these groups for a one-month period from August 1, 2012 to September 1, 2012 was analyzed. Tweets sent by the four professional physician groups were examined using Topsy, a Twitter data aggregator [
For each group, tweets sent between July 1, 2012 and September 12, 2012 were identified and assessed, including the number of retweets and details of their dissemination. This was followed by the identification of individuals who retweeted the message, the determination of the number of followers for each retweeter and the computation of the retweeter network size to measure actual information flow. Looking at how many times a message was retweeted, compared to the number of times it could have been retweeted if all of an organization’s followers retweeted the message, provides a measurement of how close the actual tweet dissemination is compared to the theoretical best.
The information dissemination potential for the followers of each group was plotted using a cumulative frequency graph in
In AAFP and ACP, it emerged that over half of the followers have a strong listener network (Level 2 follower network) with at least 100 listeners. The median number of listeners for each of the followers of AAFP, ACP, and AAP are 120, 165, and 81 respectively. In contrast, the majority of followers of AMA have smaller (quieter) listener networks, with over 50% (119,560/213,122) having fewer than 50 listeners. The potential to disseminate widely exists for each of these organizations since all of the organizations have a large percentage of listeners who are themselves information brokers.
A detailed social network visualization of the networks’ overlap is shown in
In the networks under study, there is evidence of the beginnings of a collective community, where large percentages of individuals (13%-55%) are following more than one of the professional groups. As the illustration shows, the majority of followers are specific to one of the groups. The pink nodes are following all three of the professional groups, while the green, orange, and purple nodes are following two of the professional groups. The overall common overlap across all four groups is only 471 individuals, a very small percentage of the overall networks for these groups, indicating a limited amount of community cohesion and cross-fertilization, but still allowing for efficient channels (fewer redundant messages) for information dissemination.
Dissemination potential and professional group statistics during the study period.
Professional group | Number of followers | Number following | Number of tweets | Information dissemination potential |
American Academy of Family Practice (AAFP) (@AAFP) | 7546 | 298 | 2788 | 6,959,092 |
American College Physicians (ACP) (@ACPinternists) | 5955 | 2023 | 2979 | 11,228,160 |
American Academy of Pediatrics (AAP) (@AmerAcadPed) | 11,768 | 132 | 1184 | 14,496,559 |
American Medical Association (AMA) (@AmerMedicalAssn) | 213,122 | 5729 | 7065 | 122,066,397 |
Information dissemination potential for each professional physicians group - American Medical Association (AMA), American Academy of Family Physicians (AAFP), American Academy of Pediatrics (AAP), and American College of Physicians (ACP).
Follower network of American Academy of Family Physicians (yellow), American College of Physicians (red), and American Academy of Pediatrics (blue). Size of group nodes based on number of followers.
When considering information diffusion potential, it is reasonable to exclude followers who have never tweeted or retweeted, since it is likely that those individuals will not retweet a message from one of the professional groups. The more tweets a group’s followers send, the higher the likelihood for larger dissemination.
If these followers and their Level 2 followers are removed from the dissemination network, the overall information dissemination potential decreases by less than 1% for all four professional groups. This indicates that the followers who do not send any tweets/retweets have a small number of followers themselves and are not essential information brokers. When all the followers who have sent only 10 or fewer tweets are removed, then the AMA professional group information dissemination potential is reduced by over 35%. The other professional groups are still impacted by less than 1%. This is an indication that the AMA followers’ network is not as active as the other three professional networks. For the other three groups, there is a stronger correlation between the number of tweets disseminated and the number of followers.
Number of tweets/retweets sent by followers of the four professional groups - American Medical Association (AMA), American Academy of Family Physicians (AAFP), American Academy of Pediatrics (AAP), and American College of Physicians (ACP).
In addition to the information dissemination potential, the actual retweet propagation of a sample of tweets was assessed.
The x-axis represents each tweet where the tweets are sorted by number of retweets. The y-axis represents the number of retweets. The AMA posted the largest number of tweets (164), while the AAP posted the fewest during this time period. Each organization had a number of tweets that were not retweeted by anyone. The largest number of retweets for any of these organizations during this month was 24. Given that each of these groups has thousands of followers, this level of retweeting leads to information dissemination that is far below the information dissemination potential shown in
Finally, the dissemination of a particular tweet was considered: how does the dissemination of an actual tweet compare to the theoretic best? Here, we focus on the propagation of the tweet as opposed to the content of the tweet. Are there any tweets that are disseminating to a large fraction of this medical community?
Overall, the number of retweets and the number of individuals who received the tweet is less than 0.2% of the total population dissemination potential, with the tweet from the ACP disseminating the least.
Top tweets for each professional group.
Professional group | Tweet | Number of retweets | Actual information dissemination | Fraction of information dissemination potential |
AAFP | “Ask your Doctor if medical advice from a TV commercial is right for you…” | 10 | 9558 | 0.00137 |
ACP | “Interaction between proton-pump inhibitors clopidogrel clinically unimportant…” | 7 | 489 | 0.000044 |
AAP | “Tragedy in CO – in the wake of news about another act of gun violence, how to talk with children and teens…” | 25 | 25,482 | 0.00176 |
AMA | “September is Women in Medicine Month, a time to celebrate growing number, influence of women physicians | 45 | 200,778 | 0.00164 |
Number of retweets for messages sent in August 2012. [American Medical Association (AMA), American Academy of Family Physicians (AAFP), American Academy of Pediatrics (AAP), and American College of Physicians (ACP)].
At the time of our study, the AMA had the largest number of followers—and thus, information diffusion potential—and was trailed by the AAP, AAFP, and ACP, respectively. However, each of the smaller organizations had a strong network of followers among which were individuals who themselves are potentially strong information brokers. We also began to see interconnectedness among these groups as evidenced by a group of users who follow all three smaller organizations. This preliminary analysis shows possibly large information diffusion potential, yet when we analyzed actual tweets sent, the actual dissemination was well below the calculated potential.
With the growing popularity of social media and Twitter, medical organizations are urged to engage in social media and actively share information. Therefore, it is important to determine what metrics can be used to measure the effectiveness of this as a medium. This study attempted to describe the characteristics of four medical networks and analyze their theoretical information dissemination potential, their actual information dissemination, their information sharers, and their propagation and distribution of tweets.
This study has several weaknesses. First, we captured our data at one point in time. As such, it is only a snapshot of the Twitter networks described. Social media networks tend to be dynamic with followers added and dropped from moment to moment. So, in all likelihood, these networks may look different today than they did during the study period. Additionally, the overall trend on Twitter is expansion, with increased number of users. In fact, at the time of manuscript revision submission, each of the organizations described in this manuscript has shown to have a much larger following. This also applies to the data captured for the individual tweets, which may have diffused beyond the study period. Second, our analysis does not allow for an investigation of the inter-activity, interactions, and engagement within the networks. As a result, it is impossible to draw conclusions about motivation for dissemination or how content drives diffusion. Third, though the total percentage of accounts we ignored in our analysis was less than 1%, we could not determine the percentage of private accounts that we did not have access to. According to Beevolve [
This work is merely the first step toward understanding the information power and potential of several medical professional groups on Twitter.
This analysis indicates that these medical groups participate in subnetworks with small world type tendencies. This structure allows for large-scale information dissemination; however, actual dissemination is well below potential for all four professional groups. This is consistent with many other groups on Twitter. Large-scale information diffusion in Twitter is driven by information brokers who have at least a moderate number of followers, some of whom are active followers. In other words, it is more valuable to a network to be well connected to a few influential information brokers than to have a large number of first degree followers. Although having a large number of followers is beneficial, small networks can still achieve a high potential distribution if they have a few information brokers who themselves are active and well connected. Developing a community that is active (in terms of retweeting) and engaged (in terms of content, mutuality, and reciprocity) is important for strong dissemination. As demonstrated in a previous study [
The content of the messages is of course of utmost importance. Even with strong channels for dissemination, tweets must be timely and engaging in order to provide the hook for followers to retweet and begin to reach the vast potential audience.
In the past few years, reports have been published about the use of Twitter for various purposes within the field of medicine: as a support tool for patients with chronic conditions [
This study is one example of the development of theoretical models of knowledge dissemination that could have practical implications in how we use this medium to empower patients, disseminate important public health messages, or promote our ideas and specialties. As researchers attempt to characterize best practices in the use of social networks for knowledge transfer and dissemination, they will need to look at the networks themselves, conduct content analysis of the messages, and assess the behavior and actions of the messengers as well as the recipients. This calls for more multi-disciplinary research, involving experts in computer science, communications, linguistics, and cultural studies, to develop and advance this field of inquiry.
American Academy of Family Physicians
American Academy of Pediatrics
American College of Physicians
American Medical Association
social network analysis
None declared.