This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.
Self-management support can improve health and reduce health care utilization by people with long-term conditions. Online communities for people with long-term conditions have the potential to influence health, usage of health care resources, and facilitate illness self-management. Only recently, however, has evidence been reported on how such communities function and evolve, and how they support self-management of long-term conditions in practice.
The aim of this study is to gain a better understanding of the mechanisms underlying online self-management support systems by analyzing the structure and dynamics of the networks connecting users who write posts over time.
We conducted a longitudinal network analysis of anonymized data from 2 patients’ online communities from the United Kingdom: the Asthma UK and the British Lung Foundation (BLF) communities in 2006-2016 and 2012-2016, respectively.
The number of users and activity grew steadily over time, reaching 3345 users and 32,780 posts in the Asthma UK community, and 19,837 users and 875,151 posts in the BLF community. People who wrote posts in the Asthma UK forum tended to write at an interval of 1-20 days and six months, while those in the BLF community wrote at an interval of two days. In both communities, most pairs of users could reach one another either directly or indirectly through other users. Those who wrote a disproportionally large number of posts (the superusers) represented 1% of the overall population of both Asthma UK and BLF communities and accounted for 32% and 49% of the posts, respectively. Sensitivity analysis showed that the removal of superusers would cause the communities to collapse. Thus, interactions were held together by very few superusers, who posted frequently and regularly, 65% of them at least every 1.7 days in the BLF community and 70% every 3.1 days in the Asthma UK community. Their posting activity indirectly facilitated tie formation between other users. Superusers were a constantly available resource, with a mean of 80 and 20 superusers active at any one time in the BLF and Asthma UK communities, respectively. Over time, the more active users became, the more likely they were to reply to other users’ posts rather than to write new ones, shifting from a help-seeking to a help-giving role. This might suggest that superusers were more likely to provide than to seek advice.
In this study, we uncover key structural properties related to the way users interact and sustain online health communities. Superusers’ engagement plays a fundamental sustaining role and deserves research attention. Further studies are needed to explore network determinants of the effectiveness of online engagement concerning health-related outcomes. In resource-constrained health care systems, scaling up online communities may offer a potentially accessible, wide-reaching and cost-effective intervention facilitating greater levels of self-management.
Online communities have the potential to influence health and health care. Recent studies have suggested that the participation of people with long-term conditions (LTCs) in online communities (1) improves illness self-management [
There is also evidence that self-management support interventions can reduce health service utilization [
Online communities have experienced an upsurge in popularity among people with chronic respiratory conditions such as cystic fibrosis [
This form of “user-led self-management” of LTCs bears similarities with the “expert patient” model, an approach to self-management of LTCs produced by the United Kingdom (UK) Department of Health in 2001 [
On average, one in four people with an LTC who use the Internet tries to engage online with others with similar health-related concerns [
The potential future integration of online health support systems with formal health care provision should be underpinned by a better understanding of how they are used and by evidence of their effectiveness. Indeed, as suggested by the Medical Research Council [
An expanding body of literature concerned with social network analysis has examined the structural patterns of relations among interacting actors and the social mechanisms that enable them to gain access to valuable resources [
In this study, we performed a network analysis of the structure and dynamics of two online communities of people with LTCs. We chose the Asthma UK and the British Lung Foundation (BLF) communities as an exemplar of such communities because their users typically suffer from chronic respiratory conditions. In particular, while Asthma UK users typically suffer from a respiratory condition characterized by variable and recurring symptoms, BLF users represent a more heterogeneous population of participants affected by different diseases linked to chronic symptoms of breathlessness (eg, COPD, pulmonary fibrosis, cystic fibrosis, and lung cancer).
What is the network structure of online communities for people with long-term conditions, and how do they function and evolve over time?
Does posting activity follow a time pattern?
Are there (a minority of) users with a special role in maintaining integration and cohesion of the community?
Do superusers write their posts uniformly over time or do they produce peaks of activity separated by periods of inactivity?
For how long do superusers remain active in an online community?
Are superusers help-seekers or help-givers?
Do superusers preferentially write posts to each other or to users who write relatively few posts?
Is there any association between users’ interaction patterns and their potential for enhancing peer self-management support in the community?
Do online health communities function and evolve in the same way as other real-world complex systems?
We aimed to uncover and understand how these communities function and evolve, and the role that some users have in maintaining integration and cohesion (see
Data were collected by HealthUnlocked [
We looked at the number of users, the number of posts and connections per user and posting frequency. A connection (ie, a tie, link, or edge) was established from one user to another when the former replied to a post by the latter (see
As a result of the small percentage of users who wrote posts to a disproportionally high number of users, the users’ activity showed long-tailed distributions. Therefore, our analysis was based not only on means and standard deviations but also on medians.
To uncover time patterns in posting activity, we used Fourier transforms of the time series of the users’ activity [
The “rich-club” coefficient is a metric designed to measure the extent to which well-connected users tend to connect with one another to a higher degree than expected by chance [
In our study, superusers were defined according to their cumulative activity over the entire observation period. In total, we identified 400 superusers. To uncover how many superusers were active within each week, we detected how many unique users, among the 400 identified over the entire period, were active within that time window.
Following Zhang et al [
Degree: the number of connections a user has established with other users through posts
Ego(-centred) network: the subset of connections linking a focal user—“ego”—directly to other users—“alters”—and connections linking these alters with each other
Largest component: the network component (see below) with the largest number of members.
Network Component: a subset of the network in which all members are directly or indirectly connected with one another (ie, all pairs of nodes in the subset are reachable through at least one tie) [
Node: individual user in an online community
Rich-club coefficient: the degree to which highly connected users preferentially connect to each other to a higher degree than would be expected by chance. In a community with a rich-club coefficient higher than 1, users who post to many others preferentially communicate with each other, thus forming rich clubs. Conversely, in a community with a rich-club coefficient lower than 1, users who post to many others preferentially communicate with those who post to few others, thus generating an anti-rich-club behavior [
Root post: the initial post in a thread of posts
Superusers: top 1% of users characterised by the largest number of posts written in the community over the entire observation period [
Tie, link, edge: online connection from a user to another, created when the former writes a post to the latter
Triad: a group of 3 users—nodes
The
To obtain
Permission to research was obtained from Asthma UK and the BLF before starting the study. The research protocol was examined, and permission to research was obtained from Asthma UK, BLF charities and HealthUnlocked. The study was examined by the institutional Research Ethics board at Queen Mary University of London and was exempt from full review.
The data sets span, respectively, 10 years for the Asthma UK and 4 years for the BLF communities (see
Despite the shorter time span, as a result of the larger number of users, the number of posts in the BLF community was higher than in Asthma UK, namely 875,151 compared to 32,780 respectively. Moreover, BLF users wrote a higher number of posts per user and were connected with a higher number of other users when compared with people in the Asthma UK forum (see
The number of official moderators among the highly active users was negligible; there were no moderators in the top 5% contributors to BLF and only 2 in the top 5% for Asthma UK. Thus, our network analysis predominantly reflects content originated from registered users.
When classified according to posting activity (ie, number of posts written to the forum), the top 5% users contributed to a substantial proportion of all posts: 58% and 79% in the Asthma UK and BLF communities, respectively. Superusers were those who made a high number of connections with other users in both Asthma UK and BLF communities (see nodes of large size in
Description of the Asthma UK and British Lung Foundation data sets.
Variables | Asthma UK | British Lung Foundation |
Data set time span (mm/dd/yyyy) | 02/03/2006-06/09/2016 | 13/04/2012-06/09/2016 |
Total time (weeks) | 548 | 230 |
Total number of posts, n | 32,780 | 875,151 |
Number of posts with reply, n (%) | 28,615 (87.3) | 815,184 (93.1) |
Number of posts with no reply, n (%) | 4165 (12.7) | 59,967 (6.9) |
Total number of users, n | 3345 | 19,837 |
Users who wrote ≥1 post, n (%) | 1053 (31.5) | 7814 (39.4) |
Users who wrote 1 post, n (%) | 331 (31.4) | 1186 (15.2) |
Users who wrote >1 post, n (%) | 722 (68.6) | 6628 (84.8) |
Registered users who never posted (ie, lurkers), n (%) | 2292 (68.5) | 12,023 (60.6) |
Number of posts per user, mean (SD) | 14.2 (55.0) | 66.9 (75.1) |
Number of posts per users who posted >1, median (range) | 5.1 (2-1068) | 8.0 (2-8947) |
Number of posts per users who posted >1, mean (SD) | 20.4 (65.6) | 88.1 (458.6) |
Posts contributed by top 1% superusers, n (%) | 10,457 (31.9) | 426,198 (48.7) |
Number of connections per user, mean (SD) | 2.1 (5.9) | 17.6 (69.0) |
Number of connections per user, median (SD) | 1.0 (5.9) | 1.0 (69.0) |
Number of connections per top 1% superuser, mean (SD) | 10.5 (16.5) | 141.0 (174.0) |
Number of connections per top 1% superuser, median (SD) | 7.0 (16.5) | 70.0 (174.0) |
Cumulative networks across the time span analyzed. Each node represents a user. (A) Asthma UK users (around 1000); (B) British Lung Foundation users (around 8000). The coloring of nodes is based on modularity membership and the size of the node is proportional to its degree (ie, the number of connections with other users).
Cumulative distributions of the number of posts as a function of time (weeks) within the Asthma UK (A) and the British Lung Foundation (B) communities. Calendars dates are reported below week numbers. Panels C and D illustrate the average number of posts per user per week within Asthma UK and British Lung Foundation, respectively.
The cumulative number of messages posted grew uniformly over time in the BLF community. By contrast, in 2015, the Asthma UK forum witnessed a substantial increase in posting activity, at a time coinciding with its move to the HealthUnlocked platform (see
The number of posts per user per week oscillated around a decreasing and an increasing trend (
As more users joined the communities and connected to one another through online posts, distinct groups of connected users started to emerge. These groups, called network components (see
Superusers represented a small minority (ie, 1%-5%) within both communities but were responsible for a high proportion of the posting activity and the functioning of the communities.
Sensitivity analysis showed that the removal of users with the largest number of connections caused the largest component to collapse (see
Periodicity of posting activity in Asthma UK (A) and the British Lung Foundation (B), measured through the Fast Fourier Transform (FFT). The component frequencies are denoted by
Fraction of users that are part of the largest component as a function of time (weeks) for Asthma UK (A) and the British Lung Foundation (B).
Sensitivity analysis: targeted removal of nodes (users) starting from the most connected ones within Asthma UK (A) and the British Lung Foundation (B).
Both Asthma UK and BLF communities were characterized by a low rich-club coefficient, which was consistently lower than 1 (see
Anti-rich-club behavior may suggest competition between superusers or merely the organization of the communities into groups of users characterized by different degrees of “expertise” or commitment: one group including the few committed experts and another including the vast majority of those seeking information when needed. It would, therefore, come as no surprise if the former were to communicate with the latter to a greater extent than randomly expected. We shall investigate this hypothesis further below.
We have shown that the connectedness of both communities depends crucially on the presence and activities of superusers, who committed a significant amount of their time to writing posts and targeting new users. We now look at whether their activity was concentrated in relatively short periods of time or instead it was uniformly distributed over time. How superusers’ involvement is distributed over time may have fundamental implications for the cohesion of the whole system precisely in light of the role these users play.
Rich-club coefficient as a function of the richness parameter (ie, users’ degree).
Number of unique users among the top 400 superusers as a function of time (weeks) within Asthma UK (A) and the British Lung Foundation (B).
We then investigated whether superusers’ posting activity was frequent and regular over time. To this end, for each of the top 5% users by post contribution, calculated cumulatively over the entire observation period, we measured the time interval separating every two subsequent posts to both communities. We then computed the inter-event time distributions for both communities to assess frequency and patterns of activity.
For each user, a
Thus, superusers not only play a topologically important role in the communities, but they are also likely to provide the expertise needed to answer queries.
Next, we examine whether the ego networks of different types of users were topologically different, and what generated such differences. Users commonly started a discussion thread by writing a root post (ie, the post at level 1 of the thread). Several users could then directly respond to these posts at level 1, thus creating level-2 posts. More generally, according to the design of the communities, by posting a response to a level–(t) post, users created a level–(t+1) post. There was no limitation to how a post thread could evolve, and therefore to the complexity of the thread hierarchy. Information on post levels was made available through the post metadata. In our analysis, any post at level 2 or higher was classified as a level–2+ post. Here the analysis was restricted to the BLF forum, as the Asthma UK community was significantly smaller with simpler hierarchical levels.
Cumulative distribution function (CDF) of the interposting time for the top 5% of users by post contribution within the Asthma UK (A) and the British Lung Foundation (B) communities.
By replying to other users’ posts, superusers contributed significantly to level 2 or above.
When root posters responded back to the posts received, they created a more cohesive network structure. Most of these highly active users were superusers. This suggests that superusers, by posting “help-giving” posts, enabled other users to talk to each other, thus facilitating the formation of ties between them.
Topology of two illustrative ego networks created by a user with low (A) and high (B) posting activity in the British Lung Foundation community. Panel C shows the number of closed triads in ego networks as a function of posting activity of superusers (top 5% of users by post contribution).
In this study, we applied network analysis to two online communities for patients with chronic respiratory conditions to shed light on potential structural mechanisms underlying the role of these communities as scalable, peer-to-peer self-management support intervention systems. We found that the number of users and posts increased steadily over the years in the period of analysis. The majority of users were mutually reachable, either directly or indirectly, and formed a large connected component, which underlies the strength of the network as a means for widespread diffusion of information.
Superusers played a central role in these communities as a result of the characteristics of their posting activity and their constant online engagement. They preferentially replied to posts from peripheral users who were not equally well connected. In doing so, they additionally facilitated tie formation between users. Sensitivity analysis showed that gradual removal of superusers induced the network to collapse. Thus, superusers were responsible for holding the network together and, in particular, for ensuring the emergence of a large connected component. As a result, without superusers, there would be no effective spread of information within the community. Superusers acted as a continuously available resource over time. As users became more active within the community, they became more likely to reply to posts than to ask questions. This suggests that superusers gradually became “experts” providing others with advice and support, which is in agreement with what has recently been suggested by other qualitative studies [
Based on social network analysis, this work has started elucidating crucial mechanisms underlying the potential of online health communities to promote effective self-management support interventions, in particular regarding the role of superusers in sustaining and providing integration and cohesion to the network. By analyzing the communities over more than five years, we have shown that superusers are a resource naturally present, able to sustain a network and make it thrive over time. This could prompt future studies to understand their role as a potential scalable health care workforce [
Limitations of this study include the lack of demographic and clinical information of participants as well as verification and validation of the information shared online [
We did not investigate the reasons explaining the oscillating number of posts per user per week in the 2 communities, nor the time patterns of posting activity, nor the higher and regular number of posts of BLF users compared with Asthma UK ones. Time patterns of posting activity may reflect the nature of symptoms of the underlying lung conditions (see
More research is also needed to explore the mechanisms sustaining the effectiveness of health online communities and online engagement [
Finally, 90% of people accessing patients’ online communities are passive readers who do not engage in online discussions [
Previous studies on medical online communities agree that users can benefit from the emotional support as well as the cumulative experiential information provided by others [
A qualitative study that was performed on a forum of people with stroke has shown that up to 95% of users’ intents for writing posts were met by replies [
This is in qualitative agreement with recent work on an online community for people with stroke, where superusers were shown to play an essential role in nurturing the ability of the forum to provide feedback and identify inappropriate information and health behaviors in the context of secondary prevention medications [
Finally, superusers’ engagement with the online community and their daily commitment raise questions about what motivates their behavior. Recent work has suggested that their behavior can be motivated by perceived improvements in sense of well-being [
As a result of the voluntary basis of users' contributions, self-management support through online health communities offers high potential for cost-effectiveness from the perspective of formal services. Current health care challenges [
This work has drawn on social network analysis to uncover fundamental mechanisms underlying the potential of online communities to promote effective self-management support interventions. In particular, our study contributes to a better understanding of the role played by superusers in sustaining and providing integration and cohesion to the network. By analyzing the communities over more than five years, we have shown that superusers can sustain and make the network thrive over time. The presence of both a large connected component and superusers is a crucial feature of successful health communities. It is well known that components are critical for information diffusion [
Moreover, our study has uncovered temporal patterns of posting activity. This will prompt further research aimed at investigating differences in these patterns across communities using qualitative analysis. This would include the analysis of whether users’ intents were met by replies [
Across a variety of empirical domains, it has been documented that hubs (ie, nodes with a disproportionally large number of connections) are valuable resources that help spread information widely and amplify information cascades [
This study shows that patients’ online communities share the same network features as other complex networks across a variety of empirical domains. Our analysis highlighted the special role played by superusers, their topological positions and behavior in the communities. In this sense, our results shed light on the topological mechanisms underlying the ability of patients’ online communities to provide self-management support and may, therefore, suggest levers for improving the quality of health care intervention.
At a time when health care services are working beyond capacity and patients are finding it difficult to access care, online communities provide the potential for addressing critical health care challenges. They offer a feasible way for patients with LTCs to find helpful advice and support, and a potentially cost-effective and scalable solution to the vast and rising costs associated with long-term disease management. Even though our results showed that there was no scarcity of superusers throughout the whole period of the study, nonetheless ensuring that such networks will become a core component of illness self-management on a broader scale requires proper research investment leading to randomized control studies and potentially a change in the concept of the health care team.
Asthma UK cumulative activity over the analysis time frame.
British Lung Foundation cumulative activity over the analysis time frame.
British Lung Foundation
chronic obstructive pulmonary disease
long-term condition
We would like to thank Asthma UK and British Lung Foundation for granting the permission to conduct the study. This study was funded by a Queen Mary University of London Life Science Initiative grant (supported by the Wellcome Trust Institutional Strategic Support Fund). ADS is funded by a National Institute for Health Research Academic Clinical Lectureship.
ADS conceived the study, contributed to the data analysis and interpretation and wrote the manuscript together with PP and SJ. SJ conducted the social network analysis under the guide of PP and NS. PP, NC, SJCT, AP, AS, RD, AA, and MJE are coinvestigators on the study and contributed to the interpretation of findings. CJG contributed to the design of the analysis and interpretation of findings. All authors commented and agreed on the final draft of the submitted manuscript.
The views expressed are those of the author(s) and not necessarily those of the National Health Service, the National Institute for Health Research or the Department of Health. The funder had no role in study design, data collection, data analysis, data interpretation, the writing of the manuscript, and decision to submit the manuscript for publication. MJE is the cofounder, and chief medical officer of HealthUnlocked and AA is a research officer at HealthUnlocked.