This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.
Low back pain (LBP) remains the leading cause of disability worldwide. A better understanding of the beliefs regarding LBP and impact of LBP on the individual is important in order to improve outcomes. Although personal experiences of LBP have traditionally been explored through qualitative studies, social media allows access to data from a large, heterogonous, and geographically distributed population, which is not possible using traditional qualitative or quantitative methods. As data on social media sites are collected in an unsolicited manner, individuals are more likely to express their views and emotions freely and in an unconstrained manner as compared to traditional data collection methods. Thus, content analysis of social media provides a novel approach to understanding how problems such as LBP are perceived by those who experience it and its impact.
The objective of this study was to identify contextual variables of the LBP experience from a first-person perspective to provide insights into individuals’ beliefs and perceptions.
We analyzed 896,867 cleaned tweets about LBP between January 1, 2014, and December 31, 2018. We tested and compared latent Dirichlet allocation (LDA), Dirichlet multinomial mixture (DMM), GPU-DMM, biterm topic model, and nonnegative matrix factorization for identifying topics associated with tweets. A coherence score was determined to identify the best model. Two domain experts independently performed qualitative content analysis of the topics with the strongest coherence score and grouped them into contextual categories. The experts met and reconciled any differences and developed the final labels.
LDA outperformed all other algorithms, resulting in the highest coherence score. The best model was LDA with 60 topics, with a coherence score of 0.562. The 60 topics were grouped into 19 contextual categories. “Emotion and beliefs” had the largest proportion of total tweets (157,563/896,867, 17.6%), followed by “physical activity” (124,251/896,867, 13.85%) and “daily life” (80,730/896,867, 9%), while “food and drink,” “weather,” and “not being understood” had the smallest proportions (11,551/896,867, 1.29%; 10,109/896,867, 1.13%; and 9180/896,867, 1.02%, respectively). Of the 11 topics within “emotion and beliefs,” 113,562/157,563 (72%) had negative sentiment.
The content analysis of tweets in the area of LBP identified common themes that are consistent with findings from conventional qualitative studies but provide a more granular view of individuals’ perspectives related to LBP. This understanding has the potential to assist with developing more effective and personalized models of care to improve outcomes in those with LBP.
Low back pain (LBP) is the leading cause of disability worldwide [
Optimizing management of conditions such as LBP requires consumers to be engaged in their care. To enable this, health care providers need to have an understanding of the full context of the condition from the consumer perspective. “Contextual variables” here refer to any type of useful information about the context of an individual’s pain experience, such as physical, emotional, social, and/or occupational variables [
With the current advances in online and web technologies, social media has emerged as a new and rich source of first-person health care data [
Our study approach was to undertake content analysis of Twitter data by applying topic modeling. Content analysis is a widely used technique for qualitative research [
Twitter was used as the data source rather than other social media platforms, blog posts, or news articles because individuals use this platform for expressing and sharing their feelings and opinions on health-related topics by posting short messages that can be easily collected through application programming interfaces (APIs) or other open sources [
Our data processing and analysis consisted of 4 steps (see
Keywords used to search tweets related to low back pain.
Source | Study purpose | Keywords | Total, n |
Lee et al, 2016 [ |
To quantify the risks associated with a new tweet about back pain | “painful back,” “sore back,” “back started hurting,” “buggered my back,” “hurt my back,” “I’ve got backache,” “injured my back,” “my back hurts,” “I’ve got back pain,” “pain in my back,” “put my back out,” “my back is killing me” | 12 |
Ahlwardt et al, 2014 [ |
To compare self-reported toothache experiences in tweets with those of backache, earache, and headache | “backache,” “back ache,” “back aches,” “back hurt,” “back hurting,” “back hurts,” “back killin’,” “back killing,” “back pain,” “back sore” | 10 |
Campbell et al, 2013 [ |
A systematic review to study the influence of employment social support in nonspecific back pain | “lumbago,” “backache,” “back ache,” “back pain,” “low back ache,” “low back pain,” “lower back pains” | 7 |
The overall data analysis workflow. The analysis consists of four steps: (1) data preprocessing, (2) thematic analysis using topic modelling, (3) topic labeling and categorization, and (4) domain expert validation. BTM: biterm topic model; DMM: Dirichlet multinomial mixture; GPU-DMM: General Pólya Urn Dirichlet Multinomial Mixture; LDA: latent Dirichlet allocation; NMF: nonnegative matrix factorization.
We removed duplicates, retweets, URLs, and tweets related to marketing and advertisements, which reduced the data set from 7,892,210 to 2,825,645. We filtered the data further by removing tweets that did not contain first person pronouns [
We replaced contractions with their expanded forms (eg, “didn’t” to “did not”). We converted the HTML characters to ASCII characters and removed hashtags, Unicode strings (eg, “\u2026”), numbers, and punctuation. We replaced abbreviations, elongated words (eg, “gooood” to “good”), and emoticons and emojis with their equivalent English expressions. We then performed spelling correction, lowercasing, tokenization, and lemmatization, created n-grams, removed stop words (eg, common terms such as “the” and “is”). We again removed the duplicates and the remaining data set was 1,249,576 tweets.
After completing the abovementioned steps, we excluded tweets with less than three words because in topic modeling, the document size is important to achieve high accuracy [
Topic modeling is a technique used to provide a summary of a large collection of documents by extracting “topics” that represent the dominant themes [
LDA is a generative probabilistic model that assumes each document can be represented by distribution over topics and each topic by distribution over words [
To use these models (except for NMF), we used a Java-based open-source library for short text topic modeling algorithms called STTM (version 1.8) [
Choosing the right number of topics is a crucial step in topic modeling because it can affect the accuracy of results. The quantitative approach computes the coherence score and perplexity, which helps in determining the optimal number of topics [
As a quantitative approach, we calculated the coherence score of each model on different numbers of topics ranging from 5 to 200, based on the PMI score [
Additionally, we used a qualitative approach to select the most representative topics. We manually examined the topics, their top 20 terms, and a random sample of tweets in each topic. We also created a word cloud for each topic and evaluated word clouds and their sample tweets. We identified the number of topics that provided us with distinct and meaningful topics; if we exceeded this number of topics, we started to notice an increase in duplicates and overlapping topics. We used both quantitative and qualitative approaches to select the optimal number of topics.
Topic labeling is a process of representing the meaning of a topic by assigning each topic a descriptive word or phrase [
LDA assumes that each document (tweet) is a mix of topics with different proportions [
To improve the results of thematic analysis, low-order topics can be grouped under broad, higher-order categories [
Two domain experts (FC, a rheumatologist; DU, a physiotherapist), actively working clinically and researchers in the area of LBP, independently examined the selected topics from the previous step where each topic included the top 20 words to determine face validity. As previously described, in topic modeling, the top words of each topic provide the description of that topic, thereby assisting the domain experts with inferring its meaning [
The total number of collected tweets about LBP was 7,892,210 from 2,420,258 unique users from 2014 to 2018. The average number of words in each tweet increased from 2017 onward (
After performing comprehensive data preprocessing, the final number of retained tweets was 896,867, which represents 11% (896,867/7,892,210) of the original raw data we collected, with a vocabulary size of 29,539. The minimum length of tweets was 4 words and the maximum length was 20 words.
After testing 5 topic modeling algorithms and the number of topics based on the coherence score and our manual examination, we selected the best model that included 60 topics, detected from 896,867 self-reported tweets about LBP.
The 60 topics were examined and manually given a topic label. The common and duplicate labels were then grouped into higher-order categories. Word clouds for the two categories of “pain regions” and “sleep” after combining the related topics are provided in
Independent examination of selected topics by two domain experts and reconciliation of any differences resulted in 19 contextual categories, with details presented in
The proportion of tweets for each higher-level category over the years showed that all 19 categories had been discussed by individuals with relatively similar frequency every year (see
The 19 categories and their proportions based on all tweets posted from 2014 to 2018.
The proportions of 19 categories based on the dominant topic per year.
An example of tweets for each contextual category.
Categories | Examples of tweets |
Emotion and beliefs |
My back hurts, feeling sad because I wanna get up and do something ! I hate staying in bed :( |
Physical activity |
I did 6 miles on my exercise bike yesterday, felt really pleased with myself, and ate healthy. My back hurts today |
Daily life symptoms |
So my back hurts like hell and I can hardly sit here and do my hair. I hate it when my lower back hurts and sends shooting pains down my legs, making them ache and throb. Ugh. |
Sleep |
Every time I sleep in my sis guest bedroom my back hurts, that bed is not comfortable. I”d prolly be better off sleeping on the floor |
Pain regions |
today is not a good day. my back hurts, my shoulder hurts, my elbow is tingly, a little numb down to my hand and to top it off now my left knee hurts a little. |
Health care |
So I have found one good physio and one good chiropracter, both same price, who would you see if you had lower back pain? |
Women |
Being pregnant is literally taking everything out of me. I’m exhausted, my back is killing me and I stay moody… |
Aggravating factors |
Yesterday I tried doing a back flip on my trampoline. Now, every time I walk my back hurts. When I did the back flip I landed on my head. |
Employment |
Hurt my back at work yesterday and I’m working a full 12 hours tomorrow without getting paid. Lovin life right now. |
Entertainment |
Watching Cirque Du Soleil: Michael Jackson my back hurts just from watching it |
Religion |
Testimony Time! i want to give God the glory for healing me from a severe back pain |
Co-occurring conditions |
I don’t know if my back pain is causing depression or my depression is causing back pain… |
Pharmacological therapies |
I just took my very first Oxycodone for lower back pain. I think I’m in love. It didn’t just kill the pain. It assassinated it. |
Self-treatments |
Coconut oil epsom salt & vapor bath oil just soothed my back pain away |
Social support |
Told mom my back hurts she offered to rub my feet an back I have the best mom ever |
Food and drink |
my back is killing me cant get out ov bed but need coffee |
Weather |
I love cold weather but it’s really not helping with my back pain. Where is that warm summer weather attttttt. |
Not being understood |
OMG no one understands the pain I'm in right now. My back is killing me. |
In this study, we identified 60 specific topics from 896,867 tweets about LBP and grouped them into 19 categories that relate to contextual variables of LBP. The top category was “emotion and beliefs,” with 157,563/896,867 tweets (17.6%), followed by “physical activity” (124,251/896,867, 13.85%) and “daily life” (80,730/896,867, 9%), while “food and drink,” “weather,” and “not being understood” had the lowest proportions of tweets (11,551/896,867, 1.29%; 10,109/896,867, 1.13%; and 9180/896,867, 1.02%, respectively). There were 11 topics within the category of “emotion and beliefs”; of 157,563 tweets in this category, 113,562 (72%) expressed negative sentiment. Our results were consistent with the general findings from traditional study methods in the area of LBP but provided more in-depth detail on the context of LBP from the individual perspective.
Our study examined contextual variables to provide a novel insight into first-person perspectives of the LBP experience and confirmed the broad areas that have previously been identified using more traditional data collection methods from qualitative and quantitative studies. For example, psychosocial factors have an important role in LBP [
Our study also highlighted areas related to the pain experience in individuals that have not been adequately explored in the literature but that play an important role in the effectiveness of LBP interventions and self-management behaviors, such as the “not being understood,” “religion,” and “food and drink” categories. We found that although the category of “not being understood” had the smallest proportion of tweets with a total of 9180 tweets, it had the top five words: “make,” “people,” ”stop,” “thing,” and “complain.” This is consistent with a previous systematic scoping review that examined what patients want from their medical care, which reported that patients felt misunderstood and wanted legitimation of their LBP [
The category of “food and drink” is novel and interesting. The tweets included words relating to the type of food (eg, pizza, chocolate, cookies and cream), mealtimes (such as breakfast and lunch), and the process of bringing or making food. Although they reflect important daily habits of eating and drinking, they may also highlight issues around pain affecting an individual’s capacity to eat and drink and/or problems associated with weight and in particular obesity [
There are well-described sex differences in the prevalence of back pain [
There are some limitations to our study. Although the keywords were taken from existing studies about LBP and approved by domain experts, some keywords, such as “back hurt” and “back pain,” were very broad. Therefore, the data collected might not have been specific to LBP. Selection of the right keywords in Twitter data analysis is very important to avoid unrelated data that could reduce the accuracy of results. Filtering and cleaning of Twitter data is also crucial for achieving high accuracy of results. In our study, we performed vigorous data cleaning, but our manual examination showed that there was a group of tweets that contained a few lines from the lyrics of a famous hip-hop song (Bad and Boujee) by Migos. These lines included “…So my money makin' my back ache.” One of our search keywords was “back ache.” Although there are many tools and methods available to automatically perform data cleaning, it is always necessary to manually inspect the results.
Twitter users tend to be younger and might not represent the general population; therefore, the results must be carefully interpreted [
To determine the optimal number of topics, we used the coherence score, a widely used method, and then manually examined and compared the models. This process can be further improved by using other measures such as heuristic approaches [
We also recognize that manual labeling of topics can be subjective. Two domain experts with extensive knowledge were involved in the labeling and examination of selected topics but future work in this area could involve a greater number of and more diverse domain experts to further reduce this subjectivity.
Our findings provided useful insights into individuals’ beliefs and perspectives regarding their needs and concerns related to LBP that complement the information available in the literature. Considering the contextual factors identified in this study rather than simply focusing on a biomedical model of LBP could address the needs of patients more holistically, help with improving LBP outcomes, and increase patient satisfaction. These findings have the potential to assist health care providers and clinicians with developing more effective, personalized therapies for LBP. There is also the potential to use social media to identify any major changes in community beliefs and needs regarding LBP that can be addressed in a timelier manner.
The average number of words in tweets per year.
Coherence score for latent Dirichlet allocation, Dirichlet multinomial mixture (DMM), General Pólya Urn Dirichlet Multinomial Mixture (GPU-DMM), biterm topic model, and nonnegative matrix factorization with number of topics 5-200 .
The best model selected with 60 topics and their top 20 terms.
Word clouds for the pain region and sleep categories.
Total number of tweets per each topic manually labelled.
The 19 contextual categories related to low back pain.
The total and percentage of tweets for each contextual category.
application programming interface
biterm topic model
Dirichlet multinomial mixture
General Pólya Urn Dirichlet Multinomial Mixture
low back pain
latent Dirichlet allocation
nonnegative matrix factorization
pointwise mutual information
short text topic modeling algorithm
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors. DU was supported by a National Health and Medical Research Council Career Development Fellowship (Level 2; 1142809).
PDH, FB, DU, and FC contributed to study concept and design. R contributed to data collection and topic modeling. PDH, DU, and FC contributed to topic labeling and clustering. PDH, FB, DU, and FC contributed to interpretation of data. R and PDH contributed to drafting of the initial manuscript. PDH, FB, DU, and FC contributed to critical revision of the manuscript for important intellectual content. R and PDH provided administrative, technical, or material support. All authors approved the final version of the manuscript.
None declared.