This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.
Biomedical research has traditionally been conducted via surveys and the analysis of medical records. However, these resources are limited in their content, such that non-traditional domains (eg, online forums and social media) have an opportunity to supplement the view of an individual’s health.
The objective of this study was to develop a scalable framework to detect personal health status mentions on Twitter and assess the extent to which such information is disclosed.
We collected more than 250 million tweets via the Twitter streaming API over a 2-month period in 2014. The corpus was filtered down to approximately 250,000 tweets, stratified across 34 high-impact health issues, based on guidance from the Medical Expenditure Panel Survey. We created a labeled corpus of several thousand tweets via a survey, administered over Amazon Mechanical Turk, that documents when terms correspond to mentions of personal health issues or an alternative (eg, a metaphor). We engineered a scalable classifier for personal health mentions via feature selection and assessed its potential over the health issues. We further investigated the utility of the tweets by determining the extent to which Twitter users disclose personal health status.
Our investigation yielded several notable findings. First, we find that tweets from a small subset of the health issues can train a scalable classifier to detect health mentions. Specifically, training on 2000 tweets from four health issues (cancer, depression, hypertension, and leukemia) yielded a classifier with precision of 0.77 on all 34 health issues. Second, Twitter users disclosed personal health status for all health issues. Notably, personal health status was disclosed over 50% of the time for 11 out of 34 (33%) investigated health issues. Third, the disclosure rate was dependent on the health issue in a statistically significant manner (
It is possible to automatically detect personal health status mentions on Twitter in a scalable manner. These mentions correspond to the health issues of the Twitter users themselves, but also other individuals. Though this study did not investigate the veracity of such statements, we anticipate such information may be useful in supplementing traditional health-related sources for research purposes.
Traditional methods for collecting data in support of clinical research include prospectively collected surveys (eg, [
An increasing number of studies demonstrate that the data disseminated via social media platforms, such as Twitter, can inform health-related investigations. We review such studies in the following section, but we highlight that studies have shown, for instance, that such data can be mined to model aggregate trends about health (eg, detection of statistically significant adverse effects of pharmaceuticals [
The objective of our work is to develop a scalable framework for detecting mentions about personal health on a specific social media platform, namely Twitter. The system introduced in this paper is composed of several core processes. First, the system filters the Twitter stream for tweets that are likely to contain health-related information. Next, a subset of the tweets are labeled with respect to the type of information that is communicated (eg, health status of the author versus a metaphorical statement) and applied to train a classifier. While it is possible to label a large number of tweets given a substantial budget, it is unlikely that a classifier could be specialized for each specific health issue. For instance, imagine a researcher is interested in studying 10,000 distinct health issues, each of which will require at least 500 tweets to train a robust classifier. If the cost to label each tweet is $0.10, it would cost $500,000 to build the necessary corpora! Our framework demonstrates that a scalable classifier, which discovers health mentions across a broad range of health issues, can be composed by leveraging a mixture of tweets from various health issues, which could make large-scale investigations much more cost-effective. In doing so, however, our system is oriented toward a high precision while maintaining a reasonable recall.
There are three primary contributions of this paper:
Labeled Health Mention Corpus. We leverage Amazon Mechanical Turk to create a labeled corpus of tweets with health mentions for 34 health issues. These include certain high impact health issues investigated in the Medical Expenditure Panel Survey [
Health Mention Detection. We introduce a system to automatically detect personal health mentions in tweet streams. We show that this system is trainable with a relatively small number of labeled tweets from several health issues. Moreover, it can effectively detect personal health mentions across a range of health issues on Twitter. For instance, training on 2000 tweets associated with four health issues (cancer, depression, hypertension, and leukemia) can yield a classifier that achieves a precision of 0.77 on the aforementioned corpus of tweets of 34 health issues.
Health Mention Attribution. To demonstrate the potential for the data filtered from Twitter, we investigated how people reveal information about themselves and others. In doing so, we show that the likelihood an individual self-discloses is dependent on the health issues communicated. For example, personal health status is revealed more than 50% for 11 of the 34 health issues. For certain health issues (eg, allergies, bronchitis, insomnia, migraines, and ulcers), people are more likely to disclose their own health status, while for other health issues (eg, Alzheimer’s, Down syndrome, leukemia, miscarriage, and Parkinson’s), people are more likely to disclose another person’s status.
As alluded to, various investigations have demonstrated that social media can be successfully leveraged to (1) enable individuals to discuss their health status, (2) influence an individual’s health behavior, and (3) support the analysis of aggregate trends around health activities.
First, a certain portion of studies have focused on the extent to which, as well as how, social media enables self-reports of health information. Hale et al [
Second, the previous investigations show that individuals publish information about themselves, but there is also a growing body of evidence to suggest that social media can influence an individual’s health behavior. In certain cases, exploitation of social media can bring about negative health behaviors. For instance, based on discussions about prescription abuse over Twitter, it was observed that social media may aggravate such problems [
Third, social media can be mined to learn and characterize aggregate trends with respect to health activities. For instance, it was shown that flu trends can be effectively extracted from Twitter using standard machine learning strategies [
Though social media can support a wide array of health-related investigations, there are a number of hurdles to making the associated methodologies scalable. As Curtis and colleagues [
Our work differs from the aforementioned studies in that we focus on personal health status disclosure on Twitter. We note that Mao et al [
To mine health-related information from social media, it is critical to develop a classifier. However, tweets are constrained in size and, thus, are composed of limited content. Consequentially, it is essential to define and select discriminative features to support automated health status detection. In certain studies, tweets were enriched with features by referencing external sources, such as Wikipedia [
As an alternative, it has been shown that punctuation, emoji characters, hashtags, and the @username designation, as well as text (including n-grams of words or characters [
If we rely on a classifier to filter and analyze social media, then it is essential to obtain (or create) a labeled corpus to train the classifier. Crowdsourcing over Web-based platforms, such as Amazon Mechanical Turk (MT), has been employed to generate labeled gold standard corpora [
To formalize the problem, we define the notions of personal health status and mention: Definition 1 (Personal Health Status) is the health condition of a specific person regarding a health issue or symptom, and Definition 2 (Personal Health Mention) is a statement of personal health status in social media.
These definitions focus on the health information of the individuals who are potentially identifiable. For instance, tweets such as “my father is cancer free for ten years”, “I have to do chemo tomorrow”, and “my little cousin has leukemia” are representatives of personal health mentions. By contrast, “Local charity doing great work to help cancer patients” is not a personal health mention because the subject is a group of people as opposed to a specific person.
We treat the problem of personal health mention detection as binary classification. We say a tweet is positive if it reveals personal health status and negative otherwise. For example, two MT masters assigned positive labels to each of the first three tweets in
Given their brevity (140 characters at most), tweets often have limited context. Consequentially, assigning a class label to a tweet is substantially more challenging than detecting if a given tweet communicates status of the author. The last three tweets in
In this paper, we study how people disclose personal health statuses on Twitter and present a scalable personal health mentions detection system for the Twitter stream. Specifically, we decompose this investigation into the following four hypotheses: H1: People discuss personal health status on Twitter; H2: Personal health status disclosure rate is health issue dependent; H3: The likelihood that people disclose their own versus other people’s personal health status is health issue dependent; and H4: Personal health status mention classifiers based on tweets of multiple health issues are more scalable than those based on a single health issue.
Examples of tweets related to health issues and the labels obtained through the Mechanical Turk (MT) survey.
Tweet | Label via MT | ||
Master 1 | Master 2 | ||
|
|||
|
I’m suffering from schizophrenia and a little bit of insomnia. | author | author |
|
Prayers for my dad would be appreciated. He has lymphoma. Thanks for the support everyone. | relative | relative |
|
didn’t she have a miscarriage like 3 days ago? | someone else | someone else |
|
|||
|
you’re gonna give Viv a heart attack | metaphor | metaphor |
|
Even after Bill Gates relentless support and millions of dollars poured into Malaria research, we are not successful. | viewpoint | viewpoint |
|
Praying I don’t have pneumonia | worry | worry |
|
|||
|
Cheerios say she’ll never have to worry about dieting. Too bad with 2:1 sodium to cal, she’ll have to worry about high blood pressure. | metaphor | someone else |
|
Yooo soo i walk out my apt and here this girl screaming for help. Apparently, she kneed her testicular cancer bf in the nuts repeatedly. | metaphor | someone else |
|
memorial find. 10% of your bills went to leukemia and lymphoma research. when amber was around she brightened everyone’s day in one way. | viewpoint | someone else |
Framework for personal health mention detection over Twitter. First, tweets are filtered into bins according to health issue topic. A portion of the tweets are supplied to a labeling service. The labeled data is then applied to train a classifier to detect personal health mentions.
To create a labeled corpus of health status mentions, we solicited annotators through MT. Specifically, we set up a survey for labeling a corpus on MT, the details of which are in
The positive class includes the labels of author, relative or friend, and someone else. The negative class consists of labels for metaphor, viewpoint, and worry.
For the purposes of this study, we created four types of datasets. The formalization of the design of these datasets is available in Table B-1 in
Given the difficulty in labeling tweets in practice, we generated three additional datasets to resolve label conflicts. The first is the conflict as positive (CAP) dataset, which treats tweets with conflicting labels as positive. The second is the conflict as negative (CAN) dataset, which treats tweets with conflicting labels as negative. The third is the TieBreak dataset, which uses a third MT master to break the tie. These datasets represent the best case, the worst case, and the general case in the real world and we rely upon them to assess the system’s scalability.
Label hierarchy.
System scalability emphasizes the ability to detect mentions for many, potentially unknown, health issues communicated via social media, using the labeled tweets from a limited number of health issues.
To formalize the scenario, let
As depicted in
The ideal scalability test is to train an HOC-1 classifier for every health issue in
Overview of evaluation strategies for the personal health status mention classifier. Note, D={d1, d2, …, dn} is set of health issues, X is set of health issues selected to train classifier, and Y is set of health issues used to test classifier.
To assess the performance of the system, we rely upon the standard measures of precision and recall. In our setting, precision (P) corresponds to the proportion of tweets classified as positive that are in fact positive. Recall (R)corresponds to the fraction of real positive tweets that are classified as positive. Given the large volume of tweets and the often unbalanced positive/negative class ratio per health issue (see
The extent to which people tweet about themselves versus others when disclosing personal health status. Note that this is a stacked bar chart, such that the sum of the author and others proportions corresponds to the overall proportion of positive instances.
One of the aims in this research is to examine whether we can use classifiers trained with tweets from multiple health issues to detect personal health mentions about other health issues. Hence, it should be noted that the goal of our research is to examine the effectiveness of classifiers when supplied with a set of known (or off-the-shelf) features. We use a Multinomial Naïve Bayes (MNB) binary classifier based on four types of features associated with tweets. Alternatively, we can plug other learning algorithms, such as logistic regression or a support vector machine, into the framework as the base classifier. Previous investigations verified the effectiveness of such features [
Nouns, verbs, and pronouns. We transformed each word into its lemma form. Though pronouns are often defined as stop terms (which are discarded in traditional natural language processing), they are retained because they can disclose the personal health status of a friend or family member (eg, “My mom makes having cancer look good”).
Dependencies. These are grammatical relations [
Punctuation and Emoji. These can indicate an author’s emotion and may improve classification (e.g., “my uncle is cancer free !!!!!! lol”).
HTTP LINK, #hashtags, and @username. These features represent the existence of link, hashtag, and @username in a tweet, respectively.
In our experiments, we highlight the evaluation of two important factors that can affect the scalability of a classifier: (1) the diversity of health issues in the training data, and (2) the quantity of training tweets. When we compare different classifiers, we focus on the former. When we test system scalability, beside the system scalability, we also evaluate the performance of the classifiers with different size of training dataset. The following provides details of the experiment design.
We use the 34 health issues depicted in
We use the cancer, depression, hypertension, and leukemia gold standard datasets to train each homogeneous classifier. There are two situations where we can evaluate how the diversity of health issues in the training data influence the homogeneous classifiers. First, suppose that we aim to detect multiple health issues. Given a fixed number of training tweets, how does an HOC-N classifier (eg, trained with SYND) differ from a group of HOC-1 classifiers (eg, four HOC-1 classifiers)? Second, now imagine we wish to perform detection for only one single health issue (eg, cancer). Given a fixed number of training tweets, how does a HOC-N classifier (eg, trained with SYND and test on cancer) differ from the associated HOC-1 classifier (eg, cancer HOC-1 classifier)?
To evaluate the diversity of health issues in training dataset, we compare HEC-1 with HEC-N (2 ≤ |
When assessing system scalability, we test the classifier on the CAN, CAP, and TieBreak datasets of D. This enables the evaluation of the performance of the system in a real-world scenario. We also test the classifier trained with different number of tweets.
For each experiment, we stratify the tweets and generate 30 train-test sets. In doing so, (1) each set preserves the proportion of samples for each positive (negative) class, and (2) the data is partitioned, such that we train on 80% of the tweets while we test on the remaining 20%. To control the comparison, the size of the training set for each compared classifier is equivalent.
We used the Twitter streaming API to filter for tweets between May 7, 2014 and July 23, 2014 that were (1) published in the contiguous United States according to their geolocation, and (2) written in the English language only. A total of 261,468,446 tweets were subject to a filter composed of keywords for 34 health issues, resulting in 281,357 tweets (0.11%) for further investigation.
To demonstrate the opportunities for a personal health mention detection system, we conducted an investigation to test H1, H2, and H3. We chose 100 tweets, at random, for each of the 34 health issues as shown along the x-axis of
To test hypothesis H2 (personal health status disclosure rate) and H3 (who the disclosure is about), we define the following null hypotheses: H2o: The rate of positive and negative tweets is independent of the health issues, and H3o: The rate of tweets disclosing the author’s health status and others’ health status is independent of the health issues.
To test these hypotheses, we used the TieBreak dataset, which (due to randomness) represents 100 samples from each of the 34 distributions regarding how people disclose health status. To test H2, we applied a chi-square test on these two variables: the number of positive tweets and the number of negative tweets in each health issue samples. To test hypothesis H3, we applied a Spearman correlation test on these two variables: the rate of tweets disclosing the author’s health status and the rate of tweets disclosing the others’ health status. We set the alpha level of significance to .05.
The results reveal several notable pieces of evidence, which are related to the first three hypotheses posed above.
People disclose personal health status on Twitter for a range of health issues (H1). The disclosure rate for each of the 34 health issues is greater than 9%. There are 29 health issues with disclosure rates greater than 20% and 11 health issues with disclosure rates greater than 50%. The latter group includes: allergies (85/100), anemia (57/100), arthritis (48/100), asthma (61/100), bronchitis (88/100), insomnia (70/100), kidney stones (67/100), migraines (83/100), miscarriages (52/100), pneumonia (68/100), thyroid (74/100) problems, and ulcers (56/100).
Health status disclosure rate is dependent on the health issue, χ2
33=697,
The likelihood that people disclose their own versus other people’s health status is dependent on the health issue,
We extracted the gold standard datasets for each of the four health issues mentioned in the Methods section.
The number of positive and negative tweets in the gold standard datasets.
Tweet | Cancer | Depression | Hypertension | Leukemia | SYNDa |
Positive | 166 | 261 | 211 | 436 | 1074 |
Negative | 697 | 461 | 551 | 423 | 2132 |
aSYND: synthetic health issue (D).
Before conducting an in-depth empirical investigation, we inspected the classifiers and their corresponding features to determine if they are intuitive. Here, we report on the top 10 informative features by training in a homogeneous classification setting with tweets of each of the five health issues (cancer, depression, hypertension, leukemia, and SYND).
The results show the effectiveness of feature selection in several ways. First, more than five features are pronouns, such as I, my, and she (which was also confirmed in [
This table also provides several notable results about other behaviors when people disclose personal health status. For instance, people often include @someone in health mentions. They use links to provide additional information such as pictures, locations, or texts, or use exclamation mark to express strong feelings about personal health status.
The hypertension classifier was notable because it had specific health-related terminology ranked highly. Specifically, the term blood is highly informative for this classifier. We suspect this is because hypertension is commonly referred as high blood pressure.
The most informative features for homogeneous health mention classification.
Rank | Cancer | Depression | Hypertension | Leukemia | SYNDa |
1 | I | I | I | I | I |
2 | my | my | my | My | My |
3 | ! | have | |||
4 | you | HTTP LINK | ! | ||
5 | you | it | dobj_have_diagnosis | ! | Have |
6 | have | go | ! | She | HTTP LINK |
7 | she | poss_diagnosis_my | get | Have | She |
8 | He | ! | she | He | You |
9 | HTTP LINK | get | it | Battle | obj_have_diagnosis |
10 | obj_have_diagnosis | have | blood | Help | He |
aSYND: synthetic health issue (D).
In this experiment, we compared the effectiveness of homogeneous and heterogeneous classifiers and then testing on tweets from each of the five health issues.
First, it should be noted that each homogeneous classifier outperforms the heterogeneous classifiers when testing the corresponding health issue tweets, but such classifiers do not generalize. It can be seen that the leukemia HOC-1 classifier achieved the highest AUPRC. This may be due to the balance in the positive and negative classes for this health issue. However, it was observed that the homogeneous classifiers exhibited much higher variance compared to the heterogeneous classifiers. This suggests that heterogeneous classifiers may yield stable results.
Second, the HEC-1 classifier may tend to obtain a better AUPRC when testing on health issues with a similar author-to-others disclosure rate. For instance, cancer achieved the best AUPRC when testing on leukemia tweets. Meanwhile, leukemia achieved the best AUPRC when testing on cancer tweets. Depression and hypertension also achieved the best AUPRC when testing on each other.
Third, it also shows that SYND heterogeneous classifier (HEC-N) was the second best heterogeneous classifier when testing on cancer, depression, and leukemia tweets, and the best heterogeneous classifier when testing on hypertension. Considering that the HEC-1 classifier is specialized to a certain health issue, the HEC-N classifier may provide a more scalable alternative when filtering for personal health mentions on other health issues.
AUPRC for homogeneous and heterogeneous classifiers.a
|
Cancer | Depression | Hypertension | Leukemia | SYND |
mean (SD) | |||||
Cancer | 0.732 (0.058) | 0.528 (0.018) b | 0.552 (0.014)b | 0.869 (0.009)b | 0.728 (0.009)b |
Depression | 0.441 (0.007)b | 0.663 (0.054) | 0.611 (0.014)b | 0.821 (0.006)b | 0.666 (0.006)b |
Hypertension | 0.451 (0.009)b | 0.646 (0.011) | 0.664 (0.062) | 0.726 (0.008)b | 0.616 (0.006)b |
Leukemia | 0.638 (0.011)b | 0.603 (0.011)b | 0.559 (0.019)e | 0.936 (0.019) | 0.579 (0.007)b |
SYNDf | 0.625 (0.022)e | 0.618 (0.026)d | 0.626 (0.019)c | 0.831 (0.023)b | 0.820 (0.0180 |
a AUPRC: area under the precision recall curve. Classifiers were trained with row health issue tweets and tested on column health issue tweets. Within each column, a hypothesis test was conducted between HOC-1 and each model that is not HOC-1 (eg, HOC-1 vs HEC-1).
b
c
d
e
fSYND: synthetic health issue (D).
AUPRC of homogeneous health mention classifiers, given the same number of training tweets.a
Classifier | Cancer | Depression | Hypertension | Leukemia |
mean (SD) | ||||
HOC-1b | 0.732 (0.058) | 0.663 (0.054) | 0.664 (0.063) | 0.936 (0.019) |
HOC-Nc | 0.723 (0.061) | 0.645 (0.053) | 0.672 (0.070) | 0.927 (0.022) |
HOC-N‡ | 0.756 (0.050) | 0.681 (0.050) | 0.702 (0.059)d | 0.940 (0.021) |
aAUPRC: area under the precision recall curve. Within each column, the hypothesis test was conducted between HOC-1 and each model that is not HOC-1 (eg, HOC-1 vs HOC-N).
bHOC-1: homogeneous classification with |X| = 1
cHOC-N: homogeneous classification with |X| > 1
d
In this experiment, we evaluated how homogeneous classifiers are influenced by (1) the number of health issues in the training set, and (2) the number of tweets used for training classifiers.
The hypothesis tests showed that only the HOC-1 and HOC-N‡ classifiers are statistically significant when testing on hypertension tweets (
This indicates that the HOC-N classifier can serve as a substitute for HOC-1 classifiers.
In this experiment, we evaluated how the number of health issues in the training set influence the heterogeneous classifiers.
This suggests hypothesis H4 may be true, provided the classifier is based on an appropriate mixture of health issues. However, determining an optimized group of health issues to achieve an HEC-N classifier with performance comparable to HEC-1 classifier is left to future investigation.
Based on these findings, we use HOC-N and HEC-N to conduct the system scalability test.
Comparison Between heterogeneous classifiers HEC-1 and HEC-N trained on cancer, depression, hypertension, and leukemia, and tested on the remaining 30 health issues. The tweets of each test health issue stratified with respect to their rate of observation.
After breaking ties, 43.7% of the TieBreak dataset are positive instances. Based on this proportion, there are approximately 120,260 positive instances out of 281,357 tweets in the health issue bins (or 0.046% of all the collected tweets).
We trained the SYND classifier with the gold standard datasets for cancer, depression, hypertension, and leukemia, and tested it on the other three types of datasets.
Class distribution of tweets in the datasets.
Tweets | Gold | CANa | CAPb | TieBreak |
Positives | 1082 | 1082 | 1718 | 1366 |
Negatives | 1539 | 2175 | 1539 | 1891 |
aCAN: conflict as negative
bCAP: conflict as positive
PR (precision recall) curves for testing on the gold, CAN (conflict as negative), and CAP (conflict as positive) datasets.
Performance of the SYND (synthetic health issue) classifier with a varying amount of training data.
There are several notable findings from this investigation. First, Twitter users disclose the health status of themselves and others. Second, the health status disclosure rate may depend on the health issue. Third, how people disclose their own and other people’s health status may also be health issue dependent. Fourth, tweets related with a small group of health issues can train a scalable classifier to detect health mentions on Twitter streams.
Another interesting phenomenon illustrated from the PR curves (
According to our investigation, roughly 44% of the tweets containing health issue keywords disclose personal health status. We believe there is a potential for information to assist health care professionals in learning about their patients or their patients’ family medical history, information often missing in the EMRs. This indicates that social media platforms, such as Twitter contains huge amount of personal health care related information that may complement traditional EMRs in research and practice. We recognized that we must still verify the veracity of such data, but an opportunity exists nonetheless.
We wish to highlight several limitations of this investigation. First, two parameters to extract tweets from Twitter streams require configuration: (1) the set of keywords invoked in the filter, and (2) the geolocation applied to discover tweets. Compared to keywords, geolocation can filter tweets disseminated by authoritative organizations (due to the absence of “coordinates” and “place” information in these tweets), such as the American Cancer Society, and thus greatly reduce noise. However, it should be noted that invoking such a filter can also exclude the tweets of individuals who choose not to disclose their location. A second limitation exists in the survey provided to the MT masters for labeling the corpus. Specifically, we assumed the N/A option was a member of the negative class, but this could be an incorrect assumption in certain instances. Third, this investigation was restricted to only 34 health-related phenomena, which is clearly only a sample of all possible health issues. The keywords filter service can be enhanced by integrating a laymen health vocabulary [
Recent studies demonstrate the information communicated through social media platforms, such as Twitter and Facebook, could supplement traditional medical and epidemiological research. In this paper, we showed that a health mention detection system can be designed and deployed for microblogging systems, such as Twitter. At the same time, we illustrated that the information communicated through such mentions can disclose the health status of the authors and other individuals at a wide range of rates. Our experimental investigation further showed that the combination of tweets from several health issues can yield a classifier that dominates a classifier based on the tweets of a single health issue. This may enable the system to use a small amount of training data to build a classifier that detects health status mentions across a range of health issues. We envision several opportunities for extending this work. First, we believe the scalability of the classifier may be improved by determining the minimal set of health issues and features (eg, more complicated grammar features). Second, we anticipate that the performance of the classifier could be improved be accounting for context, such as dialogue, relationships in the network, and profile information as new supplemental features. Finally, while the rate that health status is disclosed for the author versus other individuals is dependent upon the considered health issue, further investigation is required to determine what drives this disparity. We suspect, for instance, that it may be dependent on the sensitivity and severity of health issues, but this is only a conjecture.
Example question posed to Mechanical Turk masters.
Concordance between the system classifier and Mechanical Turk masters.
Summary of four datasets.
Keywords used to filter tweets.
area under the precision recall curve
conflict as positive
conflict as negative
electronic medical record
heterogeneous classification with |X| = 1
heterogeneous classification with |X| > 1
homogeneous classification with |X| = 1
homogeneous classification with |X| > 1
Kolmogorov-Smirnov Test
Multinomial Naïve Bayes
Mechanical Turk
none of the above
synthetic health issue
This research was sponsored in part by grants from the National Science Foundation (CCF-0424422) and the Patient Centered Outcomes Research Institute (CDRN-1306-04869). The authors would like to thank the members of the Mid-South Clinical Data Research Network for useful discussions during the development of this research.
None declared.