Published on in Vol 23, No 10 (2021): October

Preprints (earlier versions) of this paper are available at, first published .
Prediction of Readmission in Geriatric Patients From Clinical Notes: Retrospective Text Mining Study

Prediction of Readmission in Geriatric Patients From Clinical Notes: Retrospective Text Mining Study

Prediction of Readmission in Geriatric Patients From Clinical Notes: Retrospective Text Mining Study

Original Paper

1Nanyang Business School, Nanyang Technological University, Singapore, Singapore

2City University of Hong Kong, Hong Kong, Hong Kong

3School of Business, Singapore University of Social Sciences, Singapore, Singapore

4Tan Tock Seng Hospital, Singapore, Singapore

5Geriatric Education and Research Institute, Singapore, Singapore

6Ng Teng Fong General Hospital, Singapore, Singapore

7Medical Informatics, National University Health System, Singapore, Singapore

Corresponding Author:

Kim Huat Goh, PhD

Nanyang Business School

Nanyang Technological University

S3-B2A-34, 50 Nanyang Avenue

Singapore, 639798


Phone: 65 67904808


Background: Prior literature suggests that psychosocial factors adversely impact health and health care utilization outcomes. However, psychosocial factors are typically not captured by the structured data in electronic medical records (EMRs) but are rather recorded as free text in different types of clinical notes.

Objective: We here propose a text-mining approach to analyze EMRs to identify older adults with key psychosocial factors that predict adverse health care utilization outcomes, measured by 30-day readmission. The psychological factors were appended to the LACE (Length of stay, Acuity of the admission, Comorbidity of the patient, and Emergency department use) Index for Readmission to improve the prediction of readmission risk.

Methods: We performed a retrospective analysis using EMR notes of 43,216 hospitalization encounters in a hospital from January 1, 2017 to February 28, 2019. The mean age of the cohort was 67.51 years (SD 15.87), the mean length of stay was 5.57 days (SD 10.41), and the mean intensive care unit stay was 5% (SD 22%). We employed text-mining techniques to extract psychosocial topics that are representative of these patients and tested the utility of these topics in predicting 30-day hospital readmission beyond the predictive value of the LACE Index for Readmission.

Results: The added text-mined factors improved the area under the receiver operating characteristic curve of the readmission prediction by 8.46% for geriatric patients, 6.99% for the general hospital population, and 6.64% for frequent admitters. Medical social workers and case managers captured more of the psychosocial text topics than physicians.

Conclusions: The results of this study demonstrate the feasibility of extracting psychosocial factors from EMR clinical notes and the value of these notes in improving readmission risk prediction. Psychosocial profiles of patients can be curated and quantified from text mining clinical notes and these profiles can be successfully applied to artificial intelligence models to improve readmission risk prediction.

J Med Internet Res 2021;23(10):e26486




Hospital readmission of older adults is a significant challenge for the individual, caregivers, and health system. For individuals, readmissions can be distressing, may compromise quality of care, and increase the risk of adverse health outcomes. For caregivers, readmission is often burdensome and increases their health care spending. As for health systems, readmissions often cause resource demands and financial costs to escalate [1]. The 30-day readmission rate among patients aged 65 years or older in Singapore has been reported to be 19% [2], which is comparable to the readmission rate of Medicare patients in the United States, most of whom are older adults [3]. Significant risk factors for hospital readmission in adults aged 65 years and older include (a) sociodemographic factors such as higher age, male gender, ethnicity, and poor living conditions; (b) health-related factors such as poor overall condition, comorbidity, functional disability, and recent hospital admissions; and (c) organizational factors such as prolonged length of stay in the index hospitalization and discharge destination [4,5]. These risk factors have been used extensively in predictive models for hospital readmission by health service researchers worldwide [6-10]. Recently, other readmission predictors such as those in the psychosocial domain have begun to receive more attention.

Psychosocial factors can be defined as “the combination and interplay of psychological and social factors that potentially influence health, injury, illness, and disease” [11]. However, a review of the medical literature suggests that different medical specialties have slightly different definitions of psychological factors [11-17]. Based on the various factors identified in earlier studies, we observed that psychosocial factors can be divided into three relevant dimensions: (1) individual psychological well-being, (2) social structures, and (3) resources. Individual psychological well-being factors include psychological conditions such as mood [11,18], attitude [11,19], coping mechanism [11,17], depression [15,16,20], perceived control [13,19], and psychological distress [16,17,21]. Social structures represent the conditions of the environment in which the individual lives, including support structures [11,14,16,17], social relationships [14,18], social norms [19], and family life [22]. Finally, resources represent the means available to the individual, such as financial means, accessibility to health care [13,14], and the health service system [19].

Prior research has shown that these factors—depressive symptoms [23], poor social support, and financial stress—contribute to hospital readmission for specific patient subgroups such as those with chronic obstructive lung disease, chronic kidney disease, and heart failure [24-26]. In general, psychosocial factors could play a significant role in the hospital readmission of older adults and account for a significant proportion of the readmission risk. At the same time, psychosocial factors are indicators of a patient’s complex needs that are amenable to tailored care interventions. Such interventions can improve the patient’s clinical outcomes and reduce the utilization of health care resources.

Literature Review

There are two conceptual models in the extant literature that link psychosocial factors to hospital readmissions for older adults. The first is Andersen’s [27] Behavioral Model of Health Services Use that posits an individual’s use of health services as a function of predisposing, enabling, and need factors. Psychosocial factors (ie, individual-level and structural-level variables) can be categorized as the model’s predisposing and enabling factors, respectively. The other is Adler and Stewart’s [28] Pathways Linking Socioeconomic Status and Health model, which suggests that environmental resources and constraints, as well as psychological influences are mechanisms that lead to health outcomes such as hospital readmission. Individual-level and structural-level psychosocial factors map to the model’s psychological and environmental variables.

In contrast to the numerous clinically related risk factors that are stored as structured data in electronic medical records (EMRs), most psychosocial factors are recorded as free text in the patient’s clinical notes such as the initial and progress clinical notes of physicians, allied health professionals, case managers, and social workers. Such unstructured textual data in the EMR represent a potentially rich and untapped source of data related to patients’ psychosocial factors. The manual extraction of psychosocial keywords from unstructured data is challenging and impractical given the copious and ever-increasing amount of clinical notes recorded in a typical EMR system. As such, there have been systematic efforts by clinicians to capture social and behavioral data, including psychosocial information, as structured data in EMR systems [29]. However, the effectiveness of these efforts in different health care contexts remains unclear. At the same time, other researchers have begun to apply text-mining techniques to efficiently extract and analyze unstructured text data in EMR clinical notes to identify these psychosocial factors.

Text-mining techniques represent a broad range of approaches for analyzing and processing semistructured and unstructured text data to construct structured data. By using powerful algorithms applied to large textual documents such as those typically found in EMR systems, text mining can “turn text into numbers” to be used for further analysis. Topic modeling, which is a specific domain in text mining that examines individual words to identify common topics and concepts, holds significant promise for extracting psychosocial factors from EMR clinical notes.

To date, only a few text-mining studies have set out to identify individuals with psychosocial factors using EMR data [30]. As such, we have limited evidence on the effectiveness of extracting psychosocial information from EMRs for the purpose of secondary health care research or routine clinical care.


This study proposes a text-mining approach to identify older adults with key psychosocial factors obtained from clinical notes to help predict adverse health and health care utilization outcomes. To validate the efficacy of including psychological factors in the predictive model, we append these psychosocial factors to the commonly used LACE (Length of stay, Acuity of the admission, Comorbidity of the patient, and Emergency department use) Index for Readmission [31] to improve readmission risk prediction accuracy on an independent, hold-out sample of patients.


The study was a retrospective analysis of EMR data captured by the EPIC system over a 26-month period from January 1, 2017 to February 28, 2019. Ethical approval was provided by the Domain-Specific Review Board of the National Healthcare Group, Singapore (2018/01072).

Settings and Data Context

The sample consists of 9393 patients with 43,216 admission encounters in a 26-month period from all wards in Ng Teng Fong General Hospital, Singapore. Each clinical record was classified by the role of the author. In this sample, clinical records were authored by physicians, medical social workers, or case managers. Specifically, medical social workers and case managers are assigned to some patients who may require additional social support upon hospital admission. The dataset consists of two cohorts of patients. The first cohort includes 892 patients (3282 admission encounters) identified by the hospital as frequently readmitted patients (“Frequent Admitters” cohort). This cohort consists of patients who (1) are frequently admitted to the hospital despite having their acute medical needs met, (2) have medical conditions that require multidisciplinary care, (3) show signs of caregiver stress, (4) encounter frequent falls (more than two falls in the last 12 months) and require functional management at home, or (5) face medication management issues (eg, noncompliance to their medication regime). The second cohort consists of 9377 randomly selected patients (39,934 admission encounters) admitted to the hospital (“Standard” cohort). The “Standard” cohort consists of patients admitted to the hospital’s inpatient wards during the sampling period. The purpose of including the “frequent admitters” cohort was to oversample the frequent readmission cases and facilitate the training of the text-mining algorithm to extract psychosocial topics often associated with readmission risks. This type of oversampling method is commonly applied in health care research to train machine-learning algorithms [32,33]. The combined cohorts were randomly split into a training dataset and a hold-out/test dataset to ensure that both the training and test dataset had similar distributions of patients in the “frequent admitters” and “standard” cohorts. The training dataset comprised 30,252 admission encounters and the hold-out/test dataset comprised 12,964 admission encounters. The unit of analysis is each admission encounter.

We used the 10-fold cross-validation method to train and validate the model with the training dataset. The validated model was subsequently tested against the hold-out/test dataset in four ways. First, we used the “Combined” test dataset, which is the dataset compiled to have a similar distribution as the training dataset, containing proportionally more “frequent admitters” to ensure that the model was tested using a similar distribution of patients as that used to train the model. Second, we used the “Standard” test dataset, which was the sample randomly drawn among patients from the hospital; this sample represents a typical patient that a hospital encounters. We used this dataset to test for the generalizability of the model and to rule out overfitting. Third, we used a “Frequent” test dataset, consisting of “frequent admitters,” which was mainly used to test the fit of the model in predicting frequent admitters, as a key concern for many hospitals. Finally, we used a “Geriatrics” test dataset, comprising only geriatric patients (≥65 years old selected from the “standard” cohort test dataset), to test if the model works for the geriatric specialty where it is likely to be deployed. Additional details and procedures of cohort selection are shown in Figure 1.

Figure 1. Sampling methodology. All values represent the admission encounters. For the “frequent admitters” cohort (3282 encounters), there are 892 unique patients and for the “standard” cohort (39,934 encounters), there are 9377 unique patients. The “geriatrics” cohort represents a subsample of patient encounters within the “standard” cohort where the age of the patient at time of encounter is greater or equal to 65 years.
View this figure

Data Processing and Algorithm Development

For each admission encounter, we combined the clinical notes written by authors with similar roles (eg, all notes written by physicians were combined as physician’s notes). The notes were combined based on the author’s role (physician, medical social worker, and case manager) because each role would potentially document similar issues. Hence, it is more efficient to mine the unstructured clinical notes for each role to identify common or similar topics. We combined the notes for each admission encounter instead of analyzing each note entry as the unit of analysis because a patient’s psychosocial conditions are less likely to vary for each admission encounter.

We then applied natural language processing text mining to the clinical notes in the training dataset. We used the latent Dirichlet allocation (LDA) topic modeling algorithm to extract the common topics present in the clinical notes and then numerically weighed each topic’s intensity (loadings) in the clinical notes. A vector of lexicographically related words represents each topic due to the frequent occurrence of these words in proximity across different notes. A high loading value represents the presence of the topic in the clinical note. This process was performed separately for the physician, medical social worker, and case manager notes. A total of 100 topics were extracted from each set of notes based on the clinician’s role (ie, physician, medical social worker, case manager).

Two geriatric specialists reviewed and classified these 100 topics into broader themes, specifically dividing them into psychosocial issues or nonpsychosocial-related issues. Additionally, we conducted four interviews with a group of medical social workers and case managers to triangulate if this classification is appropriate. It is important to note that this added classification into broader themes by clinicians is solely to facilitate the reporting and interpretation of results. These broader themes were not used in the subsequent development of the readmission risk model, and only the LDA classification loadings were used in the training of the readmission risk model. Further details of the text mining procedure are provided in Multimedia Appendix 1.

We combined the topic’s intensity (loadings) for each set of notes with structured predictors of readmission established in the LACE Index for Readmission as predictors for estimating readmission risk. As readmission risk is a function of various factors beyond psychosocial factors, we incorporated the LACE index to take into account some of the factors reported in the literature. The LACE index is a score commonly used to predict a patient’s 30-day hospital readmission risk [31]. The index consists of the following variables: (1) the length of stay (L), (2) the acuity of the current or previous admission (A), (3) comorbidities of the patient as measured by the Charlson Comorbidity Index score (C), and (4) the number of visits to the emergency department in the preceding 6 months (E).

The readmission risk model was fitted using the gradient boosting trees (GBT) algorithm to predict the outcome of readmission within the next 30 days from the discharge date of the current admission. GBT uses an ensemble of multiple trees to generate more accurate prediction models for classification and regression. The algorithm’s premise is to build a series of trees, where each tree is trained with the objective to correct the misclassification errors of the previous tree in the series.

We tested the model’s predictive accuracy using the four different hold-out test samples described above. To assess the predictive value of the clinical notes, we fitted a LACE baseline readmission model without using the topics from the notes. We then compared this baseline model against models that include the physician notes and social notes (ie, medical social worker notes and case manager notes) jointly and separately.

Evaluating the Predictive Value of Psychosocial Information

As expected, we observed that physicians record fewer psychosocial issues than medical social workers and case managers (Table 1). The more detailed distribution of the specific topics extracted is provided in Tables A1-A3 of Multimedia Appendix 1.

The descriptive statistics of the variables used in the readmission risk model for each test cohort are provided in Table 2.

Table 1. Distribution of psychosocial topics (N=100).
Role of authorProportion of psychosocial topics, n (%)Proportion of nonpsychosocial topics, n (%)
Physician25 (25)75 (75)
Medical social worker100 (100)0 (0)
Case manager88 (88)12 (12)
Table 2. Descriptive statistics of variables in the LACE (Length of stay, Acuity of the admission, Comorbidity of the patient, and Emergency department use) readmission model (patient encounter level).
VariableFrequent cohorta, mean (SD)Standard cohortb, mean (SD)Geriatrics cohortc, mean (SD)Combined cohortd, mean (SD)
Age (years)72.94 (13.24)67.07 (15.98)77.62 (8.09)67.51 (15.87)
Gender (1: Male, 0: Female)0.50 (0.50)0.55 (0.50)0.50 (0.50)0.54 (0.50)
Length of stay (days)6.73 (13.18)5.47 (10.15)6.65 (10.81)5.57 (10.41)
Charlson Comorbidity Index0.47 (1.39)0.41 (1.18)0.49 (1.38)0.42 (1.20)
Emergency department admission (1: Yes, 0: No)0.50 (0.50)0.57 (0.50)0.58 (0.49)0.56 (0.50)
Intensive care unit stay (1: Yes, 0: No)0.03 (0.17)0.05 (0.22)0.04 (0.21)0.05 (0.22)
Emergency department visits in last 6 months2.86 (3.07)1.39 (2.77)1.47 (2.44)1.50 (2.82)

aPatients identified by the hospital as frequent readmission patients.

bSample of a typical hospital patient.

cSubset of patients in the “Standard” sample who are 65 years of age or older.

dCombination of the “Frequent” and “Standard” samples.

The area under the receiver operating characteristic curve (AUROC) of the LACE baseline predictive model ranged from 0.8288 to 0.8397 (Table 3) for the four different test cohorts (Frequent, Standard, Geriatrics, and Combined). The baseline model only considered common factors identified in the prior literature associated with readmission risks and did not include psychosocial factors extracted from the clinical notes. The receiver operating characteristic curve is a plot representing the diagnostic ability of a binary classifier while varying the discriminatory threshold (ie, the cut-off value to reclassify one state to the other). With varying discriminatory threshold values, the different sets of true positive rate (sensitivity) are plotted against the corresponding false positive rates (1–specificity). Thus, AUROC is a representation of the overall performance of the classifier.

Adding the text-mined notes from the medical social workers and case managers increased the AUROC of the model to 0.8573-0.8707. Further appending the clinical notes from physicians increased the AUROC to 0.8952-0.9100.

Table 3. Results of the readmissions prediction model.
LACEd baseline




LACE baseline+sociali




LACE baseline+physicianj+social





aAUROC: area under the receiving operating characteristic curve.

bPPV: positive predictive value.

cNPV: negative predictive value.

dLACE: Length of stay, Acuity of the admission, Comorbidity of the patient, and Emergency department use.

eHold-out sample of patients identified by the hospital as frequent readmission patients.

fHold-out sample of a typical hospital patient.

gSubset of patients in the “Standard” sample who are 65 years or older.

hCombination of “Frequent” and “Standard” hold-out samples.

iSocial represents the text-mined notes that medical social workers and case managers provided.

jPhysician represents the text-mined notes provided by physicians.

Comparison Across Patient Profiles

The addition of textual information improved the AUROC of the readmission model. This improvement was particularly more significant for geriatric patients than for other cohorts of patients (Table 4). For geriatric patients, notes from the medical social workers and case managers improved the AUROC by 4.32%. Combining these notes with physician notes further improved the AUROC by 8.46% compared with the baseline LACE readmission model.

Table 4. Improvements of prediction (area under receiver operating characteristic curve) over the baseline LACE (Length of stay, Acuity of the admission, Comorbidity of the patient, and Emergency department use) model for different test cohorts.
NotesFrequent cohortaStandard cohortbGeriatrics cohortcCombined cohortd
Social and physicianf6.64%6.99%8.46%6.72%

aHold-out sample of patients identified by the hospital as frequent readmission patients.

bHold-out sample of a typical hospital patient.

cSubset of patients in the “Standard” sample who are 65 years or older.

dCombination of “Frequent” and “Standard” hold-out samples.

eThe readmission model with clinical notes from the medical social worker and case manager.

fThe readmission model with clinical notes from the medical social workers and case managers.

Principal Findings

The AUROC of our readmission risk model was higher than the typical accuracy of readmission predictive models, ranging from 0.66 to 0.83, as reported in an earlier review of 30 studies [6]. The results also suggest that the readmission predictive algorithm’s performance for all four cohorts (frequent admitters, standard, geriatrics, and the combination of frequent and standard groups) are relatively similar. Thus, this model can be applied to geriatric patients as the typical pool of patients who require additional management for readmission risks. Further, when taking into account the psychosocial information captured by nonphysicians (ie, medical social workers and case managers) by adding social topics, the prediction accuracy improved by 0.0285-0.0432. When we added the physicians’ textual clinical notes, the AUROC further increased by 0.0362-0.0414 in different cohorts.

Overall, the results show that with the addition of text-mined clinical notes from physicians and other clinicians, the AUROC of readmission prediction improves by 0.0664 to 0.0842, suggesting the added benefits of extracting psychosocial information from textual clinical notes in predicting readmission risk.

This study shows that clinicians could leverage natural language processing to gain more information from the EMR system beyond the traditional structured data commonly used to predict readmission risk. Specifically, this study establishes a proof of concept for the use of text-mining techniques with EMR unstructured free text to identify psychosocial predictors of hospital readmission, particularly among geriatric patients. In doing so, our findings support the viability of the psychosocial approach in potentially reducing readmission rates. Thus, our study represents a T2 translational stage (to patients) of research, paving the way toward the T3 translational stage (to practice). In terms of development along the translational pathway, the next phase will focus on proof of value of embedding text-mining techniques in prediction models used to identify the risk of early readmission among hospitalized patients. The purpose of this phase is to perform a comprehensive geriatric assessment for high-risk patients with the goal of offering tailored care management. By managing patients’ specific physical and psychosocial needs, we should observe improvement in the quality of care and a reduction in unnecessary health care utilization. In this way, precious heath care resources can be optimally allocated to patients who will obtain the greatest benefit. This strategy is particularly relevant for older hospitalized patients, who are more likely to have unmet psychosocial needs and for whom our augmented risk prediction model performs the best. To achieve proof of value, future research could use quasiexperimental designs to compare the feasibility and effectiveness of a product that combines text-mined psychosocial factors in a state-of-the-art prediction model with those of a product that only has a prediction model.

Beyond the application of text-mining techniques to the prediction of hospital readmission, this study also presents the broader and extended possibility of using the same technical approach developed for the EMR to identify a set of underdiagnosed clinical conditions in older adults, which will have an important influence on their health and health care utilization outcomes.


Psychosocial profiles of patients can be curated and quantified from text mining clinical notes, and these profiles can be successfully applied to artificial intelligence models to predict readmission risks. The use of text mining improved the accuracy of predicting readmission, and this improved predictive accuracy was higher for geriatric patients than for other patient cohorts.


This project is funded by the Social Science Research Council, Singapore (grant number MOE2017-SSRTG-030) and the Ageing Research Institute for Society and Education-Geriatric Education & Research Institute, Singapore (grant number AG2018001).

Conflicts of Interest

None declared.

Multimedia Appendix 1

Detailed methods of text mining approach and Tables A1-A3.

PDF File (Adobe PDF File), 113 KB

  1. Felix HC, Seaberg B, Bursac Z, Thostenson J, Stewart MK. Why do patients keep coming back? Results of a readmitted patient survey. Soc Work Health Care 2015 Jan 14;54(1):1-15 [FREE Full text] [CrossRef] [Medline]
  2. Lim E, Matthew N, Mok W, Chowdhury S, Lee D. Using hospital readmission rates to track the quality of care in public hospitals in Singapore. BMC Health Serv Res 2011 Oct 19;11(S1):A16. [CrossRef]
  3. Jencks SF, Williams MV, Coleman EA. Rehospitalizations among patients in the Medicare fee-for-service program. N Engl J Med 2009 Apr 02;360(14):1418-1428. [CrossRef] [Medline]
  4. Pedersen MK, Meyer G, Uhrenfeldt L. Risk factors for acute care hospital readmission in older persons in Western countries. JBI Database Syst Rev Implement Rep 2017;15(2):454-485. [CrossRef]
  5. García-Pérez L, Linertová R, Lorenzo-Riera A, Vázquez-Díaz JR, Duque-González B, Sarría-Santamera A. Risk factors for hospital readmissions in elderly patients: a systematic review. QJM 2011 Aug 10;104(8):639-651. [CrossRef] [Medline]
  6. Kansagara D, Englander H, Salanitro A, Kagen D, Theobald C, Freeman M, et al. Risk prediction models for hospital readmission: a systematic review. JAMA 2011 Oct 19;306(15):1688-1698 [FREE Full text] [CrossRef] [Medline]
  7. Futoma J, Morris J, Lucas J. A comparison of models for predicting early hospital readmissions. J Biomed Inform 2015 Aug;56:229-238 [FREE Full text] [CrossRef] [Medline]
  8. Zhou H, Della PR, Roberts P, Goh L, Dhaliwal SS. Utility of models to predict 28-day or 30-day unplanned hospital readmissions: an updated systematic review. BMJ Open 2016 Jun 27;6(6):e011060 [FREE Full text] [CrossRef] [Medline]
  9. Jamei M, Nisnevich A, Wetchler E, Sudat S, Liu E. Predicting all-cause risk of 30-day hospital readmission using artificial neural networks. PLoS One 2017;12(7):e0181173 [FREE Full text] [CrossRef] [Medline]
  10. Low LL, Liu N, Wang S, Thumboo J, Ong MEH, Lee KH. Predicting 30-day readmissions in an Asian population: building a predictive model by incorporating markers of hospitalization severity. PLoS One 2016 Dec 9;11(12):e0167413 [FREE Full text] [CrossRef] [Medline]
  11. Rosenberger PH, Jokl P, Ickovics J. Psychosocial factors and surgical outcomes: an evidence-based literature review. J Am Acad Orthop Surg 2006 Jul;14(7):397-405. [CrossRef] [Medline]
  12. Bonde JPE. Psychosocial factors at work and risk of depression: a systematic review of the epidemiological evidence. Occup Environ Med 2008 Jul 01;65(7):438-445. [CrossRef] [Medline]
  13. Kimmel PL. Psychosocial factors in dialysis patients. Kidney Int 2001 Apr;59(4):1599-1613 [FREE Full text] [CrossRef] [Medline]
  14. Kronborg H, Vaeth M. The influence of psychosocial factors on the duration of breastfeeding. Scand J Public Health 2004 Sep 05;32(3):210-216. [CrossRef] [Medline]
  15. Singh-Manoux A. Psychosocial factors and public health. J Epidemiol Community Health 2003 Aug 01;57(8):553-556; discussion 554 [FREE Full text] [CrossRef] [Medline]
  16. Strike PC, Steptoe A. Psychosocial factors in the development of coronary artery disease. Prog Cardiovasc Dis 2004 Jan;46(4):337-347. [CrossRef] [Medline]
  17. Zaza C, Baine N. Cancer pain and psychosocial factors. J Pain Symptom Manag 2002 Nov;24(5):526-542. [CrossRef]
  18. Lutgendorf SK, Costanzo ES. Psychoneuroimmunology and health psychology: An integrative model. Brain Behav Immun 2003 Aug;17(4):225-232. [CrossRef]
  19. Bradley EH, McGraw SA, Curry L, Buckser A, King KL, Kasl SV, et al. Expanding the Andersen model: the role of psychosocial factors in long-term care use. Health Serv Res 2002 Oct;37(5):1221-1242 [FREE Full text] [CrossRef] [Medline]
  20. Bigos SJ, Battié MC, Spengler DM, Fisher LD, Fordyce WE, Hansson TH, et al. A prospective study of work perceptions and psychosocial factors affecting the report of back injury. Spine 1991 Jan;16(1):1-6. [CrossRef] [Medline]
  21. Paarlberg K, Vingerhoets AJ, Passchier J, Dekker GA, Van Geijn HP. Psychosocial factors and pregnancy outcome: A review with emphasis on methodological issues. J Psychosom Res 1995 Jul;39(5):563-595. [CrossRef]
  22. Welin C, Lappas G, Wilhelmsen L. Independent importance of psychosocial factors for prognosis after myocardial infarction. J Intern Med 2000 Jun;247(6):629-639 [FREE Full text] [CrossRef] [Medline]
  23. Kartha A, Anthony D, Manasseh CS, Greenwald JL, Chetty VK, Burgess JF, et al. Depression is a risk factor for rehospitalization in medical inpatients. Prim Care Companion J Clin Psychiatry 2007 Aug 15;9(4):256-262 [FREE Full text] [CrossRef] [Medline]
  24. Coventry PA, Gemmell I, Todd CJ. Psychosocial risk factors for hospital readmission in COPD patients on early discharge services: a cohort study. BMC Pulm Med 2011 Nov 04;11(1):49 [FREE Full text] [CrossRef] [Medline]
  25. Flythe JE, Hilbert J, Kshirsagar AV, Gilet CA. Psychosocial factors and 30-day hospital readmission among individuals receiving maintenance dialysis: a prospective study. Am J Nephrol 2017 Apr 14;45(5):400-408 [FREE Full text] [CrossRef] [Medline]
  26. Retrum JH, Boggs J, Hersh A, Wright L, Main DS, Magid DJ, et al. Patient-identified factors related to heart failure readmissions. Circ Cardiovasc Qual Outcomes 2013 Mar;6(2):171-177. [CrossRef]
  27. Andersen RM. Revisiting the behavioral model and access to medical care: does it matter? J Health Soc Behav 1995 Mar;36(1):1-10. [CrossRef]
  28. Adler N, Stewart J. Health disparities across the lifespan: meaning, methods, and mechanisms. Ann N Y Acad Sci 2010 Feb;1186:5-23. [CrossRef] [Medline]
  29. Committee On The Recommended Social And Behavioral Domains And Measures For Electronic Health Records. Capturing social and behavioral domains in electronic health records. Washington, DC: National Academies Press; 2014.
  30. Ohno-Machado L. Realizing the full potential of electronic health records: the role of natural language processing. J Am Med Inform Assoc 2011 Sep 01;18(5):539 [FREE Full text] [CrossRef] [Medline]
  31. van Walraven C, Dhalla IA, Bell C, Etchells E, Stiell IG, Zarnke K, et al. Derivation and validation of an index to predict early death or unplanned readmission after discharge from hospital to the community. CMAJ 2010 Apr 06;182(6):551-557 [FREE Full text] [CrossRef] [Medline]
  32. Carnielli CM, Macedo CCS, De Rossi T, Granato DC, Rivera C, Domingues RR, et al. Combining discovery and targeted proteomics reveals a prognostic signature in oral cancer. Nat Commun 2018 Sep 05;9(1):3598-3617. [CrossRef] [Medline]
  33. Xia B, Zhao D, Wang G, Zhang M, Lv J, Tomoiaga AS, et al. Machine learning uncovers cell identity regulator by histone code. Nat Commun 2020 Jun 01;11(1):2696-2612. [CrossRef] [Medline]

AUROC: area under the receiver operating characteristic curve
EMR: electronic medical record
GBT: gradient boosting trees
LACE: Length of stay, Acuity of the admission, Comorbidity of the patient, and Emergency department use
LDA: latent Dirichlet allocation

Edited by R Kukafka; submitted 28.12.20; peer-reviewed by M Perez-Zepeda, S Mooijaart; comments to author 16.03.21; revised version received 30.06.21; accepted 27.07.21; published 19.10.21


©Kim Huat Goh, Le Wang, Adrian Yong Kwang Yeow, Yew Yoong Ding, Lydia Shu Yi Au, Hermione Mei Niang Poh, Ke Li, Joannas Jie Lin Yeow, Gamaliel Yu Heng Tan. Originally published in the Journal of Medical Internet Research (, 19.10.2021.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on, as well as this copyright and license information must be included.