Health Outcomes from Home Hospitalization: Multisource Predictive Modeling

Background: Home hospitalization is widely accepted as a cost-effective alternative to conventional hospitalization for selected patients. A recent analysis of the home hospitalization and early discharge (HH/ED) program at Hospital Clínic de Barcelona over a 10-year period demonstrated high levels of acceptance by patients and professionals, as well as health value-based generation at the provider and health-system levels. However, health risk assessment was identified as an unmet need with the potential to enhance clinical decision making. Objective: The objective of this study is to generate and assess predictive models of mortality and in-hospital admission at entry and at HH/ED discharge. Methods: Predictive modeling of mortality and in-hospital admission was done in 2 different scenarios: at entry into the HH/ED program and at discharge, from January 2009 to December 2015. Multisource predictive variables, including standard clinical data, patients’ functional features, and population health risk assessment, were considered. Results: We studied 1925 HH/ED patients by applying a random forest classifier, as it showed the best performance. Average results of the area under the receiver operating characteristic curve (AUROC; sensitivity/specificity) for the prediction of mortality were 0.88 (0.81/0.76) and 0.89 (0.81/0.81) at entry and at home hospitalization discharge, respectively; the AUROC (sensitivity/specificity) values for in-hospital admission were 0.71 (0.67/0.64) and 0.70 (0.71/0.61) at entry and at home hospitalization discharge, respectively. Conclusions: The results showed potential for feeding clinical decision support systems aimed at supporting health professionals for inclusion of candidates into the HH/ED program, and have the capacity to guide transitions toward community-based care at HH discharge. (J Med Internet Res 2020;22(10):e21367) doi: 10.2196/21367


Home Hospitalization and Early Discharge at the Hospital Clinic of Barcelona
Home hospitalization (HH)/early discharge (ED) programs [1][2][3][4][5][6] show substantial site heterogeneities in terms of service workflows and organizational aspects. However, overall, they have demonstrated maturity and health care value generation [7] such that it is well accepted that HH/ED constitutes an effective alternative to inpatient care for a select group of patients requiring hospital admission.
The characteristics of the deployment and adoption of the HH/ED program at Hospital Clinic of Barcelona (HCB) were recently described in a report [8]. In this report, HH/ED is defined as a service providing acute, home-based, short-term, complex interventions aimed at substituting conventional hospitalization fully with HH [7,9] or partially with ED [10]. The service at HCB is delivered by trained hospital personnel, and it is provided for a period of time that is not longer than the expected length of hospital stay for the patients' diagnostic related groups involved [11]. The Hospital retains the entire clinical, fiscal, and legal responsibilities. Virtual beds are used to support the required administrative and clinical processes. The report concluded that HH/ED for acute medical and surgical patients in a real-world setting was safe, generated healthcare efficiencies, and was well accepted by 98% of patients and professionals [8]. Moreover, the study stressed the potential of HH/ED to strengthen care coordination between highly specialized hospital-based care and home-based services involving different levels of complexity [8].
Currently, the HH/ED program at HCB is a mainstream, mature service that is offered 24 hours a day, 7 days a week, all year round, with 48 virtual beds available per day. It is the first choice for eligible patients requiring hospital admission when attended in the Emergency Department, and it serves the entire Health district of Barcelona Eixample-Esquerra, which has 540,000 inhabitants.
It is well accepted that the key health outcomes that define the success of hospitalization at home [8] are mortality and unplanned emergency room consultations that lead to in-hospital admissions, either during the home hospitalization episode or during the 30-day period after discharge. This study relies on the assumption that multisource predictive modeling facilitating clinical decision support at 2 key time points-(1) at entry, and (2) at HH/ED discharge-could be useful to enhance service outcomes. Risk assessment at entry may contribute to reducing undesirable events during the episode of HH/ED, whereas the assessment of unexpected events after discharge will likely contribute to improving transitional care [12,13] and better definition of personalized care pathways within a care continuum scenario [14].

The Use of Multisource Predictive Modeling for Enhanced Risk Assessment
This study was designed to elaborate and assess the potential of a machine learning approach to the prediction of mortality and hospital admission at entry and at discharge from HH/ED.
A key specificity of the study is the use of various data sources to estimate the 2 outcomes, mortality and hospital re-admission, as conventional inpatient care. In addition to classical clinical and biological information obtained from electronic medical records (EMR), we have also considered the inclusion of Catalan population-health risk assessment scoring, known as Adjusted Morbidity Groups (GMA) [15,16], and purposely collected data on patients' performance and frailty.
The GMA is an open, publicly owned algorithm that does not rely on expert-based fixed coefficients. Such characteristics provide a high degree of flexibility for multisource predictive modeling and good potential for transferability to other sites, as demonstrated through its validation and current use in 13 of the 17 health regions in Spain, encompassing approximately 38,000,000 citizens [15]. It is fully operational since 2015 for health policy purposes and for clinicians in primary care workstations, providing yearly updated risk stratification with a population health orientation. It takes into account multimorbidity and complexity, that is, impact on health care, using data across health care tiers stored in the Catalan Health Surveillance System.
The approach adopted in this study was based on the hypothesis that the application of holistic strategies for subject-specific risk prediction and stratification, which consider multisource covariates influencing patient health, could increase predictive accuracy and facilitate clinical decision-making based on sound estimates of individual prognosis [17]. Developed predictive models were evaluated on a real-world database, which included all cases admitted to HH/ED at HCB from January 2009 to December 2015.

Dataset
Retrospective data from 1936 patients admitted to the HH/ED program at HCB from January 2009 to December 2015 (Table  1S in Multimedia Appendix 1) were considered in the analyses carried out to elaborate the predictive modeling of mortality and hospital re-admission at 2 time points: (1) at entry into HH/ED, and (2) at discharge from the HH/ED program. HH/ED at HCB is run as a transversal program, under the responsibility of the medical and nurse directors of the Hospital, serving the different clinical specialties. Patients included in the HH/ED show a broad spectrum of primary diagnoses, as displayed in Table 1S in Multimedia Appendix 1.
The potential covariates considered for predictive modeling purposes (Table 2S in Multimedia Appendix 1) encompassed 3 dimensions: (1) standard clinical and biological information obtained from EMRs; (2) patients' functional performance and frailty data, specifically collected to characterize these patients; and (3) GMA scoring indicating multimorbidity, complexity, and patients' allocation into the population-health risk stratification pyramid.

Ethical Approval
The Ethical Committee for Human Research at HCB approved the study, and all participants signed an informed consent prior to any procedure. The program was registered at ClinicalTrials.gov: NCT03130283. Figure 1 illustrates the global methodology proposed to identify patients at risk of re-admission or death after HH discharge; the elaboration of predictive modeling followed 3 successive steps: (1) feature selection, (2) data preprocessing, and (3) classification.

Feature Selection
Feature selection refers to different processes involving data cleaning, selection of variables to be considered for predictive modelling, as well as selection of the final set of patients included in the analyses.

Data Preprocessing
In order to handle the impact of missing values, a robust method was designed for mixed-type data imputation. To this end, the missForest algorithm was applied to the whole dataset [18]. Moreover, we applied a rediscretization of some categorical variables to avoid under-represented categories.

Classification
Different strategies were considered for the elaboration of predictive models in this study. Specifically, 3 of them were explored in detail (Multimedia Appendix 1); that is, logistic regression and 2 machine learning approaches: a decision tree and random forest classifiers.
For model training, the dataset was 10-times divided in (1) a training subset, taking 75% of randomly selected cases, and (2) a validation subset with the remaining 25% of cases. For each data partition, the model was trained using 4-fold cross-validation on the training subset. As successful cases (ie, survivors not requiring hospital admission) were far superior in number, the effect of class imbalance was reduced by applying a random stratified-sampling strategy [19].
Model performance was assessed by computing the area under the receiver operating characteristic curve (AUROC), sensitivity, specificity, and score metrics in the validation subset. Score is a measure of prediction accuracy and is defined as the weighted harmonic mean of the sensitivity and specificity of the model. The final performance of the models was assessed as the average performance of all independent validations.
As indicated above, the methodology was applied to predict 2 types of events: (1) mortality, and (2) in-hospital admission until 30-days after HH/ED discharge. Risk assessment was conducted in 2 different scenarios: (1) at entry into the HH/ED program, and (2) at discharge. Accordingly, the analyses led to 4 different risk models (RM): (1) RM1 accounts for predicting the need for conventional hospitalization at entry into the HH/ED program; (2) RM2 predicts mortality during the study period assessed at entry; (3) RM3 refers to predictive modeling of conventional hospital admissions assessed at HH/ED discharge; and (4) RM4 predicts mortality during the study period assessed at HH/ED discharge. The risk of mortality and re-admission during HH at entry was not assessed due to the scarcity of unsuccessful cases during HH/ED.

Study Population
All 1936 patients admitted to the HH/ED program at HCB during the study period were included in the research. However, the analyses conducted in the study were based on 1925 cases; 4 cases were discarded for having unrecoverable wrong data and 7 for having missing mandatory data. The mean age of the study group was 70.85 (SD 14.88) years; 1201 (62.4%) were men and 724 (37.6%) were women. The list of main diagnoses is depicted in Table 1S in Multimedia Appendix 1. Up to 64 variables, grouped into the 3 categories indicated above, were considered in the analyses (Table 2S in Multimedia Appendix 1).
To characterize different subpopulations of risk, patients were classified as undergoing successful and unsuccessful home hospitalization stays based on their re-admission and mortality during the study period and 30 days after hospital discharge (Tables 1-2). Of the 1925 patients admitted to the HH/ED program, 3 (0.2%) patients died and 96 (5.0%) cases were eventually readmitted to the hospital due to complications of heterogeneous origin during HH/ED. Of the remaining 1922 patients, within 30 days after HH/ED discharge, 37 (1.9%) patients died and 210 (10.9%) cases were identified as falling into the unsuccessful groups when analyzing re-admission risk. Tables 1 and 2 summarize the baseline characteristics of both study groups, according to mortality and re-admission, respectively.
Mortality was higher in elderly (P<.001) and comorbid patients, GMA (P=.02), and the Charlson Comorbidity Index (P=.001), especially in those with cardiovascular (P<.001) and oncologic disorders (P=.019). Mortality was lower in postoperative patients (P<.001) and in those with respiratory diseases (P=.005). Interestingly, in-hospital re-admission was only slightly associated with higher age (P=.003) and a major complexity of comorbid conditions, GMA (P<.001), and the Charlson Comorbidity Index (P<.001), without well-defined associations with the characteristics of the main diagnosis.

Predictive Modeling
Different modeling approaches were considered for this purpose, including logistic regression, decision trees, and random forests. The averaged AUROC of each modeling approach that was considered is presented in Table 3.
Among the different modeling strategies developed, random forest classifier (Figure 2) showed the best performance averaged over the 4 risk scenarios. Table 4 summarizes the performance of the 4 predictive models proposed in the study for in-hospital admission (RM1 and RM3) and for mortality (RM2 and RM4); Multimedia Appendix 2 depicts the relative weight, expressed as the mean decrease in accuracy (MDA) [20], of the 10 most relevant variables for each of the 4 predictive models. Table 3. Area under the receiver operating characteristic curve (AUROC; sensitivity/specificity) performance of the modeling strategies explored.
For risk of mortality (Multimedia Appendix 2, panels B and D), RDW and physical status at entry (assessed using the SF-36 questionnaire [21]) showed the highest impact in the models.
Notably, enriching the model with information acquired during HH/ED (Multimedia Appendix 2, panels C and D), several variables gained importance, such as hospital admissions during HH/ED, length of current hospitalization period, and nursing home visits.

Principal Findings
The current research has developed and internally validated 4 machine learning algorithms predicting the risk of in-hospital admission and mortality for patients undergoing home-based hospitalization until 30-days after discharge from the service at HCB, from 2009 to 2015. Predictions of the 2 undesirable events were performed at 2 specific time points: at entry and at discharge from home-based hospitalization.
The study design was formulated and adopted under the hypothesis that robust predictions could be useful for clinical decision making: (1) to decide patients' admission into the HH/ED service (RM1 and RM2); and (2) to personalize care paths for transitional care, as well as for enhanced vertical integration between specialized care and community-based services, both at patients' discharge from HH.
A unique aspect of this research is that predictors considered in the analyses encompass 3 different categories of variables (Table 2S in Multimedia Appendix 1): (1) clinical data and biological information [22][23][24] extracted from patients' electronic medical records; (2) additional variables often not considered in the clinical records specifically collected in the research protocol to reflect patients' functional capacities and health care resources; and (3) information from GMA, the population-based, health-risk assessment tool developed and implemented in Catalonia (ES) [15,16,25].
We understand that the multisource approach adopted in this research was the most appropriate to elaborate predictive modeling in a highly heterogeneous group of patients undergoing HH/ED, in terms of clinical diagnosis and frailty status [8]. The results depicted in Table 4, in terms of AUROC and score values, indicate the reasonably good performance of the predictive models as compared to recent studies on similar scenarios [26], demonstrating the feasibility of the proposed approach and leveraging the advantages of applying machine learning in clinical risk prediction contexts in front of more traditional approaches based on standard multiple regression analyses [27]. Moreover, Multimedia Appendix 2 (panels A-D) shows a high relative contribution of variables usually not considered to be of clinical standard or relevant biological information recorded in the EMR. Overall, our results indicate that our multisource approach significantly contributes to enhanced health risk assessment with a potentially high impact on clinical decision support.

Limitations of the Study and Lessons Learned for Clinical Application
We have not been able to identify literature on predictive modeling specifically addressing HH/ED. It may partly be due to the heterogeneity of orientations and characteristics of the ongoing HH/ED programs among sites. This fact constitutes a limitation regarding the potential for generalization of the results of this research to other sites. However, we understand that the multisource approach undertaken in this study shows enormous potential for risk assessment regarding mortality and early re-admissions of hospitalizations in general, and may show high applicability beyond the field of HH/ED. The predictive modeling undertaken in the study should be useful for defining the characteristics of personalized care paths of transitional care after hospital discharge. As indicated above, the results can have a high impact on shaping the interactions between specialized and community-based care in patients with high risk for hospital re-admissions.
A major general limitation of machine learning approaches such as the one proposed here is the fact that they can be considered "black-box" solutions, difficult to interpret by clinicians. Our work, however, is based on random forest models that provide interpretable information regarding variable importance (Multimedia Appendix 2, panels A-D) and even model visualization, thus facilitating the understanding of their predictions. We believe that the clinical interpretation of the predictors may require different approaches; for example, variables like age and diagnosis should be individually assessed for clinical judgment, while others, like the different GMA parametrization (including the Charlson Index), should be assessed by taking the category as a whole (and likewise, abnormalities in some blood test variables). On the other hand, this study indicates that the impact of patients' functional status on outcomes is high. However, some of the measurements included in this category are not scalable in the clinical scenario (ie, SF36). Therefore, our results clearly indicate that surrogates with higher applicability [28,29] should likely be considered for inclusion in real-life clinical settings. This could be achieved through patients' self-tracking equipment (ie, apps) that provides information on different dimensions characterizing the functional performance of the patient, namely physical and psychological status, wellbeing, activation, etc.
It is acknowledged that the generalization of the use of new clinical scores generated from predictive modeling needs external validation on other patient cohorts or in different timeframes, and even on the development of impact studies in real-world settings [30]. Apart from being costly, such a validation process can show limitations partly due to rapidly evolving clinical environments, as is the case for HH/ED at HCB, expanded to the entire health district of Eixample-Esquerra during 2018. The new scenario implies great changes in the clinical environment, patients' characteristics, and data sources prompting the need for designing dynamic models in the context of learning health systems (LHS) [31,32]. It is of note that within a mature digital health scenario, the multisource predictive modeling approach could be enriched with other sources of data, such as patient self-reported data and data from social care. The lack of digital maturity of the current ecosystems constitutes a limiting factor for now, but in the near future, risk assessment tools are expected to improve in terms of robustness, potential for generalization of the results, and incorporation of a dynamic predictive approach.

Steps Toward Dynamic Learning Health Systems
There is little doubt about the high potential shown by the digital transformation of health as part of a large-scale adoption of integrated care. It is acknowledged, however, that practical applications of this vision face major limitations when it comes to accessing and mining health data stored in distributed silos of information. However, it seems clear that integrating and analyzing highly complex data would open new avenues for digital health in the clinical arena.
The integration of biomedical research information systems with in-place electronic health records in hospitals and in primary care centers having interoperability with patients' self-tracking information would enable the development of innovative, dynamic predictive modeling approaches, opening up entirely new and fascinating scenarios for an interplay between clinical practice and biomedical research [33,34]. We have identified 4 main interrelated enablers of this scenario [15,17,35]: (1) cloud-based tools and services allowing secure analysis of patient-centric distributed and multi-disciplinary health-related information; (2) systems medicine approaches to generate clinical predictive modeling to feed clinical decision support systems and patient decision support systems; (3) implementation and evaluation strategies for real-world implementation and assessment of cloud-based services, and (4) governance, regulatory aspects, and service adoption throughout the health care systems; these are all key to harnessing the strengths and opportunities of LHS.
Combined actions involving organizational changes with the engagement of all stakeholders, selective adoption of novel biomedical and digital tools, and the achievement of financial sustainability through enhanced accountability and entrepreneurial actions should pave the way toward the transition to LHS.

Conclusions
This study proves the potential of the proposed multisource machine-learning models for the prediction of risk of re-admissions and deaths in patients undergoing home-based hospitalization in a real-world setting. Further steps beyond this study include the development of dynamic clinical decision support systems allowing progression towards sustainable patient-centered health care services.