This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.
Considerable effort has been devoted to the development of artificial intelligence, including machine learning–based predictive analytics (MLPA) for use in health care settings. The growth of MLPA could be fueled by payment reforms that hold health care organizations responsible for providing high-quality, cost-effective care. Policy analysts, ethicists, and computer scientists have identified unique ethical and regulatory challenges from the use of MLPA in health care. However, little is known about the types of MLPA health care products available on the market today or their stated goals.
This study aims to better characterize available MLPA health care products, identifying and characterizing claims about products recently or currently in use in US health care settings that are marketed as tools to improve health care efficiency by improving quality of care while reducing costs.
We conducted systematic database searches of relevant business news and academic research to identify MLPA products for health care efficiency meeting our inclusion and exclusion criteria. We used content analysis to generate MLPA product categories and characterize the organizations marketing the products.
We identified 106 products and characterized them based on publicly available information in terms of the types of predictions made and the size, type, and clinical training of the leadership of the companies marketing them. We identified 5 categories of predictions made by MLPA products based on publicly available product marketing materials: disease onset and progression, treatment, cost and utilization, admissions and readmissions, and decompensation and adverse events.
Our findings provide a foundational reference to inform the analysis of specific ethical and regulatory challenges arising from the use of MLPA to improve health care efficiency.
Machine learning–based predictive analytics (MLPA) products are emerging as a strategy for controlling rising health care costs [
Incentives for health systems to adopt products focused on health care efficiency stem—at least in part—from federal policies and payment structures that encourage value-based care under the Affordable Care Act and value-based purchasing and bundling programs instituted by the Centers for Medicare & Medicaid Services (CMS). For example, the CMS Hospital Readmissions Reduction Program reduces reimbursements to hospitals with excess unplanned 30-day hospital readmissions for certain health conditions [
However, experts recognize that although MLPA could improve the efficiency of delivered care, its use in the health care domain poses distinct ethical challenges because of its lack of transparency, continuous adaptation without human intervention, and its potential for systematic error leading to unfair decisions or actions [
These cases also highlight the importance and challenges of oversight of these complex software products. Unlike drugs and medical devices that the Food and Drug Administration (FDA) typically regulates, MLPA-based products are constantly and inherently mutable, complicating the definition of the final product. The US FDA is actively testing a regulatory framework for software as a medical device through a precertification pilot program. The framework shifts the emphasis away from the evaluation of completed products to the evaluation of processes that demonstrate a “culture of quality and organizational excellence” [
The main objective of this study is to map the landscape of currently available MLPA products marketed with the aim of improving health care efficiency. The study also seeks to characterize organizations developing these MLPA products, with the subsequent goal of identifying relevant ethical, regulatory, and policy implications.
We sought to identify MLPA products based on publicly available marketing information. To identify these products, we assessed 4 databases: LexisNexis, PubMed, Web of Knowledge, and Indeed.com. PubMed references frequently omitted necessary details to judge a product’s current use, and many of the results were duplicative with Web of Knowledge results. On this basis, we eliminated PubMed and conducted our research using the other 3 databases. LexisNexis searches returned the highest number of nonduplicative results. Indeed.com (the world’s largest job listing website) and Web of Knowledge were used because they returned additional nonduplicative results. Search terms such as “hospitals,” “health care organizations,” “machine learning,” and “predictive analytics” were used (see
We first removed all duplicates and any results that did not mention specific products (ie, congressional transcripts). For the identified products, we conducted additional targeted searches as needed to elucidate whether specific products met the eligibility criteria. The final list of eligible products for which the marketing materials were identified made the following claims: (1) the MLPA product made health care–related predictions, (2) the product primarily aimed to improve health care quality and reduce costs (ie, improve health care efficiency), (3) the product used EHR-sourced data, and (4) the product had been implemented by an identifiable US health system or provider, and possibly, though not necessarily, utilized on a routine basis. In addition, we excluded products if, based on marketing language, they (1) lacked a predictive element, (2) were not directly related to improving the quality of delivered care (eg, managing appointment schedules), or (3) solely used patient data that were not EHR-sourced (eg, data from a wearable device).
For all remaining products, we used the product website to collect additional information about the product characteristics and the organization that developed it. Characteristics included health care partners using the product, sources of data used to create and train the MLPA algorithms, and the type and size of the organization marketing the product. We also characterized the companies by the number of employees, the type of business, and whether the chief executives or board members had a clinical degree, including doctor of medicine, registered nurse, or other.
We used content analysis to generate MLPA product categories based on the type of prediction made [
From 1288 articles and other sources, we found 106 MLPA products developed by 96 companies that met our inclusion and exclusion criteria (
Identification of machine learning–based predictive analytics products.
Characteristics of companies developing machine learning–based predictive analytics products (N=96).
Characteristics and categories | Values, n (%) | |
|
||
|
Small (1-50 employees) | 34 (35) |
|
Medium (51-1000 employees) | 25 (26) |
|
Large (more than 1000 employees) | 37 (39) |
|
||
|
Computer software company—health care | 68 (71) |
|
Computer software company—general | 14 (15) |
|
Health insurer | 6 (6) |
|
Provider (hospital or health system) | 8 (8) |
|
||
|
Yes | 15 (16) |
|
No | 81 (84) |
|
||
|
Yes | 62 (65) |
|
No | 34 (35) |
aCEO: chief executive officer.
Many organizations did not meet the inclusion criteria because their products were not yet implemented by an identifiable health system or provider. Other organizations were similarly ineligible because the marketing language did not claim that their product used MLPA for predicting how to reduce cost and improve quality of care. Of the organizations, 92% (88/96) developed 1 product that met the inclusion criteria, whereas 8% (8/96) had more than one product. Companies were broadly distributed in terms of size. The vast majority 85% (82/96) were computer software companies, of which 83% (68/82) specialized in health care–related products. Of the MLPA developers, 15% (14/96) were health insurers, hospitals, or health systems.
Although chief executive officers (CEOs) of 84% (81/96) of companies did not have a clinical degree, 65% (62/96) listed a C-suite or board member who did. Of the software companies specializing in health care, 16% (11/68) had a clinician CEO, and 72% (49/68) had a clinician C-suite or board member. Computer software companies specializing in health care made up 94% (32/34) of small organizations with 50 employees or less. None of the large general computer software companies had a clinician as CEO, 75% (9/12) had a chief medical officer, and 8% (1/12) had a clinician C-suite or board member. All providers (hospitals or health systems) were large organizations with more than 1000 employees. Of the providers, 50% (4/8) had a clinician as CEO, and all providers had a clinician C-suite or board member.
We identified 5 categories of predictions made by MLPA products based on the publicly available product marketing materials: disease onset and progression, treatment, cost and utilization, admissions and readmissions, and decompensation and adverse events (
Of the products, 67% (71/106) were assigned to more than one category. A full list of products and their assigned categories can be found in
Categories of predictions made by MLPA products.
MLPAa prediction categoryb | Examples of specific predictions | Example quotes from product descriptions provided by developers |
Disease onset and progression predictions (n=62) | Patient outcome; unspecified diseases; chronic illnesses; specified diseases; mortality; comorbidities |
“Enables early prediction of disease onset.” “Clinicians can now see red flags for admitted patients at elevated risk of mortality three to five days in advance.” |
Treatment predictions (n=48) | Best course of treatment; candidates for palliative care or hospice; untreated or undertreated individuals (often referred to as |
“Identify members earlier in their disease progression who are likely going to be overmedicalized during the last 6-12 months of life.” “Helps clinicians make data-driven decisions about a patient’s care plan.” |
Cost and utilization predictions (n=38) | High-cost members of a population; high utilizers in a population; risk stratification; cost of caring for a specific patient; Medicare’s predicted risk |
“Predict health care cost for individuals for customer specified time periods.” |
Decompensation and adverse events predictions (n=34) | Hypotensive event; sepsis; hemodynamic instability; inpatient or outpatient decompensation; postoperative complications or surgical site infections; risk of adverse event; adverse medication reactions; hospital-acquired infection; hospital-acquired pressure injury |
“Identify patients at risk of surgical site infection.” “A respiratory failure detection algorithm...can highlight patients at a higher risk of prolonged ventilation up to 48 hours before onset.” |
Admissions and readmissions predictions (n=33) | Readmission risk; avoidable hospital admission or readmission or EDc use; unplanned ICUd admission or readmission; ED presentation volume; hospitalization; patient flow; length of stay or risk of an extended length of stay; discharge date; disposition at the end of hospitalization |
“Predicted output is the % chance that the patient will not return/be readmitted.” “Using only six vital signs and patient age, our machine learning tool more accurately predicted down-transfer success.” |
aMLPA: machine learning–based predictive analytics.
bTotals do not add up to 106 because categories are not mutually exclusive.
cED: emergency department.
dICU: intensive care unit.
A total of 62 products were used to predict the disease onset and progression (see
CitiusTech is a large private health care information technology company. Medictiv is a statistical analysis tool advertised as having machine learning capabilities to analyze longitudinal electronic health record–sourced data to predict the onset and progression of various unspecified diseases. Medictiv also advertises specific use cases for chronic kidney disease (CKD) and diabetes. For CKD, Medictiv uses longitudinal patient and laboratory data to predict disease progression risk for CKD stage 3 patients. For diabetes, Medictiv uses data available within 72 hours of admission, including laboratory results, demographic data, comorbidities, and health insurance claims to predict patients’ length of stay, risk of readmission, and risk stratification [
A total of 48 products made predictions related to patient treatment (see
Identifi is Evolent Health’s value-based care product, which aims to reduce costs and improve the quality of delivered care. Identifi’s machine learning–based predictive analytics algorithms use clinical, social, and administrative data to predict the best course of treatment for a patient and identify gaps in a patient’s care. They also make predictions about patient outcomes, risk of readmission, and risk stratification. Evolent Health is a public health care company with between 1000 and 5000 employees. It advertises Identifi to providers and health plans [
This category comprises products whose MLPA algorithms predict the cost or utilization of health care (n=38; see
Waystar uses social determinants of health, along with hospital and consumer data, to stratify the patient population according to risk and cost [
The products in this category (n=34) were designed to act as early warning systems for the occurrence of adverse events or decompensations (see
Dascena’s InSight is a paradigmatic application of machine learning–based predictive analytics used to provide an early warning of an adverse event. Dascena is a small, private company with less than 50 employees. The InSight algorithm warns of sepsis onset using vital sign data located in patients’ electronic health records, which is typical of products in this category. InSight provides physicians with real-time alerts and boasts its ability to forecast a patient’s condition 4 hours in the future [
In this category (n=33), predicting the risk of readmission was the most common application (n=21), where the marketing language had to explicitly state
Midas Readmission Penalty Forecaster is a common product developed in response to Centers for Medicare & Medicaid Services Hospital Readmissions Reduction Program [
Our results provide an overview of the emerging MLPA applied to improve health care efficiency and provide a systematic categorization of actual applications of this technology in patient care. The products identified as being currently in use are predominantly marketed by computer software companies. Our results also provide a systematic framework for mapping the characteristics of organizations operating in the field of MLPA in health care and the products they produce, based on the specific predictions that these products are intended to provide in a health care setting.
The potential for MLPA to transform health care has generated much anticipation to the possibilities for this technology to improve health care quality and reduce costs. Bates et al [
Our results also suggest that MLPA products are increasingly being used in response to CMS reimbursement policies. The readmissions predictions may reflect a response to the recent CMS Hospital Readmissions Reduction Program, which reduces payments to hospitals with excessive numbers of readmissions [
Our results also provide an essential framework for considering various approaches to regulation in this diverse and rapidly changing marketplace. The FDA is currently developing a framework that incorporates the level of risk to the patient in its review process. Having a systematic framework of categories that may reflect varying degrees or types of risk to patients (eg, treatment recommendations vs prediction of health care costs) may therefore be important. Traditionally, software products have not been subjected to the level of regulatory scrutiny applied to drugs or medical devices, nor has the technology sector established processes for identifying or evaluating ethical issues that may arise from their products. Developing an effective regulatory framework requires an understanding of various stakeholders and organizations involved in this marketplace, potential sources of conflict, and the resources necessary for success. In examining MLPA products, which inherently change and adapt as they incorporate new data, regulators may need to consider the extent to which business requirements—including production schedules, fundraising, and profit goals—are aligned with the design process.
In addition, further examination is needed regarding the role of clinical expertise within these companies in light of the FDA’s self-regulation approach in evaluating companies based on
Our study has several limitations. Our results are limited by our reliance on publicly available web-based information, such as product websites, press releases, and health system websites. Products developed by nonprofit health systems, academic institutions, or large insurers may not have been readily identifiable, as their products are often not marketed externally. Therefore, we are less likely to have identified products developed by a health system or health insurer that are not sold for use in other systems. Another limitation is that the predictions were categorized based on the marketing language used by the companies to describe their own products, so the actual extent to which these products do what they are marketed to do remains unclear. In addition, we do not know how often the tools are used by the health care system where they are implemented. Some may be used frequently and others rarely.
There is a rapidly emerging set of products that utilize MLPA with the dual goals of improving health care and addressing cost containment. These goals address critically important needs of the health care system in the United States. Improving care quality and outcomes is not necessarily at odds with lowering costs. There is an underlying ethical tension, however, when health care efficiency is improved by reducing cost with possible negative effects on quality. How MLPA developers perceive these trade-offs and whether reliance on such tools may exacerbate discrimination based on underlying biases is difficult to assess using currently available data. The significant role of the software and technology companies, which might have little experience in understanding clinical care, using health data, or applying medical ethics or law, suggests that regulatory approaches that rely on self-regulation and organizational culture may be challenging for the evaluation of MLPA products. More research on the process of developing these novel tools is needed to further assess the implications for policy and regulation.
Search terms.
Categorization of machine learning–based predictive analytics products.
chief executive officer
Centers for Medicare & Medicaid Services
electronic health record
Food and Drug Administration
machine learning–based predictive analytics
This study was supported by a grant from the Greenwall Foundation,
None declared.