Background: Positive economic impact is a key decision factor in making the case for or against investing in an artificial intelligence (AI) solution in the health care industry. It is most relevant for the care provider and insurer as well as for the pharmaceutical and medical technology sectors. Although the broad economic impact of digital health solutions in general has been assessed many times in literature and the benefit for patients and society has also been analyzed, the specific economic impact of AI in health care has been addressed only sporadically.
Objective: This study aimed to systematically review and summarize the cost-effectiveness studies dedicated to AI in health care and to assess whether they meet the established quality criteria.
Methods: In a first step, the quality criteria for economic impact studies were defined based on the established and adapted criteria schemes for cost impact assessments. In a second step, a systematic literature review based on qualitative and quantitative inclusion and exclusion criteria was conducted to identify relevant publications for an in-depth analysis of the economic impact assessment. In a final step, the quality of the identified economic impact studies was evaluated based on the defined quality criteria for cost-effectiveness studies.
Results: Very few publications have thoroughly addressed the economic impact assessment, and the economic assessment quality of the reviewed publications on AI shows severe methodological deficits. Only 6 out of 66 publications could be included in the second step of the analysis based on the inclusion criteria. Out of these 6 studies, none comprised a methodologically complete cost impact analysis. There are two areas for improvement in future studies. First, the initial investment and operational costs for the AI infrastructure and service need to be included. Second, alternatives to achieve similar impact must be evaluated to provide a comprehensive comparison.
Conclusions: This systematic literature analysis proved that the existing impact assessments show methodological deficits and that upcoming evaluations require more comprehensive economic analyses to enable economic decisions for or against implementing AI technology in health care.
In times of value-based health care and also because of the high share of the health care industry in the overall economy, economic impact assessment is of increasing importance. For instance, health care expenditures account for approximately US $3.5 trillion out of US $19.4 trillion (18%) of the overall gross domestic product (GDP) in the United States and for approximately US $0.4 trillion out of US $3.7 trillion (11.5%) of the overall GDP in Germany [, ]. Accordingly, the cost impact of digital health applications has also been analyzed in several studies.
In 2002, in a review of cost-effectiveness studies in the context of telemedicine interventions, Whitten et al  revealed that only 55 out of 612 identified articles presented actual cost-benefit data, which were required to be included in a detailed review. In addition, after analyzing these articles, the authors concluded that the provided evidence was not sufficient to assess whether telemedicine represents a cost-effective mean of delivering health care [ ].
More than a decade later, in 2014, Elbert et al  described in a review of systematic reviews and meta-analyses regarding electronic health (eHealth) interventions in somatic diseases that out of 31 reviews, 7 papers concluded that digital health is effective or cost-effective, 13 underlined that evidence is promising, and the other 11 found only limited or inconsistent proof. They also highlighted that the development and evaluation of strategies to implement effective or cost-effective eHealth initiatives in daily practice needed to be significantly enhanced [ ].
In another systematic review study on the economic evaluations of eHealth technologies from 2018, Sanyal et al  analyzed multiple databases with publications between 2010 and 2016. On the basis of 11 studies that fulfilled the inclusion criteria, the authors found that most of the studies demonstrated efficacy and cost-effectiveness of an intervention using a randomized control trial and statistical modeling. However, there was insufficient information provided on the feasibility of adopting these modeling technologies. Thus, the paper emphasizes that the current level of evidence is inconclusive and that more research is needed to evaluate possible long-term cost benefits [ ].
Research in this segment has been continuously intensified, and in several studies, the digital health cost-effectiveness, for example, of telemedicine for remote orthopedic consultations , digital behavioral interventions for type 2 diabetes and hypertension [ ], and internet-based interventions for mental health [ ] was analyzed in detail.
As significant medical quality enhancements and cost-saving improvements through artificial intelligence (AI) as one of the key emerging technologies in digital health are expected, the economic impact assessment of AI in health care has a crucial role for all stakeholders in health care and, thus, needs to be analyzed in detail.
It was systematically investigated whether the existing cost-effectiveness evaluations meet the established quality criteria to enable comprehensive decision making regarding the implementation of AI in health care. On the basis of these thorough economic assessments, the necessary information to decide for or against the application of AI in hospitals, industry, and payer context will be provided.
A systematic literature review was performed as described in the following sections.
A literature search was conducted utilizing the PubMed database and using the search terms provided in.
|Artificial intelligence OR machine learning AND cost effectiveness||(Artificial intelligence [title/abstract] OR machine learning [title/abstract]) AND cost effectiveness [title/abstract]||54|
|Artificial intelligence OR machine learning AND economic impact||(Artificial intelligence [title/abstract] OR machine learning [title/abstract]) AND economic impact [title/abstract]||9|
|Artificial intelligence OR machine learning AND cost saving||(Artificial intelligence [title/abstract] OR machine learning [title/abstract]) AND cost saving [title/abstract]||3|
The search terms Artificial Intelligence and Machine Learning for the overall segment are not exhaustive as eg, Decision trees, Support vector machines, or Deep neural networks could also have been used as search terms for the database queries. Nonetheless, as strategic decisions based on economic impact are mostly made on a strategic managerial and medical level without a specific technological background, the most frequently used search terms regarding AI in health care have been used. In addition, it is highly probable that papers about, for example, deep neural networks would also include such terms as artificial intelligence, support vector machines, and machine learning at least in the abstract. Finally, it was decided to use a Google Trends analysis comparing the most frequently used search terms regarding AI in health care over the last 12 months globally : The terms Artificial Intelligence and Machine Learning have been used the most by far, as illustrated in .
For the publications identified through the PubMed searches, the titles, abstracts, and full texts have been reviewed. Publications were included into the subsequent analysis if they were (1) published journal articles, (2) written in English language, and (3) published no more than 5 years ago. With regard to the content, the publications were included if they focused on at least one of the following content sectors: (1) a comprehensive description of an AI functionality, (2) an evaluation of the economic efficiency and outcomes of the AI functionality, and (3) quantitative outcomes of the AI functionality in at least one health care system. Furthermore, only publications describing concrete medical and economic outcomes, such as cost savings per patient per year, and reviews or meta-analyses comparing AI solutions have been included.
Exclusion criteria for an article were defined as follows: (1) the title did not cover a topic related to AI in health care; (2) neither the title nor the abstract contained a description of an AI application in health care; or (3) the title, abstract, or full text did not elaborate on the quantitative economic outcome of AI in health care application in any health care system. In contrast to other previous research review approaches, such as those chosen by Elbert et al  or Ekeland et al [ ], the third exclusion criterion was covered. Although this significantly limited the number of cost-effectiveness studies included, it was applied to compare the different cost-effectiveness analysis approaches and not only the health- or process-related outcomes without quantified economic impact from a national or international health care perspective.
After identifying potential studies for inclusion via the PubMed search, as previously described, the evaluation took place in two steps (). First, all titles, abstracts, and full texts were screened for the fulfillment of the inclusion and exclusion criteria. Second, publications viable for inclusion were assessed with a quality criteria catalog, which is explained in section Quality Criteria for Economic Impact Assessment.
Quality Criteria for Economic Impact Assessment
A combined criteria catalog for cost-effectiveness studies was designed. Besides own criteria, additional evaluation aspects from classical health care effectiveness studies and digital health assessments were considered [, ]. The quality criteria are summarized in .
|Description of cost-effectiveness of AIa solution||Level of detail of cost-effectiveness explanation||Authors|
|Hypothesis formulation||Analysis if a comprehensive question has been formulated that allows AI cost-effectiveness evaluation (eg, comparing the AI approach with the recommended guideline routine)||Study by Haycox and Walley |
|Cost-effectiveness perspective||Impact of change in the cost of stand-alone functionality vs overall reduction of burden of care||Study by Haycox and Walley |
|Consideration of cost alternative||Analysis if the cost-saving results could also have been achieved with an alternative strategy||Study by Haycox and Walley |
|Benefit today||Net present value of the AI service, including upfront investments and running costs||Study by Haycox and Walley |
|Verification of base case||Analysis of cost-effectiveness of the AI solution based on benchmarking with base case data||Study by Sanyal et al |
aAI: artificial intelligence.
Quality Criteria Evaluation
Quality criteria have been applied to assess the economic impact assessments on a scale of 1 to 3 (1=superficial coverage, 2=solid coverage, and 3=detailed explanation). As outlined above, 6 publications have been assessed regarding the described quality criteria for economic impact evaluation. An overview of the analysis of the publications [- ] is given in .
Quality Assessment Results
We first conclude that the level of detail of description of the cost-effectiveness measurement was overall high as the descriptions were for the most part precise and detailed, for instance, “for an incremental cost effectiveness threshold of €25,000/quality-adjusted life year, it was demonstrated that the AI tool would have led to slightly worse outcomes (1.98%), but with decreased cost (5.42%)” . Overall, 5 out of the 6 publications had a very high level of detail, and only 1 study had a medium level of detail in the general description (only a positive/negative cost-saving impact description and no further outcome explanations have been provided [ ]).
Second, the hypothesis formulation (eg, cost saving through machine learning–based prediction models to identify optimal heart failure patients for disease management programs to avoid 30-day readmissions ) was clear and accurate across all publications. All comprised well-explained and coherent hypothesis formulations.
Third, the cost-effectiveness perspective had in all cases a health care system context, although additional perspectives could have been included, such as ambulant or nurse perspectives. Furthermore, 5 studies demonstrated a comprehensive health care system perspective, whereas 1 could have been extended from a hospital to an overall system view .
Fourth, the cost alternative consideration, that is, the analysis of whether the cost-saving results could also have been achieved alternatively, was mostly missing. Only 2 papers elaborated on the different alternatives in detail, for example, differentiating on the levels of risks of the respective patient groups or different treatment options. Besides these 2 publications [, ] that covered various alternatives to achieve a similar cost saving, the remaining 4 publications did not elaborate on such cost alternative considerations at all.
Fifth, the benefit achieved today, that is, in terms of a net present value (NPV) including not only the benefits but also the necessary investment for the AI implementation and the operational costs of an AI service delivery, was not covered in any of the 6 studies. Only 1 study compared AI vs non-AI scenarios but without providing a NPV calculation. Hence, all 6 studies included a quantification of economic outcomes but failed to calculate an overall NPV.
Finally, the verification of the base case was conducted using different approaches across the 6 studies. Mostly solid data sources have been collected in dedicated AI-focused studies based on, for example, comparison of cost with/without the algorithm, reimbursement code analysis, or benchmarking of the result with the reported performance of other clinics. All papers presented a cost-effectiveness measurement based on a comprehensive comparison dataset.
One additional aspect that emerged throughout the analysis was the measurement of resource usage, which was (almost) in all papers conducted via a top-down approach, meaning from an overall health care perspective but not from a single cost split per task. In this way, important cost drivers of potentially hidden stakeholders could have been missed (eg, additional workload for ambulatory care if a hospital treatment is altered).
Overall, the outcomes of the analysis described above can be split into two result categories, namely, general feedback from the analysis and detailed assessment of the studies that have been included in the review process based on the study’s inclusion and exclusion criteria.
Generally, only a few publications can be found for the economic impact assessment of AI in health care. On the basis of the different search terms that include the most frequently searched phrases by far in this segment (Artificial Intelligence and Machine Learning) in combination with the economic impact (Cost effectiveness, Economic impact, Cost saving), there were only 66 PubMed hits. As AI strategies and consequent decision-making processes depend on solid data as the basis for decision making, this is a significant challenge for both the management and medical staff, for example, when general pro and contra decisions and specific implementations regarding AI are discussed.
When accounting for the details given in the identified AI in health care publications, the economic assessment quality shows several deficits that need to be overcome in the future. Only 6 out of the 66 publications (9%) could be included in the detailed assessment. Out of these 6 studies, none comprised a complete cost-benefit analysis; rather, they all focused on fragmented cost or cost-saving aspects.
Room for improvement () has been identified in two main areas:
- First, initial investment and operational costs for the AI infrastructure and service need to be included in the assessment. This is a core element for any strategic decision-making process, and the complete initial and operational investment costs for an AI solution must be compared with the expected economic benefits to provide concrete decision-making support.
- Second, further options to achieve similar impact must be evaluated to reach a sufficient basis for comprehensive and transparent decision-making, allowing comparisons among different strategic and investment options (eg, a genetic sequencing process or different medical expertise allocation for a diagnosis and treatment outcome improvement could also be applied instead of an AI-driven patient screening).
The conducted review has a rather narrow focus on economics and business perspectives of AI in health care. However, the literature review revealed further significant success factors for AI, for example, regarding the legal framework, such as compliance with data security, protection, and privacy policies, and also universally accepted technological requirements to enable comprehensive data collection and to analyze content while complying with data privacy requirements. Despite the benefits in assisting diagnostic and therapeutic decisions, so far, no standards for these legal and technological issues have been defined, and these aspects should be analyzed in future research with a broader focus.
Furthermore, aside from the sole economic quantitative aspects, the qualitative aspects of AI in health care for patients and the society require further research. For instance, in rural areas where the availability of primary care physicians is limited, AI can replace processes through focused test support, for example, for type 2 diabetes, thus addressing the challenges of demographic change . The comparison between AI and physicians with regard to diagnosis performance demonstrated that AI can deliver equal results, for example, in image recognition–related fields [ ]. This can, among others, also support a reallocation of medical capacities. In addition, AI can also enable a shift from a generalized to a more personalized treatment. AI-steered outcome prediction and clinical decision support processes are already used today, for instance, for patients in radiation therapy [ ].
Prior reviews in the digital health segment categorized the results into groups, for example, computerized decision support system, Web-based physical activity intervention, internet-delivered cognitive behavioral therapy, and telehealth. In addition, user’s age was differentiated (eg, children vs old patients), and shortcomings such as a missing difference between short- and long-term cost savings were highlighted . They also covered challenges that go beyond the cost-effectiveness aspect and mentioned, for instance, that the way to implement digital health in daily practice is still unclear [ ] or that patient perspectives and collaborative approaches among a variety of stakeholders are needed [ ].
Note that the focus on AI in health care required considering novel factors and a refined search strategy as compared with typical reviews on digital health resulting in differential results. First, in contrast to other reviews, Google Trends has proven to be an effective tool to narrow the search space for a representative collection of results. On the basis of the Google Trends analysis, the key phrases Artificial Intelligence and Machine Learning could be identified as the most frequently used terms by far. Second, the review covered a higher percentage of included studies after applying the defined inclusion and exclusion criteria (9% of the analyzed papers were included), whereas prior reviews had much lower inclusion rates—8% (55/612) in the study by Whitten et al , 2% (31/1657) in the study by Elbert et al [ ], or 0.1% (11/1625) in the study by Sanyal et al [ ]). This was because of two reasons: (1) AI as a subsegment of digital health in business and industry is still not covered well in scientific publications and (2) the high importance of quantitatively reported outcomes required as inclusion criterion. Third, the evaluation of cost-effectiveness studies has been conducted with a quality criteria catalog from a management perspective. As AI implementation is cost- and labor-intensive and decisions are not exclusively driven by medical improvement rates, the business management decision making basis has been chosen as crucial for positive implementation decisions and subsequent widescale applications. The addition of the business management view includes classical cost factors (onetime and running expenses) as well as decisions among different strategies to deliver cutting edge health services.
Current research covers impact assessments of AI in health care rather moderately and shows qualitative deficits in methodology. Future cost-effectiveness analyses need to increase in number and quality. They should include initial investment and running costs as well as the comparison with alternative technologies. This way a comprehensive and clearly segmented cost-benefit evaluation can be provided, which will serve as a sufficient basis for decision making regarding AI implementations.
The authors would like to thank Julia Menacher for her editorial support in creating this publication. JB is grateful for financial support from Villum Young Investigator grant number 13154. In addition, some of the work from JB was funded by H2020 project RepoTrial (number 777111). Contributions by JP are funded by the Bavarian State Ministry of Science and the Arts in the framework of the Center Digitisation.Bavaria (Zentrum Digitalisierung.Bayern).
Conflicts of Interest
Screenshot of a Google Trends analysis of search terms related to artificial intelligence in health care globally over the last 12 months (conducted on October 9, 2019).PNG File , 176 KB
Analysis of the included economic impact studies.PNG File , 396 KB
- Phelps CE. Health Economics. Abingdon, Oxfordshire: Routledge; 2017.
- Statistisches Bundesamt. Health Expenditure in 2017: +4.7% URL: https://www.destatis.de/DE/Presse/Pressemitteilungen/2019/03/PD19_109_23611.html [accessed 2019-10-03]
- Whitten PS, Mair FS, Haycox A, May CR, Williams TL, Hellmich S. Systematic review of cost effectiveness studies of telemedicine interventions. Br Med J 2002 Jun 15;324(7351):1434-1437 [FREE Full text] [CrossRef] [Medline]
- Elbert N, van Os-Medendorp H, van Renselaar W, Ekeland A, Hakkaart-van Roijen L, Raat H, et al. Effectiveness and cost-effectiveness of eHealth interventions in somatic diseases: a systematic review of systematic reviews and meta-analyses. J Med Internet Res 2014 Apr 16;16(4):e110 [FREE Full text] [CrossRef] [Medline]
- Sanyal C, Stolee P, Juzwishin D, Husereau D. Economic evaluations of eHealth technologies: a systematic review. PLoS One 2018;13(6):e0198112 [FREE Full text] [CrossRef] [Medline]
- Buvik A, Bergmo TS, Bugge E, Smaabrekke A, Wilsgaard T, Olsen JA. Cost-effectiveness of Telemedicine in remote orthopedic consultations: randomized controlled trial. J Med Internet Res 2019 Feb 19;21(2):e11330 [FREE Full text] [CrossRef] [Medline]
- Nordyke RJ, Appelbaum K, Berman MA. Estimating the impact of novel digital therapeutics in Type 2 diabetes and hypertension: health economic analysis. J Med Internet Res 2019 Oct 9;21(10):e15814 [FREE Full text] [CrossRef] [Medline]
- Le LK, Sanci L, Chatterton ML, Kauer S, Buhagiar K, Mihalopoulos C. The cost-effectiveness of an internet intervention to facilitate mental health help-seeking by young adults: randomized controlled trial. J Med Internet Res 2019 Jul 22;21(7):e13065 [FREE Full text] [CrossRef] [Medline]
- Google Trends. Compare URL: https://trends.google.com/trends/explore?cat=45&q=Artificial%20Intelligence,Machine%20learning,Neural%20network,Linear%20regression,Support%20vector%20%20 machine [accessed 2019-10-09]
- Ekeland AG, Bowes A, Flottorp S. Effectiveness of telemedicine: a systematic review of reviews. Int J Med Inform 2010 Nov;79(11):736-771. [CrossRef] [Medline]
- Haycox A, Walley T. Pharmacoeconomics: evaluating the evaluators. Br J Clin Pharmacol 1997 May;43(5):451-456 [FREE Full text] [CrossRef] [Medline]
- Lee HK, Jin R, Feng Y, Bain PA, Goffinet J, Baker C, et al. An analytical framework for TJR readmission prediction and cost-effective intervention. IEEE J Biomed Health Inform 2019 Jul;23(4):1760-1772. [CrossRef] [Medline]
- Gönel A. Clinical biochemistry test eliminator providing cost-effectiveness with five algorithms. Acta Clin Belg 2018 Dec 25:1-5. [CrossRef] [Medline]
- Bremer V, Becker D, Kolovos S, Funk B, van Breda W, Hoogendoorn M, et al. Predicting therapy success and costs for personalized treatment recommendations using baseline characteristics: Data-driven analysis. J Med Internet Res 2018 Aug 21;20(8):e10275 [FREE Full text] [CrossRef] [Medline]
- Grover D, Bauhoff S, Friedman J. Using supervised learning to select audit targets in performance-based financing in health: an example from Zambia. PLoS One 2019;14(1):e0211262 [FREE Full text] [CrossRef] [Medline]
- Lee I, Monahan S, Serban N, Griffin P, Tomar S. Estimating the cost savings of preventive dental services delivered to Medicaid-enrolled children in six southeastern states. Health Serv Res 2018 Oct;53(5):3592-3616 [FREE Full text] [CrossRef] [Medline]
- Golas SB, Shibahara T, Agboola S, Otaki H, Sato J, Nakae T, et al. A machine learning model to predict the risk of 30-day readmissions in patients with heart failure: a retrospective analysis of electronic medical records data. BMC Med Inform Decis Mak 2018 Jun 22;18(1):44 [FREE Full text] [CrossRef] [Medline]
- Spänig S, Emberger-Klein A, Sowa J, Canbay A, Menrad K, Heider D. The virtual doctor: an interactive clinical-decision-support system based on deep learning for non-invasive prediction of diabetes. Artif Intell Med 2019 Sep;100:101706. [CrossRef] [Medline]
- Shen J, Zhang CJ, Jiang B, Chen J, Song J, Liu Z, et al. Artificial Intelligence versus clinicians in disease diagnosis: systematic review. JMIR Med Inform 2019 Aug 16;7(3):e10010 [FREE Full text] [CrossRef] [Medline]
- Pillai M, Adapa K, Das SK, Mazur L, Dooley J, Marks LB, et al. Using Artificial Intelligence to improve the quality and safety of radiation therapy. J Am Coll Radiol 2019 Sep;16(9 Pt B):1267-1272. [CrossRef] [Medline]
|AI: artificial intelligence|
|eHealth: electronic health|
|GDP: gross domestic product|
|NPV: net present value|
Edited by G Eysenbach; submitted 01.11.19; peer-reviewed by D Heider, N Przulj, E van der Velde; comments to author 07.12.19; revised version received 18.12.19; accepted 19.12.19; published 20.02.20Copyright
©Justus Wolff, Josch Pauling, Andreas Keck, Jan Baumbach. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 20.02.2020.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.