This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.
In December 2019, COVID-19 broke out in Wuhan, China, leading to national and international disruptions in health care, business, education, transportation, and nearly every aspect of our daily lives. Artificial intelligence (AI) has been leveraged amid the COVID-19 pandemic; however, little is known about its use for supporting public health efforts.
This scoping review aims to explore how AI technology is being used during the COVID-19 pandemic, as reported in the literature. Thus, it is the first review that describes and summarizes features of the identified AI techniques and data sets used for their development and validation.
A scoping review was conducted following the guidelines of PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews). We searched the most commonly used electronic databases (eg, MEDLINE, EMBASE, and PsycInfo) between April 10 and 12, 2020. These terms were selected based on the target intervention (ie, AI) and the target disease (ie, COVID-19). Two reviewers independently conducted study selection and data extraction. A narrative approach was used to synthesize the extracted data.
We considered 82 studies out of the 435 retrieved studies. The most common use of AI was diagnosing COVID-19 cases based on various indicators. AI was also employed in drug and vaccine discovery or repurposing and for assessing their safety. Further, the included studies used AI for forecasting the epidemic development of COVID-19 and predicting its potential hosts and reservoirs. Researchers used AI for patient outcome–related tasks such as assessing the severity of COVID-19, predicting mortality risk, its associated factors, and the length of hospital stay. AI was used for infodemiology to raise awareness to use water, sanitation, and hygiene. The most prominent AI technique used was convolutional neural network, followed by support vector machine.
The included studies showed that AI has the potential to fight against COVID-19. However, many of the proposed methods are not yet clinically accepted. Thus, the most rewarding research will be on methods promising value beyond COVID-19. More efforts are needed for developing standardized reporting protocols or guidelines for studies on AI.
COVID-19 broke out in Wuhan, Hubei Province, China in December 2019 [
Leveraging digital tools and technologies to combat COVID-19 can augment public health strategies [
AI enables machines to become intelligent, understand queries, sift through and connect mountains of data points, and draw actionable conclusions [
Soon after the COVID-19 pandemic spread across the world, several governments, research institutes, and technology companies have issued calls to action urging researchers to develop AI applications to assist with COVID-19–related research [
A full review of the AI field is beyond the scope of this review, and we refer the reader to some surveys (eg, [
AI has the ability to analyze big data sets through aggregating and sifting through mountains of health care data (including patient data) to generate insights that can enable predictive analysis. The quick ability to obtain these insights helps clinicians as well as other stakeholders in the health care ecosystem to make effective, safe, and timely decisions to better serve patients and public health policy. There has been a steady rise in the number of studies regarding the use of AI techniques to resolve or address the COVID-19 pandemic [
To achieve the objective of this study while ensuring both replicable and transparent methods, we conducted a scoping review following the guidelines of PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews) [
In this review, we performed search queries between April 10 and 12, 2020, on the following online databases: MEDLINE (via Ovid), EMBASE (via Ovid), PsycInfo (via Ovid), IEEE Xplore, ACM Digital Library, arXiv, medRxiv, bioRxiv, Scopus, and Google Scholar. In the case of Google Scholar and due to the volume of returned hits, only the first 100 results were considered, as we found that, beyond this, results quickly lose relevance and applicability. In addition to searching bibliographic databases, we screened the reference list of the included studies and relevant reviews to look for other relevant studies that could be added to this review (ie, backward reference list checking).
The search terms we used to identify relevant studies were specified from the available literature and by referring to subject matter experts. These terms were selected based on the target intervention (eg, AI, machine learning, and deep learning) and the target disease (eg, coronavirus, COVID-19, and 2019-nCoV). Details about the exact search strings used in this study are provided in
In this review, we focused on any AI-based technology or approach used for any purpose related to the COVID-19 pandemic, such as diagnosis, epidemiological predictions, treatment and vaccine discovery, and prediction of patient outcomes. However, we excluded studies providing an overview or proposing a potential AI technique for COVID-19, or studies that were purely discussed from a research perspective.
We considered studies published in English between December 25, 2019, and April 12, 2020, such as peer-reviewed articles, theses, dissertations, conference proceedings, and preprints, while excluding other publications such as reviews, conference abstracts, proposals, editorials, and commentaries. We did not enforce any restrictions on the country of publication, study design, comparator, and outcomes.
Two reviewers, namely, authors AAA and MA, independently screened the titles and abstracts of the identified studies. They independently read the full text of studies that passed the
After extracting the data from the identified studies, we used a narrative approach to synthesize it. Specifically, we classified and described AI techniques used in the included studies in terms of their purposes (eg, diagnosis and drug and vaccine development), AI area or branch (eg, traditional machine learning and deep learning), AI models and algorithms (eg, decision tree, random forest, and naive Bayes), and platform (ie, computer and mobile). Further, we described the data sets used for development and validation of AI models in terms of data sources (eg, public databases and clinical settings); type of data (eg, radiology images, biological data, and laboratory data); size of the data set; type of validation; and proportion of training, validation, and test data sets. We used Microsoft Excel (Microsoft Corporation) to manage data synthesis.
We retrieved 435 studies through searching the identified bibliographic databases (
Flowchart of the study selection process.
Among the included studies, 72 were preprints and 10 were published articles in peer-reviewed journals (
Characteristics of the included studies.
Characteristics | Studies (N=82), n | |
|
||
|
Preprint | 72 |
|
Published | 10 |
|
||
|
February | 13 |
|
March | 53 |
|
April | 16 |
|
||
|
China | 41 |
|
US | 9 |
|
India | 6 |
|
Turkey | 5 |
|
Canada | 4 |
|
UK | 3 |
|
Bangladesh | 2 |
|
Austria | 1 |
|
Egypt | 1 |
|
Greece | 1 |
|
Hong Kong | 1 |
|
Hungary | 1 |
|
Japan | 1 |
|
Korea | 1 |
|
Netherlands | 1 |
|
Pakistan | 1 |
|
Qatar | 1 |
|
Sudan | 1 |
|
Switzerland | 1 |
Publications by months and country.
As shown in
Purposes and uses of artificial intelligence against COVID-19.
Purposes/uses | Studies (N=82), n | |
|
||
|
CTa images | 15 |
|
X-ray images | 12 |
|
Laboratory tests | 2 |
|
Genome sequence | 1 |
|
Respiratory patterns | 1 |
|
||
|
Drug discovery | 9 |
|
Vaccine discovery | 4 |
|
Protein structure | 4 |
|
Drug repurposing | 2 |
|
Treatment safety | 1 |
|
||
|
Epidemic development | 14 |
|
Potential reservoirs | 3 |
|
||
|
Severity | 6 |
|
Progression to severe | 4 |
|
Mortality risk | 2 |
|
Risk factors | 1 |
|
Hospital stay | 1 |
|
||
|
Raising awareness | 1 |
aCT: computed tomography.
In 20 studies [
There were 17 studies that used AI for epidemiological modeling tasks [
In 14 studies [
AI has also been used for infodemiology [
In 29 studies [
Features of AI-based techniques used for COVID-19.
Features | Studies (N=82), n | |
|
||
|
Deep learning | 60 |
|
Machine learning | 29 |
|
Natural language processing | 3 |
|
||
|
Convolutional neural network | 37 |
|
Support vector machine | 10 |
|
Random forest | 9 |
|
Decision tree | 9 |
|
Logistic regression | 9 |
|
Recurrent neural network | 8 |
|
Artificial neural network (unspecified) | 6 |
|
Transfer learning | 4 |
|
Autoencoders | 4 |
|
Deep neural network | 3 |
|
K-nearest neighbors | 3 |
|
Least absolute shrinkage and selection operator | 3 |
|
Polynomial neural network | 3 |
|
Multilayer perceptron | 2 |
|
Advance deep Q-learning network | 2 |
|
AdaBoost | 1 |
|
Auto-regressive integrated moving average model | 1 |
|
Bayesian analysis | 1 |
|
Bidirectional encoder representations from transformers | 1 |
|
Continuous bag of words | 1 |
|
Eureqa modeling | 1 |
|
Genetic algorithm | 1 |
|
Generative adversarial network | 1 |
|
Generalized logistic growth model | 1 |
|
Holistic agent-based model | 1 |
|
Linear discriminant analysis | 1 |
|
Linear regression | 1 |
|
Language model | 1 |
|
Multi-task deep model | 1 |
|
Naive Bayes | 1 |
|
Porter stemming | 1 |
|
Reinforcement learning | 1 |
|
Skip-gram model | 1 |
|
Time series forecasting | 1 |
|
Universal-sentence-encoder-large | 1 |
|
Vector auto average | 1 |
|
||
|
Computer | 81 |
|
Mobile | 1 |
aAI: artificial intelligence.
bNumbers do not add up as AI techniques in some studies were based on more than one AI branch.
cNumbers do not add up as several studies used more than one AI model or algorithm.
In 60 studies, AI techniques used against COVID-19 were based on deep learning models and algorithms [
In 2 studies [
As shown in
Features of data sets used for development and validation of artificial intelligence models.
Features | Studies (N=82), n | |
|
||
|
Public databases | 52 |
|
Clinical settings | 24 |
|
Government sources | 9 |
|
Literature | 6 |
|
News websites | 2 |
|
Participants | 1 |
|
||
|
Radiology image | 35 |
|
Biological data | 23 |
|
Epidemiological data | 15 |
|
Clinical data | 11 |
|
Laboratory data | 8 |
|
Demographic data | 5 |
|
Guidelines | 1 |
|
News articles | 1 |
|
||
|
<1000 | 26 |
|
1000-9999 | 16 |
|
≥10,000 | 8 |
|
||
|
Train-test split | 25 |
|
K-fold cross-validation | 18 |
|
External validation | 11 |
|
||
|
≤25 | 3 |
|
26-50 | 2 |
|
51-75 | 16 |
|
>75 | 28 |
|
||
|
≤25 | 8 |
|
26-50 | 3 |
|
51-75 | 0 |
|
>75 | 0 |
|
||
|
≤25 | 35 |
|
26-50 | 10 |
|
51-75 | 3 |
|
>75 | 1 |
aNumbers do not add up as several studies collected their data from more than one data source.
bNumbers do not add up as several studies collected more than one type of data.
cData set size was reported in 50 studies.
dType of validation was reported in 53 studies.
eNumbers do not add up as 1 study used two different types of validation.
fProportion of the training set was reported in 49 studies.
gProportion of the validation set was reported in 11 studies.
hProportion of the test set was reported in 49 studies.
The types of data collected from these data sources were as follows: radiology images (eg, CT and x-ray) [
The data set size was reported by 50 studies, ranging from 31 to 3,000,000. The data set size was less than 1000 samples in half of these studies [
Validation of models was reported in 53 studies. Three types of validation were used in the included studies: train-test split [
The training set proportion of the total data set was reported in 49 studies. The proportion of the training set ranged from ≤25% in 3 studies [
The validation set proportion of the total data set was reported in 11 studies; it ranged from ≤25% in 8 studies [
The test set proportion of the total data set was reported in 49 studies, ranging from ≤25% in 35 studies [
In this study, we conducted a scoping review of the use of AI against COVID-19. We found a lack of publications in December 2019 and January 2020. This is not surprising, given that SARS-CoV-2 was only identified on January 7 [
In the included studies, AI was used for five purposes: diagnosis, treatment and vaccine discovery, epidemiological modeling, patient outcome–related tasks, and infodemiology. None of the included studies used AI for other purposes such as contact tracing of the individuals, providing training to students and health care professionals, or robotics to deal with suspected and quarantined cases.
Most of the AI techniques used in the included studies were based on deep learning approaches such as CNN and RNN. All but 1 study used desktop machines, workstations, and clusters as opposed to mobile platforms. This can be explained by the computational demand in training AIs. Although all major mobile phone manufacturers equip their flagship models with AI coprocessors, these coprocessors accelerate inference, a computationally much lighter task. In addition,
Data sources used in the included studies usually came from the public domain (eg, NCBI, GitHub, Kaggle) and proprietary databases (less common). Radiology images were the most commonly used type of data, followed by biological data. The number of samples was still comparably small (less than 1000 in half of the studies). The diversity and size of data indicate a lack of publicly available data despite COVID-19 cases having surpassed 5 million at the time of writing. We, therefore, second Wynants et al [
Although this review explores the use of AI against COVID-19, some applications could prove useful far beyond this pandemic. For instance, Kiwibot designs autonomous medical delivery robots to minimize interpersonal contact [
In the past, fundamental AI research was focused mainly on faster (or even feasible) training. We believe that, in the future, this must be complemented with public education. AI mistrust, because of our still lacking understanding of how AI works at the deepest level, further raises ethical questions that need to be answered before AI will be uniformly accepted. We also found that AI features and results were reported in an inconsistent manner, potentially fueling AI mistrust and making a direct comparison between studies difficult. Of the 82 studies, we found that only 64.6% (n=53) of the studies included in this review disclosed the type of validation, 61% (n=50) mentioned the data size, and more than 7% (n=6) did not even specify the type of AI used. It is therefore important that we as a community develop a standardized reporting protocol to slow down the barrage of poorly conducted COVID-19 studies that threaten to overwhelm serious scientists (1916 related papers were retrieved before April 5, 2020 by Wynants et al [
We found that, explicably, the landscape of studies is still dominated by Chinese institutions, which bears the potential for cultural, technological, and geospatial biases. However, we see a recent move toward a more balanced landscape (see
Given the current “infodemic” [
Given that this review includes all AI techniques used for the COVID-19 pandemic regardless of their characteristics, study design, study setting, and country of publication, it may be considered the most comprehensive review in this research area. This helps readers to speculate how AI is being leveraged amid the COVID-19 pandemic. In comparison with similar reviews [
In contrast to other reviews, we searched the most commonly used databases in health and information technology fields to identify as many relevant studies as possible. Thus, the number of studies included in this review was much higher than in other reviews [
Given that our review excludes proposals of AI techniques, it is likely that we missed other applications of AI for COVID-19. This review, therefore, might not identify all potential uses of AI for the current pandemic. Owing to practical constraints, the search was restricted to English studies. Therefore, we probably missed several studies written in other languages, especially Chinese. The search query did not include terms related to specific types of models or algorithms such as CNN, RNN, and SVM. Thus, it is likely that we missed some studies that used such terms in their title and abstract instead of the terms that we used (ie, AI, machine learning, and deep learning). The findings of this review are mostly based on preprints, which are more likely to have inaccurate or missing information. Therefore, the accuracy of the information in the included studies may affect the accuracy of our findings.
In this study, we provide a scoping review of 82 studies on AI against COVID-19. Given that many of the proposed methods are not yet clinically accepted, we remark that the most rewarding research will be on methods promising value beyond COVID-19. We believe that mobile phones offer unexploited potential, but more research in the direction of energy-efficient and federated learning is needed. We also believe that the use of NLP to assess effective communication of nonpharmaceutical interventions is a largely unexplored research direction, especially since data driving this research is available in the public domain, unlike much of the data produced by clinical studies. For AI to gain broad acceptance, standardized reporting protocols to be followed by studies on AI are needed. Likewise, more research on AI ethics and explainable AI is needed, paired with public education initiatives.
Overview of artificial intelligence–based techniques.
Search strategy.
Interrater agreement matrices for study selection steps.
Data extraction form.
Characteristics of the included studies and features of artificial intelligence techniques used for COVID-19.
Features of data sets used for the development and validation of artificial intelligence models.
artificial intelligence
convolutional neural network
computed tomography
National Center for Biotechnology Information
natural language processing
Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews
recurrent neural network
support vector machine
None declared.