Published on in Vol 23, No 3 (2021): March

Preprints (earlier versions) of this paper are available at, first published .
Artificial Intelligence Techniques That May Be Applied to Primary Care Data to Facilitate Earlier Diagnosis of Cancer: Systematic Review

Artificial Intelligence Techniques That May Be Applied to Primary Care Data to Facilitate Earlier Diagnosis of Cancer: Systematic Review

Artificial Intelligence Techniques That May Be Applied to Primary Care Data to Facilitate Earlier Diagnosis of Cancer: Systematic Review

Original Paper

1Primary Care Unit, Department of Public Health & Primary Care, University of Cambridge, Cambridge, United Kingdom

2Wolfson Institute for Preventive Medicine, Queen Mary University of London, London, United Kingdom

3Centre for Cancer Research and Department of General Practice, University of Melbourne, Victoria, Australia

4College of Medicine and Health, University of Exeter, Exeter, United Kingdom

5Center for Innovations in Quality, Effectiveness and Safety, Michael E DeBakey Veterans Affairs Medical Center and Baylor College of Medicine, Houston, TX, United States

6Julius Center for Health Sciences and Primary Care, UMC Utrecht, Utrecht, Netherlands

Corresponding Author:

Owain T Jones, MPhil

Primary Care Unit

Department of Public Health & Primary Care

University of Cambridge

2 Wort's Causeway

Cambridge, CB1 8RN

United Kingdom

Phone: 44 1223762554


Background: More than 17 million people worldwide, including 360,000 people in the United Kingdom, were diagnosed with cancer in 2018. Cancer prognosis and disease burden are highly dependent on the disease stage at diagnosis. Most people diagnosed with cancer first present in primary care settings, where improved assessment of the (often vague) presenting symptoms of cancer could lead to earlier detection and improved outcomes for patients. There is accumulating evidence that artificial intelligence (AI) can assist clinicians in making better clinical decisions in some areas of health care.

Objective: This study aimed to systematically review AI techniques that may facilitate earlier diagnosis of cancer and could be applied to primary care electronic health record (EHR) data. The quality of the evidence, the phase of development the AI techniques have reached, the gaps that exist in the evidence, and the potential for use in primary care were evaluated.

Methods: We searched MEDLINE, Embase, SCOPUS, and Web of Science databases from January 01, 2000, to June 11, 2019, and included all studies providing evidence for the accuracy or effectiveness of applying AI techniques for the early detection of cancer, which may be applicable to primary care EHRs. We included all study designs in all settings and languages. These searches were extended through a scoping review of AI-based commercial technologies. The main outcomes assessed were measures of diagnostic accuracy for cancer.

Results: We identified 10,456 studies; 16 studies met the inclusion criteria, representing the data of 3,862,910 patients. A total of 13 studies described the initial development and testing of AI algorithms, and 3 studies described the validation of an AI algorithm in independent data sets. One study was based on prospectively collected data; only 3 studies were based on primary care data. We found no data on implementation barriers or cost-effectiveness. Risk of bias assessment highlighted a wide range of study quality. The additional scoping review of commercial AI technologies identified 21 technologies, only 1 meeting our inclusion criteria. Meta-analysis was not undertaken because of the heterogeneity of AI modalities, data set characteristics, and outcome measures.

Conclusions: AI techniques have been applied to EHR-type data to facilitate early diagnosis of cancer, but their use in primary care settings is still at an early stage of maturity. Further evidence is needed on their performance using primary care data, implementation barriers, and cost-effectiveness before widespread adoption into routine primary care clinical practice can be recommended.

J Med Internet Res 2021;23(3):e23483




Cancer control is a global health priority, with 17 million new cases diagnosed worldwide in 2018. In high-income countries such as the United Kingdom, approximately half the population over the age of 50 years will be diagnosed with cancer in their lifetime [1]. Although the National Health Service (NHS) currently spends approximately £1 billion (US $1.37 billion) on cancer diagnostics per year [2], the United Kingdom lags behind comparable European nations with their cancer survival rates [3].

In gatekeeper health care systems such as the United Kingdom, most people diagnosed with cancer first present in primary care [4], where general practitioners evaluate (often vague) presenting symptoms and decide on an appropriate management strategy, including investigations, specialist referral, or reassurance. More accurate assessment of these symptoms, especially for patients with multiple consultations, could lead to earlier diagnosis of cancer and improved outcomes for patients, including improved survival rates [5,6].

There is accumulating evidence that artificial intelligence (AI) can assist clinicians in making better clinical decisions or even replace human judgment, in certain areas of health care. This is due to the increasing availability of health care data and the rapid development of big data analytic methods. There has been increasing interest in the application of AI in medical diagnosis, including machine learning and automated analysis approaches. Recent studies have applied AI to patient symptoms to improve diagnosis [7,8], to retinal images for the diagnosis of diabetic retinopathy [9], to mammography images for breast cancer diagnosis [10,11], to computed tomography (CT) scans for the diagnosis of intracranial hemorrhages [12], and to images of blood films for the diagnosis of acute lymphoblastic leukemia [13].

Few AI techniques are currently implemented in routine clinical care. This may be due to uncertainty over the suitability of current regulations to assess the safety and efficacy of AI systems [14-16], a lack of evidence about the cost-effectiveness and acceptability of AI systems [14], challenges to implementation into existing electronic health records (EHRs) and routine clinical care, and uncertainty over the ethics of using AI systems. A recent review of AI and primary care reported that research on AI for primary care is at an early stage of maturity [17], although research on AI-driven tools such as symptom checkers for patient and clinical users are more mature [18-21].

The CanTest framework [22] (Figure 1) establishes the developmental phases required to ensure that new diagnostic tests or technologies are fit for purpose when introduced into clinical practice. It provides a roadmap for developers and policy makers to bridge the gap from the development of a diagnostic test or technology to its successful implementation. We used this framework to guide the assessment of the studies identified in this review.

Figure 1. The CanTest Framework [22].
View this figure


Few studies of AI-based techniques for the early detection of cancer have been undertaken in primary care settings [17]. Therefore, the aim of this systematic review is to identify AI techniques that facilitate the early detection of cancer and could be applied to primary care EHR data. We also aim to summarize the diagnostic accuracy measures used to evaluate existing studies and evaluate the quality of the evidence, the phase of development the AI technologies have reached, the gaps that exist in the evidence, and the potential for use in primary care. As many commercial technological developments are not documented in academic publications, we also performed a parallel scoping review of commercially available AI-based technologies for the early detection of cancer that may be suitable for implementation in primary care settings.

Search Strategy and Selection Criteria

This study was conducted in accordance with PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-analysis) guidelines [23], and the protocol was registered with PROSPERO (an international prospective register of systematic reviews) before conducting the review (CRD42020176674) [24]. All aspects of the protocol were reviewed by the senior research team.

We included all primary research articles published in peer-reviewed journals, without language restrictions, from January 01, 2000, to June 11, 2019. Studies were included if they provided evidence around the accuracy, utility, acceptability, or cost-effectiveness of applying AI techniques to facilitate the early detection of cancer and could be applied to primary care EHRs (ie, to the types of data found in primary care EHRs) [22]. We included AI techniques based on any type of data that were relevant to primary care settings, including coded data and free text. We included all types of study design, as we anticipated that there would be few relevant randomized controlled trials. We kept our search terms broad to not miss relevant studies and carefully considered evidence from any health care system to assess whether the evidence could be applied to primary care settings.

As our aim is to identify AI techniques that would be applicable in primary care clinical settings, we excluded studies that incorporated data not typically available in primary care EHRs in the early diagnostic stages (eg, histopathology images, magnetic resonance imaging, or CT scan images). We also excluded studies that only described the development of an AI technique without any testing or evaluation data, studies that did not incorporate an element of machine learning (ie, with training and testing or validation steps), studies that used AI techniques for biomarker discovery alone, and studies that were based on sample sizes of less than 50 cases or controls. Machine learning techniques and neural networks have been described since the 1960s [25,26]; however, they were initially limited by computing power and data availability. We chose to start our search in 2000, as this was when the earliest research describing the new wave of machine learning techniques emerged [27].

We searched MEDLINE, Embase, SCOPUS, and Web of Science bibliographic databases, using keywords related to AI, cancer, and early detection. We extended these systematic searches through manual searching of the reference lists of the included studies. We contacted study authors, where required. Where studies were not published in English, we identified suitably qualified native speakers to help assess these studies. We performed a parallel scoping review to look for commercially developed AI technologies that were not identified through systematic searches, thus unpublished and not scientifically evaluated. This included manually searching commercial research archives and networks (eg, arXiv [28], Google [29], Microsoft [30], and IBM [31]), reviewing the computer-based technologies identified in 3 recent reviews [19-21], and manually searching for further technologies mentioned in the text or references of the studies and websites included in these reviews.

Following duplicate removal, 1 author (OJ) screened titles and abstracts to identify studies that fit the inclusion criteria. Of the titles and abstracts, 17.42% (1838/10,456) were checked by 2 other authors (SS and NC); interrater reliability was excellent at 96.24% (1769/1838). Any disagreements were discussed by the core research team (OJ, SS, NC, and FW), and a consensus was reached. Three reviewers (OJ, SS, and NC) independently assessed the full-text articles for inclusion in the review. Any disagreements were resolved by a consensus-based decision.

Data Analysis

Data extraction was undertaken independently by at least two reviewers (OJ, SS, and NC) into a predesigned data extraction spreadsheet. The research team met regularly to reach consensus by discussing and resolving any differences in data extraction. One author (OJ) amalgamated the data extraction spreadsheets, summarizing the data where possible.

The main summary measures collected included sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), area under the receiver operating characteristic (AUROC) curve, and any other diagnostic accuracy measures of the AI techniques. Secondary outcomes include the types of AI used, the type of data used to train and test the algorithms, and how these algorithms were evaluated. We also collected data, where identified, on cost-effectiveness and patient or clinician acceptability.

Risk of bias assessment was undertaken for all full-text papers by 2 independent researchers (OJ and NC) using the quality assessment of diagnostic accuracy studies-2 (QUADAS-2) critical appraisal tool [32]. OJ assessed all studies, and 50% (40/79) of them were cross-checked by NC. Any disagreements in the assessment were resolved by consensus discussion.

The studies identified were heterogeneous, employing various AI techniques and using different outcome measures for evaluation. Hence, a meta-analysis of the data was not possible, and we chose to use a narrative synthesis approach, following established guidance on its methodology [33]. We aimed to summarize the findings of the identified studies using primarily a textual approach, while also providing an overview of the quantitative outcome measures used in the studies. Once data extraction was completed, we explored the relationships that emerged within the data.

Full details of our review question, search strategy, inclusion or exclusion criteria, and data extraction methodology are described in Multimedia Appendices 1 [1-5,7-9,11-13,34-38] and 2, and the full list of excluded studies is provided in Multimedia Appendix 3 [34,39-114].

A total of 13,004 articles were identified in database searches (including 2548 duplicates), and 793 articles underwent full-text review. Of the 79 articles that were related to EHRs, 16 met the inclusion criteria and were included in this analysis (Figure 2), representing the data of 3,862,910 patients. No articles identified through other sources or reference lists met the inclusion criteria.

Figure 2. PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-analysis) flow diagram for studies included in the review. AI: artificial intelligence.
View this figure

Tables 1 and 2 show the main study characteristics for the 16 included studies, including the modality of AI used. Supplementary information on the variables included in the AI techniques is available in Multimedia Appendix 4 [34,39-53]. We categorized the variables included into the following categories: demographics, symptoms, comorbidities, lifestyle history, examination findings, blood results, and other. Most studies (n=13) described the initial development and testing of an AI technique [39-51]. Three studies validated the AI technique developed by Kinar et al [48] in independent data sets from 3 different countries (Israel, United States, and United Kingdom) [34,52,53].

Table 1. Study details including modality of artificial intelligence and adopted comparison or control.
StudyAuthors’ originCancerModality of artificial intelligenceComparison or control

HistopathologySpecialistNot statedOther
Development studies

Alzubi et al, 2019 [39]Jordan and
Lung cancerWONN-MLBaXbc1d

Chang et al, 2009 [40]TaiwanPancreatic
BPNNe; LRfX2g; 3h

Cooper et al, 2018 [41]United

Cowley et al, 2013 [42]United
BPANNlX2; 5m

Daqqa et al, 2017 [43]Gaza, PalestineLeukemiaSVMn; DTo; K-NNpX2

Goryński et al, 2014 [44]PolandLung cancerMLP-ANNqXX

Hart et al, 2018 [45]United StatesLung cancerBPANNX2; 6r

Kalra et al, 2003 [46]United StatesProstate cancerBPNNX2; 3

Kang et al, 2017 [47]ChinaAny cancerBPNN; CVT; SVM; DTXX2

Kinar et al, 2016 [48]Israel and
United States

Kop et al, 2016 [49]The

Miotto et al, 2016 [50]United StatesMultiple diseases and cancersDNNv; RFX2; 3

Payandeh et al, 2009 [51]IranCMLw and lymphoproliferative disordersMLP-ANNXX3
Validation studies

Birks et al, 2017 [52]United

Hornbrook et al, 2017 [34]United StatesColorectal

Kinar et al, 2017 [53]IsraelColorectal

aWONN-MLB: weight optimized neural network with maximum likelihood boosting.

bX: corresponding control used in this study.

cNot used in this study.

d1: previously developed artificial intelligence methods.

eBPNN: back propagation neural network.

fLR: logistic regression.

g2: other artificial intelligence methods developed by this author.

h3: other statistical (ie, non-artificial intelligence) techniques.

iANN: artificial neural network.

jCVT: cross-validation techniques.

k4: colonoscopy.

lBPANN: back propagation artificial neural network.

m5: primary care clinicians.

nSVM: support vector machine.

oDT: decision tree.

pK-NN: K-nearest neighbor.

qMLP-ANN: multilayer perceptron artificial neural network.

r6: screening tests (eg, low-dose computed tomography scan and fecal occult blood test).

sRF: random forest.

tGBM: gradient boosting model.

uCART: classification and regression trees.

vDNN: deep neural network.

wCML: chronic myeloid leukemia.

The study authors originated from a variety of countries, including the United States (n=5), countries in the Middle East (n=5), Europe (n=5), and Asia (n=3), with some studies involving multiple countries. The AI techniques were most commonly developed to identify colorectal cancer (n=7) [34,41,42,48,49,52,53], although they also addressed lung cancer (n=3) [39,44,45], hematological cancers (n=2) [43,51], pancreatic cancer (n=1) [40], prostate cancer (n=1) [46], and multiple cancers (n=2) [47,50].

Neural networks were the dominant technique employed (n=10) [39-42,44-47,50,51], with many neural network subtypes mentioned. The study by Miotto et al [50] was the only study to include a processed form of the free text notes in the data used by the AI technique, although the work described by Kop et al [49] was developed in a subsequent study to include clinical free text data [115].

The majority of studies (n=9) used a combination of histopathological diagnoses and expert opinion as the control for their study [34,41,44,47-49,51-53]. The clinical control group was unclear in 2 studies [40,45]. Many studies used multiple AI techniques and then compared them with each other (n=8) [40,42,43,45-47,49,50]. Some studies used non-AI techniques, such as logistic regression and screening tests, as comparators for the performance of the AI technique that was being developed [40,41,45,46,48-51].

Table 2. Study details: patient variables.
StudyPatient variables

AgeSexDemographicsSymptomsComorbiditiesLifestyleExaminationFBCaOther blood testsOtherb
Development studies

Alzubi et al, 2019 [39]XcdXXXX

Chang et al, 2009 [40]XXXXXXX

Cooper et al, 2018 [41]XXXX

Cowley et al, 2013 [42]XXXX

Daqqa et al, 2017 [43]X

Goryński et al, 2014 [44]XXXXXXXXXX

Hart et al, 2018 [45]XXXXXX

Kalra et al, 2003 [46]XXXXXX

Kang et al, 2017 [47]XXXXXX

Kinar et al, 2016 [48]XXX

Kop et al, 2016 [49]XXXXXXXXX

Miotto et al, 2016 [50]XXXXXXX

Payandeh et al, 2009 [51]X
Validation studies

Birks et al, 2017 [52]XXX

Hornbrook et al, 2017 [34]XXX

Kinar et al, 2017 [53]XXX

aFBC: full blood count.

bMore detail on other variables included is available in Multimedia Appendix 4.

cX: corresponding variable used in this study.

dNot used in this study.

Most of the studies (n=12) included blood test results, all suitable for use in primary care settings. Age was also commonly included (n=12). Other variables used were sex (n=10), demographics (n=5), symptoms (n=7), comorbidities (n=8), lifestyle history (n=7), examination findings (n=6), medication or prescription history (n=3), spirometry results (n=2), urine dipstick results (n=1), fecal immunochemical test results (n=1), x-ray text reports (n=1), and referrals (n=1).

Table 3 shows the study designs and populations. Most studies used data sets originating from specialist care settings (n=7) [39,40,42-44,46,51], with only 3 studies using solely primary care patient data [41,49,52]. Kinar et al [48] included a follow-up validation study based on the health improvement network (THIN) database, also using primary care data. Several studies used a mixture of primary and secondary care patient data (n=5) [34,47,48,50,53].

Table 3. Study population and study design.
Study detailsPopulation from health care settingDatabase usedDisease positive population (patients)Disease negative population (patients)Training set (patients)Testing set (patients)
Development studies

Alzubi et al, 2019 [39]Specialist careWroclaw Thoracic Surgery Centre1200 in total; numbers of disease positive and negative unclear1200 in total; numbers of disease positive and negative unclearN/Sa1000

Chang et al, 2009 [40]Specialist care (unclear)“a certain medical center”194157b234117

Cooper et al, 2018 [41]Primary careNHSc Bowel Cancer Screening Programme comparative study [116]5491261N/SN/S

Cowley et al, 2013 [42]Specialist care2-week wait colorectal referrals to Castle Hill Hospital74703777100

Daqqa et al, 2017 [43]Specialist careComplete Blood Count test repository, European Gaza Hospital20002000N/SN/S

Goryński et al, 2014 [44]Specialist carePatients treated at Kuyavia and Pomerania Centre of pulmonology103909748

Hart et al, 2018 [45]Other (survey)National Health Interview Survey649488,418342,347146,719

Kalra et al, 2003 [46]Specialist careMen whose samples were tested at 6 sites in the United Statesd348N/S218144

Kang et al, 2017 [47]MixedDatabase of Ci Ming Health Checkup Center6501650N/SN/S

Kinar et al, 2016 [48]eMixedMaccabi Health Services EMRsf linked to the Israel Cancer Registry2437463,670466,107139,205

Kop et al, 2016 [49]Primary care6 anonymized data sets from 3 urban regions, each covering a GPg recording system1292263,879N/SN/S

Miotto et al, 2016 [50]MixedMount Sinai Data Warehouse276,214 patients with 78 diseases276,214 patients with 78 diseases200,00076,214

Payandeh et al, 2009 [51]Specialist careBlood test results from patients at the Taleghani Hospital450N/S360132
Validation studies

Birks J et al, 2017 [52]Primary careClinical Practice Research Datalink51412,220,108N/AhN/A

Hornbrook et al, 2017 [34]MixedKaiser Permanente North West EHRi system, Kaiser Permanente Tumor Registry90016,195N/AN/A

Kinar et al, 2017 [53]MixedMaccabi Health Services EMRs, linked to the Israel Cancer Registry133112,451N/AN/A

aN/S: not stated.

bCases of acute pancreatitis.

cNHS: National Health Service.

dHospitals included: Northwest Prostate Institute Seattle, the University of Washington Seattle, the Johns Hopkins Hospital Baltimore, Memorial Sloan-Kettering Cancer Institute New York, Brigham and Women’s Hospital Boston, and The University of Texas MD Anderson Cancer Center

eNB: this study also included a small validation study in the Health Improvement Network database in the United Kingdom (n=25,613)

fEMR: electronic medical record.

gGP: general practitioner.

hN/A: not applicable

iEHR: electronic health record.

Almost all the studies used different data sets, with the exception of the Maccabi Health Services EHR, which was used in 2 studies [48,53]. The data set sizes ranged from 193 to 2,225,249 patients, with a mean of 241,585 (SD 555,953), median of 3,150, and IQR of 267,237 patients. The wide range is primarily due to the large data set used by Birks et al [52]. Of the 13 development studies, 3 provided no information on the control population used [39,46,51]. Five of the development studies did not provide full information on how they partitioned their data set for the training and testing of the algorithm [39,41,43,47,49]. Five studies appeared to have independent training and testing data sets, with most split in ratios ranging from 60:40 to 70:30 [40,44-46,50].

Three studies [34,52,53] validated a previously developed AI technique [48] in independent data sets. Kinar et al [48] reported both the initial development of an AI technique and a subsequent validation study in an independent data set. The study by Cooper et al [41] was the only study that developed an AI technique based on prospectively collected clinical data, with the data originating from a pilot study of fecal immunochemical testing by the NHS Bowel Cancer Screening Programme [116].

Table 4 summarizes the main reported outcome measures. Specificity (n=11), AUROC (n=11), and sensitivity (n=10) were the most frequently reported; others included PPV (n=6), NPV (n=5), diagnostic accuracy (n=4), and odds ratios (n=3). Specificity results range from 80.6% [45] to 100% [51], sensitivity results from 0% [51] to 96.7% [40], and AUROC results from 0.55 [45] to 0.9896 [44].

Table 4. Outcome measures.
StudyCancer typeOutcome measures for each modality of AIa
Development studies

Alzubi et al, 2019 [39]Lung cancer
  • Specificity: 92%, Accuracy: 93%
  • False positive rate: 9%, F-1 score: 92%

Chang et al, 2009 [40]Pancreatic cancer
  • Sensitivity: BPNNb 88.3%, genetic algorithm LRc 96.7%, stepwise LR 96.7%
  • Specificity: BPNN 84.2%, genetic algorithm LR 82.5%, stepwise LR 73.7%
  • AUROCd: BPNN 0.895, genetic algorithm LR 0.921, stepwise LR 0.882

Cooper et al, 2018 [41]Colorectal cancer
  • Sensitivity: 35.15% (at FITe threshold 160 µg g-1)
  • Specificity: 85.57%
  • PPVf: 51.47%, NPVg: 75.19%, AUROC: 0.69, cancer detection rate: 10.66%

Cowley et al, 2013 [42]Colorectal cancer
  • Sensitivity: 90%
  • Specificity: 96%
  • PPV: 62%, NPV: 99%

Daqqa et al, 2017 [43]Leukemia
  • Sensitivity: SVMh 69.7%, K-NNi 60.0%, decision tree 62.4%
  • Specificity: SVM 81.5%, K-NN 82.8%, decision tree 87.1%
  • PPV: SVM 71.3%, K-NN 68.1%, decision tree 76.1%
  • NPV: SVM 80.4%, K-NN 74.1%, decision tree 87.1%
  • Accuracy: SVM 76.82%, K-NN 72.15%, decision tree 77.3%
  • F-measure: SVM 70%, K-NN 60%, decision tree 67%

Goryński et al, 2014 [44]Lung cancer
  • AUROC: 0.9896

Hart et al, 2018 [45]Lung cancer
  • Sensitivity: ANNj 75.30%
  • Specificity: ANN 80.60%
  • AUROC: ANN 0.86, RFk 0.81, SVM 0.55

Kalra et al, 2003 [46]Prostate cancer
  • Specificity: 92%
  • AUROC: 0.825

Kang et al, 2017 [47]Any cancer
  • Sensitivity: DNNl 64.07%, SVM 54.46%, decision tree 60.00%
  • Specificity: DNN 94.77%, SVM 95.27%, decision tree 91.50%
  • AUROC: DNN 0.882, SVM 0.928, decision tree 0.824
  • Accuracy: DNN 86.00%, SVM 83.83%, decision tree 83.60%
  • Using fuzzy interval of threshold with DNN achieves sensitivity 90.20%, specificity 94.22%, accuracy 93.22%

Kinar et al, 2016 [48]Colorectal cancer
  • Specificity: Testing set 88% overall (at a sensitivity of 50%). Higher for proximal colon tumors. Validation set 94% (at a sensitivity of 50%)
  • AUROC: Testing set 0.82, validation set 0.81
  • ORm 26 at false +ve rate of 0.5% (testing set), OR 40 at false +ve rate of 0.5% (validation set). Algorithm identified 48% more CRCn cases than gFOBTo

Kop et al, 2016 [49]Colorectal cancer
  • Sensitivity: CARTp 53.9%, RF 63.7%, LR 64.2%
  • PPV: CART 2.6%, RF 3%, LR 3%
  • AUROC: CART 0.885, RF 0.889, LR 0.891
  • F1-score: CART 0.049, RF 0.057, LR 0.058.
  • Drugs for constipation most important predictor of CRC, followed by iron deficiency anemia

Miotto et al, 2016 [50]Multiple diseases and cancers
  • Specificity: 92%
  • AUROC: 0.773 for classification of all diseases (cancer and other diagnoses). Rectal or anal cancer 0.887, liver or intrahepatic bile duct cancer 0.886, prostate cancer 0.859, multiple myeloma 0.849, ovarian cancer 0.824, bladder cancer 0.818, testicular cancer 0.811, pancreatic cancer 0.795, leukemia 0.774, uterine cancer 0.771, non-Hodgkin lymphoma 0.771, bronchial or lung cancer 0.770, colon cancer 0.767, breast cancer 0.762, kidney or renal pelvis cancer 0.753, brain or nervous system cancer 0.742, Hodgkin disease 0.731, cervical cancer 0.675
  • Accuracy index: 0.929 overall for classification of all diseases
  • F-score: 0.181 for classification of all diseases
  • Deep patient obtained approximately 55% correct predictions when suggesting 3 or more diseases per patient, regardless of time interval

Payandeh et al, 2009 [51]CMLq and lymphopro-liferative disorders
  • Sensitivity: CML 0%, lymphoproliferative disorder 0%
  • Specificity: CML 100%, lymphoproliferative disorder 99.2%
  • PPV: CML 0%, lymphoproliferative disorder 0%
  • NPV: CML 99.2%, lymphoproliferative disorder 100%
  • Error % for convoluted neural network 0.33, error % for LR 0.78
Validation studies

Birks et al, 2017 [52]Colorectal cancer
  • AUROC: analyzed at various time intervals before diagnosis, 3-6 months 0.844, 18-24 months 0.776

Hornbrook et al, 2017 [34]Colorectal cancer
  • Sensitivity: 0-180 days (test to diagnosis): 50-75 years: 34.5%, 40-89 years: 39.9%; 181-360 days: 50-75 years: 18.8%, 40-89 years: 27.4%
  • AUROC: 0.80, OR: 34.7 at 99% specificity, 19.7 at 97%, 14.6 at 95%, 10.0 at 90%

Kinar et al, 2017 [53]Colorectal cancer
  • Sensitivity: 17.0% at 1% +ve rate, 24.4% at 3% +ve rate
  • PPV: 2.1% at 1% +ve rate, 1.0% at 3% +ve rate
  • NPV: 99.9% at 1% +ve rate, 99.9% at 3% +ve rate
  • OR: 21.8% at 1% +ve rate, 10.9% at 3% +ve rate

aAI: artificial intelligence.

bBPNN: back propagation neural network.

cLR: logistic regression.

dAUROC: area under the receiver operating characteristic.

eFIT: fecal immunochemical test.

fPPV: positive predictive value.

gNPV: negative predictive value.

hSVM: support vector machine.

iK-NN: K-nearest neighbor.

jANN: artificial neural network.

kRF: random forest.

lDNN: deep neural network.

mOR: odds ratio.

nCRC: colorectal cancer.

ogFOBT: guaiac fecal occult blood test.

pCART: classification and regression trees.

qCML: chronic myeloid leukemia.

We looked for other secondary outcomes, including implementation barriers to AI techniques in primary care settings, but did not find any evidence related to patient or clinician acceptability or cost-effectiveness.

Table 5 shows the outcomes of the risk of bias assessment using the QUADAS-2 tool. The studies demonstrated a wide range in quality; however, no studies were excluded based on their risk of bias assessment. The identified limitations were acknowledged in the relative contribution of the studies to the conclusions of the review.

Table 5. Critical appraisal results using the Quality Assessment of Diagnostic Accuracy Studies-2 tool.
StudyRisk of biasApplicability concerns

Index testReference standardFlow and timingPatient
Index testReference standard
Alzubi et al, 2019 [39]abc
Birks et al, 2017 [52]
Chang et al, 2009 [40]
Cooper et al, 2018 [41]
Cowley et al, 2013 [42]
Daqqa et al, 2017 [43]
Goryński et al, 2014 [44]
Hart et al, 2018 [45]
Hornbrook et al, 2017 [34]
Kalra et al, 2003 [46]
Kang et al, 2017 [47]
Kinar et al, 2016 [48]
Kinar et al, 2017 [53]
Kop et al, 2016 [49]
Miotto et al, 2016 [50]
Payandeh et al, 2009 [51]

aHigh risk.

bLow risk.

cUnclear risk.

Table 6 summarizes the computer-based technologies identified in our parallel scoping review of commercial AI technologies. We identified 21 commercial computer-based technologies. Of these, 11 were clinician-facing differential diagnosis technologies that did not appear to be integrated into the EHR [117-127]. Ten of the technologies were linked to, or integrated into, the EHR in some way [8,128-136]. Nine of the technologies did not use AI algorithms incorporating an element of machine learning, as was required in our inclusion criteria [118,120-127]. It was also not clear from the websites and studies of 3 further technologies whether they met our AI inclusion criteria [117,130,134]. There were 8 technologies that met our inclusion criteria for AI (Abtrace [128], Babylon [8], Cthesigns [129], Isabel [131], Medial EarlySign [132], symcat [119], symptomate [135], and the unnamed technology evaluated by Liang et al [136]). Only the Medial EarlySign tool was evaluated for its performance in the diagnosis or triage of potential cancer [132]; 4 of the studies developing and validating this technology were included in this systematic review [34,48,52,53]. Cthesigns is specifically designed to aid the early diagnosis of cancer but has not been the subject of any studies we could identify [129].

Table 6. Summarizing scoping review of commercial artificial intelligence technologies.
Technology identified (origin) websites and associated academic studiesNot AIaNot cancerNot primary care basedNot early detection or diagnosisEarly researchNot publishedNot primary research<50 cases or controls
Abtrace (United Kingdom)

Abtrace website [128]bXc
Babylon (United Kingdom)

Babylon health website [8]

Zhelezniak et al [137]XXXX

Douglas et al [138]XXXX

Smith et al [139]XXXX

National Health Service 111 powered by Babylon - Outcomes Evaluation [140]XX

Middleton et al [141]XX
Cthesigns (United Kingdom)

Cthesigns website [129]X
Diagnosis Pro (United States)

No website identified

Bond et al [117]N/CdX
DocResponse (United States)

Docresponse website [130]N/CX
DxPlain (United States)

Dxplain website [118]N/C

Barnett et al [142]XXX

Barnett et al [143]XX

Bauer et al [144]XX

Berner et al [145]XXX

Bond et al [117]XXX

Elhanan et al [146]XX

Elkin et al [147]XXX

Feldman et al [148]XXXX

Hammersley et al [149]XXX

Hoffer et al [150]XX

London et al [151]XX
Iliad (United States)

No website identified

Berner et al [145]XXX

Elstein et al [152]XXXX

Friedman et al [153]XXX

Gozum et al [154]XXX

Graber et al [155]XXX

Heckerling et al [120]XXX

Lange et al [156]XX

Lau et al [157]X

Li et al [158]XXXX

Lincoln et al [159]XXXX

Murphy et al [160]XXXX

Wolf et al [161]XXXX
Internist-1 (United States)

No website identified

Miller et al [121]XXXX

Miller et al [122]XXX
Isabel (United Kingdom)

Isabel healthcare website – Isabel pro [131]

Bond et al [117]X

Ramnarayan et al [162]X

Ramnarayan et al [163]X

Carlson et al [164]X

Graber et al [165]X

Graber et al [166]X

Ramnarayan et al [167]X

Bavdekar et al [168]X

Ramnarayan et al [169]X

Semigran et al [20]X

Meyer et al [170]X
Meditel (United States)

No website identified

Berner et al [145]XXX

Hammersley et al [149]XXX

Waxman et al [171]XXX

Wexler et al [123]XXXX
Medial Early sign (United States/Israel)

Earlysign website [132]

Kinar et al [53]e

Birks et al [52]e

Hornbrook et al [34]e

Goshen et al [172]

Zack et al [173]X

Cahn et al [174]X
Multilevel Diagnosis Decision Support System (Spain)

No website identified

Rodriguez-Gonzalez et al [124]XXX
Online webGP (United Kingdom; later became eConsult)

Emis health online-triage website [175]f

Hurleygroup website [176]g

Edwards et al [133]XXX

Carter et al [177]XXX

Cowie et al [178]XXX
Pepid (United States)

Pepid website [125]hN/C

Bond et al [117]XXX
Problem Knowledge Couplers (PKC; United States)

No website identified

Apkon et al [126]XX
Quick Medical Reference (QMR) (United States; developed from Internist-1)

No website identified

Arene et al [179]XXX

Bacchus et al [180]XXX

Bankowitz et al [181]XXX

Berner et al [145]XXX

Berner et al [182]XX

Friedman et al [153]XXX

Gozum et al [154]XXX

Graber et al [155]XXX

Miller et al [122]XXX

Lemaire et al [183]XX
Reconsider (United States)

No website identified

Nelson et al [127]XXX
Symcat (United States)

Symcat website [119]X
Symptify (United States)

Symptify website [134]N/CX
Symptomate (Poland)

Symptomate website [135]X

No website identified

Liang H et al [136]XX

aAI: artificial intelligence.

bNot applicable or no data.

cStudy excluded for the reason specified in the column label.

dN/C: not clear.

eThese studies met the inclusion criteria of the systematic review and were therefore included.

fEdwards et al [133] suggests that this Egton Medical Information Systems (EMIS) application is powered by the eConsult system.

gCarter et al [177] suggests that this is the group who developed webGP.

hSeveral published studies are linked in the research section of the website, none involved use of the differential diagnosis or decision support tools. Some case studies audited the use of these tools.

Principal Findings

We identified 16 studies reporting AI techniques that could facilitate the early detection of cancer and could be applied to the types of data found in primary care EHRs. However, heterogeneity of AI modalities, data set characteristics, outcome measures, conduct of these studies, and quality assessment meant that we were unable to draw strong conclusions about the utility of these techniques in primary care settings. There was a notable paucity of evidence on performance using primary care data. Coupled with the lack of evidence on implementation barriers or cost-effectiveness, this may help explain why AI techniques have not been adopted widely into primary care clinical practice to date. The study by Kinar et al [48] and its subsequent validation in independent data sets [34,52,53], including primary care data sets, is a valuable example of a staged evaluation of an AI technique from early development, via validation data sets, to evaluation in the population for intended use [22]. The work by Kop and collaborators [49,115,184] also represents a good example of the staged development of an AI technique, with sequential peer-reviewed, published evaluations at each stage.

We also identified 21 commercial AI technologies, many of which have not been evaluated and reported in peer-reviewed, published studies. Many other technologies that were patient-facing and designed for the triage of symptoms were identified but had not been applied to EHRs. Eight of these technologies appeared to be based on newer machine learning AI techniques, with the majority appearing to be driven by knowledge-based decision tree algorithms. Only one of the identified technologies has been evaluated specifically for cancer, although it may be more efficacious for these technologies to be very general in scope and to be widely used, rather than to have a narrow focus on cancer alone. With wider adoption, these technologies have a greater potential for raising patient and clinician awareness of cancer. However, it remains important to fully understand their diagnostic accuracy and safety, including for the triage of potential cancer symptoms. AI technologies applied to EHRs are potentially useful for primary care clinicians; however, they need to be designed in a way that is appropriate for the type and origin of the data found in primary care EHRs and to have been thoroughly and transparently evaluated in the population the technology is intended for.

Strengths and Limitations

The strengths of this systematic review include the following: a broad and inclusive search strategy to avoid missing studies; guidance of an international expert panel in the development of the protocol and search strategy; independent screening, quality assessment, and data extraction processes; followed PRISMA guidance; and a parallel scoping review for commercial AI technologies. As only a few heterogeneous studies were identified, it was not possible to synthesize the data and evaluate the utility of these AI techniques. Furthermore, only one commercially available AI technology was identified via the systematic review. Many of the technologies identified in the parallel scoping review lacked sufficient academic detailing and evidence for their accuracy or safety. This is a rapidly evolving research area, which will require further review over time.


Worldwide, there is a great deal of interest in AI techniques and their potential in medicine, not least in the United Kingdom where politicians and NHS leaders have publicly prioritized the incorporation of AI into clinical settings. Our findings support those of Kueper et al [17], namely, that although some AI techniques have good initial validation reports, they have not yet been through the steps for full application in clinical practice. Validation using independent data is preferable to splitting a single data set [185] and could be the next step in the development of many AI techniques identified in this review. Much of the research is at an early stage, with variable reporting and conduct, and requires further validation in prospective clinical settings and assessment of cost-effectiveness after clinical implementation before it can be incorporated into daily practice safely and effectively [186].

Consensus is required on how AI techniques designed for clinical use should be developed and validated to ensure their safety for patients and clinicians in their intended settings. Good internal and external validity is required in these experiments to avoid bias, most notably spectrum bias [187] and distributional shift [16], and to ensure that the appropriate data are used to develop the AI technique in keeping with its anticipated clinical setting and diagnostic function. The CanTest framework provides an outline for further studies aiming to develop this evidence base for AI techniques in clinical settings; to prove their safety and efficacy to commissioners, clinicians, and patients; and to enable them to be implemented in clinical practice [22]. Prospective evaluation in the clinical setting for which the AI technique is intended is essential: AI aimed at primary care clinics must be evaluated in primary care settings, where cancer prevalence is low compared with specialist settings, to accurately evaluate their future performance [187,188]. Further research around the acceptability of AI techniques for patients and clinicians and their cost-effectiveness will also be important to facilitate rapid implementation. Once these AI techniques are ready for implementation, they will require careful design to ensure effective integration into health information systems [189]. Data governance and protection must also be addressed, as they may present significant barriers to the implementation of these technologies [190,191].

In conclusion, AI techniques have the potential to aid the interpretation of patient-reported symptoms and clinical signs and to support clinical management, doctor-patient communication, and informed decision making. Ultimately, in the context of early cancer detection, these techniques may help reduce missed diagnostic opportunities and improve safety netting. However, although there are a few good examples of staged validation of these AI techniques, most of the research is at an early stage. We found numerous examples of the implementation of AI technologies without any or sufficient evidence for their accuracy or safety. Further research is required to build up the evidence base for AI techniques applied to EHRs and to reassure commissioners, clinicians, and patients that they are safe and effective enough to be incorporated into routine clinical practice.


This research was funded by the National Institute for Health Research (NIHR) Policy Research Programme, conducted through the Policy Research Unit in Cancer Awareness, Screening, and Early Diagnosis, PR-PRU-1217-21601. The views expressed are those of the authors and not necessarily those of the NIHR or the Department of Health and Social Care. This work was also supported by the CanTest Collaborative (funded by Cancer Research UK C8640/A23385), of which FW and WH are directors and JE, HS, and NdW are associate directors. HS is additionally supported by the Houston Veterans Administration Health Services Research and Development Center for Innovations in Quality, Effectiveness, and Safety (CIN13-413) and the Agency for Healthcare Research and Quality (R01HS27363). The funding sources had no role in the study design, data collection, data analysis, data interpretation, writing of the report, or the decision to submit for publication. The authors would like to thank Isla Kuhn, Reader Services Librarian, University of Cambridge Medical Library, for her help in developing the search strategy.

Authors' Contributions

OJ developed the protocol, completed the search, screened the articles for inclusion, extracted the data, synthesized the findings, interpreted the results, and drafted the manuscript. NC screened the articles for inclusion, extracted the data, and critically revised the manuscript. SS screened the articles for inclusion, extracted the data, and critically revised the manuscript. WH developed the protocol, interpreted the results, and critically revised the manuscript. SD, JE, HS, and NdW critically revised the manuscript. FW developed the protocol, synthesized the findings, interpreted the results, and critically revised the manuscript. All authors approved the final version.

Conflicts of Interest

None declared.

Multimedia Appendix 1

Protocol for the study.

DOCX File , 34 KB

Multimedia Appendix 2

Search strategies.

DOCX File , 16 KB

Multimedia Appendix 3

Results of the full-text article review.

DOCX File , 38 KB

Multimedia Appendix 4

Supplementary information to table 1.

DOCX File , 36 KB

  1. Cancer statistics for the UK. Cancer Research UK.   URL: [accessed 2020-11-30]
  2. Hamilton W. Diagnosing symptomatic cancer in the NHS. Br Med J 2015 Oct 13;351:5311. [CrossRef] [Medline]
  3. Coleman M, Forman D, Bryant H, Butler J, Rachet B, Maringe C, et al. Cancer survival in Australia, Canada, Denmark, Norway, Sweden, and the UK, 1995–2007 (the International Cancer Benchmarking Partnership): an analysis of population-based cancer registry data. The Lancet 2011 Jan;377(9760):127-138. [CrossRef]
  4. Hiom SC. Diagnosing cancer earlier: reviewing the evidence for improving cancer survival. Br J Cancer 2015 Mar 31;112 Suppl 1(S1):S1-S5 [FREE Full text] [CrossRef] [Medline]
  5. Garbe C, Peris K, Hauschild A, Saiag P, Middleton M, Bastholt L, European Dermatology Forum (EDF), European Association of Dermato-Oncology (EADO), European Organisation for ResearchTreatment of Cancer (EORTC). Diagnosis and treatment of melanoma. European consensus-based interdisciplinary guideline - update 2016. Eur J Cancer 2016 Aug;63:201-217. [CrossRef] [Medline]
  6. Lyratzopoulos G, Wardle J, Rubin G. Rethinking diagnostic delay in cancer: how difficult is the diagnosis? Br Med J 2014 Dec 09;349(dec09 3):7400-7400 [FREE Full text] [CrossRef] [Medline]
  7. Isabel Differential Diagnosis Generator. Isabel Healthcare. 2018.   URL: [accessed 2020-11-30]
  8. Artificial intelligence. Babylon Health.   URL: [accessed 2020-11-30]
  9. Gulshan V, Peng L, Coram M, Stumpe MC, Wu D, Narayanaswamy A, et al. Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs. J Am Med Assoc 2016 Dec 13;316(22):2402-2410. [CrossRef] [Medline]
  10. McKinney SM, Sieniek M, Godbole V, Godwin J, Antropova N, Ashrafian H, et al. International evaluation of an AI system for breast cancer screening. Nature 2020 Jan 1;577(7788):89-94. [CrossRef] [Medline]
  11. Li Z, Yu L, Wang X, Yu H, Gao Y, Ren Y, et al. Diagnostic performance of mammographic texture analysis in the differential diagnosis of benign and malignant breast tumors. Clin Breast Cancer 2018 Aug;18(4):621-627. [CrossRef] [Medline]
  12. Chilamkurthy S, Ghosh R, Tanamala S, Biviji M, Campeau NG, Venugopal VK, et al. Deep learning algorithms for detection of critical findings in head CT scans: a retrospective study. The Lancet 2018 Dec;392(10162):2388-2396. [CrossRef]
  13. Shafique S, Tehsin S. Acute lymphoblastic leukemia detection and classification of its subtypes using pretrained deep convolutional neural networks. Technol Cancer Res Treat 2018 Jan 01;17:1533033818802789 [FREE Full text] [CrossRef] [Medline]
  14. Preparing the healthcare workforce to deliver the digital future. The Topol Review.: NHS Health Education England; 2019.   URL: [accessed 2020-11-30]
  15. Artificial intelligence and primary care. Royal College of General Practitioners.   URL: [accessed 2020-11-30]
  16. Challen R, Denny J, Pitt M, Gompels L, Edwards T, Tsaneva-Atanasova K. Artificial intelligence, bias and clinical safety. BMJ Qual Saf 2019 Mar 12;28(3):231-237 [FREE Full text] [CrossRef] [Medline]
  17. Kueper JK, Terry AL, Zwarenstein M, Lizotte DJ. Artificial intelligence and primary care research: a scoping review. Ann Fam Med 2020 May 01;18(3):250-258 [FREE Full text] [CrossRef] [Medline]
  18. Millenson M, Baldwin J, Zipperer L, Singh H. Beyond Dr. Google: the evidence on consumer-facing digital tools for diagnosis. Diagnosis (Berl) 2018 Sep 25;5(3):95-105 [FREE Full text] [CrossRef] [Medline]
  19. Riches N, Panagioti M, Alam R, Cheraghi-Sohi S, Campbell S, Esmail A, et al. The Effectiveness of Electronic Differential Diagnoses (DDX) Generators: a systematic review and meta-analysis. PLoS One 2016 Mar 8;11(3):0148991 [FREE Full text] [CrossRef] [Medline]
  20. Semigran H, Linder J, Gidengil C, Mehrotra A. Evaluation of symptom checkers for self diagnosis and triage: audit study. Br Med J 2015 Jul 08;351:3480 [FREE Full text] [CrossRef] [Medline]
  21. Chambers D, Cantrell AJ, Johnson M, Preston L, Baxter SK, Booth A, et al. Digital and online symptom checkers and health assessment/triage services for urgent health problems: systematic review. BMJ Open 2019 Aug 01;9(8):027743 [FREE Full text] [CrossRef] [Medline]
  22. Walter FM, Thompson MJ, Wellwood I, Abel GA, Hamilton W, Johnson M, et al. Evaluating diagnostic strategies for early detection of cancer: the CanTest framework. BMC Cancer 2019 Jun 14;19(1):586 [FREE Full text] [CrossRef] [Medline]
  23. Moher D, Shamseer L, Clarke M, Ghersi D, Liberati A, Petticrew M, PRISMA-P Group. Preferred reporting items for systematic review and meta-analysis protocols (PRISMA-P) 2015 statement. Syst Rev 2015 Jan 01;4(1):1 [FREE Full text] [CrossRef] [Medline]
  24. Jones O, Ranmuthu C, Prathivadi K, Saji S, Calanzani N, Emery J, et al. Establishing which modalities of artificial intelligence (AI) for the early detection and diagnosis of cancer are ready for implementation in primary care: a systematic review. Prospero: International prospective register of systematic reviews 2020 [FREE Full text] [CrossRef]
  25. McCarthy J, Minsky M, Rochester N, Shannon C. A proposal for the Dartmouth Summer Research Project on Artificial Intelligence, August 31, 1955. AI Magazine,27(4), 12. 2006.   URL: [accessed 2021-01-25]
  26. Muehlhauser L. What should we learn from past AI forecasts? Open Philanthropy Project. 2016.   URL: https:/​/www.​​focus/​global-catastrophic-risks/​potential-risks-advanced-artificial-intelligence/​what-should-we-learn-past-ai-forecasts [accessed 2021-01-25]
  27. AI in the UK?: Ready, Willing and Able. HOUSE OF LORDS: Select Committee on Artificial Intelligence. 2018.   URL: [accessed 2021-01-25]
  28. e-Print archive. Cornell University.   URL: [accessed 2020-11-30]
  29. Research. Google AI - research.   URL: [accessed 2020-11-30]
  30. Emerging technology, computer, and software research. Microsoft Research.   URL: [accessed 2020-11-30]
  31. Artificial intelligence. IBM Research.   URL: [accessed 2020-11-30]
  32. Whiting PF, Rutjes AWS, Westwood ME, Mallett S, Deeks JJ, Reitsma JB, QUADAS-2 Group. QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med 2011 Oct 18;155(8):529-536 [FREE Full text] [CrossRef] [Medline]
  33. Popay J, Roberts H, Sowden A, Petticrew M, Arai L, Rodgers M, et al. Guidance on the conduct of narrative synthesis in systematic reviews. ESRC Methods Program Swindon.: University of Lancaster; 2006.   URL: [accessed 2021-01-25]
  34. Hornbrook MC, Goshen R, Choman E, O'Keeffe-Rosetti M, Kinar Y, Liles EG, et al. Early colorectal cancer detected by machine learning model using gender, age, and complete blood count data. Dig Dis Sci 2017 Oct 23;62(10):2719-2727. [CrossRef] [Medline]
  35. Survival three times higher when cancer is diagnosed early. Cancer Research UK.   URL: https:/​/www.​​about-us/​cancer-news/​press-release/​2015-08-10-survival-three-times-higher-when-cancer-is-diagnosed-early [accessed 2019-12-17]
  36. NHS Long Term Plan.   URL: [accessed 2021-02-08]
  37. Esteva A, Kuprel B, Novoa RA, Ko J, Swetter SM, Blau HM, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature 2017 Feb 02;542(7639):115-118. [CrossRef] [Medline]
  38. Marchetti MA, Codella NC, Dusza SW, Gutman DA, Helba B, Kalloo A, International Skin Imaging Collaboration. Results of the 2016 International Skin Imaging Collaboration International Symposium on Biomedical Imaging challenge: Comparison of the accuracy of computer algorithms to dermatologists for the diagnosis of melanoma from dermoscopic images. J Am Acad Dermatol 2018 Feb;78(2):270-277.e1 [FREE Full text] [CrossRef] [Medline]
  39. ALzubi JA, Bharathikannan B, Tanwar S, Manikandan R, Khanna A, Thaventhiran C. Boosted neural network ensemble classification for lung cancer disease diagnosis. Applied Soft Computing 2019 Jul;80:579-591. [CrossRef]
  40. Chang C, Hsu M. The study that applies artificial intelligence and logistic regression for assistance in differential diagnostic of pancreatic cancer. Expert Systems with Applications 2009 Sep;36(7):10663-10672. [CrossRef]
  41. Cooper JA, Parsons N, Stinton C, Mathews C, Smith S, Halloran SP, et al. Risk-adjusted colorectal cancer screening using the FIT and routine screening data: development of a risk prediction model. Br J Cancer 2018 Jan 2;118(2):285-293 [FREE Full text] [CrossRef] [Medline]
  42. Cowley J. The use of knowledge discovery databases in the identification of patients with colorectal cancer. University of Hull [Dissertation]. 2012 Jul 01.   URL: [accessed 2020-11-30]
  43. Daqqa K, Maghari A. Prediction and diagnosis of leukemia using classification algorithms. In: Proceedings of 8th International Conference on Information Technology. 2017 Presented at: 8th International Conference on Information Technology (ICIT); May 17-18, 2017; Amman, Jordan p. 638-643. [CrossRef]
  44. Gorynski K, Safian I, Gradzki W. Artificial neural networks approach to early lung cancer detection. Cent Eur J Med 2014;9(5):632-641. [CrossRef]
  45. Hart GR, Roffman DA, Decker R, Deng J. A multi-parameterized artificial neural network for lung cancer risk prediction. PLoS One 2018 Oct 24;13(10):0205264 [FREE Full text] [CrossRef] [Medline]
  46. Kalra P, Togami J, Bansal BSG, Partin AW, Brawer MK, Babaian RJ, et al. A neurocomputational model for prostate carcinoma detection. Cancer 2003 Nov 01;98(9):1849-1854 [FREE Full text] [CrossRef] [Medline]
  47. Kang G, Ni Z. Research on early risk predictive model and discriminative feature selection of cancer based on real-world routine physical examination data. In: Proceedings of IEEE Int Conf Bioinforma Biomed BIBM. 2016 Presented at: IEEE Int Conf Bioinforma Biomed BIBM; 2016; Shenzhen, China p. 1512-1519. [CrossRef]
  48. Kinar Y, Kalkstein N, Akiva P, Levin B, Half EE, Goldshtein I, et al. Development and validation of a predictive model for detection of colorectal cancer in primary care by analysis of complete blood counts: a binational retrospective study. J Am Med Inform Assoc 2016 Sep 15;23(5):879-890 [FREE Full text] [CrossRef] [Medline]
  49. Kop R, Hoogendoorn M, Teije AT, Büchner FL, Slottje P, Moons LM, et al. Predictive modeling of colorectal cancer using a dedicated pre-processing pipeline on routine electronic medical records. Comput Biol Med 2016 Sep 01;76:30-38. [CrossRef] [Medline]
  50. Miotto R, Li L, Kidd BA, Dudley JT. Deep patient: an unsupervised representation to predict the future of patients from the electronic health records. Sci Rep 2016 May 17;6(1):26094 [FREE Full text] [CrossRef] [Medline]
  51. Payandeh M, Aeinfar M, Aeinfar V, Hayati M. A new method for diagnosis and predicting blood disorder and cancer using artificial intelligence (Artificial Neural Networks). Int J Hematol Stem Cell Res 2009;3(4):25-33. [CrossRef]
  52. Birks J, Bankhead C, Holt TA, Fuller A, Patnick J. Evaluation of a prediction model for colorectal cancer: retrospective analysis of 2.5 million patient records. Cancer Med 2017 Oct 21;6(10):2453-2460 [FREE Full text] [CrossRef] [Medline]
  53. Kinar Y, Akiva P, Choman E, Kariv R, Shalev V, Levin B, et al. Performance analysis of a machine learning flagging system used to identify a group of individuals at a high risk for colorectal cancer. PLoS One 2017 Feb 9;12(2):0171759 [FREE Full text] [CrossRef] [Medline]
  54. Malik N, Idris W, Gunawan TS, Olanrewaju RF, Ibrahim SN. Classification of normal and crackles respiratory sounds into healthy and lung cancer groups. Int J Electr Comput Eng 2018 Jun 01;8(3):1530. [CrossRef]
  55. Adams K, Sideris M, Papagrigoriadis S. Lunchtime Posters-Can we make “Straight to Test” decisions in Two Week Wait (2WW) patients with the help of an Artificial Neural Network (ANN)? Colorectal Dis 2014 Aug 22;16:41-68. [CrossRef]
  56. Ahmed A, Shah M, Wahid A, ul Islam S, Abbasi MK, Asghar MN. Big data analytics using neural networks for earlier cancer detection. J Med Imaging Hlth Inform 2017 Oct 01;7(6):1469-1474. [CrossRef]
  57. Ahmed K, Emran AA, Jesmin T, Mukti RF, Rahman MZ, Ahmed F. Early detection of lung cancer risk using data mining. Asian Pac J Cancer Prev 2013;14(1):595-598 [FREE Full text] [CrossRef] [Medline]
  58. Ahmen U, Rasool G, Zafar S, Maqbool HF. Fuzzy Rule Based Diagnostic System to Detect the Lung Cancer. 2018 Presented at: 2018 International Conference on Computing, Electronic and Electrical Engineering (ICE Cube); November 12-13, 2018; Quetta, Pakistan   URL: [CrossRef]
  59. Alaa A, Moon K, Hsu W, van der Schaar M. ConfidentCare: a clinical decision support system for personalized breast cancer screening. IEEE Trans Multimedia 2016 Oct;18(10):1942-1955. [CrossRef]
  60. Alharbi A, Tchier F, Rashidi M. Using a GeneticFuzzy algorithm as a computer aided breast cancer diagnostic tool. Asian Pac J Cancer Prev 2016;17(7):3651-3658 [FREE Full text] [Medline]
  61. Ayeldeen H, Elfattah MA, Shaker O, Hassanien AE, Kim TH. Case-Based Retrieval Approach of Clinical Breast Cancer Patients. 2015 Presented at: 2015 3rd International Conference on Computer, Information and Application; May 21-23, 2015; Yeosu, South Korea. [CrossRef]
  62. Balachandran K. An efficient optimization based lung cancer pre-diagnosis system with aid of Feed Forward Back Propagation Neural Network (FFBNN). J Theor Appl Inf Technol 2013 Oct;56(2):263-271 [FREE Full text]
  63. Bhar JA, George V, Malik B. Cloud Computing with Machine Learning Could Help Us in the Early Diagnosis of Breast Cancer. 2015 Presented at: Second International Conference on Advances in Computing and Communication Engineering; May 1-2, 2015; Dehradun, India. [CrossRef]
  64. CHauhan R, Kaur H, Sharma S. A Feature Based Approach for Medical Databases. In: Proceedings of the International Conference on Advances in Information Communication Technology & Computing. 2016 Presented at: AICTC '16; Aug 2016; Bikaner, India. [CrossRef]
  65. Chen Y, Joo EM. Biomedical diagnosis and prediction using parsimonious fuzzy neural networks. 2012 Presented at: 38th Annual Conference on IEEE Industrial Electronics Society; December 24, 2012; Montreal, QC, Canada. [CrossRef]
  66. Choudhury T, Kumar V, Nigam D, Vashisht V. Intelligent Classification of Lung & Oral Cancer through Diverse Data Mining Algorithms. Presented at: 2016 International Conference on Micro-Electronics and Telecommunication Engineering (ICMETE); 2016; Ghaziabad p. 133-138. [CrossRef]
  67. Çınar M, Engin M, Engin EZ, Ziya Ateşçi Y. Early prostate cancer diagnosis by using artificial neural networks and support vector machines. Expert Systems with Applications 2009 Apr;36(3):6357-6361. [CrossRef]
  68. Del Grossi AA, De Mattos Senefonte HC, Quaglio VG. Prostate cancer biopsy recommendation through use of machine learning classification techniques. In: Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Switzerland: Springer; 2014:710-721.
  69. Durga S, Kasturi K. Lung disease prediction system using data mining techniques. J Adv Res in Dynamical and Contr Sys 2017;9(5):62-66 [FREE Full text]
  70. Elhoseny M, Bian G, Lakshmanaprabu S, Shankar K, Singh AK, Wu W. Effective features to classify ovarian cancer data in internet of medical things. Computer Networks 2019 Aug;159:147-156. [CrossRef]
  71. Elshazly HI, Elkorany AM, Hassanien AE. Ensemble-based classifiers for prostate cancer diagnosis. In: Proocedings of the 9th International Computer Engineering Conference: Today Information Society What’s Next?, ICENCO 2013.: IEEE Computer Society; 2013 Presented at: 9th International Computer Engineering Conference: Today Information Society What’s Next?, ICENCO 2013; 2013; 9th International Computer Engineering Conference: Today Information Society What’s Next?, ICENCO 2013 p. 49-54. [CrossRef]
  72. Fan Y, Chaovalitwongse WA. Optimizing feature selection to improve medical diagnosis. Ann Oper Res 2009 Jan 6;174(1):169-183. [CrossRef]
  73. Gaebel J, Cypko MA, Lemke HU. Accessing patient information for probabilistic patient models using existing standards. Stud Health Technol Inform 2016;223:107-112. [Medline]
  74. Gao Z, Gong J, Qin Q, Lin J. [Application of support vector machine in the detection of early cancer]. Sheng Wu Yi Xue Gong Cheng Xue Za Zhi 2005 Oct;22(5):1045-1048. [Medline]
  75. Gelnarová E, Šafařík L. Comparison of three statistical classifiers on a prostate cancer data. Neural Network World 2005;15(4):311-318 [FREE Full text]
  76. Ghaderzadeh M. Clinical decision support system for early detection of prostate cancer from benign hyperplasia of prostate. In: Proceedings of the 14th World Congress on Medical and Health Informatics, Pts 1 and 2. 2013 Presented at: Proceedings of the 14th World Congress on Medical and Health Informatics, Pts 1 and 2; 2013; Netherlands p. 928. [CrossRef]
  77. Ghany KKA, Ayeldeen H, Zawbaa HM, Shaker O, IEEE. A rough set-based reasoner for medical diagnosis. In: Proceedings of the International Conference on Green Computing and Internet of Things. 2015 Presented at: International Conference on Green Computing and Internet of Things; 2015; Beni-Suef University, Egypt p. 429-434. [CrossRef]
  78. Ghany KKA, Ayeldeen H, Zawbaa HM, Shaker O, Ayedeen G, IEEE. Diagnosis of breast cancer using secured classifiers. In: Proceedings of the International Conference on Electrical and Computing Technologies and Applications. 2017 Presented at: International Conference on Electrical and Computing Technologies and Applications; 2017; Beni-Suef University, Egypt p. 680-684. [CrossRef]
  79. Goraneseu F, Gorunescu M, El-Darzi E, Ene M, Gorunescu S. Statistical comparison of a probabilistic neural network approach in hepatic cancer diagnosis. In: EUROCON 2005 - The International Conference on Computer as a Tool. 2005 Presented at: EUROCON 2005 - The International Conference on Computer as a Tool; 2005; Belgrade; Yugoslavia p. 237-240. [CrossRef]
  80. Gorunescu F, Belciug S. Boosting backpropagation algorithm by stimulus-sampling: application in computer-aided medical diagnosis. J Biomed Inform 2016 Oct;63:74-81 [FREE Full text] [CrossRef] [Medline]
  81. Gorunescu M, Gorunescu F, Revett K. A neural computing-based approach for the early detection of hepatocellular carcinoma. Proceedings of World Academy of Science, Engineering and Technology 2006;17:65 [FREE Full text]
  82. Govinda K, Singla K, Jain K. Fuzzy based uncertainty modeling of Cancer Diagnosis System. In: Proceedings of the International Conference on Intelligent Sustainable Systems (ICISS). 2017 Presented at: International Conference on Intelligent Sustainable Systems (ICISS); Dec 7-8, 2017; Palladam, India p. 740-743. [CrossRef]
  83. Halpern Y, Horng SK, Choi Y, Sontag D. Electronic medical record phenotyping using the anchor and learn framework. J Am Med Inform Assoc 2016 Jul;23(4):731-740 [FREE Full text] [CrossRef] [Medline]
  84. Hart GR, Roffman DA, Decker R. Scientific abstracts and sessions. Med Phys 2018 Jun 11;45(6):e120-e706. [CrossRef]
  85. Hornbrook MC, Goshen R, Choman E, O'Keeffe-Rosetti M, Kinar Y, Liles EG, et al. Correction to: early colorectal cancer detected by machine learning model using gender, age, and complete blood count data. Dig Dis Sci 2018 Jan;63(1):270. [CrossRef] [Medline]
  86. Hsu JL, Hung PC, Lin HY, Hsieh CH. Applying under-sampling techniques and cost-sensitive learning methods on risk assessment of breast cancer. J Med Syst 2015 Apr;39(4):210. [CrossRef] [Medline]
  87. Ilhan HO, Celik E. The mesothelioma disease diagnosis with artificial intelligence methods. In: Proceedings of the 10th International Conference on Application of Information and Communication Technologies (AICT). 2016 Presented at: 2016 IEEE 10th International Conference on Application of Information and Communication Technologies (AICT); Oct 12-14, 2016; Baku, Azerbaijan. [CrossRef]
  88. Ji Z, Wang B. Identifying potential clinical syndromes of hepatocellular carcinoma using PSO-based hierarchical feature selection algorithm. Biomed Res Int 2014;2014:127572 [FREE Full text] [CrossRef] [Medline]
  89. Kong Q, Wang D, Wang Y, Jin Y, Jiang B. Multi-objective neural network-based diagnostic model of prostatic cancer. System Engineering Theory and Practice 2018;38(2):532-544. [CrossRef]
  90. Kou L, Yuan Y, Sun J, Lin Y. Prediction of cancer based on mobile cloud computing and SVM. In: Proceedings of the International Conference on Dependable Systems and Their Applications (DSA). 2017 Presented at: 2017 International Conference on Dependable Systems and Their Applications (DSA); 2017; Beijing, China. [CrossRef]
  91. Kshivets O. P2.11-13 Precise early detection of lung cancer and blood cell circuit. J Thoracic Oncol 2018 Oct;13(10):S783. [CrossRef]
  92. Liu S, Gaudiot J, Cristini V. Prototyping virtual cancer therapist (VCT): a software engineering approach. Conf Proc IEEE Eng Med Biol Soc 2006;2006:5424-5427. [CrossRef] [Medline]
  93. Liu Y, Pan Q, Zhou Z. Improved feature selection algorithm for prognosis prediction of primary liver cancer. In: Intelligence Science II. Switzerland: Springer; 2018:422-430.
  94. Meng J, Zhang R, Chen D. Utilizing narrative text from electronic health records for early warning model of chronic disease. In: Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics (SMC). 2018 Presented at: 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC); Oct 7-10, 2018; Miyazaki, Japan. [CrossRef]
  95. Mesrabadi HA, Faez K. Improving early prostate cancer diagnosis by using Artificial Neural Networks and Deep Learning. In: Proceedings of the 4th Iranian Conference on Signal Processing and Intelligent Systems (ICSPIS). 2018 Presented at: 2018 4th Iranian Conference on Signal Processing and Intelligent Systems (ICSPIS); Dec 25-27, 2018; Tehran, Iran. [CrossRef]
  96. Morgado P, Vicente H, Abelha A, Machado J, Neves J, Neves J. A case-based approach to colorectal cancer detection. In: Information Science and Applications 2017. Singapore: Springer; 2017:433-442.
  97. Nalluri MR, Roy DS. Hybrid disease diagnosis using multiobjective optimization with evolutionary parameter optimization. J Healthc Eng 2017;2017:5907264 [FREE Full text] [CrossRef] [Medline]
  98. Nikitaev VG, Pronichev AN, Nagornov OV, Zaytsev SM, Polyakov EV, Romanov NA, et al. Decision support system in urologic cancer diagnosis. J Phys : Conf Ser 2019 Apr 16;1189:012032. [CrossRef]
  99. Polat K, Senturk U. A Novel ML Approach to Prediction of Breast Cancer: Combining of mad normalization, KMC based feature weighting and AdaBoostM1 classifierA novel ML approach to prediction of breast cancer: combining of mad normalization, KMC based feature weighting and AdaBoostM1 classifier. In: 2nd International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT). 2018 Presented at: 2018 2nd International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT); Oct 19-21, 2018; Ankara, Turkey. [CrossRef]
  100. Rahman A, Muniyandi RC. Feature selection from colon cancer dataset for cancer classification using Artificial Neural Network. Int J Adv Sci Engi and Info Tech. 2018.   URL: https:/​/www.​​publication/​328924307_Feature_selection_from_colon_cancer_dataset_for_cancer_classification_using_Artificial_Neural_Network [accessed 2021-02-08]
  101. Ramya Devi M, Gomathy B. An intelligent system for the detection of breast cancer using feature selection and PCA methods. Int J Appl Engi Res. 2015.   URL: https:/​/www.​​publication/​283232820_An_intelligent_system_for_the_detection_of_breast_cancer_using_feature_selection_and_PCA_methods [accessed 2021-02-08]
  102. Richter AN, Khoshgoftaar TM. Melanoma risk modeling from limited positive samples. Netw Model Anal Health Inform Bioinforma 2019 Apr 4;8(1):-. [CrossRef]
  103. Safdari R, Arpanahi H, Langarizadeh M, Ghazisaiedi M, Dargahi H, Zendehdel K. Design a fuzzy rule-based expert system to aid earlier diagnosis of gastric cancer. Acta Inform Med 2018;26(1):19. [CrossRef]
  104. Shalev V, Kinar Y, Kalkstein N, Akiva P, Half E, Goldshtein I, et al. Computational analysis of blood counts significantly increases detection rate of gastric and colorectal cancers: PR0195 Esophageal, Gastric and Duodenal Disorders. J Gastroenterol and Hepatol 2013:761-762.
  105. Sobar, Machmud R, Wijaya A. Behavior determinant based cervical cancer early detection with machine learning algorithm. Advanced Science Letters 2016;22(10):3120-3123. [CrossRef]
  106. Soliman THA, Mohamed R, Sewissy AA. A hybrid analytical hierarchical process and deep neural networks approach for classifying breast cancer. In: Proceedings of the 11th International Conference on Computer Engineering & Systems (ICCES). 2016 Presented at: 2016 11th International Conference on Computer Engineering & Systems (ICCES); Dec 20-21, 2016; Cairo, Egypt. [CrossRef]
  107. Sushma Rani N, Srinivasa Rao P, Parimala P. An efficient statistical computation technique for health care big data using R. IOP Conf Ser : Mater Sci Eng 2017 Sep 07;225:012159. [CrossRef]
  108. Wang D, Quek C, See Ng G. Ovarian cancer diagnosis using a hybrid intelligent system with simple yet convincing rules. Applied Soft Computing 2014 Jul;20:25-39. [CrossRef]
  109. Wang G, Teoh JYC, Choi KS. Diagnosis of prostate cancer in a Chinese population by using machine learning methods. Annu Int Conf IEEE Eng Med Biol Soc 2018 Jul;2018:1-4. [CrossRef] [Medline]
  110. Xu W, Zhang R, Qimin E, Liu J, Laing C. [The clinical application of data mining in laryngeal cancer]. Lin Chung Er Bi Yan Hou Tou Jing Wai Ke Za Zhi 2015 Jul;29(14):1272-1275. [Medline]
  111. Yasodha P, Ananthanarayanan NR. Analysing big data to build knowledge based system for early detection of ovarian cancer. Indian J Sci and Tech 2015;8(14). [CrossRef]
  112. Zangooei MH, Habibi J, Alizadehsani R. Disease Diagnosis with a hybrid method SVR using NSGA-II. Neurocomputing 2014 Jul;136:14-29. [CrossRef]
  113. Zhang L, Wang H, Liang J, Wang J. Decision support in cancer base on fuzzy adaptive PSO for feedforward neural network training. In: Proceedings of the International Symposium on Computer Science and Computational Technology. 2008 Presented at: 2008 International Symposium on Computer Science and Computational Technology; Dec 20-22, 2008; Shanghai, China. [CrossRef]
  114. Zhang Z, Zhang H, Bast Jr RC. An application of artificial neural networks in ovarian cancer early detection. In: Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. 2000 Presented at: IEEE-INNS-ENNS International Joint Conference on Neural Networks; July 27, 2000; Como, Italy   URL:
  115. Hoogendoorn M, Szolovits P, Moons LM, Numans ME. Utilizing uncoded consultation notes from electronic medical records for predictive modeling of colorectal cancer. Artif Intell Med 2016 May;69:53-61 [FREE Full text] [CrossRef] [Medline]
  116. Moss S, Mathews C, Day TJ, Smith S, Seaman HE, Snowball J, et al. Increased uptake and improved outcomes of bowel cancer screening with a faecal immunochemical test: results from a pilot study within the national screening programme in England. Gut 2017 Sep 07;66(9):1631-1644. [CrossRef] [Medline]
  117. Bond WF, Schwartz LM, Weaver KR, Levick D, Giuliano M, Graber ML. Differential diagnosis generators: an evaluation of currently available computer programs. J Gen Intern Med 2012 Feb 26;27(2):213-219 [FREE Full text] [CrossRef] [Medline]
  118. DXplain. DXplain.   URL: [accessed 2020-11-30]
  119. Symcat Symptom Checker. Symcat Symptom Checker.   URL: [accessed 2020-11-30]
  120. Heckerling PS, Elstein AS, Terzian CG, Kushner MS. The effect of incomplete knowledge on the diagnoses of a computer consultant system. Med Inform (Lond) 1991 Jul 12;16(4):363-370. [CrossRef] [Medline]
  121. Miller RA, Pople HE, Myers JD. Internist-I, an experimental computer-based diagnostic consultant for general internal medicine. N Engl J Med 1982 Aug 19;307(8):468-476. [CrossRef]
  122. Miller RA, McNeil MA, Challinor SM, Masarie FE, Myers JD. The Internist-1/quick medical reference project--status report. West J Med 1986 Dec;145(6):816-822 [FREE Full text] [Medline]
  123. Wexler JR, Swender PT, Tunnessen WW, Oski FA. Impact of a system of computer-assisted diagnosis. Initial evaluation of the hospitalized patient. Am J Dis Child 1975 Feb 01;129(2):203-205. [CrossRef] [Medline]
  124. Rodríguez-González A, Torres-Niño J, Mayer MA, Alor-Hernandez G, Wilkinson MD. Analysis of a multilevel diagnosis decision support system and its implications: a case study. Comput Math Methods Med 2012 Sep;2012(9):367345-367333 [FREE Full text] [CrossRef] [Medline]
  125. Clinical decision support. PEPID.   URL: [accessed 2020-11-30]
  126. Apkon M, Mattera JA, Lin Z, Herrin J, Bradley EH, Carbone M, et al. A randomized outpatient trial of a decision-support information technology tool. Arch Intern Med 2005 Nov 14;165(20):2388-2394. [CrossRef] [Medline]
  127. Nelson SJ, Blois MS, Tuttle MS, Erlbaum M, Harrison P, Kim H, et al. Evaluating RECONSIDER. J Med Syst 1985 Dec;9(5-6):379-388. [CrossRef]
  128. Our Solutions!. Abtrace.   URL: [accessed 2020-11-30]
  129. C the signs. C the Signs.   URL: [accessed 2020-11-30]
  130. DocResponse. DocResponse.   URL: [accessed 2020-11-30]
  131. Isabel Pro - the DDx Generator. Isabel Healthcare.   URL: [accessed 2020-11-30]
  132. Medial EarlySign. Medial EarlySign.   URL: [accessed 2020-11-30]
  133. Edwards HB, Marques E, Hollingworth W, Horwood J, Farr M, Bernard E, et al. Use of a primary care online consultation system, by whom, when and why: evaluation of a pilot observational study in 36 general practices in South West England. BMJ Open 2017 Nov 22;7(11):e016901 [FREE Full text] [CrossRef] [Medline]
  134. Simptify.   URL: [accessed 2020-11-30]
  135. Check your symptoms online. Symptomate.   URL: [accessed 2020-11-30]
  136. Liang H, Tsui BY, Ni H, Valentim CCS, Baxter SL, Liu G, et al. Evaluation and accurate diagnoses of pediatric diseases using artificial intelligence. Nat Med 2019 Mar;25(3):433-438. [CrossRef] [Medline]
  137. Zhelezniak V, Busbridge D, Shen A, Smith S, Hammerla N. Decoding decoders: finding optimal representation spaces for unsupervised similarity tasks. ICLR 2018 Work Track. Preprint posted online September 5, 2018.   URL: [accessed 2020-11-30]
  138. Douglas L, Zarov I, Gourgoulias K, Lucas C, Hart C, Baker A, et al. A universal marginalizer for amortized inference in generative models. NIPS 2017 Work Adv Approx Bayesian Inference. Preprint posted online November 2, 2017.   URL: [accessed 2020-11-30]
  139. Smith S, Turban D, Hamblin S, Hammerla N. Offline bilingual word vectors, orthogonal transformations and the inverted softmax. Preprint posted online February 13, 2017.   URL: [accessed 2020-11-30]
  140. NHS 111 Powered by Babylon - outcomes evaluation. Babylon Health. 2017.   URL: [accessed 2020-11-30]
  141. Middleton K, Butt M, Hammerla N, Hamblin S, Mehta K, Parsa A. Sorting out symptoms: design and evaluation of the 'babylon check' automated triage system. Preprint posted online June 7, 2016.   URL: [accessed 2020-11-30]
  142. Barnett GO, Cimino JJ, Hupp JA, Hoffer EP. DXplain: an evolving diagnostic decision-support system. J Am Med Assoc 1987 Jul 03;258(1):67. [CrossRef]
  143. Barnett GO, Famiglietti KT, Kim RJ, Hoffer EP, Feldman MJ. DXplain on the internet. Proc AMIA Symp 1998:607-611 [FREE Full text] [Medline]
  144. Bauer BA, Lee M, Bergstrom L, Wahner-Roedler DL, Bundrick J, Litin S, et al. Internal medicine resident satisfaction with a diagnostic decision support system (DXplain) introduced on a teaching hospital service. Proc AMIA Symp 2002:31-35 [FREE Full text] [Medline]
  145. Berner ES, Webster GD, Shugerman AA, Jackson JR, Algina J, Baker AL, et al. Performance of four computer-based diagnostic systems. N Engl J Med 1994 Jun 23;330(25):1792-1796. [CrossRef]
  146. Elhanan G, Socratous SA, Cimino JJ. Integrating DXplain into a clinical information system using the World Wide Web. Proc AMIA Annu Fall Symp 1996:348-352 [FREE Full text] [Medline]
  147. Elkin PL, Liebow M, Bauer BA, Chaliki S, Wahner-Roedler D, Bundrick J, et al. The introduction of a diagnostic decision support system (DXplain™) into the workflow of a teaching hospital service can decrease the cost of service for diagnostically challenging Diagnostic Related Groups (DRGs). Int J Med Inform 2010 Nov;79(11):772-777 [FREE Full text] [CrossRef] [Medline]
  148. Feldman MJ, Octo Barnett G. An approach to evaluating the accuracy of DXplain. Computer Methods and Programs in Biomedicine 1991 Aug;35(4):261-266. [CrossRef]
  149. Hammersley J, Cooney K. Evaluating the utility of available differential diagnosis systems. Proc Annu Symp Comput Appl Med Care 1988 Nov 9:229-231 [FREE Full text]
  150. Hoffer EP, Feldman MJ, Kim RJ, Famiglietti KT, Barnett GO. DXplain: patterns of use of a mature expert system. AMIA Annu Symp Proc 2005:321-325 [FREE Full text] [Medline]
  151. London S. DXplain: a web-based diagnostic decision support system for medical students. Medical Reference Services Quarterly 1998 May 07;17(2):17-28. [CrossRef]
  152. Elstein A, Friedman C, Wolf F, Murphy G, Miller J, Fine P, et al. Effects of a decision support system on the diagnostic accuracy of users: a preliminary report. J Am Med Inform Assoc 1996;3(6):422-428 [FREE Full text] [CrossRef] [Medline]
  153. Friedman CP, Elstein AS, Wolf FM, Murphy GC, Franz TM, Heckerling PS, et al. Enhancement of clinicians' diagnostic reasoning by computer-based consultation: a multisite study of 2 systems. J Am Med Assoc 1999 Nov 17;282(19):1851-1856. [CrossRef] [Medline]
  154. Gozum ME. Emulating cognitive diagnostic skills without clinical experience: a report of medical students using Quick Medical Reference and Iliad in the diagnosis of difficult clinical cases. Proc Annu Symp Comput Appl Med Care 1994:991 [FREE Full text] [Medline]
  155. Graber MA, VanScoy D. How well does decision support software perform in the emergency department? Emerg Med J 2003 Sep 01;20(5):426-428 [FREE Full text] [CrossRef] [Medline]
  156. Lange L, Haak S, Lincoln M. Use of Iliad to improve diagnostic performance of nurse practitioner students. J Nurs Educ 1997;36(1):36-45. [CrossRef]
  157. Lau L, Warner H, Poulsen A. Research review: a computer-based diagnostic model for individual case review. Top Health Inf Manage 1995 Feb;15(3):67-79. [Medline]
  158. Li YC, Haug PJ, Lincoln MJ, Turner CW, Pryor TA, Warner HH. Assessing the behavioral impact of a diagnostic decision support system. Proc Annu Symp Comput Appl Med Care 1995:805-809 [FREE Full text] [Medline]
  159. Lincoln MJ, Turner CW, Haug PJ, Warner HR, Williamson JW, Bouhaddou O, et al. Iliad training enhances medical students' diagnostic skills. J Med Syst 1991 Feb;15(1):93-110. [CrossRef]
  160. Murphy GC, Friedman CP, Elstein AS, Wolf FM, Miller T, Miller JG. The influence of a decision support system on the differential diagnosis of medical practitioners at three levels of training. Proc AMIA Annu Fall Symp 1996:219-223 [FREE Full text] [Medline]
  161. Wolf FM, Friedman CP, Elstein AS, Miller JG, Murphy GC, Heckerling P, et al. Changes in diagnostic decision-making after a computerized decision support consultation based on perceptions of need and helpfulness: a preliminary report. Proc AMIA Annu Fall Symp 1997:263-267 [FREE Full text] [Medline]
  162. Ramnarayan P, Roberts GC, Coren M, Nanduri V, Tomlinson A, Taylor PM, et al. Assessment of the potential impact of a reminder system on the reduction of diagnostic errors: a quasi-experimental study. BMC Med Inform Decis Mak 2006 Apr 28;6(1):22 [FREE Full text] [CrossRef] [Medline]
  163. Ramnarayan P, Winrow A, Coren M, Nanduri V, Buchdahl R, Jacobs B, et al. Diagnostic omission errors in acute paediatric practice: impact of a reminder system on decision-making. BMC Med Inform Decis Mak 2006 Nov 06;6:37 [FREE Full text] [CrossRef] [Medline]
  164. Carlson J, Abel M, Bridges D, Tomkowiak J. The impact of a diagnostic reminder system on student clinical reasoning during simulated case studies. Simulation in Healthcare: J Society Simul Healthcare 2011;6(1):11-17. [CrossRef]
  165. Graber ML. Taking steps towards a safer future: measures to promote timely and accurate medical diagnosis. Am J Med 2008 May;121(5 Suppl):S43-S46. [CrossRef] [Medline]
  166. Graber ML, Tompkins D, Holland JJ. Resources medical students use to derive a differential diagnosis. Med Teach 2009 Jun 27;31(6):522-527. [CrossRef] [Medline]
  167. Ramnarayan P, Cronje N, Brown R, Negus R, Coode B, Moss P, et al. Validation of a diagnostic reminder system in emergency medicine: a multi-centre study. Emerg Med J 2007 Sep 01;24(9):619-624 [FREE Full text] [CrossRef] [Medline]
  168. Bavdekar S, Pawar M. Evaluation of an Internet-Delivered Pediatric Diagnosis Support System (ISABEL®) in a Tertiary Care Center in India. Indian Pediatr 2005;42(11):91. [Medline]
  169. Ramnarayan P, Britto J. Paediatric clinical decision support systems. Arch Dis Child 2002 Nov;87(5):361-362 [FREE Full text] [CrossRef] [Medline]
  170. Meyer AND, Giardina TD, Spitzmueller C, Shahid U, Scott TMT, Singh H. Patient perspectives on the usefulness of an artificial intelligence-assisted symptom checker: cross-sectional survey study. J Med Internet Res 2020 Jan 30;22(1):14679 [FREE Full text] [CrossRef] [Medline]
  171. Waxman HS, Worley WE. Computer-assisted adult medical diagnosis: subject review and evaluation of a new microcomputer-based system. Medicine (Baltimore) 1990 May;69(3):125-136. [Medline]
  172. Goshen R, Choman E, Ran A, Muller E, Kariv R, Chodick G, et al. Computer-assisted flagging of individuals at high risk of colorectal cancer in a large health maintenance organization using the colonflag test. JCO Clinical Cancer Informatics 2018 Dec(2):1-8. [CrossRef]
  173. Zack CJ, Senecal C, Kinar Y, Metzger Y, Bar-Sinai Y, Widmer RJ, et al. Leveraging machine learning techniques to forecast patient prognosis after percutaneous coronary intervention. JACC Cardiovasc Interv 2019 Jul 22;12(14):1304-1311 [FREE Full text] [CrossRef] [Medline]
  174. Cahn A, Shoshan A, Sagiv T, Yesharim R, Goshen R, Shalev V, et al. Prediction of progression from pre-diabetes to diabetes: development and validation of a machine learning model. Diabetes Metab Res Rev 2020 Feb 14;36(2):e3252. [CrossRef] [Medline]
  175. EMIS Health - online triage. EMIS Health.   URL: [accessed 2020-11-30]
  176. Hurley Group. Hurley Group.   URL: [accessed 2020-11-30]
  177. Carter M, Fletcher E, Sansom A, Warren FC, Campbell JL. Feasibility, acceptability and effectiveness of an online alternative to face-to-face consultation in general practice: a mixed-methods study of webGP in six Devon practices. BMJ Open 2018 Feb 15;8(2):018688 [FREE Full text] [CrossRef] [Medline]
  178. Cowie J, Calveley E, Bowers G, Bowers J. Evaluation of a digital consultation and self-care advice tool in primary care: a multi-methods study. Int J Environ Res Public Health 2018 May 02;15(5):896 [FREE Full text] [CrossRef] [Medline]
  179. Arene I, Ahmed W, Fox M, Barr CE, Fisher K. Evaluation of quick medical reference (QMR) as a teaching tool. MD Comput 1998;15(5):323-326. [Medline]
  180. Bacchus CM, Quinton C, O’Rourke K, Detsky AS. A randomized crossover trial of quick medical reference (QMR) as a teaching tool for medical interns. J Gen Intern Med 1994 Nov;9(11):616-621. [CrossRef]
  181. Bankowitz R, McNeil M, Challinor S, Parker R, Kapoor W, Miller R. A computer-assisted medical diagnostic consultation service. Implementation and prospective evaluation of a prototype. Ann Intern Med 1989 May 15;110(10):824-832 [FREE Full text] [CrossRef] [Medline]
  182. Berner ES, Maisiak RS, Cobbs CG, Taunton OD. Effects of a decision support system on physicians' diagnostic performance. J Am Med Inform Assoc 1999 Sep 01;6(5):420-427 [FREE Full text] [CrossRef] [Medline]
  183. Lemaire JB, Schaefer JP, Martin LA, Faris P, Ainslie MD, Hull RD. Effectiveness of the Quick Medical Reference as a diagnostic tool. Can Med Asso J 1999 Sep 21;161(6):725-728 [FREE Full text] [Medline]
  184. Kop R, Hoogendoorn M, Moons L, Numans M, ten Teije A. On the advantage of using dedicated data mining techniques to predict colorectal cancer. In: Holmes J, Bellazzi R, Sacchi L, Peek N, editors. Artificial Intelligence in Medicine. AIME 2015. Lecture Notes in Computer Science, vol 9105. Switzerland: Springer International Publishing; 2015:133-142.
  185. Altman DG, Vergouwe Y, Royston P, Moons KGM. Prognosis and prognostic research: validating a prognostic model. Br Med J 2009 May 28;338(may28 1):605-605. [CrossRef] [Medline]
  186. Singh H, Sittig DF. A sociotechnical framework for Safety-Related Electronic Health Record Research Reporting: The SAFER Reporting framework. Annals of Internal Medicine 2020 Jun 02;172(11_Supplement):92-100. [CrossRef]
  187. Usher-Smith JA, Sharp SJ, Griffin SJ. The spectrum effect in tests for risk prediction, screening, and diagnosis. Br Med J 2016 Jun 22;353:3139 [FREE Full text] [CrossRef] [Medline]
  188. Kanagasingam Y, Xiao D, Vignarajan J, Preetham A, Tay-Kearney M, Mehrotra A. Evaluation of artificial intelligence-based grading of diabetic retinopathy in primary care. JAMA Netw Open 2018 Sep 07;1(5):182665 [FREE Full text] [CrossRef] [Medline]
  189. Kelly CJ, Karthikesalingam A, Suleyman M, Corrado G, King D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med 2019 Oct 29;17(1):195 [FREE Full text] [CrossRef] [Medline]
  190. Forcier M, Gallois H, Mullan S, Joly Y. Integrating artificial intelligence into health care through data access: can the GDPR act as a beacon for policymakers? J Law Biosci 2019 Oct;6(1):317-335 [FREE Full text] [CrossRef] [Medline]
  191. European Parlimentary Research Service (EPRS). The impact of the General Data Protection Regulation (GDPR) on artificial intelligence. STUDY: Panel for the Future of Science and Technology. 2020.   URL: [accessed 2020-11-30]

AI: artificial intelligence
AUROC: area under the receiver operating characteristic
CT: computed tomography
EHR: electronic health record
NHS: National Health Service
NIHR: National Institute for Health Research
NPV: negative predictive value
PPV: positive predictive value
PRISMA: Preferred Reporting Items for Systematic Reviews and Meta-analysis
QUADAS-2: quality assessment of diagnostic accuracy studies-2

Edited by G Eysenbach; submitted 13.08.20; peer-reviewed by Y Liang, Y Chu, R Verheij; comments to author 01.10.20; revised version received 05.11.20; accepted 30.11.20; published 03.03.21


©Owain T Jones, Natalia Calanzani, Smiji Saji, Stephen W Duffy, Jon Emery, Willie Hamilton, Hardeep Singh, Niek J de Wit, Fiona M Walter. Originally published in the Journal of Medical Internet Research (, 03.03.2021.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on, as well as this copyright and license information must be included.