%0 Journal Article %@ 1438-8871 %I JMIR Publications %V 23 %N 3 %P e24870 %T Machine Learning for Mental Health in Social Media: Bibliometric Study %A Kim,Jina %A Lee,Daeun %A Park,Eunil %+ Department of Applied Artificial Intelligence, Sungkyunkwan University, 312 International Hall, Sungkyunkwan-ro 25-2, Seoul, 03063, Republic of Korea, 82 2 740 1864, eunilpark@skku.edu %K bibliometric analysis %K machine learning %K mental health %K social media %D 2021 %7 8.3.2021 %9 Original Paper %J J Med Internet Res %G English %X Background: Social media platforms provide an easily accessible and time-saving communication approach for individuals with mental disorders compared to face-to-face meetings with medical providers. Recently, machine learning (ML)-based mental health exploration using large-scale social media data has attracted significant attention. Objective: We aimed to provide a bibliometric analysis and discussion on research trends of ML for mental health in social media. Methods: Publications addressing social media and ML in the field of mental health were retrieved from the Scopus and Web of Science databases. We analyzed the publication distribution to measure productivity on sources, countries, institutions, authors, and research subjects, and visualized the trends in this field using a keyword co-occurrence network. The research methodologies of previous studies with high citations are also thoroughly described. Results: We obtained a total of 565 relevant papers published from 2015 to 2020. In the last 5 years, the number of publications has demonstrated continuous growth with Lecture Notes in Computer Science and Journal of Medical Internet Research as the two most productive sources based on Scopus and Web of Science records. In addition, notable methodological approaches with data resources presented in high-ranking publications were investigated. Conclusions: The results of this study highlight continuous growth in this research area. Moreover, we retrieved three main discussion points from a comprehensive overview of highly cited publications that provide new in-depth directions for both researchers and practitioners. %R 10.2196/24870 %U https://www.jmir.org/2021/3/e24870 %U https://doi.org/10.2196/24870 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 23 %N 3 %P e22951 %T Natural Language Processing and Machine Learning for Identifying Incident Stroke From Electronic Health Records: Algorithm Development and Validation %A Zhao,Yiqing %A Fu,Sunyang %A Bielinski,Suzette J %A Decker,Paul A %A Chamberlain,Alanna M %A Roger,Veronique L %A Liu,Hongfang %A Larson,Nicholas B %+ Department of Health Sciences Research, Mayo Clinic, 205 3rd Ave SW, Rochester, MN, 55905, United States, 1 507 293 1700, Larson.Nicholas@mayo.edu %K stroke %K natural language processing %K electronic health records %K machine learning %D 2021 %7 8.3.2021 %9 Original Paper %J J Med Internet Res %G English %X Background: Stroke is an important clinical outcome in cardiovascular research. However, the ascertainment of incident stroke is typically accomplished via time-consuming manual chart abstraction. Current phenotyping efforts using electronic health records for stroke focus on case ascertainment rather than incident disease, which requires knowledge of the temporal sequence of events. Objective: The aim of this study was to develop a machine learning–based phenotyping algorithm for incident stroke ascertainment based on diagnosis codes, procedure codes, and clinical concepts extracted from clinical notes using natural language processing. Methods: The algorithm was trained and validated using an existing epidemiology cohort consisting of 4914 patients with atrial fibrillation (AF) with manually curated incident stroke events. Various combinations of feature sets and machine learning classifiers were compared. Using a heuristic rule based on the composition of concepts and codes, we further detected the stroke subtype (ischemic stroke/transient ischemic attack or hemorrhagic stroke) of each identified stroke. The algorithm was further validated using a cohort (n=150) stratified sampled from a population in Olmsted County, Minnesota (N=74,314). Results: Among the 4914 patients with AF, 740 had validated incident stroke events. The best-performing stroke phenotyping algorithm used clinical concepts, diagnosis codes, and procedure codes as features in a random forest classifier. Among patients with stroke codes in the general population sample, the best-performing model achieved a positive predictive value of 86% (43/50; 95% CI 0.74-0.93) and a negative predictive value of 96% (96/100). For subtype identification, we achieved an accuracy of 83% in the AF cohort and 80% in the general population sample. Conclusions: We developed and validated a machine learning–based algorithm that performed well for identifying incident stroke and for determining type of stroke. The algorithm also performed well on a sample from a general population, further demonstrating its generalizability and potential for adoption by other institutions. %R 10.2196/22951 %U https://www.jmir.org/2021/3/e22951 %U https://doi.org/10.2196/22951 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 9 %N 3 %P e25121 %T Predictive Modeling of 30-Day Emergency Hospital Transport of German Patients Using a Personal Emergency Response: Retrospective Study and Comparison with the United States %A op den Buijs,Jorn %A Pijl,Marten %A Landgraf,Andreas %+ Philips Research, High Tech Campus 34, Eindhoven, 5656 AE, Netherlands, 31 631926890, jorn.op.den.buijs@philips.com %K emergency hospital transport %K predictive modeling %K personal emergency response system %K population health management %K emergency transport %K emergency response system %K emergency response %K health management %D 2021 %7 8.3.2021 %9 Original Paper %J JMIR Med Inform %G English %X Background: Predictive analytics based on data from remote monitoring of elderly via a personal emergency response system (PERS) in the United States can identify subscribers at high risk for emergency hospital transport. These risk predictions can subsequently be used to proactively target interventions and prevent avoidable, costly health care use. It is, however, unknown if PERS-based risk prediction with targeted interventions could also be applied in the German health care setting. Objective: The objectives were to develop and validate a predictive model of 30-day emergency hospital transport based on data from a German PERS provider and compare the model with our previously published predictive model developed on data from a US PERS provider. Methods: Retrospective data of 5805 subscribers to a German PERS service were used to develop and validate an extreme gradient boosting predictive model of 30-day hospital transport, including predictors derived from subscriber demographics, self-reported medical conditions, and a 2-year history of case data. Models were trained on 80% (4644/5805) of the data, and performance was evaluated on an independent test set of 20% (1161/5805). Results were compared with our previously published prediction model developed on a data set of PERS users in the United States. Results: German PERS subscribers were on average aged 83.6 years, with 64.0% (743/1161) females, with 65.4% (759/1161) reported 3 or more chronic conditions. A total of 1.4% (350/24,847) of subscribers had one or more emergency transports in 30 days in the test set, which was significantly lower compared with the US data set (2455/109,966, 2.2%). Performance of the predictive model of emergency hospital transport, as evaluated by area under the receiver operator characteristic curve (AUC), was 0.749 (95% CI 0.721-0.777), which was similar to the US prediction model (AUC=0.778 [95% CI 0.769-0.788]). The top 1% (12/1161) of predicted high-risk patients were 10.7 times more likely to experience an emergency hospital transport in 30 days than the overall German PERS population. This lift was comparable to a model lift of 11.9 obtained by the US predictive model. Conclusions: Despite differences in emergency care use, PERS-based collected subscriber data can be used to predict use outcomes in different international settings. These predictive analytic tools can be used by health care organizations to extend population health management into the home by identifying and delivering timelier targeted interventions to high-risk patients. This could lead to overall improved patient experience, higher quality of care, and more efficient resource use. %M 33682679 %R 10.2196/25121 %U https://medinform.jmir.org/2021/3/e25121 %U https://doi.org/10.2196/25121 %U http://www.ncbi.nlm.nih.gov/pubmed/33682679 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 23 %N 3 %P e26646 %T Future Medical Artificial Intelligence Application Requirements and Expectations of Physicians in German University Hospitals: Web-Based Survey %A Maassen,Oliver %A Fritsch,Sebastian %A Palm,Julia %A Deffge,Saskia %A Kunze,Julian %A Marx,Gernot %A Riedel,Morris %A Schuppert,Andreas %A Bickenbach,Johannes %+ Department of Intensive Care Medicine, University Hospital RWTH Aachen, Pauwelsstraße 30, Aachen, 52074, Germany, 49 2418080444, oliver.maassen@rwth-aachen.de %K artificial intelligence %K AI %K machine learning %K algorithms %K clinical decision support %K physician %K requirement %K expectation %K hospital care %D 2021 %7 5.3.2021 %9 Original Paper %J J Med Internet Res %G English %X Background: The increasing development of artificial intelligence (AI) systems in medicine driven by researchers and entrepreneurs goes along with enormous expectations for medical care advancement. AI might change the clinical practice of physicians from almost all medical disciplines and in most areas of health care. While expectations for AI in medicine are high, practical implementations of AI for clinical practice are still scarce in Germany. Moreover, physicians’ requirements and expectations of AI in medicine and their opinion on the usage of anonymized patient data for clinical and biomedical research have not been investigated widely in German university hospitals. Objective: This study aimed to evaluate physicians’ requirements and expectations of AI in medicine and their opinion on the secondary usage of patient data for (bio)medical research (eg, for the development of machine learning algorithms) in university hospitals in Germany. Methods: A web-based survey was conducted addressing physicians of all medical disciplines in 8 German university hospitals. Answers were given using Likert scales and general demographic responses. Physicians were asked to participate locally via email in the respective hospitals. Results: The online survey was completed by 303 physicians (female: 121/303, 39.9%; male: 173/303, 57.1%; no response: 9/303, 3.0%) from a wide range of medical disciplines and work experience levels. Most respondents either had a positive (130/303, 42.9%) or a very positive attitude (82/303, 27.1%) towards AI in medicine. There was a significant association between the personal rating of AI in medicine and the self-reported technical affinity level (H4=48.3, P<.001). A vast majority of physicians expected the future of medicine to be a mix of human and artificial intelligence (273/303, 90.1%) but also requested a scientific evaluation before the routine implementation of AI-based systems (276/303, 91.1%). Physicians were most optimistic that AI applications would identify drug interactions (280/303, 92.4%) to improve patient care substantially but were quite reserved regarding AI-supported diagnosis of psychiatric diseases (62/303, 20.5%). Of the respondents, 82.5% (250/303) agreed that there should be open access to anonymized patient databases for medical and biomedical research. Conclusions: Physicians in stationary patient care in German university hospitals show a generally positive attitude towards using most AI applications in medicine. Along with this optimism comes several expectations and hopes that AI will assist physicians in clinical decision making. Especially in fields of medicine where huge amounts of data are processed (eg, imaging procedures in radiology and pathology) or data are collected continuously (eg, cardiology and intensive care medicine), physicians’ expectations of AI to substantially improve future patient care are high. In the study, the greatest potential was seen in the application of AI for the identification of drug interactions, assumedly due to the rising complexity of drug administration to polymorbid, polypharmacy patients. However, for the practical usage of AI in health care, regulatory and organizational challenges still have to be mastered. %M 33666563 %R 10.2196/26646 %U https://www.jmir.org/2021/3/e26646 %U https://doi.org/10.2196/26646 %U http://www.ncbi.nlm.nih.gov/pubmed/33666563 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 23 %N 3 %P e23483 %T Artificial Intelligence Techniques That May Be Applied to Primary Care Data to Facilitate Earlier Diagnosis of Cancer: Systematic Review %A Jones,Owain T %A Calanzani,Natalia %A Saji,Smiji %A Duffy,Stephen W %A Emery,Jon %A Hamilton,Willie %A Singh,Hardeep %A de Wit,Niek J %A Walter,Fiona M %+ Primary Care Unit, Department of Public Health & Primary Care, University of Cambridge, 2 Wort's Causeway, Cambridge, CB1 8RN, United Kingdom, 44 1223762554, otj24@medschl.cam.ac.uk %K artificial intelligence %K machine learning %K electronic health records %K primary health care %K early detection of cancer %D 2021 %7 3.3.2021 %9 Original Paper %J J Med Internet Res %G English %X Background: More than 17 million people worldwide, including 360,000 people in the United Kingdom, were diagnosed with cancer in 2018. Cancer prognosis and disease burden are highly dependent on the disease stage at diagnosis. Most people diagnosed with cancer first present in primary care settings, where improved assessment of the (often vague) presenting symptoms of cancer could lead to earlier detection and improved outcomes for patients. There is accumulating evidence that artificial intelligence (AI) can assist clinicians in making better clinical decisions in some areas of health care. Objective: This study aimed to systematically review AI techniques that may facilitate earlier diagnosis of cancer and could be applied to primary care electronic health record (EHR) data. The quality of the evidence, the phase of development the AI techniques have reached, the gaps that exist in the evidence, and the potential for use in primary care were evaluated. Methods: We searched MEDLINE, Embase, SCOPUS, and Web of Science databases from January 01, 2000, to June 11, 2019, and included all studies providing evidence for the accuracy or effectiveness of applying AI techniques for the early detection of cancer, which may be applicable to primary care EHRs. We included all study designs in all settings and languages. These searches were extended through a scoping review of AI-based commercial technologies. The main outcomes assessed were measures of diagnostic accuracy for cancer. Results: We identified 10,456 studies; 16 studies met the inclusion criteria, representing the data of 3,862,910 patients. A total of 13 studies described the initial development and testing of AI algorithms, and 3 studies described the validation of an AI algorithm in independent data sets. One study was based on prospectively collected data; only 3 studies were based on primary care data. We found no data on implementation barriers or cost-effectiveness. Risk of bias assessment highlighted a wide range of study quality. The additional scoping review of commercial AI technologies identified 21 technologies, only 1 meeting our inclusion criteria. Meta-analysis was not undertaken because of the heterogeneity of AI modalities, data set characteristics, and outcome measures. Conclusions: AI techniques have been applied to EHR-type data to facilitate early diagnosis of cancer, but their use in primary care settings is still at an early stage of maturity. Further evidence is needed on their performance using primary care data, implementation barriers, and cost-effectiveness before widespread adoption into routine primary care clinical practice can be recommended. %M 33656443 %R 10.2196/23483 %U https://www.jmir.org/2021/3/e23483 %U https://doi.org/10.2196/23483 %U http://www.ncbi.nlm.nih.gov/pubmed/33656443 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 9 %N 3 %P e18607 %T A Chatbot for Perinatal Women’s and Partners’ Obstetric and Mental Health Care: Development and Usability Evaluation Study %A Chung,Kyungmi %A Cho,Hee Young %A Park,Jin Young %+ Department of Psychiatry, Yonsei University College of Medicine, Yongin Severance Hospital, Yonsei University Health System, 363, Dongbaekjukjeon-daero, Giheung-gu, Yongin-si, Republic of Korea, 82 31 5189 8148, empathy@yuhs.ac %K chatbot %K mobile phone %K instant messaging %K mobile health %K perinatal care %K usability %K user experience %K usability testing %D 2021 %7 3.3.2021 %9 Original Paper %J JMIR Med Inform %G English %X Background: To motivate people to adopt medical chatbots, the establishment of a specialized medical knowledge database that fits their personal interests is of great importance in developing a chatbot for perinatal care, particularly with the help of health professionals. Objective: The objectives of this study are to develop and evaluate a user-friendly question-and-answer (Q&A) knowledge database–based chatbot (Dr. Joy) for perinatal women’s and their partners’ obstetric and mental health care by applying a text-mining technique and implementing contextual usability testing (UT), respectively, thus determining whether this medical chatbot built on mobile instant messenger (KakaoTalk) can provide its male and female users with good user experience. Methods: Two men aged 38 and 40 years and 13 women aged 27 to 43 years in pregnancy preparation or different pregnancy stages were enrolled. All participants completed the 7-day-long UT, during which they were given the daily tasks of asking Dr. Joy at least 3 questions at any time and place and then giving the chatbot either positive or negative feedback with emoji, using at least one feature of the chatbot, and finally, sending a facilitator all screenshots for the history of the day’s use via KakaoTalk before midnight. One day after the UT completion, all participants were asked to fill out a questionnaire on the evaluation of usability, perceived benefits and risks, intention to seek and share health information on the chatbot, and strengths and weaknesses of its use, as well as demographic characteristics. Results: Despite the relatively higher score of ease of learning (EOL), the results of the Spearman correlation indicated that EOL was not significantly associated with usefulness (ρ=0.26; P=.36), ease of use (ρ=0.19; P=.51), satisfaction (ρ=0.21; P=.46), or total usability scores (ρ=0.32; P=.24). Unlike EOL, all 3 subfactors and the total usability had significant positive associations with each other (all ρ>0.80; P<.001). Furthermore, perceived risks exhibited no significant negative associations with perceived benefits (ρ=−0.29; P=.30) or intention to seek (SEE; ρ=−0.28; P=.32) or share (SHA; ρ=−0.24; P=.40) health information on the chatbot via KakaoTalk, whereas perceived benefits exhibited significant positive associations with both SEE and SHA. Perceived benefits were more strongly associated with SEE (ρ=0.94; P<.001) than with SHA (ρ=0.70; P=.004). Conclusions: This study provides the potential for the uptake of this newly developed Q&A knowledge database–based KakaoTalk chatbot for obstetric and mental health care. As Dr. Joy had quality contents with both utilitarian and hedonic value, its male and female users could be encouraged to use medical chatbots in a convenient, easy-to-use, and enjoyable manner. To boost their continued usage intention for Dr. Joy, its Q&A sets need to be periodically updated to satisfy user intent by monitoring both male and female user utterances. %M 33656442 %R 10.2196/18607 %U https://medinform.jmir.org/2021/3/e18607 %U https://doi.org/10.2196/18607 %U http://www.ncbi.nlm.nih.gov/pubmed/33656442 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 23 %N 3 %P e26997 %T Preferences for Artificial Intelligence Clinicians Before and During the COVID-19 Pandemic: Discrete Choice Experiment and Propensity Score Matching Study %A Liu,Taoran %A Tsang,Winghei %A Xie,Yifei %A Tian,Kang %A Huang,Fengqiu %A Chen,Yanhui %A Lau,Oiying %A Feng,Guanrui %A Du,Jianhao %A Chu,Bojia %A Shi,Tingyu %A Zhao,Junjie %A Cai,Yiming %A Hu,Xueyan %A Akinwunmi,Babatunde %A Huang,Jian %A Zhang,Casper J P %A Ming,Wai-Kit %+ Department of Public Health and Preventive Medicine, School of Medicine, Jinan University, 601 Huangpu W Ave, Tianhe District, Guangzhou, 510632, China, 86 85228852, wkming@connect.hku.hk %K propensity score matching %K discrete latent traits %K patients’ preferences %K artificial intelligence %K COVID-19 %K preference %K discrete choice %K choice %K traditional medicine %K public health %K resource %K patient %K diagnosis %K accuracy %D 2021 %7 2.3.2021 %9 Original Paper %J J Med Internet Res %G English %X Background: Artificial intelligence (AI) methods can potentially be used to relieve the pressure that the COVID-19 pandemic has exerted on public health. In cases of medical resource shortages caused by the pandemic, changes in people’s preferences for AI clinicians and traditional clinicians are worth exploring. Objective: We aimed to quantify and compare people’s preferences for AI clinicians and traditional clinicians before and during the COVID-19 pandemic, and to assess whether people’s preferences were affected by the pressure of pandemic. Methods: We used the propensity score matching method to match two different groups of respondents with similar demographic characteristics. Respondents were recruited in 2017 and 2020. A total of 2048 respondents (2017: n=1520; 2020: n=528) completed the questionnaire and were included in the analysis. Multinomial logit models and latent class models were used to assess people’s preferences for different diagnosis methods. Results: In total, 84.7% (1115/1317) of respondents in the 2017 group and 91.3% (482/528) of respondents in the 2020 group were confident that AI diagnosis methods would outperform human clinician diagnosis methods in the future. Both groups of matched respondents believed that the most important attribute of diagnosis was accuracy, and they preferred to receive combined diagnoses from both AI and human clinicians (2017: odds ratio [OR] 1.645, 95% CI 1.535-1.763; P<.001; 2020: OR 1.513, 95% CI 1.413-1.621; P<.001; reference: clinician diagnoses). The latent class model identified three classes with different attribute priorities. In class 1, preferences for combined diagnoses and accuracy remained constant in 2017 and 2020, and high accuracy (eg, 100% accuracy in 2017: OR 1.357, 95% CI 1.164-1.581) was preferred. In class 2, the matched data from 2017 were similar to those from 2020; combined diagnoses from both AI and human clinicians (2017: OR 1.204, 95% CI 1.039-1.394; P=.011; 2020: OR 2.009, 95% CI 1.826-2.211; P<.001; reference: clinician diagnoses) and an outpatient waiting time of 20 minutes (2017: OR 1.349, 95% CI 1.065-1.708; P<.001; 2020: OR 1.488, 95% CI 1.287-1.721; P<.001; reference: 0 minutes) were consistently preferred. In class 3, the respondents in the 2017 and 2020 groups preferred different diagnosis methods; respondents in the 2017 group preferred clinician diagnoses, whereas respondents in the 2020 group preferred AI diagnoses. In the latent class, which was stratified according to sex, all male and female respondents in the 2017 and 2020 groups believed that accuracy was the most important attribute of diagnosis. Conclusions: Individuals’ preferences for receiving clinical diagnoses from AI and human clinicians were generally unaffected by the pandemic. Respondents believed that accuracy and expense were the most important attributes of diagnosis. These findings can be used to guide policies that are relevant to the development of AI-based health care. %M 33556034 %R 10.2196/26997 %U https://www.jmir.org/2021/3/e26997 %U https://doi.org/10.2196/26997 %U http://www.ncbi.nlm.nih.gov/pubmed/33556034 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 23 %N 3 %P e24134 %T A Novel Mobile App (Heali) for Disease Treatment in Participants With Irritable Bowel Syndrome: Randomized Controlled Pilot Trial %A Rafferty,Aaron J %A Hall,Rick %A Johnston,Carol S %+ College of Health Solutions, Arizona State University, HLTHN 532 Phoenix Downtown Campus, Phoenix, AZ, 85004, United States, 1 602 496 2539, Carol.johnston@asu.edu %K irritable bowel syndrome %K artificial intelligence %K mobile app %K low FODMAP diet %K randomized controlled trial %D 2021 %7 2.3.2021 %9 Original Paper %J J Med Internet Res %G English %X Background: A diet high in fermentable, oligo-, di-, monosaccharides and polyols (FODMAPs) has been shown to exacerbate symptoms of irritable bowel syndrome (IBS). Previous literature reports significant improvement in IBS symptoms with initiation of a low FODMAP diet (LFD) and monitored reintroduction. However, dietary adherence to the LFD is difficult, with patients stating that the information given by health care providers is often generalized and nonspecific, requiring them to search for supplementary information to fit their needs. Objective: The aim of our study was to determine whether Heali, a novel artificial intelligence dietary mobile app can improve adherence to the LFD, IBS symptom severity, and quality of life outcomes in adults with IBS or IBS-like symptoms over a 4-week period. Methods: Participants were randomized into 2 groups: the control group (CON), in which participants received educational materials, and the experimental group (APP), in which participants received access to the mobile app and educational materials. Over the course of this unblinded online trial, all participants completed a battery of 5 questionnaires at baseline and at the end of the trial to document IBS symptoms, quality of life, LFD knowledge, and LFD adherence. Results: We enrolled 58 participants in the study (29 in each group), and 25 participants completed the study in its entirety (11 and 14 for the CON and APP groups, respectively). Final, per-protocol analyses showed greater improvement in quality of life score for the APP group compared to the CON group (31.1 and 11.8, respectively; P=.04). Reduction in total IBS symptom severity score was 24% greater for the APP group versus the CON group. Although this did not achieve significance (–170 vs –138 respectively; P=.37), the reduction in the subscore for bowel habit dissatisfaction was 2-fold greater for the APP group than for the CON group (P=.05). Conclusions: This initial study provides preliminary evidence that Heali may provide therapeutic benefit to its users, specifically improvements in quality of life and bowel habits. Although this study was underpowered, findings from this study warrant further research in a larger sample of participants to test the efficacy of Heali app use to improve outcomes for patients with IBS. Trial Registration: ClinicalTrials.gov NCT04256551; https://clinicaltrials.gov/ct2/show/NCT04256551 %M 33650977 %R 10.2196/24134 %U https://www.jmir.org/2021/3/e24134 %U https://doi.org/10.2196/24134 %U http://www.ncbi.nlm.nih.gov/pubmed/33650977 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 9 %N 3 %P e25635 %T Machine Learning Approach to Predict the Probability of Recurrence of Renal Cell Carcinoma After Surgery: Prediction Model Development Study %A Kim,HyungMin %A Lee,Sun Jung %A Park,So Jin %A Choi,In Young %A Hong,Sung-Hoo %+ Department of Urology, Seoul St. Mary’s Hospital, College of Medicine, The Catholic University, 222, Banpo-daero, Seocho-gu, Seoul, Republic of Korea, 82 2 2258 6228, toomey@catholic.ac.kr %K renal cell carcinoma %K recurrence %K machine learning %K naïve Bayes %K algorithm %K cancer %K surgery %K web-based %K database %K prediction %K probability %K carcinoma %K kidney %K model %K development %D 2021 %7 1.3.2021 %9 Original Paper %J JMIR Med Inform %G English %X Background: Renal cell carcinoma (RCC) has a high recurrence rate of 20% to 30% after nephrectomy for clinically localized disease, and more than 40% of patients eventually die of the disease, making regular monitoring and constant management of utmost importance. Objective: The objective of this study was to develop an algorithm that predicts the probability of recurrence of RCC within 5 and 10 years of surgery. Methods: Data from 6849 Korean patients with RCC were collected from eight tertiary care hospitals listed in the KOrean Renal Cell Carcinoma (KORCC) web-based database. To predict RCC recurrence, analytical data from 2814 patients were extracted from the database. Eight machine learning algorithms were used to predict the probability of RCC recurrence, and the results were compared. Results: Within 5 years of surgery, the highest area under the receiver operating characteristic curve (AUROC) was obtained from the naïve Bayes (NB) model, with a value of 0.836. Within 10 years of surgery, the highest AUROC was obtained from the NB model, with a value of 0.784. Conclusions: An algorithm was developed that predicts the probability of RCC recurrence within 5 and 10 years using the KORCC database, a large-scale RCC cohort in Korea. It is expected that the developed algorithm will help clinicians manage prognosis and establish customized treatment strategies for patients with RCC after surgery. %M 33646127 %R 10.2196/25635 %U https://medinform.jmir.org/2021/3/e25635 %U https://doi.org/10.2196/25635 %U http://www.ncbi.nlm.nih.gov/pubmed/33646127 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 23 %N 2 %P e23458 %T Using Automated Machine Learning to Predict the Mortality of Patients With COVID-19: Prediction Model Development Study %A Ikemura,Kenji %A Bellin,Eran %A Yagi,Yukako %A Billett,Henny %A Saada,Mahmoud %A Simone,Katelyn %A Stahl,Lindsay %A Szymanski,James %A Goldstein,D Y %A Reyes Gil,Morayma %+ Department of Pathology, Albert Einstein College of Medicine, Montefiore Medical Center, 111 E 210th St, The Bronx, NY, 10467, United States, 1 9493703777, kikemura@montefiore.org %K automated machine learning %K COVID-19 %K biomarker %K ranking %K decision support tool %K machine learning %K decision support %K Shapley additive explanation %K partial dependence plot %K dimensionality reduction %D 2021 %7 26.2.2021 %9 Original Paper %J J Med Internet Res %G English %X Background: During a pandemic, it is important for clinicians to stratify patients and decide who receives limited medical resources. Machine learning models have been proposed to accurately predict COVID-19 disease severity. Previous studies have typically tested only one machine learning algorithm and limited performance evaluation to area under the curve analysis. To obtain the best results possible, it may be important to test different machine learning algorithms to find the best prediction model. Objective: In this study, we aimed to use automated machine learning (autoML) to train various machine learning algorithms. We selected the model that best predicted patients’ chances of surviving a SARS-CoV-2 infection. In addition, we identified which variables (ie, vital signs, biomarkers, comorbidities, etc) were the most influential in generating an accurate model. Methods: Data were retrospectively collected from all patients who tested positive for COVID-19 at our institution between March 1 and July 3, 2020. We collected 48 variables from each patient within 36 hours before or after the index time (ie, real-time polymerase chain reaction positivity). Patients were followed for 30 days or until death. Patients’ data were used to build 20 machine learning models with various algorithms via autoML. The performance of machine learning models was measured by analyzing the area under the precision-recall curve (AUPCR). Subsequently, we established model interpretability via Shapley additive explanation and partial dependence plots to identify and rank variables that drove model predictions. Afterward, we conducted dimensionality reduction to extract the 10 most influential variables. AutoML models were retrained by only using these 10 variables, and the output models were evaluated against the model that used 48 variables. Results: Data from 4313 patients were used to develop the models. The best model that was generated by using autoML and 48 variables was the stacked ensemble model (AUPRC=0.807). The two best independent models were the gradient boost machine and extreme gradient boost models, which had an AUPRC of 0.803 and 0.793, respectively. The deep learning model (AUPRC=0.73) was substantially inferior to the other models. The 10 most influential variables for generating high-performing models were systolic and diastolic blood pressure, age, pulse oximetry level, blood urea nitrogen level, lactate dehydrogenase level, D-dimer level, troponin level, respiratory rate, and Charlson comorbidity score. After the autoML models were retrained with these 10 variables, the stacked ensemble model still had the best performance (AUPRC=0.791). Conclusions: We used autoML to develop high-performing models that predicted the survival of patients with COVID-19. In addition, we identified important variables that correlated with mortality. This is proof of concept that autoML is an efficient, effective, and informative method for generating machine learning–based clinical decision support tools. %M 33539308 %R 10.2196/23458 %U https://www.jmir.org/2021/2/e23458 %U https://doi.org/10.2196/23458 %U http://www.ncbi.nlm.nih.gov/pubmed/33539308 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 9 %N 2 %P e22976 %T Using Machine Learning to Collect and Facilitate Remote Access to Biomedical Databases: Development of the Biomedical Database Inventory %A Rosado,Eduardo %A Garcia-Remesal,Miguel %A Paraiso-Medina,Sergio %A Pazos,Alejandro %A Maojo,Victor %+ Biomedical Informatics Group, School of Computer Science, Universidad Politecnica de Madrid, Campus de Montegancedo, s/n, Madrid, 28660, Spain, 34 699059254, vmaojo@fi.upm.es %K biomedical databases %K natural language processing %K deep learning %K internet %K biomedical knowledge %D 2021 %7 25.2.2021 %9 Original Paper %J JMIR Med Inform %G English %X Background: Currently, existing biomedical literature repositories do not commonly provide users with specific means to locate and remotely access biomedical databases. Objective: To address this issue, we developed the Biomedical Database Inventory (BiDI), a repository linking to biomedical databases automatically extracted from the scientific literature. BiDI provides an index of data resources and a path to access them seamlessly. Methods: We designed an ensemble of deep learning methods to extract database mentions. To train the system, we annotated a set of 1242 articles that included mentions of database publications. Such a data set was used along with transfer learning techniques to train an ensemble of deep learning natural language processing models targeted at database publication detection. Results: The system obtained an F1 score of 0.929 on database detection, showing high precision and recall values. When applying this model to the PubMed and PubMed Central databases, we identified over 10,000 unique databases. The ensemble model also extracted the weblinks to the reported databases and discarded irrelevant links. For the extraction of weblinks, the model achieved a cross-validated F1 score of 0.908. We show two use cases: one related to “omics” and the other related to the COVID-19 pandemic. Conclusions: BiDI enables access to biomedical resources over the internet and facilitates data-driven research and other scientific initiatives. The repository is openly available online and will be regularly updated with an automatic text processing pipeline. The approach can be reused to create repositories of different types (ie, biomedical and others). %M 33629960 %R 10.2196/22976 %U https://medinform.jmir.org/2021/2/e22976 %U https://doi.org/10.2196/22976 %U http://www.ncbi.nlm.nih.gov/pubmed/33629960 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 23 %N 2 %P e20298 %T A Risk Prediction Model Based on Machine Learning for Cognitive Impairment Among Chinese Community-Dwelling Elderly People With Normal Cognition: Development and Validation Study %A Hu,Mingyue %A Shu,Xinhui %A Yu,Gang %A Wu,Xinyin %A Välimäki,Maritta %A Feng,Hui %+ Xiangya Nursing School, Central South University, Yuelu District, 172 Tongzipo Road, Changsha , China, 86 15173121969, feng.hui@csu.edu.cn %K prediction model %K cognitive impairment %K machine learning %K nomogram %D 2021 %7 24.2.2021 %9 Original Paper %J J Med Internet Res %G English %X Background: Identifying cognitive impairment early enough could support timely intervention that may hinder or delay the trajectory of cognitive impairment, thus increasing the chances for successful cognitive aging. Objective: We aimed to build a prediction model based on machine learning for cognitive impairment among Chinese community-dwelling elderly people with normal cognition. Methods: A prospective cohort of 6718 older people from the Chinese Longitudinal Healthy Longevity Survey (CLHLS) register, followed between 2008 and 2011, was used to develop and validate the prediction model. Participants were included if they were aged 60 years or above, were community-dwelling elderly people, and had a cognitive Mini-Mental State Examination (MMSE) score ≥18. They were excluded if they were diagnosed with a severe disease (eg, cancer and dementia) or were living in institutions. Cognitive impairment was identified using the Chinese version of the MMSE. Several machine learning algorithms (random forest, XGBoost, naïve Bayes, and logistic regression) were used to assess the 3-year risk of developing cognitive impairment. Optimal cutoffs and adjusted parameters were explored in validation data, and the model was further evaluated in test data. A nomogram was established to vividly present the prediction model. Results: The mean age of the participants was 80.4 years (SD 10.3 years), and 50.85% (3416/6718) were female. During a 3-year follow-up, 991 (14.8%) participants were identified with cognitive impairment. Among 45 features, the following four features were finally selected to develop the model: age, instrumental activities of daily living, marital status, and baseline cognitive function. The concordance index of the model constructed by logistic regression was 0.814 (95% CI 0.781-0.846). Older people with normal cognitive functioning having a nomogram score of less than 170 were considered to have a low 3-year risk of cognitive impairment, and those with a score of 170 or greater were considered to have a high 3-year risk of cognitive impairment. Conclusions: This simple and feasible cognitive impairment prediction model could identify community-dwelling elderly people at the greatest 3-year risk for cognitive impairment, which could help community nurses in the early identification of dementia. %M 33625369 %R 10.2196/20298 %U https://www.jmir.org/2021/2/e20298 %U https://doi.org/10.2196/20298 %U http://www.ncbi.nlm.nih.gov/pubmed/33625369 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 23 %N 2 %P e22841 %T Patients’ Preferences for Artificial Intelligence Applications Versus Clinicians in Disease Diagnosis During the SARS-CoV-2 Pandemic in China: Discrete Choice Experiment %A Liu,Taoran %A Tsang,Winghei %A Huang,Fengqiu %A Lau,Oi Ying %A Chen,Yanhui %A Sheng,Jie %A Guo,Yiwei %A Akinwunmi,Babatunde %A Zhang,Casper JP %A Ming,Wai-Kit %+ Department of Public Health and Preventive Medicine, School of Medicine, Jinan University, West Huangpu Road 601, Guangzhou, 510000, China, 86 14715485116, wkming@connect.hku.hk %K discrete choice experiment %K artificial intelligence %K patient preference %K multinomial logit analysis %K questionnaire %K latent-class conditional logit %K app %K human clinicians %K diagnosis %K COVID-19 %K China %D 2021 %7 23.2.2021 %9 Original Paper %J J Med Internet Res %G English %X Background: Misdiagnosis, arbitrary charges, annoying queues, and clinic waiting times among others are long-standing phenomena in the medical industry across the world. These factors can contribute to patient anxiety about misdiagnosis by clinicians. However, with the increasing growth in use of big data in biomedical and health care communities, the performance of artificial intelligence (Al) techniques of diagnosis is improving and can help avoid medical practice errors, including under the current circumstance of COVID-19. Objective: This study aims to visualize and measure patients’ heterogeneous preferences from various angles of AI diagnosis versus clinicians in the context of the COVID-19 epidemic in China. We also aim to illustrate the different decision-making factors of the latent class of a discrete choice experiment (DCE) and prospects for the application of AI techniques in judgment and management during the pandemic of SARS-CoV-2 and in the future. Methods: A DCE approach was the main analysis method applied in this paper. Attributes from different dimensions were hypothesized: diagnostic method, outpatient waiting time, diagnosis time, accuracy, follow-up after diagnosis, and diagnostic expense. After that, a questionnaire is formed. With collected data from the DCE questionnaire, we apply Sawtooth software to construct a generalized multinomial logit (GMNL) model, mixed logit model, and latent class model with the data sets. Moreover, we calculate the variables’ coefficients, standard error, P value, and odds ratio (OR) and form a utility report to present the importance and weighted percentage of attributes. Results: A total of 55.8% of the respondents (428 out of 767) opted for AI diagnosis regardless of the description of the clinicians. In the GMNL model, we found that people prefer the 100% accuracy level the most (OR 4.548, 95% CI 4.048-5.110, P<.001). For the latent class model, the most acceptable model consists of 3 latent classes of respondents. The attributes with the most substantial effects and highest percentage weights are the accuracy (39.29% in general) and expense of diagnosis (21.69% in general), especially the preferences for the diagnosis “accuracy” attribute, which is constant across classes. For class 1 and class 3, people prefer the AI + clinicians method (class 1: OR 1.247, 95% CI 1.036-1.463, P<.001; class 3: OR 1.958, 95% CI 1.769-2.167, P<.001). For class 2, people prefer the AI method (OR 1.546, 95% CI 0.883-2.707, P=.37). The OR of levels of attributes increases with the increase of accuracy across all classes. Conclusions: Latent class analysis was prominent and useful in quantifying preferences for attributes of diagnosis choice. People’s preferences for the “accuracy” and “diagnostic expenses” attributes are palpable. AI will have a potential market. However, accuracy and diagnosis expenses need to be taken into consideration. %M 33493130 %R 10.2196/22841 %U https://www.jmir.org/2021/2/e22841 %U https://doi.org/10.2196/22841 %U http://www.ncbi.nlm.nih.gov/pubmed/33493130 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 23 %N 2 %P e23026 %T Learning From Past Respiratory Infections to Predict COVID-19 Outcomes: Retrospective Study %A Sang,Shengtian %A Sun,Ran %A Coquet,Jean %A Carmichael,Harris %A Seto,Tina %A Hernandez-Boussard,Tina %+ Department of Medicine, Biomedical Informatics, Stanford University, 1265 Welch Rd, 245, Stanford, CA, 94305-5479, United States, 1 650 725 5507, boussard@stanford.edu %K COVID-19 %K invasive mechanical ventilation %K all-cause mortality %K machine learning %K artificial intelligence %K respiratory %K infection %K outcome %K data %K feasibility %K framework %D 2021 %7 22.2.2021 %9 Original Paper %J J Med Internet Res %G English %X Background: For the clinical care of patients with well-established diseases, randomized trials, literature, and research are supplemented with clinical judgment to understand disease prognosis and inform treatment choices. In the void created by a lack of clinical experience with COVID-19, artificial intelligence (AI) may be an important tool to bolster clinical judgment and decision making. However, a lack of clinical data restricts the design and development of such AI tools, particularly in preparation for an impending crisis or pandemic. Objective: This study aimed to develop and test the feasibility of a “patients-like-me” framework to predict the deterioration of patients with COVID-19 using a retrospective cohort of patients with similar respiratory diseases. Methods: Our framework used COVID-19–like cohorts to design and train AI models that were then validated on the COVID-19 population. The COVID-19–like cohorts included patients diagnosed with bacterial pneumonia, viral pneumonia, unspecified pneumonia, influenza, and acute respiratory distress syndrome (ARDS) at an academic medical center from 2008 to 2019. In total, 15 training cohorts were created using different combinations of the COVID-19–like cohorts with the ARDS cohort for exploratory purposes. In this study, two machine learning models were developed: one to predict invasive mechanical ventilation (IMV) within 48 hours for each hospitalized day, and one to predict all-cause mortality at the time of admission. Model performance was assessed using the area under the receiver operating characteristic curve (AUROC), sensitivity, specificity, positive predictive value, and negative predictive value. We established model interpretability by calculating SHapley Additive exPlanations (SHAP) scores to identify important features. Results: Compared to the COVID-19–like cohorts (n=16,509), the patients hospitalized with COVID-19 (n=159) were significantly younger, with a higher proportion of patients of Hispanic ethnicity, a lower proportion of patients with smoking history, and fewer patients with comorbidities (P<.001). Patients with COVID-19 had a lower IMV rate (15.1 versus 23.2, P=.02) and shorter time to IMV (2.9 versus 4.1 days, P<.001) compared to the COVID-19–like patients. In the COVID-19–like training data, the top models achieved excellent performance (AUROC>0.90). Validating in the COVID-19 cohort, the top-performing model for predicting IMV was the XGBoost model (AUROC=0.826) trained on the viral pneumonia cohort. Similarly, the XGBoost model trained on all 4 COVID-19–like cohorts without ARDS achieved the best performance (AUROC=0.928) in predicting mortality. Important predictors included demographic information (age), vital signs (oxygen saturation), and laboratory values (white blood cell count, cardiac troponin, albumin, etc). Our models had class imbalance, which resulted in high negative predictive values and low positive predictive values. Conclusions: We provided a feasible framework for modeling patient deterioration using existing data and AI technology to address data limitations during the onset of a novel, rapidly changing pandemic. %M 33534724 %R 10.2196/23026 %U https://www.jmir.org/2021/2/e23026 %U https://doi.org/10.2196/23026 %U http://www.ncbi.nlm.nih.gov/pubmed/33534724 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 23 %N 2 %P e21037 %T Automated Computer Vision Assessment of Hypomimia in Parkinson Disease: Proof-of-Principle Pilot Study %A Abrami,Avner %A Gunzler,Steven %A Kilbane,Camilla %A Ostrand,Rachel %A Ho,Bryan %A Cecchi,Guillermo %+ IBM Research – Computational Biology Center, 1101 Kitchawan Rd, Yorktown Heights, NY, 10598, United States, 1 1 914 945 1815, gcecchi@us.ibm.com %K Parkinson disease %K hypomimia %K computer vision %K telemedicine %D 2021 %7 22.2.2021 %9 Original Paper %J J Med Internet Res %G English %X Background: Facial expressions require the complex coordination of 43 different facial muscles. Parkinson disease (PD) affects facial musculature leading to “hypomimia” or “masked facies.” Objective: We aimed to determine whether modern computer vision techniques can be applied to detect masked facies and quantify drug states in PD. Methods: We trained a convolutional neural network on images extracted from videos of 107 self-identified people with PD, along with 1595 videos of controls, in order to detect PD hypomimia cues. This trained model was applied to clinical interviews of 35 PD patients in their on and off drug motor states, and seven journalist interviews of the actor Alan Alda obtained before and after he was diagnosed with PD. Results: The algorithm achieved a test set area under the receiver operating characteristic curve of 0.71 on 54 subjects to detect PD hypomimia, compared to a value of 0.75 for trained neurologists using the United Parkinson Disease Rating Scale-III Facial Expression score. Additionally, the model accuracy to classify the on and off drug states in the clinical samples was 63% (22/35), in contrast to an accuracy of 46% (16/35) when using clinical rater scores. Finally, each of Alan Alda’s seven interviews were successfully classified as occurring before (versus after) his diagnosis, with 100% accuracy (7/7). Conclusions: This proof-of-principle pilot study demonstrated that computer vision holds promise as a valuable tool for PD hypomimia and for monitoring a patient’s motor state in an objective and noninvasive way, particularly given the increasing importance of telemedicine. %M 33616535 %R 10.2196/21037 %U https://www.jmir.org/2021/2/e21037 %U https://doi.org/10.2196/21037 %U http://www.ncbi.nlm.nih.gov/pubmed/33616535 %0 Journal Article %@ 1929-0748 %I JMIR Publications %V 10 %N 2 %P e26552 %T Investigating the Ethical and Data Governance Issues of Artificial Intelligence in Surgery: Protocol for a Delphi Study %A Lam,Kyle %A Iqbal,Fahad M %A Purkayastha,Sanjay %A Kinross,James M %+ Imperial College London, 10th Floor QEQM Building, Praed St, London, , United Kingdom, 44 796 490 4213, k.lam@imperial.ac.uk %K artificial intelligence %K digital surgery %K Delphi %K ethics %K data governance %K digital technology %K operating room %K surgery %D 2021 %7 22.2.2021 %9 Protocol %J JMIR Res Protoc %G English %X Background: The rapid uptake of digital technology into the operating room has the potential to improve patient outcomes, increase efficiency of the use of operating rooms, and allow surgeons to progress quickly up learning curves. These technologies are, however, dependent on huge amounts of data, and the consequences of their mismanagement are significant. While the field of artificial intelligence ethics is able to provide a broad framework for those designing and implementing these technologies into the operating room, there is a need to determine and address the ethical and data governance challenges of using digital technology in this unique environment. Objective: The objectives of this study are to define the term digital surgery and gain expert consensus on the key ethical and data governance issues, barriers, and future research goals of the use of artificial intelligence in surgery. Methods: Experts from the fields of surgery, ethics and law, policy, artificial intelligence, and industry will be invited to participate in a 4-round consensus Delphi exercise. In the first round, participants will supply free-text responses across 4 key domains: ethics, data governance, barriers, and future research goals. They will also be asked to provide their understanding of the term digital surgery. In subsequent rounds, statements will be grouped, and participants will be asked to rate the importance of each issue on a 9-point Likert scale ranging from 1 (not at all important) to 9 (critically important). Consensus is defined a priori as a score of 7 to 9 by 70% of respondents and 1 to 3 by less than 30% of respondents. A final online meeting round will be held to discuss inclusion of statements and draft a consensus document. Results: Full ethical approval has been obtained for the study by the local research ethics committee at Imperial College, London (20IC6136). We anticipate round 1 to commence in January 2021. Conclusions: The results of this study will define the term digital surgery, identify the key issues and barriers, and shape future research in this area. International Registered Report Identifier (IRRID): PRR1-10.2196/26552 %M 33616543 %R 10.2196/26552 %U https://www.researchprotocols.org/2021/2/e26552 %U https://doi.org/10.2196/26552 %U http://www.ncbi.nlm.nih.gov/pubmed/33616543 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 23 %N 2 %P e24221 %T Use and Control of Artificial Intelligence in Patients Across the Medical Workflow: Single-Center Questionnaire Study of Patient Perspectives %A Lennartz,Simon %A Dratsch,Thomas %A Zopfs,David %A Persigehl,Thorsten %A Maintz,David %A Große Hokamp,Nils %A Pinto dos Santos,Daniel %+ Institute for Diagnostic and Interventional Radiology, Faculty of Medicine and University Hospital Cologne, University of Cologne, Kerpener Straße 62, Cologne, 50937, Germany, 49 22147896063, daniel.pinto-dos-santos@uk-koeln.de %K artificial intelligence %K clinical implementation %K questionnaire %K survey %D 2021 %7 17.2.2021 %9 Original Paper %J J Med Internet Res %G English %X Background: Artificial intelligence (AI) is gaining increasing importance in many medical specialties, yet data on patients’ opinions on the use of AI in medicine are scarce. Objective: This study aimed to investigate patients’ opinions on the use of AI in different aspects of the medical workflow and the level of control and supervision under which they would deem the application of AI in medicine acceptable. Methods: Patients scheduled for computed tomography or magnetic resonance imaging voluntarily participated in an anonymized questionnaire between February 10, 2020, and May 24, 2020. Patient information, confidence in physicians vs AI in different clinical tasks, opinions on the control of AI, preference in cases of disagreement between AI and physicians, and acceptance of the use of AI for diagnosing and treating diseases of different severity were recorded. Results: In total, 229 patients participated. Patients favored physicians over AI for all clinical tasks except for treatment planning based on current scientific evidence. In case of disagreement between physicians and AI regarding diagnosis and treatment planning, most patients preferred the physician’s opinion to AI (96.2% [153/159] vs 3.8% [6/159] and 94.8% [146/154] vs 5.2% [8/154], respectively; P=.001). AI supervised by a physician was considered more acceptable than AI without physician supervision at diagnosis (confidence rating 3.90 [SD 1.20] vs 1.64 [SD 1.03], respectively; P=.001) and therapy (3.77 [SD 1.18] vs 1.57 [SD 0.96], respectively; P=.001). Conclusions: Patients favored physicians over AI in most clinical tasks and strongly preferred an application of AI with physician supervision. However, patients acknowledged that AI could help physicians integrate the most recent scientific evidence into medical care. Application of AI in medicine should be disclosed and controlled to protect patient interests and meet ethical standards. %M 33595451 %R 10.2196/24221 %U http://www.jmir.org/2021/2/e24221/ %U https://doi.org/10.2196/24221 %U http://www.ncbi.nlm.nih.gov/pubmed/33595451 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 9 %N 2 %P e24572 %T Development and Validation of a Machine Learning Approach for Automated Severity Assessment of COVID-19 Based on Clinical and Imaging Data: Retrospective Study %A Quiroz,Juan Carlos %A Feng,You-Zhen %A Cheng,Zhong-Yuan %A Rezazadegan,Dana %A Chen,Ping-Kang %A Lin,Qi-Ting %A Qian,Long %A Liu,Xiao-Fang %A Berkovsky,Shlomo %A Coiera,Enrico %A Song,Lei %A Qiu,Xiaoming %A Liu,Sidong %A Cai,Xiang-Ran %+ Centre for Health Informatics, Australian Institute of Health Innovation, Faculty of Medicine, Health and Human Sciences, Macquarie University, 75 Talvera Road, Macquarie Park, 2113, Australia, 61 29852729, sidong.liu@mq.edu.au %K algorithm %K clinical data %K clinical features %K COVID-19 %K CT scans %K development %K imaging %K imbalanced data %K machine learning %K oversampling %K severity assessment %K validation %D 2021 %7 11.2.2021 %9 Original Paper %J JMIR Med Inform %G English %X Background: COVID-19 has overwhelmed health systems worldwide. It is important to identify severe cases as early as possible, such that resources can be mobilized and treatment can be escalated. Objective: This study aims to develop a machine learning approach for automated severity assessment of COVID-19 based on clinical and imaging data. Methods: Clinical data—including demographics, signs, symptoms, comorbidities, and blood test results—and chest computed tomography scans of 346 patients from 2 hospitals in the Hubei Province, China, were used to develop machine learning models for automated severity assessment in diagnosed COVID-19 cases. We compared the predictive power of the clinical and imaging data from multiple machine learning models and further explored the use of four oversampling methods to address the imbalanced classification issue. Features with the highest predictive power were identified using the Shapley Additive Explanations framework. Results: Imaging features had the strongest impact on the model output, while a combination of clinical and imaging features yielded the best performance overall. The identified predictive features were consistent with those reported previously. Although oversampling yielded mixed results, it achieved the best model performance in our study. Logistic regression models differentiating between mild and severe cases achieved the best performance for clinical features (area under the curve [AUC] 0.848; sensitivity 0.455; specificity 0.906), imaging features (AUC 0.926; sensitivity 0.818; specificity 0.901), and a combination of clinical and imaging features (AUC 0.950; sensitivity 0.764; specificity 0.919). The synthetic minority oversampling method further improved the performance of the model using combined features (AUC 0.960; sensitivity 0.845; specificity 0.929). Conclusions: Clinical and imaging features can be used for automated severity assessment of COVID-19 and can potentially help triage patients with COVID-19 and prioritize care delivery to those at a higher risk of severe disease. %M 33534723 %R 10.2196/24572 %U http://medinform.jmir.org/2021/2/e24572/ %U https://doi.org/10.2196/24572 %U http://www.ncbi.nlm.nih.gov/pubmed/33534723 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 23 %N 2 %P e24246 %T A Machine Learning Prediction Model of Respiratory Failure Within 48 Hours of Patient Admission for COVID-19: Model Development and Validation %A Bolourani,Siavash %A Brenner,Max %A Wang,Ping %A McGinn,Thomas %A Hirsch,Jamie S %A Barnaby,Douglas %A Zanos,Theodoros P %A , %+ Feinstein Institutes for Medical Research, Northwell Health, 350 Community Dr, Room 1257, Manhasset, NY, 11030, United States, 1 5165620484, tzanos@northwell.edu %K artificial intelligence %K prognostic %K model %K pandemic %K severe acute respiratory syndrome coronavirus 2 %K modeling %K development %K validation %K COVID-19 %K machine learning %D 2021 %7 10.2.2021 %9 Original Paper %J J Med Internet Res %G English %X Background: Predicting early respiratory failure due to COVID-19 can help triage patients to higher levels of care, allocate scarce resources, and reduce morbidity and mortality by appropriately monitoring and treating the patients at greatest risk for deterioration. Given the complexity of COVID-19, machine learning approaches may support clinical decision making for patients with this disease. Objective: Our objective is to derive a machine learning model that predicts respiratory failure within 48 hours of admission based on data from the emergency department. Methods: Data were collected from patients with COVID-19 who were admitted to Northwell Health acute care hospitals and were discharged, died, or spent a minimum of 48 hours in the hospital between March 1 and May 11, 2020. Of 11,525 patients, 933 (8.1%) were placed on invasive mechanical ventilation within 48 hours of admission. Variables used by the models included clinical and laboratory data commonly collected in the emergency department. We trained and validated three predictive models (two based on XGBoost and one that used logistic regression) using cross-hospital validation. We compared model performance among all three models as well as an established early warning score (Modified Early Warning Score) using receiver operating characteristic curves, precision-recall curves, and other metrics. Results: The XGBoost model had the highest mean accuracy (0.919; area under the curve=0.77), outperforming the other two models as well as the Modified Early Warning Score. Important predictor variables included the type of oxygen delivery used in the emergency department, patient age, Emergency Severity Index level, respiratory rate, serum lactate, and demographic characteristics. Conclusions: The XGBoost model had high predictive accuracy, outperforming other early warning scores. The clinical plausibility and predictive ability of XGBoost suggest that the model could be used to predict 48-hour respiratory failure in admitted patients with COVID-19. %M 33476281 %R 10.2196/24246 %U http://www.jmir.org/2021/2/e24246/ %U https://doi.org/10.2196/24246 %U http://www.ncbi.nlm.nih.gov/pubmed/33476281 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 23 %N 2 %P e23693 %T Fast and Accurate Detection of COVID-19 Along With 14 Other Chest Pathologies Using a Multi-Level Classification: Algorithm Development and Validation Study %A Albahli,Saleh %A Yar,Ghulam Nabi Ahmad Hassan %+ Department of Information Technology, College of Computer, Qassim University, Buraydah, 51452, Saudi Arabia, 966 163012604, salbahli@qu.edu.sa %K COVID-19 %K chest x-ray %K convolutional neural network %K data augmentation %K biomedical imaging %K automatic detection %D 2021 %7 10.2.2021 %9 Original Paper %J J Med Internet Res %G English %X Background: COVID-19 has spread very rapidly, and it is important to build a system that can detect it in order to help an overwhelmed health care system. Many research studies on chest diseases rely on the strengths of deep learning techniques. Although some of these studies used state-of-the-art techniques and were able to deliver promising results, these techniques are not very useful if they can detect only one type of disease without detecting the others. Objective: The main objective of this study was to achieve a fast and more accurate diagnosis of COVID-19. This study proposes a diagnostic technique that classifies COVID-19 x-ray images from normal x-ray images and those specific to 14 other chest diseases. Methods: In this paper, we propose a novel, multilevel pipeline, based on deep learning models, to detect COVID-19 along with other chest diseases based on x-ray images. This pipeline reduces the burden of a single network to classify a large number of classes. The deep learning models used in this study were pretrained on the ImageNet dataset, and transfer learning was used for fast training. The lungs and heart were segmented from the whole x-ray images and passed onto the first classifier that checks whether the x-ray is normal, COVID-19 affected, or characteristic of another chest disease. If it is neither a COVID-19 x-ray image nor a normal one, then the second classifier comes into action and classifies the image as one of the other 14 diseases. Results: We show how our model uses state-of-the-art deep neural networks to achieve classification accuracy for COVID-19 along with 14 other chest diseases and normal cases based on x-ray images, which is competitive with currently used state-of-the-art models. Due to the lack of data in some classes such as COVID-19, we applied 10-fold cross-validation through the ResNet50 model. Our classification technique thus achieved an average training accuracy of 96.04% and test accuracy of 92.52% for the first level of classification (ie, 3 classes). For the second level of classification (ie, 14 classes), our technique achieved a maximum training accuracy of 88.52% and test accuracy of 66.634% by using ResNet50. We also found that when all the 16 classes were classified at once, the overall accuracy for COVID-19 detection decreased, which in the case of ResNet50 was 88.92% for training data and 71.905% for test data. Conclusions: Our proposed pipeline can detect COVID-19 with a higher accuracy along with detecting 14 other chest diseases based on x-ray images. This is achieved by dividing the classification task into multiple steps rather than classifying them collectively. %M 33529154 %R 10.2196/23693 %U http://www.jmir.org/2021/2/e23693/ %U https://doi.org/10.2196/23693 %U http://www.ncbi.nlm.nih.gov/pubmed/33529154 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 23 %N 2 %P e22320 %T The Need for Ethnoracial Equity in Artificial Intelligence for Diabetes Management: Review and Recommendations %A Pham,Quynh %A Gamble,Anissa %A Hearn,Jason %A Cafazzo,Joseph A %+ Institute of Health Policy, Management and Evaluation, Dalla Lana School of Public Health, University of Toronto, Health Sciences Building, 155 College Street, Toronto, ON, M5T 1P8, Canada, 1 4163404800 ext 4765, q.pham@uhn.ca %K diabetes %K artificial intelligence %K digital health %K ethnoracial equity %K ethnicity %K race %D 2021 %7 10.2.2021 %9 Viewpoint %J J Med Internet Res %G English %X There is clear evidence to suggest that diabetes does not affect all populations equally. Among adults living with diabetes, those from ethnoracial minority communities—foreign-born, immigrant, refugee, and culturally marginalized—are at increased risk of poor health outcomes. Artificial intelligence (AI) is actively being researched as a means of improving diabetes management and care; however, several factors may predispose AI to ethnoracial bias. To better understand whether diabetes AI interventions are being designed in an ethnoracially equitable manner, we conducted a secondary analysis of 141 articles included in a 2018 review by Contreras and Vehi entitled “Artificial Intelligence for Diabetes Management and Decision Support: Literature Review.” Two members of our research team independently reviewed each article and selected those reporting ethnoracial data for further analysis. Only 10 articles (7.1%) were ultimately selected for secondary analysis in our case study. Of the 131 excluded articles, 118 (90.1%) failed to mention participants’ ethnic or racial backgrounds. The included articles reported ethnoracial data under various categories, including race (n=6), ethnicity (n=2), race/ethnicity (n=3), and percentage of Caucasian participants (n=1). Among articles specifically reporting race, the average distribution was 69.5% White, 17.1% Black, and 3.7% Asian. Only 2 articles reported inclusion of Native American participants. Given the clear ethnic and racial differences in diabetes biomarkers, prevalence, and outcomes, the inclusion of ethnoracial training data is likely to improve the accuracy of predictive models. Such considerations are imperative in AI-based tools, which are predisposed to negative biases due to their black-box nature and proneness to distributional shift. Based on our findings, we propose a short questionnaire to assess ethnoracial equity in research describing AI-based diabetes interventions. At this unprecedented time in history, AI can either mitigate or exacerbate disparities in health care. Future accounts of the infancy of diabetes AI must reflect our early and decisive action to confront ethnoracial inequities before they are coded into our systems and perpetuate the very biases we aim to eliminate. If we take deliberate and meaningful steps now toward training our algorithms to be ethnoracially inclusive, we can architect innovations in diabetes care that are bound by the diverse fabric of our society. %M 33565982 %R 10.2196/22320 %U http://www.jmir.org/2021/2/e22320/ %U https://doi.org/10.2196/22320 %U http://www.ncbi.nlm.nih.gov/pubmed/33565982 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 9 %N 2 %P e22164 %T Identifying Myocardial Infarction Using Hierarchical Template Matching–Based Myocardial Strain: Algorithm Development and Usability Study %A Bhalodiya,Jayendra Maganbhai %A Palit,Arnab %A Giblin,Gerard %A Tiwari,Manoj Kumar %A Prasad,Sanjay K %A Bhudia,Sunil K %A Arvanitis,Theodoros N %A Williams,Mark A %+ Institute of Digital Healthcare, Warwick Manufacturing Group, University of Warwick, Gibbet Hill Rd, Coventry, CV4 7AL, United Kingdom, 44 7448404975, jayendra.bhalodiya@warwick.ac.uk %K left ventricle %K myocardial infarction %K myocardium %K strain %D 2021 %7 10.2.2021 %9 Original Paper %J JMIR Med Inform %G English %X Background: Myocardial infarction (MI; location and extent of infarction) can be determined by late enhancement cardiac magnetic resonance (CMR) imaging, which requires the injection of a potentially harmful gadolinium-based contrast agent (GBCA). Alternatively, emerging research in the area of myocardial strain has shown potential to identify MI using strain values. Objective: This study aims to identify the location of MI by developing an applied algorithmic method of circumferential strain (CS) values, which are derived through a novel hierarchical template matching (HTM) method. Methods: HTM-based CS H-spread from end-diastole to end-systole was used to develop an applied method. Grid-tagging magnetic resonance imaging was used to calculate strain values in the left ventricular (LV) myocardium, followed by the 16-segment American Heart Association model. The data set was used with k-fold cross-validation to estimate the percentage reduction of H-spread among infarcted and noninfarcted LV segments. A total of 43 participants (38 MI and 5 healthy) who underwent CMR imaging were retrospectively selected. Infarcted segments detected by using this method were validated by comparison with late enhancement CMR, and the diagnostic performance of the applied algorithmic method was evaluated with a receiver operating characteristic curve test. Results: The H-spread of the CS was reduced in infarcted segments compared with noninfarcted segments of the LV. The reductions were 30% in basal segments, 30% in midventricular segments, and 20% in apical LV segments. The diagnostic accuracy of detection, using the reported method, was represented by area under the curve values, which were 0.85, 0.82, and 0.87 for basal, midventricular, and apical slices, respectively, demonstrating good agreement with the late-gadolinium enhancement–based detections. Conclusions: The proposed applied algorithmic method has the potential to accurately identify the location of infarcted LV segments without the administration of late-gadolinium enhancement. Such an approach adds the potential to safely identify MI, potentially reduce patient scanning time, and extend the utility of CMR in patients who are contraindicated for the use of GBCA. %M 33565992 %R 10.2196/22164 %U https://medinform.jmir.org/2021/2/e22164 %U https://doi.org/10.2196/22164 %U http://www.ncbi.nlm.nih.gov/pubmed/33565992 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 7 %N 2 %P e25935 %T Collaborating in the Time of COVID-19: The Scope and Scale of Innovative Responses to a Global Pandemic %A Bernardo,Theresa %A Sobkowich,Kurtis Edward %A Forrest,Russell Othmer %A Stewart,Luke Silva %A D'Agostino,Marcelo %A Perez Gutierrez,Enrique %A Gillis,Daniel %+ Department of Population Medicine, University of Guelph, 50 Stone Rd E, Guelph, ON, N1G 2W1, Canada, 1 519 824 4120 ext 54184, theresabernardo@gmail.com %K crowdsourcing %K artificial intelligence %K collaboration %K personal protective equipment %K big data %K AI %K COVID-19 %K innovation %K information sharing %K communication %K teamwork %K knowledge %K dissemination %D 2021 %7 9.2.2021 %9 Viewpoint %J JMIR Public Health Surveill %G English %X The emergence of COVID-19 spurred the formation of myriad teams to tackle every conceivable aspect of the virus and thwart its spread. Enabled by global digital connectedness, collaboration has become a constant theme throughout the pandemic, resulting in the expedition of the scientific process (including vaccine development), rapid consolidation of global outbreak data and statistics, and experimentation with novel partnerships. To document the evolution of these collaborative efforts, the authors collected illustrative examples as the pandemic unfolded, supplemented with publications from the JMIR COVID-19 Special Issue. Over 60 projects rooted in collaboration are categorized into five main themes: knowledge dissemination, data propagation, crowdsourcing, artificial intelligence, and hardware design and development. They highlight the numerous ways that citizens, industry professionals, researchers, and academics have come together worldwide to consolidate information and produce products to combat the COVID-19 pandemic. Initially, researchers and citizen scientists scrambled to access quality data within an overwhelming quantity of information. As global curated data sets emerged, derivative works such as visualizations or models were developed that depended on consistent data and would fail when there were unanticipated changes. Crowdsourcing was used to collect and analyze data, aid in contact tracing, and produce personal protective equipment by sharing open designs for 3D printing. An international consortium of entrepreneurs and researchers created a ventilator based on an open-source design. A coalition of nongovernmental organizations and governmental organizations, led by the White House Office of Science and Technology Policy, created a shared open resource of over 200,000 research publications about COVID-19 and subsequently offered cash prizes for the best solutions to 17 key questions involving artificial intelligence. A thread of collaboration weaved throughout the pandemic response, which will shape future efforts. Novel partnerships will cross boundaries to create better processes, products, and solutions to consequential societal challenges. %M 33503001 %R 10.2196/25935 %U http://publichealth.jmir.org/2021/2/e25935/ %U https://doi.org/10.2196/25935 %U http://www.ncbi.nlm.nih.gov/pubmed/33503001 %0 Journal Article %@ 2561-326X %I JMIR Publications %V 5 %N 2 %P e25184 %T Preliminary Screening for Hereditary Breast and Ovarian Cancer Using a Chatbot Augmented Intelligence Genetic Counselor: Development and Feasibility Study %A Sato,Ann %A Haneda,Eri %A Suganuma,Nobuyasu %A Narimatsu,Hiroto %+ Department of Genetic Medicine, Kanagawa Cancer Center, 2-3-2 Nakao, Asahi-ku, Yokohama, Kanagawa, 241-8515, Japan, 81 045 520 2222, hiroto-narimatsu@umin.org %K artificial intelligence %K augmented intelligence %K hereditary cancer %K familial cancer %K IBM Watson %K preliminary screening %K cancer %K genetics %K chatbot %K screening %K feasibility %D 2021 %7 5.2.2021 %9 Original Paper %J JMIR Form Res %G English %X Background: Breast cancer is the most common form of cancer in Japan; genetic background and hereditary breast and ovarian cancer (HBOC) are implicated. The key to HBOC diagnosis involves screening to identify high-risk individuals. However, genetic medicine is still developing; thus, many patients who may potentially benefit from genetic medicine have not yet been identified. Objective: This study’s objective is to develop a chatbot system that uses augmented intelligence for HBOC screening to determine whether patients meet the National Comprehensive Cancer Network (NCCN) BRCA1/2 testing criteria. Methods: The system was evaluated by a doctor specializing in genetic medicine and certified genetic counselors. We prepared 3 scenarios and created a conversation with the chatbot to reflect each one. Then we evaluated chatbot feasibility, the required time, the medical accuracy of conversations and family history, and the final result. Results: The times required for the conversation were 7 minutes for scenario 1, 15 minutes for scenario 2, and 16 minutes for scenario 3. Scenarios 1 and 2 met the BRCA1/2 testing criteria, but scenario 3 did not, and this result was consistent with the findings of 3 experts who retrospectively reviewed conversations with the chatbot according to the 3 scenarios. A family history comparison ascertained by the chatbot with the actual scenarios revealed that each result was consistent with each scenario. From a genetic medicine perspective, no errors were noted by the 3 experts. Conclusions: This study demonstrated that chatbot systems could be applied to preliminary genetic medicine screening for HBOC. %M 33544084 %R 10.2196/25184 %U https://formative.jmir.org/2021/2/e25184 %U https://doi.org/10.2196/25184 %U http://www.ncbi.nlm.nih.gov/pubmed/33544084 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 23 %N 2 %P e25187 %T Machine Learning–Based Early Warning Systems for Clinical Deterioration: Systematic Scoping Review %A Muralitharan,Sankavi %A Nelson,Walter %A Di,Shuang %A McGillion,Michael %A Devereaux,PJ %A Barr,Neil Grant %A Petch,Jeremy %+ Centre for Data Science and Digital Health, Hamilton Health Sciences, 293 Wellington St. N, Hamilton, ON, L8L 8E7, Canada, 1 2897882965, sankavi_22@hotmail.com %K machine learning %K early warning systems %K clinical deterioration %K ambulatory care %K acute care %K remote patient monitoring %K vital signs %K sepsis %K cardiorespiratory instability %K risk prediction %D 2021 %7 4.2.2021 %9 Review %J J Med Internet Res %G English %X Background: Timely identification of patients at a high risk of clinical deterioration is key to prioritizing care, allocating resources effectively, and preventing adverse outcomes. Vital signs–based, aggregate-weighted early warning systems are commonly used to predict the risk of outcomes related to cardiorespiratory instability and sepsis, which are strong predictors of poor outcomes and mortality. Machine learning models, which can incorporate trends and capture relationships among parameters that aggregate-weighted models cannot, have recently been showing promising results. Objective: This study aimed to identify, summarize, and evaluate the available research, current state of utility, and challenges with machine learning–based early warning systems using vital signs to predict the risk of physiological deterioration in acutely ill patients, across acute and ambulatory care settings. Methods: PubMed, CINAHL, Cochrane Library, Web of Science, Embase, and Google Scholar were searched for peer-reviewed, original studies with keywords related to “vital signs,” “clinical deterioration,” and “machine learning.” Included studies used patient vital signs along with demographics and described a machine learning model for predicting an outcome in acute and ambulatory care settings. Data were extracted following PRISMA, TRIPOD, and Cochrane Collaboration guidelines. Results: We identified 24 peer-reviewed studies from 417 articles for inclusion; 23 studies were retrospective, while 1 was prospective in nature. Care settings included general wards, intensive care units, emergency departments, step-down units, medical assessment units, postanesthetic wards, and home care. Machine learning models including logistic regression, tree-based methods, kernel-based methods, and neural networks were most commonly used to predict the risk of deterioration. The area under the curve for models ranged from 0.57 to 0.97. Conclusions: In studies that compared performance, reported results suggest that machine learning–based early warning systems can achieve greater accuracy than aggregate-weighted early warning systems but several areas for further research were identified. While these models have the potential to provide clinical decision support, there is a need for standardized outcome measures to allow for rigorous evaluation of performance across models. Further research needs to address the interpretability of model outputs by clinicians, clinical efficacy of these systems through prospective study design, and their potential impact in different clinical settings. %M 33538696 %R 10.2196/25187 %U https://www.jmir.org/2021/2/e25187 %U https://doi.org/10.2196/25187 %U http://www.ncbi.nlm.nih.gov/pubmed/33538696 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 23 %N 2 %P e23436 %T Hidden Variables in Deep Learning Digital Pathology and Their Potential to Cause Batch Effects: Prediction Model Study %A Schmitt,Max %A Maron,Roman Christoph %A Hekler,Achim %A Stenzinger,Albrecht %A Hauschild,Axel %A Weichenthal,Michael %A Tiemann,Markus %A Krahl,Dieter %A Kutzner,Heinz %A Utikal,Jochen Sven %A Haferkamp,Sebastian %A Kather,Jakob Nikolas %A Klauschen,Frederick %A Krieghoff-Henning,Eva %A Fröhling,Stefan %A von Kalle,Christof %A Brinker,Titus Josef %+ Digital Biomarkers for Oncology Group, National Center for Tumor Diseases, German Cancer Research Center (DKFZ), Im Neuenheimer Feld 460, Heidelberg, 69120, Germany, 49 6221 3219304, titus.brinker@dkfz.de %K artificial intelligence %K machine learning %K deep learning %K neural networks %K convolutional neural networks %K pathology %K clinical pathology %K digital pathology %K pitfalls %K artifacts %D 2021 %7 2.2.2021 %9 Original Paper %J J Med Internet Res %G English %X Background: An increasing number of studies within digital pathology show the potential of artificial intelligence (AI) to diagnose cancer using histological whole slide images, which requires large and diverse data sets. While diversification may result in more generalizable AI-based systems, it can also introduce hidden variables. If neural networks are able to distinguish/learn hidden variables, these variables can introduce batch effects that compromise the accuracy of classification systems. Objective: The objective of the study was to analyze the learnability of an exemplary selection of hidden variables (patient age, slide preparation date, slide origin, and scanner type) that are commonly found in whole slide image data sets in digital pathology and could create batch effects. Methods: We trained four separate convolutional neural networks (CNNs) to learn four variables using a data set of digitized whole slide melanoma images from five different institutes. For robustness, each CNN training and evaluation run was repeated multiple times, and a variable was only considered learnable if the lower bound of the 95% confidence interval of its mean balanced accuracy was above 50.0%. Results: A mean balanced accuracy above 50.0% was achieved for all four tasks, even when considering the lower bound of the 95% confidence interval. Performance between tasks showed wide variation, ranging from 56.1% (slide preparation date) to 100% (slide origin). Conclusions: Because all of the analyzed hidden variables are learnable, they have the potential to create batch effects in dermatopathology data sets, which negatively affect AI-based classification systems. Practitioners should be aware of these and similar pitfalls when developing and evaluating such systems and address these and potentially other batch effect variables in their data sets through sufficient data set stratification. %M 33528370 %R 10.2196/23436 %U https://www.jmir.org/2021/2/e23436 %U https://doi.org/10.2196/23436 %U http://www.ncbi.nlm.nih.gov/pubmed/33528370 %0 Journal Article %@ 2562-7600 %I JMIR Publications %V 4 %N 1 %P e23933 %T Predicted Influences of Artificial Intelligence on Nursing Education: Scoping Review %A Buchanan,Christine %A Howitt,M Lyndsay %A Wilson,Rita %A Booth,Richard G %A Risling,Tracie %A Bamford,Megan %+ Registered Nurses' Association of Ontario, 500-4211 Yonge Street, Toronto, ON, M2P 2A9, Canada, 1 800 268 7199 ext 281, cbuchanan@rnao.ca %K nursing %K artificial intelligence %K education %K review %D 2021 %7 28.1.2021 %9 Review %J JMIR Nursing %G English %X Background: It is predicted that artificial intelligence (AI) will transform nursing across all domains of nursing practice, including administration, clinical care, education, policy, and research. Increasingly, researchers are exploring the potential influences of AI health technologies (AIHTs) on nursing in general and on nursing education more specifically. However, little emphasis has been placed on synthesizing this body of literature. Objective: A scoping review was conducted to summarize the current and predicted influences of AIHTs on nursing education over the next 10 years and beyond. Methods: This scoping review followed a previously published protocol from April 2020. Using an established scoping review methodology, the databases of MEDLINE, Cumulative Index to Nursing and Allied Health Literature, Embase, PsycINFO, Cochrane Database of Systematic Reviews, Cochrane Central, Education Resources Information Centre, Scopus, Web of Science, and Proquest were searched. In addition to the use of these electronic databases, a targeted website search was performed to access relevant grey literature. Abstracts and full-text studies were independently screened by two reviewers using prespecified inclusion and exclusion criteria. Included literature focused on nursing education and digital health technologies that incorporate AI. Data were charted using a structured form and narratively summarized into categories. Results: A total of 27 articles were identified (20 expository papers, six studies with quantitative or prototyping methods, and one qualitative study). The population included nurses, nurse educators, and nursing students at the entry-to-practice, undergraduate, graduate, and doctoral levels. A variety of AIHTs were discussed, including virtual avatar apps, smart homes, predictive analytics, virtual or augmented reality, and robots. The two key categories derived from the literature were (1) influences of AI on nursing education in academic institutions and (2) influences of AI on nursing education in clinical practice. Conclusions: Curricular reform is urgently needed within nursing education programs in academic institutions and clinical practice settings to prepare nurses and nursing students to practice safely and efficiently in the age of AI. Additionally, nurse educators need to adopt new and evolving pedagogies that incorporate AI to better support students at all levels of education. Finally, nursing students and practicing nurses must be equipped with the requisite knowledge and skills to effectively assess AIHTs and safely integrate those deemed appropriate to support person-centered compassionate nursing care in practice settings. International Registered Report Identifier (IRRID): RR2-10.2196/17490 %R 10.2196/23933 %U https://nursing.jmir.org/2021/1/e23933/ %U https://doi.org/10.2196/23933 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 9 %N 1 %P e24973 %T Deep Learning Models for Predicting Severe Progression in COVID-19-Infected Patients: Retrospective Study %A Ho,Thao Thi %A Park,Jongmin %A Kim,Taewoo %A Park,Byunggeon %A Lee,Jaehee %A Kim,Jin Young %A Kim,Ki Beom %A Choi,Sooyoung %A Kim,Young Hwan %A Lim,Jae-Kwang %A Choi,Sanghun %+ School of Mechanical Engineering, Kyungpook National University, 80 Daehak-ro, Buk-gu, Daegu, 41566, Republic of Korea, 82 53 950 5578, s-choi@knu.ac.kr %K COVID-19 %K deep learning %K artificial neural network %K convolutional neural network %K lung CT %D 2021 %7 28.1.2021 %9 Original Paper %J JMIR Med Inform %G English %X Background: Many COVID-19 patients rapidly progress to respiratory failure with a broad range of severities. Identification of high-risk cases is critical for early intervention. Objective: The aim of this study is to develop deep learning models that can rapidly identify high-risk COVID-19 patients based on computed tomography (CT) images and clinical data. Methods: We analyzed 297 COVID-19 patients from five hospitals in Daegu, South Korea. A mixed artificial convolutional neural network (ACNN) model, combining an artificial neural network for clinical data and a convolutional neural network for 3D CT imaging data, was developed to classify these cases as either high risk of severe progression (ie, event) or low risk (ie, event-free). Results: Using the mixed ACNN model, we were able to obtain high classification performance using novel coronavirus pneumonia lesion images (ie, 93.9% accuracy, 80.8% sensitivity, 96.9% specificity, and 0.916 area under the curve [AUC] score) and lung segmentation images (ie, 94.3% accuracy, 74.7% sensitivity, 95.9% specificity, and 0.928 AUC score) for event versus event-free groups. Conclusions: Our study successfully differentiated high-risk cases among COVID-19 patients using imaging and clinical features. The developed model can be used as a predictive tool for interventions in aggressive therapies. %M 33455900 %R 10.2196/24973 %U http://medinform.jmir.org/2021/1/e24973/ %U https://doi.org/10.2196/24973 %U http://www.ncbi.nlm.nih.gov/pubmed/33455900 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 9 %N 1 %P e24924 %T Machine Learning Prediction of Foodborne Disease Pathogens: Algorithm Development and Validation Study %A Wang,Hanxue %A Cui,Wenjuan %A Guo,Yunchang %A Du,Yi %A Zhou,Yuanchun %+ Computer Network Information Center, Chinese Academy of Sciences, No 4, South Fourth Street, Zhongguancun, Haidian District, Beijing, 100190, China, 86 15810134970, duyi@cnic.cn %K foodborne disease %K pathogens prediction %K machine learning %D 2021 %7 26.1.2021 %9 Original Paper %J JMIR Med Inform %G English %X Background: Foodborne diseases have a high global incidence; thus, they place a heavy burden on public health and the social economy. Foodborne pathogens, as the main factor of foodborne diseases, play an important role in the treatment and prevention of foodborne diseases; however, foodborne diseases caused by different pathogens lack specificity in their clinical features, and there is a low proportion of actual clinical pathogen detection in real life. Objective: We aimed to analyze foodborne disease case data, select appropriate features based on analysis results, and use machine learning methods to classify foodborne disease pathogens to predict foodborne disease pathogens for cases where the pathogen is not known or tested. Methods: We extracted features such as space, time, and exposed food from foodborne disease case data and analyzed the relationships between these features and the foodborne disease pathogens using a variety of machine learning methods to classify foodborne disease pathogens. We compared the results of four models to obtain the pathogen prediction model with the highest accuracy. Results: The gradient boost decision tree model obtained the highest accuracy, with accuracy approaching 69% in identifying 4 pathogens: Salmonella, Norovirus, Escherichia coli, and Vibrio parahaemolyticus. By evaluating the importance of features such as time of illness, geographical longitude and latitude, and diarrhea frequency, we found that these features play important roles in classifying foodborne disease pathogens. Conclusions: Data analysis can reflect the distribution of some features of foodborne diseases and the relationships among the features. The classification of pathogens based on the analysis results and machine learning methods can provide beneficial support for clinical auxiliary diagnosis and treatment of foodborne diseases. %M 33496675 %R 10.2196/24924 %U http://medinform.jmir.org/2021/1/e24924/ %U https://doi.org/10.2196/24924 %U http://www.ncbi.nlm.nih.gov/pubmed/33496675 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 9 %N 1 %P e19739 %T An Application of Machine Learning to Etiological Diagnosis of Secondary Hypertension: Retrospective Study Using Electronic Medical Records %A Diao,Xiaolin %A Huo,Yanni %A Yan,Zhanzheng %A Wang,Haibin %A Yuan,Jing %A Wang,Yuxin %A Cai,Jun %A Zhao,Wei %+ Department of Information Center, Fuwai Hospital, National Center for Cardiovascular Diseases, Chinese Academy of Medical Sciences and Peking Union Medical College, 167 Beilishi Road, Beijing, 100037, China, 86 1 333 119 2899, zw@fuwai.com %K secondary hypertension %K etiological diagnosis %K machine learning %K prediction model %D 2021 %7 25.1.2021 %9 Original Paper %J JMIR Med Inform %G English %X Background: Secondary hypertension is a kind of hypertension with a definite etiology and may be cured. Patients with suspected secondary hypertension can benefit from timely detection and treatment and, conversely, will have a higher risk of morbidity and mortality than those with primary hypertension. Objective: The aim of this study was to develop and validate machine learning (ML) prediction models of common etiologies in patients with suspected secondary hypertension. Methods: The analyzed data set was retrospectively extracted from electronic medical records of patients discharged from Fuwai Hospital between January 1, 2016, and June 30, 2019. A total of 7532 unique patients were included and divided into 2 data sets by time: 6302 patients in 2016-2018 as the training data set for model building and 1230 patients in 2019 as the validation data set for further evaluation. Extreme Gradient Boosting (XGBoost) was adopted to develop 5 models to predict 4 etiologies of secondary hypertension and occurrence of any of them (named as composite outcome), including renovascular hypertension (RVH), primary aldosteronism (PA), thyroid dysfunction, and aortic stenosis. Both univariate logistic analysis and Gini Impurity were used for feature selection. Grid search and 10-fold cross-validation were used to select the optimal hyperparameters for each model. Results: Validation of the composite outcome prediction model showed good performance with an area under the receiver-operating characteristic curve (AUC) of 0.924 in the validation data set, while the 4 prediction models of RVH, PA, thyroid dysfunction, and aortic stenosis achieved AUC of 0.938, 0.965, 0.959, and 0.946, respectively, in the validation data set. A total of 79 clinical indicators were identified in all and finally used in our prediction models. The result of subgroup analysis on the composite outcome prediction model demonstrated high discrimination with AUCs all higher than 0.890 among all age groups of adults. Conclusions: The ML prediction models in this study showed good performance in detecting 4 etiologies of patients with suspected secondary hypertension; thus, they may potentially facilitate clinical diagnosis decision making of secondary hypertension in an intelligent way. %M 33492233 %R 10.2196/19739 %U http://medinform.jmir.org/2021/1/e19739/ %U https://doi.org/10.2196/19739 %U http://www.ncbi.nlm.nih.gov/pubmed/33492233 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 23 %N 1 %P e20123 %T Risk Stratification for Early Detection of Diabetes and Hypertension in Resource-Limited Settings: Machine Learning Analysis %A Boutilier,Justin J %A Chan,Timothy C Y %A Ranjan,Manish %A Deo,Sarang %+ Department of Industrial and Systems Engineering, University of Wisconsin-Madison, 1513 University Avenue, Madison, WI, 53706, United States, 1 6082630350, jboutilier@wisc.edu %K machine learning %K diabetes %K hypertension %K screening %K global health %D 2021 %7 21.1.2021 %9 Original Paper %J J Med Internet Res %G English %X Background: The impending scale up of noncommunicable disease screening programs in low- and middle-income countries coupled with limited health resources require that such programs be as accurate as possible at identifying patients at high risk. Objective: The aim of this study was to develop machine learning–based risk stratification algorithms for diabetes and hypertension that are tailored for the at-risk population served by community-based screening programs in low-resource settings. Methods: We trained and tested our models by using data from 2278 patients collected by community health workers through door-to-door and camp-based screenings in the urban slums of Hyderabad, India between July 14, 2015 and April 21, 2018. We determined the best models for predicting short-term (2-month) risk of diabetes and hypertension (a model for diabetes and a model for hypertension) and compared these models to previously developed risk scores from the United States and the United Kingdom by using prediction accuracy as characterized by the area under the receiver operating characteristic curve (AUC) and the number of false negatives. Results: We found that models based on random forest had the highest prediction accuracy for both diseases and were able to outperform the US and UK risk scores in terms of AUC by 35.5% for diabetes (improvement of 0.239 from 0.671 to 0.910) and 13.5% for hypertension (improvement of 0.094 from 0.698 to 0.792). For a fixed screening specificity of 0.9, the random forest model was able to reduce the expected number of false negatives by 620 patients per 1000 screenings for diabetes and 220 patients per 1000 screenings for hypertension. This improvement reduces the cost of incorrect risk stratification by US $1.99 (or 35%) per screening for diabetes and US $1.60 (or 21%) per screening for hypertension. Conclusions: In the next decade, health systems in many countries are planning to spend significant resources on noncommunicable disease screening programs and our study demonstrates that machine learning models can be leveraged by these programs to effectively utilize limited resources by improving risk stratification. %M 33475518 %R 10.2196/20123 %U http://www.jmir.org/2021/1/e20123/ %U https://doi.org/10.2196/20123 %U http://www.ncbi.nlm.nih.gov/pubmed/33475518 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 9 %N 1 %P e24618 %T Development of Social Support Networks by Patients With Depression Through Online Health Communities: Social Network Analysis %A Lu,Yingjie %A Luo,Shuwen %A Liu,Xuan %+ School of Business, East China University of Science and Technology, Meilong Road 130, Shanghai, 200237, China, 86 2164252489, xuanliu@ecust.edu.cn %K online depression community %K social support network %K exponential random graph model %K informational support %K emotional support %K mental health %K depression %K social network %D 2021 %7 7.1.2021 %9 Original Paper %J JMIR Med Inform %G English %X Background: In recent years, people with mental health problems are increasingly using online social networks to receive social support. For example, in online depression communities, patients can share their experiences, exchange valuable information, and receive emotional support to help them cope with their disease. Therefore, it is critical to understand how patients with depression develop online social support networks to exchange informational and emotional support. Objective: Our aim in this study was to investigate which user attributes have significant effects on the formation of informational and emotional support networks in online depression communities and to further examine whether there is an association between the two social networks. Methods: We used social network theory and constructed exponential random graph models to help understand the informational and emotional support networks in online depression communities. A total of 74,986 original posts were retrieved from 1077 members in an online depression community in China from April 2003 to September 2017 and the available data were extracted. An informational support network of 1077 participant nodes and 6557 arcs and an emotional support network of 1077 participant nodes and 6430 arcs were constructed to examine the endogenous (purely structural) effects and exogenous (actor-relation) effects on each support network separately, as well as the cross-network effects between the two networks. Results: We found significant effects of two important structural features, reciprocity and transitivity, on the formation of both the informational support network (r=3.6247, P<.001, and r=1.6232, P<.001, respectively) and the emotional support network (r=4.4111, P<.001, and r=0.0177, P<.001, respectively). The results also showed significant effects of some individual factors on the formation of the two networks. No significant effects of homophily were found for gender (r=0.0783, P=.20, and r=0.1122, P=.25, respectively) in the informational or emotional support networks. There was no tendency for users who had great influence (r=0.3253, P=.05) or wrote more posts (r=0.3896, P=.07) or newcomers (r=–0.0452, P=.66) to form informational support ties more easily. However, users who spent more time online (r=0.6680, P<.001) or provided more replies to other posts (r=0.5026, P<.001) were more likely to form informational support ties. Users who had a big influence (r=0.8325, P<.001), spent more time online (r=0.5839, P<.001), wrote more posts (r=2.4025, P<.001), or provided more replies to other posts (r=0.2259, P<.001) were more likely to form emotional support ties, and newcomers (r=–0.4224, P<.001) were less likely than old-timers to receive emotional support. In addition, we found that there was a significant entrainment effect (r=0.7834, P<.001) and a nonsignificant exchange effect (r=–0.2757, P=.32) between the two networks. Conclusions: This study makes several important theoretical contributions to the research on online depression communities and has important practical implications for the managers of online depression communities and the users involved in these communities. %M 33279878 %R 10.2196/24618 %U http://medinform.jmir.org/2021/1/e24618/ %U https://doi.org/10.2196/24618 %U http://www.ncbi.nlm.nih.gov/pubmed/33279878 %0 Journal Article %@ 1929-0748 %I JMIR Publications %V 10 %N 1 %P e21453 %T Natural Language Processing–Based Virtual Cofacilitator for Online Cancer Support Groups: Protocol for an Algorithm Development and Validation Study %A Leung,Yvonne W %A Wouterloot,Elise %A Adikari,Achini %A Hirst,Graeme %A de Silva,Daswin %A Wong,Jiahui %A Bender,Jacqueline L %A Gancarz,Mathew %A Gratzer,David %A Alahakoon,Damminda %A Esplen,Mary Jane %+ de Souza Institute, University Health Network, 222 St Patrick St Rm 503, Toronto, ON, M5T 1V4, Canada, 1 844 758 6891, yvonne.leung@desouzainstitute.com %K artificial intelligence %K cancer %K online support groups %K emotional distress %K natural language processing %K participant engagement %D 2021 %7 7.1.2021 %9 Protocol %J JMIR Res Protoc %G English %X Background: Cancer and its treatment can significantly impact the short- and long-term psychological well-being of patients and families. Emotional distress and depressive symptomatology are often associated with poor treatment adherence, reduced quality of life, and higher mortality. Cancer support groups, especially those led by health care professionals, provide a safe place for participants to discuss fear, normalize stress reactions, share solidarity, and learn about effective strategies to build resilience and enhance coping. However, in-person support groups may not always be accessible to individuals; geographic distance is one of the barriers for access, and compromised physical condition (eg, fatigue, pain) is another. Emerging evidence supports the effectiveness of online support groups in reducing access barriers. Text-based and professional-led online support groups have been offered by Cancer Chat Canada. Participants join the group discussion using text in real time. However, therapist leaders report some challenges leading text-based online support groups in the absence of visual cues, particularly in tracking participant distress. With multiple participants typing at the same time, the nuances of the text messages or red flags for distress can sometimes be missed. Recent advances in artificial intelligence such as deep learning–based natural language processing offer potential solutions. This technology can be used to analyze online support group text data to track participants’ expressed emotional distress, including fear, sadness, and hopelessness. Artificial intelligence allows session activities to be monitored in real time and alerts the therapist to participant disengagement. Objective: We aim to develop and evaluate an artificial intelligence–based cofacilitator prototype to track and monitor online support group participants’ distress through real-time analysis of text-based messages posted during synchronous sessions. Methods: An artificial intelligence–based cofacilitator will be developed to identify participants who are at-risk for increased emotional distress and track participant engagement and in-session group cohesion levels, providing real-time alerts for therapist to follow-up; generate postsession participant profiles that contain discussion content keywords and emotion profiles for each session; and automatically suggest tailored resources to participants according to their needs. The study is designed to be conducted in 4 phases consisting of (1) development based on a subset of data and an existing natural language processing framework, (2) performance evaluation using human scoring, (3) beta testing, and (4) user experience evaluation. Results: This study received ethics approval in August 2019. Phase 1, development of an artificial intelligence–based cofacilitator, was completed in January 2020. As of December 2020, phase 2 is underway. The study is expected to be completed by September 2021. Conclusions: An artificial intelligence–based cofacilitator offers a promising new mode of delivery of person-centered online support groups tailored to individual needs. International Registered Report Identifier (IRRID): DERR1-10.2196/21453 %M 33410754 %R 10.2196/21453 %U https://www.researchprotocols.org/2021/1/e21453 %U https://doi.org/10.2196/21453 %U http://www.ncbi.nlm.nih.gov/pubmed/33410754 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 23 %N 1 %P e19928 %T Utilization of Self-Diagnosis Health Chatbots in Real-World Settings: Case Study %A Fan,Xiangmin %A Chao,Daren %A Zhang,Zhan %A Wang,Dakuo %A Li,Xiaohua %A Tian,Feng %+ School of Computer Science and Information Systems, Pace University, 1 Pace Plaza, New York, NY, 10078, United States, 1 9147733254, zzhang@pace.edu %K self-diagnosis %K chatbot %K conversational agent %K human–artificial intelligence interaction %K artificial intelligence %K diagnosis %K case study %K eHealth %K real world %K user experience %D 2021 %7 6.1.2021 %9 Original Paper %J J Med Internet Res %G English %X Background: Artificial intelligence (AI)-driven chatbots are increasingly being used in health care, but most chatbots are designed for a specific population and evaluated in controlled settings. There is little research documenting how health consumers (eg, patients and caregivers) use chatbots for self-diagnosis purposes in real-world scenarios. Objective: The aim of this research was to understand how health chatbots are used in a real-world context, what issues and barriers exist in their usage, and how the user experience of this novel technology can be improved. Methods: We employed a data-driven approach to analyze the system log of a widely deployed self-diagnosis chatbot in China. Our data set consisted of 47,684 consultation sessions initiated by 16,519 users over 6 months. The log data included a variety of information, including users’ nonidentifiable demographic information, consultation details, diagnostic reports, and user feedback. We conducted both statistical analysis and content analysis on this heterogeneous data set. Results: The chatbot users spanned all age groups, including middle-aged and older adults. Users consulted the chatbot on a wide range of medical conditions, including those that often entail considerable privacy and social stigma issues. Furthermore, we distilled 2 prominent issues in the use of the chatbot: (1) a considerable number of users dropped out in the middle of their consultation sessions, and (2) some users pretended to have health concerns and used the chatbot for nontherapeutic purposes. Finally, we identified a set of user concerns regarding the use of the chatbot, including insufficient actionable information and perceived inaccurate diagnostic suggestions. Conclusions: Although health chatbots are considered to be convenient tools for enhancing patient-centered care, there are issues and barriers impeding the optimal use of this novel technology. Designers and developers should employ user-centered approaches to address the issues and user concerns to achieve the best uptake and utilization. We conclude the paper by discussing several design implications, including making the chatbots more informative, easy-to-use, and trustworthy, as well as improving the onboarding experience to enhance user engagement. %M 33404508 %R 10.2196/19928 %U https://www.jmir.org/2021/1/e19928 %U https://doi.org/10.2196/19928 %U http://www.ncbi.nlm.nih.gov/pubmed/33404508 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 8 %N 12 %P e21965 %T Automatically Explaining Machine Learning Prediction Results on Asthma Hospital Visits in Patients With Asthma: Secondary Analysis %A Luo,Gang %A Johnson,Michael D %A Nkoy,Flory L %A He,Shan %A Stone,Bryan L %+ Department of Biomedical Informatics and Medical Education, University of Washington, Building C, Box 358047, 850 Republican Street, Seattle, WA, 98195, United States, 1 2062214596, gangluo@cs.wisc.edu %K asthma %K forecasting %K machine learning %K patient care management %D 2020 %7 31.12.2020 %9 Original Paper %J JMIR Med Inform %G English %X Background: Asthma is a major chronic disease that poses a heavy burden on health care. To facilitate the allocation of care management resources aimed at improving outcomes for high-risk patients with asthma, we recently built a machine learning model to predict asthma hospital visits in the subsequent year in patients with asthma. Our model is more accurate than previous models. However, like most machine learning models, it offers no explanation of its prediction results. This creates a barrier for use in care management, where interpretability is desired. Objective: This study aims to develop a method to automatically explain the prediction results of the model and recommend tailored interventions without lowering the performance measures of the model. Methods: Our data were imbalanced, with only a small portion of data instances linking to future asthma hospital visits. To handle imbalanced data, we extended our previous method of automatically offering rule-formed explanations for the prediction results of any machine learning model on tabular data without lowering the model’s performance measures. In a secondary analysis of the 334,564 data instances from Intermountain Healthcare between 2005 and 2018 used to form our model, we employed the extended method to automatically explain the prediction results of our model and recommend tailored interventions. The patient cohort consisted of all patients with asthma who received care at Intermountain Healthcare between 2005 and 2018, and resided in Utah or Idaho as recorded at the visit. Results: Our method explained the prediction results for 89.7% (391/436) of the patients with asthma who, per our model’s correct prediction, were likely to incur asthma hospital visits in the subsequent year. Conclusions: This study is the first to demonstrate the feasibility of automatically offering rule-formed explanations for the prediction results of any machine learning model on imbalanced tabular data without lowering the performance measures of the model. After further improvement, our asthma outcome prediction model coupled with the automatic explanation function could be used by clinicians to guide the allocation of limited asthma care management resources and the identification of appropriate interventions. %M 33382379 %R 10.2196/21965 %U http://medinform.jmir.org/2020/12/e21965/ %U https://doi.org/10.2196/21965 %U http://www.ncbi.nlm.nih.gov/pubmed/33382379 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 12 %P e22422 %T Deep Neural Network for Reducing the Screening Workload in Systematic Reviews for Clinical Guidelines: Algorithm Validation Study %A Yamada,Tomohide %A Yoneoka,Daisuke %A Hiraike,Yuta %A Hino,Kimihiro %A Toyoshiba,Hiroyoshi %A Shishido,Akira %A Noma,Hisashi %A Shojima,Nobuhiro %A Yamauchi,Toshimasa %+ University Institute for Population Health, King’s College London, Addison House, Guys Campus, London, SE1 1UL, United Kingdom, 44 (0)20 7848 6625, bqx07367@yahoo.co.jp %K machine learning %K evidence-based medicine %K systematic review %K meta-analysis %K clinical guideline %K deep learning %K neural network %D 2020 %7 30.12.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: Performing systematic reviews is a time-consuming and resource-intensive process. Objective: We investigated whether a machine learning system could perform systematic reviews more efficiently. Methods: All systematic reviews and meta-analyses of interventional randomized controlled trials cited in recent clinical guidelines from the American Diabetes Association, American College of Cardiology, American Heart Association (2 guidelines), and American Stroke Association were assessed. After reproducing the primary screening data set according to the published search strategy of each, we extracted correct articles (those actually reviewed) and incorrect articles (those not reviewed) from the data set. These 2 sets of articles were used to train a neural network–based artificial intelligence engine (Concept Encoder, Fronteo Inc). The primary endpoint was work saved over sampling at 95% recall (WSS@95%). Results: Among 145 candidate reviews of randomized controlled trials, 8 reviews fulfilled the inclusion criteria. For these 8 reviews, the machine learning system significantly reduced the literature screening workload by at least 6-fold versus that of manual screening based on WSS@95%. When machine learning was initiated using 2 correct articles that were randomly selected by a researcher, a 10-fold reduction in workload was achieved versus that of manual screening based on the WSS@95% value, with high sensitivity for eligible studies. The area under the receiver operating characteristic curve increased dramatically every time the algorithm learned a correct article. Conclusions: Concept Encoder achieved a 10-fold reduction of the screening workload for systematic review after learning from 2 randomly selected studies on the target topic. However, few meta-analyses of randomized controlled trials were included. Concept Encoder could facilitate the acquisition of evidence for clinical guidelines. %M 33262102 %R 10.2196/22422 %U https://www.jmir.org/2020/12/e22422 %U https://doi.org/10.2196/22422 %U http://www.ncbi.nlm.nih.gov/pubmed/33262102 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 12 %P e25442 %T An Artificial Intelligence Model to Predict the Mortality of COVID-19 Patients at Hospital Admission Time Using Routine Blood Samples: Development and Validation of an Ensemble Model %A Ko,Hoon %A Chung,Heewon %A Kang,Wu Seong %A Park,Chul %A Kim,Do Wan %A Kim,Seong Eun %A Chung,Chi Ryang %A Ko,Ryoung Eun %A Lee,Hooseok %A Seo,Jae Ho %A Choi,Tae-Young %A Jaimes,Rafael %A Kim,Kyung Won %A Lee,Jinseok %+ Biomedical Engineering, Wonkwang University, Iksan Daero, Iksan, 54538, Republic of Korea, 82 1638506970, gonasago@gmail.com %K COVID-19 %K artificial intelligence %K blood samples %K mortality prediction %D 2020 %7 23.12.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: COVID-19, which is accompanied by acute respiratory distress, multiple organ failure, and death, has spread worldwide much faster than previously thought. However, at present, it has limited treatments. Objective: To overcome this issue, we developed an artificial intelligence (AI) model of COVID-19, named EDRnet (ensemble learning model based on deep neural network and random forest models), to predict in-hospital mortality using a routine blood sample at the time of hospital admission. Methods: We selected 28 blood biomarkers and used the age and gender information of patients as model inputs. To improve the mortality prediction, we adopted an ensemble approach combining deep neural network and random forest models. We trained our model with a database of blood samples from 361 COVID-19 patients in Wuhan, China, and applied it to 106 COVID-19 patients in three Korean medical institutions. Results: In the testing data sets, EDRnet provided high sensitivity (100%), specificity (91%), and accuracy (92%). To extend the number of patient data points, we developed a web application (BeatCOVID19) where anyone can access the model to predict mortality and can register his or her own blood laboratory results. Conclusions: Our new AI model, EDRnet, accurately predicts the mortality rate for COVID-19. It is publicly available and aims to help health care providers fight COVID-19 and improve patients’ outcomes. %M 33301414 %R 10.2196/25442 %U http://www.jmir.org/2020/12/e25442/ %U https://doi.org/10.2196/25442 %U http://www.ncbi.nlm.nih.gov/pubmed/33301414 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 8 %N 12 %P e23082 %T Model-Based Reasoning of Clinical Diagnosis in Integrative Medicine: Real-World Methodological Study of Electronic Medical Records and Natural Language Processing Methods %A Geng,Wenye %A Qin,Xuanfeng %A Yang,Tao %A Cong,Zhilei %A Wang,Zhuo %A Kong,Qing %A Tang,Zihui %A Jiang,Lin %+ Department of Integrative Medicine, Fudan University Huashan Hospital, No 12 Urumuqi Mid Road, Shanghai, China, 86 021 5288 8236, dr_zhtang@yeah.net %K model-based reasoning %K integrative medicine %K electronic medical records %K natural language processing %D 2020 %7 21.12.2020 %9 Original Paper %J JMIR Med Inform %G English %X Background: Integrative medicine is a form of medicine that combines practices and treatments from alternative medicine with conventional medicine. The diagnosis in integrative medicine involves the clinical diagnosis based on modern medicine and syndrome pattern diagnosis. Electronic medical records (EMRs) are the systematized collection of patients health information stored in a digital format that can be shared across different health care settings. Although syndrome and sign information or relative information can be extracted from the EMR and content texts can be mapped to computability vectors using natural language processing techniques, application of artificial intelligence techniques to support physicians in medical practices remains a major challenge. Objective: The purpose of this study was to investigate model-based reasoning (MBR) algorithms for the clinical diagnosis in integrative medicine based on EMRs and natural language processing. We also estimated the associations among the factors of sample size, number of syndrome pattern type, and diagnosis in modern medicine using the MBR algorithms. Methods: A total of 14,075 medical records of clinical cases were extracted from the EMRs as the development data set, and an external test data set consisting of 1000 medical records of clinical cases was extracted from independent EMRs. MBR methods based on word embedding, machine learning, and deep learning algorithms were developed for the automatic diagnosis of syndrome pattern in integrative medicine. MBR algorithms combining rule-based reasoning (RBR) were also developed. A standard evaluation metrics consisting of accuracy, precision, recall, and F1 score was used for the performance estimation of the methods. The association analyses were conducted on the sample size, number of syndrome pattern type, and diagnosis of lung diseases with the best algorithms. Results: The Word2Vec convolutional neural network (CNN) MBR algorithms showed high performance (accuracy of 0.9586 in the test data set) in the syndrome pattern diagnosis of lung diseases. The Word2Vec CNN MBR combined with RBR also showed high performance (accuracy of 0.9229 in the test data set). The diagnosis of lung diseases could enhance the performance of the Word2Vec CNN MBR algorithms. Each group sample size and syndrome pattern type affected the performance of these algorithms. Conclusions: The MBR methods based on Word2Vec and CNN showed high performance in the syndrome pattern diagnosis of lung diseases in integrative medicine. The parameters of each group’s sample size, syndrome pattern type, and diagnosis of lung diseases were associated with the performance of the methods. Trial Registration: ClinicalTrials.gov NCT03274908; https://clinicaltrials.gov/ct2/show/NCT03274908 %M 33346740 %R 10.2196/23082 %U http://medinform.jmir.org/2020/12/e23082/ %U https://doi.org/10.2196/23082 %U http://www.ncbi.nlm.nih.gov/pubmed/33346740 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 12 %P e19127 %T Technical Aspects of Developing Chatbots for Medical Applications: Scoping Review %A Safi,Zeineb %A Abd-Alrazaq,Alaa %A Khalifa,Mohamed %A Househ,Mowafa %+ Division of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa University, Qatar Foundation, P.O. Box 34110, Doha Al Luqta St, Ar-Rayyan, Doha, Qatar, 974 55708549, mhouseh@hbku.edu.qa %K chatbots %K conversational agents %K medical applications %K scoping review %K technical aspects %D 2020 %7 18.12.2020 %9 Review %J J Med Internet Res %G English %X Background: Chatbots are applications that can conduct natural language conversations with users. In the medical field, chatbots have been developed and used to serve different purposes. They provide patients with timely information that can be critical in some scenarios, such as access to mental health resources. Since the development of the first chatbot, ELIZA, in the late 1960s, much effort has followed to produce chatbots for various health purposes developed in different ways. Objective: This study aimed to explore the technical aspects and development methodologies associated with chatbots used in the medical field to explain the best methods of development and support chatbot development researchers on their future work. Methods: We searched for relevant articles in 8 literature databases (IEEE, ACM, Springer, ScienceDirect, Embase, MEDLINE, PsycINFO, and Google Scholar). We also performed forward and backward reference checking of the selected articles. Study selection was performed by one reviewer, and 50% of the selected studies were randomly checked by a second reviewer. A narrative approach was used for result synthesis. Chatbots were classified based on the different technical aspects of their development. The main chatbot components were identified in addition to the different techniques for implementing each module. Results: The original search returned 2481 publications, of which we identified 45 studies that matched our inclusion and exclusion criteria. The most common language of communication between users and chatbots was English (n=23). We identified 4 main modules: text understanding module, dialog management module, database layer, and text generation module. The most common technique for developing text understanding and dialogue management is the pattern matching method (n=18 and n=25, respectively). The most common text generation is fixed output (n=36). Very few studies relied on generating original output. Most studies kept a medical knowledge base to be used by the chatbot for different purposes throughout the conversations. A few studies kept conversation scripts and collected user data and previous conversations. Conclusions: Many chatbots have been developed for medical use, at an increasing rate. There is a recent, apparent shift in adopting machine learning–based approaches for developing chatbot systems. Further research can be conducted to link clinical outcomes to different chatbot development techniques and technical characteristics. %M 33337337 %R 10.2196/19127 %U http://www.jmir.org/2020/12/e19127/ %U https://doi.org/10.2196/19127 %U http://www.ncbi.nlm.nih.gov/pubmed/33337337 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 8 %N 12 %P e22649 %T Detecting Miscoded Diabetes Diagnosis Codes in Electronic Health Records for Quality Improvement: Temporal Deep Learning Approach %A Rashidian,Sina %A Abell-Hart,Kayley %A Hajagos,Janos %A Moffitt,Richard %A Lingam,Veena %A Garcia,Victor %A Tsai,Chao-Wei %A Wang,Fusheng %A Dong,Xinyu %A Sun,Siao %A Deng,Jianyuan %A Gupta,Rajarsi %A Miller,Joshua %A Saltz,Joel %A Saltz,Mary %+ Department of Computer Science, Stony Brook University, 2212 Computer Science, Stony Brook, NY, 11794, United States, 1 631 632 8470, srashidian@cs.stonybrook.edu %K electronic health records %K diabetes %K deep learning %D 2020 %7 17.12.2020 %9 Original Paper %J JMIR Med Inform %G English %X Background: Diabetes affects more than 30 million patients across the United States. With such a large disease burden, even a small error in classification can be significant. Currently billing codes, assigned at the time of a medical encounter, are the “gold standard” reflecting the actual diseases present in an individual, and thus in aggregate reflect disease prevalence in the population. These codes are generated by highly trained coders and by health care providers but are not always accurate. Objective: This work provides a scalable deep learning methodology to more accurately classify individuals with diabetes across multiple health care systems. Methods: We leveraged a long short-term memory-dense neural network (LSTM-DNN) model to identify patients with or without diabetes using data from 5 acute care facilities with 187,187 patients and 275,407 encounters, incorporating data elements including laboratory test results, diagnostic/procedure codes, medications, demographic data, and admission information. Furthermore, a blinded physician panel reviewed discordant cases, providing an estimate of the total impact on the population. Results: When predicting the documented diagnosis of diabetes, our model achieved an 84% F1 score, 96% area under the curve–receiver operating characteristic curve, and 91% average precision on a heterogeneous data set from 5 distinct health facilities. However, in 81% of cases where the model disagreed with the documented phenotype, a blinded physician panel agreed with the model. Taken together, this suggests that 4.3% of our studied population have either missing or improper diabetes diagnosis. Conclusions: This study demonstrates that deep learning methods can improve clinical phenotyping even when patient data are noisy, sparse, and heterogeneous. %M 33331828 %R 10.2196/22649 %U http://medinform.jmir.org/2020/12/e22649/ %U https://doi.org/10.2196/22649 %U http://www.ncbi.nlm.nih.gov/pubmed/33331828 %0 Journal Article %@ 2562-7600 %I JMIR Publications %V 3 %N 1 %P e23939 %T Predicted Influences of Artificial Intelligence on the Domains of Nursing: Scoping Review %A Buchanan,Christine %A Howitt,M Lyndsay %A Wilson,Rita %A Booth,Richard G %A Risling,Tracie %A Bamford,Megan %+ Registered Nurses' Association of Ontario, 500-4211 Yonge Street, Toronto, ON, M2P 2A9, Canada, 1 800 268 7199 ext 281, cbuchanan@rnao.ca %K nursing %K artificial intelligence %K machine learning %K robotics %K patient-centered care %K review %D 2020 %7 17.12.2020 %9 Review %J JMIR Nursing %G English %X Background: Artificial intelligence (AI) is set to transform the health system, yet little research to date has explored its influence on nurses—the largest group of health professionals. Furthermore, there has been little discussion on how AI will influence the experience of person-centered compassionate care for patients, families, and caregivers. Objective: This review aims to summarize the extant literature on the emerging trends in health technologies powered by AI and their implications on the following domains of nursing: administration, clinical practice, policy, and research. This review summarizes the findings from 3 research questions, examining how these emerging trends might influence the roles and functions of nurses and compassionate nursing care over the next 10 years and beyond. Methods: Using an established scoping review methodology, MEDLINE, CINAHL, EMBASE, PsycINFO, Cochrane Database of Systematic Reviews, Cochrane Central, Education Resources Information Center, Scopus, Web of Science, and ProQuest databases were searched. In addition to the electronic database searches, a targeted website search was performed to access relevant gray literature. Abstracts and full-text studies were independently screened by 2 reviewers using prespecified inclusion and exclusion criteria. Included articles focused on nursing and digital health technologies that incorporate AI. Data were charted using structured forms and narratively summarized. Results: A total of 131 articles were retrieved from the scoping review for the 3 research questions that were the focus of this manuscript (118 from database sources and 13 from targeted websites). Emerging AI technologies discussed in the review included predictive analytics, smart homes, virtual health care assistants, and robots. The results indicated that AI has already begun to influence nursing roles, workflows, and the nurse-patient relationship. In general, robots are not viewed as replacements for nurses. There is a consensus that health technologies powered by AI may have the potential to enhance nursing practice. Consequently, nurses must proactively define how person-centered compassionate care will be preserved in the age of AI. Conclusions: Nurses have a shared responsibility to influence decisions related to the integration of AI into the health system and to ensure that this change is introduced in a way that is ethical and aligns with core nursing values such as compassionate care. Furthermore, nurses must advocate for patient and nursing involvement in all aspects of the design, implementation, and evaluation of these technologies. International Registered Report Identifier (IRRID): RR2-10.2196/17490 %R 10.2196/23939 %U https://nursing.jmir.org/2020/1/e23939/ %U https://doi.org/10.2196/23939 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 12 %P e24478 %T Computing SARS-CoV-2 Infection Risk From Symptoms, Imaging, and Test Data: Diagnostic Model Development %A D'Ambrosia,Christopher %A Christensen,Henrik %A Aronoff-Spencer,Eliah %+ Division of Infectious Diseases and Global Public Health, School of Medicine, University of California San Diego, 9500 Gilman Drive 0711, San Diego, CA, 92101, United States, 1 6462348153, earonoffspencer@health.ucsd.edu %K health %K informatics %K computation %K COVID-19 %K infection %K risk %K symptom %K imaging %K diagnostic %K probability %K machine learning %K Bayesian %K model %D 2020 %7 16.12.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: Assigning meaningful probabilities of SARS-CoV-2 infection risk presents a diagnostic challenge across the continuum of care. Objective: The aim of this study was to develop and clinically validate an adaptable, personalized diagnostic model to assist clinicians in ruling in and ruling out COVID-19 in potential patients. We compared the diagnostic performance of probabilistic, graphical, and machine learning models against a previously published benchmark model. Methods: We integrated patient symptoms and test data using machine learning and Bayesian inference to quantify individual patient risk of SARS-CoV-2 infection. We trained models with 100,000 simulated patient profiles based on 13 symptoms and estimated local prevalence, imaging, and molecular diagnostic performance from published reports. We tested these models with consecutive patients who presented with a COVID-19–compatible illness at the University of California San Diego Medical Center over the course of 14 days starting in March 2020. Results: We included 55 consecutive patients with fever (n=43, 78%) or cough (n=42, 77%) presenting for ambulatory (n=11, 20%) or hospital care (n=44, 80%). In total, 51% (n=28) were female and 49% (n=27) were aged <60 years. Common comorbidities included diabetes (n=12, 22%), hypertension (n=15, 27%), cancer (n=9, 16%), and cardiovascular disease (n=7, 13%). Of these, 69% (n=38) were confirmed via reverse transcription-polymerase chain reaction (RT-PCR) to be positive for SARS-CoV-2 infection, and 20% (n=11) had repeated negative nucleic acid testing and an alternate diagnosis. Bayesian inference network, distance metric learning, and ensemble models discriminated between patients with SARS-CoV-2 infection and alternate diagnoses with sensitivities of 81.6%-84.2%, specificities of 58.8%-70.6%, and accuracies of 61.4%-71.8%. After integrating imaging and laboratory test statistics with the predictions of the Bayesian inference network, changes in diagnostic uncertainty at each step in the simulated clinical evaluation process were highly sensitive to location, symptom, and diagnostic test choices. Conclusions: Decision support models that incorporate symptoms and available test results can help providers diagnose SARS-CoV-2 infection in real-world settings. %M 33301417 %R 10.2196/24478 %U http://www.jmir.org/2020/12/e24478/ %U https://doi.org/10.2196/24478 %U http://www.ncbi.nlm.nih.gov/pubmed/33301417 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 12 %P e18418 %T Limitations of Deep Learning Attention Mechanisms in Clinical Research: Empirical Case Study Based on the Korean Diabetic Disease Setting %A Kim,Junetae %A Lee,Sangwon %A Hwang,Eugene %A Ryu,Kwang Sun %A Jeong,Hanseok %A Lee,Jae Wook %A Hwangbo,Yul %A Choi,Kui Son %A Cha,Hyo Soung %+ Cancer Data Center, National Cancer Control Institute, National Cancer Center, 809 Madu 1(il)-dong, Ilsandong-gu, Goyang-si, Gyeonggi-do, 10408, Republic of Korea, 82 31 920 1892, kkido@ncc.re.kr %K attention %K deep learning %K explainable artificial intelligence %K uncertainty awareness %K Bayesian deep learning %K artificial intelligence %K health data %D 2020 %7 16.12.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: Despite excellent prediction performance, noninterpretability has undermined the value of applying deep-learning algorithms in clinical practice. To overcome this limitation, attention mechanism has been introduced to clinical research as an explanatory modeling method. However, potential limitations of using this attractive method have not been clarified to clinical researchers. Furthermore, there has been a lack of introductory information explaining attention mechanisms to clinical researchers. Objective: The aim of this study was to introduce the basic concepts and design approaches of attention mechanisms. In addition, we aimed to empirically assess the potential limitations of current attention mechanisms in terms of prediction and interpretability performance. Methods: First, the basic concepts and several key considerations regarding attention mechanisms were identified. Second, four approaches to attention mechanisms were suggested according to a two-dimensional framework based on the degrees of freedom and uncertainty awareness. Third, the prediction performance, probability reliability, concentration of variable importance, consistency of attention results, and generalizability of attention results to conventional statistics were assessed in the diabetic classification modeling setting. Fourth, the potential limitations of attention mechanisms were considered. Results: Prediction performance was very high for all models. Probability reliability was high in models with uncertainty awareness. Variable importance was concentrated in several variables when uncertainty awareness was not considered. The consistency of attention results was high when uncertainty awareness was considered. The generalizability of attention results to conventional statistics was poor regardless of the modeling approach. Conclusions: The attention mechanism is an attractive technique with potential to be very promising in the future. However, it may not yet be desirable to rely on this method to assess variable importance in clinical settings. Therefore, along with theoretical studies enhancing attention mechanisms, more empirical studies investigating potential limitations should be encouraged. %M 33325832 %R 10.2196/18418 %U http://www.jmir.org/2020/12/e18418/ %U https://doi.org/10.2196/18418 %U http://www.ncbi.nlm.nih.gov/pubmed/33325832 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 12 %P e20756 %T Artificial Intelligence in the Fight Against COVID-19: Scoping Review %A Abd-Alrazaq,Alaa %A Alajlani,Mohannad %A Alhuwail,Dari %A Schneider,Jens %A Al-Kuwari,Saif %A Shah,Zubair %A Hamdi,Mounir %A Househ,Mowafa %+ Division of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa University, Qatar Foundation, PO Box 5825, Doha Al Luqta St, Ar-Rayyan, Doha, , Qatar, 974 55708549, mhouseh@hbku.edu.qa %K artificial intelligence %K machine learning %K deep learning %K natural language processing %K coronavirus %K COVID-19 %K 2019-nCoV %K SARS-CoV-2 %D 2020 %7 15.12.2020 %9 Review %J J Med Internet Res %G English %X Background: In December 2019, COVID-19 broke out in Wuhan, China, leading to national and international disruptions in health care, business, education, transportation, and nearly every aspect of our daily lives. Artificial intelligence (AI) has been leveraged amid the COVID-19 pandemic; however, little is known about its use for supporting public health efforts. Objective: This scoping review aims to explore how AI technology is being used during the COVID-19 pandemic, as reported in the literature. Thus, it is the first review that describes and summarizes features of the identified AI techniques and data sets used for their development and validation. Methods: A scoping review was conducted following the guidelines of PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews). We searched the most commonly used electronic databases (eg, MEDLINE, EMBASE, and PsycInfo) between April 10 and 12, 2020. These terms were selected based on the target intervention (ie, AI) and the target disease (ie, COVID-19). Two reviewers independently conducted study selection and data extraction. A narrative approach was used to synthesize the extracted data. Results: We considered 82 studies out of the 435 retrieved studies. The most common use of AI was diagnosing COVID-19 cases based on various indicators. AI was also employed in drug and vaccine discovery or repurposing and for assessing their safety. Further, the included studies used AI for forecasting the epidemic development of COVID-19 and predicting its potential hosts and reservoirs. Researchers used AI for patient outcome–related tasks such as assessing the severity of COVID-19, predicting mortality risk, its associated factors, and the length of hospital stay. AI was used for infodemiology to raise awareness to use water, sanitation, and hygiene. The most prominent AI technique used was convolutional neural network, followed by support vector machine. Conclusions: The included studies showed that AI has the potential to fight against COVID-19. However, many of the proposed methods are not yet clinically accepted. Thus, the most rewarding research will be on methods promising value beyond COVID-19. More efforts are needed for developing standardized reporting protocols or guidelines for studies on AI. %M 33284779 %R 10.2196/20756 %U http://www.jmir.org/2020/12/e20756/ %U https://doi.org/10.2196/20756 %U http://www.ncbi.nlm.nih.gov/pubmed/33284779 %0 Journal Article %@ 2291-9279 %I JMIR Publications %V 8 %N 4 %P e24049 %T The Impact of Artificial Intelligence on the Chess World %A Duca Iliescu,Delia Monica %+ Transilvania University of Brasov, Bdul Eroilor 29, Brasov, Romania, 40 268413000, delia.duca@unitbv.ro %K artificial intelligence %K games %K chess %K AlphaZero %K MuZero %K cheat detection %K coronavirus %D 2020 %7 10.12.2020 %9 Viewpoint %J JMIR Serious Games %G English %X This paper focuses on key areas in which artificial intelligence has affected the chess world, including cheat detection methods, which are especially necessary recently, as there has been an unexpected rise in the popularity of online chess. Many major chess events that were to take place in 2020 have been canceled, but the global popularity of chess has in fact grown in recent months due to easier conversion of the game from offline to online formats compared with other games. Still, though a game of chess can be easily played online, there are some concerns about the increased chances of cheating. Artificial intelligence can address these concerns. %M 33300493 %R 10.2196/24049 %U http://games.jmir.org/2020/4/e24049/ %U https://doi.org/10.2196/24049 %U http://www.ncbi.nlm.nih.gov/pubmed/33300493 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 12 %P e18097 %T Evaluation of Four Artificial Intelligence–Assisted Self-Diagnosis Apps on Three Diagnoses: Two-Year Follow-Up Study %A Ćirković,Aleksandar %+ Schulgasse 21, Weiden, 92637, Germany, 49 1788603753, aleksandar.cirkovic@mailbox.org %K artificial intelligence %K machine learning %K mobile apps %K medical diagnosis %K mHealth %D 2020 %7 4.12.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: Consumer-oriented mobile self-diagnosis apps have been developed using undisclosed algorithms, presumably based on machine learning and other artificial intelligence (AI) technologies. The US Food and Drug Administration now discerns apps with learning AI algorithms from those with stable ones and treats the former as medical devices. To the author’s knowledge, no self-diagnosis app testing has been performed in the field of ophthalmology so far. Objective: The objective of this study was to test apps that were previously mentioned in the scientific literature on a set of diagnoses in a deliberate time interval, comparing the results and looking for differences that hint at “nonlocked” learning algorithms. Methods: Four apps from the literature were chosen (Ada, Babylon, Buoy, and Your.MD). A set of three ophthalmology diagnoses (glaucoma, retinal tear, dry eye syndrome) representing three levels of urgency was used to simultaneously test the apps’ diagnostic efficiency and treatment recommendations in this specialty. Two years was the chosen time interval between the tests (2018 and 2020). Scores were awarded by one evaluating physician using a defined scheme. Results: Two apps (Ada and Your.MD) received significantly higher scores than the other two. All apps either worsened in their results between 2018 and 2020 or remained unchanged at a low level. The variation in the results over time indicates “nonlocked” learning algorithms using AI technologies. None of the apps provided correct diagnoses and treatment recommendations for all three diagnoses in 2020. Two apps (Babylon and Your.MD) asked significantly fewer questions than the other two (P<.001). Conclusions: “Nonlocked” algorithms are used by self-diagnosis apps. The diagnostic efficiency of the tested apps seems to worsen over time, with some apps being more capable than others. Systematic studies on a wider scale are necessary for health care providers and patients to correctly assess the safety and efficacy of such apps and for correct classification by health care regulating authorities. %M 33275113 %R 10.2196/18097 %U https://www.jmir.org/2020/12/e18097 %U https://doi.org/10.2196/18097 %U http://www.ncbi.nlm.nih.gov/pubmed/33275113 %0 Journal Article %@ 1929-0748 %I JMIR Publications %V 9 %N 12 %P e22996 %T An Artificial Intelligence–Based, Personalized Smartphone App to Improve Childhood Immunization Coverage and Timelines Among Children in Pakistan: Protocol for a Randomized Controlled Trial %A Kazi,Abdul Momin %A Qazi,Saad Ahmed %A Khawaja,Sadori %A Ahsan,Nazia %A Ahmed,Rao Moueed %A Sameen,Fareeha %A Khan Mughal,Muhammad Ayub %A Saqib,Muhammad %A Ali,Sikander %A Kaleemuddin,Hussain %A Rauf,Yasir %A Raza,Mehreen %A Jamal,Saima %A Abbasi,Munir %A Stergioulas,Lampros K %+ Department of Pediatrics and Child Health, Aga Khan University, Stadium Road, PO Box 3500, Karachi, 74800, Pakistan, 92 2134864232, momin.kazi@aku.edu %K artificial intelligence %K AI %K routine childhood immunization %K EPI %K LMICs %K mHealth %K Pakistan %K personalized messages %K routine immunization %K smartphone apps %K vaccine-preventable illnesses %D 2020 %7 4.12.2020 %9 Protocol %J JMIR Res Protoc %G English %X Background: The immunization uptake rates in Pakistan are much lower than desired. Major reasons include lack of awareness, parental forgetfulness regarding schedules, and misinformation regarding vaccines. In light of the COVID-19 pandemic and distancing measures, routine childhood immunization (RCI) coverage has been adversely affected, as caregivers avoid tertiary care hospitals or primary health centers. Innovative and cost-effective measures must be taken to understand and deal with the issue of low immunization rates. However, only a few smartphone-based interventions have been carried out in low- and middle-income countries (LMICs) to improve RCI. Objective: The primary objectives of this study are to evaluate whether a personalized mobile app can improve children’s on-time visits at 10 and 14 weeks of age for RCI as compared with standard care and to determine whether an artificial intelligence model can be incorporated into the app. Secondary objectives are to determine the perceptions and attitudes of caregivers regarding childhood vaccinations and to understand the factors that might influence the effect of a mobile phone–based app on vaccination improvement. Methods: A mixed methods randomized controlled trial was designed with intervention and control arms. The study will be conducted at the Aga Khan University Hospital vaccination center. Caregivers of newborns or infants visiting the center for their children’s 6-week vaccination will be recruited. The intervention arm will have access to a smartphone app with text, voice, video, and pictorial messages regarding RCI. This app will be developed based on the findings of the pretrial qualitative component of the study, in addition to no-show study findings, which will explore caregivers’ perceptions about RCI and a mobile phone–based app in improving RCI coverage. Results: Pretrial qualitative in-depth interviews were conducted in February 2020. Enrollment of study participants for the randomized controlled trial is in process. Study exit interviews will be conducted at the 14-week immunization visits, provided the caregivers visit the immunization facility at that time, or over the phone when the children are 18 weeks of age. Conclusions: This study will generate useful insights into the feasibility, acceptability, and usability of an Android-based smartphone app for improving RCI in Pakistan and in LMICs. Trial Registration: ClinicalTrials.gov NCT04449107; https://clinicaltrials.gov/ct2/show/NCT04449107 International Registered Report Identifier (IRRID): DERR1-10.2196/22996 %M 33274726 %R 10.2196/22996 %U https://www.researchprotocols.org/2020/12/e22996 %U https://doi.org/10.2196/22996 %U http://www.ncbi.nlm.nih.gov/pubmed/33274726 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 12 %P e24048 %T Development and External Validation of a Machine Learning Tool to Rule Out COVID-19 Among Adults in the Emergency Department Using Routine Blood Tests: A Large, Multicenter, Real-World Study %A Plante,Timothy B %A Blau,Aaron M %A Berg,Adrian N %A Weinberg,Aaron S %A Jun,Ik C %A Tapson,Victor F %A Kanigan,Tanya S %A Adib,Artur B %+ Larner College of Medicine at the University of Vermont, 360 S Park Drive, Suite 206B, Colchester, VT, 05446, United States, 1 802 656 3688, timothy.plante@uvm.edu %K COVID-19 %K SARS-CoV-2 %K machine learning %K artificial intelligence %K electronic medical records %K laboratory results %K development %K validation %K testing %K model %K emergency department %D 2020 %7 2.12.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: Conventional diagnosis of COVID-19 with reverse transcription polymerase chain reaction (RT-PCR) testing (hereafter, PCR) is associated with prolonged time to diagnosis and significant costs to run the test. The SARS-CoV-2 virus might lead to characteristic patterns in the results of widely available, routine blood tests that could be identified with machine learning methodologies. Machine learning modalities integrating findings from these common laboratory test results might accelerate ruling out COVID-19 in emergency department patients. Objective: We sought to develop (ie, train and internally validate with cross-validation techniques) and externally validate a machine learning model to rule out COVID 19 using only routine blood tests among adults in emergency departments. Methods: Using clinical data from emergency departments (EDs) from 66 US hospitals before the pandemic (before the end of December 2019) or during the pandemic (March-July 2020), we included patients aged ≥20 years in the study time frame. We excluded those with missing laboratory results. Model training used 2183 PCR-confirmed cases from 43 hospitals during the pandemic; negative controls were 10,000 prepandemic patients from the same hospitals. External validation used 23 hospitals with 1020 PCR-confirmed cases and 171,734 prepandemic negative controls. The main outcome was COVID 19 status predicted using same-day routine laboratory results. Model performance was assessed with area under the receiver operating characteristic (AUROC) curve as well as sensitivity, specificity, and negative predictive value (NPV). Results: Of 192,779 patients included in the training, external validation, and sensitivity data sets (median age decile 50 [IQR 30-60] years, 40.5% male [78,249/192,779]), AUROC for training and external validation was 0.91 (95% CI 0.90-0.92). Using a risk score cutoff of 1.0 (out of 100) in the external validation data set, the model achieved sensitivity of 95.9% and specificity of 41.7%; with a cutoff of 2.0, sensitivity was 92.6% and specificity was 59.9%. At the cutoff of 2.0, the NPVs at a prevalence of 1%, 10%, and 20% were 99.9%, 98.6%, and 97%, respectively. Conclusions: A machine learning model developed with multicenter clinical data integrating commonly collected ED laboratory data demonstrated high rule-out accuracy for COVID-19 status, and might inform selective use of PCR-based testing. %M 33226957 %R 10.2196/24048 %U https://www.jmir.org/2020/12/e24048 %U https://doi.org/10.2196/24048 %U http://www.ncbi.nlm.nih.gov/pubmed/33226957 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 8 %N 11 %P e23930 %T Machine Learning Electronic Health Record Identification of Patients with Rheumatoid Arthritis: Algorithm Pipeline Development and Validation Study %A Maarseveen,Tjardo D %A Meinderink,Timo %A Reinders,Marcel J T %A Knitza,Johannes %A Huizinga,Tom W J %A Kleyer,Arnd %A Simon,David %A van den Akker,Erik B %A Knevel,Rachel %+ Department of Rheumatology, Leiden University Medical Center, C1-R k. 41, Albinusdreef 2, Leiden, 2333 ZA, Netherlands, 31 611307780, R.Knevel@lumc.nl %K Supervised machine learning %K Electronic Health Records %K Natural Language Processing %K Support Vector Machine %K Gradient Boosting %K Rheumatoid Arthritis %D 2020 %7 30.11.2020 %9 Original Paper %J JMIR Med Inform %G English %X Background: Financial codes are often used to extract diagnoses from electronic health records. This approach is prone to false positives. Alternatively, queries are constructed, but these are highly center and language specific. A tantalizing alternative is the automatic identification of patients by employing machine learning on format-free text entries. Objective: The aim of this study was to develop an easily implementable workflow that builds a machine learning algorithm capable of accurately identifying patients with rheumatoid arthritis from format-free text fields in electronic health records. Methods: Two electronic health record data sets were employed: Leiden (n=3000) and Erlangen (n=4771). Using a portion of the Leiden data (n=2000), we compared 6 different machine learning methods and a naïve word-matching algorithm using 10-fold cross-validation. Performances were compared using the area under the receiver operating characteristic curve (AUROC) and the area under the precision recall curve (AUPRC), and F1 score was used as the primary criterion for selecting the best method to build a classifying algorithm. We selected the optimal threshold of positive predictive value for case identification based on the output of the best method in the training data. This validation workflow was subsequently applied to a portion of the Erlangen data (n=4293). For testing, the best performing methods were applied to remaining data (Leiden n=1000; Erlangen n=478) for an unbiased evaluation. Results: For the Leiden data set, the word-matching algorithm demonstrated mixed performance (AUROC 0.90; AUPRC 0.33; F1 score 0.55), and 4 methods significantly outperformed word-matching, with support vector machines performing best (AUROC 0.98; AUPRC 0.88; F1 score 0.83). Applying this support vector machine classifier to the test data resulted in a similarly high performance (F1 score 0.81; positive predictive value [PPV] 0.94), and with this method, we could identify 2873 patients with rheumatoid arthritis in less than 7 seconds out of the complete collection of 23,300 patients in the Leiden electronic health record system. For the Erlangen data set, gradient boosting performed best (AUROC 0.94; AUPRC 0.85; F1 score 0.82) in the training set, and applied to the test data, resulted once again in good results (F1 score 0.67; PPV 0.97). Conclusions: We demonstrate that machine learning methods can extract the records of patients with rheumatoid arthritis from electronic health record data with high precision, allowing research on very large populations for limited costs. Our approach is language and center independent and could be applied to any type of diagnosis. We have developed our pipeline into a universally applicable and easy-to-implement workflow to equip centers with their own high-performing algorithm. This allows the creation of observational studies of unprecedented size covering different countries for low cost from already available data in electronic health record systems. %M 33252349 %R 10.2196/23930 %U http://medinform.jmir.org/2020/11/e23930/ %U https://doi.org/10.2196/23930 %U http://www.ncbi.nlm.nih.gov/pubmed/33252349 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 11 %P e20549 %T Use Characteristics and Triage Acuity of a Digital Symptom Checker in a Large Integrated Health System: Population-Based Descriptive Study %A Morse,Keith E %A Ostberg,Nicolai P %A Jones,Veena G %A Chan,Albert S %+ Department of Pediatrics, Stanford University School of Medicine, 750 Welch Road, Suite 315, Palo Alto, CA, 94304, United States, 1 650 723 5711, kmorse@stanfordchildrens.org %K symptom checker %K chatbot %K computer-assisted diagnosis %K diagnostic self-evaluation %K artificial intelligence %K self-care %K COVID-19 %D 2020 %7 30.11.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: Pressure on the US health care system has been increasing due to a combination of aging populations, rising health care expenditures, and most recently, the COVID-19 pandemic. Responses to this pressure are hindered in part by reliance on a limited supply of highly trained health care professionals, creating a need for scalable technological solutions. Digital symptom checkers are artificial intelligence–supported software tools that use a conversational “chatbot” format to support rapid diagnosis and consistent triage. The COVID-19 pandemic has brought new attention to these tools due to the need to avoid face-to-face contact and preserve urgent care capacity. However, evidence-based deployment of these chatbots requires an understanding of user demographics and associated triage recommendations generated by a large general population. Objective: In this study, we evaluate the user demographics and levels of triage acuity provided by a symptom checker chatbot deployed in partnership with a large integrated health system in the United States. Methods: This population-based descriptive study included all web-based symptom assessments completed on the website and patient portal of the Sutter Health system (24 hospitals in Northern California) from April 24, 2019, to February 1, 2020. User demographics were compared to relevant US Census population data. Results: A total of 26,646 symptom assessments were completed during the study period. Most assessments (17,816/26,646, 66.9%) were completed by female users. The mean user age was 34.3 years (SD 14.4 years), compared to a median age of 37.3 years of the general population. The most common initial symptom was abdominal pain (2060/26,646, 7.7%). A substantial number of assessments (12,357/26,646, 46.4%) were completed outside of typical physician office hours. Most users were advised to seek medical care on the same day (7299/26,646, 27.4%) or within 2-3 days (6301/26,646, 23.6%). Over a quarter of the assessments indicated a high degree of urgency (7723/26,646, 29.0%). Conclusions: Users of the symptom checker chatbot were broadly representative of our patient population, although they skewed toward younger and female users. The triage recommendations were comparable to those of nurse-staffed telephone triage lines. Although the emergence of COVID-19 has increased the interest in remote medical assessment tools, it is important to take an evidence-based approach to their deployment. %M 33170799 %R 10.2196/20549 %U https://www.jmir.org/2020/11/e20549 %U https://doi.org/10.2196/20549 %U http://www.ncbi.nlm.nih.gov/pubmed/33170799 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 8 %N 11 %P e19416 %T A Human-Algorithm Integration System for Hip Fracture Detection on Plain Radiography: System Development and Validation Study %A Cheng,Chi-Tung %A Chen,Chih-Chi %A Cheng,Fu-Jen %A Chen,Huan-Wu %A Su,Yi-Siang %A Yeh,Chun-Nan %A Chung,I-Fang %A Liao,Chien-Hung %+ Department of Trauma and Emergency Surgery, Linkou Chang Gung Memorial Hospital, Chang Gung University, Trauma Center, 5 Fuxin Street, Kweishan District, Taoyuan, 33328, Taiwan, 886 975365628, surgymet@gmail.com %K hip fracture %K neural network %K computer %K artificial intelligence %K algorithms %K human augmentation %K deep learning %K diagnosis %D 2020 %7 27.11.2020 %9 Original Paper %J JMIR Med Inform %G English %X Background: Hip fracture is the most common type of fracture in elderly individuals. Numerous deep learning (DL) algorithms for plain pelvic radiographs (PXRs) have been applied to improve the accuracy of hip fracture diagnosis. However, their efficacy is still undetermined. Objective: The objective of this study is to develop and validate a human-algorithm integration (HAI) system to improve the accuracy of hip fracture diagnosis in a real clinical environment. Methods: The HAI system with hip fracture detection ability was developed using a deep learning algorithm trained on trauma registry data and 3605 PXRs from August 2008 to December 2016. To compare their diagnostic performance before and after HAI system assistance using an independent testing dataset, 34 physicians were recruited. We analyzed the physicians’ accuracy, sensitivity, specificity, and agreement with the algorithm; we also performed subgroup analyses according to physician specialty and experience. Furthermore, we applied the HAI system in the emergency departments of different hospitals to validate its value in the real world. Results: With the support of the algorithm, which achieved 91% accuracy, the diagnostic performance of physicians was significantly improved in the independent testing dataset, as was revealed by the sensitivity (physician alone, median 95%; HAI, median 99%; P<.001), specificity (physician alone, median 90%; HAI, median 95%; P<.001), accuracy (physician alone, median 90%; HAI, median 96%; P<.001), and human-algorithm agreement [physician alone κ, median 0.69 (IQR 0.63-0.74); HAI κ, median 0.80 (IQR 0.76-0.82); P<.001. With the help of the HAI system, the primary physicians showed significant improvement in their diagnostic performance to levels comparable to those of consulting physicians, and both the experienced and less-experienced physicians benefited from the HAI system. After the HAI system had been applied in 3 departments for 5 months, 587 images were examined. The sensitivity, specificity, and accuracy of the HAI system for detecting hip fractures were 97%, 95.7%, and 96.08%, respectively. Conclusions: HAI currently impacts health care, and integrating this technology into emergency departments is feasible. The developed HAI system can enhance physicians’ hip fracture diagnostic performance. %M 33245279 %R 10.2196/19416 %U http://medinform.jmir.org/2020/11/e19416/ %U https://doi.org/10.2196/19416 %U http://www.ncbi.nlm.nih.gov/pubmed/33245279 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 8 %N 11 %P e23472 %T Deep Learning–Based Detection of Early Renal Function Impairment Using Retinal Fundus Images: Model Development and Validation %A Kang,Eugene Yu-Chuan %A Hsieh,Yi-Ting %A Li,Chien-Hung %A Huang,Yi-Jin %A Kuo,Chang-Fu %A Kang,Je-Ho %A Chen,Kuan-Jen %A Lai,Chi-Chun %A Wu,Wei-Chi %A Hwang,Yih-Shiou %+ Department of Ophthalmology, Chang Gung Memorial Hospital, Linkou Medical Center, No. 5, Fu-Hsin Rd., Taoyuan, 333, Taiwan, 886 3 3281200 ext 8666, yihshiou.hwang@gmail.com %K deep learning %K renal function %K retinal fundus image %K diabetes %K renal %K kidney %K retinal %K eye %K imaging %K impairment %K detection %K development %K validation %K model %D 2020 %7 26.11.2020 %9 Original Paper %J JMIR Med Inform %G English %X Background: Retinal imaging has been applied for detecting eye diseases and cardiovascular risks using deep learning–based methods. Furthermore, retinal microvascular and structural changes were found in renal function impairments. However, a deep learning–based method using retinal images for detecting early renal function impairment has not yet been well studied. Objective: This study aimed to develop and evaluate a deep learning model for detecting early renal function impairment using retinal fundus images. Methods: This retrospective study enrolled patients who underwent renal function tests with color fundus images captured at any time between January 1, 2001, and August 31, 2019. A deep learning model was constructed to detect impaired renal function from the images. Early renal function impairment was defined as estimated glomerular filtration rate <90 mL/min/1.73 m2. Model performance was evaluated with respect to the receiver operating characteristic curve and area under the curve (AUC). Results: In total, 25,706 retinal fundus images were obtained from 6212 patients for the study period. The images were divided at an 8:1:1 ratio. The training, validation, and testing data sets respectively contained 20,787, 2189, and 2730 images from 4970, 621, and 621 patients. There were 10,686 and 15,020 images determined to indicate normal and impaired renal function, respectively. The AUC of the model was 0.81 in the overall population. In subgroups stratified by serum hemoglobin A1c (HbA1c) level, the AUCs were 0.81, 0.84, 0.85, and 0.87 for the HbA1c levels of ≤6.5%, >6.5%, >7.5%, and >10%, respectively. Conclusions: The deep learning model in this study enables the detection of early renal function impairment using retinal fundus images. The model was more accurate for patients with elevated serum HbA1c levels. %M 33139242 %R 10.2196/23472 %U http://medinform.jmir.org/2020/11/e23472/ %U https://doi.org/10.2196/23472 %U http://www.ncbi.nlm.nih.gov/pubmed/33139242 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 11 %P e18563 %T Automated Diagnosis of Various Gastrointestinal Lesions Using a Deep Learning–Based Classification and Retrieval Framework With a Large Endoscopic Database: Model Development and Validation %A Owais,Muhammad %A Arsalan,Muhammad %A Mahmood,Tahir %A Kang,Jin Kyu %A Park,Kang Ryoung %+ Division of Electronics and Electrical Engineering, Dongguk University, 30 Pildong-ro 1-gil, Jung-gu, Seoul, 04620, Republic of Korea, 82 10 3111 7022, parkgr@dgu.edu %K artificial intelligence %K endoscopic video retrieval %K content-based medical image retrieval %K polyp detection %K deep learning %K computer-aided diagnosis %D 2020 %7 26.11.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: The early diagnosis of various gastrointestinal diseases can lead to effective treatment and reduce the risk of many life-threatening conditions. Unfortunately, various small gastrointestinal lesions are undetectable during early-stage examination by medical experts. In previous studies, various deep learning–based computer-aided diagnosis tools have been used to make a significant contribution to the effective diagnosis and treatment of gastrointestinal diseases. However, most of these methods were designed to detect a limited number of gastrointestinal diseases, such as polyps, tumors, or cancers, in a specific part of the human gastrointestinal tract. Objective: This study aimed to develop a comprehensive computer-aided diagnosis tool to assist medical experts in diagnosing various types of gastrointestinal diseases. Methods: Our proposed framework comprises a deep learning–based classification network followed by a retrieval method. In the first step, the classification network predicts the disease type for the current medical condition. Then, the retrieval part of the framework shows the relevant cases (endoscopic images) from the previous database. These past cases help the medical expert validate the current computer prediction subjectively, which ultimately results in better diagnosis and treatment. Results: All the experiments were performed using 2 endoscopic data sets with a total of 52,471 frames and 37 different classes. The optimal performances obtained by our proposed method in accuracy, F1 score, mean average precision, and mean average recall were 96.19%, 96.99%, 98.18%, and 95.86%, respectively. The overall performance of our proposed diagnostic framework substantially outperformed state-of-the-art methods. Conclusions: This study provides a comprehensive computer-aided diagnosis framework for identifying various types of gastrointestinal diseases. The results show the superiority of our proposed method over various other recent methods and illustrate its potential for clinical diagnosis and treatment. Our proposed network can be applicable to other classification domains in medical imaging, such as computed tomography scans, magnetic resonance imaging, and ultrasound sequences. %M 33242010 %R 10.2196/18563 %U http://www.jmir.org/2020/11/e18563/ %U https://doi.org/10.2196/18563 %U http://www.ncbi.nlm.nih.gov/pubmed/33242010 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 8 %N 11 %P e20031 %T Web- and Artificial Intelligence–Based Image Recognition For Sperm Motility Analysis: Verification Study %A Tsai,Vincent FS %A Zhuang,Bin %A Pong,Yuan-Hung %A Hsieh,Ju-Ton %A Chang,Hong-Chiang %+ Department of Urology, National Taiwan University Hospital, 7, Zhung-Shan S. Road, Taipei, 100, Taiwan, 886 223123456 ext 62135, bird8873@gmail.com %K Male infertility %K semen analysis %K home sperm test %K smartphone %K artificial intelligence %K cloud computing %K telemedicine %D 2020 %7 19.11.2020 %9 Original Paper %J JMIR Med Inform %G English %X Background: Human sperm quality fluctuates over time. Therefore, it is crucial for couples preparing for natural pregnancy to monitor sperm motility. Objective: This study verified the performance of an artificial intelligence–based image recognition and cloud computing sperm motility testing system (Bemaner, Createcare) composed of microscope and microfluidic modules and designed to adapt to different types of smartphones. Methods: Sperm videos were captured and uploaded to the cloud with an app. Analysis of sperm motility was performed by an artificial intelligence–based image recognition algorithm then results were displayed. According to the number of motile sperm in the vision field, 47 (deidentified) videos of sperm were scored using 6 grades (0-5) by a male-fertility expert with 10 years of experience. Pearson product-moment correlation was calculated between the grades and the results (concentration of total sperm, concentration of motile sperm, and motility percentage) computed by the system. Results: Good correlation was demonstrated between the grades and results computed by the system for concentration of total sperm (r=0.65, P<.001), concentration of motile sperm (r=0.84, P<.001), and motility percentage (r=0.90, P<.001). Conclusions: This smartphone-based sperm motility test (Bemaner) accurately measures motility-related parameters and could potentially be applied toward the following fields: male infertility detection, sperm quality test during preparation for pregnancy, and infertility treatment monitoring. With frequent at-home testing, more data can be collected to help make clinical decisions and to conduct epidemiological research. %M 33211025 %R 10.2196/20031 %U http://medinform.jmir.org/2020/11/e20031/ %U https://doi.org/10.2196/20031 %U http://www.ncbi.nlm.nih.gov/pubmed/33211025 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 8 %N 11 %P e24163 %T Development of an Artificial Intelligence–Based Automated Recommendation System for Clinical Laboratory Tests: Retrospective Analysis of the National Health Insurance Database %A Islam,Md Mohaimenul %A Yang,Hsuan-Chia %A Poly,Tahmina Nasrin %A Li,Yu-Chuan Jack %+ Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, 250 Wu-Hsing St., Taipei 110, Taipei, Taiwan, 886 2 27361661 ext 7600, jaak88@gmail.com %K artificial intelligence %K deep learning %K clinical decision-support system %K laboratory test %K patient safety %D 2020 %7 18.11.2020 %9 Original Paper %J JMIR Med Inform %G English %X Background: Laboratory tests are considered an essential part of patient safety as patients’ screening, diagnosis, and follow-up are solely based on laboratory tests. Diagnosis of patients could be wrong, missed, or delayed if laboratory tests are performed erroneously. However, recognizing the value of correct laboratory test ordering remains underestimated by policymakers and clinicians. Nowadays, artificial intelligence methods such as machine learning and deep learning (DL) have been extensively used as powerful tools for pattern recognition in large data sets. Therefore, developing an automated laboratory test recommendation tool using available data from electronic health records (EHRs) could support current clinical practice. Objective: The objective of this study was to develop an artificial intelligence–based automated model that can provide laboratory tests recommendation based on simple variables available in EHRs. Methods: A retrospective analysis of the National Health Insurance database between January 1, 2013, and December 31, 2013, was performed. We reviewed the record of all patients who visited the cardiology department at least once and were prescribed laboratory tests. The data set was split into training and testing sets (80:20) to develop the DL model. In the internal validation, 25% of data were randomly selected from the training set to evaluate the performance of this model. Results: We used the area under the receiver operating characteristic curve, precision, recall, and hamming loss as comparative measures. A total of 129,938 prescriptions were used in our model. The DL-based automated recommendation system for laboratory tests achieved a significantly higher area under the receiver operating characteristic curve (AUROCmacro and AUROCmicro of 0.76 and 0.87, respectively). Using a low cutoff, the model identified appropriate laboratory tests with 99% sensitivity. Conclusions: The developed artificial intelligence model based on DL exhibited good discriminative capability for predicting laboratory tests using routinely collected EHR data. Utilization of DL approaches can facilitate optimal laboratory test selection for patients, which may in turn improve patient safety. However, future study is recommended to assess the cost-effectiveness for implementing this model in real-world clinical settings. %M 33206057 %R 10.2196/24163 %U https://medinform.jmir.org/2020/11/e24163 %U https://doi.org/10.2196/24163 %U http://www.ncbi.nlm.nih.gov/pubmed/33206057 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 11 %P e23315 %T Economic Value of Data and Analytics for Health Care Providers: Hermeneutic Systematic Literature Review %A von Wedel,Philip %A Hagist,Christian %+ Chair of Economic and Social Policy, WHU - Otto Beisheim School of Management, Burgplatz 2, Vallendar, 56179, Germany, 49 02616509 ext 255, philip.wedel@whu.edu %K digital health %K health information technology %K healthcare provider economics %K electronic health records %K data analytics %K artificial intelligence %D 2020 %7 18.11.2020 %9 Review %J J Med Internet Res %G English %X Background: The benefits of data and analytics for health care systems and single providers is an increasingly investigated field in digital health literature. Electronic health records (EHR), for example, can improve quality of care. Emerging analytics tools based on artificial intelligence show the potential to assist physicians in day-to-day workflows. Yet, single health care providers also need information regarding the economic impact when deciding on potential adoption of these tools. Objective: This paper examines the question of whether data and analytics provide economic advantages or disadvantages for health care providers. The goal is to provide a comprehensive overview including a variety of technologies beyond computer-based patient records. Ultimately, findings are also intended to determine whether economic barriers for adoption by providers could exist. Methods: A systematic literature search of the PubMed and Google Scholar online databases was conducted, following the hermeneutic methodology that encourages iterative search and interpretation cycles. After applying inclusion and exclusion criteria to 165 initially identified studies, 50 were included for qualitative synthesis and topic-based clustering. Results: The review identified 5 major technology categories, namely EHRs (n=30), computerized clinical decision support (n=8), advanced analytics (n=5), business analytics (n=5), and telemedicine (n=2). Overall, 62% (31/50) of the reviewed studies indicated a positive economic impact for providers either via direct cost or revenue effects or via indirect efficiency or productivity improvements. When differentiating between categories, however, an ambiguous picture emerged for EHR, whereas analytics technologies like computerized clinical decision support and advanced analytics predominantly showed economic benefits. Conclusions: The research question of whether data and analytics create economic benefits for health care providers cannot be answered uniformly. The results indicate ambiguous effects for EHRs, here representing data, and mainly positive effects for the significantly less studied analytics field. The mixed results regarding EHRs can create an economic barrier for adoption by providers. This barrier can translate into a bottleneck to positive economic effects of analytics technologies relying on EHR data. Ultimately, more research on economic effects of technologies other than EHRs is needed to generate a more reliable evidence base. %M 33206056 %R 10.2196/23315 %U http://www.jmir.org/2020/11/e23315/ %U https://doi.org/10.2196/23315 %U http://www.ncbi.nlm.nih.gov/pubmed/33206056 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 8 %N 11 %P e19805 %T Deep Learning Methodology for Differentiating Glioma Recurrence From Radiation Necrosis Using Multimodal Magnetic Resonance Imaging: Algorithm Development and Validation %A Gao,Yang %A Xiao,Xiong %A Han,Bangcheng %A Li,Guilin %A Ning,Xiaolin %A Wang,Defeng %A Cai,Weidong %A Kikinis,Ron %A Berkovsky,Shlomo %A Di Ieva,Antonio %A Zhang,Liwei %A Ji,Nan %A Liu,Sidong %+ Centre for Health Informatics, Australian Institute of Health Innovation, Macquarie University, 75 Talavera Road, Macquarie Park, Sydney, 2113, Australia, 61 29852729, dr.sidong.liu@gmail.com %K recurrent tumor %K radiation necrosis %K progression %K pseudoprogression %K multimodal MRI %K deep learning %D 2020 %7 17.11.2020 %9 Original Paper %J JMIR Med Inform %G English %X Background: The radiological differential diagnosis between tumor recurrence and radiation-induced necrosis (ie, pseudoprogression) is of paramount importance in the management of glioma patients. Objective: This research aims to develop a deep learning methodology for automated differentiation of tumor recurrence from radiation necrosis based on routine magnetic resonance imaging (MRI) scans. Methods: In this retrospective study, 146 patients who underwent radiation therapy after glioma resection and presented with suspected recurrent lesions at the follow-up MRI examination were selected for analysis. Routine MRI scans were acquired from each patient, including T1, T2, and gadolinium-contrast-enhanced T1 sequences. Of those cases, 96 (65.8%) were confirmed as glioma recurrence on postsurgical pathological examination, while 50 (34.2%) were diagnosed as necrosis. A light-weighted deep neural network (DNN) (ie, efficient radionecrosis neural network [ERN-Net]) was proposed to learn radiological features of gliomas and necrosis from MRI scans. Sensitivity, specificity, accuracy, and area under the curve (AUC) were used to evaluate performance of the model in both image-wise and subject-wise classifications. Preoperative diagnostic performance of the model was also compared to that of the state-of-the-art DNN models and five experienced neurosurgeons. Results: DNN models based on multimodal MRI outperformed single-modal models. ERN-Net achieved the highest AUC in both image-wise (0.915) and subject-wise (0.958) classification tasks. The evaluated DNN models achieved an average sensitivity of 0.947 (SD 0.033), specificity of 0.817 (SD 0.075), and accuracy of 0.903 (SD 0.026), which were significantly better than the tested neurosurgeons (P=.02 in sensitivity and P<.001 in specificity and accuracy). Conclusions: Deep learning offers a useful computational tool for the differential diagnosis between recurrent gliomas and necrosis. The proposed ERN-Net model, a simple and effective DNN model, achieved excellent performance on routine MRI scans and showed a high clinical applicability. %M 33200991 %R 10.2196/19805 %U http://medinform.jmir.org/2020/11/e19805/ %U https://doi.org/10.2196/19805 %U http://www.ncbi.nlm.nih.gov/pubmed/33200991 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 11 %P e15185 %T Physicians’ Perceptions of the Use of a Chatbot for Information Seeking: Qualitative Study %A Koman,Jason %A Fauvelle,Khristina %A Schuck,Stéphane %A Texier,Nathalie %A Mebarki,Adel %+ Sanofi Aventis, 82, avenue Raspail, Gentilly Cedex, 94255, France, 33 772219558, khristina.fauvelle@sanofi.com %K health %K digital health %K innovation %K conversational agent %K decision support system %K qualitative research %K chatbot %K bot %K medical drugs %K prescription %K risk minimization measures %D 2020 %7 10.11.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: Seeking medical information can be an issue for physicians. In the specific context of medical practice, chatbots are hypothesized to present additional value for providing information quickly, particularly as far as drug risk minimization measures are concerned. Objective: This qualitative study aimed to elicit physicians’ perceptions of a pilot version of a chatbot used in the context of drug information and risk minimization measures. Methods: General practitioners and specialists were recruited across France to participate in individual semistructured interviews. Interviews were recorded, transcribed, and analyzed using a horizontal thematic analysis approach. Results: Eight general practitioners and 2 specialists participated. The tone and ergonomics of the pilot version were appreciated by physicians. However, all participants emphasized the importance of getting exhaustive, trustworthy answers when interacting with a chatbot. Conclusions: The chatbot was perceived as a useful and innovative tool that could easily be integrated into routine medical practice and could help health professionals when seeking information on drug and risk minimization measures. %M 33170134 %R 10.2196/15185 %U http://www.jmir.org/2020/11/e15185/ %U https://doi.org/10.2196/15185 %U http://www.ncbi.nlm.nih.gov/pubmed/33170134 %0 Journal Article %@ 1929-0748 %I JMIR Publications %V 9 %N 11 %P e21659 %T Artificial Intelligence–Powered Smartphone App to Facilitate Medication Adherence: Protocol for a Human Factors Design Study %A Roosan,Don %A Chok,Jay %A Karim,Mazharul %A Law,Anandi V %A Baskys,Andrius %A Hwang,Angela %A Roosan,Moom R %+ Department of Pharmacy Practice and Administration, College of Pharmacy, Western University of Health Sciences, 309 E 2nd St, Pomona, CA, 91766, United States, 1 9094698778, droosan@westernu.edu %K artificial intelligence %K smartphone app %K patient cognition %K complex medication information %K medication adherence %K machine learning %K mobile phone %D 2020 %7 9.11.2020 %9 Protocol %J JMIR Res Protoc %G English %X Background: Medication Guides consisting of crucial interactions and side effects are extensive and complex. Due to the exhaustive information, patients do not retain the necessary medication information, which can result in hospitalizations and medication nonadherence. A gap exists in understanding patients’ cognition of managing complex medication information. However, advancements in technology and artificial intelligence (AI) allow us to understand patient cognitive processes to design an app to better provide important medication information to patients. Objective: Our objective is to improve the design of an innovative AI- and human factor–based interface that supports patients’ medication information comprehension that could potentially improve medication adherence. Methods: This study has three aims. Aim 1 has three phases: (1) an observational study to understand patient perception of fear and biases regarding medication information, (2) an eye-tracking study to understand the attention locus for medication information, and (3) a psychological refractory period (PRP) paradigm study to understand functionalities. Observational data will be collected, such as audio and video recordings, gaze mapping, and time from PRP. A total of 50 patients, aged 18-65 years, who started at least one new medication, for which we developed visualization information, and who have a cognitive status of 34 during cognitive screening using the TICS-M test and health literacy level will be included in this aim of the study. In Aim 2, we will iteratively design and evaluate an AI-powered medication information visualization interface as a smartphone app with the knowledge gained from each component of Aim 1. The interface will be assessed through two usability surveys. A total of 300 patients, aged 18-65 years, with diabetes, cardiovascular diseases, or mental health disorders, will be recruited for the surveys. Data from the surveys will be analyzed through exploratory factor analysis. In Aim 3, in order to test the prototype, there will be a two-arm study design. This aim will include 900 patients, aged 18-65 years, with internet access, without any cognitive impairment, and with at least two medications. Patients will be sequentially randomized. Three surveys will be used to assess the primary outcome of medication information comprehension and the secondary outcome of medication adherence at 12 weeks. Results: Preliminary data collection will be conducted in 2021, and results are expected to be published in 2022. Conclusions: This study will lead the future of AI-based, innovative, digital interface design and aid in improving medication comprehension, which may improve medication adherence. The results from this study will also open up future research opportunities in understanding how patients manage complex medication information and will inform the format and design for innovative, AI-powered digital interfaces for Medication Guides. International Registered Report Identifier (IRRID): PRR1-10.2196/21659 %M 33164898 %R 10.2196/21659 %U http://www.researchprotocols.org/2020/11/e21659/ %U https://doi.org/10.2196/21659 %U http://www.ncbi.nlm.nih.gov/pubmed/33164898 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 8 %N 11 %P e21252 %T Patient Triage by Topic Modeling of Referral Letters: Feasibility Study %A Spasic,Irena %A Button,Kate %+ School of Computer Science & Informatics, Cardiff University, 5 The Parade, Cardiff, CF24 3AA, United Kingdom, 44 02920870320, spasici@cardiff.ac.uk %K natural language processing %K machine learning %K data science %K medical informatics %K computer-assisted decision making %D 2020 %7 6.11.2020 %9 Original Paper %J JMIR Med Inform %G English %X Background: Musculoskeletal conditions are managed within primary care, but patients can be referred to secondary care if a specialist opinion is required. The ever-increasing demand for health care resources emphasizes the need to streamline care pathways with the ultimate aim of ensuring that patients receive timely and optimal care. Information contained in referral letters underpins the referral decision-making process but is yet to be explored systematically for the purposes of treatment prioritization for musculoskeletal conditions. Objective: This study aims to explore the feasibility of using natural language processing and machine learning to automate the triage of patients with musculoskeletal conditions by analyzing information from referral letters. Specifically, we aim to determine whether referral letters can be automatically assorted into latent topics that are clinically relevant, that is, considered relevant when prescribing treatments. Here, clinical relevance is assessed by posing 2 research questions. Can latent topics be used to automatically predict treatment? Can clinicians interpret latent topics as cohorts of patients who share common characteristics or experiences such as medical history, demographics, and possible treatments? Methods: We used latent Dirichlet allocation to model each referral letter as a finite mixture over an underlying set of topics and model each topic as an infinite mixture over an underlying set of topic probabilities. The topic model was evaluated in the context of automating patient triage. Given a set of treatment outcomes, a binary classifier was trained for each outcome using previously extracted topics as the input features of the machine learning algorithm. In addition, a qualitative evaluation was performed to assess the human interpretability of topics. Results: The prediction accuracy of binary classifiers outperformed the stratified random classifier by a large margin, indicating that topic modeling could be used to predict the treatment, thus effectively supporting patient triage. The qualitative evaluation confirmed the high clinical interpretability of the topic model. Conclusions: The results established the feasibility of using natural language processing and machine learning to automate triage of patients with knee or hip pain by analyzing information from their referral letters. %M 33155985 %R 10.2196/21252 %U https://medinform.jmir.org/2020/11/e21252 %U https://doi.org/10.2196/21252 %U http://www.ncbi.nlm.nih.gov/pubmed/33155985 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 11 %P e20251 %T Engaging Unmotivated Smokers to Move Toward Quitting: Design of Motivational Interviewing–Based Chatbot Through Iterative Interactions %A Almusharraf,Fahad %A Rose,Jonathan %A Selby,Peter %+ The Edward S. Rogers Sr. Department of Electrical & Computer Engineering, Faculty of Applied Science & Engineering, University of Toronto, 10 King's College Road, Toronto, ON, M5S 3G4, Canada, 1 4169786992, jonathan.rose@ece.utoronto.ca %K smoking cessation %K motivational interviewing %K chatbot %K natural language processing %D 2020 %7 3.11.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: At any given time, most smokers in a population are ambivalent with no motivation to quit. Motivational interviewing (MI) is an evidence-based technique that aims to elicit change in ambivalent smokers. MI practitioners are scarce and expensive, and smokers are difficult to reach. Smokers are potentially reachable through the web, and if an automated chatbot could emulate an MI conversation, it could form the basis of a low-cost and scalable intervention motivating smokers to quit. Objective: The primary goal of this study is to design, train, and test an automated MI-based chatbot capable of eliciting reflection in a conversation with cigarette smokers. This study describes the process of collecting training data to improve the chatbot’s ability to generate MI-oriented responses, particularly reflections and summary statements. The secondary goal of this study is to observe the effects on participants through voluntary feedback given after completing a conversation with the chatbot. Methods: An interdisciplinary collaboration between an MI expert and experts in computer engineering and natural language processing (NLP) co-designed the conversation and algorithms underlying the chatbot. A sample of 121 adult cigarette smokers in 11 successive groups were recruited from a web-based platform for a single-arm prospective iterative design study. The chatbot was designed to stimulate reflections on the pros and cons of smoking using MI’s running head start technique. Participants were also asked to confirm the chatbot’s classification of their free-form responses to measure the classification accuracy of the underlying NLP models. Each group provided responses that were used to train the chatbot for the next group. Results: A total of 6568 responses from 121 participants in 11 successive groups over 14 weeks were received. From these responses, we were able to isolate 21 unique reasons for and against smoking and the relative frequency of each. The gradual collection of responses as inputs and smoking reasons as labels over the 11 iterations improved the F1 score of the classification within the chatbot from 0.63 in the first group to 0.82 in the final group. The mean time spent by each participant interacting with the chatbot was 21.3 (SD 14.0) min (minimum 6.4 and maximum 89.2). We also found that 34.7% (42/121) of participants enjoyed the interaction with the chatbot, and 8.3% (10/121) of participants noted explicit smoking cessation benefits from the conversation in voluntary feedback that did not solicit this explicitly. Conclusions: Recruiting ambivalent smokers through the web is a viable method to train a chatbot to increase accuracy in reflection and summary statements, the building blocks of MI. A new set of 21 smoking reasons (both for and against) has been identified. Initial feedback from smokers on the experience shows promise toward using it in an intervention. %M 33141095 %R 10.2196/20251 %U https://www.jmir.org/2020/11/e20251 %U https://doi.org/10.2196/20251 %U http://www.ncbi.nlm.nih.gov/pubmed/33141095 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 11 %P e19548 %T Classification of Depression Through Resting-State Electroencephalogram as a Novel Practice in Psychiatry: Review %A Čukić,Milena %A López,Victoria %A Pavón,Juan %+ HealthInc 3EGA, Amsterdam Health and Technology Institute, Koningin Wilhelminaplein 644, Amsterdam, 1062 KS, Netherlands, 31 615178926, micu@3ega.nl %K computational psychiatry %K physiological complexity %K machine learning %K theory-driven approach %K resting-state EEG %K personalized medicine %K computational neuroscience %K unwarranted optimism %D 2020 %7 3.11.2020 %9 Review %J J Med Internet Res %G English %X Background: Machine learning applications in health care have increased considerably in the recent past, and this review focuses on an important application in psychiatry related to the detection of depression. Since the advent of computational psychiatry, research based on functional magnetic resonance imaging has yielded remarkable results, but these tools tend to be too expensive for everyday clinical use. Objective: This review focuses on an affordable data-driven approach based on electroencephalographic recordings. Web-based applications via public or private cloud-based platforms would be a logical next step. We aim to compare several different approaches to the detection of depression from electroencephalographic recordings using various features and machine learning models. Methods: To detect depression, we reviewed published detection studies based on resting-state electroencephalogram with final machine learning, and to predict therapy outcomes, we reviewed a set of interventional studies using some form of stimulation in their methodology. Results: We reviewed 14 detection studies and 12 interventional studies published between 2008 and 2019. As direct comparison was not possible due to the large diversity of theoretical approaches and methods used, we compared them based on the steps in analysis and accuracies yielded. In addition, we compared possible drawbacks in terms of sample size, feature extraction, feature selection, classification, internal and external validation, and possible unwarranted optimism and reproducibility. In addition, we suggested desirable practices to avoid misinterpretation of results and optimism. Conclusions: This review shows the need for larger data sets and more systematic procedures to improve the use of the solution for clinical diagnostics. Therefore, regulation of the pipeline and standard requirements for methodology used should become mandatory to increase the reliability and accuracy of the complete methodology for it to be translated to modern psychiatry. %M 33141088 %R 10.2196/19548 %U https://www.jmir.org/2020/11/e19548 %U https://doi.org/10.2196/19548 %U http://www.ncbi.nlm.nih.gov/pubmed/33141088 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 8 %N 10 %P e18273 %T Exploring Eating Disorder Topics on Twitter: Machine Learning Approach %A Zhou,Sicheng %A Zhao,Yunpeng %A Bian,Jiang %A Haynos,Ann F %A Zhang,Rui %+ Institute for Health Informatics, University of Minnesota, 8-100 Phillips-Wangensteen Building, 516 Delaware Street SE, Minneapolis, MN, 55455, United States, 1 612 626 4209, zhan1386@umn.edu %K eating disorders %K topic modeling %K text classification %K social media %K public health %D 2020 %7 30.10.2020 %9 Original Paper %J JMIR Med Inform %G English %X Background: Eating disorders (EDs) are a group of mental illnesses that have an adverse effect on both mental and physical health. As social media platforms (eg, Twitter) have become an important data source for public health research, some studies have qualitatively explored the ways in which EDs are discussed on these platforms. Initial results suggest that such research offers a promising method for further understanding this group of diseases. Nevertheless, an efficient computational method is needed to further identify and analyze tweets relevant to EDs on a larger scale. Objective: This study aims to develop and validate a machine learning–based classifier to identify tweets related to EDs and to explore factors (ie, topics) related to EDs using a topic modeling method. Methods: We collected potential ED-relevant tweets using keywords from previous studies and annotated these tweets into different groups (ie, ED relevant vs irrelevant and then promotional information vs laypeople discussion). Several supervised machine learning methods, such as convolutional neural network (CNN), long short-term memory (LSTM), support vector machine, and naïve Bayes, were developed and evaluated using annotated data. We used the classifier with the best performance to identify ED-relevant tweets and applied a topic modeling method—Correlation Explanation (CorEx)—to analyze the content of the identified tweets. To validate these machine learning results, we also collected a cohort of ED-relevant tweets on the basis of manually curated rules. Results: A total of 123,977 tweets were collected during the set period. We randomly annotated 2219 tweets for developing the machine learning classifiers. We developed a CNN-LSTM classifier to identify ED-relevant tweets published by laypeople in 2 steps: first relevant versus irrelevant (F1 score=0.89) and then promotional versus published by laypeople (F1 score=0.90). A total of 40,790 ED-relevant tweets were identified using the CNN-LSTM classifier. We also identified another set of tweets (ie, 17,632 ED-relevant and 83,557 ED-irrelevant tweets) posted by laypeople using manually specified rules. Using CorEx on all ED-relevant tweets, the topic model identified 162 topics. Overall, the coherence rate for topic modeling was 77.07% (1264/1640), indicating a high quality of the produced topics. The topics were further reviewed and analyzed by a domain expert. Conclusions: A developed CNN-LSTM classifier could improve the efficiency of identifying ED-relevant tweets compared with the traditional manual-based method. The CorEx topic model was applied on the tweets identified by the machine learning–based classifier and the traditional manual approach separately. Highly overlapping topics were observed between the 2 cohorts of tweets. The produced topics were further reviewed by a domain expert. Some of the topics identified by the potential ED tweets may provide new avenues for understanding this serious set of disorders. %M 33124997 %R 10.2196/18273 %U http://medinform.jmir.org/2020/10/e18273/ %U https://doi.org/10.2196/18273 %U http://www.ncbi.nlm.nih.gov/pubmed/33124997 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 8 %N 10 %P e21222 %T Predictive Models for Neonatal Follow-Up Serum Bilirubin: Model Development and Validation %A Chou,Joseph H %+ Massachusetts General Hospital, 55 Fruit Street, Founders 526E, Boston, MA, 02114-2696, United States, 1 617 724 9040, jchou2@mgh.harvard.edu %K infant, newborn %K neonatology %K jaundice, neonatal %K hyperbilirubinemia, neonatal %K machine learning %K supervised machine learning %K data science %K medical informatics %K decision support techniques %K models, statistical %K predictive models %D 2020 %7 29.10.2020 %9 Original Paper %J JMIR Med Inform %G English %X Background: Hyperbilirubinemia affects many newborn infants and, if not treated appropriately, can lead to irreversible brain injury. Objective: This study aims to develop predictive models of follow-up total serum bilirubin measurement and to compare their accuracy with that of clinician predictions. Methods: Subjects were patients born between June 2015 and June 2019 at 4 hospitals in Massachusetts. The prediction target was a follow-up total serum bilirubin measurement obtained <72 hours after a previous measurement. Birth before versus after February 2019 was used to generate a training set (27,428 target measurements) and a held-out test set (3320 measurements), respectively. Multiple supervised learning models were trained. To further assess model performance, predictions on the held-out test set were also compared with corresponding predictions from clinicians. Results: The best predictive accuracy on the held-out test set was obtained with the multilayer perceptron (ie, neural network, mean absolute error [MAE] 1.05 mg/dL) and Xgboost (MAE 1.04 mg/dL) models. A limited number of predictors were sufficient for constructing models with the best performance and avoiding overfitting: current bilirubin measurement, last rate of rise, proportion of time under phototherapy, time to next measurement, gestational age at birth, current age, and fractional weight change from birth. Clinicians made a total of 210 prospective predictions. The neural network model accuracy on this subset of predictions had an MAE of 1.06 mg/dL compared with clinician predictions with an MAE of 1.38 mg/dL (P<.0001). In babies born at 35 weeks of gestation or later, this approach was also applied to predict the binary outcome of subsequently exceeding consensus guidelines for phototherapy initiation and achieved an area under the receiver operator characteristic curve of 0.94 (95% CI 0.91 to 0.97). Conclusions: This study developed predictive models for neonatal follow-up total serum bilirubin measurements that outperform clinicians. This may be the first report of models that predict specific bilirubin values, are not limited to near-term patients without risk factors, and take into account the effect of phototherapy. %M 33118947 %R 10.2196/21222 %U http://medinform.jmir.org/2020/10/e21222/ %U https://doi.org/10.2196/21222 %U http://www.ncbi.nlm.nih.gov/pubmed/33118947 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 10 %P e21801 %T Clinical Characteristics and Prognostic Factors for Intensive Care Unit Admission of Patients With COVID-19: Retrospective Study Using Machine Learning and Natural Language Processing %A Izquierdo,Jose Luis %A Ancochea,Julio %A , %A Soriano,Joan B %+ Hospital Universitario de La Princesa, Diego de León 62, Madrid, 28005, Spain, 34 618867769, jbsoriano2@gmail.com %K artificial intelligence %K big data %K COVID-19 %K electronic health records %K tachypnea %K SARS-CoV-2 %K predictive model %D 2020 %7 28.10.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: Many factors involved in the onset and clinical course of the ongoing COVID-19 pandemic are still unknown. Although big data analytics and artificial intelligence are widely used in the realms of health and medicine, researchers are only beginning to use these tools to explore the clinical characteristics and predictive factors of patients with COVID-19. Objective: Our primary objectives are to describe the clinical characteristics and determine the factors that predict intensive care unit (ICU) admission of patients with COVID-19. Determining these factors using a well-defined population can increase our understanding of the real-world epidemiology of the disease. Methods: We used a combination of classic epidemiological methods, natural language processing (NLP), and machine learning (for predictive modeling) to analyze the electronic health records (EHRs) of patients with COVID-19. We explored the unstructured free text in the EHRs within the Servicio de Salud de Castilla-La Mancha (SESCAM) Health Care Network (Castilla-La Mancha, Spain) from the entire population with available EHRs (1,364,924 patients) from January 1 to March 29, 2020. We extracted related clinical information regarding diagnosis, progression, and outcome for all COVID-19 cases. Results: A total of 10,504 patients with a clinical or polymerase chain reaction–confirmed diagnosis of COVID-19 were identified; 5519 (52.5%) were male, with a mean age of 58.2 years (SD 19.7). Upon admission, the most common symptoms were cough, fever, and dyspnea; however, all three symptoms occurred in fewer than half of the cases. Overall, 6.1% (83/1353) of hospitalized patients required ICU admission. Using a machine-learning, data-driven algorithm, we identified that a combination of age, fever, and tachypnea was the most parsimonious predictor of ICU admission; patients younger than 56 years, without tachypnea, and temperature <39 degrees Celsius (or >39 ºC without respiratory crackles) were not admitted to the ICU. In contrast, patients with COVID-19 aged 40 to 79 years were likely to be admitted to the ICU if they had tachypnea and delayed their visit to the emergency department after being seen in primary care. Conclusions: Our results show that a combination of easily obtainable clinical variables (age, fever, and tachypnea with or without respiratory crackles) predicts whether patients with COVID-19 will require ICU admission. %M 33090964 %R 10.2196/21801 %U http://www.jmir.org/2020/10/e21801/ %U https://doi.org/10.2196/21801 %U http://www.ncbi.nlm.nih.gov/pubmed/33090964 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 10 %P e20891 %T Federated Learning on Clinical Benchmark Data: Performance Assessment %A Lee,Geun Hyeong %A Shin,Soo-Yong %+ Department of Digital Health, Samsung Advanced Institute for Health Sciences & Technology, Sungkyunkwan University, 115 Irwon-ro, Gangnam-gu, Seoul, 06355, Republic of Korea, 82 2 3410 1449, sy.shin@skku.edu %K federated learning %K medical data %K privacy protection %K machine learning %K deep learning %D 2020 %7 26.10.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: Federated learning (FL) is a newly proposed machine-learning method that uses a decentralized dataset. Since data transfer is not necessary for the learning process in FL, there is a significant advantage in protecting personal privacy. Therefore, many studies are being actively conducted in the applications of FL for diverse areas. Objective: The aim of this study was to evaluate the reliability and performance of FL using three benchmark datasets, including a clinical benchmark dataset. Methods: To evaluate FL in a realistic setting, we implemented FL using a client-server architecture with Python. The implemented client-server version of the FL software was deployed to Amazon Web Services. Modified National Institute of Standards and Technology (MNIST), Medical Information Mart for Intensive Care-III (MIMIC-III), and electrocardiogram (ECG) datasets were used to evaluate the performance of FL. To test FL in a realistic setting, the MNIST dataset was split into 10 different clients, with one digit for each client. In addition, we conducted four different experiments according to basic, imbalanced, skewed, and a combination of imbalanced and skewed data distributions. We also compared the performance of FL to that of the state-of-the-art method with respect to in-hospital mortality using the MIMIC-III dataset. Likewise, we conducted experiments comparing basic and imbalanced data distributions using MIMIC-III and ECG data. Results: FL on the basic MNIST dataset with 10 clients achieved an area under the receiver operating characteristic curve (AUROC) of 0.997 and an F1-score of 0.946. The experiment with the imbalanced MNIST dataset achieved an AUROC of 0.995 and an F1-score of 0.921. The experiment with the skewed MNIST dataset achieved an AUROC of 0.992 and an F1-score of 0.905. Finally, the combined imbalanced and skewed experiment achieved an AUROC of 0.990 and an F1-score of 0.891. The basic FL on in-hospital mortality using MIMIC-III data achieved an AUROC of 0.850 and an F1-score of 0.944, while the experiment with the imbalanced MIMIC-III dataset achieved an AUROC of 0.850 and an F1-score of 0.943. For ECG classification, the basic FL achieved an AUROC of 0.938 and an F1-score of 0.807, and the imbalanced ECG dataset achieved an AUROC of 0.943 and an F1-score of 0.807. Conclusions: FL demonstrated comparative performance on different benchmark datasets. In addition, FL demonstrated reliable performance in cases where the distribution was imbalanced, skewed, and extreme, reflecting the real-life scenario in which data distributions from various hospitals are different. FL can achieve high performance while maintaining privacy protection because there is no requirement to centralize the data. %M 33104011 %R 10.2196/20891 %U http://www.jmir.org/2020/10/e20891/ %U https://doi.org/10.2196/20891 %U http://www.ncbi.nlm.nih.gov/pubmed/33104011 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 10 %P e20346 %T The Effectiveness of Artificial Intelligence Conversational Agents in Health Care: Systematic Review %A Milne-Ives,Madison %A de Cock,Caroline %A Lim,Ernest %A Shehadeh,Melissa Harper %A de Pennington,Nick %A Mole,Guy %A Normando,Eduardo %A Meinert,Edward %+ Centre for Health Technology, University of Plymouth, 8 Kirkby Place, Room 2, Plymouth, PL4 6DT, United Kingdom, 44 7824446808, edward.meinert@plymouth.ac.uk %K artificial intelligence %K avatar %K chatbot %K conversational agent %K digital health %K intelligent assistant %K speech recognition software %K virtual assistant %K virtual coach %K virtual health care %K virtual nursing %K voice recognition software %D 2020 %7 22.10.2020 %9 Review %J J Med Internet Res %G English %X Background: The high demand for health care services and the growing capability of artificial intelligence have led to the development of conversational agents designed to support a variety of health-related activities, including behavior change, treatment support, health monitoring, training, triage, and screening support. Automation of these tasks could free clinicians to focus on more complex work and increase the accessibility to health care services for the public. An overarching assessment of the acceptability, usability, and effectiveness of these agents in health care is needed to collate the evidence so that future development can target areas for improvement and potential for sustainable adoption. Objective: This systematic review aims to assess the effectiveness and usability of conversational agents in health care and identify the elements that users like and dislike to inform future research and development of these agents. Methods: PubMed, Medline (Ovid), EMBASE (Excerpta Medica dataBASE), CINAHL (Cumulative Index to Nursing and Allied Health Literature), Web of Science, and the Association for Computing Machinery Digital Library were systematically searched for articles published since 2008 that evaluated unconstrained natural language processing conversational agents used in health care. EndNote (version X9, Clarivate Analytics) reference management software was used for initial screening, and full-text screening was conducted by 1 reviewer. Data were extracted, and the risk of bias was assessed by one reviewer and validated by another. Results: A total of 31 studies were selected and included a variety of conversational agents, including 14 chatbots (2 of which were voice chatbots), 6 embodied conversational agents (3 of which were interactive voice response calls, virtual patients, and speech recognition screening systems), 1 contextual question-answering agent, and 1 voice recognition triage system. Overall, the evidence reported was mostly positive or mixed. Usability and satisfaction performed well (27/30 and 26/31), and positive or mixed effectiveness was found in three-quarters of the studies (23/30). However, there were several limitations of the agents highlighted in specific qualitative feedback. Conclusions: The studies generally reported positive or mixed evidence for the effectiveness, usability, and satisfactoriness of the conversational agents investigated, but qualitative user perceptions were more mixed. The quality of many of the studies was limited, and improved study design and reporting are necessary to more accurately evaluate the usefulness of the agents in health care and identify key areas for improvement. Further research should also analyze the cost-effectiveness, privacy, and security of the agents. International Registered Report Identifier (IRRID): RR2-10.2196/16934 %M 33090118 %R 10.2196/20346 %U http://www.jmir.org/2020/10/e20346/ %U https://doi.org/10.2196/20346 %U http://www.ncbi.nlm.nih.gov/pubmed/33090118 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 10 %P e22550 %T Deep Learning With Electronic Health Records for Short-Term Fracture Risk Identification: Crystal Bone Algorithm Development and Validation %A Almog,Yasmeen Adar %A Rai,Angshu %A Zhang,Patrick %A Moulaison,Amanda %A Powell,Ross %A Mishra,Anirban %A Weinberg,Kerry %A Hamilton,Celeste %A Oates,Mary %A McCloskey,Eugene %A Cummings,Steven R %+ Digital Health & Innovation, Amgen Inc, 1 Amgen Center Drive, MS 38-3B, Thousand Oaks, CA, 91320, United States, 1 4243463036, yalmog@amgen.com %K fracture %K bone %K osteoporosis %K low bone mass %K prediction %K natural language processing %K NLP %K machine learning %K deep learning %K artificial intelligence %K AI %K electronic health record %K EHR %D 2020 %7 16.10.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: Fractures as a result of osteoporosis and low bone mass are common and give rise to significant clinical, personal, and economic burden. Even after a fracture occurs, high fracture risk remains widely underdiagnosed and undertreated. Common fracture risk assessment tools utilize a subset of clinical risk factors for prediction, and often require manual data entry. Furthermore, these tools predict risk over the long term and do not explicitly provide short-term risk estimates necessary to identify patients likely to experience a fracture in the next 1-2 years. Objective: The goal of this study was to develop and evaluate an algorithm for the identification of patients at risk of fracture in a subsequent 1- to 2-year period. In order to address the aforementioned limitations of current prediction tools, this approach focused on a short-term timeframe, automated data entry, and the use of longitudinal data to inform the predictions. Methods: Using retrospective electronic health record data from over 1,000,000 patients, we developed Crystal Bone, an algorithm that applies machine learning techniques from natural language processing to the temporal nature of patient histories to generate short-term fracture risk predictions. Similar to how language models predict the next word in a given sentence or the topic of a document, Crystal Bone predicts whether a patient’s future trajectory might contain a fracture event, or whether the signature of the patient’s journey is similar to that of a typical future fracture patient. A holdout set with 192,590 patients was used to validate accuracy. Experimental baseline models and human-level performance were used for comparison. Results: The model accurately predicted 1- to 2-year fracture risk for patients aged over 50 years (area under the receiver operating characteristics curve [AUROC] 0.81). These algorithms outperformed the experimental baselines (AUROC 0.67) and showed meaningful improvements when compared to retrospective approximation of human-level performance by correctly identifying 9649 of 13,765 (70%) at-risk patients who did not receive any preventative bone-health-related medical interventions from their physicians. Conclusions: These findings indicate that it is possible to use a patient’s unique medical history as it changes over time to predict the risk of short-term fracture. Validating and applying such a tool within the health care system could enable automated and widespread prediction of this risk and may help with identification of patients at very high risk of fracture. %M 32956069 %R 10.2196/22550 %U http://www.jmir.org/2020/10/e22550/ %U https://doi.org/10.2196/22550 %U http://www.ncbi.nlm.nih.gov/pubmed/32956069 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 10 %P e19878 %T Application of an Artificial Intelligence Trilogy to Accelerate Processing of Suspected Patients With SARS-CoV-2 at a Smart Quarantine Station: Observational Study %A Liu,Ping-Yen %A Tsai,Yi-Shan %A Chen,Po-Lin %A Tsai,Huey-Pin %A Hsu,Ling-Wei %A Wang,Chi-Shiang %A Lee,Nan-Yao %A Huang,Mu-Shiang %A Wu,Yun-Chiao %A Ko,Wen-Chien %A Yang,Yi-Ching %A Chiang,Jung-Hsien %A Shen,Meng-Ru %+ Department of Obstetrics and Gynecology, National Cheng Kung University Hospital, College of Medicine, National Cheng Kung University, 138 Sheng-Li Rd, Tainan, 70401, Taiwan, 886 6 2353535 ext 5505, mrshen@mail.ncku.edu.tw %K SARS-CoV-2 %K COVID-19 %K artificial intelligence %K smart device assisted decision making %K quarantine station %D 2020 %7 14.10.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: As the COVID-19 epidemic increases in severity, the burden of quarantine stations outside emergency departments (EDs) at hospitals is increasing daily. To address the high screening workload at quarantine stations, all staff members with medical licenses are required to work shifts in these stations. Therefore, it is necessary to simplify the workflow and decision-making process for physicians and surgeons from all subspecialties. Objective: The aim of this paper is to demonstrate how the National Cheng Kung University Hospital artificial intelligence (AI) trilogy of diversion to a smart quarantine station, AI-assisted image interpretation, and a built-in clinical decision-making algorithm improves medical care and reduces quarantine processing times. Methods: This observational study on the emerging COVID-19 pandemic included 643 patients. An “AI trilogy” of diversion to a smart quarantine station, AI-assisted image interpretation, and a built-in clinical decision-making algorithm on a tablet computer was applied to shorten the quarantine survey process and reduce processing time during the COVID-19 pandemic. Results: The use of the AI trilogy facilitated the processing of suspected cases of COVID-19 with or without symptoms; also, travel, occupation, contact, and clustering histories were obtained with the tablet computer device. A separate AI-mode function that could quickly recognize pulmonary infiltrates on chest x-rays was merged into the smart clinical assisting system (SCAS), and this model was subsequently trained with COVID-19 pneumonia cases from the GitHub open source data set. The detection rates for posteroanterior and anteroposterior chest x-rays were 55/59 (93%) and 5/11 (45%), respectively. The SCAS algorithm was continuously adjusted based on updates to the Taiwan Centers for Disease Control public safety guidelines for faster clinical decision making. Our ex vivo study demonstrated the efficiency of disinfecting the tablet computer surface by wiping it twice with 75% alcohol sanitizer. To further analyze the impact of the AI application in the quarantine station, we subdivided the station group into groups with or without AI. Compared with the conventional ED (n=281), the survey time at the quarantine station (n=1520) was significantly shortened; the median survey time at the ED was 153 minutes (95% CI 108.5-205.0), vs 35 minutes at the quarantine station (95% CI 24-56; P<.001). Furthermore, the use of the AI application in the quarantine station reduced the survey time in the quarantine station; the median survey time without AI was 101 minutes (95% CI 40-153), vs 34 minutes (95% CI 24-53) with AI in the quarantine station (P<.001). Conclusions: The AI trilogy improved our medical care workflow by shortening the quarantine survey process and reducing the processing time, which is especially important during an emerging infectious disease epidemic. %M 33001832 %R 10.2196/19878 %U http://www.jmir.org/2020/10/e19878/ %U https://doi.org/10.2196/19878 %U http://www.ncbi.nlm.nih.gov/pubmed/33001832 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 8 %N 10 %P e18287 %T Construction of a Digestive System Tumor Knowledge Graph Based on Chinese Electronic Medical Records: Development and Usability Study %A Xiu,Xiaolei %A Qian,Qing %A Wu,Sizhu %+ Institute of Medical Information/Medical Library, Chinese Academy of Medical Sciences & Peking Union Medical College, No 3 Yabao Road, Chaoyang District, Beijing, 100020, China, 86 18510495073, wu.sizhu@imicams.ac.cn %K Chinese electronic medical records %K knowledge graph %K digestive system tumor %K graph evaluation %D 2020 %7 7.10.2020 %9 Original Paper %J JMIR Med Inform %G English %X Background: With the increasing incidences and mortality of digestive system tumor diseases in China, ways to use clinical experience data in Chinese electronic medical records (CEMRs) to determine potentially effective relationships between diagnosis and treatment have become a priority. As an important part of artificial intelligence, a knowledge graph is a powerful tool for information processing and knowledge organization that provides an ideal means to solve this problem. Objective: This study aimed to construct a semantic-driven digestive system tumor knowledge graph (DSTKG) to represent the knowledge in CEMRs with fine granularity and semantics. Methods: This paper focuses on the knowledge graph schema and semantic relationships that were the main challenges for constructing a Chinese tumor knowledge graph. The DSTKG was developed through a multistep procedure. As an initial step, a complete DSTKG construction framework based on CEMRs was proposed. Then, this research built a knowledge graph schema containing 7 classes and 16 kinds of semantic relationships and accomplished the DSTKG by knowledge extraction, named entity linking, and drawing the knowledge graph. Finally, the quality of the DSTKG was evaluated from 3 aspects: data layer, schema layer, and application layer. Results: Experts agreed that the DSTKG was good overall (mean score 4.20). Especially for the aspects of “rationality of schema structure,” “scalability,” and “readability of results,” the DSTKG performed well, with scores of 4.72, 4.67, and 4.69, respectively, which were much higher than the average. However, the small amount of data in the DSTKG negatively affected its “practicability” score. Compared with other Chinese tumor knowledge graphs, the DSTKG can represent more granular entities, properties, and semantic relationships. In addition, the DSTKG was flexible, allowing personalized customization to meet the designer's focus on specific interests in the digestive system tumor. Conclusions: We constructed a granular semantic DSTKG. It could provide guidance for the construction of a tumor knowledge graph and provide a preliminary step for the intelligent application of knowledge graphs based on CEMRs. Additional data sources and stronger research on assertion classification are needed to gain insight into the DSTKG’s potential. %M 33026359 %R 10.2196/18287 %U http://medinform.jmir.org/2020/10/e18287/ %U https://doi.org/10.2196/18287 %U http://www.ncbi.nlm.nih.gov/pubmed/33026359 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 9 %P e22845 %T Artificial Intelligence Chatbot Behavior Change Model for Designing Artificial Intelligence Chatbots to Promote Physical Activity and a Healthy Diet: Viewpoint %A Zhang,Jingwen %A Oh,Yoo Jung %A Lange,Patrick %A Yu,Zhou %A Fukuoka,Yoshimi %+ Department of Communication, University of California, Davis, One Shields Avenue, Davis, CA, 95616, United States, 1 530 754 1472, jwzzhang@ucdavis.edu %K chatbot %K conversational agent %K artificial intelligence %K physical activity %K diet %K intervention %K behavior change %K natural language processing %K communication %D 2020 %7 30.9.2020 %9 Viewpoint %J J Med Internet Res %G English %X Background: Chatbots empowered by artificial intelligence (AI) can increasingly engage in natural conversations and build relationships with users. Applying AI chatbots to lifestyle modification programs is one of the promising areas to develop cost-effective and feasible behavior interventions to promote physical activity and a healthy diet. Objective: The purposes of this perspective paper are to present a brief literature review of chatbot use in promoting physical activity and a healthy diet, describe the AI chatbot behavior change model our research team developed based on extensive interdisciplinary research, and discuss ethical principles and considerations. Methods: We conducted a preliminary search of studies reporting chatbots for improving physical activity and/or diet in four databases in July 2020. We summarized the characteristics of the chatbot studies and reviewed recent developments in human-AI communication research and innovations in natural language processing. Based on the identified gaps and opportunities, as well as our own clinical and research experience and findings, we propose an AI chatbot behavior change model. Results: Our review found a lack of understanding around theoretical guidance and practical recommendations on designing AI chatbots for lifestyle modification programs. The proposed AI chatbot behavior change model consists of the following four components to provide such guidance: (1) designing chatbot characteristics and understanding user background; (2) building relational capacity; (3) building persuasive conversational capacity; and (4) evaluating mechanisms and outcomes. The rationale and evidence supporting the design and evaluation choices for this model are presented in this paper. Conclusions: As AI chatbots become increasingly integrated into various digital communications, our proposed theoretical framework is the first step to conceptualize the scope of utilization in health behavior change domains and to synthesize all possible dimensions of chatbot features to inform intervention design and evaluation. There is a need for more interdisciplinary work to continue developing AI techniques to improve a chatbot’s relational and persuasive capacities to change physical activity and diet behaviors with strong ethical principles. %M 32996892 %R 10.2196/22845 %U https://www.jmir.org/2020/9/e22845 %U https://doi.org/10.2196/22845 %U http://www.ncbi.nlm.nih.gov/pubmed/32996892 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 9 %P e21849 %T Development of a Social Network for People Without a Diagnosis (RarePairs): Evaluation Study %A Kühnle,Lara %A Mücke,Urs %A Lechner,Werner M %A Klawonn,Frank %A Grigull,Lorenz %+ Hannover Medical School, Carl-Neuberg-Straße 1, Hannover, 30625, Germany, 49 511532 ext 3220, muecke.urs@mh-hannover.de %K rare disease %K diagnostic support tool %K prototype %K social network %K machine learning %K artificial intelligence %D 2020 %7 29.9.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: Diagnostic delay in rare disease (RD) is common, occasionally lasting up to more than 20 years. In attempting to reduce it, diagnostic support tools have been studied extensively. However, social platforms have not yet been used for systematic diagnostic support. This paper illustrates the development and prototypic application of a social network using scientifically developed questions to match individuals without a diagnosis. Objective: The study aimed to outline, create, and evaluate a prototype tool (a social network platform named RarePairs), helping patients with undiagnosed RDs to find individuals with similar symptoms. The prototype includes a matching algorithm, bringing together individuals with similar disease burden in the lead-up to diagnosis. Methods: We divided our project into 4 phases. In phase 1, we used known data and findings in the literature to understand and specify the context of use. In phase 2, we specified the user requirements. In phase 3, we designed a prototype based on the results of phases 1 and 2, as well as incorporating a state-of-the-art questionnaire with 53 items for recognizing an RD. Lastly, we evaluated this prototype with a data set of 973 questionnaires from individuals suffering from different RDs using 24 distance calculating methods. Results: Based on a step-by-step construction process, the digital patient platform prototype, RarePairs, was developed. In order to match individuals with similar experiences, it uses answer patterns generated by a specifically designed questionnaire (Q53). A total of 973 questionnaires answered by patients with RDs were used to construct and test an artificial intelligence (AI) algorithm like the k-nearest neighbor search. With this, we found matches for every single one of the 973 records. The cross-validation of those matches showed that the algorithm outperforms random matching significantly. Statistically, for every data set the algorithm found at least one other record (match) with the same diagnosis. Conclusions: Diagnostic delay is torturous for patients without a diagnosis. Shortening the delay is important for both doctors and patients. Diagnostic support using AI can be promoted differently. The prototype of the social media platform RarePairs might be a low-threshold patient platform, and proved suitable to match and connect different individuals with comparable symptoms. This exchange promoted through RarePairs might be used to speed up the diagnostic process. Further studies include its evaluation in a prospective setting and implementation of RarePairs as a mobile phone app. %M 32990634 %R 10.2196/21849 %U http://www.jmir.org/2020/9/e21849/ %U https://doi.org/10.2196/21849 %U http://www.ncbi.nlm.nih.gov/pubmed/32990634 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 9 %P e20645 %T Marrying Medical Domain Knowledge With Deep Learning on Electronic Health Records: A Deep Visual Analytics Approach %A Li,Rui %A Yin,Changchang %A Yang,Samuel %A Qian,Buyue %A Zhang,Ping %+ The Ohio State University, Lincoln Tower 310A, 1800 Cannon Drive, Columbus, OH, 43210, United States, 1 614 293 9286, zhang.10631@osu.edu %K electronic health records %K interpretable deep learning %K knowledge graph %K visual analytics %D 2020 %7 28.9.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: Deep learning models have attracted significant interest from health care researchers during the last few decades. There have been many studies that apply deep learning to medical applications and achieve promising results. However, there are three limitations to the existing models: (1) most clinicians are unable to interpret the results from the existing models, (2) existing models cannot incorporate complicated medical domain knowledge (eg, a disease causes another disease), and (3) most existing models lack visual exploration and interaction. Both the electronic health record (EHR) data set and the deep model results are complex and abstract, which impedes clinicians from exploring and communicating with the model directly. Objective: The objective of this study is to develop an interpretable and accurate risk prediction model as well as an interactive clinical prediction system to support EHR data exploration, knowledge graph demonstration, and model interpretation. Methods: A domain-knowledge–guided recurrent neural network (DG-RNN) model is proposed to predict clinical risks. The model takes medical event sequences as input and incorporates medical domain knowledge by attending to a subgraph of the whole medical knowledge graph. A global pooling operation and a fully connected layer are used to output the clinical outcomes. The middle results and the parameters of the fully connected layer are helpful in identifying which medical events cause clinical risks. DG-Viz is also designed to support EHR data exploration, knowledge graph demonstration, and model interpretation. Results: We conducted both risk prediction experiments and a case study on a real-world data set. A total of 554 patients with heart failure and 1662 control patients without heart failure were selected from the data set. The experimental results show that the proposed DG-RNN outperforms the state-of-the-art approaches by approximately 1.5%. The case study demonstrates how our medical physician collaborator can effectively explore the data and interpret the prediction results using DG-Viz. Conclusions: In this study, we present DG-Viz, an interactive clinical prediction system, which brings together the power of deep learning (ie, a DG-RNN–based model) and visual analytics to predict clinical risks and visually interpret the EHR prediction results. Experimental results and a case study on heart failure risk prediction tasks demonstrate the effectiveness and usefulness of the DG-Viz system. This study will pave the way for interactive, interpretable, and accurate clinical risk predictions. %M 32985996 %R 10.2196/20645 %U http://www.jmir.org/2020/9/e20645/ %U https://doi.org/10.2196/20645 %U http://www.ncbi.nlm.nih.gov/pubmed/32985996 %0 Journal Article %@ 2371-4379 %I JMIR Publications %V 5 %N 3 %P e18660 %T The Diabits App for Smartphone-Assisted Predictive Monitoring of Glycemia in Patients With Diabetes: Retrospective Observational Study %A Kriventsov,Stan %A Lindsey,Alexander %A Hayeri,Amir %+ Bio Conscious Technologies Inc, 555 W Hastings St, Suite #1200, Vancouver, BC, V6B 4N6, Canada, 1 604 729 4747, stan@bioconscious.tech %K blood glucose predictions %K type 1 diabetes %K artificial intelligence %K machine learning %K digital health %K mobile phone %D 2020 %7 22.9.2020 %9 Original Paper %J JMIR Diabetes %G English %X Background: Diabetes mellitus, which causes dysregulation of blood glucose in humans, is a major public health challenge. Patients with diabetes must monitor their glycemic levels to keep them in a healthy range. This task is made easier by using continuous glucose monitoring (CGM) devices and relaying their output to smartphone apps, thus providing users with real-time information on their glycemic fluctuations and possibly predicting future trends. Objective: This study aims to discuss various challenges of predictive monitoring of glycemia and examines the accuracy and blood glucose control effects of Diabits, a smartphone app that helps patients with diabetes monitor and manage their blood glucose levels in real time. Methods: Using data from CGM devices and user input, Diabits applies machine learning techniques to create personalized patient models and predict blood glucose fluctuations up to 60 min in advance. These predictions give patients an opportunity to take pre-emptive action to maintain their blood glucose values within the reference range. In this retrospective observational cohort study, the predictive accuracy of Diabits and the correlation between daily use of the app and blood glucose control metrics were examined based on real app users’ data. Moreover, the accuracy of predictions on the 2018 Ohio T1DM (type 1 diabetes mellitus) data set was calculated and compared against other published results. Results: On the basis of more than 6.8 million data points, 30-min Diabits predictions evaluated using Parkes Error Grid were found to be 86.89% (5,963,930/6,864,130) clinically accurate (zone A) and 99.56% (6,833,625/6,864,130) clinically acceptable (zones A and B), whereas 60-min predictions were 70.56% (4,843,605/6,864,130) clinically accurate and 97.49% (6,692,165/6,864,130) clinically acceptable. By analyzing daily use statistics and CGM data for the 280 most long-standing users of Diabits, it was established that under free-living conditions, many common blood glucose control metrics improved with increased frequency of app use. For instance, the average blood glucose for the days these users did not interact with the app was 154.0 (SD 47.2) mg/dL, with 67.52% of the time spent in the healthy 70 to 180 mg/dL range. For days with 10 or more Diabits sessions, the average blood glucose decreased to 141.6 (SD 42.0) mg/dL (P<.001), whereas the time in euglycemic range increased to 74.28% (P<.001). On the Ohio T1DM data set of 6 patients with type 1 diabetes, 30-min predictions of the base Diabits model had an average root mean square error of 18.68 (SD 2.19) mg/dL, which is an improvement over the published state-of-the-art results for this data set. Conclusions: Diabits accurately predicts future glycemic fluctuations, potentially making it easier for patients with diabetes to maintain their blood glucose in the reference range. Furthermore, an improvement in glucose control was observed on days with more frequent Diabits use. %M 32960180 %R 10.2196/18660 %U http://diabetes.jmir.org/2020/3/e18660/ %U https://doi.org/10.2196/18660 %U http://www.ncbi.nlm.nih.gov/pubmed/32960180 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 9 %P e19897 %T A Personalized Voice-Based Diet Assistant for Caregivers of Alzheimer Disease and Related Dementias: System Development and Validation %A Li,Juan %A Maharjan,Bikesh %A Xie,Bo %A Tao,Cui %+ University of Texas Health Science Center at Houston, 7000 Fannin Street Suite 600, Houston, TX, 77030, United States, 1 7135003981, cui.tao@uth.tmc.edu %K Alzheimer disease %K dementia %K diet %K knowledge %K ontology %K voice assistant %D 2020 %7 21.9.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: The world’s aging population is increasing, with an expected increase in the prevalence of Alzheimer disease and related dementias (ADRD). Proper nutrition and good eating behavior show promise for preventing and slowing the progression of ADRD and consequently improving patients with ADRD’s health status and quality of life. Most ADRD care is provided by informal caregivers, so assisting caregivers to manage patients with ADRD’s diet is important. Objective: This study aims to design, develop, and test an artificial intelligence–powered voice assistant to help informal caregivers manage the daily diet of patients with ADRD and learn food and nutrition-related knowledge. Methods: The voice assistant is being implemented in several steps: construction of a comprehensive knowledge base with ontologies that define ADRD diet care and user profiles, and is extended with external knowledge graphs; management of conversation between users and the voice assistant; personalized ADRD diet services provided through a semantics-based knowledge graph search and reasoning engine; and system evaluation in use cases with additional qualitative evaluations. Results: A prototype voice assistant was evaluated in the lab using various use cases. Preliminary qualitative test results demonstrate reasonable rates of dialogue success and recommendation correctness. Conclusions: The voice assistant provides a natural, interactive interface for users, and it does not require the user to have a technical background, which may facilitate senior caregivers’ use in their daily care tasks. This study suggests the feasibility of using the intelligent voice assistant to help caregivers manage patients with ADRD’s diet. %M 32955452 %R 10.2196/19897 %U http://www.jmir.org/2020/9/e19897/ %U https://doi.org/10.2196/19897 %U http://www.ncbi.nlm.nih.gov/pubmed/32955452 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 9 %P e21983 %T Artificial Intelligence for the Prediction of Helicobacter Pylori Infection in Endoscopic Images: Systematic Review and Meta-Analysis Of Diagnostic Test Accuracy %A Bang,Chang Seok %A Lee,Jae Jun %A Baik,Gwang Ho %+ Department of Internal Medicine, Hallym University College of Medicine, Sakju-ro 77, Chuncheon, , Republic of Korea, 82 33 240 5000, csbang@hallym.ac.kr %K artificial intelligence %K convolutional neural network %K deep learning %K machine learning %K endoscopy %K Helicobacter pylori %D 2020 %7 16.9.2020 %9 Review %J J Med Internet Res %G English %X Background: Helicobacter pylori plays a central role in the development of gastric cancer, and prediction of H pylori infection by visual inspection of the gastric mucosa is an important function of endoscopy. However, there are currently no established methods of optical diagnosis of H pylori infection using endoscopic images. Definitive diagnosis requires endoscopic biopsy. Artificial intelligence (AI) has been increasingly adopted in clinical practice, especially for image recognition and classification. Objective: This study aimed to evaluate the diagnostic test accuracy of AI for the prediction of H pylori infection using endoscopic images. Methods: Two independent evaluators searched core databases. The inclusion criteria included studies with endoscopic images of H pylori infection and with application of AI for the prediction of H pylori infection presenting diagnostic performance. Systematic review and diagnostic test accuracy meta-analysis were performed. Results: Ultimately, 8 studies were identified. Pooled sensitivity, specificity, diagnostic odds ratio, and area under the curve of AI for the prediction of H pylori infection were 0.87 (95% CI 0.72-0.94), 0.86 (95% CI 0.77-0.92), 40 (95% CI 15-112), and 0.92 (95% CI 0.90-0.94), respectively, in the 1719 patients (385 patients with H pylori infection vs 1334 controls). Meta-regression showed methodological quality and included the number of patients in each study for the purpose of heterogeneity. There was no evidence of publication bias. The accuracy of the AI algorithm reached 82% for discrimination between noninfected images and posteradication images. Conclusions: An AI algorithm is a reliable tool for endoscopic diagnosis of H pylori infection. The limitations of lacking external validation performance and being conducted only in Asia should be overcome. Trial Registration: PROSPERO CRD42020175957; https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=175957 %M 32936088 %R 10.2196/21983 %U http://www.jmir.org/2020/9/e21983/ %U https://doi.org/10.2196/21983 %U http://www.ncbi.nlm.nih.gov/pubmed/32936088 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 8 %N 9 %P e18689 %T An Intelligent Mobile-Enabled System for Diagnosing Parkinson Disease: Development and Validation of a Speech Impairment Detection System %A Zhang,Liang %A Qu,Yue %A Jin,Bo %A Jing,Lu %A Gao,Zhan %A Liang,Zhanhua %+ Department of Neurology, The First Affiliated Hospital of Dalian Medical University, No.222 Zhongshan Road, Dalian, 116011, China, 86 18098876262, jinglu131129@126.com %K Parkinson disease %K speech disorder %K remote diagnosis %K artificial intelligence %K mobile phone app %K mobile health %D 2020 %7 16.9.2020 %9 Original Paper %J JMIR Med Inform %G English %X Background: Parkinson disease (PD) is one of the most common neurological diseases. At present, because the exact cause is still unclear, accurate diagnosis and progression monitoring remain challenging. In recent years, exploring the relationship between PD and speech impairment has attracted widespread attention in the academic world. Most of the studies successfully validated the effectiveness of some vocal features. Moreover, the noninvasive nature of speech signal–based testing has pioneered a new way for telediagnosis and telemonitoring. In particular, there is an increasing demand for artificial intelligence–powered tools in the digital health era. Objective: This study aimed to build a real-time speech signal analysis tool for PD diagnosis and severity assessment. Further, the underlying system should be flexible enough to integrate any machine learning or deep learning algorithm. Methods: At its core, the system we built consists of two parts: (1) speech signal processing: both traditional and novel speech signal processing technologies have been employed for feature engineering, which can automatically extract a few linear and nonlinear dysphonia features, and (2) application of machine learning algorithms: some classical regression and classification algorithms from the machine learning field have been tested; we then chose the most efficient algorithms and relevant features. Results: Experimental results showed that our system had an outstanding ability to both diagnose and assess severity of PD. By using both linear and nonlinear dysphonia features, the accuracy reached 88.74% and recall reached 97.03% in the diagnosis task. Meanwhile, mean absolute error was 3.7699 in the assessment task. The system has already been deployed within a mobile app called No Pa. Conclusions: This study performed diagnosis and severity assessment of PD from the perspective of speech order detection. The efficiency and effectiveness of the algorithms indirectly validated the practicality of the system. In particular, the system reflects the necessity of a publicly accessible PD diagnosis and assessment system that can perform telediagnosis and telemonitoring of PD. This system can also optimize doctors’ decision-making processes regarding treatments. %M 32936086 %R 10.2196/18689 %U http://medinform.jmir.org/2020/9/e18689/ %U https://doi.org/10.2196/18689 %U http://www.ncbi.nlm.nih.gov/pubmed/32936086 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 9 %P e20641 %T Automatic Grading of Stroke Symptoms for Rapid Assessment Using Optimized Machine Learning and 4-Limb Kinematics: Clinical Validation Study %A Park,Eunjeong %A Lee,Kijeong %A Han,Taehwa %A Nam,Hyo Suk %+ Department of Neurology, Yonsei University College of Medicine, 50 Yonsei-ro, Seodaemoon-gu, Seoul, 03722, Republic of Korea, 82 222280245, hsnam@yuhs.ac %K machine learning %K artificial intelligence %K sensors %K kinematics %K stroke %K telemedicine %D 2020 %7 16.9.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: Subtle abnormal motor signs are indications of serious neurological diseases. Although neurological deficits require fast initiation of treatment in a restricted time, it is difficult for nonspecialists to detect and objectively assess the symptoms. In the clinical environment, diagnoses and decisions are based on clinical grading methods, including the National Institutes of Health Stroke Scale (NIHSS) score or the Medical Research Council (MRC) score, which have been used to measure motor weakness. Objective grading in various environments is necessitated for consistent agreement among patients, caregivers, paramedics, and medical staff to facilitate rapid diagnoses and dispatches to appropriate medical centers. Objective: In this study, we aimed to develop an autonomous grading system for stroke patients. We investigated the feasibility of our new system to assess motor weakness and grade NIHSS and MRC scores of 4 limbs, similar to the clinical examinations performed by medical staff. Methods: We implemented an automatic grading system composed of a measuring unit with wearable sensors and a grading unit with optimized machine learning. Inertial sensors were attached to measure subtle weaknesses caused by paralysis of upper and lower limbs. We collected 60 instances of data with kinematic features of motor disorders from neurological examination and demographic information of stroke patients with NIHSS 0 or 1 and MRC 7, 8, or 9 grades in a stroke unit. Training data with 240 instances were generated using a synthetic minority oversampling technique to complement the imbalanced number of data between classes and low number of training data. We trained 2 representative machine learning algorithms, an ensemble and a support vector machine (SVM), to implement auto-NIHSS and auto-MRC grading. The optimized algorithms performed a 5-fold cross-validation and were searched by Bayes optimization in 30 trials. The trained model was tested with the 60 original hold-out instances for performance evaluation in accuracy, sensitivity, specificity, and area under the receiver operating characteristics curve (AUC). Results: The proposed system can grade NIHSS scores with an accuracy of 83.3% and an AUC of 0.912 using an optimized ensemble algorithm, and it can grade with an accuracy of 80.0% and an AUC of 0.860 using an optimized SVM algorithm. The auto-MRC grading achieved an accuracy of 76.7% and a mean AUC of 0.870 in SVM classification and an accuracy of 78.3% and a mean AUC of 0.877 in ensemble classification. Conclusions: The automatic grading system quantifies proximal weakness in real time and assesses symptoms through automatic grading. The pilot outcomes demonstrated the feasibility of remote monitoring of motor weakness caused by stroke. The system can facilitate consistent grading with instant assessment and expedite dispatches to appropriate hospitals and treatment initiation by sharing auto-MRC and auto-NIHSS scores between prehospital and hospital responses as an objective observation. %M 32936079 %R 10.2196/20641 %U http://www.jmir.org/2020/9/e20641/ %U https://doi.org/10.2196/20641 %U http://www.ncbi.nlm.nih.gov/pubmed/32936079 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 9 %P e19133 %T Social Reminiscence in Older Adults’ Everyday Conversations: Automated Detection Using Natural Language Processing and Machine Learning %A Ferrario,Andrea %A Demiray,Burcu %A Yordanova,Kristina %A Luo,Minxia %A Martin,Mike %+ Department of Management, Technology, and Economics, ETH Zurich, Weinbergstrasse 56/58, Zurich, 8092, Switzerland, 41 44 632 86 24, aferrario@ethz.ch %K aging %K dementia %K reminiscence %K real-life conversations %K electronically activated recorder (EAR) %K natural language processing %K machine learning %K imbalanced learning %D 2020 %7 15.9.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: Reminiscence is the act of thinking or talking about personal experiences that occurred in the past. It is a central task of old age that is essential for healthy aging, and it serves multiple functions, such as decision-making and introspection, transmitting life lessons, and bonding with others. The study of social reminiscence behavior in everyday life can be used to generate data and detect reminiscence from general conversations. Objective: The aims of this original paper are to (1) preprocess coded transcripts of conversations in German of older adults with natural language processing (NLP), and (2) implement and evaluate learning strategies using different NLP features and machine learning algorithms to detect reminiscence in a corpus of transcripts. Methods: The methods in this study comprise (1) collecting and coding of transcripts of older adults’ conversations in German, (2) preprocessing transcripts to generate NLP features (bag-of-words models, part-of-speech tags, pretrained German word embeddings), and (3) training machine learning models to detect reminiscence using random forests, support vector machines, and adaptive and extreme gradient boosting algorithms. The data set comprises 2214 transcripts, including 109 transcripts with reminiscence. Due to class imbalance in the data, we introduced three learning strategies: (1) class-weighted learning, (2) a meta-classifier consisting of a voting ensemble, and (3) data augmentation with the Synthetic Minority Oversampling Technique (SMOTE) algorithm. For each learning strategy, we performed cross-validation on a random sample of the training data set of transcripts. We computed the area under the curve (AUC), the average precision (AP), precision, recall, as well as F1 score and specificity measures on the test data, for all combinations of NLP features, algorithms, and learning strategies. Results: Class-weighted support vector machines on bag-of-words features outperformed all other classifiers (AUC=0.91, AP=0.56, precision=0.5, recall=0.45, F1=0.48, specificity=0.98), followed by support vector machines on SMOTE-augmented data and word embeddings features (AUC=0.89, AP=0.54, precision=0.35, recall=0.59, F1=0.44, specificity=0.94). For the meta-classifier strategy, adaptive and extreme gradient boosting algorithms trained on word embeddings and bag-of-words outperformed all other classifiers and NLP features; however, the performance of the meta-classifier learning strategy was lower compared to other strategies, with highly imbalanced precision-recall trade-offs. Conclusions: This study provides evidence of the applicability of NLP and machine learning pipelines for the automated detection of reminiscence in older adults’ everyday conversations in German. The methods and findings of this study could be relevant for designing unobtrusive computer systems for the real-time detection of social reminiscence in the everyday life of older adults and classifying their functions. With further improvements, these systems could be deployed in health interventions aimed at improving older adults’ well-being by promoting self-reflection and suggesting coping strategies to be used in the case of dysfunctional reminiscence cases, which can undermine physical and mental health. %M 32866108 %R 10.2196/19133 %U http://www.jmir.org/2020/9/e19133/ %U https://doi.org/10.2196/19133 %U http://www.ncbi.nlm.nih.gov/pubmed/32866108 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 9 %P e21573 %T An Innovative Artificial Intelligence–Based App for the Diagnosis of Gestational Diabetes Mellitus (GDM-AI): Development Study %A Shen,Jiayi %A Chen,Jiebin %A Zheng,Zequan %A Zheng,Jiabin %A Liu,Zherui %A Song,Jian %A Wong,Sum Yi %A Wang,Xiaoling %A Huang,Mengqi %A Fang,Po-Han %A Jiang,Bangsheng %A Tsang,Winghei %A He,Zonglin %A Liu,Taoran %A Akinwunmi,Babatunde %A Wang,Chi Chiu %A Zhang,Casper J P %A Huang,Jian %A Ming,Wai-Kit %+ Department of Public Health and Preventive Medicine, School of Medicine, Jinan University, Guangzhou, China, 86 14715485116, wkming@connect.hku.hk %K AI %K application %K disease diagnosis %K maternal health care %K artificial intelligence %K app %K women %K rural %K innovation %K diabetes %K gestational diabetes %K diagnosis %D 2020 %7 15.9.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: Gestational diabetes mellitus (GDM) can cause adverse consequences to both mothers and their newborns. However, pregnant women living in low- and middle-income areas or countries often fail to receive early clinical interventions at local medical facilities due to restricted availability of GDM diagnosis. The outstanding performance of artificial intelligence (AI) in disease diagnosis in previous studies demonstrates its promising applications in GDM diagnosis. Objective: This study aims to investigate the implementation of a well-performing AI algorithm in GDM diagnosis in a setting, which requires fewer medical equipment and staff and to establish an app based on the AI algorithm. This study also explores possible progress if our app is widely used. Methods: An AI model that included 9 algorithms was trained on 12,304 pregnant outpatients with their consent who received a test for GDM in the obstetrics and gynecology department of the First Affiliated Hospital of Jinan University, a local hospital in South China, between November 2010 and October 2017. GDM was diagnosed according to American Diabetes Association (ADA) 2011 diagnostic criteria. Age and fasting blood glucose were chosen as critical parameters. For validation, we performed k-fold cross-validation (k=5) for the internal dataset and an external validation dataset that included 1655 cases from the Prince of Wales Hospital, the affiliated teaching hospital of the Chinese University of Hong Kong, a non-local hospital. Accuracy, sensitivity, and other criteria were calculated for each algorithm. Results: The areas under the receiver operating characteristic curve (AUROC) of external validation dataset for support vector machine (SVM), random forest, AdaBoost, k-nearest neighbors (kNN), naive Bayes (NB), decision tree, logistic regression (LR), eXtreme gradient boosting (XGBoost), and gradient boosting decision tree (GBDT) were 0.780, 0.657, 0.736, 0.669, 0.774, 0.614, 0.769, 0.742, and 0.757, respectively. SVM also retained high performance in other criteria. The specificity for SVM retained 100% in the external validation set with an accuracy of 88.7%. Conclusions: Our prospective and multicenter study is the first clinical study that supports the GDM diagnosis for pregnant women in resource-limited areas, using only fasting blood glucose value, patients’ age, and a smartphone connected to the internet. Our study proved that SVM can achieve accurate diagnosis with less operation cost and higher efficacy. Our study (referred to as GDM-AI study, ie, the study of AI-based diagnosis of GDM) also shows our app has a promising future in improving the quality of maternal health for pregnant women, precision medicine, and long-distance medical care. We recommend future work should expand the dataset scope and replicate the process to validate the performance of the AI algorithms. %M 32930674 %R 10.2196/21573 %U https://www.jmir.org/2020/9/e21573 %U https://doi.org/10.2196/21573 %U http://www.ncbi.nlm.nih.gov/pubmed/32930674 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 9 %P e20701 %T Artificial Intelligence-Based Conversational Agents for Chronic Conditions: Systematic Literature Review %A Schachner,Theresa %A Keller,Roman %A v Wangenheim,Florian %+ Department of Management, Technology, and Economics, ETH Zurich, WEV G 228, Weinbergstr 56/58, Zurich , Switzerland, 41 446325209, tschachner@ethz.ch %K artificial intelligence %K conversational agents %K chatbots %K healthcare %K chronic diseases %K systematic literature review %D 2020 %7 14.9.2020 %9 Review %J J Med Internet Res %G English %X Background: A rising number of conversational agents or chatbots are equipped with artificial intelligence (AI) architecture. They are increasingly prevalent in health care applications such as those providing education and support to patients with chronic diseases, one of the leading causes of death in the 21st century. AI-based chatbots enable more effective and frequent interactions with such patients. Objective: The goal of this systematic literature review is to review the characteristics, health care conditions, and AI architectures of AI-based conversational agents designed specifically for chronic diseases. Methods: We conducted a systematic literature review using PubMed MEDLINE, EMBASE, PyscInfo, CINAHL, ACM Digital Library, ScienceDirect, and Web of Science. We applied a predefined search strategy using the terms “conversational agent,” “healthcare,” “artificial intelligence,” and their synonyms. We updated the search results using Google alerts, and screened reference lists for other relevant articles. We included primary research studies that involved the prevention, treatment, or rehabilitation of chronic diseases, involved a conversational agent, and included any kind of AI architecture. Two independent reviewers conducted screening and data extraction, and Cohen kappa was used to measure interrater agreement.A narrative approach was applied for data synthesis. Results: The literature search found 2052 articles, out of which 10 papers met the inclusion criteria. The small number of identified studies together with the prevalence of quasi-experimental studies (n=7) and prevailing prototype nature of the chatbots (n=7) revealed the immaturity of the field. The reported chatbots addressed a broad variety of chronic diseases (n=6), showcasing a tendency to develop specialized conversational agents for individual chronic conditions. However, there lacks comparison of these chatbots within and between chronic diseases. In addition, the reported evaluation measures were not standardized, and the addressed health goals showed a large range. Together, these study characteristics complicated comparability and open room for future research. While natural language processing represented the most used AI technique (n=7) and the majority of conversational agents allowed for multimodal interaction (n=6), the identified studies demonstrated broad heterogeneity, lack of depth of reported AI techniques and systems, and inconsistent usage of taxonomy of the underlying AI software, further aggravating comparability and generalizability of study results. Conclusions: The literature on AI-based conversational agents for chronic conditions is scarce and mostly consists of quasi-experimental studies with chatbots in prototype stage that use natural language processing and allow for multimodal user interaction. Future research could profit from evidence-based evaluation of the AI-based conversational agents and comparison thereof within and between different chronic health conditions. Besides increased comparability, the quality of chatbots developed for specific chronic conditions and their subsequent impact on the target patients could be enhanced by more structured development and standardized evaluation processes. %M 32924957 %R 10.2196/20701 %U http://www.jmir.org/2020/9/e20701/ %U https://doi.org/10.2196/20701 %U http://www.ncbi.nlm.nih.gov/pubmed/32924957 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 9 %P e18091 %T Artificial Intelligence and Its Effect on Dermatologists’ Accuracy in Dermoscopic Melanoma Image Classification: Web-Based Survey Study %A Maron,Roman C %A Utikal,Jochen S %A Hekler,Achim %A Hauschild,Axel %A Sattler,Elke %A Sondermann,Wiebke %A Haferkamp,Sebastian %A Schilling,Bastian %A Heppt,Markus V %A Jansen,Philipp %A Reinholz,Markus %A Franklin,Cindy %A Schmitt,Laurenz %A Hartmann,Daniela %A Krieghoff-Henning,Eva %A Schmitt,Max %A Weichenthal,Michael %A von Kalle,Christof %A Fröhling,Stefan %A Brinker,Titus J %+ Digital Biomarkers for Oncology Group (DBO), National Center for Tumor Diseases (NCT), German Cancer Research Center (DKFZ), Im Neuenheimer Feld 280, Heidelberg, 69120, Germany, 49 62213219304, titus.brinker@nct-heidelberg.de %K artificial intelligence %K machine learning %K deep learning %K neural network %K dermatology %K diagnosis %K nevi %K melanoma %K skin neoplasm %D 2020 %7 11.9.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: Early detection of melanoma can be lifesaving but this remains a challenge. Recent diagnostic studies have revealed the superiority of artificial intelligence (AI) in classifying dermoscopic images of melanoma and nevi, concluding that these algorithms should assist a dermatologist’s diagnoses. Objective: The aim of this study was to investigate whether AI support improves the accuracy and overall diagnostic performance of dermatologists in the dichotomous image–based discrimination between melanoma and nevus. Methods: Twelve board-certified dermatologists were presented disjoint sets of 100 unique dermoscopic images of melanomas and nevi (total of 1200 unique images), and they had to classify the images based on personal experience alone (part I) and with the support of a trained convolutional neural network (CNN, part II). Additionally, dermatologists were asked to rate their confidence in their final decision for each image. Results: While the mean specificity of the dermatologists based on personal experience alone remained almost unchanged (70.6% vs 72.4%; P=.54) with AI support, the mean sensitivity and mean accuracy increased significantly (59.4% vs 74.6%; P=.003 and 65.0% vs 73.6%; P=.002, respectively) with AI support. Out of the 10% (10/94; 95% CI 8.4%-11.8%) of cases where dermatologists were correct and AI was incorrect, dermatologists on average changed to the incorrect answer for 39% (4/10; 95% CI 23.2%-55.6%) of cases. When dermatologists were incorrect and AI was correct (25/94, 27%; 95% CI 24.0%-30.1%), dermatologists changed their answers to the correct answer for 46% (11/25; 95% CI 33.1%-58.4%) of cases. Additionally, the dermatologists’ average confidence in their decisions increased when the CNN confirmed their decision and decreased when the CNN disagreed, even when the dermatologists were correct. Reported values are based on the mean of all participants. Whenever absolute values are shown, the denominator and numerator are approximations as every dermatologist ended up rating a varying number of images due to a quality control step. Conclusions: The findings of our study show that AI support can improve the overall accuracy of the dermatologists in the dichotomous image–based discrimination between melanoma and nevus. This supports the argument for AI-based tools to aid clinicians in skin lesion classification and provides a rationale for studies of such classifiers in real-life settings, wherein clinicians can integrate additional information such as patient age and medical history into their decisions. %M 32915161 %R 10.2196/18091 %U https://www.jmir.org/2020/9/e18091 %U https://doi.org/10.2196/18091 %U http://www.ncbi.nlm.nih.gov/pubmed/32915161 %0 Journal Article %@ 2561-7605 %I JMIR Publications %V 3 %N 2 %P e19554 %T Artificial Intelligence–Powered Digital Health Platform and Wearable Devices Improve Outcomes for Older Adults in Assisted Living Communities: Pilot Intervention Study %A Wilmink,Gerald %A Dupey,Katherine %A Alkire,Schon %A Grote,Jeffrey %A Zobel,Gregory %A Fillit,Howard M %A Movva,Satish %+ CarePredict, 324 South University Drive, Plantation, FL, 33324, United States, 1 6153644985, jerry.wilmink@gmail.com %K health technology %K artificial intelligence %K AI %K preventive %K senior technology %K assisted living %K long-term services %K long-term care providers %D 2020 %7 10.9.2020 %9 Original Paper %J JMIR Aging %G English %X Background: Wearables and artificial intelligence (AI)–powered digital health platforms that utilize machine learning algorithms can autonomously measure a senior’s change in activity and behavior and may be useful tools for proactive interventions that target modifiable risk factors. Objective: The goal of this study was to analyze how a wearable device and AI-powered digital health platform could provide improved health outcomes for older adults in assisted living communities. Methods: Data from 490 residents from six assisted living communities were analyzed retrospectively over 24 months. The intervention group (+CP) consisted of 3 communities that utilized CarePredict (n=256), and the control group (–CP) consisted of 3 communities (n=234) that did not utilize CarePredict. The following outcomes were measured and compared to baseline: hospitalization rate, fall rate, length of stay (LOS), and staff response time. Results: The residents of the +CP and –CP communities exhibit no statistical difference in age (P=.64), sex (P=.63), and staff service hours per resident (P=.94). The data show that the +CP communities exhibited a 39% lower hospitalization rate (P=.02), a 69% lower fall rate (P=.01), and a 67% greater length of stay (P=.03) than the –CP communities. The staff alert acknowledgment and reach resident times also improved in the +CP communities by 37% (P=.02) and 40% (P=.02), respectively. Conclusions: The AI-powered digital health platform provides the community staff with actionable information regarding each resident’s activities and behavior, which can be used to identify older adults that are at an increased risk for a health decline. Staff can use this data to intervene much earlier, protecting seniors from conditions that left untreated could result in hospitalization. In summary, the use of wearables and AI-powered digital health platform can contribute to improved health outcomes for seniors in assisted living communities. The accuracy of the system will be further validated in a larger trial. %M 32723711 %R 10.2196/19554 %U http://aging.jmir.org/2020/2/e19554/ %U https://doi.org/10.2196/19554 %U http://www.ncbi.nlm.nih.gov/pubmed/32723711 %0 Journal Article %@ 2291-5222 %I JMIR Publications %V 8 %N 9 %P e18142 %T Neural Network–Based Algorithm for Adjusting Activity Targets to Sustain Exercise Engagement Among People Using Activity Trackers: Retrospective Observation and Algorithm Development Study %A Mohammadi,Ramin %A Atif,Mursal %A Centi,Amanda Jayne %A Agboola,Stephen %A Jethwani,Kamal %A Kvedar,Joseph %A Kamarthi,Sagar %+ Northeastern University, 360 Huntington Ave, Boston, MA, 02115, United States, 1 6173733070, sagar@coe.neu.edu %K activity tracker %K exercise engagement %K dynamic activity target %K neural network %K activity target prediction %K machine learning %D 2020 %7 8.9.2020 %9 Original Paper %J JMIR Mhealth Uhealth %G English %X Background: It is well established that lack of physical activity is detrimental to the overall health of an individual. Modern-day activity trackers enable individuals to monitor their daily activities to meet and maintain targets. This is expected to promote activity encouraging behavior, but the benefits of activity trackers attenuate over time due to waning adherence. One of the key approaches to improving adherence to goals is to motivate individuals to improve on their historic performance metrics. Objective: The aim of this work was to build a machine learning model to predict an achievable weekly activity target by considering (1) patterns in the user’s activity tracker data in the previous week and (2) behavior and environment characteristics. By setting realistic goals, ones that are neither too easy nor too difficult to achieve, activity tracker users can be encouraged to continue to meet these goals, and at the same time, to find utility in their activity tracker. Methods: We built a neural network model that prescribes a weekly activity target for an individual that can be realistically achieved. The inputs to the model were user-specific personal, social, and environmental factors, daily step count from the previous 7 days, and an entropy measure that characterized the pattern of daily step count. Data for training and evaluating the machine learning model were collected over a duration of 9 weeks. Results: Of 30 individuals who were enrolled, data from 20 participants were used. The model predicted target daily count with a mean absolute error of 1545 (95% CI 1383-1706) steps for an 8-week period. Conclusions: Artificial intelligence applied to physical activity data combined with behavioral data can be used to set personalized goals in accordance with the individual’s level of activity and thereby improve adherence to a fitness tracker; this could be used to increase engagement with activity trackers. A follow-up prospective study is ongoing to determine the performance of the engagement algorithm. %M 32897235 %R 10.2196/18142 %U https://mhealth.jmir.org/2020/9/e18142 %U https://doi.org/10.2196/18142 %U http://www.ncbi.nlm.nih.gov/pubmed/32897235 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 8 %N 9 %P e18930 %T Human- Versus Machine Learning–Based Triage Using Digitalized Patient Histories in Primary Care: Comparative Study %A Entezarjou,Artin %A Bonamy,Anna-Karin Edstedt %A Benjaminsson,Simon %A Herman,Pawel %A Midlöv,Patrik %+ Center for Primary Health Care Research, Department of Clinical Sciences in Malmö/Family Medicine, Lund University, Box 50332, Malmö, 202 13, Sweden, 46 40391400, artin.entezarjou@med.lu.se %K machine learning %K artificial intelligence %K decision support %K primary care %K triage %D 2020 %7 3.9.2020 %9 Original Paper %J JMIR Med Inform %G English %X Background: Smartphones have made it possible for patients to digitally report symptoms before physical primary care visits. Using machine learning (ML), these data offer an opportunity to support decisions about the appropriate level of care (triage). Objective: The purpose of this study was to explore the interrater reliability between human physicians and an automated ML-based triage method. Methods: After testing several models, a naïve Bayes triage model was created using data from digital medical histories, capable of classifying digital medical history reports as either in need of urgent physical examination or not in need of urgent physical examination. The model was tested on 300 digital medical history reports and classification was compared with the majority vote of an expert panel of 5 primary care physicians (PCPs). Reliability between raters was measured using both Cohen κ (adjusted for chance agreement) and percentage agreement (not adjusted for chance agreement). Results: Interrater reliability as measured by Cohen κ was 0.17 when comparing the majority vote of the reference group with the model. Agreement was 74% (138/186) for cases judged not in need of urgent physical examination and 42% (38/90) for cases judged to be in need of urgent physical examination. No specific features linked to the model’s triage decision could be identified. Between physicians within the panel, Cohen κ was 0.2. Intrarater reliability when 1 physician retriaged 50 reports resulted in Cohen κ of 0.55. Conclusions: Low interrater and intrarater agreement in triage decisions among PCPs limits the possibility to use human decisions as a reference for ML to automate triage in primary care. %M 32880578 %R 10.2196/18930 %U https://medinform.jmir.org/2020/9/e18930 %U https://doi.org/10.2196/18930 %U http://www.ncbi.nlm.nih.gov/pubmed/32880578 %0 Journal Article %@ 2368-7959 %I JMIR Publications %V 7 %N 9 %P e19348 %T Utilizing Machine Learning on Internet Search Activity to Support the Diagnostic Process and Relapse Detection in Young Individuals With Early Psychosis: Feasibility Study %A Birnbaum,Michael Leo %A Kulkarni,Prathamesh "Param" %A Van Meter,Anna %A Chen,Victor %A Rizvi,Asra F %A Arenare,Elizabeth %A De Choudhury,Munmun %A Kane,John M %+ The Zucker Hillside Hospital, Northwell Health, 75-59 263rd Street, Glen Oaks, NY, 11004, United States, 1 718 470 8305, Mbirnbaum@northwell.edu %K schizophrenia spectrum disorders %K internet search activity %K Google %K diagnostic prediction %K relapse prediction %K machine learning %K digital data %K digital phenotyping %K digital biomarkers %D 2020 %7 1.9.2020 %9 Original Paper %J JMIR Ment Health %G English %X Background: Psychiatry is nearly entirely reliant on patient self-reporting, and there are few objective and reliable tests or sources of collateral information available to help diagnostic and assessment procedures. Technology offers opportunities to collect objective digital data to complement patient experience and facilitate more informed treatment decisions. Objective: We aimed to develop computational algorithms based on internet search activity designed to support diagnostic procedures and relapse identification in individuals with schizophrenia spectrum disorders. Methods: We extracted 32,733 time-stamped search queries across 42 participants with schizophrenia spectrum disorders and 74 healthy volunteers between the ages of 15 and 35 (mean 24.4 years, 44.0% male), and built machine-learning diagnostic and relapse classifiers utilizing the timing, frequency, and content of online search activity. Results: Classifiers predicted a diagnosis of schizophrenia spectrum disorders with an area under the curve value of 0.74 and predicted a psychotic relapse in individuals with schizophrenia spectrum disorders with an area under the curve of 0.71. Compared with healthy participants, those with schizophrenia spectrum disorders made fewer searches and their searches consisted of fewer words. Prior to a relapse hospitalization, participants with schizophrenia spectrum disorders were more likely to use words related to hearing, perception, and anger, and were less likely to use words related to health. Conclusions: Online search activity holds promise for gathering objective and easily accessed indicators of psychiatric symptoms. Utilizing search activity as collateral behavioral health information would represent a major advancement in efforts to capitalize on objective digital data to improve mental health monitoring. %M 32870161 %R 10.2196/19348 %U https://mental.jmir.org/2020/9/e19348 %U https://doi.org/10.2196/19348 %U http://www.ncbi.nlm.nih.gov/pubmed/32870161 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 8 %N 8 %P e21056 %T Impact of a Commercial Artificial Intelligence–Driven Patient Self-Assessment Solution on Waiting Times at General Internal Medicine Outpatient Departments: Retrospective Study %A Harada,Yukinori %A Shimizu,Taro %+ Department of Diagnostic and Generalist Medicine, Dokkyo Medical University, Kitakobayashi 880, Mibu, 321-0293, Japan, 81 282861111, shimizutaro7@gmail.com %K artificial intelligence %K automated medical history taking system %K eHealth %K interrupted time-series analysis %K waiting time %D 2020 %7 31.8.2020 %9 Original Paper %J JMIR Med Inform %G English %X Background: Patient waiting time at outpatient departments is directly related to patient satisfaction and quality of care, particularly in patients visiting the general internal medicine outpatient departments for the first time. Moreover, reducing wait time from arrival in the clinic to the initiation of an examination is key to reducing patients’ anxiety. The use of automated medical history–taking systems in general internal medicine outpatient departments is a promising strategy to reduce waiting times. Recently, Ubie Inc in Japan developed AI Monshin, an artificial intelligence–based, automated medical history–taking system for general internal medicine outpatient departments. Objective: We hypothesized that replacing the use of handwritten self-administered questionnaires with the use of AI Monshin would reduce waiting times in general internal medicine outpatient departments. Therefore, we conducted this study to examine whether the use of AI Monshin reduced patient waiting times. Methods: We retrospectively analyzed the waiting times of patients visiting the general internal medicine outpatient department at a Japanese community hospital without an appointment from April 2017 to April 2020. AI Monshin was implemented in April 2019. We compared the median waiting time before and after implementation by conducting an interrupted time-series analysis of the median waiting time per month. We also conducted supplementary analyses to explain the main results. Results: We analyzed 21,615 visits. The median waiting time after AI Monshin implementation (74.4 minutes, IQR 57.1) was not significantly different from that before AI Monshin implementation (74.3 minutes, IQR 63.7) (P=.12). In the interrupted time-series analysis, the underlying linear time trend (–0.4 minutes per month; P=.06; 95% CI –0.9 to 0.02), level change (40.6 minutes; P=.09; 95% CI –5.8 to 87.0), and slope change (–1.1 minutes per month; P=.16; 95% CI –2.7 to 0.4) were not statistically significant. In a supplemental analysis of data from 9054 of 21,615 visits (41.9%), the median examination time after AI Monshin implementation (6.0 minutes, IQR 5.2) was slightly but significantly longer than that before AI Monshin implementation (5.7 minutes, IQR 5.0) (P=.003). Conclusions: The implementation of an artificial intelligence–based, automated medical history–taking system did not reduce waiting time for patients visiting the general internal medicine outpatient department without an appointment, and there was a slight increase in the examination time after implementation; however, the system may have enhanced the quality of care by supporting the optimization of staff assignments. %M 32865504 %R 10.2196/21056 %U http://medinform.jmir.org/2020/8/e21056/ %U https://doi.org/10.2196/21056 %U http://www.ncbi.nlm.nih.gov/pubmed/32865504 %0 Journal Article %@ 2291-5222 %I JMIR Publications %V 8 %N 8 %P e19962 %T Predicting Early Warning Signs of Psychotic Relapse From Passive Sensing Data: An Approach Using Encoder-Decoder Neural Networks %A Adler,Daniel A %A Ben-Zeev,Dror %A Tseng,Vincent W-S %A Kane,John M %A Brian,Rachel %A Campbell,Andrew T %A Hauser,Marta %A Scherer,Emily A %A Choudhury,Tanzeem %+ Cornell Tech, 2 W Loop Rd, New York, NY, 10044, United States, 1 2155953769, daa243@cornell.edu %K psychotic disorders %K schizophrenia %K mHealth %K mental health %K mobile health %K smartphone applications %K machine learning %K passive sensing %K digital biomarkers %K digital phenotyping %K artificial intelligence %K deep learning %K mobile phone %D 2020 %7 31.8.2020 %9 Original Paper %J JMIR Mhealth Uhealth %G English %X Background: Schizophrenia spectrum disorders (SSDs) are chronic conditions, but the severity of symptomatic experiences and functional impairments vacillate over the course of illness. Developing unobtrusive remote monitoring systems to detect early warning signs of impending symptomatic relapses would allow clinicians to intervene before the patient’s condition worsens. Objective: In this study, we aim to create the first models, exclusively using passive sensing data from a smartphone, to predict behavioral anomalies that could indicate early warning signs of a psychotic relapse. Methods: Data used to train and test the models were collected during the CrossCheck study. Hourly features derived from smartphone passive sensing data were extracted from 60 patients with SSDs (42 nonrelapse and 18 relapse >1 time throughout the study) and used to train models and test performance. We trained 2 types of encoder-decoder neural network models and a clustering-based local outlier factor model to predict behavioral anomalies that occurred within the 30-day period before a participant's date of relapse (the near relapse period). Models were trained to recreate participant behavior on days of relative health (DRH, outside of the near relapse period), following which a threshold to the recreation error was applied to predict anomalies. The neural network model architecture and the percentage of relapse participant data used to train all models were varied. Results: A total of 20,137 days of collected data were analyzed, with 726 days of data (0.037%) within any 30-day near relapse period. The best performing model used a fully connected neural network autoencoder architecture and achieved a median sensitivity of 0.25 (IQR 0.15-1.00) and specificity of 0.88 (IQR 0.14-0.96; a median 108% increase in behavioral anomalies near relapse). We conducted a post hoc analysis using the best performing model to identify behavioral features that had a medium-to-large effect (Cohen d>0.5) in distinguishing anomalies near relapse from DRH among 4 participants who relapsed multiple times throughout the study. Qualitative validation using clinical notes collected during the original CrossCheck study showed that the identified features from our analysis were presented to clinicians during relapse events. Conclusions: Our proposed method predicted a higher rate of anomalies in patients with SSDs within the 30-day near relapse period and can be used to uncover individual-level behaviors that change before relapse. This approach will enable technologists and clinicians to build unobtrusive digital mental health tools that can predict incipient relapse in SSDs. %M 32865506 %R 10.2196/19962 %U https://mhealth.jmir.org/2020/8/e19962 %U https://doi.org/10.2196/19962 %U http://www.ncbi.nlm.nih.gov/pubmed/32865506 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 8 %N 8 %P e19870 %T Using Dual Neural Network Architecture to Detect the Risk of Dementia With Community Health Data: Algorithm Development and Validation Study %A Shen,Xiao %A Wang,Guanjin %A Kwan,Rick Yiu-Cho %A Choi,Kup-Sze %+ Centre for Smart Health, School of Nursing, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong, 852 3400 3214, hskschoi@polyu.edu.hk %K cognitive screening %K dementia risk %K dual neural network %K predictive models %K primary care %D 2020 %7 31.8.2020 %9 Original Paper %J JMIR Med Inform %G English %X Background: Recent studies have revealed lifestyle behavioral risk factors that can be modified to reduce the risk of dementia. As modification of lifestyle takes time, early identification of people with high dementia risk is important for timely intervention and support. As cognitive impairment is a diagnostic criterion of dementia, cognitive assessment tools are used in primary care to screen for clinically unevaluated cases. Among them, Mini-Mental State Examination (MMSE) is a very common instrument. However, MMSE is a questionnaire that is administered when symptoms of memory decline have occurred. Early administration at the asymptomatic stage and repeated measurements would lead to a practice effect that degrades the effectiveness of MMSE when it is used at later stages. Objective: The aim of this study was to exploit machine learning techniques to assist health care professionals in detecting high-risk individuals by predicting the results of MMSE using elderly health data collected from community-based primary care services. Methods: A health data set of 2299 samples was adopted in the study. The input data were divided into two groups of different characteristics (ie, client profile data and health assessment data). The predictive output was the result of two-class classification of the normal and high-risk cases that were defined based on MMSE. A dual neural network (DNN) model was proposed to obtain the latent representations of the two groups of input data separately, which were then concatenated for the two-class classification. Mean and k-nearest neighbor were used separately to tackle missing data, whereas a cost-sensitive learning (CSL) algorithm was proposed to deal with class imbalance. The performance of the DNN was evaluated by comparing it with that of conventional machine learning methods. Results: A total of 16 predictive models were built using the elderly health data set. Among them, the proposed DNN with CSL outperformed in the detection of high-risk cases. The area under the receiver operating characteristic curve, average precision, sensitivity, and specificity reached 0.84, 0.88, 0.73, and 0.80, respectively. Conclusions: The proposed method has the potential to serve as a tool to screen for elderly people with cognitive impairment and predict high-risk cases of dementia at the asymptomatic stage, providing health care professionals with early signals that can prompt suggestions for a follow-up or a detailed diagnosis. %M 32865498 %R 10.2196/19870 %U https://medinform.jmir.org/2020/8/e19870 %U https://doi.org/10.2196/19870 %U http://www.ncbi.nlm.nih.gov/pubmed/32865498 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 8 %P e19918 %T Is Artificial Intelligence Better Than Human Clinicians in Predicting Patient Outcomes? %A Lee,Joon %+ Data Intelligence for Health Lab, Cumming School of Medicine, University of Calgary, 3280 Hospital Dr NW, TRW 5E17, Calgary, AB, T2N 4Z6, Canada, 1 403 220 2968, joonwu.lee@ucalgary.ca %K patient outcome prediction %K artificial intelligence %K machine learning %K human-generated predictions %K human-AI symbiosis %D 2020 %7 26.8.2020 %9 Viewpoint %J J Med Internet Res %G English %X In contrast with medical imaging diagnostics powered by artificial intelligence (AI), in which deep learning has led to breakthroughs in recent years, patient outcome prediction poses an inherently challenging problem because it focuses on events that have not yet occurred. Interestingly, the performance of machine learning–based patient outcome prediction models has rarely been compared with that of human clinicians in the literature. Human intuition and insight may be sources of underused predictive information that AI will not be able to identify in electronic data. Both human and AI predictions should be investigated together with the aim of achieving a human-AI symbiosis that synergistically and complementarily combines AI with the predictive abilities of clinicians. %M 32845249 %R 10.2196/19918 %U http://www.jmir.org/2020/8/e19918/ %U https://doi.org/10.2196/19918 %U http://www.ncbi.nlm.nih.gov/pubmed/32845249 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 6 %N 3 %P e20794 %T Big Data, Natural Language Processing, and Deep Learning to Detect and Characterize Illicit COVID-19 Product Sales: Infoveillance Study on Twitter and Instagram %A Mackey,Tim Ken %A Li,Jiawei %A Purushothaman,Vidya %A Nali,Matthew %A Shah,Neal %A Bardier,Cortni %A Cai,Mingxiang %A Liang,Bryan %+ Department of Anesthesiology and Division of Infectious Diseases and Global Public Health, School of Medicine, University of California, San Diego, 8950 Villa La Jolla Drive, A124, La Jolla, CA, 92037, United States, 1 951 491 4161, tmackey@ucsd.edu %K COVID-19 %K coronavirus %K infectious disease %K social media %K surveillance %K infoveillance %K infodemiology %K infodemic %K fraud %K cybercrime %D 2020 %7 25.8.2020 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: The coronavirus disease (COVID-19) pandemic is perhaps the greatest global health challenge of the last century. Accompanying this pandemic is a parallel “infodemic,” including the online marketing and sale of unapproved, illegal, and counterfeit COVID-19 health products including testing kits, treatments, and other questionable “cures.” Enabling the proliferation of this content is the growing ubiquity of internet-based technologies, including popular social media platforms that now have billions of global users. Objective: This study aims to collect, analyze, identify, and enable reporting of suspected fake, counterfeit, and unapproved COVID-19–related health care products from Twitter and Instagram. Methods: This study is conducted in two phases beginning with the collection of COVID-19–related Twitter and Instagram posts using a combination of web scraping on Instagram and filtering the public streaming Twitter application programming interface for keywords associated with suspect marketing and sale of COVID-19 products. The second phase involved data analysis using natural language processing (NLP) and deep learning to identify potential sellers that were then manually annotated for characteristics of interest. We also visualized illegal selling posts on a customized data dashboard to enable public health intelligence. Results: We collected a total of 6,029,323 tweets and 204,597 Instagram posts filtered for terms associated with suspect marketing and sale of COVID-19 health products from March to April for Twitter and February to May for Instagram. After applying our NLP and deep learning approaches, we identified 1271 tweets and 596 Instagram posts associated with questionable sales of COVID-19–related products. Generally, product introduction came in two waves, with the first consisting of questionable immunity-boosting treatments and a second involving suspect testing kits. We also detected a low volume of pharmaceuticals that have not been approved for COVID-19 treatment. Other major themes detected included products offered in different languages, various claims of product credibility, completely unsubstantiated products, unapproved testing modalities, and different payment and seller contact methods. Conclusions: Results from this study provide initial insight into one front of the “infodemic” fight against COVID-19 by characterizing what types of health products, selling claims, and types of sellers were active on two popular social media platforms at earlier stages of the pandemic. This cybercrime challenge is likely to continue as the pandemic progresses and more people seek access to COVID-19 testing and treatment. This data intelligence can help public health agencies, regulatory authorities, legitimate manufacturers, and technology platforms better remove and prevent this content from harming the public. %M 32750006 %R 10.2196/20794 %U http://publichealth.jmir.org/2020/3/e20794/ %U https://doi.org/10.2196/20794 %U http://www.ncbi.nlm.nih.gov/pubmed/32750006 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 8 %N 8 %P e18189 %T Artificial Intelligence for Caregivers of Persons With Alzheimer’s Disease and Related Dementias: Systematic Literature Review %A Xie,Bo %A Tao,Cui %A Li,Juan %A Hilsabeck,Robin C %A Aguirre,Alyssa %+ School of Nursing, The University of Texas at Austin, 1710 Red River, Austin, TX, 78712, United States, 1 512 232 5788, boxie@utexas.edu %K Alzheimer disease %K dementia %K caregiving %K technology %K artificial intelligence %D 2020 %7 20.8.2020 %9 Review %J JMIR Med Inform %G English %X Background: Artificial intelligence (AI) has great potential for improving the care of persons with Alzheimer’s disease and related dementias (ADRD) and the quality of life of their family caregivers. To date, however, systematic review of the literature on the impact of AI on ADRD management has been lacking. Objective: This paper aims to (1) identify and examine literature on AI that provides information to facilitate ADRD management by caregivers of individuals diagnosed with ADRD and (2) identify gaps in the literature that suggest future directions for research. Methods: Following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines for conducting systematic literature reviews, during August and September 2019, we performed 3 rounds of selection. First, we searched predetermined keywords in PubMed, Cumulative Index to Nursing and Allied Health Literature Plus with Full Text, PsycINFO, IEEE Xplore Digital Library, and the ACM Digital Library. This step generated 113 nonduplicate results. Next, we screened the titles and abstracts of the 113 papers according to inclusion and exclusion criteria, after which 52 papers were excluded and 61 remained. Finally, we screened the full text of the remaining papers to ensure that they met the inclusion or exclusion criteria; 31 papers were excluded, leaving a final sample of 30 papers for analysis. Results: Of the 30 papers, 20 reported studies that focused on using AI to assist in activities of daily living. A limited number of specific daily activities were targeted. The studies’ aims suggested three major purposes: (1) to test the feasibility, usability, or perceptions of prototype AI technology; (2) to generate preliminary data on the technology’s performance (primarily accuracy in detecting target events, such as falls); and (3) to understand user needs and preferences for the design and functionality of to-be-developed technology. The majority of the studies were qualitative, with interviews, focus groups, and observation being their most common methods. Cross-sectional surveys were also common, but with small convenience samples. Sample sizes ranged from 6 to 106, with the vast majority on the low end. The majority of the studies were descriptive, exploratory, and lacking theoretical guidance. Many studies reported positive outcomes in favor of their AI technology’s feasibility and satisfaction; some studies reported mixed results on these measures. Performance of the technology varied widely across tasks. Conclusions: These findings call for more systematic designs and evaluations of the feasibility and efficacy of AI-based interventions for caregivers of people with ADRD. These gaps in the research would be best addressed through interdisciplinary collaboration, incorporating complementary expertise from the health sciences and computer science/engineering–related fields. %M 32663146 %R 10.2196/18189 %U http://medinform.jmir.org/2020/8/e18189/ %U https://doi.org/10.2196/18189 %U http://www.ncbi.nlm.nih.gov/pubmed/32663146 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 8 %P e22590 %T Social Network Analysis of COVID-19 Sentiments: Application of Artificial Intelligence %A Hung,Man %A Lauren,Evelyn %A Hon,Eric S %A Birmingham,Wendy C %A Xu,Julie %A Su,Sharon %A Hon,Shirley D %A Park,Jungweon %A Dang,Peter %A Lipsky,Martin S %+ College of Dental Medicine, Roseman University of Health Sciences, 10894 South River Front Parkway, South Jordan, UT, 84095-3538, United States, 1 801 878 1270, mhung@roseman.edu %K COVID-19 %K coronavirus %K sentiment %K social network %K Twitter %K infodemiology %K infodemic %K pandemic %K crisis %K public health %K business economy %K artificial intelligence %D 2020 %7 18.8.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: The coronavirus disease (COVID-19) pandemic led to substantial public discussion. Understanding these discussions can help institutions, governments, and individuals navigate the pandemic. Objective: The aim of this study is to analyze discussions on Twitter related to COVID-19 and to investigate the sentiments toward COVID-19. Methods: This study applied machine learning methods in the field of artificial intelligence to analyze data collected from Twitter. Using tweets originating exclusively in the United States and written in English during the 1-month period from March 20 to April 19, 2020, the study examined COVID-19–related discussions. Social network and sentiment analyses were also conducted to determine the social network of dominant topics and whether the tweets expressed positive, neutral, or negative sentiments. Geographic analysis of the tweets was also conducted. Results: There were a total of 14,180,603 likes, 863,411 replies, 3,087,812 retweets, and 641,381 mentions in tweets during the study timeframe. Out of 902,138 tweets analyzed, sentiment analysis classified 434,254 (48.2%) tweets as having a positive sentiment, 187,042 (20.7%) as neutral, and 280,842 (31.1%) as negative. The study identified 5 dominant themes among COVID-19–related tweets: health care environment, emotional support, business economy, social change, and psychological stress. Alaska, Wyoming, New Mexico, Pennsylvania, and Florida were the states expressing the most negative sentiment while Vermont, North Dakota, Utah, Colorado, Tennessee, and North Carolina conveyed the most positive sentiment. Conclusions: This study identified 5 prevalent themes of COVID-19 discussion with sentiments ranging from positive to negative. These themes and sentiments can clarify the public’s response to COVID-19 and help officials navigate the pandemic. %M 32750001 %R 10.2196/22590 %U http://www.jmir.org/2020/8/e22590/ %U https://doi.org/10.2196/22590 %U http://www.ncbi.nlm.nih.gov/pubmed/32750001 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 8 %P e20007 %T Artificial Intelligence for Rapid Meta-Analysis: Case Study on Ocular Toxicity of Hydroxychloroquine %A Michelson,Matthew %A Chow,Tiffany %A Martin,Neil A %A Ross,Mike %A Tee Qiao Ying,Amelia %A Minton,Steven %+ Evid Science, 2361 Rosencrans Ave Ste 348, El Segundo, CA, 90245-4929, United States, 1 626 765 1903, mmichelson@evidscience.com %K meta-analysis %K rapid meta-analysis %K artificial intelligence %K drug %K analysis %K hydroxychloroquine %K toxic %K COVID-19 %K treatment %K side effect %K ocular %K eye %D 2020 %7 17.8.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: Rapid access to evidence is crucial in times of an evolving clinical crisis. To that end, we propose a novel approach to answer clinical queries, termed rapid meta-analysis (RMA). Unlike traditional meta-analysis, RMA balances a quick time to production with reasonable data quality assurances, leveraging artificial intelligence (AI) to strike this balance. Objective: We aimed to evaluate whether RMA can generate meaningful clinical insights, but crucially, in a much faster processing time than traditional meta-analysis, using a relevant, real-world example. Methods: The development of our RMA approach was motivated by a currently relevant clinical question: is ocular toxicity and vision compromise a side effect of hydroxychloroquine therapy? At the time of designing this study, hydroxychloroquine was a leading candidate in the treatment of coronavirus disease (COVID-19). We then leveraged AI to pull and screen articles, automatically extract their results, review the studies, and analyze the data with standard statistical methods. Results: By combining AI with human analysis in our RMA, we generated a meaningful, clinical result in less than 30 minutes. The RMA identified 11 studies considering ocular toxicity as a side effect of hydroxychloroquine and estimated the incidence to be 3.4% (95% CI 1.11%-9.96%). The heterogeneity across individual study findings was high, which should be taken into account in interpretation of the result. Conclusions: We demonstrate that a novel approach to meta-analysis using AI can generate meaningful clinical insights in a much shorter time period than traditional meta-analysis. %M 32804086 %R 10.2196/20007 %U http://www.jmir.org/2020/8/e20007/ %U https://doi.org/10.2196/20007 %U http://www.ncbi.nlm.nih.gov/pubmed/32804086 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 8 %P e17211 %T How Can Artificial Intelligence Make Medicine More Preemptive? %A Iqbal,Usman %A Celi,Leo Anthony %A Li,Yu-Chuan Jack %+ Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, No 172-1, Sec 2, Keelung Rd, Daan District, Taipei, 10675, Taiwan, 886 6638 2736 ext 7601, jack@tmu.edu.tw %K artificial intelligence %K digital health %K eHealth %K health care technology %K medical innovations %K health information technology %K advanced care systems %D 2020 %7 11.8.2020 %9 Viewpoint %J J Med Internet Res %G English %X In this paper we propose the idea that Artificial intelligence (AI) is ushering in a new era of “Earlier Medicine,” which is a predictive approach for disease prevention based on AI modeling and big data. The flourishing health care technological landscape is showing great potential—from diagnosis and prescription automation to the early detection of disease through efficient and cost-effective patient data screening tools that benefit from the predictive capabilities of AI. Monitoring the trajectories of both in- and outpatients has proven to be a task AI can perform to a reliable degree. Predictions can be a significant advantage to health care if they are accurate, prompt, and can be personalized and acted upon efficiently. This is where AI plays a crucial role in “Earlier Medicine” implementation. %M 32780024 %R 10.2196/17211 %U https://www.jmir.org/2020/8/e17211 %U https://doi.org/10.2196/17211 %U http://www.ncbi.nlm.nih.gov/pubmed/32780024 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 8 %P e19104 %T Approaches Based on Artificial Intelligence and the Internet of Intelligent Things to Prevent the Spread of COVID-19: Scoping Review %A Adly,Aya Sedky %A Adly,Afnan Sedky %A Adly,Mahmoud Sedky %+ Faculty of Oral and Dental Medicine, Cairo University, Cairo University Road, Cairo, , Egypt, 20 1145559778, dr.mahmoud.sedky@gmail.com %K SARS-CoV-2 %K COVID-19 %K novel coronavirus %K artificial intelligence %K internet of things %K telemedicine %K machine learning %K modeling %K simulation %K robotics %D 2020 %7 10.8.2020 %9 Review %J J Med Internet Res %G English %X Background: Artificial intelligence (AI) and the Internet of Intelligent Things (IIoT) are promising technologies to prevent the concerningly rapid spread of coronavirus disease (COVID-19) and to maximize safety during the pandemic. With the exponential increase in the number of COVID-19 patients, it is highly possible that physicians and health care workers will not be able to treat all cases. Thus, computer scientists can contribute to the fight against COVID-19 by introducing more intelligent solutions to achieve rapid control of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the virus that causes the disease. Objective: The objectives of this review were to analyze the current literature, discuss the applicability of reported ideas for using AI to prevent and control COVID-19, and build a comprehensive view of how current systems may be useful in particular areas. This may be of great help to many health care administrators, computer scientists, and policy makers worldwide. Methods: We conducted an electronic search of articles in the MEDLINE, Google Scholar, Embase, and Web of Knowledge databases to formulate a comprehensive review that summarizes different categories of the most recently reported AI-based approaches to prevent and control the spread of COVID-19. Results: Our search identified the 10 most recent AI approaches that were suggested to provide the best solutions for maximizing safety and preventing the spread of COVID-19. These approaches included detection of suspected cases, large-scale screening, monitoring, interactions with experimental therapies, pneumonia screening, use of the IIoT for data and information gathering and integration, resource allocation, predictions, modeling and simulation, and robotics for medical quarantine. Conclusions: We found few or almost no studies regarding the use of AI to examine COVID-19 interactions with experimental therapies, the use of AI for resource allocation to COVID-19 patients, or the use of AI and the IIoT for COVID-19 data and information gathering/integration. Moreover, the adoption of other approaches, including use of AI for COVID-19 prediction, use of AI for COVID-19 modeling and simulation, and use of AI robotics for medical quarantine, should be further emphasized by researchers because these important approaches lack sufficient numbers of studies. Therefore, we recommend that computer scientists focus on these approaches, which are still not being adequately addressed. %M 32584780 %R 10.2196/19104 %U https://www.jmir.org/2020/8/e19104 %U https://doi.org/10.2196/19104 %U http://www.ncbi.nlm.nih.gov/pubmed/32584780 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 8 %P e17158 %T Conversational Agents in Health Care: Scoping Review and Conceptual Analysis %A Tudor Car,Lorainne %A Dhinagaran,Dhakshenya Ardhithy %A Kyaw,Bhone Myint %A Kowatsch,Tobias %A Joty,Shafiq %A Theng,Yin-Leng %A Atun,Rifat %+ Family Medicine and Primary Care, Lee Kong Chian School of Medicine, Nanyang Technological University Singapore, 11 Mandalay Road, Singapore, 65 69041258, lorainne.tudor.car@ntu.edu.sg %K conversational agents %K chatbots %K artificial intelligence %K machine learning %K mobile phone %K health care %K scoping review %D 2020 %7 7.8.2020 %9 Review %J J Med Internet Res %G English %X Background: Conversational agents, also known as chatbots, are computer programs designed to simulate human text or verbal conversations. They are increasingly used in a range of fields, including health care. By enabling better accessibility, personalization, and efficiency, conversational agents have the potential to improve patient care. Objective: This study aimed to review the current applications, gaps, and challenges in the literature on conversational agents in health care and provide recommendations for their future research, design, and application. Methods: We performed a scoping review. A broad literature search was performed in MEDLINE (Medical Literature Analysis and Retrieval System Online; Ovid), EMBASE (Excerpta Medica database; Ovid), PubMed, Scopus, and Cochrane Central with the search terms “conversational agents,” “conversational AI,” “chatbots,” and associated synonyms. We also searched the gray literature using sources such as the OCLC (Online Computer Library Center) WorldCat database and ResearchGate in April 2019. Reference lists of relevant articles were checked for further articles. Screening and data extraction were performed in parallel by 2 reviewers. The included evidence was analyzed narratively by employing the principles of thematic analysis. Results: The literature search yielded 47 study reports (45 articles and 2 ongoing clinical trials) that matched the inclusion criteria. The identified conversational agents were largely delivered via smartphone apps (n=23) and used free text only as the main input (n=19) and output (n=30) modality. Case studies describing chatbot development (n=18) were the most prevalent, and only 11 randomized controlled trials were identified. The 3 most commonly reported conversational agent applications in the literature were treatment and monitoring, health care service support, and patient education. Conclusions: The literature on conversational agents in health care is largely descriptive and aimed at treatment and monitoring and health service support. It mostly reports on text-based, artificial intelligence–driven, and smartphone app–delivered conversational agents. There is an urgent need for a robust evaluation of diverse health care conversational agents’ formats, focusing on their acceptability, safety, and effectiveness. %M 32763886 %R 10.2196/17158 %U http://www.jmir.org/2020/8/e17158/ %U https://doi.org/10.2196/17158 %U http://www.ncbi.nlm.nih.gov/pubmed/32763886 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 8 %P e15394 %T Applying Machine Learning Models with An Ensemble Approach for Accurate Real-Time Influenza Forecasting in Taiwan: Development and Validation Study %A Cheng,Hao-Yuan %A Wu,Yu-Chun %A Lin,Min-Hau %A Liu,Yu-Lun %A Tsai,Yue-Yang %A Wu,Jo-Hua %A Pan,Ke-Han %A Ke,Chih-Jung %A Chen,Chiu-Mei %A Liu,Ding-Ping %A Lin,I-Feng %A Chuang,Jen-Hsiang %+ Taiwan Centers for Disease Control, 9F, No. 6, Linsen S. Road, Zhong-zheng District, Taipei, 100, Taiwan, 886 2 2395 9825, jhchuang@cdc.gov.tw %K influenza %K Influenza-like illness %K forecasting %K machine learning %K artificial intelligence %K epidemic forecasting %K surveillance %D 2020 %7 5.8.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: Changeful seasonal influenza activity in subtropical areas such as Taiwan causes problems in epidemic preparedness. The Taiwan Centers for Disease Control has maintained real-time national influenza surveillance systems since 2004. Except for timely monitoring, epidemic forecasting using the national influenza surveillance data can provide pivotal information for public health response. Objective: We aimed to develop predictive models using machine learning to provide real-time influenza-like illness forecasts. Methods: Using surveillance data of influenza-like illness visits from emergency departments (from the Real-Time Outbreak and Disease Surveillance System), outpatient departments (from the National Health Insurance database), and the records of patients with severe influenza with complications (from the National Notifiable Disease Surveillance System), we developed 4 machine learning models (autoregressive integrated moving average, random forest, support vector regression, and extreme gradient boosting) to produce weekly influenza-like illness predictions for a given week and 3 subsequent weeks. We established a framework of the machine learning models and used an ensemble approach called stacking to integrate these predictions. We trained the models using historical data from 2008-2014. We evaluated their predictive ability during 2015-2017 for each of the 4-week time periods using Pearson correlation, mean absolute percentage error (MAPE), and hit rate of trend prediction. A dashboard website was built to visualize the forecasts, and the results of real-world implementation of this forecasting framework in 2018 were evaluated using the same metrics. Results: All models could accurately predict the timing and magnitudes of the seasonal peaks in the then-current week (nowcast) (ρ=0.802-0.965; MAPE: 5.2%-9.2%; hit rate: 0.577-0.756), 1-week (ρ=0.803-0.918; MAPE: 8.3%-11.8%; hit rate: 0.643-0.747), 2-week (ρ=0.783-0.867; MAPE: 10.1%-15.3%; hit rate: 0.669-0.734), and 3-week forecasts (ρ=0.676-0.801; MAPE: 12.0%-18.9%; hit rate: 0.643-0.786), especially the ensemble model. In real-world implementation in 2018, the forecasting performance was still accurate in nowcasts (ρ=0.875-0.969; MAPE: 5.3%-8.0%; hit rate: 0.582-0.782) and remained satisfactory in 3-week forecasts (ρ=0.721-0.908; MAPE: 7.6%-13.5%; hit rate: 0.596-0.904). Conclusions: This machine learning and ensemble approach can make accurate, real-time influenza-like illness forecasts for a 4-week period, and thus, facilitate decision making. %M 32755888 %R 10.2196/15394 %U https://www.jmir.org/2020/8/e15394 %U https://doi.org/10.2196/15394 %U http://www.ncbi.nlm.nih.gov/pubmed/32755888 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 7 %P e18228 %T Artificial Intelligence in Health Care: Bibliometric Analysis %A Guo,Yuqi %A Hao,Zhichao %A Zhao,Shichong %A Gong,Jiaqi %A Yang,Fan %+ Social Welfare Program, School of Public Administration, Dongbei University of Finance and Economics, 217 Jianshan Street, Shahekou District, Dalian, China, 86 411 84710562, fyang10@dufe.edu.cn %K health care %K artificial intelligence %K bibliometric analysis %K telehealth %K neural networks %K machine learning %D 2020 %7 29.7.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: As a critical driving power to promote health care, the health care–related artificial intelligence (AI) literature is growing rapidly. Objective: The purpose of this analysis is to provide a dynamic and longitudinal bibliometric analysis of health care–related AI publications. Methods: The Web of Science (Clarivate PLC) was searched to retrieve all existing and highly cited AI-related health care research papers published in English up to December 2019. Based on bibliometric indicators, a search strategy was developed to screen the title for eligibility, using the abstract and full text where needed. The growth rate of publications, characteristics of research activities, publication patterns, and research hotspot tendencies were computed using the HistCite software. Results: The search identified 5235 hits, of which 1473 publications were included in the analyses. Publication output increased an average of 17.02% per year since 1995, but the growth rate of research papers significantly increased to 45.15% from 2014 to 2019. The major health problems studied in AI research are cancer, depression, Alzheimer disease, heart failure, and diabetes. Artificial neural networks, support vector machines, and convolutional neural networks have the highest impact on health care. Nucleosides, convolutional neural networks, and tumor markers have remained research hotspots through 2019. Conclusions: This analysis provides a comprehensive overview of the AI-related research conducted in the field of health care, which helps researchers, policy makers, and practitioners better understand the development of health care–related AI research and possible practice implications. Future AI research should be dedicated to filling in the gaps between AI health care research and clinical applications. %M 32723713 %R 10.2196/18228 %U http://www.jmir.org/2020/7/e18228/ %U https://doi.org/10.2196/18228 %U http://www.ncbi.nlm.nih.gov/pubmed/32723713 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 7 %P e18082 %T Automatic Recognition, Segmentation, and Sex Assignment of Nocturnal Asthmatic Coughs and Cough Epochs in Smartphone Audio Recordings: Observational Field Study %A Barata,Filipe %A Tinschert,Peter %A Rassouli,Frank %A Steurer-Stey,Claudia %A Fleisch,Elgar %A Puhan,Milo Alan %A Brutsche,Martin %A Kotz,David %A Kowatsch,Tobias %+ Center for Digital Health Interventions, Department of Management, Technology, and Economics, ETH Zurich, Weinbergstrasse 56/57, Zurich, 8092, Switzerland, 41 446323509, fbarata@ethz.ch %K asthma %K cough recognition %K cough segmentation %K sex assignment %K deep learning %K smartphone %K mobile phone %D 2020 %7 14.7.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: Asthma is one of the most prevalent chronic respiratory diseases. Despite increased investment in treatment, little progress has been made in the early recognition and treatment of asthma exacerbations over the last decade. Nocturnal cough monitoring may provide an opportunity to identify patients at risk for imminent exacerbations. Recently developed approaches enable smartphone-based cough monitoring. These approaches, however, have not undergone longitudinal overnight testing nor have they been specifically evaluated in the context of asthma. Also, the problem of distinguishing partner coughs from patient coughs when two or more people are sleeping in the same room using contact-free audio recordings remains unsolved. Objective: The objective of this study was to evaluate the automatic recognition and segmentation of nocturnal asthmatic coughs and cough epochs in smartphone-based audio recordings that were collected in the field. We also aimed to distinguish partner coughs from patient coughs in contact-free audio recordings by classifying coughs based on sex. Methods: We used a convolutional neural network model that we had developed in previous work for automated cough recognition. We further used techniques (such as ensemble learning, minibatch balancing, and thresholding) to address the imbalance in the data set. We evaluated the classifier in a classification task and a segmentation task. The cough-recognition classifier served as the basis for the cough-segmentation classifier from continuous audio recordings. We compared automated cough and cough-epoch counts to human-annotated cough and cough-epoch counts. We employed Gaussian mixture models to build a classifier for cough and cough-epoch signals based on sex. Results: We recorded audio data from 94 adults with asthma (overall: mean 43 years; SD 16 years; female: 54/94, 57%; male 40/94, 43%). Audio data were recorded by each participant in their everyday environment using a smartphone placed next to their bed; recordings were made over a period of 28 nights. Out of 704,697 sounds, we identified 30,304 sounds as coughs. A total of 26,166 coughs occurred without a 2-second pause between coughs, yielding 8238 cough epochs. The ensemble classifier performed well with a Matthews correlation coefficient of 92% in a pure classification task and achieved comparable cough counts to that of human annotators in the segmentation of coughing. The count difference between automated and human-annotated coughs was a mean –0.1 (95% CI –12.11, 11.91) coughs. The count difference between automated and human-annotated cough epochs was a mean 0.24 (95% CI –3.67, 4.15) cough epochs. The Gaussian mixture model cough epoch–based sex classification performed best yielding an accuracy of 83%. Conclusions: Our study showed longitudinal nocturnal cough and cough-epoch recognition from nightly recorded smartphone-based audio from adults with asthma. The model distinguishes partner cough from patient cough in contact-free recordings by identifying cough and cough-epoch signals that correspond to the sex of the patient. This research represents a step towards enabling passive and scalable cough monitoring for adults with asthma. %M 32459641 %R 10.2196/18082 %U https://www.jmir.org/2020/7/e18082 %U https://doi.org/10.2196/18082 %U http://www.ncbi.nlm.nih.gov/pubmed/32459641 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 7 %P e16649 %T Public Perception of Artificial Intelligence in Medical Care: Content Analysis of Social Media %A Gao,Shuqing %A He,Lingnan %A Chen,Yue %A Li,Dan %A Lai,Kaisheng %+ School of Journalism and Communication, Jinan University, 601 Whampoa Ave W, Guangzhou, , China, 86 020 38374980, kaishenglai@126.com %K artificial intelligence %K public perception %K social media %K content analysis %K medical care %D 2020 %7 13.7.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: High-quality medical resources are in high demand worldwide, and the application of artificial intelligence (AI) in medical care may help alleviate the crisis related to this shortage. The development of the medical AI industry depends to a certain extent on whether industry experts have a comprehensive understanding of the public’s views on medical AI. Currently, the opinions of the general public on this matter remain unclear. Objective: The purpose of this study is to explore the public perception of AI in medical care through a content analysis of social media data, including specific topics that the public is concerned about; public attitudes toward AI in medical care and the reasons for them; and public opinion on whether AI can replace human doctors. Methods: Through an application programming interface, we collected a data set from the Sina Weibo platform comprising more than 16 million users throughout China by crawling all public posts from January to December 2017. Based on this data set, we identified 2315 posts related to AI in medical care and classified them through content analysis. Results: Among the 2315 identified posts, we found three types of AI topics discussed on the platform: (1) technology and application (n=987, 42.63%), (2) industry development (n=706, 30.50%), and (3) impact on society (n=622, 26.87%). Out of 956 posts where public attitudes were expressed, 59.4% (n=568), 34.4% (n=329), and 6.2% (n=59) of the posts expressed positive, neutral, and negative attitudes, respectively. The immaturity of AI technology (27/59, 46%) and a distrust of related companies (n=15, 25%) were the two main reasons for the negative attitudes. Across 200 posts that mentioned public attitudes toward replacing human doctors with AI, 47.5% (n=95) and 32.5% (n=65) of the posts expressed that AI would completely or partially replace human doctors, respectively. In comparison, 20.0% (n=40) of the posts expressed that AI would not replace human doctors. Conclusions: Our findings indicate that people are most concerned about AI technology and applications. Generally, the majority of people held positive attitudes and believed that AI doctors would completely or partially replace human ones. Compared with previous studies on medical doctors, the general public has a more positive attitude toward medical AI. Lack of trust in AI and the absence of the humanistic care factor are essential reasons why some people still have a negative attitude toward medical AI. We suggest that practitioners may need to pay more attention to promoting the credibility of technology companies and meeting patients’ emotional needs instead of focusing merely on technical issues. %M 32673231 %R 10.2196/16649 %U http://www.jmir.org/2020/7/e16649/ %U https://doi.org/10.2196/16649 %U http://www.ncbi.nlm.nih.gov/pubmed/32673231 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 7 %P e16021 %T Effectiveness and Safety of Using Chatbots to Improve Mental Health: Systematic Review and Meta-Analysis %A Abd-Alrazaq,Alaa Ali %A Rababeh,Asma %A Alajlani,Mohannad %A Bewick,Bridgette M %A Househ,Mowafa %+ College of Science and Engineering, Hamad Bin Khalifa University, Liberal Arts and Sciences Building, Education City, Ar Rayyan, Doha, Qatar, 974 55708549, mhouseh@hbku.edu.qa %K chatbots %K conversational agents %K mental health %K mental disorders %K depression %K anxiety %K effectiveness %K safety %D 2020 %7 13.7.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: The global shortage of mental health workers has prompted the utilization of technological advancements, such as chatbots, to meet the needs of people with mental health conditions. Chatbots are systems that are able to converse and interact with human users using spoken, written, and visual language. While numerous studies have assessed the effectiveness and safety of using chatbots in mental health, no reviews have pooled the results of those studies. Objective: This study aimed to assess the effectiveness and safety of using chatbots to improve mental health through summarizing and pooling the results of previous studies. Methods: A systematic review was carried out to achieve this objective. The search sources were 7 bibliographic databases (eg, MEDLINE, EMBASE, PsycINFO), the search engine “Google Scholar,” and backward and forward reference list checking of the included studies and relevant reviews. Two reviewers independently selected the studies, extracted data from the included studies, and assessed the risk of bias. Data extracted from studies were synthesized using narrative and statistical methods, as appropriate. Results: Of 1048 citations retrieved, we identified 12 studies examining the effect of using chatbots on 8 outcomes. Weak evidence demonstrated that chatbots were effective in improving depression, distress, stress, and acrophobia. In contrast, according to similar evidence, there was no statistically significant effect of using chatbots on subjective psychological wellbeing. Results were conflicting regarding the effect of chatbots on the severity of anxiety and positive and negative affect. Only two studies assessed the safety of chatbots and concluded that they are safe in mental health, as no adverse events or harms were reported. Conclusions: Chatbots have the potential to improve mental health. However, the evidence in this review was not sufficient to definitely conclude this due to lack of evidence that their effect is clinically important, a lack of studies assessing each outcome, high risk of bias in those studies, and conflicting results for some outcomes. Further studies are required to draw solid conclusions about the effectiveness and safety of chatbots. Trial Registration: PROSPERO International Prospective Register of Systematic Reviews CRD42019141219; https://www.crd.york.ac.uk/prospero/display_record.php?ID=CRD42019141219 %M 32673216 %R 10.2196/16021 %U http://www.jmir.org/2020/7/e16021/ %U https://doi.org/10.2196/16021 %U http://www.ncbi.nlm.nih.gov/pubmed/32673216 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 7 %P e18697 %T Diagnosing Parkinson Disease Through Facial Expression Recognition: Video Analysis %A Jin,Bo %A Qu,Yue %A Zhang,Liang %A Gao,Zhan %+ Dongbei University of Finance and Economics, 217 Jianshan St, Shahekou District, Dalian, China, 86 15524709655, liang.zhang@dufe.edu.cn %K Parkinson disease %K face landmarks %K machine learning %K artificial intelligence %D 2020 %7 10.7.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: The number of patients with neurological diseases is currently increasing annually, which presents tremendous challenges for both patients and doctors. With the advent of advanced information technology, digital medical care is gradually changing the medical ecology. Numerous people are exploring new ways to receive a consultation, track their diseases, and receive rehabilitation training in more convenient and efficient ways. In this paper, we explore the use of facial expression recognition via artificial intelligence to diagnose a typical neurological system disease, Parkinson disease (PD). Objective: This study proposes methods to diagnose PD through facial expression recognition. Methods: We collected videos of facial expressions of people with PD and matched controls. We used relative coordinates and positional jitter to extract facial expression features (facial expression amplitude and shaking of small facial muscle groups) from the key points returned by Face++. Algorithms from traditional machine learning and advanced deep learning were utilized to diagnose PD. Results: The experimental results showed our models can achieve outstanding facial expression recognition ability for PD diagnosis. Applying a long short-term model neural network to the positions of the key features, precision and F1 values of 86% and 75%, respectively, can be reached. Further, utilizing a support vector machine algorithm for the facial expression amplitude features and shaking of the small facial muscle groups, an F1 value of 99% can be achieved. Conclusions: This study contributes to the digital diagnosis of PD based on facial expression recognition. The disease diagnosis model was validated through our experiment. The results can help doctors understand the real-time dynamics of the disease and even conduct remote diagnosis. %M 32673247 %R 10.2196/18697 %U https://www.jmir.org/2020/7/e18697 %U https://doi.org/10.2196/18697 %U http://www.ncbi.nlm.nih.gov/pubmed/32673247 %0 Journal Article %@ 2291-5222 %I JMIR Publications %V 8 %N 7 %P e17558 %T A Physical Activity and Diet Program Delivered by Artificially Intelligent Virtual Health Coach: Proof-of-Concept Study %A Maher,Carol Ann %A Davis,Courtney Rose %A Curtis,Rachel Grace %A Short,Camille Elizabeth %A Murphy,Karen Joy %+ Alliance for Research in Exercise, Nutrition and Activity, Allied Health and Human Performance, University of South Australia, GPO Box 2471, Adelaide, 5001, Australia, 61 883022315, carol.maher@unisa.edu.au %K virtual assistant %K chatbot %K Mediterranean diet %K physical activity %K lifestyle %D 2020 %7 10.7.2020 %9 Original Paper %J JMIR Mhealth Uhealth %G English %X Background: Poor diet and physical inactivity are leading modifiable causes of death and disease. Advances in artificial intelligence technology present tantalizing opportunities for creating virtual health coaches capable of providing personalized support at scale. Objective: This proof of concept study aimed to test the feasibility (recruitment and retention) and preliminary efficacy of physical activity and Mediterranean-style dietary intervention (MedLiPal) delivered via artificially intelligent virtual health coach. Methods: This 12-week single-arm pre-post study took place in Adelaide, Australia, from March to August 2019. Participants were inactive community-dwelling adults aged 45 to 75 years, recruited through news stories, social media posts, and flyers. The program included access to an artificially intelligent chatbot, Paola, who guided participants through a computer-based individualized introductory session, weekly check-ins, and goal setting, and was available 24/7 to answer questions. Participants used a Garmin Vivofit4 tracker to monitor daily steps, a website with educational materials and recipes, and a printed diet and activity log sheet. Primary outcomes included feasibility (based on recruitment and retention) and preliminary efficacy for changing physical activity and diet. Secondary outcomes were body composition (based on height, weight, and waist circumference) and blood pressure. Results: Over 4 weeks, 99 potential participants registered expressions of interest, with 81 of those screened meeting eligibility criteria. Participants completed a mean of 109.8 (95% CI 1.9-217.7) more minutes of physical activity at week 12 compared with baseline. Mediterranean diet scores increased from a mean of 3.8 out of 14 at baseline, to 9.6 at 12 weeks (mean improvement 5.7 points, 95% CI 4.2-7.3). After 12 weeks, participants lost an average 1.3 kg (95% CI –0.1 to –2.5 kg) and 2.1 cm from their waist circumference (95% CI –3.5 to –0.7 cm). There were no significant changes in blood pressure. Feasibility was excellent in terms of recruitment, retention (90% at 12 weeks), and safety (no adverse events). Conclusions: An artificially intelligent virtual assistant-led lifestyle-modification intervention was feasible and achieved measurable improvements in physical activity, diet, and body composition at 12 weeks. Future research examining artificially intelligent interventions at scale, and for other health purposes, is warranted. %M 32673246 %R 10.2196/17558 %U https://mhealth.jmir.org/2020/7/e17558 %U https://doi.org/10.2196/17558 %U http://www.ncbi.nlm.nih.gov/pubmed/32673246 %0 Journal Article %@ 2291-5222 %I JMIR Publications %V 8 %N 7 %P e17216 %T Development and Clinical Evaluation of a Web-Based Upper Limb Home Rehabilitation System Using a Smartwatch and Machine Learning Model for Chronic Stroke Survivors: Prospective Comparative Study %A Chae,Sang Hoon %A Kim,Yushin %A Lee,Kyoung-Soub %A Park,Hyung-Soon %+ Department of Mechanical Engineering, Korea Advanced Institute of Science and Technology, 291 Daehak-ro, Yuseong-gu, Daejeon, 34141, Republic of Korea, 82 42 350 3038, hyungspark@kaist.ac.kr %K home-based rehabilitation %K artificial intelligence %K machine learning %K wearable device %K smartwatch %K chronic stroke %D 2020 %7 9.7.2020 %9 Original Paper %J JMIR Mhealth Uhealth %G English %X Background: Recent advancements in wearable sensor technology have shown the feasibility of remote physical therapy at home. In particular, the current COVID-19 pandemic has revealed the need and opportunity of internet-based wearable technology in future health care systems. Previous research has shown the feasibility of human activity recognition technologies for monitoring rehabilitation activities in home environments; however, few comprehensive studies ranging from development to clinical evaluation exist. Objective: This study aimed to (1) develop a home-based rehabilitation (HBR) system that can recognize and record the type and frequency of rehabilitation exercises conducted by the user using a smartwatch and smartphone app equipped with a machine learning (ML) algorithm and (2) evaluate the efficacy of the home-based rehabilitation system through a prospective comparative study with chronic stroke survivors. Methods: The HBR system involves an off-the-shelf smartwatch, a smartphone, and custom-developed apps. A convolutional neural network was used to train the ML algorithm for detecting home exercises. To determine the most accurate way for detecting the type of home exercise, we compared accuracy results with the data sets of personal or total data and accelerometer, gyroscope, or accelerometer combined with gyroscope data. From March 2018 to February 2019, we conducted a clinical study with two groups of stroke survivors. In total, 17 and 6 participants were enrolled for statistical analysis in the HBR group and control group, respectively. To measure clinical outcomes, we performed the Wolf Motor Function Test (WMFT), Fugl-Meyer Assessment of Upper Extremity, grip power test, Beck Depression Inventory, and range of motion (ROM) assessment of the shoulder joint at 0, 6, and 12 months, and at a follow-up assessment 6 weeks after retrieving the HBR system. Results: The ML model created with personal data involving accelerometer combined with gyroscope data (5590/5601, 99.80%) was the most accurate compared with accelerometer (5496/5601, 98.13%) or gyroscope data (5381/5601, 96.07%). In the comparative study, the drop-out rates in the control and HBR groups were 40% (4/10) and 22% (5/22) at 12 weeks and 100% (10/10) and 45% (10/22) at 18 weeks, respectively. The HBR group (n=17) showed a significant improvement in the mean WMFT score (P=.02) and ROM of flexion (P=.004) and internal rotation (P=.001). The control group (n=6) showed a significant change only in shoulder internal rotation (P=.03). Conclusions: This study found that a home care system using a commercial smartwatch and ML model can facilitate participation in home training and improve the functional score of the WMFT and shoulder ROM of flexion and internal rotation in the treatment of patients with chronic stroke. This strategy can possibly be a cost-effective tool for the home care treatment of stroke survivors in the future. Trial Registration: Clinical Research Information Service KCT0004818; https://tinyurl.com/y92w978t %M 32480361 %R 10.2196/17216 %U http://mhealth.jmir.org/2020/7/e17216/ %U https://doi.org/10.2196/17216 %U http://www.ncbi.nlm.nih.gov/pubmed/32480361 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 8 %N 7 %P e14500 %T Identifying the Medical Lethality of Suicide Attempts Using Network Analysis and Deep Learning: Nationwide Study %A Kim,Bora %A Kim,Younghoon %A Park,C Hyung Keun %A Rhee,Sang Jin %A Kim,Young Shin %A Leventhal,Bennett L %A Ahn,Yong Min %A Paik,Hyojung %+ Center for Supercomputing Applications, Division of Supercomputing, Korea Institute of Science and Technology Information (KISTI), 245 Daehak-ro, Yuseong-gu, Daejeon, 305-806, Republic of Korea, 1 82 42 869 1004, hyojungpaik@kisti.re.kr %K suicide %K deep learning %K network %K antecedent behaviors %D 2020 %7 9.7.2020 %9 Original Paper %J JMIR Med Inform %G English %X Background: Suicide is one of the leading causes of death among young and middle-aged people. However, little is understood about the behaviors leading up to actual suicide attempts and whether these behaviors are specific to the nature of suicide attempts. Objective: The goal of this study was to examine the clusters of behaviors antecedent to suicide attempts to determine if they could be used to assess the potential lethality of the attempt. To accomplish this goal, we developed a deep learning model using the relationships among behaviors antecedent to suicide attempts and the attempts themselves. Methods: This study used data from the Korea National Suicide Survey. We identified 1112 individuals who attempted suicide and completed a psychiatric evaluation in the emergency room. The 15-item Beck Suicide Intent Scale (SIS) was used for assessing antecedent behaviors, and the medical outcomes of the suicide attempts were measured by assessing lethality with the Columbia Suicide Severity Rating Scale (C-SSRS; lethal suicide attempt >3 and nonlethal attempt ≤3). Results: Using scores from the SIS, individuals who had lethal and nonlethal attempts comprised two different network nodes with the edges representing the relationships among nodes. Among the antecedent behaviors, the conception of a method’s lethality predicted suicidal behaviors with severe medical outcomes. The vectorized relationship values among the elements of antecedent behaviors in our deep learning model (E-GONet) increased performances, such as F1 and area under the precision-recall gain curve (AUPRG), for identifying lethal attempts (up to 3% for F1 and 32% for AUPRG), as compared with other models (mean F1: 0.81 for E-GONet, 0.78 for linear regression, and 0.80 for random forest; mean AUPRG: 0.73 for E-GONet, 0.41 for linear regression, and 0.69 for random forest). Conclusions: The relationships among behaviors antecedent to suicide attempts can be used to understand the suicidal intent of individuals and help identify the lethality of potential suicide attempts. Such a model may be useful in prioritizing cases for preventive intervention. %M 32673253 %R 10.2196/14500 %U http://medinform.jmir.org/2020/7/e14500/ %U https://doi.org/10.2196/14500 %U http://www.ncbi.nlm.nih.gov/pubmed/32673253 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 7 %P e17707 %T Artificial Intelligence and Health Technology Assessment: Anticipating a New Level of Complexity %A Alami,Hassane %A Lehoux,Pascale %A Auclair,Yannick %A de Guise,Michèle %A Gagnon,Marie-Pierre %A Shaw,James %A Roy,Denis %A Fleet,Richard %A Ag Ahmed,Mohamed Ali %A Fortin,Jean-Paul %+ Institut national d'excellence en santé et services sociaux, 2021, Avenue Union, Montréal, QC, H3A 2S9, Canada, 1 514 873 2563 ext 24404, hassane.alami@umontreal.ca %K artificial intelligence %K health technology assessment %K eHealth %K health care %K medical device %K patient %K health services %D 2020 %7 7.7.2020 %9 Viewpoint %J J Med Internet Res %G English %X Artificial intelligence (AI) is seen as a strategic lever to improve access, quality, and efficiency of care and services and to build learning and value-based health systems. Many studies have examined the technical performance of AI within an experimental context. These studies provide limited insights into the issues that its use in a real-world context of care and services raises. To help decision makers address these issues in a systemic and holistic manner, this viewpoint paper relies on the health technology assessment core model to contrast the expectations of the health sector toward the use of AI with the risks that should be mitigated for its responsible deployment. The analysis adopts the perspective of payers (ie, health system organizations and agencies) because of their central role in regulating, financing, and reimbursing novel technologies. This paper suggests that AI-based systems should be seen as a health system transformation lever, rather than a discrete set of technological devices. Their use could bring significant changes and impacts at several levels: technological, clinical, human and cognitive (patient and clinician), professional and organizational, economic, legal, and ethical. The assessment of AI’s value proposition should thus go beyond technical performance and cost logic by performing a holistic analysis of its value in a real-world context of care and services. To guide AI development, generate knowledge, and draw lessons that can be translated into action, the right political, regulatory, organizational, clinical, and technological conditions for innovation should be created as a first step. %M 32406850 %R 10.2196/17707 %U https://www.jmir.org/2020/7/e17707 %U https://doi.org/10.2196/17707 %U http://www.ncbi.nlm.nih.gov/pubmed/32406850 %0 Journal Article %@ 2369-3762 %I JMIR Publications %V 6 %N 1 %P e19285 %T Artificial Intelligence Education and Tools for Medical and Health Informatics Students: Systematic Review %A Sapci,A Hasan %A Sapci,H Aylin %+ Adelphi University, Nexus Building, 1 South Avenue, Garden City, NY, 11530, United States, 1 5168338156, sapci@adelphi.edu %K artificial intelligence %K education %K machine learning %K deep learning %K medical education %K health informatics %K systematic review %D 2020 %7 30.6.2020 %9 Review %J JMIR Med Educ %G English %X Background: The use of artificial intelligence (AI) in medicine will generate numerous application possibilities to improve patient care, provide real-time data analytics, and enable continuous patient monitoring. Clinicians and health informaticians should become familiar with machine learning and deep learning. Additionally, they should have a strong background in data analytics and data visualization to use, evaluate, and develop AI applications in clinical practice. Objective: The main objective of this study was to evaluate the current state of AI training and the use of AI tools to enhance the learning experience. Methods: A comprehensive systematic review was conducted to analyze the use of AI in medical and health informatics education, and to evaluate existing AI training practices. PRISMA-P (Preferred Reporting Items for Systematic Reviews and Meta-Analysis Protocols) guidelines were followed. The studies that focused on the use of AI tools to enhance medical education and the studies that investigated teaching AI as a new competency were categorized separately to evaluate recent developments. Results: This systematic review revealed that recent publications recommend the integration of AI training into medical and health informatics curricula. Conclusions: To the best of our knowledge, this is the first systematic review exploring the current state of AI education in both medicine and health informatics. Since AI curricula have not been standardized and competencies have not been determined, a framework for specialized AI training in medical and health informatics education is proposed. %M 32602844 %R 10.2196/19285 %U http://mededu.jmir.org/2020/1/e19285/ %U https://doi.org/10.2196/19285 %U http://www.ncbi.nlm.nih.gov/pubmed/32602844 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 8 %N 6 %P e19202 %T Medical Emergency Resource Allocation Model in Large-Scale Emergencies Based on Artificial Intelligence: Algorithm Development %A Du,Lin %+ School of Information Science and Engineering, Qilu Normal University, No 33, Shanshi East Road, Jinan, China, 86 13793161610, dul1028@163.com %K medical emergency %K resource allocation model %K distribution model %K large-scale emergencies %K artificial intelligence %D 2020 %7 25.6.2020 %9 Original Paper %J JMIR Med Inform %G English %X Background: Before major emergencies occur, the government needs to prepare various emergency supplies in advance. To do this, it should consider the coordinated storage of different types of materials while ensuring that emergency materials are not missed or superfluous. Objective: This paper aims to improve the dispatch and transportation efficiency of emergency materials under a model in which the government makes full use of Internet of Things technology and artificial intelligence technology. Methods: The paper established a model for emergency material preparation and dispatch based on queueing theory and further established a workflow system for emergency material preparation, dispatch, and transportation based on a Petri net, resulting in a highly efficient emergency material preparation and dispatch simulation system framework. Results: A decision support platform was designed to integrate all the algorithms and principles proposed. Conclusions: The resulting framework can effectively coordinate the workflow of emergency material preparation and dispatch, helping to shorten the total time of emergency material preparation, dispatch, and transportation. %M 32584262 %R 10.2196/19202 %U http://medinform.jmir.org/2020/6/e19202/ %U https://doi.org/10.2196/19202 %U http://www.ncbi.nlm.nih.gov/pubmed/32584262 %0 Journal Article %@ 2561-326X %I JMIR Publications %V 4 %N 6 %P e18890 %T Adherence of the #Here4U App – Military Version to Criteria for the Development of Rigorous Mental Health Apps %A Linden,Brooke %A Tam-Seto,Linna %A Stuart,Heather %+ Health Services and Policy Research Institute, Queen's University, 21 Arch Street, Kingston, ON, K7L 3L3, Canada, 1 613 533 6387, brooke.linden@queensu.ca %K mental health services %K telemedicine %K mHealth %K chatbot %K e-solutions %K Canadian Armed Forces %K military health %K mobile phone %D 2020 %7 17.6.2020 %9 Original Paper %J JMIR Form Res %G English %X Background: Over the past several years, the emergence of mobile mental health apps has increased as a potential solution for populations who may face logistical and social barriers to traditional service delivery, including individuals connected to the military. Objective: The goal of the #Here4U App – Military Version is to provide evidence-informed mental health support to members of Canada’s military community, leveraging artificial intelligence in the form of IBM Canada’s Watson Assistant to carry on unique text-based conversations with users, identify presenting mental health concerns, and refer users to self-help resources or recommend professional health care where appropriate. Methods: As the availability and use of mental health apps has increased, so too has the list of recommendations and guidelines for efficacious development. We describe the development and testing conducted between 2018 and 2020 and assess the quality of the #Here4U App against 16 criteria for rigorous mental health app development, as identified by Bakker and colleagues in 2016. Results: The #Here4U App – Military Version met the majority of Bakker and colleagues’ criteria, with those unmet considered not applicable to this particular product or out of scope for research conducted to date. Notably, a formal evaluation of the efficacy of the app is a major priority moving forward. Conclusions: The #Here4U App – Military Version is a promising new mental health e-solution for members of the Canadian Armed Forces community, filling many of the gaps left by traditional service delivery. %M 32554374 %R 10.2196/18890 %U https://formative.jmir.org/2020/6/e18890 %U https://doi.org/10.2196/18890 %U http://www.ncbi.nlm.nih.gov/pubmed/32554374 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 6 %P e18301 %T Technical Metrics Used to Evaluate Health Care Chatbots: Scoping Review %A Abd-Alrazaq,Alaa %A Safi,Zeineb %A Alajlani,Mohannad %A Warren,Jim %A Househ,Mowafa %A Denecke,Kerstin %+ Institute for Medical Informatics, Bern University of Applied Sciences, Quellgasse 21, 2502 Biel, Bern, Switzerland, 41 76 409 97 61, kerstin.denecke@bfh.ch %K chatbots %K conversational agents %K health care %K evaluation %K metrics %D 2020 %7 5.6.2020 %9 Review %J J Med Internet Res %G English %X Background: Dialog agents (chatbots) have a long history of application in health care, where they have been used for tasks such as supporting patient self-management and providing counseling. Their use is expected to grow with increasing demands on health systems and improving artificial intelligence (AI) capability. Approaches to the evaluation of health care chatbots, however, appear to be diverse and haphazard, resulting in a potential barrier to the advancement of the field. Objective: This study aims to identify the technical (nonclinical) metrics used by previous studies to evaluate health care chatbots. Methods: Studies were identified by searching 7 bibliographic databases (eg, MEDLINE and PsycINFO) in addition to conducting backward and forward reference list checking of the included studies and relevant reviews. The studies were independently selected by two reviewers who then extracted data from the included studies. Extracted data were synthesized narratively by grouping the identified metrics into categories based on the aspect of chatbots that the metrics evaluated. Results: Of the 1498 citations retrieved, 65 studies were included in this review. Chatbots were evaluated using 27 technical metrics, which were related to chatbots as a whole (eg, usability, classifier performance, speed), response generation (eg, comprehensibility, realism, repetitiveness), response understanding (eg, chatbot understanding as assessed by users, word error rate, concept error rate), and esthetics (eg, appearance of the virtual agent, background color, and content). Conclusions: The technical metrics of health chatbot studies were diverse, with survey designs and global usability metrics dominating. The lack of standardization and paucity of objective measures make it difficult to compare the performance of health chatbots and could inhibit advancement of the field. We suggest that researchers more frequently include metrics computed from conversation logs. In addition, we recommend the development of a framework of technical metrics with recommendations for specific circumstances for their inclusion in chatbot studies. %M 32442157 %R 10.2196/18301 %U http://www.jmir.org/2020/6/e18301/ %U https://doi.org/10.2196/18301 %U http://www.ncbi.nlm.nih.gov/pubmed/32442157 %0 Journal Article %@ 2561-326X %I JMIR Publications %V 4 %N 6 %P e16670 %T Patient Perception of Plain-Language Medical Notes Generated Using Artificial Intelligence Software: Pilot Mixed-Methods Study %A Bala,Sandeep %A Keniston,Angela %A Burden,Marisha %+ College of Medicine, University of Central Florida, 6850 Lake Nona Blvd, Orlando, FL, , United States, 1 321 299 8429, ucfbala@knights.ucf.edu %K artificial intelligence %K patient education %K natural language processing %K OpenNotes %K Open Notes %K patient-physician relationship %K simplified notes %K plain-language notes %D 2020 %7 5.6.2020 %9 Original Paper %J JMIR Form Res %G English %X Background: Clinicians’ time with patients has become increasingly limited due to regulatory burden, documentation and billing, administrative responsibilities, and market forces. These factors limit clinicians’ time to deliver thorough explanations to patients. OpenNotes began as a research initiative exploring the ability of sharing medical notes with patients to help patients understand their health care. Providing patients access to their medical notes has been shown to have many benefits, including improved patient satisfaction and clinical outcomes. OpenNotes has since evolved into a national movement that helps clinicians share notes with patients. However, a significant barrier to the widespread adoption of OpenNotes has been clinicians’ concerns that OpenNotes may cost additional time to correct patient confusion over medical language. Recent advances in artificial intelligence (AI) technology may help resolve this concern by converting medical notes to plain language with minimal time required of clinicians. Objective: This pilot study assesses patient comprehension and perceived benefits, concerns, and insights regarding an AI-simplified note through comprehension questions and guided interview. Methods: Synthea, a synthetic patient generator, was used to generate a standardized medical-language patient note which was then simplified using AI software. A multiple-choice comprehension assessment questionnaire was drafted with physician input. Study participants were recruited from inpatients at the University of Colorado Hospital. Participants were randomly assigned to be tested for their comprehension of the standardized medical-language version or AI-generated plain-language version of the patient note. Following this, participants reviewed the opposite version of the note and participated in a guided interview. A Student t test was performed to assess for differences in comprehension assessment scores between plain-language and medical-language note groups. Multivariate modeling was performed to assess the impact of demographic variables on comprehension. Interview responses were thematically analyzed. Results: Twenty patients agreed to participate. The mean number of comprehension assessment questions answered correctly was found to be higher in the plain-language group compared with the medical-language group; however, the Student t test was found to be underpowered to determine if this was significant. Age, ethnicity, and health literacy were found to have a significant impact on comprehension scores by multivariate modeling. Thematic analysis of guided interviews highlighted patients’ perceived benefits, concerns, and suggestions regarding such notes. Major themes of benefits were that simplified plain-language notes may (1) be more useable than unsimplified medical-language notes, (2) improve the patient-clinician relationship, and (3) empower patients through an enhanced understanding of their health care. Conclusions: AI software may translate medical notes into plain-language notes that are perceived as beneficial by patients. Limitations included sample size, inpatient-only setting, and possible confounding factors. Larger studies are needed to assess comprehension. Insight from patient responses to guided interviews can guide the future study and development of this technology. %M 32442148 %R 10.2196/16670 %U https://formative.jmir.org/2020/6/e16670 %U https://doi.org/10.2196/16670 %U http://www.ncbi.nlm.nih.gov/pubmed/32442148 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 8 %N 6 %P e18677 %T Application of an Isolated Word Speech Recognition System in the Field of Mental Health Consultation: Development and Usability Study %A Fu,Weifeng %+ Liberal Arts College, Hunan Normal University, 36 Lushan Road, Changsha, 410081, China, 86 18973101748, fwf1126@hunnu.edu.cn %K speech recognition %K isolated words %K mental health %K small vocabulary %K HMM %K hidden Markov model %K programming %D 2020 %7 3.6.2020 %9 Original Paper %J JMIR Med Inform %G English %X Background: Speech recognition is a technology that enables machines to understand human language. Objective: In this study, speech recognition of isolated words from a small vocabulary was applied to the field of mental health counseling. Methods: A software platform was used to establish a human-machine chat for psychological counselling. The software uses voice recognition technology to decode the user's voice information. The software system analyzes and processes the user's voice information according to many internal related databases, and then gives the user accurate feedback. For users who need psychological treatment, the system provides them with psychological education. Results: The speech recognition system included features such as speech extraction, endpoint detection, feature value extraction, training data, and speech recognition. Conclusions: The Hidden Markov Model was adopted, based on multithread programming under a VC2005 compilation environment, to realize the parallel operation of the algorithm and improve the efficiency of speech recognition. After the design was completed, simulation debugging was performed in the laboratory. The experimental results showed that the designed program met the basic requirements of a speech recognition system. %M 32384054 %R 10.2196/18677 %U https://medinform.jmir.org/2020/6/e18677 %U https://doi.org/10.2196/18677 %U http://www.ncbi.nlm.nih.gov/pubmed/32384054 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 5 %P e16896 %T Artificial Intelligence–Assisted System in Postoperative Follow-up of Orthopedic Patients: Exploratory Quantitative and Qualitative Study %A Bian,Yanyan %A Xiang,Yongbo %A Tong,Bingdu %A Feng,Bin %A Weng,Xisheng %+ Department of Orthopedic Surgery, Peking Union Medical College Hospital, Chinese Academy of Medical Science and Peking Union Medical College, No 1 Shuaifuyuan, Dongcheng District, Beijing, 100073, China, 86 13021159994, doctorwxs@163.com %K artificial intelligence %K conversational agent %K follow-up %K cost-effectiveness %D 2020 %7 26.5.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: Patient follow-up is an essential part of hospital ward management. With the development of deep learning algorithms, individual follow-up assignments might be completed by artificial intelligence (AI). We developed an AI-assisted follow-up conversational agent that can simulate the human voice and select an appropriate follow-up time for quantitative, automatic, and personalized patient follow-up. Patient feedback and voice information could be collected and converted into text data automatically. Objective: The primary objective of this study was to compare the cost-effectiveness of AI-assisted follow-up to manual follow-up of patients after surgery. The secondary objective was to compare the feedback from AI-assisted follow-up to feedback from manual follow-up. Methods: The AI-assisted follow-up system was adopted in the Orthopedic Department of Peking Union Medical College Hospital in April 2019. A total of 270 patients were followed up through this system. Prior to that, 2656 patients were followed up by phone calls manually. Patient characteristics, telephone connection rate, follow-up rate, feedback collection rate, time spent, and feedback composition were compared between the two groups of patients. Results: There was no statistically significant difference in age, gender, or disease between the two groups. There was no significant difference in telephone connection rate (manual: 2478/2656, 93.3%; AI-assisted: 249/270, 92.2%; P=.50) or successful follow-up rate (manual: 2301/2478, 92.9%; AI-assisted: 231/249, 92.8%; P=.96) between the two groups. The time spent on 100 patients in the manual follow-up group was about 9.3 hours. In contrast, the time spent on the AI-assisted follow-up was close to 0 hours. The feedback rate in the AI-assisted follow-up group was higher than that in the manual follow-up group (manual: 68/2656, 2.5%; AI-assisted: 28/270, 10.3%; P<.001). The composition of feedback was different in the two groups. Feedback from the AI-assisted follow-up group mainly included nursing, health education, and hospital environment content, while feedback from the manual follow-up group mostly included medical consultation content. Conclusions: The effectiveness of AI-assisted follow-up was not inferior to that of manual follow-up. Human resource costs are saved by AI. AI can help obtain comprehensive feedback from patients, although its depth and pertinence of communication need to be improved. %M 32452807 %R 10.2196/16896 %U http://www.jmir.org/2020/5/e16896/ %U https://doi.org/10.2196/16896 %U http://www.ncbi.nlm.nih.gov/pubmed/32452807 %0 Journal Article %@ 2369-1999 %I JMIR Publications %V 6 %N 1 %P e15859 %T Assessing Breast Cancer Survivors’ Perceptions of Using Voice-Activated Technology to Address Insomnia: Feasibility Study Featuring Focus Groups and In-Depth Interviews %A Arem,Hannah %A Scott,Remle %A Greenberg,Daniel %A Kaltman,Rebecca %A Lieberman,Daniel %A Lewin,Daniel %+ Department of Epidemiology, Milken Institute School of Public Health, George Washington University, 950 New Hampshire Ave NW, Rm 514, Washington, DC, 20052, United States, 1 2029944676, hannaharem@gwu.edu %K artificial intelligence %K breast neoplasms %K survivors %K insomnia %K cognitive behavioral therapy %K mobile phones %D 2020 %7 26.5.2020 %9 Original Paper %J JMIR Cancer %G English %X Background: Breast cancer survivors (BCSs) are a growing population with a higher prevalence of insomnia than women of the same age without a history of cancer. Cognitive behavioral therapy for insomnia (CBT-I) has been shown to be effective in this population, but it is not widely available to those who need it. Objective: This study aimed to better understand BCSs’ experiences with insomnia and to explore the feasibility and acceptability of delivering CBT-I using a virtual assistant (Amazon Alexa). Methods: We first conducted a formative phase with 2 focus groups and 3 in-depth interviews to understand BCSs’ perceptions of insomnia as well as their interest in and comfort with using a virtual assistant to learn about CBT-I. We then developed a prototype incorporating participant preferences and CBT-I components and demonstrated it in group and individual settings to BCSs to evaluate acceptability, interest, perceived feasibility, educational potential, and usability of the prototype. We also collected open-ended feedback on the content and used frequencies to describe the quantitative data. Results: We recruited 11 BCSs with insomnia in the formative phase and 14 BCSs in the prototype demonstration. In formative work, anxiety, fear, and hot flashes were identified as causes of insomnia. After prototype demonstration, nearly 79% (11/14) of participants reported an interest in and perceived feasibility of using the virtual assistant to record sleep patterns. Approximately two-thirds of the participants thought lifestyle modification (9/14, 64%) and sleep restriction (9/14, 64%) would be feasible and were interested in this feature of the program (10/14, 71% and 9/14, 64%, respectively). Relaxation exercises were rated as interesting and feasible using the virtual assistant by 71% (10/14) of the participants. Usability was rated as better than average, and all women reported that they would recommend the program to friends and family. Conclusions: This virtual assistant prototype delivering CBT-I components by using a smart speaker was rated as feasible and acceptable, suggesting that this prototype should be fully developed and tested for efficacy in the BCS population. If efficacy is shown in this population, the prototype should also be adapted for other high-risk populations. %M 32348274 %R 10.2196/15859 %U http://cancer.jmir.org/2020/1/e15859/ %U https://doi.org/10.2196/15859 %U http://www.ncbi.nlm.nih.gov/pubmed/32348274 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 8 %N 5 %P e17647 %T Clinical Desire for an Artificial Intelligence–Based Surgical Assistant System: Electronic Survey–Based Study %A Park,Soo Jin %A Lee,Eun Ji %A Kim,Se Ik %A Kong,Seong-Ho %A Jeong,Chang Wook %A Kim,Hee Seung %+ Department of Obstetrics and Gynecology, Seoul National University College of Medicine, 101 Daehak-Ro, Jongno-Gu, Seoul, 03080, Republic of Korea, 82 02 2072 4863, bboddi0311@gmail.com %K artificial intelligence %K solo surgery %K laparoscopic surgery %D 2020 %7 15.5.2020 %9 Original Paper %J JMIR Med Inform %G English %X Background: Techniques utilizing artificial intelligence (AI) are rapidly growing in medical research and development, especially in the operating room. However, the application of AI in the operating room has been limited to small tasks or software, such as clinical decision systems. It still largely depends on human resources and technology involving the surgeons’ hands. Therefore, we conceptualized AI-based solo surgery (AISS) defined as laparoscopic surgery conducted by only one surgeon with support from an AI-based surgical assistant system, and we performed an electronic survey on the clinical desire for such a system. Objective: This study aimed to evaluate the experiences of surgeons who have performed laparoscopic surgery, the limitations of conventional laparoscopic surgical systems, and the desire for an AI-based surgical assistant system for AISS. Methods: We performed an online survey for gynecologists, urologists, and general surgeons from June to August 2017. The questionnaire consisted of six items about experience, two about limitations, and five about the clinical desire for an AI-based surgical assistant system for AISS. Results: A total of 508 surgeons who have performed laparoscopic surgery responded to the survey. Most of the surgeons needed two or more assistants during laparoscopic surgery, and the rate was higher among gynecologists (251/278, 90.3%) than among general surgeons (123/173, 71.1%) and urologists (35/57, 61.4%). The majority of responders answered that the skillfulness of surgical assistants was “very important” or “important.” The most uncomfortable aspect of laparoscopic surgery was unskilled movement of the camera (431/508, 84.8%) and instruments (303/508, 59.6%). About 40% (199/508, 39.1%) of responders answered that the AI-based surgical assistant system could substitute 41%-60% of the current workforce, and 83.3% (423/508) showed willingness to buy the system. Furthermore, the most reasonable price was US $30,000-50,000. Conclusions: Surgeons who perform laparoscopic surgery may feel discomfort with the conventional laparoscopic surgical system in terms of assistant skillfulness, and they may think that the skillfulness of surgical assistants is essential. They desire to alleviate present inconveniences with the conventional laparoscopic surgical system and to perform a safe and comfortable operation by using an AI-based surgical assistant system for AISS. %M 32412421 %R 10.2196/17647 %U http://medinform.jmir.org/2020/5/e17647/ %U https://doi.org/10.2196/17647 %U http://www.ncbi.nlm.nih.gov/pubmed/32412421 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 5 %P e17620 %T Health Care Employees’ Perceptions of the Use of Artificial Intelligence Applications: Survey Study %A Abdullah,Rana %A Fakieh,Bahjat %+ Information Systems Department, King Abdulaziz University, Al-Solaimaniah District, Jeddah, 21589, Saudi Arabia, 966 126952000 ext 67438, bfakieh@kau.edu.sa %K artificial intelligence %K employees %K healthcare sector %K perception %K Saudi Arabia %D 2020 %7 14.5.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: The advancement of health care information technology and the emergence of artificial intelligence has yielded tools to improve the quality of various health care processes. Few studies have investigated employee perceptions of artificial intelligence implementation in Saudi Arabia and the Arabian world. In addition, limited studies investigated the effect of employee knowledge and job title on the perception of artificial intelligence implementation in the workplace. Objective: The aim of this study was to explore health care employee perceptions and attitudes toward the implementation of artificial intelligence technologies in health care institutions in Saudi Arabia. Methods: An online questionnaire was published, and responses were collected from 250 employees, including doctors, nurses, and technicians at 4 of the largest hospitals in Riyadh, Saudi Arabia. Results: The results of this study showed that 3.11 of 4 respondents feared artificial intelligence would replace employees and had a general lack of knowledge regarding artificial intelligence. In addition, most respondents were unaware of the advantages and most common challenges to artificial intelligence applications in the health sector, indicating a need for training. The results also showed that technicians were the most frequently impacted by artificial intelligence applications due to the nature of their jobs, which do not require much direct human interaction. Conclusions: The Saudi health care sector presents an advantageous market potential that should be attractive to researchers and developers of artificial intelligence solutions. %M 32406857 %R 10.2196/17620 %U http://www.jmir.org/2020/5/e17620/ %U https://doi.org/10.2196/17620 %U http://www.ncbi.nlm.nih.gov/pubmed/32406857 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 4 %P e17234 %T Identification of the Facial Features of Patients With Cancer: A Deep Learning–Based Pilot Study %A Liang,Bin %A Yang,Na %A He,Guosheng %A Huang,Peng %A Yang,Yong %+ Department of Radiation Oncology, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, 17 Panjiayuannanli Rd, Chaoyang District, Beijing, 100021, China, 86 1087788663, leangbin@gmail.com %K convolutional neural network %K facial features %K cancer patient %K deep learning %K cancer %D 2020 %7 29.4.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: Cancer has become the second leading cause of death globally. Most cancer cases are due to genetic mutations, which affect metabolism and result in facial changes. Objective: In this study, we aimed to identify the facial features of patients with cancer using the deep learning technique. Methods: Images of faces of patients with cancer were collected to build the cancer face image data set. A face image data set of people without cancer was built by randomly selecting images from the publicly available MegaAge data set according to the sex and age distribution of the cancer face image data set. Each face image was preprocessed to obtain an upright centered face chip, following which the background was filtered out to exclude the effects of nonrelative factors. A residual neural network was constructed to classify cancer and noncancer cases. Transfer learning, minibatches, few epochs, L2 regulation, and random dropout training strategies were used to prevent overfitting. Moreover, guided gradient-weighted class activation mapping was used to reveal the relevant features. Results: A total of 8124 face images of patients with cancer (men: n=3851, 47.4%; women: n=4273, 52.6%) were collected from January 2018 to January 2019. The ages of the patients ranged from 1 year to 70 years (median age 52 years). The average faces of both male and female patients with cancer displayed more obvious facial adiposity than the average faces of people without cancer, which was supported by a landmark comparison. When testing the data set, the training process was terminated after 5 epochs. The area under the receiver operating characteristic curve was 0.94, and the accuracy rate was 0.82. The main relative feature of cancer cases was facial skin, while the relative features of noncancer cases were extracted from the complementary face region. Conclusions: In this study, we built a face data set of patients with cancer and constructed a deep learning model to classify the faces of people with and those without cancer. We found that facial skin and adiposity were closely related to the presence of cancer. %M 32347802 %R 10.2196/17234 %U http://www.jmir.org/2020/4/e17234/ %U https://doi.org/10.2196/17234 %U http://www.ncbi.nlm.nih.gov/pubmed/32347802 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 8 %N 4 %P e17125 %T A Deep Artificial Neural Network−Based Model for Prediction of Underlying Cause of Death From Death Certificates: Algorithm Development and Validation %A Falissard,Louis %A Morgand,Claire %A Roussel,Sylvie %A Imbaud,Claire %A Ghosn,Walid %A Bounebache,Karim %A Rey,Grégoire %+ Inserm (Institut National de la Santé et de la Recherche Médicale) - CépiDc (Centre d'epidémiologie sur les causes médicales de Décès), 80 Rue du Général Leclerc, Le Kremlin Bicêtre, 94270, France, 33 679649178, louis.falissard@gmail.com %K machine learning %K deep learning %K mortality statistics %K underlying cause of death %D 2020 %7 28.4.2020 %9 Original Paper %J JMIR Med Inform %G English %X Background: Coding of underlying causes of death from death certificates is a process that is nowadays undertaken mostly by humans with potential assistance from expert systems, such as the Iris software. It is, consequently, an expensive process that can, in addition, suffer from geospatial discrepancies, thus severely impairing the comparability of death statistics at the international level. The recent advances in artificial intelligence, specifically the rise of deep learning methods, has enabled computers to make efficient decisions on a number of complex problems that were typically considered out of reach without human assistance; they require a considerable amount of data to learn from, which is typically their main limiting factor. However, the CépiDc (Centre d’épidémiologie sur les causes médicales de Décès) stores an exhaustive database of death certificates at the French national scale, amounting to several millions of training examples available for the machine learning practitioner. Objective: This article investigates the application of deep neural network methods to coding underlying causes of death. Methods: The investigated dataset was based on data contained from every French death certificate from 2000 to 2015, containing information such as the subject’s age and gender, as well as the chain of events leading to his or her death, for a total of around 8 million observations. The task of automatically coding the subject’s underlying cause of death was then formulated as a predictive modelling problem. A deep neural network−based model was then designed and fit to the dataset. Its error rate was then assessed on an exterior test dataset and compared to the current state-of-the-art (ie, the Iris software). Statistical significance of the proposed approach’s superiority was assessed via bootstrap. Results: The proposed approach resulted in a test accuracy of 97.8% (95% CI 97.7-97.9), which constitutes a significant improvement over the current state-of-the-art and its accuracy of 74.5% (95% CI 74.0-75.0) assessed on the same test example. Such an improvement opens up a whole field of new applications, from nosologist-level batch-automated coding to international and temporal harmonization of cause of death statistics. A typical example of such an application is demonstrated by recoding French overdose-related deaths from 2000 to 2010. Conclusions: This article shows that deep artificial neural networks are perfectly suited to the analysis of electronic health records and can learn a complex set of medical rules directly from voluminous datasets, without any explicit prior knowledge. Although not entirely free from mistakes, the derived algorithm constitutes a powerful decision-making tool that is able to handle structured medical data with an unprecedented performance. We strongly believe that the methods developed in this article are highly reusable in a variety of settings related to epidemiology, biostatistics, and the medical sciences in general. %M 32343252 %R 10.2196/17125 %U http://medinform.jmir.org/2020/4/e17125/ %U https://doi.org/10.2196/17125 %U http://www.ncbi.nlm.nih.gov/pubmed/32343252 %0 Journal Article %@ 1929-0748 %I JMIR Publications %V 9 %N 4 %P e17490 %T Nursing in the Age of Artificial Intelligence: Protocol for a Scoping Review %A Buchanan,Christine %A Howitt,M Lyndsay %A Wilson,Rita %A Booth,Richard G %A Risling,Tracie %A Bamford,Megan %+ Registered Nurses' Association of Ontario, 158 Pearl Street, Toronto, ON, M5H 1L3, Canada, 1 800 268 7199 ext 281, cbuchanan@rnao.ca %K nursing %K artificial intelligence %K machine learning %K robotics %K compassionate care %K scoping review %D 2020 %7 16.4.2020 %9 Protocol %J JMIR Res Protoc %G English %X Background: It is predicted that digital health technologies that incorporate artificial intelligence will transform health care delivery in the next decade. Little research has explored how emerging trends in artificial intelligence–driven digital health technologies may influence the relationship between nurses and patients. Objective: The purpose of this scoping review is to summarize the findings from 4 research questions regarding emerging trends in artificial intelligence–driven digital health technologies and their influence on nursing practice across the 5 domains outlined by the Canadian Nurses Association framework: administration, clinical care, education, policy, and research. Specifically, this scoping review will examine how emerging trends will transform the roles and functions of nurses over the next 10 years and beyond. Methods: Using an established scoping review methodology, MEDLINE, Cumulative Index to Nursing and Allied Health Literature, Embase, PsycINFO, Cochrane Database of Systematic Reviews, Cochrane Central, Education Resources Information Centre, Scopus, Web of Science, and Proquest databases were searched. In addition to the electronic database searches, a targeted website search will be performed to access relevant grey literature. Abstracts and full-text studies will be independently screened by 2 reviewers using prespecified inclusion and exclusion criteria. Included literature will focus on nursing and digital health technologies that incorporate artificial intelligence. Data will be charted using a structured form and narratively summarized. Results: Electronic database searches have retrieved 10,318 results. The scoping review and subsequent briefing paper will be completed by the fall of 2020. Conclusions: A symposium will be held to share insights gained from this scoping review with key thought leaders and a cross section of stakeholders from administration, clinical care, education, policy, and research as well as patient advocates. The symposium will provide a forum to explore opportunities for action to advance the future of nursing in a technological world and, more specifically, nurses’ delivery of compassionate care in the age of artificial intelligence. Results from the symposium will be summarized in the form of a briefing paper and widely disseminated to relevant stakeholders. International Registered Report Identifier (IRRID): DERR1-10.2196/17490 %M 32297873 %R 10.2196/17490 %U http://www.researchprotocols.org/2020/4/e17490/ %U https://doi.org/10.2196/17490 %U http://www.ncbi.nlm.nih.gov/pubmed/32297873 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 4 %P e15876 %T Leveraging Eye Tracking to Prioritize Relevant Medical Record Data: Comparative Machine Learning Study %A King,Andrew J %A Cooper,Gregory F %A Clermont,Gilles %A Hochheiser,Harry %A Hauskrecht,Milos %A Sittig,Dean F %A Visweswaran,Shyam %+ Department of Biomedical Informatics, University of Pittsburgh, The Offices at Baum, 5607 Baum Blvd., Suite 523, Pittsburgh, PA, United States, 1 412 648 7119, shv3@pitt.edu %K electronic medical record system %K eye tracking %K machine learning %K intensive care unit %K information-seeking behavior %D 2020 %7 2.4.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: Electronic medical record (EMR) systems capture large amounts of data per patient and present that data to physicians with little prioritization. Without prioritization, physicians must mentally identify and collate relevant data, an activity that can lead to cognitive overload. To mitigate cognitive overload, a Learning EMR (LEMR) system prioritizes the display of relevant medical record data. Relevant data are those that are pertinent to a context—defined as the combination of the user, clinical task, and patient case. To determine which data are relevant in a specific context, a LEMR system uses supervised machine learning models of physician information-seeking behavior. Since obtaining information-seeking behavior data via manual annotation is slow and expensive, automatic methods for capturing such data are needed. Objective: The goal of the research was to propose and evaluate eye tracking as a high-throughput method to automatically acquire physician information-seeking behavior useful for training models for a LEMR system. Methods: Critical care medicine physicians reviewed intensive care unit patient cases in an EMR interface developed for the study. Participants manually identified patient data that were relevant in the context of a clinical task: preparing a patient summary to present at morning rounds. We used eye tracking to capture each physician’s gaze dwell time on each data item (eg, blood glucose measurements). Manual annotations and gaze dwell times were used to define target variables for developing supervised machine learning models of physician information-seeking behavior. We compared the performance of manual selection and gaze-derived models on an independent set of patient cases. Results: A total of 68 pairs of manual selection and gaze-derived machine learning models were developed from training data and evaluated on an independent evaluation data set. A paired Wilcoxon signed-rank test showed similar performance of manual selection and gaze-derived models on area under the receiver operating characteristic curve (P=.40). Conclusions: We used eye tracking to automatically capture physician information-seeking behavior and used it to train models for a LEMR system. The models that were trained using eye tracking performed like models that were trained using manual annotations. These results support further development of eye tracking as a high-throughput method for training clinical decision support systems that prioritize the display of relevant medical record data. %M 32238342 %R 10.2196/15876 %U https://www.jmir.org/2020/4/e15876 %U https://doi.org/10.2196/15876 %U http://www.ncbi.nlm.nih.gov/pubmed/32238342 %0 Journal Article %@ 1929-073X %I JMIR Publications %V 9 %N 1 %P e16606 %T Use of Artificial Intelligence for Medical Literature Search: Randomized Controlled Trial Using the Hackathon Format %A Schoeb,Dominik %A Suarez-Ibarrola,Rodrigo %A Hein,Simon %A Dressler,Franz Friedrich %A Adams,Fabian %A Schlager,Daniel %A Miernik,Arkadiusz %+ Medical Center – Department of Urology, Faculty of Medicine, University of Freiburg, , Freiburg, , Germany, 49 076127025823, dominik.stefan.schoeb@uniklinik-freiburg.de %K artificial intelligence %K literature review %K medical information technology %D 2020 %7 30.3.2020 %9 Original Paper %J Interact J Med Res %G English %X Background: Mapping out the research landscape around a project is often time consuming and difficult. Objective: This study evaluates a commercial artificial intelligence (AI) search engine (IRIS.AI) for its applicability in an automated literature search on a specific medical topic. Methods: To evaluate the AI search engine in a standardized manner, the concept of a science hackathon was applied. Three groups of researchers were tasked with performing a literature search on a clearly defined scientific project. All participants had a high level of expertise for this specific field of research. Two groups were given access to the AI search engine IRIS.AI. All groups were given the same amount of time for their search and were instructed to document their results. Search results were summarized and ranked according to a predetermined scoring system. Results: The final scoring awarded 49 and 39 points out of 60 to AI groups 1 and 2, respectively, and the control group received 46 points. A total of 20 scientific studies with high relevance were identified, and 5 highly relevant studies (“spot on”) were reported by each group. Conclusions: AI technology is a promising approach to facilitate literature searches and the management of medical libraries. In this study, however, the application of AI technology lead to a more focused literature search without a significant improvement in the number of results. %M 32224481 %R 10.2196/16606 %U http://www.i-jmr.org/2020/1/e16606/ %U https://doi.org/10.2196/16606 %U http://www.ncbi.nlm.nih.gov/pubmed/32224481 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 3 %P e16235 %T User Experiences of Social Support From Companion Chatbots in Everyday Contexts: Thematic Analysis %A Ta,Vivian %A Griffith,Caroline %A Boatfield,Carolynn %A Wang,Xinyu %A Civitello,Maria %A Bader,Haley %A DeCero,Esther %A Loggarakis,Alexia %+ Lake Forest College, 555 N Sheridan Rd, Lake Forest, IL, 60045, United States, 1 682 203 0820, vpta538@gmail.com %K artificial intelligence %K social support %K artificial agents %K chatbots %K interpersonal relations %D 2020 %7 6.3.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: Previous research suggests that artificial agents may be a promising source of social support for humans. However, the bulk of this research has been conducted in the context of social support interventions that specifically address stressful situations or health improvements. Little research has examined social support received from artificial agents in everyday contexts. Objective: Considering that social support manifests in not only crises but also everyday situations and that everyday social support forms the basis of support received during more stressful events, we aimed to investigate the types of everyday social support that can be received from artificial agents. Methods: In Study 1, we examined publicly available user reviews (N=1854) of Replika, a popular companion chatbot. In Study 2, a sample (n=66) of Replika users provided detailed open-ended responses regarding their experiences of using Replika. We conducted thematic analysis on both datasets to gain insight into the kind of everyday social support that users receive through interactions with Replika. Results: Replika provides some level of companionship that can help curtail loneliness, provide a “safe space” in which users can discuss any topic without the fear of judgment or retaliation, increase positive affect through uplifting and nurturing messages, and provide helpful information/advice when normal sources of informational support are not available. Conclusions: Artificial agents may be a promising source of everyday social support, particularly companionship, emotional, informational, and appraisal support, but not as tangible support. Future studies are needed to determine who might benefit from these types of everyday social support the most and why. These results could potentially be used to help address global health issues or other crises early on in everyday situations before they potentially manifest into larger issues. %M 32141837 %R 10.2196/16235 %U http://www.jmir.org/2020/2/e16235/ %U https://doi.org/10.2196/16235 %U http://www.ncbi.nlm.nih.gov/pubmed/32141837 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 2 %P e16866 %T The Economic Impact of Artificial Intelligence in Health Care: Systematic Review %A Wolff,Justus %A Pauling,Josch %A Keck,Andreas %A Baumbach,Jan %+ TUM School of Life Sciences Weihenstephan, Technical University of Munich, Maximus-von-Imhof-Forum 3, Freising, 85354, Germany, 49 40329012 0, justus.wolff@syte-institute.com %K telemedicine %K artificial intelligence %K machine learning %K cost-benefit analysis %D 2020 %7 20.2.2020 %9 Review %J J Med Internet Res %G English %X Background: Positive economic impact is a key decision factor in making the case for or against investing in an artificial intelligence (AI) solution in the health care industry. It is most relevant for the care provider and insurer as well as for the pharmaceutical and medical technology sectors. Although the broad economic impact of digital health solutions in general has been assessed many times in literature and the benefit for patients and society has also been analyzed, the specific economic impact of AI in health care has been addressed only sporadically. Objective: This study aimed to systematically review and summarize the cost-effectiveness studies dedicated to AI in health care and to assess whether they meet the established quality criteria. Methods: In a first step, the quality criteria for economic impact studies were defined based on the established and adapted criteria schemes for cost impact assessments. In a second step, a systematic literature review based on qualitative and quantitative inclusion and exclusion criteria was conducted to identify relevant publications for an in-depth analysis of the economic impact assessment. In a final step, the quality of the identified economic impact studies was evaluated based on the defined quality criteria for cost-effectiveness studies. Results: Very few publications have thoroughly addressed the economic impact assessment, and the economic assessment quality of the reviewed publications on AI shows severe methodological deficits. Only 6 out of 66 publications could be included in the second step of the analysis based on the inclusion criteria. Out of these 6 studies, none comprised a methodologically complete cost impact analysis. There are two areas for improvement in future studies. First, the initial investment and operational costs for the AI infrastructure and service need to be included. Second, alternatives to achieve similar impact must be evaluated to provide a comprehensive comparison. Conclusions: This systematic literature analysis proved that the existing impact assessments show methodological deficits and that upcoming evaluations require more comprehensive economic analyses to enable economic decisions for or against implementing AI technology in health care. %M 32130134 %R 10.2196/16866 %U http://www.jmir.org/2020/2/e16866/ %U https://doi.org/10.2196/16866 %U http://www.ncbi.nlm.nih.gov/pubmed/32130134 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 8 %N 2 %P e17061 %T Detection of Postictal Generalized Electroencephalogram Suppression: Random Forest Approach %A Li,Xiaojin %A Tao,Shiqiang %A Jamal-Omidi,Shirin %A Huang,Yan %A Lhatoo,Samden D %A Zhang,Guo-Qiang %A Cui,Licong %+ School of Biomedical Informatics, University of Texas Health Science Center, 7000 Fannin St, Houston, TX, 77030, United States, 1 7135003791, licong.cui@uth.tmc.edu %K epilepsy %K generalized tonic-clonic seizure %K postictal generalized EEG suppression %K EEG %K random forest %D 2020 %7 14.2.2020 %9 Original Paper %J JMIR Med Inform %G English %X Background: Sudden unexpected death in epilepsy (SUDEP) is second only to stroke in neurological events resulting in years of potential life lost. Postictal generalized electroencephalogram (EEG) suppression (PGES) is a period of suppressed brain activity often occurring after generalized tonic-clonic seizure, a most significant risk factor for SUDEP. Therefore, PGES has been considered as a potential biomarker for SUDEP risk. Automatic PGES detection tools can address the limitations of labor-intensive, and sometimes inconsistent, visual analysis. A successful approach to automatic PGES detection must overcome computational challenges involved in the detection of subtle amplitude changes in EEG recordings, which may contain physiological and acquisition artifacts. Objective: This study aimed to present a random forest approach for automatic PGES detection using multichannel human EEG recordings acquired in epilepsy monitoring units. Methods: We used a combination of temporal, frequency, wavelet, and interchannel correlation features derived from EEG signals to train a random forest classifier. We also constructed and applied confidence-based correction rules based on PGES state changes. Motivated by practical utility, we introduced a new, time distance–based evaluation method for assessing the performance of PGES detection algorithms. Results: The time distance–based evaluation showed that our approach achieved a 5-second tolerance-based positive prediction rate of 0.95 for artifact-free signals. For signals with different artifact levels, our prediction rates varied from 0.68 to 0.81. Conclusions: We introduced a feature-based, random forest approach for automatic PGES detection using multichannel EEG recordings. Our approach achieved increasingly better time distance–based performance with reduced signal artifact levels. Further study is needed for PGES detection algorithms to perform well irrespective of the levels of signal artifacts. %M 32130173 %R 10.2196/17061 %U https://medinform.jmir.org/2020/2/e17061 %U https://doi.org/10.2196/17061 %U http://www.ncbi.nlm.nih.gov/pubmed/32130173 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 8 %N 1 %P e15510 %T Longitudinal Risk Prediction of Chronic Kidney Disease in Diabetic Patients Using a Temporal-Enhanced Gradient Boosting Machine: Retrospective Cohort Study %A Song,Xing %A Waitman,Lemuel R %A Yu,Alan SL %A Robbins,David C %A Hu,Yong %A Liu,Mei %+ University of Kansas Medical Center, Department of Internal Medicine, Division of Medical Informatics, 3901 Rainbow Boulevard, Kansas City, KS, 66160, United States, 1 9139456446, meiliu@kumc.edu %K diabetic kidney disease %K diabetic nephropathy %K chronic kidney disease %K machine learning %D 2020 %7 31.1.2020 %9 Original Paper %J JMIR Med Inform %G English %X Background: Artificial intelligence–enabled electronic health record (EHR) analysis can revolutionize medical practice from the diagnosis and prediction of complex diseases to making recommendations in patient care, especially for chronic conditions such as chronic kidney disease (CKD), which is one of the most frequent complications in patients with diabetes and is associated with substantial morbidity and mortality. Objective: The longitudinal prediction of health outcomes requires effective representation of temporal data in the EHR. In this study, we proposed a novel temporal-enhanced gradient boosting machine (GBM) model that dynamically updates and ensembles learners based on new events in patient timelines to improve the prediction accuracy of CKD among patients with diabetes. Methods: Using a broad spectrum of deidentified EHR data on a retrospective cohort of 14,039 adult patients with type 2 diabetes and GBM as the base learner, we validated our proposed Landmark-Boosting model against three state-of-the-art temporal models for rolling predictions of 1-year CKD risk. Results: The proposed model uniformly outperformed other models, achieving an area under receiver operating curve of 0.83 (95% CI 0.76-0.85), 0.78 (95% CI 0.75-0.82), and 0.82 (95% CI 0.78-0.86) in predicting CKD risk with automatic accumulation of new data in later years (years 2, 3, and 4 since diabetes mellitus onset, respectively). The Landmark-Boosting model also maintained the best calibration across moderate- and high-risk groups and over time. The experimental results demonstrated that the proposed temporal model can not only accurately predict 1-year CKD risk but also improve performance over time with additionally accumulated data, which is essential for clinical use to improve renal management of patients with diabetes. Conclusions: Incorporation of temporal information in EHR data can significantly improve predictive model performance and will particularly benefit patients who follow-up with their physicians as recommended. %M 32012067 %R 10.2196/15510 %U http://medinform.jmir.org/2020/1/e15510/ %U https://doi.org/10.2196/15510 %U http://www.ncbi.nlm.nih.gov/pubmed/32012067 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 1 %P e14679 %T Patient Perspectives on the Usefulness of an Artificial Intelligence–Assisted Symptom Checker: Cross-Sectional Survey Study %A Meyer,Ashley N D %A Giardina,Traber D %A Spitzmueller,Christiane %A Shahid,Umber %A Scott,Taylor M T %A Singh,Hardeep %+ Center for Innovations in Quality, Effectiveness and Safety, Michael E DeBakey Veterans Affairs Medical Center and Baylor College of Medicine, 2002 Holcombe Blvd #152, Houston, TX, United States, 1 7134404660, ameyer@bcm.edu %K clinical decision support systems %K technology %K diagnosis %K patient safety %K symptom checker %K computer-assisted diagnosis %D 2020 %7 30.1.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: Patients are increasingly seeking Web-based symptom checkers to obtain diagnoses. However, little is known about the characteristics of the patients who use these resources, their rationale for use, and whether they find them accurate and useful. Objective: The study aimed to examine patients’ experiences using an artificial intelligence (AI)–assisted online symptom checker. Methods: An online survey was administered between March 2, 2018, through March 15, 2018, to US users of the Isabel Symptom Checker within 6 months of their use. User characteristics, experiences of symptom checker use, experiences discussing results with physicians, and prior personal history of experiencing a diagnostic error were collected. Results: A total of 329 usable responses was obtained. The mean respondent age was 48.0 (SD 16.7) years; most were women (230/304, 75.7%) and white (271/304, 89.1%). Patients most commonly used the symptom checker to better understand the causes of their symptoms (232/304, 76.3%), followed by for deciding whether to seek care (101/304, 33.2%) or where (eg, primary or urgent care: 63/304, 20.7%), obtaining medical advice without going to a doctor (48/304, 15.8%), and understanding their diagnoses better (39/304, 12.8%). Most patients reported receiving useful information for their health problems (274/304, 90.1%), with half reporting positive health effects (154/302, 51.0%). Most patients perceived it to be useful as a diagnostic tool (253/301, 84.1%), as a tool providing insights leading them closer to correct diagnoses (231/303, 76.2%), and reported they would use it again (278/304, 91.4%). Patients who discussed findings with their physicians (103/213, 48.4%) more often felt physicians were interested (42/103, 40.8%) than not interested in learning about the tool’s results (24/103, 23.3%) and more often felt physicians were open (62/103, 60.2%) than not open (21/103, 20.4%) to discussing the results. Compared with patients who had not previously experienced diagnostic errors (missed or delayed diagnoses: 123/304, 40.5%), patients who had previously experienced diagnostic errors (181/304, 59.5%) were more likely to use the symptom checker to determine where they should seek care (15/123, 12.2% vs 48/181, 26.5%; P=.002), but they less often felt that physicians were interested in discussing the tool’s results (20/34, 59% vs 22/69, 32%; P=.04). Conclusions: Despite ongoing concerns about symptom checker accuracy, a large patient-user group perceived an AI-assisted symptom checker as useful for diagnosis. Formal validation studies evaluating symptom checker accuracy and effectiveness in real-world practice could provide additional useful information about their benefit. %M 32012052 %R 10.2196/14679 %U http://www.jmir.org/2020/1/e14679/ %U https://doi.org/10.2196/14679 %U http://www.ncbi.nlm.nih.gov/pubmed/32012052 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 1 %P e15645 %T The Detection of Opioid Misuse and Heroin Use From Paramedic Response Documentation: Machine Learning for Improved Surveillance %A Prieto,José Tomás %A Scott,Kenneth %A McEwen,Dean %A Podewils,Laura J %A Al-Tayyib,Alia %A Robinson,James %A Edwards,David %A Foldy,Seth %A Shlay,Judith C %A Davidson,Arthur J %+ Division of Scientific Education and Professional Development, Centers for Disease Control and Prevention, 1600 Clifton Rd, Atlanta, GA, 30333, United States, 1 3036024487, josetomasprieto@gmail.com %K naloxone %K emergency medical services %K natural language processing %K heroin %K substance-related disorders %K opioid crisis %K artificial intelligence %D 2020 %7 3.1.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: Timely, precise, and localized surveillance of nonfatal events is needed to improve response and prevention of opioid-related problems in an evolving opioid crisis in the United States. Records of naloxone administration found in prehospital emergency medical services (EMS) data have helped estimate opioid overdose incidence, including nonhospital, field-treated cases. However, as naloxone is often used by EMS personnel in unconsciousness of unknown cause, attributing naloxone administration to opioid misuse and heroin use (OM) may misclassify events. Better methods are needed to identify OM. Objective: This study aimed to develop and test a natural language processing method that would improve identification of potential OM from paramedic documentation. Methods: First, we searched Denver Health paramedic trip reports from August 2017 to April 2018 for keywords naloxone, heroin, and both combined, and we reviewed narratives of identified reports to determine whether they constituted true cases of OM. Then, we used this human classification as reference standard and trained 4 machine learning models (random forest, k-nearest neighbors, support vector machines, and L1-regularized logistic regression). We selected the algorithm that produced the highest area under the receiver operating curve (AUC) for model assessment. Finally, we compared positive predictive value (PPV) of the highest performing machine learning algorithm with PPV of searches of keywords naloxone, heroin, and combination of both in the binary classification of OM in unseen September 2018 data. Results: In total, 54,359 trip reports were filed from August 2017 to April 2018. Approximately 1.09% (594/54,359) indicated naloxone administration. Among trip reports with reviewer agreement regarding OM in the narrative, 57.6% (292/516) were considered to include information revealing OM. Approximately 1.63% (884/54,359) of all trip reports mentioned heroin in the narrative. Among trip reports with reviewer agreement, 95.5% (784/821) were considered to include information revealing OM. Combined results accounted for 2.39% (1298/54,359) of trip reports. Among trip reports with reviewer agreement, 77.79% (907/1166) were considered to include information consistent with OM. The reference standard used to train and test machine learning models included details of 1166 trip reports. L1-regularized logistic regression was the highest performing algorithm (AUC=0.94; 95% CI 0.91-0.97) in identifying OM. Tested on 5983 unseen reports from September 2018, the keyword naloxone inaccurately identified and underestimated probable OM trip report cases (63 cases; PPV=0.68). The keyword heroin yielded more cases with improved performance (129 cases; PPV=0.99). Combined keyword and L1-regularized logistic regression classifier further improved performance (146 cases; PPV=0.99). Conclusions: A machine learning application enhanced the effectiveness of finding OM among documented paramedic field responses. This approach to refining OM surveillance may lead to improved first-responder and public health responses toward prevention of overdoses and other opioid-related problems in US communities. %M 31899451 %R 10.2196/15645 %U https://www.jmir.org/2020/1/e15645 %U https://doi.org/10.2196/15645 %U http://www.ncbi.nlm.nih.gov/pubmed/31899451 %0 Journal Article %@ 2291-5222 %I JMIR Publications %V 8 %N 1 %P e13244 %T Applicability of the User Engagement Scale to Mobile Health: A Survey-Based Quantitative Study %A Holdener,Marianne %A Gut,Alain %A Angerer,Alfred %+ Winterthur Institute of Health Economics, School of Management and Law, Zurich University of Applied Sciences, Gertrudstrasse 15, Winterthur, 8401, Switzerland, 41 798145158, marianneholdener@bluemail.ch %K mobile health %K mhealth %K mobile apps %K user engagement %K measurement %K user engagement scale %K chatbot %D 2020 %7 3.1.2020 %9 Original Paper %J JMIR Mhealth Uhealth %G English %X Background: There has recently been exponential growth in the development and use of health apps on mobile phones. As with most mobile apps, however, the majority of users abandon them quickly and after minimal use. One of the most critical factors for the success of a health app is how to support users’ commitment to their health. Despite increased interest from researchers in mobile health, few studies have examined the measurement of user engagement with health apps. Objective: User engagement is a multidimensional, complex phenomenon. The aim of this study was to understand the concept of user engagement and, in particular, to demonstrate the applicability of a user engagement scale (UES) to mobile health apps. Methods: To determine the measurability of user engagement in a mobile health context, a UES was employed, which is a psychometric tool to measure user engagement with a digital system. This was adapted to Ada, developed by Ada Health, an artificial intelligence–powered personalized health guide that helps people understand their health. A principal component analysis (PCA) with varimax rotation was conducted on 30 items. In addition, sum scores as means of each subscale were calculated. Results: Survey data from 73 Ada users were analyzed. PCA was determined to be suitable, as verified by the sampling adequacy of Kaiser-Meyer-Olkin=0.858, a significant Bartlett test of sphericity (χ2300=1127.1; P<.001), and communalities mostly within the 0.7 range. Although 5 items had to be removed because of low factor loadings, the results of the remaining 25 items revealed 4 attributes: perceived usability, aesthetic appeal, reward, and focused attention. Ada users showed the highest engagement level with perceived usability, with a value of 294, followed by aesthetic appeal, reward, and focused attention. Conclusions: Although the UES was deployed in German and adapted to another digital domain, PCA yielded consistent subscales and a 4-factor structure. This indicates that user engagement with health apps can be assessed with the German version of the UES. These results can benefit related mobile health app engagement research and may be of importance to marketers and app developers. %M 31899454 %R 10.2196/13244 %U https://mhealth.jmir.org/2020/1/e13244 %U https://doi.org/10.2196/13244 %U http://www.ncbi.nlm.nih.gov/pubmed/31899454 %0 Journal Article %@ 2561-7605 %I JMIR Publications %V 2 %N 2 %P e15381 %T Exploring Older Adults’ Beliefs About the Use of Intelligent Assistants for Consumer Health Information Management: A Participatory Design Study %A Martin-Hammond,Aqueasha %A Vemireddy,Sravani %A Rao,Kartik %+ Department of Human-Centered Computing, School of Informatics and Computing, Indiana University-Purdue University Indianapolis, 535 West Michigan St, Indianapolis, IN, 46202, United States, 1 3172787686, aqumarti@iupui.edu %K intelligent assistants %K artificial intelligence %K chatbots %K conversational agents %K digital health %K elderly %K aging in place %K participatory design %K co-design %K health information seeking %D 2019 %7 11.12.2019 %9 Original Paper %J JMIR Aging %G English %X Background: Intelligent assistants (IAs), also known as intelligent agents, use artificial intelligence to help users achieve a goal or complete a task. IAs represent a potential solution for providing older adults with individualized assistance at home, for example, to reduce social isolation, serve as memory aids, or help with disease management. However, to design IAs for health that are beneficial and accepted by older adults, it is important to understand their beliefs about IAs, how they would like to interact with IAs for consumer health, and how they desire to integrate IAs into their homes. Objective: We explore older adults’ mental models and beliefs about IAs, the tasks they want IAs to support, and how they would like to interact with IAs for consumer health. For the purpose of this study, we focus on IAs in the context of consumer health information management and search. Methods: We present findings from an exploratory, qualitative study that investigated older adults’ perspectives of IAs that aid with consumer health information search and management tasks. Eighteen older adults participated in a multiphase, participatory design workshop in which we engaged them in discussion, brainstorming, and design activities that helped us identify their current challenges managing and finding health information at home. We also explored their beliefs and ideas for an IA to assist them with consumer health tasks. We used participatory design activities to identify areas in which they felt IAs might be useful, but also to uncover the reasoning behind the ideas they presented. Discussions were audio-recorded and later transcribed. We compiled design artifacts collected during the study to supplement researcher transcripts and notes. Thematic analysis was used to analyze data. Results: We found that participants saw IAs as potentially useful for providing recommendations, facilitating collaboration between themselves and other caregivers, and for alerts of serious illness. However, they also desired familiar and natural interactions with IAs (eg, using voice) that could, if need be, provide fluid and unconstrained interactions, reason about their symptoms, and provide information or advice. Other participants discussed the need for flexible IAs that could be used by those with low technical resources or skills. Conclusions: From our findings, we present a discussion of three key components of participants’ mental models, including the people, behaviors, and interactions they described that were important for IAs for consumer health information management and seeking. We then discuss the role of access, transparency, caregivers, and autonomy in design for addressing participants’ concerns about privacy and trust as well as its role in assisting others that may interact with an IA on the older adults’ behalf. International Registered Report Identifier (IRRID): RR2-10.1145/3240925.3240972 %M 31825322 %R 10.2196/15381 %U http://aging.jmir.org/2019/2/e15381/ %U https://doi.org/10.2196/15381 %U http://www.ncbi.nlm.nih.gov/pubmed/31825322 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 7 %N 4 %P e13430 %T Impact of Automatic Query Generation and Quality Recognition Using Deep Learning to Curate Evidence From Biomedical Literature: Empirical Study %A Afzal,Muhammad %A Hussain,Maqbool %A Malik,Khalid Mahmood %A Lee,Sungyoung %+ Department of Computer Science and Engineering, Kyung Hee University, Room 313, Yongin, 446-701, Republic of Korea, 82 312012514, sylee@oslab.khu.ac.kr %K data curation %K evidence-based medicine %K clinical decision support systems %K precision medicine %K biomedical research %K machine learning %K deep learning %D 2019 %7 9.12.2019 %9 Original Paper %J JMIR Med Inform %G English %X Background: The quality of health care is continuously improving and is expected to improve further because of the advancement of machine learning and knowledge-based techniques along with innovation and availability of wearable sensors. With these advancements, health care professionals are now becoming more interested and involved in seeking scientific research evidence from external sources for decision making relevant to medical diagnosis, treatments, and prognosis. Not much work has been done to develop methods for unobtrusive and seamless curation of data from the biomedical literature. Objective: This study aimed to design a framework that can enable bringing quality publications intelligently to the users’ desk to assist medical practitioners in answering clinical questions and fulfilling their informational needs. Methods: The proposed framework consists of methods for efficient biomedical literature curation, including the automatic construction of a well-built question, the recognition of evidence quality by proposing extended quality recognition model (E-QRM), and the ranking and summarization of the extracted evidence. Results: Unlike previous works, the proposed framework systematically integrates the echelons of biomedical literature curation by including methods for searching queries, content quality assessments, and ranking and summarization. Using an ensemble approach, our high-impact classifier E-QRM obtained significantly improved accuracy than the existing quality recognition model (1723/1894, 90.97% vs 1462/1894, 77.21%). Conclusions: Our proposed methods and evaluation demonstrate the validity and rigorousness of the results, which can be used in different applications, including evidence-based medicine, precision medicine, and medical education. %M 31815673 %R 10.2196/13430 %U http://medinform.jmir.org/2019/4/e13430/ %U https://doi.org/10.2196/13430 %U http://www.ncbi.nlm.nih.gov/pubmed/31815673 %0 Journal Article %@ 2369-3762 %I JMIR Publications %V 5 %N 2 %P e16048 %T Introducing Artificial Intelligence Training in Medical Education %A Paranjape,Ketan %A Schinkel,Michiel %A Nannan Panday,Rishi %A Car,Josip %A Nanayakkara,Prabath %+ Amsterdam University Medical Center, De Boelelaan 1117, 1081 HV, Amsterdam, Netherlands, 31 3174108035, ketanp@alumni.gsb.stanford.edu %K algorithm %K artificial intelligence %K black box %K deep learning %K machine learning %K medical education %K continuing education %K data sciences %K curriculum %D 2019 %7 3.12.2019 %9 Viewpoint %J JMIR Med Educ %G English %X Health care is evolving and with it the need to reform medical education. As the practice of medicine enters the age of artificial intelligence (AI), the use of data to improve clinical decision making will grow, pushing the need for skillful medicine-machine interaction. As the rate of medical knowledge grows, technologies such as AI are needed to enable health care professionals to effectively use this knowledge to practice medicine. Medical professionals need to be adequately trained in this new technology, its advantages to improve cost, quality, and access to health care, and its shortfalls such as transparency and liability. AI needs to be seamlessly integrated across different aspects of the curriculum. In this paper, we have addressed the state of medical education at present and have recommended a framework on how to evolve the medical education curriculum to include AI. %M 31793895 %R 10.2196/16048 %U http://mededu.jmir.org/2019/2/e16048/ %U https://doi.org/10.2196/16048 %U http://www.ncbi.nlm.nih.gov/pubmed/31793895 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 21 %N 11 %P e15406 %T Artificial Intelligence Technologies for Coping with Alarm Fatigue in Hospital Environments Because of Sensory Overload: Algorithm Development and Validation %A Fernandes,Chrystinne Oliveira %A Miles,Simon %A Lucena,Carlos José Pereira De %A Cowan,Donald %+ Department of Informatics, Pontifical Catholic University of Rio de Janeiro, Rio Datacenter, 4th Fl, 225 Marquês de São Vicente St, Rio de Janeiro, 22451-900, Brazil, 55 21 3527 1510, chrystinne@gmail.com %K alert fatigue health personnel %K health information systems %K patient monitoring %K alert systems %K artificial intelligence %D 2019 %7 26.11.2019 %9 Original Paper %J J Med Internet Res %G English %X Background: Informed estimates claim that 80% to 99% of alarms set off in hospital units are false or clinically insignificant, representing a cacophony of sounds that do not present a real danger to patients. These false alarms can lead to an alert overload that causes a health care provider to miss important events that could be harmful or even life-threatening. As health care units become more dependent on monitoring devices for patient care purposes, the alarm fatigue issue has to be addressed as a major concern for the health care team as well as to enhance patient safety. Objective: The main goal of this paper was to propose a feasible solution for the alarm fatigue problem by using an automatic reasoning mechanism to decide how to notify members of the health care team. The aim was to reduce the number of notifications sent by determining whether or not to group a set of alarms that occur over a short period of time to deliver them together, without compromising patient safety. Methods: This paper describes: (1) a model for supporting reasoning algorithms that decide how to notify caregivers to avoid alarm fatigue; (2) an architecture for health systems that support patient monitoring and notification capabilities; and (3) a reasoning algorithm that specifies how to notify caregivers by deciding whether to aggregate a group of alarms to avoid alarm fatigue. Results: Experiments were used to demonstrate that providing a reasoning system can reduce the notifications received by the caregivers by up to 99.3% (582/586) of the total alarms generated. Our experiments were evaluated through the use of a dataset comprising patient monitoring data and vital signs recorded during 32 surgical cases where patients underwent anesthesia at the Royal Adelaide Hospital. We present the results of our algorithm by using graphs we generated using the R language, where we show whether the algorithm decided to deliver an alarm immediately or after a delay. Conclusions: The experimental results strongly suggest that this reasoning algorithm is a useful strategy for avoiding alarm fatigue. Although we evaluated our algorithm in an experimental environment, we tried to reproduce the context of a clinical environment by using real-world patient data. Our future work is to reproduce the evaluation study based on more realistic clinical conditions by increasing the number of patients, monitoring parameters, and types of alarm. %M 31769762 %R 10.2196/15406 %U http://www.jmir.org/2019/11/e15406/ %U https://doi.org/10.2196/15406 %U http://www.ncbi.nlm.nih.gov/pubmed/31769762 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 21 %N 11 %P e16295 %T The Real Era of the Art of Medicine Begins with Artificial Intelligence %A Meskó,Bertalan %+ The Medical Futurist Institute, Povl Bang-Jensen u. 2/B1. 4/1, Budapest, 1118, Hungary, 36 703807260, berci@medicalfuturist.com %K future %K artificial intelligence %K digital health %K technology %K art of medicine %D 2019 %7 18.11.2019 %9 Viewpoint %J J Med Internet Res %G English %X Physicians have been performing the art of medicine for hundreds of years, and since the ancient era, patients have turned to physicians for help, advice, and cures. When the fathers of medicine started writing down their experience, knowledge, and observations, treating medical conditions became a structured process, with textbooks and professors sharing their methods over generations. After evidence-based medicine was established as the new form of medical science, the art and science of medicine had to be connected. As a result, by the end of the 20th century, health care had become highly dependent on technology. From electronic medical records, telemedicine, three-dimensional printing, algorithms, and sensors, technology has started to influence medical decisions and the lives of patients. While digital health technologies might be considered a threat to the art of medicine, I argue that advanced technologies, such as artificial intelligence, will initiate the real era of the art of medicine. Through the use of reinforcement learning, artificial intelligence could become the stethoscope of the 21st century. If we embrace these tools, the real art of medicine will begin now with the era of artificial intelligence. %M 31738169 %R 10.2196/16295 %U http://www.jmir.org/2019/11/e16295/ %U https://doi.org/10.2196/16295 %U http://www.ncbi.nlm.nih.gov/pubmed/31738169 %0 Journal Article %@ 1929-0748 %I JMIR Publications %V 8 %N 11 %P e14245 %T Real-Time Detection of Behavioral Anomalies of Older People Using Artificial Intelligence (The 3-PEGASE Study): Protocol for a Real-Life Prospective Trial %A Piau,Antoine %A Lepage,Benoit %A Bernon,Carole %A Gleizes,Marie-Pierre %A Nourhashemi,Fati %+ Gérontopôle, University Hospital of Toulouse, 4 Rue du Pont St Pierre, Toulouse, F-31400, France, 33 659561628, antoinepiau@hotmail.com %K frailty %K monitoring %K sensors %K artificial intelligence %K older adults %K participatory design %D 2019 %7 18.11.2019 %9 Protocol %J JMIR Res Protoc %G English %X Background: Most frail older persons are living at home, and we face difficulties in achieving seamless monitoring to detect adverse health changes. Even more important, this lack of follow-up could have a negative impact on the living choices made by older individuals and their care partners. People could give up their homes for the more reassuring environment of a medicalized living facility. We have developed a low-cost unobtrusive sensor-based solution to trigger automatic alerts in case of an acute event or subtle changes over time. It could facilitate older adults’ follow-up in their own homes, and thus support independent living. Objective: The primary objective of this prospective open-label study is to evaluate the relevance of the automatic alerts generated by our artificial intelligence–driven monitoring solution as judged by the recipients: older adults, caregivers, and professional support workers. The secondary objective is to evaluate its ability to detect subtle functional and cognitive decline and major medical events. Methods: The primary outcome will be evaluated for each successive 2-month follow-up period to estimate the progression of our learning algorithm performance over time. In total, 25 frail or disabled participants, aged 75 years and above and living alone in their own homes, will be enrolled for a 6-month follow-up period. Results: The first phase with 5 participants for a 4-month feasibility period has been completed and the expected completion date for the second phase of the study (20 participants for 6 months) is July 2020. Conclusions: The originality of our real-life project lies in the choice of the primary outcome and in our user-centered evaluation. We will evaluate the relevance of the alerts and the algorithm performance over time according to the end users. The first-line recipients of the information are the older adults and their care partners rather than health care professionals. Despite the fast pace of electronic health devices development, few studies have addressed the specific everyday needs of older adults and their families. Trial Registration: ClinicalTrials.gov NCT03484156; https://clinicaltrials.gov/ct2/show/NCT03484156 International Registered Report Identifier (IRRID): PRR1-10.2196/14245 %M 31738180 %R 10.2196/14245 %U http://www.researchprotocols.org/2019/11/e14245/ %U https://doi.org/10.2196/14245 %U http://www.ncbi.nlm.nih.gov/pubmed/31738180 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 21 %N 11 %P e16607 %T Unlocking the Power of Artificial Intelligence and Big Data in Medicine %A Lovis,Christian %+ Division of Medical Information Sciences, University Hospitals of Geneva, Gabrielle Perret Gentil 4, Geneva, 1205, Switzerland, 41 22 37 26201, Christian.Lovis@hcuge.ch %K medical informatics %K artificial intelligence %K big data %D 2019 %7 8.11.2019 %9 Viewpoint %J J Med Internet Res %G English %X Data-driven science and its corollaries in machine learning and the wider field of artificial intelligence have the potential to drive important changes in medicine. However, medicine is not a science like any other: It is deeply and tightly bound with a large and wide network of legal, ethical, regulatory, economical, and societal dependencies. As a consequence, the scientific and technological progresses in handling information and its further processing and cross-linking for decision support and predictive systems must be accompanied by parallel changes in the global environment, with numerous stakeholders, including citizen and society. What can be seen at the first glance as a barrier and a mechanism slowing down the progression of data science must, however, be considered an important asset. Only global adoption can transform the potential of big data and artificial intelligence into an effective breakthroughs in handling health and medicine. This requires science and society, scientists and citizens, to progress together. %M 31702565 %R 10.2196/16607 %U https://www.jmir.org/2019/11/e16607 %U https://doi.org/10.2196/16607 %U http://www.ncbi.nlm.nih.gov/pubmed/31702565 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 21 %N 11 %P e15360 %T The Personalization of Conversational Agents in Health Care: Systematic Review %A Kocaballi,Ahmet Baki %A Berkovsky,Shlomo %A Quiroz,Juan C %A Laranjo,Liliana %A Tong,Huong Ly %A Rezazadegan,Dana %A Briatore,Agustina %A Coiera,Enrico %+ Australian Institute of Health Innovation
, Faculty of Medicine and Health Sciences, Macquarie University, Level 6, 75 Talavera Road, Sydney, 2109, Australia, 61 298502465, baki.kocaballi@mq.edu.au %K conversational interfaces %K conversational agents %K dialogue systems %K personalization %K customization %K adaptive systems %K health care %D 2019 %7 7.11.2019 %9 Review %J J Med Internet Res %G English %X Background: The personalization of conversational agents with natural language user interfaces is seeing increasing use in health care applications, shaping the content, structure, or purpose of the dialogue between humans and conversational agents. Objective: The goal of this systematic review was to understand the ways in which personalization has been used with conversational agents in health care and characterize the methods of its implementation. Methods: We searched on PubMed, Embase, CINAHL, PsycInfo, and ACM Digital Library using a predefined search strategy. The studies were included if they: (1) were primary research studies that focused on consumers, caregivers, or health care professionals; (2) involved a conversational agent with an unconstrained natural language interface; (3) tested the system with human subjects; and (4) implemented personalization features. Results: The search found 1958 publications. After abstract and full-text screening, 13 studies were included in the review. Common examples of personalized content included feedback, daily health reports, alerts, warnings, and recommendations. The personalization features were implemented without a theoretical framework of customization and with limited evaluation of its impact. While conversational agents with personalization features were reported to improve user satisfaction, user engagement and dialogue quality, the role of personalization in improving health outcomes was not assessed directly. Conclusions: Most of the studies in our review implemented the personalization features without theoretical or evidence-based support for them and did not leverage the recent developments in other domains of personalization. Future research could incorporate personalization as a distinct design factor with a more careful consideration of its impact on health outcomes and its implications on patient safety, privacy, and decision-making. %M 31697237 %R 10.2196/15360 %U https://www.jmir.org/2019/11/e15360 %U https://doi.org/10.2196/15360 %U http://www.ncbi.nlm.nih.gov/pubmed/31697237 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 21 %N 11 %P e15511 %T Modeling Research Topics for Artificial Intelligence Applications in Medicine: Latent Dirichlet Allocation Application Study %A Tran,Bach Xuan %A Nghiem,Son %A Sahin,Oz %A Vu,Tuan Manh %A Ha,Giang Hai %A Vu,Giang Thu %A Pham,Hai Quang %A Do,Hoa Thi %A Latkin,Carl A %A Tam,Wilson %A Ho,Cyrus S H %A Ho,Roger C M %+ Institute for Preventive Medicine and Public Health, Hanoi Medical University, No 1 Ton That Tung Street, Hanoi, 100000, Vietnam, 84 98 222 8662, bach.ipmph@gmail.com %K artificial intelligence %K applications %K medicine %K scientometric %K bibliometric %K latent Dirichlet allocation %D 2019 %7 1.11.2019 %9 Original Paper %J J Med Internet Res %G English %X Background: Artificial intelligence (AI)–based technologies develop rapidly and have myriad applications in medicine and health care. However, there is a lack of comprehensive reporting on the productivity, workflow, topics, and research landscape of AI in this field. Objective: This study aimed to evaluate the global development of scientific publications and constructed interdisciplinary research topics on the theory and practice of AI in medicine from 1977 to 2018. Methods: We obtained bibliographic data and abstract contents of publications published between 1977 and 2018 from the Web of Science database. A total of 27,451 eligible articles were analyzed. Research topics were classified by latent Dirichlet allocation, and principal component analysis was used to identify the construct of the research landscape. Results: The applications of AI have mainly impacted clinical settings (enhanced prognosis and diagnosis, robot-assisted surgery, and rehabilitation), data science and precision medicine (collecting individual data for precision medicine), and policy making (raising ethical and legal issues, especially regarding privacy and confidentiality of data). However, AI applications have not been commonly used in resource-poor settings due to the limit in infrastructure and human resources. Conclusions: The application of AI in medicine has grown rapidly and focuses on three leading platforms: clinical practices, clinical material, and policies. AI might be one of the methods to narrow down the inequality in health care and medicine between developing and developed countries. Technology transfer and support from developed countries are essential measures for the advancement of AI application in health care in developing countries. %M 31682577 %R 10.2196/15511 %U https://www.jmir.org/2019/11/e15511 %U https://doi.org/10.2196/15511 %U http://www.ncbi.nlm.nih.gov/pubmed/31682577 %0 Journal Article %@ 2291-5222 %I JMIR Publications %V 7 %N 11 %P e14452 %T Development of a Deep Learning Model for Dynamic Forecasting of Blood Glucose Level for Type 2 Diabetes Mellitus: Secondary Analysis of a Randomized Controlled Trial %A Faruqui,Syed Hasib Akhter %A Du,Yan %A Meka,Rajitha %A Alaeddini,Adel %A Li,Chengdong %A Shirinkam,Sara %A Wang,Jing %+ Center on Smart and Connected Health Technologies, University of Texas Health Science Center at San Antonio, 7703 Floyd Curl Drive, San Antonio, TX, United States, 1 210 450 8561, wangj1@uthscsa.edu %K type 2 diabetes %K long short-term memory (LSTM)-based recurrent neural networks (RNNs) %K glucose level prediction %K mobile health lifestyle data %D 2019 %7 1.11.2019 %9 Original Paper %J JMIR Mhealth Uhealth %G English %X Background: Type 2 diabetes mellitus (T2DM) is a major public health burden. Self-management of diabetes including maintaining a healthy lifestyle is essential for glycemic control and to prevent diabetes complications. Mobile-based health data can play an important role in the forecasting of blood glucose levels for lifestyle management and control of T2DM. Objective: The objective of this work was to dynamically forecast daily glucose levels in patients with T2DM based on their daily mobile health lifestyle data including diet, physical activity, weight, and glucose level from the day before. Methods: We used data from 10 T2DM patients who were overweight or obese in a behavioral lifestyle intervention using mobile tools for daily monitoring of diet, physical activity, weight, and blood glucose over 6 months. We developed a deep learning model based on long short-term memory–based recurrent neural networks to forecast the next-day glucose levels in individual patients. The neural network used several layers of computational nodes to model how mobile health data (food intake including consumed calories, fat, and carbohydrates; exercise; and weight) were progressing from one day to another from noisy data. Results: The model was validated based on a data set of 10 patients who had been monitored daily for over 6 months. The proposed deep learning model demonstrated considerable accuracy in predicting the next day glucose level based on Clark Error Grid and ±10% range of the actual values. Conclusions: Using machine learning methodologies may leverage mobile health lifestyle data to develop effective individualized prediction plans for T2DM management. However, predicting future glucose levels is challenging as glucose level is determined by multiple factors. Future study with more rigorous study design is warranted to better predict future glucose levels for T2DM management. %M 31682586 %R 10.2196/14452 %U https://mhealth.jmir.org/2019/11/e14452 %U https://doi.org/10.2196/14452 %U http://www.ncbi.nlm.nih.gov/pubmed/31682586 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 7 %N 4 %P e15980 %T Cohort Selection for Clinical Trials From Longitudinal Patient Records: Text Mining Approach %A Spasic,Irena %A Krzeminski,Dominik %A Corcoran,Padraig %A Balinsky,Alexander %+ School of Computer Science & Informatics, Cardiff University, 5 The Parade, Cardiff, CF24 3AA, United Kingdom, 44 02920870320, spasici@cardiff.ac.uk %K natural language processing %K machine learning %K electronic medical records %K clinical trial %K eligibility determination %D 2019 %7 31.10.2019 %9 Original Paper %J JMIR Med Inform %G English %X Background: Clinical trials are an important step in introducing new interventions into clinical practice by generating data on their safety and efficacy. Clinical trials need to ensure that participants are similar so that the findings can be attributed to the interventions studied and not to some other factors. Therefore, each clinical trial defines eligibility criteria, which describe characteristics that must be shared by the participants. Unfortunately, the complexities of eligibility criteria may not allow them to be translated directly into readily executable database queries. Instead, they may require careful analysis of the narrative sections of medical records. Manual screening of medical records is time consuming, thus negatively affecting the timeliness of the recruitment process. Objective: Track 1 of the 2018 National Natural Language Processing Clinical Challenge focused on the task of cohort selection for clinical trials, aiming to answer the following question: Can natural language processing be applied to narrative medical records to identify patients who meet eligibility criteria for clinical trials? The task required the participating systems to analyze longitudinal patient records to determine if the corresponding patients met the given eligibility criteria. We aimed to describe a system developed to address this task. Methods: Our system consisted of 13 classifiers, one for each eligibility criterion. All classifiers used a bag-of-words document representation model. To prevent the loss of relevant contextual information associated with such representation, a pattern-matching approach was used to extract context-sensitive features. They were embedded back into the text as lexically distinguishable tokens, which were consequently featured in the bag-of-words representation. Supervised machine learning was chosen wherever a sufficient number of both positive and negative instances was available to learn from. A rule-based approach focusing on a small set of relevant features was chosen for the remaining criteria. Results: The system was evaluated using microaveraged F measure. Overall, 4 machine algorithms, including support vector machine, logistic regression, naïve Bayesian classifier, and gradient tree boosting (GTB), were evaluated on the training data using 10–fold cross-validation. Overall, GTB demonstrated the most consistent performance. Its performance peaked when oversampling was used to balance the training data. The final evaluation was performed on previously unseen test data. On average, the F measure of 89.04% was comparable to 3 of the top ranked performances in the shared task (91.11%, 90.28%, and 90.21%). With an F measure of 88.14%, we significantly outperformed these systems (81.03%, 78.50%, and 70.81%) in identifying patients with advanced coronary artery disease. Conclusions: The holdout evaluation provides evidence that our system was able to identify eligible patients for the given clinical trial with high accuracy. Our approach demonstrates how rule-based knowledge infusion can improve the performance of machine learning algorithms even when trained on a relatively small dataset. %M 31674914 %R 10.2196/15980 %U http://medinform.jmir.org/2019/4/e15980/ %U https://doi.org/10.2196/15980 %U http://www.ncbi.nlm.nih.gov/pubmed/31674914 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 21 %N 10 %P e16222 %T Trust Me, I’m a Chatbot: How Artificial Intelligence in Health Care Fails the Turing Test %A Powell,John %+ Nuffield Department of Primary Care Health Sciences, Medical Sciences Division, University of Oxford, Radcliffe Observatory Quarter, 43 Woodstock Road, Oxford, OX2 6GG, United Kingdom, 44 1865617768 ext 617768, john.powell@phc.ox.ac.uk %K artificial intelligence %K machine learning %K medical informatics %K digital health %K ehealth %K chatbots %K conversational agents %D 2019 %7 28.10.2019 %9 Viewpoint %J J Med Internet Res %G English %X Over the next decade, one issue which will dominate sociotechnical studies in health informatics is the extent to which the promise of artificial intelligence in health care will be realized, along with the social and ethical issues which accompany it. A useful thought experiment is the application of the Turing test to user-facing artificial intelligence systems in health care (such as chatbots or conversational agents). In this paper I argue that many medical decisions require value judgements and the doctor-patient relationship requires empathy and understanding to arrive at a shared decision, often handling large areas of uncertainty and balancing competing risks. Arguably, medicine requires wisdom more than intelligence, artificial or otherwise. Artificial intelligence therefore needs to supplement rather than replace medical professionals, and identifying the complementary positioning of artificial intelligence in medical consultation is a key challenge for the future. In health care, artificial intelligence needs to pass the implementation game, not the imitation game. %M 31661083 %R 10.2196/16222 %U http://www.jmir.org/2019/10/e16222/ %U https://doi.org/10.2196/16222 %U http://www.ncbi.nlm.nih.gov/pubmed/31661083 %0 Journal Article %@ 2368-7959 %I JMIR Publications %V 6 %N 10 %P e14166 %T Conversational Agents in the Treatment of Mental Health Problems: Mixed-Method Systematic Review %A Gaffney,Hannah %A Mansell,Warren %A Tai,Sara %+ , School of Health Sciences, Faculty of Biology, Medicine and Health, University of Manchester, 2nd Floor, Zochonis Building, Manchester, M13 9PL, United Kingdom, 44 161 306 0400, hannah.gaffney-2@postgrad.manchester.ac.uk %K artificial intelligence %K mental health %K stress, pychological %K psychiatry %K therapy, computer-assisted %K conversational agent %K chatbot %K digital health %D 2019 %7 18.10.2019 %9 Review %J JMIR Ment Health %G English %X Background: The use of conversational agent interventions (including chatbots and robots) in mental health is growing at a fast pace. Recent existing reviews have focused exclusively on a subset of embodied conversational agent interventions despite other modalities aiming to achieve the common goal of improved mental health. Objective: This study aimed to review the use of conversational agent interventions in the treatment of mental health problems. Methods: We performed a systematic search using relevant databases (MEDLINE, EMBASE, PsycINFO, Web of Science, and Cochrane library). Studies that reported on an autonomous conversational agent that simulated conversation and reported on a mental health outcome were included. Results: A total of 13 studies were included in the review. Among them, 4 full-scale randomized controlled trials (RCTs) were included. The rest were feasibility, pilot RCTs and quasi-experimental studies. Interventions were diverse in design and targeted a range of mental health problems using a wide variety of therapeutic orientations. All included studies reported reductions in psychological distress postintervention. Furthermore, 5 controlled studies demonstrated significant reductions in psychological distress compared with inactive control groups. In addition, 3 controlled studies comparing interventions with active control groups failed to demonstrate superior effects. Broader utility in promoting well-being in nonclinical populations was unclear. Conclusions: The efficacy and acceptability of conversational agent interventions for mental health problems are promising. However, a more robust experimental design is required to demonstrate efficacy and efficiency. A focus on streamlining interventions, demonstrating equivalence to other treatment modalities, and elucidating mechanisms of action has the potential to increase acceptance by users and clinicians and maximize reach. %M 31628789 %R 10.2196/14166 %U https://mental.jmir.org/2019/10/e14166 %U https://doi.org/10.2196/14166 %U http://www.ncbi.nlm.nih.gov/pubmed/31628789 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 21 %N 10 %P e14316 %T Psychosocial Factors Affecting Artificial Intelligence Adoption in Health Care in China: Cross-Sectional Study %A Ye,Tiantian %A Xue,Jiaolong %A He,Mingguang %A Gu,Jing %A Lin,Haotian %A Xu,Bin %A Cheng,Yu %+ Department of Medical Humanities, The Seventh Affiliated Hospital, Sun Yat-sen University, No 628, Zhenyuan Rd Guangming (New) Dist, Shenzhen, 518107, China, 86 02084114275, chengyu@mail.sysu.edu.cn %K artificial intelligence %K adoption %K technology acceptance model %K structural equation model %K intention %K subjective norms %K trust %K moderation %D 2019 %7 17.10.2019 %9 Original Paper %J J Med Internet Res %G English %X Background: Poor quality primary health care is a major issue in China, particularly in blindness prevention. Artificial intelligence (AI) could provide early screening and accurate auxiliary diagnosis to improve primary care services and reduce unnecessary referrals, but the application of AI in medical settings is still an emerging field. Objective: This study aimed to investigate the general public’s acceptance of ophthalmic AI devices, with reference to those already used in China, and the interrelated influencing factors that shape people’s intention to use these devices. Methods: We proposed a model of ophthalmic AI acceptance based on technology acceptance theories and variables from other health care–related studies. The model was verified via a 32-item questionnaire with 7-point Likert scales completed by 474 respondents (nationally random sampled). Structural equation modeling was used to evaluate item and construct reliability and validity via a confirmatory factor analysis, and the model’s path effects, significance, goodness of fit, and mediation and moderation effects were analyzed. Results: Standardized factor loadings of items were between 0.583 and 0.876. Composite reliability of 9 constructs ranged from 0.673 to 0.841. The discriminant validity of all constructs met the Fornell and Larcker criteria. Model fit indicators such as standardized root mean square residual (0.057), comparative fit index (0.915), and root mean squared error of approximation (0.049) demonstrated good fit. Intention to use (R2=0.515) is significantly affected by subjective norms (beta=.408; P<.001), perceived usefulness (beta=.336; P=.03), and resistance bias (beta=–.237; P=.02). Subjective norms and perceived behavior control had an indirect impact on intention to use through perceived usefulness and perceived ease of use. Eye health consciousness had an indirect positive effect on intention to use through perceived usefulness. Trust had a significant moderation effect (beta=–.095; P=.049) on the effect path of perceived usefulness to intention to use. Conclusions: The item, construct, and model indicators indicate reliable interpretation power and help explain the levels of public acceptance of ophthalmic AI devices in China. The influence of subjective norms can be linked to Confucian culture, collectivism, authoritarianism, and conformity mentality in China. Overall, the use of AI in diagnostics and clinical laboratory analysis is underdeveloped, and the Chinese public are generally mistrustful of medical staff and the Chinese medical system. Stakeholders such as doctors and AI suppliers should therefore avoid making misleading or over-exaggerated claims in the promotion of AI health care products. %M 31625950 %R 10.2196/14316 %U http://www.jmir.org/2019/10/e14316/ %U https://doi.org/10.2196/14316 %U http://www.ncbi.nlm.nih.gov/pubmed/31625950 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 7 %N 4 %P e14806 %T A Deep Learning Approach for Managing Medical Consumable Materials in Intensive Care Units via Convolutional Neural Networks: Technical Proof-of-Concept Study %A Peine,Arne %A Hallawa,Ahmed %A Schöffski,Oliver %A Dartmann,Guido %A Fazlic,Lejla Begic %A Schmeink,Anke %A Marx,Gernot %A Martin,Lukas %+ Department of Intensive Care Medicine and Intermediate Care, University Hospital Rheinisch-Westfälische Technische Hochschule Aachen, Pauwelsstr 30, Aachen, 52074, Germany, 49 241 800, apeine@ukaachen.de %K convolutional neural networks %K deep learning, critical care %K intensive care %K image recognition %K medical economics %K medical consumables %K artificial intelligence %K machine learning %D 2019 %7 10.10.2019 %9 Original Paper %J JMIR Med Inform %G English %X Background: High numbers of consumable medical materials (eg, sterile needles and swabs) are used during the daily routine of intensive care units (ICUs) worldwide. Although medical consumables largely contribute to total ICU hospital expenditure, many hospitals do not track the individual use of materials. Current tracking solutions meeting the specific requirements of the medical environment, like barcodes or radio frequency identification, require specialized material preparation and high infrastructure investment. This impedes the accurate prediction of consumption, leads to high storage maintenance costs caused by large inventories, and hinders scientific work due to inaccurate documentation. Thus, new cost-effective and contactless methods for object detection are urgently needed. Objective: The goal of this work was to develop and evaluate a contactless visual recognition system for tracking medical consumable materials in ICUs using a deep learning approach on a distributed client-server architecture. Methods: We developed Consumabot, a novel client-server optical recognition system for medical consumables, based on the convolutional neural network model MobileNet implemented in Tensorflow. The software was designed to run on single-board computer platforms as a detection unit. The system was trained to recognize 20 different materials in the ICU, while 100 sample images of each consumable material were provided. We assessed the top-1 recognition rates in the context of different real-world ICU settings: materials presented to the system without visual obstruction, 50% covered materials, and scenarios of multiple items. We further performed an analysis of variance with repeated measures to quantify the effect of adverse real-world circumstances. Results: Consumabot reached a >99% reliability of recognition after about 60 steps of training and 150 steps of validation. A desirable low cross entropy of <0.03 was reached for the training set after about 100 iteration steps and after 170 steps for the validation set. The system showed a high top-1 mean recognition accuracy in a real-world scenario of 0.85 (SD 0.11) for objects presented to the system without visual obstruction. Recognition accuracy was lower, but still acceptable, in scenarios where the objects were 50% covered (P<.001; mean recognition accuracy 0.71; SD 0.13) or multiple objects of the target group were present (P=.01; mean recognition accuracy 0.78; SD 0.11), compared to a nonobstructed view. The approach met the criteria of absence of explicit labeling (eg, barcodes, radio frequency labeling) while maintaining a high standard for quality and hygiene with minimal consumption of resources (eg, cost, time, training, and computational power). Conclusions: Using a convolutional neural network architecture, Consumabot consistently achieved good results in the classification of consumables and thus is a feasible way to recognize and register medical consumables directly to a hospital’s electronic health record. The system shows limitations when the materials are partially covered, therefore identifying characteristics of the consumables are not presented to the system. Further development of the assessment in different medical circumstances is needed. %M 31603430 %R 10.2196/14806 %U http://medinform.jmir.org/2019/4/e14806/ %U https://doi.org/10.2196/14806 %U http://www.ncbi.nlm.nih.gov/pubmed/31603430 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 7 %N 4 %P e14401 %T Characterizing Artificial Intelligence Applications in Cancer Research: A Latent Dirichlet Allocation Analysis %A Tran,Bach Xuan %A Latkin,Carl A %A Sharafeldin,Noha %A Nguyen,Katherina %A Vu,Giang Thu %A Tam,Wilson W S %A Cheung,Ngai-Man %A Nguyen,Huong Lan Thi %A Ho,Cyrus S H %A Ho,Roger C M %+ Institute for Preventive Medicine and Public Health, Hanoi Medical University, No 1 Ton That Tung Street, Hanoi, 100000, Vietnam, 84 982228662, bach.ipmph@gmail.com %K scientometrics %K cancer %K artificial intelligence %K global %K mapping %D 2019 %7 15.9.2019 %9 Original Paper %J JMIR Med Inform %G English %X Background: Artificial intelligence (AI)–based therapeutics, devices, and systems are vital innovations in cancer control; particularly, they allow for diagnosis, screening, precise estimation of survival, informing therapy selection, and scaling up treatment services in a timely manner. Objective: The aim of this study was to analyze the global trends, patterns, and development of interdisciplinary landscapes in AI and cancer research. Methods: An exploratory factor analysis was conducted to identify research domains emerging from abstract contents. The Jaccard similarity index was utilized to identify the most frequently co-occurring terms. Latent Dirichlet Allocation was used for classifying papers into corresponding topics. Results: From 1991 to 2018, the number of studies examining the application of AI in cancer care has grown to 3555 papers covering therapeutics, capacities, and factors associated with outcomes. Topics with the highest volume of publications include (1) machine learning, (2) comparative effectiveness evaluation of AI-assisted medical therapies, and (3) AI-based prediction. Noticeably, this classification has revealed topics examining the incremental effectiveness of AI applications, the quality of life, and functioning of patients receiving these innovations. The growing research productivity and expansion of multidisciplinary approaches are largely driven by machine learning, artificial neural networks, and AI in various clinical practices. Conclusions: The research landscapes show that the development of AI in cancer care is focused on not only improving prediction in cancer screening and AI-assisted therapeutics but also on improving other corresponding areas such as precision and personalized medicine and patient-reported outcomes. %M 31573929 %R 10.2196/14401 %U https://medinform.jmir.org/2019/4/e14401 %U https://doi.org/10.2196/14401 %U http://www.ncbi.nlm.nih.gov/pubmed/31573929 %0 Journal Article %@ 2369-1999 %I JMIR Publications %V 5 %N 2 %P e12163 %T Developing Machine Learning Algorithms for the Prediction of Early Death in Elderly Cancer Patients: Usability Study %A Sena,Gabrielle Ribeiro %A Lima,Tiago Pessoa Ferreira %A Mello,Maria Julia Gonçalves %A Thuler,Luiz Claudio Santos %A Lima,Jurema Telles Oliveira %+ Department of Geriatric Oncology, Instituto de Medicina Integral Prof Fernando Figueira, Rua dos Coelhos 300, Recife, 50070-902, Brazil, 55 81 21224100, gabriellesena8@gmail.com %K geriatric assessment %K aged %K machine learning %K medical oncology %K death %D 2019 %7 26.9.2019 %9 Original Paper %J JMIR Cancer %G English %X Background: The importance of classifying cancer patients into high- or low-risk groups has led many research teams, from the biomedical and bioinformatics fields, to study the application of machine learning (ML) algorithms. The International Society of Geriatric Oncology recommends the use of the comprehensive geriatric assessment (CGA), a multidisciplinary tool to evaluate health domains, for the follow-up of elderly cancer patients. However, no applications of ML have been proposed using CGA to classify elderly cancer patients. Objective: The aim of this study was to propose and develop predictive models, using ML and CGA, to estimate the risk of early death in elderly cancer patients. Methods: The ability of ML algorithms to predict early mortality in a cohort involving 608 elderly cancer patients was evaluated. The CGA was conducted during admission by a multidisciplinary team and included the following questionnaires: mini-mental state examination (MMSE), geriatric depression scale-short form, international physical activity questionnaire-short form, timed up and go, Katz index of independence in activities of daily living, Charlson comorbidity index, Karnofsky performance scale (KPS), polypharmacy, and mini nutritional assessment-short form (MNA-SF). The 10-fold cross-validation algorithm was used to evaluate all possible combinations of these questionnaires to estimate the risk of early death, considered when occurring within 6 months of diagnosis, in a variety of ML classifiers, including Naive Bayes (NB), decision tree algorithm J48 (J48), and multilayer perceptron (MLP). On each fold of evaluation, tiebreaking is handled by choosing the smallest set of questionnaires. Results: It was possible to select CGA questionnaire subsets with high predictive capacity for early death, which were either statistically similar (NB) or higher (J48 and MLP) when compared with the use of all questionnaires investigated. These results show that CGA questionnaire selection can improve accuracy rates and decrease the time spent to evaluate elderly cancer patients. Conclusions: A simplified predictive model aiming to estimate the risk of early death in elderly cancer patients is proposed herein, minimally composed by the MNA-SF and KPS. We strongly recommend that these questionnaires be incorporated into regular geriatric assessment of older patients with cancer. %M 31573896 %R 10.2196/12163 %U https://cancer.jmir.org/2019/2/e12163 %U https://doi.org/10.2196/12163 %U http://www.ncbi.nlm.nih.gov/pubmed/31573896 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 7 %N 3 %P e14830 %T Fine-Tuning Bidirectional Encoder Representations From Transformers (BERT)–Based Models on Large-Scale Electronic Health Record Notes: An Empirical Study %A Li,Fei %A Jin,Yonghao %A Liu,Weisong %A Rawat,Bhanu Pratap Singh %A Cai,Pengshan %A Yu,Hong %+ Department of Computer Science, University of Massachusetts Lowell, 1 University Avenue, Lowell, MA,, United States, 1 978 934 6132, Hong_Yu@uml.edu %K natural language processing %K entity normalization %K deep learning %K electronic health record note %K BERT %D 2019 %7 12.09.2019 %9 Original Paper %J JMIR Med Inform %G English %X Background: The bidirectional encoder representations from transformers (BERT) model has achieved great success in many natural language processing (NLP) tasks, such as named entity recognition and question answering. However, little prior work has explored this model to be used for an important task in the biomedical and clinical domains, namely entity normalization. Objective: We aim to investigate the effectiveness of BERT-based models for biomedical or clinical entity normalization. In addition, our second objective is to investigate whether the domains of training data influence the performances of BERT-based models as well as the degree of influence. Methods: Our data was comprised of 1.5 million unlabeled electronic health record (EHR) notes. We first fine-tuned BioBERT on this large collection of unlabeled EHR notes. This generated our BERT-based model trained using 1.5 million electronic health record notes (EhrBERT). We then further fine-tuned EhrBERT, BioBERT, and BERT on three annotated corpora for biomedical and clinical entity normalization: the Medication, Indication, and Adverse Drug Events (MADE) 1.0 corpus, the National Center for Biotechnology Information (NCBI) disease corpus, and the Chemical-Disease Relations (CDR) corpus. We compared our models with two state-of-the-art normalization systems, namely MetaMap and disease name normalization (DNorm). Results: EhrBERT achieved 40.95% F1 in the MADE 1.0 corpus for mapping named entities to the Medical Dictionary for Regulatory Activities and the Systematized Nomenclature of Medicine—Clinical Terms (SNOMED-CT), which have about 380,000 terms. In this corpus, EhrBERT outperformed MetaMap by 2.36% in F1. For the NCBI disease corpus and CDR corpus, EhrBERT also outperformed DNorm by improving the F1 scores from 88.37% and 89.92% to 90.35% and 93.82%, respectively. Compared with BioBERT and BERT, EhrBERT outperformed them on the MADE 1.0 corpus and the CDR corpus. Conclusions: Our work shows that BERT-based models have achieved state-of-the-art performance for biomedical and clinical entity normalization. BERT-based models can be readily fine-tuned to normalize any kind of named entities. %M 31516126 %R 10.2196/14830 %U http://medinform.jmir.org/2019/3/e14830/ %U https://doi.org/10.2196/14830 %U http://www.ncbi.nlm.nih.gov/pubmed/31516126 %0 Journal Article %@ 2291-5222 %I JMIR Publications %V 7 %N 8 %P e11966 %T Deep Learning Intervention for Health Care Challenges: Some Biomedical Domain Considerations %A Tobore,Igbe %A Li,Jingzhen %A Yuhang,Liu %A Al-Handarish,Yousef %A Kandwal,Abhishek %A Nie,Zedong %A Wang,Lei %+ Center for Medical Robotics and Minimally Invasive Surgical Devices, Shenzhen Institutes of Advance Technology, Chinese Academy of Sciences, 1068 Xueyuan Avenue, Shenzhen University, Xili Town, Nanshan District, Shenzhen,, China, 86 755 86585213, zd.nie@siat.ac.cn %K machine learning %K deep learning %K big data %K mHealth %K medical imaging %K electronic health record %K biologicals %K biomedical %K ECG %K EEG %K artificial intelligence %D 2019 %7 02.08.2019 %9 Viewpoint %J JMIR Mhealth Uhealth %G English %X The use of deep learning (DL) for the analysis and diagnosis of biomedical and health care problems has received unprecedented attention in the last decade. The technique has recorded a number of achievements for unearthing meaningful features and accomplishing tasks that were hitherto difficult to solve by other methods and human experts. Currently, biological and medical devices, treatment, and applications are capable of generating large volumes of data in the form of images, sounds, text, graphs, and signals creating the concept of big data. The innovation of DL is a developing trend in the wake of big data for data representation and analysis. DL is a type of machine learning algorithm that has deeper (or more) hidden layers of similar function cascaded into the network and has the capability to make meaning from medical big data. Current transformation drivers to achieve personalized health care delivery will be possible with the use of mobile health (mHealth). DL can provide the analysis for the deluge of data generated from mHealth apps. This paper reviews the fundamentals of DL methods and presents a general view of the trends in DL by capturing literature from PubMed and the Institute of Electrical and Electronics Engineers database publications that implement different variants of DL. We highlight the implementation of DL in health care, which we categorize into biological system, electronic health record, medical image, and physiological signals. In addition, we discuss some inherent challenges of DL affecting biomedical and health domain, as well as prospective research directions that focus on improving health management by promoting the application of physiological signals and modern internet technology. %M 31376272 %R 10.2196/11966 %U https://mhealth.jmir.org/2019/8/e11966/ %U https://doi.org/10.2196/11966 %U http://www.ncbi.nlm.nih.gov/pubmed/31376272 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 7 %N 3 %P e14499 %T Projection Word Embedding Model With Hybrid Sampling Training for Classifying ICD-10-CM Codes: Longitudinal Observational Study %A Lin,Chin %A Lou,Yu-Sheng %A Tsai,Dung-Jang %A Lee,Chia-Cheng %A Hsu,Chia-Jung %A Wu,Ding-Chung %A Wang,Mei-Chuen %A Fang,Wen-Hui %+ Department of Family and Community Medicine, Tri-Service General Hospital, National Defense Medical Center, No. 325, Section 2, Chenggong Road, Neihu District, Taipei, 11490, Taiwan, 886 02 87923100 ext 18448, rumaf.fang@gmail.com %K word embedding %K convolutional neural network %K artificial intelligence %K natural language processing %K electronic health records %D 2019 %7 23.7.2019 %9 Original Paper %J JMIR Med Inform %G English %X Background: Most current state-of-the-art models for searching the International Classification of Diseases, Tenth Revision Clinical Modification (ICD-10-CM) codes use word embedding technology to capture useful semantic properties. However, they are limited by the quality of initial word embeddings. Word embedding trained by electronic health records (EHRs) is considered the best, but the vocabulary diversity is limited by previous medical records. Thus, we require a word embedding model that maintains the vocabulary diversity of open internet databases and the medical terminology understanding of EHRs. Moreover, we need to consider the particularity of the disease classification, wherein discharge notes present only positive disease descriptions. Objective: We aimed to propose a projection word2vec model and a hybrid sampling method. In addition, we aimed to conduct a series of experiments to validate the effectiveness of these methods. Methods: We compared the projection word2vec model and traditional word2vec model using two corpora sources: English Wikipedia and PubMed journal abstracts. We used seven published datasets to measure the medical semantic understanding of the word2vec models and used these embeddings to identify the three–character-level ICD-10-CM diagnostic codes in a set of discharge notes. On the basis of embedding technology improvement, we also tried to apply the hybrid sampling method to improve accuracy. The 94,483 labeled discharge notes from the Tri-Service General Hospital of Taipei, Taiwan, from June 1, 2015, to June 30, 2017, were used. To evaluate the model performance, 24,762 discharge notes from July 1, 2017, to December 31, 2017, from the same hospital were used. Moreover, 74,324 additional discharge notes collected from seven other hospitals were tested. The F-measure, which is the major global measure of effectiveness, was adopted. Results: In medical semantic understanding, the original EHR embeddings and PubMed embeddings exhibited superior performance to the original Wikipedia embeddings. After projection training technology was applied, the projection Wikipedia embeddings exhibited an obvious improvement but did not reach the level of original EHR embeddings or PubMed embeddings. In the subsequent ICD-10-CM coding experiment, the model that used both projection PubMed and Wikipedia embeddings had the highest testing mean F-measure (0.7362 and 0.6693 in Tri-Service General Hospital and the seven other hospitals, respectively). Moreover, the hybrid sampling method was found to improve the model performance (F-measure=0.7371/0.6698). Conclusions: The word embeddings trained using EHR and PubMed could understand medical semantics better, and the proposed projection word2vec model improved the ability of medical semantics extraction in Wikipedia embeddings. Although the improvement from the projection word2vec model in the real ICD-10-CM coding task was not substantial, the models could effectively handle emerging diseases. The proposed hybrid sampling method enables the model to behave like a human expert. %R 10.2196/14499 %U http://medinform.jmir.org/2019/3/e14499/ %U https://doi.org/10.2196/14499 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 21 %N 7 %P e13659 %T Artificial Intelligence and the Implementation Challenge %A Shaw,James %A Rudzicz,Frank %A Jamieson,Trevor %A Goldfarb,Avi %+ Women's College Hospital, Institute for Health System Solutions and Virtual Care, 76 Grenville Street, Toronto, ON, M5G2A2, Canada, 1 4163236400, jay.shaw@wchospital.ca %K artificial intelligence %K machine learning %K implementation science %K ethics %D 2019 %7 10.07.2019 %9 Viewpoint %J J Med Internet Res %G English %X Background: Applications of artificial intelligence (AI) in health care have garnered much attention in recent years, but the implementation issues posed by AI have not been substantially addressed. Objective: In this paper, we have focused on machine learning (ML) as a form of AI and have provided a framework for thinking about use cases of ML in health care. We have structured our discussion of challenges in the implementation of ML in comparison with other technologies using the framework of Nonadoption, Abandonment, and Challenges to the Scale-Up, Spread, and Sustainability of Health and Care Technologies (NASSS). Methods: After providing an overview of AI technology, we describe use cases of ML as falling into the categories of decision support and automation. We suggest these use cases apply to clinical, operational, and epidemiological tasks and that the primary function of ML in health care in the near term will be decision support. We then outline unique implementation issues posed by ML initiatives in the categories addressed by the NASSS framework, specifically including meaningful decision support, explainability, privacy, consent, algorithmic bias, security, scalability, the role of corporations, and the changing nature of health care work. Results: Ultimately, we suggest that the future of ML in health care remains positive but uncertain, as support from patients, the public, and a wide range of health care stakeholders is necessary to enable its meaningful implementation. Conclusions: If the implementation science community is to facilitate the adoption of ML in ways that stand to generate widespread benefits, the issues raised in this paper will require substantial attention in the coming years. %M 31293245 %R 10.2196/13659 %U https://www.jmir.org/2019/7/e13659/ %U https://doi.org/10.2196/13659 %U http://www.ncbi.nlm.nih.gov/pubmed/31293245 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 21 %N 7 %P e13664 %T Reducing Patient Loneliness With Artificial Agents: Design Insights From Evolutionary Neuropsychiatry %A Loveys,Kate %A Fricchione,Gregory %A Kolappa,Kavitha %A Sagar,Mark %A Broadbent,Elizabeth %+ Department of Psychological Medicine, The University of Auckland, Auckland City Hospital, Level 12 Support Building, 85 Park Road, Grafton, Auckland, 1023, New Zealand, 64 9 373 7599 ext 84340, k.loveys@auckland.ac.nz %K loneliness %K neuropsychiatry %K biological evolution %K psychological bonding %K interpersonal relations %K artificial intelligence %K social support %K eHealth %D 2019 %7 08.07.2019 %9 Viewpoint %J J Med Internet Res %G English %X Loneliness is a growing public health issue that substantially increases the risk of morbidity and mortality. Artificial agents, such as robots, embodied conversational agents, and chatbots, present an innovation in care delivery and have been shown to reduce patient loneliness by providing social support. However, similar to doctor and patient relationships, the quality of a patient’s relationship with an artificial agent can impact support effectiveness as well as care engagement. Incorporating mammalian attachment-building behavior in neural network processing as part of an agent’s capabilities may improve relationship quality and engagement between patients and artificial agents. We encourage developers of artificial agents intended to relieve patient loneliness to incorporate design insights from evolutionary neuropsychiatry. %M 31287067 %R 10.2196/13664 %U https://www.jmir.org/2019/7/e13664/ %U https://doi.org/10.2196/13664 %U http://www.ncbi.nlm.nih.gov/pubmed/31287067 %0 Journal Article %@ 2369-3762 %I JMIR Publications %V 5 %N 1 %P e13930 %T Applications and Challenges of Implementing Artificial Intelligence in Medical Education: Integrative Review %A Chan,Kai Siang %A Zary,Nabil %+ Mohammed Bin Rashid University of Medicine and Health Sciences, Building 14, Dubai Healthcare City, PO Box 505055, Dubai,, United Arab Emirates, 971 83571846, nabil.zary@icloud.com %K medical education %K evaluation of AIED systems %K real world applications of AIED systems %K artificial intelligence %D 2019 %7 15.6.2019 %9 Review %J JMIR Med Educ %G English %X Background: Since the advent of artificial intelligence (AI) in 1955, the applications of AI have increased over the years within a rapidly changing digital landscape where public expectations are on the rise, fed by social media, industry leaders, and medical practitioners. However, there has been little interest in AI in medical education until the last two decades, with only a recent increase in the number of publications and citations in the field. To our knowledge, thus far, a limited number of articles have discussed or reviewed the current use of AI in medical education. Objective: This study aims to review the current applications of AI in medical education as well as the challenges of implementing AI in medical education. Methods: Medline (Ovid), EBSCOhost Education Resources Information Center (ERIC) and Education Source, and Web of Science were searched with explicit inclusion and exclusion criteria. Full text of the selected articles was analyzed using the Extension of Technology Acceptance Model and the Diffusions of Innovations theory. Data were subsequently pooled together and analyzed quantitatively. Results: A total of 37 articles were identified. Three primary uses of AI in medical education were identified: learning support (n=32), assessment of students’ learning (n=4), and curriculum review (n=1). The main reasons for use of AI are its ability to provide feedback and a guided learning pathway and to decrease costs. Subgroup analysis revealed that medical undergraduates are the primary target audience for AI use. In addition, 34 articles described the challenges of AI implementation in medical education; two main reasons were identified: difficulty in assessing the effectiveness of AI in medical education and technical challenges while developing AI applications. Conclusions: The primary use of AI in medical education was for learning support mainly due to its ability to provide individualized feedback. Little emphasis was placed on curriculum review and assessment of students’ learning due to the lack of digitalization and sensitive nature of examinations, respectively. Big data manipulation also warrants the need to ensure data integrity. Methodological improvements are required to increase AI adoption by addressing the technical difficulties of creating an AI application and using novel methods to assess the effectiveness of AI. To better integrate AI into the medical profession, measures should be taken to introduce AI into the medical school curriculum for medical professionals to better understand AI algorithms and maximize its use. %M 31199295 %R 10.2196/13930 %U http://mededu.jmir.org/2019/1/e13930/ %U https://doi.org/10.2196/13930 %U http://www.ncbi.nlm.nih.gov/pubmed/31199295 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 21 %N 5 %P e13216 %T Your Robot Therapist Will See You Now: Ethical Implications of Embodied Artificial Intelligence in Psychiatry, Psychology, and Psychotherapy %A Fiske,Amelia %A Henningsen,Peter %A Buyx,Alena %+ Institute for History and Ethics of Medicine, Technical University of Munich School of Medicine, Technical University of Munich, Ismaninger Straße 22, Munich, 81675, Germany, 49 8941404041, a.fiske@tum.de %K artificial intelligence %K robotics %K ethics %K psychiatry %K psychology %K psychotherapy %K medicine %D 2019 %7 09.05.2019 %9 Original Paper %J J Med Internet Res %G English %X Background: Research in embodied artificial intelligence (AI) has increasing clinical relevance for therapeutic applications in mental health services. With innovations ranging from ‘virtual psychotherapists’ to social robots in dementia care and autism disorder, to robots for sexual disorders, artificially intelligent virtual and robotic agents are increasingly taking on high-level therapeutic interventions that used to be offered exclusively by highly trained, skilled health professionals. In order to enable responsible clinical implementation, ethical and social implications of the increasing use of embodied AI in mental health need to be identified and addressed. Objective: This paper assesses the ethical and social implications of translating embodied AI applications into mental health care across the fields of Psychiatry, Psychology and Psychotherapy. Building on this analysis, it develops a set of preliminary recommendations on how to address ethical and social challenges in current and future applications of embodied AI. Methods: Based on a thematic literature search and established principles of medical ethics, an analysis of the ethical and social aspects of currently embodied AI applications was conducted across the fields of Psychiatry, Psychology, and Psychotherapy. To enable a comprehensive evaluation, the analysis was structured around the following three steps: assessment of potential benefits; analysis of overarching ethical issues and concerns; discussion of specific ethical and social issues of the interventions. Results: From an ethical perspective, important benefits of embodied AI applications in mental health include new modes of treatment, opportunities to engage hard-to-reach populations, better patient response, and freeing up time for physicians. Overarching ethical issues and concerns include: harm prevention and various questions of data ethics; a lack of guidance on development of AI applications, their clinical integration and training of health professionals; ‘gaps’ in ethical and regulatory frameworks; the potential for misuse including using the technologies to replace established services, thereby potentially exacerbating existing health inequalities. Specific challenges identified and discussed in the application of embodied AI include: matters of risk-assessment, referrals, and supervision; the need to respect and protect patient autonomy; the role of non-human therapy; transparency in the use of algorithms; and specific concerns regarding long-term effects of these applications on understandings of illness and the human condition. Conclusions: We argue that embodied AI is a promising approach across the field of mental health; however, further research is needed to address the broader ethical and societal concerns of these technologies to negotiate best research and medical practices in innovative mental health care. We conclude by indicating areas of future research and developing recommendations for high-priority areas in need of concrete ethical guidance. %M 31094356 %R 10.2196/13216 %U https://www.jmir.org/2019/5/e13216/ %U https://doi.org/10.2196/13216 %U http://www.ncbi.nlm.nih.gov/pubmed/31094356 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 21 %N 5 %P e11030 %T Data-Driven Blood Glucose Pattern Classification and Anomalies Detection: Machine-Learning Applications in Type 1 Diabetes %A Woldaregay,Ashenafi Zebene %A Årsand,Eirik %A Botsis,Taxiarchis %A Albers,David %A Mamykina,Lena %A Hartvigsen,Gunnar %+ Department of Computer Science, University of Tromsø – The Arctic University of Norway, Realfagbygget, Hansine Hansens vei 54, Tromsø,, Norway, 47 77646444, ashenafi.z.woldaregay@uit.no %K type 1 diabetes %K blood glucose dynamics %K anomalies detection %K machine learning %D 2019 %7 01.05.2019 %9 Review %J J Med Internet Res %G English %X Background: Diabetes mellitus is a chronic metabolic disorder that results in abnormal blood glucose (BG) regulations. The BG level is preferably maintained close to normality through self-management practices, which involves actively tracking BG levels and taking proper actions including adjusting diet and insulin medications. BG anomalies could be defined as any undesirable reading because of either a precisely known reason (normal cause variation) or an unknown reason (special cause variation) to the patient. Recently, machine-learning applications have been widely introduced within diabetes research in general and BG anomaly detection in particular. However, irrespective of their expanding and increasing popularity, there is a lack of up-to-date reviews that materialize the current trends in modeling options and strategies for BG anomaly classification and detection in people with diabetes. Objective: This review aimed to identify, assess, and analyze the state-of-the-art machine-learning strategies and their hybrid systems focusing on BG anomaly classification and detection including glycemic variability (GV), hyperglycemia, and hypoglycemia in type 1 diabetes within the context of personalized decision support systems and BG alarm events applications, which are important constituents for optimal diabetes self-management. Methods: A rigorous literature search was conducted between September 1 and October 1, 2017, and October 15 and November 5, 2018, through various Web-based databases. Peer-reviewed journals and articles were considered. Information from the selected literature was extracted based on predefined categories, which were based on previous research and further elaborated through brainstorming. Results: The initial results were vetted using the title, abstract, and keywords and retrieved 496 papers. After a thorough assessment and screening, 47 articles remained, which were critically analyzed. The interrater agreement was measured using a Cohen kappa test, and disagreements were resolved through discussion. The state-of-the-art classes of machine learning have been developed and tested up to the task and achieved promising performance including artificial neural network, support vector machine, decision tree, genetic algorithm, Gaussian process regression, Bayesian neural network, deep belief network, and others. Conclusions: Despite the complexity of BG dynamics, there are many attempts to capture hypoglycemia and hyperglycemia incidences and the extent of an individual’s GV using different approaches. Recently, the advancement of diabetes technologies and continuous accumulation of self-collected health data have paved the way for popularity of machine learning in these tasks. According to the review, most of the identified studies used a theoretical threshold, which suffers from inter- and intrapatient variation. Therefore, future studies should consider the difference among patients and also track its temporal change over time. Moreover, studies should also give more emphasis on the types of inputs used and their associated time lag. Generally, we foresee that these developments might encourage researchers to further develop and test these systems on a large-scale basis. %M 31042157 %R 10.2196/11030 %U https://www.jmir.org/2019/5/e11030/ %U https://doi.org/10.2196/11030 %U http://www.ncbi.nlm.nih.gov/pubmed/31042157 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 7 %N 2 %P e13445 %T The Use of Artificially Intelligent Self-Diagnosing Digital Platforms by the General Public: Scoping Review %A Aboueid,Stephanie %A Liu,Rebecca H %A Desta,Binyam Negussie %A Chaurasia,Ashok %A Ebrahim,Shanil %+ Applied Health Sciences, University of Waterloo, 200 University Avenue West, Waterloo, ON, N2L 3G5, Canada, 1 6134061899, seaboueid@uwaterloo.ca %K diagnosis %K artificial intelligence %K symptom checkers %K diagnostic self evaluation %K self-care %D 2019 %7 01.05.2019 %9 Review %J JMIR Med Inform %G English %X Background: Self-diagnosis is the process of diagnosing or identifying a medical condition in oneself. Artificially intelligent digital platforms for self-diagnosis are becoming widely available and are used by the general public; however, little is known about the body of knowledge surrounding this technology. Objective: The objectives of this scoping review were to (1) systematically map the extent and nature of the literature and topic areas pertaining to digital platforms that use computerized algorithms to provide users with a list of potential diagnoses and (2) identify key knowledge gaps. Methods: The following databases were searched: PubMed (Medline), Scopus, Association for Computing Machinery Digital Library, Institute of Electrical and Electronics Engineers, Google Scholar, Open Grey, and ProQuest Dissertations and Theses. The search strategy was developed and refined with the assistance of a librarian and consisted of 3 main concepts: (1) self-diagnosis; (2) digital platforms; and (3) public or patients. The search generated 2536 articles from which 217 were duplicates. Following the Tricco et al 2018 checklist, 2 researchers screened the titles and abstracts (n=2316) and full texts (n=104), independently. A total of 19 articles were included for review, and data were retrieved following a data-charting form that was pretested by the research team. Results: The included articles were mainly conducted in the United States (n=10) or the United Kingdom (n=4). Among the articles, topic areas included accuracy or correspondence with a doctor’s diagnosis (n=6), commentaries (n=2), regulation (n=3), sociological (n=2), user experience (n=2), theoretical (n=1), privacy and security (n=1), ethical (n=1), and design (n=1). Individuals who do not have access to health care and perceive to have a stigmatizing condition are more likely to use this technology. The accuracy of this technology varied substantially based on the disease examined and platform used. Women and those with higher education were more likely to choose the right diagnosis out of the potential list of diagnoses. Regulation of this technology is lacking in most parts of the world; however, they are currently under development. Conclusions: There are prominent research gaps in the literature surrounding the use of artificially intelligent self-diagnosing digital platforms. Given the variety of digital platforms and the wide array of diseases they cover, measuring accuracy is cumbersome. More research is needed to understand the user experience and inform regulations. %M 31042151 %R 10.2196/13445 %U http://medinform.jmir.org/2019/2/e13445/ %U https://doi.org/10.2196/13445 %U http://www.ncbi.nlm.nih.gov/pubmed/31042151 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 21 %N 4 %P e13822 %T Detecting Developmental Delay and Autism Through Machine Learning Models Using Home Videos of Bangladeshi Children: Development and Validation Study %A Tariq,Qandeel %A Fleming,Scott Lanyon %A Schwartz,Jessey Nicole %A Dunlap,Kaitlyn %A Corbin,Conor %A Washington,Peter %A Kalantarian,Haik %A Khan,Naila Z %A Darmstadt,Gary L %A Wall,Dennis Paul %+ Division of Systems Medicine, Department of Pediatrics, Stanford University, 1265 Welch Road, Palo Alto, CA, 94305, United States, 1 6173946031, dpwall@stanford.edu %K autism %K autism spectrum disorder %K machine learning %K developmental delays %K clinical resources %K Bangladesh %K Biomedical Data Science %D 2019 %7 24.04.2019 %9 Original Paper %J J Med Internet Res %G English %X Background: Autism spectrum disorder (ASD) is currently diagnosed using qualitative methods that measure between 20-100 behaviors, can span multiple appointments with trained clinicians, and take several hours to complete. In our previous work, we demonstrated the efficacy of machine learning classifiers to accelerate the process by collecting home videos of US-based children, identifying a reduced subset of behavioral features that are scored by untrained raters using a machine learning classifier to determine children’s “risk scores” for autism. We achieved an accuracy of 92% (95% CI 88%-97%) on US videos using a classifier built on five features. Objective: Using videos of Bangladeshi children collected from Dhaka Shishu Children’s Hospital, we aim to scale our pipeline to another culture and other developmental delays, including speech and language conditions. Methods: Although our previously published and validated pipeline and set of classifiers perform reasonably well on Bangladeshi videos (75% accuracy, 95% CI 71%-78%), this work improves on that accuracy through the development and application of a powerful new technique for adaptive aggregation of crowdsourced labels. We enhance both the utility and performance of our model by building two classification layers: The first layer distinguishes between typical and atypical behavior, and the second layer distinguishes between ASD and non-ASD. In each of the layers, we use a unique rater weighting scheme to aggregate classification scores from different raters based on their expertise. We also determine Shapley values for the most important features in the classifier to understand how the classifiers’ process aligns with clinical intuition. Results: Using these techniques, we achieved an accuracy (area under the curve [AUC]) of 76% (SD 3%) and sensitivity of 76% (SD 4%) for identifying atypical children from among developmentally delayed children, and an accuracy (AUC) of 85% (SD 5%) and sensitivity of 76% (SD 6%) for identifying children with ASD from those predicted to have other developmental delays. Conclusions: These results show promise for using a mobile video-based and machine learning–directed approach for early and remote detection of autism in Bangladeshi children. This strategy could provide important resources for developmental health in developing countries with few clinical resources for diagnosis, helping children get access to care at an early age. Future research aimed at extending the application of this approach to identify a range of other conditions and determine the population-level burden of developmental disabilities and impairments will be of high value. %M 31017583 %R 10.2196/13822 %U http://www.jmir.org/2019/4/e13822/ %U https://doi.org/10.2196/13822 %U http://www.ncbi.nlm.nih.gov/pubmed/31017583 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 21 %N 4 %P e12887 %T Physicians’ Perceptions of Chatbots in Health Care: Cross-Sectional Web-Based Survey %A Palanica,Adam %A Flaschner,Peter %A Thommandram,Anirudh %A Li,Michael %A Fossat,Yan %+ Labs Department, Klick Health, Klick Inc, 175 Bloor St E, Suite 300, Toronto, ON, M4W 3R8, Canada, 1 416 214 4977, apalanica@klick.com %K physician satisfaction %K health care %K telemedicine %K mobile health %K health surveys %D 2019 %7 05.04.2019 %9 Original Paper %J J Med Internet Res %G English %X Background: Many potential benefits for the uses of chatbots within the context of health care have been theorized, such as improved patient education and treatment compliance. However, little is known about the perspectives of practicing medical physicians on the use of chatbots in health care, even though these individuals are the traditional benchmark of proper patient care. Objective: This study aimed to investigate the perceptions of physicians regarding the use of health care chatbots, including their benefits, challenges, and risks to patients. Methods: A total of 100 practicing physicians across the United States completed a Web-based, self-report survey to examine their opinions of chatbot technology in health care. Descriptive statistics and frequencies were used to examine the characteristics of participants. Results: A wide variety of positive and negative perspectives were reported on the use of health care chatbots, including the importance to patients for managing their own health and the benefits on physical, psychological, and behavioral health outcomes. More consistent agreement occurred with regard to administrative benefits associated with chatbots; many physicians believed that chatbots would be most beneficial for scheduling doctor appointments (78%, 78/100), locating health clinics (76%, 76/100), or providing medication information (71%, 71/100). Conversely, many physicians believed that chatbots cannot effectively care for all of the patients’ needs (76%, 76/100), cannot display human emotion (72%, 72/100), and cannot provide detailed diagnosis and treatment because of not knowing all of the personal factors associated with the patient (71%, 71/100). Many physicians also stated that health care chatbots could be a risk to patients if they self-diagnose too often (714%, 74/100) and do not accurately understand the diagnoses (74%, 74/100). Conclusions: Physicians believed in both costs and benefits associated with chatbots, depending on the logistics and specific roles of the technology. Chatbots may have a beneficial role to play in health care to support, motivate, and coach patients as well as for streamlining organizational tasks; in essence, chatbots could become a surrogate for nonmedical caregivers. However, concerns remain on the inability of chatbots to comprehend the emotional state of humans as well as in areas where expert medical knowledge and intelligence is required. %M 30950796 %R 10.2196/12887 %U https://www.jmir.org/2019/4/e12887/ %U https://doi.org/10.2196/12887 %U http://www.ncbi.nlm.nih.gov/pubmed/30950796 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 21 %N 4 %P e12286 %T Applications of Machine Learning in Real-Life Digital Health Interventions: Review of the Literature %A Triantafyllidis,Andreas K %A Tsanas,Athanasios %+ Information Technologies Institute, Centre for Research and Technology Hellas, 6th km Charilaou-Thermi Rd, GR 57001 Thermi, Thessaloniki, 60361, Greece, 30 2310 498100, atriand@gmail.com %K machine learning %K data mining %K artificial intelligence %K digital health %K review %K telemedicine %D 2019 %7 05.04.2019 %9 Review %J J Med Internet Res %G English %X Background: Machine learning has attracted considerable research interest toward developing smart digital health interventions. These interventions have the potential to revolutionize health care and lead to substantial outcomes for patients and medical professionals. Objective: Our objective was to review the literature on applications of machine learning in real-life digital health interventions, aiming to improve the understanding of researchers, clinicians, engineers, and policy makers in developing robust and impactful data-driven interventions in the health care domain. Methods: We searched the PubMed and Scopus bibliographic databases with terms related to machine learning, to identify real-life studies of digital health interventions incorporating machine learning algorithms. We grouped those interventions according to their target (ie, target condition), study design, number of enrolled participants, follow-up duration, primary outcome and whether this had been statistically significant, machine learning algorithms used in the intervention, and outcome of the algorithms (eg, prediction). Results: Our literature search identified 8 interventions incorporating machine learning in a real-life research setting, of which 3 (37%) were evaluated in a randomized controlled trial and 5 (63%) in a pilot or experimental single-group study. The interventions targeted depression prediction and management, speech recognition for people with speech disabilities, self-efficacy for weight loss, detection of changes in biopsychosocial condition of patients with multiple morbidity, stress management, treatment of phantom limb pain, smoking cessation, and personalized nutrition based on glycemic response. The average number of enrolled participants in the studies was 71 (range 8-214), and the average follow-up study duration was 69 days (range 3-180). Of the 8 interventions, 6 (75%) showed statistical significance (at the P=.05 level) in health outcomes. Conclusions: This review found that digital health interventions incorporating machine learning algorithms in real-life studies can be useful and effective. Given the low number of studies identified in this review and that they did not follow a rigorous machine learning evaluation methodology, we urge the research community to conduct further studies in intervention settings following evaluation principles and demonstrating the potential of machine learning in clinical practice. %M 30950797 %R 10.2196/12286 %U https://www.jmir.org/2019/4/e12286/ %U https://doi.org/10.2196/12286 %U http://www.ncbi.nlm.nih.gov/pubmed/30950797 %0 Journal Article %@ 1929-073X %I JMIR Publications %V 8 %N 2 %P e12100 %T Artificial Intelligence in Clinical Health Care Applications: Viewpoint %A van Hartskamp,Michael %A Consoli,Sergio %A Verhaegh,Wim %A Petkovic,Milan %A van de Stolpe,Anja %+ Philips Research, HTC11, p247, High Tech Campus, Eindhoven, 5656AE, Netherlands, 31 612784841, anja.van.de.stolpe@philips.com %K artificial intelligence %K deep learning %K clinical data %K Bayesian modeling %K medical informatics %D 2019 %7 05.04.2019 %9 Viewpoint %J Interact J Med Res %G English %X The idea of artificial intelligence (AI) has a long history. It turned out, however, that reaching intelligence at human levels is more complicated than originally anticipated. Currently, we are experiencing a renewed interest in AI, fueled by an enormous increase in computing power and an even larger increase in data, in combination with improved AI technologies like deep learning. Healthcare is considered the next domain to be revolutionized by artificial intelligence. While AI approaches are excellently suited to develop certain algorithms, for biomedical applications there are specific challenges. We propose six recommendations—the 6Rs—to improve AI projects in the biomedical space, especially clinical health care, and to facilitate communication between AI scientists and medical doctors: (1) Relevant and well-defined clinical question first; (2) Right data (ie, representative and of good quality); (3) Ratio between number of patients and their variables should fit the AI method; (4) Relationship between data and ground truth should be as direct and causal as possible; (5) Regulatory ready; enabling validation; and (6) Right AI method. %M 30950806 %R 10.2196/12100 %U https://www.i-jmr.org/2019/2/e12100/ %U https://doi.org/10.2196/12100 %U http://www.ncbi.nlm.nih.gov/pubmed/30950806 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 21 %N 3 %P e12422 %T Physician Confidence in Artificial Intelligence: An Online Mobile Survey %A Oh,Songhee %A Kim,Jae Heon %A Choi,Sung-Woo %A Lee,Hee Jeong %A Hong,Jungrak %A Kwon,Soon Hyo %+ Division of Nephrology, Department of Internal Medicine, Soonchunhyang University Hospital, Daesagwanro 59 Youngsangu, Seoul, 04401, Republic of Korea, 82 01034413147, ksoonhyo@schmc.ac.kr %K artificial intelligence %K AI %K awareness %K physicians %D 2019 %7 25.03.2019 %9 Original Paper %J J Med Internet Res %G English %X Background: It is expected that artificial intelligence (AI) will be used extensively in the medical field in the future. Objective: The purpose of this study is to investigate the awareness of AI among Korean doctors and to assess physicians’ attitudes toward the medical application of AI. Methods: We conducted an online survey composed of 11 closed-ended questions using Google Forms. The survey consisted of questions regarding the recognition of and attitudes toward AI, the development direction of AI in medicine, and the possible risks of using AI in the medical field. Results: A total of 669 participants completed the survey. Only 40 (5.9%) answered that they had good familiarity with AI. However, most participants considered AI useful in the medical field (558/669, 83.4% agreement). The advantage of using AI was seen as the ability to analyze vast amounts of high-quality, clinically relevant data in real time. Respondents agreed that the area of medicine in which AI would be most useful is disease diagnosis (558/669, 83.4% agreement). One possible problem cited by the participants was that AI would not be able to assist in unexpected situations owing to inadequate information (196/669, 29.3%). Less than half of the participants(294/669, 43.9%) agreed that AI is diagnostically superior to human doctors. Only 237 (35.4%) answered that they agreed that AI could replace them in their jobs. Conclusions: This study suggests that Korean doctors and medical students have favorable attitudes toward AI in the medical field. The majority of physicians surveyed believed that AI will not replace their roles in the future. %M 30907742 %R 10.2196/12422 %U http://www.jmir.org/2019/3/e12422/ %U https://doi.org/10.2196/12422 %U http://www.ncbi.nlm.nih.gov/pubmed/30907742 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 21 %N 3 %P e12802 %T Artificial Intelligence and the Future of Primary Care: Exploratory Qualitative Study of UK General Practitioners’ Views %A Blease,Charlotte %A Kaptchuk,Ted J %A Bernstein,Michael H %A Mandl,Kenneth D %A Halamka,John D %A DesRoches,Catherine M %+ General Medicine and Primary Care, Beth Israel Deaconess Medical Center, Harvard Medical School, 330 Brookline Avenue, Boston, MA, MA 02215, United States, 1 617 754 1457, cblease@bidmc.harvard.edu %K artificial intelligence %K attitudes %K future %K general practice %K machine learning %K opinions %K primary care %K qualitative research %K technology %D 2019 %7 20.03.2019 %9 Original Paper %J J Med Internet Res %G English %X Background: The potential for machine learning to disrupt the medical profession is the subject of ongoing debate within biomedical informatics and related fields. Objective: This study aimed to explore general practitioners’ (GPs’) opinions about the potential impact of future technology on key tasks in primary care. Methods: In June 2018, we conducted a Web-based survey of 720 UK GPs’ opinions about the likelihood of future technology to fully replace GPs in performing 6 key primary care tasks, and, if respondents considered replacement for a particular task likely, to estimate how soon the technological capacity might emerge. This study involved qualitative descriptive analysis of written responses (“comments”) to an open-ended question in the survey. Results: Comments were classified into 3 major categories in relation to primary care: (1) limitations of future technology, (2) potential benefits of future technology, and (3) social and ethical concerns. Perceived limitations included the beliefs that communication and empathy are exclusively human competencies; many GPs also considered clinical reasoning and the ability to provide value-based care as necessitating physicians’ judgments. Perceived benefits of technology included expectations about improved efficiencies, in particular with respect to the reduction of administrative burdens on physicians. Social and ethical concerns encompassed multiple, divergent themes including the need to train more doctors to overcome workforce shortfalls and misgivings about the acceptability of future technology to patients. However, some GPs believed that the failure to adopt technological innovations could incur harms to both patients and physicians. Conclusions: This study presents timely information on physicians’ views about the scope of artificial intelligence (AI) in primary care. Overwhelmingly, GPs considered the potential of AI to be limited. These views differ from the predictions of biomedical informaticians. More extensive, stand-alone qualitative work would provide a more in-depth understanding of GPs’ views. %M 30892270 %R 10.2196/12802 %U http://www.jmir.org/2019/3/e12802/ %U https://doi.org/10.2196/12802 %U http://www.ncbi.nlm.nih.gov/pubmed/30892270 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 21 %N 3 %P e11990 %T Detecting Hypoglycemia Incidents Reported in Patients’ Secure Messages: Using Cost-Sensitive Learning and Oversampling to Reduce Data Imbalance %A Chen,Jinying %A Lalor,John %A Liu,Weisong %A Druhl,Emily %A Granillo,Edgard %A Vimalananda,Varsha G %A Yu,Hong %+ Department of Population and Quantitative Health Sciences, University of Massachusetts Medical School, Albert Sherman Center, 9th Floor, 368 Plantation Street, Worcester, MA, 01605, United States, 1 508 856 6063, jinying.chen@umassmed.edu %K secure messaging %K natural language processing %K hypoglycemia %K supervised machine learning %K imbalanced data %K adverse event detection %K drug-related side effects and adverse reactions %D 2019 %7 11.03.2019 %9 Original Paper %J J Med Internet Res %G English %X Background: Improper dosing of medications such as insulin can cause hypoglycemic episodes, which may lead to severe morbidity or even death. Although secure messaging was designed for exchanging nonurgent messages, patients sometimes report hypoglycemia events through secure messaging. Detecting these patient-reported adverse events may help alert clinical teams and enable early corrective actions to improve patient safety. Objective: We aimed to develop a natural language processing system, called HypoDetect (Hypoglycemia Detector), to automatically identify hypoglycemia incidents reported in patients’ secure messages. Methods: An expert in public health annotated 3000 secure message threads between patients with diabetes and US Department of Veterans Affairs clinical teams as containing patient-reported hypoglycemia incidents or not. A physician independently annotated 100 threads randomly selected from this dataset to determine interannotator agreement. We used this dataset to develop and evaluate HypoDetect. HypoDetect incorporates 3 machine learning algorithms widely used for text classification: linear support vector machines, random forest, and logistic regression. We explored different learning features, including new knowledge-driven features. Because only 114 (3.80%) messages were annotated as positive, we investigated cost-sensitive learning and oversampling methods to mitigate the challenge of imbalanced data. Results: The interannotator agreement was Cohen kappa=.976. Using cross-validation, logistic regression with cost-sensitive learning achieved the best performance (area under the receiver operating characteristic curve=0.954, sensitivity=0.693, specificity 0.974, F1 score=0.590). Cost-sensitive learning and the ensembled synthetic minority oversampling technique improved the sensitivity of the baseline systems substantially (by 0.123 to 0.728 absolute gains). Our results show that a variety of features contributed to the best performance of HypoDetect. Conclusions: Despite the challenge of data imbalance, HypoDetect achieved promising results for the task of detecting hypoglycemia incidents from secure messages. The system has a great potential to facilitate early detection and treatment of hypoglycemia. %M 30855231 %R 10.2196/11990 %U http://www.jmir.org/2019/3/e11990/ %U https://doi.org/10.2196/11990 %U http://www.ncbi.nlm.nih.gov/pubmed/30855231 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 7 %N 1 %P e10788 %T Detection of Bleeding Events in Electronic Health Record Notes Using Convolutional Neural Network Models Enhanced With Recurrent Neural Network Autoencoders: Deep Learning Approach %A Li,Rumeng %A Hu,Baotian %A Liu,Feifan %A Liu,Weisong %A Cunningham,Francesca %A McManus,David D %A Yu,Hong %+ Department of Computer Science, University of Massachusetts Lowell, 1 University Avenue, Lowell, MA, 01854, United States, 1 9789343620, hong_yu@uml.edu %K autoencoder %K BiLSTM %K bleeding %K convolutional neural networks %K electronic health record %D 2019 %7 08.02.2019 %9 Original Paper %J JMIR Med Inform %G English %X Background: Bleeding events are common and critical and may cause significant morbidity and mortality. High incidences of bleeding events are associated with cardiovascular disease in patients on anticoagulant therapy. Prompt and accurate detection of bleeding events is essential to prevent serious consequences. As bleeding events are often described in clinical notes, automatic detection of bleeding events from electronic health record (EHR) notes may improve drug-safety surveillance and pharmacovigilance. Objective: We aimed to develop a natural language processing (NLP) system to automatically classify whether an EHR note sentence contains a bleeding event. Methods: We expert annotated 878 EHR notes (76,577 sentences and 562,630 word-tokens) to identify bleeding events at the sentence level. This annotated corpus was used to train and validate our NLP systems. We developed an innovative hybrid convolutional neural network (CNN) and long short-term memory (LSTM) autoencoder (HCLA) model that integrates a CNN architecture with a bidirectional LSTM (BiLSTM) autoencoder model to leverage large unlabeled EHR data. Results: HCLA achieved the best area under the receiver operating characteristic curve (0.957) and F1 score (0.938) to identify whether a sentence contains a bleeding event, thereby surpassing the strong baseline support vector machines and other CNN and autoencoder models. Conclusions: By incorporating a supervised CNN model and a pretrained unsupervised BiLSTM autoencoder, the HCLA achieved high performance in detecting bleeding events. %M 30735140 %R 10.2196/10788 %U http://medinform.jmir.org/2019/1/e10788/ %U https://doi.org/10.2196/10788 %U http://www.ncbi.nlm.nih.gov/pubmed/30735140 %0 Journal Article %@ 2368-7959 %I JMIR Publications %V 5 %N 4 %P e64 %T Using Psychological Artificial Intelligence (Tess) to Relieve Symptoms of Depression and Anxiety: Randomized Controlled Trial %A Fulmer,Russell %A Joerin,Angela %A Gentile,Breanna %A Lakerink,Lysanne %A Rauws,Michiel %+ Northwestern University, 633 Clark Street, Evanston, IL, United States, 1 312 609 5300 ext 699, russell.fulmer@northwestern.edu %K artificial intelligence %K mental health services %K depression %K anxiety %K students %D 2018 %7 13.12.2018 %9 Original Paper %J JMIR Ment Health %G English %X Background: Students in need of mental health care face many barriers including cost, location, availability, and stigma. Studies show that computer-assisted therapy and 1 conversational chatbot delivering cognitive behavioral therapy (CBT) offer a less-intensive and more cost-effective alternative for treating depression and anxiety. Although CBT is one of the most effective treatment methods, applying an integrative approach has been linked to equally effective posttreatment improvement. Integrative psychological artificial intelligence (AI) offers a scalable solution as the demand for affordable, convenient, lasting, and secure support grows. Objective: This study aimed to assess the feasibility and efficacy of using an integrative psychological AI, Tess, to reduce self-identified symptoms of depression and anxiety in college students. Methods: In this randomized controlled trial, 75 participants were recruited from 15 universities across the United States. All participants completed Web-based surveys, including the Patient Health Questionnaire (PHQ-9), Generalized Anxiety Disorder Scale (GAD-7), and Positive and Negative Affect Scale (PANAS) at baseline and 2 to 4 weeks later (T2). The 2 test groups consisted of 50 participants in total and were randomized to receive unlimited access to Tess for either 2 weeks (n=24) or 4 weeks (n=26). The information-only control group participants (n=24) received an electronic link to the National Institute of Mental Health’s (NIMH) eBook on depression among college students and were only granted access to Tess after completion of the study. Results: A sample of 74 participants completed this study with 0% attrition from the test group and less than 1% attrition from the control group (1/24). The average age of participants was 22.9 years, with 70% of participants being female (52/74), mostly Asian (37/74, 51%), and white (32/74, 41%). Group 1 received unlimited access to Tess, with daily check-ins for 2 weeks. Group 2 received unlimited access to Tess with biweekly check-ins for 4 weeks. The information-only control group was provided with an electronic link to the NIMH’s eBook. Multivariate analysis of covariance was conducted. We used an alpha level of .05 for all statistical tests. Results revealed a statistically significant difference between the control group and group 1, such that group 1 reported a significant reduction in symptoms of depression as measured by the PHQ-9 (P=.03), whereas those in the control group did not. A statistically significant difference was found between the control group and both test groups 1 and 2 for symptoms of anxiety as measured by the GAD-7. Group 1 (P=.045) and group 2 (P=.02) reported a significant reduction in symptoms of anxiety, whereas the control group did not. A statistically significant difference was found on the PANAS between the control group and group 1 (P=.03) and suggests that Tess did impact scores. Conclusions: This study offers evidence that AI can serve as a cost-effective and accessible therapeutic agent. Although not designed to appropriate the role of a trained therapist, integrative psychological AI emerges as a feasible option for delivering support. Trial Registration: International Standard Randomized Controlled Trial Number: ISRCTN61214172; https://doi.org/10.1186/ISRCTN61214172. %M 30545815 %R 10.2196/mental.9782 %U http://mental.jmir.org/2018/4/e64/ %U https://doi.org/10.2196/mental.9782 %U http://www.ncbi.nlm.nih.gov/pubmed/30545815 %0 Journal Article %@ 2291-5222 %I JMIR Publications %V 6 %N 11 %P e198 %T The Perceived Benefits of an Artificial Intelligence–Embedded Mobile App Implementing Evidence-Based Guidelines for the Self-Management of Chronic Neck and Back Pain: Observational Study %A Lo,Wai Leung Ambrose %A Lei,Di %A Li,Le %A Huang,Dong Feng %A Tong,Kin-Fai %+ Guangdong Engineering and Technology Research Center for Rehabilitation Medicine and Translation, Department of Rehabilitation Medicine, The First Affiliated Hospital, Sun Yat-sen University, 58 Zhongshan Second Road, Guangzhou, 510000, China, 86 87332200 ext 8536, ambroselo0726@outlook.com %K low back pain %K neck pain %K mobile app %K exercise therapy %K mHealth %D 2018 %7 26.11.2018 %9 Original Paper %J JMIR Mhealth Uhealth %G English %X Background: Chronic musculoskeletal neck and back pain are disabling conditions among adults. Use of technology has been suggested as an alternative way to increase adherence to exercise therapy, which may improve clinical outcomes. Objective: The aim was to investigate the self-perceived benefits of an artificial intelligence (AI)–embedded mobile app to self-manage chronic neck and back pain. Methods: A total of 161 participants responded to the invitation. The evaluation questionnaire included 14 questions that were intended to explore if using the AI rehabilitation system may (1) increase time spent on therapeutic exercise, (2) affect pain level (assessed by the 0-10 Numerical Pain Rating Scale), and (3) reduce the need for other interventions. Results: An increase in time spent on therapeutic exercise per day was observed. The median Numerical Pain Rating Scale scores were 6 (interquartile range [IQR] 5-8) before and 4 (IQR 3-6) after using the AI-embedded mobile app (95% CI 1.18-1.81). A 3-point reduction was reported by the participants who used the AI-embedded mobile app for more than 6 months. Reduction in the usage of other interventions while using the AI-embedded mobile app was also reported. Conclusions: This study demonstrated the positive self-perceived beneficiary effect of using the AI-embedded mobile app to provide a personalized therapeutic exercise program. The positive results suggest that it at least warrants further study to investigate the physiological effect of the AI-embedded mobile app and how it compares with routine clinical care. %M 30478019 %R 10.2196/mhealth.8127 %U http://mhealth.jmir.org/2018/11/e198/ %U https://doi.org/10.2196/mhealth.8127 %U http://www.ncbi.nlm.nih.gov/pubmed/30478019 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 20 %N 9 %P e11510 %T Patient and Consumer Safety Risks When Using Conversational Assistants for Medical Information: An Observational Study of Siri, Alexa, and Google Assistant %A Bickmore,Timothy W %A Trinh,Ha %A Olafsson,Stefan %A O'Leary,Teresa K %A Asadi,Reza %A Rickles,Nathaniel M %A Cruz,Ricardo %+ College of Computer and Information Science, Northeastern University, 910-177, 360 Huntington Avenue, Boston, MA, 02115, United States, 1 6173735477, bickmore@ccs.neu.edu %K conversational assistant %K conversational interface %K dialogue system %K medical error %K patient safety %D 2018 %7 04.09.2018 %9 Original Paper %J J Med Internet Res %G English %X Background: Conversational assistants, such as Siri, Alexa, and Google Assistant, are ubiquitous and are beginning to be used as portals for medical services. However, the potential safety issues of using conversational assistants for medical information by patients and consumers are not understood. Objective: To determine the prevalence and nature of the harm that could result from patients or consumers using conversational assistants for medical information. Methods: Participants were given medical problems to pose to Siri, Alexa, or Google Assistant, and asked to determine an action to take based on information from the system. Assignment of tasks and systems were randomized across participants, and participants queried the conversational assistants in their own words, making as many attempts as needed until they either reported an action to take or gave up. Participant-reported actions for each medical task were rated for patient harm using an Agency for Healthcare Research and Quality harm scale. Results: Fifty-four subjects completed the study with a mean age of 42 years (SD 18). Twenty-nine (54%) were female, 31 (57%) Caucasian, and 26 (50%) were college educated. Only 8 (15%) reported using a conversational assistant regularly, while 22 (41%) had never used one, and 24 (44%) had tried one “a few times.“ Forty-four (82%) used computers regularly. Subjects were only able to complete 168 (43%) of their 394 tasks. Of these, 49 (29%) reported actions that could have resulted in some degree of patient harm, including 27 (16%) that could have resulted in death. Conclusions: Reliance on conversational assistants for actionable medical information represents a safety risk for patients and consumers. Patients should be cautioned to not use these technologies for answers to medical questions they intend to act on without further consultation from a health care provider. %M 30181110 %R 10.2196/11510 %U http://www.jmir.org/2018/9/e11510/ %U https://doi.org/10.2196/11510 %U http://www.ncbi.nlm.nih.gov/pubmed/30181110 %0 Journal Article %@ 2368-7959 %I JMIR Publications %V 5 %N 3 %P e10454 %T An Embodied Conversational Agent for Unguided Internet-Based Cognitive Behavior Therapy in Preventative Mental Health: Feasibility and Acceptability Pilot Trial %A Suganuma,Shinichiro %A Sakamoto,Daisuke %A Shimoyama,Haruhiko %+ Department of Clinical Psychology, Graduate School of Education, The University of Tokyo, , Tokyo,, Japan, 81 3 5841 8068, sgnm.shin@gmail.com %K embodied conversational agent %K cognitive behavioral therapy %K psychological distress %K mental well‐being %K artificial intelligence technology %D 2018 %7 31.07.2018 %9 Original Paper %J JMIR Ment Health %G English %X Background: Recent years have seen an increase in the use of internet-based cognitive behavioral therapy in the area of mental health. Although lower effectiveness and higher dropout rates of unguided than those of guided internet-based cognitive behavioral therapy remain critical issues, not incurring ongoing human clinical resources makes it highly advantageous. Objective: Current research in psychotherapy, which acknowledges the importance of therapeutic alliance, aims to evaluate the feasibility and acceptability, in terms of mental health, of an application that is embodied with a conversational agent. This application was enabled for use as an internet-based cognitive behavioral therapy preventative mental health measure. Methods: Analysis of the data from the 191 participants of the experimental group with a mean age of 38.07 (SD 10.75) years and the 263 participants of the control group with a mean age of 38.05 (SD 13.45) years using a 2-way factorial analysis of variance (group × time) was performed. Results: There was a significant main effect (P=.02) and interaction for time on the variable of positive mental health (P=.02), and for the treatment group, a significant simple main effect was also found (P=.002). In addition, there was a significant main effect (P=.02) and interaction for time on the variable of negative mental health (P=.005), and for the treatment group, a significant simple main effect was also found (P=.001). Conclusions: This research can be seen to represent a certain level of evidence for the mental health application developed herein, indicating empirically that internet-based cognitive behavioral therapy with the embodied conversational agent can be used in mental health care. In the pilot trial, given the issues related to feasibility and acceptability, it is necessary to pursue higher quality evidence while continuing to further improve the application, based on the findings of the current research. %M 30064969 %R 10.2196/10454 %U http://mental.jmir.org/2018/3/e10454/ %U https://doi.org/10.2196/10454 %U http://www.ncbi.nlm.nih.gov/pubmed/30064969 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 20 %N 6 %P e10148 %T Towards an Artificially Empathic Conversational Agent for Mental Health Applications: System Design and User Perceptions %A Morris,Robert R %A Kouddous,Kareem %A Kshirsagar,Rohan %A Schueller,Stephen M %+ Koko, 155 Rivington St, New York, NY, 10002, United States, 1 617 851 4967, rob@koko.ai %K conversational agents %K mental health %K empathy %K crowdsourcing %K peer support %D 2018 %7 26.06.2018 %9 Original Paper %J J Med Internet Res %G English %X Background: Conversational agents cannot yet express empathy in nuanced ways that account for the unique circumstances of the user. Agents that possess this faculty could be used to enhance digital mental health interventions. Objective: We sought to design a conversational agent that could express empathic support in ways that might approach, or even match, human capabilities. Another aim was to assess how users might appraise such a system. Methods: Our system used a corpus-based approach to simulate expressed empathy. Responses from an existing pool of online peer support data were repurposed by the agent and presented to the user. Information retrieval techniques and word embeddings were used to select historical responses that best matched a user’s concerns. We collected ratings from 37,169 users to evaluate the system. Additionally, we conducted a controlled experiment (N=1284) to test whether the alleged source of a response (human or machine) might change user perceptions. Results: The majority of responses created by the agent (2986/3770, 79.20%) were deemed acceptable by users. However, users significantly preferred the efforts of their peers (P<.001). This effect was maintained in a controlled study (P=.02), even when the only difference in responses was whether they were framed as coming from a human or a machine. Conclusions: Our system illustrates a novel way for machines to construct nuanced and personalized empathic utterances. However, the design had significant limitations and further research is needed to make this approach viable. Our controlled study suggests that even in ideal conditions, nonhuman agents may struggle to express empathy as well as humans. The ethical implications of empathic agents, as well as their potential iatrogenic effects, are also discussed. %M 29945856 %R 10.2196/10148 %U http://www.jmir.org/2018/6/e10148/ %U https://doi.org/10.2196/10148 %U http://www.ncbi.nlm.nih.gov/pubmed/29945856 %0 Journal Article %@ 2368-7959 %I JMIR Publications %V 5 %N 2 %P e32 %T Ethical Issues for Direct-to-Consumer Digital Psychotherapy Apps: Addressing Accountability, Data Protection, and Consent %A Martinez-Martin,Nicole %A Kreitmair,Karola %+ Stanford Center for Biomedical Ethics, 1215 Welch Road, Stanford, CA, 94305, United States, 1 650 723 5760, nicolemz@stanford.edu %K ethics %K ethical issues %K mental health %K technology %K telemedicine %K mHealth %K psychotherapy %D 2018 %7 23.04.2018 %9 Viewpoint %J JMIR Ment Health %G English %X This paper focuses on the ethical challenges presented by direct-to-consumer (DTC) digital psychotherapy services that do not involve oversight by a professional mental health provider. DTC digital psychotherapy services can potentially assist in improving access to mental health care for the many people who would otherwise not have the resources or ability to connect with a therapist. However, the lack of adequate regulation in this area exacerbates concerns over how safety, privacy, accountability, and other ethical obligations to protect an individual in therapy are addressed within these services. In the traditional therapeutic relationship, there are ethical obligations that serve to protect the interests of the client and provide warnings. In contrast, in a DTC therapy app, there are no clear lines of accountability or associated ethical obligations to protect the user seeking mental health services. The types of DTC services that present ethical challenges include apps that use a digital platform to connect users to minimally trained nonprofessional counselors, as well as services that provide counseling steered by artificial intelligence and conversational agents. There is a need for adequate oversight of DTC nonprofessional psychotherapy services and additional empirical research to inform policy that will provide protection to the consumer. %M 29685865 %R 10.2196/mental.9423 %U http://mental.jmir.org/2018/2/e32/ %U https://doi.org/10.2196/mental.9423 %U http://www.ncbi.nlm.nih.gov/pubmed/29685865 %0 Journal Article %@ 2369-6893 %I JMIR Publications %V 3 %N 1 %P e37 %T Feasibility of an Automated System Counselor for Survivors of Sexual Assault %A Howe,Esther %A Pedrelli,Paola %A Morris,Robert %A Nyer,Maren %A Mischoulon,David %A Picard,Rosalind %+ Department of Psychiatry, Massachusetts General Hospital, 6th Floor, 1 Bowdoin Square, Boston, MA,, United States, 1 6176437690, ehowe3@mgh.harvard.edu %K CBT %K web chat %D 2017 %7 22.09.2017 %9 Abstract %J iproc %G English %X Background: Sexual assault (SA) is common and costly to individuals and society, and increases risk of mental health disorders. Stigma and cost of care discourage survivors from seeking help. Norms profiling survivors as heterosexual, cisgendered women dissuade LGBTQIA+ individuals and men from accessing care. Because individuals prefer disclosing sensitive information online rather than in-person, online systems—like instant messaging and chatbots—for counseling may bypass concerns about stigma. These systems’ anonymity may increase disclosure and decrease impression management, the process by which individuals attempt to influence others’ perceptions. Their low cost may expand reach of care. There are no known evidence-based chat platforms for SA survivors. Objective: To examine feasibility of a chat platform with peer and automated system (chatbot) counseling interfaces to provide cognitive reappraisals (a cognitive behavioral therapy technique) to survivors. Methods: Participants are English-speaking, US-based survivors, 18+ years old. Participants are told they will be randomized to chat with a peer or automated system counselor 5 times over 2 weeks. In reality, all participants chat with a peer counselor. Chats employ a modified-for-context evidence-based cognitive reappraisal script developed by Koko, a company offering support services for emotional distress via social networks. At baseline, participants indicate counselor type preference and complete a basic demographic form, the Brief Fear of Negative Evaluation Scale, and self-disclosure items from the International Personality Item Pool. After 5 chats, participants complete questions from the Client Satisfaction Questionnaire (CSQ), Self-Reported Attitudes Toward Agent, and the Working Alliance Inventory. Hypotheses: 1) Online chatting and automated systems will be acceptable and feasible means of delivering cognitive reappraisals to survivors. 2) High impression management (IM≥25) and low self-disclosure (SD≤45) will be associated with preference for an automated system. 3) IM and SD will separately moderate the relationship between counselor assignment and participant satisfaction. Results: Ten participants have completed the study. Recruitment is ongoing. We will enroll 50+ participants by 10/2017 and outline findings at the Connected Health Conference. To date, 70% of participants completed all chats within 24 hours of enrollment, and 60% indicated a pre-chat preference for an automated system, suggesting acceptability of the concept. The post-chat CSQ mean total score of 3.98 on a 5-point Likert scale (1=Poor; 5=Excellent) suggests platform acceptability. Of the 50% reporting high IM, 60% indicated preference for an automated system. Of the 30% reporting low SD, 33% reported preference for an automated system. At recruitment completion, ANOVA analyses will elucidate relationships between IM, SD, and counselor assignment. Correlation and linear regression analyses will show any moderating effect of IM and SD on the relationship between counselor assignment and participant satisfaction. Conclusions: Preliminary results suggest acceptability and feasibility of cognitive reappraisals via chat for survivors, and of the automated system counselor concept. Final results will explore relationships between SD, IM, counselor type, and participant satisfaction to inform the development of new platforms for survivors. %R 10.2196/iproc.8585 %U http://www.iproc.org/2017/1/e37/ %U https://doi.org/10.2196/iproc.8585