TY - JOUR AU - Ikemura, Kenji AU - Bellin, Eran AU - Yagi, Yukako AU - Billett, Henny AU - Saada, Mahmoud AU - Simone, Katelyn AU - Stahl, Lindsay AU - Szymanski, James AU - Goldstein, D Y AU - Reyes Gil, Morayma PY - 2021 DA - 2021/2/26 TI - Using Automated Machine Learning to Predict the Mortality of Patients With COVID-19: Prediction Model Development Study JO - J Med Internet Res SP - e23458 VL - 23 IS - 2 KW - automated machine learning KW - COVID-19 KW - biomarker KW - ranking KW - decision support tool KW - machine learning KW - decision support KW - Shapley additive explanation KW - partial dependence plot KW - dimensionality reduction AB - Background: During a pandemic, it is important for clinicians to stratify patients and decide who receives limited medical resources. Machine learning models have been proposed to accurately predict COVID-19 disease severity. Previous studies have typically tested only one machine learning algorithm and limited performance evaluation to area under the curve analysis. To obtain the best results possible, it may be important to test different machine learning algorithms to find the best prediction model. Objective: In this study, we aimed to use automated machine learning (autoML) to train various machine learning algorithms. We selected the model that best predicted patients’ chances of surviving a SARS-CoV-2 infection. In addition, we identified which variables (ie, vital signs, biomarkers, comorbidities, etc) were the most influential in generating an accurate model. Methods: Data were retrospectively collected from all patients who tested positive for COVID-19 at our institution between March 1 and July 3, 2020. We collected 48 variables from each patient within 36 hours before or after the index time (ie, real-time polymerase chain reaction positivity). Patients were followed for 30 days or until death. Patients’ data were used to build 20 machine learning models with various algorithms via autoML. The performance of machine learning models was measured by analyzing the area under the precision-recall curve (AUPCR). Subsequently, we established model interpretability via Shapley additive explanation and partial dependence plots to identify and rank variables that drove model predictions. Afterward, we conducted dimensionality reduction to extract the 10 most influential variables. AutoML models were retrained by only using these 10 variables, and the output models were evaluated against the model that used 48 variables. Results: Data from 4313 patients were used to develop the models. The best model that was generated by using autoML and 48 variables was the stacked ensemble model (AUPRC=0.807). The two best independent models were the gradient boost machine and extreme gradient boost models, which had an AUPRC of 0.803 and 0.793, respectively. The deep learning model (AUPRC=0.73) was substantially inferior to the other models. The 10 most influential variables for generating high-performing models were systolic and diastolic blood pressure, age, pulse oximetry level, blood urea nitrogen level, lactate dehydrogenase level, D-dimer level, troponin level, respiratory rate, and Charlson comorbidity score. After the autoML models were retrained with these 10 variables, the stacked ensemble model still had the best performance (AUPRC=0.791). Conclusions: We used autoML to develop high-performing models that predicted the survival of patients with COVID-19. In addition, we identified important variables that correlated with mortality. This is proof of concept that autoML is an efficient, effective, and informative method for generating machine learning–based clinical decision support tools. UR - https://www.jmir.org/2021/2/e23458 UR - https://doi.org/10.2196/23458 UR - http://www.ncbi.nlm.nih.gov/pubmed/33539308 DO - 10.2196/23458 ID - info:doi/10.2196/23458 ER - TY - JOUR AU - Rosado, Eduardo AU - Garcia-Remesal, Miguel AU - Paraiso-Medina, Sergio AU - Pazos, Alejandro AU - Maojo, Victor PY - 2021 DA - 2021/2/25 TI - Using Machine Learning to Collect and Facilitate Remote Access to Biomedical Databases: Development of the Biomedical Database Inventory JO - JMIR Med Inform SP - e22976 VL - 9 IS - 2 KW - biomedical databases KW - natural language processing KW - deep learning KW - internet KW - biomedical knowledge AB - Background: Currently, existing biomedical literature repositories do not commonly provide users with specific means to locate and remotely access biomedical databases. Objective: To address this issue, we developed the Biomedical Database Inventory (BiDI), a repository linking to biomedical databases automatically extracted from the scientific literature. BiDI provides an index of data resources and a path to access them seamlessly. Methods: We designed an ensemble of deep learning methods to extract database mentions. To train the system, we annotated a set of 1242 articles that included mentions of database publications. Such a data set was used along with transfer learning techniques to train an ensemble of deep learning natural language processing models targeted at database publication detection. Results: The system obtained an F1 score of 0.929 on database detection, showing high precision and recall values. When applying this model to the PubMed and PubMed Central databases, we identified over 10,000 unique databases. The ensemble model also extracted the weblinks to the reported databases and discarded irrelevant links. For the extraction of weblinks, the model achieved a cross-validated F1 score of 0.908. We show two use cases: one related to “omics” and the other related to the COVID-19 pandemic. Conclusions: BiDI enables access to biomedical resources over the internet and facilitates data-driven research and other scientific initiatives. The repository is openly available online and will be regularly updated with an automatic text processing pipeline. The approach can be reused to create repositories of different types (ie, biomedical and others). UR - https://medinform.jmir.org/2021/2/e22976 UR - https://doi.org/10.2196/22976 UR - http://www.ncbi.nlm.nih.gov/pubmed/33629960 DO - 10.2196/22976 ID - info:doi/10.2196/22976 ER - TY - JOUR AU - Hu, Mingyue AU - Shu, Xinhui AU - Yu, Gang AU - Wu, Xinyin AU - Välimäki, Maritta AU - Feng, Hui PY - 2021 DA - 2021/2/24 TI - A Risk Prediction Model Based on Machine Learning for Cognitive Impairment Among Chinese Community-Dwelling Elderly People With Normal Cognition: Development and Validation Study JO - J Med Internet Res SP - e20298 VL - 23 IS - 2 KW - prediction model KW - cognitive impairment KW - machine learning KW - nomogram AB - Background: Identifying cognitive impairment early enough could support timely intervention that may hinder or delay the trajectory of cognitive impairment, thus increasing the chances for successful cognitive aging. Objective: We aimed to build a prediction model based on machine learning for cognitive impairment among Chinese community-dwelling elderly people with normal cognition. Methods: A prospective cohort of 6718 older people from the Chinese Longitudinal Healthy Longevity Survey (CLHLS) register, followed between 2008 and 2011, was used to develop and validate the prediction model. Participants were included if they were aged 60 years or above, were community-dwelling elderly people, and had a cognitive Mini-Mental State Examination (MMSE) score ≥18. They were excluded if they were diagnosed with a severe disease (eg, cancer and dementia) or were living in institutions. Cognitive impairment was identified using the Chinese version of the MMSE. Several machine learning algorithms (random forest, XGBoost, naïve Bayes, and logistic regression) were used to assess the 3-year risk of developing cognitive impairment. Optimal cutoffs and adjusted parameters were explored in validation data, and the model was further evaluated in test data. A nomogram was established to vividly present the prediction model. Results: The mean age of the participants was 80.4 years (SD 10.3 years), and 50.85% (3416/6718) were female. During a 3-year follow-up, 991 (14.8%) participants were identified with cognitive impairment. Among 45 features, the following four features were finally selected to develop the model: age, instrumental activities of daily living, marital status, and baseline cognitive function. The concordance index of the model constructed by logistic regression was 0.814 (95% CI 0.781-0.846). Older people with normal cognitive functioning having a nomogram score of less than 170 were considered to have a low 3-year risk of cognitive impairment, and those with a score of 170 or greater were considered to have a high 3-year risk of cognitive impairment. Conclusions: This simple and feasible cognitive impairment prediction model could identify community-dwelling elderly people at the greatest 3-year risk for cognitive impairment, which could help community nurses in the early identification of dementia. UR - https://www.jmir.org/2021/2/e20298 UR - https://doi.org/10.2196/20298 UR - http://www.ncbi.nlm.nih.gov/pubmed/33625369 DO - 10.2196/20298 ID - info:doi/10.2196/20298 ER - TY - JOUR AU - Liu, Taoran AU - Tsang, Winghei AU - Huang, Fengqiu AU - Lau, Oi Ying AU - Chen, Yanhui AU - Sheng, Jie AU - Guo, Yiwei AU - Akinwunmi, Babatunde AU - Zhang, Casper JP AU - Ming, Wai-Kit PY - 2021 DA - 2021/2/23 TI - Patients’ Preferences for Artificial Intelligence Applications Versus Clinicians in Disease Diagnosis During the SARS-CoV-2 Pandemic in China: Discrete Choice Experiment JO - J Med Internet Res SP - e22841 VL - 23 IS - 2 KW - discrete choice experiment KW - artificial intelligence KW - patient preference KW - multinomial logit analysis KW - questionnaire KW - latent-class conditional logit KW - app KW - human clinicians KW - diagnosis KW - COVID-19 KW - China AB - Background: Misdiagnosis, arbitrary charges, annoying queues, and clinic waiting times among others are long-standing phenomena in the medical industry across the world. These factors can contribute to patient anxiety about misdiagnosis by clinicians. However, with the increasing growth in use of big data in biomedical and health care communities, the performance of artificial intelligence (Al) techniques of diagnosis is improving and can help avoid medical practice errors, including under the current circumstance of COVID-19. Objective: This study aims to visualize and measure patients’ heterogeneous preferences from various angles of AI diagnosis versus clinicians in the context of the COVID-19 epidemic in China. We also aim to illustrate the different decision-making factors of the latent class of a discrete choice experiment (DCE) and prospects for the application of AI techniques in judgment and management during the pandemic of SARS-CoV-2 and in the future. Methods: A DCE approach was the main analysis method applied in this paper. Attributes from different dimensions were hypothesized: diagnostic method, outpatient waiting time, diagnosis time, accuracy, follow-up after diagnosis, and diagnostic expense. After that, a questionnaire is formed. With collected data from the DCE questionnaire, we apply Sawtooth software to construct a generalized multinomial logit (GMNL) model, mixed logit model, and latent class model with the data sets. Moreover, we calculate the variables’ coefficients, standard error, P value, and odds ratio (OR) and form a utility report to present the importance and weighted percentage of attributes. Results: A total of 55.8% of the respondents (428 out of 767) opted for AI diagnosis regardless of the description of the clinicians. In the GMNL model, we found that people prefer the 100% accuracy level the most (OR 4.548, 95% CI 4.048-5.110, P<.001). For the latent class model, the most acceptable model consists of 3 latent classes of respondents. The attributes with the most substantial effects and highest percentage weights are the accuracy (39.29% in general) and expense of diagnosis (21.69% in general), especially the preferences for the diagnosis “accuracy” attribute, which is constant across classes. For class 1 and class 3, people prefer the AI + clinicians method (class 1: OR 1.247, 95% CI 1.036-1.463, P<.001; class 3: OR 1.958, 95% CI 1.769-2.167, P<.001). For class 2, people prefer the AI method (OR 1.546, 95% CI 0.883-2.707, P=.37). The OR of levels of attributes increases with the increase of accuracy across all classes. Conclusions: Latent class analysis was prominent and useful in quantifying preferences for attributes of diagnosis choice. People’s preferences for the “accuracy” and “diagnostic expenses” attributes are palpable. AI will have a potential market. However, accuracy and diagnosis expenses need to be taken into consideration. UR - https://www.jmir.org/2021/2/e22841 UR - https://doi.org/10.2196/22841 UR - http://www.ncbi.nlm.nih.gov/pubmed/33493130 DO - 10.2196/22841 ID - info:doi/10.2196/22841 ER - TY - JOUR AU - Sang, Shengtian AU - Sun, Ran AU - Coquet, Jean AU - Carmichael, Harris AU - Seto, Tina AU - Hernandez-Boussard, Tina PY - 2021 DA - 2021/2/22 TI - Learning From Past Respiratory Infections to Predict COVID-19 Outcomes: Retrospective Study JO - J Med Internet Res SP - e23026 VL - 23 IS - 2 KW - COVID-19 KW - invasive mechanical ventilation KW - all-cause mortality KW - machine learning KW - artificial intelligence KW - respiratory KW - infection KW - outcome KW - data KW - feasibility KW - framework AB - Background: For the clinical care of patients with well-established diseases, randomized trials, literature, and research are supplemented with clinical judgment to understand disease prognosis and inform treatment choices. In the void created by a lack of clinical experience with COVID-19, artificial intelligence (AI) may be an important tool to bolster clinical judgment and decision making. However, a lack of clinical data restricts the design and development of such AI tools, particularly in preparation for an impending crisis or pandemic. Objective: This study aimed to develop and test the feasibility of a “patients-like-me” framework to predict the deterioration of patients with COVID-19 using a retrospective cohort of patients with similar respiratory diseases. Methods: Our framework used COVID-19–like cohorts to design and train AI models that were then validated on the COVID-19 population. The COVID-19–like cohorts included patients diagnosed with bacterial pneumonia, viral pneumonia, unspecified pneumonia, influenza, and acute respiratory distress syndrome (ARDS) at an academic medical center from 2008 to 2019. In total, 15 training cohorts were created using different combinations of the COVID-19–like cohorts with the ARDS cohort for exploratory purposes. In this study, two machine learning models were developed: one to predict invasive mechanical ventilation (IMV) within 48 hours for each hospitalized day, and one to predict all-cause mortality at the time of admission. Model performance was assessed using the area under the receiver operating characteristic curve (AUROC), sensitivity, specificity, positive predictive value, and negative predictive value. We established model interpretability by calculating SHapley Additive exPlanations (SHAP) scores to identify important features. Results: Compared to the COVID-19–like cohorts (n=16,509), the patients hospitalized with COVID-19 (n=159) were significantly younger, with a higher proportion of patients of Hispanic ethnicity, a lower proportion of patients with smoking history, and fewer patients with comorbidities (P<.001). Patients with COVID-19 had a lower IMV rate (15.1 versus 23.2, P=.02) and shorter time to IMV (2.9 versus 4.1 days, P<.001) compared to the COVID-19–like patients. In the COVID-19–like training data, the top models achieved excellent performance (AUROC>0.90). Validating in the COVID-19 cohort, the top-performing model for predicting IMV was the XGBoost model (AUROC=0.826) trained on the viral pneumonia cohort. Similarly, the XGBoost model trained on all 4 COVID-19–like cohorts without ARDS achieved the best performance (AUROC=0.928) in predicting mortality. Important predictors included demographic information (age), vital signs (oxygen saturation), and laboratory values (white blood cell count, cardiac troponin, albumin, etc). Our models had class imbalance, which resulted in high negative predictive values and low positive predictive values. Conclusions: We provided a feasible framework for modeling patient deterioration using existing data and AI technology to address data limitations during the onset of a novel, rapidly changing pandemic. UR - https://www.jmir.org/2021/2/e23026 UR - https://doi.org/10.2196/23026 UR - http://www.ncbi.nlm.nih.gov/pubmed/33534724 DO - 10.2196/23026 ID - info:doi/10.2196/23026 ER - TY - JOUR AU - Abrami, Avner AU - Gunzler, Steven AU - Kilbane, Camilla AU - Ostrand, Rachel AU - Ho, Bryan AU - Cecchi, Guillermo PY - 2021 DA - 2021/2/22 TI - Automated Computer Vision Assessment of Hypomimia in Parkinson Disease: Proof-of-Principle Pilot Study JO - J Med Internet Res SP - e21037 VL - 23 IS - 2 KW - Parkinson disease KW - hypomimia KW - computer vision KW - telemedicine AB - Background: Facial expressions require the complex coordination of 43 different facial muscles. Parkinson disease (PD) affects facial musculature leading to “hypomimia” or “masked facies.” Objective: We aimed to determine whether modern computer vision techniques can be applied to detect masked facies and quantify drug states in PD. Methods: We trained a convolutional neural network on images extracted from videos of 107 self-identified people with PD, along with 1595 videos of controls, in order to detect PD hypomimia cues. This trained model was applied to clinical interviews of 35 PD patients in their on and off drug motor states, and seven journalist interviews of the actor Alan Alda obtained before and after he was diagnosed with PD. Results: The algorithm achieved a test set area under the receiver operating characteristic curve of 0.71 on 54 subjects to detect PD hypomimia, compared to a value of 0.75 for trained neurologists using the United Parkinson Disease Rating Scale-III Facial Expression score. Additionally, the model accuracy to classify the on and off drug states in the clinical samples was 63% (22/35), in contrast to an accuracy of 46% (16/35) when using clinical rater scores. Finally, each of Alan Alda’s seven interviews were successfully classified as occurring before (versus after) his diagnosis, with 100% accuracy (7/7). Conclusions: This proof-of-principle pilot study demonstrated that computer vision holds promise as a valuable tool for PD hypomimia and for monitoring a patient’s motor state in an objective and noninvasive way, particularly given the increasing importance of telemedicine. UR - https://www.jmir.org/2021/2/e21037 UR - https://doi.org/10.2196/21037 UR - http://www.ncbi.nlm.nih.gov/pubmed/33616535 DO - 10.2196/21037 ID - info:doi/10.2196/21037 ER - TY - JOUR AU - Lam, Kyle AU - Iqbal, Fahad M AU - Purkayastha, Sanjay AU - Kinross, James M PY - 2021 DA - 2021/2/22 TI - Investigating the Ethical and Data Governance Issues of Artificial Intelligence in Surgery: Protocol for a Delphi Study JO - JMIR Res Protoc SP - e26552 VL - 10 IS - 2 KW - artificial intelligence KW - digital surgery KW - Delphi KW - ethics KW - data governance KW - digital technology KW - operating room KW - surgery AB - Background: The rapid uptake of digital technology into the operating room has the potential to improve patient outcomes, increase efficiency of the use of operating rooms, and allow surgeons to progress quickly up learning curves. These technologies are, however, dependent on huge amounts of data, and the consequences of their mismanagement are significant. While the field of artificial intelligence ethics is able to provide a broad framework for those designing and implementing these technologies into the operating room, there is a need to determine and address the ethical and data governance challenges of using digital technology in this unique environment. Objective: The objectives of this study are to define the term digital surgery and gain expert consensus on the key ethical and data governance issues, barriers, and future research goals of the use of artificial intelligence in surgery. Methods: Experts from the fields of surgery, ethics and law, policy, artificial intelligence, and industry will be invited to participate in a 4-round consensus Delphi exercise. In the first round, participants will supply free-text responses across 4 key domains: ethics, data governance, barriers, and future research goals. They will also be asked to provide their understanding of the term digital surgery. In subsequent rounds, statements will be grouped, and participants will be asked to rate the importance of each issue on a 9-point Likert scale ranging from 1 (not at all important) to 9 (critically important). Consensus is defined a priori as a score of 7 to 9 by 70% of respondents and 1 to 3 by less than 30% of respondents. A final online meeting round will be held to discuss inclusion of statements and draft a consensus document. Results: Full ethical approval has been obtained for the study by the local research ethics committee at Imperial College, London (20IC6136). We anticipate round 1 to commence in January 2021. Conclusions: The results of this study will define the term digital surgery, identify the key issues and barriers, and shape future research in this area. International Registered Report Identifier (IRRID): PRR1-10.2196/26552 UR - https://www.researchprotocols.org/2021/2/e26552 UR - https://doi.org/10.2196/26552 UR - http://www.ncbi.nlm.nih.gov/pubmed/33616543 DO - 10.2196/26552 ID - info:doi/10.2196/26552 ER - TY - JOUR AU - Lennartz, Simon AU - Dratsch, Thomas AU - Zopfs, David AU - Persigehl, Thorsten AU - Maintz, David AU - Große Hokamp, Nils AU - Pinto dos Santos, Daniel PY - 2021 DA - 2021/2/17 TI - Use and Control of Artificial Intelligence in Patients Across the Medical Workflow: Single-Center Questionnaire Study of Patient Perspectives JO - J Med Internet Res SP - e24221 VL - 23 IS - 2 KW - artificial intelligence KW - clinical implementation KW - questionnaire KW - survey AB - Background: Artificial intelligence (AI) is gaining increasing importance in many medical specialties, yet data on patients’ opinions on the use of AI in medicine are scarce. Objective: This study aimed to investigate patients’ opinions on the use of AI in different aspects of the medical workflow and the level of control and supervision under which they would deem the application of AI in medicine acceptable. Methods: Patients scheduled for computed tomography or magnetic resonance imaging voluntarily participated in an anonymized questionnaire between February 10, 2020, and May 24, 2020. Patient information, confidence in physicians vs AI in different clinical tasks, opinions on the control of AI, preference in cases of disagreement between AI and physicians, and acceptance of the use of AI for diagnosing and treating diseases of different severity were recorded. Results: In total, 229 patients participated. Patients favored physicians over AI for all clinical tasks except for treatment planning based on current scientific evidence. In case of disagreement between physicians and AI regarding diagnosis and treatment planning, most patients preferred the physician’s opinion to AI (96.2% [153/159] vs 3.8% [6/159] and 94.8% [146/154] vs 5.2% [8/154], respectively; P=.001). AI supervised by a physician was considered more acceptable than AI without physician supervision at diagnosis (confidence rating 3.90 [SD 1.20] vs 1.64 [SD 1.03], respectively; P=.001) and therapy (3.77 [SD 1.18] vs 1.57 [SD 0.96], respectively; P=.001). Conclusions: Patients favored physicians over AI in most clinical tasks and strongly preferred an application of AI with physician supervision. However, patients acknowledged that AI could help physicians integrate the most recent scientific evidence into medical care. Application of AI in medicine should be disclosed and controlled to protect patient interests and meet ethical standards. UR - http://www.jmir.org/2021/2/e24221/ UR - https://doi.org/10.2196/24221 UR - http://www.ncbi.nlm.nih.gov/pubmed/33595451 DO - 10.2196/24221 ID - info:doi/10.2196/24221 ER - TY - JOUR AU - Quiroz, Juan Carlos AU - Feng, You-Zhen AU - Cheng, Zhong-Yuan AU - Rezazadegan, Dana AU - Chen, Ping-Kang AU - Lin, Qi-Ting AU - Qian, Long AU - Liu, Xiao-Fang AU - Berkovsky, Shlomo AU - Coiera, Enrico AU - Song, Lei AU - Qiu, Xiaoming AU - Liu, Sidong AU - Cai, Xiang-Ran PY - 2021 DA - 2021/2/11 TI - Development and Validation of a Machine Learning Approach for Automated Severity Assessment of COVID-19 Based on Clinical and Imaging Data: Retrospective Study JO - JMIR Med Inform SP - e24572 VL - 9 IS - 2 KW - algorithm KW - clinical data KW - clinical features KW - COVID-19 KW - CT scans KW - development KW - imaging KW - imbalanced data KW - machine learning KW - oversampling KW - severity assessment KW - validation AB - Background: COVID-19 has overwhelmed health systems worldwide. It is important to identify severe cases as early as possible, such that resources can be mobilized and treatment can be escalated. Objective: This study aims to develop a machine learning approach for automated severity assessment of COVID-19 based on clinical and imaging data. Methods: Clinical data—including demographics, signs, symptoms, comorbidities, and blood test results—and chest computed tomography scans of 346 patients from 2 hospitals in the Hubei Province, China, were used to develop machine learning models for automated severity assessment in diagnosed COVID-19 cases. We compared the predictive power of the clinical and imaging data from multiple machine learning models and further explored the use of four oversampling methods to address the imbalanced classification issue. Features with the highest predictive power were identified using the Shapley Additive Explanations framework. Results: Imaging features had the strongest impact on the model output, while a combination of clinical and imaging features yielded the best performance overall. The identified predictive features were consistent with those reported previously. Although oversampling yielded mixed results, it achieved the best model performance in our study. Logistic regression models differentiating between mild and severe cases achieved the best performance for clinical features (area under the curve [AUC] 0.848; sensitivity 0.455; specificity 0.906), imaging features (AUC 0.926; sensitivity 0.818; specificity 0.901), and a combination of clinical and imaging features (AUC 0.950; sensitivity 0.764; specificity 0.919). The synthetic minority oversampling method further improved the performance of the model using combined features (AUC 0.960; sensitivity 0.845; specificity 0.929). Conclusions: Clinical and imaging features can be used for automated severity assessment of COVID-19 and can potentially help triage patients with COVID-19 and prioritize care delivery to those at a higher risk of severe disease. UR - http://medinform.jmir.org/2021/2/e24572/ UR - https://doi.org/10.2196/24572 UR - http://www.ncbi.nlm.nih.gov/pubmed/33534723 DO - 10.2196/24572 ID - info:doi/10.2196/24572 ER - TY - JOUR AU - Bolourani, Siavash AU - Brenner, Max AU - Wang, Ping AU - McGinn, Thomas AU - Hirsch, Jamie S AU - Barnaby, Douglas AU - Zanos, Theodoros P PY - 2021 DA - 2021/2/10 TI - A Machine Learning Prediction Model of Respiratory Failure Within 48 Hours of Patient Admission for COVID-19: Model Development and Validation JO - J Med Internet Res SP - e24246 VL - 23 IS - 2 KW - artificial intelligence KW - prognostic KW - model KW - pandemic KW - severe acute respiratory syndrome coronavirus 2 KW - modeling KW - development KW - validation KW - COVID-19 KW - machine learning AB - Background: Predicting early respiratory failure due to COVID-19 can help triage patients to higher levels of care, allocate scarce resources, and reduce morbidity and mortality by appropriately monitoring and treating the patients at greatest risk for deterioration. Given the complexity of COVID-19, machine learning approaches may support clinical decision making for patients with this disease. Objective: Our objective is to derive a machine learning model that predicts respiratory failure within 48 hours of admission based on data from the emergency department. Methods: Data were collected from patients with COVID-19 who were admitted to Northwell Health acute care hospitals and were discharged, died, or spent a minimum of 48 hours in the hospital between March 1 and May 11, 2020. Of 11,525 patients, 933 (8.1%) were placed on invasive mechanical ventilation within 48 hours of admission. Variables used by the models included clinical and laboratory data commonly collected in the emergency department. We trained and validated three predictive models (two based on XGBoost and one that used logistic regression) using cross-hospital validation. We compared model performance among all three models as well as an established early warning score (Modified Early Warning Score) using receiver operating characteristic curves, precision-recall curves, and other metrics. Results: The XGBoost model had the highest mean accuracy (0.919; area under the curve=0.77), outperforming the other two models as well as the Modified Early Warning Score. Important predictor variables included the type of oxygen delivery used in the emergency department, patient age, Emergency Severity Index level, respiratory rate, serum lactate, and demographic characteristics. Conclusions: The XGBoost model had high predictive accuracy, outperforming other early warning scores. The clinical plausibility and predictive ability of XGBoost suggest that the model could be used to predict 48-hour respiratory failure in admitted patients with COVID-19. UR - http://www.jmir.org/2021/2/e24246/ UR - https://doi.org/10.2196/24246 UR - http://www.ncbi.nlm.nih.gov/pubmed/33476281 DO - 10.2196/24246 ID - info:doi/10.2196/24246 ER - TY - JOUR AU - Albahli, Saleh AU - Yar, Ghulam Nabi Ahmad Hassan PY - 2021 DA - 2021/2/10 TI - Fast and Accurate Detection of COVID-19 Along With 14 Other Chest Pathologies Using a Multi-Level Classification: Algorithm Development and Validation Study JO - J Med Internet Res SP - e23693 VL - 23 IS - 2 KW - COVID-19 KW - chest x-ray KW - convolutional neural network KW - data augmentation KW - biomedical imaging KW - automatic detection AB - Background: COVID-19 has spread very rapidly, and it is important to build a system that can detect it in order to help an overwhelmed health care system. Many research studies on chest diseases rely on the strengths of deep learning techniques. Although some of these studies used state-of-the-art techniques and were able to deliver promising results, these techniques are not very useful if they can detect only one type of disease without detecting the others. Objective: The main objective of this study was to achieve a fast and more accurate diagnosis of COVID-19. This study proposes a diagnostic technique that classifies COVID-19 x-ray images from normal x-ray images and those specific to 14 other chest diseases. Methods: In this paper, we propose a novel, multilevel pipeline, based on deep learning models, to detect COVID-19 along with other chest diseases based on x-ray images. This pipeline reduces the burden of a single network to classify a large number of classes. The deep learning models used in this study were pretrained on the ImageNet dataset, and transfer learning was used for fast training. The lungs and heart were segmented from the whole x-ray images and passed onto the first classifier that checks whether the x-ray is normal, COVID-19 affected, or characteristic of another chest disease. If it is neither a COVID-19 x-ray image nor a normal one, then the second classifier comes into action and classifies the image as one of the other 14 diseases. Results: We show how our model uses state-of-the-art deep neural networks to achieve classification accuracy for COVID-19 along with 14 other chest diseases and normal cases based on x-ray images, which is competitive with currently used state-of-the-art models. Due to the lack of data in some classes such as COVID-19, we applied 10-fold cross-validation through the ResNet50 model. Our classification technique thus achieved an average training accuracy of 96.04% and test accuracy of 92.52% for the first level of classification (ie, 3 classes). For the second level of classification (ie, 14 classes), our technique achieved a maximum training accuracy of 88.52% and test accuracy of 66.634% by using ResNet50. We also found that when all the 16 classes were classified at once, the overall accuracy for COVID-19 detection decreased, which in the case of ResNet50 was 88.92% for training data and 71.905% for test data. Conclusions: Our proposed pipeline can detect COVID-19 with a higher accuracy along with detecting 14 other chest diseases based on x-ray images. This is achieved by dividing the classification task into multiple steps rather than classifying them collectively. UR - http://www.jmir.org/2021/2/e23693/ UR - https://doi.org/10.2196/23693 UR - http://www.ncbi.nlm.nih.gov/pubmed/33529154 DO - 10.2196/23693 ID - info:doi/10.2196/23693 ER - TY - JOUR AU - Pham, Quynh AU - Gamble, Anissa AU - Hearn, Jason AU - Cafazzo, Joseph A PY - 2021 DA - 2021/2/10 TI - The Need for Ethnoracial Equity in Artificial Intelligence for Diabetes Management: Review and Recommendations JO - J Med Internet Res SP - e22320 VL - 23 IS - 2 KW - diabetes KW - artificial intelligence KW - digital health KW - ethnoracial equity KW - ethnicity KW - race UR - http://www.jmir.org/2021/2/e22320/ UR - https://doi.org/10.2196/22320 UR - http://www.ncbi.nlm.nih.gov/pubmed/33565982 DO - 10.2196/22320 ID - info:doi/10.2196/22320 ER - TY - JOUR AU - Bhalodiya, Jayendra Maganbhai AU - Palit, Arnab AU - Giblin, Gerard AU - Tiwari, Manoj Kumar AU - Prasad, Sanjay K AU - Bhudia, Sunil K AU - Arvanitis, Theodoros N AU - Williams, Mark A PY - 2021 DA - 2021/2/10 TI - Identifying Myocardial Infarction Using Hierarchical Template Matching–Based Myocardial Strain: Algorithm Development and Usability Study JO - JMIR Med Inform SP - e22164 VL - 9 IS - 2 KW - left ventricle KW - myocardial infarction KW - myocardium KW - strain AB - Background: Myocardial infarction (MI; location and extent of infarction) can be determined by late enhancement cardiac magnetic resonance (CMR) imaging, which requires the injection of a potentially harmful gadolinium-based contrast agent (GBCA). Alternatively, emerging research in the area of myocardial strain has shown potential to identify MI using strain values. Objective: This study aims to identify the location of MI by developing an applied algorithmic method of circumferential strain (CS) values, which are derived through a novel hierarchical template matching (HTM) method. Methods: HTM-based CS H-spread from end-diastole to end-systole was used to develop an applied method. Grid-tagging magnetic resonance imaging was used to calculate strain values in the left ventricular (LV) myocardium, followed by the 16-segment American Heart Association model. The data set was used with k-fold cross-validation to estimate the percentage reduction of H-spread among infarcted and noninfarcted LV segments. A total of 43 participants (38 MI and 5 healthy) who underwent CMR imaging were retrospectively selected. Infarcted segments detected by using this method were validated by comparison with late enhancement CMR, and the diagnostic performance of the applied algorithmic method was evaluated with a receiver operating characteristic curve test. Results: The H-spread of the CS was reduced in infarcted segments compared with noninfarcted segments of the LV. The reductions were 30% in basal segments, 30% in midventricular segments, and 20% in apical LV segments. The diagnostic accuracy of detection, using the reported method, was represented by area under the curve values, which were 0.85, 0.82, and 0.87 for basal, midventricular, and apical slices, respectively, demonstrating good agreement with the late-gadolinium enhancement–based detections. Conclusions: The proposed applied algorithmic method has the potential to accurately identify the location of infarcted LV segments without the administration of late-gadolinium enhancement. Such an approach adds the potential to safely identify MI, potentially reduce patient scanning time, and extend the utility of CMR in patients who are contraindicated for the use of GBCA. UR - https://medinform.jmir.org/2021/2/e22164 UR - https://doi.org/10.2196/22164 UR - http://www.ncbi.nlm.nih.gov/pubmed/33565992 DO - 10.2196/22164 ID - info:doi/10.2196/22164 ER - TY - JOUR AU - Bernardo, Theresa AU - Sobkowich, Kurtis Edward AU - Forrest, Russell Othmer AU - Stewart, Luke Silva AU - D'Agostino, Marcelo AU - Perez Gutierrez, Enrique AU - Gillis, Daniel PY - 2021 DA - 2021/2/9 TI - Collaborating in the Time of COVID-19: The Scope and Scale of Innovative Responses to a Global Pandemic JO - JMIR Public Health Surveill SP - e25935 VL - 7 IS - 2 KW - crowdsourcing KW - artificial intelligence KW - collaboration KW - personal protective equipment KW - big data KW - AI KW - COVID-19 KW - innovation KW - information sharing KW - communication KW - teamwork KW - knowledge KW - dissemination UR - http://publichealth.jmir.org/2021/2/e25935/ UR - https://doi.org/10.2196/25935 UR - http://www.ncbi.nlm.nih.gov/pubmed/33503001 DO - 10.2196/25935 ID - info:doi/10.2196/25935 ER - TY - JOUR AU - Sato, Ann AU - Haneda, Eri AU - Suganuma, Nobuyasu AU - Narimatsu, Hiroto PY - 2021 DA - 2021/2/5 TI - Preliminary Screening for Hereditary Breast and Ovarian Cancer Using a Chatbot Augmented Intelligence Genetic Counselor: Development and Feasibility Study JO - JMIR Form Res SP - e25184 VL - 5 IS - 2 KW - artificial intelligence KW - augmented intelligence KW - hereditary cancer KW - familial cancer KW - IBM Watson KW - preliminary screening KW - cancer KW - genetics KW - chatbot KW - screening KW - feasibility AB - Background: Breast cancer is the most common form of cancer in Japan; genetic background and hereditary breast and ovarian cancer (HBOC) are implicated. The key to HBOC diagnosis involves screening to identify high-risk individuals. However, genetic medicine is still developing; thus, many patients who may potentially benefit from genetic medicine have not yet been identified. Objective: This study’s objective is to develop a chatbot system that uses augmented intelligence for HBOC screening to determine whether patients meet the National Comprehensive Cancer Network (NCCN) BRCA1/2 testing criteria. Methods: The system was evaluated by a doctor specializing in genetic medicine and certified genetic counselors. We prepared 3 scenarios and created a conversation with the chatbot to reflect each one. Then we evaluated chatbot feasibility, the required time, the medical accuracy of conversations and family history, and the final result. Results: The times required for the conversation were 7 minutes for scenario 1, 15 minutes for scenario 2, and 16 minutes for scenario 3. Scenarios 1 and 2 met the BRCA1/2 testing criteria, but scenario 3 did not, and this result was consistent with the findings of 3 experts who retrospectively reviewed conversations with the chatbot according to the 3 scenarios. A family history comparison ascertained by the chatbot with the actual scenarios revealed that each result was consistent with each scenario. From a genetic medicine perspective, no errors were noted by the 3 experts. Conclusions: This study demonstrated that chatbot systems could be applied to preliminary genetic medicine screening for HBOC. UR - https://formative.jmir.org/2021/2/e25184 UR - https://doi.org/10.2196/25184 UR - http://www.ncbi.nlm.nih.gov/pubmed/33544084 DO - 10.2196/25184 ID - info:doi/10.2196/25184 ER - TY - JOUR AU - Muralitharan, Sankavi AU - Nelson, Walter AU - Di, Shuang AU - McGillion, Michael AU - Devereaux, PJ AU - Barr, Neil Grant AU - Petch, Jeremy PY - 2021 DA - 2021/2/4 TI - Machine Learning–Based Early Warning Systems for Clinical Deterioration: Systematic Scoping Review JO - J Med Internet Res SP - e25187 VL - 23 IS - 2 KW - machine learning KW - early warning systems KW - clinical deterioration KW - ambulatory care KW - acute care KW - remote patient monitoring KW - vital signs KW - sepsis KW - cardiorespiratory instability KW - risk prediction AB - Background: Timely identification of patients at a high risk of clinical deterioration is key to prioritizing care, allocating resources effectively, and preventing adverse outcomes. Vital signs–based, aggregate-weighted early warning systems are commonly used to predict the risk of outcomes related to cardiorespiratory instability and sepsis, which are strong predictors of poor outcomes and mortality. Machine learning models, which can incorporate trends and capture relationships among parameters that aggregate-weighted models cannot, have recently been showing promising results. Objective: This study aimed to identify, summarize, and evaluate the available research, current state of utility, and challenges with machine learning–based early warning systems using vital signs to predict the risk of physiological deterioration in acutely ill patients, across acute and ambulatory care settings. Methods: PubMed, CINAHL, Cochrane Library, Web of Science, Embase, and Google Scholar were searched for peer-reviewed, original studies with keywords related to “vital signs,” “clinical deterioration,” and “machine learning.” Included studies used patient vital signs along with demographics and described a machine learning model for predicting an outcome in acute and ambulatory care settings. Data were extracted following PRISMA, TRIPOD, and Cochrane Collaboration guidelines. Results: We identified 24 peer-reviewed studies from 417 articles for inclusion; 23 studies were retrospective, while 1 was prospective in nature. Care settings included general wards, intensive care units, emergency departments, step-down units, medical assessment units, postanesthetic wards, and home care. Machine learning models including logistic regression, tree-based methods, kernel-based methods, and neural networks were most commonly used to predict the risk of deterioration. The area under the curve for models ranged from 0.57 to 0.97. Conclusions: In studies that compared performance, reported results suggest that machine learning–based early warning systems can achieve greater accuracy than aggregate-weighted early warning systems but several areas for further research were identified. While these models have the potential to provide clinical decision support, there is a need for standardized outcome measures to allow for rigorous evaluation of performance across models. Further research needs to address the interpretability of model outputs by clinicians, clinical efficacy of these systems through prospective study design, and their potential impact in different clinical settings. UR - https://www.jmir.org/2021/2/e25187 UR - https://doi.org/10.2196/25187 UR - http://www.ncbi.nlm.nih.gov/pubmed/33538696 DO - 10.2196/25187 ID - info:doi/10.2196/25187 ER - TY - JOUR AU - Schmitt, Max AU - Maron, Roman Christoph AU - Hekler, Achim AU - Stenzinger, Albrecht AU - Hauschild, Axel AU - Weichenthal, Michael AU - Tiemann, Markus AU - Krahl, Dieter AU - Kutzner, Heinz AU - Utikal, Jochen Sven AU - Haferkamp, Sebastian AU - Kather, Jakob Nikolas AU - Klauschen, Frederick AU - Krieghoff-Henning, Eva AU - Fröhling, Stefan AU - von Kalle, Christof AU - Brinker, Titus Josef PY - 2021 DA - 2021/2/2 TI - Hidden Variables in Deep Learning Digital Pathology and Their Potential to Cause Batch Effects: Prediction Model Study JO - J Med Internet Res SP - e23436 VL - 23 IS - 2 KW - artificial intelligence KW - machine learning KW - deep learning KW - neural networks KW - convolutional neural networks KW - pathology KW - clinical pathology KW - digital pathology KW - pitfalls KW - artifacts AB - Background: An increasing number of studies within digital pathology show the potential of artificial intelligence (AI) to diagnose cancer using histological whole slide images, which requires large and diverse data sets. While diversification may result in more generalizable AI-based systems, it can also introduce hidden variables. If neural networks are able to distinguish/learn hidden variables, these variables can introduce batch effects that compromise the accuracy of classification systems. Objective: The objective of the study was to analyze the learnability of an exemplary selection of hidden variables (patient age, slide preparation date, slide origin, and scanner type) that are commonly found in whole slide image data sets in digital pathology and could create batch effects. Methods: We trained four separate convolutional neural networks (CNNs) to learn four variables using a data set of digitized whole slide melanoma images from five different institutes. For robustness, each CNN training and evaluation run was repeated multiple times, and a variable was only considered learnable if the lower bound of the 95% confidence interval of its mean balanced accuracy was above 50.0%. Results: A mean balanced accuracy above 50.0% was achieved for all four tasks, even when considering the lower bound of the 95% confidence interval. Performance between tasks showed wide variation, ranging from 56.1% (slide preparation date) to 100% (slide origin). Conclusions: Because all of the analyzed hidden variables are learnable, they have the potential to create batch effects in dermatopathology data sets, which negatively affect AI-based classification systems. Practitioners should be aware of these and similar pitfalls when developing and evaluating such systems and address these and potentially other batch effect variables in their data sets through sufficient data set stratification. UR - https://www.jmir.org/2021/2/e23436 UR - https://doi.org/10.2196/23436 UR - http://www.ncbi.nlm.nih.gov/pubmed/33528370 DO - 10.2196/23436 ID - info:doi/10.2196/23436 ER - TY - JOUR AU - Buchanan, Christine AU - Howitt, M Lyndsay AU - Wilson, Rita AU - Booth, Richard G AU - Risling, Tracie AU - Bamford, Megan PY - 2021 DA - 2021/1/28 TI - Predicted Influences of Artificial Intelligence on Nursing Education: Scoping Review JO - JMIR Nursing SP - e23933 VL - 4 IS - 1 KW - nursing KW - artificial intelligence KW - education KW - review AB - Background: It is predicted that artificial intelligence (AI) will transform nursing across all domains of nursing practice, including administration, clinical care, education, policy, and research. Increasingly, researchers are exploring the potential influences of AI health technologies (AIHTs) on nursing in general and on nursing education more specifically. However, little emphasis has been placed on synthesizing this body of literature. Objective: A scoping review was conducted to summarize the current and predicted influences of AIHTs on nursing education over the next 10 years and beyond. Methods: This scoping review followed a previously published protocol from April 2020. Using an established scoping review methodology, the databases of MEDLINE, Cumulative Index to Nursing and Allied Health Literature, Embase, PsycINFO, Cochrane Database of Systematic Reviews, Cochrane Central, Education Resources Information Centre, Scopus, Web of Science, and Proquest were searched. In addition to the use of these electronic databases, a targeted website search was performed to access relevant grey literature. Abstracts and full-text studies were independently screened by two reviewers using prespecified inclusion and exclusion criteria. Included literature focused on nursing education and digital health technologies that incorporate AI. Data were charted using a structured form and narratively summarized into categories. Results: A total of 27 articles were identified (20 expository papers, six studies with quantitative or prototyping methods, and one qualitative study). The population included nurses, nurse educators, and nursing students at the entry-to-practice, undergraduate, graduate, and doctoral levels. A variety of AIHTs were discussed, including virtual avatar apps, smart homes, predictive analytics, virtual or augmented reality, and robots. The two key categories derived from the literature were (1) influences of AI on nursing education in academic institutions and (2) influences of AI on nursing education in clinical practice. Conclusions: Curricular reform is urgently needed within nursing education programs in academic institutions and clinical practice settings to prepare nurses and nursing students to practice safely and efficiently in the age of AI. Additionally, nurse educators need to adopt new and evolving pedagogies that incorporate AI to better support students at all levels of education. Finally, nursing students and practicing nurses must be equipped with the requisite knowledge and skills to effectively assess AIHTs and safely integrate those deemed appropriate to support person-centered compassionate nursing care in practice settings. International Registered Report Identifier (IRRID): RR2-10.2196/17490 UR - https://nursing.jmir.org/2021/1/e23933/ UR - https://doi.org/10.2196/23933 DO - 10.2196/23933 ID - info:doi/10.2196/23933 ER - TY - JOUR AU - Ho, Thao Thi AU - Park, Jongmin AU - Kim, Taewoo AU - Park, Byunggeon AU - Lee, Jaehee AU - Kim, Jin Young AU - Kim, Ki Beom AU - Choi, Sooyoung AU - Kim, Young Hwan AU - Lim, Jae-Kwang AU - Choi, Sanghun PY - 2021 DA - 2021/1/28 TI - Deep Learning Models for Predicting Severe Progression in COVID-19-Infected Patients: Retrospective Study JO - JMIR Med Inform SP - e24973 VL - 9 IS - 1 KW - COVID-19 KW - deep learning KW - artificial neural network KW - convolutional neural network KW - lung CT AB - Background: Many COVID-19 patients rapidly progress to respiratory failure with a broad range of severities. Identification of high-risk cases is critical for early intervention. Objective: The aim of this study is to develop deep learning models that can rapidly identify high-risk COVID-19 patients based on computed tomography (CT) images and clinical data. Methods: We analyzed 297 COVID-19 patients from five hospitals in Daegu, South Korea. A mixed artificial convolutional neural network (ACNN) model, combining an artificial neural network for clinical data and a convolutional neural network for 3D CT imaging data, was developed to classify these cases as either high risk of severe progression (ie, event) or low risk (ie, event-free). Results: Using the mixed ACNN model, we were able to obtain high classification performance using novel coronavirus pneumonia lesion images (ie, 93.9% accuracy, 80.8% sensitivity, 96.9% specificity, and 0.916 area under the curve [AUC] score) and lung segmentation images (ie, 94.3% accuracy, 74.7% sensitivity, 95.9% specificity, and 0.928 AUC score) for event versus event-free groups. Conclusions: Our study successfully differentiated high-risk cases among COVID-19 patients using imaging and clinical features. The developed model can be used as a predictive tool for interventions in aggressive therapies. UR - http://medinform.jmir.org/2021/1/e24973/ UR - https://doi.org/10.2196/24973 UR - http://www.ncbi.nlm.nih.gov/pubmed/33455900 DO - 10.2196/24973 ID - info:doi/10.2196/24973 ER - TY - JOUR AU - Wang, Hanxue AU - Cui, Wenjuan AU - Guo, Yunchang AU - Du, Yi AU - Zhou, Yuanchun PY - 2021 DA - 2021/1/26 TI - Machine Learning Prediction of Foodborne Disease Pathogens: Algorithm Development and Validation Study JO - JMIR Med Inform SP - e24924 VL - 9 IS - 1 KW - foodborne disease KW - pathogens prediction KW - machine learning AB - Background: Foodborne diseases have a high global incidence; thus, they place a heavy burden on public health and the social economy. Foodborne pathogens, as the main factor of foodborne diseases, play an important role in the treatment and prevention of foodborne diseases; however, foodborne diseases caused by different pathogens lack specificity in their clinical features, and there is a low proportion of actual clinical pathogen detection in real life. Objective: We aimed to analyze foodborne disease case data, select appropriate features based on analysis results, and use machine learning methods to classify foodborne disease pathogens to predict foodborne disease pathogens for cases where the pathogen is not known or tested. Methods: We extracted features such as space, time, and exposed food from foodborne disease case data and analyzed the relationships between these features and the foodborne disease pathogens using a variety of machine learning methods to classify foodborne disease pathogens. We compared the results of four models to obtain the pathogen prediction model with the highest accuracy. Results: The gradient boost decision tree model obtained the highest accuracy, with accuracy approaching 69% in identifying 4 pathogens: Salmonella, Norovirus, Escherichia coli, and Vibrio parahaemolyticus. By evaluating the importance of features such as time of illness, geographical longitude and latitude, and diarrhea frequency, we found that these features play important roles in classifying foodborne disease pathogens. Conclusions: Data analysis can reflect the distribution of some features of foodborne diseases and the relationships among the features. The classification of pathogens based on the analysis results and machine learning methods can provide beneficial support for clinical auxiliary diagnosis and treatment of foodborne diseases. UR - http://medinform.jmir.org/2021/1/e24924/ UR - https://doi.org/10.2196/24924 UR - http://www.ncbi.nlm.nih.gov/pubmed/33496675 DO - 10.2196/24924 ID - info:doi/10.2196/24924 ER - TY - JOUR AU - Diao, Xiaolin AU - Huo, Yanni AU - Yan, Zhanzheng AU - Wang, Haibin AU - Yuan, Jing AU - Wang, Yuxin AU - Cai, Jun AU - Zhao, Wei PY - 2021 DA - 2021/1/25 TI - An Application of Machine Learning to Etiological Diagnosis of Secondary Hypertension: Retrospective Study Using Electronic Medical Records JO - JMIR Med Inform SP - e19739 VL - 9 IS - 1 KW - secondary hypertension KW - etiological diagnosis KW - machine learning KW - prediction model AB - Background: Secondary hypertension is a kind of hypertension with a definite etiology and may be cured. Patients with suspected secondary hypertension can benefit from timely detection and treatment and, conversely, will have a higher risk of morbidity and mortality than those with primary hypertension. Objective: The aim of this study was to develop and validate machine learning (ML) prediction models of common etiologies in patients with suspected secondary hypertension. Methods: The analyzed data set was retrospectively extracted from electronic medical records of patients discharged from Fuwai Hospital between January 1, 2016, and June 30, 2019. A total of 7532 unique patients were included and divided into 2 data sets by time: 6302 patients in 2016-2018 as the training data set for model building and 1230 patients in 2019 as the validation data set for further evaluation. Extreme Gradient Boosting (XGBoost) was adopted to develop 5 models to predict 4 etiologies of secondary hypertension and occurrence of any of them (named as composite outcome), including renovascular hypertension (RVH), primary aldosteronism (PA), thyroid dysfunction, and aortic stenosis. Both univariate logistic analysis and Gini Impurity were used for feature selection. Grid search and 10-fold cross-validation were used to select the optimal hyperparameters for each model. Results: Validation of the composite outcome prediction model showed good performance with an area under the receiver-operating characteristic curve (AUC) of 0.924 in the validation data set, while the 4 prediction models of RVH, PA, thyroid dysfunction, and aortic stenosis achieved AUC of 0.938, 0.965, 0.959, and 0.946, respectively, in the validation data set. A total of 79 clinical indicators were identified in all and finally used in our prediction models. The result of subgroup analysis on the composite outcome prediction model demonstrated high discrimination with AUCs all higher than 0.890 among all age groups of adults. Conclusions: The ML prediction models in this study showed good performance in detecting 4 etiologies of patients with suspected secondary hypertension; thus, they may potentially facilitate clinical diagnosis decision making of secondary hypertension in an intelligent way. UR - http://medinform.jmir.org/2021/1/e19739/ UR - https://doi.org/10.2196/19739 UR - http://www.ncbi.nlm.nih.gov/pubmed/33492233 DO - 10.2196/19739 ID - info:doi/10.2196/19739 ER - TY - JOUR AU - Boutilier, Justin J AU - Chan, Timothy C Y AU - Ranjan, Manish AU - Deo, Sarang PY - 2021 DA - 2021/1/21 TI - Risk Stratification for Early Detection of Diabetes and Hypertension in Resource-Limited Settings: Machine Learning Analysis JO - J Med Internet Res SP - e20123 VL - 23 IS - 1 KW - machine learning KW - diabetes KW - hypertension KW - screening KW - global health AB - Background: The impending scale up of noncommunicable disease screening programs in low- and middle-income countries coupled with limited health resources require that such programs be as accurate as possible at identifying patients at high risk. Objective: The aim of this study was to develop machine learning–based risk stratification algorithms for diabetes and hypertension that are tailored for the at-risk population served by community-based screening programs in low-resource settings. Methods: We trained and tested our models by using data from 2278 patients collected by community health workers through door-to-door and camp-based screenings in the urban slums of Hyderabad, India between July 14, 2015 and April 21, 2018. We determined the best models for predicting short-term (2-month) risk of diabetes and hypertension (a model for diabetes and a model for hypertension) and compared these models to previously developed risk scores from the United States and the United Kingdom by using prediction accuracy as characterized by the area under the receiver operating characteristic curve (AUC) and the number of false negatives. Results: We found that models based on random forest had the highest prediction accuracy for both diseases and were able to outperform the US and UK risk scores in terms of AUC by 35.5% for diabetes (improvement of 0.239 from 0.671 to 0.910) and 13.5% for hypertension (improvement of 0.094 from 0.698 to 0.792). For a fixed screening specificity of 0.9, the random forest model was able to reduce the expected number of false negatives by 620 patients per 1000 screenings for diabetes and 220 patients per 1000 screenings for hypertension. This improvement reduces the cost of incorrect risk stratification by US $1.99 (or 35%) per screening for diabetes and US $1.60 (or 21%) per screening for hypertension. Conclusions: In the next decade, health systems in many countries are planning to spend significant resources on noncommunicable disease screening programs and our study demonstrates that machine learning models can be leveraged by these programs to effectively utilize limited resources by improving risk stratification. UR - http://www.jmir.org/2021/1/e20123/ UR - https://doi.org/10.2196/20123 UR - http://www.ncbi.nlm.nih.gov/pubmed/33475518 DO - 10.2196/20123 ID - info:doi/10.2196/20123 ER - TY - JOUR AU - Lu, Yingjie AU - Luo, Shuwen AU - Liu, Xuan PY - 2021 DA - 2021/1/7 TI - Development of Social Support Networks by Patients With Depression Through Online Health Communities: Social Network Analysis JO - JMIR Med Inform SP - e24618 VL - 9 IS - 1 KW - online depression community KW - social support network KW - exponential random graph model KW - informational support KW - emotional support KW - mental health KW - depression KW - social network AB - Background: In recent years, people with mental health problems are increasingly using online social networks to receive social support. For example, in online depression communities, patients can share their experiences, exchange valuable information, and receive emotional support to help them cope with their disease. Therefore, it is critical to understand how patients with depression develop online social support networks to exchange informational and emotional support. Objective: Our aim in this study was to investigate which user attributes have significant effects on the formation of informational and emotional support networks in online depression communities and to further examine whether there is an association between the two social networks. Methods: We used social network theory and constructed exponential random graph models to help understand the informational and emotional support networks in online depression communities. A total of 74,986 original posts were retrieved from 1077 members in an online depression community in China from April 2003 to September 2017 and the available data were extracted. An informational support network of 1077 participant nodes and 6557 arcs and an emotional support network of 1077 participant nodes and 6430 arcs were constructed to examine the endogenous (purely structural) effects and exogenous (actor-relation) effects on each support network separately, as well as the cross-network effects between the two networks. Results: We found significant effects of two important structural features, reciprocity and transitivity, on the formation of both the informational support network (r=3.6247, P<.001, and r=1.6232, P<.001, respectively) and the emotional support network (r=4.4111, P<.001, and r=0.0177, P<.001, respectively). The results also showed significant effects of some individual factors on the formation of the two networks. No significant effects of homophily were found for gender (r=0.0783, P=.20, and r=0.1122, P=.25, respectively) in the informational or emotional support networks. There was no tendency for users who had great influence (r=0.3253, P=.05) or wrote more posts (r=0.3896, P=.07) or newcomers (r=–0.0452, P=.66) to form informational support ties more easily. However, users who spent more time online (r=0.6680, P<.001) or provided more replies to other posts (r=0.5026, P<.001) were more likely to form informational support ties. Users who had a big influence (r=0.8325, P<.001), spent more time online (r=0.5839, P<.001), wrote more posts (r=2.4025, P<.001), or provided more replies to other posts (r=0.2259, P<.001) were more likely to form emotional support ties, and newcomers (r=–0.4224, P<.001) were less likely than old-timers to receive emotional support. In addition, we found that there was a significant entrainment effect (r=0.7834, P<.001) and a nonsignificant exchange effect (r=–0.2757, P=.32) between the two networks. Conclusions: This study makes several important theoretical contributions to the research on online depression communities and has important practical implications for the managers of online depression communities and the users involved in these communities. UR - http://medinform.jmir.org/2021/1/e24618/ UR - https://doi.org/10.2196/24618 UR - http://www.ncbi.nlm.nih.gov/pubmed/33279878 DO - 10.2196/24618 ID - info:doi/10.2196/24618 ER - TY - JOUR AU - Leung, Yvonne W AU - Wouterloot, Elise AU - Adikari, Achini AU - Hirst, Graeme AU - de Silva, Daswin AU - Wong, Jiahui AU - Bender, Jacqueline L AU - Gancarz, Mathew AU - Gratzer, David AU - Alahakoon, Damminda AU - Esplen, Mary Jane PY - 2021 DA - 2021/1/7 TI - Natural Language Processing–Based Virtual Cofacilitator for Online Cancer Support Groups: Protocol for an Algorithm Development and Validation Study JO - JMIR Res Protoc SP - e21453 VL - 10 IS - 1 KW - artificial intelligence KW - cancer KW - online support groups KW - emotional distress KW - natural language processing KW - participant engagement AB - Background: Cancer and its treatment can significantly impact the short- and long-term psychological well-being of patients and families. Emotional distress and depressive symptomatology are often associated with poor treatment adherence, reduced quality of life, and higher mortality. Cancer support groups, especially those led by health care professionals, provide a safe place for participants to discuss fear, normalize stress reactions, share solidarity, and learn about effective strategies to build resilience and enhance coping. However, in-person support groups may not always be accessible to individuals; geographic distance is one of the barriers for access, and compromised physical condition (eg, fatigue, pain) is another. Emerging evidence supports the effectiveness of online support groups in reducing access barriers. Text-based and professional-led online support groups have been offered by Cancer Chat Canada. Participants join the group discussion using text in real time. However, therapist leaders report some challenges leading text-based online support groups in the absence of visual cues, particularly in tracking participant distress. With multiple participants typing at the same time, the nuances of the text messages or red flags for distress can sometimes be missed. Recent advances in artificial intelligence such as deep learning–based natural language processing offer potential solutions. This technology can be used to analyze online support group text data to track participants’ expressed emotional distress, including fear, sadness, and hopelessness. Artificial intelligence allows session activities to be monitored in real time and alerts the therapist to participant disengagement. Objective: We aim to develop and evaluate an artificial intelligence–based cofacilitator prototype to track and monitor online support group participants’ distress through real-time analysis of text-based messages posted during synchronous sessions. Methods: An artificial intelligence–based cofacilitator will be developed to identify participants who are at-risk for increased emotional distress and track participant engagement and in-session group cohesion levels, providing real-time alerts for therapist to follow-up; generate postsession participant profiles that contain discussion content keywords and emotion profiles for each session; and automatically suggest tailored resources to participants according to their needs. The study is designed to be conducted in 4 phases consisting of (1) development based on a subset of data and an existing natural language processing framework, (2) performance evaluation using human scoring, (3) beta testing, and (4) user experience evaluation. Results: This study received ethics approval in August 2019. Phase 1, development of an artificial intelligence–based cofacilitator, was completed in January 2020. As of December 2020, phase 2 is underway. The study is expected to be completed by September 2021. Conclusions: An artificial intelligence–based cofacilitator offers a promising new mode of delivery of person-centered online support groups tailored to individual needs. International Registered Report Identifier (IRRID): DERR1-10.2196/21453 UR - https://www.researchprotocols.org/2021/1/e21453 UR - https://doi.org/10.2196/21453 UR - http://www.ncbi.nlm.nih.gov/pubmed/33410754 DO - 10.2196/21453 ID - info:doi/10.2196/21453 ER - TY - JOUR AU - Fan, Xiangmin AU - Chao, Daren AU - Zhang, Zhan AU - Wang, Dakuo AU - Li, Xiaohua AU - Tian, Feng PY - 2021 DA - 2021/1/6 TI - Utilization of Self-Diagnosis Health Chatbots in Real-World Settings: Case Study JO - J Med Internet Res SP - e19928 VL - 23 IS - 1 KW - self-diagnosis KW - chatbot KW - conversational agent KW - human–artificial intelligence interaction KW - artificial intelligence KW - diagnosis KW - case study KW - eHealth KW - real world KW - user experience AB - Background: Artificial intelligence (AI)-driven chatbots are increasingly being used in health care, but most chatbots are designed for a specific population and evaluated in controlled settings. There is little research documenting how health consumers (eg, patients and caregivers) use chatbots for self-diagnosis purposes in real-world scenarios. Objective: The aim of this research was to understand how health chatbots are used in a real-world context, what issues and barriers exist in their usage, and how the user experience of this novel technology can be improved. Methods: We employed a data-driven approach to analyze the system log of a widely deployed self-diagnosis chatbot in China. Our data set consisted of 47,684 consultation sessions initiated by 16,519 users over 6 months. The log data included a variety of information, including users’ nonidentifiable demographic information, consultation details, diagnostic reports, and user feedback. We conducted both statistical analysis and content analysis on this heterogeneous data set. Results: The chatbot users spanned all age groups, including middle-aged and older adults. Users consulted the chatbot on a wide range of medical conditions, including those that often entail considerable privacy and social stigma issues. Furthermore, we distilled 2 prominent issues in the use of the chatbot: (1) a considerable number of users dropped out in the middle of their consultation sessions, and (2) some users pretended to have health concerns and used the chatbot for nontherapeutic purposes. Finally, we identified a set of user concerns regarding the use of the chatbot, including insufficient actionable information and perceived inaccurate diagnostic suggestions. Conclusions: Although health chatbots are considered to be convenient tools for enhancing patient-centered care, there are issues and barriers impeding the optimal use of this novel technology. Designers and developers should employ user-centered approaches to address the issues and user concerns to achieve the best uptake and utilization. We conclude the paper by discussing several design implications, including making the chatbots more informative, easy-to-use, and trustworthy, as well as improving the onboarding experience to enhance user engagement. UR - https://www.jmir.org/2021/1/e19928 UR - https://doi.org/10.2196/19928 UR - http://www.ncbi.nlm.nih.gov/pubmed/33404508 DO - 10.2196/19928 ID - info:doi/10.2196/19928 ER - TY - JOUR AU - Luo, Gang AU - Johnson, Michael D AU - Nkoy, Flory L AU - He, Shan AU - Stone, Bryan L PY - 2020 DA - 2020/12/31 TI - Automatically Explaining Machine Learning Prediction Results on Asthma Hospital Visits in Patients With Asthma: Secondary Analysis JO - JMIR Med Inform SP - e21965 VL - 8 IS - 12 KW - asthma KW - forecasting KW - machine learning KW - patient care management AB - Background: Asthma is a major chronic disease that poses a heavy burden on health care. To facilitate the allocation of care management resources aimed at improving outcomes for high-risk patients with asthma, we recently built a machine learning model to predict asthma hospital visits in the subsequent year in patients with asthma. Our model is more accurate than previous models. However, like most machine learning models, it offers no explanation of its prediction results. This creates a barrier for use in care management, where interpretability is desired. Objective: This study aims to develop a method to automatically explain the prediction results of the model and recommend tailored interventions without lowering the performance measures of the model. Methods: Our data were imbalanced, with only a small portion of data instances linking to future asthma hospital visits. To handle imbalanced data, we extended our previous method of automatically offering rule-formed explanations for the prediction results of any machine learning model on tabular data without lowering the model’s performance measures. In a secondary analysis of the 334,564 data instances from Intermountain Healthcare between 2005 and 2018 used to form our model, we employed the extended method to automatically explain the prediction results of our model and recommend tailored interventions. The patient cohort consisted of all patients with asthma who received care at Intermountain Healthcare between 2005 and 2018, and resided in Utah or Idaho as recorded at the visit. Results: Our method explained the prediction results for 89.7% (391/436) of the patients with asthma who, per our model’s correct prediction, were likely to incur asthma hospital visits in the subsequent year. Conclusions: This study is the first to demonstrate the feasibility of automatically offering rule-formed explanations for the prediction results of any machine learning model on imbalanced tabular data without lowering the performance measures of the model. After further improvement, our asthma outcome prediction model coupled with the automatic explanation function could be used by clinicians to guide the allocation of limited asthma care management resources and the identification of appropriate interventions. UR - http://medinform.jmir.org/2020/12/e21965/ UR - https://doi.org/10.2196/21965 UR - http://www.ncbi.nlm.nih.gov/pubmed/33382379 DO - 10.2196/21965 ID - info:doi/10.2196/21965 ER - TY - JOUR AU - Yamada, Tomohide AU - Yoneoka, Daisuke AU - Hiraike, Yuta AU - Hino, Kimihiro AU - Toyoshiba, Hiroyoshi AU - Shishido, Akira AU - Noma, Hisashi AU - Shojima, Nobuhiro AU - Yamauchi, Toshimasa PY - 2020 DA - 2020/12/30 TI - Deep Neural Network for Reducing the Screening Workload in Systematic Reviews for Clinical Guidelines: Algorithm Validation Study JO - J Med Internet Res SP - e22422 VL - 22 IS - 12 KW - machine learning KW - evidence-based medicine KW - systematic review KW - meta-analysis KW - clinical guideline KW - deep learning KW - neural network AB - Background: Performing systematic reviews is a time-consuming and resource-intensive process. Objective: We investigated whether a machine learning system could perform systematic reviews more efficiently. Methods: All systematic reviews and meta-analyses of interventional randomized controlled trials cited in recent clinical guidelines from the American Diabetes Association, American College of Cardiology, American Heart Association (2 guidelines), and American Stroke Association were assessed. After reproducing the primary screening data set according to the published search strategy of each, we extracted correct articles (those actually reviewed) and incorrect articles (those not reviewed) from the data set. These 2 sets of articles were used to train a neural network–based artificial intelligence engine (Concept Encoder, Fronteo Inc). The primary endpoint was work saved over sampling at 95% recall (WSS@95%). Results: Among 145 candidate reviews of randomized controlled trials, 8 reviews fulfilled the inclusion criteria. For these 8 reviews, the machine learning system significantly reduced the literature screening workload by at least 6-fold versus that of manual screening based on WSS@95%. When machine learning was initiated using 2 correct articles that were randomly selected by a researcher, a 10-fold reduction in workload was achieved versus that of manual screening based on the WSS@95% value, with high sensitivity for eligible studies. The area under the receiver operating characteristic curve increased dramatically every time the algorithm learned a correct article. Conclusions: Concept Encoder achieved a 10-fold reduction of the screening workload for systematic review after learning from 2 randomly selected studies on the target topic. However, few meta-analyses of randomized controlled trials were included. Concept Encoder could facilitate the acquisition of evidence for clinical guidelines. UR - https://www.jmir.org/2020/12/e22422 UR - https://doi.org/10.2196/22422 UR - http://www.ncbi.nlm.nih.gov/pubmed/33262102 DO - 10.2196/22422 ID - info:doi/10.2196/22422 ER - TY - JOUR AU - Ko, Hoon AU - Chung, Heewon AU - Kang, Wu Seong AU - Park, Chul AU - Kim, Do Wan AU - Kim, Seong Eun AU - Chung, Chi Ryang AU - Ko, Ryoung Eun AU - Lee, Hooseok AU - Seo, Jae Ho AU - Choi, Tae-Young AU - Jaimes, Rafael AU - Kim, Kyung Won AU - Lee, Jinseok PY - 2020 DA - 2020/12/23 TI - An Artificial Intelligence Model to Predict the Mortality of COVID-19 Patients at Hospital Admission Time Using Routine Blood Samples: Development and Validation of an Ensemble Model JO - J Med Internet Res SP - e25442 VL - 22 IS - 12 KW - COVID-19 KW - artificial intelligence KW - blood samples KW - mortality prediction AB - Background: COVID-19, which is accompanied by acute respiratory distress, multiple organ failure, and death, has spread worldwide much faster than previously thought. However, at present, it has limited treatments. Objective: To overcome this issue, we developed an artificial intelligence (AI) model of COVID-19, named EDRnet (ensemble learning model based on deep neural network and random forest models), to predict in-hospital mortality using a routine blood sample at the time of hospital admission. Methods: We selected 28 blood biomarkers and used the age and gender information of patients as model inputs. To improve the mortality prediction, we adopted an ensemble approach combining deep neural network and random forest models. We trained our model with a database of blood samples from 361 COVID-19 patients in Wuhan, China, and applied it to 106 COVID-19 patients in three Korean medical institutions. Results: In the testing data sets, EDRnet provided high sensitivity (100%), specificity (91%), and accuracy (92%). To extend the number of patient data points, we developed a web application (BeatCOVID19) where anyone can access the model to predict mortality and can register his or her own blood laboratory results. Conclusions: Our new AI model, EDRnet, accurately predicts the mortality rate for COVID-19. It is publicly available and aims to help health care providers fight COVID-19 and improve patients’ outcomes. UR - http://www.jmir.org/2020/12/e25442/ UR - https://doi.org/10.2196/25442 UR - http://www.ncbi.nlm.nih.gov/pubmed/33301414 DO - 10.2196/25442 ID - info:doi/10.2196/25442 ER - TY - JOUR AU - Geng, Wenye AU - Qin, Xuanfeng AU - Yang, Tao AU - Cong, Zhilei AU - Wang, Zhuo AU - Kong, Qing AU - Tang, Zihui AU - Jiang, Lin PY - 2020 DA - 2020/12/21 TI - Model-Based Reasoning of Clinical Diagnosis in Integrative Medicine: Real-World Methodological Study of Electronic Medical Records and Natural Language Processing Methods JO - JMIR Med Inform SP - e23082 VL - 8 IS - 12 KW - model-based reasoning KW - integrative medicine KW - electronic medical records KW - natural language processing AB - Background: Integrative medicine is a form of medicine that combines practices and treatments from alternative medicine with conventional medicine. The diagnosis in integrative medicine involves the clinical diagnosis based on modern medicine and syndrome pattern diagnosis. Electronic medical records (EMRs) are the systematized collection of patients health information stored in a digital format that can be shared across different health care settings. Although syndrome and sign information or relative information can be extracted from the EMR and content texts can be mapped to computability vectors using natural language processing techniques, application of artificial intelligence techniques to support physicians in medical practices remains a major challenge. Objective: The purpose of this study was to investigate model-based reasoning (MBR) algorithms for the clinical diagnosis in integrative medicine based on EMRs and natural language processing. We also estimated the associations among the factors of sample size, number of syndrome pattern type, and diagnosis in modern medicine using the MBR algorithms. Methods: A total of 14,075 medical records of clinical cases were extracted from the EMRs as the development data set, and an external test data set consisting of 1000 medical records of clinical cases was extracted from independent EMRs. MBR methods based on word embedding, machine learning, and deep learning algorithms were developed for the automatic diagnosis of syndrome pattern in integrative medicine. MBR algorithms combining rule-based reasoning (RBR) were also developed. A standard evaluation metrics consisting of accuracy, precision, recall, and F1 score was used for the performance estimation of the methods. The association analyses were conducted on the sample size, number of syndrome pattern type, and diagnosis of lung diseases with the best algorithms. Results: The Word2Vec convolutional neural network (CNN) MBR algorithms showed high performance (accuracy of 0.9586 in the test data set) in the syndrome pattern diagnosis of lung diseases. The Word2Vec CNN MBR combined with RBR also showed high performance (accuracy of 0.9229 in the test data set). The diagnosis of lung diseases could enhance the performance of the Word2Vec CNN MBR algorithms. Each group sample size and syndrome pattern type affected the performance of these algorithms. Conclusions: The MBR methods based on Word2Vec and CNN showed high performance in the syndrome pattern diagnosis of lung diseases in integrative medicine. The parameters of each group’s sample size, syndrome pattern type, and diagnosis of lung diseases were associated with the performance of the methods. Trial Registration: ClinicalTrials.gov NCT03274908; https://clinicaltrials.gov/ct2/show/NCT03274908 UR - http://medinform.jmir.org/2020/12/e23082/ UR - https://doi.org/10.2196/23082 UR - http://www.ncbi.nlm.nih.gov/pubmed/33346740 DO - 10.2196/23082 ID - info:doi/10.2196/23082 ER - TY - JOUR AU - Safi, Zeineb AU - Abd-Alrazaq, Alaa AU - Khalifa, Mohamed AU - Househ, Mowafa PY - 2020 DA - 2020/12/18 TI - Technical Aspects of Developing Chatbots for Medical Applications: Scoping Review JO - J Med Internet Res SP - e19127 VL - 22 IS - 12 KW - chatbots KW - conversational agents KW - medical applications KW - scoping review KW - technical aspects AB - Background: Chatbots are applications that can conduct natural language conversations with users. In the medical field, chatbots have been developed and used to serve different purposes. They provide patients with timely information that can be critical in some scenarios, such as access to mental health resources. Since the development of the first chatbot, ELIZA, in the late 1960s, much effort has followed to produce chatbots for various health purposes developed in different ways. Objective: This study aimed to explore the technical aspects and development methodologies associated with chatbots used in the medical field to explain the best methods of development and support chatbot development researchers on their future work. Methods: We searched for relevant articles in 8 literature databases (IEEE, ACM, Springer, ScienceDirect, Embase, MEDLINE, PsycINFO, and Google Scholar). We also performed forward and backward reference checking of the selected articles. Study selection was performed by one reviewer, and 50% of the selected studies were randomly checked by a second reviewer. A narrative approach was used for result synthesis. Chatbots were classified based on the different technical aspects of their development. The main chatbot components were identified in addition to the different techniques for implementing each module. Results: The original search returned 2481 publications, of which we identified 45 studies that matched our inclusion and exclusion criteria. The most common language of communication between users and chatbots was English (n=23). We identified 4 main modules: text understanding module, dialog management module, database layer, and text generation module. The most common technique for developing text understanding and dialogue management is the pattern matching method (n=18 and n=25, respectively). The most common text generation is fixed output (n=36). Very few studies relied on generating original output. Most studies kept a medical knowledge base to be used by the chatbot for different purposes throughout the conversations. A few studies kept conversation scripts and collected user data and previous conversations. Conclusions: Many chatbots have been developed for medical use, at an increasing rate. There is a recent, apparent shift in adopting machine learning–based approaches for developing chatbot systems. Further research can be conducted to link clinical outcomes to different chatbot development techniques and technical characteristics. UR - http://www.jmir.org/2020/12/e19127/ UR - https://doi.org/10.2196/19127 UR - http://www.ncbi.nlm.nih.gov/pubmed/33337337 DO - 10.2196/19127 ID - info:doi/10.2196/19127 ER - TY - JOUR AU - Rashidian, Sina AU - Abell-Hart, Kayley AU - Hajagos, Janos AU - Moffitt, Richard AU - Lingam, Veena AU - Garcia, Victor AU - Tsai, Chao-Wei AU - Wang, Fusheng AU - Dong, Xinyu AU - Sun, Siao AU - Deng, Jianyuan AU - Gupta, Rajarsi AU - Miller, Joshua AU - Saltz, Joel AU - Saltz, Mary PY - 2020 DA - 2020/12/17 TI - Detecting Miscoded Diabetes Diagnosis Codes in Electronic Health Records for Quality Improvement: Temporal Deep Learning Approach JO - JMIR Med Inform SP - e22649 VL - 8 IS - 12 KW - electronic health records KW - diabetes KW - deep learning AB - Background: Diabetes affects more than 30 million patients across the United States. With such a large disease burden, even a small error in classification can be significant. Currently billing codes, assigned at the time of a medical encounter, are the “gold standard” reflecting the actual diseases present in an individual, and thus in aggregate reflect disease prevalence in the population. These codes are generated by highly trained coders and by health care providers but are not always accurate. Objective: This work provides a scalable deep learning methodology to more accurately classify individuals with diabetes across multiple health care systems. Methods: We leveraged a long short-term memory-dense neural network (LSTM-DNN) model to identify patients with or without diabetes using data from 5 acute care facilities with 187,187 patients and 275,407 encounters, incorporating data elements including laboratory test results, diagnostic/procedure codes, medications, demographic data, and admission information. Furthermore, a blinded physician panel reviewed discordant cases, providing an estimate of the total impact on the population. Results: When predicting the documented diagnosis of diabetes, our model achieved an 84% F1 score, 96% area under the curve–receiver operating characteristic curve, and 91% average precision on a heterogeneous data set from 5 distinct health facilities. However, in 81% of cases where the model disagreed with the documented phenotype, a blinded physician panel agreed with the model. Taken together, this suggests that 4.3% of our studied population have either missing or improper diabetes diagnosis. Conclusions: This study demonstrates that deep learning methods can improve clinical phenotyping even when patient data are noisy, sparse, and heterogeneous. UR - http://medinform.jmir.org/2020/12/e22649/ UR - https://doi.org/10.2196/22649 UR - http://www.ncbi.nlm.nih.gov/pubmed/33331828 DO - 10.2196/22649 ID - info:doi/10.2196/22649 ER - TY - JOUR AU - Buchanan, Christine AU - Howitt, M Lyndsay AU - Wilson, Rita AU - Booth, Richard G AU - Risling, Tracie AU - Bamford, Megan PY - 2020 DA - 2020/12/17 TI - Predicted Influences of Artificial Intelligence on the Domains of Nursing: Scoping Review JO - JMIR Nursing SP - e23939 VL - 3 IS - 1 KW - nursing KW - artificial intelligence KW - machine learning KW - robotics KW - patient-centered care KW - review AB - Background: Artificial intelligence (AI) is set to transform the health system, yet little research to date has explored its influence on nurses—the largest group of health professionals. Furthermore, there has been little discussion on how AI will influence the experience of person-centered compassionate care for patients, families, and caregivers. Objective: This review aims to summarize the extant literature on the emerging trends in health technologies powered by AI and their implications on the following domains of nursing: administration, clinical practice, policy, and research. This review summarizes the findings from 3 research questions, examining how these emerging trends might influence the roles and functions of nurses and compassionate nursing care over the next 10 years and beyond. Methods: Using an established scoping review methodology, MEDLINE, CINAHL, EMBASE, PsycINFO, Cochrane Database of Systematic Reviews, Cochrane Central, Education Resources Information Center, Scopus, Web of Science, and ProQuest databases were searched. In addition to the electronic database searches, a targeted website search was performed to access relevant gray literature. Abstracts and full-text studies were independently screened by 2 reviewers using prespecified inclusion and exclusion criteria. Included articles focused on nursing and digital health technologies that incorporate AI. Data were charted using structured forms and narratively summarized. Results: A total of 131 articles were retrieved from the scoping review for the 3 research questions that were the focus of this manuscript (118 from database sources and 13 from targeted websites). Emerging AI technologies discussed in the review included predictive analytics, smart homes, virtual health care assistants, and robots. The results indicated that AI has already begun to influence nursing roles, workflows, and the nurse-patient relationship. In general, robots are not viewed as replacements for nurses. There is a consensus that health technologies powered by AI may have the potential to enhance nursing practice. Consequently, nurses must proactively define how person-centered compassionate care will be preserved in the age of AI. Conclusions: Nurses have a shared responsibility to influence decisions related to the integration of AI into the health system and to ensure that this change is introduced in a way that is ethical and aligns with core nursing values such as compassionate care. Furthermore, nurses must advocate for patient and nursing involvement in all aspects of the design, implementation, and evaluation of these technologies. International Registered Report Identifier (IRRID): RR2-10.2196/17490 UR - https://nursing.jmir.org/2020/1/e23939/ UR - https://doi.org/10.2196/23939 DO - 10.2196/23939 ID - info:doi/10.2196/23939 ER - TY - JOUR AU - D'Ambrosia, Christopher AU - Christensen, Henrik AU - Aronoff-Spencer, Eliah PY - 2020 DA - 2020/12/16 TI - Computing SARS-CoV-2 Infection Risk From Symptoms, Imaging, and Test Data: Diagnostic Model Development JO - J Med Internet Res SP - e24478 VL - 22 IS - 12 KW - health KW - informatics KW - computation KW - COVID-19 KW - infection KW - risk KW - symptom KW - imaging KW - diagnostic KW - probability KW - machine learning KW - Bayesian KW - model AB - Background: Assigning meaningful probabilities of SARS-CoV-2 infection risk presents a diagnostic challenge across the continuum of care. Objective: The aim of this study was to develop and clinically validate an adaptable, personalized diagnostic model to assist clinicians in ruling in and ruling out COVID-19 in potential patients. We compared the diagnostic performance of probabilistic, graphical, and machine learning models against a previously published benchmark model. Methods: We integrated patient symptoms and test data using machine learning and Bayesian inference to quantify individual patient risk of SARS-CoV-2 infection. We trained models with 100,000 simulated patient profiles based on 13 symptoms and estimated local prevalence, imaging, and molecular diagnostic performance from published reports. We tested these models with consecutive patients who presented with a COVID-19–compatible illness at the University of California San Diego Medical Center over the course of 14 days starting in March 2020. Results: We included 55 consecutive patients with fever (n=43, 78%) or cough (n=42, 77%) presenting for ambulatory (n=11, 20%) or hospital care (n=44, 80%). In total, 51% (n=28) were female and 49% (n=27) were aged <60 years. Common comorbidities included diabetes (n=12, 22%), hypertension (n=15, 27%), cancer (n=9, 16%), and cardiovascular disease (n=7, 13%). Of these, 69% (n=38) were confirmed via reverse transcription-polymerase chain reaction (RT-PCR) to be positive for SARS-CoV-2 infection, and 20% (n=11) had repeated negative nucleic acid testing and an alternate diagnosis. Bayesian inference network, distance metric learning, and ensemble models discriminated between patients with SARS-CoV-2 infection and alternate diagnoses with sensitivities of 81.6%-84.2%, specificities of 58.8%-70.6%, and accuracies of 61.4%-71.8%. After integrating imaging and laboratory test statistics with the predictions of the Bayesian inference network, changes in diagnostic uncertainty at each step in the simulated clinical evaluation process were highly sensitive to location, symptom, and diagnostic test choices. Conclusions: Decision support models that incorporate symptoms and available test results can help providers diagnose SARS-CoV-2 infection in real-world settings. UR - http://www.jmir.org/2020/12/e24478/ UR - https://doi.org/10.2196/24478 UR - http://www.ncbi.nlm.nih.gov/pubmed/33301417 DO - 10.2196/24478 ID - info:doi/10.2196/24478 ER - TY - JOUR AU - Kim, Junetae AU - Lee, Sangwon AU - Hwang, Eugene AU - Ryu, Kwang Sun AU - Jeong, Hanseok AU - Lee, Jae Wook AU - Hwangbo, Yul AU - Choi, Kui Son AU - Cha, Hyo Soung PY - 2020 DA - 2020/12/16 TI - Limitations of Deep Learning Attention Mechanisms in Clinical Research: Empirical Case Study Based on the Korean Diabetic Disease Setting JO - J Med Internet Res SP - e18418 VL - 22 IS - 12 KW - attention KW - deep learning KW - explainable artificial intelligence KW - uncertainty awareness KW - Bayesian deep learning KW - artificial intelligence KW - health data AB - Background: Despite excellent prediction performance, noninterpretability has undermined the value of applying deep-learning algorithms in clinical practice. To overcome this limitation, attention mechanism has been introduced to clinical research as an explanatory modeling method. However, potential limitations of using this attractive method have not been clarified to clinical researchers. Furthermore, there has been a lack of introductory information explaining attention mechanisms to clinical researchers. Objective: The aim of this study was to introduce the basic concepts and design approaches of attention mechanisms. In addition, we aimed to empirically assess the potential limitations of current attention mechanisms in terms of prediction and interpretability performance. Methods: First, the basic concepts and several key considerations regarding attention mechanisms were identified. Second, four approaches to attention mechanisms were suggested according to a two-dimensional framework based on the degrees of freedom and uncertainty awareness. Third, the prediction performance, probability reliability, concentration of variable importance, consistency of attention results, and generalizability of attention results to conventional statistics were assessed in the diabetic classification modeling setting. Fourth, the potential limitations of attention mechanisms were considered. Results: Prediction performance was very high for all models. Probability reliability was high in models with uncertainty awareness. Variable importance was concentrated in several variables when uncertainty awareness was not considered. The consistency of attention results was high when uncertainty awareness was considered. The generalizability of attention results to conventional statistics was poor regardless of the modeling approach. Conclusions: The attention mechanism is an attractive technique with potential to be very promising in the future. However, it may not yet be desirable to rely on this method to assess variable importance in clinical settings. Therefore, along with theoretical studies enhancing attention mechanisms, more empirical studies investigating potential limitations should be encouraged. UR - http://www.jmir.org/2020/12/e18418/ UR - https://doi.org/10.2196/18418 UR - http://www.ncbi.nlm.nih.gov/pubmed/33325832 DO - 10.2196/18418 ID - info:doi/10.2196/18418 ER - TY - JOUR AU - Abd-Alrazaq, Alaa AU - Alajlani, Mohannad AU - Alhuwail, Dari AU - Schneider, Jens AU - Al-Kuwari, Saif AU - Shah, Zubair AU - Hamdi, Mounir AU - Househ, Mowafa PY - 2020 DA - 2020/12/15 TI - Artificial Intelligence in the Fight Against COVID-19: Scoping Review JO - J Med Internet Res SP - e20756 VL - 22 IS - 12 KW - artificial intelligence KW - machine learning KW - deep learning KW - natural language processing KW - coronavirus KW - COVID-19 KW - 2019-nCoV KW - SARS-CoV-2 AB - Background: In December 2019, COVID-19 broke out in Wuhan, China, leading to national and international disruptions in health care, business, education, transportation, and nearly every aspect of our daily lives. Artificial intelligence (AI) has been leveraged amid the COVID-19 pandemic; however, little is known about its use for supporting public health efforts. Objective: This scoping review aims to explore how AI technology is being used during the COVID-19 pandemic, as reported in the literature. Thus, it is the first review that describes and summarizes features of the identified AI techniques and data sets used for their development and validation. Methods: A scoping review was conducted following the guidelines of PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews). We searched the most commonly used electronic databases (eg, MEDLINE, EMBASE, and PsycInfo) between April 10 and 12, 2020. These terms were selected based on the target intervention (ie, AI) and the target disease (ie, COVID-19). Two reviewers independently conducted study selection and data extraction. A narrative approach was used to synthesize the extracted data. Results: We considered 82 studies out of the 435 retrieved studies. The most common use of AI was diagnosing COVID-19 cases based on various indicators. AI was also employed in drug and vaccine discovery or repurposing and for assessing their safety. Further, the included studies used AI for forecasting the epidemic development of COVID-19 and predicting its potential hosts and reservoirs. Researchers used AI for patient outcome–related tasks such as assessing the severity of COVID-19, predicting mortality risk, its associated factors, and the length of hospital stay. AI was used for infodemiology to raise awareness to use water, sanitation, and hygiene. The most prominent AI technique used was convolutional neural network, followed by support vector machine. Conclusions: The included studies showed that AI has the potential to fight against COVID-19. However, many of the proposed methods are not yet clinically accepted. Thus, the most rewarding research will be on methods promising value beyond COVID-19. More efforts are needed for developing standardized reporting protocols or guidelines for studies on AI. UR - http://www.jmir.org/2020/12/e20756/ UR - https://doi.org/10.2196/20756 UR - http://www.ncbi.nlm.nih.gov/pubmed/33284779 DO - 10.2196/20756 ID - info:doi/10.2196/20756 ER - TY - JOUR AU - Duca Iliescu, Delia Monica PY - 2020 DA - 2020/12/10 TI - The Impact of Artificial Intelligence on the Chess World JO - JMIR Serious Games SP - e24049 VL - 8 IS - 4 KW - artificial intelligence KW - games KW - chess KW - AlphaZero KW - MuZero KW - cheat detection KW - coronavirus UR - http://games.jmir.org/2020/4/e24049/ UR - https://doi.org/10.2196/24049 UR - http://www.ncbi.nlm.nih.gov/pubmed/33300493 DO - 10.2196/24049 ID - info:doi/10.2196/24049 ER - TY - JOUR AU - Ćirković, Aleksandar PY - 2020 DA - 2020/12/4 TI - Evaluation of Four Artificial Intelligence–Assisted Self-Diagnosis Apps on Three Diagnoses: Two-Year Follow-Up Study JO - J Med Internet Res SP - e18097 VL - 22 IS - 12 KW - artificial intelligence KW - machine learning KW - mobile apps KW - medical diagnosis KW - mHealth AB - Background: Consumer-oriented mobile self-diagnosis apps have been developed using undisclosed algorithms, presumably based on machine learning and other artificial intelligence (AI) technologies. The US Food and Drug Administration now discerns apps with learning AI algorithms from those with stable ones and treats the former as medical devices. To the author’s knowledge, no self-diagnosis app testing has been performed in the field of ophthalmology so far. Objective: The objective of this study was to test apps that were previously mentioned in the scientific literature on a set of diagnoses in a deliberate time interval, comparing the results and looking for differences that hint at “nonlocked” learning algorithms. Methods: Four apps from the literature were chosen (Ada, Babylon, Buoy, and Your.MD). A set of three ophthalmology diagnoses (glaucoma, retinal tear, dry eye syndrome) representing three levels of urgency was used to simultaneously test the apps’ diagnostic efficiency and treatment recommendations in this specialty. Two years was the chosen time interval between the tests (2018 and 2020). Scores were awarded by one evaluating physician using a defined scheme. Results: Two apps (Ada and Your.MD) received significantly higher scores than the other two. All apps either worsened in their results between 2018 and 2020 or remained unchanged at a low level. The variation in the results over time indicates “nonlocked” learning algorithms using AI technologies. None of the apps provided correct diagnoses and treatment recommendations for all three diagnoses in 2020. Two apps (Babylon and Your.MD) asked significantly fewer questions than the other two (P<.001). Conclusions: “Nonlocked” algorithms are used by self-diagnosis apps. The diagnostic efficiency of the tested apps seems to worsen over time, with some apps being more capable than others. Systematic studies on a wider scale are necessary for health care providers and patients to correctly assess the safety and efficacy of such apps and for correct classification by health care regulating authorities. UR - https://www.jmir.org/2020/12/e18097 UR - https://doi.org/10.2196/18097 UR - http://www.ncbi.nlm.nih.gov/pubmed/33275113 DO - 10.2196/18097 ID - info:doi/10.2196/18097 ER - TY - JOUR AU - Kazi, Abdul Momin AU - Qazi, Saad Ahmed AU - Khawaja, Sadori AU - Ahsan, Nazia AU - Ahmed, Rao Moueed AU - Sameen, Fareeha AU - Khan Mughal, Muhammad Ayub AU - Saqib, Muhammad AU - Ali, Sikander AU - Kaleemuddin, Hussain AU - Rauf, Yasir AU - Raza, Mehreen AU - Jamal, Saima AU - Abbasi, Munir AU - Stergioulas, Lampros K PY - 2020 DA - 2020/12/4 TI - An Artificial Intelligence–Based, Personalized Smartphone App to Improve Childhood Immunization Coverage and Timelines Among Children in Pakistan: Protocol for a Randomized Controlled Trial JO - JMIR Res Protoc SP - e22996 VL - 9 IS - 12 KW - artificial intelligence KW - AI KW - routine childhood immunization KW - EPI KW - LMICs KW - mHealth KW - Pakistan KW - personalized messages KW - routine immunization KW - smartphone apps KW - vaccine-preventable illnesses AB - Background: The immunization uptake rates in Pakistan are much lower than desired. Major reasons include lack of awareness, parental forgetfulness regarding schedules, and misinformation regarding vaccines. In light of the COVID-19 pandemic and distancing measures, routine childhood immunization (RCI) coverage has been adversely affected, as caregivers avoid tertiary care hospitals or primary health centers. Innovative and cost-effective measures must be taken to understand and deal with the issue of low immunization rates. However, only a few smartphone-based interventions have been carried out in low- and middle-income countries (LMICs) to improve RCI. Objective: The primary objectives of this study are to evaluate whether a personalized mobile app can improve children’s on-time visits at 10 and 14 weeks of age for RCI as compared with standard care and to determine whether an artificial intelligence model can be incorporated into the app. Secondary objectives are to determine the perceptions and attitudes of caregivers regarding childhood vaccinations and to understand the factors that might influence the effect of a mobile phone–based app on vaccination improvement. Methods: A mixed methods randomized controlled trial was designed with intervention and control arms. The study will be conducted at the Aga Khan University Hospital vaccination center. Caregivers of newborns or infants visiting the center for their children’s 6-week vaccination will be recruited. The intervention arm will have access to a smartphone app with text, voice, video, and pictorial messages regarding RCI. This app will be developed based on the findings of the pretrial qualitative component of the study, in addition to no-show study findings, which will explore caregivers’ perceptions about RCI and a mobile phone–based app in improving RCI coverage. Results: Pretrial qualitative in-depth interviews were conducted in February 2020. Enrollment of study participants for the randomized controlled trial is in process. Study exit interviews will be conducted at the 14-week immunization visits, provided the caregivers visit the immunization facility at that time, or over the phone when the children are 18 weeks of age. Conclusions: This study will generate useful insights into the feasibility, acceptability, and usability of an Android-based smartphone app for improving RCI in Pakistan and in LMICs. Trial Registration: ClinicalTrials.gov NCT04449107; https://clinicaltrials.gov/ct2/show/NCT04449107 International Registered Report Identifier (IRRID): DERR1-10.2196/22996 UR - https://www.researchprotocols.org/2020/12/e22996 UR - https://doi.org/10.2196/22996 UR - http://www.ncbi.nlm.nih.gov/pubmed/33274726 DO - 10.2196/22996 ID - info:doi/10.2196/22996 ER - TY - JOUR AU - Plante, Timothy B AU - Blau, Aaron M AU - Berg, Adrian N AU - Weinberg, Aaron S AU - Jun, Ik C AU - Tapson, Victor F AU - Kanigan, Tanya S AU - Adib, Artur B PY - 2020 DA - 2020/12/2 TI - Development and External Validation of a Machine Learning Tool to Rule Out COVID-19 Among Adults in the Emergency Department Using Routine Blood Tests: A Large, Multicenter, Real-World Study JO - J Med Internet Res SP - e24048 VL - 22 IS - 12 KW - COVID-19 KW - SARS-CoV-2 KW - machine learning KW - artificial intelligence KW - electronic medical records KW - laboratory results KW - development KW - validation KW - testing KW - model KW - emergency department AB - Background: Conventional diagnosis of COVID-19 with reverse transcription polymerase chain reaction (RT-PCR) testing (hereafter, PCR) is associated with prolonged time to diagnosis and significant costs to run the test. The SARS-CoV-2 virus might lead to characteristic patterns in the results of widely available, routine blood tests that could be identified with machine learning methodologies. Machine learning modalities integrating findings from these common laboratory test results might accelerate ruling out COVID-19 in emergency department patients. Objective: We sought to develop (ie, train and internally validate with cross-validation techniques) and externally validate a machine learning model to rule out COVID 19 using only routine blood tests among adults in emergency departments. Methods: Using clinical data from emergency departments (EDs) from 66 US hospitals before the pandemic (before the end of December 2019) or during the pandemic (March-July 2020), we included patients aged ≥20 years in the study time frame. We excluded those with missing laboratory results. Model training used 2183 PCR-confirmed cases from 43 hospitals during the pandemic; negative controls were 10,000 prepandemic patients from the same hospitals. External validation used 23 hospitals with 1020 PCR-confirmed cases and 171,734 prepandemic negative controls. The main outcome was COVID 19 status predicted using same-day routine laboratory results. Model performance was assessed with area under the receiver operating characteristic (AUROC) curve as well as sensitivity, specificity, and negative predictive value (NPV). Results: Of 192,779 patients included in the training, external validation, and sensitivity data sets (median age decile 50 [IQR 30-60] years, 40.5% male [78,249/192,779]), AUROC for training and external validation was 0.91 (95% CI 0.90-0.92). Using a risk score cutoff of 1.0 (out of 100) in the external validation data set, the model achieved sensitivity of 95.9% and specificity of 41.7%; with a cutoff of 2.0, sensitivity was 92.6% and specificity was 59.9%. At the cutoff of 2.0, the NPVs at a prevalence of 1%, 10%, and 20% were 99.9%, 98.6%, and 97%, respectively. Conclusions: A machine learning model developed with multicenter clinical data integrating commonly collected ED laboratory data demonstrated high rule-out accuracy for COVID-19 status, and might inform selective use of PCR-based testing. UR - https://www.jmir.org/2020/12/e24048 UR - https://doi.org/10.2196/24048 UR - http://www.ncbi.nlm.nih.gov/pubmed/33226957 DO - 10.2196/24048 ID - info:doi/10.2196/24048 ER - TY - JOUR AU - Maarseveen, Tjardo D AU - Meinderink, Timo AU - Reinders, Marcel J T AU - Knitza, Johannes AU - Huizinga, Tom W J AU - Kleyer, Arnd AU - Simon, David AU - van den Akker, Erik B AU - Knevel, Rachel PY - 2020 DA - 2020/11/30 TI - Machine Learning Electronic Health Record Identification of Patients with Rheumatoid Arthritis: Algorithm Pipeline Development and Validation Study JO - JMIR Med Inform SP - e23930 VL - 8 IS - 11 KW - Supervised machine learning KW - Electronic Health Records KW - Natural Language Processing KW - Support Vector Machine KW - Gradient Boosting KW - Rheumatoid Arthritis AB - Background: Financial codes are often used to extract diagnoses from electronic health records. This approach is prone to false positives. Alternatively, queries are constructed, but these are highly center and language specific. A tantalizing alternative is the automatic identification of patients by employing machine learning on format-free text entries. Objective: The aim of this study was to develop an easily implementable workflow that builds a machine learning algorithm capable of accurately identifying patients with rheumatoid arthritis from format-free text fields in electronic health records. Methods: Two electronic health record data sets were employed: Leiden (n=3000) and Erlangen (n=4771). Using a portion of the Leiden data (n=2000), we compared 6 different machine learning methods and a naïve word-matching algorithm using 10-fold cross-validation. Performances were compared using the area under the receiver operating characteristic curve (AUROC) and the area under the precision recall curve (AUPRC), and F1 score was used as the primary criterion for selecting the best method to build a classifying algorithm. We selected the optimal threshold of positive predictive value for case identification based on the output of the best method in the training data. This validation workflow was subsequently applied to a portion of the Erlangen data (n=4293). For testing, the best performing methods were applied to remaining data (Leiden n=1000; Erlangen n=478) for an unbiased evaluation. Results: For the Leiden data set, the word-matching algorithm demonstrated mixed performance (AUROC 0.90; AUPRC 0.33; F1 score 0.55), and 4 methods significantly outperformed word-matching, with support vector machines performing best (AUROC 0.98; AUPRC 0.88; F1 score 0.83). Applying this support vector machine classifier to the test data resulted in a similarly high performance (F1 score 0.81; positive predictive value [PPV] 0.94), and with this method, we could identify 2873 patients with rheumatoid arthritis in less than 7 seconds out of the complete collection of 23,300 patients in the Leiden electronic health record system. For the Erlangen data set, gradient boosting performed best (AUROC 0.94; AUPRC 0.85; F1 score 0.82) in the training set, and applied to the test data, resulted once again in good results (F1 score 0.67; PPV 0.97). Conclusions: We demonstrate that machine learning methods can extract the records of patients with rheumatoid arthritis from electronic health record data with high precision, allowing research on very large populations for limited costs. Our approach is language and center independent and could be applied to any type of diagnosis. We have developed our pipeline into a universally applicable and easy-to-implement workflow to equip centers with their own high-performing algorithm. This allows the creation of observational studies of unprecedented size covering different countries for low cost from already available data in electronic health record systems. UR - http://medinform.jmir.org/2020/11/e23930/ UR - https://doi.org/10.2196/23930 UR - http://www.ncbi.nlm.nih.gov/pubmed/33252349 DO - 10.2196/23930 ID - info:doi/10.2196/23930 ER - TY - JOUR AU - Morse, Keith E AU - Ostberg, Nicolai P AU - Jones, Veena G AU - Chan, Albert S PY - 2020 DA - 2020/11/30 TI - Use Characteristics and Triage Acuity of a Digital Symptom Checker in a Large Integrated Health System: Population-Based Descriptive Study JO - J Med Internet Res SP - e20549 VL - 22 IS - 11 KW - symptom checker KW - chatbot KW - computer-assisted diagnosis KW - diagnostic self-evaluation KW - artificial intelligence KW - self-care KW - COVID-19 AB - Background: Pressure on the US health care system has been increasing due to a combination of aging populations, rising health care expenditures, and most recently, the COVID-19 pandemic. Responses to this pressure are hindered in part by reliance on a limited supply of highly trained health care professionals, creating a need for scalable technological solutions. Digital symptom checkers are artificial intelligence–supported software tools that use a conversational “chatbot” format to support rapid diagnosis and consistent triage. The COVID-19 pandemic has brought new attention to these tools due to the need to avoid face-to-face contact and preserve urgent care capacity. However, evidence-based deployment of these chatbots requires an understanding of user demographics and associated triage recommendations generated by a large general population. Objective: In this study, we evaluate the user demographics and levels of triage acuity provided by a symptom checker chatbot deployed in partnership with a large integrated health system in the United States. Methods: This population-based descriptive study included all web-based symptom assessments completed on the website and patient portal of the Sutter Health system (24 hospitals in Northern California) from April 24, 2019, to February 1, 2020. User demographics were compared to relevant US Census population data. Results: A total of 26,646 symptom assessments were completed during the study period. Most assessments (17,816/26,646, 66.9%) were completed by female users. The mean user age was 34.3 years (SD 14.4 years), compared to a median age of 37.3 years of the general population. The most common initial symptom was abdominal pain (2060/26,646, 7.7%). A substantial number of assessments (12,357/26,646, 46.4%) were completed outside of typical physician office hours. Most users were advised to seek medical care on the same day (7299/26,646, 27.4%) or within 2-3 days (6301/26,646, 23.6%). Over a quarter of the assessments indicated a high degree of urgency (7723/26,646, 29.0%). Conclusions: Users of the symptom checker chatbot were broadly representative of our patient population, although they skewed toward younger and female users. The triage recommendations were comparable to those of nurse-staffed telephone triage lines. Although the emergence of COVID-19 has increased the interest in remote medical assessment tools, it is important to take an evidence-based approach to their deployment. UR - https://www.jmir.org/2020/11/e20549 UR - https://doi.org/10.2196/20549 UR - http://www.ncbi.nlm.nih.gov/pubmed/33170799 DO - 10.2196/20549 ID - info:doi/10.2196/20549 ER - TY - JOUR AU - Cheng, Chi-Tung AU - Chen, Chih-Chi AU - Cheng, Fu-Jen AU - Chen, Huan-Wu AU - Su, Yi-Siang AU - Yeh, Chun-Nan AU - Chung, I-Fang AU - Liao, Chien-Hung PY - 2020 DA - 2020/11/27 TI - A Human-Algorithm Integration System for Hip Fracture Detection on Plain Radiography: System Development and Validation Study JO - JMIR Med Inform SP - e19416 VL - 8 IS - 11 KW - hip fracture KW - neural network KW - computer KW - artificial intelligence KW - algorithms KW - human augmentation KW - deep learning KW - diagnosis AB - Background: Hip fracture is the most common type of fracture in elderly individuals. Numerous deep learning (DL) algorithms for plain pelvic radiographs (PXRs) have been applied to improve the accuracy of hip fracture diagnosis. However, their efficacy is still undetermined. Objective: The objective of this study is to develop and validate a human-algorithm integration (HAI) system to improve the accuracy of hip fracture diagnosis in a real clinical environment. Methods: The HAI system with hip fracture detection ability was developed using a deep learning algorithm trained on trauma registry data and 3605 PXRs from August 2008 to December 2016. To compare their diagnostic performance before and after HAI system assistance using an independent testing dataset, 34 physicians were recruited. We analyzed the physicians’ accuracy, sensitivity, specificity, and agreement with the algorithm; we also performed subgroup analyses according to physician specialty and experience. Furthermore, we applied the HAI system in the emergency departments of different hospitals to validate its value in the real world. Results: With the support of the algorithm, which achieved 91% accuracy, the diagnostic performance of physicians was significantly improved in the independent testing dataset, as was revealed by the sensitivity (physician alone, median 95%; HAI, median 99%; P<.001), specificity (physician alone, median 90%; HAI, median 95%; P<.001), accuracy (physician alone, median 90%; HAI, median 96%; P<.001), and human-algorithm agreement [physician alone κ, median 0.69 (IQR 0.63-0.74); HAI κ, median 0.80 (IQR 0.76-0.82); P<.001. With the help of the HAI system, the primary physicians showed significant improvement in their diagnostic performance to levels comparable to those of consulting physicians, and both the experienced and less-experienced physicians benefited from the HAI system. After the HAI system had been applied in 3 departments for 5 months, 587 images were examined. The sensitivity, specificity, and accuracy of the HAI system for detecting hip fractures were 97%, 95.7%, and 96.08%, respectively. Conclusions: HAI currently impacts health care, and integrating this technology into emergency departments is feasible. The developed HAI system can enhance physicians’ hip fracture diagnostic performance. UR - http://medinform.jmir.org/2020/11/e19416/ UR - https://doi.org/10.2196/19416 UR - http://www.ncbi.nlm.nih.gov/pubmed/33245279 DO - 10.2196/19416 ID - info:doi/10.2196/19416 ER - TY - JOUR AU - Kang, Eugene Yu-Chuan AU - Hsieh, Yi-Ting AU - Li, Chien-Hung AU - Huang, Yi-Jin AU - Kuo, Chang-Fu AU - Kang, Je-Ho AU - Chen, Kuan-Jen AU - Lai, Chi-Chun AU - Wu, Wei-Chi AU - Hwang, Yih-Shiou PY - 2020 DA - 2020/11/26 TI - Deep Learning–Based Detection of Early Renal Function Impairment Using Retinal Fundus Images: Model Development and Validation JO - JMIR Med Inform SP - e23472 VL - 8 IS - 11 KW - deep learning KW - renal function KW - retinal fundus image KW - diabetes KW - renal KW - kidney KW - retinal KW - eye KW - imaging KW - impairment KW - detection KW - development KW - validation KW - model AB - Background: Retinal imaging has been applied for detecting eye diseases and cardiovascular risks using deep learning–based methods. Furthermore, retinal microvascular and structural changes were found in renal function impairments. However, a deep learning–based method using retinal images for detecting early renal function impairment has not yet been well studied. Objective: This study aimed to develop and evaluate a deep learning model for detecting early renal function impairment using retinal fundus images. Methods: This retrospective study enrolled patients who underwent renal function tests with color fundus images captured at any time between January 1, 2001, and August 31, 2019. A deep learning model was constructed to detect impaired renal function from the images. Early renal function impairment was defined as estimated glomerular filtration rate <90 mL/min/1.73 m2. Model performance was evaluated with respect to the receiver operating characteristic curve and area under the curve (AUC). Results: In total, 25,706 retinal fundus images were obtained from 6212 patients for the study period. The images were divided at an 8:1:1 ratio. The training, validation, and testing data sets respectively contained 20,787, 2189, and 2730 images from 4970, 621, and 621 patients. There were 10,686 and 15,020 images determined to indicate normal and impaired renal function, respectively. The AUC of the model was 0.81 in the overall population. In subgroups stratified by serum hemoglobin A1c (HbA1c) level, the AUCs were 0.81, 0.84, 0.85, and 0.87 for the HbA1c levels of ≤6.5%, >6.5%, >7.5%, and >10%, respectively. Conclusions: The deep learning model in this study enables the detection of early renal function impairment using retinal fundus images. The model was more accurate for patients with elevated serum HbA1c levels. UR - http://medinform.jmir.org/2020/11/e23472/ UR - https://doi.org/10.2196/23472 UR - http://www.ncbi.nlm.nih.gov/pubmed/33139242 DO - 10.2196/23472 ID - info:doi/10.2196/23472 ER - TY - JOUR AU - Owais, Muhammad AU - Arsalan, Muhammad AU - Mahmood, Tahir AU - Kang, Jin Kyu AU - Park, Kang Ryoung PY - 2020 DA - 2020/11/26 TI - Automated Diagnosis of Various Gastrointestinal Lesions Using a Deep Learning–Based Classification and Retrieval Framework With a Large Endoscopic Database: Model Development and Validation JO - J Med Internet Res SP - e18563 VL - 22 IS - 11 KW - artificial intelligence KW - endoscopic video retrieval KW - content-based medical image retrieval KW - polyp detection KW - deep learning KW - computer-aided diagnosis AB - Background: The early diagnosis of various gastrointestinal diseases can lead to effective treatment and reduce the risk of many life-threatening conditions. Unfortunately, various small gastrointestinal lesions are undetectable during early-stage examination by medical experts. In previous studies, various deep learning–based computer-aided diagnosis tools have been used to make a significant contribution to the effective diagnosis and treatment of gastrointestinal diseases. However, most of these methods were designed to detect a limited number of gastrointestinal diseases, such as polyps, tumors, or cancers, in a specific part of the human gastrointestinal tract. Objective: This study aimed to develop a comprehensive computer-aided diagnosis tool to assist medical experts in diagnosing various types of gastrointestinal diseases. Methods: Our proposed framework comprises a deep learning–based classification network followed by a retrieval method. In the first step, the classification network predicts the disease type for the current medical condition. Then, the retrieval part of the framework shows the relevant cases (endoscopic images) from the previous database. These past cases help the medical expert validate the current computer prediction subjectively, which ultimately results in better diagnosis and treatment. Results: All the experiments were performed using 2 endoscopic data sets with a total of 52,471 frames and 37 different classes. The optimal performances obtained by our proposed method in accuracy, F1 score, mean average precision, and mean average recall were 96.19%, 96.99%, 98.18%, and 95.86%, respectively. The overall performance of our proposed diagnostic framework substantially outperformed state-of-the-art methods. Conclusions: This study provides a comprehensive computer-aided diagnosis framework for identifying various types of gastrointestinal diseases. The results show the superiority of our proposed method over various other recent methods and illustrate its potential for clinical diagnosis and treatment. Our proposed network can be applicable to other classification domains in medical imaging, such as computed tomography scans, magnetic resonance imaging, and ultrasound sequences. UR - http://www.jmir.org/2020/11/e18563/ UR - https://doi.org/10.2196/18563 UR - http://www.ncbi.nlm.nih.gov/pubmed/33242010 DO - 10.2196/18563 ID - info:doi/10.2196/18563 ER - TY - JOUR AU - Tsai, Vincent FS AU - Zhuang, Bin AU - Pong, Yuan-Hung AU - Hsieh, Ju-Ton AU - Chang, Hong-Chiang PY - 2020 DA - 2020/11/19 TI - Web- and Artificial Intelligence–Based Image Recognition For Sperm Motility Analysis: Verification Study JO - JMIR Med Inform SP - e20031 VL - 8 IS - 11 KW - Male infertility KW - semen analysis KW - home sperm test KW - smartphone KW - artificial intelligence KW - cloud computing KW - telemedicine AB - Background: Human sperm quality fluctuates over time. Therefore, it is crucial for couples preparing for natural pregnancy to monitor sperm motility. Objective: This study verified the performance of an artificial intelligence–based image recognition and cloud computing sperm motility testing system (Bemaner, Createcare) composed of microscope and microfluidic modules and designed to adapt to different types of smartphones. Methods: Sperm videos were captured and uploaded to the cloud with an app. Analysis of sperm motility was performed by an artificial intelligence–based image recognition algorithm then results were displayed. According to the number of motile sperm in the vision field, 47 (deidentified) videos of sperm were scored using 6 grades (0-5) by a male-fertility expert with 10 years of experience. Pearson product-moment correlation was calculated between the grades and the results (concentration of total sperm, concentration of motile sperm, and motility percentage) computed by the system. Results: Good correlation was demonstrated between the grades and results computed by the system for concentration of total sperm (r=0.65, P<.001), concentration of motile sperm (r=0.84, P<.001), and motility percentage (r=0.90, P<.001). Conclusions: This smartphone-based sperm motility test (Bemaner) accurately measures motility-related parameters and could potentially be applied toward the following fields: male infertility detection, sperm quality test during preparation for pregnancy, and infertility treatment monitoring. With frequent at-home testing, more data can be collected to help make clinical decisions and to conduct epidemiological research. UR - http://medinform.jmir.org/2020/11/e20031/ UR - https://doi.org/10.2196/20031 UR - http://www.ncbi.nlm.nih.gov/pubmed/33211025 DO - 10.2196/20031 ID - info:doi/10.2196/20031 ER - TY - JOUR AU - Islam, Md Mohaimenul AU - Yang, Hsuan-Chia AU - Poly, Tahmina Nasrin AU - Li, Yu-Chuan Jack PY - 2020 DA - 2020/11/18 TI - Development of an Artificial Intelligence–Based Automated Recommendation System for Clinical Laboratory Tests: Retrospective Analysis of the National Health Insurance Database JO - JMIR Med Inform SP - e24163 VL - 8 IS - 11 KW - artificial intelligence KW - deep learning KW - clinical decision-support system KW - laboratory test KW - patient safety AB - Background: Laboratory tests are considered an essential part of patient safety as patients’ screening, diagnosis, and follow-up are solely based on laboratory tests. Diagnosis of patients could be wrong, missed, or delayed if laboratory tests are performed erroneously. However, recognizing the value of correct laboratory test ordering remains underestimated by policymakers and clinicians. Nowadays, artificial intelligence methods such as machine learning and deep learning (DL) have been extensively used as powerful tools for pattern recognition in large data sets. Therefore, developing an automated laboratory test recommendation tool using available data from electronic health records (EHRs) could support current clinical practice. Objective: The objective of this study was to develop an artificial intelligence–based automated model that can provide laboratory tests recommendation based on simple variables available in EHRs. Methods: A retrospective analysis of the National Health Insurance database between January 1, 2013, and December 31, 2013, was performed. We reviewed the record of all patients who visited the cardiology department at least once and were prescribed laboratory tests. The data set was split into training and testing sets (80:20) to develop the DL model. In the internal validation, 25% of data were randomly selected from the training set to evaluate the performance of this model. Results: We used the area under the receiver operating characteristic curve, precision, recall, and hamming loss as comparative measures. A total of 129,938 prescriptions were used in our model. The DL-based automated recommendation system for laboratory tests achieved a significantly higher area under the receiver operating characteristic curve (AUROCmacro and AUROCmicro of 0.76 and 0.87, respectively). Using a low cutoff, the model identified appropriate laboratory tests with 99% sensitivity. Conclusions: The developed artificial intelligence model based on DL exhibited good discriminative capability for predicting laboratory tests using routinely collected EHR data. Utilization of DL approaches can facilitate optimal laboratory test selection for patients, which may in turn improve patient safety. However, future study is recommended to assess the cost-effectiveness for implementing this model in real-world clinical settings. UR - https://medinform.jmir.org/2020/11/e24163 UR - https://doi.org/10.2196/24163 UR - http://www.ncbi.nlm.nih.gov/pubmed/33206057 DO - 10.2196/24163 ID - info:doi/10.2196/24163 ER - TY - JOUR AU - von Wedel, Philip AU - Hagist, Christian PY - 2020 DA - 2020/11/18 TI - Economic Value of Data and Analytics for Health Care Providers: Hermeneutic Systematic Literature Review JO - J Med Internet Res SP - e23315 VL - 22 IS - 11 KW - digital health KW - health information technology KW - healthcare provider economics KW - electronic health records KW - data analytics KW - artificial intelligence AB - Background: The benefits of data and analytics for health care systems and single providers is an increasingly investigated field in digital health literature. Electronic health records (EHR), for example, can improve quality of care. Emerging analytics tools based on artificial intelligence show the potential to assist physicians in day-to-day workflows. Yet, single health care providers also need information regarding the economic impact when deciding on potential adoption of these tools. Objective: This paper examines the question of whether data and analytics provide economic advantages or disadvantages for health care providers. The goal is to provide a comprehensive overview including a variety of technologies beyond computer-based patient records. Ultimately, findings are also intended to determine whether economic barriers for adoption by providers could exist. Methods: A systematic literature search of the PubMed and Google Scholar online databases was conducted, following the hermeneutic methodology that encourages iterative search and interpretation cycles. After applying inclusion and exclusion criteria to 165 initially identified studies, 50 were included for qualitative synthesis and topic-based clustering. Results: The review identified 5 major technology categories, namely EHRs (n=30), computerized clinical decision support (n=8), advanced analytics (n=5), business analytics (n=5), and telemedicine (n=2). Overall, 62% (31/50) of the reviewed studies indicated a positive economic impact for providers either via direct cost or revenue effects or via indirect efficiency or productivity improvements. When differentiating between categories, however, an ambiguous picture emerged for EHR, whereas analytics technologies like computerized clinical decision support and advanced analytics predominantly showed economic benefits. Conclusions: The research question of whether data and analytics create economic benefits for health care providers cannot be answered uniformly. The results indicate ambiguous effects for EHRs, here representing data, and mainly positive effects for the significantly less studied analytics field. The mixed results regarding EHRs can create an economic barrier for adoption by providers. This barrier can translate into a bottleneck to positive economic effects of analytics technologies relying on EHR data. Ultimately, more research on economic effects of technologies other than EHRs is needed to generate a more reliable evidence base. UR - http://www.jmir.org/2020/11/e23315/ UR - https://doi.org/10.2196/23315 UR - http://www.ncbi.nlm.nih.gov/pubmed/33206056 DO - 10.2196/23315 ID - info:doi/10.2196/23315 ER - TY - JOUR AU - Gao, Yang AU - Xiao, Xiong AU - Han, Bangcheng AU - Li, Guilin AU - Ning, Xiaolin AU - Wang, Defeng AU - Cai, Weidong AU - Kikinis, Ron AU - Berkovsky, Shlomo AU - Di Ieva, Antonio AU - Zhang, Liwei AU - Ji, Nan AU - Liu, Sidong PY - 2020 DA - 2020/11/17 TI - Deep Learning Methodology for Differentiating Glioma Recurrence From Radiation Necrosis Using Multimodal Magnetic Resonance Imaging: Algorithm Development and Validation JO - JMIR Med Inform SP - e19805 VL - 8 IS - 11 KW - recurrent tumor KW - radiation necrosis KW - progression KW - pseudoprogression KW - multimodal MRI KW - deep learning AB - Background: The radiological differential diagnosis between tumor recurrence and radiation-induced necrosis (ie, pseudoprogression) is of paramount importance in the management of glioma patients. Objective: This research aims to develop a deep learning methodology for automated differentiation of tumor recurrence from radiation necrosis based on routine magnetic resonance imaging (MRI) scans. Methods: In this retrospective study, 146 patients who underwent radiation therapy after glioma resection and presented with suspected recurrent lesions at the follow-up MRI examination were selected for analysis. Routine MRI scans were acquired from each patient, including T1, T2, and gadolinium-contrast-enhanced T1 sequences. Of those cases, 96 (65.8%) were confirmed as glioma recurrence on postsurgical pathological examination, while 50 (34.2%) were diagnosed as necrosis. A light-weighted deep neural network (DNN) (ie, efficient radionecrosis neural network [ERN-Net]) was proposed to learn radiological features of gliomas and necrosis from MRI scans. Sensitivity, specificity, accuracy, and area under the curve (AUC) were used to evaluate performance of the model in both image-wise and subject-wise classifications. Preoperative diagnostic performance of the model was also compared to that of the state-of-the-art DNN models and five experienced neurosurgeons. Results: DNN models based on multimodal MRI outperformed single-modal models. ERN-Net achieved the highest AUC in both image-wise (0.915) and subject-wise (0.958) classification tasks. The evaluated DNN models achieved an average sensitivity of 0.947 (SD 0.033), specificity of 0.817 (SD 0.075), and accuracy of 0.903 (SD 0.026), which were significantly better than the tested neurosurgeons (P=.02 in sensitivity and P<.001 in specificity and accuracy). Conclusions: Deep learning offers a useful computational tool for the differential diagnosis between recurrent gliomas and necrosis. The proposed ERN-Net model, a simple and effective DNN model, achieved excellent performance on routine MRI scans and showed a high clinical applicability. UR - http://medinform.jmir.org/2020/11/e19805/ UR - https://doi.org/10.2196/19805 UR - http://www.ncbi.nlm.nih.gov/pubmed/33200991 DO - 10.2196/19805 ID - info:doi/10.2196/19805 ER - TY - JOUR AU - Koman, Jason AU - Fauvelle, Khristina AU - Schuck, Stéphane AU - Texier, Nathalie AU - Mebarki, Adel PY - 2020 DA - 2020/11/10 TI - Physicians’ Perceptions of the Use of a Chatbot for Information Seeking: Qualitative Study JO - J Med Internet Res SP - e15185 VL - 22 IS - 11 KW - health KW - digital health KW - innovation KW - conversational agent KW - decision support system KW - qualitative research KW - chatbot KW - bot KW - medical drugs KW - prescription KW - risk minimization measures AB - Background: Seeking medical information can be an issue for physicians. In the specific context of medical practice, chatbots are hypothesized to present additional value for providing information quickly, particularly as far as drug risk minimization measures are concerned. Objective: This qualitative study aimed to elicit physicians’ perceptions of a pilot version of a chatbot used in the context of drug information and risk minimization measures. Methods: General practitioners and specialists were recruited across France to participate in individual semistructured interviews. Interviews were recorded, transcribed, and analyzed using a horizontal thematic analysis approach. Results: Eight general practitioners and 2 specialists participated. The tone and ergonomics of the pilot version were appreciated by physicians. However, all participants emphasized the importance of getting exhaustive, trustworthy answers when interacting with a chatbot. Conclusions: The chatbot was perceived as a useful and innovative tool that could easily be integrated into routine medical practice and could help health professionals when seeking information on drug and risk minimization measures. UR - http://www.jmir.org/2020/11/e15185/ UR - https://doi.org/10.2196/15185 UR - http://www.ncbi.nlm.nih.gov/pubmed/33170134 DO - 10.2196/15185 ID - info:doi/10.2196/15185 ER - TY - JOUR AU - Roosan, Don AU - Chok, Jay AU - Karim, Mazharul AU - Law, Anandi V AU - Baskys, Andrius AU - Hwang, Angela AU - Roosan, Moom R PY - 2020 DA - 2020/11/9 TI - Artificial Intelligence–Powered Smartphone App to Facilitate Medication Adherence: Protocol for a Human Factors Design Study JO - JMIR Res Protoc SP - e21659 VL - 9 IS - 11 KW - artificial intelligence KW - smartphone app KW - patient cognition KW - complex medication information KW - medication adherence KW - machine learning KW - mobile phone AB - Background: Medication Guides consisting of crucial interactions and side effects are extensive and complex. Due to the exhaustive information, patients do not retain the necessary medication information, which can result in hospitalizations and medication nonadherence. A gap exists in understanding patients’ cognition of managing complex medication information. However, advancements in technology and artificial intelligence (AI) allow us to understand patient cognitive processes to design an app to better provide important medication information to patients. Objective: Our objective is to improve the design of an innovative AI- and human factor–based interface that supports patients’ medication information comprehension that could potentially improve medication adherence. Methods: This study has three aims. Aim 1 has three phases: (1) an observational study to understand patient perception of fear and biases regarding medication information, (2) an eye-tracking study to understand the attention locus for medication information, and (3) a psychological refractory period (PRP) paradigm study to understand functionalities. Observational data will be collected, such as audio and video recordings, gaze mapping, and time from PRP. A total of 50 patients, aged 18-65 years, who started at least one new medication, for which we developed visualization information, and who have a cognitive status of 34 during cognitive screening using the TICS-M test and health literacy level will be included in this aim of the study. In Aim 2, we will iteratively design and evaluate an AI-powered medication information visualization interface as a smartphone app with the knowledge gained from each component of Aim 1. The interface will be assessed through two usability surveys. A total of 300 patients, aged 18-65 years, with diabetes, cardiovascular diseases, or mental health disorders, will be recruited for the surveys. Data from the surveys will be analyzed through exploratory factor analysis. In Aim 3, in order to test the prototype, there will be a two-arm study design. This aim will include 900 patients, aged 18-65 years, with internet access, without any cognitive impairment, and with at least two medications. Patients will be sequentially randomized. Three surveys will be used to assess the primary outcome of medication information comprehension and the secondary outcome of medication adherence at 12 weeks. Results: Preliminary data collection will be conducted in 2021, and results are expected to be published in 2022. Conclusions: This study will lead the future of AI-based, innovative, digital interface design and aid in improving medication comprehension, which may improve medication adherence. The results from this study will also open up future research opportunities in understanding how patients manage complex medication information and will inform the format and design for innovative, AI-powered digital interfaces for Medication Guides. International Registered Report Identifier (IRRID): PRR1-10.2196/21659 UR - http://www.researchprotocols.org/2020/11/e21659/ UR - https://doi.org/10.2196/21659 UR - http://www.ncbi.nlm.nih.gov/pubmed/33164898 DO - 10.2196/21659 ID - info:doi/10.2196/21659 ER - TY - JOUR AU - Spasic, Irena AU - Button, Kate PY - 2020 DA - 2020/11/6 TI - Patient Triage by Topic Modeling of Referral Letters: Feasibility Study JO - JMIR Med Inform SP - e21252 VL - 8 IS - 11 KW - natural language processing KW - machine learning KW - data science KW - medical informatics KW - computer-assisted decision making AB - Background: Musculoskeletal conditions are managed within primary care, but patients can be referred to secondary care if a specialist opinion is required. The ever-increasing demand for health care resources emphasizes the need to streamline care pathways with the ultimate aim of ensuring that patients receive timely and optimal care. Information contained in referral letters underpins the referral decision-making process but is yet to be explored systematically for the purposes of treatment prioritization for musculoskeletal conditions. Objective: This study aims to explore the feasibility of using natural language processing and machine learning to automate the triage of patients with musculoskeletal conditions by analyzing information from referral letters. Specifically, we aim to determine whether referral letters can be automatically assorted into latent topics that are clinically relevant, that is, considered relevant when prescribing treatments. Here, clinical relevance is assessed by posing 2 research questions. Can latent topics be used to automatically predict treatment? Can clinicians interpret latent topics as cohorts of patients who share common characteristics or experiences such as medical history, demographics, and possible treatments? Methods: We used latent Dirichlet allocation to model each referral letter as a finite mixture over an underlying set of topics and model each topic as an infinite mixture over an underlying set of topic probabilities. The topic model was evaluated in the context of automating patient triage. Given a set of treatment outcomes, a binary classifier was trained for each outcome using previously extracted topics as the input features of the machine learning algorithm. In addition, a qualitative evaluation was performed to assess the human interpretability of topics. Results: The prediction accuracy of binary classifiers outperformed the stratified random classifier by a large margin, indicating that topic modeling could be used to predict the treatment, thus effectively supporting patient triage. The qualitative evaluation confirmed the high clinical interpretability of the topic model. Conclusions: The results established the feasibility of using natural language processing and machine learning to automate triage of patients with knee or hip pain by analyzing information from their referral letters. UR - https://medinform.jmir.org/2020/11/e21252 UR - https://doi.org/10.2196/21252 UR - http://www.ncbi.nlm.nih.gov/pubmed/33155985 DO - 10.2196/21252 ID - info:doi/10.2196/21252 ER - TY - JOUR AU - Almusharraf, Fahad AU - Rose, Jonathan AU - Selby, Peter PY - 2020 DA - 2020/11/3 TI - Engaging Unmotivated Smokers to Move Toward Quitting: Design of Motivational Interviewing–Based Chatbot Through Iterative Interactions JO - J Med Internet Res SP - e20251 VL - 22 IS - 11 KW - smoking cessation KW - motivational interviewing KW - chatbot KW - natural language processing AB - Background: At any given time, most smokers in a population are ambivalent with no motivation to quit. Motivational interviewing (MI) is an evidence-based technique that aims to elicit change in ambivalent smokers. MI practitioners are scarce and expensive, and smokers are difficult to reach. Smokers are potentially reachable through the web, and if an automated chatbot could emulate an MI conversation, it could form the basis of a low-cost and scalable intervention motivating smokers to quit. Objective: The primary goal of this study is to design, train, and test an automated MI-based chatbot capable of eliciting reflection in a conversation with cigarette smokers. This study describes the process of collecting training data to improve the chatbot’s ability to generate MI-oriented responses, particularly reflections and summary statements. The secondary goal of this study is to observe the effects on participants through voluntary feedback given after completing a conversation with the chatbot. Methods: An interdisciplinary collaboration between an MI expert and experts in computer engineering and natural language processing (NLP) co-designed the conversation and algorithms underlying the chatbot. A sample of 121 adult cigarette smokers in 11 successive groups were recruited from a web-based platform for a single-arm prospective iterative design study. The chatbot was designed to stimulate reflections on the pros and cons of smoking using MI’s running head start technique. Participants were also asked to confirm the chatbot’s classification of their free-form responses to measure the classification accuracy of the underlying NLP models. Each group provided responses that were used to train the chatbot for the next group. Results: A total of 6568 responses from 121 participants in 11 successive groups over 14 weeks were received. From these responses, we were able to isolate 21 unique reasons for and against smoking and the relative frequency of each. The gradual collection of responses as inputs and smoking reasons as labels over the 11 iterations improved the F1 score of the classification within the chatbot from 0.63 in the first group to 0.82 in the final group. The mean time spent by each participant interacting with the chatbot was 21.3 (SD 14.0) min (minimum 6.4 and maximum 89.2). We also found that 34.7% (42/121) of participants enjoyed the interaction with the chatbot, and 8.3% (10/121) of participants noted explicit smoking cessation benefits from the conversation in voluntary feedback that did not solicit this explicitly. Conclusions: Recruiting ambivalent smokers through the web is a viable method to train a chatbot to increase accuracy in reflection and summary statements, the building blocks of MI. A new set of 21 smoking reasons (both for and against) has been identified. Initial feedback from smokers on the experience shows promise toward using it in an intervention. UR - https://www.jmir.org/2020/11/e20251 UR - https://doi.org/10.2196/20251 UR - http://www.ncbi.nlm.nih.gov/pubmed/33141095 DO - 10.2196/20251 ID - info:doi/10.2196/20251 ER - TY - JOUR AU - Čukić, Milena AU - López, Victoria AU - Pavón, Juan PY - 2020 DA - 2020/11/3 TI - Classification of Depression Through Resting-State Electroencephalogram as a Novel Practice in Psychiatry: Review JO - J Med Internet Res SP - e19548 VL - 22 IS - 11 KW - computational psychiatry KW - physiological complexity KW - machine learning KW - theory-driven approach KW - resting-state EEG KW - personalized medicine KW - computational neuroscience KW - unwarranted optimism AB - Background: Machine learning applications in health care have increased considerably in the recent past, and this review focuses on an important application in psychiatry related to the detection of depression. Since the advent of computational psychiatry, research based on functional magnetic resonance imaging has yielded remarkable results, but these tools tend to be too expensive for everyday clinical use. Objective: This review focuses on an affordable data-driven approach based on electroencephalographic recordings. Web-based applications via public or private cloud-based platforms would be a logical next step. We aim to compare several different approaches to the detection of depression from electroencephalographic recordings using various features and machine learning models. Methods: To detect depression, we reviewed published detection studies based on resting-state electroencephalogram with final machine learning, and to predict therapy outcomes, we reviewed a set of interventional studies using some form of stimulation in their methodology. Results: We reviewed 14 detection studies and 12 interventional studies published between 2008 and 2019. As direct comparison was not possible due to the large diversity of theoretical approaches and methods used, we compared them based on the steps in analysis and accuracies yielded. In addition, we compared possible drawbacks in terms of sample size, feature extraction, feature selection, classification, internal and external validation, and possible unwarranted optimism and reproducibility. In addition, we suggested desirable practices to avoid misinterpretation of results and optimism. Conclusions: This review shows the need for larger data sets and more systematic procedures to improve the use of the solution for clinical diagnostics. Therefore, regulation of the pipeline and standard requirements for methodology used should become mandatory to increase the reliability and accuracy of the complete methodology for it to be translated to modern psychiatry. UR - https://www.jmir.org/2020/11/e19548 UR - https://doi.org/10.2196/19548 UR - http://www.ncbi.nlm.nih.gov/pubmed/33141088 DO - 10.2196/19548 ID - info:doi/10.2196/19548 ER - TY - JOUR AU - Zhou, Sicheng AU - Zhao, Yunpeng AU - Bian, Jiang AU - Haynos, Ann F AU - Zhang, Rui PY - 2020 DA - 2020/10/30 TI - Exploring Eating Disorder Topics on Twitter: Machine Learning Approach JO - JMIR Med Inform SP - e18273 VL - 8 IS - 10 KW - eating disorders KW - topic modeling KW - text classification KW - social media KW - public health AB - Background: Eating disorders (EDs) are a group of mental illnesses that have an adverse effect on both mental and physical health. As social media platforms (eg, Twitter) have become an important data source for public health research, some studies have qualitatively explored the ways in which EDs are discussed on these platforms. Initial results suggest that such research offers a promising method for further understanding this group of diseases. Nevertheless, an efficient computational method is needed to further identify and analyze tweets relevant to EDs on a larger scale. Objective: This study aims to develop and validate a machine learning–based classifier to identify tweets related to EDs and to explore factors (ie, topics) related to EDs using a topic modeling method. Methods: We collected potential ED-relevant tweets using keywords from previous studies and annotated these tweets into different groups (ie, ED relevant vs irrelevant and then promotional information vs laypeople discussion). Several supervised machine learning methods, such as convolutional neural network (CNN), long short-term memory (LSTM), support vector machine, and naïve Bayes, were developed and evaluated using annotated data. We used the classifier with the best performance to identify ED-relevant tweets and applied a topic modeling method—Correlation Explanation (CorEx)—to analyze the content of the identified tweets. To validate these machine learning results, we also collected a cohort of ED-relevant tweets on the basis of manually curated rules. Results: A total of 123,977 tweets were collected during the set period. We randomly annotated 2219 tweets for developing the machine learning classifiers. We developed a CNN-LSTM classifier to identify ED-relevant tweets published by laypeople in 2 steps: first relevant versus irrelevant (F1 score=0.89) and then promotional versus published by laypeople (F1 score=0.90). A total of 40,790 ED-relevant tweets were identified using the CNN-LSTM classifier. We also identified another set of tweets (ie, 17,632 ED-relevant and 83,557 ED-irrelevant tweets) posted by laypeople using manually specified rules. Using CorEx on all ED-relevant tweets, the topic model identified 162 topics. Overall, the coherence rate for topic modeling was 77.07% (1264/1640), indicating a high quality of the produced topics. The topics were further reviewed and analyzed by a domain expert. Conclusions: A developed CNN-LSTM classifier could improve the efficiency of identifying ED-relevant tweets compared with the traditional manual-based method. The CorEx topic model was applied on the tweets identified by the machine learning–based classifier and the traditional manual approach separately. Highly overlapping topics were observed between the 2 cohorts of tweets. The produced topics were further reviewed by a domain expert. Some of the topics identified by the potential ED tweets may provide new avenues for understanding this serious set of disorders. UR - http://medinform.jmir.org/2020/10/e18273/ UR - https://doi.org/10.2196/18273 UR - http://www.ncbi.nlm.nih.gov/pubmed/33124997 DO - 10.2196/18273 ID - info:doi/10.2196/18273 ER - TY - JOUR AU - Chou, Joseph H PY - 2020 DA - 2020/10/29 TI - Predictive Models for Neonatal Follow-Up Serum Bilirubin: Model Development and Validation JO - JMIR Med Inform SP - e21222 VL - 8 IS - 10 KW - infant, newborn KW - neonatology KW - jaundice, neonatal KW - hyperbilirubinemia, neonatal KW - machine learning KW - supervised machine learning KW - data science KW - medical informatics KW - decision support techniques KW - models, statistical KW - predictive models AB - Background: Hyperbilirubinemia affects many newborn infants and, if not treated appropriately, can lead to irreversible brain injury. Objective: This study aims to develop predictive models of follow-up total serum bilirubin measurement and to compare their accuracy with that of clinician predictions. Methods: Subjects were patients born between June 2015 and June 2019 at 4 hospitals in Massachusetts. The prediction target was a follow-up total serum bilirubin measurement obtained <72 hours after a previous measurement. Birth before versus after February 2019 was used to generate a training set (27,428 target measurements) and a held-out test set (3320 measurements), respectively. Multiple supervised learning models were trained. To further assess model performance, predictions on the held-out test set were also compared with corresponding predictions from clinicians. Results: The best predictive accuracy on the held-out test set was obtained with the multilayer perceptron (ie, neural network, mean absolute error [MAE] 1.05 mg/dL) and Xgboost (MAE 1.04 mg/dL) models. A limited number of predictors were sufficient for constructing models with the best performance and avoiding overfitting: current bilirubin measurement, last rate of rise, proportion of time under phototherapy, time to next measurement, gestational age at birth, current age, and fractional weight change from birth. Clinicians made a total of 210 prospective predictions. The neural network model accuracy on this subset of predictions had an MAE of 1.06 mg/dL compared with clinician predictions with an MAE of 1.38 mg/dL (P<.0001). In babies born at 35 weeks of gestation or later, this approach was also applied to predict the binary outcome of subsequently exceeding consensus guidelines for phototherapy initiation and achieved an area under the receiver operator characteristic curve of 0.94 (95% CI 0.91 to 0.97). Conclusions: This study developed predictive models for neonatal follow-up total serum bilirubin measurements that outperform clinicians. This may be the first report of models that predict specific bilirubin values, are not limited to near-term patients without risk factors, and take into account the effect of phototherapy. UR - http://medinform.jmir.org/2020/10/e21222/ UR - https://doi.org/10.2196/21222 UR - http://www.ncbi.nlm.nih.gov/pubmed/33118947 DO - 10.2196/21222 ID - info:doi/10.2196/21222 ER - TY - JOUR AU - Izquierdo, Jose Luis AU - Ancochea, Julio AU - Soriano, Joan B PY - 2020 DA - 2020/10/28 TI - Clinical Characteristics and Prognostic Factors for Intensive Care Unit Admission of Patients With COVID-19: Retrospective Study Using Machine Learning and Natural Language Processing JO - J Med Internet Res SP - e21801 VL - 22 IS - 10 KW - artificial intelligence KW - big data KW - COVID-19 KW - electronic health records KW - tachypnea KW - SARS-CoV-2 KW - predictive model AB - Background: Many factors involved in the onset and clinical course of the ongoing COVID-19 pandemic are still unknown. Although big data analytics and artificial intelligence are widely used in the realms of health and medicine, researchers are only beginning to use these tools to explore the clinical characteristics and predictive factors of patients with COVID-19. Objective: Our primary objectives are to describe the clinical characteristics and determine the factors that predict intensive care unit (ICU) admission of patients with COVID-19. Determining these factors using a well-defined population can increase our understanding of the real-world epidemiology of the disease. Methods: We used a combination of classic epidemiological methods, natural language processing (NLP), and machine learning (for predictive modeling) to analyze the electronic health records (EHRs) of patients with COVID-19. We explored the unstructured free text in the EHRs within the Servicio de Salud de Castilla-La Mancha (SESCAM) Health Care Network (Castilla-La Mancha, Spain) from the entire population with available EHRs (1,364,924 patients) from January 1 to March 29, 2020. We extracted related clinical information regarding diagnosis, progression, and outcome for all COVID-19 cases. Results: A total of 10,504 patients with a clinical or polymerase chain reaction–confirmed diagnosis of COVID-19 were identified; 5519 (52.5%) were male, with a mean age of 58.2 years (SD 19.7). Upon admission, the most common symptoms were cough, fever, and dyspnea; however, all three symptoms occurred in fewer than half of the cases. Overall, 6.1% (83/1353) of hospitalized patients required ICU admission. Using a machine-learning, data-driven algorithm, we identified that a combination of age, fever, and tachypnea was the most parsimonious predictor of ICU admission; patients younger than 56 years, without tachypnea, and temperature <39 degrees Celsius (or >39 ºC without respiratory crackles) were not admitted to the ICU. In contrast, patients with COVID-19 aged 40 to 79 years were likely to be admitted to the ICU if they had tachypnea and delayed their visit to the emergency department after being seen in primary care. Conclusions: Our results show that a combination of easily obtainable clinical variables (age, fever, and tachypnea with or without respiratory crackles) predicts whether patients with COVID-19 will require ICU admission. UR - http://www.jmir.org/2020/10/e21801/ UR - https://doi.org/10.2196/21801 UR - http://www.ncbi.nlm.nih.gov/pubmed/33090964 DO - 10.2196/21801 ID - info:doi/10.2196/21801 ER - TY - JOUR AU - Lee, Geun Hyeong AU - Shin, Soo-Yong PY - 2020 DA - 2020/10/26 TI - Federated Learning on Clinical Benchmark Data: Performance Assessment JO - J Med Internet Res SP - e20891 VL - 22 IS - 10 KW - federated learning KW - medical data KW - privacy protection KW - machine learning KW - deep learning AB - Background: Federated learning (FL) is a newly proposed machine-learning method that uses a decentralized dataset. Since data transfer is not necessary for the learning process in FL, there is a significant advantage in protecting personal privacy. Therefore, many studies are being actively conducted in the applications of FL for diverse areas. Objective: The aim of this study was to evaluate the reliability and performance of FL using three benchmark datasets, including a clinical benchmark dataset. Methods: To evaluate FL in a realistic setting, we implemented FL using a client-server architecture with Python. The implemented client-server version of the FL software was deployed to Amazon Web Services. Modified National Institute of Standards and Technology (MNIST), Medical Information Mart for Intensive Care-III (MIMIC-III), and electrocardiogram (ECG) datasets were used to evaluate the performance of FL. To test FL in a realistic setting, the MNIST dataset was split into 10 different clients, with one digit for each client. In addition, we conducted four different experiments according to basic, imbalanced, skewed, and a combination of imbalanced and skewed data distributions. We also compared the performance of FL to that of the state-of-the-art method with respect to in-hospital mortality using the MIMIC-III dataset. Likewise, we conducted experiments comparing basic and imbalanced data distributions using MIMIC-III and ECG data. Results: FL on the basic MNIST dataset with 10 clients achieved an area under the receiver operating characteristic curve (AUROC) of 0.997 and an F1-score of 0.946. The experiment with the imbalanced MNIST dataset achieved an AUROC of 0.995 and an F1-score of 0.921. The experiment with the skewed MNIST dataset achieved an AUROC of 0.992 and an F1-score of 0.905. Finally, the combined imbalanced and skewed experiment achieved an AUROC of 0.990 and an F1-score of 0.891. The basic FL on in-hospital mortality using MIMIC-III data achieved an AUROC of 0.850 and an F1-score of 0.944, while the experiment with the imbalanced MIMIC-III dataset achieved an AUROC of 0.850 and an F1-score of 0.943. For ECG classification, the basic FL achieved an AUROC of 0.938 and an F1-score of 0.807, and the imbalanced ECG dataset achieved an AUROC of 0.943 and an F1-score of 0.807. Conclusions: FL demonstrated comparative performance on different benchmark datasets. In addition, FL demonstrated reliable performance in cases where the distribution was imbalanced, skewed, and extreme, reflecting the real-life scenario in which data distributions from various hospitals are different. FL can achieve high performance while maintaining privacy protection because there is no requirement to centralize the data. UR - http://www.jmir.org/2020/10/e20891/ UR - https://doi.org/10.2196/20891 UR - http://www.ncbi.nlm.nih.gov/pubmed/33104011 DO - 10.2196/20891 ID - info:doi/10.2196/20891 ER - TY - JOUR AU - Milne-Ives, Madison AU - de Cock, Caroline AU - Lim, Ernest AU - Shehadeh, Melissa Harper AU - de Pennington, Nick AU - Mole, Guy AU - Normando, Eduardo AU - Meinert, Edward PY - 2020 DA - 2020/10/22 TI - The Effectiveness of Artificial Intelligence Conversational Agents in Health Care: Systematic Review JO - J Med Internet Res SP - e20346 VL - 22 IS - 10 KW - artificial intelligence KW - avatar KW - chatbot KW - conversational agent KW - digital health KW - intelligent assistant KW - speech recognition software KW - virtual assistant KW - virtual coach KW - virtual health care KW - virtual nursing KW - voice recognition software AB - Background: The high demand for health care services and the growing capability of artificial intelligence have led to the development of conversational agents designed to support a variety of health-related activities, including behavior change, treatment support, health monitoring, training, triage, and screening support. Automation of these tasks could free clinicians to focus on more complex work and increase the accessibility to health care services for the public. An overarching assessment of the acceptability, usability, and effectiveness of these agents in health care is needed to collate the evidence so that future development can target areas for improvement and potential for sustainable adoption. Objective: This systematic review aims to assess the effectiveness and usability of conversational agents in health care and identify the elements that users like and dislike to inform future research and development of these agents. Methods: PubMed, Medline (Ovid), EMBASE (Excerpta Medica dataBASE), CINAHL (Cumulative Index to Nursing and Allied Health Literature), Web of Science, and the Association for Computing Machinery Digital Library were systematically searched for articles published since 2008 that evaluated unconstrained natural language processing conversational agents used in health care. EndNote (version X9, Clarivate Analytics) reference management software was used for initial screening, and full-text screening was conducted by 1 reviewer. Data were extracted, and the risk of bias was assessed by one reviewer and validated by another. Results: A total of 31 studies were selected and included a variety of conversational agents, including 14 chatbots (2 of which were voice chatbots), 6 embodied conversational agents (3 of which were interactive voice response calls, virtual patients, and speech recognition screening systems), 1 contextual question-answering agent, and 1 voice recognition triage system. Overall, the evidence reported was mostly positive or mixed. Usability and satisfaction performed well (27/30 and 26/31), and positive or mixed effectiveness was found in three-quarters of the studies (23/30). However, there were several limitations of the agents highlighted in specific qualitative feedback. Conclusions: The studies generally reported positive or mixed evidence for the effectiveness, usability, and satisfactoriness of the conversational agents investigated, but qualitative user perceptions were more mixed. The quality of many of the studies was limited, and improved study design and reporting are necessary to more accurately evaluate the usefulness of the agents in health care and identify key areas for improvement. Further research should also analyze the cost-effectiveness, privacy, and security of the agents. International Registered Report Identifier (IRRID): RR2-10.2196/16934 UR - http://www.jmir.org/2020/10/e20346/ UR - https://doi.org/10.2196/20346 UR - http://www.ncbi.nlm.nih.gov/pubmed/33090118 DO - 10.2196/20346 ID - info:doi/10.2196/20346 ER - TY - JOUR AU - Almog, Yasmeen Adar AU - Rai, Angshu AU - Zhang, Patrick AU - Moulaison, Amanda AU - Powell, Ross AU - Mishra, Anirban AU - Weinberg, Kerry AU - Hamilton, Celeste AU - Oates, Mary AU - McCloskey, Eugene AU - Cummings, Steven R PY - 2020 DA - 2020/10/16 TI - Deep Learning With Electronic Health Records for Short-Term Fracture Risk Identification: Crystal Bone Algorithm Development and Validation JO - J Med Internet Res SP - e22550 VL - 22 IS - 10 KW - fracture KW - bone KW - osteoporosis KW - low bone mass KW - prediction KW - natural language processing KW - NLP KW - machine learning KW - deep learning KW - artificial intelligence KW - AI KW - electronic health record KW - EHR AB - Background: Fractures as a result of osteoporosis and low bone mass are common and give rise to significant clinical, personal, and economic burden. Even after a fracture occurs, high fracture risk remains widely underdiagnosed and undertreated. Common fracture risk assessment tools utilize a subset of clinical risk factors for prediction, and often require manual data entry. Furthermore, these tools predict risk over the long term and do not explicitly provide short-term risk estimates necessary to identify patients likely to experience a fracture in the next 1-2 years. Objective: The goal of this study was to develop and evaluate an algorithm for the identification of patients at risk of fracture in a subsequent 1- to 2-year period. In order to address the aforementioned limitations of current prediction tools, this approach focused on a short-term timeframe, automated data entry, and the use of longitudinal data to inform the predictions. Methods: Using retrospective electronic health record data from over 1,000,000 patients, we developed Crystal Bone, an algorithm that applies machine learning techniques from natural language processing to the temporal nature of patient histories to generate short-term fracture risk predictions. Similar to how language models predict the next word in a given sentence or the topic of a document, Crystal Bone predicts whether a patient’s future trajectory might contain a fracture event, or whether the signature of the patient’s journey is similar to that of a typical future fracture patient. A holdout set with 192,590 patients was used to validate accuracy. Experimental baseline models and human-level performance were used for comparison. Results: The model accurately predicted 1- to 2-year fracture risk for patients aged over 50 years (area under the receiver operating characteristics curve [AUROC] 0.81). These algorithms outperformed the experimental baselines (AUROC 0.67) and showed meaningful improvements when compared to retrospective approximation of human-level performance by correctly identifying 9649 of 13,765 (70%) at-risk patients who did not receive any preventative bone-health-related medical interventions from their physicians. Conclusions: These findings indicate that it is possible to use a patient’s unique medical history as it changes over time to predict the risk of short-term fracture. Validating and applying such a tool within the health care system could enable automated and widespread prediction of this risk and may help with identification of patients at very high risk of fracture. UR - http://www.jmir.org/2020/10/e22550/ UR - https://doi.org/10.2196/22550 UR - http://www.ncbi.nlm.nih.gov/pubmed/32956069 DO - 10.2196/22550 ID - info:doi/10.2196/22550 ER - TY - JOUR AU - Liu, Ping-Yen AU - Tsai, Yi-Shan AU - Chen, Po-Lin AU - Tsai, Huey-Pin AU - Hsu, Ling-Wei AU - Wang, Chi-Shiang AU - Lee, Nan-Yao AU - Huang, Mu-Shiang AU - Wu, Yun-Chiao AU - Ko, Wen-Chien AU - Yang, Yi-Ching AU - Chiang, Jung-Hsien AU - Shen, Meng-Ru PY - 2020 DA - 2020/10/14 TI - Application of an Artificial Intelligence Trilogy to Accelerate Processing of Suspected Patients With SARS-CoV-2 at a Smart Quarantine Station: Observational Study JO - J Med Internet Res SP - e19878 VL - 22 IS - 10 KW - SARS-CoV-2 KW - COVID-19 KW - artificial intelligence KW - smart device assisted decision making KW - quarantine station AB - Background: As the COVID-19 epidemic increases in severity, the burden of quarantine stations outside emergency departments (EDs) at hospitals is increasing daily. To address the high screening workload at quarantine stations, all staff members with medical licenses are required to work shifts in these stations. Therefore, it is necessary to simplify the workflow and decision-making process for physicians and surgeons from all subspecialties. Objective: The aim of this paper is to demonstrate how the National Cheng Kung University Hospital artificial intelligence (AI) trilogy of diversion to a smart quarantine station, AI-assisted image interpretation, and a built-in clinical decision-making algorithm improves medical care and reduces quarantine processing times. Methods: This observational study on the emerging COVID-19 pandemic included 643 patients. An “AI trilogy” of diversion to a smart quarantine station, AI-assisted image interpretation, and a built-in clinical decision-making algorithm on a tablet computer was applied to shorten the quarantine survey process and reduce processing time during the COVID-19 pandemic. Results: The use of the AI trilogy facilitated the processing of suspected cases of COVID-19 with or without symptoms; also, travel, occupation, contact, and clustering histories were obtained with the tablet computer device. A separate AI-mode function that could quickly recognize pulmonary infiltrates on chest x-rays was merged into the smart clinical assisting system (SCAS), and this model was subsequently trained with COVID-19 pneumonia cases from the GitHub open source data set. The detection rates for posteroanterior and anteroposterior chest x-rays were 55/59 (93%) and 5/11 (45%), respectively. The SCAS algorithm was continuously adjusted based on updates to the Taiwan Centers for Disease Control public safety guidelines for faster clinical decision making. Our ex vivo study demonstrated the efficiency of disinfecting the tablet computer surface by wiping it twice with 75% alcohol sanitizer. To further analyze the impact of the AI application in the quarantine station, we subdivided the station group into groups with or without AI. Compared with the conventional ED (n=281), the survey time at the quarantine station (n=1520) was significantly shortened; the median survey time at the ED was 153 minutes (95% CI 108.5-205.0), vs 35 minutes at the quarantine station (95% CI 24-56; P<.001). Furthermore, the use of the AI application in the quarantine station reduced the survey time in the quarantine station; the median survey time without AI was 101 minutes (95% CI 40-153), vs 34 minutes (95% CI 24-53) with AI in the quarantine station (P<.001). Conclusions: The AI trilogy improved our medical care workflow by shortening the quarantine survey process and reducing the processing time, which is especially important during an emerging infectious disease epidemic. UR - http://www.jmir.org/2020/10/e19878/ UR - https://doi.org/10.2196/19878 UR - http://www.ncbi.nlm.nih.gov/pubmed/33001832 DO - 10.2196/19878 ID - info:doi/10.2196/19878 ER - TY - JOUR AU - Xiu, Xiaolei AU - Qian, Qing AU - Wu, Sizhu PY - 2020 DA - 2020/10/7 TI - Construction of a Digestive System Tumor Knowledge Graph Based on Chinese Electronic Medical Records: Development and Usability Study JO - JMIR Med Inform SP - e18287 VL - 8 IS - 10 KW - Chinese electronic medical records KW - knowledge graph KW - digestive system tumor KW - graph evaluation AB - Background: With the increasing incidences and mortality of digestive system tumor diseases in China, ways to use clinical experience data in Chinese electronic medical records (CEMRs) to determine potentially effective relationships between diagnosis and treatment have become a priority. As an important part of artificial intelligence, a knowledge graph is a powerful tool for information processing and knowledge organization that provides an ideal means to solve this problem. Objective: This study aimed to construct a semantic-driven digestive system tumor knowledge graph (DSTKG) to represent the knowledge in CEMRs with fine granularity and semantics. Methods: This paper focuses on the knowledge graph schema and semantic relationships that were the main challenges for constructing a Chinese tumor knowledge graph. The DSTKG was developed through a multistep procedure. As an initial step, a complete DSTKG construction framework based on CEMRs was proposed. Then, this research built a knowledge graph schema containing 7 classes and 16 kinds of semantic relationships and accomplished the DSTKG by knowledge extraction, named entity linking, and drawing the knowledge graph. Finally, the quality of the DSTKG was evaluated from 3 aspects: data layer, schema layer, and application layer. Results: Experts agreed that the DSTKG was good overall (mean score 4.20). Especially for the aspects of “rationality of schema structure,” “scalability,” and “readability of results,” the DSTKG performed well, with scores of 4.72, 4.67, and 4.69, respectively, which were much higher than the average. However, the small amount of data in the DSTKG negatively affected its “practicability” score. Compared with other Chinese tumor knowledge graphs, the DSTKG can represent more granular entities, properties, and semantic relationships. In addition, the DSTKG was flexible, allowing personalized customization to meet the designer's focus on specific interests in the digestive system tumor. Conclusions: We constructed a granular semantic DSTKG. It could provide guidance for the construction of a tumor knowledge graph and provide a preliminary step for the intelligent application of knowledge graphs based on CEMRs. Additional data sources and stronger research on assertion classification are needed to gain insight into the DSTKG’s potential. UR - http://medinform.jmir.org/2020/10/e18287/ UR - https://doi.org/10.2196/18287 UR - http://www.ncbi.nlm.nih.gov/pubmed/33026359 DO - 10.2196/18287 ID - info:doi/10.2196/18287 ER - TY - JOUR AU - Zhang, Jingwen AU - Oh, Yoo Jung AU - Lange, Patrick AU - Yu, Zhou AU - Fukuoka, Yoshimi PY - 2020 DA - 2020/9/30 TI - Artificial Intelligence Chatbot Behavior Change Model for Designing Artificial Intelligence Chatbots to Promote Physical Activity and a Healthy Diet: Viewpoint JO - J Med Internet Res SP - e22845 VL - 22 IS - 9 KW - chatbot KW - conversational agent KW - artificial intelligence KW - physical activity KW - diet KW - intervention KW - behavior change KW - natural language processing KW - communication AB - Background: Chatbots empowered by artificial intelligence (AI) can increasingly engage in natural conversations and build relationships with users. Applying AI chatbots to lifestyle modification programs is one of the promising areas to develop cost-effective and feasible behavior interventions to promote physical activity and a healthy diet. Objective: The purposes of this perspective paper are to present a brief literature review of chatbot use in promoting physical activity and a healthy diet, describe the AI chatbot behavior change model our research team developed based on extensive interdisciplinary research, and discuss ethical principles and considerations. Methods: We conducted a preliminary search of studies reporting chatbots for improving physical activity and/or diet in four databases in July 2020. We summarized the characteristics of the chatbot studies and reviewed recent developments in human-AI communication research and innovations in natural language processing. Based on the identified gaps and opportunities, as well as our own clinical and research experience and findings, we propose an AI chatbot behavior change model. Results: Our review found a lack of understanding around theoretical guidance and practical recommendations on designing AI chatbots for lifestyle modification programs. The proposed AI chatbot behavior change model consists of the following four components to provide such guidance: (1) designing chatbot characteristics and understanding user background; (2) building relational capacity; (3) building persuasive conversational capacity; and (4) evaluating mechanisms and outcomes. The rationale and evidence supporting the design and evaluation choices for this model are presented in this paper. Conclusions: As AI chatbots become increasingly integrated into various digital communications, our proposed theoretical framework is the first step to conceptualize the scope of utilization in health behavior change domains and to synthesize all possible dimensions of chatbot features to inform intervention design and evaluation. There is a need for more interdisciplinary work to continue developing AI techniques to improve a chatbot’s relational and persuasive capacities to change physical activity and diet behaviors with strong ethical principles. UR - https://www.jmir.org/2020/9/e22845 UR - https://doi.org/10.2196/22845 UR - http://www.ncbi.nlm.nih.gov/pubmed/32996892 DO - 10.2196/22845 ID - info:doi/10.2196/22845 ER - TY - JOUR AU - Kühnle, Lara AU - Mücke, Urs AU - Lechner, Werner M AU - Klawonn, Frank AU - Grigull, Lorenz PY - 2020 DA - 2020/9/29 TI - Development of a Social Network for People Without a Diagnosis (RarePairs): Evaluation Study JO - J Med Internet Res SP - e21849 VL - 22 IS - 9 KW - rare disease KW - diagnostic support tool KW - prototype KW - social network KW - machine learning KW - artificial intelligence AB - Background: Diagnostic delay in rare disease (RD) is common, occasionally lasting up to more than 20 years. In attempting to reduce it, diagnostic support tools have been studied extensively. However, social platforms have not yet been used for systematic diagnostic support. This paper illustrates the development and prototypic application of a social network using scientifically developed questions to match individuals without a diagnosis. Objective: The study aimed to outline, create, and evaluate a prototype tool (a social network platform named RarePairs), helping patients with undiagnosed RDs to find individuals with similar symptoms. The prototype includes a matching algorithm, bringing together individuals with similar disease burden in the lead-up to diagnosis. Methods: We divided our project into 4 phases. In phase 1, we used known data and findings in the literature to understand and specify the context of use. In phase 2, we specified the user requirements. In phase 3, we designed a prototype based on the results of phases 1 and 2, as well as incorporating a state-of-the-art questionnaire with 53 items for recognizing an RD. Lastly, we evaluated this prototype with a data set of 973 questionnaires from individuals suffering from different RDs using 24 distance calculating methods. Results: Based on a step-by-step construction process, the digital patient platform prototype, RarePairs, was developed. In order to match individuals with similar experiences, it uses answer patterns generated by a specifically designed questionnaire (Q53). A total of 973 questionnaires answered by patients with RDs were used to construct and test an artificial intelligence (AI) algorithm like the k-nearest neighbor search. With this, we found matches for every single one of the 973 records. The cross-validation of those matches showed that the algorithm outperforms random matching significantly. Statistically, for every data set the algorithm found at least one other record (match) with the same diagnosis. Conclusions: Diagnostic delay is torturous for patients without a diagnosis. Shortening the delay is important for both doctors and patients. Diagnostic support using AI can be promoted differently. The prototype of the social media platform RarePairs might be a low-threshold patient platform, and proved suitable to match and connect different individuals with comparable symptoms. This exchange promoted through RarePairs might be used to speed up the diagnostic process. Further studies include its evaluation in a prospective setting and implementation of RarePairs as a mobile phone app. UR - http://www.jmir.org/2020/9/e21849/ UR - https://doi.org/10.2196/21849 UR - http://www.ncbi.nlm.nih.gov/pubmed/32990634 DO - 10.2196/21849 ID - info:doi/10.2196/21849 ER - TY - JOUR AU - Li, Rui AU - Yin, Changchang AU - Yang, Samuel AU - Qian, Buyue AU - Zhang, Ping PY - 2020 DA - 2020/9/28 TI - Marrying Medical Domain Knowledge With Deep Learning on Electronic Health Records: A Deep Visual Analytics Approach JO - J Med Internet Res SP - e20645 VL - 22 IS - 9 KW - electronic health records KW - interpretable deep learning KW - knowledge graph KW - visual analytics AB - Background: Deep learning models have attracted significant interest from health care researchers during the last few decades. There have been many studies that apply deep learning to medical applications and achieve promising results. However, there are three limitations to the existing models: (1) most clinicians are unable to interpret the results from the existing models, (2) existing models cannot incorporate complicated medical domain knowledge (eg, a disease causes another disease), and (3) most existing models lack visual exploration and interaction. Both the electronic health record (EHR) data set and the deep model results are complex and abstract, which impedes clinicians from exploring and communicating with the model directly. Objective: The objective of this study is to develop an interpretable and accurate risk prediction model as well as an interactive clinical prediction system to support EHR data exploration, knowledge graph demonstration, and model interpretation. Methods: A domain-knowledge–guided recurrent neural network (DG-RNN) model is proposed to predict clinical risks. The model takes medical event sequences as input and incorporates medical domain knowledge by attending to a subgraph of the whole medical knowledge graph. A global pooling operation and a fully connected layer are used to output the clinical outcomes. The middle results and the parameters of the fully connected layer are helpful in identifying which medical events cause clinical risks. DG-Viz is also designed to support EHR data exploration, knowledge graph demonstration, and model interpretation. Results: We conducted both risk prediction experiments and a case study on a real-world data set. A total of 554 patients with heart failure and 1662 control patients without heart failure were selected from the data set. The experimental results show that the proposed DG-RNN outperforms the state-of-the-art approaches by approximately 1.5%. The case study demonstrates how our medical physician collaborator can effectively explore the data and interpret the prediction results using DG-Viz. Conclusions: In this study, we present DG-Viz, an interactive clinical prediction system, which brings together the power of deep learning (ie, a DG-RNN–based model) and visual analytics to predict clinical risks and visually interpret the EHR prediction results. Experimental results and a case study on heart failure risk prediction tasks demonstrate the effectiveness and usefulness of the DG-Viz system. This study will pave the way for interactive, interpretable, and accurate clinical risk predictions. UR - http://www.jmir.org/2020/9/e20645/ UR - https://doi.org/10.2196/20645 UR - http://www.ncbi.nlm.nih.gov/pubmed/32985996 DO - 10.2196/20645 ID - info:doi/10.2196/20645 ER - TY - JOUR AU - Kriventsov, Stan AU - Lindsey, Alexander AU - Hayeri, Amir PY - 2020 DA - 2020/9/22 TI - The Diabits App for Smartphone-Assisted Predictive Monitoring of Glycemia in Patients With Diabetes: Retrospective Observational Study JO - JMIR Diabetes SP - e18660 VL - 5 IS - 3 KW - blood glucose predictions KW - type 1 diabetes KW - artificial intelligence KW - machine learning KW - digital health KW - mobile phone AB - Background: Diabetes mellitus, which causes dysregulation of blood glucose in humans, is a major public health challenge. Patients with diabetes must monitor their glycemic levels to keep them in a healthy range. This task is made easier by using continuous glucose monitoring (CGM) devices and relaying their output to smartphone apps, thus providing users with real-time information on their glycemic fluctuations and possibly predicting future trends. Objective: This study aims to discuss various challenges of predictive monitoring of glycemia and examines the accuracy and blood glucose control effects of Diabits, a smartphone app that helps patients with diabetes monitor and manage their blood glucose levels in real time. Methods: Using data from CGM devices and user input, Diabits applies machine learning techniques to create personalized patient models and predict blood glucose fluctuations up to 60 min in advance. These predictions give patients an opportunity to take pre-emptive action to maintain their blood glucose values within the reference range. In this retrospective observational cohort study, the predictive accuracy of Diabits and the correlation between daily use of the app and blood glucose control metrics were examined based on real app users’ data. Moreover, the accuracy of predictions on the 2018 Ohio T1DM (type 1 diabetes mellitus) data set was calculated and compared against other published results. Results: On the basis of more than 6.8 million data points, 30-min Diabits predictions evaluated using Parkes Error Grid were found to be 86.89% (5,963,930/6,864,130) clinically accurate (zone A) and 99.56% (6,833,625/6,864,130) clinically acceptable (zones A and B), whereas 60-min predictions were 70.56% (4,843,605/6,864,130) clinically accurate and 97.49% (6,692,165/6,864,130) clinically acceptable. By analyzing daily use statistics and CGM data for the 280 most long-standing users of Diabits, it was established that under free-living conditions, many common blood glucose control metrics improved with increased frequency of app use. For instance, the average blood glucose for the days these users did not interact with the app was 154.0 (SD 47.2) mg/dL, with 67.52% of the time spent in the healthy 70 to 180 mg/dL range. For days with 10 or more Diabits sessions, the average blood glucose decreased to 141.6 (SD 42.0) mg/dL (P<.001), whereas the time in euglycemic range increased to 74.28% (P<.001). On the Ohio T1DM data set of 6 patients with type 1 diabetes, 30-min predictions of the base Diabits model had an average root mean square error of 18.68 (SD 2.19) mg/dL, which is an improvement over the published state-of-the-art results for this data set. Conclusions: Diabits accurately predicts future glycemic fluctuations, potentially making it easier for patients with diabetes to maintain their blood glucose in the reference range. Furthermore, an improvement in glucose control was observed on days with more frequent Diabits use. UR - http://diabetes.jmir.org/2020/3/e18660/ UR - https://doi.org/10.2196/18660 UR - http://www.ncbi.nlm.nih.gov/pubmed/32960180 DO - 10.2196/18660 ID - info:doi/10.2196/18660 ER - TY - JOUR AU - Li, Juan AU - Maharjan, Bikesh AU - Xie, Bo AU - Tao, Cui PY - 2020 DA - 2020/9/21 TI - A Personalized Voice-Based Diet Assistant for Caregivers of Alzheimer Disease and Related Dementias: System Development and Validation JO - J Med Internet Res SP - e19897 VL - 22 IS - 9 KW - Alzheimer disease KW - dementia KW - diet KW - knowledge KW - ontology KW - voice assistant AB - Background: The world’s aging population is increasing, with an expected increase in the prevalence of Alzheimer disease and related dementias (ADRD). Proper nutrition and good eating behavior show promise for preventing and slowing the progression of ADRD and consequently improving patients with ADRD’s health status and quality of life. Most ADRD care is provided by informal caregivers, so assisting caregivers to manage patients with ADRD’s diet is important. Objective: This study aims to design, develop, and test an artificial intelligence–powered voice assistant to help informal caregivers manage the daily diet of patients with ADRD and learn food and nutrition-related knowledge. Methods: The voice assistant is being implemented in several steps: construction of a comprehensive knowledge base with ontologies that define ADRD diet care and user profiles, and is extended with external knowledge graphs; management of conversation between users and the voice assistant; personalized ADRD diet services provided through a semantics-based knowledge graph search and reasoning engine; and system evaluation in use cases with additional qualitative evaluations. Results: A prototype voice assistant was evaluated in the lab using various use cases. Preliminary qualitative test results demonstrate reasonable rates of dialogue success and recommendation correctness. Conclusions: The voice assistant provides a natural, interactive interface for users, and it does not require the user to have a technical background, which may facilitate senior caregivers’ use in their daily care tasks. This study suggests the feasibility of using the intelligent voice assistant to help caregivers manage patients with ADRD’s diet. UR - http://www.jmir.org/2020/9/e19897/ UR - https://doi.org/10.2196/19897 UR - http://www.ncbi.nlm.nih.gov/pubmed/32955452 DO - 10.2196/19897 ID - info:doi/10.2196/19897 ER - TY - JOUR AU - Bang, Chang Seok AU - Lee, Jae Jun AU - Baik, Gwang Ho PY - 2020 DA - 2020/9/16 TI - Artificial Intelligence for the Prediction of Helicobacter Pylori Infection in Endoscopic Images: Systematic Review and Meta-Analysis Of Diagnostic Test Accuracy JO - J Med Internet Res SP - e21983 VL - 22 IS - 9 KW - artificial intelligence KW - convolutional neural network KW - deep learning KW - machine learning KW - endoscopy KW - Helicobacter pylori AB - Background: Helicobacter pylori plays a central role in the development of gastric cancer, and prediction of H pylori infection by visual inspection of the gastric mucosa is an important function of endoscopy. However, there are currently no established methods of optical diagnosis of H pylori infection using endoscopic images. Definitive diagnosis requires endoscopic biopsy. Artificial intelligence (AI) has been increasingly adopted in clinical practice, especially for image recognition and classification. Objective: This study aimed to evaluate the diagnostic test accuracy of AI for the prediction of H pylori infection using endoscopic images. Methods: Two independent evaluators searched core databases. The inclusion criteria included studies with endoscopic images of H pylori infection and with application of AI for the prediction of H pylori infection presenting diagnostic performance. Systematic review and diagnostic test accuracy meta-analysis were performed. Results: Ultimately, 8 studies were identified. Pooled sensitivity, specificity, diagnostic odds ratio, and area under the curve of AI for the prediction of H pylori infection were 0.87 (95% CI 0.72-0.94), 0.86 (95% CI 0.77-0.92), 40 (95% CI 15-112), and 0.92 (95% CI 0.90-0.94), respectively, in the 1719 patients (385 patients with H pylori infection vs 1334 controls). Meta-regression showed methodological quality and included the number of patients in each study for the purpose of heterogeneity. There was no evidence of publication bias. The accuracy of the AI algorithm reached 82% for discrimination between noninfected images and posteradication images. Conclusions: An AI algorithm is a reliable tool for endoscopic diagnosis of H pylori infection. The limitations of lacking external validation performance and being conducted only in Asia should be overcome. Trial Registration: PROSPERO CRD42020175957; https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=175957 UR - http://www.jmir.org/2020/9/e21983/ UR - https://doi.org/10.2196/21983 UR - http://www.ncbi.nlm.nih.gov/pubmed/32936088 DO - 10.2196/21983 ID - info:doi/10.2196/21983 ER - TY - JOUR AU - Zhang, Liang AU - Qu, Yue AU - Jin, Bo AU - Jing, Lu AU - Gao, Zhan AU - Liang, Zhanhua PY - 2020 DA - 2020/9/16 TI - An Intelligent Mobile-Enabled System for Diagnosing Parkinson Disease: Development and Validation of a Speech Impairment Detection System JO - JMIR Med Inform SP - e18689 VL - 8 IS - 9 KW - Parkinson disease KW - speech disorder KW - remote diagnosis KW - artificial intelligence KW - mobile phone app KW - mobile health AB - Background: Parkinson disease (PD) is one of the most common neurological diseases. At present, because the exact cause is still unclear, accurate diagnosis and progression monitoring remain challenging. In recent years, exploring the relationship between PD and speech impairment has attracted widespread attention in the academic world. Most of the studies successfully validated the effectiveness of some vocal features. Moreover, the noninvasive nature of speech signal–based testing has pioneered a new way for telediagnosis and telemonitoring. In particular, there is an increasing demand for artificial intelligence–powered tools in the digital health era. Objective: This study aimed to build a real-time speech signal analysis tool for PD diagnosis and severity assessment. Further, the underlying system should be flexible enough to integrate any machine learning or deep learning algorithm. Methods: At its core, the system we built consists of two parts: (1) speech signal processing: both traditional and novel speech signal processing technologies have been employed for feature engineering, which can automatically extract a few linear and nonlinear dysphonia features, and (2) application of machine learning algorithms: some classical regression and classification algorithms from the machine learning field have been tested; we then chose the most efficient algorithms and relevant features. Results: Experimental results showed that our system had an outstanding ability to both diagnose and assess severity of PD. By using both linear and nonlinear dysphonia features, the accuracy reached 88.74% and recall reached 97.03% in the diagnosis task. Meanwhile, mean absolute error was 3.7699 in the assessment task. The system has already been deployed within a mobile app called No Pa. Conclusions: This study performed diagnosis and severity assessment of PD from the perspective of speech order detection. The efficiency and effectiveness of the algorithms indirectly validated the practicality of the system. In particular, the system reflects the necessity of a publicly accessible PD diagnosis and assessment system that can perform telediagnosis and telemonitoring of PD. This system can also optimize doctors’ decision-making processes regarding treatments. UR - http://medinform.jmir.org/2020/9/e18689/ UR - https://doi.org/10.2196/18689 UR - http://www.ncbi.nlm.nih.gov/pubmed/32936086 DO - 10.2196/18689 ID - info:doi/10.2196/18689 ER - TY - JOUR AU - Park, Eunjeong AU - Lee, Kijeong AU - Han, Taehwa AU - Nam, Hyo Suk PY - 2020 DA - 2020/9/16 TI - Automatic Grading of Stroke Symptoms for Rapid Assessment Using Optimized Machine Learning and 4-Limb Kinematics: Clinical Validation Study JO - J Med Internet Res SP - e20641 VL - 22 IS - 9 KW - machine learning KW - artificial intelligence KW - sensors KW - kinematics KW - stroke KW - telemedicine AB - Background: Subtle abnormal motor signs are indications of serious neurological diseases. Although neurological deficits require fast initiation of treatment in a restricted time, it is difficult for nonspecialists to detect and objectively assess the symptoms. In the clinical environment, diagnoses and decisions are based on clinical grading methods, including the National Institutes of Health Stroke Scale (NIHSS) score or the Medical Research Council (MRC) score, which have been used to measure motor weakness. Objective grading in various environments is necessitated for consistent agreement among patients, caregivers, paramedics, and medical staff to facilitate rapid diagnoses and dispatches to appropriate medical centers. Objective: In this study, we aimed to develop an autonomous grading system for stroke patients. We investigated the feasibility of our new system to assess motor weakness and grade NIHSS and MRC scores of 4 limbs, similar to the clinical examinations performed by medical staff. Methods: We implemented an automatic grading system composed of a measuring unit with wearable sensors and a grading unit with optimized machine learning. Inertial sensors were attached to measure subtle weaknesses caused by paralysis of upper and lower limbs. We collected 60 instances of data with kinematic features of motor disorders from neurological examination and demographic information of stroke patients with NIHSS 0 or 1 and MRC 7, 8, or 9 grades in a stroke unit. Training data with 240 instances were generated using a synthetic minority oversampling technique to complement the imbalanced number of data between classes and low number of training data. We trained 2 representative machine learning algorithms, an ensemble and a support vector machine (SVM), to implement auto-NIHSS and auto-MRC grading. The optimized algorithms performed a 5-fold cross-validation and were searched by Bayes optimization in 30 trials. The trained model was tested with the 60 original hold-out instances for performance evaluation in accuracy, sensitivity, specificity, and area under the receiver operating characteristics curve (AUC). Results: The proposed system can grade NIHSS scores with an accuracy of 83.3% and an AUC of 0.912 using an optimized ensemble algorithm, and it can grade with an accuracy of 80.0% and an AUC of 0.860 using an optimized SVM algorithm. The auto-MRC grading achieved an accuracy of 76.7% and a mean AUC of 0.870 in SVM classification and an accuracy of 78.3% and a mean AUC of 0.877 in ensemble classification. Conclusions: The automatic grading system quantifies proximal weakness in real time and assesses symptoms through automatic grading. The pilot outcomes demonstrated the feasibility of remote monitoring of motor weakness caused by stroke. The system can facilitate consistent grading with instant assessment and expedite dispatches to appropriate hospitals and treatment initiation by sharing auto-MRC and auto-NIHSS scores between prehospital and hospital responses as an objective observation. UR - http://www.jmir.org/2020/9/e20641/ UR - https://doi.org/10.2196/20641 UR - http://www.ncbi.nlm.nih.gov/pubmed/32936079 DO - 10.2196/20641 ID - info:doi/10.2196/20641 ER - TY - JOUR AU - Ferrario, Andrea AU - Demiray, Burcu AU - Yordanova, Kristina AU - Luo, Minxia AU - Martin, Mike PY - 2020 DA - 2020/9/15 TI - Social Reminiscence in Older Adults’ Everyday Conversations: Automated Detection Using Natural Language Processing and Machine Learning JO - J Med Internet Res SP - e19133 VL - 22 IS - 9 KW - aging KW - dementia KW - reminiscence KW - real-life conversations KW - electronically activated recorder (EAR) KW - natural language processing KW - machine learning KW - imbalanced learning AB - Background: Reminiscence is the act of thinking or talking about personal experiences that occurred in the past. It is a central task of old age that is essential for healthy aging, and it serves multiple functions, such as decision-making and introspection, transmitting life lessons, and bonding with others. The study of social reminiscence behavior in everyday life can be used to generate data and detect reminiscence from general conversations. Objective: The aims of this original paper are to (1) preprocess coded transcripts of conversations in German of older adults with natural language processing (NLP), and (2) implement and evaluate learning strategies using different NLP features and machine learning algorithms to detect reminiscence in a corpus of transcripts. Methods: The methods in this study comprise (1) collecting and coding of transcripts of older adults’ conversations in German, (2) preprocessing transcripts to generate NLP features (bag-of-words models, part-of-speech tags, pretrained German word embeddings), and (3) training machine learning models to detect reminiscence using random forests, support vector machines, and adaptive and extreme gradient boosting algorithms. The data set comprises 2214 transcripts, including 109 transcripts with reminiscence. Due to class imbalance in the data, we introduced three learning strategies: (1) class-weighted learning, (2) a meta-classifier consisting of a voting ensemble, and (3) data augmentation with the Synthetic Minority Oversampling Technique (SMOTE) algorithm. For each learning strategy, we performed cross-validation on a random sample of the training data set of transcripts. We computed the area under the curve (AUC), the average precision (AP), precision, recall, as well as F1 score and specificity measures on the test data, for all combinations of NLP features, algorithms, and learning strategies. Results: Class-weighted support vector machines on bag-of-words features outperformed all other classifiers (AUC=0.91, AP=0.56, precision=0.5, recall=0.45, F1=0.48, specificity=0.98), followed by support vector machines on SMOTE-augmented data and word embeddings features (AUC=0.89, AP=0.54, precision=0.35, recall=0.59, F1=0.44, specificity=0.94). For the meta-classifier strategy, adaptive and extreme gradient boosting algorithms trained on word embeddings and bag-of-words outperformed all other classifiers and NLP features; however, the performance of the meta-classifier learning strategy was lower compared to other strategies, with highly imbalanced precision-recall trade-offs. Conclusions: This study provides evidence of the applicability of NLP and machine learning pipelines for the automated detection of reminiscence in older adults’ everyday conversations in German. The methods and findings of this study could be relevant for designing unobtrusive computer systems for the real-time detection of social reminiscence in the everyday life of older adults and classifying their functions. With further improvements, these systems could be deployed in health interventions aimed at improving older adults’ well-being by promoting self-reflection and suggesting coping strategies to be used in the case of dysfunctional reminiscence cases, which can undermine physical and mental health. UR - http://www.jmir.org/2020/9/e19133/ UR - https://doi.org/10.2196/19133 UR - http://www.ncbi.nlm.nih.gov/pubmed/32866108 DO - 10.2196/19133 ID - info:doi/10.2196/19133 ER - TY - JOUR AU - Shen, Jiayi AU - Chen, Jiebin AU - Zheng, Zequan AU - Zheng, Jiabin AU - Liu, Zherui AU - Song, Jian AU - Wong, Sum Yi AU - Wang, Xiaoling AU - Huang, Mengqi AU - Fang, Po-Han AU - Jiang, Bangsheng AU - Tsang, Winghei AU - He, Zonglin AU - Liu, Taoran AU - Akinwunmi, Babatunde AU - Wang, Chi Chiu AU - Zhang, Casper J P AU - Huang, Jian AU - Ming, Wai-Kit PY - 2020 DA - 2020/9/15 TI - An Innovative Artificial Intelligence–Based App for the Diagnosis of Gestational Diabetes Mellitus (GDM-AI): Development Study JO - J Med Internet Res SP - e21573 VL - 22 IS - 9 KW - AI KW - application KW - disease diagnosis KW - maternal health care KW - artificial intelligence KW - app KW - women KW - rural KW - innovation KW - diabetes KW - gestational diabetes KW - diagnosis AB - Background: Gestational diabetes mellitus (GDM) can cause adverse consequences to both mothers and their newborns. However, pregnant women living in low- and middle-income areas or countries often fail to receive early clinical interventions at local medical facilities due to restricted availability of GDM diagnosis. The outstanding performance of artificial intelligence (AI) in disease diagnosis in previous studies demonstrates its promising applications in GDM diagnosis. Objective: This study aims to investigate the implementation of a well-performing AI algorithm in GDM diagnosis in a setting, which requires fewer medical equipment and staff and to establish an app based on the AI algorithm. This study also explores possible progress if our app is widely used. Methods: An AI model that included 9 algorithms was trained on 12,304 pregnant outpatients with their consent who received a test for GDM in the obstetrics and gynecology department of the First Affiliated Hospital of Jinan University, a local hospital in South China, between November 2010 and October 2017. GDM was diagnosed according to American Diabetes Association (ADA) 2011 diagnostic criteria. Age and fasting blood glucose were chosen as critical parameters.For validation, we performed k-fold cross-validation (k=5) for the internal dataset and an external validation dataset that included 1655 cases from the Prince of Wales Hospital, the affiliated teaching hospital of the Chinese University of Hong Kong, a non-local hospital. Accuracy, sensitivity, and other criteria were calculated for each algorithm. Results: The areas under the receiver operating characteristic curve (AUROC) of external validation dataset for support vector machine (SVM), random forest, AdaBoost, k-nearest neighbors (kNN), naive Bayes (NB), decision tree, logistic regression (LR), eXtreme gradient boosting (XGBoost), and gradient boosting decision tree (GBDT) were 0.780, 0.657, 0.736, 0.669, 0.774, 0.614, 0.769, 0.742, and 0.757, respectively. SVM also retained high performance in other criteria. The specificity for SVM retained 100% in the external validation set with an accuracy of 88.7%. Conclusions: Our prospective and multicenter study is the first clinical study that supports the GDM diagnosis for pregnant women in resource-limited areas, using only fasting blood glucose value, patients’ age, and a smartphone connected to the internet. Our study proved that SVM can achieve accurate diagnosis with less operation cost and higher efficacy. Our study (referred to as GDM-AI study, ie, the study of AI-based diagnosis of GDM) also shows our app has a promising future in improving the quality of maternal health for pregnant women, precision medicine, and long-distance medical care. We recommend future work should expand the dataset scope and replicate the process to validate the performance of the AI algorithms. UR - https://www.jmir.org/2020/9/e21573 UR - https://doi.org/10.2196/21573 UR - http://www.ncbi.nlm.nih.gov/pubmed/32930674 DO - 10.2196/21573 ID - info:doi/10.2196/21573 ER - TY - JOUR AU - Schachner, Theresa AU - Keller, Roman AU - v Wangenheim, Florian PY - 2020 DA - 2020/9/14 TI - Artificial Intelligence-Based Conversational Agents for Chronic Conditions: Systematic Literature Review JO - J Med Internet Res SP - e20701 VL - 22 IS - 9 KW - artificial intelligence KW - conversational agents KW - chatbots KW - healthcare KW - chronic diseases KW - systematic literature review AB - Background: A rising number of conversational agents or chatbots are equipped with artificial intelligence (AI) architecture. They are increasingly prevalent in health care applications such as those providing education and support to patients with chronic diseases, one of the leading causes of death in the 21st century. AI-based chatbots enable more effective and frequent interactions with such patients. Objective: The goal of this systematic literature review is to review the characteristics, health care conditions, and AI architectures of AI-based conversational agents designed specifically for chronic diseases. Methods: We conducted a systematic literature review using PubMed MEDLINE, EMBASE, PyscInfo, CINAHL, ACM Digital Library, ScienceDirect, and Web of Science. We applied a predefined search strategy using the terms “conversational agent,” “healthcare,” “artificial intelligence,” and their synonyms. We updated the search results using Google alerts, and screened reference lists for other relevant articles. We included primary research studies that involved the prevention, treatment, or rehabilitation of chronic diseases, involved a conversational agent, and included any kind of AI architecture. Two independent reviewers conducted screening and data extraction, and Cohen kappa was used to measure interrater agreement.A narrative approach was applied for data synthesis. Results: The literature search found 2052 articles, out of which 10 papers met the inclusion criteria. The small number of identified studies together with the prevalence of quasi-experimental studies (n=7) and prevailing prototype nature of the chatbots (n=7) revealed the immaturity of the field. The reported chatbots addressed a broad variety of chronic diseases (n=6), showcasing a tendency to develop specialized conversational agents for individual chronic conditions. However, there lacks comparison of these chatbots within and between chronic diseases. In addition, the reported evaluation measures were not standardized, and the addressed health goals showed a large range. Together, these study characteristics complicated comparability and open room for future research. While natural language processing represented the most used AI technique (n=7) and the majority of conversational agents allowed for multimodal interaction (n=6), the identified studies demonstrated broad heterogeneity, lack of depth of reported AI techniques and systems, and inconsistent usage of taxonomy of the underlying AI software, further aggravating comparability and generalizability of study results. Conclusions: The literature on AI-based conversational agents for chronic conditions is scarce and mostly consists of quasi-experimental studies with chatbots in prototype stage that use natural language processing and allow for multimodal user interaction. Future research could profit from evidence-based evaluation of the AI-based conversational agents and comparison thereof within and between different chronic health conditions. Besides increased comparability, the quality of chatbots developed for specific chronic conditions and their subsequent impact on the target patients could be enhanced by more structured development and standardized evaluation processes. UR - http://www.jmir.org/2020/9/e20701/ UR - https://doi.org/10.2196/20701 UR - http://www.ncbi.nlm.nih.gov/pubmed/32924957 DO - 10.2196/20701 ID - info:doi/10.2196/20701 ER - TY - JOUR AU - Maron, Roman C AU - Utikal, Jochen S AU - Hekler, Achim AU - Hauschild, Axel AU - Sattler, Elke AU - Sondermann, Wiebke AU - Haferkamp, Sebastian AU - Schilling, Bastian AU - Heppt, Markus V AU - Jansen, Philipp AU - Reinholz, Markus AU - Franklin, Cindy AU - Schmitt, Laurenz AU - Hartmann, Daniela AU - Krieghoff-Henning, Eva AU - Schmitt, Max AU - Weichenthal, Michael AU - von Kalle, Christof AU - Fröhling, Stefan AU - Brinker, Titus J PY - 2020 DA - 2020/9/11 TI - Artificial Intelligence and Its Effect on Dermatologists’ Accuracy in Dermoscopic Melanoma Image Classification: Web-Based Survey Study JO - J Med Internet Res SP - e18091 VL - 22 IS - 9 KW - artificial intelligence KW - machine learning KW - deep learning KW - neural network KW - dermatology KW - diagnosis KW - nevi KW - melanoma KW - skin neoplasm AB - Background: Early detection of melanoma can be lifesaving but this remains a challenge. Recent diagnostic studies have revealed the superiority of artificial intelligence (AI) in classifying dermoscopic images of melanoma and nevi, concluding that these algorithms should assist a dermatologist’s diagnoses. Objective: The aim of this study was to investigate whether AI support improves the accuracy and overall diagnostic performance of dermatologists in the dichotomous image–based discrimination between melanoma and nevus. Methods: Twelve board-certified dermatologists were presented disjoint sets of 100 unique dermoscopic images of melanomas and nevi (total of 1200 unique images), and they had to classify the images based on personal experience alone (part I) and with the support of a trained convolutional neural network (CNN, part II). Additionally, dermatologists were asked to rate their confidence in their final decision for each image. Results: While the mean specificity of the dermatologists based on personal experience alone remained almost unchanged (70.6% vs 72.4%; P=.54) with AI support, the mean sensitivity and mean accuracy increased significantly (59.4% vs 74.6%; P=.003 and 65.0% vs 73.6%; P=.002, respectively) with AI support. Out of the 10% (10/94; 95% CI 8.4%-11.8%) of cases where dermatologists were correct and AI was incorrect, dermatologists on average changed to the incorrect answer for 39% (4/10; 95% CI 23.2%-55.6%) of cases. When dermatologists were incorrect and AI was correct (25/94, 27%; 95% CI 24.0%-30.1%), dermatologists changed their answers to the correct answer for 46% (11/25; 95% CI 33.1%-58.4%) of cases. Additionally, the dermatologists’ average confidence in their decisions increased when the CNN confirmed their decision and decreased when the CNN disagreed, even when the dermatologists were correct. Reported values are based on the mean of all participants. Whenever absolute values are shown, the denominator and numerator are approximations as every dermatologist ended up rating a varying number of images due to a quality control step. Conclusions: The findings of our study show that AI support can improve the overall accuracy of the dermatologists in the dichotomous image–based discrimination between melanoma and nevus. This supports the argument for AI-based tools to aid clinicians in skin lesion classification and provides a rationale for studies of such classifiers in real-life settings, wherein clinicians can integrate additional information such as patient age and medical history into their decisions. UR - https://www.jmir.org/2020/9/e18091 UR - https://doi.org/10.2196/18091 UR - http://www.ncbi.nlm.nih.gov/pubmed/32915161 DO - 10.2196/18091 ID - info:doi/10.2196/18091 ER - TY - JOUR AU - Wilmink, Gerald AU - Dupey, Katherine AU - Alkire, Schon AU - Grote, Jeffrey AU - Zobel, Gregory AU - Fillit, Howard M AU - Movva, Satish PY - 2020 DA - 2020/9/10 TI - Artificial Intelligence–Powered Digital Health Platform and Wearable Devices Improve Outcomes for Older Adults in Assisted Living Communities: Pilot Intervention Study JO - JMIR Aging SP - e19554 VL - 3 IS - 2 KW - health technology KW - artificial intelligence KW - AI KW - preventive KW - senior technology KW - assisted living KW - long-term services KW - long-term care providers AB - Background: Wearables and artificial intelligence (AI)–powered digital health platforms that utilize machine learning algorithms can autonomously measure a senior’s change in activity and behavior and may be useful tools for proactive interventions that target modifiable risk factors. Objective: The goal of this study was to analyze how a wearable device and AI-powered digital health platform could provide improved health outcomes for older adults in assisted living communities. Methods: Data from 490 residents from six assisted living communities were analyzed retrospectively over 24 months. The intervention group (+CP) consisted of 3 communities that utilized CarePredict (n=256), and the control group (–CP) consisted of 3 communities (n=234) that did not utilize CarePredict. The following outcomes were measured and compared to baseline: hospitalization rate, fall rate, length of stay (LOS), and staff response time. Results: The residents of the +CP and –CP communities exhibit no statistical difference in age (P=.64), sex (P=.63), and staff service hours per resident (P=.94). The data show that the +CP communities exhibited a 39% lower hospitalization rate (P=.02), a 69% lower fall rate (P=.01), and a 67% greater length of stay (P=.03) than the –CP communities. The staff alert acknowledgment and reach resident times also improved in the +CP communities by 37% (P=.02) and 40% (P=.02), respectively. Conclusions: The AI-powered digital health platform provides the community staff with actionable information regarding each resident’s activities and behavior, which can be used to identify older adults that are at an increased risk for a health decline. Staff can use this data to intervene much earlier, protecting seniors from conditions that left untreated could result in hospitalization. In summary, the use of wearables and AI-powered digital health platform can contribute to improved health outcomes for seniors in assisted living communities. The accuracy of the system will be further validated in a larger trial. UR - http://aging.jmir.org/2020/2/e19554/ UR - https://doi.org/10.2196/19554 UR - http://www.ncbi.nlm.nih.gov/pubmed/32723711 DO - 10.2196/19554 ID - info:doi/10.2196/19554 ER - TY - JOUR AU - Mohammadi, Ramin AU - Atif, Mursal AU - Centi, Amanda Jayne AU - Agboola, Stephen AU - Jethwani, Kamal AU - Kvedar, Joseph AU - Kamarthi, Sagar PY - 2020 DA - 2020/9/8 TI - Neural Network–Based Algorithm for Adjusting Activity Targets to Sustain Exercise Engagement Among People Using Activity Trackers: Retrospective Observation and Algorithm Development Study JO - JMIR Mhealth Uhealth SP - e18142 VL - 8 IS - 9 KW - activity tracker KW - exercise engagement KW - dynamic activity target KW - neural network KW - activity target prediction KW - machine learning AB - Background: It is well established that lack of physical activity is detrimental to the overall health of an individual. Modern-day activity trackers enable individuals to monitor their daily activities to meet and maintain targets. This is expected to promote activity encouraging behavior, but the benefits of activity trackers attenuate over time due to waning adherence. One of the key approaches to improving adherence to goals is to motivate individuals to improve on their historic performance metrics. Objective: The aim of this work was to build a machine learning model to predict an achievable weekly activity target by considering (1) patterns in the user’s activity tracker data in the previous week and (2) behavior and environment characteristics. By setting realistic goals, ones that are neither too easy nor too difficult to achieve, activity tracker users can be encouraged to continue to meet these goals, and at the same time, to find utility in their activity tracker. Methods: We built a neural network model that prescribes a weekly activity target for an individual that can be realistically achieved. The inputs to the model were user-specific personal, social, and environmental factors, daily step count from the previous 7 days, and an entropy measure that characterized the pattern of daily step count. Data for training and evaluating the machine learning model were collected over a duration of 9 weeks. Results: Of 30 individuals who were enrolled, data from 20 participants were used. The model predicted target daily count with a mean absolute error of 1545 (95% CI 1383-1706) steps for an 8-week period. Conclusions: Artificial intelligence applied to physical activity data combined with behavioral data can be used to set personalized goals in accordance with the individual’s level of activity and thereby improve adherence to a fitness tracker; this could be used to increase engagement with activity trackers. A follow-up prospective study is ongoing to determine the performance of the engagement algorithm. UR - https://mhealth.jmir.org/2020/9/e18142 UR - https://doi.org/10.2196/18142 UR - http://www.ncbi.nlm.nih.gov/pubmed/32897235 DO - 10.2196/18142 ID - info:doi/10.2196/18142 ER - TY - JOUR AU - Entezarjou, Artin AU - Bonamy, Anna-Karin Edstedt AU - Benjaminsson, Simon AU - Herman, Pawel AU - Midlöv, Patrik PY - 2020 DA - 2020/9/3 TI - Human- Versus Machine Learning–Based Triage Using Digitalized Patient Histories in Primary Care: Comparative Study JO - JMIR Med Inform SP - e18930 VL - 8 IS - 9 KW - machine learning KW - artificial intelligence KW - decision support KW - primary care KW - triage AB - Background: Smartphones have made it possible for patients to digitally report symptoms before physical primary care visits. Using machine learning (ML), these data offer an opportunity to support decisions about the appropriate level of care (triage). Objective: The purpose of this study was to explore the interrater reliability between human physicians and an automated ML-based triage method. Methods: After testing several models, a naïve Bayes triage model was created using data from digital medical histories, capable of classifying digital medical history reports as either in need of urgent physical examination or not in need of urgent physical examination. The model was tested on 300 digital medical history reports and classification was compared with the majority vote of an expert panel of 5 primary care physicians (PCPs). Reliability between raters was measured using both Cohen κ (adjusted for chance agreement) and percentage agreement (not adjusted for chance agreement). Results: Interrater reliability as measured by Cohen κ was 0.17 when comparing the majority vote of the reference group with the model. Agreement was 74% (138/186) for cases judged not in need of urgent physical examination and 42% (38/90) for cases judged to be in need of urgent physical examination. No specific features linked to the model’s triage decision could be identified. Between physicians within the panel, Cohen κ was 0.2. Intrarater reliability when 1 physician retriaged 50 reports resulted in Cohen κ of 0.55. Conclusions: Low interrater and intrarater agreement in triage decisions among PCPs limits the possibility to use human decisions as a reference for ML to automate triage in primary care. UR - https://medinform.jmir.org/2020/9/e18930 UR - https://doi.org/10.2196/18930 UR - http://www.ncbi.nlm.nih.gov/pubmed/32880578 DO - 10.2196/18930 ID - info:doi/10.2196/18930 ER - TY - JOUR AU - Birnbaum, Michael Leo AU - Kulkarni, Prathamesh "Param" AU - Van Meter, Anna AU - Chen, Victor AU - Rizvi, Asra F AU - Arenare, Elizabeth AU - De Choudhury, Munmun AU - Kane, John M PY - 2020 DA - 2020/9/1 TI - Utilizing Machine Learning on Internet Search Activity to Support the Diagnostic Process and Relapse Detection in Young Individuals With Early Psychosis: Feasibility Study JO - JMIR Ment Health SP - e19348 VL - 7 IS - 9 KW - schizophrenia spectrum disorders KW - internet search activity KW - Google KW - diagnostic prediction KW - relapse prediction KW - machine learning KW - digital data KW - digital phenotyping KW - digital biomarkers AB - Background: Psychiatry is nearly entirely reliant on patient self-reporting, and there are few objective and reliable tests or sources of collateral information available to help diagnostic and assessment procedures. Technology offers opportunities to collect objective digital data to complement patient experience and facilitate more informed treatment decisions. Objective: We aimed to develop computational algorithms based on internet search activity designed to support diagnostic procedures and relapse identification in individuals with schizophrenia spectrum disorders. Methods: We extracted 32,733 time-stamped search queries across 42 participants with schizophrenia spectrum disorders and 74 healthy volunteers between the ages of 15 and 35 (mean 24.4 years, 44.0% male), and built machine-learning diagnostic and relapse classifiers utilizing the timing, frequency, and content of online search activity. Results: Classifiers predicted a diagnosis of schizophrenia spectrum disorders with an area under the curve value of 0.74 and predicted a psychotic relapse in individuals with schizophrenia spectrum disorders with an area under the curve of 0.71. Compared with healthy participants, those with schizophrenia spectrum disorders made fewer searches and their searches consisted of fewer words. Prior to a relapse hospitalization, participants with schizophrenia spectrum disorders were more likely to use words related to hearing, perception, and anger, and were less likely to use words related to health. Conclusions: Online search activity holds promise for gathering objective and easily accessed indicators of psychiatric symptoms. Utilizing search activity as collateral behavioral health information would represent a major advancement in efforts to capitalize on objective digital data to improve mental health monitoring. UR - https://mental.jmir.org/2020/9/e19348 UR - https://doi.org/10.2196/19348 UR - http://www.ncbi.nlm.nih.gov/pubmed/32870161 DO - 10.2196/19348 ID - info:doi/10.2196/19348 ER - TY - JOUR AU - Harada, Yukinori AU - Shimizu, Taro PY - 2020 DA - 2020/8/31 TI - Impact of a Commercial Artificial Intelligence–Driven Patient Self-Assessment Solution on Waiting Times at General Internal Medicine Outpatient Departments: Retrospective Study JO - JMIR Med Inform SP - e21056 VL - 8 IS - 8 KW - artificial intelligence KW - automated medical history taking system KW - eHealth KW - interrupted time-series analysis KW - waiting time AB - Background: Patient waiting time at outpatient departments is directly related to patient satisfaction and quality of care, particularly in patients visiting the general internal medicine outpatient departments for the first time. Moreover, reducing wait time from arrival in the clinic to the initiation of an examination is key to reducing patients’ anxiety. The use of automated medical history–taking systems in general internal medicine outpatient departments is a promising strategy to reduce waiting times. Recently, Ubie Inc in Japan developed AI Monshin, an artificial intelligence–based, automated medical history–taking system for general internal medicine outpatient departments. Objective: We hypothesized that replacing the use of handwritten self-administered questionnaires with the use of AI Monshin would reduce waiting times in general internal medicine outpatient departments. Therefore, we conducted this study to examine whether the use of AI Monshin reduced patient waiting times. Methods: We retrospectively analyzed the waiting times of patients visiting the general internal medicine outpatient department at a Japanese community hospital without an appointment from April 2017 to April 2020. AI Monshin was implemented in April 2019. We compared the median waiting time before and after implementation by conducting an interrupted time-series analysis of the median waiting time per month. We also conducted supplementary analyses to explain the main results. Results: We analyzed 21,615 visits. The median waiting time after AI Monshin implementation (74.4 minutes, IQR 57.1) was not significantly different from that before AI Monshin implementation (74.3 minutes, IQR 63.7) (P=.12). In the interrupted time-series analysis, the underlying linear time trend (–0.4 minutes per month; P=.06; 95% CI –0.9 to 0.02), level change (40.6 minutes; P=.09; 95% CI –5.8 to 87.0), and slope change (–1.1 minutes per month; P=.16; 95% CI –2.7 to 0.4) were not statistically significant. In a supplemental analysis of data from 9054 of 21,615 visits (41.9%), the median examination time after AI Monshin implementation (6.0 minutes, IQR 5.2) was slightly but significantly longer than that before AI Monshin implementation (5.7 minutes, IQR 5.0) (P=.003). Conclusions: The implementation of an artificial intelligence–based, automated medical history–taking system did not reduce waiting time for patients visiting the general internal medicine outpatient department without an appointment, and there was a slight increase in the examination time after implementation; however, the system may have enhanced the quality of care by supporting the optimization of staff assignments. UR - http://medinform.jmir.org/2020/8/e21056/ UR - https://doi.org/10.2196/21056 UR - http://www.ncbi.nlm.nih.gov/pubmed/32865504 DO - 10.2196/21056 ID - info:doi/10.2196/21056 ER - TY - JOUR AU - Adler, Daniel A AU - Ben-Zeev, Dror AU - Tseng, Vincent W-S AU - Kane, John M AU - Brian, Rachel AU - Campbell, Andrew T AU - Hauser, Marta AU - Scherer, Emily A AU - Choudhury, Tanzeem PY - 2020 DA - 2020/8/31 TI - Predicting Early Warning Signs of Psychotic Relapse From Passive Sensing Data: An Approach Using Encoder-Decoder Neural Networks JO - JMIR Mhealth Uhealth SP - e19962 VL - 8 IS - 8 KW - psychotic disorders KW - schizophrenia KW - mHealth KW - mental health KW - mobile health KW - smartphone applications KW - machine learning KW - passive sensing KW - digital biomarkers KW - digital phenotyping KW - artificial intelligence KW - deep learning KW - mobile phone AB - Background: Schizophrenia spectrum disorders (SSDs) are chronic conditions, but the severity of symptomatic experiences and functional impairments vacillate over the course of illness. Developing unobtrusive remote monitoring systems to detect early warning signs of impending symptomatic relapses would allow clinicians to intervene before the patient’s condition worsens. Objective: In this study, we aim to create the first models, exclusively using passive sensing data from a smartphone, to predict behavioral anomalies that could indicate early warning signs of a psychotic relapse. Methods: Data used to train and test the models were collected during the CrossCheck study. Hourly features derived from smartphone passive sensing data were extracted from 60 patients with SSDs (42 nonrelapse and 18 relapse >1 time throughout the study) and used to train models and test performance. We trained 2 types of encoder-decoder neural network models and a clustering-based local outlier factor model to predict behavioral anomalies that occurred within the 30-day period before a participant's date of relapse (the near relapse period). Models were trained to recreate participant behavior on days of relative health (DRH, outside of the near relapse period), following which a threshold to the recreation error was applied to predict anomalies. The neural network model architecture and the percentage of relapse participant data used to train all models were varied. Results: A total of 20,137 days of collected data were analyzed, with 726 days of data (0.037%) within any 30-day near relapse period. The best performing model used a fully connected neural network autoencoder architecture and achieved a median sensitivity of 0.25 (IQR 0.15-1.00) and specificity of 0.88 (IQR 0.14-0.96; a median 108% increase in behavioral anomalies near relapse). We conducted a post hoc analysis using the best performing model to identify behavioral features that had a medium-to-large effect (Cohen d>0.5) in distinguishing anomalies near relapse from DRH among 4 participants who relapsed multiple times throughout the study. Qualitative validation using clinical notes collected during the original CrossCheck study showed that the identified features from our analysis were presented to clinicians during relapse events. Conclusions: Our proposed method predicted a higher rate of anomalies in patients with SSDs within the 30-day near relapse period and can be used to uncover individual-level behaviors that change before relapse. This approach will enable technologists and clinicians to build unobtrusive digital mental health tools that can predict incipient relapse in SSDs. UR - https://mhealth.jmir.org/2020/8/e19962 UR - https://doi.org/10.2196/19962 UR - http://www.ncbi.nlm.nih.gov/pubmed/32865506 DO - 10.2196/19962 ID - info:doi/10.2196/19962 ER - TY - JOUR AU - Shen, Xiao AU - Wang, Guanjin AU - Kwan, Rick Yiu-Cho AU - Choi, Kup-Sze PY - 2020 DA - 2020/8/31 TI - Using Dual Neural Network Architecture to Detect the Risk of Dementia With Community Health Data: Algorithm Development and Validation Study JO - JMIR Med Inform SP - e19870 VL - 8 IS - 8 KW - cognitive screening KW - dementia risk KW - dual neural network KW - predictive models KW - primary care AB - Background: Recent studies have revealed lifestyle behavioral risk factors that can be modified to reduce the risk of dementia. As modification of lifestyle takes time, early identification of people with high dementia risk is important for timely intervention and support. As cognitive impairment is a diagnostic criterion of dementia, cognitive assessment tools are used in primary care to screen for clinically unevaluated cases. Among them, Mini-Mental State Examination (MMSE) is a very common instrument. However, MMSE is a questionnaire that is administered when symptoms of memory decline have occurred. Early administration at the asymptomatic stage and repeated measurements would lead to a practice effect that degrades the effectiveness of MMSE when it is used at later stages. Objective: The aim of this study was to exploit machine learning techniques to assist health care professionals in detecting high-risk individuals by predicting the results of MMSE using elderly health data collected from community-based primary care services. Methods: A health data set of 2299 samples was adopted in the study. The input data were divided into two groups of different characteristics (ie, client profile data and health assessment data). The predictive output was the result of two-class classification of the normal and high-risk cases that were defined based on MMSE. A dual neural network (DNN) model was proposed to obtain the latent representations of the two groups of input data separately, which were then concatenated for the two-class classification. Mean and k-nearest neighbor were used separately to tackle missing data, whereas a cost-sensitive learning (CSL) algorithm was proposed to deal with class imbalance. The performance of the DNN was evaluated by comparing it with that of conventional machine learning methods. Results: A total of 16 predictive models were built using the elderly health data set. Among them, the proposed DNN with CSL outperformed in the detection of high-risk cases. The area under the receiver operating characteristic curve, average precision, sensitivity, and specificity reached 0.84, 0.88, 0.73, and 0.80, respectively. Conclusions: The proposed method has the potential to serve as a tool to screen for elderly people with cognitive impairment and predict high-risk cases of dementia at the asymptomatic stage, providing health care professionals with early signals that can prompt suggestions for a follow-up or a detailed diagnosis. UR - https://medinform.jmir.org/2020/8/e19870 UR - https://doi.org/10.2196/19870 UR - http://www.ncbi.nlm.nih.gov/pubmed/32865498 DO - 10.2196/19870 ID - info:doi/10.2196/19870 ER - TY - JOUR AU - Lee, Joon PY - 2020 DA - 2020/8/26 TI - Is Artificial Intelligence Better Than Human Clinicians in Predicting Patient Outcomes? JO - J Med Internet Res SP - e19918 VL - 22 IS - 8 KW - patient outcome prediction KW - artificial intelligence KW - machine learning KW - human-generated predictions KW - human-AI symbiosis UR - http://www.jmir.org/2020/8/e19918/ UR - https://doi.org/10.2196/19918 UR - http://www.ncbi.nlm.nih.gov/pubmed/32845249 DO - 10.2196/19918 ID - info:doi/10.2196/19918 ER - TY - JOUR AU - Mackey, Tim Ken AU - Li, Jiawei AU - Purushothaman, Vidya AU - Nali, Matthew AU - Shah, Neal AU - Bardier, Cortni AU - Cai, Mingxiang AU - Liang, Bryan PY - 2020 DA - 2020/8/25 TI - Big Data, Natural Language Processing, and Deep Learning to Detect and Characterize Illicit COVID-19 Product Sales: Infoveillance Study on Twitter and Instagram JO - JMIR Public Health Surveill SP - e20794 VL - 6 IS - 3 KW - COVID-19 KW - coronavirus KW - infectious disease KW - social media KW - surveillance KW - infoveillance KW - infodemiology KW - infodemic KW - fraud KW - cybercrime AB - Background: The coronavirus disease (COVID-19) pandemic is perhaps the greatest global health challenge of the last century. Accompanying this pandemic is a parallel “infodemic,” including the online marketing and sale of unapproved, illegal, and counterfeit COVID-19 health products including testing kits, treatments, and other questionable “cures.” Enabling the proliferation of this content is the growing ubiquity of internet-based technologies, including popular social media platforms that now have billions of global users. Objective: This study aims to collect, analyze, identify, and enable reporting of suspected fake, counterfeit, and unapproved COVID-19–related health care products from Twitter and Instagram. Methods: This study is conducted in two phases beginning with the collection of COVID-19–related Twitter and Instagram posts using a combination of web scraping on Instagram and filtering the public streaming Twitter application programming interface for keywords associated with suspect marketing and sale of COVID-19 products. The second phase involved data analysis using natural language processing (NLP) and deep learning to identify potential sellers that were then manually annotated for characteristics of interest. We also visualized illegal selling posts on a customized data dashboard to enable public health intelligence. Results: We collected a total of 6,029,323 tweets and 204,597 Instagram posts filtered for terms associated with suspect marketing and sale of COVID-19 health products from March to April for Twitter and February to May for Instagram. After applying our NLP and deep learning approaches, we identified 1271 tweets and 596 Instagram posts associated with questionable sales of COVID-19–related products. Generally, product introduction came in two waves, with the first consisting of questionable immunity-boosting treatments and a second involving suspect testing kits. We also detected a low volume of pharmaceuticals that have not been approved for COVID-19 treatment. Other major themes detected included products offered in different languages, various claims of product credibility, completely unsubstantiated products, unapproved testing modalities, and different payment and seller contact methods. Conclusions: Results from this study provide initial insight into one front of the “infodemic” fight against COVID-19 by characterizing what types of health products, selling claims, and types of sellers were active on two popular social media platforms at earlier stages of the pandemic. This cybercrime challenge is likely to continue as the pandemic progresses and more people seek access to COVID-19 testing and treatment. This data intelligence can help public health agencies, regulatory authorities, legitimate manufacturers, and technology platforms better remove and prevent this content from harming the public. UR - http://publichealth.jmir.org/2020/3/e20794/ UR - https://doi.org/10.2196/20794 UR - http://www.ncbi.nlm.nih.gov/pubmed/32750006 DO - 10.2196/20794 ID - info:doi/10.2196/20794 ER - TY - JOUR AU - Xie, Bo AU - Tao, Cui AU - Li, Juan AU - Hilsabeck, Robin C AU - Aguirre, Alyssa PY - 2020 DA - 2020/8/20 TI - Artificial Intelligence for Caregivers of Persons With Alzheimer’s Disease and Related Dementias: Systematic Literature Review JO - JMIR Med Inform SP - e18189 VL - 8 IS - 8 KW - Alzheimer disease KW - dementia KW - caregiving KW - technology KW - artificial intelligence AB - Background: Artificial intelligence (AI) has great potential for improving the care of persons with Alzheimer’s disease and related dementias (ADRD) and the quality of life of their family caregivers. To date, however, systematic review of the literature on the impact of AI on ADRD management has been lacking. Objective: This paper aims to (1) identify and examine literature on AI that provides information to facilitate ADRD management by caregivers of individuals diagnosed with ADRD and (2) identify gaps in the literature that suggest future directions for research. Methods: Following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines for conducting systematic literature reviews, during August and September 2019, we performed 3 rounds of selection. First, we searched predetermined keywords in PubMed, Cumulative Index to Nursing and Allied Health Literature Plus with Full Text, PsycINFO, IEEE Xplore Digital Library, and the ACM Digital Library. This step generated 113 nonduplicate results. Next, we screened the titles and abstracts of the 113 papers according to inclusion and exclusion criteria, after which 52 papers were excluded and 61 remained. Finally, we screened the full text of the remaining papers to ensure that they met the inclusion or exclusion criteria; 31 papers were excluded, leaving a final sample of 30 papers for analysis. Results: Of the 30 papers, 20 reported studies that focused on using AI to assist in activities of daily living. A limited number of specific daily activities were targeted. The studies’ aims suggested three major purposes: (1) to test the feasibility, usability, or perceptions of prototype AI technology; (2) to generate preliminary data on the technology’s performance (primarily accuracy in detecting target events, such as falls); and (3) to understand user needs and preferences for the design and functionality of to-be-developed technology. The majority of the studies were qualitative, with interviews, focus groups, and observation being their most common methods. Cross-sectional surveys were also common, but with small convenience samples. Sample sizes ranged from 6 to 106, with the vast majority on the low end. The majority of the studies were descriptive, exploratory, and lacking theoretical guidance. Many studies reported positive outcomes in favor of their AI technology’s feasibility and satisfaction; some studies reported mixed results on these measures. Performance of the technology varied widely across tasks. Conclusions: These findings call for more systematic designs and evaluations of the feasibility and efficacy of AI-based interventions for caregivers of people with ADRD. These gaps in the research would be best addressed through interdisciplinary collaboration, incorporating complementary expertise from the health sciences and computer science/engineering–related fields. UR - http://medinform.jmir.org/2020/8/e18189/ UR - https://doi.org/10.2196/18189 UR - http://www.ncbi.nlm.nih.gov/pubmed/32663146 DO - 10.2196/18189 ID - info:doi/10.2196/18189 ER - TY - JOUR AU - Hung, Man AU - Lauren, Evelyn AU - Hon, Eric S AU - Birmingham, Wendy C AU - Xu, Julie AU - Su, Sharon AU - Hon, Shirley D AU - Park, Jungweon AU - Dang, Peter AU - Lipsky, Martin S PY - 2020 DA - 2020/8/18 TI - Social Network Analysis of COVID-19 Sentiments: Application of Artificial Intelligence JO - J Med Internet Res SP - e22590 VL - 22 IS - 8 KW - COVID-19 KW - coronavirus KW - sentiment KW - social network KW - Twitter KW - infodemiology KW - infodemic KW - pandemic KW - crisis KW - public health KW - business economy KW - artificial intelligence AB - Background: The coronavirus disease (COVID-19) pandemic led to substantial public discussion. Understanding these discussions can help institutions, governments, and individuals navigate the pandemic. Objective: The aim of this study is to analyze discussions on Twitter related to COVID-19 and to investigate the sentiments toward COVID-19. Methods: This study applied machine learning methods in the field of artificial intelligence to analyze data collected from Twitter. Using tweets originating exclusively in the United States and written in English during the 1-month period from March 20 to April 19, 2020, the study examined COVID-19–related discussions. Social network and sentiment analyses were also conducted to determine the social network of dominant topics and whether the tweets expressed positive, neutral, or negative sentiments. Geographic analysis of the tweets was also conducted. Results: There were a total of 14,180,603 likes, 863,411 replies, 3,087,812 retweets, and 641,381 mentions in tweets during the study timeframe. Out of 902,138 tweets analyzed, sentiment analysis classified 434,254 (48.2%) tweets as having a positive sentiment, 187,042 (20.7%) as neutral, and 280,842 (31.1%) as negative. The study identified 5 dominant themes among COVID-19–related tweets: health care environment, emotional support, business economy, social change, and psychological stress. Alaska, Wyoming, New Mexico, Pennsylvania, and Florida were the states expressing the most negative sentiment while Vermont, North Dakota, Utah, Colorado, Tennessee, and North Carolina conveyed the most positive sentiment. Conclusions: This study identified 5 prevalent themes of COVID-19 discussion with sentiments ranging from positive to negative. These themes and sentiments can clarify the public’s response to COVID-19 and help officials navigate the pandemic. UR - http://www.jmir.org/2020/8/e22590/ UR - https://doi.org/10.2196/22590 UR - http://www.ncbi.nlm.nih.gov/pubmed/32750001 DO - 10.2196/22590 ID - info:doi/10.2196/22590 ER - TY - JOUR AU - Michelson, Matthew AU - Chow, Tiffany AU - Martin, Neil A AU - Ross, Mike AU - Tee Qiao Ying, Amelia AU - Minton, Steven PY - 2020 DA - 2020/8/17 TI - Artificial Intelligence for Rapid Meta-Analysis: Case Study on Ocular Toxicity of Hydroxychloroquine JO - J Med Internet Res SP - e20007 VL - 22 IS - 8 KW - meta-analysis KW - rapid meta-analysis KW - artificial intelligence KW - drug KW - analysis KW - hydroxychloroquine KW - toxic KW - COVID-19 KW - treatment KW - side effect KW - ocular KW - eye AB - Background: Rapid access to evidence is crucial in times of an evolving clinical crisis. To that end, we propose a novel approach to answer clinical queries, termed rapid meta-analysis (RMA). Unlike traditional meta-analysis, RMA balances a quick time to production with reasonable data quality assurances, leveraging artificial intelligence (AI) to strike this balance. Objective: We aimed to evaluate whether RMA can generate meaningful clinical insights, but crucially, in a much faster processing time than traditional meta-analysis, using a relevant, real-world example. Methods: The development of our RMA approach was motivated by a currently relevant clinical question: is ocular toxicity and vision compromise a side effect of hydroxychloroquine therapy? At the time of designing this study, hydroxychloroquine was a leading candidate in the treatment of coronavirus disease (COVID-19). We then leveraged AI to pull and screen articles, automatically extract their results, review the studies, and analyze the data with standard statistical methods. Results: By combining AI with human analysis in our RMA, we generated a meaningful, clinical result in less than 30 minutes. The RMA identified 11 studies considering ocular toxicity as a side effect of hydroxychloroquine and estimated the incidence to be 3.4% (95% CI 1.11%-9.96%). The heterogeneity across individual study findings was high, which should be taken into account in interpretation of the result. Conclusions: We demonstrate that a novel approach to meta-analysis using AI can generate meaningful clinical insights in a much shorter time period than traditional meta-analysis. UR - http://www.jmir.org/2020/8/e20007/ UR - https://doi.org/10.2196/20007 UR - http://www.ncbi.nlm.nih.gov/pubmed/32804086 DO - 10.2196/20007 ID - info:doi/10.2196/20007 ER - TY - JOUR AU - Iqbal, Usman AU - Celi, Leo Anthony AU - Li, Yu-Chuan Jack PY - 2020 DA - 2020/8/11 TI - How Can Artificial Intelligence Make Medicine More Preemptive? JO - J Med Internet Res SP - e17211 VL - 22 IS - 8 KW - artificial intelligence KW - digital health KW - eHealth KW - health care technology KW - medical innovations KW - health information technology KW - advanced care systems UR - https://www.jmir.org/2020/8/e17211 UR - https://doi.org/10.2196/17211 UR - http://www.ncbi.nlm.nih.gov/pubmed/32780024 DO - 10.2196/17211 ID - info:doi/10.2196/17211 ER - TY - JOUR AU - Adly, Aya Sedky AU - Adly, Afnan Sedky AU - Adly, Mahmoud Sedky PY - 2020 DA - 2020/8/10 TI - Approaches Based on Artificial Intelligence and the Internet of Intelligent Things to Prevent the Spread of COVID-19: Scoping Review JO - J Med Internet Res SP - e19104 VL - 22 IS - 8 KW - SARS-CoV-2 KW - COVID-19 KW - novel coronavirus KW - artificial intelligence KW - internet of things KW - telemedicine KW - machine learning KW - modeling KW - simulation KW - robotics AB - Background: Artificial intelligence (AI) and the Internet of Intelligent Things (IIoT) are promising technologies to prevent the concerningly rapid spread of coronavirus disease (COVID-19) and to maximize safety during the pandemic. With the exponential increase in the number of COVID-19 patients, it is highly possible that physicians and health care workers will not be able to treat all cases. Thus, computer scientists can contribute to the fight against COVID-19 by introducing more intelligent solutions to achieve rapid control of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the virus that causes the disease. Objective: The objectives of this review were to analyze the current literature, discuss the applicability of reported ideas for using AI to prevent and control COVID-19, and build a comprehensive view of how current systems may be useful in particular areas. This may be of great help to many health care administrators, computer scientists, and policy makers worldwide. Methods: We conducted an electronic search of articles in the MEDLINE, Google Scholar, Embase, and Web of Knowledge databases to formulate a comprehensive review that summarizes different categories of the most recently reported AI-based approaches to prevent and control the spread of COVID-19. Results: Our search identified the 10 most recent AI approaches that were suggested to provide the best solutions for maximizing safety and preventing the spread of COVID-19. These approaches included detection of suspected cases, large-scale screening, monitoring, interactions with experimental therapies, pneumonia screening, use of the IIoT for data and information gathering and integration, resource allocation, predictions, modeling and simulation, and robotics for medical quarantine. Conclusions: We found few or almost no studies regarding the use of AI to examine COVID-19 interactions with experimental therapies, the use of AI for resource allocation to COVID-19 patients, or the use of AI and the IIoT for COVID-19 data and information gathering/integration. Moreover, the adoption of other approaches, including use of AI for COVID-19 prediction, use of AI for COVID-19 modeling and simulation, and use of AI robotics for medical quarantine, should be further emphasized by researchers because these important approaches lack sufficient numbers of studies. Therefore, we recommend that computer scientists focus on these approaches, which are still not being adequately addressed. UR - https://www.jmir.org/2020/8/e19104 UR - https://doi.org/10.2196/19104 UR - http://www.ncbi.nlm.nih.gov/pubmed/32584780 DO - 10.2196/19104 ID - info:doi/10.2196/19104 ER - TY - JOUR AU - Tudor Car, Lorainne AU - Dhinagaran, Dhakshenya Ardhithy AU - Kyaw, Bhone Myint AU - Kowatsch, Tobias AU - Joty, Shafiq AU - Theng, Yin-Leng AU - Atun, Rifat PY - 2020 DA - 2020/8/7 TI - Conversational Agents in Health Care: Scoping Review and Conceptual Analysis JO - J Med Internet Res SP - e17158 VL - 22 IS - 8 KW - conversational agents KW - chatbots KW - artificial intelligence KW - machine learning KW - mobile phone KW - health care KW - scoping review AB - Background: Conversational agents, also known as chatbots, are computer programs designed to simulate human text or verbal conversations. They are increasingly used in a range of fields, including health care. By enabling better accessibility, personalization, and efficiency, conversational agents have the potential to improve patient care. Objective: This study aimed to review the current applications, gaps, and challenges in the literature on conversational agents in health care and provide recommendations for their future research, design, and application. Methods: We performed a scoping review. A broad literature search was performed in MEDLINE (Medical Literature Analysis and Retrieval System Online; Ovid), EMBASE (Excerpta Medica database; Ovid), PubMed, Scopus, and Cochrane Central with the search terms “conversational agents,” “conversational AI,” “chatbots,” and associated synonyms. We also searched the gray literature using sources such as the OCLC (Online Computer Library Center) WorldCat database and ResearchGate in April 2019. Reference lists of relevant articles were checked for further articles. Screening and data extraction were performed in parallel by 2 reviewers. The included evidence was analyzed narratively by employing the principles of thematic analysis. Results: The literature search yielded 47 study reports (45 articles and 2 ongoing clinical trials) that matched the inclusion criteria. The identified conversational agents were largely delivered via smartphone apps (n=23) and used free text only as the main input (n=19) and output (n=30) modality. Case studies describing chatbot development (n=18) were the most prevalent, and only 11 randomized controlled trials were identified. The 3 most commonly reported conversational agent applications in the literature were treatment and monitoring, health care service support, and patient education. Conclusions: The literature on conversational agents in health care is largely descriptive and aimed at treatment and monitoring and health service support. It mostly reports on text-based, artificial intelligence–driven, and smartphone app–delivered conversational agents. There is an urgent need for a robust evaluation of diverse health care conversational agents’ formats, focusing on their acceptability, safety, and effectiveness. UR - http://www.jmir.org/2020/8/e17158/ UR - https://doi.org/10.2196/17158 UR - http://www.ncbi.nlm.nih.gov/pubmed/32763886 DO - 10.2196/17158 ID - info:doi/10.2196/17158 ER - TY - JOUR AU - Cheng, Hao-Yuan AU - Wu, Yu-Chun AU - Lin, Min-Hau AU - Liu, Yu-Lun AU - Tsai, Yue-Yang AU - Wu, Jo-Hua AU - Pan, Ke-Han AU - Ke, Chih-Jung AU - Chen, Chiu-Mei AU - Liu, Ding-Ping AU - Lin, I-Feng AU - Chuang, Jen-Hsiang PY - 2020 DA - 2020/8/5 TI - Applying Machine Learning Models with An Ensemble Approach for Accurate Real-Time Influenza Forecasting in Taiwan: Development and Validation Study JO - J Med Internet Res SP - e15394 VL - 22 IS - 8 KW - influenza KW - Influenza-like illness KW - forecasting KW - machine learning KW - artificial intelligence KW - epidemic forecasting KW - surveillance AB - Background: Changeful seasonal influenza activity in subtropical areas such as Taiwan causes problems in epidemic preparedness. The Taiwan Centers for Disease Control has maintained real-time national influenza surveillance systems since 2004. Except for timely monitoring, epidemic forecasting using the national influenza surveillance data can provide pivotal information for public health response. Objective: We aimed to develop predictive models using machine learning to provide real-time influenza-like illness forecasts. Methods: Using surveillance data of influenza-like illness visits from emergency departments (from the Real-Time Outbreak and Disease Surveillance System), outpatient departments (from the National Health Insurance database), and the records of patients with severe influenza with complications (from the National Notifiable Disease Surveillance System), we developed 4 machine learning models (autoregressive integrated moving average, random forest, support vector regression, and extreme gradient boosting) to produce weekly influenza-like illness predictions for a given week and 3 subsequent weeks. We established a framework of the machine learning models and used an ensemble approach called stacking to integrate these predictions. We trained the models using historical data from 2008-2014. We evaluated their predictive ability during 2015-2017 for each of the 4-week time periods using Pearson correlation, mean absolute percentage error (MAPE), and hit rate of trend prediction. A dashboard website was built to visualize the forecasts, and the results of real-world implementation of this forecasting framework in 2018 were evaluated using the same metrics. Results: All models could accurately predict the timing and magnitudes of the seasonal peaks in the then-current week (nowcast) (ρ=0.802-0.965; MAPE: 5.2%-9.2%; hit rate: 0.577-0.756), 1-week (ρ=0.803-0.918; MAPE: 8.3%-11.8%; hit rate: 0.643-0.747), 2-week (ρ=0.783-0.867; MAPE: 10.1%-15.3%; hit rate: 0.669-0.734), and 3-week forecasts (ρ=0.676-0.801; MAPE: 12.0%-18.9%; hit rate: 0.643-0.786), especially the ensemble model. In real-world implementation in 2018, the forecasting performance was still accurate in nowcasts (ρ=0.875-0.969; MAPE: 5.3%-8.0%; hit rate: 0.582-0.782) and remained satisfactory in 3-week forecasts (ρ=0.721-0.908; MAPE: 7.6%-13.5%; hit rate: 0.596-0.904). Conclusions: This machine learning and ensemble approach can make accurate, real-time influenza-like illness forecasts for a 4-week period, and thus, facilitate decision making. UR - https://www.jmir.org/2020/8/e15394 UR - https://doi.org/10.2196/15394 UR - http://www.ncbi.nlm.nih.gov/pubmed/32755888 DO - 10.2196/15394 ID - info:doi/10.2196/15394 ER - TY - JOUR AU - Guo, Yuqi AU - Hao, Zhichao AU - Zhao, Shichong AU - Gong, Jiaqi AU - Yang, Fan PY - 2020 DA - 2020/7/29 TI - Artificial Intelligence in Health Care: Bibliometric Analysis JO - J Med Internet Res SP - e18228 VL - 22 IS - 7 KW - health care KW - artificial intelligence KW - bibliometric analysis KW - telehealth KW - neural networks KW - machine learning AB - Background: As a critical driving power to promote health care, the health care–related artificial intelligence (AI) literature is growing rapidly. Objective: The purpose of this analysis is to provide a dynamic and longitudinal bibliometric analysis of health care–related AI publications. Methods: The Web of Science (Clarivate PLC) was searched to retrieve all existing and highly cited AI-related health care research papers published in English up to December 2019. Based on bibliometric indicators, a search strategy was developed to screen the title for eligibility, using the abstract and full text where needed. The growth rate of publications, characteristics of research activities, publication patterns, and research hotspot tendencies were computed using the HistCite software. Results: The search identified 5235 hits, of which 1473 publications were included in the analyses. Publication output increased an average of 17.02% per year since 1995, but the growth rate of research papers significantly increased to 45.15% from 2014 to 2019. The major health problems studied in AI research are cancer, depression, Alzheimer disease, heart failure, and diabetes. Artificial neural networks, support vector machines, and convolutional neural networks have the highest impact on health care. Nucleosides, convolutional neural networks, and tumor markers have remained research hotspots through 2019. Conclusions: This analysis provides a comprehensive overview of the AI-related research conducted in the field of health care, which helps researchers, policy makers, and practitioners better understand the development of health care–related AI research and possible practice implications. Future AI research should be dedicated to filling in the gaps between AI health care research and clinical applications. UR - http://www.jmir.org/2020/7/e18228/ UR - https://doi.org/10.2196/18228 UR - http://www.ncbi.nlm.nih.gov/pubmed/32723713 DO - 10.2196/18228 ID - info:doi/10.2196/18228 ER - TY - JOUR AU - Barata, Filipe AU - Tinschert, Peter AU - Rassouli, Frank AU - Steurer-Stey, Claudia AU - Fleisch, Elgar AU - Puhan, Milo Alan AU - Brutsche, Martin AU - Kotz, David AU - Kowatsch, Tobias PY - 2020 DA - 2020/7/14 TI - Automatic Recognition, Segmentation, and Sex Assignment of Nocturnal Asthmatic Coughs and Cough Epochs in Smartphone Audio Recordings: Observational Field Study JO - J Med Internet Res SP - e18082 VL - 22 IS - 7 KW - asthma KW - cough recognition KW - cough segmentation KW - sex assignment KW - deep learning KW - smartphone KW - mobile phone AB - Background: Asthma is one of the most prevalent chronic respiratory diseases. Despite increased investment in treatment, little progress has been made in the early recognition and treatment of asthma exacerbations over the last decade. Nocturnal cough monitoring may provide an opportunity to identify patients at risk for imminent exacerbations. Recently developed approaches enable smartphone-based cough monitoring. These approaches, however, have not undergone longitudinal overnight testing nor have they been specifically evaluated in the context of asthma. Also, the problem of distinguishing partner coughs from patient coughs when two or more people are sleeping in the same room using contact-free audio recordings remains unsolved. Objective: The objective of this study was to evaluate the automatic recognition and segmentation of nocturnal asthmatic coughs and cough epochs in smartphone-based audio recordings that were collected in the field. We also aimed to distinguish partner coughs from patient coughs in contact-free audio recordings by classifying coughs based on sex. Methods: We used a convolutional neural network model that we had developed in previous work for automated cough recognition. We further used techniques (such as ensemble learning, minibatch balancing, and thresholding) to address the imbalance in the data set. We evaluated the classifier in a classification task and a segmentation task. The cough-recognition classifier served as the basis for the cough-segmentation classifier from continuous audio recordings. We compared automated cough and cough-epoch counts to human-annotated cough and cough-epoch counts. We employed Gaussian mixture models to build a classifier for cough and cough-epoch signals based on sex. Results: We recorded audio data from 94 adults with asthma (overall: mean 43 years; SD 16 years; female: 54/94, 57%; male 40/94, 43%). Audio data were recorded by each participant in their everyday environment using a smartphone placed next to their bed; recordings were made over a period of 28 nights. Out of 704,697 sounds, we identified 30,304 sounds as coughs. A total of 26,166 coughs occurred without a 2-second pause between coughs, yielding 8238 cough epochs. The ensemble classifier performed well with a Matthews correlation coefficient of 92% in a pure classification task and achieved comparable cough counts to that of human annotators in the segmentation of coughing. The count difference between automated and human-annotated coughs was a mean –0.1 (95% CI –12.11, 11.91) coughs. The count difference between automated and human-annotated cough epochs was a mean 0.24 (95% CI –3.67, 4.15) cough epochs. The Gaussian mixture model cough epoch–based sex classification performed best yielding an accuracy of 83%. Conclusions: Our study showed longitudinal nocturnal cough and cough-epoch recognition from nightly recorded smartphone-based audio from adults with asthma. The model distinguishes partner cough from patient cough in contact-free recordings by identifying cough and cough-epoch signals that correspond to the sex of the patient. This research represents a step towards enabling passive and scalable cough monitoring for adults with asthma. UR - https://www.jmir.org/2020/7/e18082 UR - https://doi.org/10.2196/18082 UR - http://www.ncbi.nlm.nih.gov/pubmed/32459641 DO - 10.2196/18082 ID - info:doi/10.2196/18082 ER - TY - JOUR AU - Gao, Shuqing AU - He, Lingnan AU - Chen, Yue AU - Li, Dan AU - Lai, Kaisheng PY - 2020 DA - 2020/7/13 TI - Public Perception of Artificial Intelligence in Medical Care: Content Analysis of Social Media JO - J Med Internet Res SP - e16649 VL - 22 IS - 7 KW - artificial intelligence KW - public perception KW - social media KW - content analysis KW - medical care AB - Background: High-quality medical resources are in high demand worldwide, and the application of artificial intelligence (AI) in medical care may help alleviate the crisis related to this shortage. The development of the medical AI industry depends to a certain extent on whether industry experts have a comprehensive understanding of the public’s views on medical AI. Currently, the opinions of the general public on this matter remain unclear. Objective: The purpose of this study is to explore the public perception of AI in medical care through a content analysis of social media data, including specific topics that the public is concerned about; public attitudes toward AI in medical care and the reasons for them; and public opinion on whether AI can replace human doctors. Methods: Through an application programming interface, we collected a data set from the Sina Weibo platform comprising more than 16 million users throughout China by crawling all public posts from January to December 2017. Based on this data set, we identified 2315 posts related to AI in medical care and classified them through content analysis. Results: Among the 2315 identified posts, we found three types of AI topics discussed on the platform: (1) technology and application (n=987, 42.63%), (2) industry development (n=706, 30.50%), and (3) impact on society (n=622, 26.87%). Out of 956 posts where public attitudes were expressed, 59.4% (n=568), 34.4% (n=329), and 6.2% (n=59) of the posts expressed positive, neutral, and negative attitudes, respectively. The immaturity of AI technology (27/59, 46%) and a distrust of related companies (n=15, 25%) were the two main reasons for the negative attitudes. Across 200 posts that mentioned public attitudes toward replacing human doctors with AI, 47.5% (n=95) and 32.5% (n=65) of the posts expressed that AI would completely or partially replace human doctors, respectively. In comparison, 20.0% (n=40) of the posts expressed that AI would not replace human doctors. Conclusions: Our findings indicate that people are most concerned about AI technology and applications. Generally, the majority of people held positive attitudes and believed that AI doctors would completely or partially replace human ones. Compared with previous studies on medical doctors, the general public has a more positive attitude toward medical AI. Lack of trust in AI and the absence of the humanistic care factor are essential reasons why some people still have a negative attitude toward medical AI. We suggest that practitioners may need to pay more attention to promoting the credibility of technology companies and meeting patients’ emotional needs instead of focusing merely on technical issues. UR - http://www.jmir.org/2020/7/e16649/ UR - https://doi.org/10.2196/16649 UR - http://www.ncbi.nlm.nih.gov/pubmed/32673231 DO - 10.2196/16649 ID - info:doi/10.2196/16649 ER - TY - JOUR AU - Abd-Alrazaq, Alaa Ali AU - Rababeh, Asma AU - Alajlani, Mohannad AU - Bewick, Bridgette M AU - Househ, Mowafa PY - 2020 DA - 2020/7/13 TI - Effectiveness and Safety of Using Chatbots to Improve Mental Health: Systematic Review and Meta-Analysis JO - J Med Internet Res SP - e16021 VL - 22 IS - 7 KW - chatbots KW - conversational agents KW - mental health KW - mental disorders KW - depression KW - anxiety KW - effectiveness KW - safety AB - Background: The global shortage of mental health workers has prompted the utilization of technological advancements, such as chatbots, to meet the needs of people with mental health conditions. Chatbots are systems that are able to converse and interact with human users using spoken, written, and visual language. While numerous studies have assessed the effectiveness and safety of using chatbots in mental health, no reviews have pooled the results of those studies. Objective: This study aimed to assess the effectiveness and safety of using chatbots to improve mental health through summarizing and pooling the results of previous studies. Methods: A systematic review was carried out to achieve this objective. The search sources were 7 bibliographic databases (eg, MEDLINE, EMBASE, PsycINFO), the search engine “Google Scholar,” and backward and forward reference list checking of the included studies and relevant reviews. Two reviewers independently selected the studies, extracted data from the included studies, and assessed the risk of bias. Data extracted from studies were synthesized using narrative and statistical methods, as appropriate. Results: Of 1048 citations retrieved, we identified 12 studies examining the effect of using chatbots on 8 outcomes. Weak evidence demonstrated that chatbots were effective in improving depression, distress, stress, and acrophobia. In contrast, according to similar evidence, there was no statistically significant effect of using chatbots on subjective psychological wellbeing. Results were conflicting regarding the effect of chatbots on the severity of anxiety and positive and negative affect. Only two studies assessed the safety of chatbots and concluded that they are safe in mental health, as no adverse events or harms were reported. Conclusions: Chatbots have the potential to improve mental health. However, the evidence in this review was not sufficient to definitely conclude this due to lack of evidence that their effect is clinically important, a lack of studies assessing each outcome, high risk of bias in those studies, and conflicting results for some outcomes. Further studies are required to draw solid conclusions about the effectiveness and safety of chatbots. Trial Registration: PROSPERO International Prospective Register of Systematic Reviews CRD42019141219; https://www.crd.york.ac.uk/prospero/display_record.php?ID=CRD42019141219 UR - http://www.jmir.org/2020/7/e16021/ UR - https://doi.org/10.2196/16021 UR - http://www.ncbi.nlm.nih.gov/pubmed/32673216 DO - 10.2196/16021 ID - info:doi/10.2196/16021 ER - TY - JOUR AU - Jin, Bo AU - Qu, Yue AU - Zhang, Liang AU - Gao, Zhan PY - 2020 DA - 2020/7/10 TI - Diagnosing Parkinson Disease Through Facial Expression Recognition: Video Analysis JO - J Med Internet Res SP - e18697 VL - 22 IS - 7 KW - Parkinson disease KW - face landmarks KW - machine learning KW - artificial intelligence AB - Background: The number of patients with neurological diseases is currently increasing annually, which presents tremendous challenges for both patients and doctors. With the advent of advanced information technology, digital medical care is gradually changing the medical ecology. Numerous people are exploring new ways to receive a consultation, track their diseases, and receive rehabilitation training in more convenient and efficient ways. In this paper, we explore the use of facial expression recognition via artificial intelligence to diagnose a typical neurological system disease, Parkinson disease (PD). Objective: This study proposes methods to diagnose PD through facial expression recognition. Methods: We collected videos of facial expressions of people with PD and matched controls. We used relative coordinates and positional jitter to extract facial expression features (facial expression amplitude and shaking of small facial muscle groups) from the key points returned by Face++. Algorithms from traditional machine learning and advanced deep learning were utilized to diagnose PD. Results: The experimental results showed our models can achieve outstanding facial expression recognition ability for PD diagnosis. Applying a long short-term model neural network to the positions of the key features, precision and F1 values of 86% and 75%, respectively, can be reached. Further, utilizing a support vector machine algorithm for the facial expression amplitude features and shaking of the small facial muscle groups, an F1 value of 99% can be achieved. Conclusions: This study contributes to the digital diagnosis of PD based on facial expression recognition. The disease diagnosis model was validated through our experiment. The results can help doctors understand the real-time dynamics of the disease and even conduct remote diagnosis. UR - https://www.jmir.org/2020/7/e18697 UR - https://doi.org/10.2196/18697 UR - http://www.ncbi.nlm.nih.gov/pubmed/32673247 DO - 10.2196/18697 ID - info:doi/10.2196/18697 ER - TY - JOUR AU - Maher, Carol Ann AU - Davis, Courtney Rose AU - Curtis, Rachel Grace AU - Short, Camille Elizabeth AU - Murphy, Karen Joy PY - 2020 DA - 2020/7/10 TI - A Physical Activity and Diet Program Delivered by Artificially Intelligent Virtual Health Coach: Proof-of-Concept Study JO - JMIR Mhealth Uhealth SP - e17558 VL - 8 IS - 7 KW - virtual assistant KW - chatbot KW - Mediterranean diet KW - physical activity KW - lifestyle AB - Background: Poor diet and physical inactivity are leading modifiable causes of death and disease. Advances in artificial intelligence technology present tantalizing opportunities for creating virtual health coaches capable of providing personalized support at scale. Objective: This proof of concept study aimed to test the feasibility (recruitment and retention) and preliminary efficacy of physical activity and Mediterranean-style dietary intervention (MedLiPal) delivered via artificially intelligent virtual health coach. Methods: This 12-week single-arm pre-post study took place in Adelaide, Australia, from March to August 2019. Participants were inactive community-dwelling adults aged 45 to 75 years, recruited through news stories, social media posts, and flyers. The program included access to an artificially intelligent chatbot, Paola, who guided participants through a computer-based individualized introductory session, weekly check-ins, and goal setting, and was available 24/7 to answer questions. Participants used a Garmin Vivofit4 tracker to monitor daily steps, a website with educational materials and recipes, and a printed diet and activity log sheet. Primary outcomes included feasibility (based on recruitment and retention) and preliminary efficacy for changing physical activity and diet. Secondary outcomes were body composition (based on height, weight, and waist circumference) and blood pressure. Results: Over 4 weeks, 99 potential participants registered expressions of interest, with 81 of those screened meeting eligibility criteria. Participants completed a mean of 109.8 (95% CI 1.9-217.7) more minutes of physical activity at week 12 compared with baseline. Mediterranean diet scores increased from a mean of 3.8 out of 14 at baseline, to 9.6 at 12 weeks (mean improvement 5.7 points, 95% CI 4.2-7.3). After 12 weeks, participants lost an average 1.3 kg (95% CI –0.1 to –2.5 kg) and 2.1 cm from their waist circumference (95% CI –3.5 to –0.7 cm). There were no significant changes in blood pressure. Feasibility was excellent in terms of recruitment, retention (90% at 12 weeks), and safety (no adverse events). Conclusions: An artificially intelligent virtual assistant-led lifestyle-modification intervention was feasible and achieved measurable improvements in physical activity, diet, and body composition at 12 weeks. Future research examining artificially intelligent interventions at scale, and for other health purposes, is warranted. UR - https://mhealth.jmir.org/2020/7/e17558 UR - https://doi.org/10.2196/17558 UR - http://www.ncbi.nlm.nih.gov/pubmed/32673246 DO - 10.2196/17558 ID - info:doi/10.2196/17558 ER - TY - JOUR AU - Chae, Sang Hoon AU - Kim, Yushin AU - Lee, Kyoung-Soub AU - Park, Hyung-Soon PY - 2020 DA - 2020/7/9 TI - Development and Clinical Evaluation of a Web-Based Upper Limb Home Rehabilitation System Using a Smartwatch and Machine Learning Model for Chronic Stroke Survivors: Prospective Comparative Study JO - JMIR Mhealth Uhealth SP - e17216 VL - 8 IS - 7 KW - home-based rehabilitation KW - artificial intelligence KW - machine learning KW - wearable device KW - smartwatch KW - chronic stroke AB - Background: Recent advancements in wearable sensor technology have shown the feasibility of remote physical therapy at home. In particular, the current COVID-19 pandemic has revealed the need and opportunity of internet-based wearable technology in future health care systems. Previous research has shown the feasibility of human activity recognition technologies for monitoring rehabilitation activities in home environments; however, few comprehensive studies ranging from development to clinical evaluation exist. Objective: This study aimed to (1) develop a home-based rehabilitation (HBR) system that can recognize and record the type and frequency of rehabilitation exercises conducted by the user using a smartwatch and smartphone app equipped with a machine learning (ML) algorithm and (2) evaluate the efficacy of the home-based rehabilitation system through a prospective comparative study with chronic stroke survivors. Methods: The HBR system involves an off-the-shelf smartwatch, a smartphone, and custom-developed apps. A convolutional neural network was used to train the ML algorithm for detecting home exercises. To determine the most accurate way for detecting the type of home exercise, we compared accuracy results with the data sets of personal or total data and accelerometer, gyroscope, or accelerometer combined with gyroscope data. From March 2018 to February 2019, we conducted a clinical study with two groups of stroke survivors. In total, 17 and 6 participants were enrolled for statistical analysis in the HBR group and control group, respectively. To measure clinical outcomes, we performed the Wolf Motor Function Test (WMFT), Fugl-Meyer Assessment of Upper Extremity, grip power test, Beck Depression Inventory, and range of motion (ROM) assessment of the shoulder joint at 0, 6, and 12 months, and at a follow-up assessment 6 weeks after retrieving the HBR system. Results: The ML model created with personal data involving accelerometer combined with gyroscope data (5590/5601, 99.80%) was the most accurate compared with accelerometer (5496/5601, 98.13%) or gyroscope data (5381/5601, 96.07%). In the comparative study, the drop-out rates in the control and HBR groups were 40% (4/10) and 22% (5/22) at 12 weeks and 100% (10/10) and 45% (10/22) at 18 weeks, respectively. The HBR group (n=17) showed a significant improvement in the mean WMFT score (P=.02) and ROM of flexion (P=.004) and internal rotation (P=.001). The control group (n=6) showed a significant change only in shoulder internal rotation (P=.03). Conclusions: This study found that a home care system using a commercial smartwatch and ML model can facilitate participation in home training and improve the functional score of the WMFT and shoulder ROM of flexion and internal rotation in the treatment of patients with chronic stroke. This strategy can possibly be a cost-effective tool for the home care treatment of stroke survivors in the future. Trial Registration: Clinical Research Information Service KCT0004818; https://tinyurl.com/y92w978t UR - http://mhealth.jmir.org/2020/7/e17216/ UR - https://doi.org/10.2196/17216 UR - http://www.ncbi.nlm.nih.gov/pubmed/32480361 DO - 10.2196/17216 ID - info:doi/10.2196/17216 ER - TY - JOUR AU - Kim, Bora AU - Kim, Younghoon AU - Park, C Hyung Keun AU - Rhee, Sang Jin AU - Kim, Young Shin AU - Leventhal, Bennett L AU - Ahn, Yong Min AU - Paik, Hyojung PY - 2020 DA - 2020/7/9 TI - Identifying the Medical Lethality of Suicide Attempts Using Network Analysis and Deep Learning: Nationwide Study JO - JMIR Med Inform SP - e14500 VL - 8 IS - 7 KW - suicide KW - deep learning KW - network KW - antecedent behaviors AB - Background: Suicide is one of the leading causes of death among young and middle-aged people. However, little is understood about the behaviors leading up to actual suicide attempts and whether these behaviors are specific to the nature of suicide attempts. Objective: The goal of this study was to examine the clusters of behaviors antecedent to suicide attempts to determine if they could be used to assess the potential lethality of the attempt. To accomplish this goal, we developed a deep learning model using the relationships among behaviors antecedent to suicide attempts and the attempts themselves. Methods: This study used data from the Korea National Suicide Survey. We identified 1112 individuals who attempted suicide and completed a psychiatric evaluation in the emergency room. The 15-item Beck Suicide Intent Scale (SIS) was used for assessing antecedent behaviors, and the medical outcomes of the suicide attempts were measured by assessing lethality with the Columbia Suicide Severity Rating Scale (C-SSRS; lethal suicide attempt >3 and nonlethal attempt ≤3). Results: Using scores from the SIS, individuals who had lethal and nonlethal attempts comprised two different network nodes with the edges representing the relationships among nodes. Among the antecedent behaviors, the conception of a method’s lethality predicted suicidal behaviors with severe medical outcomes. The vectorized relationship values among the elements of antecedent behaviors in our deep learning model (E-GONet) increased performances, such as F1 and area under the precision-recall gain curve (AUPRG), for identifying lethal attempts (up to 3% for F1 and 32% for AUPRG), as compared with other models (mean F1: 0.81 for E-GONet, 0.78 for linear regression, and 0.80 for random forest; mean AUPRG: 0.73 for E-GONet, 0.41 for linear regression, and 0.69 for random forest). Conclusions: The relationships among behaviors antecedent to suicide attempts can be used to understand the suicidal intent of individuals and help identify the lethality of potential suicide attempts. Such a model may be useful in prioritizing cases for preventive intervention. UR - http://medinform.jmir.org/2020/7/e14500/ UR - https://doi.org/10.2196/14500 UR - http://www.ncbi.nlm.nih.gov/pubmed/32673253 DO - 10.2196/14500 ID - info:doi/10.2196/14500 ER - TY - JOUR AU - Alami, Hassane AU - Lehoux, Pascale AU - Auclair, Yannick AU - de Guise, Michèle AU - Gagnon, Marie-Pierre AU - Shaw, James AU - Roy, Denis AU - Fleet, Richard AU - Ag Ahmed, Mohamed Ali AU - Fortin, Jean-Paul PY - 2020 DA - 2020/7/7 TI - Artificial Intelligence and Health Technology Assessment: Anticipating a New Level of Complexity JO - J Med Internet Res SP - e17707 VL - 22 IS - 7 KW - artificial intelligence KW - health technology assessment KW - eHealth KW - health care KW - medical device KW - patient KW - health services UR - https://www.jmir.org/2020/7/e17707 UR - https://doi.org/10.2196/17707 UR - http://www.ncbi.nlm.nih.gov/pubmed/32406850 DO - 10.2196/17707 ID - info:doi/10.2196/17707 ER - TY - JOUR AU - Sapci, A Hasan AU - Sapci, H Aylin PY - 2020 DA - 2020/6/30 TI - Artificial Intelligence Education and Tools for Medical and Health Informatics Students: Systematic Review JO - JMIR Med Educ SP - e19285 VL - 6 IS - 1 KW - artificial intelligence KW - education KW - machine learning KW - deep learning KW - medical education KW - health informatics KW - systematic review AB - Background: The use of artificial intelligence (AI) in medicine will generate numerous application possibilities to improve patient care, provide real-time data analytics, and enable continuous patient monitoring. Clinicians and health informaticians should become familiar with machine learning and deep learning. Additionally, they should have a strong background in data analytics and data visualization to use, evaluate, and develop AI applications in clinical practice. Objective: The main objective of this study was to evaluate the current state of AI training and the use of AI tools to enhance the learning experience. Methods: A comprehensive systematic review was conducted to analyze the use of AI in medical and health informatics education, and to evaluate existing AI training practices. PRISMA-P (Preferred Reporting Items for Systematic Reviews and Meta-Analysis Protocols) guidelines were followed. The studies that focused on the use of AI tools to enhance medical education and the studies that investigated teaching AI as a new competency were categorized separately to evaluate recent developments. Results: This systematic review revealed that recent publications recommend the integration of AI training into medical and health informatics curricula. Conclusions: To the best of our knowledge, this is the first systematic review exploring the current state of AI education in both medicine and health informatics. Since AI curricula have not been standardized and competencies have not been determined, a framework for specialized AI training in medical and health informatics education is proposed. UR - http://mededu.jmir.org/2020/1/e19285/ UR - https://doi.org/10.2196/19285 UR - http://www.ncbi.nlm.nih.gov/pubmed/32602844 DO - 10.2196/19285 ID - info:doi/10.2196/19285 ER - TY - JOUR AU - Du, Lin PY - 2020 DA - 2020/6/25 TI - Medical Emergency Resource Allocation Model in Large-Scale Emergencies Based on Artificial Intelligence: Algorithm Development JO - JMIR Med Inform SP - e19202 VL - 8 IS - 6 KW - medical emergency KW - resource allocation model KW - distribution model KW - large-scale emergencies KW - artificial intelligence AB - Background: Before major emergencies occur, the government needs to prepare various emergency supplies in advance. To do this, it should consider the coordinated storage of different types of materials while ensuring that emergency materials are not missed or superfluous. Objective: This paper aims to improve the dispatch and transportation efficiency of emergency materials under a model in which the government makes full use of Internet of Things technology and artificial intelligence technology. Methods: The paper established a model for emergency material preparation and dispatch based on queueing theory and further established a workflow system for emergency material preparation, dispatch, and transportation based on a Petri net, resulting in a highly efficient emergency material preparation and dispatch simulation system framework. Results: A decision support platform was designed to integrate all the algorithms and principles proposed. Conclusions: The resulting framework can effectively coordinate the workflow of emergency material preparation and dispatch, helping to shorten the total time of emergency material preparation, dispatch, and transportation. UR - http://medinform.jmir.org/2020/6/e19202/ UR - https://doi.org/10.2196/19202 UR - http://www.ncbi.nlm.nih.gov/pubmed/32584262 DO - 10.2196/19202 ID - info:doi/10.2196/19202 ER - TY - JOUR AU - Linden, Brooke AU - Tam-Seto, Linna AU - Stuart, Heather PY - 2020 DA - 2020/6/17 TI - Adherence of the #Here4U App – Military Version to Criteria for the Development of Rigorous Mental Health Apps JO - JMIR Form Res SP - e18890 VL - 4 IS - 6 KW - mental health services KW - telemedicine KW - mHealth KW - chatbot KW - e-solutions KW - Canadian Armed Forces KW - military health KW - mobile phone AB - Background: Over the past several years, the emergence of mobile mental health apps has increased as a potential solution for populations who may face logistical and social barriers to traditional service delivery, including individuals connected to the military. Objective: The goal of the #Here4U App – Military Version is to provide evidence-informed mental health support to members of Canada’s military community, leveraging artificial intelligence in the form of IBM Canada’s Watson Assistant to carry on unique text-based conversations with users, identify presenting mental health concerns, and refer users to self-help resources or recommend professional health care where appropriate. Methods: As the availability and use of mental health apps has increased, so too has the list of recommendations and guidelines for efficacious development. We describe the development and testing conducted between 2018 and 2020 and assess the quality of the #Here4U App against 16 criteria for rigorous mental health app development, as identified by Bakker and colleagues in 2016. Results: The #Here4U App – Military Version met the majority of Bakker and colleagues’ criteria, with those unmet considered not applicable to this particular product or out of scope for research conducted to date. Notably, a formal evaluation of the efficacy of the app is a major priority moving forward. Conclusions: The #Here4U App – Military Version is a promising new mental health e-solution for members of the Canadian Armed Forces community, filling many of the gaps left by traditional service delivery. UR - https://formative.jmir.org/2020/6/e18890 UR - https://doi.org/10.2196/18890 UR - http://www.ncbi.nlm.nih.gov/pubmed/32554374 DO - 10.2196/18890 ID - info:doi/10.2196/18890 ER - TY - JOUR AU - Abd-Alrazaq, Alaa AU - Safi, Zeineb AU - Alajlani, Mohannad AU - Warren, Jim AU - Househ, Mowafa AU - Denecke, Kerstin PY - 2020 DA - 2020/6/5 TI - Technical Metrics Used to Evaluate Health Care Chatbots: Scoping Review JO - J Med Internet Res SP - e18301 VL - 22 IS - 6 KW - chatbots KW - conversational agents KW - health care KW - evaluation KW - metrics AB - Background: Dialog agents (chatbots) have a long history of application in health care, where they have been used for tasks such as supporting patient self-management and providing counseling. Their use is expected to grow with increasing demands on health systems and improving artificial intelligence (AI) capability. Approaches to the evaluation of health care chatbots, however, appear to be diverse and haphazard, resulting in a potential barrier to the advancement of the field. Objective: This study aims to identify the technical (nonclinical) metrics used by previous studies to evaluate health care chatbots. Methods: Studies were identified by searching 7 bibliographic databases (eg, MEDLINE and PsycINFO) in addition to conducting backward and forward reference list checking of the included studies and relevant reviews. The studies were independently selected by two reviewers who then extracted data from the included studies. Extracted data were synthesized narratively by grouping the identified metrics into categories based on the aspect of chatbots that the metrics evaluated. Results: Of the 1498 citations retrieved, 65 studies were included in this review. Chatbots were evaluated using 27 technical metrics, which were related to chatbots as a whole (eg, usability, classifier performance, speed), response generation (eg, comprehensibility, realism, repetitiveness), response understanding (eg, chatbot understanding as assessed by users, word error rate, concept error rate), and esthetics (eg, appearance of the virtual agent, background color, and content). Conclusions: The technical metrics of health chatbot studies were diverse, with survey designs and global usability metrics dominating. The lack of standardization and paucity of objective measures make it difficult to compare the performance of health chatbots and could inhibit advancement of the field. We suggest that researchers more frequently include metrics computed from conversation logs. In addition, we recommend the development of a framework of technical metrics with recommendations for specific circumstances for their inclusion in chatbot studies. UR - http://www.jmir.org/2020/6/e18301/ UR - https://doi.org/10.2196/18301 UR - http://www.ncbi.nlm.nih.gov/pubmed/32442157 DO - 10.2196/18301 ID - info:doi/10.2196/18301 ER - TY - JOUR AU - Bala, Sandeep AU - Keniston, Angela AU - Burden, Marisha PY - 2020 DA - 2020/6/5 TI - Patient Perception of Plain-Language Medical Notes Generated Using Artificial Intelligence Software: Pilot Mixed-Methods Study JO - JMIR Form Res SP - e16670 VL - 4 IS - 6 KW - artificial intelligence KW - patient education KW - natural language processing KW - OpenNotes KW - Open Notes KW - patient-physician relationship KW - simplified notes KW - plain-language notes AB - Background: Clinicians’ time with patients has become increasingly limited due to regulatory burden, documentation and billing, administrative responsibilities, and market forces. These factors limit clinicians’ time to deliver thorough explanations to patients. OpenNotes began as a research initiative exploring the ability of sharing medical notes with patients to help patients understand their health care. Providing patients access to their medical notes has been shown to have many benefits, including improved patient satisfaction and clinical outcomes. OpenNotes has since evolved into a national movement that helps clinicians share notes with patients. However, a significant barrier to the widespread adoption of OpenNotes has been clinicians’ concerns that OpenNotes may cost additional time to correct patient confusion over medical language. Recent advances in artificial intelligence (AI) technology may help resolve this concern by converting medical notes to plain language with minimal time required of clinicians. Objective: This pilot study assesses patient comprehension and perceived benefits, concerns, and insights regarding an AI-simplified note through comprehension questions and guided interview. Methods: Synthea, a synthetic patient generator, was used to generate a standardized medical-language patient note which was then simplified using AI software. A multiple-choice comprehension assessment questionnaire was drafted with physician input. Study participants were recruited from inpatients at the University of Colorado Hospital. Participants were randomly assigned to be tested for their comprehension of the standardized medical-language version or AI-generated plain-language version of the patient note. Following this, participants reviewed the opposite version of the note and participated in a guided interview. A Student t test was performed to assess for differences in comprehension assessment scores between plain-language and medical-language note groups. Multivariate modeling was performed to assess the impact of demographic variables on comprehension. Interview responses were thematically analyzed. Results: Twenty patients agreed to participate. The mean number of comprehension assessment questions answered correctly was found to be higher in the plain-language group compared with the medical-language group; however, the Student t test was found to be underpowered to determine if this was significant. Age, ethnicity, and health literacy were found to have a significant impact on comprehension scores by multivariate modeling. Thematic analysis of guided interviews highlighted patients’ perceived benefits, concerns, and suggestions regarding such notes. Major themes of benefits were that simplified plain-language notes may (1) be more useable than unsimplified medical-language notes, (2) improve the patient-clinician relationship, and (3) empower patients through an enhanced understanding of their health care. Conclusions: AI software may translate medical notes into plain-language notes that are perceived as beneficial by patients. Limitations included sample size, inpatient-only setting, and possible confounding factors. Larger studies are needed to assess comprehension. Insight from patient responses to guided interviews can guide the future study and development of this technology. UR - https://formative.jmir.org/2020/6/e16670 UR - https://doi.org/10.2196/16670 UR - http://www.ncbi.nlm.nih.gov/pubmed/32442148 DO - 10.2196/16670 ID - info:doi/10.2196/16670 ER - TY - JOUR AU - Fu, Weifeng PY - 2020 DA - 2020/6/3 TI - Application of an Isolated Word Speech Recognition System in the Field of Mental Health Consultation: Development and Usability Study JO - JMIR Med Inform SP - e18677 VL - 8 IS - 6 KW - speech recognition KW - isolated words KW - mental health KW - small vocabulary KW - HMM KW - hidden Markov model KW - programming AB - Background: Speech recognition is a technology that enables machines to understand human language. Objective: In this study, speech recognition of isolated words from a small vocabulary was applied to the field of mental health counseling. Methods: A software platform was used to establish a human-machine chat for psychological counselling. The software uses voice recognition technology to decode the user's voice information. The software system analyzes and processes the user's voice information according to many internal related databases, and then gives the user accurate feedback. For users who need psychological treatment, the system provides them with psychological education. Results: The speech recognition system included features such as speech extraction, endpoint detection, feature value extraction, training data, and speech recognition. Conclusions: The Hidden Markov Model was adopted, based on multithread programming under a VC2005 compilation environment, to realize the parallel operation of the algorithm and improve the efficiency of speech recognition. After the design was completed, simulation debugging was performed in the laboratory. The experimental results showed that the designed program met the basic requirements of a speech recognition system. UR - https://medinform.jmir.org/2020/6/e18677 UR - https://doi.org/10.2196/18677 UR - http://www.ncbi.nlm.nih.gov/pubmed/32384054 DO - 10.2196/18677 ID - info:doi/10.2196/18677 ER - TY - JOUR AU - Bian, Yanyan AU - Xiang, Yongbo AU - Tong, Bingdu AU - Feng, Bin AU - Weng, Xisheng PY - 2020 DA - 2020/5/26 TI - Artificial Intelligence–Assisted System in Postoperative Follow-up of Orthopedic Patients: Exploratory Quantitative and Qualitative Study JO - J Med Internet Res SP - e16896 VL - 22 IS - 5 KW - artificial intelligence KW - conversational agent KW - follow-up KW - cost-effectiveness AB - Background: Patient follow-up is an essential part of hospital ward management. With the development of deep learning algorithms, individual follow-up assignments might be completed by artificial intelligence (AI). We developed an AI-assisted follow-up conversational agent that can simulate the human voice and select an appropriate follow-up time for quantitative, automatic, and personalized patient follow-up. Patient feedback and voice information could be collected and converted into text data automatically. Objective: The primary objective of this study was to compare the cost-effectiveness of AI-assisted follow-up to manual follow-up of patients after surgery. The secondary objective was to compare the feedback from AI-assisted follow-up to feedback from manual follow-up. Methods: The AI-assisted follow-up system was adopted in the Orthopedic Department of Peking Union Medical College Hospital in April 2019. A total of 270 patients were followed up through this system. Prior to that, 2656 patients were followed up by phone calls manually. Patient characteristics, telephone connection rate, follow-up rate, feedback collection rate, time spent, and feedback composition were compared between the two groups of patients. Results: There was no statistically significant difference in age, gender, or disease between the two groups. There was no significant difference in telephone connection rate (manual: 2478/2656, 93.3%; AI-assisted: 249/270, 92.2%; P=.50) or successful follow-up rate (manual: 2301/2478, 92.9%; AI-assisted: 231/249, 92.8%; P=.96) between the two groups. The time spent on 100 patients in the manual follow-up group was about 9.3 hours. In contrast, the time spent on the AI-assisted follow-up was close to 0 hours. The feedback rate in the AI-assisted follow-up group was higher than that in the manual follow-up group (manual: 68/2656, 2.5%; AI-assisted: 28/270, 10.3%; P<.001). The composition of feedback was different in the two groups. Feedback from the AI-assisted follow-up group mainly included nursing, health education, and hospital environment content, while feedback from the manual follow-up group mostly included medical consultation content. Conclusions: The effectiveness of AI-assisted follow-up was not inferior to that of manual follow-up. Human resource costs are saved by AI. AI can help obtain comprehensive feedback from patients, although its depth and pertinence of communication need to be improved. UR - http://www.jmir.org/2020/5/e16896/ UR - https://doi.org/10.2196/16896 UR - http://www.ncbi.nlm.nih.gov/pubmed/32452807 DO - 10.2196/16896 ID - info:doi/10.2196/16896 ER - TY - JOUR AU - Arem, Hannah AU - Scott, Remle AU - Greenberg, Daniel AU - Kaltman, Rebecca AU - Lieberman, Daniel AU - Lewin, Daniel PY - 2020 DA - 2020/5/26 TI - Assessing Breast Cancer Survivors’ Perceptions of Using Voice-Activated Technology to Address Insomnia: Feasibility Study Featuring Focus Groups and In-Depth Interviews JO - JMIR Cancer SP - e15859 VL - 6 IS - 1 KW - artificial intelligence KW - breast neoplasms KW - survivors KW - insomnia KW - cognitive behavioral therapy KW - mobile phones AB - Background: Breast cancer survivors (BCSs) are a growing population with a higher prevalence of insomnia than women of the same age without a history of cancer. Cognitive behavioral therapy for insomnia (CBT-I) has been shown to be effective in this population, but it is not widely available to those who need it. Objective: This study aimed to better understand BCSs’ experiences with insomnia and to explore the feasibility and acceptability of delivering CBT-I using a virtual assistant (Amazon Alexa). Methods: We first conducted a formative phase with 2 focus groups and 3 in-depth interviews to understand BCSs’ perceptions of insomnia as well as their interest in and comfort with using a virtual assistant to learn about CBT-I. We then developed a prototype incorporating participant preferences and CBT-I components and demonstrated it in group and individual settings to BCSs to evaluate acceptability, interest, perceived feasibility, educational potential, and usability of the prototype. We also collected open-ended feedback on the content and used frequencies to describe the quantitative data. Results: We recruited 11 BCSs with insomnia in the formative phase and 14 BCSs in the prototype demonstration. In formative work, anxiety, fear, and hot flashes were identified as causes of insomnia. After prototype demonstration, nearly 79% (11/14) of participants reported an interest in and perceived feasibility of using the virtual assistant to record sleep patterns. Approximately two-thirds of the participants thought lifestyle modification (9/14, 64%) and sleep restriction (9/14, 64%) would be feasible and were interested in this feature of the program (10/14, 71% and 9/14, 64%, respectively). Relaxation exercises were rated as interesting and feasible using the virtual assistant by 71% (10/14) of the participants. Usability was rated as better than average, and all women reported that they would recommend the program to friends and family. Conclusions: This virtual assistant prototype delivering CBT-I components by using a smart speaker was rated as feasible and acceptable, suggesting that this prototype should be fully developed and tested for efficacy in the BCS population. If efficacy is shown in this population, the prototype should also be adapted for other high-risk populations. UR - http://cancer.jmir.org/2020/1/e15859/ UR - https://doi.org/10.2196/15859 UR - http://www.ncbi.nlm.nih.gov/pubmed/32348274 DO - 10.2196/15859 ID - info:doi/10.2196/15859 ER - TY - JOUR AU - Park, Soo Jin AU - Lee, Eun Ji AU - Kim, Se Ik AU - Kong, Seong-Ho AU - Jeong, Chang Wook AU - Kim, Hee Seung PY - 2020 DA - 2020/5/15 TI - Clinical Desire for an Artificial Intelligence–Based Surgical Assistant System: Electronic Survey–Based Study JO - JMIR Med Inform SP - e17647 VL - 8 IS - 5 KW - artificial intelligence KW - solo surgery KW - laparoscopic surgery AB - Background: Techniques utilizing artificial intelligence (AI) are rapidly growing in medical research and development, especially in the operating room. However, the application of AI in the operating room has been limited to small tasks or software, such as clinical decision systems. It still largely depends on human resources and technology involving the surgeons’ hands. Therefore, we conceptualized AI-based solo surgery (AISS) defined as laparoscopic surgery conducted by only one surgeon with support from an AI-based surgical assistant system, and we performed an electronic survey on the clinical desire for such a system. Objective: This study aimed to evaluate the experiences of surgeons who have performed laparoscopic surgery, the limitations of conventional laparoscopic surgical systems, and the desire for an AI-based surgical assistant system for AISS. Methods: We performed an online survey for gynecologists, urologists, and general surgeons from June to August 2017. The questionnaire consisted of six items about experience, two about limitations, and five about the clinical desire for an AI-based surgical assistant system for AISS. Results: A total of 508 surgeons who have performed laparoscopic surgery responded to the survey. Most of the surgeons needed two or more assistants during laparoscopic surgery, and the rate was higher among gynecologists (251/278, 90.3%) than among general surgeons (123/173, 71.1%) and urologists (35/57, 61.4%). The majority of responders answered that the skillfulness of surgical assistants was “very important” or “important.” The most uncomfortable aspect of laparoscopic surgery was unskilled movement of the camera (431/508, 84.8%) and instruments (303/508, 59.6%). About 40% (199/508, 39.1%) of responders answered that the AI-based surgical assistant system could substitute 41%-60% of the current workforce, and 83.3% (423/508) showed willingness to buy the system. Furthermore, the most reasonable price was US $30,000-50,000. Conclusions: Surgeons who perform laparoscopic surgery may feel discomfort with the conventional laparoscopic surgical system in terms of assistant skillfulness, and they may think that the skillfulness of surgical assistants is essential. They desire to alleviate present inconveniences with the conventional laparoscopic surgical system and to perform a safe and comfortable operation by using an AI-based surgical assistant system for AISS. UR - http://medinform.jmir.org/2020/5/e17647/ UR - https://doi.org/10.2196/17647 UR - http://www.ncbi.nlm.nih.gov/pubmed/32412421 DO - 10.2196/17647 ID - info:doi/10.2196/17647 ER - TY - JOUR AU - Abdullah, Rana AU - Fakieh, Bahjat PY - 2020 DA - 2020/5/14 TI - Health Care Employees’ Perceptions of the Use of Artificial Intelligence Applications: Survey Study JO - J Med Internet Res SP - e17620 VL - 22 IS - 5 KW - artificial intelligence KW - employees KW - healthcare sector KW - perception KW - Saudi Arabia AB - Background: The advancement of health care information technology and the emergence of artificial intelligence has yielded tools to improve the quality of various health care processes. Few studies have investigated employee perceptions of artificial intelligence implementation in Saudi Arabia and the Arabian world. In addition, limited studies investigated the effect of employee knowledge and job title on the perception of artificial intelligence implementation in the workplace. Objective: The aim of this study was to explore health care employee perceptions and attitudes toward the implementation of artificial intelligence technologies in health care institutions in Saudi Arabia. Methods: An online questionnaire was published, and responses were collected from 250 employees, including doctors, nurses, and technicians at 4 of the largest hospitals in Riyadh, Saudi Arabia. Results: The results of this study showed that 3.11 of 4 respondents feared artificial intelligence would replace employees and had a general lack of knowledge regarding artificial intelligence. In addition, most respondents were unaware of the advantages and most common challenges to artificial intelligence applications in the health sector, indicating a need for training. The results also showed that technicians were the most frequently impacted by artificial intelligence applications due to the nature of their jobs, which do not require much direct human interaction. Conclusions: The Saudi health care sector presents an advantageous market potential that should be attractive to researchers and developers of artificial intelligence solutions. UR - http://www.jmir.org/2020/5/e17620/ UR - https://doi.org/10.2196/17620 UR - http://www.ncbi.nlm.nih.gov/pubmed/32406857 DO - 10.2196/17620 ID - info:doi/10.2196/17620 ER - TY - JOUR AU - Liang, Bin AU - Yang, Na AU - He, Guosheng AU - Huang, Peng AU - Yang, Yong PY - 2020 DA - 2020/4/29 TI - Identification of the Facial Features of Patients With Cancer: A Deep Learning–Based Pilot Study JO - J Med Internet Res SP - e17234 VL - 22 IS - 4 KW - convolutional neural network KW - facial features KW - cancer patient KW - deep learning KW - cancer AB - Background: Cancer has become the second leading cause of death globally. Most cancer cases are due to genetic mutations, which affect metabolism and result in facial changes. Objective: In this study, we aimed to identify the facial features of patients with cancer using the deep learning technique. Methods: Images of faces of patients with cancer were collected to build the cancer face image data set. A face image data set of people without cancer was built by randomly selecting images from the publicly available MegaAge data set according to the sex and age distribution of the cancer face image data set. Each face image was preprocessed to obtain an upright centered face chip, following which the background was filtered out to exclude the effects of nonrelative factors. A residual neural network was constructed to classify cancer and noncancer cases. Transfer learning, minibatches, few epochs, L2 regulation, and random dropout training strategies were used to prevent overfitting. Moreover, guided gradient-weighted class activation mapping was used to reveal the relevant features. Results: A total of 8124 face images of patients with cancer (men: n=3851, 47.4%; women: n=4273, 52.6%) were collected from January 2018 to January 2019. The ages of the patients ranged from 1 year to 70 years (median age 52 years). The average faces of both male and female patients with cancer displayed more obvious facial adiposity than the average faces of people without cancer, which was supported by a landmark comparison. When testing the data set, the training process was terminated after 5 epochs. The area under the receiver operating characteristic curve was 0.94, and the accuracy rate was 0.82. The main relative feature of cancer cases was facial skin, while the relative features of noncancer cases were extracted from the complementary face region. Conclusions: In this study, we built a face data set of patients with cancer and constructed a deep learning model to classify the faces of people with and those without cancer. We found that facial skin and adiposity were closely related to the presence of cancer. UR - http://www.jmir.org/2020/4/e17234/ UR - https://doi.org/10.2196/17234 UR - http://www.ncbi.nlm.nih.gov/pubmed/32347802 DO - 10.2196/17234 ID - info:doi/10.2196/17234 ER - TY - JOUR AU - Falissard, Louis AU - Morgand, Claire AU - Roussel, Sylvie AU - Imbaud, Claire AU - Ghosn, Walid AU - Bounebache, Karim AU - Rey, Grégoire PY - 2020 DA - 2020/4/28 TI - A Deep Artificial Neural Network−Based Model for Prediction of Underlying Cause of Death From Death Certificates: Algorithm Development and Validation JO - JMIR Med Inform SP - e17125 VL - 8 IS - 4 KW - machine learning KW - deep learning KW - mortality statistics KW - underlying cause of death AB - Background: Coding of underlying causes of death from death certificates is a process that is nowadays undertaken mostly by humans with potential assistance from expert systems, such as the Iris software. It is, consequently, an expensive process that can, in addition, suffer from geospatial discrepancies, thus severely impairing the comparability of death statistics at the international level. The recent advances in artificial intelligence, specifically the rise of deep learning methods, has enabled computers to make efficient decisions on a number of complex problems that were typically considered out of reach without human assistance; they require a considerable amount of data to learn from, which is typically their main limiting factor. However, the CépiDc (Centre d’épidémiologie sur les causes médicales de Décès) stores an exhaustive database of death certificates at the French national scale, amounting to several millions of training examples available for the machine learning practitioner. Objective: This article investigates the application of deep neural network methods to coding underlying causes of death. Methods: The investigated dataset was based on data contained from every French death certificate from 2000 to 2015, containing information such as the subject’s age and gender, as well as the chain of events leading to his or her death, for a total of around 8 million observations. The task of automatically coding the subject’s underlying cause of death was then formulated as a predictive modelling problem. A deep neural network−based model was then designed and fit to the dataset. Its error rate was then assessed on an exterior test dataset and compared to the current state-of-the-art (ie, the Iris software). Statistical significance of the proposed approach’s superiority was assessed via bootstrap. Results: The proposed approach resulted in a test accuracy of 97.8% (95% CI 97.7-97.9), which constitutes a significant improvement over the current state-of-the-art and its accuracy of 74.5% (95% CI 74.0-75.0) assessed on the same test example. Such an improvement opens up a whole field of new applications, from nosologist-level batch-automated coding to international and temporal harmonization of cause of death statistics. A typical example of such an application is demonstrated by recoding French overdose-related deaths from 2000 to 2010. Conclusions: This article shows that deep artificial neural networks are perfectly suited to the analysis of electronic health records and can learn a complex set of medical rules directly from voluminous datasets, without any explicit prior knowledge. Although not entirely free from mistakes, the derived algorithm constitutes a powerful decision-making tool that is able to handle structured medical data with an unprecedented performance. We strongly believe that the methods developed in this article are highly reusable in a variety of settings related to epidemiology, biostatistics, and the medical sciences in general. UR - http://medinform.jmir.org/2020/4/e17125/ UR - https://doi.org/10.2196/17125 UR - http://www.ncbi.nlm.nih.gov/pubmed/32343252 DO - 10.2196/17125 ID - info:doi/10.2196/17125 ER - TY - JOUR AU - Buchanan, Christine AU - Howitt, M Lyndsay AU - Wilson, Rita AU - Booth, Richard G AU - Risling, Tracie AU - Bamford, Megan PY - 2020 DA - 2020/4/16 TI - Nursing in the Age of Artificial Intelligence: Protocol for a Scoping Review JO - JMIR Res Protoc SP - e17490 VL - 9 IS - 4 KW - nursing KW - artificial intelligence KW - machine learning KW - robotics KW - compassionate care KW - scoping review AB - Background: It is predicted that digital health technologies that incorporate artificial intelligence will transform health care delivery in the next decade. Little research has explored how emerging trends in artificial intelligence–driven digital health technologies may influence the relationship between nurses and patients. Objective: The purpose of this scoping review is to summarize the findings from 4 research questions regarding emerging trends in artificial intelligence–driven digital health technologies and their influence on nursing practice across the 5 domains outlined by the Canadian Nurses Association framework: administration, clinical care, education, policy, and research. Specifically, this scoping review will examine how emerging trends will transform the roles and functions of nurses over the next 10 years and beyond. Methods: Using an established scoping review methodology, MEDLINE, Cumulative Index to Nursing and Allied Health Literature, Embase, PsycINFO, Cochrane Database of Systematic Reviews, Cochrane Central, Education Resources Information Centre, Scopus, Web of Science, and Proquest databases were searched. In addition to the electronic database searches, a targeted website search will be performed to access relevant grey literature. Abstracts and full-text studies will be independently screened by 2 reviewers using prespecified inclusion and exclusion criteria. Included literature will focus on nursing and digital health technologies that incorporate artificial intelligence. Data will be charted using a structured form and narratively summarized. Results: Electronic database searches have retrieved 10,318 results. The scoping review and subsequent briefing paper will be completed by the fall of 2020. Conclusions: A symposium will be held to share insights gained from this scoping review with key thought leaders and a cross section of stakeholders from administration, clinical care, education, policy, and research as well as patient advocates. The symposium will provide a forum to explore opportunities for action to advance the future of nursing in a technological world and, more specifically, nurses’ delivery of compassionate care in the age of artificial intelligence. Results from the symposium will be summarized in the form of a briefing paper and widely disseminated to relevant stakeholders. International Registered Report Identifier (IRRID): DERR1-10.2196/17490 UR - http://www.researchprotocols.org/2020/4/e17490/ UR - https://doi.org/10.2196/17490 UR - http://www.ncbi.nlm.nih.gov/pubmed/32297873 DO - 10.2196/17490 ID - info:doi/10.2196/17490 ER - TY - JOUR AU - King, Andrew J AU - Cooper, Gregory F AU - Clermont, Gilles AU - Hochheiser, Harry AU - Hauskrecht, Milos AU - Sittig, Dean F AU - Visweswaran, Shyam PY - 2020 DA - 2020/4/2 TI - Leveraging Eye Tracking to Prioritize Relevant Medical Record Data: Comparative Machine Learning Study JO - J Med Internet Res SP - e15876 VL - 22 IS - 4 KW - electronic medical record system KW - eye tracking KW - machine learning KW - intensive care unit KW - information-seeking behavior AB - Background: Electronic medical record (EMR) systems capture large amounts of data per patient and present that data to physicians with little prioritization. Without prioritization, physicians must mentally identify and collate relevant data, an activity that can lead to cognitive overload. To mitigate cognitive overload, a Learning EMR (LEMR) system prioritizes the display of relevant medical record data. Relevant data are those that are pertinent to a context—defined as the combination of the user, clinical task, and patient case. To determine which data are relevant in a specific context, a LEMR system uses supervised machine learning models of physician information-seeking behavior. Since obtaining information-seeking behavior data via manual annotation is slow and expensive, automatic methods for capturing such data are needed. Objective: The goal of the research was to propose and evaluate eye tracking as a high-throughput method to automatically acquire physician information-seeking behavior useful for training models for a LEMR system. Methods: Critical care medicine physicians reviewed intensive care unit patient cases in an EMR interface developed for the study. Participants manually identified patient data that were relevant in the context of a clinical task: preparing a patient summary to present at morning rounds. We used eye tracking to capture each physician’s gaze dwell time on each data item (eg, blood glucose measurements). Manual annotations and gaze dwell times were used to define target variables for developing supervised machine learning models of physician information-seeking behavior. We compared the performance of manual selection and gaze-derived models on an independent set of patient cases. Results: A total of 68 pairs of manual selection and gaze-derived machine learning models were developed from training data and evaluated on an independent evaluation data set. A paired Wilcoxon signed-rank test showed similar performance of manual selection and gaze-derived models on area under the receiver operating characteristic curve (P=.40). Conclusions: We used eye tracking to automatically capture physician information-seeking behavior and used it to train models for a LEMR system. The models that were trained using eye tracking performed like models that were trained using manual annotations. These results support further development of eye tracking as a high-throughput method for training clinical decision support systems that prioritize the display of relevant medical record data. UR - https://www.jmir.org/2020/4/e15876 UR - https://doi.org/10.2196/15876 UR - http://www.ncbi.nlm.nih.gov/pubmed/32238342 DO - 10.2196/15876 ID - info:doi/10.2196/15876 ER - TY - JOUR AU - Schoeb, Dominik AU - Suarez-Ibarrola, Rodrigo AU - Hein, Simon AU - Dressler, Franz Friedrich AU - Adams, Fabian AU - Schlager, Daniel AU - Miernik, Arkadiusz PY - 2020 DA - 2020/3/30 TI - Use of Artificial Intelligence for Medical Literature Search: Randomized Controlled Trial Using the Hackathon Format JO - Interact J Med Res SP - e16606 VL - 9 IS - 1 KW - artificial intelligence KW - literature review KW - medical information technology AB - Background: Mapping out the research landscape around a project is often time consuming and difficult. Objective: This study evaluates a commercial artificial intelligence (AI) search engine (IRIS.AI) for its applicability in an automated literature search on a specific medical topic. Methods: To evaluate the AI search engine in a standardized manner, the concept of a science hackathon was applied. Three groups of researchers were tasked with performing a literature search on a clearly defined scientific project. All participants had a high level of expertise for this specific field of research. Two groups were given access to the AI search engine IRIS.AI. All groups were given the same amount of time for their search and were instructed to document their results. Search results were summarized and ranked according to a predetermined scoring system. Results: The final scoring awarded 49 and 39 points out of 60 to AI groups 1 and 2, respectively, and the control group received 46 points. A total of 20 scientific studies with high relevance were identified, and 5 highly relevant studies (“spot on”) were reported by each group. Conclusions: AI technology is a promising approach to facilitate literature searches and the management of medical libraries. In this study, however, the application of AI technology lead to a more focused literature search without a significant improvement in the number of results. UR - http://www.i-jmr.org/2020/1/e16606/ UR - https://doi.org/10.2196/16606 UR - http://www.ncbi.nlm.nih.gov/pubmed/32224481 DO - 10.2196/16606 ID - info:doi/10.2196/16606 ER - TY - JOUR AU - Ta, Vivian AU - Griffith, Caroline AU - Boatfield, Carolynn AU - Wang, Xinyu AU - Civitello, Maria AU - Bader, Haley AU - DeCero, Esther AU - Loggarakis, Alexia PY - 2020 DA - 2020/3/6 TI - User Experiences of Social Support From Companion Chatbots in Everyday Contexts: Thematic Analysis JO - J Med Internet Res SP - e16235 VL - 22 IS - 3 KW - artificial intelligence KW - social support KW - artificial agents KW - chatbots KW - interpersonal relations AB - Background: Previous research suggests that artificial agents may be a promising source of social support for humans. However, the bulk of this research has been conducted in the context of social support interventions that specifically address stressful situations or health improvements. Little research has examined social support received from artificial agents in everyday contexts. Objective: Considering that social support manifests in not only crises but also everyday situations and that everyday social support forms the basis of support received during more stressful events, we aimed to investigate the types of everyday social support that can be received from artificial agents. Methods: In Study 1, we examined publicly available user reviews (N=1854) of Replika, a popular companion chatbot. In Study 2, a sample (n=66) of Replika users provided detailed open-ended responses regarding their experiences of using Replika. We conducted thematic analysis on both datasets to gain insight into the kind of everyday social support that users receive through interactions with Replika. Results: Replika provides some level of companionship that can help curtail loneliness, provide a “safe space” in which users can discuss any topic without the fear of judgment or retaliation, increase positive affect through uplifting and nurturing messages, and provide helpful information/advice when normal sources of informational support are not available. Conclusions: Artificial agents may be a promising source of everyday social support, particularly companionship, emotional, informational, and appraisal support, but not as tangible support. Future studies are needed to determine who might benefit from these types of everyday social support the most and why. These results could potentially be used to help address global health issues or other crises early on in everyday situations before they potentially manifest into larger issues. UR - http://www.jmir.org/2020/2/e16235/ UR - https://doi.org/10.2196/16235 UR - http://www.ncbi.nlm.nih.gov/pubmed/32141837 DO - 10.2196/16235 ID - info:doi/10.2196/16235 ER - TY - JOUR AU - Wolff, Justus AU - Pauling, Josch AU - Keck, Andreas AU - Baumbach, Jan PY - 2020 DA - 2020/2/20 TI - The Economic Impact of Artificial Intelligence in Health Care: Systematic Review JO - J Med Internet Res SP - e16866 VL - 22 IS - 2 KW - telemedicine KW - artificial intelligence KW - machine learning KW - cost-benefit analysis AB - Background: Positive economic impact is a key decision factor in making the case for or against investing in an artificial intelligence (AI) solution in the health care industry. It is most relevant for the care provider and insurer as well as for the pharmaceutical and medical technology sectors. Although the broad economic impact of digital health solutions in general has been assessed many times in literature and the benefit for patients and society has also been analyzed, the specific economic impact of AI in health care has been addressed only sporadically. Objective: This study aimed to systematically review and summarize the cost-effectiveness studies dedicated to AI in health care and to assess whether they meet the established quality criteria. Methods: In a first step, the quality criteria for economic impact studies were defined based on the established and adapted criteria schemes for cost impact assessments. In a second step, a systematic literature review based on qualitative and quantitative inclusion and exclusion criteria was conducted to identify relevant publications for an in-depth analysis of the economic impact assessment. In a final step, the quality of the identified economic impact studies was evaluated based on the defined quality criteria for cost-effectiveness studies. Results: Very few publications have thoroughly addressed the economic impact assessment, and the economic assessment quality of the reviewed publications on AI shows severe methodological deficits. Only 6 out of 66 publications could be included in the second step of the analysis based on the inclusion criteria. Out of these 6 studies, none comprised a methodologically complete cost impact analysis. There are two areas for improvement in future studies. First, the initial investment and operational costs for the AI infrastructure and service need to be included. Second, alternatives to achieve similar impact must be evaluated to provide a comprehensive comparison. Conclusions: This systematic literature analysis proved that the existing impact assessments show methodological deficits and that upcoming evaluations require more comprehensive economic analyses to enable economic decisions for or against implementing AI technology in health care. UR - http://www.jmir.org/2020/2/e16866/ UR - https://doi.org/10.2196/16866 UR - http://www.ncbi.nlm.nih.gov/pubmed/32130134 DO - 10.2196/16866 ID - info:doi/10.2196/16866 ER - TY - JOUR AU - Li, Xiaojin AU - Tao, Shiqiang AU - Jamal-Omidi, Shirin AU - Huang, Yan AU - Lhatoo, Samden D AU - Zhang, Guo-Qiang AU - Cui, Licong PY - 2020 DA - 2020/2/14 TI - Detection of Postictal Generalized Electroencephalogram Suppression: Random Forest Approach JO - JMIR Med Inform SP - e17061 VL - 8 IS - 2 KW - epilepsy KW - generalized tonic-clonic seizure KW - postictal generalized EEG suppression KW - EEG KW - random forest AB - Background: Sudden unexpected death in epilepsy (SUDEP) is second only to stroke in neurological events resulting in years of potential life lost. Postictal generalized electroencephalogram (EEG) suppression (PGES) is a period of suppressed brain activity often occurring after generalized tonic-clonic seizure, a most significant risk factor for SUDEP. Therefore, PGES has been considered as a potential biomarker for SUDEP risk. Automatic PGES detection tools can address the limitations of labor-intensive, and sometimes inconsistent, visual analysis. A successful approach to automatic PGES detection must overcome computational challenges involved in the detection of subtle amplitude changes in EEG recordings, which may contain physiological and acquisition artifacts. Objective: This study aimed to present a random forest approach for automatic PGES detection using multichannel human EEG recordings acquired in epilepsy monitoring units. Methods: We used a combination of temporal, frequency, wavelet, and interchannel correlation features derived from EEG signals to train a random forest classifier. We also constructed and applied confidence-based correction rules based on PGES state changes. Motivated by practical utility, we introduced a new, time distance–based evaluation method for assessing the performance of PGES detection algorithms. Results: The time distance–based evaluation showed that our approach achieved a 5-second tolerance-based positive prediction rate of 0.95 for artifact-free signals. For signals with different artifact levels, our prediction rates varied from 0.68 to 0.81. Conclusions: We introduced a feature-based, random forest approach for automatic PGES detection using multichannel EEG recordings. Our approach achieved increasingly better time distance–based performance with reduced signal artifact levels. Further study is needed for PGES detection algorithms to perform well irrespective of the levels of signal artifacts. UR - https://medinform.jmir.org/2020/2/e17061 UR - https://doi.org/10.2196/17061 UR - http://www.ncbi.nlm.nih.gov/pubmed/32130173 DO - 10.2196/17061 ID - info:doi/10.2196/17061 ER - TY - JOUR AU - Song, Xing AU - Waitman, Lemuel R AU - Yu, Alan SL AU - Robbins, David C AU - Hu, Yong AU - Liu, Mei PY - 2020 DA - 2020/1/31 TI - Longitudinal Risk Prediction of Chronic Kidney Disease in Diabetic Patients Using a Temporal-Enhanced Gradient Boosting Machine: Retrospective Cohort Study JO - JMIR Med Inform SP - e15510 VL - 8 IS - 1 KW - diabetic kidney disease KW - diabetic nephropathy KW - chronic kidney disease KW - machine learning AB - Background: Artificial intelligence–enabled electronic health record (EHR) analysis can revolutionize medical practice from the diagnosis and prediction of complex diseases to making recommendations in patient care, especially for chronic conditions such as chronic kidney disease (CKD), which is one of the most frequent complications in patients with diabetes and is associated with substantial morbidity and mortality. Objective: The longitudinal prediction of health outcomes requires effective representation of temporal data in the EHR. In this study, we proposed a novel temporal-enhanced gradient boosting machine (GBM) model that dynamically updates and ensembles learners based on new events in patient timelines to improve the prediction accuracy of CKD among patients with diabetes. Methods: Using a broad spectrum of deidentified EHR data on a retrospective cohort of 14,039 adult patients with type 2 diabetes and GBM as the base learner, we validated our proposed Landmark-Boosting model against three state-of-the-art temporal models for rolling predictions of 1-year CKD risk. Results: The proposed model uniformly outperformed other models, achieving an area under receiver operating curve of 0.83 (95% CI 0.76-0.85), 0.78 (95% CI 0.75-0.82), and 0.82 (95% CI 0.78-0.86) in predicting CKD risk with automatic accumulation of new data in later years (years 2, 3, and 4 since diabetes mellitus onset, respectively). The Landmark-Boosting model also maintained the best calibration across moderate- and high-risk groups and over time. The experimental results demonstrated that the proposed temporal model can not only accurately predict 1-year CKD risk but also improve performance over time with additionally accumulated data, which is essential for clinical use to improve renal management of patients with diabetes. Conclusions: Incorporation of temporal information in EHR data can significantly improve predictive model performance and will particularly benefit patients who follow-up with their physicians as recommended. UR - http://medinform.jmir.org/2020/1/e15510/ UR - https://doi.org/10.2196/15510 UR - http://www.ncbi.nlm.nih.gov/pubmed/32012067 DO - 10.2196/15510 ID - info:doi/10.2196/15510 ER - TY - JOUR AU - Meyer, Ashley N D AU - Giardina, Traber D AU - Spitzmueller, Christiane AU - Shahid, Umber AU - Scott, Taylor M T AU - Singh, Hardeep PY - 2020 DA - 2020/1/30 TI - Patient Perspectives on the Usefulness of an Artificial Intelligence–Assisted Symptom Checker: Cross-Sectional Survey Study JO - J Med Internet Res SP - e14679 VL - 22 IS - 1 KW - clinical decision support systems KW - technology KW - diagnosis KW - patient safety KW - symptom checker KW - computer-assisted diagnosis AB - Background: Patients are increasingly seeking Web-based symptom checkers to obtain diagnoses. However, little is known about the characteristics of the patients who use these resources, their rationale for use, and whether they find them accurate and useful. Objective: The study aimed to examine patients’ experiences using an artificial intelligence (AI)–assisted online symptom checker. Methods: An online survey was administered between March 2, 2018, through March 15, 2018, to US users of the Isabel Symptom Checker within 6 months of their use. User characteristics, experiences of symptom checker use, experiences discussing results with physicians, and prior personal history of experiencing a diagnostic error were collected. Results: A total of 329 usable responses was obtained. The mean respondent age was 48.0 (SD 16.7) years; most were women (230/304, 75.7%) and white (271/304, 89.1%). Patients most commonly used the symptom checker to better understand the causes of their symptoms (232/304, 76.3%), followed by for deciding whether to seek care (101/304, 33.2%) or where (eg, primary or urgent care: 63/304, 20.7%), obtaining medical advice without going to a doctor (48/304, 15.8%), and understanding their diagnoses better (39/304, 12.8%). Most patients reported receiving useful information for their health problems (274/304, 90.1%), with half reporting positive health effects (154/302, 51.0%). Most patients perceived it to be useful as a diagnostic tool (253/301, 84.1%), as a tool providing insights leading them closer to correct diagnoses (231/303, 76.2%), and reported they would use it again (278/304, 91.4%). Patients who discussed findings with their physicians (103/213, 48.4%) more often felt physicians were interested (42/103, 40.8%) than not interested in learning about the tool’s results (24/103, 23.3%) and more often felt physicians were open (62/103, 60.2%) than not open (21/103, 20.4%) to discussing the results. Compared with patients who had not previously experienced diagnostic errors (missed or delayed diagnoses: 123/304, 40.5%), patients who had previously experienced diagnostic errors (181/304, 59.5%) were more likely to use the symptom checker to determine where they should seek care (15/123, 12.2% vs 48/181, 26.5%; P=.002), but they less often felt that physicians were interested in discussing the tool’s results (20/34, 59% vs 22/69, 32%; P=.04). Conclusions: Despite ongoing concerns about symptom checker accuracy, a large patient-user group perceived an AI-assisted symptom checker as useful for diagnosis. Formal validation studies evaluating symptom checker accuracy and effectiveness in real-world practice could provide additional useful information about their benefit. UR - http://www.jmir.org/2020/1/e14679/ UR - https://doi.org/10.2196/14679 UR - http://www.ncbi.nlm.nih.gov/pubmed/32012052 DO - 10.2196/14679 ID - info:doi/10.2196/14679 ER - TY - JOUR AU - Prieto, José Tomás AU - Scott, Kenneth AU - McEwen, Dean AU - Podewils, Laura J AU - Al-Tayyib, Alia AU - Robinson, James AU - Edwards, David AU - Foldy, Seth AU - Shlay, Judith C AU - Davidson, Arthur J PY - 2020 DA - 2020/1/3 TI - The Detection of Opioid Misuse and Heroin Use From Paramedic Response Documentation: Machine Learning for Improved Surveillance JO - J Med Internet Res SP - e15645 VL - 22 IS - 1 KW - naloxone KW - emergency medical services KW - natural language processing KW - heroin KW - substance-related disorders KW - opioid crisis KW - artificial intelligence AB - Background: Timely, precise, and localized surveillance of nonfatal events is needed to improve response and prevention of opioid-related problems in an evolving opioid crisis in the United States. Records of naloxone administration found in prehospital emergency medical services (EMS) data have helped estimate opioid overdose incidence, including nonhospital, field-treated cases. However, as naloxone is often used by EMS personnel in unconsciousness of unknown cause, attributing naloxone administration to opioid misuse and heroin use (OM) may misclassify events. Better methods are needed to identify OM. Objective: This study aimed to develop and test a natural language processing method that would improve identification of potential OM from paramedic documentation. Methods: First, we searched Denver Health paramedic trip reports from August 2017 to April 2018 for keywords naloxone, heroin, and both combined, and we reviewed narratives of identified reports to determine whether they constituted true cases of OM. Then, we used this human classification as reference standard and trained 4 machine learning models (random forest, k-nearest neighbors, support vector machines, and L1-regularized logistic regression). We selected the algorithm that produced the highest area under the receiver operating curve (AUC) for model assessment. Finally, we compared positive predictive value (PPV) of the highest performing machine learning algorithm with PPV of searches of keywords naloxone, heroin, and combination of both in the binary classification of OM in unseen September 2018 data. Results: In total, 54,359 trip reports were filed from August 2017 to April 2018. Approximately 1.09% (594/54,359) indicated naloxone administration. Among trip reports with reviewer agreement regarding OM in the narrative, 57.6% (292/516) were considered to include information revealing OM. Approximately 1.63% (884/54,359) of all trip reports mentioned heroin in the narrative. Among trip reports with reviewer agreement, 95.5% (784/821) were considered to include information revealing OM. Combined results accounted for 2.39% (1298/54,359) of trip reports. Among trip reports with reviewer agreement, 77.79% (907/1166) were considered to include information consistent with OM. The reference standard used to train and test machine learning models included details of 1166 trip reports. L1-regularized logistic regression was the highest performing algorithm (AUC=0.94; 95% CI 0.91-0.97) in identifying OM. Tested on 5983 unseen reports from September 2018, the keyword naloxone inaccurately identified and underestimated probable OM trip report cases (63 cases; PPV=0.68). The keyword heroin yielded more cases with improved performance (129 cases; PPV=0.99). Combined keyword and L1-regularized logistic regression classifier further improved performance (146 cases; PPV=0.99). Conclusions: A machine learning application enhanced the effectiveness of finding OM among documented paramedic field responses. This approach to refining OM surveillance may lead to improved first-responder and public health responses toward prevention of overdoses and other opioid-related problems in US communities. UR - https://www.jmir.org/2020/1/e15645 UR - https://doi.org/10.2196/15645 UR - http://www.ncbi.nlm.nih.gov/pubmed/31899451 DO - 10.2196/15645 ID - info:doi/10.2196/15645 ER - TY - JOUR AU - Holdener, Marianne AU - Gut, Alain AU - Angerer, Alfred PY - 2020 DA - 2020/1/3 TI - Applicability of the User Engagement Scale to Mobile Health: A Survey-Based Quantitative Study JO - JMIR Mhealth Uhealth SP - e13244 VL - 8 IS - 1 KW - mobile health KW - mhealth KW - mobile apps KW - user engagement KW - measurement KW - user engagement scale KW - chatbot AB - Background: There has recently been exponential growth in the development and use of health apps on mobile phones. As with most mobile apps, however, the majority of users abandon them quickly and after minimal use. One of the most critical factors for the success of a health app is how to support users’ commitment to their health. Despite increased interest from researchers in mobile health, few studies have examined the measurement of user engagement with health apps. Objective: User engagement is a multidimensional, complex phenomenon. The aim of this study was to understand the concept of user engagement and, in particular, to demonstrate the applicability of a user engagement scale (UES) to mobile health apps. Methods: To determine the measurability of user engagement in a mobile health context, a UES was employed, which is a psychometric tool to measure user engagement with a digital system. This was adapted to Ada, developed by Ada Health, an artificial intelligence–powered personalized health guide that helps people understand their health. A principal component analysis (PCA) with varimax rotation was conducted on 30 items. In addition, sum scores as means of each subscale were calculated. Results: Survey data from 73 Ada users were analyzed. PCA was determined to be suitable, as verified by the sampling adequacy of Kaiser-Meyer-Olkin=0.858, a significant Bartlett test of sphericity (χ2300=1127.1; P<.001), and communalities mostly within the 0.7 range. Although 5 items had to be removed because of low factor loadings, the results of the remaining 25 items revealed 4 attributes: perceived usability, aesthetic appeal, reward, and focused attention. Ada users showed the highest engagement level with perceived usability, with a value of 294, followed by aesthetic appeal, reward, and focused attention. Conclusions: Although the UES was deployed in German and adapted to another digital domain, PCA yielded consistent subscales and a 4-factor structure. This indicates that user engagement with health apps can be assessed with the German version of the UES. These results can benefit related mobile health app engagement research and may be of importance to marketers and app developers. UR - https://mhealth.jmir.org/2020/1/e13244 UR - https://doi.org/10.2196/13244 UR - http://www.ncbi.nlm.nih.gov/pubmed/31899454 DO - 10.2196/13244 ID - info:doi/10.2196/13244 ER - TY - JOUR AU - Martin-Hammond, Aqueasha AU - Vemireddy, Sravani AU - Rao, Kartik PY - 2019 DA - 2019/12/11 TI - Exploring Older Adults’ Beliefs About the Use of Intelligent Assistants for Consumer Health Information Management: A Participatory Design Study JO - JMIR Aging SP - e15381 VL - 2 IS - 2 KW - intelligent assistants KW - artificial intelligence KW - chatbots KW - conversational agents KW - digital health KW - elderly KW - aging in place KW - participatory design KW - co-design KW - health information seeking AB - Background: Intelligent assistants (IAs), also known as intelligent agents, use artificial intelligence to help users achieve a goal or complete a task. IAs represent a potential solution for providing older adults with individualized assistance at home, for example, to reduce social isolation, serve as memory aids, or help with disease management. However, to design IAs for health that are beneficial and accepted by older adults, it is important to understand their beliefs about IAs, how they would like to interact with IAs for consumer health, and how they desire to integrate IAs into their homes. Objective: We explore older adults’ mental models and beliefs about IAs, the tasks they want IAs to support, and how they would like to interact with IAs for consumer health. For the purpose of this study, we focus on IAs in the context of consumer health information management and search. Methods: We present findings from an exploratory, qualitative study that investigated older adults’ perspectives of IAs that aid with consumer health information search and management tasks. Eighteen older adults participated in a multiphase, participatory design workshop in which we engaged them in discussion, brainstorming, and design activities that helped us identify their current challenges managing and finding health information at home. We also explored their beliefs and ideas for an IA to assist them with consumer health tasks. We used participatory design activities to identify areas in which they felt IAs might be useful, but also to uncover the reasoning behind the ideas they presented. Discussions were audio-recorded and later transcribed. We compiled design artifacts collected during the study to supplement researcher transcripts and notes. Thematic analysis was used to analyze data. Results: We found that participants saw IAs as potentially useful for providing recommendations, facilitating collaboration between themselves and other caregivers, and for alerts of serious illness. However, they also desired familiar and natural interactions with IAs (eg, using voice) that could, if need be, provide fluid and unconstrained interactions, reason about their symptoms, and provide information or advice. Other participants discussed the need for flexible IAs that could be used by those with low technical resources or skills. Conclusions: From our findings, we present a discussion of three key components of participants’ mental models, including the people, behaviors, and interactions they described that were important for IAs for consumer health information management and seeking. We then discuss the role of access, transparency, caregivers, and autonomy in design for addressing participants’ concerns about privacy and trust as well as its role in assisting others that may interact with an IA on the older adults’ behalf. International Registered Report Identifier (IRRID): RR2-10.1145/3240925.3240972 UR - http://aging.jmir.org/2019/2/e15381/ UR - https://doi.org/10.2196/15381 UR - http://www.ncbi.nlm.nih.gov/pubmed/31825322 DO - 10.2196/15381 ID - info:doi/10.2196/15381 ER - TY - JOUR AU - Afzal, Muhammad AU - Hussain, Maqbool AU - Malik, Khalid Mahmood AU - Lee, Sungyoung PY - 2019 DA - 2019/12/9 TI - Impact of Automatic Query Generation and Quality Recognition Using Deep Learning to Curate Evidence From Biomedical Literature: Empirical Study JO - JMIR Med Inform SP - e13430 VL - 7 IS - 4 KW - data curation KW - evidence-based medicine KW - clinical decision support systems KW - precision medicine KW - biomedical research KW - machine learning KW - deep learning AB - Background: The quality of health care is continuously improving and is expected to improve further because of the advancement of machine learning and knowledge-based techniques along with innovation and availability of wearable sensors. With these advancements, health care professionals are now becoming more interested and involved in seeking scientific research evidence from external sources for decision making relevant to medical diagnosis, treatments, and prognosis. Not much work has been done to develop methods for unobtrusive and seamless curation of data from the biomedical literature. Objective: This study aimed to design a framework that can enable bringing quality publications intelligently to the users’ desk to assist medical practitioners in answering clinical questions and fulfilling their informational needs. Methods: The proposed framework consists of methods for efficient biomedical literature curation, including the automatic construction of a well-built question, the recognition of evidence quality by proposing extended quality recognition model (E-QRM), and the ranking and summarization of the extracted evidence. Results: Unlike previous works, the proposed framework systematically integrates the echelons of biomedical literature curation by including methods for searching queries, content quality assessments, and ranking and summarization. Using an ensemble approach, our high-impact classifier E-QRM obtained significantly improved accuracy than the existing quality recognition model (1723/1894, 90.97% vs 1462/1894, 77.21%). Conclusions: Our proposed methods and evaluation demonstrate the validity and rigorousness of the results, which can be used in different applications, including evidence-based medicine, precision medicine, and medical education. UR - http://medinform.jmir.org/2019/4/e13430/ UR - https://doi.org/10.2196/13430 UR - http://www.ncbi.nlm.nih.gov/pubmed/31815673 DO - 10.2196/13430 ID - info:doi/10.2196/13430 ER - TY - JOUR AU - Paranjape, Ketan AU - Schinkel, Michiel AU - Nannan Panday, Rishi AU - Car, Josip AU - Nanayakkara, Prabath PY - 2019 DA - 2019/12/3 TI - Introducing Artificial Intelligence Training in Medical Education JO - JMIR Med Educ SP - e16048 VL - 5 IS - 2 KW - algorithm KW - artificial intelligence KW - black box KW - deep learning KW - machine learning KW - medical education KW - continuing education KW - data sciences KW - curriculum UR - http://mededu.jmir.org/2019/2/e16048/ UR - https://doi.org/10.2196/16048 UR - http://www.ncbi.nlm.nih.gov/pubmed/31793895 DO - 10.2196/16048 ID - info:doi/10.2196/16048 ER - TY - JOUR AU - Fernandes, Chrystinne Oliveira AU - Miles, Simon AU - Lucena, Carlos José Pereira De AU - Cowan, Donald PY - 2019 DA - 2019/11/26 TI - Artificial Intelligence Technologies for Coping with Alarm Fatigue in Hospital Environments Because of Sensory Overload: Algorithm Development and Validation JO - J Med Internet Res SP - e15406 VL - 21 IS - 11 KW - alert fatigue health personnel KW - health information systems KW - patient monitoring KW - alert systems KW - artificial intelligence AB - Background: Informed estimates claim that 80% to 99% of alarms set off in hospital units are false or clinically insignificant, representing a cacophony of sounds that do not present a real danger to patients. These false alarms can lead to an alert overload that causes a health care provider to miss important events that could be harmful or even life-threatening. As health care units become more dependent on monitoring devices for patient care purposes, the alarm fatigue issue has to be addressed as a major concern for the health care team as well as to enhance patient safety. Objective: The main goal of this paper was to propose a feasible solution for the alarm fatigue problem by using an automatic reasoning mechanism to decide how to notify members of the health care team. The aim was to reduce the number of notifications sent by determining whether or not to group a set of alarms that occur over a short period of time to deliver them together, without compromising patient safety. Methods: This paper describes: (1) a model for supporting reasoning algorithms that decide how to notify caregivers to avoid alarm fatigue; (2) an architecture for health systems that support patient monitoring and notification capabilities; and (3) a reasoning algorithm that specifies how to notify caregivers by deciding whether to aggregate a group of alarms to avoid alarm fatigue. Results: Experiments were used to demonstrate that providing a reasoning system can reduce the notifications received by the caregivers by up to 99.3% (582/586) of the total alarms generated. Our experiments were evaluated through the use of a dataset comprising patient monitoring data and vital signs recorded during 32 surgical cases where patients underwent anesthesia at the Royal Adelaide Hospital. We present the results of our algorithm by using graphs we generated using the R language, where we show whether the algorithm decided to deliver an alarm immediately or after a delay. Conclusions: The experimental results strongly suggest that this reasoning algorithm is a useful strategy for avoiding alarm fatigue. Although we evaluated our algorithm in an experimental environment, we tried to reproduce the context of a clinical environment by using real-world patient data. Our future work is to reproduce the evaluation study based on more realistic clinical conditions by increasing the number of patients, monitoring parameters, and types of alarm. UR - http://www.jmir.org/2019/11/e15406/ UR - https://doi.org/10.2196/15406 UR - http://www.ncbi.nlm.nih.gov/pubmed/31769762 DO - 10.2196/15406 ID - info:doi/10.2196/15406 ER - TY - JOUR AU - Meskó, Bertalan PY - 2019 DA - 2019/11/18 TI - The Real Era of the Art of Medicine Begins with Artificial Intelligence JO - J Med Internet Res SP - e16295 VL - 21 IS - 11 KW - future KW - artificial intelligence KW - digital health KW - technology KW - art of medicine UR - http://www.jmir.org/2019/11/e16295/ UR - https://doi.org/10.2196/16295 UR - http://www.ncbi.nlm.nih.gov/pubmed/31738169 DO - 10.2196/16295 ID - info:doi/10.2196/16295 ER - TY - JOUR AU - Piau, Antoine AU - Lepage, Benoit AU - Bernon, Carole AU - Gleizes, Marie-Pierre AU - Nourhashemi, Fati PY - 2019 DA - 2019/11/18 TI - Real-Time Detection of Behavioral Anomalies of Older People Using Artificial Intelligence (The 3-PEGASE Study): Protocol for a Real-Life Prospective Trial JO - JMIR Res Protoc SP - e14245 VL - 8 IS - 11 KW - frailty KW - monitoring KW - sensors KW - artificial intelligence KW - older adults KW - participatory design AB - Background: Most frail older persons are living at home, and we face difficulties in achieving seamless monitoring to detect adverse health changes. Even more important, this lack of follow-up could have a negative impact on the living choices made by older individuals and their care partners. People could give up their homes for the more reassuring environment of a medicalized living facility. We have developed a low-cost unobtrusive sensor-based solution to trigger automatic alerts in case of an acute event or subtle changes over time. It could facilitate older adults’ follow-up in their own homes, and thus support independent living. Objective: The primary objective of this prospective open-label study is to evaluate the relevance of the automatic alerts generated by our artificial intelligence–driven monitoring solution as judged by the recipients: older adults, caregivers, and professional support workers. The secondary objective is to evaluate its ability to detect subtle functional and cognitive decline and major medical events. Methods: The primary outcome will be evaluated for each successive 2-month follow-up period to estimate the progression of our learning algorithm performance over time. In total, 25 frail or disabled participants, aged 75 years and above and living alone in their own homes, will be enrolled for a 6-month follow-up period. Results: The first phase with 5 participants for a 4-month feasibility period has been completed and the expected completion date for the second phase of the study (20 participants for 6 months) is July 2020. Conclusions: The originality of our real-life project lies in the choice of the primary outcome and in our user-centered evaluation. We will evaluate the relevance of the alerts and the algorithm performance over time according to the end users. The first-line recipients of the information are the older adults and their care partners rather than health care professionals. Despite the fast pace of electronic health devices development, few studies have addressed the specific everyday needs of older adults and their families. Trial Registration: ClinicalTrials.gov NCT03484156; https://clinicaltrials.gov/ct2/show/NCT03484156 International Registered Report Identifier (IRRID): PRR1-10.2196/14245 UR - http://www.researchprotocols.org/2019/11/e14245/ UR - https://doi.org/10.2196/14245 UR - http://www.ncbi.nlm.nih.gov/pubmed/31738180 DO - 10.2196/14245 ID - info:doi/10.2196/14245 ER - TY - JOUR AU - Lovis, Christian PY - 2019 DA - 2019/11/8 TI - Unlocking the Power of Artificial Intelligence and Big Data in Medicine JO - J Med Internet Res SP - e16607 VL - 21 IS - 11 KW - medical informatics KW - artificial intelligence KW - big data UR - https://www.jmir.org/2019/11/e16607 UR - https://doi.org/10.2196/16607 UR - http://www.ncbi.nlm.nih.gov/pubmed/31702565 DO - 10.2196/16607 ID - info:doi/10.2196/16607 ER - TY - JOUR AU - Kocaballi, Ahmet Baki AU - Berkovsky, Shlomo AU - Quiroz, Juan C AU - Laranjo, Liliana AU - Tong, Huong Ly AU - Rezazadegan, Dana AU - Briatore, Agustina AU - Coiera, Enrico PY - 2019 DA - 2019/11/7 TI - The Personalization of Conversational Agents in Health Care: Systematic Review JO - J Med Internet Res SP - e15360 VL - 21 IS - 11 KW - conversational interfaces KW - conversational agents KW - dialogue systems KW - personalization KW - customization KW - adaptive systems KW - health care AB - Background: The personalization of conversational agents with natural language user interfaces is seeing increasing use in health care applications, shaping the content, structure, or purpose of the dialogue between humans and conversational agents. Objective: The goal of this systematic review was to understand the ways in which personalization has been used with conversational agents in health care and characterize the methods of its implementation. Methods: We searched on PubMed, Embase, CINAHL, PsycInfo, and ACM Digital Library using a predefined search strategy. The studies were included if they: (1) were primary research studies that focused on consumers, caregivers, or health care professionals; (2) involved a conversational agent with an unconstrained natural language interface; (3) tested the system with human subjects; and (4) implemented personalization features. Results: The search found 1958 publications. After abstract and full-text screening, 13 studies were included in the review. Common examples of personalized content included feedback, daily health reports, alerts, warnings, and recommendations. The personalization features were implemented without a theoretical framework of customization and with limited evaluation of its impact. While conversational agents with personalization features were reported to improve user satisfaction, user engagement and dialogue quality, the role of personalization in improving health outcomes was not assessed directly. Conclusions: Most of the studies in our review implemented the personalization features without theoretical or evidence-based support for them and did not leverage the recent developments in other domains of personalization. Future research could incorporate personalization as a distinct design factor with a more careful consideration of its impact on health outcomes and its implications on patient safety, privacy, and decision-making. UR - https://www.jmir.org/2019/11/e15360 UR - https://doi.org/10.2196/15360 UR - http://www.ncbi.nlm.nih.gov/pubmed/31697237 DO - 10.2196/15360 ID - info:doi/10.2196/15360 ER - TY - JOUR AU - Tran, Bach Xuan AU - Nghiem, Son AU - Sahin, Oz AU - Vu, Tuan Manh AU - Ha, Giang Hai AU - Vu, Giang Thu AU - Pham, Hai Quang AU - Do, Hoa Thi AU - Latkin, Carl A AU - Tam, Wilson AU - Ho, Cyrus S H AU - Ho, Roger C M PY - 2019 DA - 2019/11/1 TI - Modeling Research Topics for Artificial Intelligence Applications in Medicine: Latent Dirichlet Allocation Application Study JO - J Med Internet Res SP - e15511 VL - 21 IS - 11 KW - artificial intelligence KW - applications KW - medicine KW - scientometric KW - bibliometric KW - latent Dirichlet allocation AB - Background: Artificial intelligence (AI)–based technologies develop rapidly and have myriad applications in medicine and health care. However, there is a lack of comprehensive reporting on the productivity, workflow, topics, and research landscape of AI in this field. Objective: This study aimed to evaluate the global development of scientific publications and constructed interdisciplinary research topics on the theory and practice of AI in medicine from 1977 to 2018. Methods: We obtained bibliographic data and abstract contents of publications published between 1977 and 2018 from the Web of Science database. A total of 27,451 eligible articles were analyzed. Research topics were classified by latent Dirichlet allocation, and principal component analysis was used to identify the construct of the research landscape. Results: The applications of AI have mainly impacted clinical settings (enhanced prognosis and diagnosis, robot-assisted surgery, and rehabilitation), data science and precision medicine (collecting individual data for precision medicine), and policy making (raising ethical and legal issues, especially regarding privacy and confidentiality of data). However, AI applications have not been commonly used in resource-poor settings due to the limit in infrastructure and human resources. Conclusions: The application of AI in medicine has grown rapidly and focuses on three leading platforms: clinical practices, clinical material, and policies. AI might be one of the methods to narrow down the inequality in health care and medicine between developing and developed countries. Technology transfer and support from developed countries are essential measures for the advancement of AI application in health care in developing countries. UR - https://www.jmir.org/2019/11/e15511 UR - https://doi.org/10.2196/15511 UR - http://www.ncbi.nlm.nih.gov/pubmed/31682577 DO - 10.2196/15511 ID - info:doi/10.2196/15511 ER - TY - JOUR AU - Faruqui, Syed Hasib Akhter AU - Du, Yan AU - Meka, Rajitha AU - Alaeddini, Adel AU - Li, Chengdong AU - Shirinkam, Sara AU - Wang, Jing PY - 2019 DA - 2019/11/1 TI - Development of a Deep Learning Model for Dynamic Forecasting of Blood Glucose Level for Type 2 Diabetes Mellitus: Secondary Analysis of a Randomized Controlled Trial JO - JMIR Mhealth Uhealth SP - e14452 VL - 7 IS - 11 KW - type 2 diabetes KW - long short-term memory (LSTM)-based recurrent neural networks (RNNs) KW - glucose level prediction KW - mobile health lifestyle data AB - Background: Type 2 diabetes mellitus (T2DM) is a major public health burden. Self-management of diabetes including maintaining a healthy lifestyle is essential for glycemic control and to prevent diabetes complications. Mobile-based health data can play an important role in the forecasting of blood glucose levels for lifestyle management and control of T2DM. Objective: The objective of this work was to dynamically forecast daily glucose levels in patients with T2DM based on their daily mobile health lifestyle data including diet, physical activity, weight, and glucose level from the day before. Methods: We used data from 10 T2DM patients who were overweight or obese in a behavioral lifestyle intervention using mobile tools for daily monitoring of diet, physical activity, weight, and blood glucose over 6 months. We developed a deep learning model based on long short-term memory–based recurrent neural networks to forecast the next-day glucose levels in individual patients. The neural network used several layers of computational nodes to model how mobile health data (food intake including consumed calories, fat, and carbohydrates; exercise; and weight) were progressing from one day to another from noisy data. Results: The model was validated based on a data set of 10 patients who had been monitored daily for over 6 months. The proposed deep learning model demonstrated considerable accuracy in predicting the next day glucose level based on Clark Error Grid and ±10% range of the actual values. Conclusions: Using machine learning methodologies may leverage mobile health lifestyle data to develop effective individualized prediction plans for T2DM management. However, predicting future glucose levels is challenging as glucose level is determined by multiple factors. Future study with more rigorous study design is warranted to better predict future glucose levels for T2DM management. UR - https://mhealth.jmir.org/2019/11/e14452 UR - https://doi.org/10.2196/14452 UR - http://www.ncbi.nlm.nih.gov/pubmed/31682586 DO - 10.2196/14452 ID - info:doi/10.2196/14452 ER - TY - JOUR AU - Spasic, Irena AU - Krzeminski, Dominik AU - Corcoran, Padraig AU - Balinsky, Alexander PY - 2019 DA - 2019/10/31 TI - Cohort Selection for Clinical Trials From Longitudinal Patient Records: Text Mining Approach JO - JMIR Med Inform SP - e15980 VL - 7 IS - 4 KW - natural language processing KW - machine learning KW - electronic medical records KW - clinical trial KW - eligibility determination AB - Background: Clinical trials are an important step in introducing new interventions into clinical practice by generating data on their safety and efficacy. Clinical trials need to ensure that participants are similar so that the findings can be attributed to the interventions studied and not to some other factors. Therefore, each clinical trial defines eligibility criteria, which describe characteristics that must be shared by the participants. Unfortunately, the complexities of eligibility criteria may not allow them to be translated directly into readily executable database queries. Instead, they may require careful analysis of the narrative sections of medical records. Manual screening of medical records is time consuming, thus negatively affecting the timeliness of the recruitment process. Objective: Track 1 of the 2018 National Natural Language Processing Clinical Challenge focused on the task of cohort selection for clinical trials, aiming to answer the following question: Can natural language processing be applied to narrative medical records to identify patients who meet eligibility criteria for clinical trials? The task required the participating systems to analyze longitudinal patient records to determine if the corresponding patients met the given eligibility criteria. We aimed to describe a system developed to address this task. Methods: Our system consisted of 13 classifiers, one for each eligibility criterion. All classifiers used a bag-of-words document representation model. To prevent the loss of relevant contextual information associated with such representation, a pattern-matching approach was used to extract context-sensitive features. They were embedded back into the text as lexically distinguishable tokens, which were consequently featured in the bag-of-words representation. Supervised machine learning was chosen wherever a sufficient number of both positive and negative instances was available to learn from. A rule-based approach focusing on a small set of relevant features was chosen for the remaining criteria. Results: The system was evaluated using microaveraged F measure. Overall, 4 machine algorithms, including support vector machine, logistic regression, naïve Bayesian classifier, and gradient tree boosting (GTB), were evaluated on the training data using 10–fold cross-validation. Overall, GTB demonstrated the most consistent performance. Its performance peaked when oversampling was used to balance the training data. The final evaluation was performed on previously unseen test data. On average, the F measure of 89.04% was comparable to 3 of the top ranked performances in the shared task (91.11%, 90.28%, and 90.21%). With an F measure of 88.14%, we significantly outperformed these systems (81.03%, 78.50%, and 70.81%) in identifying patients with advanced coronary artery disease. Conclusions: The holdout evaluation provides evidence that our system was able to identify eligible patients for the given clinical trial with high accuracy. Our approach demonstrates how rule-based knowledge infusion can improve the performance of machine learning algorithms even when trained on a relatively small dataset. UR - http://medinform.jmir.org/2019/4/e15980/ UR - https://doi.org/10.2196/15980 UR - http://www.ncbi.nlm.nih.gov/pubmed/31674914 DO - 10.2196/15980 ID - info:doi/10.2196/15980 ER - TY - JOUR AU - Powell, John PY - 2019 DA - 2019/10/28 TI - Trust Me, I’m a Chatbot: How Artificial Intelligence in Health Care Fails the Turing Test JO - J Med Internet Res SP - e16222 VL - 21 IS - 10 KW - artificial intelligence KW - machine learning KW - medical informatics KW - digital health KW - ehealth KW - chatbots KW - conversational agents UR - http://www.jmir.org/2019/10/e16222/ UR - https://doi.org/10.2196/16222 UR - http://www.ncbi.nlm.nih.gov/pubmed/31661083 DO - 10.2196/16222 ID - info:doi/10.2196/16222 ER - TY - JOUR AU - Gaffney, Hannah AU - Mansell, Warren AU - Tai, Sara PY - 2019 DA - 2019/10/18 TI - Conversational Agents in the Treatment of Mental Health Problems: Mixed-Method Systematic Review JO - JMIR Ment Health SP - e14166 VL - 6 IS - 10 KW - artificial intelligence KW - mental health KW - stress, pychological KW - psychiatry KW - therapy, computer-assisted KW - conversational agent KW - chatbot KW - digital health AB - Background: The use of conversational agent interventions (including chatbots and robots) in mental health is growing at a fast pace. Recent existing reviews have focused exclusively on a subset of embodied conversational agent interventions despite other modalities aiming to achieve the common goal of improved mental health. Objective: This study aimed to review the use of conversational agent interventions in the treatment of mental health problems. Methods: We performed a systematic search using relevant databases (MEDLINE, EMBASE, PsycINFO, Web of Science, and Cochrane library). Studies that reported on an autonomous conversational agent that simulated conversation and reported on a mental health outcome were included. Results: A total of 13 studies were included in the review. Among them, 4 full-scale randomized controlled trials (RCTs) were included. The rest were feasibility, pilot RCTs and quasi-experimental studies. Interventions were diverse in design and targeted a range of mental health problems using a wide variety of therapeutic orientations. All included studies reported reductions in psychological distress postintervention. Furthermore, 5 controlled studies demonstrated significant reductions in psychological distress compared with inactive control groups. In addition, 3 controlled studies comparing interventions with active control groups failed to demonstrate superior effects. Broader utility in promoting well-being in nonclinical populations was unclear. Conclusions: The efficacy and acceptability of conversational agent interventions for mental health problems are promising. However, a more robust experimental design is required to demonstrate efficacy and efficiency. A focus on streamlining interventions, demonstrating equivalence to other treatment modalities, and elucidating mechanisms of action has the potential to increase acceptance by users and clinicians and maximize reach. UR - https://mental.jmir.org/2019/10/e14166 UR - https://doi.org/10.2196/14166 UR - http://www.ncbi.nlm.nih.gov/pubmed/31628789 DO - 10.2196/14166 ID - info:doi/10.2196/14166 ER - TY - JOUR AU - Ye, Tiantian AU - Xue, Jiaolong AU - He, Mingguang AU - Gu, Jing AU - Lin, Haotian AU - Xu, Bin AU - Cheng, Yu PY - 2019 DA - 2019/10/17 TI - Psychosocial Factors Affecting Artificial Intelligence Adoption in Health Care in China: Cross-Sectional Study JO - J Med Internet Res SP - e14316 VL - 21 IS - 10 KW - artificial intelligence KW - adoption KW - technology acceptance model KW - structural equation model KW - intention KW - subjective norms KW - trust KW - moderation AB - Background: Poor quality primary health care is a major issue in China, particularly in blindness prevention. Artificial intelligence (AI) could provide early screening and accurate auxiliary diagnosis to improve primary care services and reduce unnecessary referrals, but the application of AI in medical settings is still an emerging field. Objective: This study aimed to investigate the general public’s acceptance of ophthalmic AI devices, with reference to those already used in China, and the interrelated influencing factors that shape people’s intention to use these devices. Methods: We proposed a model of ophthalmic AI acceptance based on technology acceptance theories and variables from other health care–related studies. The model was verified via a 32-item questionnaire with 7-point Likert scales completed by 474 respondents (nationally random sampled). Structural equation modeling was used to evaluate item and construct reliability and validity via a confirmatory factor analysis, and the model’s path effects, significance, goodness of fit, and mediation and moderation effects were analyzed. Results: Standardized factor loadings of items were between 0.583 and 0.876. Composite reliability of 9 constructs ranged from 0.673 to 0.841. The discriminant validity of all constructs met the Fornell and Larcker criteria. Model fit indicators such as standardized root mean square residual (0.057), comparative fit index (0.915), and root mean squared error of approximation (0.049) demonstrated good fit. Intention to use (R2=0.515) is significantly affected by subjective norms (beta=.408; P<.001), perceived usefulness (beta=.336; P=.03), and resistance bias (beta=–.237; P=.02). Subjective norms and perceived behavior control had an indirect impact on intention to use through perceived usefulness and perceived ease of use. Eye health consciousness had an indirect positive effect on intention to use through perceived usefulness. Trust had a significant moderation effect (beta=–.095; P=.049) on the effect path of perceived usefulness to intention to use. Conclusions: The item, construct, and model indicators indicate reliable interpretation power and help explain the levels of public acceptance of ophthalmic AI devices in China. The influence of subjective norms can be linked to Confucian culture, collectivism, authoritarianism, and conformity mentality in China. Overall, the use of AI in diagnostics and clinical laboratory analysis is underdeveloped, and the Chinese public are generally mistrustful of medical staff and the Chinese medical system. Stakeholders such as doctors and AI suppliers should therefore avoid making misleading or over-exaggerated claims in the promotion of AI health care products. UR - http://www.jmir.org/2019/10/e14316/ UR - https://doi.org/10.2196/14316 UR - http://www.ncbi.nlm.nih.gov/pubmed/31625950 DO - 10.2196/14316 ID - info:doi/10.2196/14316 ER - TY - JOUR AU - Peine, Arne AU - Hallawa, Ahmed AU - Schöffski, Oliver AU - Dartmann, Guido AU - Fazlic, Lejla Begic AU - Schmeink, Anke AU - Marx, Gernot AU - Martin, Lukas PY - 2019 DA - 2019/10/10 TI - A Deep Learning Approach for Managing Medical Consumable Materials in Intensive Care Units via Convolutional Neural Networks: Technical Proof-of-Concept Study JO - JMIR Med Inform SP - e14806 VL - 7 IS - 4 KW - convolutional neural networks KW - deep learning, critical care KW - intensive care KW - image recognition KW - medical economics KW - medical consumables KW - artificial intelligence KW - machine learning AB - Background: High numbers of consumable medical materials (eg, sterile needles and swabs) are used during the daily routine of intensive care units (ICUs) worldwide. Although medical consumables largely contribute to total ICU hospital expenditure, many hospitals do not track the individual use of materials. Current tracking solutions meeting the specific requirements of the medical environment, like barcodes or radio frequency identification, require specialized material preparation and high infrastructure investment. This impedes the accurate prediction of consumption, leads to high storage maintenance costs caused by large inventories, and hinders scientific work due to inaccurate documentation. Thus, new cost-effective and contactless methods for object detection are urgently needed. Objective: The goal of this work was to develop and evaluate a contactless visual recognition system for tracking medical consumable materials in ICUs using a deep learning approach on a distributed client-server architecture. Methods: We developed Consumabot, a novel client-server optical recognition system for medical consumables, based on the convolutional neural network model MobileNet implemented in Tensorflow. The software was designed to run on single-board computer platforms as a detection unit. The system was trained to recognize 20 different materials in the ICU, while 100 sample images of each consumable material were provided. We assessed the top-1 recognition rates in the context of different real-world ICU settings: materials presented to the system without visual obstruction, 50% covered materials, and scenarios of multiple items. We further performed an analysis of variance with repeated measures to quantify the effect of adverse real-world circumstances. Results: Consumabot reached a >99% reliability of recognition after about 60 steps of training and 150 steps of validation. A desirable low cross entropy of <0.03 was reached for the training set after about 100 iteration steps and after 170 steps for the validation set. The system showed a high top-1 mean recognition accuracy in a real-world scenario of 0.85 (SD 0.11) for objects presented to the system without visual obstruction. Recognition accuracy was lower, but still acceptable, in scenarios where the objects were 50% covered (P<.001; mean recognition accuracy 0.71; SD 0.13) or multiple objects of the target group were present (P=.01; mean recognition accuracy 0.78; SD 0.11), compared to a nonobstructed view. The approach met the criteria of absence of explicit labeling (eg, barcodes, radio frequency labeling) while maintaining a high standard for quality and hygiene with minimal consumption of resources (eg, cost, time, training, and computational power). Conclusions: Using a convolutional neural network architecture, Consumabot consistently achieved good results in the classification of consumables and thus is a feasible way to recognize and register medical consumables directly to a hospital’s electronic health record. The system shows limitations when the materials are partially covered, therefore identifying characteristics of the consumables are not presented to the system. Further development of the assessment in different medical circumstances is needed. UR - http://medinform.jmir.org/2019/4/e14806/ UR - https://doi.org/10.2196/14806 UR - http://www.ncbi.nlm.nih.gov/pubmed/31603430 DO - 10.2196/14806 ID - info:doi/10.2196/14806 ER - TY - JOUR AU - Tran, Bach Xuan AU - Latkin, Carl A AU - Sharafeldin, Noha AU - Nguyen, Katherina AU - Vu, Giang Thu AU - Tam, Wilson W S AU - Cheung, Ngai-Man AU - Nguyen, Huong Lan Thi AU - Ho, Cyrus S H AU - Ho, Roger C M PY - 2019 DA - 2019/9/15 TI - Characterizing Artificial Intelligence Applications in Cancer Research: A Latent Dirichlet Allocation Analysis JO - JMIR Med Inform SP - e14401 VL - 7 IS - 4 KW - scientometrics KW - cancer KW - artificial intelligence KW - global KW - mapping AB - Background: Artificial intelligence (AI)–based therapeutics, devices, and systems are vital innovations in cancer control; particularly, they allow for diagnosis, screening, precise estimation of survival, informing therapy selection, and scaling up treatment services in a timely manner. Objective: The aim of this study was to analyze the global trends, patterns, and development of interdisciplinary landscapes in AI and cancer research. Methods: An exploratory factor analysis was conducted to identify research domains emerging from abstract contents. The Jaccard similarity index was utilized to identify the most frequently co-occurring terms. Latent Dirichlet Allocation was used for classifying papers into corresponding topics. Results: From 1991 to 2018, the number of studies examining the application of AI in cancer care has grown to 3555 papers covering therapeutics, capacities, and factors associated with outcomes. Topics with the highest volume of publications include (1) machine learning, (2) comparative effectiveness evaluation of AI-assisted medical therapies, and (3) AI-based prediction. Noticeably, this classification has revealed topics examining the incremental effectiveness of AI applications, the quality of life, and functioning of patients receiving these innovations. The growing research productivity and expansion of multidisciplinary approaches are largely driven by machine learning, artificial neural networks, and AI in various clinical practices. Conclusions: The research landscapes show that the development of AI in cancer care is focused on not only improving prediction in cancer screening and AI-assisted therapeutics but also on improving other corresponding areas such as precision and personalized medicine and patient-reported outcomes. UR - https://medinform.jmir.org/2019/4/e14401 UR - https://doi.org/10.2196/14401 UR - http://www.ncbi.nlm.nih.gov/pubmed/31573929 DO - 10.2196/14401 ID - info:doi/10.2196/14401 ER - TY - JOUR AU - Sena, Gabrielle Ribeiro AU - Lima, Tiago Pessoa Ferreira AU - Mello, Maria Julia Gonçalves AU - Thuler, Luiz Claudio Santos AU - Lima, Jurema Telles Oliveira PY - 2019 DA - 2019/9/26 TI - Developing Machine Learning Algorithms for the Prediction of Early Death in Elderly Cancer Patients: Usability Study JO - JMIR Cancer SP - e12163 VL - 5 IS - 2 KW - geriatric assessment KW - aged KW - machine learning KW - medical oncology KW - death AB - Background: The importance of classifying cancer patients into high- or low-risk groups has led many research teams, from the biomedical and bioinformatics fields, to study the application of machine learning (ML) algorithms. The International Society of Geriatric Oncology recommends the use of the comprehensive geriatric assessment (CGA), a multidisciplinary tool to evaluate health domains, for the follow-up of elderly cancer patients. However, no applications of ML have been proposed using CGA to classify elderly cancer patients. Objective: The aim of this study was to propose and develop predictive models, using ML and CGA, to estimate the risk of early death in elderly cancer patients. Methods: The ability of ML algorithms to predict early mortality in a cohort involving 608 elderly cancer patients was evaluated. The CGA was conducted during admission by a multidisciplinary team and included the following questionnaires: mini-mental state examination (MMSE), geriatric depression scale-short form, international physical activity questionnaire-short form, timed up and go, Katz index of independence in activities of daily living, Charlson comorbidity index, Karnofsky performance scale (KPS), polypharmacy, and mini nutritional assessment-short form (MNA-SF). The 10-fold cross-validation algorithm was used to evaluate all possible combinations of these questionnaires to estimate the risk of early death, considered when occurring within 6 months of diagnosis, in a variety of ML classifiers, including Naive Bayes (NB), decision tree algorithm J48 (J48), and multilayer perceptron (MLP). On each fold of evaluation, tiebreaking is handled by choosing the smallest set of questionnaires. Results: It was possible to select CGA questionnaire subsets with high predictive capacity for early death, which were either statistically similar (NB) or higher (J48 and MLP) when compared with the use of all questionnaires investigated. These results show that CGA questionnaire selection can improve accuracy rates and decrease the time spent to evaluate elderly cancer patients. Conclusions: A simplified predictive model aiming to estimate the risk of early death in elderly cancer patients is proposed herein, minimally composed by the MNA-SF and KPS. We strongly recommend that these questionnaires be incorporated into regular geriatric assessment of older patients with cancer. UR - https://cancer.jmir.org/2019/2/e12163 UR - https://doi.org/10.2196/12163 UR - http://www.ncbi.nlm.nih.gov/pubmed/31573896 DO - 10.2196/12163 ID - info:doi/10.2196/12163 ER - TY - JOUR AU - Li, Fei AU - Jin, Yonghao AU - Liu, Weisong AU - Rawat, Bhanu Pratap Singh AU - Cai, Pengshan AU - Yu, Hong PY - 2019 DA - 2019/09/12 TI - Fine-Tuning Bidirectional Encoder Representations From Transformers (BERT)–Based Models on Large-Scale Electronic Health Record Notes: An Empirical Study JO - JMIR Med Inform SP - e14830 VL - 7 IS - 3 KW - natural language processing KW - entity normalization KW - deep learning KW - electronic health record note KW - BERT AB - Background: The bidirectional encoder representations from transformers (BERT) model has achieved great success in many natural language processing (NLP) tasks, such as named entity recognition and question answering. However, little prior work has explored this model to be used for an important task in the biomedical and clinical domains, namely entity normalization. Objective: We aim to investigate the effectiveness of BERT-based models for biomedical or clinical entity normalization. In addition, our second objective is to investigate whether the domains of training data influence the performances of BERT-based models as well as the degree of influence. Methods: Our data was comprised of 1.5 million unlabeled electronic health record (EHR) notes. We first fine-tuned BioBERT on this large collection of unlabeled EHR notes. This generated our BERT-based model trained using 1.5 million electronic health record notes (EhrBERT). We then further fine-tuned EhrBERT, BioBERT, and BERT on three annotated corpora for biomedical and clinical entity normalization: the Medication, Indication, and Adverse Drug Events (MADE) 1.0 corpus, the National Center for Biotechnology Information (NCBI) disease corpus, and the Chemical-Disease Relations (CDR) corpus. We compared our models with two state-of-the-art normalization systems, namely MetaMap and disease name normalization (DNorm). Results: EhrBERT achieved 40.95% F1 in the MADE 1.0 corpus for mapping named entities to the Medical Dictionary for Regulatory Activities and the Systematized Nomenclature of Medicine—Clinical Terms (SNOMED-CT), which have about 380,000 terms. In this corpus, EhrBERT outperformed MetaMap by 2.36% in F1. For the NCBI disease corpus and CDR corpus, EhrBERT also outperformed DNorm by improving the F1 scores from 88.37% and 89.92% to 90.35% and 93.82%, respectively. Compared with BioBERT and BERT, EhrBERT outperformed them on the MADE 1.0 corpus and the CDR corpus. Conclusions: Our work shows that BERT-based models have achieved state-of-the-art performance for biomedical and clinical entity normalization. BERT-based models can be readily fine-tuned to normalize any kind of named entities. UR - http://medinform.jmir.org/2019/3/e14830/ UR - https://doi.org/10.2196/14830 UR - http://www.ncbi.nlm.nih.gov/pubmed/31516126 DO - 10.2196/14830 ID - info:doi/10.2196/14830 ER - TY - JOUR AU - Tobore, Igbe AU - Li, Jingzhen AU - Yuhang, Liu AU - Al-Handarish, Yousef AU - Kandwal, Abhishek AU - Nie, Zedong AU - Wang, Lei PY - 2019 DA - 2019/08/02 TI - Deep Learning Intervention for Health Care Challenges: Some Biomedical Domain Considerations JO - JMIR Mhealth Uhealth SP - e11966 VL - 7 IS - 8 KW - machine learning KW - deep learning KW - big data KW - mHealth KW - medical imaging KW - electronic health record KW - biologicals KW - biomedical KW - ECG KW - EEG KW - artificial intelligence UR - https://mhealth.jmir.org/2019/8/e11966/ UR - https://doi.org/10.2196/11966 UR - http://www.ncbi.nlm.nih.gov/pubmed/31376272 DO - 10.2196/11966 ID - info:doi/10.2196/11966 ER - TY - JOUR AU - Lin, Chin AU - Lou, Yu-Sheng AU - Tsai, Dung-Jang AU - Lee, Chia-Cheng AU - Hsu, Chia-Jung AU - Wu, Ding-Chung AU - Wang, Mei-Chuen AU - Fang, Wen-Hui PY - 2019 DA - 2019/7/23 TI - Projection Word Embedding Model With Hybrid Sampling Training for Classifying ICD-10-CM Codes: Longitudinal Observational Study JO - JMIR Med Inform SP - e14499 VL - 7 IS - 3 KW - word embedding KW - convolutional neural network KW - artificial intelligence KW - natural language processing KW - electronic health records AB - Background: Most current state-of-the-art models for searching the International Classification of Diseases, Tenth Revision Clinical Modification (ICD-10-CM) codes use word embedding technology to capture useful semantic properties. However, they are limited by the quality of initial word embeddings. Word embedding trained by electronic health records (EHRs) is considered the best, but the vocabulary diversity is limited by previous medical records. Thus, we require a word embedding model that maintains the vocabulary diversity of open internet databases and the medical terminology understanding of EHRs. Moreover, we need to consider the particularity of the disease classification, wherein discharge notes present only positive disease descriptions. Objective: We aimed to propose a projection word2vec model and a hybrid sampling method. In addition, we aimed to conduct a series of experiments to validate the effectiveness of these methods. Methods: We compared the projection word2vec model and traditional word2vec model using two corpora sources: English Wikipedia and PubMed journal abstracts. We used seven published datasets to measure the medical semantic understanding of the word2vec models and used these embeddings to identify the three–character-level ICD-10-CM diagnostic codes in a set of discharge notes. On the basis of embedding technology improvement, we also tried to apply the hybrid sampling method to improve accuracy. The 94,483 labeled discharge notes from the Tri-Service General Hospital of Taipei, Taiwan, from June 1, 2015, to June 30, 2017, were used. To evaluate the model performance, 24,762 discharge notes from July 1, 2017, to December 31, 2017, from the same hospital were used. Moreover, 74,324 additional discharge notes collected from seven other hospitals were tested. The F-measure, which is the major global measure of effectiveness, was adopted. Results: In medical semantic understanding, the original EHR embeddings and PubMed embeddings exhibited superior performance to the original Wikipedia embeddings. After projection training technology was applied, the projection Wikipedia embeddings exhibited an obvious improvement but did not reach the level of original EHR embeddings or PubMed embeddings. In the subsequent ICD-10-CM coding experiment, the model that used both projection PubMed and Wikipedia embeddings had the highest testing mean F-measure (0.7362 and 0.6693 in Tri-Service General Hospital and the seven other hospitals, respectively). Moreover, the hybrid sampling method was found to improve the model performance (F-measure=0.7371/0.6698). Conclusions: The word embeddings trained using EHR and PubMed could understand medical semantics better, and the proposed projection word2vec model improved the ability of medical semantics extraction in Wikipedia embeddings. Although the improvement from the projection word2vec model in the real ICD-10-CM coding task was not substantial, the models could effectively handle emerging diseases. The proposed hybrid sampling method enables the model to behave like a human expert. UR - http://medinform.jmir.org/2019/3/e14499/ UR - https://doi.org/10.2196/14499 DO - 10.2196/14499 ID - info:doi/10.2196/14499 ER - TY - JOUR AU - Shaw, James AU - Rudzicz, Frank AU - Jamieson, Trevor AU - Goldfarb, Avi PY - 2019 DA - 2019/07/10 TI - Artificial Intelligence and the Implementation Challenge JO - J Med Internet Res SP - e13659 VL - 21 IS - 7 KW - artificial intelligence KW - machine learning KW - implementation science KW - ethics AB - Background: Applications of artificial intelligence (AI) in health care have garnered much attention in recent years, but the implementation issues posed by AI have not been substantially addressed. Objective: In this paper, we have focused on machine learning (ML) as a form of AI and have provided a framework for thinking about use cases of ML in health care. We have structured our discussion of challenges in the implementation of ML in comparison with other technologies using the framework of Nonadoption, Abandonment, and Challenges to the Scale-Up, Spread, and Sustainability of Health and Care Technologies (NASSS). Methods: After providing an overview of AI technology, we describe use cases of ML as falling into the categories of decision support and automation. We suggest these use cases apply to clinical, operational, and epidemiological tasks and that the primary function of ML in health care in the near term will be decision support. We then outline unique implementation issues posed by ML initiatives in the categories addressed by the NASSS framework, specifically including meaningful decision support, explainability, privacy, consent, algorithmic bias, security, scalability, the role of corporations, and the changing nature of health care work. Results: Ultimately, we suggest that the future of ML in health care remains positive but uncertain, as support from patients, the public, and a wide range of health care stakeholders is necessary to enable its meaningful implementation. Conclusions: If the implementation science community is to facilitate the adoption of ML in ways that stand to generate widespread benefits, the issues raised in this paper will require substantial attention in the coming years. UR - https://www.jmir.org/2019/7/e13659/ UR - https://doi.org/10.2196/13659 UR - http://www.ncbi.nlm.nih.gov/pubmed/31293245 DO - 10.2196/13659 ID - info:doi/10.2196/13659 ER - TY - JOUR AU - Loveys, Kate AU - Fricchione, Gregory AU - Kolappa, Kavitha AU - Sagar, Mark AU - Broadbent, Elizabeth PY - 2019 DA - 2019/07/08 TI - Reducing Patient Loneliness With Artificial Agents: Design Insights From Evolutionary Neuropsychiatry JO - J Med Internet Res SP - e13664 VL - 21 IS - 7 KW - loneliness KW - neuropsychiatry KW - biological evolution KW - psychological bonding KW - interpersonal relations KW - artificial intelligence KW - social support KW - eHealth UR - https://www.jmir.org/2019/7/e13664/ UR - https://doi.org/10.2196/13664 UR - http://www.ncbi.nlm.nih.gov/pubmed/31287067 DO - 10.2196/13664 ID - info:doi/10.2196/13664 ER - TY - JOUR AU - Chan, Kai Siang AU - Zary, Nabil PY - 2019 DA - 2019/6/15 TI - Applications and Challenges of Implementing Artificial Intelligence in Medical Education: Integrative Review JO - JMIR Med Educ SP - e13930 VL - 5 IS - 1 KW - medical education KW - evaluation of AIED systems KW - real world applications of AIED systems KW - artificial intelligence AB - Background: Since the advent of artificial intelligence (AI) in 1955, the applications of AI have increased over the years within a rapidly changing digital landscape where public expectations are on the rise, fed by social media, industry leaders, and medical practitioners. However, there has been little interest in AI in medical education until the last two decades, with only a recent increase in the number of publications and citations in the field. To our knowledge, thus far, a limited number of articles have discussed or reviewed the current use of AI in medical education. Objective: This study aims to review the current applications of AI in medical education as well as the challenges of implementing AI in medical education. Methods: Medline (Ovid), EBSCOhost Education Resources Information Center (ERIC) and Education Source, and Web of Science were searched with explicit inclusion and exclusion criteria. Full text of the selected articles was analyzed using the Extension of Technology Acceptance Model and the Diffusions of Innovations theory. Data were subsequently pooled together and analyzed quantitatively. Results: A total of 37 articles were identified. Three primary uses of AI in medical education were identified: learning support (n=32), assessment of students’ learning (n=4), and curriculum review (n=1). The main reasons for use of AI are its ability to provide feedback and a guided learning pathway and to decrease costs. Subgroup analysis revealed that medical undergraduates are the primary target audience for AI use. In addition, 34 articles described the challenges of AI implementation in medical education; two main reasons were identified: difficulty in assessing the effectiveness of AI in medical education and technical challenges while developing AI applications. Conclusions: The primary use of AI in medical education was for learning support mainly due to its ability to provide individualized feedback. Little emphasis was placed on curriculum review and assessment of students’ learning due to the lack of digitalization and sensitive nature of examinations, respectively. Big data manipulation also warrants the need to ensure data integrity. Methodological improvements are required to increase AI adoption by addressing the technical difficulties of creating an AI application and using novel methods to assess the effectiveness of AI. To better integrate AI into the medical profession, measures should be taken to introduce AI into the medical school curriculum for medical professionals to better understand AI algorithms and maximize its use. UR - http://mededu.jmir.org/2019/1/e13930/ UR - https://doi.org/10.2196/13930 UR - http://www.ncbi.nlm.nih.gov/pubmed/31199295 DO - 10.2196/13930 ID - info:doi/10.2196/13930 ER - TY - JOUR AU - Fiske, Amelia AU - Henningsen, Peter AU - Buyx, Alena PY - 2019 DA - 2019/05/09 TI - Your Robot Therapist Will See You Now: Ethical Implications of Embodied Artificial Intelligence in Psychiatry, Psychology, and Psychotherapy JO - J Med Internet Res SP - e13216 VL - 21 IS - 5 KW - artificial intelligence KW - robotics KW - ethics KW - psychiatry KW - psychology KW - psychotherapy KW - medicine AB - Background: Research in embodied artificial intelligence (AI) has increasing clinical relevance for therapeutic applications in mental health services. With innovations ranging from ‘virtual psychotherapists’ to social robots in dementia care and autism disorder, to robots for sexual disorders, artificially intelligent virtual and robotic agents are increasingly taking on high-level therapeutic interventions that used to be offered exclusively by highly trained, skilled health professionals. In order to enable responsible clinical implementation, ethical and social implications of the increasing use of embodied AI in mental health need to be identified and addressed. Objective: This paper assesses the ethical and social implications of translating embodied AI applications into mental health care across the fields of Psychiatry, Psychology and Psychotherapy. Building on this analysis, it develops a set of preliminary recommendations on how to address ethical and social challenges in current and future applications of embodied AI. Methods: Based on a thematic literature search and established principles of medical ethics, an analysis of the ethical and social aspects of currently embodied AI applications was conducted across the fields of Psychiatry, Psychology, and Psychotherapy. To enable a comprehensive evaluation, the analysis was structured around the following three steps: assessment of potential benefits; analysis of overarching ethical issues and concerns; discussion of specific ethical and social issues of the interventions. Results: From an ethical perspective, important benefits of embodied AI applications in mental health include new modes of treatment, opportunities to engage hard-to-reach populations, better patient response, and freeing up time for physicians. Overarching ethical issues and concerns include: harm prevention and various questions of data ethics; a lack of guidance on development of AI applications, their clinical integration and training of health professionals; ‘gaps’ in ethical and regulatory frameworks; the potential for misuse including using the technologies to replace established services, thereby potentially exacerbating existing health inequalities. Specific challenges identified and discussed in the application of embodied AI include: matters of risk-assessment, referrals, and supervision; the need to respect and protect patient autonomy; the role of non-human therapy; transparency in the use of algorithms; and specific concerns regarding long-term effects of these applications on understandings of illness and the human condition. Conclusions: We argue that embodied AI is a promising approach across the field of mental health; however, further research is needed to address the broader ethical and societal concerns of these technologies to negotiate best research and medical practices in innovative mental health care. We conclude by indicating areas of future research and developing recommendations for high-priority areas in need of concrete ethical guidance. UR - https://www.jmir.org/2019/5/e13216/ UR - https://doi.org/10.2196/13216 UR - http://www.ncbi.nlm.nih.gov/pubmed/31094356 DO - 10.2196/13216 ID - info:doi/10.2196/13216 ER - TY - JOUR AU - Woldaregay, Ashenafi Zebene AU - Årsand, Eirik AU - Botsis, Taxiarchis AU - Albers, David AU - Mamykina, Lena AU - Hartvigsen, Gunnar PY - 2019 DA - 2019/05/01 TI - Data-Driven Blood Glucose Pattern Classification and Anomalies Detection: Machine-Learning Applications in Type 1 Diabetes JO - J Med Internet Res SP - e11030 VL - 21 IS - 5 KW - type 1 diabetes KW - blood glucose dynamics KW - anomalies detection KW - machine learning AB - Background: Diabetes mellitus is a chronic metabolic disorder that results in abnormal blood glucose (BG) regulations. The BG level is preferably maintained close to normality through self-management practices, which involves actively tracking BG levels and taking proper actions including adjusting diet and insulin medications. BG anomalies could be defined as any undesirable reading because of either a precisely known reason (normal cause variation) or an unknown reason (special cause variation) to the patient. Recently, machine-learning applications have been widely introduced within diabetes research in general and BG anomaly detection in particular. However, irrespective of their expanding and increasing popularity, there is a lack of up-to-date reviews that materialize the current trends in modeling options and strategies for BG anomaly classification and detection in people with diabetes. Objective: This review aimed to identify, assess, and analyze the state-of-the-art machine-learning strategies and their hybrid systems focusing on BG anomaly classification and detection including glycemic variability (GV), hyperglycemia, and hypoglycemia in type 1 diabetes within the context of personalized decision support systems and BG alarm events applications, which are important constituents for optimal diabetes self-management. Methods: A rigorous literature search was conducted between September 1 and October 1, 2017, and October 15 and November 5, 2018, through various Web-based databases. Peer-reviewed journals and articles were considered. Information from the selected literature was extracted based on predefined categories, which were based on previous research and further elaborated through brainstorming. Results: The initial results were vetted using the title, abstract, and keywords and retrieved 496 papers. After a thorough assessment and screening, 47 articles remained, which were critically analyzed. The interrater agreement was measured using a Cohen kappa test, and disagreements were resolved through discussion. The state-of-the-art classes of machine learning have been developed and tested up to the task and achieved promising performance including artificial neural network, support vector machine, decision tree, genetic algorithm, Gaussian process regression, Bayesian neural network, deep belief network, and others. Conclusions: Despite the complexity of BG dynamics, there are many attempts to capture hypoglycemia and hyperglycemia incidences and the extent of an individual’s GV using different approaches. Recently, the advancement of diabetes technologies and continuous accumulation of self-collected health data have paved the way for popularity of machine learning in these tasks. According to the review, most of the identified studies used a theoretical threshold, which suffers from inter- and intrapatient variation. Therefore, future studies should consider the difference among patients and also track its temporal change over time. Moreover, studies should also give more emphasis on the types of inputs used and their associated time lag. Generally, we foresee that these developments might encourage researchers to further develop and test these systems on a large-scale basis. UR - https://www.jmir.org/2019/5/e11030/ UR - https://doi.org/10.2196/11030 UR - http://www.ncbi.nlm.nih.gov/pubmed/31042157 DO - 10.2196/11030 ID - info:doi/10.2196/11030 ER - TY - JOUR AU - Aboueid, Stephanie AU - Liu, Rebecca H AU - Desta, Binyam Negussie AU - Chaurasia, Ashok AU - Ebrahim, Shanil PY - 2019 DA - 2019/05/01 TI - The Use of Artificially Intelligent Self-Diagnosing Digital Platforms by the General Public: Scoping Review JO - JMIR Med Inform SP - e13445 VL - 7 IS - 2 KW - diagnosis KW - artificial intelligence KW - symptom checkers KW - diagnostic self evaluation KW - self-care AB - Background: Self-diagnosis is the process of diagnosing or identifying a medical condition in oneself. Artificially intelligent digital platforms for self-diagnosis are becoming widely available and are used by the general public; however, little is known about the body of knowledge surrounding this technology. Objective: The objectives of this scoping review were to (1) systematically map the extent and nature of the literature and topic areas pertaining to digital platforms that use computerized algorithms to provide users with a list of potential diagnoses and (2) identify key knowledge gaps. Methods: The following databases were searched: PubMed (Medline), Scopus, Association for Computing Machinery Digital Library, Institute of Electrical and Electronics Engineers, Google Scholar, Open Grey, and ProQuest Dissertations and Theses. The search strategy was developed and refined with the assistance of a librarian and consisted of 3 main concepts: (1) self-diagnosis; (2) digital platforms; and (3) public or patients. The search generated 2536 articles from which 217 were duplicates. Following the Tricco et al 2018 checklist, 2 researchers screened the titles and abstracts (n=2316) and full texts (n=104), independently. A total of 19 articles were included for review, and data were retrieved following a data-charting form that was pretested by the research team. Results: The included articles were mainly conducted in the United States (n=10) or the United Kingdom (n=4). Among the articles, topic areas included accuracy or correspondence with a doctor’s diagnosis (n=6), commentaries (n=2), regulation (n=3), sociological (n=2), user experience (n=2), theoretical (n=1), privacy and security (n=1), ethical (n=1), and design (n=1). Individuals who do not have access to health care and perceive to have a stigmatizing condition are more likely to use this technology. The accuracy of this technology varied substantially based on the disease examined and platform used. Women and those with higher education were more likely to choose the right diagnosis out of the potential list of diagnoses. Regulation of this technology is lacking in most parts of the world; however, they are currently under development. Conclusions: There are prominent research gaps in the literature surrounding the use of artificially intelligent self-diagnosing digital platforms. Given the variety of digital platforms and the wide array of diseases they cover, measuring accuracy is cumbersome. More research is needed to understand the user experience and inform regulations. UR - http://medinform.jmir.org/2019/2/e13445/ UR - https://doi.org/10.2196/13445 UR - http://www.ncbi.nlm.nih.gov/pubmed/31042151 DO - 10.2196/13445 ID - info:doi/10.2196/13445 ER - TY - JOUR AU - Tariq, Qandeel AU - Fleming, Scott Lanyon AU - Schwartz, Jessey Nicole AU - Dunlap, Kaitlyn AU - Corbin, Conor AU - Washington, Peter AU - Kalantarian, Haik AU - Khan, Naila Z AU - Darmstadt, Gary L AU - Wall, Dennis Paul PY - 2019 DA - 2019/04/24 TI - Detecting Developmental Delay and Autism Through Machine Learning Models Using Home Videos of Bangladeshi Children: Development and Validation Study JO - J Med Internet Res SP - e13822 VL - 21 IS - 4 KW - autism KW - autism spectrum disorder KW - machine learning KW - developmental delays KW - clinical resources KW - Bangladesh KW - Biomedical Data Science AB - Background: Autism spectrum disorder (ASD) is currently diagnosed using qualitative methods that measure between 20-100 behaviors, can span multiple appointments with trained clinicians, and take several hours to complete. In our previous work, we demonstrated the efficacy of machine learning classifiers to accelerate the process by collecting home videos of US-based children, identifying a reduced subset of behavioral features that are scored by untrained raters using a machine learning classifier to determine children’s “risk scores” for autism. We achieved an accuracy of 92% (95% CI 88%-97%) on US videos using a classifier built on five features. Objective: Using videos of Bangladeshi children collected from Dhaka Shishu Children’s Hospital, we aim to scale our pipeline to another culture and other developmental delays, including speech and language conditions. Methods: Although our previously published and validated pipeline and set of classifiers perform reasonably well on Bangladeshi videos (75% accuracy, 95% CI 71%-78%), this work improves on that accuracy through the development and application of a powerful new technique for adaptive aggregation of crowdsourced labels. We enhance both the utility and performance of our model by building two classification layers: The first layer distinguishes between typical and atypical behavior, and the second layer distinguishes between ASD and non-ASD. In each of the layers, we use a unique rater weighting scheme to aggregate classification scores from different raters based on their expertise. We also determine Shapley values for the most important features in the classifier to understand how the classifiers’ process aligns with clinical intuition. Results: Using these techniques, we achieved an accuracy (area under the curve [AUC]) of 76% (SD 3%) and sensitivity of 76% (SD 4%) for identifying atypical children from among developmentally delayed children, and an accuracy (AUC) of 85% (SD 5%) and sensitivity of 76% (SD 6%) for identifying children with ASD from those predicted to have other developmental delays. Conclusions: These results show promise for using a mobile video-based and machine learning–directed approach for early and remote detection of autism in Bangladeshi children. This strategy could provide important resources for developmental health in developing countries with few clinical resources for diagnosis, helping children get access to care at an early age. Future research aimed at extending the application of this approach to identify a range of other conditions and determine the population-level burden of developmental disabilities and impairments will be of high value. UR - http://www.jmir.org/2019/4/e13822/ UR - https://doi.org/10.2196/13822 UR - http://www.ncbi.nlm.nih.gov/pubmed/31017583 DO - 10.2196/13822 ID - info:doi/10.2196/13822 ER - TY - JOUR AU - Palanica, Adam AU - Flaschner, Peter AU - Thommandram, Anirudh AU - Li, Michael AU - Fossat, Yan PY - 2019 DA - 2019/04/05 TI - Physicians’ Perceptions of Chatbots in Health Care: Cross-Sectional Web-Based Survey JO - J Med Internet Res SP - e12887 VL - 21 IS - 4 KW - physician satisfaction KW - health care KW - telemedicine KW - mobile health KW - health surveys AB - Background: Many potential benefits for the uses of chatbots within the context of health care have been theorized, such as improved patient education and treatment compliance. However, little is known about the perspectives of practicing medical physicians on the use of chatbots in health care, even though these individuals are the traditional benchmark of proper patient care. Objective: This study aimed to investigate the perceptions of physicians regarding the use of health care chatbots, including their benefits, challenges, and risks to patients. Methods: A total of 100 practicing physicians across the United States completed a Web-based, self-report survey to examine their opinions of chatbot technology in health care. Descriptive statistics and frequencies were used to examine the characteristics of participants. Results: A wide variety of positive and negative perspectives were reported on the use of health care chatbots, including the importance to patients for managing their own health and the benefits on physical, psychological, and behavioral health outcomes. More consistent agreement occurred with regard to administrative benefits associated with chatbots; many physicians believed that chatbots would be most beneficial for scheduling doctor appointments (78%, 78/100), locating health clinics (76%, 76/100), or providing medication information (71%, 71/100). Conversely, many physicians believed that chatbots cannot effectively care for all of the patients’ needs (76%, 76/100), cannot display human emotion (72%, 72/100), and cannot provide detailed diagnosis and treatment because of not knowing all of the personal factors associated with the patient (71%, 71/100). Many physicians also stated that health care chatbots could be a risk to patients if they self-diagnose too often (714%, 74/100) and do not accurately understand the diagnoses (74%, 74/100). Conclusions: Physicians believed in both costs and benefits associated with chatbots, depending on the logistics and specific roles of the technology. Chatbots may have a beneficial role to play in health care to support, motivate, and coach patients as well as for streamlining organizational tasks; in essence, chatbots could become a surrogate for nonmedical caregivers. However, concerns remain on the inability of chatbots to comprehend the emotional state of humans as well as in areas where expert medical knowledge and intelligence is required. UR - https://www.jmir.org/2019/4/e12887/ UR - https://doi.org/10.2196/12887 UR - http://www.ncbi.nlm.nih.gov/pubmed/30950796 DO - 10.2196/12887 ID - info:doi/10.2196/12887 ER - TY - JOUR AU - Triantafyllidis, Andreas K AU - Tsanas, Athanasios PY - 2019 DA - 2019/04/05 TI - Applications of Machine Learning in Real-Life Digital Health Interventions: Review of the Literature JO - J Med Internet Res SP - e12286 VL - 21 IS - 4 KW - machine learning KW - data mining KW - artificial intelligence KW - digital health KW - review KW - telemedicine AB - Background: Machine learning has attracted considerable research interest toward developing smart digital health interventions. These interventions have the potential to revolutionize health care and lead to substantial outcomes for patients and medical professionals. Objective: Our objective was to review the literature on applications of machine learning in real-life digital health interventions, aiming to improve the understanding of researchers, clinicians, engineers, and policy makers in developing robust and impactful data-driven interventions in the health care domain. Methods: We searched the PubMed and Scopus bibliographic databases with terms related to machine learning, to identify real-life studies of digital health interventions incorporating machine learning algorithms. We grouped those interventions according to their target (ie, target condition), study design, number of enrolled participants, follow-up duration, primary outcome and whether this had been statistically significant, machine learning algorithms used in the intervention, and outcome of the algorithms (eg, prediction). Results: Our literature search identified 8 interventions incorporating machine learning in a real-life research setting, of which 3 (37%) were evaluated in a randomized controlled trial and 5 (63%) in a pilot or experimental single-group study. The interventions targeted depression prediction and management, speech recognition for people with speech disabilities, self-efficacy for weight loss, detection of changes in biopsychosocial condition of patients with multiple morbidity, stress management, treatment of phantom limb pain, smoking cessation, and personalized nutrition based on glycemic response. The average number of enrolled participants in the studies was 71 (range 8-214), and the average follow-up study duration was 69 days (range 3-180). Of the 8 interventions, 6 (75%) showed statistical significance (at the P=.05 level) in health outcomes. Conclusions: This review found that digital health interventions incorporating machine learning algorithms in real-life studies can be useful and effective. Given the low number of studies identified in this review and that they did not follow a rigorous machine learning evaluation methodology, we urge the research community to conduct further studies in intervention settings following evaluation principles and demonstrating the potential of machine learning in clinical practice. UR - https://www.jmir.org/2019/4/e12286/ UR - https://doi.org/10.2196/12286 UR - http://www.ncbi.nlm.nih.gov/pubmed/30950797 DO - 10.2196/12286 ID - info:doi/10.2196/12286 ER - TY - JOUR AU - van Hartskamp, Michael AU - Consoli, Sergio AU - Verhaegh, Wim AU - Petkovic, Milan AU - van de Stolpe, Anja PY - 2019 DA - 2019/04/05 TI - Artificial Intelligence in Clinical Health Care Applications: Viewpoint JO - Interact J Med Res SP - e12100 VL - 8 IS - 2 KW - artificial intelligence KW - deep learning KW - clinical data KW - Bayesian modeling KW - medical informatics UR - https://www.i-jmr.org/2019/2/e12100/ UR - https://doi.org/10.2196/12100 UR - http://www.ncbi.nlm.nih.gov/pubmed/30950806 DO - 10.2196/12100 ID - info:doi/10.2196/12100 ER - TY - JOUR AU - Oh, Songhee AU - Kim, Jae Heon AU - Choi, Sung-Woo AU - Lee, Hee Jeong AU - Hong, Jungrak AU - Kwon, Soon Hyo PY - 2019 DA - 2019/03/25 TI - Physician Confidence in Artificial Intelligence: An Online Mobile Survey JO - J Med Internet Res SP - e12422 VL - 21 IS - 3 KW - artificial intelligence KW - AI KW - awareness KW - physicians AB - Background: It is expected that artificial intelligence (AI) will be used extensively in the medical field in the future. Objective: The purpose of this study is to investigate the awareness of AI among Korean doctors and to assess physicians’ attitudes toward the medical application of AI. Methods: We conducted an online survey composed of 11 closed-ended questions using Google Forms. The survey consisted of questions regarding the recognition of and attitudes toward AI, the development direction of AI in medicine, and the possible risks of using AI in the medical field. Results: A total of 669 participants completed the survey. Only 40 (5.9%) answered that they had good familiarity with AI. However, most participants considered AI useful in the medical field (558/669, 83.4% agreement). The advantage of using AI was seen as the ability to analyze vast amounts of high-quality, clinically relevant data in real time. Respondents agreed that the area of medicine in which AI would be most useful is disease diagnosis (558/669, 83.4% agreement). One possible problem cited by the participants was that AI would not be able to assist in unexpected situations owing to inadequate information (196/669, 29.3%). Less than half of the participants(294/669, 43.9%) agreed that AI is diagnostically superior to human doctors. Only 237 (35.4%) answered that they agreed that AI could replace them in their jobs. Conclusions: This study suggests that Korean doctors and medical students have favorable attitudes toward AI in the medical field. The majority of physicians surveyed believed that AI will not replace their roles in the future. UR - http://www.jmir.org/2019/3/e12422/ UR - https://doi.org/10.2196/12422 UR - http://www.ncbi.nlm.nih.gov/pubmed/30907742 DO - 10.2196/12422 ID - info:doi/10.2196/12422 ER - TY - JOUR AU - Blease, Charlotte AU - Kaptchuk, Ted J AU - Bernstein, Michael H AU - Mandl, Kenneth D AU - Halamka, John D AU - DesRoches, Catherine M PY - 2019 DA - 2019/03/20 TI - Artificial Intelligence and the Future of Primary Care: Exploratory Qualitative Study of UK General Practitioners’ Views JO - J Med Internet Res SP - e12802 VL - 21 IS - 3 KW - artificial intelligence KW - attitudes KW - future KW - general practice KW - machine learning KW - opinions KW - primary care KW - qualitative research KW - technology AB - Background: The potential for machine learning to disrupt the medical profession is the subject of ongoing debate within biomedical informatics and related fields. Objective: This study aimed to explore general practitioners’ (GPs’) opinions about the potential impact of future technology on key tasks in primary care. Methods: In June 2018, we conducted a Web-based survey of 720 UK GPs’ opinions about the likelihood of future technology to fully replace GPs in performing 6 key primary care tasks, and, if respondents considered replacement for a particular task likely, to estimate how soon the technological capacity might emerge. This study involved qualitative descriptive analysis of written responses (“comments”) to an open-ended question in the survey. Results: Comments were classified into 3 major categories in relation to primary care: (1) limitations of future technology, (2) potential benefits of future technology, and (3) social and ethical concerns. Perceived limitations included the beliefs that communication and empathy are exclusively human competencies; many GPs also considered clinical reasoning and the ability to provide value-based care as necessitating physicians’ judgments. Perceived benefits of technology included expectations about improved efficiencies, in particular with respect to the reduction of administrative burdens on physicians. Social and ethical concerns encompassed multiple, divergent themes including the need to train more doctors to overcome workforce shortfalls and misgivings about the acceptability of future technology to patients. However, some GPs believed that the failure to adopt technological innovations could incur harms to both patients and physicians. Conclusions: This study presents timely information on physicians’ views about the scope of artificial intelligence (AI) in primary care. Overwhelmingly, GPs considered the potential of AI to be limited. These views differ from the predictions of biomedical informaticians. More extensive, stand-alone qualitative work would provide a more in-depth understanding of GPs’ views. UR - http://www.jmir.org/2019/3/e12802/ UR - https://doi.org/10.2196/12802 UR - http://www.ncbi.nlm.nih.gov/pubmed/30892270 DO - 10.2196/12802 ID - info:doi/10.2196/12802 ER - TY - JOUR AU - Chen, Jinying AU - Lalor, John AU - Liu, Weisong AU - Druhl, Emily AU - Granillo, Edgard AU - Vimalananda, Varsha G AU - Yu, Hong PY - 2019 DA - 2019/03/11 TI - Detecting Hypoglycemia Incidents Reported in Patients’ Secure Messages: Using Cost-Sensitive Learning and Oversampling to Reduce Data Imbalance JO - J Med Internet Res SP - e11990 VL - 21 IS - 3 KW - secure messaging KW - natural language processing KW - hypoglycemia KW - supervised machine learning KW - imbalanced data KW - adverse event detection KW - drug-related side effects and adverse reactions AB - Background: Improper dosing of medications such as insulin can cause hypoglycemic episodes, which may lead to severe morbidity or even death. Although secure messaging was designed for exchanging nonurgent messages, patients sometimes report hypoglycemia events through secure messaging. Detecting these patient-reported adverse events may help alert clinical teams and enable early corrective actions to improve patient safety. Objective: We aimed to develop a natural language processing system, called HypoDetect (Hypoglycemia Detector), to automatically identify hypoglycemia incidents reported in patients’ secure messages. Methods: An expert in public health annotated 3000 secure message threads between patients with diabetes and US Department of Veterans Affairs clinical teams as containing patient-reported hypoglycemia incidents or not. A physician independently annotated 100 threads randomly selected from this dataset to determine interannotator agreement. We used this dataset to develop and evaluate HypoDetect. HypoDetect incorporates 3 machine learning algorithms widely used for text classification: linear support vector machines, random forest, and logistic regression. We explored different learning features, including new knowledge-driven features. Because only 114 (3.80%) messages were annotated as positive, we investigated cost-sensitive learning and oversampling methods to mitigate the challenge of imbalanced data. Results: The interannotator agreement was Cohen kappa=.976. Using cross-validation, logistic regression with cost-sensitive learning achieved the best performance (area under the receiver operating characteristic curve=0.954, sensitivity=0.693, specificity 0.974, F1 score=0.590). Cost-sensitive learning and the ensembled synthetic minority oversampling technique improved the sensitivity of the baseline systems substantially (by 0.123 to 0.728 absolute gains). Our results show that a variety of features contributed to the best performance of HypoDetect. Conclusions: Despite the challenge of data imbalance, HypoDetect achieved promising results for the task of detecting hypoglycemia incidents from secure messages. The system has a great potential to facilitate early detection and treatment of hypoglycemia. UR - http://www.jmir.org/2019/3/e11990/ UR - https://doi.org/10.2196/11990 UR - http://www.ncbi.nlm.nih.gov/pubmed/30855231 DO - 10.2196/11990 ID - info:doi/10.2196/11990 ER - TY - JOUR AU - Li, Rumeng AU - Hu, Baotian AU - Liu, Feifan AU - Liu, Weisong AU - Cunningham, Francesca AU - McManus, David D AU - Yu, Hong PY - 2019 DA - 2019/02/08 TI - Detection of Bleeding Events in Electronic Health Record Notes Using Convolutional Neural Network Models Enhanced With Recurrent Neural Network Autoencoders: Deep Learning Approach JO - JMIR Med Inform SP - e10788 VL - 7 IS - 1 KW - autoencoder KW - BiLSTM KW - bleeding KW - convolutional neural networks KW - electronic health record AB - Background: Bleeding events are common and critical and may cause significant morbidity and mortality. High incidences of bleeding events are associated with cardiovascular disease in patients on anticoagulant therapy. Prompt and accurate detection of bleeding events is essential to prevent serious consequences. As bleeding events are often described in clinical notes, automatic detection of bleeding events from electronic health record (EHR) notes may improve drug-safety surveillance and pharmacovigilance. Objective: We aimed to develop a natural language processing (NLP) system to automatically classify whether an EHR note sentence contains a bleeding event. Methods: We expert annotated 878 EHR notes (76,577 sentences and 562,630 word-tokens) to identify bleeding events at the sentence level. This annotated corpus was used to train and validate our NLP systems. We developed an innovative hybrid convolutional neural network (CNN) and long short-term memory (LSTM) autoencoder (HCLA) model that integrates a CNN architecture with a bidirectional LSTM (BiLSTM) autoencoder model to leverage large unlabeled EHR data. Results: HCLA achieved the best area under the receiver operating characteristic curve (0.957) and F1 score (0.938) to identify whether a sentence contains a bleeding event, thereby surpassing the strong baseline support vector machines and other CNN and autoencoder models. Conclusions: By incorporating a supervised CNN model and a pretrained unsupervised BiLSTM autoencoder, the HCLA achieved high performance in detecting bleeding events. UR - http://medinform.jmir.org/2019/1/e10788/ UR - https://doi.org/10.2196/10788 UR - http://www.ncbi.nlm.nih.gov/pubmed/30735140 DO - 10.2196/10788 ID - info:doi/10.2196/10788 ER - TY - JOUR AU - Fulmer, Russell AU - Joerin, Angela AU - Gentile, Breanna AU - Lakerink, Lysanne AU - Rauws, Michiel PY - 2018 DA - 2018/12/13 TI - Using Psychological Artificial Intelligence (Tess) to Relieve Symptoms of Depression and Anxiety: Randomized Controlled Trial JO - JMIR Ment Health SP - e64 VL - 5 IS - 4 KW - artificial intelligence KW - mental health services KW - depression KW - anxiety KW - students AB - Background: Students in need of mental health care face many barriers including cost, location, availability, and stigma. Studies show that computer-assisted therapy and 1 conversational chatbot delivering cognitive behavioral therapy (CBT) offer a less-intensive and more cost-effective alternative for treating depression and anxiety. Although CBT is one of the most effective treatment methods, applying an integrative approach has been linked to equally effective posttreatment improvement. Integrative psychological artificial intelligence (AI) offers a scalable solution as the demand for affordable, convenient, lasting, and secure support grows. Objective: This study aimed to assess the feasibility and efficacy of using an integrative psychological AI, Tess, to reduce self-identified symptoms of depression and anxiety in college students. Methods: In this randomized controlled trial, 75 participants were recruited from 15 universities across the United States. All participants completed Web-based surveys, including the Patient Health Questionnaire (PHQ-9), Generalized Anxiety Disorder Scale (GAD-7), and Positive and Negative Affect Scale (PANAS) at baseline and 2 to 4 weeks later (T2). The 2 test groups consisted of 50 participants in total and were randomized to receive unlimited access to Tess for either 2 weeks (n=24) or 4 weeks (n=26). The information-only control group participants (n=24) received an electronic link to the National Institute of Mental Health’s (NIMH) eBook on depression among college students and were only granted access to Tess after completion of the study. Results: A sample of 74 participants completed this study with 0% attrition from the test group and less than 1% attrition from the control group (1/24). The average age of participants was 22.9 years, with 70% of participants being female (52/74), mostly Asian (37/74, 51%), and white (32/74, 41%). Group 1 received unlimited access to Tess, with daily check-ins for 2 weeks. Group 2 received unlimited access to Tess with biweekly check-ins for 4 weeks. The information-only control group was provided with an electronic link to the NIMH’s eBook. Multivariate analysis of covariance was conducted. We used an alpha level of .05 for all statistical tests. Results revealed a statistically significant difference between the control group and group 1, such that group 1 reported a significant reduction in symptoms of depression as measured by the PHQ-9 (P=.03), whereas those in the control group did not. A statistically significant difference was found between the control group and both test groups 1 and 2 for symptoms of anxiety as measured by the GAD-7. Group 1 (P=.045) and group 2 (P=.02) reported a significant reduction in symptoms of anxiety, whereas the control group did not. A statistically significant difference was found on the PANAS between the control group and group 1 (P=.03) and suggests that Tess did impact scores. Conclusions: This study offers evidence that AI can serve as a cost-effective and accessible therapeutic agent. Although not designed to appropriate the role of a trained therapist, integrative psychological AI emerges as a feasible option for delivering support. Trial Registration: International Standard Randomized Controlled Trial Number: ISRCTN61214172; https://doi.org/10.1186/ISRCTN61214172. UR - http://mental.jmir.org/2018/4/e64/ UR - https://doi.org/10.2196/mental.9782 UR - http://www.ncbi.nlm.nih.gov/pubmed/30545815 DO - 10.2196/mental.9782 ID - info:doi/10.2196/mental.9782 ER - TY - JOUR AU - Lo, Wai Leung Ambrose AU - Lei, Di AU - Li, Le AU - Huang, Dong Feng AU - Tong, Kin-Fai PY - 2018 DA - 2018/11/26 TI - The Perceived Benefits of an Artificial Intelligence–Embedded Mobile App Implementing Evidence-Based Guidelines for the Self-Management of Chronic Neck and Back Pain: Observational Study JO - JMIR Mhealth Uhealth SP - e198 VL - 6 IS - 11 KW - low back pain KW - neck pain KW - mobile app KW - exercise therapy KW - mHealth AB - Background: Chronic musculoskeletal neck and back pain are disabling conditions among adults. Use of technology has been suggested as an alternative way to increase adherence to exercise therapy, which may improve clinical outcomes. Objective: The aim was to investigate the self-perceived benefits of an artificial intelligence (AI)–embedded mobile app to self-manage chronic neck and back pain. Methods: A total of 161 participants responded to the invitation. The evaluation questionnaire included 14 questions that were intended to explore if using the AI rehabilitation system may (1) increase time spent on therapeutic exercise, (2) affect pain level (assessed by the 0-10 Numerical Pain Rating Scale), and (3) reduce the need for other interventions. Results: An increase in time spent on therapeutic exercise per day was observed. The median Numerical Pain Rating Scale scores were 6 (interquartile range [IQR] 5-8) before and 4 (IQR 3-6) after using the AI-embedded mobile app (95% CI 1.18-1.81). A 3-point reduction was reported by the participants who used the AI-embedded mobile app for more than 6 months. Reduction in the usage of other interventions while using the AI-embedded mobile app was also reported. Conclusions: This study demonstrated the positive self-perceived beneficiary effect of using the AI-embedded mobile app to provide a personalized therapeutic exercise program. The positive results suggest that it at least warrants further study to investigate the physiological effect of the AI-embedded mobile app and how it compares with routine clinical care. UR - http://mhealth.jmir.org/2018/11/e198/ UR - https://doi.org/10.2196/mhealth.8127 UR - http://www.ncbi.nlm.nih.gov/pubmed/30478019 DO - 10.2196/mhealth.8127 ID - info:doi/10.2196/mhealth.8127 ER - TY - JOUR AU - Bickmore, Timothy W AU - Trinh, Ha AU - Olafsson, Stefan AU - O'Leary, Teresa K AU - Asadi, Reza AU - Rickles, Nathaniel M AU - Cruz, Ricardo PY - 2018 DA - 2018/09/04 TI - Patient and Consumer Safety Risks When Using Conversational Assistants for Medical Information: An Observational Study of Siri, Alexa, and Google Assistant JO - J Med Internet Res SP - e11510 VL - 20 IS - 9 KW - conversational assistant KW - conversational interface KW - dialogue system KW - medical error KW - patient safety AB - Background: Conversational assistants, such as Siri, Alexa, and Google Assistant, are ubiquitous and are beginning to be used as portals for medical services. However, the potential safety issues of using conversational assistants for medical information by patients and consumers are not understood. Objective: To determine the prevalence and nature of the harm that could result from patients or consumers using conversational assistants for medical information. Methods: Participants were given medical problems to pose to Siri, Alexa, or Google Assistant, and asked to determine an action to take based on information from the system. Assignment of tasks and systems were randomized across participants, and participants queried the conversational assistants in their own words, making as many attempts as needed until they either reported an action to take or gave up. Participant-reported actions for each medical task were rated for patient harm using an Agency for Healthcare Research and Quality harm scale. Results: Fifty-four subjects completed the study with a mean age of 42 years (SD 18). Twenty-nine (54%) were female, 31 (57%) Caucasian, and 26 (50%) were college educated. Only 8 (15%) reported using a conversational assistant regularly, while 22 (41%) had never used one, and 24 (44%) had tried one “a few times.“ Forty-four (82%) used computers regularly. Subjects were only able to complete 168 (43%) of their 394 tasks. Of these, 49 (29%) reported actions that could have resulted in some degree of patient harm, including 27 (16%) that could have resulted in death. Conclusions: Reliance on conversational assistants for actionable medical information represents a safety risk for patients and consumers. Patients should be cautioned to not use these technologies for answers to medical questions they intend to act on without further consultation from a health care provider. UR - http://www.jmir.org/2018/9/e11510/ UR - https://doi.org/10.2196/11510 UR - http://www.ncbi.nlm.nih.gov/pubmed/30181110 DO - 10.2196/11510 ID - info:doi/10.2196/11510 ER - TY - JOUR AU - Suganuma, Shinichiro AU - Sakamoto, Daisuke AU - Shimoyama, Haruhiko PY - 2018 DA - 2018/07/31 TI - An Embodied Conversational Agent for Unguided Internet-Based Cognitive Behavior Therapy in Preventative Mental Health: Feasibility and Acceptability Pilot Trial JO - JMIR Ment Health SP - e10454 VL - 5 IS - 3 KW - embodied conversational agent KW - cognitive behavioral therapy KW - psychological distress KW - mental well‐being KW - artificial intelligence technology AB - Background: Recent years have seen an increase in the use of internet-based cognitive behavioral therapy in the area of mental health. Although lower effectiveness and higher dropout rates of unguided than those of guided internet-based cognitive behavioral therapy remain critical issues, not incurring ongoing human clinical resources makes it highly advantageous. Objective: Current research in psychotherapy, which acknowledges the importance of therapeutic alliance, aims to evaluate the feasibility and acceptability, in terms of mental health, of an application that is embodied with a conversational agent. This application was enabled for use as an internet-based cognitive behavioral therapy preventative mental health measure. Methods: Analysis of the data from the 191 participants of the experimental group with a mean age of 38.07 (SD 10.75) years and the 263 participants of the control group with a mean age of 38.05 (SD 13.45) years using a 2-way factorial analysis of variance (group × time) was performed. Results: There was a significant main effect (P=.02) and interaction for time on the variable of positive mental health (P=.02), and for the treatment group, a significant simple main effect was also found (P=.002). In addition, there was a significant main effect (P=.02) and interaction for time on the variable of negative mental health (P=.005), and for the treatment group, a significant simple main effect was also found (P=.001). Conclusions: This research can be seen to represent a certain level of evidence for the mental health application developed herein, indicating empirically that internet-based cognitive behavioral therapy with the embodied conversational agent can be used in mental health care. In the pilot trial, given the issues related to feasibility and acceptability, it is necessary to pursue higher quality evidence while continuing to further improve the application, based on the findings of the current research. UR - http://mental.jmir.org/2018/3/e10454/ UR - https://doi.org/10.2196/10454 UR - http://www.ncbi.nlm.nih.gov/pubmed/30064969 DO - 10.2196/10454 ID - info:doi/10.2196/10454 ER - TY - JOUR AU - Morris, Robert R AU - Kouddous, Kareem AU - Kshirsagar, Rohan AU - Schueller, Stephen M PY - 2018 DA - 2018/06/26 TI - Towards an Artificially Empathic Conversational Agent for Mental Health Applications: System Design and User Perceptions JO - J Med Internet Res SP - e10148 VL - 20 IS - 6 KW - conversational agents KW - mental health KW - empathy KW - crowdsourcing KW - peer support AB - Background: Conversational agents cannot yet express empathy in nuanced ways that account for the unique circumstances of the user. Agents that possess this faculty could be used to enhance digital mental health interventions. Objective: We sought to design a conversational agent that could express empathic support in ways that might approach, or even match, human capabilities. Another aim was to assess how users might appraise such a system. Methods: Our system used a corpus-based approach to simulate expressed empathy. Responses from an existing pool of online peer support data were repurposed by the agent and presented to the user. Information retrieval techniques and word embeddings were used to select historical responses that best matched a user’s concerns. We collected ratings from 37,169 users to evaluate the system. Additionally, we conducted a controlled experiment (N=1284) to test whether the alleged source of a response (human or machine) might change user perceptions. Results: The majority of responses created by the agent (2986/3770, 79.20%) were deemed acceptable by users. However, users significantly preferred the efforts of their peers (P<.001). This effect was maintained in a controlled study (P=.02), even when the only difference in responses was whether they were framed as coming from a human or a machine. Conclusions: Our system illustrates a novel way for machines to construct nuanced and personalized empathic utterances. However, the design had significant limitations and further research is needed to make this approach viable. Our controlled study suggests that even in ideal conditions, nonhuman agents may struggle to express empathy as well as humans. The ethical implications of empathic agents, as well as their potential iatrogenic effects, are also discussed. UR - http://www.jmir.org/2018/6/e10148/ UR - https://doi.org/10.2196/10148 UR - http://www.ncbi.nlm.nih.gov/pubmed/29945856 DO - 10.2196/10148 ID - info:doi/10.2196/10148 ER - TY - JOUR AU - Martinez-Martin, Nicole AU - Kreitmair, Karola PY - 2018 DA - 2018/04/23 TI - Ethical Issues for Direct-to-Consumer Digital Psychotherapy Apps: Addressing Accountability, Data Protection, and Consent JO - JMIR Ment Health SP - e32 VL - 5 IS - 2 KW - ethics KW - ethical issues KW - mental health KW - technology KW - telemedicine KW - mHealth KW - psychotherapy UR - http://mental.jmir.org/2018/2/e32/ UR - https://doi.org/10.2196/mental.9423 UR - http://www.ncbi.nlm.nih.gov/pubmed/29685865 DO - 10.2196/mental.9423 ID - info:doi/10.2196/mental.9423 ER - TY - JOUR AU - Howe, Esther AU - Pedrelli, Paola AU - Morris, Robert AU - Nyer, Maren AU - Mischoulon, David AU - Picard, Rosalind PY - 2017 DA - 2017/09/22 TI - Feasibility of an Automated System Counselor for Survivors of Sexual Assault JO - iproc SP - e37 VL - 3 IS - 1 KW - CBT KW - web chat AB - Background: Sexual assault (SA) is common and costly to individuals and society, and increases risk of mental health disorders. Stigma and cost of care discourage survivors from seeking help. Norms profiling survivors as heterosexual, cisgendered women dissuade LGBTQIA+ individuals and men from accessing care. Because individuals prefer disclosing sensitive information online rather than in-person, online systems—like instant messaging and chatbots—for counseling may bypass concerns about stigma. These systems’ anonymity may increase disclosure and decrease impression management, the process by which individuals attempt to influence others’ perceptions. Their low cost may expand reach of care. There are no known evidence-based chat platforms for SA survivors. Objective: To examine feasibility of a chat platform with peer and automated system (chatbot) counseling interfaces to provide cognitive reappraisals (a cognitive behavioral therapy technique) to survivors. Methods: Participants are English-speaking, US-based survivors, 18+ years old. Participants are told they will be randomized to chat with a peer or automated system counselor 5 times over 2 weeks. In reality, all participants chat with a peer counselor. Chats employ a modified-for-context evidence-based cognitive reappraisal script developed by Koko, a company offering support services for emotional distress via social networks. At baseline, participants indicate counselor type preference and complete a basic demographic form, the Brief Fear of Negative Evaluation Scale, and self-disclosure items from the International Personality Item Pool. After 5 chats, participants complete questions from the Client Satisfaction Questionnaire (CSQ), Self-Reported Attitudes Toward Agent, and the Working Alliance Inventory. Hypotheses: 1) Online chatting and automated systems will be acceptable and feasible means of delivering cognitive reappraisals to survivors. 2) High impression management (IM≥25) and low self-disclosure (SD≤45) will be associated with preference for an automated system. 3) IM and SD will separately moderate the relationship between counselor assignment and participant satisfaction. Results: Ten participants have completed the study. Recruitment is ongoing. We will enroll 50+ participants by 10/2017 and outline findings at the Connected Health Conference. To date, 70% of participants completed all chats within 24 hours of enrollment, and 60% indicated a pre-chat preference for an automated system, suggesting acceptability of the concept. The post-chat CSQ mean total score of 3.98 on a 5-point Likert scale (1=Poor; 5=Excellent) suggests platform acceptability. Of the 50% reporting high IM, 60% indicated preference for an automated system. Of the 30% reporting low SD, 33% reported preference for an automated system. At recruitment completion, ANOVA analyses will elucidate relationships between IM, SD, and counselor assignment. Correlation and linear regression analyses will show any moderating effect of IM and SD on the relationship between counselor assignment and participant satisfaction. Conclusions: Preliminary results suggest acceptability and feasibility of cognitive reappraisals via chat for survivors, and of the automated system counselor concept. Final results will explore relationships between SD, IM, counselor type, and participant satisfaction to inform the development of new platforms for survivors. UR - http://www.iproc.org/2017/1/e37/ UR - https://doi.org/10.2196/iproc.8585 DO - 10.2196/iproc.8585 ID - info:doi/10.2196/iproc.8585 ER -