Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?


Journal Description

The Journal of Medical Internet Research (JMIR), now in its 21st year, is the pioneer open access eHealth journal and is the flagship journal of JMIR Publications. It is the leading digital health journal globally in terms of quality/visibility (Impact Factor 2019: 5.03), ranking Q1 in the medical informatics category, and is also the largest journal in the field. The journal focuses on emerging technologies, medical devices, apps, engineering, telehealth and informatics applications for patient education, prevention, population health and clinical care. As a leading high-impact journal in its disciplines (health informatics and health services research), it is selective, but it is now complemented by almost 30 specialty JMIR sister journals, which have a broader scope, and which together receive over 6.000 submissions a year. Peer-review reports are portable across JMIR journals and papers can be transferred, so authors save time by not having to resubmit a paper to different journal but can simply transfer it between journals. 

As an open access journal, we are read by clinicians, allied health professionals, informal caregivers, and patients alike, and have (as with all JMIR journals) a focus on readable and applied science reporting the design and evaluation of health innovations and emerging technologies. We publish original research, viewpoints, and reviews (both literature reviews and medical device/technology/app reviews).

We are also a leader in participatory and open science approaches, and offer the option to publish new submissions immediately as preprints, which receive DOIs for immediate citation (eg, in grant proposals), and for open peer-review purposes. We also invite patients to participate (eg, as peer-reviewers) and have patient representatives on editorial boards.

Be a widely cited leader in the digitial health revolution and submit your paper today!


Recent Articles:

  • Source: Canva; Copyright: The Authors; URL:; License: Licensed by the authors.

    Behavior Change Text Messages for Home Exercise Adherence in Knee Osteoarthritis: Randomized Trial


    Background: Exercise is a core recommended treatment for knee osteoarthritis (OA), yet adherence declines, particularly following cessation of clinician supervision. Objective: This study aims to evaluate whether a 24-week SMS intervention improves adherence to unsupervised home exercise in people with knee OA and obesity compared with no SMS. Methods: A two-group superiority randomized controlled trial was performed in a community setting. Participants were people aged 50 years with knee OA and BMI ≥30 kg/m2 who had undertaken a 12-week physiotherapist-supervised exercise program as part of a preceding clinical trial. Both groups were asked to continue their home exercise program unsupervised three times per week for 24 weeks and were randomly allocated to a behavior change theory–informed, automated, semi-interactive SMS intervention addressing exercise barriers and facilitators or to control (no SMS). Primary outcomes were self-reported home exercise adherence at 24 weeks measured by the Exercise Adherence Rating Scale (EARS) Section B (0-24, higher number indicating greater adherence) and the number of days exercised in the past week (0-3). Secondary outcomes included self-rated adherence (numeric rating scale), knee pain, physical function, quality of life, global change, physical activity, self-efficacy, pain catastrophizing, and kinesiophobia. Results: A total of 110 participants (56 SMS group and 54 no SMS) were enrolled and 99 (90.0%) completed both primary outcomes (48/56, 86% SMS group and 51/54, 94% no SMS). At 24 weeks, the SMS group reported higher EARS scores (mean 16.5, SD 6.5 vs mean 13.3, SD 7.0; mean difference 3.1, 95% CI 0.8-5.5; P=.01) and more days exercised in the past week (mean 1.8, SD 1.2 vs mean 1.3, SD 1.2; mean difference 0.6, 95% CI 0.2-1.0; P=.01) than the control group. There was no evidence of between-group differences in secondary outcomes. Conclusions: An SMS program increased self-reported adherence to unsupervised home exercise in people with knee OA and obesity, although this did not translate into improved clinical outcomes. Trial Registration: Australian New Zealand Clinical Trials Registry 12617001243303;

  • Source: The Authors / Placeit; Copyright: The Authors / Placeit; URL:; License: Licensed by JMIR.

    Marrying Medical Domain Knowledge With Deep Learning on Electronic Health Records: A Deep Visual Analytics Approach


    Background: Deep learning models have attracted significant interest from health care researchers during the last few decades. There have been many studies that apply deep learning to medical applications and achieve promising results. However, there are three limitations to the existing models: (1) most clinicians are unable to interpret the results from the existing models, (2) existing models cannot incorporate complicated medical domain knowledge (eg, a disease causes another disease), and (3) most existing models lack visual exploration and interaction. Both the electronic health record (EHR) data set and the deep model results are complex and abstract, which impedes clinicians from exploring and communicating with the model directly. Objective: The objective of this study is to develop an interpretable and accurate risk prediction model as well as an interactive clinical prediction system to support EHR data exploration, knowledge graph demonstration, and model interpretation. Methods: A domain-knowledge–guided recurrent neural network (DG-RNN) model is proposed to predict clinical risks. The model takes medical event sequences as input and incorporates medical domain knowledge by attending to a subgraph of the whole medical knowledge graph. A global pooling operation and a fully connected layer are used to output the clinical outcomes. The middle results and the parameters of the fully connected layer are helpful in identifying which medical events cause clinical risks. DG-Viz is also designed to support EHR data exploration, knowledge graph demonstration, and model interpretation. Results: We conducted both risk prediction experiments and a case study on a real-world data set. A total of 554 patients with heart failure and 1662 control patients without heart failure were selected from the data set. The experimental results show that the proposed DG-RNN outperforms the state-of-the-art approaches by approximately 1.5%. The case study demonstrates how our medical physician collaborator can effectively explore the data and interpret the prediction results using DG-Viz. Conclusions: In this study, we present DG-Viz, an interactive clinical prediction system, which brings together the power of deep learning (ie, a DG-RNN–based model) and visual analytics to predict clinical risks and visually interpret the EHR prediction results. Experimental results and a case study on heart failure risk prediction tasks demonstrate the effectiveness and usefulness of the DG-Viz system. This study will pave the way for interactive, interpretable, and accurate clinical risk predictions.

  • Source:; Copyright: Freepik; URL:; License: Licensed by JMIR.

    Patients' and Nurses’ Experiences and Perceptions of Remote Monitoring of Implantable Cardiac Defibrillators in Heart Failure: Cross-Sectional,...


    Background: The new generation of implantable cardioverter-defibrillators (ICDs) supports wireless technology, which enables remote patient monitoring (RPM) of the device. In Sweden, it is mainly registered nurses with advanced education and training in ICD devices who handle the arrhythmias and technical issues of the remote transmissions. Previous studies have largely focused on the perceptions of physicians, and it has not been explored how the patients’ and nurses’ experiences of RPM correspond to each other. Objective: Our objective is to describe, explore, and compare the experiences and perceptions, concerning RPM of ICD, of patients with heart failure (HF) and nurses performing ICD follow-up. Methods: This study has a cross-sectional, descriptive, mixed methods design. All patients with HF and an ICD with RPM from one region in Sweden, who had transitioned from office-based visits to implementing RPM, and ICD nurses from all ICD clinics in Sweden were invited to complete a purpose-designed, 8-item questionnaire to assess experiences of RPM. The questionnaire started with a neutral question: “What are your experiences of RPM in general?” This was followed by one positive subscale with three questions (score range 3-12), with higher scores reflecting more positive experiences, and one negative subscale with three questions (score range 3-12), with lower scores reflecting more negative experiences. One open-ended question was analyzed with qualitative content analysis. Results: The sample consisted of 175 patients (response rate 98.9%) and 30 ICD nurses (response rate 60%). The majority of patients (154/175, 88.0%) and nurses (23/30, 77%) experienced RPM as very good; however, the nurses noted more downsides than did the patients. The mean scores of the negative experiences subscale were 11.5 (SD 1.1) for the patients and 10.7 (SD 0.9) for the nurses (P=.08). The mean scores of the positive experiences subscale were 11.1 (SD 1.6) for the patients and 8.5 (SD 1.9) for the nurses (P=.04). A total of 11 out of 175 patients (6.3%) were worried or anxious about what the RPM entailed, while 15 out of 30 nurses (50%) felt distressed by the responsibility that accompanied their work with RPM (P=.04). Patients found that RPM increased their own (173/175, 98.9%) and their relatives’ (169/175, 96.6%) security, and all nurses (30/30, 100%) answered that they found RPM to be necessary from a safety perspective. Most patients found it to be an advantage with fewer office-based visits. Nurses found it difficult to handle different systems with different platforms, especially for smaller clinics with few patients. Another difficulty was to set the correct number of alarms for the individual patient. This caused a high number of transmissions and a risk to miss important information. Conclusions: Both patients and nurses found that RPM increased assurance, reliance, and safety. Few patients were anxious about what the RPM entailed, while about half of the nurses felt distressed by the responsibility that accompanied their work with RPM. To increase nurses’ sense of security, it seems important to adjust organizational routines and reimbursement systems and to balance the workload.

  • Source: freepik; Copyright: katemangostar; URL:; License: Licensed by JMIR.

    Influencing Factors of Continuous Use of Web-Based Diagnosis and Treatment by Patients With Diabetes: Model Development and Data Analysis


    Background: The internet has become a major source of health care information for patients and has enabled them to obtain continuous diagnosis and treatment services. However, the quality of web-based health care information is mixed, which raises concerns about the credibility of physician advice obtained on the internet and markedly affects patients’ choices and decision-making behavior with regard to web-based diagnosis and treatment. Therefore, it is important to identify the influencing factors of continuous use of web-based diagnosis and treatment from the perspective of trust. Objective: The objective of our study was to investigate the influencing factors of patients’ continuous use of web-based diagnosis and treatment based on the elaboration likelihood model and on trust theory in the face of a decline in physiological conditions and the lack of convenient long-term professional guidance. Methods: Data on patients with diabetes in China who used an online health community twice or more from January 2018 to June 2019 were collected by developing a web crawler. A total of 2437 valid data records were obtained and then analyzed using correlation factor analysis and regression analysis to validate our research model and hypotheses. Results: The timely response rate (under the central route), the reference group (under the peripheral route), and the number of thank-you letters and patients’ ratings that measure physicians’ electronic word of mouth are all positively related with the continuous use of web-based diagnosis and treatment by patients with diabetes. Moreover, the physician’s professional title and hospital’s ranking level had weak effects on the continuous use of web-based diagnosis and treatment by patients with diabetes, and the effect size of the physician’s professional title was greater than that of the hospital’s ranking level. Conclusions: From the patient's perspective, among all indicators that measure physicians’ service quality, the effect size of a timely response rate is much greater than those of effect satisfaction and attitude satisfaction; thus, the former plays an essential role in influencing the patients’ behavior of continuous use of web-based diagnosis and treatment services. In addition, the effect size of electronic word of mouth was greater than that of the physician’s offline reputation. Physicians who provide web-based services should seek clues to patients’ needs and preferences for receiving health information during web-based physician-patient interactions and make full use of their professionalism and service reliability to communicate effectively with patients. Furthermore, the platform should improve its electronic word of mouth mechanism to realize its full potential in trust transmission and motivation, ultimately promoting the patient’s information-sharing behavior and continuous use of web-based diagnosis and treatment.

  • Human versus computer-based instructions for exercise. Source: The Authors; Copyright: The Authors; URL:; License: Creative Commons Attribution (CC-BY).

    Effectiveness of Human Versus Computer-Based Instructions for Exercise on Physical Activity–Related Health Competence in Patients with Hip Osteoarthritis:...


    Background: Hip and knee osteoarthritis is ranked as the 11th highest contributor to global disability. Exercise is a core treatment in osteoarthritis. The model for physical activity–related health competence describes possibilities to empower patients to perform physical exercises in the best possible health-promoting manner while taking into account their own physical condition. Face-to-face supervision is the gold standard for exercise guidance. Objective: The aim of this study was to evaluate whether instruction and guidance via a digital app is not inferior to supervision by a physiotherapist with regard to movement quality, control competence for physical training, and exercise-specific self-efficacy. Methods: Patients with clinically diagnosed hip osteoarthritis were recruited via print advertisements, emails and flyers. The intervention consisted of two identical training sessions with one exercise for mobility, two for strength, and one for balance. One session was guided by a physiotherapist and the other was guided by a fully automated tablet computer-based app. Both interventions took place at a university hospital. Outcomes were assessor-rated movement quality, and self-reported questionnaires on exercise-specific self-efficacy and control competence for physical training. Participants were randomly assigned to one of two treatment sequences. One sequence started with the app in the first session followed by the physiotherapist in the second session after a minimum washout phase of 27 days (AP group) and the other sequence occurred in the reverse order (PA group). Noninferiority was defined as a between-treatment effect (gIG)<0.2 in favor of the physiotherapist-guided training, including the upper confidence interval. Participants, assessors, and the statistician were neither blinded to the treatment nor to the treatment sequence. Results: A total of 54 participants started the first training session (32 women, 22 men; mean age 62.4, SD 8.2 years). The treatment sequence groups were similar in size (PA: n=26; AP: n=28). Seven subjects did not attend the second training session (PA: n=3; AP: n=4). The app was found to be inferior to the physiotherapist in all outcomes considered, except for movement quality of the mobility exercise (gIG –0.13, 95% CI –0.41-0.16). In contrast to the two strengthening exercises in different positions (supine gIG 0.76, 95% CI 0.39-1.13; table gIG 1.19, 95% CI 0.84-1.55), movement quality of the balance exercise was close to noninferiority (gIG 0.15, 95% CI –0.17-0.48). Exercise-specific self-efficacy showed a strong effect in favor of the physiotherapist (gIG 0.84, 95% CI 0.46-1.22). In terms of control competence for physical training, the app was only slightly inferior to the physiotherapist (gIG 0.18, 95% CI –0.14-0.50). Conclusions: Despite its inferiority in almost all measures of interest, exercise-specific self-efficacy and control competence for physical training did improve in patients who used the digital app. Movement quality was acceptable for exercises that are easy to conduct and instruct. The digital app opens up possibilities as a supplementary tool to support patients in independent home training for less complex exercises; however, it cannot replace a physiotherapist. Trial Registration: German Clinical Trial Register: DRKS00015759;

  • Source: iStock by Getty Images; Copyright: Prostock-Studio; URL:; License: Licensed by the authors.

    Data Quality Issues With Physician-Rating Websites: Systematic Review


    Background: In recent years, online physician-rating websites have become prominent and exert considerable influence on patients’ decisions. However, the quality of these decisions depends on the quality of data that these systems collect. Thus, there is a need to examine the various data quality issues with physician-rating websites. Objective: This study’s objective was to identify and categorize the data quality issues afflicting physician-rating websites by reviewing the literature on online patient-reported physician ratings and reviews. Methods: We performed a systematic literature search in ACM Digital Library, EBSCO, Springer, PubMed, and Google Scholar. The search was limited to quantitative, qualitative, and mixed-method papers published in the English language from 2001 to 2020. Results: A total of 423 articles were screened. From these, 49 papers describing 18 unique data quality issues afflicting physician-rating websites were included. Using a data quality framework, we classified these issues into the following four categories: intrinsic, contextual, representational, and accessible. Among the papers, 53% (26/49) reported intrinsic data quality errors, 61% (30/49) highlighted contextual data quality issues, 8% (4/49) discussed representational data quality issues, and 27% (13/49) emphasized accessibility data quality. More than half the papers discussed multiple categories of data quality issues. Conclusions: The results from this review demonstrate the presence of a range of data quality issues. While intrinsic and contextual factors have been well-researched, accessibility and representational issues warrant more attention from researchers, as well as practitioners. In particular, representational factors, such as the impact of inline advertisements and the positioning of positive reviews on the first few pages, are usually deliberate and result from the business model of physician-rating websites. The impact of these factors on data quality has not been addressed adequately and requires further investigation.

  • Source: Freepik; Copyright: wavebreakmedia_micro; URL:; License: Licensed by JMIR.

    Clinical Mortality in a Large COVID-19 Cohort: Observational Study


    Background: Northwell Health, an integrated health system in New York, has treated more than 15,000 inpatients with COVID-19 at the US epicenter of the SARS-CoV-2 pandemic. Objective: We describe the demographic characteristics of patients who died of COVID-19, observation of frequent rapid response team/cardiac arrest (RRT/CA) calls for non–intensive care unit (ICU) patients, and factors that contributed to RRT/CA calls. Methods: A team of registered nurses reviewed the medical records of inpatients who tested positive for SARS-CoV-2 via polymerase chain reaction before or on admission and who died between March 13 (first Northwell Health inpatient expiration) and April 30, 2020, at 15 Northwell Health hospitals. The findings for these patients were abstracted into a database and statistically analyzed. Results: Of 2634 patients who died of COVID-19, 1478 (56.1%) had oxygen saturation levels ≥90% on presentation and required no respiratory support. At least one RRT/CA was called on 1112/2634 patients (42.2%) at a non-ICU level of care. Before the RRT/CA call, the most recent oxygen saturation levels for 852/1112 (76.6%) of these non-ICU patients were at least 90%. At the time the RRT/CA was called, 479/1112 patients (43.1%) had an oxygen saturation of <80%. Conclusions: This study represents one of the largest reviewed cohorts of mortality that also captures data in nonstructured fields. Approximately 50% of deaths occurred at a non-ICU level of care despite admission to the appropriate care setting with normal staffing. The data imply a sudden, unexpected deterioration in respiratory status requiring RRT/CA in a large number of non-ICU patients. Patients admitted at a non-ICU level of care suffered rapid clinical deterioration, often with a sudden decrease in oxygen saturation. These patients could benefit from additional monitoring (eg, continuous central oxygenation saturation), although this approach warrants further study.

  • Source: Unsplash; Copyright: Obi Onyeador; URL:; License: Licensed by JMIR.

    Intergroup Contact, COVID-19 News Consumption, and the Moderating Role of Digital Media Trust on Prejudice Toward Asians in the United States:...


    Background: The perceived threat of a contagious virus may lead people to be distrustful of immigrants and out-groups. Since the COVID-19 outbreak, the salient politicized discourses of blaming Chinese people for spreading the virus have fueled over 2000 reports of anti-Asian racial incidents and hate crimes in the United States. Objective: The study aims to investigate the relationships between news consumption, trust, intergroup contact, and prejudicial attitudes toward Asians and Asian Americans residing in the United States during the COVID-19 pandemic. We compare how traditional news, social media use, and biased news exposure cultivate racial attitudes, and the moderating role of media use and trust on prejudice against Asians is examined. Methods: A cross-sectional study was completed in May 2020. A total of 430 US adults (mean age 36.75, SD 11.49 years; n=258, 60% male) participated in an online survey through Amazon’s Mechanical Turk platform. Respondents answered questions related to traditional news exposure, social media use, perceived trust, and their top three news channels for staying informed about the novel coronavirus. In addition, intergroup contact and racial attitudes toward Asians were assessed. We performed hierarchical regression analyses to test the associations. Moderation effects were estimated using simple slopes testing with a 95% bootstrap confidence interval approach. Results: Participants who identified as conservatives (β=.08, P=.02), had a personal infection history (β=.10, P=.004), and interacted with Asian people frequently in their daily lives (β=.46, P<.001) reported more negative attitudes toward Asians after controlling for sociodemographic variables. Relying more on traditional news media (β=.08, P=.04) and higher levels of trust in social media (β=.13, P=.007) were positively associated with prejudice against Asians. In contrast, consuming news from left-leaning outlets (β=–.15, P=.001) and neutral outlets (β=–.13, P=.003) was linked to less prejudicial attitudes toward Asians. Among those who had high trust in social media, exposure had a negative relationship with prejudice. At high levels of trust in digital websites and apps, frequent use was related to less unfavorable attitudes toward Asians. Conclusions: Experiencing racial prejudice among the Asian population during a challenging pandemic can cause poor psychological outcomes and exacerbate health disparities. The results suggest that conservative ideology, personal infection history, frequency of intergroup contact, traditional news exposure, and trust in social media emerge as positive predictors of prejudice against Asians and Asian Americans, whereas people who get COVID-19 news from left-leaning and balanced outlets show less prejudice. For those who have more trust in social media and digital news, frequent use of these two sources is associated with lower levels of prejudice. Our findings highlight the need to reshape traditional news discourses and use social media and mobile news apps to develop credible messages for combating racial prejudice against Asians.

  • Students wearing masks to signify the COVID-19 situation, and using their phones (in particular, WhatsApp). Source: Image created by the authors; Copyright: Jean CJ Liu; URL:; License: Creative Commons Attribution (CC-BY).

    The Relation Between Official WhatsApp-Distributed COVID-19 News Exposure and Psychological Symptoms: Cross-Sectional Survey Study


    Background: In a global pandemic, digital technology offers innovative methods to disseminate public health messages. As an example, the messenger app WhatsApp was adopted by both the World Health Organization and government agencies to provide updates on the coronavirus disease (COVID-19). During a time when rumors and excessive news threaten psychological well-being, these services allow for rapid transmission of information and may boost resilience. Objective: In this study, we sought to accomplish the following: (1) assess well-being during the pandemic; (2) replicate prior findings linking exposure to COVID-19 news with psychological distress; and (3) examine whether subscription to an official WhatsApp channel can mitigate this risk. Methods: Across 8 weeks of the COVID-19 outbreak (March 7 to April 21, 2020), we conducted a survey of 1145 adults in Singapore. As the primary outcome measure, participants completed the Depression, Anxiety, and Stress Scale (DASS-21). As predictor variables, participants also answered questions pertaining to the following: (1) their exposure to COVID-19 news; (2) their use of the Singapore government’s WhatsApp channel; and (3) their demographics. Results: Within the sample, 7.9% of participants had severe or extremely severe symptoms on at least one DASS-21 subscale. Depression scores were associated with increased time spent receiving COVID-19 updates, whereas use of the official WhatsApp channel emerged as a protective factor (b=–0.07, t[863]=–2.04, P=.04). Similarly, increased anxiety scores were associated with increased exposure to both updates and rumors, but this risk was mitigated by trust in the government’s WhatsApp messages (b=–0.05, t[863]=–2.13, P=.03). Finally, although stress symptoms increased with the amount of time spent receiving updates, these symptoms were not significantly related to WhatsApp use. Conclusions: Our findings suggest that messenger apps may be an effective medium for disseminating pandemic-related information, allowing official agencies to reach a broad sector of the population rapidly. In turn, this use may promote public well-being amid an “infodemic.” Trial Registration: NCT04305574;

  • Source: Image created by the Authors; Copyright: The Authors; URL:; License: Creative Commons Attribution (CC-BY).

    Using Smartphones and Wearable Devices to Monitor Behavioral Changes During COVID-19


    Background: In the absence of a vaccine or effective treatment for COVID-19, countries have adopted nonpharmaceutical interventions (NPIs) such as social distancing and full lockdown. An objective and quantitative means of passively monitoring the impact and response of these interventions at a local level is needed. Objective: We aim to explore the utility of the recently developed open-source mobile health platform Remote Assessment of Disease and Relapse (RADAR)–base as a toolbox to rapidly test the effect and response to NPIs intended to limit the spread of COVID-19. Methods: We analyzed data extracted from smartphone and wearable devices, and managed by the RADAR-base from 1062 participants recruited in Italy, Spain, Denmark, the United Kingdom, and the Netherlands. We derived nine features on a daily basis including time spent at home, maximum distance travelled from home, the maximum number of Bluetooth-enabled nearby devices (as a proxy for physical distancing), step count, average heart rate, sleep duration, bedtime, phone unlock duration, and social app use duration. We performed Kruskal-Wallis tests followed by post hoc Dunn tests to assess differences in these features among baseline, prelockdown, and during lockdown periods. We also studied behavioral differences by age, gender, BMI, and educational background. Results: We were able to quantify expected changes in time spent at home, distance travelled, and the number of nearby Bluetooth-enabled devices between prelockdown and during lockdown periods (P<.001 for all five countries). We saw reduced sociality as measured through mobility features and increased virtual sociality through phone use. People were more active on their phones (P<.001 for Italy, Spain, and the United Kingdom), spending more time using social media apps (P<.001 for Italy, Spain, the United Kingdom, and the Netherlands), particularly around major news events. Furthermore, participants had a lower heart rate (P<.001 for Italy and Spain; P=.02 for Denmark), went to bed later (P<.001 for Italy, Spain, the United Kingdom, and the Netherlands), and slept more (P<.001 for Italy, Spain, and the United Kingdom). We also found that young people had longer homestay than older people during the lockdown and fewer daily steps. Although there was no significant difference between the high and low BMI groups in time spent at home, the low BMI group walked more. Conclusions: RADAR-base, a freely deployable data collection platform leveraging data from wearables and mobile technologies, can be used to rapidly quantify and provide a holistic view of behavioral changes in response to public health interventions as a result of infectious outbreaks such as COVID-19. RADAR-base may be a viable approach to implementing an early warning system for passively assessing the local compliance to interventions in epidemics and pandemics, and could help countries ease out of lockdown.

  • Artificial Intelligence in Medicine. Source: iStock; Copyright:; URL:; License: Licensed by JMIR.

    Using Item Response Theory for Explainable Machine Learning in Predicting Mortality in the Intensive Care Unit: Case-Based Approach


    Background: Supervised machine learning (ML) is being featured in the health care literature with study results frequently reported using metrics such as accuracy, sensitivity, specificity, recall, or F1 score. Although each metric provides a different perspective on the performance, they remain to be overall measures for the whole sample, discounting the uniqueness of each case or patient. Intuitively, we know that all cases are not equal, but the present evaluative approaches do not take case difficulty into account. Objective: A more case-based, comprehensive approach is warranted to assess supervised ML outcomes and forms the rationale for this study. This study aims to demonstrate how the item response theory (IRT) can be used to stratify the data based on how difficult each case is to classify, independent of the outcome measure of interest (eg, accuracy). This stratification allows the evaluation of ML classifiers to take the form of a distribution rather than a single scalar value. Methods: Two large, public intensive care unit data sets, Medical Information Mart for Intensive Care III and electronic intensive care unit, were used to showcase this method in predicting mortality. For each data set, a balanced sample (n=8078 and n=21,940, respectively) and an imbalanced sample (n=12,117 and n=32,910, respectively) were drawn. A 2-parameter logistic model was used to provide scores for each case. Several ML algorithms were used in the demonstration to classify cases based on their health-related features: logistic regression, linear discriminant analysis, K-nearest neighbors, decision tree, naive Bayes, and a neural network. Generalized linear mixed model analyses were used to assess the effects of case difficulty strata, ML algorithm, and the interaction between them in predicting accuracy. Results: The results showed significant effects (P<.001) for case difficulty strata, ML algorithm, and their interaction in predicting accuracy and illustrated that all classifiers performed better with easier-to-classify cases and that overall the neural network performed best. Significant interactions suggest that cases that fall in the most arduous strata should be handled by logistic regression, linear discriminant analysis, decision tree, or neural network but not by naive Bayes or K-nearest neighbors. Conventional metrics for ML classification have been reported for methodological comparison. Conclusions: This demonstration shows that using the IRT is a viable method for understanding the data that are provided to ML algorithms, independent of outcome measures, and highlights how well classifiers differentiate cases of varying difficulty. This method explains which features are indicative of healthy states and why. It enables end users to tailor the classifier that is appropriate to the difficulty level of the patient for personalized medicine.

  • MIRROR being used on a smartphone. Source: Placeit/The Authors; Copyright: Placeit/The Authors; URL:; License: Licensed by JMIR.

    Mobile Insight in Risk, Resilience, and Online Referral (MIRROR): Psychometric Evaluation of an Online Self-Help Test


    Background: Most people who experience a potentially traumatic event (PTE) recover on their own. A small group of individuals develops psychological complaints, but this is often not detected in time or guidance to care is suboptimal. To identify these individuals and encourage them to seek help, a web-based self-help test called Mobile Insight in Risk, Resilience, and Online Referral (MIRROR) was developed. MIRROR takes an innovative approach since it integrates both negative and positive outcomes of PTEs and time since the event and provides direct feedback to the user. Objective: The goal of this study was to assess MIRROR’s use, examine its psychometric properties (factor structure, internal consistency, and convergent and divergent validity), and evaluate how well it classifies respondents into different outcome categories compared with reference measures. Methods: MIRROR was embedded in the website of Victim Support Netherlands so visitors could use it. We compared MIRROR’s outcomes to reference measures of PTSD symptoms (PTSD Checklist for DSM-5), depression, anxiety, stress (Depression Anxiety Stress Scale–21), psychological resilience (Resilience Evaluation Scale), and positive mental health (Mental Health Continuum Short Form). Results: In 6 months, 1112 respondents completed MIRROR, of whom 663 also completed the reference measures. Results showed good internal consistency (interitem correlations range .24 to .55, corrected item-total correlations range .30 to .54, and Cronbach alpha coefficient range .62 to .68), and convergent and divergent validity (Pearson correlations range –.259 to .665). Exploratory and confirmatory factor analyses (EFA+CFA) yielded a 2-factor model with good model fit (CFA model fit indices: χ219=107.8, P<.001, CFI=.965, TLI=.948, RMSEA=.065), conceptual meaning, and parsimony. MIRROR correctly classified respondents into different outcome categories compared with the reference measures. Conclusions: MIRROR is a valid and reliable self-help test to identify negative (PTSD complaints) and positive outcomes (psychosocial functioning and resilience) of PTEs. MIRROR is an easily accessible online tool that can help people who have experienced a PTE to timely identify psychological complaints and find appropriate support, a tool that might be highly needed in times like the coronavirus pandemic.

Citing this Article

Right click to copy or hit: ctrl+c (cmd+c on mac)

Latest Submissions Open for Peer-Review:

View All Open Peer Review Articles
  • Social Stigma and Mental Health Among Overseas Chinese During the COVID-19 Pandemic

    Date Submitted: Sep 25, 2020

    Open Peer Review Period: Sep 24, 2020 - Nov 19, 2020

    Background: The COVID-19 has led to stigma and discrimination among various groups of people in different populations. However, there is no data quantifying the pattern of COVID-19 social stigma and i...

    Background: The COVID-19 has led to stigma and discrimination among various groups of people in different populations. However, there is no data quantifying the pattern of COVID-19 social stigma and its impacts on mental health. Objective: To assess COVID-19 social stigma and mental health, and to examine their association among overseas Chinese during the COVID-19 pandemic. Methods: A cross-sectional study was conducted among 519 overseas Chinese from 23 countries who were sampled using the snowball sampling method and invited to participate in an online survey during April 21 to May 7, 2020. Depression was assessed by the Chinese version of the World Health Organization Five Well-Being Index (WHO-5), and anxiety was assessed by the Chinese version of the Generalized Anxiety Disorder Scale (GAD-7). COVID-19 social stigma was assessed by a validated 6-item COVID-19 social stigma scale. Covariates included gender, age (10-year categories), educational level, marital status, occupation, length of stay in the immigrant country, perceived severity of COVID-19 (PSC), and perceived effectiveness of prevention and control measures (PEPC) in the city where they currently lived. Results: A total of 519 participants from 23 countries were involved in the current study. The prevalence of depression was 37.2% (95%CI: 33.0%–41.5%), and that of anxiety was 15.8% (95%CI: 12.8%–19.2%). During the COVID-19 pandemic, 24.2% and 26.0% of participants reported having been refused services and work or study, more than 85% reported having read “Chinese virus” or “Wuhan virus” in the media, and 54.6% reported having heard these phrases during interpersonal communication. After controlling for covariates, perceived severity, and perceived effectiveness of control, compared to those with the lowest-quartile social stigma, the odds ratios (ORs) of depression among those with the third- and highest-quartile social stigma were 2.41(95%CI: 1.25–4.65) and 3.33 (95%CI: 1.71–6.48). The OR of anxiety among those with the highest-quartile social stigma was 5.40 (95%CI: 1.95–14.95). Conclusions: Overseas Chinese face some increased social stigma, including discrimination and labeling, which is associated with a high prevalence of depression and anxiety, during the COVID-19 pandemic.

  • Emergency response to the COVID-19 pandemic using digital health technologies: practical experience of a tertiary hospital in China

    Date Submitted: Sep 22, 2020

    Open Peer Review Period: Sep 22, 2020 - Nov 17, 2020

    Background: The outbreak of the novel corona virus disease (COVID-19) has caused a continuing global pandemic. Hospitals are integral in the control and prevention of COVID-19 but are met with numerou...

    Background: The outbreak of the novel corona virus disease (COVID-19) has caused a continuing global pandemic. Hospitals are integral in the control and prevention of COVID-19 but are met with numerous challenges in the midst of the epidemic. Objective: The objective of our study was to introduce the practical experience of design and implementation, as well as the preliminary results, of an online COVID-19 service platform from a tertiary hospital in China. Methods: The online COVID-19 service platform was deployed within the healthcare system of the Guangdong Second Provincial General Hospital-Internet Hospital, a program function which provides online medical services for both public individuals and lay-healthcare workers. The focal functions of this system include COVID-19 automated screening, related symptoms monitoring, online consultation, psychological support, and it also serves as a COVID-19 knowledge hub. The design and process of each functionality were introduced. The platform services usage data were collected and represented by three periods: the pre-epidemic period (2019.12.22~2020.1.22), the outbreak period (2020.1.23~2020.3.31), and the post-epidemic period (2020.4.1~2020.6.30). Results: By the end of June 2020, the COVID-19 automated screening and symptoms monitoring system had been used by 96,642 people for 161,884 and 7,795,194 person-times. The general online consultation service volume scaled up from 930 visits per-month in pre-epidemic period to over 8406 visits during the outbreak period, and dropped to 2218 visits in the post-epidemic period. The psychological counseling program served 636 clients during epidemic period. For people who used the COVID-19 automated screening service, overall, 160,916 (99%) of the users were classified under the no risk category. Less than 464 (0.3%) of the people were categorized under the medium to high risk class, and 12 people (0.01%) were recommended for COVID-19 treatment. Among the 96,642 individuals who used the COVID-19 related symptoms monitoring service, 6,696(6.9%) were symptomatic at some points during monitoring period. Fever was the most frequently reported symptom, with 40% of the people having had this symptom. Cough (25%) and sore throat (24%) were also relatively frequently reported among the symptomatic clients. Conclusions: The online COVID-19 service platform exhibited as a role model for using digital health technologies to respond to the COVID-19 pandemic from a tertiary hospital in China. The digital solutions of COVID-19 automated screening, daily symptoms monitoring, online care service, and knowledge propagation have plausible acceptability and feasibility for complementing offline hospital services and facilitating disease control and prevention.

  • Experience and attitude of elementary school students and their parents towards online learning in China during the COVID-19 outbreak:Questionnaire Study

    Date Submitted: Sep 22, 2020

    Open Peer Review Period: Sep 22, 2020 - Nov 17, 2020

    Background: Due to the widespread of COVID-19, the emergency homeschooling plan has been rigorously implemented in China. Objective: This study aimed to investigate the experience and attitude of elem...

    Background: Due to the widespread of COVID-19, the emergency homeschooling plan has been rigorously implemented in China. Objective: This study aimed to investigate the experience and attitude of elementary school students and their parents towards online learning in China during the COVID-19 outbreak. Methods: A 16-item questionnaire was distributed to 867 elementary students and their parents at 10 days, and 141 elementary students and their parents 30 days after the first online learning course. The questionnaire comprised questions regarding the completeness of course and homework, effectiveness, reliability, and abundance of courses, the enthusiasm to take part in online learning, and the satisfaction of online learning. Sociodemographic data, like students’ grades and equipment for online learning, were recorded. Results: In terms of equipment, lower grade pupils were more likely to choose TV to execute their online learning. Most of the students had good enthusiasm for taking online learning courses. Most of the students couldn’t do well in the online learning class and homework after class. The majority of elementary school parents thought that the reliability, effectiveness, and abundance of online courses were perfect. In terms of satisfaction, most parents were satisfied with online learning courses, and the score was above 6 points. In consist of parents, most students were satisfied with online learning courses, and the score also was above 6 points. For future study, most parents or students hoped to return to face-to-face learning in our study. Compared with the first stage, the proportion of students completing courses and homework after class in the second stage decreased significantly. In terms of the validity and reliability of the course, compared with the first stage, the evaluation of parents on online courses in the second stage was lower. Parents and students’ satisfaction with online courses decreased in the second stage, but there was no statistically significant difference between the two stages. Conclusions: Online learning can prevent the spread of infectious diseases, meanwhile allowing elementary school students to gain knowledge during the COVID-19 outbreak. Most enrolled elementary school students had full enthusiastic about participating in online learning. Both elementary school students and their parents were highly satisfied with online learning. In the initial phase of online learning, students were able to complete all online lessons and homework assignments after school very well. However, as time went on, the percentage of students who completed the courses and homework on time was decreased. Compared with the first stage, the satisfaction of students and their parents towards online learning decreased in the second stage. As soon as this happened, online learning remained an excellent form of education during the epidemic outbreak. To achieve better teaching results, some corrections need to be made to the lessons, such as making them more interactive.

  • Towards a universal definition of disease activity score thresholds: The AS135 score

    Date Submitted: Sep 22, 2020

    Open Peer Review Period: Sep 22, 2020 - Nov 17, 2020

    Background: Many study groups have developed scores reflecting disease activity. The result of this fragmentation is a multitude of disease activity scores, even for a single disease. We recently sugg...

    Background: Many study groups have developed scores reflecting disease activity. The result of this fragmentation is a multitude of disease activity scores, even for a single disease. We recently suggested a standardization for the cut-offs. Our standardized system produces a similar wide range of values and facilitates the task of interpreting activity scores for various diseases. However, the formulae used in this article were not perfect as limited to a linear transformation, and without a cell-phone application, the possibility of use by clinicians was low. Objective: To identify and standardize disease activity scores in rheumatology. Methods: We conducted a literature review on disease activity criteria using both a manual approach and in-house computer software (BIBOT) that applies natural language processing (a machine learning artificial intelligence technique) to automatically identify and interpret important words in abstracts published in English between 1.1.1975 and 31.12.2018. Within all extracted disease activity scores, we selected those with cut-off values divided into four classes (remission, low, moderate and high disease activity). We used a linear interpolation to map all these disease activity scores to our new score, the AS135, and developed a smart-phone application to perform the conversion automatically. Results: A total of 108 activity criteria from various fields (rheumatology, dermatology, gastroenterology, psychiatry, neurology and pneumology) were identified, but it is in rheumatology that we found the most separation into four classes. We built the AS135 score modification for each selected score using a linear interpolation of the existing criteria. It was defined on the interval [0,10], and values 1, 3 and 5 were used as thresholds. These arbitrary thresholds were then associated with the thresholds of the existing criteria, and an interpolation can be calculated, allowing the conversion of the existing criteria into the AS135 criterion. Finally, we created a mobile application that allows each user to obtain both the original value of the activity criteria and the new AS135 value. The use of a linear model to approximate the distribution of each score could be a limitation, but selected scores have been constructed to be interpreted as a linear scale, which makes the approximation performed by AS135 very acceptable. Conclusions: We developed an application for clinicians that enables the use of a single disease activity score for different inflammatory rheumatic diseases using an intuitive scale, the AS-135 score. Clinical Trial: Not applicable

  • Utilization of Electronic Medical Records Data for Medical Research in a Hospital in China: A Cross-Sectional Study

    Date Submitted: Sep 17, 2020

    Open Peer Review Period: Sep 17, 2020 - Nov 12, 2020

    Background: With the proliferation of electronic medical records systems (EMRs), there is an increasing interest in utilizing EMRs data for medical research, yet there is no quantitative research on E...

    Background: With the proliferation of electronic medical records systems (EMRs), there is an increasing interest in utilizing EMRs data for medical research, yet there is no quantitative research on EMRs data utilization for medical research purposes. Objective: Understand the current status of clinical data utilization in clinical research activities, including trends in recent years and differences between different populations, to find out the present problems in the use of EMR data for research, and provide a reference for promoting the utilization of EMR data in scientific research. Methods: For this descriptive, cross-sectional study, the utilization of EMRs data by staff at Xuanwu Hospital in Beijing, China between 2016 and 2019 was analyzed. The utilization of EMRs data was described as the number of requests, the proportion of requesters, and the frequency of requests per capita. The comparison by year, professional title, and age was conducted by double-sided chi square test. Results: From 2016 to 2019, EMRs data utilization was poor, as the proportion of requesters was 5.8% and the frequency was 0.1 times / person / year. The frequency per capita gradually slowed and more older, senior level staff used EMRs data compared to younger staff. Conclusions: The value of using EMRs data for research purposes does not get enough attention among researchers in Chinese hospitals. Ensuring equal availability of EMRs data and highlighting the benefits of such systems can help promote its use in research settings. Future research should focus on mechanisms that encourage data utilization, ensure fair data availability, and promote data sharing.

  • Implementing remote collaboration in a virtual patient platform – enabling students and physicians to learn collaborative clinical reasoning

    Date Submitted: Sep 14, 2020

    Open Peer Review Period: Sep 14, 2020 - Nov 9, 2020

    Background: Learning with virtual patients is highly popular for fostering clinical reasoning in medical education. However, little learning with virtual patients is done collaboratively, despite the...

    Background: Learning with virtual patients is highly popular for fostering clinical reasoning in medical education. However, little learning with virtual patients is done collaboratively, despite the potential learning benefits of collaborative vs. individual learning. Objective: In this article, we describe the rationale behind the implementation of student collaboration in the CASUS virtual patient platform. Methods: The SimpleWebRTC library of andYet was used to implement the collaborative tool. It provided a basis for the conferencing platform and could be adapted to include features such as video communication and screensharing. An additional text chat was created based on the message protocol of the SimpleWebRTC library. We implemented a user interface for educators to set up and configure the collaboration. Educators can configure video, audio, and text-based chat communication, which are known to promote effective learning. Results: We tested the tool in a sample of 137 students working on virtual patients. The study results indicate that students successfully diagnosed 53% (SD = 26%) of the patients when working alone and 71% (SD= 20%) when collaborating using the tool (p < .05, eta2=.12). A usability questionnaire for the study sample shows a usability score of 82.16 (SD = 1.31), a B+ grade. Conclusions: The approach provides a technical framework for collaboration that can be used with the CASUS virtual patient system. Additionally, the application programming interface is generic, so that the setup can also be used with other learning management systems. The collaborative tool helps students diagnose virtual patients and results in a good overall usability of CASUS. Using learning analytics, we are able to track students’ progress in content knowledge and collaborative knowledge and guide them through a virtual patient curriculum designed to teach both. More broadly, the collaborative tool provides an array of new possibilities for researchers and educators alike to design courses, collaborative homework assignments, and research questions for collaborative learning.