This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.
Hemodialysis (HD) therapy is an indispensable tool used in critical care management. Patients undergoing HD are at risk for intradialytic adverse events, ranging from muscle cramps to cardiac arrest. So far, there is no effective HD device–integrated algorithm to assist medical staff in response to these adverse events a step earlier during HD.
We aimed to develop machine learning algorithms to predict intradialytic adverse events in an unbiased manner.
Three-month dialysis and physiological time-series data were collected from all patients who underwent maintenance HD therapy at a tertiary care referral center. Dialysis data were collected automatically by HD devices, and physiological data were recorded by medical staff. Intradialytic adverse events were documented by medical staff according to patient complaints. Features extracted from the time series data sets by linear and differential analyses were used for machine learning to predict adverse events during HD.
Time series dialysis data were collected during the 4-hour HD session in 108 patients who underwent maintenance HD therapy. There were a total of 4221 HD sessions, 406 of which involved at least one intradialytic adverse event. Models were built by classification algorithms and evaluated by four-fold cross-validation. The developed algorithm predicted overall intradialytic adverse events, with an area under the curve (AUC) of 0.83, sensitivity of 0.53, and specificity of 0.96. The algorithm also predicted muscle cramps, with an AUC of 0.85, and blood pressure elevation, with an AUC of 0.93. In addition, the model built based on ultrafiltration-unrelated features predicted all types of adverse events, with an AUC of 0.81, indicating that ultrafiltration-unrelated factors also contribute to the onset of adverse events.
Our results demonstrated that algorithms combining linear and differential analyses with two-class classification machine learning can predict intradialytic adverse events in quasi-real time with high AUCs. Such a methodology implemented with local cloud computation and real-time optimization by personalized HD data could warn clinicians to take timely actions in advance.
Hemodialysis (HD) therapy has a substantial role in critical care management [
The Crit-Line (Fresenius Medical Care) monitor is a device developed to assist with fluid removal during ultrafiltration by noninvasively monitoring real-time hematocrit, oxygen saturation, and intradialytic volume status, using an optical transmission method [
Artificial intelligence has been applied to HD patients to assist clinical practice, including prediction of urea clearance [
This was a retrospective observational study in a single institution. We reviewed the records of all patients who underwent maintenance HD therapy at Changhua Christian Hospital, a tertiary-care referral center in middle Taiwan, between August 2017 and October 2017. During this period, 129 patients were eligible for enrollment evaluation, and 108 patients completed the 3-month study. HD sessions were excluded for the following three reasons: (1) session interruption due to dialyzer exchange, (2) more than one interruption per session due to patient urination or defecation, and (3) inability of patients to freely express their discomfort during the session. Eventually, a total of 4221 HD sessions from 108 patients were used to build the model. Each patient received either 39 or 40 HD sessions during the 3-month study period.
The Institutional Review Board of our institution approved all protocols in April 2017 before the study began, and the protocols conformed to the ethical guidelines of the Helsinki declaration. The need for informed consent was waived because of the retrospective nature of the study.
Demographic information from medical records, including age, gender, and years under dialysis treatment, were included for model building. Dialysis and physiological data of the enrolled patients during the 4-hour HD session were included in the study. Physiological data were measured and recorded by medical staff every 30 to 60 minutes approximately. Dialysis data were collected from the dialysis machine automatically. Intradialytic adverse events were documented by medical staff according to physiological measurements or patient complaints, as shown in
For each HD session i (i=1-4221), the data set HDi consisted of records {Yj,k, Tk}, where j (range 1-9) is the index for the dialysis and physiological measurements, and k is the index of time when a measurement is taking. Yj,k is the value of the measurement j at time Tk. HDi also included additional time-invariant patient-specific information Yj (j=10-13), including age, gender, years under dialysis treatment, and predialytic weight (
List of intradialytic adverse events.
Adverse event | Episodes, n |
Muscle cramps | 138 |
Blood pressure elevation | 108 |
Low blood pressure | 64 |
Miscellaneous | 45 |
Headache | 28 |
Lightheadedness | 26 |
Chest tightness | 23 |
Vascular access thrombosis | 23 |
Cold sweating | 22 |
Nausea/vomiting | 12 |
Fever | 10 |
Tachycardia | 10 |
Dyspnea | 8 |
Hoarseness | 8 |
Chills | 5 |
Leg pain | 5 |
Low back pain | 5 |
Shoulder pain | 5 |
Altered mental status | 4 |
Chest discomfort | 3 |
Numb hands | 3 |
Tinnitus | 3 |
Vascular access occlusion | 3 |
Abdominal pain | 2 |
Hypersomnia | 1 |
Palpitation | 1 |
Pruritus | 1 |
To avoid the artifacts at the beginning of the data due to the different procedures on how the dialysis was set up and started in each HD session, the first data point Yj,1 at the beginning of each HD session was excluded if the blood flow rate varied between T1 and T2. We also excluded the data point Yj,k when the blood flow rate was equal to or below zero due to dialysis interruption (dialyzer exchange or patient urination/defecation). An entire HD session was excluded from the analysis if the session was interrupted more than once.
In our main analysis, the whole data set {Yj,k, Tk} of an HD session was included for feature extraction if no adverse event was registered for the session. On the other hand, for the HD session with adverse events, only data preceding the first adverse event were included for feature extraction, meaning the length of HD was less than 4 hours. Because the time interval between two adjacent records and the length of HD sessions vary, regression analysis is challenging, and we need to include the temporal features of the measured variables in the analyses for classification. To this end, we derived the mean, standard deviation of the mean, and coefficient of variance, as well as the slope and R square of linear regression from the dialysis and physiological measurements {Yj,k, Tk}. We also derived the maximum, minimum, and mean of change rate (the first-order derivative), as well as the second-order derivative of venous pressure and transmembranous pressure as features for analysis. A total of 84 features {Xh} (h=1-84), including those from the raw measurements {Yj,k, Tk} and those derived from the temporal aspect of the data as described above, were extracted for each HD session (
As aforementioned, the dialysis data set {Yj,k, Tk} is recorded once the values of venous pressure or transmembranous pressure change. Therefore, the value of any measurement at the time Tp between two measured time stamps, Tk and Tk-1, can be assigned as {Yj,k, Tp} = {Yj,k, Tk-1}. Thus, feature extraction of the data in an HD session could be terminated at an arbitrary time (Tp).
For outcome labeling, the HD sessions with one or more than one adverse event were labeled as 1, and HD sessions with no adverse event were labeled as 0. We also randomly relabeled 4221 HD sessions regardless of their true outcome as a negative control set while kept the same 0 to 1 ratio as the experimental set. A two-class classification model was built and evaluated by four-fold cross-validation using Azure (Microsoft Inc). At least three repeats were performed by introducing different random numbers for each model building.
To pinpoint which features are more important than others in predicting HD adverse events, we also selected and used key features for model building and compared the results with that by a total of 84 features. The selection of key features was performed using MATLAB (MATrixLABoratory, MathWorks Inc) (source code available in the format of .m;
The key feature selection process started with selecting the top feature according to the scores obtained from the model using a single feature from the 84 features once at a time. Next, the top two-feature combinations were selected from the two-feature combination pool, which was established by combining the top feature from the last step with each of the remaining 83 features. The two-feature combinations that resulted in scores higher than that of the top feature from the last step were kept for the next step. Likewise, the top three-feature combinations could be selected from the pool established by combining the top two-feature combinations with each of the remaining 82 features when the three-feature combinations scored higher than the two-feature combinations. We repeated this procedure until the top 20-feature combinations were selected. Features that most frequently appeared in these 20-feature combinations were defined as key features.
This study was approved by the Institutional Review Board of National Yang-Ming University (N_105_0132) and the Institutional Review Board of Changhua Christian Hospital (CCH IRB No. 161005).
As of November 2017, we enrolled 108 patients.
The list of intradialytic adverse events and the number of occurrences are shown in
To increase the outcome 1 to 0 ratios, wherein the session with an adverse event is labeled as 1 and the session without an adverse event is labeled as 0, we categorized the 27 adverse events listed in
A two-class averaged perceptron was used for model building with a learning rate of 20 and maximal iterations of 20. For the 84-feature model, the mean area under the curve (AUC) was 0.83 (SD 0.03), with an F1 score of 0.53, sensitivity of 0.53, and specificity of 0.96 (
Prediction of all intradialytic adverse events except blood pressure elevation. (A) Machine learning performance represented by receiver operating characteristic (ROC) curves from 84 features (curve a, red), the top 21 features (curve c, green), all features but excluding ultrafiltration-related features (curve d, blue), and the negative control (curve b, black). Each ROC curve shown here is the average of 12 simulated ROC curves. (B). Machine learning performance represented by ROC curves is obtained from ultrafiltration-related features (curves e, f, g, h, i, j, and k) and from 84 features (curve a, black). (C) Area under the curve (AUC) and F1 scores from different feature combinations for predicting all intradialytic adverse events. UF: ultrafiltration.
Ultrafiltration rate and ultrafiltration volume are important parameters for HD. However, our models indicated that employing a single feature, such as the maximal value of ultrafiltration volume (feature 78) or the mean value of ultrafiltration rate changes (feature 77), cannot predict adverse events properly (
Next, the 21 features that most frequently appeared in the 20-feature combinations were selected for the evaluation. The two-class averaged perceptron model based on these top 21 performance features but skipping ultrafiltration-related features showed a mean AUC of 0.82 (SD 0.02) and F1 score of 0.45 (
The 21 features were age, maximum transmembranous pressure, minimum systolic blood pressure (SBP), minimum diastolic blood pressure (DBP), minimum pulse pressure, minimum blood flow rate, mean SBP, mean venous pressure, mean transmembranous pressure, slope of linear regression of SBP, slope of linear regression of DBP, slope of linear regression of pulse pressure, slope of linear regression of pulse rate, slope of linear regression of transmembranous pressure, standard deviation of the mean of blood flow rate, R-squared of linear regression of pulse pressure, and related parameters to the second-order derivative of venous pressure (features 2, 5, 6, 8, 11, 14, 17, 20, 21, 26, 29, 31, 36, 47-52, 57, and 59) (
The model, which was based on 14 ultrafiltration-related features, had a mean AUC of 0.85 (SD 0.04) and F1 score of 0.45 (
Prediction of a specific intradialytic adverse event: muscle cramps. (A) Machine learning performance is represented by receiver operating characteristic (ROC) curves from 84 features (curve a, red), all features but excluding ultrafiltration-related features (curve c, orange dot), ultrafiltration-related features (curves d, e, f, i, k), and the negative control (curve b, black). (B) Area under the curve (AUC) and F1 scores from different feature combinations for predicting muscle cramps. UF: ultrafiltration.
The model, which was based on a total of 84 features, had a mean AUC of 0.93 (SD 0.02) and F1 score of 0.41 for predicting the occurrence of hypertension (
Prediction of a specific intradialytic adverse event: blood pressure elevation. (A) Machine learning performance represented by receiver operating characteristic (ROC) curves from 84 features (curve a, red), blood pressure-related features (curve d, green), ultrafiltration-related features (curve c, blue), and the negative control (curve b, black). (B) Area under the curve (AUC) and F1 scores from different feature combinations for predicting blood pressure elevation. UF: ultrafiltration.
As shown in
To further understand the cutoff ending time point dependence of prediction accuracy, 500 HD sessions were randomly selected to compare the prediction probabilities of adverse events obtained from 84 features with cutoff ending time points of 0, 5, 10, 15, and 20 minutes before the index adverse event. As shown in
Even though none of the 84 features contained explicit time series information, the linear and differential analyses that feature extraction employed may be affected by the length of HD sessions. Therefore, we truncated the HD sessions with no adverse events (negative ones) and compared the prediction results with those from the untruncated ones. Since the average length of HD sessions with adverse events (positive ones) was 3.3 hours, negative HD sessions were truncated and randomly assigned endpoints (Tend) between 3 and 3.5 hours, yet the endpoints of positive ones remained unchanged. The data set {Yj,k, Tk} at endpoint Tend was defined according to the same method used for {Yj,k, Tk} at arbitrary time Tp. Regarding the results, the mean AUC was 0.89 (SD 0.019), F1 score was 0.55, sensitivity was 0.52, and specificity was 0.97. Alternately, the AUC was 0.86 with an F1 score of 0.55 when the endpoints were assigned exactly at 3.3 hours. Compared to the original results obtained from the untruncated negative HD sessions with a duration of about 4 hours (AUC=0.83, F1=0.53, sensitivity=0.53, and specificity=0.96), the prediction results were better when the endpoints were set earlier. Indeed, the AUC was 0.92, with an F1 score of 0.62, sensitivity of 0.61, and specificity of 0.98, when the endpoints were randomly assigned between 2.5 and 3.5 hours for negative HD sessions.
Prediction performance for group 1 intradialytic adverse events using the features of different cutoff ending time points. (A) Machine learning performance represented by receiver operating characteristic (ROC) curves from 84 features extracted from [Yj, Tk]HD terminated at different cutoff ending time points as follows: one time point before an adverse event (noted as 0 minutes), and 5, 10, 15, 20, and 60 minutes before an adverse event or before the end of the hemodialysis (HD) session if no adverse event. (B) Area under the curve (AUC) and F1 scores. (C) Probability of adverse event occurrence in 500 randomly selected HD sessions. The red circle indicates the HD session with adverse events, and the predicted probabilities of adverse events are all higher than 0.8 independent of the cutoff ending time point.
Our findings indicate that algorithms combining linear and differential analyses with two-class classification machine learning predict intradialytic adverse events with high AUCs. We attempted to identify features that contribute the most to predicting all adverse events, except hypertension, (group 1) from a total of 84 features extracted from [Yj, Tk]HD. Among the top 23 features, only feature 76 and feature 82 were related to ultrafiltration (the number of times that the ultrafiltration rate changes and the linear regression slope of ultrafiltration volume). After excluding these two ultrafiltration-related features, we found that the remaining 21 features were sufficient for accurate prediction with good discriminating power, with a slight reduction in the AUC from 0.83 (84 features) to 0.82 (21 features). The model built by 14 ultrafiltration-related features also had a good AUC of 0.83. Therefore, instead of including all 84 features for model building, selecting the top 21 ultrafiltration-unrelated features or integrating a total of 14 ultrafiltration-related features can reduce computing load. Our results also suggest that these two clusters of features (
In our study, muscle cramp was an adverse event that occurred most frequently during HD treatment. A muscle cramp is a common adverse event that happens during HD therapy, with a prevalence of 28% among all HD sessions [
In general, symptomatic hypotension occurs in 20% to 30% of dialysis sessions [
As shown in
Compared with several two-class classification modules, such as Bayes point machine, boosted decision tree, and SVM, models built by two-class average perceptron had the best AUC and F1 score. We also built models by deep learning (data not shown), but the results from deep learning did not show a good AUC and F1 score, possibly due to the limited number of our HD data sets. As clinicians are now facing the new era of artificial intelligence [
Several questions may be answered if the size of the HD data set is expanded in future studies. First, how early can we predict adverse events? The consistency in the predicted probabilities of adverse events using features based on different cutoff ending time points could detect about one-tenth of HD sessions with adverse events (
In this study, a model of two-class classification was established to predict intradialytic adverse events in quasi-real time, with AUCs higher than 0.8. The consistency in the predicted probabilities of adverse events obtained from the features extracted in the ongoing HD process in real time could have the HD session tagged for forthcoming adverse events. Such a methodology implemented with local cloud computation could warn clinicians to take necessary actions and adjust the HD machine settings in advance.
List of hemodialysis machine readouts.
An example of physiological and dialysis data collected in one hemodialysis (HD) session.
List of 84 features.
Source code: Feature extraction.
Source code: Top performance feature selection.
Demographic characteristics of the study participants (n=108).
Simple decision tree of hemodialysis (HD) patients and HD sessions used for the study. One hundred twenty-nine patients were eligible after the enrollment evaluation, and 108 patients completed the 3-month study. There were 4221 HD sessions used for model building, and of these, 3815 did not have adverse events and 406 had one or multiple adverse events.
Top 16 features for predicting intradialytic muscle cramps.
area under the curve
diastolic blood pressure
hemodialysis
systolic blood pressure
support vector machines
The authors acknowledge financial support from Advantech Foundation (YM105C041). This study was also supported for research purposes by the Ministry of Science and Technology (MOST), Taiwan (MOST 108-2923-B-010-002-MY3, MOST 109-2926-I-010-502, MOST 109-2823-8-010-003-CV, and MOST 110-2923-B-A49A-501-MY3 [OKL]; 109-2314-B-010-053-MY3, 109-2321-B-009-007, MOST 109-2811-B-010-532, MOST 110-2811-B-010-510, MOST 110-2813-C-A49A-551-B, and MOST 110-2321-B-A49-003 [CY]; MOST 108-2633-B-009-001 [CY and OKL]; and MOST 109-2314-B-182-010 [YL]), grants from Taipei Veterans General Hospital, Taipei, Taiwan (V106D25-003-MY3, VGHUST107-G5-3-3, VGHUST109-V5-1-2, and V110C-194) (CY), the “Yin Yen-Liang Foundation Development and Construction Plan” of the School of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan (107F-M01-0504) (CY and OKL), and the Chang Gung Medical Research Foundation (CMRPD1K0601) (YL). Moreover, this work was in part supported for research purposes by the “Center for Intelligent Drug Systems and Smart Bio-devices (IDS2B)” from The Featured Areas Research Center Program within the framework of the Higher Education Sprout Project by the Ministry of Education (MOE) in Taiwan (CY). The funders had no role in study design, data collection, analysis, interpretation, or writing of the manuscript.
CY and OKL contributed equally as Corresponding Authors.
None declared.