Original Paper
Abstract
Background: Postoperative acute kidney injury (AKI) is a significant risk associated with surgeries under general anesthesia, often leading to increased mortality and morbidity. Existing predictive models for postoperative AKI are usually limited to specific surgical areas or require external validation.
Objective: We proposed to build a prediction model for postoperative AKI using several machine learning methods.
Methods: We conducted a retrospective cohort analysis of noncardiac surgeries from 2009 to 2019 at seven university hospitals in South Korea. We evaluated six machine learning models: deep neural network, logistic regression, decision tree, random forest, light gradient boosting machine, and naïve Bayes for predicting postoperative AKI, defined as a significant increase in serum creatinine or the initiation of renal replacement therapy within 30 days after surgery. The performance of the models was analyzed using the area under the curve (AUC) of the receiver operating characteristic curve, accuracy, precision, sensitivity (recall), specificity, and F1-score.
Results: Among the 239,267 surgeries analyzed, 7935 cases of postoperative AKI were identified. The models, using 38 preoperative predictors, showed that deep neural network (AUC=0.832), light gradient boosting machine (AUC=0.836), and logistic regression (AUC=0.825) demonstrated superior performance in predicting AKI risk. The deep neural network model was then developed into a user-friendly website for clinical use.
Conclusions: Our study introduces a robust, high-performance AKI risk prediction system that is applicable in clinical settings using preoperative data. This model’s integration into a user-friendly website enhances its clinical utility, offering a significant step forward in personalized patient care and risk management.
doi:10.2196/62853
Keywords
Introduction
Acute kidney injury (AKI) represents a critical challenge in postoperative care, significantly affecting patient outcomes and health care systems. It is a common complication that affects up to 5% to 7.5% of all hospitalized patients, with a markedly higher prevalence of 20% in intensive care units [
]. Among all AKI in hospitalized patients, 40% occur in postoperative patients [ ]. This condition not only escalates morbidity but also substantially increases in-hospital mortality by approximately 3- to 9-fold [ ]. The severity of this risk is further underscored in patients who developed postoperative AKI after intraabdominal surgery, as a large-scale study reported a 15-fold higher risk of mortality in patients with AKI compared to those without AKI [ ]. Moreover, even patients whose renal function completely recovered after postoperative AKI still faced a higher risk of death compared to those without AKI [ , ], highlighting the profound and lasting consequences of this condition. These statistics underscore the need for accurate prediction and preemptive management of AKI in the postoperative setting.There are many factors associated with postoperative AKI: age; sex; obesity; type of surgery; medications including renin-angiotensin-aldosterone system inhibitors (RASi) and nonsteroidal anti-inflammatory drugs (NSAIDs); and comorbidities such as chronic kidney disease (CKD), diabetes, hypertension, cardiovascular disease, liver disease, and chronic obstructive pulmonary disease [
- ]. These factors need to be integrated to assess the risk of postoperative AKI before surgery, and accurate risk prediction enables recognition of patients who need preoperative, intraoperative, and postoperative management to alleviate the risk. Several risk-scoring tools for postoperative AKI have been described [ - ]. However, their limitations are the homogeneity of the study population, the inclusion of a single center or a small number of centers, and the lack of external validation. To make the risk-scoring system generalizable, validation from a larger cohort using a multicenter database is needed [ ]. Machine learning allows greater insight into possible interactions between variables and searches for as many informative and interesting feature relationships as possible, including those in subgroups, which can discover new variables involved in the event and is useful in a large dataset [ ]. Therefore, the aim of this study was to build a risk prediction model for postoperative AKI using machine learning methods from a multicenter cohort.Methods
Study Population
Patients who underwent general anesthesia surgery from March 1, 2009, to December 31, 2019, at seven academic hospitals of the Catholic University of Korea (Seoul St. Mary’s, Yeouido St. Mary’s, Uijeongbu St. Mary’s, Eunpyeong St. Mary’s, Bucheon St. Mary’s, St. Vincent, and Incheon St. Mary’s Hospitals) were included. The exclusion criteria were as follows: operation-related criteria were operation duration under 1 hour or duration not available, cardiac surgeries, operations of brain death donors, nephrectomies, and kidney transplant operations; and renal function-related exclusion criteria were patients with a history of renal replacement therapy, preoperative serum creatinine (sCr) ≥4.0 mg/dL or estimated glomerular filtration rate (eGFR) <15 mL/min per 1.73 m2, elevation of preoperative sCr more than 0.3 mg/dL or 1.5 times within 2 weeks before surgery, and patients without preoperative or postoperative sCr values (
).
Ethical Considerations
The study was approved by the institutional review board of the Catholic University of Korea, College of Medicine (XC20WIDI0080) with waiver of consent due to the retrospective study methods. This study was not registered as it is a retrospective observational study. This report has been written according to the recently updated TRIPOD+AI (Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis+Artificial Intelligence) statement [
].Definition of Postoperative AKI
Postoperative AKI was defined as AKI that developed within 30 days after surgery, using the Kidney Disease: Improving Global Outcomes (KDIGO) criteria [
]. Stage-1 AKI was defined as sCr 1.5 to 1.9 times above baseline or an increase in sCr ≥0.3 mg/dL; stage-2 AKI was defined as sCr 2.0 to 2.9 times above baseline; and stage-3 AKI was defined as sCr more than 3 times above baseline, ≥4 mg/dL, or the initiation of renal replacement therapy (hemodialysis, peritoneal dialysis, or continuous renal replacement therapy). We did not use the urine output criteria of KDIGO, as previous studies suggested that the threshold of oliguria for postoperative AKI may be different from those of other AKIs [ , ] and due to the lack of data. This definition of postoperative AKI was used to create the supervised learning dataset of those with or without postoperative AKI.Data Collection and Cleansing
We collected data on demographic characteristics; underlying clinical diseases; preoperative laboratory data; preoperative medication; and surgical characteristics such as expected operation time, the day of operation (weekday or weekend), and the department of surgery. The underlying diseases of subjects were determined using the International Statistical Classification of Diseases and Related Health Problems, 10th Revision (ICD-10) codes of principal and secondary diagnosis. Comorbid diseases and ICD-10 codes are shown in
. Preoperative medications included RASi (angiotensin converting enzyme inhibitor [ACEi] or angiotensin II type 1 receptor blocker [ARB]) or NSAIDs. Preoperative eGFR was calculated from the Chronic Kidney Disease Epidemiology Collaboration equation [ ]. BMI was calculated as the patient’s weight in kilograms divided by height in meters squared (kg/m2).Anonymized data was extracted from the Catholic Medical Center (CMC) Clinical Data Warehouse, which is separately generated and managed redundantly from the electronic medical record systems of eight affiliated hospitals of the College of Medicine, the Catholic University of Korea [
] and processed using R software (version 3.6.3; R Foundation for Statistics Computing). The 38 variables included in the final analysis are shown in . In cases where laboratory tests were conducted multiple times before surgery, we selected the most recent preoperative values, taken closest to the time of surgery, to ensure the data accurately reflected the patient’s latest clinical status. Data artifacts and extreme values were set to the 1st percentile and 99th percentile, and missing values were filled using multiple imputation by chained equations (MICE) [ ]. MICE was used to provide more accurate estimates of the missing variables with the correlation of missing variables to other existing data points [ ]. We excluded variables with more than 40% missing data, following common practice [ - ]. The rates of the missing data for the variables are shown in . Nonbinary data were one-hot encoded, a method for rearranging categorical data into binary variables, and numerical data were normalized using min-max scaling. This would convert all numeric values between or equal to a value of 0 and 1. Min-max scaling is given by:One-hot encoding, min-max scaling, and dataset splitting were accomplished using the Scikit-Learn library (version 0.24.2) [
]. These steps are required to improve the performance of machine learning models and training stability. Because there was a small percentage of AKI events (3.3%), there was an extreme class imbalance in the dataset. Such imbalances can cause a falsely elevated accuracy and adversely affect machine learning training [ ]. To help overcome this issue, the AKI training dataset was augmented using an oversampling method by synthetic minority over-sampling technique (SMOTE), which has been shown to improve imbalanced class classifications (using imblearn library version 0.8.0) [ , ].Parameters | Variables |
Patient parameters |
|
Surgical parameters |
|
Laboratory parameters |
|
aBP: blood pressure.
bCOPD: chronic obstructive pulmonary disease.
cACEi: angiotensin-converting enzyme inhibitor.
dARB: angiotensin II type 1 receptor blocker.
eNSAID: nonsteroidal anti-inflammatory drug.
feGFR: estimated glomerular filtration rate.
gAST: aspartate aminotransferase.
hALT: alanine aminotransferase.
Machine Learning
Various machine learning methods were used to create the model, which was trained and evaluated using Python (version 3.8.5; Python Software Foundation). Machine learning methods commonly used in health care were applied [
, ]. Models applied were logistic regression, decision tree, random forest, naïve Bayes (using Scikit-Learn library version 0.24.2) [ ], light gradient boosting machine (GBM; using lightgbm version 3.2.1) [ ], and deep neural network (DNN; using Keras library version 2.5.0) [ ]. The strengths and weaknesses of each model have been summarized in .Method | How it works | Advantages | Disadvantages |
DNNa [ | ]Multiple layers of interconnected nodes (neurons) of at least 3 hidden layers or more. Each neuron is a weighted sum of inputs and produces output by an activation function. Learns by backpropagation. |
|
|
Logistic regression [ | ]Linear classification algorithm that finds relationships between independent variables and a binary outcome using the probability from logistical functions. |
|
|
Decision tree [ | ]A number of nodes that separate features depending on feature values and continue at each node, representing a tree. |
|
|
Random forest [ | ]An ensemble (group) of decision trees that randomly select features and data for training, with decisions made by the ensemble using regression or other methods. |
|
|
Light GBMb [ | ]An ensemble (group) of “weak models” (usually decision trees), which are sequentially added to one another to help to improve performance over a number of iterations. |
|
|
Naïve Bayes [ | ]Makes use of conditional probability to represent the likelihood of classification given a certain set of features, assuming that each feature is independent of one another. |
|
|
aDNN: deep neural network.
bGBM: gradient boosting machine.
For the deep learning model, the structure that was chosen was a model that had an input layer of width 50 (to account for the 40 inputs and to include the one-hot encodings); 3 hidden layers with a width of 64, 32, and 32; and a single output node. The configurations of the different models are shown in
. The training of the models was done on a machine with Intel Xeon Gold 6240R (8 cores) at 2.40 GHz, with 64 GB of RAM, Windows 10 Enterprise Build 17763. To analyze the statistical performance of the models for postoperative AKI prediction, we assessed the area under the curve (AUC) of the receiver operating characteristic (ROC) curve, accuracy, precision, sensitivity (recall), specificity, and F1-score. To determine the optimal thresholds for ROC-AUC analysis and the calculation of sensitivity and specificity, we used the Youden index (Youden J statistic). This index is defined as:Alternatively, it can be expressed as the maximum value of true positive rate – false positive rate, which was the criterion applied in our models to identify the threshold value. This method ensures a balanced trade-off between sensitivity and specificity, as described in Schisterman et al [
].Statistical Analysis
Statistical analysis was performed using SAS software (version 9.4; SAS Institute). Continuous variables were presented as means and SDs for data with normal distribution and presented as medians and IQRs for data with nonparametric distribution. After the distribution of data between the two groups was determined, they were compared using an independent t test (2-tailed) or Wilcoxon rank sum test. Categorical data were presented as percentages, and a comparison between the two groups was performed using the chi-square test or Fisher exact test. To determine the risk factors for AKI, we used the logistic regression model. Multivariable analysis using logistic regression was performed on variables with a P value <.20 on univariable analysis [
]. The results are presented as odds ratio with 95% CI. P values <.05 were considered significant.Results
Patient Baseline Characteristics
A total of 439,072 surgery cases from seven academic hospitals of the Catholic University of Korea were included in the study (
). After the exclusion of patients according to the exclusion criteria mentioned above, a total of 239,267 cases were included in the final analysis. Among these, 7935 (3.3%) AKI events occurred. Baseline demographics of patients with and without AKI are shown in . Significant differences were observed in all baseline characteristics between the two groups. Patients with AKI were older, with a higher percentage of those of the female sex, and had a lower BMI, a higher percentage of smokers, and higher baseline systolic and diastolic blood pressure. The AKI group also showed a higher percentage of all preexisting comorbidities (CKD, diabetes, hypertension, coronary artery disease, cerebrovascular disease, chronic obstructive pulmonary disease, or liver cirrhosis); more frequent usage of RASi (ACEi or ARB) and NSAIDs; and a higher percentage of patients undergoing general surgery, neurosurgery, and thoracic surgery. Laboratory results of the AKI group showed lower levels of hemoglobin, serum albumin, and eGFR, and higher baseline sCr and C-reactive protein levels. Variable selection was performed based on these clinical characteristics using logistic regression (clinical parameters are shown in and laboratory parameters are shown in ). During feature selection, the hematocrit variable was removed because it had >0.9 correlation with preoperative hemoglobin levels.Variables | No AKIa (n=231,332) | AKI (n=7935) | P value | ||
Age (years), mean (SD) | 54.56 (5.16) | 61.36 (4.19) | <.001 | ||
Male sex, n (%) | 126,217 (54.6) | 2997 (37.8) | <.001 | ||
BMI (kg/m2), mean (SD) | 24.29 (4.06) | 24.07 (4.22) | <.001 | ||
Smoker, n (%) | 34,228 (14.8) | 1424 (18.0) | <.001 | ||
Preexisting comorbidities | |||||
Chronic kidney disease, n (%) | 298 (0.1) | 114 (1.4) | <.001 | ||
Diabetes, n (%) | 7295 (3.2) | 2663 (8.4) | <.001 | ||
Hypertension, n (%) | 5802 (2.5) | 504 (6.4) | <.001 | ||
Coronary artery disease, n (%) | 3927 (1.7) | 308 (3.9) | <.001 | ||
Cerebrovascular disease, n (%) | 11,960 (5.2) | 734 (9.3) | <.001 | ||
COPDb, n (%) | 952 (0.4) | 85 (1.1) | <.001 | ||
Liver cirrhosis, n (%) | 1077 (0.5) | 131 (1.9) | <.001 | ||
Systolic BPc (mm Hg), mean (SD) | 125.84 (15.74) | 133.81 (20.7) | <.001 | ||
Diastolic BP (mm Hg), mean (SD) | 69.71 (9.99) | 67.01 (12.69) | <.001 | ||
Medication, n (%) | |||||
ACEid or ARBe | 7844 (3.4) | 768 (9.7) | <.001 | ||
NSAIDsf | 35,006 (15.1) | 1650 (20.8) | <.001 | ||
Department, n (%) | <.001 | ||||
General surgery | 61,948 (26.8) | 2409 (30.4) | |||
Neurosurgery | 31,169 (13.5) | 1407 (17.7) | |||
Orthopedics | 60,442 (26.1) | 1176 (14.8) | |||
Obstetrics and Gynecology | 22,607 (9.8) | 299 (3.8) | |||
Otorhinolaryngology | 11,551 (5.0) | 148 (1.9) | |||
Thoracic surgery | 10,774 (4.7) | 427 (5.4) | |||
Others | 32,831 (14.2) | 2069 (26.1) | |||
Preoperative laboratory results, mean (SD) | |||||
Hemoglobin (g/dL) | 13.14 (1.83) | 12.1 (2.15) | <.001 | ||
Urea nitrogen (mg/dL) | 14.69 (5.47) | 16.86 (8.75) | <.001 | ||
Creatinine (mg/dL) | 0.81 (0.24) | 0.91 (0.43) | <.001 | ||
eGFRg (mL/min/1.73 m2) | 95.41 (23.22) | 82.75 (7.12) | <.001 | ||
Albumin (g/dL) | 4.16 (0.51) | 3.67 (0.69) | <.001 | ||
C-reactive protein (g/dL) | 4.04 (18.35) | 8.4 (26.4) | <.001 |
aAKI: acute kidney injury.
bCOPD: chronic obstructive pulmonary disease.
cBP: blood pressure.
dACEi: angiotensin-converting enzyme inhibitor.
eARB: angiotensin II type 1 receptor blocker.
fNSAID: nonsteroidal anti-inflammatory drug.
geGFR: estimated glomerular filtration rate.
Clinical parameters | Univariable analysis | Multivariable analysis | |||
Odds ratio (95% CI) | P valuea | Odds ratio (95% CI) | P value | ||
Age (years) | 1.034 (1.032-1.036) | <.001 | 1.021 (1.019-1.023) | <.001 | |
Sex: male (reference: female) | 0.505 (0.483-0.529) | <.001 | 0.690 (0.652-0.73) | <.001 | |
Systolic BPb (mm Hg) | 1.023 (1.022-1.024) | <.001 | 1.013 (1.011-1.014) | <.001 | |
Diastolic BP (mm Hg) | 0.974 (0.972-0.977) | <.001 | 0.983 (0.981-0.985) | <.001 | |
BMI (kg/m2) | 0.986 (0.981-0.992) | <.001 | 1.011 (1.006-1.017) | <.001 | |
Chronic kidney disease: yes (reference: no) | 11.303 (9.098-14.043) | <.001 | 2.248 (1.728-2.925) | <.001 | |
Diabetes mellitus: yes (reference: no) | 2.800 (2.577-3.042) | <.001 | 1.161 (1.05-1.284) | .01 | |
Hypertension: yes (reference: no) | 2.636 (2.4-2.896) | <.001 | 1.210 (1.08-1.357) | .01 | |
Cerebrovascular disease: yes (reference: no) | 1.870 (1.729-2.022) | <.001 | 1.217 (1.118-1.326) | <.001 | |
Coronary artery disease: yes (reference: no) | 2.338 (2.078-2.632) | <.001 | 1.049 (0.917-1.199) | .49 | |
COPDc: yes (reference: no) | 2.62 (2.097-3.275) | <.001 | 1.136 (0.896-1.441) | .29 | |
Liver cirrhosis: yes (reference: no) | 4.147 (3.493-4.925) | <.001 | 1.316 (1.086-1.595) | .01 | |
Smoking: active (reference: never) | 1.329 (1.253-1.411) | <.001 | 1.191 (1.112-1.275) | <.001 | |
Operation duration | 1.223 (1.21-1.237) | <.001 | 1.164 (1.150-1.178) | <.001 | |
Preoperative ACEid or ARBe usage: yes (reference: no) | 3.053 (2.825-3.3) | <.001 | 1.326 (1.216-1.447) | <.001 | |
Preoperative NSAIDsf usage: yes (reference: no) | 1.472 (1.393-1.556) | <.001 | 1 (0.941-1.062) | .99 |
aAll variables with P value <.05 in the univariate analysis were included in the multivariate analysis.
bBP: blood pressure.
cCOPD: chronic obstructive pulmonary disease.
dACEi: angiotensin-converting enzyme inhibitor.
eARB: angiotensin II type 1 receptor blocker.
fNSAID: nonsteroidal anti-inflammatory drug.
Laboratory parameters | Univariable analysis | Multivariable analysis | ||||
Odds ratio (95% CI) | P valuea | Odds ratio (95% CI) | P value | |||
Preoperative serum variables | ||||||
Albumin | 0.269 (0.26-0.278) | <.001 | 0.524 (0.489-0.561) | <.001 | ||
Total protein | 0.452 (0.44-0.464) | <.001 | 1.042 (0.998-1.089) | .06 | ||
White blood cell count | 1.015 (1.013-1.017) | <.001 | 1.005 (1.003-1.008) | <.001 | ||
ALTb | 1.002 (1.002-1.003) | <.001 | 1 (0.999-1.001) | .62 | ||
ASTc | 1.003 (1.002-1.003) | <.001 | 1 (1-1.001) | .52 | ||
Urea nitrogen | 1.05 (1.047-1.053) | <.001 | 1.001 (0.997-1.005) | .63 | ||
Calcium | 0.433 (0.419-0.448) | <.001 | 1.003 (0.959-1.049) | .88 | ||
Chloride | 0.975 (0.968-0.982) | <.001 | 1.005 (0.997-1.013) | .20 | ||
Creatine phosphokinase | 1(1-1) | <.001 | 1 (1-1) | .12 | ||
Creatinine | 3.034 (2.846-3.234) | <.001 | 3.218 (2.871-3.607) | <.001 | ||
eGFRd | 1.004 (1.004-1.005) | <.001 | 1.012 (1.011-1.013) | <.001 | ||
C-reactive protein | 1.007 (1.006-1.007) | <.001 | 0.998 (0.997-0.999) | .01 | ||
Glucose | 1.007 (1.006-1.007) | <.001 | 1.002 (1.002-1.002) | <.001 | ||
Hemoglobin | 0.747 (0.738-0.756) | <.001 | 1 (0.958-1.043) | .99 | ||
Potassium | 0.539 (0.51-0.57) | <.001 | 0.755 (0.715-0.798) | <.001 | ||
Lactic dehydrogenase | 1.001 (1.001-1.001) | <.001 | 1 (1-1) | <.001 | ||
Uric acid | 1.099 (1.084-1.114) | <.001 | 1.078 (1.062-1.095) | <.001 | ||
Sodium | 0.892 (0.886-0.898) | <.001 | 0.98 (0.971-0.99) | <.001 | ||
Preoperative urine variables | ||||||
Dipstick protein | 2.002 (1.948-2.057) | <.001 | 1.369 (1.325-1.416) | <.001 | ||
Specific gravity | 0.118 (0.023-0.617) | .01 | 0.042 (0.009-0.196) | <.001 |
aAll variables with P value <.05 in the univariate analysis were included in the multivariate analysis.
bALT: alanine aminotransferase.
cAST: aspartate aminotransferase.
deGFR: estimated glomerular filtration rate.
Model Performance
The dataset was divided into the training set (80%) and the test set (20%). The training set (n=191,413) and test set (n=47,854) were balanced for outcomes and randomly assigned. To prevent outcome prediction bias, the testing subset was only evaluated after the model had been finalized. Predictive features were blinded during the outcome assessment phase. The loss function graph and AUC graphs of the training and validation sets for the DNN model are shown in
. Performance of the training and test sets of the DNN model is also presented in . The performances of the different models are shown in .We hypothesized that a simple system not using too many variables, for example, fewer than 20 variables, would be more practical to use in a clinical setting. Therefore, we evaluated model 2 and model 3 using multiple machine learning methods. Model 2 included 11 variables that were used in the classification system developed by Park et al [
], including age, sex, emergency operation, operation duration, diabetes, ACEi or ARB usage, blood levels of albumin, hemoglobin, sodium, eGFR, and urine dipstick protein. In this model, light GBM (AUC=0.81) and DNN (AUC=0.8) showed the highest performance. Model 3 included variables that were found significant on multivariable analysis, including age, sex, systolic blood pressure, diastolic blood pressure, operation duration, eGFR, blood levels of creatinine, albumin, sodium, potassium, chloride, glucose, lactic dehydrogenase, and urine dipstick protein. In this model as well, light GBM (AUC=0.825), and DNN (AUC=0.811) showed the highest performance. Model 1 included all 38 preoperative variables and surgical characteristics, on which light GBM (AUC=0.836), and DNN (AUC=0.832) demonstrated the best prediction performance once again. The ROC-AUC for model 1 of the different AKI prediction models is shown in .To enhance clinical applicability, a nomogram was created based on a simplified logistic regression model, focusing on 8 key predictors: age, gender, albumin, hemoglobin, sodium, operation duration, eGFR, and urine protein. The nomogram was created by making modifications to a Python nomogram library called simpleNomo [
]. It provides a graphical tool for clinicians to estimate the risk of AKI in individual patients by integrating these factors. This approach allows quick risk stratification, clinical decision-making, and targeted interventions [ ]. The nomogram is shown in .
Analysis and model | AUCa | Accuracy | NPVb | Precision or PPVc | Specificity | Recall or sensitivity | F1-score | |||||||
DNNd | ||||||||||||||
Model 1e | 0.832 | 0.711 | 0.99 | 0.086 | 0.708 | 0.802 | 0.156 | |||||||
Model 2f | 0.8 | 0.712 | 0.988 | 0.082 | 0.711 | 0.75 | 0.147 | |||||||
Model 3g | 0.811 | 0.691 | 0.989 | 0.079 | 0.688 | 0.785 | 0.144 | |||||||
Logistic regression | ||||||||||||||
Model 1 | 0.825 | 0.7 | 0.99 | 0.083 | 0.696 | 0.805 | 0.151 | |||||||
Model 2 | 0.79 | 0.73 | 0.987 | 0.083 | 0.731 | 0.709 | 0.148 | |||||||
Model 3 | 0.806 | 0.719 | 0.988 | 0.083 | 0.718 | 0.741 | 0.149 | |||||||
Logistic regression with LASSOh penalty | ||||||||||||||
Model 1 | 0.821 | 0.727 | 0.989 | 0.088 | 0.725 | 0.771 | 0.158 | |||||||
Model 2 | 0.788 | 0.713 | 0.987 | 0.08 | 0.712 | 0.728 | 0.144 | |||||||
Model 3 | 0.803 | 0.703 | 0.988 | 0.079 | 0.702 | 0.749 | 0.143 | |||||||
Decision tree | ||||||||||||||
Model 1 | 0.679 | 0.606 | 0.983 | 0.056 | 0.603 | 0.691 | 0.104 | |||||||
Model 2 | 0.711 | 0.635 | 0.984 | 0.062 | 0.633 | 0.708 | 0.114 | |||||||
Model 3 | 0.626 | 0.844 | 0.976 | 0.085 | 0.86 | 0.379 | 0.139 | |||||||
Random forest | ||||||||||||||
Model 1 | 0.813 | 0.751 | 0.988 | 0.092 | 0.752 | 0.732 | 0.163 | |||||||
Model 2 | 0.806 | 0.708 | 0.988 | 0.081 | 0.706 | 0.751 | 0.146 | |||||||
Model 3 | 0.812 | 0.756 | 0.987 | 0.091 | 0.758 | 0.708 | 0.161 | |||||||
Light GBMi | ||||||||||||||
Model 1 | 0.836 | 0.711 | 0.991 | 0.087 | 0.708 | 0.813 | 0.157 | |||||||
Model 2 | 0.81 | 0.73 | 0.988 | 0.086 | 0.73 | 0.74 | 0.154 | |||||||
Model 3 | 0.825 | 0.728 | 0.988 | 0.087 | 0.727 | 0.756 | 0.156 | |||||||
Naïve Bayes | ||||||||||||||
Model 1 | 0.785 | 0.68 | 0.988 | 0.075 | 0.677 | 0.767 | 0.137 | |||||||
Model 2 | 0.773 | 0.662 | 0.988 | 0.072 | 0.658 | 0.77 | 0.131 | |||||||
Model 3 | 0.792 | 0.742 | 0.986 | 0.086 | 0.744 | 0.701 | 0.153 |
aAUC: area under the curve.
bNPV: negative predictive value.
cPPV: positive predictive value.
dDNN: deep neural network.
eModel 1: age, sex, systolic blood pressure, diastolic blood pressure, BMI, chronic kidney disease, diabetes mellitus, hypertension, cerebrovascular disease, coronary artery disease, chronic obstructive pulmonary disease, liver cirrhosis, emergency operation, operation duration, angiotensin-converting enzyme inhibitor (ACEi) or angiotensin II type 1 receptor blocker (ARB) usage, nonsteroidal anti-inflammatory drug (NSAID) usage, estimated glomerular filtration rate (eGFR), blood levels of creatinine, total protein, albumin, aspartate aminotransferase (AST), alanine aminotransferase (ALT), urea nitrogen, sodium, potassium, chloride, calcium, creatine phosphokinase, lactic dehydrogenase, C-reactive protein, glucose, hemoglobin, and white blood cell count, urine specific gravity, and urine protein.
fModel 2: age, sex, emergency operation, operation duration, diabetes mellitus, ACEi or ARB usage, blood levels of albumin, hemoglobin, and sodium, eGFR, and urine protein.
gModel 3: age, sex, systolic blood pressure, diastolic blood pressure, operation duration, eGFR, blood levels of creatinine, albumin, sodium, potassium, chloride, glucose, and lactic dehydrogenase, and urine protein.
hLASSO: Least Absolute Shrinkage and Selection Operator.
iGBM: gradient boosting machine.


Finally, our postoperative AKI prediction tool, the CMC-AKIX, was developed using all 38 variables. Therefore, the DNN model 1 was developed into a user-friendly website, which can be accessed on the web [
] (shown in ). This was created using Flask and hosted on a Google Cloud Virtual Machine.
Discussion
Using a multicenter database of 239,267 noncardiac surgeries, we have developed a high-performance risk prediction system for postoperative AKI that can be easily applied. The model uses preoperative patient characteristics and laboratory data along with simple information about the surgery. DNN and light GBM showed a good performance in predicting postoperative AKI, with the best performance when all 38 variables were included.
AKI has a global presence and a high disease burden and mortality [
]. The incidence of AKI varies widely according to the geographic locations and is dependent on the setting: community acquired versus hospital acquired. It was reported that 1 in 5 adults and 1 in 3 children worldwide experience hospital-acquired AKI using the KDIGO definition [ ]. Causes of hospital-acquired AKI include sepsis, critical illness, surgery, and use of nephrotoxic medications [ ]. Postoperative AKI accounts for 30% to 40% of hospital-acquired AKI [ ] and increases the risk of morbidity and in-hospital mortality [ ]. Since treatment options are limited, the prevention of postoperative AKI is the cornerstone of improving patient outcomes after surgery [ ]. Previous studies have found risk factors that increase the risk of postoperative AKI [ , , ]. However, the definition of AKI using increased sCr levels as a marker of kidney damage has a limitation, because sCr levels begin to increase after the pathological changes of kidney injury are already in progress. Therefore, earlier and timely prevention and detection of postoperative AKI can be difficult [ ]. This has led to continuous efforts to develop a risk stratification system for postoperative AKI. Recently, Park et al [ ] have developed an index to classify postoperative AKI within 90 days after noncardiac surgery from 90,805 patients (SPARK index), which included 11 variables: age, sex, expected surgery duration, emergency operation, diabetes, use of RASi, baseline eGFR, dipstick albuminuria, hypoalbuminemia, anemia, and hyponatremia. The SPARK index showed a discrimination power of AUC of 0.80 for postoperative AKI in the discovery cohort and an AUC of 0.72 in the validation cohort.Machine learning approaches are more flexible than statistical methods as they are free from statistical assumptions such as noncollinearity or normal distribution of residuals. It allows all possible interactions between variables according to multidimensional nonlinear patterns and aggressively searches for as many informative and interesting features as possible [
]. Lei et al [ ] used machine learning techniques to stratify the risk of postoperative AKI within 7 days after noncardiac surgery from a single center cohort of 42,615 patients. In that study, GBM showed the highest performance with an AUC of 0.817 (95% CI 0.802-0.832) and included 339 preoperative and intraoperative variables. Bihorac et al [ ] developed a machine learning–based risk prediction tool (MySurgeryRisk) for 8 major postoperative complications within 24 months after any kind of surgery from a single center cohort of 51,457 patients. Using this platform, the authors validated the model’s performance for predicting postoperative AKI, with an AUC of 0.82 (95% CI 0.82-0.83), including 135 variables from a cohort of 22,300 surgeries [ ].The strength of our study is that we used a multicenter dataset of a larger scale than previous ones. Data were extracted from the CMC Clinical Data Warehouse, which included data from seven academic hospitals located in five cities in South Korea. The prediction model was developed using 38 clinical and laboratory parameters in combination that exhibited the best prediction performance. These variables are used in clinical practice and can be extracted from electronic medical records. In addition, by including the department of surgery as a variable, the CMC-AKIX can be applied to various kinds of noncardiac surgery. We are looking to simplify the model and improve usability by allowing incomplete data or missing values to be filled in with best estimate values using imputation methods such as MICE.
Our study holds a distinct advantage in that it compared several of the most widely used machine learning methods in clinical data modeling. By doing so, we systematically observed and elucidated the strengths and limitations of each model, using a large, well-curated dataset. We observed that certain methods were more affected by the imbalanced dataset, including the decision tree classifier, random forest, and naïve Bayes. We aim to offer insights into the selection of different algorithms for applications in clinical studies.
This study has several limitations. First, the results of this study have not been externally validated in independent cohorts from different countries, races, and ethnicities. As such, further external validation is needed to assess the generalizability of the CMC-AKIX model across diverse populations. Second, our definition of postoperative AKI as AKI developing within 30 days after surgery may be controversial as most studies observing postoperative AKI apply the time period of 7 days in conjunction with the KDIGO criteria [
]. A 30-day period was selected for this study because postoperative complications or morbidity in most studies is defined as events occurring within 30 days after surgery [ , ]. Patients with AKI that persist for more than 7 days, beyond the 30-day period of the study, have been organized in a second cohort study observing postoperative risk of acute kidney disease and CKD [ ]. Third, the urine output definition of the KDIGO criteria was not used because of a lack of urine output data. This could have led to incomplete identification of postoperative AKI. Fourth, the intraoperative and postoperative factors were not included in the risk prediction system, which also affects postoperative renal outcomes. As the purpose of our model is mainly to identify patients at high risk for postoperative AKI while they are still in the preoperative setting, intraoperative and postoperative variables should not be included.In the future, we look to collaborating with other institutions with different demographic data to validate the model and see if it could perform well with different demographic populations. Also, the model will be fine-tuned in the process of including diverse datasets, and the performance of the model will be improved by creating an appropriate ensemble of machine learning models to gain the benefits of the different machine learning structures and advantages [
]. At last, practical use of the model may be significantly increased by incorporating it into an electronic alert system to automatically identify patients at high risk for postoperative AKI, providing timely risk alerts, and thereby allowing for proactive management such as cessation of causative medications or prescription of fluids—ultimately improving patient care [ ]. Such systems could also allow for continuous updates and refinements of the model as new data become available, ensuring its relevance and adaptability. By supporting evidence-based decision-making and improving perioperative risk management, this approach has the potential to significantly enhance patient outcomes and optimize resource allocation in diverse health care settings. In conclusion, we propose a machine learning–based risk prediction tool, the CMC-AKIX, using individual patients’ preoperative characteristics and surgical information. This model was adapted to a user-friendly web-based program, and one can use it even if all variables are not included. This tool may guide preoperative counseling, decision-making, and perioperative care.Acknowledgments
This research was supported by the Clinical Trials Center of Incheon St. Mary’s Hospital; the Catholic University of Korea; and the Institute of Clinical Medicine Research of Bucheon St. Mary’s Hospital, Research Fund, 2020. The authors also wish to acknowledge the financial support of the Catholic Medical Center Research Foundation made in the program year of 2022 and a cooperative research fund from the Korean Society of Nephrology (2022). In addition, this research was supported by a grant of the Korea Health Technology R&D Project through the Korea Health Industry Development Institute (KHIDI), funded by the Ministry of Health & Welfare, Republic of Korea (grant RS-2023-KH135368).
Data Availability
Access to the datasets generated or analyzed in this study is available from the corresponding author upon reasonable request.
Authors' Contributions
This paper is jointly corresponded by In Young Choi and Hye Eun Yoon. The contact information for In Young Choi is as follows:
In Young Choi, PhD
Department of Medical Informatics, Graduate School of Healthcare Management & Policy, College of Medicine, The Catholic University of Korea
Tel: 82-2-2258-7870
E-mail: iychoi@catholic.ac.kr
Conflicts of Interest
None declared.
ICD-10 (International Statistical Classification of Diseases and Related Health Problems, 10th Revision) codes for comorbid conditions.
DOCX File , 21 KBRates of missing data for the variables before imputation.
DOCX File , 34 KBCode and configurations of the deep neural network.
DOCX File , 22 KBPerformance of the training and test sets of the deep neural network model.
DOCX File , 21 KBReferences
- Thakar CV. Perioperative acute kidney injury. Adv Chronic Kidney Dis. Jan 2013;20(1):67-75. [CrossRef] [Medline]
- Biteker M, Dayan A, Tekkeşin AI, Can MM, Taycı İ, İlhan E, et al. Incidence, risk factors, and outcomes of perioperative acute kidney injury in noncardiac and nonvascular surgery. Am J Surg. Jan 2014;207(1):53-59. [CrossRef] [Medline]
- Kim M, Brady J, Li G. Variations in the risk of acute kidney injury across intraabdominal surgery procedures. Anesth Analg. Nov 2014;119(5):1121-1132. [CrossRef] [Medline]
- Bihorac A, Yavas S, Subbiah S, Hobson C, Schold J, Gabrielli A, et al. Long-term risk of mortality and acute kidney injury during hospitalization after major surgery. Ann Surg. May 2009;249(5):851-858. [CrossRef] [Medline]
- Choi BY, Choi W, Min J, Chung BH, Koh ES, Hong SY, et al. Predicting long-term mortality of patients with postoperative acute kidney injury following noncardiac general anesthesia surgery using machine learning. Kidney Res Clin Pract. Sep 26, 2024. [FREE Full text] [CrossRef] [Medline]
- Park JT. Postoperative acute kidney injury. Korean J Anesthesiol. Jun 2017;70(3):258-266. [FREE Full text] [CrossRef] [Medline]
- Grams ME, Sang Y, Coresh J, Ballew S, Matsushita K, Molnar MZ, et al. Acute kidney injury after major surgery: a retrospective analysis of Veterans Health Administration data. Am J Kidney Dis. Jun 2016;67(6):872-880. [FREE Full text] [CrossRef] [Medline]
- Boyer N, Eldridge J, Prowle JR, Forni LG. Postoperative acute kidney injury. Clin J Am Soc Nephrol. Oct 2022;17(10):1535-1545. [FREE Full text] [CrossRef] [Medline]
- Park S, Cho H, Park S, Lee S, Kim K, Yoon HJ, et al. Simple postoperative AKI risk (SPARK) classification before noncardiac surgery: a prediction index development study with external validation. J Am Soc Nephrol. Jan 2019;30(1):170-181. [FREE Full text] [CrossRef] [Medline]
- Bihorac A, Ozrazgat-Baslanti T, Ebadi A, Motaei A, Madkour M, Pardalos P, et al. MySurgeryRisk: development and validation of a machine-learning risk algorithm for major complications and death after surgery. Ann Surg. Apr 2019;269(4):652-662. [FREE Full text] [CrossRef] [Medline]
- Lei VJ, Luong T, Shan E, Chen X, Neuman MD, Eneanya ND, et al. Risk stratification for postoperative acute kidney injury in major noncardiac surgery using preoperative and intraoperative data. JAMA Netw Open. Dec 02, 2019;2(12):e1916921. [FREE Full text] [CrossRef] [Medline]
- Vanmassenhove J, Kielstein J, Jörres A, Biesen WV. Management of patients at risk of acute kidney injury. Lancet. May 27, 2017;389(10084):2139-2151. [CrossRef] [Medline]
- Kim KJ, Tagkopoulos I. Application of machine learning in rheumatic disease research. Korean J Intern Med. Jul 2019;34(4):708-722. [FREE Full text] [CrossRef] [Medline]
- Collins GS, Moons KGM, Dhiman P, Riley RD, Beam AL, Van Calster B, et al. TRIPOD+AI statement: updated guidance for reporting clinical prediction models that use regression or machine learning methods. BMJ. Apr 18, 2024;385:q902. [FREE Full text] [CrossRef] [Medline]
- Kidney Disease: Improving Global Outcomes (KDIGO) Acute Kidney Injury Work Group. KDIGO clinical practice guideline for acute kidney injury. Kidney Int. Mar 2012;2(1):1-138. [CrossRef]
- Hori D, Katz NM, Fine DM, Ono M, Barodka VM, Lester LC, et al. Defining oliguria during cardiopulmonary bypass and its relationship with cardiac surgery-associated acute kidney injury. Br J Anaesth. Dec 2016;117(6):733-740. [FREE Full text] [CrossRef] [Medline]
- Mizota T, Yamamoto Y, Hamada M, Matsukawa S, Shimizu S, Kai S. Intraoperative oliguria predicts acute kidney injury after major abdominal surgery. Br J Anaesth. Dec 01, 2017;119(6):1127-1134. [FREE Full text] [CrossRef] [Medline]
- Inker LA, Eneanya ND, Coresh J, Tighiouart H, Wang D, Sang Y, et al. New creatinine- and cystatin C-based equations to estimate GFR without race. N Engl J Med. Nov 04, 2021;385(19):1737-1749. [FREE Full text] [CrossRef] [Medline]
- Buuren SV, Groothuis-Oudshoorn K. MICE: multivariate imputation by chained equations in R. J Stat Soft. 2011;45(3):1-67. [CrossRef]
- Li P, Stuart EA, Allison DB. Multiple imputation: a flexible tool for handling missing data. JAMA. Nov 10, 2015;314(18):1966-1967. [FREE Full text] [CrossRef] [Medline]
- Azur MJ, Stuart EA, Frangakis C, Leaf PJ. Multiple imputation by chained equations: what is it and how does it work? Int J Methods Psychiatr Res. Mar 2011;20(1):40-49. [FREE Full text] [CrossRef] [Medline]
- Dong Y, Peng CYJ. Principled missing data methods for researchers. Springerplus. Dec 2013;2(1):222. [FREE Full text] [CrossRef] [Medline]
- White IR, Royston P, Wood AM. Multiple imputation using chained equations: issues and guidance for practice. Stat Med. Feb 20, 2011;30(4):377-399. [CrossRef] [Medline]
- Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: machine learning in Python. J Mach Learn Res. Nov 1, 2011;12:2825-2830. [FREE Full text] [CrossRef]
- Wang L, Han M, Li X, Zhang N, Cheng H. Review of classification methods on unbalanced data sets. IEEE Access. 2021;9:64606-64628. [CrossRef]
- Lemaître G, Nogueira F, Aridas CK. Imbalanced-learn: a python toolbox to tackle the curse of imbalanced datasets in machine learning. arXiv. Preprint posted online on September 21, 2016. [CrossRef]
- Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. SMOTE: synthetic minority over-sampling technique. J Artif Intell Res. 2002;16:321-357. [CrossRef]
- Lei L, Wang Y, Xue Q, Tong J, Zhou C, Yang J. A comparative study of machine learning algorithms for predicting acute kidney injury after liver cancer resection. PeerJ. 2020;8:e8583. [FREE Full text] [CrossRef] [Medline]
- Habehh H, Gohel S. Machine learning in healthcare. Curr Genomics. Dec 16, 2021;22(4):291-300. [FREE Full text] [CrossRef] [Medline]
- Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, et al. LightGBM: a highly efficient gradient boosting decision tree. In: editors. \; 2017. Presented at: NIPS'17: Proceedings of the 31st International Conference on Neural Information Processing Systems; December 4-9, 2017:3149-3157; Long Beach, CA. URL: https://proceedings.neurips.cc/paper_files/paper/2017/file/6449f44a102fde848669bdd9eb6b76fa-Paper.pdf [CrossRef]
- Keras. GitHub. 2015. URL: https://github.com/fchollet/keras [accessed 2024-05-27]
- LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. May 28, 2015;521(7553):436-444. [CrossRef] [Medline]
- Zabor EC, Reddy CA, Tendulkar RD, Patil S. Logistic regression in clinical studies. Int J Radiat Oncol Biol Phys. Feb 01, 2022;112(2):271-277. [CrossRef] [Medline]
- Song YY, Lu Y. Decision tree methods: applications for classification and prediction. Shanghai Arch Psychiatry. Apr 25, 2015;27(2):130-135. [FREE Full text] [CrossRef] [Medline]
- Breiman L. Random forests. Mach Learn. 2001;45(1):5-32. [FREE Full text] [CrossRef]
- Rish I. An empirical study of the naïve Bayes classifier. IJCAI 2001 Work Empir Methods Artif Intell. 2001;3. [FREE Full text]
- Schisterman EF, Faraggi D, Reiser B, Hu J. Youden Index and the optimal threshold for markers with mass at zero. Stat Med. Jan 30, 2008;27(2):297-315. [FREE Full text] [CrossRef] [Medline]
- Vittinghoff E, McCulloch CE. Relaxing the rule of ten events per variable in logistic and Cox regression. Am J Epidemiol. Mar 15, 2007;165(6):710-718. [CrossRef] [Medline]
- Hong H, Hong S. simpleNomo: a Python package of making nomograms for visualizable calculation of logistic regression models. Health Data Sci. 2023;3:0023. [FREE Full text] [CrossRef] [Medline]
- Steyerberg EW. Clinical Prediction Models: A Practical Approach to Development, Validation, and Updating. Cham, Switzerland. Springer; 2019.
- CMC-AKIX. URL: http://www.cmc-akix.com [accessed 2025-03-14]
- Lewington AJ, Cerdá J, Mehta RL. Raising awareness of acute kidney injury: a global perspective of a silent killer. Kidney Int. Sep 2013;84(3):457-467. [FREE Full text] [CrossRef] [Medline]
- Susantitaphong P, Cruz DN, Cerda J, Abulfaraj M, Alqahtani F, Koulouridis I, et al. World incidence of AKI: a meta-analysis. Clin J Am Soc Nephrol. 2013;8(9):1482-1493. [FREE Full text] [CrossRef] [Medline]
- Uchino S, Kellum JA, Bellomo R, Doig GS, Morimatsu H, Morgera S, et al. Acute renal failure in critically ill patients: a multinational, multicenter study. JAMA. Aug 17, 2005;294(7):813-818. [CrossRef] [Medline]
- Kellum JA, Romagnani P, Ashuntantang G, Ronco C, Zarbock A, Anders H. Acute kidney injury. Nat Rev Dis Primers. Jul 15, 2021;7(1):52. [CrossRef] [Medline]
- Ren Y, Loftus TJ, Datta S, Ruppert MM, Guan Z, Miao S, et al. Performance of a machine learning algorithm using electronic health record data to predict postoperative complications and report on a mobile platform. JAMA Netw Open. May 02, 2022;5(5):e2211973. [FREE Full text] [CrossRef] [Medline]
- Sheetz KH, Woodside KJ, Shahinian VB, Dimick JB, Montgomery JR, Waits SA. Trends in bariatric surgery procedures among patients with ESKD in the United States. Clin J Am Soc Nephrol. Aug 07, 2019;14(8):1193-1199. [FREE Full text] [CrossRef] [Medline]
- Dencker EE, Bonde A, Troelsen A, Varadarajan KM, Sillesen M. Postoperative complications: an observational study of trends in the United States from 2012 to 2018. BMC Surg. Nov 06, 2021;21(1):393. [FREE Full text] [CrossRef] [Medline]
- Kung CW, Chou Y. Acute kidney disease: an overview of the epidemiology, pathophysiology, and management. Kidney Res Clin Pract. Nov 2023;42(6):686-699. [FREE Full text] [CrossRef] [Medline]
- Polikar R. Ensemble based systems in decision making. IEEE Circuits Syst Mag. 2006;6(3):21-45. [CrossRef]
- Park S, Yi J, Lee YJ, Kwon EJ, Yun G, Jeong JC, et al. Electronic alert outpatient protocol improves the quality of care for the risk of postcontrast acute kidney injury following computed tomography. Kidney Res Clin Pract. Sep 2023;42(5):606-616. [FREE Full text] [CrossRef] [Medline]
Abbreviations
AKI: acute kidney injury |
ACEi: angiotensin-converting enzyme inhibitor |
ARB: angiotensin II type 1 receptor blocker |
AUC: area under the curve |
CKD: chronic kidney disease |
DNN: deep neural network |
GBM: gradient boosting machine |
ICD-10: International Statistical Classification of Diseases and Related Health Problems, 10th Revision |
eGFR: estimated glomerular filtration rate |
KDIGO: Kidney Disease: Improving Global Outcomes |
MICE: multiple imputation by chained equations |
NSAID: nonsteroidal anti-inflammatory drug |
RASi: renin-angiotensin-aldosterone system inhibitors |
ROC: receiver operating characteristic |
RRT: renal replacement therapy |
sCr: serum creatinine |
SMOTE: synthetic minority over-sampling technique |
TRIPOD+AI: Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis+Artificial Intelligence |
Edited by B Puladi; submitted 10.06.24; peer-reviewed by M-P Li, Q Yan; comments to author 07.08.24; revised version received 16.10.24; accepted 02.01.25; published 09.04.25.
Copyright©Ji Won Min, Jae-Hong Min, Se-Hyun Chang, Byung Ha Chung, Eun Sil Koh, Young Soo Kim, Hyung Wook Kim, Tae Hyun Ban, Seok Joon Shin, In Young Choi, Hye Eun Yoon. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 09.04.2025.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research (ISSN 1438-8871), is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.