Machine learning for predicting device-associated infection and 30-day survival outcomes after invasive device procedure in intensive care unit patients

Machine learning for predicting device-associated infection and 30-day survival outcomes after invasive device procedure in intensive care unit patients

Study design and participants

Electronic inpatient records were retrieved from the MIMIC-IV database (version 2.2), which includes data on 180,733 inpatients and 50,920 ICU patients admitted to Beth Israel Deaconess Medical Center between 2008 and 2019. This database integrates information from hospital and ICU systems, along with external sources, capturing vital signs, laboratory and microbiological tests, admissions and discharges, medications, length of stay, survival data, and discharge or death records16. Access to this database was granted upon completion of the Collaborative Institutional Training Initiative (CITI Program) training course (certification ID: 12037493). Since the database is publicly available and anonymized, patient informed consent was not required. This study adhered to relevant guidelines to ensure patient privacy, and it was approved by the Ethics Committee of Tengzhou Central People’s Hospital in China (ethical review approval number: 2023-ethical reviews-45). All methodologies and protocols were rigorously conducted in full compliance with the principles set forth in the Declaration of Helsinki.

The study included patients over 18 years old who were first hospitalized in the ICU and underwent at least one invasive procedure, such as invasive mechanical ventilation (IMV), central venous catheter (CVC), or indwelling urinary catheter (IUC). Exclusion criteria were: (1) death within 48 h of ICU admission; (2) invasive procedures conducted in non-ICU departments; (3) failure to meet the Centers for Disease Control and Prevention (CDC) National Healthcare Safety Network (NHSN) guidelines17 for diagnosing VAP, CLABSI, or CAUTI.

Outcomes and definitions

This study targeted three types of infections: VAP, CLABSI, and CAUTI. The outcomes were defined as 30-day mortality and device-associated infections occurring in the ICU after more than 48 h following the specified operations. The predictive device-associated infections included VAP, CLABSI, and CAUTI. Invasive device procedures and diagnoses of device-associated infections were evaluated based on the CDC’s NHSN guidelines.

Data collection

Variables documented upon ICU admission were collected, encompassing general information, laboratory examinations, and vital signs. Basic data included age, gender, ethnicity, comorbidities, admission care unit, invasive interventions, device-associated infections, 30-day survival outcomes post-first invasive procedure, and hospitalization outcomes. Laboratory parameters and vital signs were collected within the first 24 h of ICU admission. These included white blood cell (WBC) count, platelet count, anion gap, bicarbonate levels, creatinine levels, chloride levels, glucose levels, hemoglobin levels, potassium levels, sodium levels, blood urea nitrogen (BUN) levels, calcium levels, activated partial thromboplastin time (APTT), partial thromboplastin time (PT), international normalized ratio (INR), systolic blood pressure (SBP), diastolic blood pressure (DBP), mean arterial pressure (MAP), temperature, peripheral blood oxygen saturation (SpO2), heart rate, respiratory rate, Glasgow Coma Scale (GCS), Sequential Organ Failure Assessment (SOFA), Acute Physiology Score III (APS III), and Simplified Acute Physiology Score II (SAPS II). The code for the study is available on GitHub (https://github.com/susu223344/Device-Associated-Infections-Hyperparametric-Search/blob/main/sql).

Development and visualization of ML models

Patients were randomly assigned to training and validation datasets in a 7:3 ratio. Missing data were imputed using the k-nearest neighbor (KNN) strategy, and continuous variables were normalized using a min-max scaler. To mitigate the risk of data leakage, the KNN imputation method and min-max scaling were separately applied to the training and validation datasets following data partitioning18. Two separate ML models were developed: one for device-associated infections and one for 30-day survival outcomes. Seven ML models were applied for predicting device-associated infections: random forest (RF), logistic regression (LR), support vector machine (SVM), extreme gradient boosting (XGBoost), Gaussian naive Bayesian (GNB), decision tree (DT), and recurrent neural networks with long short-term memory (LSTM). For 30-day survival outcomes, five models were used: Cox regression, extra survival trees (EST), survival tree (ST), gradient boosting survival tree (GBST), and deep learning survival neural network (DeepSurv)19,20. A grid search approach was employed to optimize each model, identifying the optimal hyperparameters through 10-fold cross-validation (Supplementary Table 1). The code for this grid search is accessible on GitHub ( All models were trained and evaluated using 10-fold cross-validation with the identified optimal hyperparameters.

Assessment of the device-associated infection ML model was conducted using the area under the curve (AUC) of the receiver operating characteristic (ROC), the area under the precision-recall curve (AUPRC), and the Brier score (BS). Evaluation of the 30-day survival ML model employed concordance index (C-index), integrated Brier score (IBS), and time-dependent AUC. Both AUC and C-index measure discrimination; they are equivalent for binary classification ML models, but the C-index remains unaffected by survival time censoring, making it ideal for survival data analysis. C-index and AUC values below 0.60 indicate poor discrimination, between 0.60 and 0.75 indicate potentially helpful discrimination, and above 0.75 indicate clearly useful discrimination21. Additionally, AUPRC, similar to ROC-AUC, excels at evaluating discrimination in class-imbalanced scenarios, distinguishing positive and negative samples more effectively. The time-ROC curve, illustrating ROC curves at various time points with evolving AUC, captures the model’s discrimination variations over time, with a time-dependent AUC closer to 1 indicating higher predictive performance. BS and IBS are essential for assessing prediction accuracy; BS calculates the mean squared difference between predicted and actual values, while IBS integrates BS for continuous-time model assessment, with values closer to 0 signifying superior predictive accuracy.

Selection of the optimal ML algorithm for the device-associated infection model involved training seven ML models with all variables and comparing metrics such as AUC, AUPRC, and BS, alongside disparities in decision curve analysis (DCA) and calibration curves. This rigorous evaluation identified the most suitable algorithm. A similar process was followed for the 30-day survival model, training five ML models with all relevant variables and selecting the best algorithm based on C-index, IBS, and time-dependent AUC. The Shapley additive explanation (SHAP)22 method evaluated the variable importance of the optimal ML algorithms for both models. The performance of the optimal ML algorithms was tested on the full set of variables and the top 30, top 20, top 15, and top 10 variables, emphasizing fewer variables without compromising performance, thereby determining the final model.

The optimal ML models were converted and implemented into a web-based application. This application displays the predicted probability of device-associated infections and provides insights into the decision-making process using the SHAP method. Additionally, it includes the patients’ 30-day Kaplan-Meier survival curve.

Statistical analysis

Data were presented as medians with interquartile ranges for continuous variables and as counts with percentages for categorical variables. Differences between the two datasets were assessed using the Wilcoxon rank sum test for continuous data and Fisher’s exact test for categorical data. All statistical analyses were performed using R version 4.3.0 and Python version 3.7.0 (version 3.8.17 for the web-based application). A P-value of less than 0.05 was considered statistically significant.

link

Leave a Reply

Your email address will not be published. Required fields are marked *