2022 journal article
A Machine Learning Pipeline for Mortality Prediction in the ICU
International Journal of Digital Health, 2(1).
Mortality risk prediction for patients admitted into the intensive care unit (ICU) is a crucial and challenging task, so that clinicians are able to respond with timely and appropriate clinical intervention. This becomes more urgent under the background of COVID-19 as a global pandemic. In recent years, electronic health records (EHR) have been widely adopted, and have the potential to greatly improve clinical services and diagnostics. However, the large proportion of missing data in EHR poses challenges that may reduce the accuracy of prediction methods. We propose a cohort study that builds a pipeline that extracts ICD-9 codes and laboratory tests from public available electronic ICU databases, and improve the in-hospital mortality prediction accuracy using a combination of neural network missing data imputation approach and decision tree based outcome prediction algorithm. We show the proposed approach achieves a higher area under the ROC curve, ranging from 0.88-0.98, compared with other well-known machine learning methods applied to similar target population. It also offers clinical interpretations through variable selection. Our analysis also shows that mortality prediction for neonates was more challenging than for adults, and that prediction accuracy decreases as patients stayed longer in the ICU.