2019 journal article

The value of missing information in severity of illness score development

JOURNAL OF BIOMEDICAL INFORMATICS, 97.

By: J. Agor*, O. Ozaltin n, J. Ivy n, M. Capan*, R. Arnold* & S. Romero*

author keywords: Severity of illness scores; Sepsis; Missing data; Prediction models; Electronic health records
MeSH headings : Adolescent; Adult; Aged; Aged, 80 and over; Area Under Curve; Computational Biology / methods; Data Interpretation, Statistical; Electronic Health Records / statistics & numerical data; Female; Hospital Mortality; Humans; Intensive Care Units; Logistic Models; Male; Middle Aged; Models, Statistical; Outcome Assessment, Health Care / statistics & numerical data; Sepsis / mortality; Severity of Illness Index; Support Vector Machine; Young Adult
TL;DR: When developing prediction models using longitudinal EHR data, researchers should explore the incorporation of indicators for missing variables along with appropriate imputation to improve the performance of severity of illness scoring systems. (via Semantic Scholar)
UN Sustainable Development Goal Categories
3. Good Health and Well-being (Web of Science; OpenAlex)
Source: Web Of Science
Added: April 27, 2020

We aim to investigate the hypothesis that using information about which variables are missing along with appropriate imputation improves the performance of severity of illness scoring systems used to predict critical patient outcomes.We quantify the impact of missing and imputed variables on the performance of prediction models used in the development of a sepsis-related severity of illness scoring system. Electronic health records (EHR) data were compiled from Christiana Care Health System (CCHS) on 119,968 adult patients hospitalized between July 2013 and December 2015. Two outcomes of interest were considered for prediction: (1) first transfer to intensive care unit (ICU) and (2) in-hospital mortality. Five different prediction models were employed. Indicators were utilized in these prediction models to identify when variables were missing and imputed.We observed statistically significant gains in prediction performance when moving from models that did not indicate missing information to those that did. Moreover, this increase was higher in models that use summary variables as predictors compared to those that use all variables.When developing prediction models using longitudinal EHR data, researchers should explore the incorporation of indicators for missing variables along with appropriate imputation.