AAI_2025_Capstone_Chronicles_Combined

3

Our goal is to build an interpretable, clinically relevant AI system aligned with responsible

practices and suitable for integration with EHR platforms like Epic or Cerner.

Dataset Summary

The dataset is sourced from the PhysioNet 2012 Challenge (Silva et al., 2012) and contains

multivariate time-series physiological and clinical records collected during the first 48 hours of ICU

admission. It comprises three primary categories: static features (patient-level attributes), dynamic

features (time-dependent physiological measurements and laboratory values collected at irregular

intervals), and outcome features (severity scores and the binary in-hospital mortality target).

Several data quality challenges were identified. Dynamic variables are measured at irregular

intervals, reflecting clinical necessity rather than fixed sampling schedules. Some laboratory values

exhibit extreme sparsity, with over 95% missingness (see figure 1), likely due to being ordered only under

specific clinical circumstances. Other measurements, such as non-invasive blood pressure readings and

urine output, are more consistently recorded due to their role in routine monitoring. The dataset also

contains outliers in certain laboratory results, which may arise from measurement errors, transcription

mistakes, or extreme clinical states.

To address these issues, missing dynamic values were imputed using forward-fill and backward

fill within patient records to maintain temporal continuity. Static numerical variables were imputed with

median values to reduce the influence of outliers while preserving central tendencies. Variables with

extreme sparsity (>80% missingness) were excluded from certain aggregation-based preprocessing steps.

Scaling was performed using statistics from the training set to prevent data leakage.

The project’s objective is to predict in -hospital mortality, making both static and dynamic

features potentially relevant. Static attributes such as age are well-established predictors of mortality risk,

149

Made with FlippingBook - Share PDF online