AAI_2025_Capstone_Chronicles_Combined

First page Table of contents Previous page 151 Next page Last page

stays. This dataset provides a valuable opportunity to explore how advanced machine learning models can

leverage complex patient data to generate accurate and timely risk assessments.

Historically, this problem has been addressed using clinical scoring systems like the Simplified

Acute Physiology Score (SAPS-I) (LeGall et al., 1984) and the Sequential Organ Failure Assessment

(SOFA) (Vincent et al., 1996) score, both of which are present in our dataset. While useful, these scores

are often based on a limited set of variables and traditional statistical models like logistic regression. The

advent of large-scale electronic health record (EHR) databases has spurred a shift towards more

sophisticated machine learning approaches. Researchers have successfully applied a range of methods to

this problem, from tree-based ensembles to deep learning models capable of analyzing raw time-series

data, demonstrating the potential for data-driven models to improve upon traditional scoring systems. Our

project will explore a multi-modal approach, evaluating three powerful machine learning architectures:

XGBoost for aggregated tabular data, and Convolutional Neural Networks (CNNs) and Transformers for

direct time-series analysis.

XGBoost

XGBoost is a fast, regularized implementation of gradient boosting that builds an ensemble of

decision trees, each trained to correct the errors of the previous ones, resulting in high predictive accuracy

(Chen & Guestrin, 2016). It is well-suited for our tabular patient dataset, as it efficiently handles missing

values and provides built-in feature importance rankings. These capabilities allow us to both achieve

strong predictive performance and identify the clinical factors most associated with mortality risk.

XGBoost is well-suited for our project because it excels with structured, tabular data and can

natively handle missing values, simplifying preprocessing. It also provides feature importance rankings,

151

Made with FlippingBook - Share PDF online