AAI_2025_Capstone_Chronicles_Combined

12

Transformer-Based Modeling

As part of our broader effort to predict ICU mortality using multivariate time-series data, we have

also explored Transformer-based architectures for their ability to capture long-range dependencies and

complex temporal interactions, key to modeling physiological signals in clinical settings. We evaluated

two models: PatchTST (Nie et al., 2023), a lightweight Transformer designed for time-series

classification, and TimesFM (Das et al., 2023), a large-scale pretrained Transformer optimized for

temporal reasoning and scalable deployment.

PatchTST Implementation and Optimization

PatchTST was implemented using the Hugging Face Transformers library (PatchTST, 2025). We

explored three configurations: a baseline model with default parameters, a custom-configured model with

increased capacity, and a hyperparameter-tuned model optimized via Optuna. The custom model included

four hidden layers, eight attention heads, and multiple dropout mechanisms to reduce overfitting. It used a

hidden size of 64, a feed-forward dimension of 128, and layer normalization. The tuned model explored a

broader search space, including hidden size, number of layers, attention heads, dropout rates, patch

length, stride, and normalization type. All models dynamically calculated input channels and context

length based on the number of features (21), sequence length (48), and patch configuration. A

classification token was used, and the model was configured for binary classification.

Training followed a stratified split (60% training, 20% validation, 20% testing) to maintain class

balance. Models were trained using binary cross-entropy loss and the AdamW optimizer, with learning

rates and weight decay adjusted per configuration. Batch sizes ranged from 32 to 128. Training ran up to

50 epochs for the baseline and custom models, and up to 150 epochs for the tuned model, with early

stopping triggered by stagnant validation AUROC over 10 epochs. Class imbalance was addressed using

158

Made with FlippingBook - Share PDF online