AAI_2025_Capstone_Chronicles_Combined

First page Table of contents Previous page 15 Next page Last page

EfficientNet Model Training Methodology

The EfficientNet-based pipeline utilized the EfficientNetB0 architecture with pre-trained ImageNet weights as the feature extractor. To adapt it for multi-label classification, we removed the original classification head and added a global average pooling layer, a 128-unit dense layer with ReLU activation, a dropout layer (rate = 0.3), and a final sigmoid-activated output layer with seven neurons, one for each consolidated pathology category. This design allowed the model to independently predict the presence of multiple conditions within the same chest radiograph. (Fig A1 demonstrates our full EfficientNet architecture with added layers.) All input images were resized to 1024×1024 to balance resolution fidelity with memory constraints. Though the original dataset contained variable image sizes (as high as 3000×3000 pixels), resizing enabled the use of smaller batch sizes and accelerated training while preserving key clinical features. Images were normalized and augmented with horizontal flips, contrast shifts, and brightness jitter to improve generalizability. The data was split into training (72%), validation (18%), and testing (10%) subsets using stratified sampling, and labels were encoded as binary vectors. The model was trained using a phased approach over five total stages: ● Phase 1 (1 epoch): With the EfficientNet base frozen, we trained the classification head using binary cross-entropy loss with label smoothing (0.05) to prevent the model from getting stuck in local minima where it predicted all-zero outputs. Initially we found the model would predict all 0’s without this “warm-up” layer, so this was essential to our architecture.

Made with FlippingBook - Share PDF online