AAI_2025_Capstone_Chronicles_Combined
10
Model training followed nnU-Net’s standard procedure for the 3d_fullres configuration, but was necessarily truncated due to the seven-week duration of the capstone module and practical compute constraints. The planner defines a schedule of up to 1000 epochs per fold; in this project, only fold 0 was used, and training was stopped after 263 epochs. Training was executed on cloud-hosted NVIDIA GPUs using a managed orchestration platform (dstack). Initial runs were performed on A40-class GPUs to validate the pipeline; the final training run analyzed in this report was carried out on an A100-class GPU to reduce epoch times and make further progress within the available time window. The network was optimized using stochastic gradient descent with Nesterov momentum, an initial learning rate of 0.01 with polynomial decay over the planned 1000 epochs, and nnU-Net’s default weight decay. The loss function was the standard nnU-Net compound loss combining Dice and Cross-Entropy, which is well-suited to class-imbalanced lesion segmentation (Isensee et al., 2021). During training, nnU-Net applied its default online augmentations, including random rotations and flips, elastic deformations, intensity and gamma augmentations, and spatial cropping, to improve robustness to variation in tumor morphology and acquisition parameters. Validation was performed on the held-out portion of fold 0 defined by nnU-Net’s data split, with no overlap between training and validation patches. To maintain the reproducibility objectives of this project, no manual hyperparameter tuning or architectural modifications were introduced beyond the automatically generated 3d_fullres plan. All key configuration choices—including target spacing, patch size, batch size, network
depth, optimizer, learning-rate schedule, and augmentation strategy—were taken directly from nnU-Net’s planner output for Lung1 (Isensee et al., 2021). This yields a clean, standardized baseline that can serve as a reference implementation for future segmentation model comparisons on the same dataset. Subsequent work can build on this foundation by extending training to all five folds, exploring ensembling and test-time augmentation, or substituting alternative architectures, while still relying on the same DICOM-native volumetry pipeline and evaluation protocol introduced here. 5 Results All results presented here should be interpreted in the context of substantial time and compute constraints. The capstone module spans seven weeks, and a significant portion of that period was spent iterating on environment configuration, data handling, and stable training runs. Based on the observed epoch duration for the 3D full-resolution configuration and the standard nnU-Net recommendation of training up to 1000 epochs per fold across five folds (Isensee et al., 2021), a fully configured baseline experiment using at most two GPUs, as supported by the framework in this setup, would likely require on the order of several months of wall-clock time. As a result, the experiment reported here represents a single-fold, partially trained nnU-Net model rather than a fully converged, five-fold ensemble. Within these constraints, the trained nnU-Net v2 model (Isensee et al., 2021) was evaluated on the Lung1 dataset (Aerts et al., 2019) using fold-0 for training and validation and a separate held-out test split. Model behavior was
407
Made with FlippingBook - Share PDF online