AAI_2025_Capstone_Chronicles_Combined

First page Table of contents Previous page 289 Next page Last page

smooth decision boundaries, RandomResizedCrop to introduce realistic scale and aspect ratio variability, horizontal flips, color jitter, and rotation to mimic natural appearance changes, and RandomErasing (p = 0.3, scale = 0.02–0.2) to simulate partial occlusions and missing visual features commonly encountered in wildlife imagery. The OneCycleLR learning rate policy (cosine anneal, pct_start = 0.3, total epochs = 100) was employed to speed convergence while avoiding local minima. Early stopping (patience = 10, min_delta = 0.001) prevented overfitting, and per-epoch checkpointing ensured reproducibility and recovery in case of interruptions. These measures, combined with carefully tuned regularization, provided a robust training loop capable of extracting meaningful features from scratch. In addition, the architecture and pipeline were intentionally designed to be modular, allowing for future experiments with attention modules, feature pyramid integrations, or specialized classification heads without rewriting the entire training loop. For the other models, training was performed on a batch size of 32, a learning rate of 0.0001, and using the CrossEntropyLoss function. Custom loss functions that incorporated MixUp for regularization were also tested but did not significantly outperform standard loss functions in the case of the ScratchResNet. The final ScratchResNet model achieved approximately 74% validation accuracy, a notable performance for a network trained entirely from scratch without reliance on pre-trained feature extractors. This outcome demonstrates the model’s ability to learn rich, discriminative features directly from the wildlife dataset, despite the considerable challenges posed by high intra-class variability — such as differences in pose, lighting, and occlusion — and strong inter-class similarities across the 20 categories. In the context of a complex, real-world dataset, this result establishes a competitive baseline and reinforces the viability of a fully custom architecture in scenarios where pre-training is not possible or desired. Results

289

Made with FlippingBook - Share PDF online