AAI_2025_Capstone_Chronicles_Combined

8

5.2​ Faster R-CNN Our Faster R-CNN model underwent an initial training phase of 10 epochs followed by additional fine-tuning to improve localization performance. Early-stopping with a patience of 5 epochs was used during fine-tuning, which allowed the model to continue training beyond the initial plateau and refine its bounding-box predictions. This approach aligns with established practices for optimizing two-stage detectors on small, subtle medical targets (Paik et al., 2024; Zhao et al., 2022). Following this extended training procedure, the model achieved a precision of 0.370 and a recall of 0.383 at IoU ≥ 0.5. These values indicate that the detector successfully identified a meaningful subset of fracture instances while producing a moderate level of accurate bounding-box predictions. This level of sensitivity is consistent with the early performance of Faster R-CNN in medical imaging applications where fracture regions are small and exhibit limited contrast relative to surrounding anatomical structures (Yao et al., 2021; Le et al., 2020). Visual examination of the predicted bounding boxes further confirmed that the model is learning clinically relevant spatial patterns. The detector frequently placed bounding boxes near vertebral margins and cortical irregularities associated with fractures, demonstrating that the Region Proposal Network and detection head are extracting meaningful features from the CT slices. Although the bounding boxes remain imperfect, the overall behavior is consistent with the expected early-stage performance of two-stage detectors applied to fine-grained medical abnormalities. Continued optimization, particularly through anchor-scale tuning and longer training schedules, has been shown in

prior work to substantially improve localization accuracy on similar tasks (Paik et al., 2024). 5.3​ DETR The DETR model was able to learn the fracture detection task effectively despite the relatively small dataset. During the initial training process, the model tended to learn slowly where it was predicting either nothing or fractures for everything to maximize recall. Through hyperparameter tuning, the model was able to converge to a somewhat stable state to make sound predictions. The model also showed signs of overfitting. This was mitigated by applying data augmentation for random brightness, contrast, and blur, and optimizing for f1-score instead of recall. With early stopping, these changes helped balance the model’s generalization to new images.

Figure 5 DETR model training and validation loss until epoch 34. However, as seen in Figure 5, the validation loss neutral trend still indicates overfitting. The training loss consistently decreased, indicating that the model was learning something. While the validation loss fluctuated, validation metrics such as fracture recall, precision, and detection f1-score showed a clear upward trend, as seen in Figure 6. We only

311

Made with FlippingBook - Share PDF online