AAI_2025_Capstone_Chronicles_Combined
10
deployment in a clinical workflow. DETR achieved near-perfect recall and the strongest localization performance, but at the cost of a high false-positive rate and signs of overfitting. Together, these outcomes highlight the trade-offs inherent in model selection: prioritizing precision reduces missed fractures, while prioritizing recall reduces clinical risk but increases diagnostic burden. Overall, the results suggest that deep learning models hold significant promise as supportive tools for cervical fracture screening, particularly those capable of robust localization. Additional steps, such as larger and more diverse datasets, improved augmentation strategies, anchor tuning, and longer or more stable training procedures, are needed before such systems can be considered reliable for real-world medical use. Future work should emphasize clinical validation and model interpretability to enhance automated systems, rather than replace radiologist expertise. ACKNOWLEDGMENTS We thank our instructors at the University of San Diego’s Shiley-Marcos School of Engineering for their support and guidance. Additional thanks to Professor Anna Marbut, M.S., for direction throughout this capstone project. Works Cited Bhattacharya, P., & Nowak, P. (2025). Drone detection and tracking with YOLO and a rule-based method . arXiv. https://arxiv.org/abs/2502.05292 Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., & Zagoruyko, S. (2020). End-to-end object detection with transformers. In European Conference on Computer Vision (pp. 213–229). Springer.
Table 1 Results for all three models on classification metrics for fracture precision, recall, and f1-score, as well as bounding box recall or box accuracy. Highlighted cells show the best performance for each metric. Model Recall Precision F1-Score Accuracy CNN 7.9% 61.1% 14.0% N/A R-CNN 58.0% 69.0% 63.0% 38.3% DETR 99.0% 31.0% 47.0% 61.0% fewer false alarms, but at the cost of missing fractures? On the other hand, would you want a model that misses virtually no fractures, but at the cost of generating many false alarms? For medical professionals, having that safety net with the DETR model may be the better option, even if it may require more double-checking. 6 Conclusion This project explored the feasibility of using deep learning-based computer vision models to detect cervical spine fractures in CT images, a task that remains clinically challenging due to subtle fracture features, class imbalance, and variability in imaging conditions. By constructing a consistent preprocessing pipeline and evaluating three model architectures (baseline CNN, Faster R-CNN, and DETR) we demonstrated how different approaches vary in their ability to recognize fractures and localize clinically meaningful regions of interest. The baseline CNN showed that simple classification models can learn broad anatomical patterns, but lack the sensitivity required for fracture detection, missing nearly all fractured cases. Faster R-CNN provided more balanced performance, achieving moderate precision and recall along with interpretable bounding boxes, though its sensitivity remained insufficient for
313
Made with FlippingBook - Share PDF online