AAI_2025_Capstone_Chronicles_Combined
residual stages for high-confusion classes. Second, we will strengthen the data pipeline by combining automated MegaDetector labeling with manual verification and applying class-specific augmentations to better separate challenging species. Third, deployment capabilities will be expanded to support bulk image uploads, real-time video stream analysis, and a high-throughput batch inference pipeline. Finally, we will advance training strategies—incorporating label smoothing, hard-negative mining, semi-supervised learning, and model ensembles—to further boost accuracy and robustness. These enhancements aim to push classification accuracy further, reduce inter-class confusion, and expand the model’s utility in real-world conservation and research contexts. ScratchResNet’s strong initial performance provides a solid foundation for these next steps, ensuring that the system continues to evolve into a more powerful, scalable, and field-ready wildlife monitoring tool. Our results also show that careful spatial-temporal dataset partitioning, strong data preprocessing practices, and regularization techniques can yield models that generalize across locations and over time. Most significantly, our active learning framework demonstrates substantial potential for reducing human annotation burden while maintaining system performance. By implementing a confidence threshold of 0.70 for uncertain sample identification, WildScan can systematically target the most informative data points for human review, aligning with findings that active learning can reduce annotation effort by over 50% while preserving accuracy. The annual model update cycles strike the optimal balance between stability and adaptability, avoiding the performance volatility seen with overly frequent retraining. To ensure reliability in production, we effectively implemented Top-versus-All histogram binning (TvA) calibration on model softmax outputs. Calibration reduced Expected Calibration Error (ECE) on out-of-distribution test sets from approximately 10% uncalibrated to under 7% post-calibration, demonstrating significant improvements in confidence reliability. By incorporating newly annotated uncertain samples into successive calibration datasets, we showcased a practical workflow for ongoing deployment monitoring—continuously refining confidence estimates as new data arrive. This
299
Made with FlippingBook - Share PDF online