AAI_2025_Capstone_Chronicles_Combined
For this project, we are using a subset of the Caltech Camera Traps (CCT20) dataset, which contains over 250,000 images of wildlife spanning more than 140 species (Tabak et al., 2019). This subset focuses on ~20 species with bounding box annotations that enable object detection. In a live deployment, new image data would be captured by remote trail cameras, uploaded to a cloud-based storage solution (e.g., AWS S3), and processed automatically through the WildScan pipeline (Le Coz et al., 2024). The figure below illustrates the proposed deployment architecture for the WildScan system. It represents a semi-automated pipeline designed to handle incoming, unlabeled wildlife image data through a continuous loop of detection, classification, monitoring, and periodic model updates. Because newly collected datasets in deployment lack ground ‑ truth labels, direct accuracy measurement is not immediately possible. Instead, the system continuously tracks unsupervised performance metrics—such as prediction confidence and distribution shifts—as new data become available. If these metrics indicate potential data drift or suggest a significant risk of degraded performance, the system triggers a targeted annotation task to obtain labels for a representative subset of the new data. These annotated samples are then used to retrain and update the model, which is evaluated and registered before re ‑ deployment, completing the feedback loop. This approach ensures that WildScan remains adaptable, accurate, and scalable, even under dynamic and evolving field conditions.
280
Made with FlippingBook - Share PDF online