AAI_2025_Capstone_Chronicles_Combined
12
Draw, Detect, Navigate
The Faster R-CNN ResNet50 FPN v2 training process followed a structured approach
using DataLoaders and batch sizes of 6 for object detection and classification on doodle
images. For training optimization, stochastic gradient descent was used with a learning rate of
0.005, a momentum of 0.9, and a weight decay of 0.0005. Additionally, a step learning rate
scheduler was applied, which reduced the learning rate by 0.1 every 3 epochs. The model was
trained for a total of 10 epochs. During each iteration, loss values were calculated. The
optimizer updated model parameters accordingly, and training the loss was recorded for every
epoch. Lastly, the training loss values were logged and saved to a CSV file for future reference.
This approach has allowed us to train our model efficiently and maintain a record of its
performance over time.
While the model demonstrated high accuracy, when integration was attempted into
Unity, the latest edition of the model support package was unable to correctly parse the model
layers with all supported opset export versions, and efforts pivoted to YOLO after verification of
successful import with the same software and package versions.
When transitioning to YOLO, training for fifty epochs with early stopping, a learning rate
of 0.01, a batch size of sixteen, weight decay of 0.0005, warmup epochs of 3, warmup bias
learning rate of 0.1, class loss of 0.5 and the Adam optimizer, initial results showed high
success with a mean Average Precision of the intersection over union threshold of 0.5 (map50)
of 0.985, a threshold of 0.95 (mAP50-95) of 0.937, recall of 0.961 and precision of 0.966.
Performance was highest for classes with the greatest representation, in this case the start and
end pictograms of helicopter and hospital, but this was a deliberate choice for the purposes of
the application. Recall and precision remained high across all classes despite this imbalance,
despite the presence of extremely abstract samples in the data which presented a challenge to
correctly identify by human interpretations. In pursuit of higher accuracy and in exploration of
alternative configurations, YOLO small was tested as well as varied values for hyperparameters
for model training. Both Adam and standard Stochastic Gradient Descent (SGD) optimizers were
implemented as well as batch sizes ranging from 16 to 64, learning rates from 0.01 to 0.98,
39
Made with FlippingBook - Share PDF online