AAI_2025_Capstone_Chronicles_Combined

12

Draw, Detect, Navigate ​

The Faster R-CNN ResNet50 FPN v2 training process followed a structured approach

using DataLoaders and batch sizes of 6 for object detection and classification on doodle

images. For training optimization, stochastic gradient descent was used with a learning rate of

0.005, a momentum of 0.9, and a weight decay of 0.0005. Additionally, a step learning rate

scheduler was applied, which reduced the learning rate by 0.1 every 3 epochs. The model was

trained for a total of 10 epochs. During each iteration, loss values were calculated. The

optimizer updated model parameters accordingly, and training the loss was recorded for every

epoch. Lastly, the training loss values were logged and saved to a CSV file for future reference.

This approach has allowed us to train our model efficiently and maintain a record of its

performance over time.

While the model demonstrated high accuracy, when integration was attempted into

Unity, the latest edition of the model support package was unable to correctly parse the model

layers with all supported opset export versions, and efforts pivoted to YOLO after verification of

successful import with the same software and package versions.

When transitioning to YOLO, training for fifty epochs with early stopping, a learning rate

of 0.01, a batch size of sixteen, weight decay of 0.0005, warmup epochs of 3, warmup bias

learning rate of 0.1, class loss of 0.5 and the Adam optimizer, initial results showed high

success with a mean Average Precision of the intersection over union threshold of 0.5 (map50)

of 0.985, a threshold of 0.95 (mAP50-95) of 0.937, recall of 0.961 and precision of 0.966.

Performance was highest for classes with the greatest representation, in this case the start and

end pictograms of helicopter and hospital, but this was a deliberate choice for the purposes of

the application. Recall and precision remained high across all classes despite this imbalance,

despite the presence of extremely abstract samples in the data which presented a challenge to

correctly identify by human interpretations. In pursuit of higher accuracy and in exploration of

alternative configurations, YOLO small was tested as well as varied values for hyperparameters

for model training. Both Adam and standard Stochastic Gradient Descent (SGD) optimizers were

implemented as well as batch sizes ranging from 16 to 64, learning rates from 0.01 to 0.98,

39

Made with FlippingBook - Share PDF online