AAI_2025_Capstone_Chronicles_Combined

13

Draw, Detect, Navigate ​

momentum values from 0.8 to 0.937, weight decay of 0.0004 to 0.007, weight decay of

0.0005 0.007, and weight of classification loss from 0.243 to 0.873 in a randomized search of

the hyperparameter space. A target number of epochs was set at 50 with early stopping and a

patience of 7 epochs.

Results

Best results were automatically saved prior to training stopping, unless there was still

indication of learning after the maximum number of epochs. The best results were seen in

training on the set of 10,000 images from the synthetic data generation with a batch size of 64,

learning rate of 0.0759, weight decay of 0.0071, classification loss weight of 0.8733 with best

results obtained after ten epochs. This final edition of the model achieved an inference speed of

.06 milliseconds, recall of 0.963, precision of 0.967, mAP50-95 of 0.947, and mAP50 of 0.987.

Perfect recall was seen on helicopters and skulls, and lowest recall was a score of 0.84 on vans.

It is of note that while the dataset was of general high quality, neither the drawing providers nor

the project researches had sufficient resources to screen all drawings, and it was observed that

a few nearly blank as well as drawings created in bad faith for the context were present, likely

having at least a minute impact on end results.

Our project's nano model was trained first, as it would be the fastest for training time,

and also provided a very good bench mark. The initial test data set was made specifically for

this model consisting of synthetic images, but hand labeled by our group. Considering only 400

images were labeled by hand, the model was able to achieve results that would continue the

further improvement of our synthetically trained models. We argue that this was a pivotal

moment in our project, as it showed the ability for the models to have further improvement as

long as the data and images were correctly correlated. Our methodology for this model

included 400 images that were rotated 90 degrees three times, to give us a total of 4 images,

per synthetically developed image. The only reconsideration for this model that our group had,

is having the images rotated more than 3 times, which could be an area for further exploration.

40

Made with FlippingBook - Share PDF online