AAI_2025_Capstone_Chronicles_Combined
8
Draw, Detect, Navigate
Data Summary
In training for prediction based on human-drawn symbolic representations, the Google
Quick, Draw! Doodle dataset was used (Jongejan et al.). The dataset consisted of human-drawn
representations of the word prompt, drawn under the time constraint of 20 seconds to force
simplicity, with 3,000 samples per class. All samples included were those that were correctly
identified, and were closely cropped to the area of the drawing. Each labeled grayscale image
.png file was accompanied by the country code of the user who generated the image, a
vectorized version of the image preserving stroke data, and a unique identifier. The dataset was
created through self-selected user participation in the Quick, Draw! Game, where users
volunteered their drawings as training data (Jongejan et al.). The dataset contained no missing
values except for country codes, which were likely unavailable due to factors such as VPN usage
or IP detection failures. While the cause of the missing country code was not disclosed in the
source of the dataset, due to the automatic detection using the user’s internet protocol (IP)
address and the potential for some users to use virtual private network (VPNs) multiple
technical sources of this are possible. In the chosen modeling approach, the country code was
discarded as non-relevant data.
The 345 image classes present in the dataset were of a wide assortment of those known
to a general audience, ranging from “bat” to “The Eiffel Tower” to “diving board”. Images were
all stretched to even squares if not already in that form, with an even 36 pixels of padding of
white space on each side. For the purposes of computational resource management and scope,
this project uses twelve of the 345 categories present in the Quick, Draw! dataset. The number
of necessary classes when taken into business contexts would likely be highly dependent on the
domain and usage. Data augmentation was performed to further pad images for the purpose of
bounding box prediction and providing the mode bounding boxes which shifted from image to
image, but this was found to be insufficient to help all models attempting to learn accurate box
predictions.
35
Made with FlippingBook - Share PDF online