M.S. AAI Capstone Chronicles 2024

8

An alternative pretrained model for object detection and image segmentation is the You Only

Look Once (YOLO) model, pretrained using the Common Objects in Context (COCO) dataset (Jocher et

al., 2023). The dataset consists of 80 object categories including people, cars, animals, and sport

equipment (Jocher & Laughing, 2023). The latest version of this model is YOLOv9 which was released in

February 2024 with the intention of outperforming convolutional-based and transformer-based models

in accuracy and speed. Similar to the ViT model, transfer learning allows the YOLO model to be applied

to the SAA task with less computational resources and a higher performance when compared to a

traditional CNN model.

The performance of the YOLOv5 model on a dataset generated by a UAV is studied in the paper

Innovation in Livestock Surveillance: Applying the YOLO Algorithm to UAV Imagery and Videography

(Kurniadi, et al., 2023) . The dataset used to train the model consisted of 3131 images containing cows

and 836 not containing cows. For this classification task the model achieved the best results when the

UAV was five meters away from the object and not in motion with an accuracy of 75%. The poorest

performance occurred when the UAV was ten meters away from the object and in motion with an

accuracy of 0. For the purposes of this study the restrictions on the UAV distance and motion were not

limiting factors; however, for other applications these restrictions should be taken into consideration.

Experimental Methods

For this analysis a traditional CNN model and a ViT pretrained model are developed to address the SAA

task presented.

CNN

The architecture of the generated CNN model consists of three layers known as the convolution layer,

the pooling layer, and the fully connected layer. The purpose of the convolution layer is to obtain

significant features from the input by using filter parameters which are learned throughout the training

122

Made with FlippingBook - professional solution for displaying marketing and sales documents online