M.S. AAI Capstone Chronicles 2024

5

An issue that arises with handling this dataset is the size and resources needed to load the

complete data. The original dataset is up to 13TB and for the purpose of this analysis 500GB (about

167,000 images across 859 flights) are loaded and stored to accommodate computing and timing

constraints. The dataset was assembled and made available for the purpose of computer vision

applications and is thus preemptively cleaned and standardized. As such, there are no duplicated data

points and minimal missing values. Due to the nature in which the data was collected, both planned and

unplanned objects crossed paths with the UAV. Only objects which were planned have an associated

variable storing the distance between the UAV and the detected object, all unplanned objects contain a

missing value for this feature. To prevent the loss of data, this missing value is filled with the mean

distance value. The distribution of the distance between the UAV and the detected object can be seen

in Figure 4. This figure illustrates a skew in the detection of objects, where a majority of the objects

detected are between 200m and 300m. This indicates that the performance of the model may be

dependent on the distance the UAV is from the object being identified.

Figure 4

Distribution of the distance between that UAV and the detected object

119

Made with FlippingBook - professional solution for displaying marketing and sales documents online