M.S. AAI Capstone Chronicles 2024
For hyperparameter tuning we used grid search across our LSTM models. Similar grid search approaches for our CNN and ViT Models focused on batch sizes, learning rates, and a number of dense layer configurations.
To prevent overfitting, we implemented various regularization techniques across our models. Dropout layers were incorporated into all models and in addition L2 regularization was applied to dense layers in the ViT and YOLOv8 models. Although we explored batch normalization for LSTM models, we ultimately excluded it due to minimal performance improvements. The MobileNetV3Large's trainable dense layers utilized a single dropout layer combined with L2 kernel regularizers. Our learning rate optimization incorporated step decay schedules, reducing rates by a factor of 0.1 when validation loss plateaued for three epochs, enabling more precise model fine-tuning in later training stages. Feature engineering played a crucial role, with LSTM models benefiting from wavelet coefficients and cyclic temporal encodings, while our CNN and ViT models leveraged data augmentation techniques to improve generalization across diverse food images.
Results and Discussion
Tidepool Dataset In capturing results for this dataset we started with a baseline simple linear regression model then compared it to our trained LSTM model. The LSTM model demonstrated significant improvements in predicting glucose levels when compared to the baseline linear regression model. The Mean Squared Error (MSE) was reduced from 0.7131 with the linear regression model to 0.0642 with the LSTM model, illustrating the latter's superior ability to minimize
11
245
Made with FlippingBook - professional solution for displaying marketing and sales documents online