M.S. AAI Capstone Chronicles 2024
Building on these results, future iterations of the system may explore the incorporation of advanced models such as the Informer ProbSparse self-attention mechanism (Zhou et al., 2021) or ensemble methods to further enhance predictive accuracy and robustness. Nutrition 5k Dataset The Nutrition 5k dataset was analyzed using two different approaches: In our first method, an end-to-end approach directly inferring the carbohydrate content from food images using YOLOv8 and Vision Transformer (ViT) models (Figure 3.2). YOLOv8 demonstrated strong performance, achieving a validation MAE of 0.07 after optimization and a 50% improvement in validation loss through hyperparameter tuning. The predictions for carbohydrate content were highly accurate, with an average deviation of just ±5g. The ViT Model performed slightly less effectively than YOLOv8, with higher validation loss and slower convergence. Figure 3 provides a detailed analysis of the model's prediction performance. Optimization efforts for the YOLOv8 focused on tuning batch sizes, learning rates, and dense layer configurations, yielding optimal parameters of a batch size of 16, a learning rate of 0.0001, and dense layer units of (512, 256, 128). The adjustments made resulted in closer alignment between predicted and actual carbohydrate values (Figure 3.3). The analysis encountered several challenges that affected model performance. One significant limitation in the Nutrition 5k dataset was the lack of diversity, as it was based on items common in a California diet. Expansion of the dataset to include items from other cuisines would improve generalizability.
13
247
Made with FlippingBook - professional solution for displaying marketing and sales documents online