M.S. AAI Capstone Chronicles 2024
outliers in glucose levels were handled by capping the values at the 1st and 99th percentiles to reduce noise without discarding valuable data. The Tidepool dataset contained formatting inconsistencies in the recorded time values, requiring standardization to a common datetime format. The origins of these data issues stem from device limitations, user errors, and the variability of day-to-day living conditions in real-world data collection. For instance, the intermittent nature of Tidepools physical activity labels reflects practical challenges in consistently tracking exercise data. Certain intervariable relationships were evident, such as the correlation in the DiaTrend dataset between glucose levels and insulin bolus doses, confirming the expected impact of pharmacologic interventions. There were also correlations between demographic variables and glucose trends, which although weak, suggest that there may be additional benefit to personalized management. When looking at the Tidepool dataset, there also were strong correlations between basal insulin rates and bolus, indicative of active glucose level management. A weaker than anticipated correlation between glucose levels and physical activity labels might indicate issues with data acquisition quality. Lastly the Nutrition 5k dataset showed high correlation between carbohydrate content and total calories (r = 0.87) reinforcing the central role of carbs in glycemic load predictions.
Background Information
5
239
Made with FlippingBook - professional solution for displaying marketing and sales documents online