M.S. AAI Capstone Chronicles 2024
Upon initial interrogation, it was clear that the dataset had a high prevalence of missing data at given timesteps. This created some initial concern about the quality of the dataset overall. However, further inspection revealed that the sparse readings were largely isolated to the 26 variables corresponding to the laboratory samples. This sparsity is to be expected, as these readings are generally taken far less frequently in an ICU setting than the vital signs, which are streaming in automatically from monitoring devices. While less prevalent, the missing data did occur in the vital sign readings as well, meaning that all continuous variables would require some level of imputation in order to be useful for downstream modeling tasks. This imputation was conducted to capture a “last reading” of the variable and carry it forward using a forward fill technique. In addition, patients that had greater than 30% of their vital sign variable data missing were removed. Further analysis was conducted to analyze the correlation characteristics between the input variables and the target in order to support feature selection. This produced evidence of moderate correlation between some variables but did not produce any definitive results aside from some cross correlation between variables that may be redundant, such as blood pressure readings (redundant variables were removed from the dataset). As a result, additional techniques were employed to determine the variables that were most relevant to predicting sepsis outcome. The chosen approach was a measurement of feature importance, which was obtained by training a Decision Tree Classifier (DTC) on a summarized version of the dataset. The dataset was rolled up to the patient level and the DTC was trained to classify the sepsis outcome of the patient. This enabled the generation of a feature importance report, displayed in Figure 2, which provided a ranking of features deemed important by the model for determining the sepsis
267
Made with FlippingBook - professional solution for displaying marketing and sales documents online