ADS Capstone Chronicles Revised
16
correlation significant correlationswerefound(Figure9),signaling that relationships in the data arenon-linear. All numerical variable distributions are skewed. Table 2 shows the variables’ median, mean, standard deviation, interquartile range, and skew value. Table 2 Numerical Variable Descriptives Xn x̃ x̅ s IQR Skew Age 61.0 55.8 21.8 43-72 -0.78 Price 0.17 18.53 172.33 .05-.64 12.08 Weight 70 73.8 22.07 62-89 0.28 Manu 1 7.53 40.28 1-50 10.31 matrix; no
Figure 9 Correlation Matrix
4.5.3 Categorical Input features. Nominal distributions were examined withrespectto the outcome variable. Nominal features include expedited, report_source (doctor, pharmacist, other healthcare professional), company, country, and sex. Color coded, proportional bar charts were created to assess differences in category frequencies. Frequency counts and proportions for each variable level were calculated with df[‘variable’].value_counts(Table3).Figure 10 shows the relationship between output and report source, where physicians and doctors are less likely to report deaths compared to other healthcare professionals, eventhoughtheybothmakeupthemajority of the sample. The sample contains 10% more females than males and most reports come from the US. MostreportsinFAERS are expedited, meaning that they contain serious, unexpected ADRs that must be reported within 15 days. Table 3 Categorical Variables Xn k Mode p̂
Figure 8 Age, Weight, Sex, and Outcome A
B
Sex
2 (F/M)
Female
55.1%
Source
3 (D,P,O)
Doctor
48.7%, 41.6%
166
Made with FlippingBook - Online Brochure Maker