ADS Capstone Chronicles Revised

‭15‬

‭outcome‬ ‭of‬ ‭recovered‬ ‭,‬ ‭along‬ ‭with‬ ‭“fatigue”‬ ‭with‬ ‭an‬ ‭outcome‬ ‭of‬ ‭not‬ ‭recovered‬ ‭,‬ ‭which‬ ‭makes‬ ‭using‬ ‭that‬ ‭outcome‬ ‭variable‬ ‭confusing.‬ ‭Due‬ ‭to‬ ‭the‬ ‭ambiguity‬ ‭in‬ ‭the‬ ‭5-level‬ ‭labeling,‬ ‭this‬ ‭analysis‬ ‭will‬ ‭use‬ ‭a‬ ‭3-level‬ ‭outcome‬ ‭of‬ ‭death‬ ‭,‬ ‭serious‬ ‭,‬ ‭and‬ ‭nonserious‬ ‭.‬ ‭This‬ ‭outcome‬ ‭is‬ ‭more‬ ‭robust‬ ‭because‬ ‭it‬ ‭refers‬ ‭to‬ ‭the‬ ‭primary‬ ‭outcome‬ ‭from‬ ‭all‬ ‭possible‬‭ADR‬‭conditions‬‭and‬‭there‬ ‭is‬ ‭only‬ ‭one‬ ‭label‬ ‭per‬ ‭patient.‬ ‭A‬ ‭serious‬ ‭outcome‬ ‭is‬ ‭comprised‬ ‭of‬ ‭hospitalizations,‬ ‭disabling‬ ‭conditions,‬ ‭life‬ ‭threatening‬ ‭conditions,‬‭and‬‭birth‬‭defects.‬‭The‬‭three-level‬ ‭categorical‬ ‭outcome‬ ‭will‬ ‭be‬ ‭assessed‬ ‭with‬ ‭accuracy,‬ ‭precision,‬ ‭recall,‬ ‭specificity,‬ ‭and‬ ‭F1.‬‭The‬‭class‬‭imbalance‬‭of‬‭the‬‭three‬‭levels‬‭is‬ ‭shown‬ ‭in‬ ‭Figure‬ ‭7‬ ‭(i.e.,‬ ‭baseline‬ ‭classification‬ ‭rates‬ ‭for‬ ‭each‬ ‭level:‬ ‭serious‬ ‭69.4%,‬ ‭death‬ ‭27.1%,‬‭and‬ ‭nonserious‬ ‭3.6%).‬ ‭Reducing‬ ‭the‬ ‭amount‬ ‭of‬ ‭levels‬ ‭makes‬ ‭interpretation‬ ‭and‬ ‭model‬ ‭training‬ ‭more‬ ‭feasible.‬ ‭Class‬ ‭balancing‬ ‭was‬ ‭conducted‬ ‭prior‬ ‭to‬ ‭training‬ ‭for‬ ‭assessments‬ ‭of‬ ‭categorical outcomes (see Modeling).‬ ‭Figure 7‬ ‭Outcome Variable Distribution‬

‭4.5 Exploratory Data Analysis‬ ‭The‬ ‭dataset‬ ‭used‬ ‭for‬ ‭machine‬ ‭learning‬ ‭objectives‬ ‭was‬ ‭queried‬ ‭from‬ ‭a‬ ‭local‬ ‭connection to‬ ‭pharma_db‬ ‭:‬ ‭“““‬ ‭SELECT‬ ‭d.med_product,‬ ‭d.manu_num,‬ ‭d.ndc9,‬ ‭a.serious_outcome,‬ ‭a.expedited,‬ ‭a.age,‬ ‭a.sex,‬ ‭a.year,‬ ‭a.weight,‬ ‭p.price‬ ‭FROM adverse_events a‬ ‭INNER‬ ‭JOIN‬ ‭patient_reactions‬ ‭r‬ ‭ON‬ ‭a.event_id‬ ‭=‬ ‭r.event_id‬ ‭INNER‬ ‭JOIN‬ ‭patient_drugs‬ ‭d‬ ‭ON‬ ‭a.event_id‬ ‭=‬ ‭d.event_id‬ ‭LEFT JOIN prices p ON p.ndc9 = d.ndc9‬ ‭”””‬ ‭The‬ ‭query‬ ‭resulted‬ ‭in‬ ‭a‬ ‭sample‬ ‭size‬ ‭of‬ ‭83,307‬ ‭at‬ ‭time‬ ‭of‬ ‭this‬ ‭project.‬ ‭The‬ ‭data‬ ‭is‬ ‭split‬ ‭into‬ ‭80/10/10‬ ‭(66,645/8,331/8,331)‬ ‭for‬ ‭training/validation/testing.‬ ‭4.5.1‬ ‭Outcome‬ ‭Variable.‬ ‭There‬ ‭are‬ ‭a‬ ‭few‬ ‭possible‬ ‭outcome‬ ‭variables‬ ‭in‬ ‭the‬ ‭FAERS‬ ‭data.‬‭ADR-specific‬‭outcome‬‭severity‬‭as‬‭five‬ ‭levels‬ ‭-‬ ‭recovered,‬ ‭recovering,‬ ‭recovered‬ ‭with‬ ‭sequelae,‬ ‭not‬ ‭recovered,‬ ‭and‬ ‭fatal.‬ ‭These‬ ‭values‬ ‭have‬ ‭been‬ ‭mapped‬ ‭to‬ ‭an‬ ‭ordinal‬ ‭scale‬ ‭of‬ ‭1-5‬ ‭(Yue‬ ‭e‬ ‭al.,‬ ‭2024).‬ ‭However,‬ ‭for‬ ‭example,‬ ‭recovering‬ ‭and‬ ‭not‬ ‭recovered‬ ‭might‬ ‭be‬ ‭interpreted‬ ‭as‬ ‭the‬ ‭same‬ ‭thing‬‭by‬‭different‬‭people.‬‭Additionally,‬‭each‬ ‭person‬‭can‬‭have‬‭multiple‬‭ADR‬‭terms‬‭nested‬ ‭in‬‭a‬‭single‬‭report,‬‭with‬‭each‬‭of‬‭the‬‭outcomes‬ ‭being‬ ‭different.‬ ‭For‬ ‭example,‬ ‭the‬ ‭same‬ ‭patient‬ ‭could‬ ‭have‬ ‭“chest‬ ‭pains”‬ ‭with‬ ‭an‬

‭4.5.2‬ ‭Numerical‬ ‭Input‬ ‭Features.‬ ‭The‬ ‭numerical‬‭input‬‭features‬‭are‬‭age‬‭(yr),‬‭weight‬ ‭(kg),‬ ‭drug‬ ‭prices‬ ‭(per‬ ‭unit),‬ ‭and‬ ‭number‬ ‭of‬ ‭manufacturers‬ ‭(manu).‬ ‭The‬ ‭distributions‬ ‭of‬ ‭these‬ ‭were‬ ‭examined‬ ‭with‬ ‭respect‬ ‭to‬ ‭the‬ ‭outcome‬ ‭variable‬ ‭levels‬ ‭(Figure‬ ‭8a,b).‬ ‭Descriptive‬ ‭statistics‬ ‭of‬ ‭numerical‬ ‭variables‬ ‭were‬ ‭calculated‬ ‭with‬ ‭df.describe‬ ‭and‬ ‭multicollinearity‬ ‭examined‬ ‭with‬ ‭a‬

165

Made with FlippingBook - Online Brochure Maker