ADS Capstone Chronicles Revised

‭26‬

‭this‬ ‭subsample‬ ‭of‬ ‭the‬ ‭broader‬ ‭population,‬ ‭and‬ ‭may‬‭infer‬‭that‬‭these‬‭people‬‭have‬‭access‬ ‭to‬ ‭better‬ ‭healthcare‬ ‭resources‬ ‭compared‬ ‭to‬ ‭people‬ ‭taking‬ ‭the‬ ‭same‬ ‭drugs‬ ‭who‬ ‭might‬ ‭have‬ ‭had‬ ‭an‬ ‭ADR,‬ ‭but‬ ‭who‬ ‭did‬ ‭not‬ ‭seek‬ ‭medical‬ ‭attention,‬ ‭or‬ ‭who’s‬ ‭healthcare‬ ‭providers‬‭were‬‭not‬‭thorough‬‭enough‬‭to‬‭make‬ ‭a‬‭report‬‭in‬‭FAERS.‬‭Additionally,‬‭the‬‭system‬ ‭can‬ ‭only‬ ‭make‬ ‭predictions‬ ‭on‬ ‭the‬ ‭list‬ ‭of‬ ‭drugs‬ ‭reported‬ ‭in‬ ‭FAERS.‬ ‭If‬ ‭an‬ ‭end-user‬ ‭inputs‬‭a‬‭drug‬‭into‬‭the‬‭predication‬‭application‬ ‭that‬‭was‬‭not‬‭retained‬‭as‬‭a‬‭feature‬‭in‬‭the‬‭final‬ ‭model,‬ ‭the‬ ‭prediction‬ ‭will‬ ‭be‬ ‭slightly‬ ‭less‬ ‭accurate‬ ‭and‬ ‭only‬ ‭be‬ ‭based‬ ‭on‬ ‭age,‬‭weight,‬ ‭sex,‬ ‭drug‬ ‭price,‬ ‭and‬ ‭provider‬ ‭qualification.‬ ‭However,‬ ‭since‬ ‭all‬ ‭of‬ ‭these‬ ‭features‬ ‭carry‬ ‭the‬ ‭majority‬ ‭of‬ ‭feature‬ ‭importance,‬ ‭the‬ ‭potential‬ ‭outcome‬ ‭will‬ ‭be‬ ‭classified‬ ‭meaningfully.‬ ‭FAERS‬ ‭data‬ ‭also‬ ‭lacks‬ ‭context‬ ‭for‬ ‭underlying‬ ‭circumstances.‬ ‭For‬ ‭example,‬ ‭acetaminophen,‬‭also‬‭known‬‭as‬‭Tylenol,‬‭is‬‭in‬ ‭the‬ ‭top‬ ‭five‬ ‭reported‬ ‭drugs‬‭related‬‭to‬‭death.‬ ‭At‬ ‭first‬ ‭glance,‬ ‭this‬ ‭may‬ ‭seem‬ ‭like‬ ‭a‬ ‭safe,‬ ‭over-the-counter‬ ‭drug,‬ ‭but‬ ‭it‬ ‭has‬ ‭the‬ ‭potential‬ ‭for‬ ‭liver‬ ‭damage‬‭from‬‭overdosing,‬ ‭and‬ ‭this‬ ‭can‬ ‭lead‬ ‭to‬ ‭death‬ ‭when‬ ‭combined‬ ‭with‬ ‭alcohol‬ ‭use‬ ‭(Ghosh‬ ‭et‬ ‭al.,‬ ‭2021).‬ ‭Another‬ ‭example‬ ‭is‬ ‭aromatase‬ ‭inhibitors‬ ‭which‬ ‭are‬ ‭used‬ ‭for‬ ‭cancer‬ ‭treatment‬ ‭in‬ ‭females‬‭and‬‭hormone‬‭balancing‬‭and‬‭fertility‬ ‭in‬ ‭males.‬ ‭Currently,‬ ‭FAERS‬ ‭reports‬ ‭do‬ ‭not‬ ‭capture‬ ‭the‬ ‭situational‬ ‭context‬ ‭of‬ ‭pharmaceutical‬‭drug‬‭use,‬‭so‬‭any‬‭conclusions‬ ‭made‬ ‭about‬ ‭drugs‬ ‭are‬ ‭based‬ ‭on‬ ‭speculation‬ ‭and‬ ‭the‬ ‭limited‬ ‭demographic‬ ‭information‬ ‭available.‬ ‭The‬ ‭amount‬ ‭of‬ ‭data‬ ‭in‬‭the‬‭FAERS‬‭database‬ ‭is‬ ‭into‬ ‭the‬ ‭millions‬ ‭and‬ ‭requires‬ ‭high-load‬ ‭computational‬ ‭resources.‬ ‭This‬ ‭analysis‬ ‭utilized‬ ‭a‬ ‭robust‬ ‭subset‬ ‭of‬ ‭the‬ ‭data‬ ‭to‬ ‭build‬ ‭and‬ ‭test‬ ‭the‬ ‭pipeline,‬ ‭which‬ ‭may‬ ‭have‬ ‭contained‬‭different‬‭statistical‬‭properties‬‭than‬ ‭the‬ ‭entire‬ ‭dataset.‬ ‭Additionally,‬ ‭the‬ ‭FAERS‬

‭database‬ ‭has‬ ‭a‬ ‭lack‬ ‭of‬ ‭required‬ ‭fields,‬ ‭so‬ ‭each‬ ‭record‬ ‭has‬ ‭its‬ ‭own‬ ‭pattern‬ ‭of‬ ‭missing‬ ‭data.‬‭Thus,‬‭the‬‭full‬‭28‬‭million‬‭sample‬‭size‬‭is‬ ‭not‬ ‭usable.‬ ‭When‬ ‭filtering‬ ‭for‬ ‭quality‬ ‭data‬ ‭with‬ ‭complete‬ ‭records‬ ‭of‬ ‭individual‬ ‭difference‬ ‭data‬ ‭(age,‬‭sex,‬‭weight,‬‭outcome),‬ ‭the‬ ‭available‬ ‭sample‬ ‭drops‬ ‭to‬ ‭a‬ ‭million‬ ‭at‬ ‭most.‬ ‭It‬ ‭was‬ ‭further‬ ‭filtered‬ ‭to‬‭reports‬‭from‬ ‭healthcare‬ ‭professionals‬ ‭only,‬ ‭and‬‭restricted‬ ‭to‬‭the‬‭most‬‭current‬‭quarter-year‬‭of‬‭data.‬‭The‬ ‭pipeline‬‭can‬‭be‬‭adapted‬‭to‬‭retrieve‬‭more‬‭data‬ ‭from‬ ‭FAERS‬ ‭by‬ ‭updating‬ ‭the‬ ‭URL‬ ‭date‬ ‭limits‬ ‭and‬ ‭other‬ ‭parameters‬ ‭in‬ ‭the‬ ‭API‬ ‭request.‬ ‭Training‬ ‭the‬ ‭classification‬ ‭model‬ ‭required‬ ‭undersampling‬ ‭the‬ ‭lowest‬ ‭frequency‬ ‭class,‬ ‭nonserious‬ ‭,‬ ‭which‬ ‭means‬ ‭that‬ ‭about‬ ‭a‬ ‭third‬ ‭of‬‭the‬ ‭death‬ ‭outcomes‬‭and‬‭a‬‭tenth‬‭of‬ ‭serious‬ ‭outcomes‬ ‭were‬ ‭not‬ ‭used‬‭for‬‭model‬‭training.‬ ‭Thus,‬ ‭the‬ ‭models‬ ‭might‬ ‭not‬ ‭have‬ ‭incorporated‬ ‭as‬ ‭much‬ ‭useful‬ ‭information‬ ‭as‬ ‭was‬ ‭available.‬‭However,‬‭this‬‭did‬‭not‬‭impact‬ ‭the‬ ‭ensemble‬ ‭models‬ ‭and‬ ‭mainly‬ ‭affected‬ ‭the other models’ performance.‬ ‭The‬‭historical‬‭documents‬‭API‬‭endpoint‬‭only‬ ‭contains‬ ‭data‬ ‭up‬ ‭until‬ ‭the‬ ‭year‬ ‭2014.‬ ‭The‬ ‭historical‬ ‭records‬ ‭could‬ ‭be‬ ‭improved‬ ‭by‬ ‭using‬ ‭other‬ ‭sources‬ ‭of‬ ‭data‬ ‭on‬ ‭pharmaceutical‬ ‭drug‬ ‭news‬ ‭headlines‬ ‭like‬ ‭ProQuest‬ ‭database‬ ‭API‬ ‭(Clarivate,‬ ‭n.d.).‬ ‭This‬ ‭would‬ ‭make‬ ‭for‬ ‭an‬ ‭interesting‬ ‭dashboard feature.‬ ‭The‬‭Medicaid‬‭drug‬‭prices‬‭API‬‭only‬‭contains‬ ‭information‬‭on‬‭drugs‬‭that‬‭are‬‭prescribed‬‭and‬ ‭can‬ ‭be‬ ‭reimbursed‬ ‭through‬ ‭their‬ ‭services,‬ ‭and‬‭only‬‭includes‬‭the‬‭price‬‭per‬‭unit.‬‭Thus,‬‭it‬ ‭does‬ ‭not‬ ‭contain‬ ‭prices‬ ‭for‬ ‭all‬ ‭marketed‬ ‭drugs.‬ ‭Only‬ ‭700‬ ‭matching‬ ‭drug‬‭prices‬‭were‬ ‭found‬‭in‬‭the‬‭FAERS‬‭dataset‬‭(less‬‭than‬‭10%).‬ ‭Missing‬ ‭price‬ ‭values‬ ‭were‬ ‭imputed‬ ‭using‬ ‭median‬‭price.‬‭Thus,‬‭the‬‭price‬‭feature‬‭did‬‭not‬ ‭add‬ ‭as‬ ‭much‬ ‭variance‬ ‭to‬ ‭the‬ ‭model‬ ‭as‬ ‭expected.‬‭Adding‬‭additional‬‭sources‬‭of‬‭drug‬

176

Made with FlippingBook - Online Brochure Maker