ADS Capstone Chronicles Revised

First page Table of contents Previous page 159 Next page Last page

‭9‬

‭While‬ ‭training‬ ‭and‬ ‭assessing‬ ‭models,‬ ‭slight‬ ‭decreases‬ ‭in‬ ‭performance‬ ‭will‬ ‭be‬‭traded‬‭for‬ ‭increased‬ ‭interpretability,‬ ‭ensuring‬ ‭that‬ ‭the‬ ‭output‬ ‭of‬ ‭the‬ ‭model‬ ‭can‬ ‭be‬ ‭easily‬ ‭used‬‭and‬ ‭accessed‬‭by‬‭the‬‭general‬‭public.‬‭In‬‭increasing‬ ‭accessibility,‬ ‭black-box‬ ‭models,‬‭which‬‭refer‬ ‭to‬ ‭models‬ ‭that‬ ‭have‬ ‭unclear‬ ‭mechanisms‬ ‭regarding‬‭predictions‬‭and‬‭results,‬‭will‬‭not‬‭be‬ ‭used.‬ ‭Many‬ ‭of‬ ‭the‬ ‭models‬ ‭in‬ ‭the‬ ‭aforementioned‬ ‭literature‬ ‭(deep‬ ‭neural‬ ‭networks,‬ ‭recommender-based‬‭methods)‬‭are‬ ‭black‬ ‭box‬ ‭models.‬ ‭These‬ ‭models‬ ‭limit‬ ‭patient‬‭and‬‭provider‬‭understanding‬‭and‬‭trust‬ ‭of‬‭results,‬‭which‬‭is‬‭important‬‭in‬‭the‬‭medical‬ ‭field‬ ‭(Xu‬ ‭&‬ ‭Shuttleworth,‬ ‭2024).‬ ‭This‬ ‭project‬ ‭prioritized‬ ‭transparency,‬ ‭to‬ ‭ultimately‬ ‭deliver‬ ‭a‬ ‭useful‬ ‭product‬ ‭to‬ ‭the‬ ‭general‬‭public,‬‭unlike‬‭much‬‭of‬‭the‬‭literature‬ ‭that‬ ‭seeks‬ ‭to‬ ‭generate‬ ‭the‬ ‭best‬ ‭performing‬ ‭model.‬ ‭The‬‭models‬‭will‬‭also‬‭include‬‭up‬‭to‬‭date‬‭data‬ ‭from‬‭the‬‭FAERS‬‭which‬‭gives‬‭novel‬‭insights‬ ‭in‬ ‭real‬ ‭time;‬ ‭in‬ ‭contrast‬ ‭to‬ ‭much‬ ‭of‬ ‭the‬ ‭literature‬‭which‬‭assessed‬‭model‬‭performance‬ ‭against‬ ‭datasets‬ ‭over‬ ‭a‬ ‭decade‬ ‭old.‬ ‭An‬ ‭Apache‬ ‭Airflow‬ ‭trigger‬ ‭will‬ ‭update‬ ‭the‬ ‭model‬ ‭on‬ ‭a‬ ‭quarterly‬ ‭basis,‬ ‭therefore‬ ‭creating‬ ‭a‬ ‭living‬ ‭pipeline‬ ‭that‬ ‭stays‬ ‭current‬ ‭with‬ ‭the‬ ‭latest‬ ‭data‬ ‭releases‬ ‭from‬ ‭government‬ ‭APIs.‬ ‭This‬ ‭ensures‬ ‭that‬ ‭information‬ ‭pertaining‬ ‭to‬ ‭new‬ ‭drugs‬ ‭is‬ ‭continuously‬ ‭added,‬ ‭and‬ ‭that‬ ‭relevance‬ ‭of‬ ‭the‬ ‭tools‬ ‭(database,‬ ‭dashboard,‬ ‭application)‬ ‭are maintained for users (Figure 4).‬ ‭Figure 4‬ ‭Surveillance System Architecture‬

‭4.1 Data Preparation‬ ‭The‬ ‭data‬ ‭used‬ ‭in‬ ‭this‬ ‭project‬ ‭is‬ ‭public,‬ ‭de-identified‬ ‭data,‬ ‭and‬ ‭therefore‬ ‭does‬ ‭not‬ ‭require‬ ‭informed‬ ‭consent‬ ‭or‬ ‭privacy‬ ‭protections.‬ ‭The‬ ‭data‬ ‭preparation‬ ‭phase‬ ‭leveraged‬ ‭the‬ ‭following‬ ‭open-source‬ ‭frameworks:‬ ‭MySQL‬ ‭(Oracle‬ ‭Corporation,‬ ‭2022),‬ ‭Jupyter‬ ‭Notebook‬ ‭(Project‬ ‭Jupyter,‬ ‭2023)‬‭with‬‭Python‬‭version‬‭3.9.18,‬‭openFDA‬ ‭API‬ ‭(FDA,‬ ‭n.d.a),‬ ‭and‬ ‭Data.Medicaid‬ ‭API‬ ‭(Centers‬ ‭for‬ ‭Medicare‬ ‭and‬ ‭Medicaid‬ ‭Services,‬‭2024).‬‭All‬‭data‬‭preparation‬‭code‬‭is‬ ‭within‬‭the‬‭“DataProcessing.ipynb”‬‭notebook‬ ‭file.‬ ‭4.1.1‬ ‭Static‬ ‭Data‬ ‭Source.‬ ‭Two‬ ‭static‬ ‭files‬ ‭(version‬ ‭3.3)‬ ‭were‬ ‭downloaded‬ ‭from‬ ‭ADReCS‬‭(Cai‬‭et‬‭al.,‬‭2015)‬‭and‬‭stored‬‭in‬‭our‬ ‭GitHub‬ ‭folder‬ ‭called‬ ‭ADReCS‬ ‭(Staggs‬ ‭&‬ ‭van‬ ‭der‬ ‭Wagt,‬ ‭n.d.).‬ ‭The‬ ‭first‬ ‭file‬ ‭contains‬ ‭adverse‬ ‭drug‬ ‭reaction‬ ‭ontology‬ ‭(ADR_ontology_3.3.xlsx)‬ ‭and‬ ‭the‬ ‭second‬ ‭contains‬ ‭standardized‬ ‭information‬ ‭on‬ ‭drug‬ ‭compounds‬ ‭(Drug_information_v3.3.xlsx).‬ ‭These‬ ‭files‬ ‭were‬ ‭used‬ ‭to‬ ‭understand‬ ‭drug‬ ‭names‬ ‭and‬ ‭terms‬ ‭in‬ ‭the‬ ‭event‬ ‭of‬ ‭ambiguity‬ ‭in FAERS data.‬ ‭4.1.2‬‭API‬‭Data‬‭Requests.‬ ‭API‬‭requests‬‭were‬ ‭developed‬ ‭based‬ ‭on‬ ‭each‬ ‭API’s‬ ‭requirements.‬ ‭Execution‬ ‭times‬ ‭were‬ ‭intermittently‬ ‭paused‬ ‭in‬‭random‬‭intervals‬‭to‬ ‭prevent‬‭overwhelming‬‭host‬‭servers‬‭based‬‭on‬ ‭API-specified‬‭rate‬‭limits.‬‭During‬‭testing‬‭and‬ ‭development,‬ ‭a‬ ‭small‬ ‭sample‬ ‭of‬ ‭data‬ ‭was‬ ‭pulled‬ ‭from‬ ‭each‬ ‭API‬ ‭to‬ ‭reduce‬ ‭computational‬ ‭load.‬ ‭Debugging‬ ‭statements‬ ‭are‬ ‭included‬ ‭in‬ ‭the‬ ‭code‬ ‭for‬ ‭each‬ ‭API‬ ‭request.‬ ‭Primary‬ ‭data‬ ‭was‬ ‭sourced‬ ‭from‬ ‭the‬ ‭FDA’s‬ ‭API‬ ‭endpoints‬ ‭(adverse‬ ‭events,‬ ‭labels,‬ ‭manufacturers,‬ ‭documents;‬ ‭FDA,‬ ‭n.d.a).‬ ‭Data‬ ‭was‬ ‭requested‬ ‭with‬ ‭API‬ ‭keys‬ ‭which‬

‭4 Methodology‬ ‭All‬ ‭code‬ ‭for‬ ‭this‬ ‭project‬ ‭is‬ ‭stored‬ ‭in‬ ‭a‬ ‭GitHub‬ ‭repository‬ ‭(Staggs‬ ‭&‬‭van‬‭der‬‭Wagt,‬ ‭n.d.).‬

159

Made with FlippingBook - Online Brochure Maker