ADS Capstone Chronicles Revised
5
drug interactions (DrugBank, 2024), standardized drug information from RxNorm (National Institutes of Health [NIH],n.d.),andtheAdverseDrugReaction Classification System (ADReCS; Cai etal., 2015).Otherresearchgroupshavecompiled drug information from these sources to apply machine learning and reinforcement learning methods to drug interactions, side effects, and adverse events data. 3.1 ADReCS AnearlytoolthatcombinedFAERSadverse drug reactionwithdruginteractiondataand chemical structures is the ADReCS dashboard and database. It contains hierarchical, standardized adverse drug reaction terms and drug compounds (http://bioinf.xmu.edu.cn/ADReCS/; Cai et al., 2015). The system was developed to combat inconsistencies, synonyms, and abbreviations used in reporting. The database dictionary can be used to clean datasets from FAERS by linking terms toa standard spelling. The system also links drugs with associated adverse reactions, chemical structures, and Simplified Molecular Input Line Entry Specification (SMILES) structures (Weininger, 1988). 3.2 Classification Models In addition to the existing databases and dashboards linked above, many researchers have utilized machine learning to predict drug side effects and frequencies. In 2017, Lee et al. generated three classification models, Bayesian classifier, k-nearest neighbor, and random forest models, using 10-fold cross validationtopredictdrugside effects.DatasourcesincludedDrugBankfor biological aspects of drug data (Wishart et al.,2018),PubChemCompoundIDfordrug mapping (NCBI, n.d.), UniProt Consortium Knowledgebase for protein information (2021), SMILES for drug structure data (Weininger, 1988), the National Drug
File-Reference Terminology for therapeutic indications of drugs (U.S. Department of Veterans Affairs, n.d.), and the Side Effect Resource (SIDER) database for side effect data (Kuhn et al., 2016). The drug structures, side effects, and therapeutic indications were represented as binary featuresinclassification(presentorabsent). Theinitialtotalfeaturecountwas7257;due to the high number of features, they also explored dimensionality reduction techniques to reducethenumberoffeatures due to class imbalance. They assessed the modelwiththreeavailabledatasets:Pauwels etal.’s(2011), Mizutanietal.’s(2012),and Liu’ et al.’s (2012). Models attained high accuracy, but lacked performance in precision, recall, and F-measure. Overall, the random forest model performed best, with an accuracy of 0.951, precision of 0.710,recallof0.304,andF-scoreof0.426. In the study's conclusion, they mentioned that some drugs werehardertopredictthan others;despitethis,thestudyconfirmedthat machine learning models have feasibility and effectiveness in predicting side effects. They also addressed the limitation that model interpretability from the perspective of physicians could be improved, specifically by adding statistically significant measures, and clinical meaningfulness. 3.3 Recommender Methods Zhang et al. (2016) used two recommender methods, integrated neighborhood-based method and restricted Boltzmann machine-based method, and their combined ensemble,totrainmodelsontherelationship between approved drugs, side effects, and drug-side effect associations. The models were used to predict new side effects for drugs.DatawassourcedfromSIDER(Kuhn et al., 2016), PubChem (NCBI, n.d.), DrugBank(Wishartetal.,2018),andKyoto Encyclopedia of Genes and Genomes
155
Made with FlippingBook - Online Brochure Maker