ADS Capstone Chronicles Revised

17

ahfaosr2em67e,n4t4i4onreodwdsraonpds,3t1h7e rceosluumltinnsg. feature set 4.4.3.2 Feature Scaling Fd oo lnl oe wi ni ntgh ef esaat mu ree fsaeslheicotni o tnh, afte ai tt uwr ae s sdc aolni ne gf oi sr Mt e os td eslpi nl igt Ai sp pdroonaec ho1n. At h e7 5d%a t at r, ati hn e ann da 2n5e%w if ni t s toann coen ol yf tt hh ee St rt aa ni ndi anrgd Sdcaat lae. r Ftor lal no swf oi nr gm et hr aits, bAoptphr tohaec ht r a i2n i nagraen dttreasntsdf oa rt ams ee dt s f oursMi nog d etlhi nagt StandardScaler transformer. 4.4.3.3 Imputation of Remaining Nulls Af efwt e rStDhOe Ha f ocroel umme nn tsi own ietdh snt eupl ls , vtahleurees a, rbeust t itlhl ae mn ualxl si m, uwmh incuhl l vi sa l uoens l yi n a0n. 3y 6o%n e coofl u mt hne i st 9o 5t a0l owbes eorpvta tni oo tn st o. Gdi rvoepn thhoows e l or oww tsh (aat s nwu me bweor uilsd, lsoc si ke i lto- ltesaor fn v apl aucakbal eg ed aKt aN)N, bI mu tpruattehre rt too ui ms ep tuht ee those nulls based on closely-related ombasneyr vSaDt iOo Hn svwa li ut he isn atrhee cSl Do sOeHl y d- raet laasteetd. ,Buescianugs ae ka p- nper aoraecsht t on ei mi g ph ub toer vs a l(uKeNs Ni s) l obgai cs ae ld. Wmi tohdi ne l ti nh ge Ko fNnNeI img hpbuot er sr , ct oh ne shi dyepreerdp adruarmi negt eirmf po ur tna ut imo nb ei sr sceotmtpoutfaivtieontaollayvoid our work becoming too expensive, especially cwoonrski di negr i n gw itthhe. nSui mmibl aerrl yo f tfoe a t us cr ae lsi nwg , e at hr ee Kt hNeNn I umspe ud t et or itsr fai nt sofnolrymo nb ot ht he ttrhaei ntirnagi ndiantga aa nn dd test datasets to avoid any data leakage.

4.4.3.4 Conducting PCA Fd oa tl laoswe ti ,nwg et ha ei mh taon kd el ienpg aol fl tahl le na ug lel , vg ae lnudees r i, nr at chee, ar enddu c3e6t hCeDdCi mHe ne as ilot hn aMl i teyaosfutrhees S fDeOa tHu rf eeas ,t ubr eust uT shien go rPi griinnac li pSaDl OCHo mdpa ot anseent t sh aAd n3a 0l y9s i fse a(tPuCrAe s) ., bc eurtt at hi na tcnoul ummbnesr tf eol lhtaon2d7l e4 nauf tlel rvwa leu erse mi no vt ehde pc or el ucme dni sn gi n s et hc et i otnr a. i nSitnagr t idnagt aws eitt,h wteh opseer f o2r7m4 Pn uC mA bwei rt h ot fh ec ogmo apl oonf erne tdsu cniencge ds soawr yn ttoo oenxl py l tahi ne 9I n5 %oor fd tehr e vtaor i adnec tee irnmtihnee 2 7t4h eS D Oc oHmcpo ol unme nntss. no ne cjeussst atrhye f 2o 7r 4t ht ar at ,i nwi ne gf idt at thaes ePtCSAD tOr Ha ncsof ol ur mm ne sr ar antdi o palgoat i nt shte t hc eu mn uu ml a tbi ev re oef x cpol ami npeodn evnat rs i. aTnhc ee rneusmu letri ni cga l gcraalpc hu l a t(iFoingsu r ei n f o4r. 6m buesl o wt h)a t a n9 0d cvoa mr i apnocneeanmt soanrgesrtetqhuei r2e7d4 tSoDeOx pHl acionl u9m5 %n s .o f t h e Figure 4.6 Cumulative explained variance by number of principal components when conducting PCA on 274 SDOH columns

108

Made with FlippingBook - Online Brochure Maker