ADS Capstone Chronicles Revised

4

Medicare and Medicaid resources, necessitating further investigation. 3 LiteratureReview Fraud detection within healthcare has been a significant focus for researchers and practitioners due to the substantial financial implications and complexity of the healthcare system. According to Kose et al. (2015), healthcare fraud and abuse present significant challenges for industry professionals. The infrequent occurrence of fraud, estimated to be between 3% and 10%, complicates the acquisition of large datasets necessary for machine learning applications. In response to these challenges, unsupervised machine-learning techniques have garnered considerable attention for their capacity to detect anomalies without the need for labeled training data. This attribute renders them particularly suitable for fraud detection scenarios where labeled data is often limited. Kose et al. (2015) introduced an approach known as Interactive Machine Learning (IMF), which involves subject matter experts collaborating directly with the training and modeling phases of the machine learning process in an unsupervised setting. This method integrates various human-computer interaction tools, enabling end users to customize the algorithm to better meet their specific objectives. As a result, a new flexible framework for detecting fraud around electronic claims was developed, known as the Electronic Fraud and Abuse Detection (eFAD) framework, which allows for suspicious cases to be effectively classified. The final model for this approach uses the binary pairwise

comparison method to initialize the weights, after which the subject matter experts are involved to adjust the weights and fine-tune parameters, as necessary. The resulting tool provided by Kose et al. includes a dashboard featuring a confusion matrix, accuracy, and Area Under the Curve (AUC) metrics, offering end users a detailed view of the model’s performance. Bauder et al. (2018) evaluated five unsupervised machine learning methods to identify Medicare provider fraud: Isolation Forest (IF), Uniform Random Forest (URF), Local Outlier Factor (LOF), Autoencoder (AE), and K-nearest neighbors (KNN). Their study emphasized the Area Under the Receiver Operating Characteristic Curve (AUROC) as the primary metric to gauge the effectiveness of these methods. Their research highlighted the inherent trade-off between sensitivity and specificity in fraud detection. The study underscored the challenges associated with detecting fraud in Medicare Part B data, particularly the limited availability of known fraud labels, which results in imbalanced datasets. This imbalance poses significant challenges for the training and validation of unsupervised models. The authors proposed the incorporation of additional fraud labels to enhance both sensitivity and specificity, improving the robustness of these models. Given the continuous advancements in technology, the healthcare industry increasingly requires refined detection techniques to address growing concerns surrounding Medicare and Medicaid FWA. Johnson and Khoshgoftaar (2019) focused

128

Made with FlippingBook - Online Brochure Maker