M.S. Applied Data Science - Capstone Chronicles 2025
4
and enable proactive recall management. By developing methods to accurately identify high-risk recalls, the study aims to reduce the circulation time of hazardous products. In doing so, it supports regulatory agencies and manufacturers in prioritizing limited resources to expedite critical responses. The proposed approach has the potential to save lives by accelerating the identification of Class I recalls and shifting the industry focus from reactive problem-solving to proactive safety assurance. 2.2 Definition of Objectives The multiclass classification and risk prediction system for recalls will require several key actions to address challenges in recall management. These include developing a machine learning model for recall severity classification, creating an early warning system for high-risk products, utilizing NLP techniques to analyze recall reasons, implementing a time-based pattern analysis system, and building a risk assessment tool for manufacturers. The project will analyze text patterns in product descriptions and recall reasons, combine this analysis with categorical features to determine their effectiveness in different class categories. Through these actions, the project aims to provide more accurate and consistent recall severity classification, efficient identification of high-risk recalls, early detection of potential high-risk products, and standardized analysis of recall reasons across product types. It will also identify emerging trends and seasonal variations in product safety issues, offer customized risk scores and preventive measures for manufacturers, and determine which classification method performs better for different class balance scenarios. The expected outcomes include improved accuracy and
timeliness in recall classification, reduction in Class I recalls, more efficient resource allocation, enhanced communication, and potential economic benefits. Even if not all objectives are fully met, the project will still contribute valuable insights into recall classification complexities and help identify current limitations in the field. 3 Literature Review The purpose of this literature review is to position this research project in relation to existing studies on recall systems and data-driven approaches for improving recall efficiency and accuracy. By reviewing recent studies on FDA recall processes, predictive modeling, and emerging technologies such as NLP, this study identifies gaps in the literature and establishes the rationale for its analysis. This review will cover several key areas, including the evolution of recall research, challenges within current recall systems, data-driven approaches for prediction, industry-specific recall findings, and the potential of NLP in enhancing recall prediction across various sectors. 3.1 Recall Classification and Regulatory Oversight Y. Zhou (2023) and Dubin et al. (2021) examined the factors influencing FDA recall classifications and associated regulatory risks. Y. Zhou identifies lobbying as a potential influence on the classification process, suggesting that external pressures may compromise objectivity. Dubin et al. (2021) found that medical devices approved through the more rigorous premarket approval pathway had a higher risk of recall than those cleared through the 510(k) pathway, challenging the assumption that stricter approval leads to safer products.
8
Made with FlippingBook flipbook maker