ADS Capstone Chronicles Revised
6
advancements remain necessary to optimize these approaches. In this context, the planned dataset for our study comprises approximately 1,000,000 entries spanning from 2019 to the present. Drawing on insights from prior research, our approach involves adapting established methodologies to novel datasets, thereby contributing to the evolution of current practices. 4 Methodology The dataset used for analysis was obtained from CMS Market Saturation and Utilization Data via the CMS API. This dataset includes detailed Medicare Fee-for-Service (FFS) claims, which helps in analyzing market saturation and healthcare service utilization across various counties and states. As the data were sourced from a public government website, no Health Insurance Portability and Accountability Act (HIPAA) or ethical violations occurred, and it does not contain personally identifiable information. 4.1 Data Acquisition and Aggregation The data was acquired from the CMS API using Python. Pandas Data Frame was used to read the dataset to manipulate and explore the dataset. Initial exploration includes reading the first few rows to understand the contents and structure of the dataset. Mixed data types were identified in some columns and to resolve this, special characters were removed. This ensures the format of the data is consistent.
4.1.1 Exploratory Data Analysis Exploratory data analysis was conducted using both graphical and nongraphical methods. Univariate analysis involved visualizing key numeric variables through histograms to assess dispersion and central tendency. Multivariate analysis used correlation matrices to identify relationships between numeric variables. The graphical representation (see Figure 1) illustrates the distribution of healthcare services across the United States, highlighting regions with higher market saturation and healthcare service utilization. Visualizing these data points helps in identifying patterns and areas with potential inefficiencies, driving the news for reallocating healthcare resources. Figure 1 serves as a crucial tool for understanding geographical disparities and guiding decision-making in healthcare service provision. The map displays the number of users and the running sum of total payments across each county, the darker the blue shading, the higher the total payment across each county. This helps to identify regions in the United States that may be susceptible to FWA due to their over-saturation or under-utilization of healthcare services. Such visualizations and insights are essential in supporting the project’s objectives of optimizing resource allocation, enhancing service delivery, and mitigating instances of FWA. For instance, the map focuses on San Diego County, which had 1,785 users and a running sum of $4,727,743,791,300 in total payments.
130
Made with FlippingBook - Online Brochure Maker