M.S. Applied Data Science - Capstone Chronicles 2025

5

appropriate for the modeling pipeline used in this study.

disability effects. Second, ACS five-year estimates smooth away year-specific fluctuations, meaning that sharp changes — such as post-COVID employment shocks — are not visible. As a result, the findings represent structural, multi-year patterns rather than annual trends. These limitations were logged in the analysis workflow and flagged in the modeling stage to ensure cautious interpretation and additional robustness checks

4.2.2 Data Quality Issues and Limitations Two major data quality issues emerged during validation of the ACS tract-level tables and the Illinois PUMS sample. First, rural southern Illinois tracts showed substantially higher sampling uncertainty. For example, several tracts in Alexander, Pulaski, and Johnson counties had disability estimates with margins of error exceeding 30 percent of the estimate, compared with less than 10 percent in Chicago-area tracts. Income measures displayed the same pattern: some rural tracts had WAGP MOE values above $8,000, while urban tracts averaged closer to $2,000. These high-variance tracts introduce noise into regression coefficients, especially for

Figure 1 Total income by educational attainment and disability status.

This boxplot illustrates clear income disparities between disabled and non-disabled adults at every education level. Even as educational attainment

increases, income for disabled individuals remains consistently lower, reflecting structural barriers that limit the economic returns of education for this group.

180

Made with FlippingBook flipbook maker