M.S. Applied Data Science - Capstone Chronicles 2025

19

Figure 8 Sample Bias by State

The dataset offered substantial benefits, particularly its scale and level of preprocessing. However, several limitations were identified for future improvement. First, brands needed to be recognized by SafeGraph as branded locations, introducing the possibility of omitting small or locally-owned businesses. This may shrink the degree of centrality for communities with mostly small businesses in graph analysis. This gap could be addressed through supplemental manual data collection in subsequent studies. Second, although geographic bias could be evaluated at the state level, the proportion of the dataset relative to the true population remained difficult to assess at finer geographic scales, leaving potential unexplored drift between expected and actual representation at the city or zip code level. Lastly, while historical data extended back to 2019, this study focused on a single month due to technical resource constraints and practical

227

Made with FlippingBook flipbook maker