M.S. Applied Data Science - Capstone Chronicles 2025
20
considerations. Incorporating additional months in future analyses would further strengthen the robustness and generalizability of the findings. Bias, Ethics, and Privacy The data used in this study were produced by SafeGraph and made available through Dewey. Prior to any internal data preparation, SafeGraph ensured privacy compliance through multiple differential privacy techniques. The dataset did not contain individual-level information; instead, it included only aggregated demographic attributes associated with specific points of interest. SafeGraph applied these privacy protections to the bucket_customer_income and customer_home_city fields to prevent the possibility of reidentifying individuals through reverse engineering. First, Laplacian noise was added to these columns. In addition, SafeGraph withheld data for any demographic attribute (e.g., city or income) unless at least two customers were observed in that group. Finally, SafeGraph enforced a minimum threshold of four panelists for any city to appear in customer_home_city , ensuring that individuals from rare or sparsely represented locations could not be uniquely identified (SafeGraph, 2022). Although these privacy-preserving features were not directly manipulated in this research, they ensured that any downstream applications of the study remained compliant with privacy standards and did not pose a risk to citizen confidentiality. Modeling In the modeling phase of our study, we cover the construction of the directed graph and its edges using Python’s networkx module. With the business network, we apply Louvain and Leiden community detection algorithms. We also applied geographic filtering and custom scoring algorithms in order to serve potential strategic business partnerships.
228
Made with FlippingBook flipbook maker