M.S. Applied Data Science - Capstone Chronicles 2025

First page Table of contents Previous page 230 Next page Last page

Community Detection Network analysis included computing average local clustering coefficients with random sampling and identifying isolated nodes, followed by degree distribution analysis (i.e., in-degree vs. out-degree scatter plots and histograms) to find the most influential nodes. We then perform community detection using both Louvain and Leiden algorithms (via igraph/leidenalg), evaluating their output quality based on modularity scores. The node of interest (i.e., target location) is then passed to a custom scoring algorithm where a geospatial component filters nodes in a radius of a target location and returns the top N recommended nodes (i.e., businesses) based on total score, shown in the following formula. This scoring functions weighs features such as edge cross shopping weight, bidirectionality between the node of interest and target node, customer volume similarity, physical proximity score, and community membership. = 0.4 + 0.2 + 0.15 + 0.1 (2) ● CSW = cross_shop_weight ● B = bidirectional ● Sim = customer_similarity ● Prox = proximity_score ● Comm = same_community

These recommended nodes are then visualized on an interactive Folium map, demonstrating how a business can combine network structure with geographic proximity for identifying potential brand partnerships.

230

Made with FlippingBook flipbook maker