M.S. Applied Data Science - Capstone Chronicles 2025
22
Community Detection Network analysis included computing average local clustering coefficients with random sampling and identifying isolated nodes, followed by degree distribution analysis (i.e., in-degree vs. out-degree scatter plots and histograms) to find the most influential nodes. We then perform community detection using both Louvain and Leiden algorithms (via igraph/leidenalg), evaluating their output quality based on modularity scores. The node of interest (i.e., target location) is then passed to a custom scoring algorithm where a geospatial component filters nodes in a radius of a target location and returns the top N recommended nodes (i.e., businesses) based on total score, shown in the following formula. This scoring functions weighs features such as edge cross shopping weight, bidirectionality between the node of interest and target node, customer volume similarity, physical proximity score, and community membership. = 0.4 + 0.2 + 0.15 + 0.1 (2) ● CSW = cross_shop_weight ● B = bidirectional ● Sim = customer_similarity ● Prox = proximity_score ● Comm = same_community
These recommended nodes are then visualized on an interactive Folium map, demonstrating how a business can combine network structure with geographic proximity for identifying potential brand partnerships.
230
Made with FlippingBook flipbook maker