M.S. Applied Data Science - Capstone Chronicles 2025
15
hypothesis of a few central hubs that connect many other businesses. The out-degree distribution appeared more normal indicating a more consistent dispersal of spend at other businesses. The normalized distribution may also be aided by the cross-shopping threshold explained in section 4.4. See Figure 6 for data visualizations of these distributions. Businesses by Influence. The PageRank algorithm, originally used by Google Search to rank website importance, was repurposed in this study to identify the most influential business hubs in the San Diego commercial ecosystem. Using the weighted, directed edges of the physical, cross-shopping graph, along with a damping factor of α = 0.85, a maximum of 50 iterations, and an error tolerance of 1×10⁻⁵, we derived scores representing each business’s approximate market influence. In this context, PageRank interprets the weights of directed edges as votes of confidence, wherein a vote originating from a highly-ranked source node contributes more heavily to the target node’s score. Higher scores, therefore, indicate greater overall influence in the network (GeeksforGeeks, 2025). To quantify the total market influence of a single brand (e.g., McDonald’s), PageRank scores for individual store locations were averaged by brand. Figure 7 displays the ten highest ranked brands and their corresponding PageRank values, which designate them as central hubs of consumer activity. These top-ranked brands represent strong partnership candidates, because their high influence scores suggest that they serve as major conduits of customer traffic throughout the network. We found that among the top ten influential brands, major retailers such as Walmart, Target, Costco, CVS, and Vons, play a dominant role as hubs of consumer traffic. However, we
223
Made with FlippingBook flipbook maker