M.S. Applied Data Science - Capstone Chronicles 2025
11
demonstrated severe multicollinearity with VIF values of 10.49 and 11.08 respectively, accompanied by r = .913 ( p < .001). This correlation is theoretically expected, as npxG represents a subset of xG. Additionally, take-ons success and take-ons attempted showed moderate multicollinearity ( r = .851, VIF = 5.97 and 6.23), reflecting the inherent relationship between attempt frequency and success rate. Following established practice (Wooldridge, 2015), variables with VIF > 10 were systematically removed, retaining expected xG while removing npxG to ensure model stability. Table 5 Forward - VIF Analysis and Multicollinearity Assessment by Position
4.1.1.4 Position-Specific Performance Comparisons
Spider chart visualizations (see Figures 7-10) provide multidimensional performance comparisons between elite players within each tactical position, using the same metrics analyzed in the correlation matrices. Performance scores are normalized to a 0-100 scale for cross-metric comparability. Forward position analysis revealed Kylian Mbappé's dominance (Score: 21.5) over Vinícius Júnior across offensive metrics, particularly in goal-scoring efficiency and xG generation. The spider chart illustrates Mbappé's superior finishing capabilities while highlighting comparable dribbling statistics between both players. Figure 7 Forward Position Performance Comparison: Mbappé vs. Vinícius Júnior
Pearson r*
Variable
VIF
Status
Goals Assists Shots Shots on target Expected xG Expected npxG Expected xAG Take-ons success Take-ons
2.6
Acceptable Acceptable Moderate
1.65 6.01
4.81
Acceptable
10.49
.913
Severe
11.08
.913
Severe
2.04
Acceptable
5.97
.851
Moderate
attempted 6.23
.851
Moderate
Note. VIF > 10 indicates severe multicollinearity; 5 < VIF < 10 indicates moderate multicollinearity. Sample sizes: forwards ( n = 1,695), midfielders ( n = 2,029), defenders ( n = 1,900), goalkeepers ( n = 396).
Midfielder comparisons demonstrated marginal differences between Jude Bellingham (Score: 31.8) and Luka Modrić (Score: 30.7), with Bellingham exhibiting higher defensive contributions (Tkl: 18.2 vs. 9.6) while maintaining similar creative output. Both players show exceptional passing (P-Cmp%) accuracy
58
Made with FlippingBook flipbook maker