M.S. Applied Data Science - Capstone Chronicles 2025
10
Figure 6 Scatterplot: FRPM vs Graduation Rate
There was a correlation heatmap created to examine the relationship between numerical features, as shown in Figure 7. The highest correlation came from avg_safety_score and school_climate_index ( r = 1.00), which demonstrated that both of these indicators are similar. There was a negative perfect correlation between pct_inexperienced and pct_experience ( r = -1.00), which is teacher experience and reflects how CDE reports staff categories as mutually exclusive proportions. Features that represented enrollment measures such as cohortstudents and eligible_cumulative_enrollment also
demonstrated a strong positive correlation ( r = 0.96). Graduation and dropout rate had a strong negative correlation ( r = –0.91), confirming internal consistency in the dataset. Safety-related indicators, including pct_safe_gr11, were strongly positively correlated with climate scores ( r = 0.81), while pct_unsafe_gr11 was strongly negatively correlated with pct_safe_gr11 ( r = –0.83). After understanding the heat map and correlations it helped with identifying redundancies, clarifying overlapping constructs and guiding the selection of variables for modeling.
Figure 7 Correlation Heatmap
199
Made with FlippingBook flipbook maker