AAI_2025_Capstone_Chronicles_Combined
Evaluating Deep Learning Model Convergence in Chess via Nash Equilibria
13
Figure 7: The figure showcases loss, accuracy, and f1-score weighted by importance on the test set of 64,000 unseen chess positions. The ResNet, despite showing steady progress during training, ends up where it started in terms of performance on the test set. The second phase of training, starting on Epoch 23, shows a brief divergence in all metrics before returning back to the baseline performance seen on average. A basic expected score was computed for each child position via win probability and draw probability, and the best scoring position for the hero of the root position was selected. The round-robin tournament entailed each model participating in a 10-game match against every other model. Each game had one of 5 starting positions derived from the most common openings played; this was done so that the deterministic policy from depth-1 search encounters unique starting positions each match. Each model snapshot played both sides of the
87
Made with FlippingBook - Share PDF online