AAI_2025_Capstone_Chronicles_Combined
Evaluating Deep Learning Model Convergence in Chess via Nash Equilibria
12
Results
Figure 6: The model was able to achieve upwards of 70% weighted accuracy on training data after training on approximately 1.5 Million Games. The first image on the left denotes the first phase of training at 5e-4 learning rate with Nadam. The second phase of training (1e-5 learning rate) did not seem to increase the accuracy of the model, despite the reduced learning rate and extra 2.6 Million gradient steps. The performance of the model on the test set indicates that a lack of real generalization is taking place. In fact, all metrics at gradient step 4.8M and step 100k, the last and first epoch, are the same or very similar ( Figure 7 ). However, the test set does not tell the full story. Chess is a game with an extremely large state space; there are more legal chess positions than atoms in the known universe. As such, the tiny fraction of positions found in the training set, which belong to strong human players, does not cover the breadth of positions that a chess engine will encounter. The training data lacks “easy” positions that should be elementary to classify. To further understand these results and uncover deeper performance dynamics, we conducted a round-robin tournament between model snapshots. Since the ResNet classifies positions into win for the hero, draw, or loss for the hero, a basic depth-1 search was used to construct a policy over legal action space.
86
Made with FlippingBook - Share PDF online