AAI_2025_Capstone_Chronicles_Combined
Evaluating Deep Learning Model Convergence in Chess via Nash Equilibria
3 (Balduzzi, et al. 2020). This distinction matters because Deep Learning architectures are known to undergo catastrophic forgetting (Kemker, et al. 2018). Moreover, Deep Learning architectures are susceptible to adversarial attacks (Szegedy, et al. 2014). To more thoroughly monitor the convergence of Deep Learning architectures as they progress through training, this paper proposes testing model’s against prior, less trained iterations via self-play and computing the Nash Equilibrium strategy over resulting win rate matrix. By recomputing the Nash Equilibrium over the set of model snapshots produced over training, we can monitor the stability convergence and point of possible divergence of model playing strength throughout training. Aside from the advantages over elo that the Nash Equilibrium strategy has as an evaluation metric, its computation is somewhat convenient due to its only requirements being a winrate matrix over the model population. This win rate matrix can be updated in parallel alongside model training, meaning the Nash Equilibrium can be used as a method for real-time monitoring. We see this in DeepMind’s Alphastar, where several “leagues” of agents are sampled from to create a robust Starcraft agent (Vinyals, et al. 2019). Exploratory Data Analysis The database that my project is attending to is a very large and compressed PGN database of expert level chess games played in 2024. The dataset in its raw form consists of .PGN files (which are text) that use standardized algebraic chess notation to denote the positions encountered in the game, the result of the game, and various metadata about the game itself (including the rating of each player of the game). The database is composed of around 10 million games/.pgn files played in 2024.
77
Made with FlippingBook - Share PDF online