AAI_2025_Capstone_Chronicles_Combined

First page Table of contents Previous page 76 Next page Last page

Evaluating Deep Learning Model Convergence in Chess via Nash Equilibria

2 Deep Blue and former World Champion Garry Kasparov in 1997 that computer chess programs could be considered to have superhuman playing strength. Modern chess programs far exceed human capabilities even on relatively small computational budgets. Chess programs that natively run on cell phones would easily defeat the strongest human players of today. But that isn’t to say that chess is a solved game. In fact, recent progress in Machine Learning and Deep Learning has created a new era of chess engines that play stronger and more “human-like”. AlphaZero by DeepMind demonstrated that a Convolutional Neural Network can achieve superhuman playing strength exceeding contemporary engines with the aid of Monte-Carlo Tree Search and reinforcement learning (Silver, et al. 2018). More recently, transformer architectures have been shown to achieve grandmaster strength in the absence of search (Monroe & Chalmers, 2024) The Elo system was invented for the game of chess by a chess master Arpad Elo. Since its implementation in the 1960s and 1970s, the Elo system has been the normative metric for evaluating human player strength (Elo, 1978). The utility of the Elo rating model’s are realized when comparing two players. The difference in their elo ratings implies an expected win rate of the player with the larger elo rating. Moreover, the elo ratings are updated quickly after the game result via an adjustment proportional to overperformance or underperformance of the participating players. Due to the widespread adoption of elo in human chess, chess algorithms and papers measure the playing strength of their models via elo. This elo system, which is sometimes computed via self-play like AlphaZero or by actual testing on humans, is treated as a key indicator of the convergence of the model training and improvements. However, the underlying assumptions of elo do not account for the cyclical relationships between players in a metagame

Made with FlippingBook - Share PDF online