AAI_2025_Capstone_Chronicles_Combined

Beyond timing experiments, TurbaNet was also tested in real-world machine learning applications. The first application involved training a fully connected feedforward neural network on the MNIST dataset. A single PyTorch model was trained and compared against a swarm of TurbaNet models, measuring both training time and predictive performance. Each model trained for ten epochs with a batch size of eight using 33,600 images. The TurbaNet models were trained using a bootstrapping approach, where each model learned from a subset of the dataset, then the model’s predictions were averaged for a final ensemble model. Building on the concept of the "memory split" advantage, as discussed by Lobacheva et al. (2020), the experiments explored whether ensembles of medium-sized networks could outperform a single large network with an equivalent total number of parameters. This approach aligns with findings that suggest such ensembles can achieve better predictive performance and uncertainty estimation. The goal was that while individual TurbaNet models would underperform the PyTorch model, the ensemble model would achieve superior accuracy due to the combined predictive power of multiple networks. To evaluate this, the individual TurbaNet models as well as the ensemble’s performance were compared to the PyTorch model. They were compared on training time and constructed confusion matrices to assess classification accuracy. The TurbaNet implementation consisted of 50 networks, each with 784 input nodes, 64 hidden nodes in the first layer, 32 hidden nodes in the second layer, and 10 output nodes. ReLU activation functions were applied to hidden layers, while no activation was used for the output layer. The models were optimized using the Adam optimizer with a learning rate of 5e-5, and classification loss was computed using cross-entropy loss with log-softmax applied. The second real-world application involved stock price prediction using an LSTM-based model. A PyTorch LSTM model was trained and compared against a swarm of TurbaNet LSTM networks, where each model was assigned to predict the price movement of a single stock. The dataset consisted of 100 stocks, with each model trained for 500 epochs using a batch size of 32. Training sequences were constructed from the closing prices of each stock over a period of 20 days, forming time-series inputs for prediction.

135

Made with FlippingBook - Share PDF online