AAI_2025_Capstone_Chronicles_Combined
Where the results really get interesting is when the runtime of both these training sessions is examined. In this case PyTorch takes almost 24x longer to train the swarm as compared to TurbaNet. This experiment didn't seem to be stressing the limits of the GPU so it implies that this could be extended even further to train on even more stocks at once.
Table 5.2: Model Performance Metrics for Time-Series Stock Data Model Training Time (s)
Time per Stock (s)
PyTorch
557.57
5.5757
23.39
0.2339
Turba
6.) Conclusion
The precise threshold at which TurbaNet encounters performance limitations—or what constitutes a 'large' network—is left intentionally undefined, as it varies based on available hardware. Future experiments could explore the impact of higher VRAM or larger bandwidth GPUs to determine the more significant factors. The parametric sweep has undoubtedly identified that TurbaNet is capable of outperforming state of the art libraries like PyTorch when it comes to training large numbers of relatively small models. When applying TurbaNet to real life problems, key observations were made. It can be used to improve model accuracy through common bootstrapping/ensemble approaches perhaps at the cost of increased training time, and it can be used to drastically reduce runtime when training on data that requires many independent models that are trained on different datasets. TurbaNet enables full utilization of accelerator hardware even for problems involving many small models, where traditional frameworks may underutilize GPU resources. The effectiveness of this methodology theoretically scales proportionally with hardware advances (memory and compute) so there is good reason to believe that this will only become more effective as GPUs and other accelerators advance technologically.
144
Made with FlippingBook - Share PDF online