AAI_2025_Capstone_Chronicles_Combined

First page Table of contents Previous page 138 Next page Last page

linearly, TurbaNet demonstrates a non-linear scaling behavior. Around a swarm size of 60, GPU saturation appears to occur, after which runtime begins to increase—but still at a much slower rate than PyTorch (~50x slower growth).

Figure 5.2 GPU Performance Comparison for Large Networks

5.1.2) CPU 2 Swarm Sweep Results

The same sweep was performed but forcing the calculation onto the CPU. The networks have the same parameters as described in the GPU section and the other training parameters were also identical. On the CPU, TurbaNet’s benefits are less pronounced, as vectorization on CPUs is inherently limited compared to GPUs. While CPUs support SIMD instructions, these are executed at a significantly smaller scale than GPU parallelism (Yi, 2024).

5.1.2.1) Small Network

138

Made with FlippingBook - Share PDF online