AAI_2025_Capstone_Chronicles_Combined
17
Figure 9
Performance across NSynth’s timbral quality labels
Figure 10
Feature reconstruction results further highlight the model’s understanding of timbre with mean RMSE across 30 dimensions of 0.814
Most importantly, the deep model learns these relationships without relying on handcrafted descriptors. Its shared embedding implicitly captures phenomena such as harmonic evolution, transient sharpness, and spectral decay—patterns that PCA and engineered features only approximate. This supports the role of learned representations as a richer timbre space for retrieval and downstream similarity search.
349
Made with FlippingBook - Share PDF online