AAI_2025_Capstone_Chronicles_Combined

17

Figure 9

Performance across NSynth’s timbral quality labels

Figure 10

Feature reconstruction results further highlight the model’s understanding of timbre with mean RMSE across 30 dimensions of 0.814

Most importantly, the deep model learns these relationships without relying on handcrafted descriptors. Its shared embedding implicitly captures phenomena such as harmonic evolution, transient sharpness, and spectral decay—patterns that PCA and engineered features only approximate. This supports the role of learned representations as a richer timbre space for retrieval and downstream similarity search.

349

Made with FlippingBook - Share PDF online