AAI_2025_Capstone_Chronicles_Combined
16
Figure 8
Training and validation loss curves for the deep timbre model
Performance varies across NSynth’s timbral quality labels (Figure 9). The model performs best on common descriptors such as bright , dark , distorted , and percussive , which are prevalent in the dataset and correlate with distinctive acoustic patterns. In contrast, rarer qualities such as nonlinear envelope or tempo-synced timbre yield lower AUROC values. This variation aligns with exploratory analysis showing class imbalance across timbral attributes. Feature reconstruction results further highlight the model’s internal understanding of timbre (Figure 10). While some engineered descriptors (e.g., harmonic stability and spectral brightness ratios) exhibit higher error, the mean RMSE remains relatively stable across the 30 reconstructed features. This indicates that the model can approximate our engineered features without directly observing them.
348
Made with FlippingBook - Share PDF online