AAI_2025_Capstone_Chronicles_Combined

16

Figure 8

Training and validation loss curves for the deep timbre model

Performance varies across NSynth’s timbral quality labels (Figure 9). The model performs best on common descriptors such as bright , dark , distorted , and percussive , which are prevalent in the dataset and correlate with distinctive acoustic patterns. In contrast, rarer qualities such as nonlinear envelope or tempo-synced timbre yield lower AUROC values. This variation aligns with exploratory analysis showing class imbalance across timbral attributes. Feature reconstruction results further highlight the model’s internal understanding of timbre (Figure 10). While some engineered descriptors (e.g., harmonic stability and spectral brightness ratios) exhibit higher error, the mean RMSE remains relatively stable across the 30 reconstructed features. This indicates that the model can approximate our engineered features without directly observing them.

348

Made with FlippingBook - Share PDF online