AAI_2025_Capstone_Chronicles_Combined
18
Learned Embedding Space
The deep timbre embeddings reveal structure that is both complex and perceptually meaningful. Sustained, warm pad-like tones consistently populate one region of the learned space, while short, bright percussive attack tonalities show as distinct clusters elsewhere. Figure 11 visualizes this separation: after projecting 4,000 sounds into two dimensions and applying perceptual filtering of the aforementioned qualities, regions of the space become recognizable as coherent timbral neighborhoods rather than random scatterings. This indicates that the deep learning model organizes audio in ways that mirror how listeners perceive similarity.
Figure 11
2-D embedding of 4,000 audio clips with implemented perceptual filters
Note: The left view displays smooth, warm pad-like sounds, while the right emphasizes bright, attack-driven percussive tones.
350
Made with FlippingBook - Share PDF online