AAI_2025_Capstone_Chronicles_Combined
22
sounds, or complex layered music. Further, incorporating longer sound data would allow the system to learn how timbre evolves across phrases, transitions, and layered textures, essential for real-world creative workflows. Another direction involves improving the modeling of temporal evolution. Although the current deep learning model captures timbral patterns, additional work could provide better generalization across time. For example, a more refined transformer architecture may capture long-range relationships with greater detail. Incorporating semantic descriptors of sound, akin to text-to-audio modalities, could also enhance retrieval by allowing users to search and compare sound through natural language prompts. Finally, the system could benefit from user-driven refinement. As users explore the timbre space, their interactions could provide feedback loops that fine-tune the embedding, leading to timbre models that adapt to individual preferences. This project explores how machine learning techniques can support sound designers, composers, and producers in navigating timbre-based similarities. Results indicate that both the PCA-based model and the deep model capture important aspects of timbral structure. The PCA model provides an interpretable baseline that reflects classical MIR descriptors, while the deep embedding space reveals more complex and nonlinear timbral relationships. FAISS indexing then enables efficient retrieval and supports the deployment of the system in a creative and intuitive sound exploration application. Conclusion
354
Made with FlippingBook - Share PDF online