Discrete Diffusion Probabilistic Models for Symbolic Music Generation
Abstract
Denoising Diffusion Probabilistic Models (DDPMs) have made great strides in generating high-quality samples in both discrete and continuous domains. However, Discrete DDPMs (D3PMs) have yet to be applied to the domain of Symbolic Music. This work presents the direct generation of Polyphonic Symbolic Music using D3PMs. Our model exhibits state-of-the-art sample quality, according to current quantitative evaluation metrics, and allows for flexible infilling at the note level. We further show, that our models are accessible to post-hoc classifier guidance, widening the scope of possible applications. However, we also cast a critical view on quantitative evaluation of music sample quality via statistical metrics, and present a simple algorithm that can confound our metrics with completely spurious, non-musical samples.
Cite
Text
Plasser et al. "Discrete Diffusion Probabilistic Models for Symbolic Music Generation." International Joint Conference on Artificial Intelligence, 2023. doi:10.24963/IJCAI.2023/648Markdown
[Plasser et al. "Discrete Diffusion Probabilistic Models for Symbolic Music Generation." International Joint Conference on Artificial Intelligence, 2023.](https://mlanthology.org/ijcai/2023/plasser2023ijcai-discrete/) doi:10.24963/IJCAI.2023/648BibTeX
@inproceedings{plasser2023ijcai-discrete,
title = {{Discrete Diffusion Probabilistic Models for Symbolic Music Generation}},
author = {Plasser, Matthias and Peter, Silvan and Widmer, Gerhard},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2023},
pages = {5842-5850},
doi = {10.24963/IJCAI.2023/648},
url = {https://mlanthology.org/ijcai/2023/plasser2023ijcai-discrete/}
}