Three-Modal Guidance for Symbolic Music Generation: Melody, Structure, Texture

Abstract

The vision of this work is a flexible co-creation of music between human and a trained model that can be used with or without domain knowledge. Building upon previous work, the transformer-based FIGARO framework, we propose a symbolic music generation that takes up three separate guiding modalities: a melody, structural piece description termed expert description, and music texture. Our approach aims to enable a composer to try out combinations of different melodies, expert descriptions, and textures. FIGARO is capable of generating music based on a structural expert description generated with domain knowledge, and a learned representation of a music piece. The description part of the input is generated for each bar, and provides a multitude of features, such as mean pitch, chords, note density, etc. The learned representation is generated for each bar as a whole. The main contribution of this work is a more extensive modularisation of the input to the model, i.e. the concept of explicit separation of the input into three above-mentioned distinct modalities commonly used in music composition and symbolic description of the musical works: melody, domain knowledge-driven description of the piece, and texture guiding the feel of the music. We demonstrate our preliminary results with novel model-based implementation of a piece, provided a melody, a bar-wise description and a multi-track accompaniment.

Cite

Text

Lucht et al. "Three-Modal Guidance for Symbolic Music Generation: Melody, Structure, Texture." NeurIPS 2024 Workshops: Audio_Imagination, 2024.

Markdown

[Lucht et al. "Three-Modal Guidance for Symbolic Music Generation: Melody, Structure, Texture." NeurIPS 2024 Workshops: Audio_Imagination, 2024.](https://mlanthology.org/neuripsw/2024/lucht2024neuripsw-threemodal/)

BibTeX

@inproceedings{lucht2024neuripsw-threemodal,
  title     = {{Three-Modal Guidance for Symbolic Music Generation: Melody, Structure, Texture}},
  author    = {Lucht, Daniel Alexander and Leins, David Philip and von Rütte, Dimitri and Moringen, Alexandra},
  booktitle = {NeurIPS 2024 Workshops: Audio_Imagination},
  year      = {2024},
  url       = {https://mlanthology.org/neuripsw/2024/lucht2024neuripsw-threemodal/}
}