MuseControlLite: Multifunctional Music Generation with Lightweight Conditioners
Abstract
We propose MuseControlLite, a lightweight mechanism designed to fine-tune text-to-music generation models for precise conditioning using various time-varying musical attributes and reference audio signals. The key finding is that positional embeddings, which have been seldom used by text-to-music generation models in the conditioner for text conditions, are critical when the condition of interest is a function of time. Using melody control as an example, our experiments show that simply adding rotary positional embeddings to the decoupled cross-attention layers increases control accuracy from 56.6% to 61.1%, while requiring 6.75 times fewer trainable parameters than state-of-the-art fine-tuning mechanisms, using the same pre-trained diffusion Transformer model of Stable Audio Open. We evaluate various forms of musical attribute control, audio inpainting, and audio outpainting, demonstrating improved controllability over MusicGen-Large and Stable Audio Open ControlNet at a significantly lower fine-tuning cost, with only 85M trainable parameters. Source code, model checkpoints, and demo examples are available at: https://MuseControlLite.github.io/web/
Cite
Text
Tsai et al. "MuseControlLite: Multifunctional Music Generation with Lightweight Conditioners." Proceedings of the 42nd International Conference on Machine Learning, 2025.Markdown
[Tsai et al. "MuseControlLite: Multifunctional Music Generation with Lightweight Conditioners." Proceedings of the 42nd International Conference on Machine Learning, 2025.](https://mlanthology.org/icml/2025/tsai2025icml-musecontrollite/)BibTeX
@inproceedings{tsai2025icml-musecontrollite,
title = {{MuseControlLite: Multifunctional Music Generation with Lightweight Conditioners}},
author = {Tsai, Fang-Duo and Wu, Shih-Lun and Lee, Weijaw and Yang, Sheng-Ping and Chen, Bo-Rui and Cheng, Hao-Chung and Yang, Yi-Hsuan},
booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
year = {2025},
pages = {60266-60279},
volume = {267},
url = {https://mlanthology.org/icml/2025/tsai2025icml-musecontrollite/}
}