Unifying Symbolic Music Arrangement: Track-Aware Reconstruction and Structured Tokenization
Abstract
We present a unified framework for automatic multitrack music arrangement that enables a single pre-trained symbolic music model to handle diverse arrangement scenarios, including reinterpretation, simplification, and additive generation. At its core is a segment-level reconstruction objective operating on token-level disentangled content and style, allowing for flexible any-to-any instrumentation transformations at inference time. To support track-wise modeling, we introduce REMI-z, a structured tokenization scheme for multitrack symbolic music that enhances modeling efficiency and effectiveness for both arrangement tasks and unconditional generation. Our method outperforms task-specific state-of-the-art models on representative tasks in different arrangement scenarios---band arrangement, piano reduction, and drum arrangement, in both objective metrics and perceptual evaluations. Taken together, our framework demonstrates strong generality and suggests broader applicability in symbolic music-to-music transformation.
Cite
Text
Ou et al. "Unifying Symbolic Music Arrangement: Track-Aware Reconstruction and Structured Tokenization." Advances in Neural Information Processing Systems, 2025.Markdown
[Ou et al. "Unifying Symbolic Music Arrangement: Track-Aware Reconstruction and Structured Tokenization." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/ou2025neurips-unifying/)BibTeX
@inproceedings{ou2025neurips-unifying,
title = {{Unifying Symbolic Music Arrangement: Track-Aware Reconstruction and Structured Tokenization}},
author = {Ou, Longshen and Zhao, Jingwei and Wang, Ziyu and Xia, Gus and Liang, Qihao and Hopkins, Torin and Wang, Ye},
booktitle = {Advances in Neural Information Processing Systems},
year = {2025},
url = {https://mlanthology.org/neurips/2025/ou2025neurips-unifying/}
}