S²MILE: Semantic-and-Structure-Aware Music-Driven Lyric Generation

Abstract

The task of music-to-lyric generation aims to create lyrics that can be sung in harmony with the music while capturing the music’s intrinsic meaning. Previous efforts in this area have struggled to effectively handle both the structural and semantic alignments of music and lyrics, often relying on rigid, manually crafted rules or overlooking the semantic essence of music, which deviates from the natural lyric-writing process of humans. In this paper, we bridge the structural and semantic gap between music and lyrics by proposing an end-to-end model for music-driven lyric generation. Our model aims at generating well-formatted lyrics based solely on the music while capturing its inherent semantic essence. In the music processing phase, we introduce a hierarchical music information extractor, which operates at both the song and sentence levels. The song-level extractor focuses on discerning the overall semantic content of the music, such as themes and emotions. Simultaneously, the sentence-level extractor captures the local semantic and structural details from note sequences. Additionally, we propose a lyric length predictor that determines the optimal length for the generated lyrics. During the lyric generation phase, the information gathered by the above modules is integrated, providing essential guidance for the downstream lyric generation module to produce coherent and meaningful lyrics. Experimental results on objective and subjective benchmarks demonstrate the capabilities of our proposed model in capturing semantics and generating well-formatted lyrics.

Cite

Text

You et al. "S²MILE: Semantic-and-Structure-Aware Music-Driven Lyric Generation." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I21.34375

Markdown

[You et al. "S²MILE: Semantic-and-Structure-Aware Music-Driven Lyric Generation." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/you2025aaai-s/) doi:10.1609/AAAI.V39I21.34375

BibTeX

@inproceedings{you2025aaai-s,
  title     = {{S²MILE: Semantic-and-Structure-Aware Music-Driven Lyric Generation}},
  author    = {You, Mu and Zhang, Fang and Zhang, Shuai and Xu, Linli},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {22208-22217},
  doi       = {10.1609/AAAI.V39I21.34375},
  url       = {https://mlanthology.org/aaai/2025/you2025aaai-s/}
}