X-Former Elucidator: Reviving Efficient Attention for Long Context Language Modeling

Miao, Xupeng; Zhu, Shenhan; Fu, Fangcheng; Guo, Ziyu; Yang, Zhi; Tu, Yaofeng; Jia, Zhihao; Cui, Bin

doi:10.24963/ijcai.2024/904

X-Former Elucidator: Reviving Efficient Attention for Long Context Language Modeling

Xupeng Miao, Shenhan Zhu, Fangcheng Fu, Ziyu Guo, Zhi Yang, Yaofeng Tu, Zhihao Jia, Bin Cui

IJCAI 2024 pp. 8179-8187

doi:10.24963/ijcai.2024/904 /ijcai/2024/miao2024ijcai-x/

Abstract

Despite the rapid development of neural vocoders in recent years, they usually suffer from some intrinsic challenges like opaque modeling, and parameter-performance trade-off. In this study, we propose an innovative time-frequency (T-F) domain-based neural vocoder to resolve the above-mentioned challenges. To be specific, we bridge the connection between the classical signal range-null decomposition (RND) theory and vocoder task, and the reconstruction of target spectrogram can be decomposed into the superimposition between the range-space and null-space, where the former is enabled by a linear domain shift from the original mel-scale domain to the target linear-scale domain, and the latter is instantiated via a learnable network for further spectral detail generation. Accordingly, we propose a novel dual-path framework, where the spectrum is hierarchically encoded/decoded, and the cross- and narrow-band modules are elaborately devised for efficient sub-band and sequential modeling. Comprehensive experiments are conducted on the LJSpeech and LibriTTS benchmarks. Quantitative and qualitative results show that while enjoying lightweight network parameters, the proposed approach yields state-of-the-art performance among existing advanced methods. Our code and the pretrained model weights are available at https://github.com/Andong-Li-speech/RNDVoC.

PDF IJCAI Semantic Scholar

Cite

Text

Miao et al. "X-Former Elucidator: Reviving Efficient Attention for Long Context Language Modeling." International Joint Conference on Artificial Intelligence, 2024. doi:10.24963/ijcai.2024/904

Markdown

[Miao et al. "X-Former Elucidator: Reviving Efficient Attention for Long Context Language Modeling." International Joint Conference on Artificial Intelligence, 2024.](https://mlanthology.org/ijcai/2024/miao2024ijcai-x/) doi:10.24963/ijcai.2024/904

BibTeX

@inproceedings{miao2024ijcai-x,
  title     = {{X-Former Elucidator: Reviving Efficient Attention for Long Context Language Modeling}},
  author    = {Miao, Xupeng and Zhu, Shenhan and Fu, Fangcheng and Guo, Ziyu and Yang, Zhi and Tu, Yaofeng and Jia, Zhihao and Cui, Bin},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2024},
  pages     = {8179-8187},
  doi       = {10.24963/ijcai.2024/904},
  url       = {https://mlanthology.org/ijcai/2024/miao2024ijcai-x/}
}