Reparameterized Multi-Resolution Convolutions for Long Sequence Modelling

Abstract

Global convolutions have shown increasing promise as powerful general-purpose sequence models. However, training long convolutions is challenging, and kernel parameterizations must be able to learn long-range dependencies without overfitting. This work introduces reparameterized multi-resolution convolutions ($\texttt{MRConv}$), a novel approach to parameterizing global convolutional kernels for long-sequence modeling. By leveraging multi-resolution convolutions, incorporating structural reparameterization and introducing learnable kernel decay, $\texttt{MRConv}$ learns expressive long-range kernels that perform well across various data modalities. Our experiments demonstrate state-of-the-art performance on the Long Range Arena, Sequential CIFAR, and Speech Commands tasks among convolution models and linear-time transformers. Moreover, we report improved performance on ImageNet classification by replacing 2D convolutions with 1D $\texttt{MRConv}$ layers.

Cite

Text

Cunningham et al. "Reparameterized Multi-Resolution Convolutions for Long Sequence Modelling." Neural Information Processing Systems, 2024. doi:10.52202/079017-0852

Markdown

[Cunningham et al. "Reparameterized Multi-Resolution Convolutions for Long Sequence Modelling." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/cunningham2024neurips-reparameterized/) doi:10.52202/079017-0852

BibTeX

@inproceedings{cunningham2024neurips-reparameterized,
  title     = {{Reparameterized Multi-Resolution Convolutions for Long Sequence Modelling}},
  author    = {Cunningham, Harry Jake and Giannone, Giorgio and Zhang, Mingtian and Deisenroth, Marc Peter},
  booktitle = {Neural Information Processing Systems},
  year      = {2024},
  doi       = {10.52202/079017-0852},
  url       = {https://mlanthology.org/neurips/2024/cunningham2024neurips-reparameterized/}
}