ESSAformer: Efficient Transformer for Hyperspectral Image Super-Resolution

Abstract

Single hyperspectral image super-resolution (single-HSI-SR) aims to restore a high-resolution hyperspectral image from a low-resolution observation. However, the prevailing CNN-based approaches have shown limitations in building long-range dependencies and capturing interaction information between spectral features. This results in inadequate utilization of spectral information and artifacts after upsampling. To address this issue, we propose ESSAformer, an ESSA attention-embedded Transformer network for single-HSI-SR with an iterative refining structure. Specifically, we first introduce a robust and spectral-friendly similarity metric, i.e., the spectral correlation coefficient of the spectrum (SCC), to replace the original attention matrix and incorporates inductive biases into the model to facilitate training. Built upon it, we further utilize the kernelizable attention technique with theoretical support to form a novel efficient SCC-kernel-based self-attention (ESSA) and reduce attention computation to linear complexity. ESSA enlarges the receptive field for features after upsampling without bringing much computation and allows the model to effectively utilize spatial-spectral information from different scales, resulting in the generation of more natural high-resolution images. Without the need for pretraining on large-scale datasets, our experiments demonstrate ESSA's effectiveness in both visual quality and quantitative results. The code will be released.

Cite

Text

Zhang et al. "ESSAformer: Efficient Transformer for Hyperspectral Image Super-Resolution." International Conference on Computer Vision, 2023. doi:10.1109/ICCV51070.2023.02109

Markdown

[Zhang et al. "ESSAformer: Efficient Transformer for Hyperspectral Image Super-Resolution." International Conference on Computer Vision, 2023.](https://mlanthology.org/iccv/2023/zhang2023iccv-essaformer/) doi:10.1109/ICCV51070.2023.02109

BibTeX

@inproceedings{zhang2023iccv-essaformer,
  title     = {{ESSAformer: Efficient Transformer for Hyperspectral Image Super-Resolution}},
  author    = {Zhang, Mingjin and Zhang, Chi and Zhang, Qiming and Guo, Jie and Gao, Xinbo and Zhang, Jing},
  booktitle = {International Conference on Computer Vision},
  year      = {2023},
  pages     = {23073-23084},
  doi       = {10.1109/ICCV51070.2023.02109},
  url       = {https://mlanthology.org/iccv/2023/zhang2023iccv-essaformer/}
}