Multi-Scale Feature Fusion Using Channel Transformers for Guided Thermal Image Super Resolution
Abstract
Thermal imaging, leveraging the infrared spectrum, offers a compelling alternative to visible spectrum (VIS) imagery in challenging environmental conditions like low- light, occlusions, and adverse weather. However, its widespread adoption in computer vision tasks is hampered by lower spatial resolution. We address this challenge by proposing a novel framework titled Multi-Scale Feature Fusion using Channel Transformers (MSFFCT) for Guided Thermal Image Super-Resolution (GTISR).GTISR tackles the resolution limitations of thermal imagery. It leverages high-resolution RGB information as a guide to reconstruct high-resolution thermal imagery from low-resolution thermal inputs. At the core of MSFFCT lies a novel deep learning architecture that combines the strengths of two powerful approaches: channel-based transformers and multi-scale fusion.MSFFCT overcomes inherent limitations of Convolutional Neural Networks (CNNs) typically used in super-resolution tasks. CNNs often suffer from restricted receptive fields, limiting their ability to capture long-range dependencies within the image. Additionally, computational cost grows significantly with larger inputs. MSFFCT addresses these shortcomings by enabling efficient processing of global information and offering superior scalability. MSFFCT achieved state-of-the-art results on the ×8 and × 16 GTISR tasks of the 2024 Perception Beyond Visual Spectrum (PBVS) challenge, winning 2nd place in both tasks and demonstrating its effectiveness in real-world scenarios.
Cite
Text
Puttagunta et al. "Multi-Scale Feature Fusion Using Channel Transformers for Guided Thermal Image Super Resolution." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2024. doi:10.1109/CVPRW63382.2024.00314Markdown
[Puttagunta et al. "Multi-Scale Feature Fusion Using Channel Transformers for Guided Thermal Image Super Resolution." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2024.](https://mlanthology.org/cvprw/2024/puttagunta2024cvprw-multiscale/) doi:10.1109/CVPRW63382.2024.00314BibTeX
@inproceedings{puttagunta2024cvprw-multiscale,
title = {{Multi-Scale Feature Fusion Using Channel Transformers for Guided Thermal Image Super Resolution}},
author = {Puttagunta, Raghunath Sai and Kathariya, Birendra and Li, Zhu and York, George},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
year = {2024},
pages = {3086-3095},
doi = {10.1109/CVPRW63382.2024.00314},
url = {https://mlanthology.org/cvprw/2024/puttagunta2024cvprw-multiscale/}
}