LeMeViT: Efficient Vision Transformer with Learnable Meta Tokens for Remote Sensing Image Interpretation

Abstract

Ordinal regression bridges regression and classification by assigning objects to ordered classes. While human experts rely on discriminative patch-level features for decisions, current approaches are limited by the availability of only image-level ordinal labels, overlooking fine-grained patch-level characteristics. In this paper, we propose a Dual-level Fuzzy Learning with Patch Guidance framework, named DFPG that learns precise feature-based grading boundaries from ambiguous ordinal labels, with patch-level supervision. Specifically, we propose patch-labeling and filtering strategies to enable the model to focus on patch-level features exclusively with only image-level ordinal labels available. We further design a dual-level fuzzy learning module, which leverages fuzzy logic to quantitatively capture and handle label ambiguity from both patch-wise and channel-wise perspectives. Extensive experiments on various image ordinal regression datasets demonstrate the superiority of our proposed method, further confirming its ability in distinguishing samples from difficult-to-classify categories. The code is available at https://github.com/ZJUMAI/DFPG-ord.

Cite

Text

Jiang et al. "LeMeViT: Efficient Vision Transformer with Learnable Meta Tokens for Remote Sensing Image Interpretation." International Joint Conference on Artificial Intelligence, 2024. doi:10.24963/ijcai.2024/103

Markdown

[Jiang et al. "LeMeViT: Efficient Vision Transformer with Learnable Meta Tokens for Remote Sensing Image Interpretation." International Joint Conference on Artificial Intelligence, 2024.](https://mlanthology.org/ijcai/2024/jiang2024ijcai-lemevit/) doi:10.24963/ijcai.2024/103

BibTeX

@inproceedings{jiang2024ijcai-lemevit,
  title     = {{LeMeViT: Efficient Vision Transformer with Learnable Meta Tokens for Remote Sensing Image Interpretation}},
  author    = {Jiang, Wentao and Zhang, Jing and Wang, Di and Zhang, Qiming and Wang, Zengmao and Du, Bo},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2024},
  pages     = {929-937},
  doi       = {10.24963/ijcai.2024/103},
  url       = {https://mlanthology.org/ijcai/2024/jiang2024ijcai-lemevit/}
}