Physics-Informed Neural Networks: Minimizing Residual Loss with Wide Networks and Effective Activations

Abstract

Remote sensing image-text retrieval is a fundamental task in remote sensing multimodal analysis, promoting the alignment of visual and language representations. The mainstream approaches commonly focus on capturing shared semantic representations between visual and textual modalities. However, the inherent characteristics of remote sensing image-text pairs lead to a semantic confusion problem, stemming from redundant visual representations and high inter-class similarity. To tackle this problem, we propose a novel Discriminative and Fine-grained Information Mining (DFIM) model, which aims to enhance semantic clarity by reducing visual redundancy and increasing the semantic gap between different classes. Specifically, the Dynamic Visual Enhancement (DVE) module adaptively enhances the visual discriminative features under the guidance of multimodal fusion information. Meanwhile, the Fine-grained Semantic Matching (FSM) module cleverly models the matching relationship between image regions and text words as an optimal transport problem, thereby refining intra-instance matching. Extensive experiments on two benchmark datasets justify the superiority of DFIM in terms of retrieval accuracy and visual interpretability over the leading methods.

Cite

Text

Dashtbayaz et al. "Physics-Informed Neural Networks: Minimizing Residual Loss with Wide Networks and Effective Activations." International Joint Conference on Artificial Intelligence, 2024. doi:10.24963/ijcai.2024/647

Markdown

[Dashtbayaz et al. "Physics-Informed Neural Networks: Minimizing Residual Loss with Wide Networks and Effective Activations." International Joint Conference on Artificial Intelligence, 2024.](https://mlanthology.org/ijcai/2024/dashtbayaz2024ijcai-physics/) doi:10.24963/ijcai.2024/647

BibTeX

@inproceedings{dashtbayaz2024ijcai-physics,
  title     = {{Physics-Informed Neural Networks: Minimizing Residual Loss with Wide Networks and Effective Activations}},
  author    = {Dashtbayaz, Nima Hosseini and Farhani, Ghazal and Wang, Boyu and Ling, Charles X.},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2024},
  pages     = {5853-5861},
  doi       = {10.24963/ijcai.2024/647},
  url       = {https://mlanthology.org/ijcai/2024/dashtbayaz2024ijcai-physics/}
}