Multi-View Classification Using Hybrid Fusion and Mutual Distillation

Abstract

Multi-view classification problems are common in medical image analysis, forensics, and other domains where problem queries involve multi-image input. Existing multi-view classification methods are often tailored to a specific task. In this paper, we repurpose off-the-shelf Hybrid CNN-Transformer networks for multi-view classification with either structured or unstructured views. Our approach incorporates a novel fusion scheme, mutual distillation, and introduces minimal additional parameters. We demonstrate the effectiveness and generalization capability of our approach, MV-HFMD, on multiple multi-view classification tasks and show that it outperforms other multi-view approaches, even task-specific methods. Code is available at https://github.com/vidarlab/multi-view-hybrid.

Cite

Text

Black and Souvenir. "Multi-View Classification Using Hybrid Fusion and Mutual Distillation." Winter Conference on Applications of Computer Vision, 2024.

Markdown

[Black and Souvenir. "Multi-View Classification Using Hybrid Fusion and Mutual Distillation." Winter Conference on Applications of Computer Vision, 2024.](https://mlanthology.org/wacv/2024/black2024wacv-multiview/)

BibTeX

@inproceedings{black2024wacv-multiview,
  title     = {{Multi-View Classification Using Hybrid Fusion and Mutual Distillation}},
  author    = {Black, Samuel and Souvenir, Richard},
  booktitle = {Winter Conference on Applications of Computer Vision},
  year      = {2024},
  pages     = {270-280},
  url       = {https://mlanthology.org/wacv/2024/black2024wacv-multiview/}
}