Are Capsule Networks Texture or Shape Biased?

Abstract

Capsule networks (CapsNets) have been proposed as an alternative to traditional convolutional neural networks (CNNs), with the promise of better capturing part-whole relationships and spatial hierarchies. While CNNs are known to exhibit a strong bias towards texture in visual recognition tasks, human perception is more shape-biased. In this paper, we aim to investigate whether CapsNets, by design, demonstrate a stronger bias toward shape than texture, compared to CNNs. We conducted a series of experiments across multiple capsule architectures on images with a texture-shape cue conflict. Contrary to theoretical expectations, our results show that CapsNets do not consistently exhibit a stronger shape bias than CNNs. Although certain capsule models demonstrate promising shape recognition, they still rely significantly on texture, and their overall performance remains closer to that of CNNs than to human perception. These findings highlight the need for further research and architectural improvements to fully realize the potential of CapsNets in shape-based recognition.

Cite

Text

Renzulli et al. "Are Capsule Networks Texture or Shape Biased?." NeurIPS 2024 Workshops: SciForDL, 2024.

Markdown

[Renzulli et al. "Are Capsule Networks Texture or Shape Biased?." NeurIPS 2024 Workshops: SciForDL, 2024.](https://mlanthology.org/neuripsw/2024/renzulli2024neuripsw-capsule/)

BibTeX

@inproceedings{renzulli2024neuripsw-capsule,
  title     = {{Are Capsule Networks Texture or Shape Biased?}},
  author    = {Renzulli, Riccardo and Vranay, Dominik and Grangetto, Marco},
  booktitle = {NeurIPS 2024 Workshops: SciForDL},
  year      = {2024},
  url       = {https://mlanthology.org/neuripsw/2024/renzulli2024neuripsw-capsule/}
}