Category Query Learning for Human-Object Interaction Classification

Abstract

Unlike most previous HOI methods that focus on learning better human-object features, we propose a novel and complementary approach called category query learning. Such queries are explicitly associated to interaction categories, converted to image specific category representation via a transformer decoder, and learnt via an auxiliary image-level classification task. This idea is motivated by an earlier multi-label image classification method, but is for the first time applied for the challenging human-object interaction classification task. Our method is simple, general and effective. It is validated on three representative HOI baselines and achieves new state-of-the-art results on two benchmarks.

Cite

Text

Xie et al. "Category Query Learning for Human-Object Interaction Classification." Conference on Computer Vision and Pattern Recognition, 2023. doi:10.1109/CVPR52729.2023.01466

Markdown

[Xie et al. "Category Query Learning for Human-Object Interaction Classification." Conference on Computer Vision and Pattern Recognition, 2023.](https://mlanthology.org/cvpr/2023/xie2023cvpr-category/) doi:10.1109/CVPR52729.2023.01466

BibTeX

@inproceedings{xie2023cvpr-category,
  title     = {{Category Query Learning for Human-Object Interaction Classification}},
  author    = {Xie, Chi and Zeng, Fangao and Hu, Yue and Liang, Shuang and Wei, Yichen},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2023},
  pages     = {15275-15284},
  doi       = {10.1109/CVPR52729.2023.01466},
  url       = {https://mlanthology.org/cvpr/2023/xie2023cvpr-category/}
}