Pose Augmentation: Class-Agnostic Object Pose Transformation for Object Recognition

Abstract

Object pose increases interclass object variance which makes object recognition from 2D images harder. To render a classifier robust to pose variations, most deep neural networks try to eliminate the influence of pose by using large datasets with many poses for each class. Here, we propose a different approach: a class-agnostic object pose transformation network (OPT-Net) can transform an image along 3D yaw and pitch axes to synthesize additional poses continuously. Synthesized images lead to better training of an object classifier. We design a novel eliminate-add structure to explicitly disentangle pose from object identity: first ‘eliminate’ pose information of the input image and then ‘add’ target pose information (regularized as continuous variables) to synthesize any target pose. We trained OPT-Net on images of toy vehicles shot on a turntable from the iLab-20M dataset. After training on unbalanced discrete poses (5 classes with 6 poses per object instance, plus 5 classes with only 2 poses), we show that OPT-Net can synthesize balanced continuous new poses along yaw and pitch axes with high quality. Training a ResNet-18 classifier with original plus synthesized poses improves mAP accuracy by 9% over training on original poses only. Further, the pre-trained OPT-Net can generalize to new object classes, which we demonstrate on both iLab-20M and RGB-D. We also show that the learned features can generalize to ImageNet. (The code is released at this URL)

Cite

Text

Ge et al. "Pose Augmentation: Class-Agnostic Object Pose Transformation for Object Recognition." Proceedings of the European Conference on Computer Vision (ECCV), 2020. doi:10.1007/978-3-030-58604-1_9

Markdown

[Ge et al. "Pose Augmentation: Class-Agnostic Object Pose Transformation for Object Recognition." Proceedings of the European Conference on Computer Vision (ECCV), 2020.](https://mlanthology.org/eccv/2020/ge2020eccv-pose/) doi:10.1007/978-3-030-58604-1_9

BibTeX

@inproceedings{ge2020eccv-pose,
  title     = {{Pose Augmentation: Class-Agnostic Object Pose Transformation for Object Recognition}},
  author    = {Ge, Yunhao and Zhao, Jiaping and Itti, Laurent},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2020},
  doi       = {10.1007/978-3-030-58604-1_9},
  url       = {https://mlanthology.org/eccv/2020/ge2020eccv-pose/}
}