The More You Look, the More You See: Towards General Object Understanding Through Recursive Refinement

Wang, Jingyan; Russakovsky, Olga; Ramanan, Deva

doi:10.1109/WACV.2018.00199

The More You Look, the More You See: Towards General Object Understanding Through Recursive Refinement

Jingyan Wang, Olga Russakovsky, Deva Ramanan

WACV 2018 pp. 1794-1803

doi:10.1109/WACV.2018.00199 /wacv/2018/wang2018wacv-more/

Abstract

Comprehensive object understanding is a central challenge in visual recognition, yet most advances with deep neural networks reason about each aspect in isolation. In this work, we present a unified framework to tackle this broader object understanding problem. We formalize a refinement module that recursively develops understanding across space and semantics - "the more it looks, the more it sees." More concretely, we cluster the objects within each semantic category into fine-grained subcategories; our recursive model extracts features for each region of interest, recursively predicts the location and the content of the region, and selectively chooses a small subset of the regions to process in the next step. Our model can quickly determine if an object is present, followed by its class ("Is this a person?"), and finally report finegrained predictions ("Is this person standing?"). Our experiments demonstrate the advantages of joint reasoning about spatial layout and fine-grained semantics. On the PASCAL VOC dataset, our proposed model simultaneously achieves strong performance on instance segmentation, part segmentation and keypoint detection in a single efficient pipeline that does not require explicit training for each task. One of the reasons for our strong performance is the ability to naturally leverage highly-engineered architectures, such as Faster-RCNN, within our pipeline. Source code is available at https://github.com/ jingyanw/recursive-refinement.

WACV Semantic Scholar

Cite

Text

Wang et al. "The More You Look, the More You See: Towards General Object Understanding Through Recursive Refinement." IEEE/CVF Winter Conference on Applications of Computer Vision, 2018. doi:10.1109/WACV.2018.00199

Markdown

[Wang et al. "The More You Look, the More You See: Towards General Object Understanding Through Recursive Refinement." IEEE/CVF Winter Conference on Applications of Computer Vision, 2018.](https://mlanthology.org/wacv/2018/wang2018wacv-more/) doi:10.1109/WACV.2018.00199

BibTeX

@inproceedings{wang2018wacv-more,
  title     = {{The More You Look, the More You See: Towards General Object Understanding Through Recursive Refinement}},
  author    = {Wang, Jingyan and Russakovsky, Olga and Ramanan, Deva},
  booktitle = {IEEE/CVF Winter Conference on Applications of Computer Vision},
  year      = {2018},
  pages     = {1794-1803},
  doi       = {10.1109/WACV.2018.00199},
  url       = {https://mlanthology.org/wacv/2018/wang2018wacv-more/}
}