Chained Predictions Using Convolutional Neural Networks

Georgia Gkioxari, Alexander Toshev, Navdeep Jaitly

ECCV 2016 pp. 728-743

doi:10.1007/978-3-319-46493-0_44 /eccv/2016/gkioxari2016eccv-chained/

Abstract

In this work, we present an adaptation of the sequence-to-sequence model for structured vision tasks. In this model, the output variables for a given input are predicted sequentially using neural networks. The prediction for each output variable depends not only on the input but also on the previously predicted output variables. The model is applied to spatial localization tasks and uses convolutional neural networks (CNNs) for processing input images and a multi-scale deconvolutional architecture for making spatial predictions at each step. We explore the impact of weight sharing with a recurrent connection matrix between consecutive predictions, and compare it to a formulation where these weights are not tied. Untied weights are particularly suited for problems with a fixed sized structure, where different classes of output are predicted at different steps. We show that chain models achieve top performing results on human pose estimation from images and videos.

PDF ECCV Semantic Scholar

Cite

Text

Gkioxari et al. "Chained Predictions Using Convolutional Neural Networks." European Conference on Computer Vision, 2016. doi:10.1007/978-3-319-46493-0_44

Markdown

[Gkioxari et al. "Chained Predictions Using Convolutional Neural Networks." European Conference on Computer Vision, 2016.](https://mlanthology.org/eccv/2016/gkioxari2016eccv-chained/) doi:10.1007/978-3-319-46493-0_44

BibTeX

@inproceedings{gkioxari2016eccv-chained,
  title     = {{Chained Predictions Using Convolutional Neural Networks}},
  author    = {Gkioxari, Georgia and Toshev, Alexander and Jaitly, Navdeep},
  booktitle = {European Conference on Computer Vision},
  year      = {2016},
  pages     = {728-743},
  doi       = {10.1007/978-3-319-46493-0_44},
  url       = {https://mlanthology.org/eccv/2016/gkioxari2016eccv-chained/}
}