Weakly Supervised Learning of Heterogeneous Concepts in Videos

Shah, Sohil; Kulkarni, Kuldeep; Biswas, Arijit; Gandhi, Ankit; Deshmukh, Om; Davis, Larry S.

doi:10.1007/978-3-319-46466-4_17

Weakly Supervised Learning of Heterogeneous Concepts in Videos

Sohil Shah, Kuldeep Kulkarni, Arijit Biswas, Ankit Gandhi, Om Deshmukh, Larry S. Davis

ECCV 2016 pp. 275-293

doi:10.1007/978-3-319-46466-4_17 /eccv/2016/shah2016eccv-weakly/

Abstract

Typical textual descriptions that accompany online videos are ‘weak’: i.e., they mention the important heterogeneous concepts in the video but not their corresponding spatio-temporal locations. However, certain location constraints on these concepts can be inferred from the description. The goal of this paper is to present a generalization of the Indian Buffet Process (IBP) that can (a) systematically incorporate heterogeneous concepts in an integrated framework, and (b) enforce location constraints, for efficient classification and localization of the concepts in the videos. Finally, we develop posterior inference for the proposed formulation using mean-field variational approximation. Comparative evaluations on the Casablanca and the A2D datasets show that the proposed approach significantly outperforms other state-of-the-art techniques: 24 % relative improvement for pairwise concept classification in the Casablanca dataset and 9 % relative improvement for localization in the A2D dataset as compared to the most competitive baseline.

PDF ECCV Semantic Scholar

Cite

Text

Shah et al. "Weakly Supervised Learning of Heterogeneous Concepts in Videos." European Conference on Computer Vision, 2016. doi:10.1007/978-3-319-46466-4_17

Markdown

[Shah et al. "Weakly Supervised Learning of Heterogeneous Concepts in Videos." European Conference on Computer Vision, 2016.](https://mlanthology.org/eccv/2016/shah2016eccv-weakly/) doi:10.1007/978-3-319-46466-4_17

BibTeX

@inproceedings{shah2016eccv-weakly,
  title     = {{Weakly Supervised Learning of Heterogeneous Concepts in Videos}},
  author    = {Shah, Sohil and Kulkarni, Kuldeep and Biswas, Arijit and Gandhi, Ankit and Deshmukh, Om and Davis, Larry S.},
  booktitle = {European Conference on Computer Vision},
  year      = {2016},
  pages     = {275-293},
  doi       = {10.1007/978-3-319-46466-4_17},
  url       = {https://mlanthology.org/eccv/2016/shah2016eccv-weakly/}
}