Combining Multiple Sources of Knowledge in Deep CNNs for Action Recognition

Park, Eunbyung; Han, Xufeng; Berg, Tamara L.; Berg, Alexander C.

doi:10.1109/WACV.2016.7477589

Combining Multiple Sources of Knowledge in Deep CNNs for Action Recognition

Eunbyung Park, Xufeng Han, Tamara L. Berg, Alexander C. Berg

WACV 2016 pp. 1-8

doi:10.1109/WACV.2016.7477589 /wacv/2016/park2016wacv-combining/

Abstract

Although deep convolutional neural networks (CNNs) have shown remarkable results for feature learning and prediction tasks, many recent studies have demonstrated improved performance by incorporating additional handcrafted features or by fusing predictions from multiple CNNs. Usually, these combinations are implemented via feature concatenation or by averaging output prediction scores from several CNNs. In this paper, we present new approaches for combining different sources of knowledge in deep learning. First, we propose feature amplification, where we use an auxiliary, hand-crafted, feature (e.g. optical flow) to perform spatially varying soft-gating on intermediate CNN feature maps. Second, we present a spatially varying multiplicative fusion method for combining multiple CNNs trained on different sources that results in robust prediction by amplifying or suppressing the feature activations based on their agreement. We test these methods in the context of action recognition where information from spatial and temporal cues is useful, obtaining results that are comparable with state-of-the-art methods and outperform methods using only CNNs and optical flow features.

WACV Semantic Scholar

Cite

Text

Park et al. "Combining Multiple Sources of Knowledge in Deep CNNs for Action Recognition." IEEE/CVF Winter Conference on Applications of Computer Vision, 2016. doi:10.1109/WACV.2016.7477589

Markdown

[Park et al. "Combining Multiple Sources of Knowledge in Deep CNNs for Action Recognition." IEEE/CVF Winter Conference on Applications of Computer Vision, 2016.](https://mlanthology.org/wacv/2016/park2016wacv-combining/) doi:10.1109/WACV.2016.7477589

BibTeX

@inproceedings{park2016wacv-combining,
  title     = {{Combining Multiple Sources of Knowledge in Deep CNNs for Action Recognition}},
  author    = {Park, Eunbyung and Han, Xufeng and Berg, Tamara L. and Berg, Alexander C.},
  booktitle = {IEEE/CVF Winter Conference on Applications of Computer Vision},
  year      = {2016},
  pages     = {1-8},
  doi       = {10.1109/WACV.2016.7477589},
  url       = {https://mlanthology.org/wacv/2016/park2016wacv-combining/}
}