Extreme Low Resolution Activity Recognition with Multi-Siamese Embedding Learning
Abstract
This paper presents an approach for recognizing human activities from extreme low resolution (e.g., 16x12) videos. Extreme low resolution recognition is not only necessary for analyzing actions at a distance but also is crucial for enabling privacy-preserving recognition of human activities. We design a new two-stream multi-Siamese convolutional neural network. The idea is to explicitly capture the inherent property of low resolution (LR) videos that two images originated from the exact same scene often have totally different pixel values depending on their LR transformations. Our approach learns the shared embedding space that maps LR videos with the same content to the same location regardless of their transformations. We experimentally confirm that our approach of jointly learning such transform robust LR video representation and the classifier outperforms the previous state-of-the-art low resolution recognition approaches on two public standard datasets by a meaningful margin.
Cite
Text
Ryoo et al. "Extreme Low Resolution Activity Recognition with Multi-Siamese Embedding Learning." AAAI Conference on Artificial Intelligence, 2018. doi:10.1609/AAAI.V32I1.12299Markdown
[Ryoo et al. "Extreme Low Resolution Activity Recognition with Multi-Siamese Embedding Learning." AAAI Conference on Artificial Intelligence, 2018.](https://mlanthology.org/aaai/2018/ryoo2018aaai-extreme/) doi:10.1609/AAAI.V32I1.12299BibTeX
@inproceedings{ryoo2018aaai-extreme,
title = {{Extreme Low Resolution Activity Recognition with Multi-Siamese Embedding Learning}},
author = {Ryoo, Michael S. and Kim, Kiyoon and Yang, Hyun Jong},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2018},
pages = {7315-7322},
doi = {10.1609/AAAI.V32I1.12299},
url = {https://mlanthology.org/aaai/2018/ryoo2018aaai-extreme/}
}