Monte Carlo Tree Search for Scheduling Activity Recognition
Abstract
This paper presents an efficient approach to video parsing. Our videos show a number of co-occurring individual and group activities. To address challenges of the domain, we use an expressive spatiotemporal AND-OR graph (ST-AOG) that jointly models activity parts, their spatiotemporal relations, and context, as well as enables multitarget tracking. The standard ST-AOG inference is prohibitively expensive in our setting, since it would require running a multitude of detectors, and tracking their detections in a long video footage. This problem is addressed by formulating a cost-sensitive inference of ST-AOG as Monte Carlo Tree Search (MCTS). For querying an activity in the video, MCTS optimally schedules a sequence of detectors and trackers to be run, and where they should be applied in the space-time volume. Evaluation on the benchmark datasets demonstrates that MCTS enables two-magnitude speed-ups without compromising accuracy relative to the standard cost-insensitive inference.
Cite
Text
Amer et al. "Monte Carlo Tree Search for Scheduling Activity Recognition." International Conference on Computer Vision, 2013. doi:10.1109/ICCV.2013.171Markdown
[Amer et al. "Monte Carlo Tree Search for Scheduling Activity Recognition." International Conference on Computer Vision, 2013.](https://mlanthology.org/iccv/2013/amer2013iccv-monte/) doi:10.1109/ICCV.2013.171BibTeX
@inproceedings{amer2013iccv-monte,
title = {{Monte Carlo Tree Search for Scheduling Activity Recognition}},
author = {Amer, Mohamed R. and Todorovic, Sinisa and Fern, Alan and Zhu, Song-Chun},
booktitle = {International Conference on Computer Vision},
year = {2013},
doi = {10.1109/ICCV.2013.171},
url = {https://mlanthology.org/iccv/2013/amer2013iccv-monte/}
}