MART: Motion-Aware Recurrent Neural Network for Robust Visual Tracking
Abstract
We introduce MART, Motion-Aware Recurrent neural network (MA-RNN) for Tracking, by modeling robust long-term spatial-temporal representation. In particular, we propose a simple, yet effective context-aware displacement attention (CADA) module to capture target motion in videos. By seamlessly integrating CADA into RNN, the proposed MA-RNN can spatially align and aggregate temporal information guided by motion from frame to frame, leading to more effective representation that benefits a tracker from motion when handling occlusion, deformation, viewpoint change etc. Moreover, to deal with scale change, we present a monotonic bounding box regression (mBBR) approach that iteratively predicts regression offsets for target object under the guidance of intersection-over-union (IoU) score, guaranteeing non-decreasing accuracy. In extensive experiments on five benchmarks, including GOT-10k, LaSOT, TC-128, OTB-15 and VOT-19, our tracker MART consistently achieves state-of-the-art results and runs in real-time.
Cite
Text
Fan and Ling. "MART: Motion-Aware Recurrent Neural Network for Robust Visual Tracking." Winter Conference on Applications of Computer Vision, 2021.Markdown
[Fan and Ling. "MART: Motion-Aware Recurrent Neural Network for Robust Visual Tracking." Winter Conference on Applications of Computer Vision, 2021.](https://mlanthology.org/wacv/2021/fan2021wacv-mart/)BibTeX
@inproceedings{fan2021wacv-mart,
title = {{MART: Motion-Aware Recurrent Neural Network for Robust Visual Tracking}},
author = {Fan, Heng and Ling, Haibin},
booktitle = {Winter Conference on Applications of Computer Vision},
year = {2021},
pages = {566-575},
url = {https://mlanthology.org/wacv/2021/fan2021wacv-mart/}
}