Identity Preserve Transform: Understand What Activity Classification Models Have Learnt
Abstract
Activity classification has observed great success recently. The performance on small dataset is almost saturated and people are moving towards larger datasets. What leads to the performance gain on the model and what the model has learnt? In this paper we propose identity preserve transform (IPT) to study this problem. IPT manipulates the nuisance factors (background, viewpoint, etc.) of the data while keeping those factors related to the task (human motion) unchanged. To our surprise, we found popular models are using highly correlated information (background, object) to achieve high classification accuracy, rather than using the essential information (human motion). This can explain why an activity classification model usually fails to generalize to datasets it is not trained on. We implement IPT in two forms, i.e. image-space transform and 3D transform, using synthetic images. The tool will be made open-source to help study model and dataset design.
Cite
Text
Lyu et al. "Identity Preserve Transform: Understand What Activity Classification Models Have Learnt." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020. doi:10.1109/CVPRW50498.2020.00012Markdown
[Lyu et al. "Identity Preserve Transform: Understand What Activity Classification Models Have Learnt." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020.](https://mlanthology.org/cvprw/2020/lyu2020cvprw-identity/) doi:10.1109/CVPRW50498.2020.00012BibTeX
@inproceedings{lyu2020cvprw-identity,
title = {{Identity Preserve Transform: Understand What Activity Classification Models Have Learnt}},
author = {Lyu, Jialing and Qiu, Weichao and Yuille, Alan L.},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
year = {2020},
pages = {39-47},
doi = {10.1109/CVPRW50498.2020.00012},
url = {https://mlanthology.org/cvprw/2020/lyu2020cvprw-identity/}
}