Learning Appearance Manifolds from Video
Abstract
The appearance of dynamic scenes is often largely governed by a latent low-dimensional dynamic process. We show how to learn a mapping from video frames to this low-dimensional representation by exploiting the temporal coherence between frames and supervision from a user. This function maps the frames of the video to a low-dimensional sequence that evolves according to Markovian dynamics. This ensures that the recovered low-dimensional sequence represents a physically meaningful process. We relate our algorithm to manifold learning, semi-supervised learning, and system identification, and demonstrate it on the tasks of tracking 3D rigid objects, deformable bodies, and articulated bodies. We also show how to use the inverse of this mapping to manipulate video.
Cite
Text
Rahimi et al. "Learning Appearance Manifolds from Video." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2005. doi:10.1109/CVPR.2005.204Markdown
[Rahimi et al. "Learning Appearance Manifolds from Video." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2005.](https://mlanthology.org/cvpr/2005/rahimi2005cvpr-learning/) doi:10.1109/CVPR.2005.204BibTeX
@inproceedings{rahimi2005cvpr-learning,
title = {{Learning Appearance Manifolds from Video}},
author = {Rahimi, Ali and Recht, Ben and Darrell, Trevor},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year = {2005},
pages = {868-875},
doi = {10.1109/CVPR.2005.204},
url = {https://mlanthology.org/cvpr/2005/rahimi2005cvpr-learning/}
}