Learning Appearance Manifolds from Video

Abstract

The appearance of dynamic scenes is often largely governed by a latent low-dimensional dynamic process. We show how to learn a mapping from video frames to this low-dimensional representation by exploiting the temporal coherence between frames and supervision from a user. This function maps the frames of the video to a low-dimensional sequence that evolves according to Markovian dynamics. This ensures that the recovered low-dimensional sequence represents a physically meaningful process. We relate our algorithm to manifold learning, semi-supervised learning, and system identification, and demonstrate it on the tasks of tracking 3D rigid objects, deformable bodies, and articulated bodies. We also show how to use the inverse of this mapping to manipulate video.

Cite

Text

Rahimi et al. "Learning Appearance Manifolds from Video." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2005. doi:10.1109/CVPR.2005.204

Markdown

[Rahimi et al. "Learning Appearance Manifolds from Video." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2005.](https://mlanthology.org/cvpr/2005/rahimi2005cvpr-learning/) doi:10.1109/CVPR.2005.204

BibTeX

@inproceedings{rahimi2005cvpr-learning,
  title     = {{Learning Appearance Manifolds from Video}},
  author    = {Rahimi, Ali and Recht, Ben and Darrell, Trevor},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year      = {2005},
  pages     = {868-875},
  doi       = {10.1109/CVPR.2005.204},
  url       = {https://mlanthology.org/cvpr/2005/rahimi2005cvpr-learning/}
}