LatentMan : Generating Consistent Animated Characters Using Image Diffusion Models
Abstract
We propose a zero-shot approach for generating consistent videos of animated characters based on Text-to-Image (T2I) diffusion models. Existing Text-to-Video (T2V) methods are expensive to train and require large-scale video datasets to produce diverse characters and motions. At the same time, their zero-shot alternatives fail to produce temporally consistent videos with continuous motion. We strive to bridge this gap, and we introduce LATENTMAN that leverages existing text-based motion diffusion models to generate diverse continuous motions to guide the T2I model. To boost the temporal consistency, we introduce the Spatial Latent Alignment module that exploits cross-frame dense correspondences that we compute to align the latents of the video frames. Furthermore, we propose Pixel-Wise Guidance to steer the diffusion process in a direction that minimizes visual discrepancies between frames. Our proposed approach outperforms existing zero-shot T2V approaches in generating videos of animated characters in terms of pixel-wise consistency and user preference.
Cite
Text
Eldesokey and Wonka. "LatentMan : Generating Consistent Animated Characters Using Image Diffusion Models." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2024. doi:10.1109/CVPRW63382.2024.00746Markdown
[Eldesokey and Wonka. "LatentMan : Generating Consistent Animated Characters Using Image Diffusion Models." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2024.](https://mlanthology.org/cvprw/2024/eldesokey2024cvprw-latentman/) doi:10.1109/CVPRW63382.2024.00746BibTeX
@inproceedings{eldesokey2024cvprw-latentman,
title = {{LatentMan : Generating Consistent Animated Characters Using Image Diffusion Models}},
author = {Eldesokey, Abdelrahman and Wonka, Peter},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
year = {2024},
pages = {7510-7519},
doi = {10.1109/CVPRW63382.2024.00746},
url = {https://mlanthology.org/cvprw/2024/eldesokey2024cvprw-latentman/}
}