xGen-VideoSyn-1: High-Fidelity Text-to-Video Synthesis with Compressed Representations

Cite

Text

Qin et al. "xGen-VideoSyn-1: High-Fidelity Text-to-Video Synthesis with Compressed Representations." European Conference on Computer Vision Workshops, 2024. doi:10.1007/978-3-031-92808-6_16

Markdown

[Qin et al. "xGen-VideoSyn-1: High-Fidelity Text-to-Video Synthesis with Compressed Representations." European Conference on Computer Vision Workshops, 2024.](https://mlanthology.org/eccvw/2024/qin2024eccvw-xgenvideosyn1/) doi:10.1007/978-3-031-92808-6_16

BibTeX

@inproceedings{qin2024eccvw-xgenvideosyn1,
  title     = {{xGen-VideoSyn-1: High-Fidelity Text-to-Video Synthesis with Compressed Representations}},
  author    = {Qin, Can and Xia, Congying and Ramakrishnan, Krithika and Ryoo, Michael S. and Tu, Lifu and Feng, Yihao and Shu, Manli and Zhou, Honglu and Awadalla, Anas and Wang, Jun and Purushwalkam, Senthil and Xue, Le and Zhou, Yingbo and Wang, Huan and Savarese, Silvio and Niebles, Juan Carlos and Chen, Zeyuan and Xu, Ran and Xiong, Caiming},
  booktitle = {European Conference on Computer Vision Workshops},
  year      = {2024},
  pages     = {249-265},
  doi       = {10.1007/978-3-031-92808-6_16},
  url       = {https://mlanthology.org/eccvw/2024/qin2024eccvw-xgenvideosyn1/}
}