MoLA: Motion Generation and Editing with Latent Diffusion Enhanced by Adversarial Training

Uchida, Kengo; Shibuya, Takashi; Takida, Yuhta; Murata, Naoki; Tanke, Julian; Takahashi, Shusuke; Mitsufuji, Yuki

MoLA: Motion Generation and Editing with Latent Diffusion Enhanced by Adversarial Training

Kengo Uchida, Takashi Shibuya, Yuhta Takida, Naoki Murata, Julian Tanke, Shusuke Takahashi, Yuki Mitsufuji

CVPRW 2025 pp. 2910-2919

/cvprw/2025/uchida2025cvprw-mola/

Abstract

In text-to-motion generation, controllability as well as generation quality and speed has become increasingly critical. The controllability challenges include generating a motion of a length that matches the given textual description and editing the generated motions according to control signals, such as the start-end positions and the pelvis trajectory. In this paper, we propose MoLA, which provides fast, high-quality, variable-length motion generation and can also deal with multiple editing tasks in a single framework. Our approach revisits the motion representation used as inputs and outputs in the model, incorporating an activation variable to enable variable-length motion generation. Additionally, we integrate a variational autoencoder and a latent diffusion model, further enhanced through adversarial training, to achieve high-quality and fast generation. Moreover, we apply a training-free guided generation framework to achieve various editing tasks with motion control inputs. We quantitatively show the effectiveness of adversarial learning in text-to-motion generation, and demonstrate the applicability of our editing framework to multiple editing tasks in the motion domain.

PDF CVPRW Semantic Scholar

Cite

Text

Uchida et al. "MoLA: Motion Generation and Editing with Latent Diffusion Enhanced by Adversarial Training." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2025.

Markdown

[Uchida et al. "MoLA: Motion Generation and Editing with Latent Diffusion Enhanced by Adversarial Training." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2025.](https://mlanthology.org/cvprw/2025/uchida2025cvprw-mola/)

BibTeX

@inproceedings{uchida2025cvprw-mola,
  title     = {{MoLA: Motion Generation and Editing with Latent Diffusion Enhanced by Adversarial Training}},
  author    = {Uchida, Kengo and Shibuya, Takashi and Takida, Yuhta and Murata, Naoki and Tanke, Julian and Takahashi, Shusuke and Mitsufuji, Yuki},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2025},
  pages     = {2910-2919},
  url       = {https://mlanthology.org/cvprw/2025/uchida2025cvprw-mola/}
}