Jin, Yang

13 publications

NeurIPS 2025 Enhancing Consistency of Flow-Based Image Editing Through Kalman Control Haozhe Chi, Zhicheng Sun, Yang Jin, Yi Ma, Jing Wang, Yadong Mu
AAAI 2025 Granularity-Adaptive Spatial Evidence Tokenization for Video Question Answering Hao Jiang, Yang Jin, Zhicheng Sun, Kun Xu, Kun Xu, Liwei Chen, Yang Song, Kun Gai, Yadong Mu
ICLR 2025 Pyramidal Flow Matching for Efficient Video Generative Modeling Yang Jin, Zhicheng Sun, Ningyuan Li, Kun Xu, Kun Xu, Hao Jiang, Nan Zhuang, Quzhe Huang, Yang Song, Yadong Mu, Zhouchen Lin
ECCV 2024 Boosting Gaze Object Prediction via Pixel-Level Supervision from Vision Foundation Model Yang Jin, Lei Zhang, Shi Yan, Bin Fan, Binglu Wang
NeurIPS 2024 RectifID: Personalizing Rectified Flow with Anchored Classifier Guidance Zhicheng Sun, Zhenhao Yang, Yang Jin, Haozhe Chi, Kun Xu, Kun Xu, Liwei Chen, Hao Jiang, Yang Song, Kun Gai, Yadong Mu
AAAI 2024 TransGOP: Transformer-Based Gaze Object Prediction Binglu Wang, Chenxi Guo, Yang Jin, Haisheng Xia, Nian Liu
ICLR 2024 Unified Language-Vision Pretraining in LLM with Dynamic Discrete Visual Tokenization Yang Jin, Kun Xu, Kun Xu, Liwei Chen, Chao Liao, Jianchao Tan, Quzhe Huang, Bin Chen, Chengru Song, Dai Meng, Di Zhang, Wenwu Ou, Kun Gai, Yadong Mu
ICML 2024 Video-LaVIT: Unified Video-Language Pre-Training with Decoupled Visual-Motional Tokenization Yang Jin, Zhicheng Sun, Kun Xu, Kun Xu, Liwei Chen, Hao Jiang, Quzhe Huang, Chengru Song, Yuliang Liu, Di Zhang, Yang Song, Kun Gai, Yadong Mu
ECCV 2024 Weakly-Supervised Spatio-Temporal Video Grounding with Variational Cross-Modal Alignment Yang Jin, Yadong Mu
CVPR 2023 Learning Instance-Level Representation for Large-Scale Multi-Modal Pretraining in E-Commerce Yang Jin, Yongzhi Li, Zehuan Yuan, Yadong Mu
ICCV 2023 Video Action Segmentation via Contextually Refined Temporal Keypoints Borui Jiang, Yang Jin, Zhentao Tan, Yadong Mu
CVPR 2022 Complex Video Action Reasoning via Learnable Markov Logic Network Yang Jin, Linchao Zhu, Yadong Mu
NeurIPS 2022 Embracing Consistency: A One-Stage Approach for Spatio-Temporal Video Grounding Yang Jin, Yongzhi Li, Zehuan Yuan, Yadong Mu