Yu, Licheng

26 publications

CVPR 2025 Accelerating Multimodal Large Language Models by Searching Optimal Vision Token Reduction Shiyu Zhao, Zhenting Wang, Felix Juefei-Xu, Xide Xia, Miao Liu, Xiaofang Wang, Mingfu Liang, Ning Zhang, Dimitris N. Metaxas, Licheng Yu
CVPR 2025 Apollo: An Exploration of Video Understanding in Large Multimodal Models Orr Zohar, Xiaohan Wang, Yann Dubois, Nikhil Mehta, Tong Xiao, Philippe Hansen-Estruch, Licheng Yu, Xiaofang Wang, Felix Juefei-Xu, Ning Zhang, Serena Yeung-Levy, Xide Xia
CVPR 2025 Building a Mind Palace: Structuring Environment-Grounded Semantic Graphs for Effective Long Video Analysis with LLMs Zeyi Huang, Yuyang Ji, Xiaofang Wang, Nikhil Mehta, Tong Xiao, Donghyun Lee, Sigmund Vanvalkenburgh, Shengxin Zha, Bolin Lai, Licheng Yu, Ning Zhang, Yong Jae Lee, Miao Liu
CVPR 2025 ROICtrl: Boosting Instance Control for Visual Generation Yuchao Gu, Yipin Zhou, Yunfan Ye, Yixin Nie, Licheng Yu, Pingchuan Ma, Kevin Qinghong Lin, Mike Zheng Shou
CVPR 2024 AVID: Any-Length Video Inpainting with Diffusion Model Zhixing Zhang, Bichen Wu, Xiaoyan Wang, Yaqiao Luo, Luxin Zhang, Yinan Zhao, Peter Vajda, Dimitris Metaxas, Licheng Yu
CVPR 2024 Fairy: Fast Parallelized Instruction-Guided Video-to-Video Synthesis Bichen Wu, Ching-Yao Chuang, Xiaoyan Wang, Yichen Jia, Kapil Krishnakumar, Tong Xiao, Feng Liang, Licheng Yu, Peter Vajda
CVPR 2024 FlowVid: Taming Imperfect Optical Flows for Consistent Video-to-Video Synthesis Feng Liang, Bichen Wu, Jialiang Wang, Licheng Yu, Kunpeng Li, Yinan Zhao, Ishan Misra, Jia-Bin Huang, Peizhao Zhang, Peter Vajda, Diana Marculescu
CVPR 2024 Layout-Agnostic Scene Text Image Synthesis with Diffusion Models Qilong Zhangli, Jindong Jiang, Di Liu, Licheng Yu, Xiaoliang Dai, Ankit Ramchandani, Guan Pang, Dimitris N. Metaxas, Praveen Krishnan
ECCV 2024 Text-to-Sticker: Style Tailoring Latent Diffusion Models for Human Expression Animesh Sinha, Bo Sun, Anmol Kalia, Arantxa Casanova, Elliot Blanchard, David Yan, Winnie Zhang, Tony Nelli, Jiahui Chen, Hardik Shah, Licheng Yu, Mitesh Kumar Singh, Ankit Ramchandani, Maziar Sanjabi, Sonal Gupta, Amy L Bearman, Dhruv Mahajan
CVPR 2024 VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence Yuchao Gu, Yipin Zhou, Bichen Wu, Licheng Yu, Jia-Wei Liu, Rui Zhao, Jay Zhangjie Wu, David Junhao Zhang, Mike Zheng Shou, Kevin Tang
ICCV 2023 CiT: Curation in Training for Effective Vision-Language Data Hu Xu, Saining Xie, Po-Yao Huang, Licheng Yu, Russell Howes, Gargi Ghosh, Luke Zettlemoyer, Christoph Feichtenhofer
CVPR 2023 FAME-ViL: Multi-Tasking Vision-Language Model for Heterogeneous Fashion Tasks Xiao Han, Xiatian Zhu, Licheng Yu, Li Zhang, Yi-Zhe Song, Tao Xiang
CVPR 2023 Learning Procedure-Aware Video Representation from Instructional Videos and Their Narrations Yiwu Zhong, Licheng Yu, Yang Bai, Shangwen Li, Xueting Yan, Yin Li
ICLR 2023 RoPAWS: Robust Semi-Supervised Representation Learning from Uncurated Data Sangwoo Mo, Jong-Chyi Su, Chih-Yao Ma, Mido Assran, Ishan Misra, Licheng Yu, Sean Bell
CVPR 2023 Tell Me What Happened: Unifying Text-Guided Video Completion via Multimodal Masked Video Generation Tsu-Jui Fu, Licheng Yu, Ning Zhang, Cheng-Yang Fu, Jong-Chyi Su, William Yang Wang, Sean Bell
ECCV 2022 FashionViL: Fashion-Focused Vision-and-Language Representation Learning Xiao Han, Licheng Yu, Xiatian Zhu, Li Zhang, Yi-Zhe Song, Tao Xiang
ECCV 2022 GEB+: A Benchmark for Generic Event Boundary Captioning, Grounding and Retrieval Yuxuan Wang, Difei Gao, Licheng Yu, Weixian Lei, Matt Feiszli, Mike Zheng Shou
CVPR 2022 Unsupervised Vision-and-Language Pre-Training via Retrieval-Based Multi-Granular Alignment Mingyang Zhou, Licheng Yu, Amanpreet Singh, Mengjiao Wang, Zhou Yu, Ning Zhang
CVPR 2021 Connecting What to Say with Where to Look by Modeling Human Attention Traces Zihang Meng, Licheng Yu, Ning Zhang, Tamara L. Berg, Babak Damavandi, Vikas Singh, Amy Bearman
ECCV 2020 Behind the Scene: Revealing the Secrets of Pre-Trained Vision-and-Language Models Jize Cao, Zhe Gan, Yu Cheng, Licheng Yu, Yen-Chun Chen, Jingjing Liu
ECCV 2020 TVR: A Large-Scale Dataset for Video-Subtitle Moment Retrieval Jie Lei, Licheng Yu, Tamara L. Berg, Mohit Bansal
ECCV 2020 UNITER: UNiversal Image-TExt Representation Learning Yen-Chun Chen, Linjie Li, Licheng Yu, Ahmed El Kholy Faisal Ahmed, Zhe Gan, Yu Cheng, Jingjing Liu
CVPR 2017 A Joint Speaker-Listener-Reinforcer Model for Referring Expressions Licheng Yu, Hao Tan, Mohit Bansal, Tamara L. Berg
ECCV 2016 Modeling Context in Referring Expressions Licheng Yu, Patrick Poirson, Shan Yang, Alexander C. Berg, Tamara L. Berg
AAAI 2015 Dictionary Learning with Mutually Reinforcing Group-Graph Structures Hongteng Xu, Licheng Yu, Dixin Luo, Hongyuan Zha, Yi Xu
ICCV 2015 Visual Madlibs: Fill in the Blank Description Generation and Question Answering Licheng Yu, Eunbyung Park, Alexander C. Berg, Tamara L. Berg