Chen, Shaoxiang

15 publications

CVPRW 2025 UniToken: Harmonizing Multimodal Understanding and Generation Through Unified Visual Encoding Yang Jiao, Haibo Qiu, Zequn Jie, Shaoxiang Chen, Jingjing Chen, Lin Ma, Yu-Gang Jiang
NeurIPS 2024 ChatTracker: Enhancing Visual Tracking Performance via Chatting with Multimodal Large Language Model Yiming Sun, Fan Yu, Shaoxiang Chen, Yu Zhang, Junwei Huang, Yang Li, Chenhui Li, Changbo Wang
AAAI 2024 Instance-Aware Multi-Camera 3D Object Detection with Structural Priors Mining and Self-Boosting Learning Yang Jiao, Zequn Jie, Shaoxiang Chen, Lechao Cheng, Jingjing Chen, Lin Ma, Yu-Gang Jiang
NeurIPS 2024 Lumen: Unleashing Versatile Vision-Centric Capabilities of Large Multimodal Models Yang Jiao, Shaoxiang Chen, Zequn Jie, Jingjing Chen, Lin Ma, Yu-Gang Jiang
ECCV 2024 Making Large Language Models Better Planners with Reasoning-Decision Alignment Zhijian Huang, Tao Tang, Shaoxiang Chen, Sihao Lin, Zequn Jie, Lin Ma, Guangrun Wang, Xiaodan Liang
CVPR 2023 MSMDFusion: Fusing LiDAR and Camera at Multiple Scales with Multi-Depth Seeds for 3D Object Detection Yang Jiao, Zequn Jie, Shaoxiang Chen, Jingjing Chen, Lin Ma, Yu-Gang Jiang
ECCV 2022 MORE: Multi-Order RElation Mining for Dense Captioning in 3D Scenes Yang Jiao, Shaoxiang Chen, Zequn Jie, Jingjing Chen, Lin Ma, Yu-Gang Jiang
ICCV 2021 Motion Guided Region Message Passing for Video Captioning Shaoxiang Chen, Yu-Gang Jiang
CVPR 2021 Towards Bridging Event Captioner and Sentence Localizer for Weakly Supervised Dense Event Captioning Shaoxiang Chen, Yu-Gang Jiang
ECCV 2020 Hierarchical Visual-Textual Graph for Temporal Activity Localization via Language Shaoxiang Chen, Yu-Gang Jiang
ECCV 2020 Learning Modality Interaction for Temporal Sentence Localization and Event Captioning in Videos Shaoxiang Chen, Wenhao Jiang, Wei Liu, Yu-Gang Jiang
IJCAI 2019 Deep Learning for Video Captioning: A Review Shaoxiang Chen, Ting Yao, Yu-Gang Jiang
AAAI 2019 Motion Guided Spatial Attention for Video Captioning Shaoxiang Chen, Yu-Gang Jiang
AAAI 2019 Semantic Proposal for Activity Localization in Videos via Sentence Query Shaoxiang Chen, Yu-Gang Jiang
ECCVW 2018 Non-Local NetVLAD Encoding for Video Classification Yongyi Tang, Xing Zhang, Jingwen Wang, Shaoxiang Chen, Lin Ma, Yu-Gang Jiang