He, Tong

75 publications

ICCV 2025 Aether: Geometric-Aware Unified World Modeling Haoyi Zhu, Yifan Wang, Jianjun Zhou, Wenzheng Chang, Yang Zhou, Zizun Li, Junyi Chen, Chunhua Shen, Jiangmiao Pang, Tong He
ICLR 2025 Bridging Information Asymmetry in Text-Video Retrieval: A Data-Centric Approach Zechen Bai, Tianjun Xiao, Tong He, Pichao Wang, Zheng Zhang, Thomas Brox, Mike Zheng Shou
AISTATS 2025 Common Learning Constraints Alter Interpretations of Direct Preference Optimization Lemin Kong, Xiangkun Hu, Tong He, David Wipf
NeurIPS 2025 DICEPTION: A Generalist Diffusion Model for Visual Perceptual Tasks Canyu Zhao, Yanlong Sun, Mingyu Liu, Huanyi Zheng, Muzhi Zhu, Zhiyue Zhao, Hao Chen, Tong He, Chunhua Shen
ICLR 2025 Depth Any Video with Scalable Synthetic Data Honghui Yang, Di Huang, Wei Yin, Chunhua Shen, Haifeng Liu, Xiaofei He, Binbin Lin, Wanli Ouyang, Tong He
TMLR 2025 EMMA: End-to-End Multimodal Model for Autonomous Driving Jyh-Jing Hwang, Runsheng Xu, Hubert Lin, Wei-Chih Hung, Jingwei Ji, Kristy Choi, Di Huang, Tong He, Paul Covington, Benjamin Sapp, Yin Zhou, James Guo, Dragomir Anguelov, Mingxing Tan
ICCV 2025 EgoAgent: A Joint Predictive Agent Model in Egocentric Worlds Lu Chen, Yizhou Wang, Shixiang Tang, Qianhong Ma, Tong He, Wanli Ouyang, Xiaowei Zhou, Hujun Bao, Sida Peng
ICML 2025 Explicit Preference Optimization: No Need for an Implicit Reward Model Xiangkun Hu, Lemin Kong, Tong He, David Wipf
AAAI 2025 GigaGS: 3D Gaussian Based Planar Representation for Large-Scene Surface Reconstruction Junyi Chen, Weicai Ye, Yifan Wang, Danpeng Chen, Di Huang, Wanli Ouyang, Guofeng Zhang, Yu Qiao, Tong He
CVPR 2025 GoalFlow: Goal-Driven Flow Matching for Multimodal Trajectories Generation in End-to-End Autonomous Driving Zebin Xing, Xingyu Zhang, Yang Hu, Bo Jiang, Tong He, Qian Zhang, Xiaoxiao Long, Wei Yin
ICLR 2025 Lumina-T2X: Scalable Flow-Based Large Diffusion Transformer for Flexible Resolution Generation Peng Gao, Le Zhuo, Dongyang Liu, Ruoyi Du, Xu Luo, Longtian Qiu, Yuhang Zhang, Rongjie Huang, Shijie Geng, Renrui Zhang, Junlin Xie, Wenqi Shao, Zhengkai Jiang, Tianshuo Yang, Weicai Ye, Tong He, Jingwen He, Junjun He, Yu Qiao, Hongsheng Li
ICLR 2025 MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers Yiwen Chen, Tong He, Di Huang, Weicai Ye, Sijin Chen, Jiaxiang Tang, Zhongang Cai, Lei Yang, Gang Yu, Guosheng Lin, Chi Zhang
ICLR 2025 ND-SDF: Learning Normal Deflection Fields for High-Fidelity Indoor Reconstruction Ziyu Tang, Weicai Ye, Yifan Wang, Di Huang, Hujun Bao, Tong He, Guofeng Zhang
ICLR 2025 ProAdvPrompter: A Two-Stage Journey to Effective Adversarial Prompting for LLMs Hao Di, Tong He, Haishan Ye, Yinghui Huang, Xiangyu Chang, Guang Dai, Ivor Tsang
CVPR 2025 S4-Driver: Scalable Self-Supervised Driving Multimodal Large Language Model with Spatio-Temporal Visual Representation Yichen Xie, Runsheng Xu, Tong He, Jyh-Jing Hwang, Katie Luo, Jingwei Ji, Hubert Lin, Letian Chen, Yiren Lu, Zhaoqi Leng, Dragomir Anguelov, Mingxing Tan
ICLR 2025 SPA: 3D Spatial-Awareness Enables Effective Embodied Representation Haoyi Zhu, Honghui Yang, Yating Wang, Jiange Yang, Limin Wang, Tong He
NeurIPS 2025 Sekai: A Video Dataset Towards World Exploration Zhen Li, Chuanhao Li, Xiaofeng Mao, Shaoheng Lin, Ming Li, Shitian Zhao, xu Zhao Pan, Xinyue Li, Yukang Feng, Jianwen Sun, Zizhen Li, Fanrui Zhang, Jiaxin Ai, Zhixiang Wang, Yuwei Wu, Tong He, Yunde Jia, Kaipeng Zhang
ICML 2025 Sparse Autoencoders, Again? Yin Lu, Xuening Zhu, Tong He, David Wipf
CVPR 2025 Tra-MoE: Learning Trajectory Prediction Model from Multiple Domains for Adaptive Policy Conditioning Jiange Yang, Haoyi Zhu, Yating Wang, Gangshan Wu, Tong He, Limin Wang
ICCV 2025 VQ-VLA: Improving Vision-Language-Action Models via Scaling Vector-Quantized Action Tokenizers Yating Wang, Haoyi Zhu, Mingyu Liu, Jiange Yang, Hao-Shu Fang, Tong He
ICLR 2025 Where Am I and What Will I See: An Auto-Regressive Model for Spatial Localization and View Prediction Junyi Chen, Di Huang, Weicai Ye, Wanli Ouyang, Tong He
CVPR 2024 Adaptive Slot Attention: Object Discovery with Dynamic Slot Number Ke Fan, Zechen Bai, Tianjun Xiao, Tong He, Max Horn, Yanwei Fu, Francesco Locatello, Zheng Zhang
ECCV 2024 Agent3D-Zero: An Agent for Zero-Shot 3D Understanding Sha Zhang, Di Huang, Jiajun Deng, Shixiang Tang, Wanli Ouyang, Tong He, Yanyong Zhang
AAAI 2024 Boosting Residual Networks with Group Knowledge Shengji Tang, Peng Ye, Baopu Li, Weihao Lin, Tao Chen, Tong He, Chong Yu, Wanli Ouyang
ICLR 2024 Consistent Video-to-Video Transfer Using Synthetic Dataset Jiaxin Cheng, Tianjun Xiao, Tong He
ICLR 2024 Convolution Meets LoRA: Parameter Efficient Finetuning for Segment Anything Model Zihan Zhong, Zhiqiang Tang, Tong He, Haoyang Fang, Chun Yuan
ECCV 2024 DetToolChain: A New Prompting Paradigm to Unleash Detection Ability of MLLM Yixuan Wu, Yizhou Wang, Shixiang Tang, Wenhao Wu, Tong He, Wanli Ouyang, Philip Torr, Jian Wu
NeurIPS 2024 DiffPano: Scalable and Consistent Text to Panorama Generation with Spherical Epipolar-Aware Diffusion Weicai Ye, Chenhao Ji, Zheng Chen, Junyao Gao, Xiaoshui Huang, Song-Hai Zhang, Wanli Ouyang, Tong He, Cairong Zhao, Guofeng Zhang
CVPR 2024 DreamComposer: Controllable 3D Object Generation via Multi-View Conditions Yunhan Yang, Yukun Huang, Xiaoyang Wu, Yuan-Chen Guo, Song-Hai Zhang, Hengshuang Zhao, Tong He, Xihui Liu
NeurIPS 2024 EMR-Merging: Tuning-Free High-Performance Model Merging Chenyu Huang, Peng Ye, Tao Chen, Tong He, Xiangyu Yue, Wanli Ouyang
AAAI 2024 Frozen CLIP Transformer Is an Efficient Point Cloud Encoder Xiaoshui Huang, Zhou Huang, Sheng Li, Wentao Qu, Tong He, Yuenan Hou, Yifan Zuo, Wanli Ouyang
ECCV 2024 GVGEN: Text-to-3D Generation with Volumetric Representation Xianglong He, Junyi Chen, Sida Peng, Di Huang, Yangguang Li, Xiaoshui Huang, Chun Yuan, Wanli Ouyang, Tong He
AISTATS 2024 Graph Machine Learning Through the Lens of Bilevel Optimization Amber Yijia Zheng, Tong He, Yixuan Qiu, Minjie Wang, David Wipf
CVPR 2024 Learning for Transductive Threshold Calibration in Open-World Recognition Qin Zhang, Dongsheng An, Tianjun Xiao, Tong He, Qingming Tang, Ying Nian Wu, Joseph Tighe, Yifan Xing
NeurIPS 2024 NeuRodin: A Two-Stage Framework for High-Fidelity Neural Surface Reconstruction Yifan Wang, Di Huang, Weicai Ye, Guofeng Zhang, Wanli Ouyang, Tong He
ICMLW 2024 New Desiderata for Direct Preference Optimization Xiangkun Hu, Tong He, David Wipf
NeurIPS 2024 One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos Zechen Bai, Tong He, Haiyang Mei, Pichao Wang, Ziteng Gao, Joya Chen, Lei Liu, Zheng Zhang, Mike Zheng Shou
ECCV 2024 Pixel-GS Density Control with Pixel-Aware Gradient for 3D Gaussian Splatting Zheng Zhang, Wenbo Hu, Yixing Lao, Tong He, Hengshuang Zhao
NeurIPS 2024 Point Cloud Matters: Rethinking the Impact of Different Observation Spaces on Robot Learning Haoyi Zhu, Yating Wang, Di Huang, Weicai Ye, Wanli Ouyang, Tong He
CVPR 2024 Point Transformer V3: Simpler Faster Stronger Xiaoyang Wu, Li Jiang, Peng-Shuai Wang, Zhijian Liu, Xihui Liu, Yu Qiao, Wanli Ouyang, Tong He, Hengshuang Zhao
ECCV 2024 PredBench: Benchmarking Spatio-Temporal Prediction Across Diverse Disciplines ZiDong Wang, Zeyu Lu, Di Huang, Tong He, Xihui Liu, Wanli Ouyang, Lei Bai
NeurIPS 2024 RAGChecker: A Fine-Grained Framework for Diagnosing Retrieval-Augmented Generation Dongyu Ru, Lin Qiu, Xiangkun Hu, Tianhang Zhang, Peng Shi, Shuaichen Chang, Cheng Jiayang, Cunxiang Wang, Shichao Sun, Huanyu Li, Zizhao Zhang, Binjie Wang, Jiarong Jiang, Tong He, Zhiguo Wang, Pengfei Liu, Yue Zhang, Zheng Zhang
NeurIPS 2024 Rethinking the Training and Evaluation of Rich-Context Layout-to-Image Generation Jiaxin Cheng, Zixu Zhao, Tong He, Tianjun Xiao, Zheng Zhang, Yicong Zhou
CVPR 2024 TASeg: Temporal Aggregation Network for LiDAR Semantic Segmentation Xiaopei Wu, Yuenan Hou, Xiaoshui Huang, Binbin Lin, Tong He, Xinge Zhu, Yuexin Ma, Boxi Wu, Haifeng Liu, Deng Cai, Wanli Ouyang
CVPR 2024 UniPAD: A Universal Pre-Training Paradigm for Autonomous Driving Honghui Yang, Sha Zhang, Di Huang, Xiaoyang Wu, Haoyi Zhu, Tong He, Shixiang Tang, Hengshuang Zhao, Qibo Qiu, Binbin Lin, Xiaofei He, Wanli Ouyang
NeurIPS 2024 Unified Lexical Representation for Interpretable Visual-Language Alignment Yifan Li, Yikai Wang, Yanwei Fu, Dongyu Ru, Zheng Zhang, Tong He
ICLR 2023 Bridging the Gap to Real-World Object-Centric Learning Maximilian Seitzer, Max Horn, Andrii Zadaianchuk, Dominik Zietlow, Tianjun Xiao, Carl-Johann Simon-Gabriel, Tong He, Zheng Zhang, Bernhard Schölkopf, Thomas Brox, Francesco Locatello
ICCV 2023 Coarse-to-Fine Amodal Segmentation with Shape Prior Jianxiong Gao, Xuelin Qian, Yikai Wang, Tianjun Xiao, Tong He, Zheng Zhang, Yanwei Fu
CVPR 2023 Crossing the Gap: Domain Generalization for Image Captioning Yuchen Ren, Zhendong Mao, Shancheng Fang, Yan Lu, Tong He, Hao Du, Yongdong Zhang, Wanli Ouyang
CVPR 2023 GD-MAE: Generative Decoder for MAE Pre-Training on LiDAR Point Clouds Honghui Yang, Tong He, Jiaheng Liu, Hua Chen, Boxi Wu, Binbin Lin, Xiaofei He, Wanli Ouyang
CVPR 2023 MM-3DScene: 3D Scene Understanding by Customizing Masked Modeling with Informative-Preserved Reconstruction and Self-Distilled Consistency Mingye Xu, Mutian Xu, Tong He, Wanli Ouyang, Yali Wang, Xiaoguang Han, Yu Qiao
ICCV 2023 Object-Centric Multiple Object Tracking Zixu Zhao, Jiaze Wang, Max Horn, Yizhuo Ding, Tong He, Zechen Bai, Dominik Zietlow, Carl-Johann Simon-Gabriel, Bing Shuai, Zhuowen Tu, Thomas Brox, Bernt Schiele, Yanwei Fu, Francesco Locatello, Zheng Zhang, Tianjun Xiao
CVPR 2023 PVT-SSD: Single-Stage 3D Object Detector with Point-Voxel Transformer Honghui Yang, Wenxiao Wang, Minghao Chen, Binbin Lin, Tong He, Hua Chen, Xiaofei He, Wanli Ouyang
ICCV 2023 Ponder: Point Cloud Pre-Training via Neural Rendering Di Huang, Sida Peng, Tong He, Honghui Yang, Xiaowei Zhou, Wanli Ouyang
ICCV 2023 Rethinking Amodal Video Segmentation from Learning Supervised Signals with Object-Centric Representation Ke Fan, Jingshi Lei, Xuelin Qian, Miaopeng Yu, Tianjun Xiao, Tong He, Zheng Zhang, Yanwei Fu
ICCV 2023 Unsupervised Open-Vocabulary Object Localization in Videos Ke Fan, Zechen Bai, Tianjun Xiao, Dominik Zietlow, Max Horn, Zixu Zhao, Carl-Johann Simon-Gabriel, Mike Zheng Shou, Francesco Locatello, Bernt Schiele, Thomas Brox, Zheng Zhang, Yanwei Fu, Tong He
NeurIPS 2022 Learning Manifold Dimensions with Conditional Variational Autoencoders Yijia Zheng, Tong He, Yixuan Qiu, David P Wipf
ECCV 2022 PSS: Progressive Sample Selection for Open-World Visual Representation Learning Tianyue Cao, Yongxin Wang, Yifan Xing, Tianjun Xiao, Tong He, Zheng Zhang, Hao Zhou, Joseph Tighe
ECCV 2022 PointInst3D: Segmenting 3D Instances by Points Tong He, Wei Yin, Chunhua Shen, Anton van den Hengel
CVPRW 2022 ResNeSt: Split-Attention Networks Hang Zhang, Chongruo Wu, Zhongyue Zhang, Yi Zhu, Haibin Lin, Zhi Zhang, Yue Sun, Tong He, Jonas Mueller, R. Manmatha, Mu Li, Alexander J. Smola
NeurIPS 2022 Self-Supervised Amodal Video Object Segmentation Jian Yao, Yuxin Hong, Chiyu Wang, Tianjun Xiao, Tong He, Francesco Locatello, David P Wipf, Yanwei Fu, Zheng Zhang
ICCV 2021 ARCH++: Animation-Ready Clothed Human Reconstruction Revisited Tong He, Yuanlu Xu, Shunsuke Saito, Stefano Soatto, Tony Tung
CVPR 2021 DyCo3D: Robust Instance Segmentation of 3D Point Clouds Through Dynamic Convolution Tong He, Chunhua Shen, Anton van den Hengel
NeurIPS 2021 GRIN: Generative Relation and Intention Network for Multi-Agent Trajectory Prediction Longyuan Li, Jian Yao, Li Wenliang, Tong He, Tianjun Xiao, Junchi Yan, David P. Wipf, Zheng Zhang
CVPR 2021 HCRF-Flow: Scene Flow from Point Clouds with Continuous High-Order CRFs and Position-Aware Flow Embedding Ruibo Li, Guosheng Lin, Tong He, Fayao Liu, Chunhua Shen
ICCV 2021 Learning Hierarchical Graph Neural Networks for Image Clustering Yifan Xing, Tong He, Tianjun Xiao, Yongxin Wang, Yuanjun Xiong, Wei Xia, David Wipf, Zheng Zhang, Stefano Soatto
NeurIPS 2021 Progressive Coordinate Transforms for Monocular 3D Object Detection Li Wang, Li Zhang, Yi Zhu, Zhi Zhang, Tong He, Mu Li, Xiangyang Xue
NeurIPS 2020 Geo-PIFu: Geometry and Pixel Aligned Implicit Functions for Single-View Human Reconstruction Tong He, John Collomosse, Hailin Jin, Stefano Soatto
JMLR 2020 GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural Language Processing Jian Guo, He He, Tong He, Leonard Lausen, Mu Li, Haibin Lin, Xingjian Shi, Chenguang Wang, Junyuan Xie, Sheng Zha, Aston Zhang, Hang Zhang, Zhi Zhang, Zhongyue Zhang, Shuai Zheng, Yi Zhu
ECCV 2020 Instance-Aware Embedding for Point Cloud Instance Segmentation Tong He, Yifan Liu, Chunhua Shen, Xinlong Wang, Changming Sun
ECCV 2020 Learning and Memorizing Representative Prototypes for 3D Point Cloud Semantic and Instance Segmentation Tong He, Dong Gong, Zhi Tian, Chunhua Shen
CoRL 2020 SAM: Squeeze-and-Mimic Networks for Conditional Visual Driving Policy Learning Albert Zhao, Tong He, Yitao Liang, Haibin Huang, Guy Van den Broeck, Stefano Soatto
AAAI 2019 Mono3D++: Monocular 3D Vehicle Detection with Two-Scale 3D Hypotheses and Task Priors Tong He, Stefano Soatto
ICCV 2017 Single Shot Text Detector with Regional Attention Pan He, Weilin Huang, Tong He, Qile Zhu, Yu Qiao, Xiaolin Li
ECCV 2016 Detecting Text in Natural Image with Connectionist Text Proposal Network Zhi Tian, Weilin Huang, Tong He, Pan He, Yu Qiao