Chen, Shizhe

19 publications

ICCV 2025 HORT: Monocular Hand-Held Objects Reconstruction with Transformers Zerui Chen, Rolandos Alexandros Potamias, Shizhe Chen, Cordelia Schmid
ICLR 2025 NextBestPath: Efficient 3D Mapping of Unseen Environments Shiyao Li, Antoine Guedon, Clémentin Boittiaux, Shizhe Chen, Vincent Lepetit
CVPR 2024 SUGAR: Pre-Training 3D Visual Representations for Robotics Shizhe Chen, Ricardo Garcia, Ivan Laptev, Cordelia Schmid
ICCV 2023 Explore and Tell: Embodied Visual Captioning in 3D Environments Anwen Hu, Shizhe Chen, Liang Zhang, Qin Jin
CoRL 2023 PolarNet: 3D Point Clouds for Language-Guided Robotic Manipulation Shizhe Chen, Ricardo Garcia Pinel, Cordelia Schmid, Ivan Laptev
CVPR 2023 gSDF: Geometry-Driven Signed Distance Functions for 3D Hand-Object Reconstruction Zerui Chen, Shizhe Chen, Cordelia Schmid, Ivan Laptev
ECCV 2022 Few-Shot Action Recognition with Hierarchical Matching and Contrastive Learning Sipeng Zheng, Shizhe Chen, Qin Jin
CoRL 2022 Instruction-Driven History-Aware Policies for Robotic Manipulations Pierre-Louis Guhur, Shizhe Chen, Ricardo Garcia Pinel, Makarand Tapaswi, Ivan Laptev, Cordelia Schmid
NeurIPS 2022 Language Conditioned Spatial Relation Reasoning for 3D Object Grounding Shizhe Chen, Pierre-Louis Guhur, Makarand Tapaswi, Cordelia Schmid, Ivan Laptev
ECCV 2022 Learning from Unlabeled 3D Environments for Vision-and-Language Navigation Shizhe Chen, Pierre-Louis Guhur, Makarand Tapaswi, Cordelia Schmid, Ivan Laptev
CVPR 2022 Think Global, Act Local: Dual-Scale Graph Transformer for Vision-and-Language Navigation Shizhe Chen, Pierre-Louis Guhur, Makarand Tapaswi, Cordelia Schmid, Ivan Laptev
CVPR 2022 VRDFormer: End-to-End Video Visual Relation Detection with Transformers Sipeng Zheng, Shizhe Chen, Qin Jin
ICCV 2021 Airbert: In-Domain Pretraining for Vision-and-Language Navigation Pierre-Louis Guhur, Makarand Tapaswi, Shizhe Chen, Ivan Laptev, Cordelia Schmid
ICCV 2021 Elaborative Rehearsal for Zero-Shot Action Recognition Shizhe Chen, Dong Huang
NeurIPS 2021 History Aware Multimodal Transformer for Vision-and-Language Navigation Shizhe Chen, Pierre-Louis Guhur, Cordelia Schmid, Ivan Laptev
CVPR 2021 Sketch, Ground, and Refine: Top-Down Dense Video Captioning Chaorui Deng, Shizhe Chen, Da Chen, Yuan He, Qi Wu
CVPR 2021 Towards Diverse Paragraph Captioning for Untrimmed Videos Yuqing Song, Shizhe Chen, Qin Jin
IJCAI 2019 From Words to Sentences: A Progressive Learning Approach for Zero-Resource Machine Translation with Visual Pivots Shizhe Chen, Qin Jin, Jianlong Fu
AAAI 2019 Unsupervised Bilingual Lexicon Induction from Mono-Lingual Multimodal Data Shizhe Chen, Qin Jin, Alexander G. Hauptmann