Liang, Xiaodan
180 publications
AAAI
2025
Affordances-Oriented Planning Using Foundation Models for Continuous Vision-Language Navigation
CVPR
2025
FireEdit: Fine-Grained Instruction-Based Image Editing via Region-Aware Vision Language Model
CVPR
2025
HiRes-LLaVA: Restoring Fragmentation Input in High-Resolution Large Vision-Language Models
NeurIPS
2025
PhyBlock: A Progressive Benchmark for Physical Understanding and Planning via 3D Block Assembly
NeurIPS
2024
FVEL: Interactive Formal Verification Environment with Large Language Models via Theorem Proving
ECCV
2024
GarmentAligner: Text-to-Garment Generation via Retrieval-Augmented Multi-Level Corrections
CVPR
2024
Holistic Autonomous Driving Understanding by Bird's-Eye-View Injected Multi-Modal Large Models
AAAI
2024
PTUS: Photo-Realistic Talking Upper-Body Synthesis via 3D-Aware Motion Decomposition Warping
AAAI
2024
Towards Detailed Text-to-Motion Synthesis via Basic-to-Advanced Hierarchical Diffusion Model
NeurIPS
2024
VidMan: Exploiting Implicit Dynamics from Video Diffusion Model for Effective Robot Manipulation
NeurIPS
2024
Web2Code: A Large-Scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs
CVPR
2023
DetCLIPv2: Scalable Open-Vocabulary Object Detection Pre-Training via Word-Region Alignment
ICCV
2023
FULLER: Unified Multi-Modality Multi-Task 3D Perception via Multi-Level Gradient Calibration
CVPR
2023
GP-VTON: Towards General Purpose Virtual Try-on via Collaborative Local-Flow Global-Parsing Learning
ICCV
2023
GrowCLIP: Data-Aware Automatic Model Growing for Large-Scale Contrastive Language-Image Pre-Training
ICLR
2023
ViewCo: Discovering Text-Supervised Segmentation Masks via Multi-View Semantic Consistency
CVPR
2022
Arch-Graph: Acyclic Architecture Relation Predictor for Task-Transferable Neural Architecture Search
NeurIPS
2022
CoupAlign: Coupling Word-Pixel with Sentence-Mask Alignments for Referring Image Segmentation
NeurIPS
2022
DetCLIP: Dictionary-Enriched Visual-Concept Paralleled Pre-Training for Open-World Detection
ICCV
2021
Pi-NAS: Improving Neural Architecture Search by Reducing Supernet Training Consistency Shift
ICCV
2021
Product1M: Towards Weakly Supervised Instance-Level Product Retrieval via Cross-Modal Pretraining
ICCV
2021
UltraPose: Synthesizing Dense Pose with 1 Billion Points by Human-Body Decoupling 3D Model
NeurIPS
2020
Auto-Panoptic: Cooperative Multi-Component Architecture Search for Panoptic Segmentation
NeurIPS
2020
Towards Interpretable Natural Language Understanding with Explanations as Latent Variables
CVPR
2017
Deep Variation-Structured Reinforcement Learning for Visual Relationship and Attribute Detection