Tang, Hao
130 publications
CoRL
2025
3DS-VLA: A 3D Spatial-Aware Vision Language Action Model for Robust Multi-Task Manipulation
AAAI
2025
ARNet: Self-Supervised FG-SBIR with Unified Sample Feature Alignment and Multi-Scale Token Recycling
IJCAI
2025
Connecting Giants: Synergistic Knowledge Transfer of Large Multimodal Models for Few-Shot Learning
ICCV
2025
DynImg: Key Frames with Visual Prompts Are Good Representation for Multi-Modal Video Understanding
NeurIPS
2025
Enhancing Diffusion-Based Unrestricted Adversarial Attacks via Adversary Preferences Alignment
ICCV
2025
MaskSAM: Auto-Prompt SAM with Mask Classification for Volumetric Medical Image Segmentation
WACV
2025
Q-TempFusion: Quantization-Aware Temporal Multi-Sensor Fusion on Bird's-Eye View Representation
NeurIPS
2025
Towards Better Dental AI: A Multimodal Benchmark and Instruction Dataset for Panoramic X-Ray Analysis
NeurIPS
2025
UFO: A Unified Approach to Fine-Grained Visual Perception via Open-Ended Language Interface
ICLR
2025
VisualPredicator: Learning Abstract World Models with Neuro-Symbolic Predicates for Robot Planning
CVPR
2024
Learning with Unreliability: Fast Few-Shot Voxel Radiance Fields with Relative Geometric Consistency
NeurIPS
2024
Revisiting Adversarial Patches for Designing Camera-Agnostic Attacks Against Person Detection
ECCV
2024
StoryImager: A Unified and Efficient Framework for Coherent Story Visualization and Completion
CVPR
2024
Token Transformation Matters: Towards Faithful Post-Hoc Explanation for Vision Transformer
NeurIPS
2024
WorldCoder, a Model-Based LLM Agent: Building World Models by Writing Code and Interacting with the Environment
NeurIPS
2023
LART: Neural Correspondence Learning with Latent Regularization Transformer for 3D Motion Transfer
CVPR
2023
Master: Meta Style Transformer for Controllable Zero-Shot and Few-Shot Artistic Style Transfer
NeurIPS
2023
Object Reprojection Error (ORE): Camera Pose Benchmarks from Lightweight Tracking Annotations
NeurIPS
2023
PackQViT: Faster Sub-8-Bit Vision Transformers via Full and Packed Quantization on the Mobile
CVPR
2023
Pruning Parameterization with Bi-Level Optimization for Efficient Semantic Segmentation on the Edge