He, Ran
91 publications
CVPR
2025
Do We Really Need Curated Malicious Data for Safety Alignment in Multi-Modal Large Language Models?
CVPR
2025
R-TPT: Improving Adversarial Robustness of Vision-Language Models Through Test-Time Prompt Tuning
ICCV
2025
Semantic Equitable Clustering: A Simple and Effective Strategy for Clustering Vision Tokens
NeurIPS
2025
The Illusion of Progress? a Critical Look at Test-Time Adaptation for Vision-Language Models
CVPR
2025
Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-Modal LLMs in Video Analysis
NeurIPS
2024
Hallo3D: Multi-Modal Hallucination Detection and Mitigation for Consistent 3D Content Generation
NeurIPSW
2024
InfiMM-WebMath-40b: Advancing Multimodal Pre-Training for Enhanced Mathematical Reasoning
ICLR
2024
Thought Propagation: An Analogical Approach to Complex Reasoning with Large Language Models
CVPR
2024
Uncertainty-Aware Source-Free Adaptive Image Super-Resolution with Wavelet Augmentation Transformer
NeurIPS
2023
Learning-to-Rank Meets Language: Boosting Language-Driven Ordering Alignment for Ordinal Classification
NeurIPS
2022
Orthogonal Transformer: An Efficient Vision Transformer Backbone with Token Orthogonalization
ICCV
2021
CM-NAS: Cross-Modality Neural Architecture Search for Visible-Infrared Person Re-Identification
IJCAI
2019
Neurons Merging Layer: Towards Progressive Redundancy Reduction for Deep Supervised Hashing
IJCAI
2019
Pedestrian Attribute Recognition by Joint Visual-Semantic Reasoning and Knowledge Distillation