ML Anthology
Authors
Search
About
Hu, Di
41 publications
CVPR
2025
Adaptive Unimodal Regulation for Balanced Multimodal Information Acquisition
Chengxiang Huang
,
Yake Wei
,
Zequn Yang
,
Di Hu
ICLR
2025
AnyTouch: Learning Unified Static-Dynamic Representation Across Multiple Visuo-Tactile Sensors
Ruoxuan Feng
,
Jiangyu Hu
,
Wenke Xia
,
TianciGao
,
Ao Shen
,
Yuhao Sun
,
Bin Fang
,
Di Hu
CVPR
2025
Crab: A Unified Audio-Visual Scene Understanding Model with Explicit Cooperation
Henghui Du
,
Guangyao Li
,
Chang Zhou
,
Chunjie Zhang
,
Alan Zhao
,
Di Hu
ICML
2025
Efficient Quantification of Multimodal Interaction at Sample Level
Zequn Yang
,
Hongfa Wang
,
Di Hu
NeurIPS
2025
Human-Assisted Robotic Policy Refinement via Action Preference Optimization
Wenke Xia
,
Yichu Yang
,
Hongtao Wu
,
Xiao Ma
,
Tao Kong
,
Di Hu
NeurIPS
2025
MokA: Multimodal Low-Rank Adaptation for MLLMs
Yake Wei
,
Yu Miao
,
Dongzhan Zhou
,
Di Hu
CVPR
2025
Patch Matters: Training-Free Fine-Grained Image Caption Enhancement via Local Perception
Ruotian Peng
,
Haiying He
,
Yake Wei
,
Yandong Wen
,
Di Hu
CVPR
2025
Phoenix: A Motion-Based Self-Reflection Framework for Fine-Grained Robotic Action Correction
Wenke Xia
,
Ruoxuan Feng
,
Dong Wang
,
Di Hu
ICML
2025
RollingQ: Reviving the Cooperation Dynamics in Multimodal Transformer
Haotian Ni
,
Yake Wei
,
Hang Liu
,
Gong Chen
,
Chong Peng
,
Hao Lin
,
Di Hu
ECCV
2024
Can Textual Semantics Mitigate Sounding Object Segmentation Preference?
Yaoting Wang
,
Peiwen Sun
,
Yuanchao Li
,
Honggang Zhang
,
Di Hu
ECCV
2024
Diagnosing and Re-Learning for Balanced Multimodal Learning
Yake Wei
,
Siwei Li
,
Ruoxuan Feng
,
Di Hu
CVPR
2024
Enhancing Multimodal Cooperation via Sample-Level Modality Valuation
Yake Wei
,
Ruoxuan Feng
,
Zihe Wang
,
Di Hu
CoRL
2024
KOI: Accelerating Online Imitation Learning via Hybrid Key-State Guidance
Jingxian Lu
,
Wenke Xia
,
Dong Wang
,
Zhigang Wang
,
Bin Zhao
,
Di Hu
,
Xuelong Li
ICML
2024
MMPareto: Boosting Multimodal Learning with Innocent Unimodal Assistance
Yake Wei
,
Di Hu
CoRL
2024
Play to the Score: Stage-Guided Dynamic Multi-Sensory Fusion for Robotic Manipulation
Ruoxuan Feng
,
Di Hu
,
Wenke Ma
,
Xuelong Li
AAAI
2024
Prompting Segmentation with Sound Is Generalizable Audio-Visual Source Localizer
Yaoting Wang
,
Weisong Liu
,
Guangyao Li
,
Jian Ding
,
Di Hu
,
Xi Li
ICLR
2024
Quantifying and Enhancing Multi-Modal Robustness with Modality Preference
Zequn Yang
,
Yake Wei
,
Ce Liang
,
Di Hu
ECCV
2024
Ref-AVS: Refer and Segment Objects in Audio-Visual Scenes
Yaoting Wang
,
Peiwen Sun
,
Dongzhan Zhou
,
Guangyao Li
,
Honggang Zhang
,
Di Hu
AAAI
2024
SphereDiffusion: Spherical Geometry-Aware Distortion Resilient Diffusion Model
Tao Wu
,
Xuewei Li
,
Zhongang Qi
,
Di Hu
,
Xintao Wang
,
Ying Shan
,
Xi Li
ECCV
2024
Stepping Stones: A Progressive Training Strategy for Audio-Visual Semantic Segmentation
Juncheng Ma
,
Peiwen Sun
,
Yaoting Wang
,
Di Hu
MLJ
2024
Towards Accurate Knowledge Transfer via Target-Awareness Representation Disentanglement
Xingjian Li
,
Di Hu
,
Xuhong Li
,
Haoyi Xiong
,
Cheng-Zhong Xu
,
Dejing Dou
WACV
2023
Exploiting Visual Context Semantics for Sound Source Localization
Xinchi Zhou
,
Dongzhan Zhou
,
Di Hu
,
Hang Zhou
,
Wanli Ouyang
WACV
2023
SeCo: Separating Unknown Musical Visual Sounds with Consistency Guidance
Xinchi Zhou
,
Dongzhan Zhou
,
Wanli Ouyang
,
Hang Zhou
,
Di Hu
TMLR
2023
Supervised Knowledge May Hurt Novel Class Discovery Performance
Ziyun Li
,
Jona Otholt
,
Ben Dai
,
Di Hu
,
Christoph Meinel
,
Haojin Yang
ICCV
2023
Towards Inadequately Pre-Trained Models in Transfer Learning
Andong Deng
,
Xingjian Li
,
Di Hu
,
Tianyang Wang
,
Haoyi Xiong
,
Cheng-Zhong Xu
NeurIPSW
2022
A Closer Look at Novel Class Discovery from the Labeled Set
Ziyun Li
,
Jona Otholt
,
Ben Dai
,
Di Hu
,
Christoph Meinel
,
Haojin Yang
CVPR
2022
Balanced Multimodal Learning via On-the-Fly Gradient Modulation
Xiaokang Peng
,
Yake Wei
,
Andong Deng
,
Dong Wang
,
Di Hu
CVPR
2022
Learning to Answer Questions in Dynamic Audio-Visual Scenarios
Guangyao Li
,
Yake Wei
,
Yapeng Tian
,
Chenliang Xu
,
Ji-Rong Wen
,
Di Hu
NeurIPSW
2022
Not All Knowledge Is Created Equal: Mutual Distillation of Confident Knowledge
Ziyun Li
,
Xinshao Wang
,
Di Hu
,
Neil M. Robertson
,
David A. Clifton
,
Christoph Meinel
,
Haojin Yang
AAAI
2022
SepFusion: Finding Optimal Fusion Structures for Visual Sound Separation
Dongzhan Zhou
,
Xinchi Zhou
,
Di Hu
,
Hang Zhou
,
Lei Bai
,
Ziwei Liu
,
Wanli Ouyang
AAAI
2022
Visual Sound Localization in the Wild by Cross-Modal Interference Erasing
Xian Liu
,
Rui Qian
,
Hang Zhou
,
Di Hu
,
Weiyao Lin
,
Ziwei Liu
,
Bolei Zhou
,
Xiaowei Zhou
CVPR
2021
Cyclic Co-Learning of Sounding Object Visual Grounding and Sound Separation
Yapeng Tian
,
Di Hu
,
Chenliang Xu
AAAI
2021
Temporal Relational Modeling with Self-Supervision for Action Segmentation
Dong Wang
,
Di Hu
,
Xingjian Li
,
Dejing Dou
CVPR
2021
Unsupervised Multi-Source Domain Adaptation for Person Re-Identification
Zechen Bai
,
Zhigang Wang
,
Jian Wang
,
Di Hu
,
Errui Ding
ECCV
2020
Cross-Task Transfer for Geotagged Audiovisual Aerial Scene Recognition
Di Hu
,
Xuhong Li
,
Lichao Mou
,
Pu Jin
,
Dong Chen
,
Liping Jing
,
Xiaoxiang Zhu
,
Dejing Dou
NeurIPS
2020
Discriminative Sounding Objects Localization via Self-Supervised Audiovisual Matching
Di Hu
,
Rui Qian
,
Minyue Jiang
,
Xiao Tan
,
Shilei Wen
,
Errui Ding
,
Weiyao Lin
,
Dejing Dou
ECCV
2020
Multiple Sound Sources Localization from Coarse to Fine
Rui Qian
,
Di Hu
,
Heinrich Dinkel
,
Mengyue Wu
,
Ning Xu
,
Weiyao Lin
ACML
2019
Multivariate Time Series Prediction Based on Optimized Temporal Convolutional Networks with Stacked Auto-Encoders
Yunxiao Wang
,
Zheng Liu
,
Di Hu
,
Mian Zhang
ICCV
2017
Image2song: Song Retrieval via Bridging Image Content and Lyric Words
Xuelong Li
,
Di Hu
,
Xiaoqiang Lu
AAAI
2017
Large Graph Hashing with Spectral Rotation
Xuelong Li
,
Di Hu
,
Feiping Nie
CVPR
2016
Temporal Multimodal Learning in Audiovisual Speech Recognition
Di Hu
,
Xuelong Li
,
Xiaoqiang Lu