Li, Haoyuan

14 publications

ICCV 2025 Anomaly Detection of Integrated Circuits Package Substrates Using the Large Vision Model SAIC: Dataset Construction, Methodology, and Application Ruiyun Yu, Bingyang Guo, Haoyuan Li
IJCAI 2025 CorrDetail: Visual Detail Enhanced Self-Correction for Face Forgery Detection Binjia Zhou, Hengrui Lou, Lizhe Chen, Haoyuan Li, Dawei Luo, Shuai Chen, Jie Lei, Zunlei Feng, Yijun Bei
AAAI 2025 Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback Wenyi Xiao, Ziwei Huang, Leilei Gan, Wanggui He, Haoyuan Li, Zhelun Yu, Fangxun Shu, Hao Jiang, Linchao Zhu
CVPR 2025 EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions Kai Chen, Yunhao Gou, Runhui Huang, Zhili Liu, Daxin Tan, Jing Xu, Chunwei Wang, Yi Zhu, Yihan Zeng, Kuo Yang, Dingdong Wang, Kun Xiang, Haoyuan Li, Haoli Bai, Jianhua Han, Xiaohui Li, Weike Jin, Nian Xie, Yu Zhang, James T. Kwok, Hengshuang Zhao, Xiaodan Liang, Dit-Yan Yeung, Xiao Chen, Zhenguo Li, Wei Zhang, Qun Liu, Lanqing Hong, Lu Hou, Hang Xu
ICML 2025 HealthGPT: A Medical Large Vision-Language Model for Unifying Comprehension and Generation via Heterogeneous Knowledge Adaptation Tianwei Lin, Wenqiao Zhang, Sijing Li, Yuqian Yuan, Binhe Yu, Haoyuan Li, Wanggui He, Hao Jiang, Mengze Li, Song Xiaohui, Siliang Tang, Jun Xiao, Hui Lin, Yueting Zhuang, Beng Chin Ooi
ICLR 2025 LLaVA-MoD: Making LLaVA Tiny via MoE-Knowledge Distillation Fangxun Shu, Yue Liao, Lei Zhang, Le Zhuo, Chenning Xu, Guanghao Zhang, Haonan Shi, Long Chan, TaoZhong, Zhelun Yu, Wanggui He, Siming Fu, Haoyuan Li, Si Liu, Hongsheng Li, Hao Jiang
AAAI 2025 MARS: Mixture of Auto-Regressive Models for Fine-Grained Text-to-Image Synthesis Wanggui He, Siming Fu, Mushui Liu, Xierui Wang, Wenyi Xiao, Fangxun Shu, Yi Wang, Lei Zhang, Zhelun Yu, Haoyuan Li, Ziwei Huang, Leilei Gan, Hao Jiang
ICLR 2025 Streaming Video Question-Answering with In-Context Video KV-Cache Retrieval Shangzhe Di, Zhelun Yu, Guanghao Zhang, Haoyuan Li, TaoZhong, Hao Cheng, Bolin Li, Wanggui He, Fangxun Shu, Hao Jiang
ICLRW 2025 TDRI: Two-Phase Dialogue Refinement and Co-Adaptation for Interactive Image Generation Yangfan He, Yuheng Feng, Jianhui Wang, Kun Li, Yijin Wang, Haoyuan Li, Sida Li, Yinghui Xia, Tianyu Shi, Miao Zhang
ICLR 2025 UniGS: Unified Language-Image-3D Pretraining with Gaussian Splatting Haoyuan Li, Zhou Yanpeng, Tao Tang, Jifei Song, Yihan Zeng, Michael Kampffmeyer, Hang Xu, Xiaodan Liang
ICCV 2023 Coordinate Transformer: Achieving Single-Stage Multi-Person Mesh Recovery from Videos Haoyuan Li, Haoye Dong, Hanchao Jia, Dong Huang, Michael C. Kampffmeyer, Liang Lin, Xiaodan Liang
CVPR 2023 DATE: Domain Adaptive Product Seeker for E-Commerce Haoyuan Li, Hao Jiang, Tao Jin, Mengyan Li, Yan Chen, Zhijie Lin, Yang Zhao, Zhou Zhao
NeurIPS 2022 Towards Effective Multi-Modal Interchanges in Zero-Resource Sounding Object Localization Yang Zhao, Chen Zhang, Haifeng Huang, Haoyuan Li, Zhou Zhao
AAAI 2020 Urban2Vec: Incorporating Street View Imagery and POIs for Multi-Modal Urban Neighborhood Embedding Zhecheng Wang, Haoyuan Li, Ram Rajagopal