Gani, Hanan

5 publications

ICLR 2026 Agent-X: Evaluating Deep Multimodal Reasoning in Vision-Centric Agentic Tasks Tajamul Ashraf, Amal Saqib, Hanan Gani, Muhra AlMahri, Yuhao Li, Noor Ahsan, Umair Nawaz, Jean Lahoud, Hisham Cholakkal, Mubarak Shah, Philip Torr, Fahad Shahbaz Khan, Rao Muhammad Anwer, Salman Khan
ICCV 2025 AURELIA: Test-Time Reasoning Distillation in Audio-Visual LLMs Sanjoy Chowdhury, Hanan Gani, Nishit Anand, Sayan Nag, Ruohan Gao, Mohamed Elhoseiny, Salman Khan, Dinesh Manocha
WACV 2025 Test-Time Low Rank Adaptation via Confidence Maximization for Zero-Shot Generalization of Vision-Language Models Raza Imam, Hanan Gani, Muhammad Huzaifa, Karthik Nandakumar
CVPR 2025 VideoGLaMM : A Large Multimodal Model for Pixel-Level Visual Grounding in Videos Shehan Munasinghe, Hanan Gani, Wenqi Zhu, Jiale Cao, Eric Xing, Fahad Shahbaz Khan, Salman Khan
ICLR 2024 LLM Blueprint: Enabling Text-to-Image Generation with Complex and Detailed Prompts Hanan Gani, Shariq Farooq Bhat, Muzammal Naseer, Salman Khan, Peter Wonka