Nag, Sayan

14 publications

ICCV 2025 AURELIA: Test-Time Reasoning Distillation in Audio-Visual LLMs Sanjoy Chowdhury, Hanan Gani, Nishit Anand, Sayan Nag, Ruohan Gao, Mohamed Elhoseiny, Salman Khan, Dinesh Manocha
ICCV 2025 AVTrustBench: Assessing and Enhancing Reliability and Robustness in Audio-Visual LLMs Sanjoy Chowdhury, Sayan Nag, Subhrajyoti Dasgupta, Yaoting Wang, Mohamed Elhoseiny, Ruohan Gao, Dinesh Manocha
ICCV 2025 EgoAdapt: Adaptive Multisensory Distillation and Policy Learning for Efficient Egocentric Perception Sanjoy Chowdhury, Subrata Biswas, Sayan Nag, Tushar Nagarajan, Calvin Murdock, Ishwarya Ananthabhotla, Yijun Qian, Vamsi Krishna Ithapu, Dinesh Manocha, Ruohan Gao
NeurIPS 2025 Localizing Knowledge in Diffusion Transformers Arman Zarei, Samyadeep Basu, Keivan Rezaei, Zihao Lin, Sayan Nag, Soheil Feizi
NeurIPS 2025 MAGNET: A Multi-Agent Framework for Finding Audio-Visual Needles by Reasoning over Multi-Video Haystacks Sanjoy Chowdhury, Mohamed Elmoghany, Yohan Abeysinghe, Junjie Fei, Sayan Nag, Salman Khan, Mohamed Elhoseiny, Dinesh Manocha
CVPR 2024 Jack of All Tasks Master of Many: Designing General-Purpose Coarse-to-Fine Vision-Language Model Shraman Pramanick, Guangxing Han, Rui Hou, Sayan Nag, Ser-Nam Lim, Nicolas Ballas, Qifan Wang, Rama Chellappa, Amjad Almahairi
CVPR 2024 MeLFusion: Synthesizing Music from Image and Language Cues Using Diffusion Models Sanjoy Chowdhury, Sayan Nag, K J Joseph, Balaji Vasan Srinivasan, Dinesh Manocha
ECCV 2024 Meerkat: Audio-Visual Large Language Model for Grounding in Space and Time Sanjoy Chowdhury, Sayan Nag, Subhrajyoti Dasgupta, Jun Chen, Mohamed Elhoseiny, Ruohan Gao, Dinesh Manocha
ECCV 2024 SAFARI: Adaptive Sequence Transformer for Weakly Supervised Referring Expression Segmentation Sayan Nag, Koustava Goswami, Srikrishna Karanam
CVPRW 2023 DeCAtt: Efficient Vision Transformers with Decorrelated Attention Heads Mayukh Bhattacharyya, Soumitri Chattopadhyay, Sayan Nag
ICCV 2023 EgoVLPv2: Egocentric Video-Language Pre-Training with Fusion in the Backbone Shraman Pramanick, Yale Song, Sayan Nag, Kevin Qinghong Lin, Hardik Shah, Mike Zheng Shou, Rama Chellappa, Pengchuan Zhang
WACV 2023 SERF: Towards Better Training of Deep Neural Networks Using Log-Softplus ERror Activation Function Sayan Nag, Mayukh Bhattacharyya, Anuraag Mukherjee, Rohit Kundu
TMLR 2023 VoLTA: Vision-Language Transformer with Weakly-Supervised Local-Feature Alignment Shraman Pramanick, Li Jing, Sayan Nag, Jiachen Zhu, Hardik J Shah, Yann LeCun, Rama Chellappa
IJCAI 2022 Deciphering Environmental Air Pollution with Large Scale City Data Mayukh Bhattacharyya, Sayan Nag, Udita Ghosh