Behnam, Payman

3 publications

ICML 2025 RocketKV: Accelerating Long-Context LLM Inference via Two-Stage KV Cache Compression Payman Behnam, Yaosheng Fu, Ritchie Zhao, Po-An Tsai, Zhiding Yu, Alexey Tumanov
TMLR 2025 ∇QDARTS: Quantization as an Elastic Dimension to Differentiable NAS Payman Behnam, Uday Kamal, Sanjana Vijay Ganesh, Zhaoyi Li, Michael Andrew Jurado, Alind Khare, Igor Fedorov, Gaowen Liu, Alexey Tumanov
ECCV 2024 SuperFedNAS: Cost-Efficient Federated Neural Architecture Search for On-Device Inference Alind Khare, Animesh Agrawal, Aditya Annavajjala, Payman Behnam, Myungjin Lee, Hugo M Latapie, Alexey Tumanov