Tumanov, Alexey

9 publications

ICLRW 2025 Initialization Using Update Approximation Is a Silver Bullet for Extremely Efficient Low-Rank Fine-Tuning Kaustubh Ponkshe, Raghav Singhal, Eduard Gorbunov, Alexey Tumanov, Samuel Horváth, Praneeth Vepakomma
ICML 2025 RocketKV: Accelerating Long-Context LLM Inference via Two-Stage KV Cache Compression Payman Behnam, Yaosheng Fu, Ritchie Zhao, Po-An Tsai, Zhiding Yu, Alexey Tumanov
TMLR 2025 ∇QDARTS: Quantization as an Elastic Dimension to Differentiable NAS Payman Behnam, Uday Kamal, Sanjana Vijay Ganesh, Zhaoyi Li, Michael Andrew Jurado, Alind Khare, Igor Fedorov, Gaowen Liu, Alexey Tumanov
ECCV 2024 DεpS: Delayed Ε-Shrinking for Faster Once-for-All Training Aditya Annavajjala, Alind Khare, Animesh Agrawal, Igor Fedorov, Hugo M Latapie, Myungjin Lee, Alexey Tumanov
TMLR 2024 PLUM: Improving Inference Efficiency by Leveraging Repetition-Sparsity Trade-Off Sachit Kuhar, Yash Jain, Alexey Tumanov
ECCV 2024 SuperFedNAS: Cost-Efficient Federated Neural Architecture Search for On-Device Inference Alind Khare, Animesh Agrawal, Aditya Annavajjala, Payman Behnam, Myungjin Lee, Hugo M Latapie, Alexey Tumanov
NeurIPS 2022 UnfoldML: Cost-Aware and Uncertainty-Based Dynamic 2D Prediction for Multi-Stage Classification Yanbo Xu, Alind Khare, Glenn Matlin, Monish Ramadoss, Rishikesan Kamaleswaran, Chao Zhang, Alexey Tumanov
ICLR 2021 CompOFA – Compound Once-for-All Networks for Faster Multi-Platform Deployment Manas Sahni, Shreya Varshini, Alind Khare, Alexey Tumanov
UAI 2018 IDK Cascades: Fast Deep Learning by Learning Not to Overthink Xin Wang, Yujia Luo, Daniel Crankshaw, Alexey Tumanov, Fisher Yu, Joseph E. Gonzalez