Srivastava, Varun

2 publications

NeurIPS 2024 Compressing Large Language Models Using Low Rank and Low Precision Decomposition Rajarshi Saha, Naomi Sagan, Varun Srivastava, Andrea J. Goldsmith, Mert Pilanci
NeurIPS 2023 Matrix Compression via Randomized Low Rank and Low Precision Factorization Rajarshi Saha, Varun Srivastava, Mert Pilanci