Baherwani, Vatsal

3 publications

NeurIPS 2025 Dense Backpropagation Improves Training for Sparse Mixture-of-Experts Ashwinee Panda, Vatsal Baherwani, Zain Sarwar, Benjamin Thérien, Sambit Sahu, Tom Goldstein, Supriyo Chakraborty
NeurIPSW 2024 Dense Backpropagation Improves Routing for Sparsely-Gated Mixture-of-Experts Ashwinee Panda, Vatsal Baherwani, Zain Sarwar, Benjamin Thérien, Stephen Rawls, Sambit Sahu, Supriyo Chakraborty, Tom Goldstein
NeurIPSW 2024 Dense Backpropagation Improves Routing for Sparsely-Gated Mixture-of-Experts Ashwinee Panda, Vatsal Baherwani, Zain Sarwar, Benjamin Thérien, Stephen Rawls, Sambit Sahu, Supriyo Chakraborty, Tom Goldstein