Saha, Swarnadeep

9 publications

ICLR 2026 Hybrid Reinforcement: When Reward Is Sparse, Better to Be Dense Leitian Tao, Ilia Kulikov, Swarnadeep Saha, Tianlu Wang, Jing Xu, Sharon Li, Jason E Weston, Ping Yu
ICLR 2026 J1: Incentivizing Thinking in LLM-as-a-Judge via Reinforcement Learning Chenxi Whitehouse, Tianlu Wang, Ping Yu, Xian Li, Jason E Weston, Ilia Kulikov, Swarnadeep Saha
ICLR 2026 OptimalThinkingBench: Evaluating over and Underthinking in LLMs Pranjal Aggarwal, Seungone Kim, Jack Lanchantin, Sean Welleck, Jason E Weston, Ilia Kulikov, Swarnadeep Saha
ICML 2025 Learning to Plan & Reason for Evaluation with Thinking-LLM-as-a-Judge Swarnadeep Saha, Xian Li, Marjan Ghazvininejad, Jason E Weston, Tianlu Wang
ICLR 2025 System 1.x: Learning to Balance Fast and Slow Planning with Language Models Swarnadeep Saha, Archiki Prasad, Justin Chen, Peter Hase, Elias Stengel-Eskin, Mohit Bansal
ICML 2024 MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models Justin Chen, Swarnadeep Saha, Elias Stengel-Eskin, Mohit Bansal
NeurIPS 2023 Can Language Models Teach? Teacher Explanations Improve Student Performance via Personalization Swarnadeep Saha, Peter Hase, Mohit Bansal
ICLR 2023 Summarization Programs: Interpretable Abstractive Summarization with Neural Modular Trees Swarnadeep Saha, Shiyue Zhang, Peter Hase, Mohit Bansal
IJCAI 2019 Aligning Learning Outcomes to Learning Resources: A Lexico-Semantic Spatial Approach Swarnadeep Saha, Malolan Chetlur, Tejas Indulal Dhamecha, K. Gayathri Wijayarathna, Red Mendoza, Paul Gagnon, Nabil Zary, Shantanu Godbole