Misra, Dipendra

27 publications

NeurIPS 2025 Principled Fine-Tuning of LLMs from User-Edits: A Medley of Preference, Supervision, and Reward Dipendra Misra, Aldo Pacchiano, Ta-Chung Chi, Ge Gao
NeurIPS 2024 Aligning LLM Agents by Learning Latent Preference from User Edits Ge Gao, Alexey Taymanov, Eduardo Salinas, Paul Mineiro, Dipendra Misra
NeurIPS 2024 Is Behavior Cloning All You Need? Understanding Horizon in Imitation Learning Dylan J. Foster, Adam Block, Dipendra Misra
ICLRW 2024 LLF-Bench: Benchmark for Interactive Learning from Language Feedback Ching-An Cheng, Andrey Kolobov, Dipendra Misra, Allen Nie, Adith Swaminathan
NeurIPS 2024 Policy Improvement Using Language Feedback Models Victor Zhong, Dipendra Misra, Xingdi Yuan, Marc-Alexandre Côté
ICML 2024 Provable Interactive Learning with Hindsight Instruction Feedback Dipendra Misra, Aldo Pacchiano, Robert E. Schapire
ICLR 2024 The Truth Is in There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction Pratyusha Sharma, Jordan T. Ash, Dipendra Misra
ICLR 2024 Towards Principled Representation Learning from Videos for Reinforcement Learning Dipendra Misra, Akanksha Saran, Tengyang Xie, Alex Lamb, John Langford
NeurIPSW 2024 Towards Principled Representation Learning from Videos for Reinforcement Learning Dipendra Misra, Akanksha Saran, Tengyang Xie, Alex Lamb, John Langford
TMLR 2023 Guaranteed Discovery of Control-Endogenous Latent States with Multi-Step Inverse Models Alex Lamb, Riashat Islam, Yonathan Efroni, Aniket Rajiv Didolkar, Dipendra Misra, Dylan J Foster, Lekan P Molu, Rajan Chari, Akshay Krishnamurthy, John Langford
NeurIPSW 2023 Learning to Generate Better than Your LLM Jonathan Chang, Kianté Brantley, Rajkumar Ramamurthy, Dipendra Misra, Wen Sun
ICML 2023 Principled Offline RL in the Presence of Rich Exogenous Information Riashat Islam, Manan Tomar, Alex Lamb, Yonathan Efroni, Hongyu Zang, Aniket Rajiv Didolkar, Dipendra Misra, Xin Li, Harm Van Seijen, Remi Tachet Des Combes, John Langford
AISTATS 2023 Provable Safe Reinforcement Learning with Binary Feedback Andrew Bennett, Dipendra Misra, Nathan Kallus
NeurIPS 2023 Survival Instinct in Offline Reinforcement Learning Anqi Li, Dipendra Misra, Andrey Kolobov, Ching-An Cheng
ICMLW 2023 Survival Instinct in Offline Reinforcement Learning and Implicit Human Bias in Data Anqi Li, Dipendra Misra, Andrey Kolobov, Ching-An Cheng
AISTATS 2022 Investigating the Role of Negatives in Contrastive Representation Learning Jordan Ash, Surbhi Goel, Akshay Krishnamurthy, Dipendra Misra
NeurIPSW 2022 Agent-Controller Representations: Principled Offline RL with Rich Exogenous Information Riashat Islam, Manan Tomar, Alex Lamb, Hongyu Zang, Yonathan Efroni, Dipendra Misra, Aniket Rajiv Didolkar, Xin Li, Harm van Seijen, Remi Tachet des Combes, John Langford
ICLR 2022 Provably Filtering Exogenous Distractors Using Multistep Inverse Dynamics Yonathan Efroni, Dipendra Misra, Akshay Krishnamurthy, Alekh Agarwal, John Langford
NeurIPS 2022 Provably Sample-Efficient RL with Side Information About Latent Dynamics Yao Liu, Dipendra Misra, Miro Dudik, Robert E. Schapire
COLT 2022 Sample-Efficient Reinforcement Learning in the Presence of Exogenous Information Yonathan Efroni, Dylan J Foster, Dipendra Misra, Akshay Krishnamurthy, John Langford
NeurIPSW 2022 Towards Data-Driven Offline Simulations for Online Reinforcement Learning Shengpu Tang, Felipe Vieira Frujeri, Dipendra Misra, Alex Lamb, John Langford, Paul Mineiro, Sebastian Kochman
ICML 2022 Understanding Contrastive Learning Requires Incorporating Inductive Biases Nikunj Saunshi, Jordan Ash, Surbhi Goel, Dipendra Misra, Cyril Zhang, Sanjeev Arora, Sham Kakade, Akshay Krishnamurthy
ICML 2021 Interactive Learning from Activity Description Khanh X Nguyen, Dipendra Misra, Robert Schapire, Miroslav Dudik, Patrick Shafto
ICLR 2021 Provable Rich Observation Reinforcement Learning with Combinatorial Latent States Dipendra Misra, Qinghua Liu, Chi Jin, John Langford
ICML 2020 Kinematic State Abstraction and Provably Efficient Rich-Observation Reinforcement Learning Dipendra Misra, Mikael Henaff, Akshay Krishnamurthy, John Langford
NeurIPS 2020 Learning the Linear Quadratic Regulator from Nonlinear Observations Zakaria Mhammedi, Dylan J Foster, Max Simchowitz, Dipendra Misra, Wen Sun, Akshay Krishnamurthy, Alexander Rakhlin, John Langford
ICML 2018 Lipschitz Continuity in Model-Based Reinforcement Learning Kavosh Asadi, Dipendra Misra, Michael Littman