Sukhbaatar, Sainbayar

30 publications

ICLR 2025 Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces DiJia Su, Sainbayar Sukhbaatar, Michael Rabbat, Yuandong Tian, Qinqing Zheng
ICML 2025 R.I.P.: Better Models by Survival of the Fittest Prompts Ping Yu, Weizhe Yuan, Olga Golovneva, Tianhao Wu, Sainbayar Sukhbaatar, Jason E Weston, Jing Xu
NeurIPS 2025 Self-Challenging Language Model Agents Yifei Zhou, Sergey Levine, Jason E Weston, Xian Li, Sainbayar Sukhbaatar
ICML 2025 Self-Consistency Preference Optimization Archiki Prasad, Weizhe Yuan, Richard Yuanzhe Pang, Jing Xu, Maryam Fazel-Zarandi, Mohit Bansal, Sainbayar Sukhbaatar, Jason E Weston, Jane Yu
ICML 2025 Thinking LLMs: General Instruction Following with Thought Generation Tianhao Wu, Janice Lan, Weizhe Yuan, Jiantao Jiao, Jason E Weston, Sainbayar Sukhbaatar
ICLRW 2025 Training Large Language Models to Reason in a Continuous Latent Space Shibo Hao, Sainbayar Sukhbaatar, DiJia Su, Xian Li, Zhiting Hu, Jason E Weston, Yuandong Tian
ICLRW 2024 Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping Lucas Lehnert, Sainbayar Sukhbaatar, Paul McVay, Michael Rabbat, Yuandong Tian
CoLLAs 2024 Compositional Interfaces for Compositional Generalization Jelena Luketina, Jack Lanchantin, Sainbayar Sukhbaatar, Arthur Szlam
NeurIPS 2024 Iterative Reasoning Preference Optimization Richard Yuanzhe Pang, Weizhe Yuan, Kyunghyun Cho, He He, Sainbayar Sukhbaatar, Jason Weston
ICML 2024 Self-Rewarding Language Models Weizhe Yuan, Richard Yuanzhe Pang, Kyunghyun Cho, Xian Li, Sainbayar Sukhbaatar, Jing Xu, Jason E Weston
ICMLW 2024 Teaching Large Language Models to Reason with Reinforcement Learning Alexander Havrilla, Yuqing Du, Sharath Chandra Raparthy, Christoforos Nalmpantis, Jane Dwivedi-Yu, Eric Hambro, Sainbayar Sukhbaatar, Roberta Raileanu
AAAI 2023 A Data Source for Reasoning Embodied Agents Jack Lanchantin, Sainbayar Sukhbaatar, Gabriel Synnaeve, Yuxuan Sun, Kavya Srinet, Arthur Szlam
NeurIPSW 2023 A Study on Improving Reasoning in Language Models Yuqing Du, Alexander Havrilla, Sainbayar Sukhbaatar, Pieter Abbeel, Roberta Raileanu
ICMLW 2023 Compositional Interfaces for Compositional Generalization Jelena Luketina, Jack Lanchantin, Sainbayar Sukhbaatar, Arthur Szlam
NeurIPS 2023 Learning to Reason and Memorize with Self-Notes Jack Lanchantin, Shubham Toshniwal, Jason Weston, Arthur Szlam, Sainbayar Sukhbaatar
ICLRW 2023 Think Before You Act: Unified Policy for Interleaving Language Reasoning with Actions Lina Mezghani, Piotr Bojanowski, Karteek Alahari, Sainbayar Sukhbaatar
CoRL 2022 Learning Goal-Conditioned Policies Offline with Self-Supervised Reward Shaping Lina Mezghani, Sainbayar Sukhbaatar, Piotr Bojanowski, Alessandro Lazaric, Karteek Alahari
NeurIPS 2022 Staircase Attention for Recurrent Processing of Sequences Da Ju, Stephen Roller, Sainbayar Sukhbaatar, Jason E Weston
UAI 2022 Temporal Abstractions-Augmented Temporally Contrastive Learning: An Alternative to the Laplacian in RL Akram Erraqabi, Marlos C. Machado, Mingde Zhao, Sainbayar Sukhbaatar, Alessandro Lazaric, Denoyer Ludovic, Yoshua Bengio
ICLRW 2022 Walk the Random Walk: Learning to Discover and Reach Goals Without Supervision Lina Mezghani, Piotr Bojanowski, Karteek Alahari, Sainbayar Sukhbaatar
ICMLW 2021 Exploration-Driven Representation Learning in Reinforcement Learning Akram Erraqabi, Harry Zhao, Marlos C. Machado, Yoshua Bengio, Sainbayar Sukhbaatar, Ludovic Denoyer, Alessandro Lazaric
NeurIPS 2021 Hash Layers for Large Sparse Models Stephen Roller, Sainbayar Sukhbaatar, Arthur Szlam, Jason Weston
ICML 2021 Not All Memories Are Created Equal: Learning to Forget by Expiring Sainbayar Sukhbaatar, Da Ju, Spencer Poff, Stephen Roller, Arthur Szlam, Jason Weston, Angela Fan
ICLR 2019 Learning When to Communicate at Scale in Multiagent Cooperative and Competitive Tasks Amanpreet Singh, Tushar Jain, Sainbayar Sukhbaatar
ICML 2018 Composable Planning with Attributes Amy Zhang, Sainbayar Sukhbaatar, Adam Lerer, Arthur Szlam, Rob Fergus
ICLR 2018 Intrinsic Motivation and Automatic Curricula via Asymmetric Self-Play Sainbayar Sukhbaatar, Zeming Lin, Ilya Kostrikov, Gabriel Synnaeve, Arthur Szlam, Rob Fergus
NeurIPS 2016 Learning Multiagent Communication with Backpropagation Sainbayar Sukhbaatar, Arthur Szlam, Rob Fergus
NeurIPS 2015 End-to-End Memory Networks Sainbayar Sukhbaatar, Arthur Szlam, Jason Weston, Rob Fergus
ICLR 2015 Learning from Noisy Labels with Deep Neural Networks Sainbayar Sukhbaatar, Rob Fergus
ICLR 2013 Auto-Pooling: Learning to Improve Invariance of Image Features from Image Sequences Sainbayar Sukhbaatar, Takaki Makino, Kazuyuki Aihara