ML Anthology
Authors
Search
About
Sukhbaatar, Sainbayar
30 publications
ICLR
2025
Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces
DiJia Su
,
Sainbayar Sukhbaatar
,
Michael Rabbat
,
Yuandong Tian
,
Qinqing Zheng
ICML
2025
R.I.P.: Better Models by Survival of the Fittest Prompts
Ping Yu
,
Weizhe Yuan
,
Olga Golovneva
,
Tianhao Wu
,
Sainbayar Sukhbaatar
,
Jason E Weston
,
Jing Xu
NeurIPS
2025
Self-Challenging Language Model Agents
Yifei Zhou
,
Sergey Levine
,
Jason E Weston
,
Xian Li
,
Sainbayar Sukhbaatar
ICML
2025
Self-Consistency Preference Optimization
Archiki Prasad
,
Weizhe Yuan
,
Richard Yuanzhe Pang
,
Jing Xu
,
Maryam Fazel-Zarandi
,
Mohit Bansal
,
Sainbayar Sukhbaatar
,
Jason E Weston
,
Jane Yu
ICML
2025
Thinking LLMs: General Instruction Following with Thought Generation
Tianhao Wu
,
Janice Lan
,
Weizhe Yuan
,
Jiantao Jiao
,
Jason E Weston
,
Sainbayar Sukhbaatar
ICLRW
2025
Training Large Language Models to Reason in a Continuous Latent Space
Shibo Hao
,
Sainbayar Sukhbaatar
,
DiJia Su
,
Xian Li
,
Zhiting Hu
,
Jason E Weston
,
Yuandong Tian
ICLRW
2024
Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping
Lucas Lehnert
,
Sainbayar Sukhbaatar
,
Paul McVay
,
Michael Rabbat
,
Yuandong Tian
CoLLAs
2024
Compositional Interfaces for Compositional Generalization
Jelena Luketina
,
Jack Lanchantin
,
Sainbayar Sukhbaatar
,
Arthur Szlam
NeurIPS
2024
Iterative Reasoning Preference Optimization
Richard Yuanzhe Pang
,
Weizhe Yuan
,
Kyunghyun Cho
,
He He
,
Sainbayar Sukhbaatar
,
Jason Weston
ICML
2024
Self-Rewarding Language Models
Weizhe Yuan
,
Richard Yuanzhe Pang
,
Kyunghyun Cho
,
Xian Li
,
Sainbayar Sukhbaatar
,
Jing Xu
,
Jason E Weston
ICMLW
2024
Teaching Large Language Models to Reason with Reinforcement Learning
Alexander Havrilla
,
Yuqing Du
,
Sharath Chandra Raparthy
,
Christoforos Nalmpantis
,
Jane Dwivedi-Yu
,
Eric Hambro
,
Sainbayar Sukhbaatar
,
Roberta Raileanu
AAAI
2023
A Data Source for Reasoning Embodied Agents
Jack Lanchantin
,
Sainbayar Sukhbaatar
,
Gabriel Synnaeve
,
Yuxuan Sun
,
Kavya Srinet
,
Arthur Szlam
NeurIPSW
2023
A Study on Improving Reasoning in Language Models
Yuqing Du
,
Alexander Havrilla
,
Sainbayar Sukhbaatar
,
Pieter Abbeel
,
Roberta Raileanu
ICMLW
2023
Compositional Interfaces for Compositional Generalization
Jelena Luketina
,
Jack Lanchantin
,
Sainbayar Sukhbaatar
,
Arthur Szlam
NeurIPS
2023
Learning to Reason and Memorize with Self-Notes
Jack Lanchantin
,
Shubham Toshniwal
,
Jason Weston
,
Arthur Szlam
,
Sainbayar Sukhbaatar
ICLRW
2023
Think Before You Act: Unified Policy for Interleaving Language Reasoning with Actions
Lina Mezghani
,
Piotr Bojanowski
,
Karteek Alahari
,
Sainbayar Sukhbaatar
CoRL
2022
Learning Goal-Conditioned Policies Offline with Self-Supervised Reward Shaping
Lina Mezghani
,
Sainbayar Sukhbaatar
,
Piotr Bojanowski
,
Alessandro Lazaric
,
Karteek Alahari
NeurIPS
2022
Staircase Attention for Recurrent Processing of Sequences
Da Ju
,
Stephen Roller
,
Sainbayar Sukhbaatar
,
Jason E Weston
UAI
2022
Temporal Abstractions-Augmented Temporally Contrastive Learning: An Alternative to the Laplacian in RL
Akram Erraqabi
,
Marlos C. Machado
,
Mingde Zhao
,
Sainbayar Sukhbaatar
,
Alessandro Lazaric
,
Denoyer Ludovic
,
Yoshua Bengio
ICLRW
2022
Walk the Random Walk: Learning to Discover and Reach Goals Without Supervision
Lina Mezghani
,
Piotr Bojanowski
,
Karteek Alahari
,
Sainbayar Sukhbaatar
ICMLW
2021
Exploration-Driven Representation Learning in Reinforcement Learning
Akram Erraqabi
,
Harry Zhao
,
Marlos C. Machado
,
Yoshua Bengio
,
Sainbayar Sukhbaatar
,
Ludovic Denoyer
,
Alessandro Lazaric
NeurIPS
2021
Hash Layers for Large Sparse Models
Stephen Roller
,
Sainbayar Sukhbaatar
,
Arthur Szlam
,
Jason Weston
ICML
2021
Not All Memories Are Created Equal: Learning to Forget by Expiring
Sainbayar Sukhbaatar
,
Da Ju
,
Spencer Poff
,
Stephen Roller
,
Arthur Szlam
,
Jason Weston
,
Angela Fan
ICLR
2019
Learning When to Communicate at Scale in Multiagent Cooperative and Competitive Tasks
Amanpreet Singh
,
Tushar Jain
,
Sainbayar Sukhbaatar
ICML
2018
Composable Planning with Attributes
Amy Zhang
,
Sainbayar Sukhbaatar
,
Adam Lerer
,
Arthur Szlam
,
Rob Fergus
ICLR
2018
Intrinsic Motivation and Automatic Curricula via Asymmetric Self-Play
Sainbayar Sukhbaatar
,
Zeming Lin
,
Ilya Kostrikov
,
Gabriel Synnaeve
,
Arthur Szlam
,
Rob Fergus
NeurIPS
2016
Learning Multiagent Communication with Backpropagation
Sainbayar Sukhbaatar
,
Arthur Szlam
,
Rob Fergus
NeurIPS
2015
End-to-End Memory Networks
Sainbayar Sukhbaatar
,
Arthur Szlam
,
Jason Weston
,
Rob Fergus
ICLR
2015
Learning from Noisy Labels with Deep Neural Networks
Sainbayar Sukhbaatar
,
Rob Fergus
ICLR
2013
Auto-Pooling: Learning to Improve Invariance of Image Features from Image Sequences
Sainbayar Sukhbaatar
,
Takaki Makino
,
Kazuyuki Aihara