Weston, Jason E

24 publications

ICLR 2026 Hybrid Reinforcement: When Reward Is Sparse, Better to Be Dense Leitian Tao, Ilia Kulikov, Swarnadeep Saha, Tianlu Wang, Jing Xu, Sharon Li, Jason E Weston, Ping Yu
ICLR 2026 J1: Incentivizing Thinking in LLM-as-a-Judge via Reinforcement Learning Chenxi Whitehouse, Tianlu Wang, Ping Yu, Xian Li, Jason E Weston, Ilia Kulikov, Swarnadeep Saha
ICLR 2026 LLM Pretraining with Continuous Concepts Jihoon Tack, Jack Lanchantin, Jane Yu, Andrew Cohen, Ilia Kulikov, Janice Lan, Shibo Hao, Yuandong Tian, Jason E Weston, Xian Li
ICLR 2026 OptimalThinkingBench: Evaluating over and Underthinking in LLMs Pranjal Aggarwal, Seungone Kim, Jack Lanchantin, Sean Welleck, Jason E Weston, Ilia Kulikov, Swarnadeep Saha
ICLR 2026 RESTRAIN: From Spurious Votes to Signals — Self-Training RL with Self-Penalization Zhaoning Yu, Zhaolun Su, Leitian Tao, Haozhu Wang, Aashu Singh, Hanchao Yu, Jianyu Wang, Hongyang Gao, Weizhe Yuan, Jason E Weston, Ping Yu, Jing Xu
ICLR 2026 Scaling Agent Learning via Experience Synthesis Zhaorun Chen, Zhuokai Zhao, Kai Zhang, Bo Liu, Qi Qi, Yifan Wu, Tarun Kalluri, Xuefei Cao, Yuanhao Xiong, Haibo Tong, Huaxiu Yao, Hengduo Li, Jiacheng Zhu, Xian Li, Dawn Song, Bo Li, Jason E Weston, Dat Huynh
ICLR 2026 The Alignment Waltz: Jointly Training Agents to Collaborate for Safety Jingyu Zhang, Haozhu Wang, Eric Michael Smith, Sid Wang, Amr Sharaf, Mahesh Pasupuleti, Benjamin Van Durme, Daniel Khashabi, Jason E Weston, Hongyuan Zhan
ICLR 2025 Backtracking Improves Generation Safety Yiming Zhang, Jianfeng Chi, Hailey Nguyen, Kartikeya Upasani, Daniel M. Bikel, Jason E Weston, Eric Michael Smith
ICML 2025 Learning to Plan & Reason for Evaluation with Thinking-LLM-as-a-Judge Swarnadeep Saha, Xian Li, Marjan Ghazvininejad, Jason E Weston, Tianlu Wang
NeurIPS 2025 Meta CLIP 2: A Worldwide Scaling Recipe Yung-Sung Chuang, Yang Li, Dong Wang, Ching-Feng Yeh, Kehan Lyu, Ramya Raghavendra, James R. Glass, Lifei Huang, Jason E Weston, Luke Zettlemoyer, Xinlei Chen, Zhuang Liu, Saining Xie, Wen-tau Yih, Shang-Wen Li, Hu Xu
NeurIPS 2025 NaturalReasoning: Reasoning in the Wild with 2.8m Challenging Questions Weizhe Yuan, Jane Yu, Song Jiang, Karthik Padthe, Yang Li, Dong Wang, Ilia Kulikov, Kyunghyun Cho, Yuandong Tian, Jason E Weston, Xian Li
ICML 2025 R.I.P.: Better Models by Survival of the Fittest Prompts Ping Yu, Weizhe Yuan, Olga Golovneva, Tianhao Wu, Sainbayar Sukhbaatar, Jason E Weston, Jing Xu
NeurIPS 2025 Self-Challenging Language Model Agents Yifei Zhou, Sergey Levine, Jason E Weston, Xian Li, Sainbayar Sukhbaatar
ICML 2025 Self-Consistency Preference Optimization Archiki Prasad, Weizhe Yuan, Richard Yuanzhe Pang, Jing Xu, Maryam Fazel-Zarandi, Mohit Bansal, Sainbayar Sukhbaatar, Jason E Weston, Jane Yu
ICLRW 2025 Source2Synth: Synthetic Data Generation and Curation Grounded in Real Data Sources Alisia Maria Lupidi, Carlos Gemmell, Nicola Cancedda, Jane Yu, Jason E Weston, Jakob Nicolaus Foerster, Roberta Raileanu, Maria Lomeli
ICML 2025 Thinking LLMs: General Instruction Following with Thought Generation Tianhao Wu, Janice Lan, Weizhe Yuan, Jiantao Jiao, Jason E Weston, Sainbayar Sukhbaatar
ICLRW 2025 Training Large Language Models to Reason in a Continuous Latent Space Shibo Hao, Sainbayar Sukhbaatar, DiJia Su, Xian Li, Zhiting Hu, Jason E Weston, Yuandong Tian
ICLRW 2024 Chain-of-Verification Reduces Hallucination in Large Language Models Shehzaad Dhuliawala, Mojtaba Komeili, Jing Xu, Roberta Raileanu, Xian Li, Asli Celikyilmaz, Jason E Weston
NeurIPSW 2024 Distilling System 2 into System 1 Ping Yu, Jing Xu, Jason E Weston, Ilia Kulikov
ICLR 2024 Self-Alignment with Instruction Backtranslation Xian Li, Ping Yu, Chunting Zhou, Timo Schick, Omer Levy, Luke Zettlemoyer, Jason E Weston, Mike Lewis
ICML 2024 Self-Rewarding Language Models Weizhe Yuan, Richard Yuanzhe Pang, Kyunghyun Cho, Xian Li, Sainbayar Sukhbaatar, Jing Xu, Jason E Weston
ICLRW 2024 The ART of LLM Refinement: Ask, Refine, Trust Kumar Shridhar, Koustuv Sinha, Andrew Cohen, Tianlu Wang, Ping Yu, Ramakanth Pasunuru, Mrinmaya Sachan, Jason E Weston, Asli Celikyilmaz
NeurIPS 2022 Staircase Attention for Recurrent Processing of Sequences Da Ju, Stephen Roller, Sainbayar Sukhbaatar, Jason E Weston
NeurIPS 2016 Dialog-Based Language Learning Jason E Weston