McAleer, Stephen Marcus

19 publications

TMLR 2025 Tree Search for Language Model Agents Jing Yu Koh, Stephen Marcus McAleer, Daniel Fried, Ruslan Salakhutdinov
ICML 2024 AlphaZero-like Tree-Search Can Guide Large Language Model Decoding and Training Ziyu Wan, Xidong Feng, Muning Wen, Stephen Marcus Mcaleer, Ying Wen, Weinan Zhang, Jun Wang
ICLR 2024 Confronting Reward Model Overoptimization with Constrained RLHF Ted Moskovitz, Aaditya K Singh, Dj Strouse, Tuomas Sandholm, Ruslan Salakhutdinov, Anca Dragan, Stephen Marcus McAleer
ICLR 2024 Game-Theoretic Robust Reinforcement Learning Handles Temporally-Coupled Perturbations Yongyuan Liang, Yanchao Sun, Ruijie Zheng, Xiangyu Liu, Benjamin Eysenbach, Tuomas Sandholm, Furong Huang, Stephen Marcus McAleer
ICLR 2024 Illusory Attacks: Information-Theoretic Detectability Matters in Adversarial Attacks Tim Franzmeyer, Stephen Marcus McAleer, Joao F. Henriques, Jakob Nicolaus Foerster, Philip Torr, Adel Bibi, Christian Schroeder de Witt
ICLR 2024 Llemma: An Open Language Model for Mathematics Zhangir Azerbayev, Hailey Schoelkopf, Keiran Paster, Marco Dos Santos, Stephen Marcus McAleer, Albert Q. Jiang, Jia Deng, Stella Biderman, Sean Welleck
ICLR 2024 Toward Optimal Policy Population Growth in Two-Player Zero-Sum Games Stephen Marcus McAleer, Jb Lanier, Kevin A. Wang, Pierre Baldi, Tuomas Sandholm, Roy Fox
ICML 2023 A Game-Theoretic Framework for Managing Risk in Multi-Agent Systems Oliver Slumbers, David Henry Mguni, Stefano B Blumberg, Stephen Marcus Mcaleer, Yaodong Yang, Jun Wang
ICLR 2023 ESCHER: Eschewing Importance Sampling in Games by Computing a History Value Function to Estimate Regret Stephen Marcus McAleer, Gabriele Farina, Marc Lanctot, Tuomas Sandholm
ICMLW 2023 Game-Theoretic Robust Reinforcement Learning Handles Temporally-Coupled Perturbations Yongyuan Liang, Yanchao Sun, Ruijie Zheng, Xiangyu Liu, Tuomas Sandholm, Furong Huang, Stephen Marcus McAleer
ICMLW 2023 Illusory Attacks: Detectability Matters in Adversarial Attacks on Sequential Decision-Makers Tim Franzmeyer, Stephen Marcus McAleer, Joao F. Henriques, Jakob Nicolaus Foerster, Philip Torr, Adel Bibi, Christian Schroeder de Witt
TMLR 2023 JiangJun: Mastering Xiangqi by Tackling Non-Transitivity in Two-Player Zero-Sum Games Yang Li, Kun Xiong, Yingping Zhang, Jiangcheng Zhu, Stephen Marcus McAleer, Wei Pan, Jun Wang, Zonghong Dai, Yaodong Yang
ICML 2023 MANSA: Learning Fast and Slow in Multi-Agent Systems David Henry Mguni, Haojun Chen, Taher Jafferjee, Jianhong Wang, Longfei Yue, Xidong Feng, Stephen Marcus Mcaleer, Feifei Tong, Jun Wang, Yaodong Yang
ICML 2023 Regret-Minimizing Double Oracle for Extensive-Form Games Xiaohang Tang, Le Cong Dinh, Stephen Marcus Mcaleer, Yaodong Yang
NeurIPSW 2022 ESCHER: Eschewing Importance Sampling in Games by Computing a History Value Function to Estimate Regret Stephen Marcus McAleer, Gabriele Farina, Marc Lanctot, Tuomas Sandholm
NeurIPSW 2022 Feasible Adversarial Robust Reinforcement Learning for Underspecified Environments John Banister Lanier, Stephen Marcus McAleer, Pierre Baldi, Roy Fox
TMLR 2022 Online Double Oracle Le Cong Dinh, Stephen Marcus McAleer, Zheng Tian, Nicolas Perez-Nieves, Oliver Slumbers, David Henry Mguni, Jun Wang, Haitham Bou Ammar, Yaodong Yang
NeurIPSW 2021 Target Entropy Annealing for Discrete Soft Actor-Critic Yaosheng Xu, Dailin Hu, Litian Liang, Stephen Marcus McAleer, Pieter Abbeel, Roy Fox
NeurIPSW 2021 Temporal-Difference Value Estimation via Uncertainty-Guided Soft Updates Litian Liang, Yaosheng Xu, Stephen Marcus McAleer, Dailin Hu, Alexander Ihler, Pieter Abbeel, Roy Fox