Peng, Bei

12 publications

NeurIPS 2025 MACS: Multi-Agent Reinforcement Learning for Optimization of Crystal Structures Elena Zamaraeva, Christopher Collins, George R Darling, Matthew Stephen Dyer, Bei Peng, Rahul Savani, Dmytro Antypov, Vladimir Gusev, Judith Clymo, Paul G. Spirakis, Matthew Rosseinsky
CVPR 2025 SIDA: Social Media Image Deepfake Detection, Localization and Explanation with Large Multimodal Model Zhenglin Huang, Jinwei Hu, Xiangtai Li, Yiwei He, Xingyu Zhao, Bei Peng, Baoyuan Wu, Xiaowei Huang, Guangliang Cheng
NeurIPS 2021 FACMAC: Factored Multi-Agent Centralised Policy Gradients Bei Peng, Tabish Rashid, Christian Schroeder de Witt, Pierre-Alexandre Kamienny, Philip Torr, Wendelin Boehmer, Shimon Whiteson
ICLR 2021 RODE: Learning Roles to Decompose Multi-Agent Tasks Tonghan Wang, Tarun Gupta, Anuj Mahajan, Bei Peng, Shimon Whiteson, Chongjie Zhang
ICML 2021 Randomized Entity-Wise Factorization for Multi-Agent Reinforcement Learning Shariq Iqbal, Christian A Schroeder De Witt, Bei Peng, Wendelin Boehmer, Shimon Whiteson, Fei Sha
NeurIPS 2021 Regularized SoftMax Deep Multi-Agent Q-Learning Ling Pan, Tabish Rashid, Bei Peng, Longbo Huang, Shimon Whiteson
ICML 2021 UneVEn: Universal Value Exploration for Multi-Agent Reinforcement Learning Tarun Gupta, Anuj Mahajan, Bei Peng, Wendelin Boehmer, Shimon Whiteson
JMLR 2020 Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey Sanmit Narvekar, Bei Peng, Matteo Leonetti, Jivko Sinapov, Matthew E. Taylor, Peter Stone
ICLR 2020 Optimistic Exploration Even with a Pessimistic Initialisation Tabish Rashid, Bei Peng, Wendelin Böhmer, Shimon Whiteson
NeurIPS 2020 Weighted QMIX: Expanding Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning Tabish Rashid, Gregory Farquhar, Bei Peng, Shimon Whiteson
ICML 2017 Interactive Learning from Policy-Dependent Human Feedback James MacGlashan, Mark K. Ho, Robert Loftin, Bei Peng, Guan Wang, David L. Roberts, Matthew E. Taylor, Michael L. Littman
AAAI 2014 A Strategy-Aware Technique for Learning Behaviors from Discrete Human Feedback Robert Tyler Loftin, James MacGlashan, Bei Peng, Matthew E. Taylor, Michael L. Littman, Jeff Huang, David L. Roberts