Qi, Penghui

7 publications

ICLR 2026 Revisiting Parameter Server in LLM Post-Training Xinyi Wan, Penghui Qi, Guangxing Huang, Chaoyi Ruan, Min Lin, Jialin Li
ICLR 2026 SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning Bo Liu, Simon Yu, Zichen Liu, Leon Guertler, Penghui Qi, Daniel Balcells, Mickel Liu, Cheston Tan, Weiyan Shi, Min Lin, Wee Sun Lee, Natasha Jaques
NeurIPS 2025 Optimizing Anytime Reasoning via Budget Relative Policy Optimization Penghui Qi, Zichen Liu, Tianyu Pang, Chao Du, Wee Sun Lee, Min Lin
ICML 2025 PipeOffload: Improving Scalability of Pipeline Parallelism with Memory Optimization Xinyi Wan, Penghui Qi, Guangxing Huang, Min Lin, Jialin Li
NeurIPS 2024 Pipeline Parallelism with Controllable Memory Penghui Qi, Xinyi Wan, Nyamdavaa Amar, Min Lin
ICLR 2024 Zero Bubble (Almost) Pipeline Parallelism Penghui Qi, Xinyi Wan, Guangxing Huang, Min Lin
ICML 2021 SCC: An Efficient Deep Reinforcement Learning Agent Mastering the Game of StarCraft II Xiangjun Wang, Junxiao Song, Penghui Qi, Peng Peng, Zhenkun Tang, Wei Zhang, Weimin Li, Xiongjun Pi, Jujie He, Chao Gao, Haitao Long, Quan Yuan