ML Anthology
Authors
Search
About
Mao, Weichao
10 publications
ICML
2025
Teaching Language Models to Critique via Reinforcement Learning
Zhihui Xie
,
Jie Chen
,
Liyu Chen
,
Weichao Mao
,
Jingjing Xu
,
Lingpeng Kong
ICLRW
2025
Teaching Language Models to Critique via Reinforcement Learning
Zhihui Xie
,
Jie Chen
,
Liyu Chen
,
Weichao Mao
,
Jingjing Xu
,
Lingpeng Kong
L4DC
2024
$\widetilde{O}(T^{-1})$ Convergence to (coarse) Correlated Equilibria in Full-Information General-Sum Markov Games
Weichao Mao
,
Haoran Qiu
,
Chen Wang
,
Hubertus Franke
,
Zbigniew Kalbarczyk
,
Tamer Başar
L4DC
2024
Controlgym: Large-Scale Control Environments for Benchmarking Reinforcement Learning Algorithms
Xiangyuan Zhang
,
Weichao Mao
,
Saviz Mowlavi
,
Mouhacine Benosman
,
Tamer Başar
NeurIPS
2023
Multi-Agent Meta-Reinforcement Learning: Sharper Convergence Rates with Task Similarity
Weichao Mao
,
Haoran Qiu
,
Chen Wang
,
Hubertus Franke
,
Zbigniew Kalbarczyk
,
Ravishankar Iyer
,
Tamer Basar
NeurIPS
2022
A Mean-Field Game Approach to Cloud Resource Management with Function Approximation
Weichao Mao
,
Haoran Qiu
,
Chen Wang
,
Hubertus Franke
,
Zbigniew Kalbarczyk
,
Ravishankar Iyer
,
Tamer Basar
ICML
2022
On Improving Model-Free Algorithms for Decentralized Multi-Agent Reinforcement Learning
Weichao Mao
,
Lin Yang
,
Kaiqing Zhang
,
Tamer Basar
ICML
2021
Near-Optimal Model-Free Reinforcement Learning in Non-Stationary Episodic MDPs
Weichao Mao
,
Kaiqing Zhang
,
Ruihao Zhu
,
David Simchi-Levi
,
Tamer Basar
NeurIPS
2020
POLY-HOOT: Monte-Carlo Planning in Continuous Space MDPs with Non-Asymptotic Analysis
Weichao Mao
,
Kaiqing Zhang
,
Qiaomin Xie
,
Tamer Basar
IJCAI
2018
Online Pricing for Revenue Maximization with Unknown Time Discounting Valuations
Weichao Mao
,
Zhenzhe Zheng
,
Fan Wu
,
Guihai Chen