Tao, Nigel

2 publications

ICML 2001 A Multi-Agent Policy-Gradient Approach to Network Routing Nigel Tao, Jonathan Baxter, Lex Weaver
UAI 2001 The Optimal Reward Baseline for Gradient-Based Reinforcement Learning Lex Weaver, Nigel Tao