Thomas, Philip S.

41 publications

NeurIPS 2025 Beyond Prediction: Managing the Repercussions of Machine Learning Applications Aline Weber, Blossom Metevier, Yuriy Brun, Philip S. Thomas, Bruno Castro da Silva
NeurIPS 2025 Fair Continuous Resource Allocation with Equality of Impact Blossom Metevier, Dennis Wei, Karthikeyan Natesan Ramamurthy, Philip S. Thomas
NeurIPS 2025 Fair Representation Learning with Controllable High Confidence Guarantees via Adversarial Inference Yuhong Luo, Austin Hoag, Xintong Wang, Philip S. Thomas, Przemyslaw A. Grabowicz
NeurIPS 2024 Abstract Reward Processes: Leveraging State Abstraction for Consistent Off-Policy Evaluation Shreyas Chaudhari, Ameet Deshpande, Bruno Castro da Silva, Philip S. Thomas
JMLR 2024 Data-Efficient Policy Evaluation Through Behavior Policy Search Josiah P. Hanna, Yash Chandak, Philip S. Thomas, Martha White, Peter Stone, Scott Niekum
AAAI 2024 From past to Future: Rethinking Eligibility Traces Dhawal Gupta, Scott M. Jordan, Shreyas Chaudhari, Bo Liu, Philip S. Thomas, Bruno Castro da Silva
ICML 2024 Position: Benchmarking Is Limited in Reinforcement Learning Research Scott M. Jordan, Adam White, Bruno Castro Da Silva, Martha White, Philip S. Thomas
NeurIPS 2023 Behavior Alignment via Reward Function Optimization Dhawal Gupta, Yash Chandak, Scott Jordan, Philip S. Thomas, Bruno C. da Silva
NeurIPSW 2023 Learning Models and Evaluating Policies with Offline Off-Policy Data Under Partial Observability Shreyas Chaudhari, Philip S. Thomas, Bruno Castro da Silva
ICLR 2022 Fairness Guarantees Under Demographic Shift Stephen Giguere, Blossom Metevier, Bruno Castro da Silva, Yuriy Brun, Philip S. Thomas, Scott Niekum
NeurIPS 2022 Off-Policy Evaluation for Action-Dependent Non-Stationary Environments Yash Chandak, Shiv Shankar, Nathaniel Bastian, Bruno da Silva, Emma Brunskill, Philip S. Thomas
NeurIPSW 2022 Optimization Using Parallel Gradient Evaluations on Multiple Parameters Yash Chandak, Shiv Shankar, Venkata Gandikota, Philip S. Thomas, Arya Mazumdar
AAAI 2021 High-Confidence Off-Policy (or Counterfactual) Variance Estimation Yash Chandak, Shiv Shankar, Philip S. Thomas
NeurIPS 2021 Multi-Objective SPIBB: Seldonian Offline Policy Improvement with Safety Constraints in Finite MDPs Harsh Satija, Philip S. Thomas, Joelle Pineau, Romain Laroche
NeurIPS 2021 SOPE: Spectrum of Off-Policy Estimators Christina Yuan, Yash Chandak, Stephen Giguere, Philip S. Thomas, Scott Niekum
NeurIPS 2021 Structural Credit Assignment in Neural Networks Using Reinforcement Learning Dhawal Gupta, Gabor Mihucz, Matthew Schlegel, James Kostas, Philip S. Thomas, Martha White
NeurIPS 2021 Universal Off-Policy Evaluation Yash Chandak, Scott Niekum, Bruno da Silva, Erik Learned-Miller, Emma Brunskill, Philip S. Thomas
AAAI 2020 Lifelong Learning with a Changing Action Set Yash Chandak, Georgios Theocharous, Chris Nota, Philip S. Thomas
ICMLW 2020 Optimizing for the Future in Non-Stationary MDPs Yash Chandak, Georgios Theocharous, Shiv Shankar, Martha White, Sridhar Mahadevan, Philip S. Thomas
AAAI 2020 Reinforcement Learning When All Actions Are Not Always Available Yash Chandak, Georgios Theocharous, Blossom Metevier, Philip S. Thomas
NeurIPS 2020 Security Analysis of Safe and Seldonian Reinforcement Learning Algorithms Pinar Ozisik, Philip S. Thomas
NeurIPS 2020 Towards Safe Policy Improvement for Non-Stationary MDPs Yash Chandak, Scott Jordan, Georgios Theocharous, Martha White, Philip S. Thomas
NeurIPS 2019 A Meta-MDP Approach to Exploration for Lifelong Reinforcement Learning Francisco Garcia, Philip S. Thomas
AAAI 2019 Natural Option Critic Saket Tiwari, Philip S. Thomas
NeurIPS 2019 Offline Contextual Bandits with High Probability Fairness Guarantees Blossom Metevier, Stephen Giguere, Sarah Brockman, Ari Kobren, Yuriy Brun, Emma Brunskill, Philip S. Thomas
IJCAI 2018 Importance Sampling for Fair Policy Selection Shayan Doroudi, Philip S. Thomas, Emma Brunskill
ICML 2017 Data-Efficient Policy Evaluation Through Behavior Policy Search Josiah P. Hanna, Philip S. Thomas, Peter Stone, Scott Niekum
UAI 2017 Importance Sampling for Fair Policy Selection Shayan Doroudi, Philip S. Thomas, Emma Brunskill
AAAI 2017 Importance Sampling with Unequal Support Philip S. Thomas, Emma Brunskill
AAAI 2017 Predictive Off-Policy Policy Evaluation for Nonstationary Decision Problems, with Applications to Digital Marketing Philip S. Thomas, Georgios Theocharous, Mohammad Ghavamzadeh, Ishan Durugkar, Emma Brunskill
NeurIPS 2017 Using Options and Covariance Testing for Long Horizon Off-Policy Policy Evaluation Zhaohan Guo, Philip S. Thomas, Emma Brunskill
AAAI 2016 Increasing the Action Gap: New Operators for Reinforcement Learning Marc G. Bellemare, Georg Ostrovski, Arthur Guez, Philip S. Thomas, Rémi Munos
AAAI 2015 High-Confidence Off-Policy Evaluation Philip S. Thomas, Georgios Theocharous, Mohammad Ghavamzadeh
IJCAI 2015 Personalized Ad Recommendation Systems for Life-Time Value Optimization with Guarantees Georgios Theocharous, Philip S. Thomas, Mohammad Ghavamzadeh
NeurIPS 2015 Policy Evaluation Using the Ω-Return Philip S. Thomas, Scott Niekum, Georgios Theocharous, George Konidaris
AAAI 2014 Natural Temporal Difference Learning William Dabney, Philip S. Thomas
NeurIPS 2013 Projected Natural Actor-Critic Philip S. Thomas, William C Dabney, Stephen Giguere, Sridhar Mahadevan
ICML 2011 Conjugate Markov Decision Processes Philip S. Thomas, Andrew G. Barto
NeurIPS 2011 Policy Gradient Coagent Networks Philip S. Thomas
NeurIPS 2011 TD_gamma: Re-Evaluating Complex Backups in Temporal Difference Learning George Konidaris, Scott Niekum, Philip S. Thomas
AAAI 2011 Value Function Approximation in Reinforcement Learning Using the Fourier Basis George Dimitri Konidaris, Sarah Osentoski, Philip S. Thomas