Reinforcement Learning with Action-Derived Rewards for Chemotherapy and Clinical Trial Dosing Regimen Selection
Abstract
Unstructured learning problems without well-defined rewards are unsuitable for current reinforcement learning (RL) approaches. Action-derived rewards can allow RL agents to fully explore state and action trade-offs in scenarios that require specific outcomes yet are unstructured by external reward. Clinical trial dosing choice is an example of such a problem. We report the successful formulation of clinical trial dosing choice as an RL problem using action-based rewards and learning of dosing regimens to reduce mean tumor diameters (MTD) in patients undergoing simulated temozolomide (TMZ) and procarbazine, 1-(2-chloroethyl)-3-cyclohexyl-l-nitrosourea, and vincristine (PCV) chemo- and radiotherapy clinical trials. The use of action-derived rewards as partial proxies for outcomes is described for the first time. Novel dosing regimens learned by an RL agent in the presence of action-derived rewards achieve significant reduction in MTD for cohorts and individual patients in simulated TMZ and PCV clinical trials while reducing treatment cycle administrations and dosage concentrations compared to human-expert dosing regimens. Our approach can be easily adapted for other learning tasks where outcome-based learning is not practical.
Cite
Text
Yauney and Shah. "Reinforcement Learning with Action-Derived Rewards for Chemotherapy and Clinical Trial Dosing Regimen Selection." Proceedings of the 3rd Machine Learning for Healthcare Conference, 2018.Markdown
[Yauney and Shah. "Reinforcement Learning with Action-Derived Rewards for Chemotherapy and Clinical Trial Dosing Regimen Selection." Proceedings of the 3rd Machine Learning for Healthcare Conference, 2018.](https://mlanthology.org/mlhc/2018/yauney2018mlhc-reinforcement/)BibTeX
@inproceedings{yauney2018mlhc-reinforcement,
title = {{Reinforcement Learning with Action-Derived Rewards for Chemotherapy and Clinical Trial Dosing Regimen Selection}},
author = {Yauney, Gregory and Shah, Pratik},
booktitle = {Proceedings of the 3rd Machine Learning for Healthcare Conference},
year = {2018},
pages = {161-226},
volume = {85},
url = {https://mlanthology.org/mlhc/2018/yauney2018mlhc-reinforcement/}
}