Reinforcement Learning with Action-Derived Rewards for Chemotherapy and Clinical Trial Dosing Regimen Selection

Abstract

Unstructured learning problems without well-defined rewards are unsuitable for current reinforcement learning (RL) approaches. Action-derived rewards can allow RL agents to fully explore state and action trade-offs in scenarios that require specific outcomes yet are unstructured by external reward. Clinical trial dosing choice is an example of such a problem. We report the successful formulation of clinical trial dosing choice as an RL problem using action-based rewards and learning of dosing regimens to reduce mean tumor diameters (MTD) in patients undergoing simulated temozolomide (TMZ) and procarbazine, 1-(2-chloroethyl)-3-cyclohexyl-l-nitrosourea, and vincristine (PCV) chemo- and radiotherapy clinical trials. The use of action-derived rewards as partial proxies for outcomes is described for the first time. Novel dosing regimens learned by an RL agent in the presence of action-derived rewards achieve significant reduction in MTD for cohorts and individual patients in simulated TMZ and PCV clinical trials while reducing treatment cycle administrations and dosage concentrations compared to human-expert dosing regimens. Our approach can be easily adapted for other learning tasks where outcome-based learning is not practical.

Cite

Text

Yauney and Shah. "Reinforcement Learning with Action-Derived Rewards for Chemotherapy and Clinical Trial Dosing Regimen Selection." Proceedings of the 3rd Machine Learning for Healthcare Conference, 2018.

Markdown

[Yauney and Shah. "Reinforcement Learning with Action-Derived Rewards for Chemotherapy and Clinical Trial Dosing Regimen Selection." Proceedings of the 3rd Machine Learning for Healthcare Conference, 2018.](https://mlanthology.org/mlhc/2018/yauney2018mlhc-reinforcement/)

BibTeX

@inproceedings{yauney2018mlhc-reinforcement,
  title     = {{Reinforcement Learning with Action-Derived Rewards for Chemotherapy and Clinical Trial Dosing Regimen Selection}},
  author    = {Yauney, Gregory and Shah, Pratik},
  booktitle = {Proceedings of the 3rd Machine Learning for Healthcare Conference},
  year      = {2018},
  pages     = {161-226},
  volume    = {85},
  url       = {https://mlanthology.org/mlhc/2018/yauney2018mlhc-reinforcement/}
}