Parmas, Paavo

7 publications

TMLR 2025 Double Horizon Model-Based Policy Optimization Akihiro Kubo, Paavo Parmas, Shin Ishii
ICLR 2025 Near-Optimal Policy Identification in Robust Constrained Markov Decision Processes via Epigraph Form Toshinori Kitamura, Tadashi Kozuno, Wataru Kumagai, Kenta Hoshino, Yohei Hosoe, Kazumi Kasaura, Masashi Hamaya, Paavo Parmas, Yutaka Matsuo
ICML 2023 Model-Based Reinforcement Learning with Scalable Composite Policy Gradient Estimators Paavo Parmas, Takuma Seno, Yuma Aoki
NeurIPS 2022 Proppo: A Message Passing Framework for Customizable and Composable Learning Algorithms Paavo Parmas, Takuma Seno
AISTATS 2021 A Unified View of Likelihood Ratio and Reparameterization Gradients Paavo Parmas, Masashi Sugiyama
ICML 2018 PIPPS: Flexible Model-Based Policy Search Robust to the Curse of Chaos Paavo Parmas, Carl Edward Rasmussen, Jan Peters, Kenji Doya
NeurIPS 2018 Total Stochastic Gradient Algorithms and Applications in Reinforcement Learning Paavo Parmas