Morimura, Tetsuro

19 publications

TMLR 2025 Evaluation of Best-of-N Sampling Strategies for Language Model Alignment Yuki Ichihara, Yuu Jinnai, Tetsuro Morimura, Kenshi Abe, Kaito Ariu, Mitsuki Sakamoto, Eiji Uchibe
TMLR 2025 Return-Aligned Decision Transformer Tsunehiko Tanaka, Kenshi Abe, Kaito Ariu, Tetsuro Morimura, Edgar Simo-Serra
ICMLW 2024 Filtered Direct Preference Optimization Tetsuro Morimura, Mitsuki Sakamoto, Yuu Jinnai, Kenshi Abe, Kaito Ariu
ICML 2024 Model-Based Minimum Bayes Risk Decoding for Text Generation Yuu Jinnai, Tetsuro Morimura, Ukyo Honda, Kaito Ariu, Kenshi Abe
TMLR 2024 Policy Gradient with Kernel Quadrature Satoshi Hayakawa, Tetsuro Morimura
ICMLW 2024 Regularized Best-of-N Sampling to Mitigate Reward Hacking for Language Model Alignment Yuu Jinnai, Tetsuro Morimura, Kaito Ariu, Kenshi Abe
ICLR 2024 Safe Collaborative Filtering Riku Togashi, Tatsushi Oka, Naoto Ohsaka, Tetsuro Morimura
NeurIPSW 2021 Mean-Variance Efficient Reinforcement Learning by Expected Quadratic Utility Maximization Masahiro Kato, Kei Nakagawa, Kenshi Abe, Tetsuro Morimura
IJCAI 2016 Weight Features for Predicting Future Model Performance of Deep Neural Networks Yasunori Yamada, Tetsuro Morimura
AISTATS 2015 A Consistent Method for Graph Based Anomaly Localization Satoshi Hara, Tetsuro Morimura, Toshihiro Takahashi, Hiroki Yanagisawa, Taiji Suzuki
AISTATS 2015 Predicting Preference Reversals via Gaussian Process Uncertainty Aversion Rikiya Takahashi, Tetsuro Morimura
AAAI 2014 Mixing-Time Regularized Policy Gradient Tetsuro Morimura, Takayuki Osogami, Tomoyuki Shirai
NeurIPS 2013 Solving Inverse Problem of Markov Chain with Partial Observations Tetsuro Morimura, Takayuki Osogami, Tsuyoshi Ide
AAAI 2012 Time-Consistency of Optimization Problems Takayuki Osogami, Tetsuro Morimura
ACML 2010 Adaptive Step-Size Policy Gradients with Average Reward Metric Takamitsu Matsubara, Tetsuro Morimura, Jun Morimoto
ICML 2010 Nonparametric Return Distribution Approximation for Reinforcement Learning Tetsuro Morimura, Masashi Sugiyama, Hisashi Kashima, Hirotaka Hachiya, Toshiyuki Tanaka
UAI 2010 Parametric Return Density Estimation for Reinforcement Learning Tetsuro Morimura, Masashi Sugiyama, Hisashi Kashima, Hirotaka Hachiya, Toshiyuki Tanaka
NeurIPS 2009 A Generalized Natural Actor-Critic Algorithm Tetsuro Morimura, Eiji Uchibe, Junichiro Yoshimoto, Kenji Doya
ECML-PKDD 2008 A New Natural Policy Gradient by Stationary Distribution Metric Tetsuro Morimura, Eiji Uchibe, Junichiro Yoshimoto, Kenji Doya