ML Anthology
Authors
Search
About
Jinnai, Yuu
15 publications
TMLR
2025
Evaluation of Best-of-N Sampling Strategies for Language Model Alignment
Yuki Ichihara
,
Yuu Jinnai
,
Tetsuro Morimura
,
Kenshi Abe
,
Kaito Ariu
,
Mitsuki Sakamoto
,
Eiji Uchibe
ICMLW
2024
Filtered Direct Preference Optimization
Tetsuro Morimura
,
Mitsuki Sakamoto
,
Yuu Jinnai
,
Kenshi Abe
,
Kaito Ariu
ICML
2024
Model-Based Minimum Bayes Risk Decoding for Text Generation
Yuu Jinnai
,
Tetsuro Morimura
,
Ukyo Honda
,
Kaito Ariu
,
Kenshi Abe
ICMLW
2024
Regularized Best-of-N Sampling to Mitigate Reward Hacking for Language Model Alignment
Yuu Jinnai
,
Tetsuro Morimura
,
Kaito Ariu
,
Kenshi Abe
AAAI
2021
Lipschitz Lifelong Reinforcement Learning
Erwan Lecarpentier
,
David Abel
,
Kavosh Asadi
,
Yuu Jinnai
,
Emmanuel Rachelson
,
Michael L. Littman
ICLR
2020
Exploration in Reinforcement Learning with Deep Covering Options
Yuu Jinnai
,
Jee Won Park
,
Marlos C. Machado
,
George Konidaris
AAAI
2020
Neural Architecture Search Using Deep Neural Networks and Monte Carlo Tree Search
Linnan Wang
,
Yiyang Zhao
,
Yuu Jinnai
,
Yuandong Tian
,
Rodrigo Fonseca
ICML
2019
Discovering Options for Exploration by Minimizing Cover Time
Yuu Jinnai
,
Jee Won Park
,
David Abel
,
George Konidaris
ICML
2019
Finding Options That Minimize Planning Time
Yuu Jinnai
,
David Abel
,
David Hershkowitz
,
Michael Littman
,
George Konidaris
AAAI
2019
State Abstraction as Compression in Apprenticeship Learning
David Abel
,
Dilip Arumugam
,
Kavosh Asadi
,
Yuu Jinnai
,
Michael L. Littman
,
Lawson L. S. Wong
ICML
2018
Policy and Value Transfer in Lifelong Reinforcement Learning
David Abel
,
Yuu Jinnai
,
Sophie Yue Guo
,
George Konidaris
,
Michael Littman
AAAI
2017
Learning to Avoid Dominated Action Sequences in Planning for Black-Box Domains
Yuu Jinnai
,
Alex Fukunaga
AAAI
2017
Learning to Prune Dominated Action Sequences in Online Black-Box Planning
Yuu Jinnai
,
Alex S. Fukunaga
JAIR
2017
On Hash-Based Work Distribution Methods for Parallel Best-First Search
Yuu Jinnai
,
Alex Fukunaga
AAAI
2016
Abstract Zobrist Hashing: An Efficient Work Distribution Method for Parallel Best-First Search
Yuu Jinnai
,
Alex Fukunaga