ML Anthology
Authors
Search
About
Chow, Yinlam
36 publications
ICLR
2025
Inference-Aware Fine-Tuning for Best-of-N Sampling in Large Language Models
Yinlam Chow
,
Guy Tennenholtz
,
Izzeddin Gur
,
Vincent Zhuang
,
Bo Dai
,
Aviral Kumar
,
Rishabh Agarwal
,
Sridhar Thiagarajan
,
Craig Boutilier
,
Aleksandra Faust
ICML
2025
Preference Adaptive and Sequential Text-to-Image Generation
Ofir Nabati
,
Guy Tennenholtz
,
Chihwei Hsu
,
Moonkyung Ryu
,
Deepak Ramachandran
,
Yinlam Chow
,
Xiang Li
,
Craig Boutilier
ICLR
2024
Demystifying Embedding Spaces Using Large Language Models
Guy Tennenholtz
,
Yinlam Chow
,
ChihWei Hsu
,
Jihwan Jeong
,
Lior Shani
,
Azamat Tulepbergenov
,
Deepak Ramachandran
,
Martin Mladenov
,
Craig Boutilier
NeurIPS
2024
DynaMITE-RL: A Dynamic Model for Improved Temporal Meta-Reinforcement Learning
Anthony Liang
,
Guy Tennenholtz
,
Chih-Wei Hsu
,
Yinlam Chow
,
Erdem Biyik
,
Craig Boutilier
ICMLW
2024
DynaMITE-RL: A Dynamic Model for Improved Temporal Meta-Reinforcement Learning
Anthony Liang
,
Guy Tennenholtz
,
ChihWei Hsu
,
Yinlam Chow
,
Erdem Biyik
,
Craig Boutilier
NeurIPS
2024
Embedding-Aligned Language Models
Guy Tennenholtz
,
Yinlam Chow
,
Chih-Wei Hsu
,
Lior Shani
,
Ethan Liang
,
Craig Boutilier
ICLR
2023
A Mixture-of-Expert Approach to RL-Based Dialogue Management
Yinlam Chow
,
Azamat Tulepbergenov
,
Ofir Nachum
,
Dhawal Gupta
,
Moonkyung Ryu
,
Mohammad Ghavamzadeh
,
Craig Boutilier
NeurIPS
2023
Offline Reinforcement Learning for Mixture-of-Expert Dialogue Management
Dhawal Gupta
,
Yinlam Chow
,
Azamat Tulepbergenov
,
Mohammad Ghavamzadeh
,
Craig Boutilier
NeurIPSW
2022
A Mixture-of-Expert Approach to RL-Based Dialogue Management
Yinlam Chow
,
Azamat Tulepbergenov
,
Ofir Nachum
,
Dhawal Gupta
,
Moonkyung Ryu
,
Mohammad Ghavamzadeh
,
Craig Boutilier
NeurIPS
2022
Efficient Risk-Averse Reinforcement Learning
Ido Greenberg
,
Yinlam Chow
,
Mohammad Ghavamzadeh
,
Shie Mannor
ICMLW
2022
SAFER: Data-Efficient and Safe Reinforcement Learning via Skill Acquisition
Dylan Z Slack
,
Yinlam Chow
,
Bo Dai
,
Nevan Wichers
AISTATS
2021
Non-Stationary Off-Policy Optimization
Joey Hong
,
Branislav Kveton
,
Manzil Zaheer
,
Yinlam Chow
,
Amr Ahmed
ICLR
2021
Control-Aware Representations for Model-Based Reinforcement Learning
Brandon Cui
,
Yinlam Chow
,
Mohammad Ghavamzadeh
NeurIPS
2021
Safe Reinforcement Learning with Natural Language Constraints
Tsung-Yen Yang
,
Michael Y Hu
,
Yinlam Chow
,
Peter J Ramadge
,
Karthik Narasimhan
IJCAI
2021
Variational Model-Based Policy Optimization
Yinlam Chow
,
Brandon Cui
,
Moonkyung Ryu
,
Mohammad Ghavamzadeh
IJCAI
2020
BRPO: Batch Residual Policy Optimization
Sungryull Sohn
,
Yinlam Chow
,
Jayden Ooi
,
Ofir Nachum
,
Honglak Lee
,
Ed H. Chi
,
Craig Boutilier
ICLR
2020
CAQL: Continuous Action Q-Learning
Moonkyung Ryu
,
Yinlam Chow
,
Ross Anderson
,
Christian Tjandraatmadja
,
Craig Boutilier
NeurIPS
2020
CoinDICE: Off-Policy Confidence Interval Estimation
Bo Dai
,
Ofir Nachum
,
Yinlam Chow
,
Lihong Li
,
Csaba Szepesvari
,
Dale Schuurmans
NeurIPS
2020
Latent Bandits Revisited
Joey Hong
,
Branislav Kveton
,
Manzil Zaheer
,
Yinlam Chow
,
Amr Ahmed
,
Craig Boutilier
ICLR
2020
Prediction, Consistency, Curvature: Representation Learning for Locally-Linear Control
Nir Levine
,
Yinlam Chow
,
Rui Shu
,
Ang Li
,
Mohammad Ghavamzadeh
,
Hung Bui
ICML
2020
Predictive Coding for Locally-Linear Control
Rui Shu
,
Tung Nguyen
,
Yinlam Chow
,
Tuan Pham
,
Khoat Than
,
Mohammad Ghavamzadeh
,
Stefano Ermon
,
Hung Bui
CoRL
2020
Safe Policy Learning for Continuous Control
Yinlam Chow
,
Ofir Nachum
,
Aleksandra Faust
,
Edgar DueƱez-Guzman
,
Mohammad Ghavamzadeh
NeurIPS
2019
DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections
Ofir Nachum
,
Yinlam Chow
,
Bo Dai
,
Lihong Li
ICMLW
2019
DualDICE: Efficient Estimation of Off-Policy Stationary Distribution Corrections
Ofir Nachum
,
Yinlam Chow
,
Bo Dai
,
Lihong Li
ICMLW
2019
Lyapunov-Based Safe Policy Optimization for Continuous Control
Yinlam Chow
,
Ofir Nachum
,
Aleksandra Faust
,
Edgar Duenez-Guzman
,
Mohammad Ghavamzadeh
AISTATS
2019
Risk-Sensitive Generative Adversarial Imitation Learning
Jonathan Lacotte
,
Mohammad Ghavamzadeh
,
Yinlam Chow
,
Marco Pavone
NeurIPS
2018
A Block Coordinate Ascent Algorithm for Mean-Variance Optimization
Tengyang Xie
,
Bo Liu
,
Yangyang Xu
,
Mohammad Ghavamzadeh
,
Yinlam Chow
,
Daoming Lyu
,
Daesub Yoon
NeurIPS
2018
A Lyapunov-Based Approach to Safe Reinforcement Learning
Yinlam Chow
,
Ofir Nachum
,
Edgar Duenez-Guzman
,
Mohammad Ghavamzadeh
ICLR
2018
Imitation Learning from Visual Data with Multiple Intentions
Aviv Tamar
,
Khashayar Rohanimanesh
,
Yinlam Chow
,
Chris Vigorito
,
Ben Goodrich
,
Michael Kahane
,
Derik Pridmore
ICML
2018
More Robust Doubly Robust Off-Policy Evaluation
Mehrdad Farajtabar
,
Yinlam Chow
,
Mohammad Ghavamzadeh
ICML
2018
Path Consistency Learning in Tsallis Entropy Regularized MDPs
Yinlam Chow
,
Ofir Nachum
,
Mohammad Ghavamzadeh
AISTATS
2017
Sequential Multiple Hypothesis Testing with Type I Error Control
Alan Malek
,
Sumeet Katariya
,
Yinlam Chow
,
Mohammad Ghavamzadeh
NeurIPS
2016
Safe Policy Improvement by Minimizing Robust Baseline Regret
Mohammad Ghavamzadeh
,
Marek Petrik
,
Yinlam Chow
NeurIPS
2015
Policy Gradient for Coherent Risk Measures
Aviv Tamar
,
Yinlam Chow
,
Mohammad Ghavamzadeh
,
Shie Mannor
NeurIPS
2015
Risk-Sensitive and Robust Decision-Making: A CVaR Optimization Approach
Yinlam Chow
,
Aviv Tamar
,
Shie Mannor
,
Marco Pavone
NeurIPS
2014
Algorithms for CVaR Optimization in MDPs
Yinlam Chow
,
Mohammad Ghavamzadeh