Chow, Yinlam

36 publications

ICLR 2025 Inference-Aware Fine-Tuning for Best-of-N Sampling in Large Language Models Yinlam Chow, Guy Tennenholtz, Izzeddin Gur, Vincent Zhuang, Bo Dai, Aviral Kumar, Rishabh Agarwal, Sridhar Thiagarajan, Craig Boutilier, Aleksandra Faust
ICML 2025 Preference Adaptive and Sequential Text-to-Image Generation Ofir Nabati, Guy Tennenholtz, Chihwei Hsu, Moonkyung Ryu, Deepak Ramachandran, Yinlam Chow, Xiang Li, Craig Boutilier
ICLR 2024 Demystifying Embedding Spaces Using Large Language Models Guy Tennenholtz, Yinlam Chow, ChihWei Hsu, Jihwan Jeong, Lior Shani, Azamat Tulepbergenov, Deepak Ramachandran, Martin Mladenov, Craig Boutilier
NeurIPS 2024 DynaMITE-RL: A Dynamic Model for Improved Temporal Meta-Reinforcement Learning Anthony Liang, Guy Tennenholtz, Chih-Wei Hsu, Yinlam Chow, Erdem Biyik, Craig Boutilier
ICMLW 2024 DynaMITE-RL: A Dynamic Model for Improved Temporal Meta-Reinforcement Learning Anthony Liang, Guy Tennenholtz, ChihWei Hsu, Yinlam Chow, Erdem Biyik, Craig Boutilier
NeurIPS 2024 Embedding-Aligned Language Models Guy Tennenholtz, Yinlam Chow, Chih-Wei Hsu, Lior Shani, Ethan Liang, Craig Boutilier
ICLR 2023 A Mixture-of-Expert Approach to RL-Based Dialogue Management Yinlam Chow, Azamat Tulepbergenov, Ofir Nachum, Dhawal Gupta, Moonkyung Ryu, Mohammad Ghavamzadeh, Craig Boutilier
NeurIPS 2023 Offline Reinforcement Learning for Mixture-of-Expert Dialogue Management Dhawal Gupta, Yinlam Chow, Azamat Tulepbergenov, Mohammad Ghavamzadeh, Craig Boutilier
NeurIPSW 2022 A Mixture-of-Expert Approach to RL-Based Dialogue Management Yinlam Chow, Azamat Tulepbergenov, Ofir Nachum, Dhawal Gupta, Moonkyung Ryu, Mohammad Ghavamzadeh, Craig Boutilier
NeurIPS 2022 Efficient Risk-Averse Reinforcement Learning Ido Greenberg, Yinlam Chow, Mohammad Ghavamzadeh, Shie Mannor
ICMLW 2022 SAFER: Data-Efficient and Safe Reinforcement Learning via Skill Acquisition Dylan Z Slack, Yinlam Chow, Bo Dai, Nevan Wichers
AISTATS 2021 Non-Stationary Off-Policy Optimization Joey Hong, Branislav Kveton, Manzil Zaheer, Yinlam Chow, Amr Ahmed
ICLR 2021 Control-Aware Representations for Model-Based Reinforcement Learning Brandon Cui, Yinlam Chow, Mohammad Ghavamzadeh
NeurIPS 2021 Safe Reinforcement Learning with Natural Language Constraints Tsung-Yen Yang, Michael Y Hu, Yinlam Chow, Peter J Ramadge, Karthik Narasimhan
IJCAI 2021 Variational Model-Based Policy Optimization Yinlam Chow, Brandon Cui, Moonkyung Ryu, Mohammad Ghavamzadeh
IJCAI 2020 BRPO: Batch Residual Policy Optimization Sungryull Sohn, Yinlam Chow, Jayden Ooi, Ofir Nachum, Honglak Lee, Ed H. Chi, Craig Boutilier
ICLR 2020 CAQL: Continuous Action Q-Learning Moonkyung Ryu, Yinlam Chow, Ross Anderson, Christian Tjandraatmadja, Craig Boutilier
NeurIPS 2020 CoinDICE: Off-Policy Confidence Interval Estimation Bo Dai, Ofir Nachum, Yinlam Chow, Lihong Li, Csaba Szepesvari, Dale Schuurmans
NeurIPS 2020 Latent Bandits Revisited Joey Hong, Branislav Kveton, Manzil Zaheer, Yinlam Chow, Amr Ahmed, Craig Boutilier
ICLR 2020 Prediction, Consistency, Curvature: Representation Learning for Locally-Linear Control Nir Levine, Yinlam Chow, Rui Shu, Ang Li, Mohammad Ghavamzadeh, Hung Bui
ICML 2020 Predictive Coding for Locally-Linear Control Rui Shu, Tung Nguyen, Yinlam Chow, Tuan Pham, Khoat Than, Mohammad Ghavamzadeh, Stefano Ermon, Hung Bui
CoRL 2020 Safe Policy Learning for Continuous Control Yinlam Chow, Ofir Nachum, Aleksandra Faust, Edgar DueƱez-Guzman, Mohammad Ghavamzadeh
NeurIPS 2019 DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections Ofir Nachum, Yinlam Chow, Bo Dai, Lihong Li
ICMLW 2019 DualDICE: Efficient Estimation of Off-Policy Stationary Distribution Corrections Ofir Nachum, Yinlam Chow, Bo Dai, Lihong Li
ICMLW 2019 Lyapunov-Based Safe Policy Optimization for Continuous Control Yinlam Chow, Ofir Nachum, Aleksandra Faust, Edgar Duenez-Guzman, Mohammad Ghavamzadeh
AISTATS 2019 Risk-Sensitive Generative Adversarial Imitation Learning Jonathan Lacotte, Mohammad Ghavamzadeh, Yinlam Chow, Marco Pavone
NeurIPS 2018 A Block Coordinate Ascent Algorithm for Mean-Variance Optimization Tengyang Xie, Bo Liu, Yangyang Xu, Mohammad Ghavamzadeh, Yinlam Chow, Daoming Lyu, Daesub Yoon
NeurIPS 2018 A Lyapunov-Based Approach to Safe Reinforcement Learning Yinlam Chow, Ofir Nachum, Edgar Duenez-Guzman, Mohammad Ghavamzadeh
ICLR 2018 Imitation Learning from Visual Data with Multiple Intentions Aviv Tamar, Khashayar Rohanimanesh, Yinlam Chow, Chris Vigorito, Ben Goodrich, Michael Kahane, Derik Pridmore
ICML 2018 More Robust Doubly Robust Off-Policy Evaluation Mehrdad Farajtabar, Yinlam Chow, Mohammad Ghavamzadeh
ICML 2018 Path Consistency Learning in Tsallis Entropy Regularized MDPs Yinlam Chow, Ofir Nachum, Mohammad Ghavamzadeh
AISTATS 2017 Sequential Multiple Hypothesis Testing with Type I Error Control Alan Malek, Sumeet Katariya, Yinlam Chow, Mohammad Ghavamzadeh
NeurIPS 2016 Safe Policy Improvement by Minimizing Robust Baseline Regret Mohammad Ghavamzadeh, Marek Petrik, Yinlam Chow
NeurIPS 2015 Policy Gradient for Coherent Risk Measures Aviv Tamar, Yinlam Chow, Mohammad Ghavamzadeh, Shie Mannor
NeurIPS 2015 Risk-Sensitive and Robust Decision-Making: A CVaR Optimization Approach Yinlam Chow, Aviv Tamar, Shie Mannor, Marco Pavone
NeurIPS 2014 Algorithms for CVaR Optimization in MDPs Yinlam Chow, Mohammad Ghavamzadeh