Weng, Paul

33 publications

TMLR 2025 A Survey of Reinforcement Learning from Human Feedback Timo Kaufmann, Paul Weng, Viktor Bengs, Eyke Hüllermeier
ICML 2025 Comparing Comparisons: Informative and Easy Human Feedback with Distinguishability Queries Xuening Feng, Zhaohui Jiang, Timo Kaufmann, Eyke Hüllermeier, Paul Weng, Yifei Zhu
AAAI 2025 DUO: Diverse, Uncertain, On-Policy Query Generation and Selection for Reinforcement Learning from Human Feedback Xuening Feng, Zhaohui Jiang, Timo Kaufmann, Puchen Xu, Eyke Hüllermeier, Paul Weng, Yifei Zhu
AAAI 2025 Enhancing Online Reinforcement Learning with Meta-Learned Objective from Offline Data Shilong Deng, Zetao Zheng, Hongcai He, Paul Weng, Jie Shao
ICLR 2025 Reinforcement Learning from Imperfect Corrective Actions and Proxy Rewards Zhaohui Jiang, Xuening Feng, Paul Weng, Yifei Zhu, Yan Song, Tianze Zhou, Yujing Hu, Tangjie Lv, Changjie Fan
MLJ 2025 State-Novelty Guided Action Persistence in Deep Reinforcement Learning Jianshu Hu, Paul Weng, Yutong Ban
NeurIPS 2025 Time Reversal Symmetry for Efficient Robotic Manipulations in Deep Reinforcement Learning Yunpeng Jiang, Jianshu Hu, Paul Weng, Yutong Ban
TMLR 2025 Understanding and Reducing the Class-Dependent Effects of Data Augmentation with a Two-Player Game Approach Yunpeng Jiang, Yutong Ban, Paul Weng
MLJ 2024 A Survey on Interpretable Reinforcement Learning Claire Glanois, Paul Weng, Matthieu Zimmer, Dong Li, Tianpei Yang, Jianye Hao, Wulong Liu
ICMLW 2024 Comparing Comparisons: Informative and Easy Human Feedback with Distinguishability Queries Xuening Feng, Zhaohui Jiang, Timo Kaufmann, Eyke Hüllermeier, Paul Weng, Yifei Zhu
ICML 2024 INViT: A Generalizable Routing Problem Solver with Invariant Nested View Transformer Han Fang, Zhihao Song, Paul Weng, Yutong Ban
ICLR 2024 Revisiting Data Augmentation in Deep Reinforcement Learning Jianshu Hu, Yunpeng Jiang, Paul Weng
TMLR 2023 Differentiable Logic Machines Matthieu Zimmer, Xuening Feng, Claire Glanois, Zhaohui Jiang, Jianyi Zhang, Paul Weng, Dong Li, Jianye Hao, Wulong Liu
ECML-PKDD 2023 Unsupervised Salient Patch Selection for Data-Efficient Reinforcement Learning Zhaohui Jiang, Paul Weng
ACML 2022 CVaR-Regret Bounds for Multi-Armed Bandits Chenmien Tan, Paul Weng
ICML 2022 Neuro-Symbolic Hierarchical Rule Induction Claire Glanois, Zhaohui Jiang, Xuening Feng, Paul Weng, Matthieu Zimmer, Dong Li, Wulong Liu, Jianye Hao
CoRL 2022 Solving Complex Manipulation Tasks with Model-Assisted Model-Free Reinforcement Learning Jianshu Hu, Paul Weng
ICML 2021 Learning Fair Policies in Decentralized Cooperative Multi-Agent Reinforcement Learning Matthieu Zimmer, Claire Glanois, Umer Siddique, Paul Weng
ICML 2020 Learning Fair Policies in Multi-Objective (Deep) Reinforcement Learning with Average and Discounted Rewards Umer Siddique, Paul Weng, Matthieu Zimmer
IJCAI 2019 Exploiting the Sign of the Advantage Function to Learn Deterministic Policies in Continuous Domains Matthieu Zimmer, Paul Weng
ICML 2017 Multi-Objective Bandits: Optimizing the Generalized Gini Index Róbert Busa-Fekete, Balázs Szörényi, Paul Weng, Shie Mannor
AAAI 2017 Optimizing Quantiles in Preference-Based Markov Decision Processes Hugo Gilbert, Paul Weng, Yan Xu
UAI 2016 Model-Free Reinforcement Learning with Skew-Symmetric Bilinear Utilities Hugo Gilbert, Bruno Zanuttini, Paul Weng, Paolo Viappiani, Esther Nicart
IJCAI 2015 Optimization of Probabilistic Argumentation with Markov Decision Models Emmanuel Hadoux, Aurélie Beynier, Nicolas Maudet, Paul Weng, Anthony Hunter
ICML 2015 Qualitative Multi-Armed Bandits: A Quantile-Based Approach Balazs Szorenyi, Robert Busa-Fekete, Paul Weng, Eyke Hüllermeier
IJCAI 2015 Solving MDPs with Skew Symmetric Bilinear Utility Functions Hugo Gilbert, Olivier Spanjaard, Paolo Viappiani, Paul Weng
MLJ 2014 Preference-Based Reinforcement Learning: Evolutionary Direct Policy Search Using a Preference-Based Racing Algorithm Róbert Busa-Fekete, Balázs Szörényi, Paul Weng, Weiwei Cheng, Eyke Hüllermeier
UAI 2013 Approximation of Lorenz-Optimal Solutions in Multiobjective Markov Decision Processes Patrice Perny, Paul Weng, Judy Goldsmith, Josiah Hanna
IJCAI 2013 Interactive Value Iteration for Markov Decision Processes with Unknown Rewards Paul Weng, Bruno Zanuttini
ICML 2013 Top-K Selection Based on Adaptive Sampling of Noisy Preferences Robert Busa-Fekete, Balazs Szorenyi, Weiwei Cheng, Paul Weng, Eyke Huellermeier
UAI 2006 Axiomatic Foundations for a Class of Generalized Expected Utility: Algebraic Expected Utility Paul Weng
IJCAI 2005 Algebraic Markov Decision Processes Patrice Perny, Olivier Spanjaard, Paul Weng
UAI 2005 Qualitative Decision Making Under Possibilistic Uncertainty: Toward More Discriminating Criteria Paul Weng