Kveton, Branislav

94 publications

ICML 2025 Comparing Few to Rank Many: Active Human Preference Learning Using Randomized Frank-Wolfe Method Kiran Koshy Thekumparampil, Gaurush Hiranandani, Kousha Kalantari, Shoham Sabach, Branislav Kveton
AAAI 2025 Cross-Validated Off-Policy Evaluation Matej Cief, Branislav Kveton, Michal Kompan
ICLRW 2025 Data-Efficient Supervised Fine-Tuning of Language Models Using Optimal Design Rohan Deb, Kiran Koshy Thekumparampil, Kousha Kalantari, Gaurush Hiranandani, Shoham Sabach, Branislav Kveton
ICML 2025 FisherSFT: Data-Efficient Supervised Fine-Tuning of Language Models Using Information Gain Rohan Deb, Kiran Koshy Thekumparampil, Kousha Kalantari, Gaurush Hiranandani, Shoham Sabach, Branislav Kveton
ICCV 2025 Multimodal LLMs as Customized Reward Models for Text-to-Image Generation Shijie Zhou, Ruiyi Zhang, Huaisheng Zhu, Branislav Kveton, Yufan Zhou, Jiuxiang Gu, Jian Chen, Changyou Chen
ICLR 2025 OCEAN: Offline Chain-of-Thought Evaluation and Alignment in Large Language Models Junda Wu, Xintong Li, Ruoyu Wang, Yu Xia, Yuxin Xiong, Jianing Wang, Tong Yu, Xiang Chen, Branislav Kveton, Lina Yao, Jingbo Shang, Julian McAuley
NeurIPS 2025 Offline RL by Reward-Weighted Fine-Tuning for Conversation Optimization Subhojyoti Mukherjee, Viet Dac Lai, Raghavendra Addanki, Ryan A. Rossi, Seunghyun Yoon, Trung Bui, Anup Rao, Jayakumar Subramanian, Branislav Kveton
TMLR 2025 Personalization of Large Language Models: A Survey Zhehao Zhang, Ryan A. Rossi, Branislav Kveton, Yijia Shao, Diyi Yang, Hamed Zamani, Franck Dernoncourt, Joe Barrow, Tong Yu, Sungchul Kim, Ruiyi Zhang, Jiuxiang Gu, Tyler Derr, Hongjie Chen, Junda Wu, Xiang Chen, Zichao Wang, Subrata Mitra, Nedim Lipka, Nesreen K. Ahmed, Yu Wang
AAAI 2025 Selective Uncertainty Propagation in Offline RL Sanath Kumar Krishnamurthy, Tanmay Gangwani, Sumeet Katariya, Branislav Kveton, Shrey Modi, Anshuka Rangi
ICML 2024 MADA: Meta-Adaptive Optimizers Through Hyper-Gradient Descent Kaan Ozkara, Can Karakus, Parameswaran Raman, Mingyi Hong, Shoham Sabach, Branislav Kveton, Volkan Cevher
ICMLW 2024 Off-Policy Evaluation from Logged Human Feedback Aniruddha Bhargava, Lalit K Jain, Branislav Kveton, Ge Liu, Subhojyoti Mukherjee
NeurIPS 2024 Online Posterior Sampling with a Diffusion Prior Branislav Kveton, Boris N. Oreshkin, Youngsuk Park, Aniket Deshmukh, Rui Song
ICLR 2024 Only Pay for What Is Uncertain: Variance-Adaptive Thompson Sampling Aadirupa Saha, Branislav Kveton
ICMLW 2024 Optimal Design for Human Feedback Subhojyoti Mukherjee, Anusha Lalitha, Kousha Kalantari, Aniket Anand Deshmukh, Ge Liu, Yifei Ma, Branislav Kveton
NeurIPS 2024 Optimal Design for Human Preference Elicitation Subhojyoti Mukherjee, Anusha Lalitha, Kousha Kalantari, Aniket Deshmukh, Ge Liu, Yifei Ma, Branislav Kveton
AISTATS 2024 Pessimistic Off-Policy Multi-Objective Optimization Shima Alizadeh, Aniruddha Bhargava, Karthick Gopalswamy, Lalit Jain, Branislav Kveton, Ge Liu
ICMLW 2023 Active Learning with Crowd Sourcing Improves Information Retrieval Zhuotong Chen, Yifei Ma, Branislav Kveton, Anoop Deoras
NeurIPS 2023 Finite-Time Logarithmic Bayes Regret Upper Bounds Alexia Atsidakou, Branislav Kveton, Sumeet Katariya, Constantine Caramanis, Sujay Sanghavi
UAI 2023 Fixed-Budget Best-Arm Identification with Heterogeneous Reward Variances Anusha Lalitha Lalitha, Kousha Kalantari, Yifei Ma, Anoop Deoras, Branislav Kveton
AAAI 2023 Meta-Learning for Simple Regret Minimization Mohammad Javad Azizi, Branislav Kveton, Mohammad Ghavamzadeh, Sumeet Katariya
AISTATS 2023 Mixed-Effect Thompson Sampling Imad Aouali, Branislav Kveton, Sumeet Katariya
ICML 2023 Multi-Task Off-Policy Learning from Bandit Feedback Joey Hong, Branislav Kveton, Manzil Zaheer, Sumeet Katariya, Mohammad Ghavamzadeh
ICML 2023 Multiplier Bootstrap-Based Exploration Runzhe Wan, Haoyu Wei, Branislav Kveton, Rui Song
ICML 2023 Thompson Sampling with Diffusion Generative Prior Yu-Guan Hsieh, Shiva Kasiviswanathan, Branislav Kveton, Patrick Blöbaum
AISTATS 2022 Hierarchical Bayesian Bandits Joey Hong, Branislav Kveton, Manzil Zaheer, Mohammad Ghavamzadeh
AISTATS 2022 On the Value of Prior in Online Learning to Rank Branislav Kveton, Ofer Meshi, Masrour Zoghi, Zhen Qin
AISTATS 2022 Random Effect Bandits Rong Zhu, Branislav Kveton
AISTATS 2022 Safe Optimal Design with Applications in Off-Policy Learning Ruihao Zhu, Branislav Kveton
AISTATS 2022 Thompson Sampling with a Mixture Prior Joey Hong, Branislav Kveton, Manzil Zaheer, Mohammad Ghavamzadeh, Craig Boutilier
ICML 2022 Deep Hierarchy in Bandits Joey Hong, Branislav Kveton, Sumeet Katariya, Manzil Zaheer, Mohammad Ghavamzadeh
NeurIPSW 2022 Diffusion Prior for Online Decision Making: A Case Study of Thompson Sampling Yu-Guan Hsieh, Shiva Kasiviswanathan, Branislav Kveton, Patrick Blöbaum
IJCAI 2022 Fixed-Budget Best-Arm Identification in Structured Bandits Mohammad Javad Azizi, Branislav Kveton, Mohammad Ghavamzadeh
IJCAI 2022 IMO3: Interactive Multi-Objective Off-Policy Optimization Nan Wang, Hongning Wang, Maryam Karimzadehgan, Branislav Kveton, Craig Boutilier
ICML 2022 Safe Exploration for Efficient Policy Evaluation and Comparison Runzhe Wan, Branislav Kveton, Rui Song
NeurIPS 2022 Uplifting Bandits Yu-Guan Hsieh, Shiva P. Kasiviswanathan, Branislav Kveton
AISTATS 2021 Non-Stationary Off-Policy Optimization Joey Hong, Branislav Kveton, Manzil Zaheer, Yinlam Chow, Amr Ahmed
UAI 2021 CORe: Capitalizing on Rewards in Bandit Exploration Nan Wang, Branislav Kveton, Maryam Karimzadehgan
ICML 2021 Meta-Thompson Sampling Branislav Kveton, Mikhail Konobeev, Manzil Zaheer, Chih-Wei Hsu, Martin Mladenov, Craig Boutilier, Csaba Szepesvari
NeurIPS 2021 No Regrets for Learning the Prior in Bandits Soumya Basu, Branislav Kveton, Manzil Zaheer, Csaba Szepesvari
NeurIPS 2020 Differentiable Meta-Learning of Bandit Policies Craig Boutilier, Chih-wei Hsu, Branislav Kveton, Martin Mladenov, Csaba Szepesvari, Manzil Zaheer
ICML 2020 Graphical Models Meet Bandits: A Variational Thompson Sampling Approach Tong Yu, Branislav Kveton, Zheng Wen, Ruiyi Zhang, Ole J. Mengshoel
NeurIPS 2020 Latent Bandits Revisited Joey Hong, Branislav Kveton, Manzil Zaheer, Yinlam Chow, Amr Ahmed, Craig Boutilier
AISTATS 2020 Old Dog Learns New Tricks: Randomized UCB for Bandit Problems Sharan Vaswani, Abbas Mehrabian, Audrey Durand, Branislav Kveton
AISTATS 2020 Randomized Exploration in Generalized Linear Bandits Branislav Kveton, Manzil Zaheer, Csaba Szepesvari, Lihong Li, Mohammad Ghavamzadeh, Craig Boutilier
JMLR 2020 Spectral Bandits Tomáš Kocák, Rémi Munos, Branislav Kveton, Shipra Agrawal, Michal Valko
UAI 2019 BubbleRank: Safe Online Learning to Re-Rank via Implicit Click Feedback Chang Li, Branislav Kveton, Tor Lattimore, Ilya Markov, Maarten de Rijke, Csaba Szepesvári, Masrour Zoghi
UAI 2019 Cascading Linear Submodular Bandits: Accounting for Position Bias and Diversity in Online Learning to Rank Gaurush Hiranandani, Harvineet Singh, Prakhar Gupta, Iftikhar Ahamath Burhanuddin, Zheng Wen, Branislav Kveton
AISTATS 2019 Conservative Exploration Using Interleaving Sumeet Katariya, Branislav Kveton, Zheng Wen, Vamsi K. Potluru
ICML 2019 Garbage in, Reward Out: Bootstrapping Exploration in Multi-Armed Bandits Branislav Kveton, Csaba Szepesvari, Sharan Vaswani, Zheng Wen, Tor Lattimore, Mohammad Ghavamzadeh
AISTATS 2019 Nearly Optimal Adaptive Procedure with Change Detection for Piecewise-Stationary Bandit Yang Cao, Zheng Wen, Branislav Kveton, Yao Xie
UAI 2019 Perturbed-History Exploration in Stochastic Linear Bandits Branislav Kveton, Csaba Szepesvári, Mohammad Ghavamzadeh, Craig Boutilier
IJCAI 2019 Perturbed-History Exploration in Stochastic Multi-Armed Bandits Branislav Kveton, Csaba Szepesvári, Mohammad Ghavamzadeh, Craig Boutilier
AISTATS 2019 Sample Efficient Graph-Based Optimization with Noisy Observations Thanh Tan Nguyen, Ali Shameli, Yasin Abbasi-Yadkori, Anup Rao, Branislav Kveton
ECML-PKDD 2018 SpectralLeader: Online Spectral Learning for Single Topic Models Tong Yu, Branislav Kveton, Zheng Wen, Hung Bui, Ole J. Mengshoel
NeurIPS 2018 TopRank: A Practical Algorithm for Online Stochastic Ranking Tor Lattimore, Branislav Kveton, Shuai Li, Csaba Szepesvari
IJCAI 2017 Bernoulli Rank-1 Bandits for Click Feedback Sumeet Katariya, Branislav Kveton, Csaba Szepesvári, Claire Vernade, Zheng Wen
ICML 2017 Model-Independent Online Learning for Influence Maximization Sharan Vaswani, Branislav Kveton, Zheng Wen, Mohammad Ghavamzadeh, Laks V. S. Lakshmanan, Mark Schmidt
NeurIPS 2017 Online Influence Maximization Under Independent Cascade Model with Semi-Bandit Feedback Zheng Wen, Branislav Kveton, Michal Valko, Sharan Vaswani
ICML 2017 Online Learning to Rank in Stochastic Click Models Masrour Zoghi, Tomas Tunys, Mohammad Ghavamzadeh, Branislav Kveton, Csaba Szepesvari, Zheng Wen
AISTATS 2017 Stochastic Rank-1 Bandits Sumeet Katariya, Branislav Kveton, Csaba Szepesvári, Claire Vernade, Zheng Wen
ECML-PKDD 2017 Thompson Sampling for Optimizing Stochastic Local Search Tong Yu, Branislav Kveton, Ole J. Mengshoel
UAI 2016 Cascading Bandits for Large-Scale Recommendation Problems Shi Zong, Hao Ni, Kenny Sung, Nan Rosemary Ke, Zheng Wen, Branislav Kveton
ICML 2016 DCM Bandits: Learning to Rank with Multiple Clicks Sumeet Katariya, Branislav Kveton, Csaba Szepesvari, Zheng Wen
ECML-PKDD 2016 Graphical Model Sketch Branislav Kveton, Hung Bui, Mohammad Ghavamzadeh, Georgios Theocharous, S. Muthukrishnan, Siqi Sun
IJCAI 2016 Practical Linear Models for Large-Scale One-Class Collaborative Filtering Suvash Sedhain, Hung Bui, Jaya Kawale, Nikos Vlassis, Branislav Kveton, Aditya Krishna Menon, Trung Bui, Scott Sanner
ICML 2015 Cascading Bandits: Learning to Rank in the Cascade Model Branislav Kveton, Csaba Szepesvari, Zheng Wen, Azin Ashkan
NeurIPS 2015 Combinatorial Cascading Bandits Branislav Kveton, Zheng Wen, Azin Ashkan, Csaba Szepesvari
ICML 2015 Efficient Learning in Large-Scale Combinatorial Semi-Bandits Zheng Wen, Branislav Kveton, Azin Ashkan
NeurIPS 2015 Efficient Thompson Sampling for Online Matrix-Factorization Recommendation Jaya Kawale, Hung H Bui, Branislav Kveton, Long Tran-Thanh, Sanjay Chawla
IJCAI 2015 Optimal Greedy Diversity for Recommendation Azin Ashkan, Branislav Kveton, Shlomo Berkovsky, Zheng Wen
AISTATS 2015 Tight Regret Bounds for Stochastic Combinatorial Semi-Bandits Branislav Kveton, Zheng Wen, Azin Ashkan, Csaba Szepesvári
AAAI 2014 Large-Scale Optimistic Adaptive Submodularity Victor Gabillon, Branislav Kveton, Zheng Wen, Brian Eriksson, S. Muthukrishnan
UAI 2014 Matroid Bandits: Fast Combinatorial Optimization with Learning Branislav Kveton, Zheng Wen, Azin Ashkan, Hoda Eydgahi, Brian Eriksson
UAI 2014 SPPM: Sparse Privacy Preserving Mappings Salman Salamatian, Nadia Fawaz, Branislav Kveton, Nina Taft
ICML 2014 Spectral Bandits for Smooth Graph Functions Michal Valko, Remi Munos, Branislav Kveton, Tomáš Kocák
NeurIPS 2013 Adaptive Submodular Maximization in Bandit Setting Victor Gabillon, Branislav Kveton, Zheng Wen, Brian Eriksson, S. Muthukrishnan
ICML 2013 Sequential Bayesian Search Zheng Wen, Branislav Kveton, Brian Eriksson, Sandilya Bhamidipati
AAAI 2013 Structured Kernel-Based Reinforcement Learning Branislav Kveton, Georgios Theocharous
UAI 2012 Incorporating Metadata into Dynamic Topic Analysis Tianxi Li, Branislav Kveton, Yu Wu, Ashwin Kashyap
AAAI 2012 Kernel-Based Reinforcement Learning on Representative States Branislav Kveton, Georgios Theocharous
UAI 2012 Leveraging Side Observations in Stochastic Bandits Stéphane Caron, Branislav Kveton, Marc Lelarge, Smriti Bhagat
UAI 2010 Automatic Tuning of Interactive Perception Applications Qian Zhu, Branislav Kveton, Lily B. Mummert, Padmanabhan Pillai
UAI 2010 Online Semi-Supervised Learning on Quantized Graphs Michal Valko, Branislav Kveton, Ling Huang, Daniel Ting
CVPRW 2010 Online Semi-Supervised Perception: Real-Time Learning Without Explicit Feedback Branislav Kveton, Matthai Philipose, Michal Valko, Ling Huang
AISTATS 2010 Semi-Supervised Learning with Max-Margin Graph Cuts Branislav Kveton, Michal Valko, Ali Rahimi, Ling Huang
AAAI 2008 Online Learning with Expert Advice and Finite-Horizon Constraints Branislav Kveton, Jia Yuan Yu, Georgios Theocharous, Shie Mannor
UAI 2008 Partitioned Linear Programming Approximations for MDPs Branislav Kveton, Milos Hauskrecht
AAAI 2007 Adaptive Timeout Policies for Fast Fine-Grained Power Management Branislav Kveton, Prashant Gandhi, Georgios Theocharous, Shie Mannor, Barbara Rosario, Nilesh Shah
AAAI 2006 Learning Basis Functions in Hybrid Domains Branislav Kveton, Milos Hauskrecht
JAIR 2006 Solving Factored MDPs with Hybrid State and Action Variables Branislav Kveton, Milos Hauskrecht, Carlos Guestrin
AAAI 2006 When Gossip Is Good: Distributed Probabilistic Inference for Detection of Slow Network Intrusions Denver Dash, Branislav Kveton, John Mark Agosta, Eve M. Schooler, Jaideep Chandrashekar, Abraham Bachrach, Alex Newman
IJCAI 2005 An MCMC Approach to Solving Hybrid Factored MDPs Branislav Kveton, Milos Hauskrecht
UAI 2004 Solving Factored MDPs with Continuous and Discrete Variables Carlos Guestrin, Milos Hauskrecht, Branislav Kveton
NeurIPS 2003 Linear Program Approximations for Factored Continuous-State Markov Decision Processes Milos Hauskrecht, Branislav Kveton