Szepesvari, Csaba

218 publications

NeurIPS 2025 Beyond Least Squares: Uniform Approximation and the Hidden Cost of Misspecification Davide Maran, Csaba Szepesvari
NeurIPS 2025 Eluder Dimension: Localise It! Alireza Bakhtiari, Alex Ayoub, Samuel McLaughlin Robertson, David Janz, Csaba Szepesvari
NeurIPS 2025 REINFORCE Converges to Optimal Policies with Any Learning Rate Samuel McLaughlin Robertson, Thang D. Chu, Bo Dai, Dale Schuurmans, Csaba Szepesvari, Jincheng Mei
COLT 2025 Thompson Sampling for Bandit Convex Optimisation Alireza Bakhtiari, Tor Lattimore, Csaba Szepesvári
NeurIPS 2024 Almost Free: Self-Concordance in Natural Exponential Families and an Application to Bandits Shuai Liu, Alex Ayoub, Flore Sentenac, Xiaoqi Tan, Csaba Szepesvári
NeurIPS 2024 Confident Natural Policy Gradient for Local Planning in $q_\pi$-Realizable Constrained MDPs Tian Tian, Lin F. Yang, Csaba Szepesvári
NeurIPS 2024 Ensemble Sampling for Linear Bandits: Small Ensembles Suffice David Janz, Alexander E. Litvak, Csaba Szepesvári
AISTATS 2024 Exploration via Linearly Perturbed Loss Minimisation David Janz, Shuai Liu, Alex Ayoub, Csaba Szepesvári
NeurIPS 2024 Small Steps No More: Global Convergence of Stochastic Gradient Bandits for Arbitrary Learning Rates Jincheng Mei, Bo Dai, Alekh Agarwal, Sharan Vaswani, Anant Raj, Csaba Szepesvári, Dale Schuurmans
ICLR 2024 Stochastic Gradient Descent for Gaussian Processes Done Right Jihao Andreas Lin, Shreyas Padhy, Javier Antoran, Austin Tripp, Alexander Terenin, Csaba Szepesvari, José Miguel Hernández-Lobato, David Janz
ICML 2024 Switching the Loss Reduces the Cost in Batch Reinforcement Learning Alex Ayoub, Kaiwen Wang, Vincent Liu, Samuel Robertson, James Mcinerney, Dawen Liang, Nathan Kallus, Csaba Szepesvari
NeurIPS 2024 To Believe or Not to Believe Your LLM: Iterative Prompting for Estimating Epistemic Uncertainty Yasin Abbasi Yadkori, Ilja Kuzborskij, András György, Csaba Szepesvári
NeurIPS 2024 Trajectory Data Suffices for Statistically Efficient Learning in Offline RL with Linear $q^\pi$-Realizability and Concentrability Volodymyr Tkachuk, Gellért Weisz, Csaba Szepesvári
NeurIPS 2023 Context-Lumpable Stochastic Bandits Chung-Wei Lee, Qinghua Liu, Yasin Abbasi Yadkori, Chi Jin, Tor Lattimore, Csaba Szepesvari
AISTATS 2023 Efficient Planning in Combinatorial Action Spaces with Applications to Cooperative Multi-Agent Reinforcement Learning Volodymyr Tkachuk, Seyed Alireza Bakhtiari, Johannes Kirschner, Matej Jusup, Ilija Bogunovic, Csaba Szepesvári
COLT 2023 Exponential Hardness of Reinforcement Learning with Linear Function Approximation Sihan Liu, Gaurav Mahajan, Daniel Kane, Shachar Lovett, Gellért Weisz, Csaba Szepesvári
NeurIPS 2023 Online RL in Linearly $q^\pi$-Realizable MDPs Is as Easy as in Linear MDPs if You Learn What to Ignore Gellert Weisz, András György, Csaba Szepesvari
ICLR 2023 Optimistic Exploration with Learned Features Provably Solves Markov Decision Processes with Neural Dynamics Sirui Zheng, Lingxiao Wang, Shuang Qiu, Zuyue Fu, Zhuoran Yang, Csaba Szepesvari, Zhaoran Wang
NeurIPS 2023 Optimistic Natural Policy Gradient: A Simple Efficient Policy Optimization Framework for Online RL Qinghua Liu, Gellert Weisz, András György, Chi Jin, Csaba Szepesvari
NeurIPS 2023 Ordering-Based Conditions for Global Convergence of Policy Gradient Methods Jincheng Mei, Bo Dai, Alekh Agarwal, Mohammad Ghavamzadeh, Csaba Szepesvari, Dale Schuurmans
NeurIPS 2023 Regret Minimization via Saddle Point Optimization Johannes Kirschner, Alireza Bakhtiari, Kushagra Chandak, Volodymyr Tkachuk, Csaba Szepesvari
ICML 2023 Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice Toshinori Kitamura, Tadashi Kozuno, Yunhao Tang, Nino Vieillard, Michal Valko, Wenhao Yang, Jincheng Mei, Pierre Menard, Mohammad Gheshlaghi Azar, Remi Munos, Olivier Pietquin, Matthieu Geist, Csaba Szepesvari, Wataru Kumagai, Yutaka Matsuo
ICML 2023 Revisiting Simple Regret: Fast Rates for Returning a Good Arm Yao Zhao, Connor Stephens, Csaba Szepesvari, Kwang-Sung Jun
ICML 2023 Stochastic Gradient Succeeds for Bandits Jincheng Mei, Zixin Zhong, Bo Dai, Alekh Agarwal, Csaba Szepesvari, Dale Schuurmans
ICML 2023 The Optimal Approximation Factors in Misspecified Off-Policy Value Function Estimation Philip Amortila, Nan Jiang, Csaba Szepesvari
AISTATS 2022 Confident Least Square Value Iteration with Local Access to a Simulator Botao Hao, Nevena Lazic, Dong Yin, Yasin Abbasi-Yadkori, Csaba Szepesvari
AISTATS 2022 Faster Rates, Adaptive Algorithms, and Finite-Time Bounds for Linear Composition Optimization and Gradient TD Learning Anant Raj, Pooria Joulani, Andras Gyorgy, Csaba Szepesvari
AISTATS 2022 The Curse of Passive Data Collection in Batch Reinforcement Learning Chenjun Xiao, Ilbin Lee, Bo Dai, Dale Schuurmans, Csaba Szepesvari
UAI 2022 A Free Lunch from the Noise: Provable and Practical Exploration for Representation Learning Tongzheng Ren, Tianjun Zhang, Csaba Szepesvári, Bo Dai
NeurIPS 2022 Bandit Theory and Thompson Sampling-Guided Directed Evolution for Sequence Optimization Hui Yuan, Chengzhuo Ni, Huazheng Wang, Xuezhou Zhang, Le Cong, Csaba Szepesvari, Mengdi Wang
NeurIPS 2022 Confident Approximate Policy Iteration for Efficient Local Planning in $q^\pi$-Realizable MDPs Gellért Weisz, András György, Tadashi Kozuno, Csaba Szepesvari
ALT 2022 Efficient Local Planning with Linear Function Approximation Dong Yin, Botao Hao, Yasin Abbasi-Yadkori, Nevena Lazić, Csaba Szepesvári
NeurIPS 2022 Near-Optimal Sample Complexity Bounds for Constrained MDPs Sharan Vaswani, Lin Yang, Csaba Szepesvari
NeurIPS 2022 Sample-Efficient Reinforcement Learning of Partially Observable Markov Games Qinghua Liu, Csaba Szepesvari, Chi Jin
ALT 2022 TensorPlan and the Few Actions Lower Bound for Planning in MDPs Under Linear Realizability of Optimal Value Functions Gellért Weisz, Csaba Szepesvári, András György
NeurIPS 2022 The Role of Baselines in Policy Gradient Optimization Jincheng Mei, Wesley Chung, Valentin Thomas, Bo Dai, Csaba Szepesvari, Dale Schuurmans
UAI 2022 Towards Painless Policy Optimization for Constrained MDPs Arushi Jain, Sharan Vaswani, Reza Babanezhad, Csaba Szepesvári, Doina Precup
COLT 2022 When Is Partially Observable Reinforcement Learning Not Scary? Qinghua Liu, Alan Chung, Csaba Szepesvari, Chi Jin
AISTATS 2021 Adaptive Approximate Policy Iteration Botao Hao, Nevena Lazic, Yasin Abbasi-Yadkori, Pooria Joulani, Csaba Szepesvari
AISTATS 2021 Confident Off-Policy Evaluation and Selection Through Self-Normalized Importance Weighting Ilja Kuzborskij, Claire Vernade, Andras Gyorgy, Csaba Szepesvari
AISTATS 2021 Online Sparse Reinforcement Learning Botao Hao, Tor Lattimore, Csaba Szepesvari, Mengdi Wang
COLT 2021 **Paper Retracted by Author Request (see Pdf for Retraction Notice from the Authors)** Nonparametric Regression with Shallow Overparameterized Neural Networks Trained by GD with Early Stopping Ilja Kuzborskij, Csaba Szepesvari
ICML 2021 A Distribution-Dependent Analysis of Meta Learning Mikhail Konobeev, Ilja Kuzborskij, Csaba Szepesvari
COLT 2021 Asymptotically Optimal Information-Directed Sampling Johannes Kirschner, Tor Lattimore, Claire Vernade, Csaba Szepesvari
ICML 2021 Bootstrapping Fitted Q-Evaluation for Off-Policy Inference Botao Hao, Xiang Ji, Yaqi Duan, Hao Lu, Csaba Szepesvari, Mengdi Wang
ALT 2021 Exponential Lower Bounds for Planning in MDPs with Linearly-Realizable Optimal Action-Value Functions Gellért Weisz, Philip Amortila, Csaba Szepesvári
MLJ 2021 Guest Editorial: Special Issue on Reinforcement Learning for Real Life Yuxi Li, Alborz Geramifard, Lihong Li, Csaba Szepesvári, Tao Wang
ICML 2021 Improved Regret Bound and Experience Replay in Regularized Policy Iteration Nevena Lazic, Dong Yin, Yasin Abbasi-Yadkori, Csaba Szepesvari
ICML 2021 Leveraging Non-Uniformity in First-Order Non-Convex Optimization Jincheng Mei, Yue Gao, Bo Dai, Csaba Szepesvari, Dale Schuurmans
ICML 2021 Meta-Thompson Sampling Branislav Kveton, Mikhail Konobeev, Manzil Zaheer, Chih-Wei Hsu, Martin Mladenov, Craig Boutilier, Csaba Szepesvari
COLT 2021 Nearly Minimax Optimal Reinforcement Learning for Linear Mixture Markov Decision Processes Dongruo Zhou, Quanquan Gu, Csaba Szepesvari
NeurIPS 2021 No Regrets for Learning the Prior in Bandits Soumya Basu, Branislav Kveton, Manzil Zaheer, Csaba Szepesvari
COLT 2021 On Query-Efficient Planning in MDPs Under Linear Realizability of the Optimal State-Value Function Gellert Weisz, Philip Amortila, Barnabás Janzer, Yasin Abbasi-Yadkori, Nan Jiang, Csaba Szepesvari
NeurIPS 2021 On the Convergence and Sample Efficiency of Variance-Reduced Policy Gradient Method Junyu Zhang, Chengzhuo Ni, Zheng Yu, Csaba Szepesvari, Mengdi Wang
ICML 2021 On the Optimality of Batch Policy Optimization Algorithms Chenjun Xiao, Yifan Wu, Jincheng Mei, Bo Dai, Tor Lattimore, Lihong Li, Csaba Szepesvari, Dale Schuurmans
NeurIPS 2021 On the Role of Optimization in Double Descent: A Least Squares Study Ilja Kuzborskij, Csaba Szepesvari, Omar Rivasplata, Amal Rannen-Triki, Razvan Pascanu
ICML 2021 Sparse Feature Selection Makes Batch Reinforcement Learning More Sample Efficient Botao Hao, Yaqi Duan, Tor Lattimore, Csaba Szepesvari, Mengdi Wang
JMLR 2021 Tighter Risk Certificates for Neural Networks María Pérez-Ortiz, Omar Rivasplata, John Shawe-Taylor, Csaba Szepesvári
NeurIPS 2021 Understanding the Effect of Stochasticity in Policy Optimization Jincheng Mei, Bo Dai, Chenjun Xiao, Csaba Szepesvari, Dale Schuurmans
ICML 2020 A Simpler Approach to Accelerated Optimization: Iterative Averaging Meets Optimism Pooria Joulani, Anant Raj, Andras Gyorgy, Csaba Szepesvari
AISTATS 2020 Adaptive Exploration in Linear Contextual Bandit Botao Hao, Tor Lattimore, Csaba Szepesvari
ICLR 2020 Behaviour Suite for Reinforcement Learning Ian Osband, Yotam Doron, Matteo Hessel, John Aslanides, Eren Sezener, Andre Saraiva, Katrina McKinney, Tor Lattimore, Csaba Szepesvari, Satinder Singh, Benjamin Van Roy, Richard Sutton, David Silver, Hado Van Hasselt
NeurIPS 2020 CoinDICE: Off-Policy Confidence Interval Estimation Bo Dai, Ofir Nachum, Yinlam Chow, Lihong Li, Csaba Szepesvari, Dale Schuurmans
NeurIPS 2020 Differentiable Meta-Learning of Bandit Policies Craig Boutilier, Chih-wei Hsu, Branislav Kveton, Martin Mladenov, Csaba Szepesvari, Manzil Zaheer
NeurIPS 2020 Efficient Planning in Large MDPs with Weak Linear Function Approximation Roshan Shariff, Csaba Szepesvari
NeurIPS 2020 Escaping the Gravitational Pull of SoftMax Jincheng Mei, Chenjun Xiao, Bo Dai, Lihong Li, Csaba Szepesvari, Dale Schuurmans
COLT 2020 Exploration by Optimisation in Partial Monitoring Tor Lattimore, Csaba Szepesvári
JMLR 2020 Gradient Descent for Sparse Rank-One Matrix Completion for Crowd-Sourced Aggregation of Sparsely Interacting Workers Yao Ma, Alex Olshevsky, Csaba Szepesvari, Venkatesh Saligrama
NeurIPS 2020 ImpatientCapsAndRuns: Approximately Optimal Algorithm Configuration from an Infinite Pool Gellert Weisz, András György, Wei-I Lin, Devon Graham, Kevin Leyton-Brown, Csaba Szepesvari, Brendan Lucier
ICML 2020 Learning with Good Feature Representations in Bandits and in RL with a Generative Model Tor Lattimore, Csaba Szepesvari, Gellert Weisz
NeurIPS 2020 Model Selection in Contextual Stochastic Bandit Problems Aldo Pacchiano, My Phan, Yasin Abbasi Yadkori, Anup Rao, Julian Zimmert, Tor Lattimore, Csaba Szepesvari
ICML 2020 Model-Based Reinforcement Learning with Value-Targeted Regression Alex Ayoub, Zeyu Jia, Csaba Szepesvari, Mengdi Wang, Lin Yang
L4DC 2020 Model-Based Reinforcement Learning with Value-Targeted Regression Zeyu Jia, Lin Yang, Csaba Szepesvari, Mengdi Wang
ICML 2020 On the Global Convergence Rates of SoftMax Policy Gradient Methods Jincheng Mei, Chenjun Xiao, Csaba Szepesvari, Dale Schuurmans
NeurIPS 2020 Online Algorithm for Unsupervised Sequential Selection with Contextual Information Arun Verma, Manjesh Kumar Hanawal, Csaba Szepesvari, Venkatesh Saligrama
NeurIPS 2020 PAC-Bayes Analysis Beyond the Usual Bounds Omar Rivasplata, Ilja Kuzborskij, Csaba Szepesvari, John Shawe-Taylor
AISTATS 2020 Randomized Exploration in Generalized Linear Bandits Branislav Kveton, Manzil Zaheer, Csaba Szepesvari, Lihong Li, Mohammad Ghavamzadeh, Craig Boutilier
NeurIPS 2020 Variational Policy Gradient Method for Reinforcement Learning with General Utilities Junyu Zhang, Alec Koppel, Amrit Singh Bedi, Csaba Szepesvari, Mengdi Wang
ALT 2019 An Exponential Efron-Stein Inequality for $L_q$ Stable Learning Rules Karim Abou-Moustafa, Csaba Szepesvári
AAAI 2019 An Exponential Tail Bound for the Deleted Estimate Karim T. Abou-Moustafa, Csaba Szepesvári
COLT 2019 An Information-Theoretic Approach to Minimax Regret in Partial Monitoring Tor Lattimore, Csaba Szepesvári
UAI 2019 BubbleRank: Safe Online Learning to Re-Rank via Implicit Click Feedback Chang Li, Branislav Kveton, Tor Lattimore, Ilya Markov, Maarten de Rijke, Csaba Szepesvári, Masrour Zoghi
ICML 2019 CapsAndRuns: An Improved Method for Approximately Optimal Algorithm Configuration Gellert Weisz, Andras Gyorgy, Csaba Szepesvari
ALT 2019 Cleaning up the Neighborhood: A Full Classification for Adversarial Partial Monitoring Tor Lattimore, Csaba Szepesvári
NeurIPS 2019 Detecting Overfitting via Adversarial Examples Roman Werpachowski, András György, Csaba Szepesvari
COLT 2019 Distribution-Dependent Analysis of Gibbs-ERM Principle Ilja Kuzborskij, Nicolò Cesa-Bianchi, Csaba Szepesvári
ICML 2019 Garbage in, Reward Out: Bootstrapping Exploration in Multi-Armed Bandits Branislav Kveton, Csaba Szepesvari, Sharan Vaswani, Zheng Wen, Tor Lattimore, Mohammad Ghavamzadeh
AISTATS 2019 Model-Free Linear Quadratic Control via Reduction to Expert Prediction Yasin Abbasi-Yadkori, Nevena Lazic, Csaba Szepesvari
AISTATS 2019 Online Algorithm for Unsupervised Sensor Selection Arun Verma, Manjesh Hanawal, Csaba Szepesvari, Venkatesh Saligrama
ICML 2019 Online Learning to Rank with Features Shuai Li, Tor Lattimore, Csaba Szepesvari
ICML 2019 POLITEX: Regret Bounds for Policy Iteration Using Expert Prediction Yasin Abbasi-Yadkori, Peter Bartlett, Kush Bhatia, Nevena Lazic, Csaba Szepesvari, Gellert Weisz
UAI 2019 Perturbed-History Exploration in Stochastic Linear Bandits Branislav Kveton, Csaba Szepesvári, Mohammad Ghavamzadeh, Craig Boutilier
IJCAI 2019 Perturbed-History Exploration in Stochastic Multi-Armed Bandits Branislav Kveton, Csaba Szepesvári, Mohammad Ghavamzadeh, Craig Boutilier
ICLR 2019 Rigorous Agent Evaluation: An Adversarial Approach to Uncover Catastrophic Failures Jonathan Uesato, Ananya Kumar, Csaba Szepesvari, Tom Erez, Avraham Ruderman, Keith Anderson, Krishnamurthy Dvijotham, Nicolas Heess, Pushmeet Kohli
NeurIPS 2019 Think Out of the "Box": Generically-Constrained Asynchronous Composite Optimization and Hedging Pooria Joulani, András György, Csaba Szepesvari
ICML 2018 Bandits with Delayed, Aggregated Anonymous Feedback Ciara Pike-Burke, Shipra Agrawal, Csaba Szepesvari, Steffen Grunewalder
ICML 2018 Gradient Descent for Sparse Rank-One Matrix Completion for Crowd-Sourced Aggregation of Sparsely Interacting Workers Yao Ma, Alexander Olshevsky, Csaba Szepesvari, Venkatesh Saligrama
ICML 2018 LeapsAndBounds: A Method for Approximately Optimal Algorithm Configuration Gellert Weisz, Andras Gyorgy, Csaba Szepesvari
AISTATS 2018 Linear Stochastic Approximation: How Far Does Constant Step-Size and Iterate Averaging Go? Chandrashekar Lakshminarayanan, Csaba Szepesvári
NeurIPS 2018 PAC-Bayes Bounds for Stable Algorithms with Instance-Dependent Priors Omar Rivasplata, Emilio Parrado-Hernandez, John S Shawe-Taylor, Shiliang Sun, Csaba Szepesvari
NeurIPS 2018 TopRank: A Practical Algorithm for Online Stochastic Ranking Tor Lattimore, Branislav Kveton, Shuai Li, Csaba Szepesvari
ALT 2017 A Modular Analysis of Adaptive (Non-)Convex Optimization: Optimism, Composite Objectives, and Variational Bounds Pooria Joulani, András György, Csaba Szepesvári
IJCAI 2017 Bernoulli Rank-1 Bandits for Click Feedback Sumeet Katariya, Branislav Kveton, Csaba Szepesvári, Claire Vernade, Zheng Wen
JMLR 2017 Following the Leader and Fast Rates in Online Linear Prediction: Curved Constraint Sets and Other Regularities Ruitong Huang, Tor Lattimore, András György, Csaba Szepesvári
NeurIPS 2017 Multi-View Matrix Factorization for Linear Dynamical System Estimation Mahdi Karami, Martha White, Dale Schuurmans, Csaba Szepesvari
ICML 2017 Online Learning to Rank in Stochastic Click Models Masrour Zoghi, Tomas Tunys, Mohammad Ghavamzadeh, Branislav Kveton, Csaba Szepesvari, Zheng Wen
AISTATS 2017 Stochastic Rank-1 Bandits Sumeet Katariya, Branislav Kveton, Csaba Szepesvári, Claire Vernade, Zheng Wen
ALT 2017 Structured Best Arm Identification with Fixed Confidence Ruitong Huang, Mohammad M. Ajallooeian, Csaba Szepesvári, Martin Müller
AISTATS 2017 The End of Optimism? an Asymptotic Analysis of Finite-Armed Linear Bandits Tor Lattimore, Csaba Szepesvári
AISTATS 2017 Unsupervised Sequential Sensor Acquisition Manjesh Kumar Hanawal, Csaba Szepesvári, Venkatesh Saligrama
AISTATS 2016 (Bandit) Convex Optimization with Biased Noisy Gradient Oracles Xiaowei Hu, Prashanth L. A., András György, Csaba Szepesvári
AAAI 2016 Compressed Conditional Mean Embeddings for Model-Based Reinforcement Learning Guy Lever, John Shawe-Taylor, Ronnie Stafford, Csaba Szepesvári
ICML 2016 Conservative Bandits Yifan Wu, Roshan Shariff, Tor Lattimore, Csaba Szepesvari
ICML 2016 Cumulative Prospect Theory Meets Reinforcement Learning: Prediction and Control Prashanth L.A., Cheng Jie, Michael Fu, Steve Marcus, Csaba Szepesvari
ICML 2016 DCM Bandits: Learning to Rank with Multiple Clicks Sumeet Katariya, Branislav Kveton, Csaba Szepesvari, Zheng Wen
AAAI 2016 Delay-Tolerant Online Convex Optimization: Unified Analysis and Adaptive-Gradient Algorithms Pooria Joulani, András György, Csaba Szepesvári
NeurIPS 2016 Following the Leader and Fast Rates in Linear Prediction: Curved Constraint Sets and Other Regularities Ruitong Huang, Tor Lattimore, András György, Csaba Szepesvari
JMLR 2016 Regularized Policy Iteration with Nonparametric Function Spaces Amir-massoud Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvári, Shie Mannor
NeurIPS 2016 SDP Relaxation with Randomized Rounding for Energy Disaggregation Kiarash Shaloudegi, András György, Csaba Szepesvari, Wilsun Xu
ICML 2016 Shifting Regret, Mirror Descent, and Matrices Andras Gyorgy, Csaba Szepesvari
UAI 2015 Bayesian Optimal Control of Smoothly Parameterized Systems Yasin Abbasi-Yadkori, Csaba Szepesvári
ICML 2015 Cascading Bandits: Learning to Rank in the Cascade Model Branislav Kveton, Csaba Szepesvari, Zheng Wen, Azin Ashkan
NeurIPS 2015 Combinatorial Cascading Bandits Branislav Kveton, Zheng Wen, Azin Ashkan, Csaba Szepesvari
ICML 2015 Deterministic Independent Component Analysis Ruitong Huang, Andras Gyorgy, Csaba Szepesvári
AISTATS 2015 Exploiting Symmetries to Construct Efficient MCMC Algorithms with an Application to SLAM Roshan Shariff, András György, Csaba Szepesvári
IJCAI 2015 Fast Cross-Validation for Incremental Learning Pooria Joulani, András György, Csaba Szepesvári
NeurIPS 2015 Linear Multi-Resource Allocation with Semi-Bandit Feedback Tor Lattimore, Koby Crammer, Csaba Szepesvari
NeurIPS 2015 Mixing Time Estimation in Reversible Markov Chains from a Single Sample Path Daniel J. Hsu, Aryeh Kontorovich, Csaba Szepesvari
AISTATS 2015 Near-Optimal Max-Affine Estimators for Convex Regression Gábor Balázs, András György, Csaba Szepesvári
ICML 2015 On Identifying Good Options Under Combinatorially Structured Feedback in Finite Noisy Environments Yifan Wu, Andras Gyorgy, Csaba Szepesvari
NeurIPS 2015 Online Learning with Gaussian Payoffs and Side Observations Yifan Wu, András György, Csaba Szepesvari
AISTATS 2015 Tight Regret Bounds for Stochastic Combinatorial Semi-Bandits Branislav Kveton, Zheng Wen, Azin Ashkan, Csaba Szepesvári
AISTATS 2015 Toward Minimax Off-Policy Value Estimation Lihong Li, Rémi Munos, Csaba Szepesvári
AISTATS 2014 A Finite-Sample Generalization Bound for Semiparametric Regression: Partially Linear Models Ruitong Huang, Csaba Szepesvári
ICML 2014 Adaptive Monte Carlo via Bandit Allocation James Neufeld, Andras Gyorgy, Csaba Szepesvari, Dale Schuurmans
ALT 2014 On Learning the Optimal Waiting Time Tor Lattimore, András György, Csaba Szepesvári
ICML 2014 Online Learning in Markov Decision Processes with Changing Cost Sequences Travis Dick, Andras Gyorgy, Csaba Szepesvari
UAI 2014 Optimal Resource Allocation with Semi-Bandit Feedback Tor Lattimore, Koby Crammer, Csaba Szepesvári
COLT 2014 Proceedings of the 27th Conference on Learning Theory, COLT 2014, Barcelona, Spain, June 13-15, 2014 Maria-Florina Balcan, Vitaly Feldman, Csaba Szepesvári
NeurIPS 2014 Universal Option Models Hengshuai Yao, Csaba Szepesvari, Richard S. Sutton, Joseph Modayil, Shalabh Bhatnagar
ICML 2013 A Randomized Mirror Descent Algorithm for Large Scale Multiple Kernel Learning Arash Afkanpour, András György, Csaba Szepesvari, Michael Bowling
MLJ 2013 Alignment Based Kernel Learning with a Continuous Set of Base Kernels Arash Afkanpour, Csaba Szepesvári, Michael Bowling
ICML 2013 Characterizing the Representer Theorem Yaoliang Yu, Hao Cheng, Dale Schuurmans, Csaba Szepesvari
ICML 2013 Cost-Sensitive Multiclass Classification Risk Bounds Bernardo Ávila Pires, Csaba Szepesvari, Mohammad Ghavamzadeh
ICML 2013 Online Learning Under Delayed Feedback Pooria Joulani, Andras Gyorgy, Csaba Szepesvari
NeurIPS 2013 Online Learning in Markov Decision Processes with Adversarially Chosen Transition Probability Distributions Yasin Abbasi Yadkori, Peter L Bartlett, Varun Kanade, Yevgeny Seldin, Csaba Szepesvari
NeurIPS 2013 Online Learning with Costly Features and Labels Navid Zolghadr, Gabor Bartok, Russell Greiner, András György, Csaba Szepesvari
ICML 2012 An Adaptive Algorithm for Finite Stochastic Partial Monitoring Gábor Bartók, Navid Zolghadr, Csaba Szepesvári
ICML 2012 Analysis of Kernel Mean Matching Under Covariate Shift Yaoliang Yu, Csaba Szepesvári
AAAI 2012 Approximate Policy Iteration with Linear Action Models Hengshuai Yao, Csaba Szepesvári
NeurIPS 2012 Deep Representations and Codes for Image Auto-Annotation Ryan Kiros, Csaba Szepesvári
AISTATS 2012 Online-to-Confidence-Set Conversions and Application to Sparse Stochastic Bandits Yasin Abbasi-Yadkori, David Pal, Csaba Szepesvari
ALT 2012 Partial Monitoring with Side Information Gábor Bartók, Csaba Szepesvári
ICML 2012 Statistical Linear Estimation with Penalized Estimators: An Application to Reinforcement Learning Bernardo Ávila Pires, Csaba Szepesvári
AISTATS 2012 The Adversarial Stochastic Shortest Path Problem with Unknown Transition Probabilities Gergely Neu, Andras Gyorgy, Csaba Szepesvari
COLT 2011 Agnostic KWIK Learning and Efficient Approximate Reinforcement Learning István Szita, Csaba Szepesvári
ALT 2011 Algorithmic Learning Theory - 22nd International Conference, ALT 2011, Espoo, Finland, October 5-7, 2011. Proceedings Jyrki Kivinen, Csaba Szepesvári, Esko Ukkonen, Thomas Zeugmann
ALT 2011 Editors' Introduction Jyrki Kivinen, Csaba Szepesvári, Esko Ukkonen, Thomas Zeugmann
NeurIPS 2011 Improved Algorithms for Linear Stochastic Bandits Yasin Abbasi-yadkori, Dávid Pál, Csaba Szepesvári
COLT 2011 Minimax Regret of Finite Partial-Monitoring Games in Stochastic Environments Gábor Bartók, Dávid Pál, Csaba Szepesvári
MLJ 2011 Model Selection in Reinforcement Learning Amir Massoud Farahmand, Csaba Szepesvári
UAI 2011 PAC-Bayesian Policy Evaluation for Reinforcement Learning Mahdi Milani Fard, Joelle Pineau, Csaba Szepesvári
COLT 2011 Regret Bounds for the Adaptive Control of Linear Quadratic Systems Yasin Abbasi-Yadkori, Csaba Szepesvári
JMLR 2011 X-Armed Bandits Sébastien Bubeck, Rémi Munos, Gilles Stoltz, Csaba Szepesvári
AISTATS 2010 A Markov-Chain Monte Carlo Approach to Simultaneous Localization and Mapping Peter Torma, András György, Csaba Szepesvári
ICML 2010 Budgeted Distribution Learning of Belief Net Parameters Liuyang Li, Barnabás Póczos, Csaba Szepesvári, Russell Greiner
NeurIPS 2010 Error Propagation for Approximate Policy and Value Iteration Amir-massoud Farahmand, Csaba Szepesvári, Rémi Munos
NeurIPS 2010 Estimation of Rényi Entropy and Mutual Information Based on Generalized Nearest-Neighbor Graphs Dávid Pál, Barnabás Póczos, Csaba Szepesvári
ICML 2010 Model-Based Reinforcement Learning with Nearly Tight Exploration Complexity Bounds Istvan Szita, Csaba Szepesvári
NeurIPS 2010 Online Markov Decision Processes Under Bandit Feedback Gergely Neu, Andras Antos, András György, Csaba Szepesvári
NeurIPS 2010 Parametric Bandits: The Generalized Linear Case Sarah Filippi, Olivier Cappe, Aurélien Garivier, Csaba Szepesvári
AISTATS 2010 REGO: Rank-Based Estimation of Renyi Information Using Euclidean Graph Optimization Barnabas Poczos, Sergey Kirshner, Csaba Szepesvári
COLT 2010 The Online Loop-Free Stochastic Shortest-Path Problem Gergely Neu, András György, Csaba Szepesvári
ICML 2010 Toward Off-Policy Learning Control with Function Approximation Hamid Reza Maei, Csaba Szepesvári, Shalabh Bhatnagar, Richard S. Sutton
ALT 2010 Toward a Classification of Finite Partial-Monitoring Games Gábor Bartók, Dávid Pál, Csaba Szepesvári
NeurIPS 2009 A General Projection Property for Distribution Families Yao-liang Yu, Yuxi Li, Dale Schuurmans, Csaba Szepesvári
NeurIPS 2009 Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation Hamid R. Maei, Csaba Szepesvári, Shalabh Bhatnagar, Doina Precup, David Silver, Richard S. Sutton
ICML 2009 Fast Gradient-Descent Methods for Temporal-Difference Learning with Linear Function Approximation Richard S. Sutton, Hamid Reza Maei, Doina Precup, Shalabh Bhatnagar, David Silver, Csaba Szepesvári, Eric Wiewiora
AISTATS 2009 Learning Exercise Policies for American Options Yuxi Li, Csaba Szepesvari, Dale Schuurmans
ICML 2009 Learning When to Stop Thinking and Do Something! Barnabás Póczos, Yasin Abbasi-Yadkori, Csaba Szepesvári, Russell Greiner, Nathan R. Sturtevant
ICML 2009 Learning to Segment from a Few Well-Selected Training Images Alireza Farhangfar, Russell Greiner, Csaba Szepesvári
NeurIPS 2009 Multi-Step Dyna Planning for Policy Evaluation and Control Hengshuai Yao, Shalabh Bhatnagar, Dongcui Diao, Richard S. Sutton, Csaba Szepesvári
MLJ 2009 Training Parsers by Inverse Reinforcement Learning Gergely Neu, Csaba Szepesvári
ICML 2009 Workshop Summary: On-Line Learning with Limited Feedback Jean-Yves Audibert, Peter Auer, Alessandro Lazaric, Rémi Munos, Daniil Ryabko, Csaba Szepesvári
NeurIPS 2008 A Convergent $O(n)$ Temporal-Difference Algorithm for Off-Policy Learning with Linear Function Approximation Richard S. Sutton, Hamid R. Maei, Csaba Szepesvári
ALT 2008 Active Learning in Multi-Armed Bandits András Antos, Varun Grover, Csaba Szepesvári
ALT 2008 Active Learning of Group-Structured Environments Gábor Bartók, Csaba Szepesvári, Sandra Zilles
UAI 2008 Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping Richard S. Sutton, Csaba Szepesvári, Alborz Geramifard, Michael H. Bowling
ICML 2008 Empirical Bernstein Stopping Volodymyr Mnih, Csaba Szepesvári, Jean-Yves Audibert
JMLR 2008 Finite-Time Bounds for Fitted Value Iteration Rémi Munos, Csaba Szepesvári
MLJ 2008 Learning Near-Optimal Policies with Bellman-Residual Minimization Based Fitted Policy Iteration and a Single Sample Path András Antos, Csaba Szepesvári, Rémi Munos
NeurIPS 2008 Online Optimization in X-Armed Bandits Sébastien Bubeck, Gilles Stoltz, Csaba Szepesvári, Rémi Munos
NeurIPS 2008 Regularized Policy Iteration Amir M. Farahmand, Mohammad Ghavamzadeh, Shie Mannor, Csaba Szepesvári
UAI 2008 Speeding up Planning in Markov Decision Processes via Automatically Constructed Abstraction Alejandro Isaza, Csaba Szepesvári, Vadim Bulitko, Russell Greiner
UAI 2007 Apprenticeship Learning Using Inverse Reinforcement Learning and Gradient Methods Gergely Neu, Csaba Szepesvári
IJCAI 2007 Continuous Time Associative Bandit Problems András György, Levente Kocsis, Ivett Szabó, Csaba Szepesvári
NeurIPS 2007 Fitted Q-Iteration in Continuous Action-Space MDPs András Antos, Csaba Szepesvári, Rémi Munos
COLT 2007 Improved Rates for the Stochastic Continuum-Armed Bandit Problem Peter Auer, Ronald Ortner, Csaba Szepesvári
ICML 2007 Manifold-Adaptive Dimension Estimation Amir Massoud Farahmand, Csaba Szepesvári, Jean-Yves Audibert
IJCAI 2007 Sequence Prediction Exploiting Similary Information István Bíró, Zoltán Szamonek, Csaba Szepesvári
ALT 2007 Tuning Bandit Algorithms in Stochastic Environments Jean-Yves Audibert, Rémi Munos, Csaba Szepesvári
ECML-PKDD 2006 Bandit Based Monte-Carlo Planning Levente Kocsis, Csaba Szepesvári
COLT 2006 Learning Near-Optimal Policies with Bellman-Residual Minimization Based Fitted Policy Iteration and a Single Sample Path András Antos, Csaba Szepesvári, Rémi Munos
MLJ 2006 Universal Parameter Optimisation in Games Based on SPSA Levente Kocsis, Csaba Szepesvári
ICML 2005 Finite Time Bounds for Sampling Based Fitted Value Iteration Csaba Szepesvári, Rémi Munos
ECCV 2004 Enhancing Particle Filters Using Local Likelihood Sampling Péter Torma, Csaba Szepesvári
ICML 2004 Interpolation-Based Q-Learning Csaba Szepesvári, William D. Smart
ECML-PKDD 2004 Margin Maximizing Discriminant Analysis András Kocsor, Kornél Kovács, Csaba Szepesvári
AAAI 2004 Shortest Path Discovery Problems: A Framework, Algorithms and Experimental Results Csaba Szepesvári
AISTATS 2003 Sequential Importance Sampling for Visual Tracking Reconsidered Péter Torma, Csaba Szepesvári
MLJ 2000 Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms Satinder Singh, Tommi S. Jaakkola, Michael L. Littman, Csaba Szepesvári
NeCo 1999 A Unified Analysis of Value-Function-Based Reinforcement Learning Algorithms Csaba Szepesvári, Michael L. Littman
MLJ 1998 Module-Based Reinforcement Learning: Experiments with a Real Robot Zsolt Kalmár, Csaba Szepesvári, András Lörincz
ICML 1998 Multi-Criteria Reinforcement Learning Zoltán Gábor, Zsolt Kalmár, Csaba Szepesvári
ECML-PKDD 1997 Learning and Exploitation Do Not Conflict Under Minimax Optimality Csaba Szepesvári
NeurIPS 1997 The Asymptotic Convergence-Rate of Q-Learning Csaba Szepesvári
ICML 1996 A Generalized Reinforcement-Learning Model: Convergence and Applications Michael L. Littman, Csaba Szepesvári
NeCo 1994 Topology Learning Solved by Extended Objects: A Neural Network Model Csaba Szepesvári, László Balázs, András Lörincz