Csaba Szepesv'ari

Csaba Szepesv'ari's AcademicInfluence.com Rankings

Csaba Szepesv'ari

Computer Science

#4522

World Rank

#4771

Historical Rank

Machine Learning

#912

World Rank

#925

Historical Rank

Artificial Intelligence

#1129

World Rank

#1149

Historical Rank

Database

#1737

World Rank

#1822

Historical Rank

computer-science Degrees

Download Badge

Computer Science

Csaba Szepesv'ari's Degrees

PhD Computer Science University of Szeged

Similar Degrees You Can Earn

Best Online PhD of Computer Science (Doctorates) 2026

Why Is Csaba Szepesv'ari Influential?

(Suggest an Edit or Addition)

(See a Problem?)

Csaba Szepesv'ari's Published Works

Number of citations in a given year to any of this author's works

Total number of citations to an author for the works they published in a given year. This highlights publication of the most important work(s) by the author

Published Works

Bandit Based Monte-Carlo Planning (2006) (2872)
Improved Algorithms for Linear Stochastic Bandits (2011) (1291)
Algorithms for Reinforcement Learning (2010) (1198)
Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms (2000) (641)
Fast gradient-descent methods for temporal-difference learning with linear function approximation (2009) (584)
Exploration-exploitation tradeoff using variance estimates in multi-armed bandits (2009) (569)
Finite-Time Bounds for Fitted Value Iteration (2008) (452)
Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path (2006) (396)
Parametric Bandits: The Generalized Linear Case (2010) (367)
X-Armed Bandits (2010) (360)
Learning with a Strong Adversary (2015) (319)
Regret Bounds for the Adaptive Control of Linear Quadratic Systems (2011) (316)
A Generalized Reinforcement-Learning Model: Convergence and Applications (1996) (267)
Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation (2009) (258)
Cascading Bandits: Learning to Rank in the Cascade Model (2015) (239)
Toward Off-Policy Learning Control with Function Approximation (2010) (238)
Apprenticeship Learning using Inverse Reinforcement Learning and Gradient Methods (2007) (238)
A Convergent O(n) Temporal-difference Algorithm for Off-policy Learning with Linear Function Approximation (2008) (235)
Fitted Q-iteration in continuous action-space MDPs (2007) (225)
Error Propagation for Approximate Policy and Value Iteration (2010) (210)
Empirical Bernstein stopping (2008) (208)
Online Optimization in X-Armed Bandits (2008) (208)
Improved Rates for the Stochastic Continuum-Armed Bandit Problem (2007) (208)
Online Learning under Delayed Feedback (2013) (207)
Tuning Bandit Algorithms in Stochastic Environments (2007) (189)
Model-Based Reinforcement Learning with Value-Targeted Regression (2020) (188)
Online Markov Decision Processes Under Bandit Feedback (2010) (184)
The grand challenge of computer Go (2012) (179)
Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping (2008) (179)
A convergent O ( n ) algorithm for off-policy temporal-difference learning with linear function approximation (2008) (173)
A Unified Analysis of Value-Function-Based Reinforcement-Learning Algorithms (1999) (171)
On the Global Convergence Rates of Softmax Policy Gradient Methods (2020) (162)
Partial Monitoring - Classification, Regret Bounds, and Algorithms (2014) (155)
Estimation of Renyi Entropy and Mutual Information Based on Generalized Nearest-Neighbor Graphs (2010) (153)
Improved Monte-Carlo Search (2006) (146)
Regularized Policy Iteration (2008) (141)
The Asymptotic Convergence-Rate of Q-learning (1997) (137)
Model-based reinforcement learning with nearly tight exploration complexity bounds (2010) (136)
Online-to-Confidence-Set Conversions and Application to Sparse Stochastic Bandits (2012) (130)
Learning with Good Feature Representations in Bandits and in RL with a Generative Model (2019) (124)
Behaviour Suite for Reinforcement Learning (2019) (123)
Finite time bounds for sampling based fitted value iteration (2005) (122)
Toward Minimax Off-policy Value Estimation (2015) (119)
Nearly Minimax Optimal Reinforcement Learning for Linear Mixture Markov Decision Processes (2020) (115)
Manifold-Adaptive Dimension Estimation (2007) (99)
Linear Stochastic Approximation: How Far Does Constant Step-Size and Iterate Averaging Go? (2018) (95)
POLITEX: Regret Bounds for Policy Iteration using Expert Prediction (2019) (94)
Toward a classification of finite partial-monitoring games (2010) (93)
Multi-criteria Reinforcement Learning (1998) (92)
Regularized Policy Iteration with Nonparametric Function Spaces (2016) (90)
Bandits with Delayed, Aggregated Anonymous Feedback (2017) (90)
Online Learning to Rank in Stochastic Click Models (2017) (86)
Combinatorial Cascading Bandits (2015) (84)
Regularized Fitted Q-Iteration for planning in continuous-space Markovian decision problems (2009) (83)
Interpolation-based Q-learning (2004) (83)
The End of Optimism? An Asymptotic Analysis of Finite-Armed Linear Bandits (2016) (83)
Analysis of Kernel Mean Matching under Covariate Shift (2012) (80)
Bernoulli Rank-1 Bandits for Click Feedback (2017) (80)
DCM Bandits: Learning to Rank with Multiple Clicks (2016) (80)
Training parsers by inverse reinforcement learning (2009) (80)
Online Learning in Markov Decision Processes with Adversarially Chosen Transition Probability Distributions (2013) (80)
Model-Free Linear Quadratic Control via Reduction to Expert Prediction (2018) (79)
Partial Monitoring with Side Information (2012) (76)
Conservative Bandits (2016) (73)
The adversarial stochastic shortest path problem with unknown transition probabilities (2012) (72)
Cumulative Prospect Theory Meets Reinforcement Learning: Prediction and Control (2015) (71)
Variational Policy Gradient Method for Reinforcement Learning with General Utilities (2020) (70)
Following the Leader and Fast Rates in Online Linear Prediction: Curved Constraint Sets and Other Regularities (2017) (70)
Exponential Lower Bounds for Planning in MDPs With Linearly-Realizable Optimal Action-Value Functions (2020) (66)
Mixing Time Estimation in Reversible Markov Chains from a Single Sample Path (2015) (66)
Rigorous Agent Evaluation: An Adversarial Approach to Uncover Catastrophic Failures (2018) (64)
Model Selection in Contextual Stochastic Bandit Problems (2020) (63)
Online learning for linearly parametrized control problems (2012) (62)
Online Learning in Markov Decision Processes with Changing Cost Sequences (2014) (62)
Active learning in heteroscedastic noise (2010) (61)
Stochastic Rank-1 Bandits (2016) (61)
Tighter risk certificates for neural networks (2020) (58)
Active Learning in Multi-armed Bandits (2008) (57)
CoinDICE: Off-Policy Confidence Interval Estimation (2020) (55)
(Bandit) Convex Optimization with Biased Noisy Gradient Oracles (2015) (55)
Randomized Exploration in Generalized Linear Bandits (2019) (55)
The Online Loop-free Stochastic Shortest-Path Problem (2010) (54)
Garbage In, Reward Out: Bootstrapping Exploration in Multi-Armed Bandits (2018) (53)
PAC-Bayes Analysis Beyond the Usual Bounds (2020) (53)
An Information-Theoretic Approach to Minimax Regret in Partial Monitoring (2019) (52)
Model Selection in Reinforcement Learning (2011) (52)
Online Least Squares Estimation with Self-Normalized Processes: An Application to Bandit Problems (2011) (51)
TopRank: A practical algorithm for online stochastic ranking (2018) (51)
Adaptive Exploration in Linear Contextual Bandit (2019) (51)
Module-Based Reinforcement Learning: Experiments with a Real Robot (1998) (50)
Tight Regret Bounds for Stochastic Combinatorial Semi-Bandits (2014) (50)
Sequential Learning for Multi-Channel Wireless Network Monitoring With Channel Switching Costs (2014) (49)
Prediction of protein functional domains from sequences using artificial neural networks. (2001) (48)
Sequential learning for optimal monitoring of multi-channel wireless networks (2011) (48)
Minimax Regret of Finite Partial-Monitoring Games in Stochastic Environments (2011) (45)
PAC-Bayes bounds for stable algorithms with instance-dependent priors (2018) (43)
Linear Multi-Resource Allocation with Semi-Bandit Feedback (2015) (43)
Cost-sensitive Multiclass Classification Risk Bounds (2013) (43)
Uncertainty, performance, and model dependency in approximate adaptive nonlinear control (1997) (42)
Stochastic Optimization in a Cumulative Prospect Theory Framework (2018) (42)
Near-optimal max-affine estimators for convex regression (2015) (42)
Regularization in reinforcement learning (2011) (41)
Deep Representations and Codes for Image Auto-Annotation (2012) (40)
Delay-Tolerant Online Convex Optimization: Unified Analysis and Adaptive-Gradient Algorithms (2016) (40)
Online Learning with Gaussian Payoffs and Side Observations (2015) (39)
An adaptive algorithm for finite stochastic partial monitoring (2012) (38)
Reinforcement Learning Algorithms for MDPs (2011) (38)
A General Projection Property for Distribution Families (2009) (36)
Learning Exercise Policies for American Options (2009) (36)
Statistical linear estimation with penalized estimators: an application to reinforcement learning (2012) (36)
Bayesian Optimal Control of Smoothly Parameterized Systems (2015) (36)
Continuous Time Associative Bandit Problems (2007) (35)
Topology Learning Solved by Extended Objects: A Neural Network Model (1993) (35)
Margin Maximizing Discriminant Analysis (2004) (34)
On the Convergence and Sample Efficiency of Variance-Reduced Policy Gradient Method (2021) (34)
Perturbed-History Exploration in Stochastic Linear Bandits (2019) (33)
Confident Off-Policy Evaluation and Selection through Self-Normalized Importance Weighting (2020) (32)
Evaluation and Analysis of the Performance of the EXP3 Algorithm in Stochastic Environments (2013) (32)
Use of variance estimation in the multi-armed bandit problem (2006) (32)
Model-based and Model-free Reinforcement Learning for Visual Servoing (2009) (31)
An automatic method for the identification and interpretation of clustered microcalcifications in mammograms. (1999) (31)
Detecting Overfitting via Adversarial Examples (2019) (30)
LEAPSANDBOUNDS: A Method for Approximately Optimal Algorithm Configuration (2018) (30)
Exploration by Optimisation in Partial Monitoring (2019) (29)
LS-N-IPS: An Improvement of Particle Filters by Means of Local Search (2001) (29)
Online Learning with Costly Features and Labels (2013) (29)
Universal Option Models (2014) (29)
Variance estimates and exploration function in multi-armed bandit (2008) (28)
A Modular Analysis of Adaptive (Non-)Convex Optimization: Optimism, Composite Objectives, and Variational Bounds (2017) (27)
Performance of Nonlinear Approximate Adaptive Controllers (2003) (27)
Stochastic Low-Rank Bandits (2017) (26)
Regret Bounds for Model-Free Linear Quadratic Control (2018) (26)
Following the Leader and Fast Rates in Linear Prediction: Curved Constraint Sets and Other Regularities (2016) (25)
Online Sparse Reinforcement Learning (2020) (25)
Sparse Feature Selection Makes Batch Reinforcement Learning More Sample Efficient (2020) (25)
PAC-Bayes with Backprop (2019) (24)
Regularized Fitted Q-iteration: Application to Planning (2008) (24)
SDP Relaxation with Randomized Rounding for Energy Disaggregation (2016) (24)
Neurocontroller using dynamic state feedback for compensatory control (1997) (24)
Universal parameter optimisation in games based on SPSA (2006) (24)
Structured Best Arm Identification with Fixed Confidence (2017) (24)
REGO: Rank-based Estimation of Renyi Information using Euclidean Graph Optimization (2010) (23)
Efficient Planning in Large MDPs with Weak Linear Function Approximation (2020) (23)
Online Learning to Rank with Features (2018) (23)
BubbleRank: Safe Online Learning to Re-Rank via Implicit Click Feedback (2018) (22)
Gradient Descent for Sparse Rank-One Matrix Completion for Crowd-Sourced Aggregation of Sparsely Interacting Workers (2018) (22)
Learning when to stop thinking and do something! (2009) (22)
Exploration-Enhanced POLITEX (2019) (22)
Multi-Step Dyna Planning for Policy Evaluation and Control (2009) (22)
Multiclass Classification Calibration Functions (2016) (21)
When Is Partially Observable Reinforcement Learning Not Scary? (2022) (21)
Escaping the Gravitational Pull of Softmax (2020) (21)
Perturbed-History Exploration in Stochastic Multi-Armed Bandits (2019) (21)
Asymptotically Optimal Information-Directed Sampling (2020) (21)
Approximate geometry representations and sensory fusion (1996) (21)
A Linearly Relaxed Approximate Linear Program for Markov Decision Processes (2017) (21)
Optimal Resource Allocation with Semi-Bandit Feedback (2014) (21)
Compressed Conditional Mean Embeddings for Model-Based Reinforcement Learning (2016) (20)
On Query-efficient Planning in MDPs under Linear Realizability of the Optimal State-value Function (2021) (20)
Bounds and dynamics for empirical game theoretic analysis (2019) (20)
Adaptive Monte Carlo via Bandit Allocation (2014) (20)
Pseudo-MDPs and factored linear action models (2014) (20)
Extending rapidly-exploring random trees for asymptotically optimal anytime motion planning (2010) (20)
CapsAndRuns: An Improved Method for Approximately Optimal Algorithm Configuration (2019) (19)
Self-organizing neurocontrol (1994) (19)
Cleaning up the neighborhood: A full classification for adversarial partial monitoring (2018) (19)
No Regrets for Learning the Prior in Bandits (2021) (19)
Efficient approximate planning in continuous space Markovian Decision Problems (2001) (19)
Agnostic KWIK learning and efficient approximate reinforcement learning (2011) (18)
Sequential Importance Sampling for Visual Tracking Reconsidered (2003) (18)
Characterizing the Representer Theorem (2013) (18)
Shifting Regret, Mirror Descent, and Matrices (2016) (18)
Efficient Local Planning with Linear Function Approximation (2021) (17)
Approximate Policy Iteration with Linear Action Models (2012) (17)
RSPSA: Enhanced Parameter Optimization in Games (2006) (17)
Meta-Thompson Sampling (2021) (17)
Self-Organizing Multi-Resolution Grid for Motion Planning and Control (1996) (16)
Learning With Adversary (2015) (16)
On Minimax Optimal Offline Policy Evaluation (2014) (15)
Enhancing Particle Filters Using Local Likelihood Sampling (2004) (15)
Regularized least-squares regression: Learning from a β-mixing sequence (2012) (15)
Static and Dynamic Aspects of Optimal Sequential Decision Making (1998) (15)
Behavior of an Adaptive Self-organizing Autonomous Agent Working with Cues and Competing Concepts (1993) (14)
Bootstrapping Statistical Inference for Off-Policy Evaluation (2021) (14)
Approximate Inverse-Dynamics Based Robust Control Using Static And Dynamic Feedback (1998) (14)
Performance of Nonlinear Approximate Adaptive Controllers: French/Adaptive Controllers (2005) (13)
Module Based Reinforcement Learning for a Real Robot (1997) (13)
Provably Efficient Adaptive Approximate Policy Iteration (2020) (13)
Differentiable Meta-Learning of Bandit Policies (2020) (13)
A modular analysis of adaptive (non-)convex optimization: Optimism, composite objectives, variance reduction, and variational bounds (2020) (13)
Distribution-Dependent Analysis of Gibbs-ERM Principle (2019) (13)
Empirical Bayes Regret Minimization (2019) (12)
Shortest Path Discovery Problems: A Framework, Algorithms and Experimental Results (2004) (12)
Bayesian Optimal Control of Smoothly Parameterized Systems: The Lazy Posterior Sampling Algorithm (2014) (12)
A simpler approach to accelerated optimization: iterative averaging meets optimism (2020) (12)
On using likelihood-adjusted proposals in particle filtering: local importance sampling (2005) (12)
Learning to segment from a few well-selected training images (2009) (12)
A Markov-Chain Monte Carlo Approach to Simultaneous Localization and Mapping (2010) (11)
Improved Regret Bound and Experience Replay in Regularized Policy Iteration (2021) (11)
BubbleRank: Safe Online Learning to Rerank (2018) (11)
Inverse Dynamics Controllers for Robust Control: Consequences for Neurocontrollers (1996) (11)
Finite Time Bounds for Temporal Difference Learning with Function Approximation: Problems with some “state-of-the-art” results (2017) (11)
Unsupervised Sequential Sensor Acquisition (2017) (11)
On Multi-objective Policy Optimization as a Tool for Reinforcement Learning (2021) (11)
Learning and Exploitation Do Not Conflict Under Minimax Optimality (1997) (10)
The SBASE protein domain library, release 6.0: a collection of annotated protein sequence segments (1999) (10)
An Evaluation Criterion for Macro-Learning and Some Results (1999) (10)
On Identifying Good Options under Combinatorially Structured Feedback in Finite Noisy Environments (2015) (10)
A Randomized Mirror Descent Algorithm for Large Scale Multiple Kernel Learning (2012) (10)
A Simpler Approach to Accelerated Stochastic Optimization: Iterative Averaging Meets Optimism (2020) (10)
Decision-Theoretic Clustering of Strategies (2015) (10)
PAC-Bayesian Policy Evaluation for Reinforcement Learning (2011) (10)
Generalization in an autonomous agent (1994) (9)
Reduced-Variance Payoff Estimation in Adversarial Bandit Problems (2005) (9)
Efron-Stein PAC-Bayesian Inequalities (2019) (9)
Local Importance Sampling: A Novel Technique to Enhance Particle Filtering (2006) (9)
Bandits with Delayed Anonymous Feedback (2017) (9)
Optimization Issues in KL-Constrained Approximate Policy Iteration (2021) (9)
Alignment based kernel learning with a continuous set of base kernels (2011) (9)
Differentiable Meta-Learning in Contextual Bandits (2020) (8)
Online Algorithm for Unsupervised Sensor Selection (2019) (8)
Learning near-optimal policies with fitted policy iteration and a single sample path (2005) (8)
Non-Markovian Policies in Sequential Decision Problems (1998) (8)
An integrated architecture for motion-control and path-planning (1998) (8)
Understanding the Effect of Stochasticity in Policy Optimization (2021) (8)
A Free Lunch from the Noise: Provable and Practical Exploration for Representation Learning (2021) (8)
Efficient Stopping Rules (2008) (8)
Computer aided diagnosis of clustered microcalcifications using artificial neural nets (2000) (8)
Parallel and robust skeletonization built on self-organizing elements (1999) (8)
Module Based Reinforcement Learning: An Application to a Real Robot (1997) (7)
Near-Optimal Sample Complexity Bounds for Constrained MDPs (2022) (7)
Differentiable Bandit Exploration (2020) (7)
Linear Stochastic Approximation: Constant Step-Size and Iterate Averaging (2017) (7)
An Exponential Efron-Stein Inequality for Lq Stable Learning Rules (2019) (7)
Fast Cross-Validation for Incremental Learning (2015) (7)
Generalized Markov Decision Processes: Dynamic-programming and Reinforcement-learning Algorithms (2010) (7)
Multi-view Matrix Factorization for Linear Dynamical System Estimation (2017) (6)
A Finite-Sample Generalization Bound for Semiparametric Regression: Partially Linear Models (2014) (6)
On the Sample Complexity of Batch Reinforcement Learning with Policy-Induced Data (2021) (6)
Log-optimal currency portfolios and control Lyapunov exponents (2005) (6)
Exploiting Symmetries to Construct Efficient MCMC Algorithms With an Application to SLAM (2015) (6)
An Exponential Tail Bound for Lq Stable Learning Rules. Application to k-Folds Cross-Validation (2019) (6)
Autonomous exploration for navigating in non-stationary CMPs (2019) (6)
Value-Aware Loss Function for Model Learning in Reinforcement Learning (2016) (6)
General Framework for Reinforcement Learning (1995) (6)
Proceedings of the 22nd international conference on Algorithmic learning theory (1995) (6)
Markov Decision Processes under Bandit Feedback (2015) (5)
Nonparametric Regression with Shallow Overparameterized Neural Networks Trained by GD with Early Stopping (2021) (5)
Module-Based Reinforcement Learning: Experiments with a Real Robot (1998) (5)
On Optimality of Meta-Learning in Fixed-Design Regression with Weighted Biased Regularization (2020) (5)
The Curse of Passive Data Collection in Batch Reinforcement Learning (2021) (5)
Robust control using inverse dynamics neurocontrollers (1997) (5)
On Learning the Optimal Waiting Time (2014) (5)
Chaining Bounds for Empirical Risk Minimization (2016) (4)
Cascading Bandits (2015) (4)
An Asymptotic Scaling Analysis of LQ Performance for an Approximate Adaptive Control Design (2002) (4)
Maximum Margin Discriminant Analysis based Face Recognition (2005) (4)
On the Role of Optimization in Double Descent: A Least Squares Study (2021) (4)
Optimistic MLE - A Generic Model-based Algorithm for Partially Observable Sequential Decision Making (2022) (4)
Dynamic concept model learns optimal policies (1994) (4)
Budgeted Distribution Learning of Belief Net Parameters (2010) (4)
Think out of the "Box": Generically-Constrained Asynchronous Composite Optimization and Hedging (2019) (4)
A Distribution-dependent Analysis of Meta Learning (2020) (3)
LQ performance bounds for adaptive output feedback controllers for functionally uncertain nonlinear systems (2002) (3)
Online Algorithm for Unsupervised Sequential Selection with Contextual Information (2020) (3)
An a Priori Exponential Tail Bound for k-Folds Cross-Validation (2017) (3)
Modular Reinforcement Learning: A Case Study in a Robot Domain (2000) (3)
Self-organized learning of 3 dimensions (1994) (3)
LMS-2: Towards an algorithm that is as cheap as LMS and almost as efficient as RLS (2009) (3)
Sample-Efficient Reinforcement Learning of Partially Observable Markov Games (2022) (3)
Models of active learning in group-structured state spaces (2010) (3)
On Asymptotic and Finite-Time Optimality of Bayesian Predictors (2019) (3)
Proceedings of the 10th European Workshop on Reinforcement Learning (2013) (3)
Sequence Prediction Exploiting Similary Information (2007) (3)
Confident Least Square Value Iteration with Local Access to a Simulator (2022) (3)
ImpatientCapsAndRuns: Approximately Optimal Algorithm Configuration from an Infinite Pool (2020) (3)
Deterministic Independent Component Analysis (2015) (3)
Adaptive Approximate Policy Iteration (2021) (3)
Crowdsourcing with Sparsely Interacting Workers (2017) (3)
Algorithmic Learning Theory (2011) (2)
Towards Painless Policy Optimization for Constrained MDPs (2022) (2)
Speeding Up Planning in Markov Decision Processes via Automatically Constructed Abstraction (2008) (2)
European Workshop on Reinforcement Learning (2008) (2)
Integration of Artificial Neural Networks and Dynamic Concepts to an Adaptive and Self-Organizing Agent (1993) (2)
Ockham's Razor Modeling of the Matrisome Channels of the Basal Ganglia Thalamocortical Loops (2001) (2)
Reinforcement Learning: Theory and Practice (2010) (2)
Convergent Reinforcement Learning with Value Function Interpolation (2001) (2)
KL-Entropy-Regularized RL with a Generative Model is Minimax Optimal (2022) (2)
Confident Approximate Policy Iteration for Efficient Local Planning in qπ-realizable MDPs (2022) (2)
Kernel Machine Based Feature Extraction Algorithms for Regression Problems (2004) (2)
Complexity of learning: the case of everyday neural networks (1994) (2)
X-mHMM: an efficient algorithm for training mixtures of HMMs when the number of mixtures is unknown (2005) (2)
Pathological Effects of Variance on Classification-Based Policy Iteration (2015) (1)
An Exponential Tail Bound for the Deleted Estimate (2019) (1)
Parametric Bandits: The Generalized Linear Case (extended version) (2010) (1)
Non-trivial two-armed partial-monitoring games are bandits (2011) (1)
Sample Efficient Deep Reinforcement Learning via Local Planning (2023) (1)
Least Squares Temporal Difference Learning and Galerkin ’ s Method (2011) (1)
EARNING WITH A S TRONG A DVERSARY (2016) (1)
Revisiting Simple Regret Minimization in Multi-Armed Bandits (2022) (1)
Max-affine estimators for convex stochastic programming (2016) (1)
Active Learning of Group-Structured Environments (2008) (1)
Efficient object tracking in video sequences by means of LS-N-IPS (2001) (1)
Robust Nonparametric Copula Based Dependence Estimators (2011) (1)
Toward Manifold-Adaptive Learning (2007) (1)
A unified modular analysis of online and stochastic optimization: adaptivity, optimism, non-convexity (2016) (1)
Regularized Fitted Q-iteration : Application to Bounded Resource Planning (2009) (1)
A Randomized Strategy for Learning to Combine Many Features (2012) (1)
Sequential Learning without Feedback (2016) (1)
Towards Facial Pose Tracking (2002) (0)
Scaling of LQ performance in approximate adaptive designs (2000) (0)
Reinforcement Learning Algorithms in Markov Decision Processes AAAI-10 Tutorial Part II: Learning to predict values (2010) (0)
Reinforcement Learning Algorithms in Markov Decision Processes AAAI-10 Tutorial Part IV: Take home message (2010) (0)
Comparison to Alternative Designs (2005) (0)
Classification with Margin Constraints: A Unification with Applications to Optimization (2015) (0)
Stochastic Processes and Markov Chains (2020) (0)
Sztochasztikus Rendszerek és Pénzügyi Piacok Modellezése = Stochastic Systems and Modelling of Financial Markets (2007) (0)
Function Approximator Designs for the Integrator Chain (2005) (0)
Guest Editors' introduction (2014) (0)
Hoeffding Bounds vs . Empirical (2018) (0)
Automated Detection and Classification of Micro-Calcifications in Mammograms Using Artifical Neural Nets (1998) (0)
Exponential Hardness of Reinforcement Learning with Linear Function Approximation (2023) (0)
Learning Lipschitz Functions by GD-trained Shallow Overparameterized ReLU Neural Networks (2022) (0)
The Explore-Then-Commit Algorithm (2020) (0)
Uncertainty Modelling, Control Design and System Performance (2005) (0)
Workshop summary: On-line learning with limited feedback (2009) (0)
FlexVoice: A Parametric Approach to High-Quality Speech Synthesis (2000) (0)
Generalization Bounds for Partially Linear Models (2014) (0)
Strict Feedback Systems (2005) (0)
L EARNING WITH A S TRONG A DVERSARY (2015) (0)
Revisiting Simple Regret: Fast Rates for Returning a Good Arm (2022) (0)
Partial Monitoring (2020) (0)
The Exp3 Algorithm (2020) (0)
Appendix A: Lyapunov's Direct Method (2005) (0)
Invited Talk: Towards Robust Reinforcement Learning Algorithms (2011) (0)
Computer Aided Diagnosis of Clustered Microcalci fi cations Using Arti fi cial Neural Nets (2004) (0)
Output Feedback Control (2005) (0)
The return of $\epsilon$-greedy: sublinear regret for model-free linear quadratic control (2018) (0)
Uncertainty and performance of adaptive controllers for functionally uncertain output feedback systems (1998) (0)
Prediction of Protein Domain-Types by Backpropagation (2010) (0)
Editors' Introduction (2011) (0)
Bounds and dynamics for empirical game theoretic analysis (2019) (0)
Bandit Theory and Thompson Sampling-Guided Directed Evolution for Sequence Optimization (2022) (0)
Artificial Neural Networks Prediction of Protein Functional Domains from Sequences Using (2001) (0)
VALUATION : A N ADVERSARIAL APPROACH TO UNCOVER CATASTROPHIC FAILURES (2019) (0)
Faster Rates, Adaptive Algorithms, and Finite-Time Bounds for Linear Composition Optimization and Gradient TD Learning (2022) (0)
Performance-Evaluation for Automated Detection of Microcalcifications in Mammograms Using Three Different Film-Digitizers (1998) (0)
The Role of Baselines in Policy Gradient Optimization (2023) (0)
The Chain of Integrators (2005) (0)
Convergent Temporal-Difference Learning with Arbitrary Differentiable Function Approximator (2010) (0)
Efficient Planning in Combinatorial Action Spaces with Applications to Cooperative Multi-Agent Reinforcement Learning (2023) (0)
Appendix B: Functional Bounds from System Identification (2005) (0)
Towards Painless Policy Optimization for Constrained MDPs: Supplementary material (2022) (0)

This paper list is powered by the following services:

What Schools Are Affiliated With Csaba Szepesv'ari?

Csaba Szepesv'ari is affiliated with the following schools:

University of Alberta

Csaba Szepesv'ari's Academic­Influence.com Rankings

Csaba Szepesv'ari's Degrees

Similar Degrees You Can Earn

Why Is Csaba Szepesv'ari Influential?

Csaba Szepesv'ari's Published Works

Published Works

What Schools Are Affiliated With Csaba Szepesv'ari?

Csaba Szepesv'ari's AcademicInfluence.com Rankings