Why Is Richard S. Sutton Influential?
According to Wikipedia , Richard S. Sutton is a Canadian computer scientist. Currently, he is a distinguished research scientist at DeepMind and a professor of computing science at the University of Alberta. Sutton is considered one of the founders of modern computational reinforcement learning, having several significant contributions to the field, including temporal difference learning and policy gradient methods.
Richard S. Sutton's Published Works
Number of citations in a given year to any of this author's works
Total number of citations to an author for the works they published in a given year. This highlights publication of the most important work(s) by the author
1990 2000 2010 2020 0 2500 5000 7500 10000 12500 15000 17500 20000 22500 25000 27500 30000 32500 35000 37500 Published Papers Reinforcement Learning: An Introduction (31531) Introduction to Reinforcement Learning (5446) Policy Gradient Methods for Reinforcement Learning with Function Approximation (4172) Learning to Predict by the Methods of Temporal Differences (3464) Neuronlike adaptive elements that can solve difficult learning control problems (3238) Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning (2669) Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming (1528) Toward a modern theory of adaptive networks: expectation and prediction. (1443) Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding (1244) Reinforcement Learning (1085) Neural networks for control (1033) Temporal credit assignment in reinforcement learning (842) Dyna, an integrated architecture for learning, planning, and reacting (648) Time-Derivative Models of Pavlovian Reinforcement (612) Dimensions of Reinforcement Learning (557) Eligibility Traces for Off-Policy Policy Evaluation (549) A Menu of Designs for Reinforcement Learning Over Time (541) Fast gradient-descent methods for temporal-difference learning with linear function approximation (525) Predictive Representations of State (498) Reinforcement Learning for RoboCup Soccer Keepaway (444) Natural actor-critic algorithms (433) Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction (391) Reinforcement learning with replacing eligibility traces (379) Off-Policy Temporal Difference Learning with Function Approximation (329) Experiments with Reinforcement Learning in Problems with Continuous State and Action Spaces (327) Reinforcement Learning with Replacing Eligibility Traces (300) Learning and Sequential Decision Making (294) Reinforcement Learning is Direct Adaptive Optimal Control (286) Temporal abstraction in reinforcement learning (281) Off-Policy Actor-Critic (280) Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation (233) Adapting Bias by Gradient Descent: An Incremental Version of Delta-Bar-Delta (226) Scaling Reinforcement Learning toward RoboCup Soccer (216) Toward Off-Policy Learning Control with Function Approximation (214) Reinforcement learning architectures for animats (212) A Convergent O(n) Temporal-difference Algorithm for Off-policy Learning with Linear Function Approximation (210) Reinforcement learning is direct adaptive optimal control (180) TD Models: Modeling the World at a Mixture of Time Scales (178) An Emphatic Approach to the Problem of Off-policy Temporal-Difference Learning (175) Planning by Incremental Dynamic Programming (166) Incremental Natural Actor-Critic Algorithms (164) Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping (159) Model-Free reinforcement learning with continuous action in practice (155) Intra-Option Learning about Temporally Abstract Actions (155) A convergent O ( n ) algorithm for off-policy temporal-difference learning with linear function approximation (149) Gradient temporal-difference learning algorithms (148) Online human training of a myoelectric prosthesis controller via actor-critic reinforcement learning (144) Training and Tracking in Robotics (142) Associative search network: A reinforcement learning associative memory (139) Multi-time Models for Temporally Abstract Planning (138) Stimulus Representation and the Timing of Reward-Prediction Errors in Models of the Dopamine System (135) Reinforcement Learning of Local Shape in the Game of Go (135) Online Learning with Random Representations (133) Learning to predict by the methods of temporal differences (131) Theoretical Results on Reinforcement Learning with Temporally Abstract Options (123) A Deeper Look at Experience Replay (120) Computational Schemes and Neural Network Models for Formation and Control of Multijoint Arm Trajectory (117) Sample-based learning and search with permanent and transient memories (116) Gain Adaptation Beats Least Squares (113) Roles of Macro-Actions in Accelerating Reinforcement Learning (110) GQ(lambda): A general gradient algorithm for temporal-difference prediction learning with eligibility traces (108) Multi-timescale nexting in a reinforcement learning robot (107) Simulation of anticipatory responses in classical conditioning by a neuron-like adaptive element (104) GQ(λ): A general gradient algorithm for temporal-difference prediction learning with eligibility traces (102) Weighted importance sampling for off-policy learning with linear function approximation (95) Between MDPs and Semi-MDPs : Learning , Planning , and Representing Knowledge at Multiple Temporal Scales (93) Simulation of the classically conditioned nictitating membrane response by a neuron-like adaptive element: Response topography, neuronal firing, and interstimulus intervals (92) Behaviour Suite for Reinforcement Learning (88) Temporal-Difference Networks (88) Temporal-difference search in computer Go (87) Evaluating the TD model of classical conditioning (86) Between MOPs and Semi-MOP: Learning, Planning & Representing Knowledge at Multiple Temporal Scales (78) On the role of tracking in stationary environments (77) Multi-step Reinforcement Learning: A Unifying Algorithm (73) Model-Based Reinforcement Learning with an Approximate, Learned Model (73) Introduction: The challenge of reinforcement learning (72) Sequential Decision Problems and Neural Networks (70) Connectionist Learning for Control (67) Reinforcement learning in board games (64) Natural actorcritic algorithms. (64) Adaptive artificial limbs: a real-time approach to prediction and anticipation (62) Integrated Modeling and Control Based on Reinforcement Learning and Dynamic Programming (61) Incremental Least-Squares Temporal Difference Learning (60) Macro-Actions in Reinforcement Learning: An Empirical Analysis (59) True Online Temporal-Difference Learning (58) Dynamic switching and real-time machine learning for improved human control of assistive biomedical robots (51) Application of real-time machine learning to myoelectric prosthesis control: A case series in adaptive switching (51) Reward is enough (51) Using Predictive Representations to Improve Generalization in Reinforcement Learning (50) Challenging Control Problems (49) A Unified View (49) Temporal Abstraction in Temporal-difference Networks (48) A Deeper Look at Planning as Learning from Replay (47) True online TD(λ) (47) Improved Switching among Temporally Abstract Actions (47) Keepaway Soccer: A Machine Learning Testbed (47) Open Theoretical Questions in Reinforcement Learning (46) Tuning-free step-size adaptation (45) iLSTD: Eligibility Traces and Convergence Analysis (45) Connectionist Learning for Control: An Overview (45) Landmark learning: An illustration of associative search (45) Introduction: The Challenge of Reinforcement Learning (44) The Truck Backer-Upper: An Example of Self-Learning in Neural Networks (41) Goal Seeking Components for Adaptive Intelligence: An Initial Assessment. (41) Multi-step Off-policy Learning Without Importance Sampling Ratios (39) Real-time prediction learning for the simultaneous actuation of multiple prosthetic joints (39) True Online TD(lambda) (38) Linear Off-Policy Actor-Critic (36) Reinforcement Learning Architectures (35) Learning Polynomial Functions by Feature Construction (35) Off-policy TD( l) with a true online equivalence (34) An Adaptive Sensorimotor Network Inspired by the Anatomy and Physiology of the Cerebellum (33) A computational model of hippocampal function in trace conditioning (31) Reinforcement Learning in Artificial Intelligence (31) On the Significance of Markov Decision Processes (30) A Bioreactor Benchmark for Adaptive Network-based Process Control (30) Learning Instance-Independent Value Functions to Enhance Local Search (29) A new Q(lambda) with interim forward view and Monte Carlo equivalence (28) Reinforcement Learning: Past, Present and Future (28) A new Q ( � ) with interim forward view and Monte Carlo equivalence (28) Iterative Construction of Sparse Polynomial Approximations (26) TD(λ) networks: temporal-difference networks with eligibility traces (26) Universal Option Models (25) Temporal-Difference Networks with History (25) Magnitude and timing of nictitating membrane movements during classical conditioning of the rabbit (Oryctolagus cuniculus). (25) Representation Search through Generate and Test (24) Emphatic Temporal-Difference Learning (24) Planning by Prioritized Sweeping with Small Backups (24) Associative Learning from Replayed Experience (23) Scaling life-long off-policy learning (23) Off-policy learning based on weighted importance sampling with linear computational complexity (22) A Batch, Off-Policy, Actor-Critic Algorithm for Optimizing the Average Reward (22) Learning to Maximize Rewards: A Review of "Reinforcement Learning: An Introduction (20) Exponentiated Gradient Methods for Reinforcement Learning (20) Understanding Multi-Step Deep Reinforcement Learning: A Systematic Study of the DQN Target (20) Multi-Step Dyna Planning for Policy Evaluation and Control (19) Comparing Policy-Gradient Algorithms (18) Surprise and Curiosity for Big Data Robotics (18) Some Recent Applications of Reinforcement Learning (17) Online Off-policy Prediction (17) Face valuing: Training user interfaces with facial expressions and reinforcement learning (17) Learning to Predict Independent of Span (17) Discounted Reinforcement Learning is Not an Optimization Problem (16) On Generalized Bellman Equations and Temporal-Difference Learning (16) Special Issue “On Defining Artificial Intelligence”—Commentaries and Author’s Response (16) The Grand Challenge of Predictive Empirical Abstract Knowledge (16) Adaptive State Representation and Estimation Using Recurrent Connectionist Networks (16) Advances in reinforcement learning and their implications for intelligent control (16) Theoretical Results on Reinforcement Learning with Temporally Abstract Behaviors (16) Prediction Driven Behavior: Learning Predictions that Drive Fixed Responses (15) Magnitude and timing of conditioned responses in delay and trace classical conditioning of the nictitating membrane response of the rabbit (Oryctolagus cuniculus). (15) Between Instruction and Reward: Human-Prompted Switching (14) Acquiring a broad range of empirical knowledge in real time by temporal-difference learning (13) Application of connectionist learning methods to manufacturing process monitoring (12) Planning with Closed Loop Macro Actions (12) Planning and Learning (12) Off-policy Learning with Options and Recognizers (12) TIDBD: Adapting Temporal-difference Step-sizes Through Stochastic Meta-descent (12) Scalar timing varies with response magnitude in classical conditioning of the nictitating membrane response of the rabbit (Oryctolagus cuniculus). (12) Reactive Reinforcement Learning in Asynchronous Environments (12) Comparing Direct and Indirect Temporal-Difference Methods for Estimating the Variance of the Return (11) Synthesis of nonlinear control surfaces by a layered associative search network (11) Learning and Planning in Average-Reward Markov Decision Processes (11) Multi-timescale Nexting in a Reinforcement Learning Robot (11) Learning a nonlinear model of a manufacturing process using multilayer connectionist networks (11) Planning with Expectation Models (10) Off-policy Learning with Recognizers (10) A First Empirical Study of Emphatic Temporal Difference Learning (10) Investigating Experience: Temporal Coherence and Empirical Knowledge Representation (10) Learning Feature Relevance Through Step Size Adaptation in Temporal-Difference Learning (10) Efficient planning in MDPs by small backups (10) Directly Estimating the Variance of the λ-Return Using Temporal-Difference Methods (10) Agent Learning using Action-Dependent Learning Rates in Computer Role-Playing Games (9) On The Virtues of Linear Learning and Trajectory Distributions (9) Adaptive Switching in Practice: Improving Myoelectric Prosthesis Performance through Reinforcement Learning (9) Temporal-Difference Learning to Assist Human Decision Making during the Control of an Artificial Limb (9) A Summary Comparison of CMAC Neural Network and Traditional Adaptive Control Systems (9) Incremental natural-gradient actor-critic algorithms (9) Forward Actor-Critic for Nonlinear Function Approximation in Reinforcement Learning (8) The Reinforcement Learning Problem (8) Pavlovian Control of Intraspinal Microstimulation to Produce Over-Ground Walking (8) Two geometric input transformation methods for fast online reinforcement learning with neural nets (8) Natural-Gradient Actor-Critic Algorithms (8) Fixed-Horizon Temporal Difference Methods for Stable Reinforcement Learning (8) Policy iterations for reinforcement learning problems in continuous time and space - Fundamental theory and methods (8) This Excerpt from Reinforcement Learning. Introduction 1.2 Examples 1.3 Elements of Reinforcement Learning 1.3 Elements of Reinforcement Learning (8) Crossprop: Learning Representations by Stochastic Meta-Gradient Descent in Neural Networks (7) Directly Estimating the Variance of the {\lambda}-Return Using Temporal-Difference Methods (7) Per-decision Multi-step Temporal Difference Learning with Control Variates (7) Average-Reward Off-Policy Policy Evaluation with Function Approximation (6) Timing in trace conditioning of the nictitating membrane response of the rabbit (Oryctolagus cuniculus): scalar, nonscalar, and adaptive features. (6) Reinforcement Learning and Artificial Intelligence (6) Communicative Capital for Prosthetic Agents (6) Artificial Intelligence as a Control Problem: Comments on the Relationship between Machine Learning and Intelligent Control (5) Integrating Episodic Memory into a Reinforcement Learning Agent using Reservoir Sampling (5) Adaptive Control Using Neural Networks (5) Acquiring Diverse Predictive Knowledge in Real Time by Temporal-difference Learning (5) Beyond Reward: The Problem of Knowledge and Data (4) Model-based Reinforcement Learning with Non-linear Expectation Models and Stochastic Environments (4) Model-based Reinforcement Learning (4) Multi-Time Models for Reinforcement Learning (4) Some New Directions for Adaptive Control Theory in Robotics (4) Integrated Modeling and Control Based on Reinforcement Learning (3) Time course of the rabbit's conditioned nictitating membrane movements during acquisition, extinction, and reacquisition. (3) Using Associative Content- Addressable Memories to Control Robots (3) Prediction in Intelligence: An Empirical Comparison of Off-policy Algorithms on Robots (3) Off-Policy Knowledge Maintenance for Robots (3) From Eye-blinks to State Construction: Diagnostic Benchmarks for Online Representation Learning (3) Towards self-learning adaptive scheduling for ATM networks (3) Pavlovian control of intraspinal microstimulation to produce over-ground walking. (2) SELECTED BIBLIOGRAPHY ON CONNECTIONISM (2) Timing and cue competition in conditioning of the nictitating membrane response of the rabbit (Oryctolagus cuniculus). (2) Title of Thesis: Reinforcement Learning in Environments with Independent Delayed-sense Dynamics Reinforcement Learning in Environments with Independent Delayed-sense Dynamics (2) Document-editing Assistants and Model-based Reinforcement Learning as a Path to Conversational AI (2) Learning Sparse Representations Incrementally in Deep Reinforcement Learning (2) Book Reviews (2) Continual Backprop: Stochastic Gradient Descent with Persistent Randomness (2) A Neural Network Baseline Problem for Control of Aircraft Flare and Touchdown (2) An Empirical Evaluation of True Online TD(λ) (2) On the Signiicance of Markov Decision Processes (2) The PEAK Project (2) Reinforcement and Local Searc: A Case Study TITLE2: (1) Should All Temporal Difference Learning Use Emphasis? (1) Intelligent Control for Multiple Autonomous Undersea Vehicles (1) Natural Actor – Crit ic Algorithms (1) Inverse Policy Evaluation for Value-based Sequential Decision-making (1) Policy Iteration for Discounted Reinforcement Learning Problems in Continuous Time and Space (1) Empirical Comparison of Gradient Descent and Exponentiated Gradient Descent in Supervised and Reinforcement Learning (1) Extending Sliding-Step Importance Weighting from Supervised Learning to Reinforcement Learning (1) Solutions to Selected Problems In : Reinforcement Learning : An Introduction by (1) E � cient Planning in MDPs by Small Backups (1) Generalization and Function Approximation (1) Scalable Online Recurrent Learning Using Columnar Neural Networks (1) On the Signi � cance of Markov Decision Processes (1) Predicting Periodicity with Temporal Difference Learning (1) Introduction: The Challenge of Reinforcement Learning (1) Temporal Abstraction in TD Networks (1) Category : Reinforcement Learning and Control ; ORAL presentation Improved Switching among Temporally Abstract Actions (1) True Online Emphatic TD($\lambda$): Quick Reference and Implementation Guide (1) Monte Carlo Methods (1) Prediction and Anticipation for Adaptive Artificial Limbs (1) Actor-critic Algorithms 1. Policy Gradient Methods for Reinforcement Learning with Function Average Reward Td Actor-critic Algorithm Using Func- Tion Approximation (1) Book Review Reinforcement Learning: an Introduction (1) NADALINE : A Normalized Adaptive Linear Element that Learns Efficiently (1) True Online Emphatic TD(λ): Quick Reference and Implementation Guide (1) Integral Policy Iterations for Reinforcement Learning Problems in Continuous Time and Space (1) Opening remarks 9 : 15-9 : 45 Active Sequential Estimation of Object Dynamics with Tactile Sensory Feedback - (0) 5.2. Improvement through Adding New Learning Methods 19 0 (0) Average-Reward Learning and Planning with Options (0) New Results - Life-Long Robot Learning and Development of Motor and Social Skills (0) Appeared in Proceedings of the Seventh Yale Workshop on Adaptive and Learning Systems pp Gain Adaptation Beats Least Squares (0) An Empirical Evaluation of True Online TD({\lambda}) (0) Naval Research Laboratory, Code 5514 Navy Center for Applied Research in Artificial Intelligence 4555 Overlook Ave., S.W., Washington, D.C. 20375-5320 (0) movements during acquisition, extinction, and reacquisition Time course of the rabbit's conditioned nictitating membrane (0) Applications of Neural Networks in Robotics and Automation for Manufacturing (0) Prediction problems inspired by animal learning (0) Connectionist Learning Control at GTE Laboratories (0) GQ($\lambda$) Quick Reference and Implementation Guide (0) GQ($λ$) Quick Reference and Implementation Guide (0) Incremental Policy Gradients for Online Reinforcement Learning Control (0) Does the Adam Optimizer Exacerbate Catastrophic Forgetting? (0) iCORE Research Grant Proposal Reinforcement Learning and Artificial Intelligence (0) Solutions to Exercises in Reinforcement Learning (0) responsesites of extinction for a single learned (0) A History of Meta-gradient: Gradient Methods for Meta-learning (0) Research Grant Renewal Proposal Reinforcement Learning and Artificial Intelligence chair : (0) A unified framework for credit assignment (0) Recent Advances in Numerical Techniques for Large-Scale Optimization (0) New Results - Learning Algorithms for Autonomous Robots: Concepts and Algorithms (0) An Empirical Comparison of Off-policy Prediction Learning Algorithms in the Four Rooms Environment (0) Position Paper: Representation Search through Generate and Test (0) Understanding the Pathologies of Approximate Policy Evaluation when Combined with Greedification in Reinforcement Learning (0) Vision-Based Robot Motion Planning (0) Bridging the Implementation Gap: From Sensorimotor Experience to Abstract Conceptual Knowledge (0) Learning representations through stochastic gradient descent in cross-validation error (0) Iterations for Reinforcement Learning Problems in Continuous Time and Space ? (0) Multi-Time Models for Reinforcement Learning Doina PrecupDepartment of Computer ScienceUniversity of MassachusettsAmherst (0) Convergent Temporal-Difference Learning with Arbitrary Differentiable Function Approximator (0) Experience-Oriented Artificial Intelligence (0) Oryctolagus cuniculus membrane response of the rabbit ( Timing and cue competition in conditioning of the nictitating (0) Looking Back on the Actor–Critic Architecture (0) Does Standard Backpropagation Forget Less Catastrophically Than Adam? (0) Reward-Respecting Subtasks for Model-Based Reinforcement Learning (0) Sequential Decision Probelms and Neural Networks (0) Reinforcement Learning Algorithms in Markov Decision Processes AAAI-10 Tutorial Part IV: Take home message (0) Reinforcement Learning Algorithms in Markov Decision Processes AAAI-10 Tutorial Part II: Learning to predict values (0) Journal of Cognitive Neuroscience 11:1 (0) Learning Agent State Online with Recurrent Generate-and-Test (0) Chapter 12 Time-Derivative Models of Pavlovian Reinforcement (0) An Empirical Comparison of Off-policy Prediction Learning Algorithms on the Collision Task (0) Summary of Notation (0) Summary of Proposal for Public Release (0) cient Planning in MDPs by Small Backups (0) True Online TD ( λ ) Harm (0) Planning with Expectation Models for Control (0) Online Representation Search and Its Interactions with Unsupervised Learning (0) Elementary Solution Methods (0) More Papers This paper list is powered by the following services:
Other Resources About Richard S. Sutton What Schools Are Affiliated With Richard S. Sutton? Richard S. Sutton is affiliated with the following schools:
Richard S. Sutton's AcademicInfluence.com Rankings