Richard S. Sutton
#4,039
Most Influential Person Now
Canadian computer scientist
Richard S. Sutton's AcademicInfluence.com Rankings
Richard S. Suttoncomputer-science Degrees
Computer Science
#190
World Rank
#197
Historical Rank
Machine Learning
#21
World Rank
#21
Historical Rank
Artificial Intelligence
#25
World Rank
#27
Historical Rank
Database
#161
World Rank
#166
Historical Rank
Download Badge
Computer Science
Richard S. Sutton's Degrees
- PhD Computer Science University of Massachusetts Amherst
Similar Degrees You Can Earn
Why Is Richard S. Sutton Influential?
(Suggest an Edit or Addition)According to Wikipedia, Richard S. Sutton is a Canadian computer scientist. He is a professor of computing science at the University of Alberta and a research scientist at Keen Technologies. Sutton is considered one of the founders of modern computational reinforcement learning, having several significant contributions to the field, including temporal difference learning and policy gradient methods.
Richard S. Sutton's Published Works
Published Works
- Reinforcement Learning: An Introduction (2005) (37772)
- Policy Gradient Methods for Reinforcement Learning with Function Approximation (1999) (5252)
- Introduction to Reinforcement Learning (1998) (5160)
- Learning to Predict by the Methods of Temporal Differences (1988) (3799)
- Neuronlike adaptive elements that can solve difficult learning control problems (1983) (3372)
- Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning (1999) (3115)
- Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming (1990) (1628)
- Toward a modern theory of adaptive networks: expectation and prediction. (1981) (1495)
- Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding (1995) (1335)
- Neural networks for control (1990) (987)
- Temporal credit assignment in reinforcement learning (1984) (908)
- Dyna, an integrated architecture for learning, planning, and reacting (1990) (770)
- Eligibility Traces for Off-Policy Policy Evaluation (2000) (658)
- Time-Derivative Models of Pavlovian Reinforcement (1990) (625)
- Fast gradient-descent methods for temporal-difference learning with linear function approximation (2009) (584)
- A Menu of Designs for Reinforcement Learning Over Time (1995) (578)
- Predictive Representations of State (2001) (536)
- Natural actor-critic algorithms (2009) (513)
- Reinforcement Learning for RoboCup Soccer Keepaway (2005) (461)
- Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction (2011) (459)
- Reinforcement Learning (1992) (400)
- Dimensions of Reinforcement Learning (1998) (369)
- Off-Policy Temporal Difference Learning with Function Approximation (2001) (365)
- Off-Policy Actor-Critic (2012) (349)
- Reinforcement learning with replacing eligibility traces (2004) (339)
- Reinforcement Learning with Replacing Eligibility Traces (2005) (339)
- Experiments with Reinforcement Learning in Problems with Continuous State and Action Spaces (1997) (332)
- Temporal abstraction in reinforcement learning (2000) (318)
- Reinforcement Learning is Direct Adaptive Optimal Control (1992) (274)
- Reinforcement learning is direct adaptive optimal control (1991) (263)
- Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation (2009) (258)
- Adapting Bias by Gradient Descent: An Incremental Version of Delta-Bar-Delta (1992) (248)
- Learning and Sequential Decision Making (1989) (246)
- Toward Off-Policy Learning Control with Function Approximation (2010) (238)
- A Convergent O(n) Temporal-difference Algorithm for Off-policy Learning with Linear Function Approximation (2008) (235)
- Scaling Reinforcement Learning toward RoboCup Soccer (2001) (221)
- An Emphatic Approach to the Problem of Off-policy Temporal-Difference Learning (2015) (220)
- Reinforcement learning architectures for animats (1991) (212)
- Learning to predict by the methods of temporal differences (2004) (199)
- Model-Free reinforcement learning with continuous action in practice (2012) (198)
- Reward is enough (2021) (191)
- A Deeper Look at Experience Replay (2017) (184)
- TD Models: Modeling the World at a Mixture of Time Scales (1995) (180)
- Incremental Natural Actor-Critic Algorithms (2007) (180)
- Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping (2008) (179)
- A convergent O ( n ) algorithm for off-policy temporal-difference learning with linear function approximation (2008) (173)
- Planning by Incremental Dynamic Programming (1991) (173)
- Intra-Option Learning about Temporally Abstract Actions (1998) (164)
- Online human training of a myoelectric prosthesis controller via actor-critic reinforcement learning (2011) (160)
- Gradient temporal-difference learning algorithms (2011) (158)
- Training and Tracking in Robotics (1985) (151)
- Associative search network: A reinforcement learning associative memory (1981) (148)
- Reinforcement Learning of Local Shape in the Game of Go (2007) (146)
- Stimulus Representation and the Timing of Reward-Prediction Errors in Models of the Dopamine System (2008) (143)
- Multi-time Models for Temporally Abstract Planning (1997) (143)
- Online Learning with Random Representations (1993) (143)
- Theoretical Results on Reinforcement Learning with Temporally Abstract Options (1998) (135)
- Sample-based learning and search with permanent and transient memories (2008) (125)
- Introduction: The challenge of reinforcement learning (1992) (124)
- Behaviour Suite for Reinforcement Learning (2019) (123)
- Computational Schemes and Neural Network Models for Formation and Control of Multijoint Arm Trajectory (1995) (122)
- Multi-timescale nexting in a reinforcement learning robot (2011) (121)
- Roles of Macro-Actions in Accelerating Reinforcement Learning (1998) (119)
- Weighted importance sampling for off-policy learning with linear function approximation (2014) (115)
- GQ(λ): A general gradient algorithm for temporal-difference prediction learning with eligibility traces (2010) (114)
- Gain Adaptation Beats Least Squares (2006) (113)
- Simulation of anticipatory responses in classical conditioning by a neuron-like adaptive element (1982) (103)
- GQ(lambda): A general gradient algorithm for temporal-difference prediction learning with eligibility traces (2010) (102)
- Temporal-difference search in computer Go (2012) (98)
- Evaluating the TD model of classical conditioning (2012) (96)
- Between MDPs and Semi-MDPs : Learning , Planning , and Representing Knowledge at Multiple Temporal Scales (1998) (96)
- Temporal-Difference Networks (2004) (95)
- Simulation of the classically conditioned nictitating membrane response by a neuron-like adaptive element: Response topography, neuronal firing, and interstimulus intervals (1986) (93)
- On the role of tracking in stationary environments (2007) (88)
- Multi-step Reinforcement Learning: A Unifying Algorithm (2017) (86)
- Between MOPs and Semi-MOP: Learning, Planning & Representing Knowledge at Multiple Temporal Scales (1998) (78)
- Model-Based Reinforcement Learning with an Approximate, Learned Model (1996) (74)
- Sequential Decision Problems and Neural Networks (1989) (73)
- Natural actorcritic algorithms. (2009) (72)
- True Online Temporal-Difference Learning (2015) (71)
- Adaptive artificial limbs: a real-time approach to prediction and anticipation (2013) (68)
- Reinforcement learning in board games (2004) (67)
- Connectionist Learning for Control (1995) (66)
- Integrated Modeling and Control Based on Reinforcement Learning and Dynamic Programming (1990) (66)
- Macro-Actions in Reinforcement Learning: An Empirical Analysis (1998) (66)
- Application of real-time machine learning to myoelectric prosthesis control: A case series in adaptive switching (2016) (65)
- Tuning-free step-size adaptation (2012) (62)
- Incremental Least-Squares Temporal Difference Learning (2006) (62)
- Connectionist Learning for Control: An Overview (1989) (59)
- Dynamic switching and real-time machine learning for improved human control of assistive biomedical robots (2012) (59)
- A Unified View (1998) (57)
- A Deeper Look at Planning as Learning from Replay (2015) (54)
- Using Predictive Representations to Improve Generalization in Reinforcement Learning (2005) (53)
- Open Theoretical Questions in Reinforcement Learning (1999) (53)
- True online TD(λ) (2014) (52)
- Temporal Abstraction in Temporal-difference Networks (2005) (51)
- Challenging Control Problems (1995) (49)
- Keepaway Soccer: A Machine Learning Testbed (2001) (47)
- Improved Switching among Temporally Abstract Actions (1998) (47)
- iLSTD: Eligibility Traces and Convergence Analysis (2006) (46)
- Landmark learning: An illustration of associative search (1981) (45)
- Goal Seeking Components for Adaptive Intelligence: An Initial Assessment. (1981) (44)
- Multi-step Off-policy Learning Without Importance Sampling Ratios (2017) (44)
- Reinforcement Learning for 3 vs. 2 Keepaway (2000) (41)
- Real-time prediction learning for the simultaneous actuation of multiple prosthetic joints (2013) (40)
- True Online TD(lambda) (2014) (39)
- An Adaptive Sensorimotor Network Inspired by the Anatomy and Physiology of the Cerebellum (1995) (39)
- Linear Off-Policy Actor-Critic (2012) (39)
- On the Significance of Markov Decision Processes (1997) (39)
- Reinforcement Learning in Artificial Intelligence (1997) (38)
- Off-policy TD( l) with a true online equivalence (2014) (37)
- Reinforcement Learning: Past, Present and Future (1998) (37)
- Reinforcement Learning Architectures (1992) (35)
- Understanding Multi-Step Deep Reinforcement Learning: A Systematic Study of the DQN Target (2019) (35)
- Learning Polynomial Functions by Feature Construction (1991) (34)
- The Truck Backer-Upper: An Example of Self-Learning in Neural Networks (1995) (32)
- A computational model of hippocampal function in trace conditioning (2008) (32)
- A new Q(lambda) with interim forward view and Monte Carlo equivalence (2014) (31)
- Learning Instance-Independent Value Functions to Enhance Local Search (1998) (30)
- Representation Search through Generate and Test (2013) (30)
- Planning by Prioritized Sweeping with Small Backups (2013) (30)
- Universal Option Models (2014) (29)
- A new Q ( � ) with interim forward view and Monte Carlo equivalence (2014) (29)
- Discounted Reinforcement Learning is Not an Optimization Problem (2019) (29)
- Learning and Planning in Average-Reward Markov Decision Processes (2020) (28)
- Magnitude and timing of nictitating membrane movements during classical conditioning of the rabbit (Oryctolagus cuniculus). (2008) (27)
- Emphatic Temporal-Difference Learning (2015) (26)
- Iterative Construction of Sparse Polynomial Approximations (1991) (26)
- TD(λ) networks: temporal-difference networks with eligibility traces (2005) (26)
- Temporal-Difference Networks with History (2005) (25)
- Surprise and Curiosity for Big Data Robotics (2014) (25)
- A Bioreactor Benchmark for Adaptive Network-based Process Control (1995) (25)
- On Generalized Bellman Equations and Temporal-Difference Learning (2017) (25)
- Scaling life-long off-policy learning (2012) (24)
- Policy iterations for reinforcement learning problems in continuous time and space - Fundamental theory and methods (2020) (23)
- Off-policy learning based on weighted importance sampling with linear computational complexity (2015) (23)
- A Batch, Off-Policy, Actor-Critic Algorithm for Optimizing the Average Reward (2016) (23)
- Multi-Step Dyna Planning for Policy Evaluation and Control (2009) (22)
- Special Issue “On Defining Artificial Intelligence”—Commentaries and Author’s Response (2020) (22)
- Some Recent Applications of Reinforcement Learning (2017) (22)
- Associative Learning from Replayed Experience (2017) (21)
- Comparing Policy-Gradient Algorithms (2001) (21)
- Face valuing: Training user interfaces with facial expressions and reinforcement learning (2016) (21)
- Exponentiated Gradient Methods for Reinforcement Learning (1997) (20)
- Online Off-policy Prediction (2018) (19)
- Planning with Expectation Models (2019) (19)
- Reactive Reinforcement Learning in Asynchronous Environments (2018) (19)
- Fixed-Horizon Temporal Difference Methods for Stable Reinforcement Learning (2019) (19)
- Learning to Predict Independent of Span (2015) (18)
- The Grand Challenge of Predictive Empirical Abstract Knowledge (2009) (18)
- Prediction Driven Behavior: Learning Predictions that Drive Fixed Responses (2014) (17)
- Continual Backprop: Stochastic Gradient Descent with Persistent Randomness (2021) (17)
- Efficient planning in MDPs by small backups (2013) (17)
- Comparing Direct and Indirect Temporal-Difference Methods for Estimating the Variance of the Return (2018) (17)
- Advances in reinforcement learning and their implications for intelligent control (1990) (16)
- Theoretical Results on Reinforcement Learning with Temporally Abstract Behaviors (1998) (16)
- Planning and Learning (1998) (16)
- Pavlovian control of intraspinal microstimulation to produce over-ground walking (2019) (15)
- Between Instruction and Reward: Human-Prompted Switching (2012) (15)
- Magnitude and timing of conditioned responses in delay and trace classical conditioning of the nictitating membrane response of the rabbit (Oryctolagus cuniculus). (2009) (15)
- TIDBD: Adapting Temporal-difference Step-sizes Through Stochastic Meta-descent (2018) (14)
- Scalar timing varies with response magnitude in classical conditioning of the nictitating membrane response of the rabbit (Oryctolagus cuniculus). (2009) (14)
- Adaptive State Representation and Estimation Using Recurrent Connectionist Networks (1995) (13)
- Planning with Closed Loop Macro Actions (2008) (13)
- Acquiring a broad range of empirical knowledge in real time by temporal-difference learning (2012) (13)
- Learning to Maximize Rewards: A Review of "Reinforcement Learning: An Introduction (2000) (13)
- Learning Feature Relevance Through Step Size Adaptation in Temporal-Difference Learning (2019) (13)
- Directly Estimating the Variance of the λ-Return Using Temporal-Difference Methods (2018) (12)
- Off-policy Learning with Options and Recognizers (2005) (12)
- Application of connectionist learning methods to manufacturing process monitoring (1988) (12)
- A Summary Comparison of CMAC Neural Network and Traditional Adaptive Control Systems (1995) (12)
- Investigating Experience: Temporal Coherence and Empirical Knowledge Representation (2007) (12)
- Learning a nonlinear model of a manufacturing process using multilayer connectionist networks (1990) (11)
- Adaptive Switching in Practice: Improving Myoelectric Prosthesis Performance through Reinforcement Learning (2014) (11)
- Average-Reward Off-Policy Policy Evaluation with Function Approximation (2021) (11)
- On The Virtues of Linear Learning and Trajectory Distributions (2007) (10)
- Off-policy Learning with Recognizers (2000) (10)
- Synthesis of nonlinear control surfaces by a layered associative search network (2004) (10)
- Two geometric input transformation methods for fast online reinforcement learning with neural nets (2018) (10)
- Agent Learning using Action-Dependent Learning Rates in Computer Role-Playing Games (2008) (10)
- Book Reviews (1999) (10)
- Natural-Gradient Actor-Critic Algorithms (2007) (10)
- A First Empirical Study of Emphatic Temporal Difference Learning (2017) (10)
- Temporal-Difference Learning to Assist Human Decision Making during the Control of an Artificial Limb (2013) (9)
- Introduction: The Challenge of Reinforcement Learning (2004) (9)
- Crossprop: Learning Representations by Stochastic Meta-Gradient Descent in Neural Networks (2016) (9)
- Communicative Capital for Prosthetic Agents (2017) (9)
- Integrating Episodic Memory into a Reinforcement Learning Agent using Reservoir Sampling (2018) (9)
- The Reinforcement Learning Problem (1998) (9)
- Incremental natural-gradient actor-critic algorithms (2007) (9)
- From eye-blinks to state construction: Diagnostic benchmarks for online representation learning (2020) (9)
- Directly Estimating the Variance of the {\lambda}-Return Using Temporal-Difference Methods (2018) (8)
- Looking Back on the Actor–Critic Architecture (2021) (8)
- Forward Actor-Critic for Nonlinear Function Approximation in Reinforcement Learning (2017) (8)
- Reward-Respecting Subtasks for Model-Based Reinforcement Learning (2022) (7)
- Timing in trace conditioning of the nictitating membrane response of the rabbit (Oryctolagus cuniculus): scalar, nonscalar, and adaptive features. (2010) (7)
- Multi-Time Models for Reinforcement Learning (2007) (7)
- Per-decision Multi-step Temporal Difference Learning with Control Variates (2018) (7)
- Artificial Intelligence as a Control Problem: Comments on the Relationship between Machine Learning and Intelligent Control (7)
- The Quest for a Common Model of the Intelligent Decision Maker (2022) (7)
- Reinforcement Learning and Artificial Intelligence (2003) (6)
- Beyond Reward: The Problem of Knowledge and Data (2011) (6)
- This Excerpt from Reinforcement Learning. Introduction 1.2 Examples 1.3 Elements of Reinforcement Learning 1.3 Elements of Reinforcement Learning (5)
- The Alberta Plan for AI Research (2022) (5)
- Model-based Reinforcement Learning with Non-linear Expectation Models and Stochastic Environments (2018) (5)
- Acquiring Diverse Predictive Knowledge in Real Time by Temporal-difference Learning (2012) (5)
- Using Associative Content- Addressable Memories to Control Robots (1995) (4)
- Model-based Reinforcement Learning (2007) (4)
- Adaptive Control Using Neural Networks (1995) (4)
- Learning Sparse Representations Incrementally in Deep Reinforcement Learning (2019) (4)
- Some New Directions for Adaptive Control Theory in Robotics (1995) (4)
- Understanding the Pathologies of Approximate Policy Evaluation when Combined with Greedification in Reinforcement Learning (2020) (4)
- Off-Policy Knowledge Maintenance for Robots (2010) (3)
- Document-editing Assistants and Model-based Reinforcement Learning as a Path to Conversational AI (2020) (3)
- Prediction in Intelligence: An Empirical Comparison of Off-policy Algorithms on Robots (2019) (3)
- Time course of the rabbit's conditioned nictitating membrane movements during acquisition, extinction, and reacquisition (2014) (3)
- Title of Thesis: Reinforcement Learning in Environments with Independent Delayed-sense Dynamics Reinforcement Learning in Environments with Independent Delayed-sense Dynamics (2008) (2)
- SELECTED BIBLIOGRAPHY ON CONNECTIONISM (1989) (2)
- On the Signiicance of Markov Decision Processes (1997) (2)
- Timing and cue competition in conditioning of the nictitating membrane response of the rabbit (Oryctolagus cuniculus). (2013) (2)
- An Empirical Evaluation of True Online TD(λ) (2015) (2)
- An Empirical Comparison of Off-policy Prediction Learning Algorithms on the Collision Task (2021) (2)
- A History of Meta-gradient: Gradient Methods for Meta-learning (2022) (2)
- Integrated Modeling and Control Based on Reinforcement Learning (1990) (2)
- NADALINE : A Normalized Adaptive Linear Element that Learns Efficiently (2001) (2)
- A Neural Network Baseline Problem for Control of Aircraft Flare and Touchdown (1995) (2)
- Extending Sliding-Step Importance Weighting from Supervised Learning to Reinforcement Learning (2019) (2)
- The PEAK Project (2007) (2)
- Average-Reward Learning and Planning with Options (2021) (2)
- An Empirical Comparison of Off-policy Prediction Learning Algorithms in the Four Rooms Environment (2021) (2)
- Scalable Online Recurrent Learning Using Columnar Neural Networks (2021) (2)
- Should All Temporal Difference Learning Use Emphasis? (2019) (2)
- Communicative capital: a key resource for human–machine shared agency and collaborative capacity (2022) (1)
- Book Review Reinforcement Learning: an Introduction (1)
- E � cient Planning in MDPs by Small Backups (2013) (1)
- An Empirical Evaluation of True Online TD({\lambda}) (2015) (1)
- True Online Emphatic TD(λ): Quick Reference and Implementation Guide (2015) (1)
- Category : Reinforcement Learning and Control ; ORAL presentation Improved Switching among Temporally Abstract Actions (1999) (1)
- Policy Iteration for Discounted Reinforcement Learning Problems in Continuous Time and Space (2017) (1)
- On the Signi � cance of Markov Decision Processes (1997) (1)
- Inverse Policy Evaluation for Value-based Sequential Decision-making (2020) (1)
- Prediction problems inspired by animal learning (1)
- True Online Emphatic TD($\lambda$): Quick Reference and Implementation Guide (2015) (1)
- Temporal Abstraction in TD Networks (2005) (1)
- Prediction and Anticipation for Adaptive Artificial Limbs (2012) (1)
- Integral Policy Iterations for Reinforcement Learning Problems in Continuous Time and Space (2017) (1)
- Actor-critic Algorithms 1. Policy Gradient Methods for Reinforcement Learning with Function Average Reward Td Actor-critic Algorithm Using Func- Tion Approximation (1)
- Generalization and Function Approximation (1998) (1)
- Reinforcement and Local Searc: A Case Study TITLE2: (1997) (1)
- Applications of Neural Networks in Robotics and Automation for Manufacturing (1995) (1)
- Natural Actor – Crit ic Algorithms (2009) (1)
- Intelligent Control for Multiple Autonomous Undersea Vehicles (1995) (1)
- Empirical Comparison of Gradient Descent and Exponentiated Gradient Descent in Supervised and Reinforcement Learning (1996) (1)
- Temporal-difference search in computer Go (2012) (1)
- Solutions to Selected Problems In : Reinforcement Learning : An Introduction by (2008) (1)
- Learning Agent State Online with Recurrent Generate-and-Test (2021) (1)
- Vision-Based Robot Motion Planning (1995) (1)
- Predicting Periodicity with Temporal Difference Learning (2018) (1)
- Elementary Solution Methods (1998) (0)
- Reinforcement Learning Algorithms in Markov Decision Processes AAAI-10 Tutorial Part IV: Take home message (2010) (0)
- Planning with Expectation Models for Control (2021) (0)
- Research Grant Renewal Proposal Reinforcement Learning and Artificial Intelligence chair : (2007) (0)
- Position Paper: Representation Search through Generate and Test (2013) (0)
- Appeared in Proceedings of the Seventh Yale Workshop on Adaptive and Learning Systems pp Gain Adaptation Beats Least Squares (2004) (0)
- Oryctolagus cuniculus membrane response of the rabbit ( Timing and cue competition in conditioning of the nictitating (2013) (0)
- GQ($λ$) Quick Reference and Implementation Guide (2017) (0)
- movements during acquisition, extinction, and reacquisition Time course of the rabbit's conditioned nictitating membrane (2014) (0)
- 5.2. Improvement through Adding New Learning Methods 19 0 (2007) (0)
- Incremental Policy Gradients for Online Reinforcement Learning Control (2020) (0)
- cient Planning in MDPs by Small Backups (2013) (0)
- Iterations for Reinforcement Learning Problems in Continuous Time and Space ? (2018) (0)
- Journal of Cognitive Neuroscience 11:1 (1999) (0)
- Sequential Decision Probelms and Neural Networks (1989) (0)
- On Convergence of Average-Reward Off-Policy Control Algorithms in Weakly-Communicating MDPs (2022) (0)
- Does the Adam Optimizer Exacerbate Catastrophic Forgetting? (2021) (0)
- Online Real-Time Recurrent Learning Using Sparse Connections and Selective Learning (2023) (0)
- Toward Efficient Gradient-Based Value Estimation (2023) (0)
- Recent Advances in Numerical Techniques for Large-Scale Optimization (1995) (0)
- True Online TD ( λ ) Harm (2014) (0)
- Convergent Temporal-Difference Learning with Arbitrary Differentiable Function Approximator (2010) (0)
- Evaluating the TD model of classical conditioning (2012) (0)
- Experience-Oriented Artificial Intelligence (2005) (0)
- Scalable Real-Time Recurrent Learning Using Sparse Connections and Selective Learning (2023) (0)
- Learning representations through stochastic gradient descent in cross-validation error (2016) (0)
- Solutions to Exercises in Reinforcement Learning (2017) (0)
- A unified framework for credit assignment (1990) (0)
- Auxiliary task discovery through generate-and-test (2022) (0)
- Doubly-Asynchronous Value Iteration: Making Value Iteration Asynchronous in Actions (2022) (0)
- Opening remarks 9 : 15-9 : 45 Active Sequential Estimation of Object Dynamics with Tactile Sensory Feedback - (2010) (0)
- iCORE Research Grant Proposal Reinforcement Learning and Artificial Intelligence (2003) (0)
- New Results - Life-Long Robot Learning and Development of Motor and Social Skills (2013) (0)
- Naval Research Laboratory, Code 5514 Navy Center for Applied Research in Artificial Intelligence 4555 Overlook Ave., S.W., Washington, D.C. 20375-5320 (1992) (0)
- responsesites of extinction for a single learned (2015) (0)
- Toward Discovering Options that Achieve Faster Planning (2022) (0)
- Bridging the Implementation Gap: From Sensorimotor Experience to Abstract Conceptual Knowledge (2010) (0)
- Summary of Notation (1998) (0)
- Multi-Time Models for Reinforcement Learning Doina PrecupDepartment of Computer ScienceUniversity of MassachusettsAmherst (1997) (0)
- Monte Carlo Methods (1998) (0)
- GQ($\lambda$) Quick Reference and Implementation Guide (2017) (0)
- New Results - Learning Algorithms for Autonomous Robots: Concepts and Algorithms (2011) (0)
- Online Representation Search and Its Interactions with Unsupervised Learning (2012) (0)
- Summary of Proposal for Public Release (2004) (0)
- Reinforcement Learning Algorithms in Markov Decision Processes AAAI-10 Tutorial Part II: Learning to predict values (2010) (0)
- Connectionist Learning Control at GTE Laboratories (1990) (0)
- Does Standard Backpropagation Forget Less Catastrophically Than Adam? (2021) (0)
- Chapter 12 Time-Derivative Models of Pavlovian Reinforcement (1990) (0)
This paper list is powered by the following services:
Other Resources About Richard S. Sutton
What Schools Are Affiliated With Richard S. Sutton?
Richard S. Sutton is affiliated with the following schools: