Shimon Whiteson
#148,721
Most Influential Person Now
Researcher
Shimon Whiteson's AcademicInfluence.com Rankings
Shimon Whitesoncomputer-science Degrees
Computer Science
#7662
World Rank
#8066
Historical Rank
Machine Learning
#2903
World Rank
#2939
Historical Rank
Artificial Intelligence
#3197
World Rank
#3244
Historical Rank
Database
#4714
World Rank
#4897
Historical Rank

Download Badge
Computer Science
Shimon Whiteson's Degrees
- PhD Computer Science University of Oxford
Similar Degrees You Can Earn
Why Is Shimon Whiteson Influential?
(Suggest an Edit or Addition)Shimon Whiteson's Published Works
Number of citations in a given year to any of this author's works
Total number of citations to an author for the works they published in a given year. This highlights publication of the most important work(s) by the author
Published Works
- Counterfactual Multi-Agent Policy Gradients (2017) (1243)
- Learning to Communicate with Deep Multi-Agent Reinforcement Learning (2016) (1144)
- QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning (2018) (1050)
- Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning (2017) (475)
- A Survey of Multi-Objective Sequential Decision-Making (2013) (456)
- The StarCraft Multi-Agent Challenge (2019) (425)
- Learning with Opponent-Learning Awareness (2017) (385)
- LipNet: End-to-End Sentence-level Lipreading (2016) (290)
- Fast Context Adaptation via Meta-Learning (2018) (264)
- Evolutionary Function Approximation for Reinforcement Learning (2006) (238)
- Multiagent Reinforcement Learning for Urban Traffic Control Using Coordination Graphs (2008) (215)
- MAVEN: Multi-Agent Variational Exploration (2019) (197)
- Deep Variational Reinforcement Learning for POMDPs (2018) (195)
- A Survey of Reinforcement Learning Informed by Natural Language (2019) (194)
- A theoretical and empirical analysis of Expected Sarsa (2009) (187)
- Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning (2020) (161)
- VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning (2019) (153)
- Balancing exploration and exploitation in listwise and pairwise online learning to rank for information retrieval (2013) (142)
- Transfer via inter-task mappings in policy search reinforcement learning (2007) (137)
- A probabilistic method for inferring preferences from clicks (2011) (135)
- LipNet: Sentence-level Lipreading (2016) (134)
- Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-Networks (2016) (133)
- Weighted QMIX: Expanding Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning (2020) (128)
- Exploiting locality of interaction in factored Dec-POMDPs (2008) (123)
- Automatic feature selection in neuroevolution (2005) (120)
- Evolving Keepaway Soccer Players through Task Decomposition (2003) (113)
- Reusing historical interaction data for faster online learning to rank for IR (2013) (112)
- Bayesian Action Decoder for Deep Multi-Agent Reinforcement Learning (2018) (112)
- Relative Upper Confidence Bound for the K-Armed Dueling Bandit Problem (2013) (107)
- Comparing evolutionary and temporal difference methods in a reinforcement learning domain (2006) (103)
- Deep Coordination Graphs (2019) (98)
- Adaptive Tile Coding for Value Function Approximation (2007) (94)
- Protecting against evaluation overfitting in empirical reinforcement learning (2011) (91)
- Evolving Soccer Keepaway Players Through Task Decomposition (2005) (91)
- Multileave Gradient Descent for Fast Online Learning to Rank (2016) (88)
- RODE: Learning Roles to Decompose Multi-Agent Tasks (2020) (85)
- Multi-Objective Deep Reinforcement Learning (2016) (85)
- Is Independent Learning All You Need in the StarCraft Multi-Agent Challenge? (2020) (84)
- Multi-Objective Decision Making (2017) (83)
- Learning From Demonstration in the Wild (2018) (80)
- Traffic Light Control by Multiagent Reinforcement Learning Systems (2010) (79)
- Multileaved Comparisons for Fast Online Evaluation (2014) (77)
- Inverse Reinforcement Learning from Failure (2016) (74)
- Towards Personalised Gaming via Facial Expression Recognition (2014) (73)
- Stable Opponent Shaping in Differentiable Games (2018) (73)
- Copeland Dueling Bandits (2015) (72)
- Balancing Exploration and Exploitation in Learning to Rank Online (2011) (71)
- Multi-Agent Common Knowledge Reinforcement Learning (2018) (70)
- DiCE: The Infinitely Differentiable Monte-Carlo Estimator (2018) (70)
- FACMAC: Factored Multi-Agent Centralised Policy Gradients (2020) (65)
- TACO: Learning Task Decomposition via Temporal Alignment for Control (2018) (64)
- Approximate solutions for factored Dec-POMDPs with many agents (2013) (63)
- Empirical Studies in Action Selection with Reinforcement Learning (2007) (62)
- GradientDICE: Rethinking Generalized Offline Estimation of Stationary Values (2020) (61)
- TreeQN and ATreeC: Differentiable Tree Planning for Deep Reinforcement Learning (2017) (61)
- Concurrent layered learning (2003) (54)
- Machine learning for event selection in high energy physics (2009) (53)
- Evolutionary Computation for Reinforcement Learning (2012) (52)
- TreeQN and ATreeC: Differentiable Tree-Structured Models for Deep Reinforcement Learning (2017) (52)
- Fidelity, Soundness, and Efficiency of Interleaved Comparison Methods (2013) (52)
- Expected Policy Gradients (2017) (51)
- Lerot: an online learning to rank framework (2013) (50)
- Neuroevolutionary reinforcement learning for generalized control of simulated helicopters (2011) (49)
- MergeRUCB: A Method for Large-Scale Online Ranker Evaluation (2015) (48)
- The Reinforcement Learning Competitions (2010) (47)
- Adaptive job routing and scheduling (2004) (46)
- Lossless clustering of histories in decentralized POMDPs (2009) (46)
- The Representational Capacity of Action-Value Networks for Multi-Agent Reinforcement Learning (2019) (45)
- Relative confidence sampling for efficient on-line ranker evaluation (2014) (44)
- Incremental Clustering and Expansion for Faster Optimal Planning in Dec-POMDPs (2013) (44)
- Critical factors in the performance of novelty search (2011) (44)
- OFFER: Off-Environment Reinforcement Learning (2017) (42)
- Automatic Feature Selection for Model-Based Reinforcement Learning in Factored MDPs (2009) (42)
- Incremental clustering and expansion for faster optimal planning in decentralized POMDPs (2013) (42)
- Deep Multi-Agent Reinforcement Learning for Decentralized Continuous Cooperative Control (2020) (41)
- Neuroevolutionary reinforcement learning for generalized helicopter control (2009) (41)
- Learning potential functions and their representations for multi-task reinforcement learning (2014) (40)
- Computing Convex Coverage Sets for Faster Multi-objective Coordination (2015) (39)
- CAML: Fast Context Adaptation via Meta-Learning (2018) (38)
- Point-Based Planning for Multi-Objective POMDPs (2015) (37)
- VIREL: A Variational Inference Framework for Reinforcement Learning (2018) (35)
- DAC: The Double Actor-Critic Architecture for Learning Options (2019) (35)
- Estimating interleaved comparison outcomes from historical click data (2012) (35)
- Critical factors in the empirical performance of temporal difference and evolutionary methods for reinforcement learning (2010) (35)
- Expected Policy Gradients for Reinforcement Learning (2018) (34)
- Bounded Approximations for Linear Multi-Objective Planning Under Uncertainty (2014) (34)
- Using informative behavior to increase engagement in the tamer framework (2013) (33)
- My Body is a Cage: the Role of Morphology in Graph-Based Incompatible Control (2020) (33)
- Linear support for multi-objective coordination graphs (2014) (32)
- TERESA: a socially intelligent semi-autonomous telepresence system (2015) (32)
- Provably Convergent Two-Timescale Off-Policy Actor-Critic with Function Approximation (2019) (31)
- Optimistic Exploration even with a Pessimistic Initialisation (2020) (31)
- Weighted QMIX: Expanding Monotonic Value Function Factorisation (2020) (30)
- Transfer Learning for Policy Search Methods (2006) (30)
- Generalized Off-Policy Actor-Critic (2019) (30)
- Rapidly exploring learning trees (2017) (29)
- Stochastic Optimization for Collision Selection in High Energy Physics (2006) (29)
- Randomized Entity-wise Factorization for Multi-Agent Reinforcement Learning (2020) (27)
- On-line evolutionary computation for reinforcement learning in stochastic domains (2006) (27)
- Transient Non-stationarity and Generalisation in Deep Reinforcement Learning (2020) (27)
- Challenge balancing for personalised game spaces (2014) (26)
- UneVEn: Universal Value Exploration for Multi-Agent Reinforcement Learning (2020) (26)
- Exploiting Submodular Value Functions for Faster Dynamic Sensor Selection (2015) (25)
- Can Q-Learning with Graph Networks Learn a Generalizable Branching Heuristic for a SAT Solver? (2020) (25)
- Computing Convex Coverage Sets for Multi-objective Coordination Graphs (2013) (25)
- Contextual Bandits for Information Retrieval (2011) (25)
- Queued Pareto Local Search for Multi-Objective Optimization (2014) (25)
- Growing Action Spaces (2019) (25)
- Temporal Difference and Policy Search Methods for Reinforcement Learning: An Empirical Comparison (2007) (24)
- Alternating Optimisation and Quadrature for Robust Control (2016) (23)
- Exploiting submodular value functions for scaling up active perception (2018) (23)
- Fast Efficient Hyperparameter Tuning for Policy Gradient Methods (2019) (23)
- Using informative behavior to increase engagement while learning from human reward (2015) (22)
- Improving SAT Solver Heuristics with Graph Networks and Reinforcement Learning (2019) (22)
- Adaptive Representations for Reinforcement Learning (2010) (22)
- Fast Efficient Hyperparameter Tuning for Policy Gradients (2019) (21)
- Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning (2021) (21)
- The Impact of Non-stationarity on Generalisation in Deep Reinforcement Learning (2020) (21)
- Mean-Variance Policy Iteration for Risk-Averse Reinforcement Learning (2020) (20)
- Exploration with Unreliable Intrinsic Reward in Multi-Agent Reinforcement Learning (2019) (20)
- Breaking the Deadly Triad with a Target Network (2021) (19)
- In Defense of the Unitary Scalarization for Deep Multi-Task Learning (2022) (19)
- Exploiting Best-Match Equations for Efficient Reinforcement Learning (2011) (19)
- Critical factors in the performance of hyperNEAT (2013) (19)
- Deep Interactive Bayesian Reinforcement Learning via Meta-Learning (2021) (18)
- Multi-task evolutionary shaping without pre-specified representations (2010) (18)
- Exploiting Structure in Cooperative Bayesian Games (2012) (17)
- Social interaction for efficient agent learning from human reward (2017) (17)
- Generalized Domains for Empirical Evaluations in Reinforcement Learning (2009) (16)
- EFFICIENT ABSTRACTION SELECTION IN REINFORCEMENT LEARNING (2014) (16)
- Exploration in Approximate Hyper-State Space for Meta Reinforcement Learning (2020) (16)
- AI-QMIX: Attention and Imagination for Dynamic Multi-Agent Reinforcement Learning (2020) (16)
- Fingerprint Policy Optimisation for Robust Reinforcement Learning (2018) (16)
- Towards autonomic computing: adaptive network routing and scheduling (2004) (16)
- Facial feedback for reinforcement learning: a case study and offline analysis using the TAMER framework (2020) (15)
- Deep Residual Reinforcement Learning (2019) (15)
- Multitask Soft Option Learning (2019) (15)
- WordCraft: An Environment for Benchmarking Commonsense Agents (2020) (15)
- Reinforcement learning enhanced quantum-inspired algorithm for combinatorial optimization (2020) (14)
- Multi-Task Reinforcement Learning: Shaping and Feature Selection (2011) (13)
- Maximizing Information Gain in Partially Observable Environments via Prediction Reward (2020) (13)
- A Baseline for Any Order Gradient Estimation in Stochastic Computation Graphs (2019) (13)
- Fourier Policy Gradients (2018) (13)
- Symphony: Learning Realistic and Diverse Agents for Autonomous Driving Simulation (2022) (12)
- Improving reinforcement learning function approximators via neuroevolution (2005) (12)
- V-MAX: tempered optimism for better PAC reinforcement learning (2012) (11)
- Average-Reward Off-Policy Policy Evaluation with Function Approximation (2021) (11)
- Automatic feature selection using FS-NEAT (2008) (10)
- Pareto Local Policy Search for MOMDP Planning (2015) (10)
- Optimizing Base Rankers Using Clicks - A Case Study Using BM25 (2014) (10)
- Bayesian Ranker Comparison Based on Historical User Interactions (2015) (9)
- VariBAD: Variational Bayes-Adaptive Deep RL via Meta-Learning (2021) (9)
- Regularized Softmax Deep Multi-Agent Q-Learning (2021) (9)
- Loaded DiCE: Trading off Bias and Variance in Any-Order Score Function Gradient Estimators for Reinforcement Learning (2019) (9)
- A Deeper Look at Discounting Mismatch in Actor-Critic Algorithms (2020) (9)
- Generalization in Cooperative Multi-Agent Systems (2022) (8)
- Towards Challenge Balancing for Personalised Game Spaces (2014) (8)
- Privileged Information Dropout in Reinforcement Learning (2020) (8)
- Learning from human reward benefits from socio-competitive feedback (2014) (8)
- Acquiring social interaction behaviours for telepresence robots via deep learning from demonstration (2017) (8)
- Loaded DiCE: Trading off Bias and Variance in Any-Order Score Function Estimators for Reinforcement Learning (2019) (7)
- Sample-Efficient Evolutionary Function Approximation for Reinforcement Learning (2006) (7)
- Switching between Representations in Reinforcement Learning (2010) (7)
- Hierarchical Model-Based Imitation Learning for Planning in Autonomous Driving (2022) (7)
- Provably Convergent Off-Policy Actor-Critic with Function Approximation (2019) (7)
- You May Not Need Ratio Clipping in PPO (2022) (7)
- Report on the 2008 Reinforcement Learning Competition (2010) (7)
- Learning Retrospective Knowledge with Reverse Reinforcement Learning (2020) (7)
- SoftDICE for Imitation Learning: Rethinking Off-policy Distribution Matching (2021) (6)
- Postponed Updates for Temporal-Difference Reinforcement Learning (2009) (6)
- Why Multi-objective Reinforcement Learning? (2015) (6)
- Alternating Optimisation and Quadrature for Robust Reinforcement Learning (2016) (6)
- SMACv2: An Improved Benchmark for Cooperative Multi-Agent Reinforcement Learning (2022) (6)
- Monotonic Improvement Guarantees under Non-stationarity for Decentralized PPO (2022) (6)
- "Learning to rank for information retrieval from user interactions" by K. Hofmann, S. Whiteson, A. Schuth, and M. de Rijke with Martin Vesely as coordinator (2014) (6)
- Multi-objective variable elimination for collaborative graphical games (2013) (6)
- An Investigation of the Bias-Variance Tradeoff in Meta-Gradients (2022) (6)
- Snowflake: Scaling GNNs to High-Dimensional Continuous Control via Parameter Freezing (2021) (5)
- An Analysis of Piecewise-Linear and Convex Value Functions for Active Perception POMDPs (2015) (5)
- Introduction to the special issue on empirical evaluations in reinforcement learning (2011) (5)
- Leveraging social networks to motivate humans to train agents (2014) (4)
- Model based Multi-agent Reinforcement Learning with Tensor Decompositions (2021) (4)
- A Large-Scale Study of Agents Learning from Human Reward (2015) (4)
- Integrating distributed Bayesian inference and reinforcement learning for sensor management (2009) (4)
- Communicating via Markov Decision Processes (2021) (4)
- Evolutionary Function Approximation (2010) (4)
- Imitation Is Not Enough: Robustifying Imitation with Reinforcement Learning for Challenging Driving Scenarios (2022) (4)
- Evolving robocup keepaway players through task decomposition (2003) (4)
- Dynamic-Depth Context Tree Weighting (2017) (3)
- Contextual Policy Optimisation (2018) (3)
- PAC Greedy Maximization with Efficient Bounds on Information Gain for Sensor Selection (2016) (3)
- Softmax with Regularization: Better Value Estimation in Multi-Agent Reinforcement Learning (2021) (3)
- Towards Learning from Implicit Human Reward: (Extended Abstract) (2016) (3)
- A Survey of Meta-Reinforcement Learning (2023) (3)
- Using Confidence Bounds for Efficient On−Line Ranker Evaluation (2014) (3)
- Proceedings of the IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (2009) (3)
- On the Practical Consistency of Meta-Reinforcement Learning Algorithms (2021) (3)
- Exploiting Agent and Type Independence in Collaborative Graphical Bayesian Games (2011) (3)
- Truncated Emphatic Temporal Difference Methods for Prediction and Control (2021) (3)
- Variational Multi−Objective Coordination (2015) (3)
- Hypernetworks in Meta-Reinforcement Learning (2022) (2)
- Generalized Beliefs for Cooperative AI (2022) (2)
- Equivariant Networks for Zero-Shot Coordination (2022) (2)
- Analysing factorizations of action-value networks for cooperative multi-agent reinforcement learning (2021) (2)
- Bayesian Bellman Operators (2021) (2)
- Per-Step Reward: A New Perspective for Risk-Averse Reinforcement Learning (2020) (2)
- Reinforcement Learning in Factored Action Spaces using Tensor Decompositions (2021) (2)
- Robust central pattern generators for embodied hierarchical reinforcement learning (2011) (2)
- Towards Autonomic Computing: Adaptive Job Routing and Scheduling (2004) (1)
- A Better Baseline for Second Order Gradient Estimation in Stochastic Computation Graphs (2018) (1)
- Real-Time Resource Allocation for Tracking Systems (2020) (1)
- On-Line Evolutionary Computation (2010) (1)
- Embedding Synthetic Off-Policy Experience for Autonomous Driving via Zero-Shot Curricula (2022) (1)
- Off-Environment RL with Rare Events (2016) (1)
- Adaptive Tile Coding for Reinforcement Learning (2006) (1)
- Probably Approximately Correct Greedy Maximization (2016) (1)
- Efficient Abstraction Selection in Reinforcement Learning (Extended Abstract) (2013) (1)
- Adaptive Tile Coding (2010) (1)
- Balancing exploration and exploitation in listwise and pairwise online learning to rank for information retrieval (2012) (1)
- Deterministic and Discriminative Imitation (D2-Imitation): Revisiting Adversarial Imitation for Sample Efficiency (2021) (1)
- Using informative behavior to increase engagement while learning from human reward (2015) (1)
- Semi-On-Policy Training for Sample Efficient Multi-Agent Policy Gradients (2021) (1)
- The StarCraft Multi-Agent Challenge Extended Abstract (2019) (1)
- Pareto Local Search for MOMDP Planning (2015) (1)
- How does the sensitivity of multileaving methods compare to that of interleaving methods ? (2014) (0)
- Sample-Efficient Evolutionary Function Approximation (2010) (0)
- Supplementary Material: Randomized Entity-wise Factorization for Multi-Agent Reinforcement Learning (2021) (0)
- for MOMDP Planning (2015) (0)
- UvA-DARE (Digital Academic Repository) Towards Personalised Gaming via Facial Expression Recognition Towards Personalised Gaming via Facial Expression Recognition (2014) (0)
- UvA-DARE (Digital Academic Repository) Towards Personalised Gaming via Facial Expression Recognition (2014) (0)
- Implicit Communication as Minimum Entropy Coupling (2021) (0)
- Improving Exploration in Deep Reinforcement Learning (2017) (0)
- Adapting Rankers Online (2011) (0)
- Probably Approximately Correct Greedy Maximization: (Extended Abstract) (2016) (0)
- Why Target Networks Stabilise Temporal Difference Methods (2023) (0)
- AIMS CDT Project Report : Towards One-Shot Learning From Demonstration via Reinforcement Learning (2018) (0)
- Integrating Reinforcement Learning and Distributed Perception Networks for Mobile Sensor Control (2008) (0)
- Cheap Talk Discovery and Utilization in Multi-Agent Reinforcement Learning (2023) (0)
- Exploiting submodular value functions for scaling up active perception (2017) (0)
- UvA-DARE ( Digital Academic Repository ) Learning Potential Functions and their Representations for MultiTask Reinforcement Learning (2014) (0)
- A large-scale study of agents learning from human reward (Extended abstract) (2015) (0)
- Universal Morphology Control via Contextual Modulation (2023) (0)
- Towards Autonomic Computing: Adaptive Job Routing (2004) (0)
- TION IN DEEP REINFORCEMENT LEARNING (2021) (0)
- Deep Residual Reinforcement Learning (Extended Abstract) (2021) (0)
- Supplementary : A Baseline for Any Order Gradient Estimation in SCGs (2019) (0)
- Supplementary Material for ‘ Bayesian Action Decoder for Deep Multi-Agent Reinforcement Learning ’ (2019) (0)
- Design criteria for challenge balancing of personalised game spaces (2014) (0)
- Intro to Reinforcement Learning (2017) (0)
- Utile Context Tree Weighting (2017) (0)
- Facial feedback for reinforcement learning: a case study and offline analysis using the TAMER framework (2020) (0)
- Learning potential functions and their representations for multi-task reinforcement learning (2013) (0)
- Appendix for Regularized Softmax Deep Multi-Agent Q-Learning (2021) (0)
- Trust-Region-Free Policy Optimization for Stochastic Policies (2023) (0)
- Trust Region Bounds for Decentralized PPO Under Non-stationarity (2022) (0)
- Learning Skills Diverse in Value-Relevant Features (2022) (0)
- Probably Approximately Correct Greedy Maximization with Efficient Bounds on Information Gain for Sensor Selection (2016) (0)
- Robust Reinforcement Learning with Bayesian Optimisation and Quadrature (2020) (0)
- VIABLE: Fast Adaptation via Backpropagating Learned Loss (2019) (0)
- Automatic Feature Selection for Reinforcement Learning (2010) (0)
- Machine learning in network systems A top-down approach to autonomic computing using adaptive routing and scheduling (2017) (0)
- Particle-Based Score Estimation for State Space Model Learning in Autonomous Driving (2022) (0)
- STABLE OPPONENT SHAPING (2019) (0)
- UvA-DARE ( Digital Academic Repository ) Challenge Balancing for Personalised Game Spaces (2014) (0)
- DICE: THE INFINITELY DIFFERENTIABLE MONTE (2018) (0)
- Social interaction for efficient agent learning from human reward (2017) (0)
- Queued Pareto Local Search for Multi−objective Decision Making (2015) (0)
This paper list is powered by the following services:
Other Resources About Shimon Whiteson
What Schools Are Affiliated With Shimon Whiteson?
Shimon Whiteson is affiliated with the following schools: