Shimon Whiteson

Q: What Schools Are Affiliated With Shimon Whiteson

Shimon Whiteson is affiliated with the following schools: University of Amsterdam, University of California, Berkeley, University of Texas at Austin, University of Oxford

Shimon Whiteson's AcademicInfluence.com Rankings

Shimon Whiteson

Computer Science

#7662

World Rank

#8066

Historical Rank

Machine Learning

#2903

World Rank

#2939

Historical Rank

Artificial Intelligence

#3197

World Rank

#3244

Historical Rank

Database

#4714

World Rank

#4897

Historical Rank

computer-science Degrees

Download Badge

Computer Science

Shimon Whiteson's Degrees

PhD Computer Science University of Oxford

Similar Degrees You Can Earn

Best Online PhD of Computer Science (Doctorates) 2026

Why Is Shimon Whiteson Influential?

(Suggest an Edit or Addition)

(See a Problem?)

Shimon Whiteson's Published Works

Number of citations in a given year to any of this author's works

Total number of citations to an author for the works they published in a given year. This highlights publication of the most important work(s) by the author

Published Works

Counterfactual Multi-Agent Policy Gradients (2017) (1243)
Learning to Communicate with Deep Multi-Agent Reinforcement Learning (2016) (1144)
QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning (2018) (1050)
Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning (2017) (475)
A Survey of Multi-Objective Sequential Decision-Making (2013) (456)
The StarCraft Multi-Agent Challenge (2019) (425)
Learning with Opponent-Learning Awareness (2017) (385)
LipNet: End-to-End Sentence-level Lipreading (2016) (290)
Fast Context Adaptation via Meta-Learning (2018) (264)
Evolutionary Function Approximation for Reinforcement Learning (2006) (238)
Multiagent Reinforcement Learning for Urban Traffic Control Using Coordination Graphs (2008) (215)
MAVEN: Multi-Agent Variational Exploration (2019) (197)
Deep Variational Reinforcement Learning for POMDPs (2018) (195)
A Survey of Reinforcement Learning Informed by Natural Language (2019) (194)
A theoretical and empirical analysis of Expected Sarsa (2009) (187)
Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning (2020) (161)
VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning (2019) (153)
Balancing exploration and exploitation in listwise and pairwise online learning to rank for information retrieval (2013) (142)
Transfer via inter-task mappings in policy search reinforcement learning (2007) (137)
A probabilistic method for inferring preferences from clicks (2011) (135)
LipNet: Sentence-level Lipreading (2016) (134)
Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-Networks (2016) (133)
Weighted QMIX: Expanding Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning (2020) (128)
Exploiting locality of interaction in factored Dec-POMDPs (2008) (123)
Automatic feature selection in neuroevolution (2005) (120)
Evolving Keepaway Soccer Players through Task Decomposition (2003) (113)
Reusing historical interaction data for faster online learning to rank for IR (2013) (112)
Bayesian Action Decoder for Deep Multi-Agent Reinforcement Learning (2018) (112)
Relative Upper Confidence Bound for the K-Armed Dueling Bandit Problem (2013) (107)
Comparing evolutionary and temporal difference methods in a reinforcement learning domain (2006) (103)
Deep Coordination Graphs (2019) (98)
Adaptive Tile Coding for Value Function Approximation (2007) (94)
Protecting against evaluation overfitting in empirical reinforcement learning (2011) (91)
Evolving Soccer Keepaway Players Through Task Decomposition (2005) (91)
Multileave Gradient Descent for Fast Online Learning to Rank (2016) (88)
RODE: Learning Roles to Decompose Multi-Agent Tasks (2020) (85)
Multi-Objective Deep Reinforcement Learning (2016) (85)
Is Independent Learning All You Need in the StarCraft Multi-Agent Challenge? (2020) (84)
Multi-Objective Decision Making (2017) (83)
Learning From Demonstration in the Wild (2018) (80)
Traffic Light Control by Multiagent Reinforcement Learning Systems (2010) (79)
Multileaved Comparisons for Fast Online Evaluation (2014) (77)
Inverse Reinforcement Learning from Failure (2016) (74)
Towards Personalised Gaming via Facial Expression Recognition (2014) (73)
Stable Opponent Shaping in Differentiable Games (2018) (73)
Copeland Dueling Bandits (2015) (72)
Balancing Exploration and Exploitation in Learning to Rank Online (2011) (71)
Multi-Agent Common Knowledge Reinforcement Learning (2018) (70)
DiCE: The Infinitely Differentiable Monte-Carlo Estimator (2018) (70)
FACMAC: Factored Multi-Agent Centralised Policy Gradients (2020) (65)
TACO: Learning Task Decomposition via Temporal Alignment for Control (2018) (64)
Approximate solutions for factored Dec-POMDPs with many agents (2013) (63)
Empirical Studies in Action Selection with Reinforcement Learning (2007) (62)
GradientDICE: Rethinking Generalized Offline Estimation of Stationary Values (2020) (61)
TreeQN and ATreeC: Differentiable Tree Planning for Deep Reinforcement Learning (2017) (61)
Concurrent layered learning (2003) (54)
Machine learning for event selection in high energy physics (2009) (53)
Evolutionary Computation for Reinforcement Learning (2012) (52)
TreeQN and ATreeC: Differentiable Tree-Structured Models for Deep Reinforcement Learning (2017) (52)
Fidelity, Soundness, and Efficiency of Interleaved Comparison Methods (2013) (52)
Expected Policy Gradients (2017) (51)
Lerot: an online learning to rank framework (2013) (50)
Neuroevolutionary reinforcement learning for generalized control of simulated helicopters (2011) (49)
MergeRUCB: A Method for Large-Scale Online Ranker Evaluation (2015) (48)
The Reinforcement Learning Competitions (2010) (47)
Adaptive job routing and scheduling (2004) (46)
Lossless clustering of histories in decentralized POMDPs (2009) (46)
The Representational Capacity of Action-Value Networks for Multi-Agent Reinforcement Learning (2019) (45)
Relative confidence sampling for efficient on-line ranker evaluation (2014) (44)
Incremental Clustering and Expansion for Faster Optimal Planning in Dec-POMDPs (2013) (44)
Critical factors in the performance of novelty search (2011) (44)
OFFER: Off-Environment Reinforcement Learning (2017) (42)
Automatic Feature Selection for Model-Based Reinforcement Learning in Factored MDPs (2009) (42)
Incremental clustering and expansion for faster optimal planning in decentralized POMDPs (2013) (42)
Deep Multi-Agent Reinforcement Learning for Decentralized Continuous Cooperative Control (2020) (41)
Neuroevolutionary reinforcement learning for generalized helicopter control (2009) (41)
Learning potential functions and their representations for multi-task reinforcement learning (2014) (40)
Computing Convex Coverage Sets for Faster Multi-objective Coordination (2015) (39)
CAML: Fast Context Adaptation via Meta-Learning (2018) (38)
Point-Based Planning for Multi-Objective POMDPs (2015) (37)
VIREL: A Variational Inference Framework for Reinforcement Learning (2018) (35)
DAC: The Double Actor-Critic Architecture for Learning Options (2019) (35)
Estimating interleaved comparison outcomes from historical click data (2012) (35)
Critical factors in the empirical performance of temporal difference and evolutionary methods for reinforcement learning (2010) (35)
Expected Policy Gradients for Reinforcement Learning (2018) (34)
Bounded Approximations for Linear Multi-Objective Planning Under Uncertainty (2014) (34)
Using informative behavior to increase engagement in the tamer framework (2013) (33)
My Body is a Cage: the Role of Morphology in Graph-Based Incompatible Control (2020) (33)
Linear support for multi-objective coordination graphs (2014) (32)
TERESA: a socially intelligent semi-autonomous telepresence system (2015) (32)
Provably Convergent Two-Timescale Off-Policy Actor-Critic with Function Approximation (2019) (31)
Optimistic Exploration even with a Pessimistic Initialisation (2020) (31)
Weighted QMIX: Expanding Monotonic Value Function Factorisation (2020) (30)
Transfer Learning for Policy Search Methods (2006) (30)
Generalized Off-Policy Actor-Critic (2019) (30)
Rapidly exploring learning trees (2017) (29)
Stochastic Optimization for Collision Selection in High Energy Physics (2006) (29)
Randomized Entity-wise Factorization for Multi-Agent Reinforcement Learning (2020) (27)
On-line evolutionary computation for reinforcement learning in stochastic domains (2006) (27)
Transient Non-stationarity and Generalisation in Deep Reinforcement Learning (2020) (27)
Challenge balancing for personalised game spaces (2014) (26)
UneVEn: Universal Value Exploration for Multi-Agent Reinforcement Learning (2020) (26)
Exploiting Submodular Value Functions for Faster Dynamic Sensor Selection (2015) (25)
Can Q-Learning with Graph Networks Learn a Generalizable Branching Heuristic for a SAT Solver? (2020) (25)
Computing Convex Coverage Sets for Multi-objective Coordination Graphs (2013) (25)
Contextual Bandits for Information Retrieval (2011) (25)
Queued Pareto Local Search for Multi-Objective Optimization (2014) (25)
Growing Action Spaces (2019) (25)
Temporal Difference and Policy Search Methods for Reinforcement Learning: An Empirical Comparison (2007) (24)
Alternating Optimisation and Quadrature for Robust Control (2016) (23)
Exploiting submodular value functions for scaling up active perception (2018) (23)
Fast Efficient Hyperparameter Tuning for Policy Gradient Methods (2019) (23)
Using informative behavior to increase engagement while learning from human reward (2015) (22)
Improving SAT Solver Heuristics with Graph Networks and Reinforcement Learning (2019) (22)
Adaptive Representations for Reinforcement Learning (2010) (22)
Fast Efficient Hyperparameter Tuning for Policy Gradients (2019) (21)
Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning (2021) (21)
The Impact of Non-stationarity on Generalisation in Deep Reinforcement Learning (2020) (21)
Mean-Variance Policy Iteration for Risk-Averse Reinforcement Learning (2020) (20)
Exploration with Unreliable Intrinsic Reward in Multi-Agent Reinforcement Learning (2019) (20)
Breaking the Deadly Triad with a Target Network (2021) (19)
In Defense of the Unitary Scalarization for Deep Multi-Task Learning (2022) (19)
Exploiting Best-Match Equations for Efficient Reinforcement Learning (2011) (19)
Critical factors in the performance of hyperNEAT (2013) (19)
Deep Interactive Bayesian Reinforcement Learning via Meta-Learning (2021) (18)
Multi-task evolutionary shaping without pre-specified representations (2010) (18)
Exploiting Structure in Cooperative Bayesian Games (2012) (17)
Social interaction for efficient agent learning from human reward (2017) (17)
Generalized Domains for Empirical Evaluations in Reinforcement Learning (2009) (16)
EFFICIENT ABSTRACTION SELECTION IN REINFORCEMENT LEARNING (2014) (16)
Exploration in Approximate Hyper-State Space for Meta Reinforcement Learning (2020) (16)
AI-QMIX: Attention and Imagination for Dynamic Multi-Agent Reinforcement Learning (2020) (16)
Fingerprint Policy Optimisation for Robust Reinforcement Learning (2018) (16)
Towards autonomic computing: adaptive network routing and scheduling (2004) (16)
Facial feedback for reinforcement learning: a case study and offline analysis using the TAMER framework (2020) (15)
Deep Residual Reinforcement Learning (2019) (15)
Multitask Soft Option Learning (2019) (15)
WordCraft: An Environment for Benchmarking Commonsense Agents (2020) (15)
Reinforcement learning enhanced quantum-inspired algorithm for combinatorial optimization (2020) (14)
Multi-Task Reinforcement Learning: Shaping and Feature Selection (2011) (13)
Maximizing Information Gain in Partially Observable Environments via Prediction Reward (2020) (13)
A Baseline for Any Order Gradient Estimation in Stochastic Computation Graphs (2019) (13)
Fourier Policy Gradients (2018) (13)
Symphony: Learning Realistic and Diverse Agents for Autonomous Driving Simulation (2022) (12)
Improving reinforcement learning function approximators via neuroevolution (2005) (12)
V-MAX: tempered optimism for better PAC reinforcement learning (2012) (11)
Average-Reward Off-Policy Policy Evaluation with Function Approximation (2021) (11)
Automatic feature selection using FS-NEAT (2008) (10)
Pareto Local Policy Search for MOMDP Planning (2015) (10)
Optimizing Base Rankers Using Clicks - A Case Study Using BM25 (2014) (10)
Bayesian Ranker Comparison Based on Historical User Interactions (2015) (9)
VariBAD: Variational Bayes-Adaptive Deep RL via Meta-Learning (2021) (9)
Regularized Softmax Deep Multi-Agent Q-Learning (2021) (9)
Loaded DiCE: Trading off Bias and Variance in Any-Order Score Function Gradient Estimators for Reinforcement Learning (2019) (9)
A Deeper Look at Discounting Mismatch in Actor-Critic Algorithms (2020) (9)
Generalization in Cooperative Multi-Agent Systems (2022) (8)
Towards Challenge Balancing for Personalised Game Spaces (2014) (8)
Privileged Information Dropout in Reinforcement Learning (2020) (8)
Learning from human reward benefits from socio-competitive feedback (2014) (8)
Acquiring social interaction behaviours for telepresence robots via deep learning from demonstration (2017) (8)
Loaded DiCE: Trading off Bias and Variance in Any-Order Score Function Estimators for Reinforcement Learning (2019) (7)
Sample-Efficient Evolutionary Function Approximation for Reinforcement Learning (2006) (7)
Switching between Representations in Reinforcement Learning (2010) (7)
Hierarchical Model-Based Imitation Learning for Planning in Autonomous Driving (2022) (7)
Provably Convergent Off-Policy Actor-Critic with Function Approximation (2019) (7)
You May Not Need Ratio Clipping in PPO (2022) (7)
Report on the 2008 Reinforcement Learning Competition (2010) (7)
Learning Retrospective Knowledge with Reverse Reinforcement Learning (2020) (7)
SoftDICE for Imitation Learning: Rethinking Off-policy Distribution Matching (2021) (6)
Postponed Updates for Temporal-Difference Reinforcement Learning (2009) (6)
Why Multi-objective Reinforcement Learning? (2015) (6)
Alternating Optimisation and Quadrature for Robust Reinforcement Learning (2016) (6)
SMACv2: An Improved Benchmark for Cooperative Multi-Agent Reinforcement Learning (2022) (6)
Monotonic Improvement Guarantees under Non-stationarity for Decentralized PPO (2022) (6)
"Learning to rank for information retrieval from user interactions" by K. Hofmann, S. Whiteson, A. Schuth, and M. de Rijke with Martin Vesely as coordinator (2014) (6)
Multi-objective variable elimination for collaborative graphical games (2013) (6)
An Investigation of the Bias-Variance Tradeoff in Meta-Gradients (2022) (6)
Snowflake: Scaling GNNs to High-Dimensional Continuous Control via Parameter Freezing (2021) (5)
An Analysis of Piecewise-Linear and Convex Value Functions for Active Perception POMDPs (2015) (5)
Introduction to the special issue on empirical evaluations in reinforcement learning (2011) (5)
Leveraging social networks to motivate humans to train agents (2014) (4)
Model based Multi-agent Reinforcement Learning with Tensor Decompositions (2021) (4)
A Large-Scale Study of Agents Learning from Human Reward (2015) (4)
Integrating distributed Bayesian inference and reinforcement learning for sensor management (2009) (4)
Communicating via Markov Decision Processes (2021) (4)
Evolutionary Function Approximation (2010) (4)
Imitation Is Not Enough: Robustifying Imitation with Reinforcement Learning for Challenging Driving Scenarios (2022) (4)
Evolving robocup keepaway players through task decomposition (2003) (4)
Dynamic-Depth Context Tree Weighting (2017) (3)
Contextual Policy Optimisation (2018) (3)
PAC Greedy Maximization with Efficient Bounds on Information Gain for Sensor Selection (2016) (3)
Softmax with Regularization: Better Value Estimation in Multi-Agent Reinforcement Learning (2021) (3)
Towards Learning from Implicit Human Reward: (Extended Abstract) (2016) (3)
A Survey of Meta-Reinforcement Learning (2023) (3)
Using Confidence Bounds for Efficient On−Line Ranker Evaluation (2014) (3)
Proceedings of the IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (2009) (3)
On the Practical Consistency of Meta-Reinforcement Learning Algorithms (2021) (3)
Exploiting Agent and Type Independence in Collaborative Graphical Bayesian Games (2011) (3)
Truncated Emphatic Temporal Difference Methods for Prediction and Control (2021) (3)
Variational Multi−Objective Coordination (2015) (3)
Hypernetworks in Meta-Reinforcement Learning (2022) (2)
Generalized Beliefs for Cooperative AI (2022) (2)
Equivariant Networks for Zero-Shot Coordination (2022) (2)
Analysing factorizations of action-value networks for cooperative multi-agent reinforcement learning (2021) (2)
Bayesian Bellman Operators (2021) (2)
Per-Step Reward: A New Perspective for Risk-Averse Reinforcement Learning (2020) (2)
Reinforcement Learning in Factored Action Spaces using Tensor Decompositions (2021) (2)
Robust central pattern generators for embodied hierarchical reinforcement learning (2011) (2)
Towards Autonomic Computing: Adaptive Job Routing and Scheduling (2004) (1)
A Better Baseline for Second Order Gradient Estimation in Stochastic Computation Graphs (2018) (1)
Real-Time Resource Allocation for Tracking Systems (2020) (1)
On-Line Evolutionary Computation (2010) (1)
Embedding Synthetic Off-Policy Experience for Autonomous Driving via Zero-Shot Curricula (2022) (1)
Off-Environment RL with Rare Events (2016) (1)
Adaptive Tile Coding for Reinforcement Learning (2006) (1)
Probably Approximately Correct Greedy Maximization (2016) (1)
Efficient Abstraction Selection in Reinforcement Learning (Extended Abstract) (2013) (1)
Adaptive Tile Coding (2010) (1)
Balancing exploration and exploitation in listwise and pairwise online learning to rank for information retrieval (2012) (1)
Deterministic and Discriminative Imitation (D2-Imitation): Revisiting Adversarial Imitation for Sample Efficiency (2021) (1)
Using informative behavior to increase engagement while learning from human reward (2015) (1)
Semi-On-Policy Training for Sample Efficient Multi-Agent Policy Gradients (2021) (1)
The StarCraft Multi-Agent Challenge Extended Abstract (2019) (1)
Pareto Local Search for MOMDP Planning (2015) (1)
How does the sensitivity of multileaving methods compare to that of interleaving methods ? (2014) (0)
Sample-Efficient Evolutionary Function Approximation (2010) (0)
Supplementary Material: Randomized Entity-wise Factorization for Multi-Agent Reinforcement Learning (2021) (0)
for MOMDP Planning (2015) (0)
UvA-DARE (Digital Academic Repository) Towards Personalised Gaming via Facial Expression Recognition Towards Personalised Gaming via Facial Expression Recognition (2014) (0)
UvA-DARE (Digital Academic Repository) Towards Personalised Gaming via Facial Expression Recognition (2014) (0)
Implicit Communication as Minimum Entropy Coupling (2021) (0)
Improving Exploration in Deep Reinforcement Learning (2017) (0)
Adapting Rankers Online (2011) (0)
Probably Approximately Correct Greedy Maximization: (Extended Abstract) (2016) (0)
Why Target Networks Stabilise Temporal Difference Methods (2023) (0)
AIMS CDT Project Report : Towards One-Shot Learning From Demonstration via Reinforcement Learning (2018) (0)
Integrating Reinforcement Learning and Distributed Perception Networks for Mobile Sensor Control (2008) (0)
Cheap Talk Discovery and Utilization in Multi-Agent Reinforcement Learning (2023) (0)
Exploiting submodular value functions for scaling up active perception (2017) (0)
UvA-DARE ( Digital Academic Repository ) Learning Potential Functions and their Representations for MultiTask Reinforcement Learning (2014) (0)
A large-scale study of agents learning from human reward (Extended abstract) (2015) (0)
Universal Morphology Control via Contextual Modulation (2023) (0)
Towards Autonomic Computing: Adaptive Job Routing (2004) (0)
TION IN DEEP REINFORCEMENT LEARNING (2021) (0)
Deep Residual Reinforcement Learning (Extended Abstract) (2021) (0)
Supplementary : A Baseline for Any Order Gradient Estimation in SCGs (2019) (0)
Supplementary Material for ‘ Bayesian Action Decoder for Deep Multi-Agent Reinforcement Learning ’ (2019) (0)
Design criteria for challenge balancing of personalised game spaces (2014) (0)
Intro to Reinforcement Learning (2017) (0)
Utile Context Tree Weighting (2017) (0)
Facial feedback for reinforcement learning: a case study and offline analysis using the TAMER framework (2020) (0)
Learning potential functions and their representations for multi-task reinforcement learning (2013) (0)
Appendix for Regularized Softmax Deep Multi-Agent Q-Learning (2021) (0)
Trust-Region-Free Policy Optimization for Stochastic Policies (2023) (0)
Trust Region Bounds for Decentralized PPO Under Non-stationarity (2022) (0)
Learning Skills Diverse in Value-Relevant Features (2022) (0)
Probably Approximately Correct Greedy Maximization with Efficient Bounds on Information Gain for Sensor Selection (2016) (0)
Robust Reinforcement Learning with Bayesian Optimisation and Quadrature (2020) (0)
VIABLE: Fast Adaptation via Backpropagating Learned Loss (2019) (0)
Automatic Feature Selection for Reinforcement Learning (2010) (0)
Machine learning in network systems A top-down approach to autonomic computing using adaptive routing and scheduling (2017) (0)
Particle-Based Score Estimation for State Space Model Learning in Autonomous Driving (2022) (0)
STABLE OPPONENT SHAPING (2019) (0)
UvA-DARE ( Digital Academic Repository ) Challenge Balancing for Personalised Game Spaces (2014) (0)
DICE: THE INFINITELY DIFFERENTIABLE MONTE (2018) (0)
Social interaction for efficient agent learning from human reward (2017) (0)
Queued Pareto Local Search for Multi−objective Decision Making (2015) (0)

This paper list is powered by the following services:

Other Resources About Shimon Whiteson

www.cs.ox.ac.uk

What Schools Are Affiliated With Shimon Whiteson?

Shimon Whiteson is affiliated with the following schools:

Shimon Whiteson's Academic­Influence.com Rankings

Shimon Whiteson's Degrees

Similar Degrees You Can Earn

Why Is Shimon Whiteson Influential?

Shimon Whiteson's Published Works

Published Works

Other Resources About Shimon Whiteson

What Schools Are Affiliated With Shimon Whiteson?

Shimon Whiteson's AcademicInfluence.com Rankings