1

A Deeper Understanding of State-Based Critics in Multi-Agent Reinforcement Learning

Centralized Training for Decentralized Execution, where training is done in a centralized offline fashion, has become a popular solution paradigm in Multi-Agent Reinforcement Learning. Many such methods take the form of actor-critic with state-based …

Local Advantage Actor-Critic for Robust Multi-Agent Deep Reinforcement Learning

Policy gradient methods have become popular in multi-agent reinforcement learning, but they suffer from high variance due to the presence of environmental stochasticity and exploring agents (i.e., non-stationarity), which is potentially worsened by …

Contrasting Centralized and Decentralized Critics in Multi-Agent Reinforcement Learning

We prove and test that centralized critic and decentralized critic in multi-agent learning is asymptotically equivalent. We also provide bias-variance trade-off analysis and empirical advice.

Likelihood Quantile Networks for Coordinating Multi-Agent Reinforcement Learning

When multiple agents learn in a decentralized manner, the environment appears non-stationary from the perspective of an individual agent due to the exploration and learning of the other agents. Recently proposed deep multi-agent reinforcement …