2024 journal article

Distributed Multiagent Reinforcement Learning Based on Graph-Induced Local Value Functions

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 69(10), 6636–6651.

author keywords: Couplings; Heuristic algorithms; Convergence; Approximation algorithms; Scalability; Reinforcement learning; Indexes; Distributed learning; Markov decision process; multiagent systems; optimal control; reinforcement learning
Source: Web Of Science
Added: October 14, 2024

Achieving distributed reinforcement learning (RL) for large-scale cooperative multi-agent systems (MASs) is challenging because: (i) each agent has access to only limited information; (ii) issues on scalability and sample efficiency emerge due to the curse of dimensionality. In this paper, we propose a general distributed framework for sample efficient cooperative multi-agent reinforcement learning (MARL) by utilizing the structures of graphs involved in this problem. We introduce three coupling graphs describing three types of inter-agent couplings in MARL, namely, the state graph, observation graph and reward graph. By further considering a communication graph, we propose two distributed RL approaches based on local value functions derived from the coupling graphs. The first approach is able to reduce sample complexity significantly under specific conditions on the aforementioned four graphs. The second approach provides an approximate solution and can be efficient even for problems with dense coupling graphs. Here there is a trade-off between minimizing the approximation error and reducing the computational complexity. Simulations show that our RL algorithms have a significantly improved scalability to large-scale MASs compared with centralized and consensus-based distributed RL algorithms.