2024 journal article
Distributed Multiagent Reinforcement Learning Based on Graph-Induced Local Value Functions
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 69(10), 6636–6651.
Achieving distributed reinforcement learning (RL) for large-scale cooperative multi-agent systems (MASs) is challenging because: (i) each agent has access to only limited information; (ii) issues on scalability and sample efficiency emerge due to the curse of dimensionality. In this paper, we propose a general distributed framework for sample efficient cooperative multi-agent reinforcement learning (MARL) by utilizing the structures of graphs involved in this problem. We introduce three coupling graphs describing three types of inter-agent couplings in MARL, namely, the state graph, observation graph and reward graph. By further considering a communication graph, we propose two distributed RL approaches based on local value functions derived from the coupling graphs. The first approach is able to reduce sample complexity significantly under specific conditions on the aforementioned four graphs. The second approach provides an approximate solution and can be efficient even for problems with dense coupling graphs. Here there is a trade-off between minimizing the approximation error and reducing the computational complexity. Simulations show that our RL algorithms have a significantly improved scalability to large-scale MASs compared with centralized and consensus-based distributed RL algorithms.