2020 article

Eco-Vehicular Edge Networks for Connected Transportation: A Distributed Multi-Agent Reinforcement Learning Approach

2020 IEEE 92ND VEHICULAR TECHNOLOGY CONFERENCE (VTC2020-FALL).

By: M. Pervej n & S. Lin n

author keywords: Connected transportation; energy efficiency; reinforcement learning; resource scheduling; software-defined networking; vehicle-to-infrastructure (V2I) communication
TL;DR: A distributed multi-agent reinforcement learning (D-MARL) algorithm is proposed for eco-vehicular edges, where multiple agents cooperatively learn to receive the best reward, and results validate that the learning solution achieves near-optimal performances within a small number of training episodes as compared with existing baselines. (via Semantic Scholar)
Source: Web Of Science
Added: July 19, 2021

This paper introduces an energy-efficient, software-defined vehicular edge network for the growing intelligent connected transportation system. A joint user-centric virtual cell formation and resource allocation problem is investigated to bring eco-solutions at the edge. This joint problem aims to combat against the power-hungry edge nodes while maintaining assured reliability and data rate. More specifically, by prioritizing the downlink communication of dynamic eco-routing, highly mobile autonomous vehicles are served with multiple low-powered access points (APs) simultaneously for ubiquitous connectivity and guaranteed reliability of the network. The formulated optimization is exceptionally troublesome to solve within a polynomial time, due to its complicated combinatorial structure. Hence, a distributed multi-agent reinforcement learning (D-MARL) algorithm is proposed for eco-vehicular edges, where multiple agents cooperatively learn to receive the best reward. First, the algorithm segments the centralized action space into multiple smaller groups. Based on the model-free distributed Q learner, each edge agent takes its actions from the respective group. Also, in each learning state, a software-defined controller chooses the global best action from individual bests of the distributed agents. Numerical results validate that our learning solution achieves near-optimal performances within a small number of training episodes as compared with existing baselines.