2023 journal article

Reinforcement Learning-Based Joint User Scheduling and Link Configuration in Millimeter-Wave Networks

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 22(5), 3038–3054.

By: Y. Zhang * & R. Heath Jr

co-author countries: United States of America πŸ‡ΊπŸ‡Έ
author keywords: Millimeter wave communication; Relays; Training; Delays; Optimization; Dynamic scheduling; Wireless communication; Millimeter wave; mobility; user scheduling; relay selection; codebook selection; beam tracking; deep reinforcement learning; proximal policy optimization; multi-armed bandit; Thompson sampling
Source: Web Of Science
Added: July 3, 2023

In this paper, we develop algorithms for joint user scheduling and three types of mmWave link configuration: relay selection, codebook optimization, and beam tracking in millimeter wave (mmWave) networks. Our goal is to design an online controller that dynamically schedules users and configures their links to minimize system delay. To solve this complex scheduling problem, we model it as a dynamic decision-making process and develop two reinforcement learning-based solutions. The first solution is based on deep reinforcement learning (DRL), which leverages the proximal policy optimization to train a neural network-based solution. Due to the potential high sample complexity of DRL, we also propose an empirical multi-armed bandit (MAB)-based solution, which decomposes the decision-making process into a sequential of sub-actions and exploits classic maxweight scheduling and Thompson sampling to decide those sub-actions. Our evaluation of the proposed solutions confirms their effectiveness in providing acceptable system delay. It also shows that the DRL-based solution has better delay performance while the MAB-based solution has a faster training process.