Deep Reinforcement Learning for Virtualized Radio Access Networks Orchestration

Thesis event information

Date and time of the thesis defence

Place of the thesis defence

IT116

Topic of the dissertation

Deep Reinforcement Learning for Virtualized Radio Access Networks Orchestration

Doctoral candidate

Master of Engineering Fahri Wisnu Murti

Faculty and unit

University of Oulu Graduate School, Faculty of Information Technology and Electrical Engineering, Centre for Wireless Communications

Subject of study

Communications Engineering

Opponent

Docent Tao Chen, VTT Technical Research Centre of Finland

Custos

Professor Matti Latva-aho, University of Oulu

Visit thesis event

Add event to calendar

Deep Reinforcement Learning for Virtualized Radio Access Networks Orchestration

The main objective of this thesis is to devise novel learning-based frameworks that orchestrate cost-aware virtualized Radio Access Networks (vRANs).

In vRANs, the base station (BS) functions can be fully configurable, disaggregated, and implemented at a low cost over commodity platforms. This paradigm shift brings flexibility to RAN operations and potentially reduces operational expenses. However, their expansive deployment is challenged by highly coupled configuration options and non-trivial underlying systems. In this regard, their orchestration problems are investigated, and deep reinforcement learning (RL)-based frameworks are developed to solve the problems with minimal assumptions about the system.

First, the functional split problem is investigated, where the BS functions can be deployed flexibly at the centralized unit (CU) and distributed units (DUs) to minimize the total vRAN cost. This problem is combinatorial and provably NP-hard, and finding the optimal solution is computationally expensive. A chain rule-based stochastic RL policy with a sequence-to-sequence model is proposed to solve this problem heuristically. The results show that it can learn to make split decisions close to optimality.

Next, a vRAN reconfiguration problem is proposed to jointly reconfigure the functional splits of the BSs, locations of the CUs and DUs, their resources, and the routing for each BS data flow. The goal is to minimize the long-term total network operation cost while adapting to the traffic demands and resource availability. This problem has a multi-dimensional discrete action space, which yields a combinatorial number of possible actions. A combination of action branching, an action decomposition method followed by neural network branches, with a dueling double deep Q-network algorithm is proposed for the solution framework. The results show the framework successfully learns the optimal policy and offers substantial cost savings to the baselines.

Finally, a joint vRAN and multi-access edge computing (MEC) orchestration is proposed to jointly control the vRAN splits, the resources and hosting locations of the vRAN/MEC services, and the routing for each data flow. The goal is to minimize the long-term network operation cost and maximize the MEC performance criterion while adapting vRAN/MEC demands and resource availability. A Bayesian framework of deep RL is proposed to solve this problem, for which numerical evaluations show that it is data-efficient and can improve the learning performance of its non-Bayesian version.
Last updated: 8.10.2024