Deep Reinforcement Learning for Virtualized Radio Access Networks Orchestration
Thesis event information
Date and time of the thesis defence
Place of the thesis defence
IT116
Topic of the dissertation
Deep Reinforcement Learning for Virtualized Radio Access Networks Orchestration
Doctoral candidate
Master of Engineering Fahri Wisnu Murti
Faculty and unit
University of Oulu Graduate School, Faculty of Information Technology and Electrical Engineering, Centre for Wireless Communications
Subject of study
Communications Engineering
Opponent
Docent Tao Chen, VTT Technical Research Centre of Finland
Custos
Professor Matti Latva-aho, University of Oulu
Deep Reinforcement Learning for Virtualized Radio Access Networks Orchestration
The main objective of this thesis is to devise novel learning-based frameworks that orchestrate cost-aware virtualized Radio Access Networks (vRANs).
In vRANs, the base station (BS) functions can be fully configurable, disaggregated, and implemented at a low cost over commodity platforms. This paradigm shift brings flexibility to RAN operations and potentially reduces operational expenses. However, their expansive deployment is challenged by highly coupled configuration options and non-trivial underlying systems. In this regard, their orchestration problems are investigated, and deep reinforcement learning (RL)-based frameworks are developed to solve the problems with minimal assumptions about the system.
First, the functional split problem is investigated, where the BS functions can be deployed flexibly at the centralized unit (CU) and distributed units (DUs) to minimize the total vRAN cost. This problem is combinatorial and provably NP-hard, and finding the optimal solution is computationally expensive. A chain rule-based stochastic RL policy with a sequence-to-sequence model is proposed to solve this problem heuristically. The results show that it can learn to make split decisions close to optimality.
Next, a vRAN reconfiguration problem is proposed to jointly reconfigure the functional splits of the BSs, locations of the CUs and DUs, their resources, and the routing for each BS data flow. The goal is to minimize the long-term total network operation cost while adapting to the traffic demands and resource availability. This problem has a multi-dimensional discrete action space, which yields a combinatorial number of possible actions. A combination of action branching, an action decomposition method followed by neural network branches, with a dueling double deep Q-network algorithm is proposed for the solution framework. The results show the framework successfully learns the optimal policy and offers substantial cost savings to the baselines.
Finally, a joint vRAN and multi-access edge computing (MEC) orchestration is proposed to jointly control the vRAN splits, the resources and hosting locations of the vRAN/MEC services, and the routing for each data flow. The goal is to minimize the long-term network operation cost and maximize the MEC performance criterion while adapting vRAN/MEC demands and resource availability. A Bayesian framework of deep RL is proposed to solve this problem, for which numerical evaluations show that it is data-efficient and can improve the learning performance of its non-Bayesian version.
In vRANs, the base station (BS) functions can be fully configurable, disaggregated, and implemented at a low cost over commodity platforms. This paradigm shift brings flexibility to RAN operations and potentially reduces operational expenses. However, their expansive deployment is challenged by highly coupled configuration options and non-trivial underlying systems. In this regard, their orchestration problems are investigated, and deep reinforcement learning (RL)-based frameworks are developed to solve the problems with minimal assumptions about the system.
First, the functional split problem is investigated, where the BS functions can be deployed flexibly at the centralized unit (CU) and distributed units (DUs) to minimize the total vRAN cost. This problem is combinatorial and provably NP-hard, and finding the optimal solution is computationally expensive. A chain rule-based stochastic RL policy with a sequence-to-sequence model is proposed to solve this problem heuristically. The results show that it can learn to make split decisions close to optimality.
Next, a vRAN reconfiguration problem is proposed to jointly reconfigure the functional splits of the BSs, locations of the CUs and DUs, their resources, and the routing for each BS data flow. The goal is to minimize the long-term total network operation cost while adapting to the traffic demands and resource availability. This problem has a multi-dimensional discrete action space, which yields a combinatorial number of possible actions. A combination of action branching, an action decomposition method followed by neural network branches, with a dueling double deep Q-network algorithm is proposed for the solution framework. The results show the framework successfully learns the optimal policy and offers substantial cost savings to the baselines.
Finally, a joint vRAN and multi-access edge computing (MEC) orchestration is proposed to jointly control the vRAN splits, the resources and hosting locations of the vRAN/MEC services, and the routing for each data flow. The goal is to minimize the long-term network operation cost and maximize the MEC performance criterion while adapting vRAN/MEC demands and resource availability. A Bayesian framework of deep RL is proposed to solve this problem, for which numerical evaluations show that it is data-efficient and can improve the learning performance of its non-Bayesian version.
Last updated: 8.10.2024