Session 10: PHY-32022-05-11T13:29:46+00:00

Session 10: PHY-3: ML/AI

Thursday, 9 June 2022, 10:30-12:00

(room tbd)

Session Chair: TBD ( , )

Reinforcement Learning for Delay Sensitive Uplink Outer-Loop Link Adaptation

Petteri Kela and Thomas Höhne (Nokia, Finland); Teemu Veijalainen (Nokia Bell Labs, Finland); Hussein Abdulrahman (Nokia, Finland)
Link adaptation (LA) – selecting the modulation and coding scheme (MCS) – is the process where the transmission format is adjusted to prevailing channel conditions, to achieve a balance between spectral efficiency and reliability. New 5G use cases with increased reliability and low latency demands, together with limited user equipment (UE) transmission power budgets, make this task ever more challenging, particularly in adverse propagation environments and interference scenarios. Usually machine learning -based algorithms for link adaptation are aiming at maximizing spectral efficiency. In this paper we study extended reality (XR) uplink traffic and show how online reinforcement learning (RL) can be applied in LA to meet XR’s stringent demands for uplink traffic. In particular, we propose a practical Q-learning as well as deep Q network -based outer-loop link adaptation (OLLA) algorithms that aim at using minimal amount of radio resources for delivering packets within the packet delay budget (PDB). By striving for low radio resource usage multi-agent RL related greediness problems can be mitigated. Realistic system level simulations confirm that the proposed algorithm outperforms traditional OLLA when high reliability and low delays are required. It is also shown that the challenging requirements, set by XR uplink traffic, can be met.

Learning-Based Remote Radio Head Selection and Localization in Distributed Antenna System

Artan Salihu (Institute of Telecommunications, Technische Universitat (TU) Wien & Christian Doppler Laboratory for Dependable Wireless Connectivity for the Society in Motion, Austria); Stefan Schwarz (TU Wien & CD-Lab Society in Motion, Austria); Markus Rupp (TU Wien, Austria)
In this work, we consider estimating user positions in a spatially distributed antenna system (DAS) from the uplink channel state information (CSI). However, with the increased number of remote radio heads (RRHs), collecting CSI at a central unit (CU) can significantly increase the fronthaul overhead and computational complexity of the CU. This problem can be mitigated by selecting a subset of RRHs. Thus, we present a deep learning-based approach to select a subset of RRHs for wireless localization. We employ an RRH selection layer that is jointly trained with the rest of the network and learn the model parameters as well as the set of selected RRHs. We show that the selection strategy comes at a relatively small cost of localization performance. Nonetheless, by comparison to a trivial approach based on the maximization of the channel gain, we show that the proposed method leads to significant performance gains in a propagation environment dominated by non-line-of-sight.

Authentication at the Physical Layer with Cooperative Communications and Machine Learning

Linda Senigagliesi, Marco Baldi and Ennio Gambi (Università Politecnica delle Marche, Italy)
Physical layer authentication allows users to be authenticated only on the basis of the peculiar characteristics of their transmission channels. With the aim of enhancing the authentication accuracy, cooperative communications are studied, considering a set of nodes located between the supplicant and the authenticator. Relay nodes in the system can behave in different ways, passively or actively cooperating to establish whether or not forward messages to the final recipient. The presence of cooperating nodes provides the authenticator with a large number of features, which can be exploited and combined to discriminate the source of packets using both statistical and machine learning based techniques.
We also consider two basic power allocation strategies, and provide performance assessments based on both theoretical arguments and numerical simulations.

An Efficient Actor Critic DRL Framework for Resource Allocation in Multi-Cell Downlink NOMA

Abdullah S. Alajmi (Queen Mary University of London & Prince Sattam Bin Abdulaziz University, United Kingdom (Great Britain)); Waleed Ahsan (Queen Marry University of LONDON, United Kingdom (Great Britain))
In this paper, a tractable framework for downlink non-orthogonal multiple access (NOMA) is proposed based on a model-free reinforcement learning (RL) approach for dynamic resource allocation in a multi-cell network structure. With the aid of actor critic deep reinforcement learning (ACDRL), we optimize the active power allocation for multi-cell NOMA systems under an online environment to maximize the long-term sum rate. To
exploit the dynamic nature of NOMA, this work utilizes the instantaneous data rate for designing the dynamic reward. The state space in ACDRL contains all possible resource allocation realizations depending on a three-dimensional association among users, base stations, and sub-channels. We propose an ACDRL algorithm with this transformed state space which is scalable to handle different network loads by utilizing multiple deep neural networks. Lastly, the simulation results validate that the proposed solution for multi-cell NOMA outperforms the conventional RL, DRL algorithms, and orthogonal multiple access (OMA) schemes in terms of the evaluated long-term sum rate.

[RAS] Learning-Based Orchestration for Dynamic Functional Split and Resource Allocation in vRANs

Fahri Wisnu Murti and Samad Ali (University of Oulu, Finland); George Iosifidis (Delft University of Technology, The Netherlands); Matti Latva-aho (University of Oulu, Finland)
One of the key benefits of virtualized radio access networks (vRANs) is network management flexibility. However, this versatility raises previously-unseen network management challenges. In this paper, a learning-based zero-touch vRAN orchestration framework (LOFV) is proposed to jointly select the functional splits and allocate the virtualized resources to minimize the long-term management cost. First, testbed measurements of the behaviour between the users’ demand and the virtualized resource utilization are collected using a centralized RAN system. The collected data reveals that there are non-linear and non-monotonic relationships between demand and resource utilization. Then, a comprehensive cost model is proposed that takes resource overprovisioning, declined demand, instantiation and reconfiguration into account. Moreover, the proposed cost model also captures different routing and computing costs for each split. Motivated by our measurement insights and cost model, LOFV is developed using a model-free reinforcement learning paradigm. The proposed solution is constructed from a combination of deep Q-learning and a regression-based neural network that maps the network state and users’ demand into split and resource control decisions. Our numerical evaluations show that LOFV can offer cost savings by up to 69% of the optimal static policy and 45% of the optimal fully dynamic policy.

Go to Top