AI4C4 – AI/ML Solutions for Communications
Thursday, 4 June 2026, 11:00-12:30, room Sala 4 (1st floor)
Session Chair: German Corrales Madueno (Keysight Technologies, DK)
Multi-Agent Deep Reinforcement Learning-Based Handover Decision Method Incorporating a Dual Active Protocol Stack
Nam-I Kim (Electronics and Telecommunications Research Institute, Korea (South)); Hyung-sub Kim, Jeehyeon Na and Dongseok Roh (ETRI, Korea (South))
In this paper, we propose a multi-agent deep reinforcement learning-based dual active protocol stack handover decision method. With the recent arrival of the 5G era, heterogeneous network approaches have emerged. Consequently, decision-making to support the mobility of user equipment (UE), which must be handled by network cores and base stations, inevitably becomes highly complex. Moreover, the dual active protocol stack handover, newly released by 3GPP, requires more complex calculations to provide a better user experience for UE. To address the complexity of handover decision problems, we applied the recently popular deep reinforcement learning (DRL) method. DRL enables network operators to benefit from an artificial intelligence system that learns and autonomously decides, eliminating the need to devise complex handover policies or algorithms. We also implemented the Madrid grid to consider a more realistic environment, referring to a highly realistic model based on actual cities. Results of learning and simulations performed in the Madrid grid environment are provided for validation of the proposed approach.
Deep Reinforcement Learning for Throughput Optimization in NCJT Mobile Multi-Panel Devices
Robin Flemming and Philipp Schulz (Technische Universität Dresden, Germany); Rakash SivaSiva Ganesan and Samer Bazzi (Nokia, Germany); Luis Suárez (Nokia Technologies, Germany); Gerhard P. Fettweis (Technische Universität Dresden, Germany)
Ensuring high throughput and stable connectivity for mobile user equipment (UEs) with multiple receiver antenna panels is a key challenge in modern mobile networks. It is well-established that current handover mechanisms, based on procedures from the Third Generation Partnership Project (3GPP), exhibit performance limitations, especially in scenarios with high mobility at cell edges or at large distances from the base station (BS). In these areas, the strong influence of shadowing is known to cause an excessive number of unnecessary handovers (HOs), which significantly reduce throughput and reliability. To overcome this limitation, we introduce a novel framework based on Deep Q-Networks (DQN) to optimize handover decisions in Non-Coherent Joint Transmission (NCJT) mobile scenarios. Our approach enables the UE to control the selection of the transmission reception points (TRPs) autonomously and intelligently. We design and train the DQN for two different operating modes: one designed for maximum throughput and one for maximum reliability. The simulation results confirm that the DQN-based approach increases the overall performance compared to the 3GPP legacy procedures in critical scenarios by up to 60%. These results establish deep reinforcement learning as a powerful tool for adaptive mobility management of multi-panel UEs and open up promising application possibilities for future wireless systems that require intelligent and context-sensitive TRP selection.
DNN-Based Nulling Control Beam Focusing for near-Field Multi-User Interference Mitigation
Mohammadhossein Karimi, Yuanzhe Gong and Tho Le-Ngoc (McGill University, Canada)
This paper proposes a deep learning-based framework for near-field nulling control beam focusing (NCBF) in extra-large MIMO (XL-MIMO) systems to mitigate multi-user interference (MUI). A dual-estimator architecture comprising two fully connected deep neural networks (FCDNNs) is developed to separately predict the phase and magnitude components of NCBF weights, using locations of both desired and interfering users. The models are trained on a large dataset generated via a Linearly Constrained Minimum Variance (LCMV) beamforming algorithm to accommodate diverse user configurations, including both collinear and non-collinear scenarios. Illustrative results demonstrate that the proposed DNN models achieve high prediction accuracy, with test errors of only 0.067 radians for phase estimation and 0.206 dB for magnitude estimation. Full-wave simulations incorporating realistic element radiation patterns and inter-element coupling confirm the close agreement between the beam patterns produced by the DNN-predicted and LCMV-based NCBF schemes under practical deployment conditions. An average MUI suppression of 36.7 dB is achieved, with interference mitigation exceeding 17.5 dB across all tested cases. The proposed approach enables scalable and real-time beam focusing with effective interference suppression, offering a promising solution for future near-field multi-user wireless communications.
Optimisation of Resource Allocation in Heterogeneous Wireless Networks Using Deep Reinforcement Learning
Oluwaseyi E Giwa (African Institute for Mathematical Sciences, South Africa & University of Cape Town, South Africa); Jonathan Shock (University of Cape Town, South Africa); Jaco Du Toit (Stellenbosch University, South Africa & Vodacom Group Limited, South Africa); Tobi Ebenezer Awodumila (University of Cape Town, South Africa)
Dynamic resource allocation in Open RAN (O-RAN) HetNets presents a complex optimisation challenge under varying user loads. We propose a Near-Real-Time RAN Intelligent Controller (Near-RT RIC) xApp utilising Deep Reinforcement Learning (DRL) to jointly optimise transmit power, bandwidth slicing, and user scheduling. Leveraging real-world network topologies, we benchmark Proximal Policy Optimisation (PPO) and Twin Delayed Deep Deterministic Policy Gradient (TD3) against standard heuristics. Our results demonstrate that the PPO-based xApp achieves a superior trade-off, reducing network energy consumption by up to (70%) in dense scenarios while improving user fairness by over \ (30% ) compared to throughput-greedy baselines. These findings validate the feasibility of centralised, energy-aware AI orchestration in future 6G architectures.
A Deep Reinforcement Learning Approach to Active Channel Sounding in 6G Networks
George J. Stamatakis (Foundation for Research and Technology-Hellas – FORTH, Greece); Grigorios Tsagkatakis and Nikolaos Petroulakis (FORTH, Greece)
The timely knowledge of channel state information at the receiver’s side will have a profound impact on the performance of 6G networks. Acquiring this information through active channel sounding introduces overhead to the network and can lead to excessive energy and bandwidth consumption. In this work, we formulate the problem of active channel sounding as a constrained sequential decision problem whereby the base station periodically decides whether it will transmit a Channel State Information Reference Signal to a user equipment associated with it or not. The objective of the base station is to maximize its throughput under hard resource constraints. We express the problem as a Partially Observed Markov Decision Process and utilize Deep Reinforcement Learning techniques to derive an approximately optimal channel sounding policy. Numerical results indicate that the optimal policy can improve the base station’s throughput compared to base-line policies, especially so in resource constrained scenarios.























