Next-gen resource optimization in NB-IoT networks: Harnessing soft actor-critic reinforcement learning

被引：1

作者：

Anbazhagan, S. ^{[1
]}

Mugelan, R. K. ^{[1
]}

机构：

[1] Vellore Inst Technol, Sch Elect Engn, Dept Commun Engn, Vellore 632014, Tamil Nadu, India

来源：

COMPUTER NETWORKS | 2024年 / 252卷

关键词：

Narrowband Internet of Things (NB-ioT); Resource allocation; Reinforcement learning; Soft actor-critic (SAC); D2D COMMUNICATION; ALLOCATION; UPLINK; PERFORMANCE; ADAPTATION; DOWNLINK;

D O I：

10.1016/j.comnet.2024.110670

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Resource allocation in Narrowband Internet of Things (NB-IoT) networks is a complex challenge due to dynamic user demands, variable channel conditions, and distance considerations. Traditional approaches often struggle to adapt to the dynamic nature of these environments. In this study, we leverage reinforcement learning (RL) to address the intricate nature of NB-IoT resource allocation. Specifically, we employ the Soft Actor-Critic (SAC) algorithm, comparing its performance against conventional RL algorithms such as Deep Q-Network (DQN) and Proximal Policy Optimization (PPO). The Soft Actor-Critic (SAC) algorithm is employed to train an agent for adaptive resource allocation, considering energy efficiency, throughput, latency, fairness, and interference constraints. The agent adeptly balances these objectives through an intricate reward structure and penalty mechanisms. Through comprehensive analysis, we present performance metrics, including total reward, energy efficiency, throughput, fairness, and latency, showcasing the efficacy of SAC when compared to DQN and PPO. Our findings underscore the efficiency of SAC in optimizing resource allocation in NBIoT networks, offering a promising solution to the complexities inherent in such dynamic environments. Resource allocation in Narrowband Internet of Things (NB-IoT) networks presents a complex challenge due to dynamic user demands, variable channel conditions, and distance considerations. Traditional approaches often struggle to adapt to these dynamic environments. This study leverages reinforcement learning (RL), specifically the Soft Actor-Critic (SAC) algorithm, to address the intricacies of NB-IoT resource allocation. We compare SAC's performance against conventional RL algorithms, including Deep Q-Network (DQN) and Proximal Policy Optimization (PPO). The SAC algorithm is utilized to train an agent for adaptive resource allocation, focusing on energy efficiency, throughput, latency, fairness, interference constraints, recovery time, and long-term performance stability. To demonstrate the scalability and effectiveness of SAC, we conducted experiments on NB-IoT networks with varying deployment types and configurations, including standard urban and suburban, high-density urban, industrial IoT, rural and low-density, and IoT service providers. To assess generalization capability, we tested SAC across applications like smart metering, smart cities, smart agriculture, and asset tracking & management. Our comprehensive analysis demonstrates that SAC significantly outperforms DQN and PPO across multiple performance metrics. Specifically, SAC improves energy efficiency by 5.60% over PPO and 10.25% over DQN. In terms of latency, SAC achieves a marginal reduction of approximately 0.0124% compared to PPO and 0.0126% compared to DQN. SAC enhances throughput by 214.98% over PPO and 15.72% over DQN. Additionally, SAC shows a substantial increase in fairness (Jain's index), improving by 358.31% over PPO and 614.46% over DQN. SAC also demonstrates superior recovery time, improving by 18.99% over PPO and 25.07% over DQN. In both deployment scenarios and diverse IoT applications, SAC consistently achieves high total rewards, minimal fluctuations, and stable performance. Energy efficiency remains constant at 7.2 bits per Joule, and latency is approximately 0.080 s. Throughput is robust across different deployments, while fairness remains high, ensuring equitable resource allocation. Recovery times are stable, enhancing operational reliability. These results underscore SAC's efficiency and robustness in optimizing resource allocation in NB-IoT networks, presenting a promising solution to the complexities of dynamic environments.

引用

页数：42

共 50 条

[31] Adaptive and Efficient Resource Allocation in Cloud Datacenters Using Actor-Critic Deep Reinforcement Learning
Chen, Zheyi
Hu, Jia
Min, Geyong
Luo, Chunbo
El-Ghazawi, Tarek
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (08) : 1911 - 1923
[32] Multi-Agent Reinforcement Learning Based Energy Efficiency Optimization in NB-IoT Networks
Guo, Yuancheng
Xiang, Min
2019 IEEE GLOBECOM WORKSHOPS (GC WKSHPS), 2019,
[33] Bayesian Soft Actor-Critic: A Directed Acyclic Strategy Graph Based Deep Reinforcement Learning
Yang, Qin
Parasuraman, Ramviyas
39TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2024, 2024, : 646 - 648
[34] Digital Twin With Soft Actor-Critic Reinforcement Learning for Transitioning From Industry 4.0 to 5.0
Asmat, Hamid
Ud Din, Ikram
Almogren, Ahmad
Khan, Muhammad Yasar
IEEE Access, 2025, 13 : 40577 - 40593
[35] Efficient Optimization of Actor-Critic Learning for Constrained Resource Orchestration in RAN with Network Slicing
Kanwal Janjua, Hafiza
de Miguel, Ignacio
Duran Barroso, Ramon J.
Gonzalez de Dios, Oscar
Carlos Aguado, Juan
Merayo Alvarez, Noemi
Fernandez, Patricia
Lorenzo, Ruben M.
2023 26TH CONFERENCE ON INNOVATION IN CLOUDS, INTERNET AND NETWORKS AND WORKSHOPS, ICIN, 2023,
[36] Access Control in NB-IoT Networks: A Deep Reinforcement Learning Strategy
Hadjadj-Aoul, Yassine
Ait-Chellouche, Soraya
INFORMATION, 2020, 11 (11) : 1 - 16
[37] Deep Reinforcement Learning for NPDCCH Period Adjustment in NB-IoT Networks
Yu, Ya-Ju
Chuang, Ching-Chih
Cheng, Yu-Wei
2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2021, : 1883 - 1888
[38] Joint Optimization of Channel Bonding and Transmit Power Using Optimized Actor-Critic Deep Reinforcement Learning for Wireless Networks
Yadav, Rajender Singh
Patel, Prabhat
Jain, Prashant Kumar
Shukla, Shailja
INTERNATIONAL JOURNAL OF COMMUNICATION SYSTEMS, 2025, 38 (07)
[39] On-line Energy Optimization of Hybrid Production Systems Using Actor-Critic Reinforcement Learning
Schwung, Dorothea
Schwung, Andreas
Ding, Steven X.
2018 9TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS (IS), 2018, : 147 - 154
[40] AN ACTOR-CRITIC REINFORCEMENT LEARNING APPROACH TO MINIMUM AGE OF INFORMATION SCHEDULING IN ENERGY HARVESTING NETWORKS
Leng, Shiyang
Yener, Aylin
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 8128 - 8132

← 1 2 3 4 5 →