Spectrum Prediction With Deep 3D Pyramid Vision Transformer Learning

被引：1

作者：

Pan, Guangliang ^{[1
]}

Wu, Qihui ^{[1
]}

Zhou, Bo ^{[1
]}

Li, Jie ^{[1
]}

Wang, Wei ^{[1
]}

Ding, Guoru ^{[2
]}

Yau, David K. Y. ^{[3
]}

机构：

[1] Nanjing Univ Aeronaut & Astronaut, Coll Elect & Informat Engn, Nanjing 211106, Peoples R China

[2] Army Engn Univ, Coll Commun Engn, Nanjing 210007, Peoples R China

[3] Singapore Univ Technol & Design, Pillar Informat Syst Technol & Design, Singapore 487372, Singapore

来源：

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS | 2025年 / 24卷 / 01期

基金：

新加坡国家研究基金会; 中国国家自然科学基金;

关键词：

Three-dimensional displays; Hidden Markov models; Feature extraction; Transformers; Spectrogram; Convolution; Wireless communication; Predictive models; Monitoring; Autoregressive processes; Spectrum prediction; 3D vision transformer; pyramid; 3D convolutional layer; transfer learning; NETWORKS;

D O I：

10.1109/TWC.2024.3495812

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In this paper, we propose a deep learning (DL)-based task-driven spectrum prediction framework, named DeepSPred. The DeepSPred comprises a feature encoder and a task predictor, where the encoder extracts spectrum usage pattern features, and the predictor configures different networks according to the task requirements to predict future spectrum. Based on the DeepSPred, we first propose a novel 3D spectrum prediction method combining a flow processing strategy with 3D vision Transformer (ViT, i.e., Swin) and a pyramid to serve possible applications such as spectrum monitoring task, named 3D-SwinSTB. 3D-SwinSTB unique 3D Patch Merging ViT-to-3D ViT Patch Expanding and pyramid designs help the model accurately learn the potential correlation of the evolution of the spectrogram over time. Then, we propose a novel spectrum occupancy rate (SOR) method by redesigning a predictor consisting exclusively of 3D convolutional and linear layers to serve possible applications such as dynamic spectrum access (DSA) task, named 3D-SwinLinear. Unlike the 3D-SwinSTB output spectrogram, 3D-SwinLinear projects the spectrogram directly as the SOR. Finally, we employ transfer learning (TL) to ensure the applicability of our two methods to diverse spectrum services. The results show that our 3D-SwinSTB outperforms recent benchmarks by more than 5%, while our 3D-SwinLinear achieves a 90% accuracy, with a performance improvement exceeding 10%.

引用

页码：509 / 525

页数：17

共 50 条

[1] Deep learning for 3D vision
Guo, Yulan
Wang, Hanyun
Clark, Ronald
Berretti, Stefano
Bennamoun, Mohammed
IET COMPUTER VISION, 2022, 16 (07) : 567 - 569
[2] A Fusion Deep Learning Model of ResNet and Vision Transformer for 3D CT Images
Liu, Chiyu
Sun, Cunjie
IEEE ACCESS, 2024, 12 : 93389 - 93397
[3] A Deep Learning-Based Approach for Cervical Cancer Classification Using 3D CNN and Vision Transformer
Abinaya, K.
Sivakumar, B.
JOURNAL OF IMAGING INFORMATICS IN MEDICINE, 2024, 37 (01): : 280 - 296
[4] Learning 3D Face Representation with Vision Transformer for Masked Face Recognition
Wang, Yuan
Yang, Zhen
Zhang, Zhiqiang
Zang, Huaijuan
Zhu, Qiang
Zhan, Shu
2022 ASIA CONFERENCE ON ALGORITHMS, COMPUTING AND MACHINE LEARNING (CACML 2022), 2022, : 505 - 511
[5] COTR: Compact Occupancy TRansformer for Vision-based 3D Occupancy Prediction
Ma, Qihang
Tan, Xin
Qu, Yanyun
Ma, Lizhuang
Zhang, Zhizhong
Xie, Yuan
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 19936 - 19945
[6] Dose prediction in HDR brachytherapy for cervical cancer using 3D transformer-based deep learning
Jian, W.
Zhu, L.
Zhang, Y.
Zhang, B.
Wang, X.
RADIOTHERAPY AND ONCOLOGY, 2023, 182 : S408 - S409
[7] Deep learning based 3D segmentation in computer vision: A survey
He, Yong
Yu, Hongshan
Liu, Xiaoyan
Yang, Zhengeng
Sun, Wei
Anwar, Saeed
Mian, Ajmal
INFORMATION FUSION, 2025, 115
[8] Deep Learning Advances in Computer Vision with 3D Data: A Survey
Ioannidou, Anastasia
Chatzilari, Elisavet
Nikolopoulos, Spiros
Kompatsiaris, Ioannis
ACM COMPUTING SURVEYS, 2017, 50 (02)
[9] Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wang, Wenhai
Xie, Enze
Li, Xiang
Fan, Deng-Ping
Song, Kaitao
Liang, Ding
Lu, Tong
Luo, Ping
Shao, Ling
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 548 - 558
[10] 3D Vision robot online packing platform for deep reinforcement learning
Mu, Xingyu
Kan, Quanmin
Jiang, Yong
Chang, Chao
Tian, Xincheng
Zhou, Lelai
Zhao, Yongguo
ROBOTICS AND COMPUTER-INTEGRATED MANUFACTURING, 2025, 94

← 1 2 3 4 5 →