Spectrum Prediction With Deep 3D Pyramid Vision Transformer Learning

被引:1
|
作者
Pan, Guangliang [1 ]
Wu, Qihui [1 ]
Zhou, Bo [1 ]
Li, Jie [1 ]
Wang, Wei [1 ]
Ding, Guoru [2 ]
Yau, David K. Y. [3 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Coll Elect & Informat Engn, Nanjing 211106, Peoples R China
[2] Army Engn Univ, Coll Commun Engn, Nanjing 210007, Peoples R China
[3] Singapore Univ Technol & Design, Pillar Informat Syst Technol & Design, Singapore 487372, Singapore
基金
新加坡国家研究基金会; 中国国家自然科学基金;
关键词
Three-dimensional displays; Hidden Markov models; Feature extraction; Transformers; Spectrogram; Convolution; Wireless communication; Predictive models; Monitoring; Autoregressive processes; Spectrum prediction; 3D vision transformer; pyramid; 3D convolutional layer; transfer learning; NETWORKS;
D O I
10.1109/TWC.2024.3495812
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, we propose a deep learning (DL)-based task-driven spectrum prediction framework, named DeepSPred. The DeepSPred comprises a feature encoder and a task predictor, where the encoder extracts spectrum usage pattern features, and the predictor configures different networks according to the task requirements to predict future spectrum. Based on the DeepSPred, we first propose a novel 3D spectrum prediction method combining a flow processing strategy with 3D vision Transformer (ViT, i.e., Swin) and a pyramid to serve possible applications such as spectrum monitoring task, named 3D-SwinSTB. 3D-SwinSTB unique 3D Patch Merging ViT-to-3D ViT Patch Expanding and pyramid designs help the model accurately learn the potential correlation of the evolution of the spectrogram over time. Then, we propose a novel spectrum occupancy rate (SOR) method by redesigning a predictor consisting exclusively of 3D convolutional and linear layers to serve possible applications such as dynamic spectrum access (DSA) task, named 3D-SwinLinear. Unlike the 3D-SwinSTB output spectrogram, 3D-SwinLinear projects the spectrogram directly as the SOR. Finally, we employ transfer learning (TL) to ensure the applicability of our two methods to diverse spectrum services. The results show that our 3D-SwinSTB outperforms recent benchmarks by more than 5%, while our 3D-SwinLinear achieves a 90% accuracy, with a performance improvement exceeding 10%.
引用
收藏
页码:509 / 525
页数:17
相关论文
共 50 条
  • [1] Deep learning for 3D vision
    Guo, Yulan
    Wang, Hanyun
    Clark, Ronald
    Berretti, Stefano
    Bennamoun, Mohammed
    IET COMPUTER VISION, 2022, 16 (07) : 567 - 569
  • [2] A Fusion Deep Learning Model of ResNet and Vision Transformer for 3D CT Images
    Liu, Chiyu
    Sun, Cunjie
    IEEE ACCESS, 2024, 12 : 93389 - 93397
  • [3] A Deep Learning-Based Approach for Cervical Cancer Classification Using 3D CNN and Vision Transformer
    Abinaya, K.
    Sivakumar, B.
    JOURNAL OF IMAGING INFORMATICS IN MEDICINE, 2024, 37 (01): : 280 - 296
  • [4] Learning 3D Face Representation with Vision Transformer for Masked Face Recognition
    Wang, Yuan
    Yang, Zhen
    Zhang, Zhiqiang
    Zang, Huaijuan
    Zhu, Qiang
    Zhan, Shu
    2022 ASIA CONFERENCE ON ALGORITHMS, COMPUTING AND MACHINE LEARNING (CACML 2022), 2022, : 505 - 511
  • [5] COTR: Compact Occupancy TRansformer for Vision-based 3D Occupancy Prediction
    Ma, Qihang
    Tan, Xin
    Qu, Yanyun
    Ma, Lizhuang
    Zhang, Zhizhong
    Xie, Yuan
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 19936 - 19945
  • [6] Dose prediction in HDR brachytherapy for cervical cancer using 3D transformer-based deep learning
    Jian, W.
    Zhu, L.
    Zhang, Y.
    Zhang, B.
    Wang, X.
    RADIOTHERAPY AND ONCOLOGY, 2023, 182 : S408 - S409
  • [7] Deep learning based 3D segmentation in computer vision: A survey
    He, Yong
    Yu, Hongshan
    Liu, Xiaoyan
    Yang, Zhengeng
    Sun, Wei
    Anwar, Saeed
    Mian, Ajmal
    INFORMATION FUSION, 2025, 115
  • [8] Deep Learning Advances in Computer Vision with 3D Data: A Survey
    Ioannidou, Anastasia
    Chatzilari, Elisavet
    Nikolopoulos, Spiros
    Kompatsiaris, Ioannis
    ACM COMPUTING SURVEYS, 2017, 50 (02)
  • [9] Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
    Wang, Wenhai
    Xie, Enze
    Li, Xiang
    Fan, Deng-Ping
    Song, Kaitao
    Liang, Ding
    Lu, Tong
    Luo, Ping
    Shao, Ling
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 548 - 558
  • [10] 3D Vision robot online packing platform for deep reinforcement learning
    Mu, Xingyu
    Kan, Quanmin
    Jiang, Yong
    Chang, Chao
    Tian, Xincheng
    Zhou, Lelai
    Zhao, Yongguo
    ROBOTICS AND COMPUTER-INTEGRATED MANUFACTURING, 2025, 94