Cross-Domain Object Detection Algorithm for Complex End-to-End Scene Understanding

被引:0
|
作者
Chen, Aoran [1 ]
Huang, Hai [1 ]
Zhu, Yueyan [1 ]
Xue, Junsheng [1 ]
机构
[1] School of Information and Communication Engineering, Beijing University of Posts and Telecommunications, Beijing,100876, China
关键词
Computer vision - Convolutional neural networks - Image reconstruction - Multilayer neural networks - Object detection - Object recognition;
D O I
10.13190/j.jbupt.2023-285
中图分类号
学科分类号
摘要
Conventional deep learning training approaches often assume a similarity between the deployment scenario and the visual domain features present in the training data. However, this assumption might not hold true in complex end-to-end scenarios, making it difficult to meet the demands of intelligent detection services in open environments. In response, an object detection algorithm based on artificial intelligence closed-loop ensemble theory with cross-domain capabilities has been introduced. Within the detection framework, construct a backbone network and bottleneck layer network with multiscale convolutional layers. A visual domain discriminator featuring long-range dependency attention works as a secondary detection head to refine the results. Moreover, a background focusing module, based on spatial reconstruction attention units, is able to enhance learning focused on pseudo-background representations, thereby improving the accuracy of cross-domain object detection. Experimental results show that, compared to two-stage algorithms, the proposed algorithm yields an average precision increase 6.9%, and surpasses single-stage algorithms by 9.0% in complex end-to-end scenarios. © 2024 Beijing University of Posts and Telecommunications. All rights reserved.
引用
收藏
页码:57 / 62
相关论文
共 50 条
  • [41] Enhancing scene understanding based on deep learning for end-to-end autonomous driving
    Hu, Jie
    Kong, Huifang
    Zhang, Qian
    Liu, Runwu
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2022, 116
  • [42] Deeply Tensor Compressed Transformers for End-to-End Object Detection
    Zhen, Peining
    Gao, Ziyang
    Hou, Tianshu
    Cheng, Yuan
    Chen, Hai-Bao
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 4716 - 4724
  • [43] EEM: An End-to-end Evaluation Metric for Scene Text Detection and Recognition
    Hao, Jiedong
    Wen, Yafei
    Deng, Jie
    Gan, Jun
    Ren, Shuai
    Tan, Hui
    Chen, Xiaoxin
    DOCUMENT ANALYSIS AND RECOGNITION, ICDAR 2021, PT IV, 2021, 12824 : 95 - 108
  • [44] End-to-End Object Detection with Enhanced Positive Sample Filter
    Song, Xiaolin
    Chen, Binghui
    Li, Pengyu
    Wang, Biao
    Zhang, Honggang
    APPLIED SCIENCES-BASEL, 2023, 13 (03):
  • [45] Dynamic DETR: End-to-End Object Detection with Dynamic Attention
    Dai, Xiyang
    Chen, Yinpeng
    Yang, Jianwei
    Zhang, Pengchuan
    Yuan, Lu
    Zhang, Lei
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 2968 - 2977
  • [46] Towards End-to-End Unified Scene Text Detection and Layout Analysis
    Long, Shangbang
    Qin, Siyang
    Panteleev, Dmitry
    Bissacco, Alessandro
    Fujii, Yasuhisa
    Raptis, Michalis
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 1039 - 1049
  • [47] End-to-End Analysis for Text Detection and Recognition in Natural Scene Images
    Alnefaie, Ahlam
    Gupta, Deepak
    Bhuyan, Monowar H.
    Razzak, Imran
    Gupta, Prashant
    Prasad, Mukesh
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [48] Feature Fusion Pyramid Network for End-to-End Scene Text Detection
    Wu, Yirui
    Zhang, Lilai
    Li, Hao
    Zhang, Yunfei
    Wan, Shaohua
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2024, 23 (11)
  • [49] End-to-end Domain-Adversarial Voice Activity Detection
    Lavechin, Marvin
    Gill, Marie-Philippe
    Bousbib, Ruben
    Bredin, Herve
    Garcia-Perera, Leibny Paola
    INTERSPEECH 2020, 2020, : 3685 - 3689
  • [50] Cross-Domain Adaptive Teacher for Object Detection
    Li, Yu-Jhe
    Dai, Xiaoliang
    Ma, Chih-Yao
    Liu, Yen-Cheng
    Chen, Kan
    Wu, Bichen
    He, Zijian
    Kitani, Kris
    Vajda, Peter
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 7571 - 7580