Cross-Domain Object Detection Algorithm for Complex End-to-End Scene Understanding

被引：0

作者：

Chen, Aoran ^{[1
]}

Huang, Hai ^{[1
]}

Zhu, Yueyan ^{[1
]}

Xue, Junsheng ^{[1
]}

机构：

[1] School of Information and Communication Engineering, Beijing University of Posts and Telecommunications, Beijing,100876, China

来源：

Beijing Youdian Daxue Xuebao/Journal of Beijing University of Posts and Telecommunications | 2024年 / 47卷 / 04期

关键词：

Computer vision - Convolutional neural networks - Image reconstruction - Multilayer neural networks - Object detection - Object recognition;

D O I：

10.13190/j.jbupt.2023-285

中图分类号：

学科分类号：

摘要：

Conventional deep learning training approaches often assume a similarity between the deployment scenario and the visual domain features present in the training data. However, this assumption might not hold true in complex end-to-end scenarios, making it difficult to meet the demands of intelligent detection services in open environments. In response, an object detection algorithm based on artificial intelligence closed-loop ensemble theory with cross-domain capabilities has been introduced. Within the detection framework, construct a backbone network and bottleneck layer network with multiscale convolutional layers. A visual domain discriminator featuring long-range dependency attention works as a secondary detection head to refine the results. Moreover, a background focusing module, based on spatial reconstruction attention units, is able to enhance learning focused on pseudo-background representations, thereby improving the accuracy of cross-domain object detection. Experimental results show that, compared to two-stage algorithms, the proposed algorithm yields an average precision increase 6.9%, and surpasses single-stage algorithms by 9.0% in complex end-to-end scenarios. © 2024 Beijing University of Posts and Telecommunications. All rights reserved.

引用

页码：57 / 62

共 50 条

[41] Enhancing scene understanding based on deep learning for end-to-end autonomous driving
Hu, Jie
Kong, Huifang
Zhang, Qian
Liu, Runwu
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2022, 116
[42] Deeply Tensor Compressed Transformers for End-to-End Object Detection
Zhen, Peining
Gao, Ziyang
Hou, Tianshu
Cheng, Yuan
Chen, Hai-Bao
THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 4716 - 4724
[43] EEM: An End-to-end Evaluation Metric for Scene Text Detection and Recognition
Hao, Jiedong
Wen, Yafei
Deng, Jie
Gan, Jun
Ren, Shuai
Tan, Hui
Chen, Xiaoxin
DOCUMENT ANALYSIS AND RECOGNITION, ICDAR 2021, PT IV, 2021, 12824 : 95 - 108
[44] End-to-End Object Detection with Enhanced Positive Sample Filter
Song, Xiaolin
Chen, Binghui
Li, Pengyu
Wang, Biao
Zhang, Honggang
APPLIED SCIENCES-BASEL, 2023, 13 (03):
[45] Dynamic DETR: End-to-End Object Detection with Dynamic Attention
Dai, Xiyang
Chen, Yinpeng
Yang, Jianwei
Zhang, Pengchuan
Yuan, Lu
Zhang, Lei
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 2968 - 2977
[46] Towards End-to-End Unified Scene Text Detection and Layout Analysis
Long, Shangbang
Qin, Siyang
Panteleev, Dmitry
Bissacco, Alessandro
Fujii, Yasuhisa
Raptis, Michalis
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 1039 - 1049
[47] End-to-End Analysis for Text Detection and Recognition in Natural Scene Images
Alnefaie, Ahlam
Gupta, Deepak
Bhuyan, Monowar H.
Razzak, Imran
Gupta, Prashant
Prasad, Mukesh
2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
[48] Feature Fusion Pyramid Network for End-to-End Scene Text Detection
Wu, Yirui
Zhang, Lilai
Li, Hao
Zhang, Yunfei
Wan, Shaohua
ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2024, 23 (11)
[49] End-to-end Domain-Adversarial Voice Activity Detection
Lavechin, Marvin
Gill, Marie-Philippe
Bousbib, Ruben
Bredin, Herve
Garcia-Perera, Leibny Paola
INTERSPEECH 2020, 2020, : 3685 - 3689
[50] Cross-Domain Adaptive Teacher for Object Detection
Li, Yu-Jhe
Dai, Xiaoliang
Ma, Chih-Yao
Liu, Yen-Cheng
Chen, Kan
Wu, Bichen
He, Zijian
Kitani, Kris
Vajda, Peter
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 7571 - 7580

← 1 2 3 4 5 →