UWV-Yolox: A Deep Learning Model for Underwater Video Object Detection

被引：6

作者：

Pan, Haixia ^{[1
]}

Lan, Jiahua ^{[1
]}

Wang, Hongqiang ^{[1
]}

Li, Yanan ^{[1
]}

Zhang, Meng ^{[1
]}

Ma, Mojie ^{[1
]}

Zhang, Dongdong ^{[1
]}

Zhao, Xiaoran ^{[1
]}

机构：

[1] Beihang Univ, Sch Software, Beijing 100191, Peoples R China

来源：

SENSORS | 2023年 / 23卷 / 10期

关键词：

underwater video; object detection; coordinate attention; loss function; frame-level optimization;

D O I：

10.3390/s23104859

中图分类号：

O65 [分析化学];

学科分类号：

070302 ; 081704 ;

摘要：

Underwater video object detection is a challenging task due to the poor quality of underwater videos, including blurriness and low contrast. In recent years, Yolo series models have been widely applied to underwater video object detection. However, these models perform poorly for blurry and low-contrast underwater videos. Additionally, they fail to account for the contextual relationships between the frame-level results. To address these challenges, we propose a video object detection model named UWV-Yolox. First, the Contrast Limited Adaptive Histogram Equalization method is used to augment the underwater videos. Then, a new CSP_CA module is proposed by adding Coordinate Attention to the backbone of the model to augment the representations of objects of interest. Next, a new loss function is proposed, including regression and jitter loss. Finally, a frame-level optimization module is proposed to optimize the detection results by utilizing the relationship between neighboring frames in videos, improving the video detection performance. To evaluate the performance of our model, We construct experiments on the UVODD dataset built in the paper, and select mAP@0.5 as the evaluation metric. The mAP@0.5 of the UWV-Yolox model reaches 89.0%, which is 3.2% better than the original Yolox model. Furthermore, compared with other object detection models, the UWV-Yolox model has more stable predictions for objects, and our improvements can be flexibly applied to other models.

引用

页数：19

共 50 条

[41] HARNESSING DEEP TRANSFER LEARNING WITH IMAGING TECHNOLOGY FOR UNDERWATER OBJECT DETECTION AND TRACKING IN CONSUMER ELECTRONICS
Alahmari, Saad
AL Mazroa, Alanoud
Mahmood, Khalid
Alqurni, Jehad saad
Salama, Ahmed s.
Alzahrani, Yazeed
FRACTALS-COMPLEX GEOMETRY PATTERNS AND SCALING IN NATURE AND SOCIETY, 2024,
[42] Deep Learning based Object Detection via Style-transferred Underwater Sonar Images
Lee, Sejin
Park, Byungjae
Kim, Ayoung
IFAC PAPERSONLINE, 2019, 52 (21): : 152 - 155
[43] Enhancing Underwater Object Detection Using Advanced Deep Learning De-Noising Techniques
Umamageswari, Arasakumaran
Deepa, Sivapatham
Hussain, Faritha Banu Jahir
Shanmugam, Padmapriya
TRAITEMENT DU SIGNAL, 2024, 41 (05) : 2593 - 2602
[44] Hybrid deep-learning framework for object-based forgery detection in video
Tan, Shunquan
Chen, Baoying
Zeng, Jishen
Li, Bin
Huang, Jiwu
SIGNAL PROCESSING-IMAGE COMMUNICATION, 2022, 105
[45] A Foreign Object Detection Method for Belt Conveyors Based on an Improved YOLOX Model
Yao, Rongbin
Qi, Peng
Hua, Dezheng
Zhang, Xu
Lu, He
Liu, Xinhua
TECHNOLOGIES, 2023, 11 (05)
[46] Advanced deep learning framework for underwater object detection with multibeam forward-looking sonar
Ge, Liangfu
Singh, Premjeet
Sadhu, Ayan
STRUCTURAL HEALTH MONITORING-AN INTERNATIONAL JOURNAL, 2024,
[47] Automatic detection, identification and counting of deep-water snappers on underwater baited video using deep learning
Baletaud, Florian
Villon, Sebastien
Gilbert, Antoine
Come, Jean-Marie
Fiat, Sylvie
Iovan, Corina
Vigliola, Laurent
FRONTIERS IN MARINE SCIENCE, 2025, 12
[48] Deep active learning for object detection
Li, Ying
Fan, Binbin
Zhang, Weiping
Ding, Weiping
Yin, Jianwei
INFORMATION SCIENCES, 2021, 579 : 418 - 433
[49] Learning Deep Relationship for Object Detection
Xu, Nuo
Huo, Chunlei
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2018, E101D (01) : 273 - 276
[50] Active Learning for Deep Object Detection
Brust, Clemens-Alexander
Kaeding, Christoph
Denzler, Joachim
PROCEEDINGS OF THE 14TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 5, 2019, : 181 - 190

← 1 2 3 4 5 →