Shuffled Grouping Cross-Channel Attention-based Bilateral-Filter-Interpolation Deformable ConvNet with Applications to Benthonic Organism Detection

被引：1

作者：

Chen T. ^{[1
]}

Wang N. ^{[2
]}

机构：

[1] School of Marine Electrical Engineering, Dalian Maritime University, Dalian

[2] School of Marine Engineering and the Dalian Key Laboratory of Green Power Control and Test for Intelligent Ships, Dalian Maritime University, Dalian

来源：

IEEE Transactions on Artificial Intelligence | 2024年 / 5卷 / 09期

基金：

中国国家自然科学基金;

关键词：

Artificial intelligence; Benthonic organism detection; bilateral-filterinterpolation deformable ConvNet; Color; Convolution; deep learning; Feature extraction; Interpolation; Kernel; Organisms; shuffled grouping crosschannel attention;

D O I：

10.1109/TAI.2024.3385387

中图分类号：

学科分类号：

摘要：

In this paper, to holistically tackle underwater detection degradation due to unknown geometric variation arising from scale, pose, viewpoint and occlusion under low-contrast and color-distortion circumstances, a shuffled grouping cross-channel attention-based bilateral-filter-interpolation deformable ConvNet (SGCA-BDC) framework is established for benthonic organism detection. Main contributions are as follows: 1) By comprehensively considering spatial and feature similarities between offset and integral coordinate positions, the bilateral-filter-interpolation deformable ConvNet (BDC) with modulation weight mechanism is created, such that sampling ability of convolutional kernel for benthonic organism with unknown geometric variation can be adaptively augmented from spatial perspective. 2) By utilizing 1-D convolution to recalibrate channel weight for grouped sub-feature via information entropy statistic technique, a shuffled grouping cross-channel attention (SGCA) module is innovated, such that seabed background noise can be suppressed from channel aspect. 3) The proposed SGCA-BDC scheme is eventually built in an organic manner by incorporating BDC and SGCA modules. Comprehensive experiments and comparisons demonstrate that the SGCA-BDC scheme remarkably outperforms typical detection approaches including Faster RCNN, SSD, YOLOv6, YOLOv7, YOLOv8, RetinaNet and CenterNet in terms of mean average precision by 8.54%, 4.4%, 5.18%, 3.1%, 3.01%, 12.53% and 7.09%, respectively. IEEE

引用

页码：1 / 13

页数：12

共 1 条

[1] CROSS-CHANNEL ATTENTION-BASED TARGET SPEAKER VOICE ACTIVITY DETECTION: EXPERIMENTAL RESULTS FOR THE M2MET CHALLENGE
Wang, Weiqing
Qin, Xiaoyi
Li, Ming
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 9171 - 9175

← 1 →