Semantic segmentation of underwater images based on the improved SegFormer

被引:0
|
作者
Chen, Bowei [1 ,2 ]
Zhao, Wei [1 ,2 ]
Zhang, Qiusheng [3 ]
Li, Mingliang [3 ]
Qi, Mingyang [3 ]
Tang, You [3 ,4 ,5 ]
机构
[1] Qingdao Innovat & Dev Base, Harbin, Peoples R China
[2] Harbin Engn Univ, Lab Underwater Intelligence, Qingdao, Peoples R China
[3] Jilin Agr Sci & Technol Univ, Elect & Informat Engn Coll, Jilin, Peoples R China
[4] Jilin Agr Univ, Coll Informat Technol, Changchun, Peoples R China
[5] Yanbian Univ, Coll Agr, Yanji, Peoples R China
关键词
underwater images; semantic segmentation; attention mechanism; feature fusion; SegFormer;
D O I
10.3389/fmars.2025.1522160
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Underwater images segmentation is essential for tasks such as underwater exploration, marine environmental monitoring, and resource development. Nevertheless, given the complexity and variability of the underwater environment, improving model accuracy remains a key challenge in underwater image segmentation tasks. To address these issues, this study presents a high-performance semantic segmentation approach for underwater images based on the standard SegFormer model. First, the Mix Transformer backbone in SegFormer is replaced with a Swin Transformer to enhance feature extraction and facilitate efficient acquisition of global context information. Next, the Efficient Multi-scale Attention (EMA) mechanism is introduced in the backbone's downsampling stages and the decoder to better capture multi-scale features, further improving segmentation accuracy. Furthermore, a Feature Pyramid Network (FPN) structure is incorporated into the decoder to combine feature maps at multiple resolutions, allowing the model to integrate contextual information effectively, enhancing robustness in complex underwater environments. Testing on the SUIM underwater image dataset shows that the proposed model achieves high performance across multiple metrics: mean Intersection over Union (MIoU) of 77.00%, mean Recall (mRecall) of 85.04%, mean Precision (mPrecision) of 89.03%, and mean F1score (mF1score) of 86.63%. Compared to the standard SegFormer, it demonstrates improvements of 3.73% in MIoU, 1.98% in mRecall, 3.38% in mPrecision, and 2.44% in mF1score, with an increase of 9.89M parameters. The results demonstrate that the proposed method achieves superior segmentation accuracy with minimal additional computation, showcasing high performance in underwater image segmentation.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] Kidney Tumor Segmentation Based on DWR-SegFormer
    Deng, Yani
    Liu, Xin
    Shao, Lianhe
    Wang, Kai
    Wang, Xihan
    Gao, Quanli
    ELECTRONICS, 2024, 13 (16)
  • [32] Semantic segmentation of urban street scene images based on improved U-Net network
    Zhu, Fuzhen
    Cui, Jingyi
    Zhu, Bing
    Li, Huiling
    Liu, Yan
    OPTOELECTRONICS LETTERS, 2023, 19 (03) : 179 - 185
  • [33] A Semantic Segmentation Method for Road Scene Images Based on Improved DeeplabV3+ Network
    Bi, Lihua
    Zhang, Xiangfei
    Li, Shihao
    Li, Canlin
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (08) : 841 - 849
  • [34] Semantic segmentation of urban street scene images based on improved U-Net network
    ZHU Fuzhen
    CUI Jingyi
    ZHU Bing
    LI Huiling
    LIU Yan
    OptoelectronicsLetters, 2023, 19 (03) : 179 - 185
  • [35] Semantic Segmentation of Forward-Looking Sonar Images Based on Improved Deeplabv3+
    Yin, Fei
    Nie, Weizhi
    Su, Yishan
    OCEANS 2024 - SINGAPORE, 2024,
  • [36] Semantic segmentation of urban street scene images based on improved U-Net network
    Fuzhen Zhu
    Jingyi Cui
    Bing Zhu
    Huiling Li
    Yan Liu
    Optoelectronics Letters, 2023, 19 : 179 - 185
  • [37] Underwater Image Denoising and Semantic Segmentation
    Chavan, Rahul Namadev
    Aswathy, P.
    FOURTH CONGRESS ON INTELLIGENT SYSTEMS, VOL 3, CIS 2023, 2024, 865 : 165 - 176
  • [38] Multimodality semantic segmentation based on polarization and color images
    Wang, Fan
    Ainouz, Samia
    Lian, Chunfeng
    Bensrhair, Abdelaziz
    NEUROCOMPUTING, 2017, 253 : 193 - 200
  • [39] Semantic Segmentation of Fisheye Images
    Blott, Gregor
    Takami, Masato
    Heipke, Christian
    COMPUTER VISION - ECCV 2018 WORKSHOPS, PT I, 2019, 11129 : 181 - 196
  • [40] IMPROVED SEMANTIC SEGMENTATION FOR IDENTIFICATION OF FLOODED REGIONS IN UAV AERIAL IMAGES: A TRANSFORMER-BASED APPROACH
    Verma, Ujjwal
    Puthumanaillam, Gokul
    IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 4800 - 4803