Dual Stream Encoder-Decoder Architecture with Feature Fusion Model for Underwater Object Detection

被引:1
|
作者
Nissar, Mehvish [1 ]
Mishra, Amit Kumar [2 ]
Subudhi, Badri Narayan [1 ]
机构
[1] Indian Inst Technol Jammu, Dept Elect Engn, Jammu 181221, India
[2] Aberystwyth Univ, Fac Business & Phys Sci, Aberystwyth SY23 3FL, Wales
关键词
underwater surveillance; object detection; deep learning; CNN; background subtraction; video surveillance; foreground segmentation; CONVOLUTIONAL NEURAL-NETWORK;
D O I
10.3390/math12203227
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Underwater surveillance is an imminent and fascinating exploratory domain, particularly in monitoring aquatic ecosystems. This field offers valuable insights into underwater behavior and activities, which have broad applications across various domains. Specifically, underwater surveillance involves detecting and tracking moving objects within aquatic environments. However, the complex properties of water make object detection a challenging task. Background subtraction is a commonly employed technique for detecting local changes in video scenes by segmenting images into the background and foreground to isolate the object of interest. Within this context, we propose an innovative dual-stream encoder-decoder framework based on the VGG-16 and ResNet-50 models for detecting moving objects in underwater frames. The network includes a feature fusion module that effectively extracts multiple-level features. Using a limited set of images and performing training in an end-to-end manner, the proposed framework yields accurate results without post-processing. The efficacy of the proposed technique is confirmed through visual and quantitative comparisons with eight cutting-edge methods using two standard databases. The first one employed in our experiments is the Underwater Change Detection Dataset, which includes five challenges, each challenge comprising approximately 1000 frames. The categories in this dataset were recorded under various underwater conditions. The second dataset used for practical analysis is the Fish4Knowledge dataset, where we considered five challenges. Each category, recorded in different aquatic settings, contains a varying number of frames, typically exceeding 1000 per category. Our proposed method surpasses all methods used for comparison by attaining an average F-measure of 0.98 on the Underwater Change Detection Dataset and 0.89 on the Fish4Knowledge dataset.
引用
收藏
页数:22
相关论文
共 50 条
  • [1] Road Semantic Segmentation and Traffic Object Detection Model Based on Encoder-Decoder CNN Architecture
    Wang, Yih-Chen
    Yu, Chao-Wei
    Lu, Xiu-Ying
    Chen, Yen-Lin
    2022 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS - TAIWAN, IEEE ICCE-TW 2022, 2022, : 421 - 422
  • [2] Two-Stream Deep Encoder-Decoder Architecture for Fully Automatic Video Object Segmentation
    Xu, Jingwei
    Song, Li
    Xie, Rong
    2017 IEEE VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2017,
  • [3] Semantic segmentation method of underwater images based on encoder-decoder architecture
    Wang, Jinkang
    He, Xiaohui
    Shao, Faming
    Lu, Guanlin
    Hu, Ruizhe
    Jiang, Qunyan
    PLOS ONE, 2022, 17 (08):
  • [4] A Dual Attention Encoder-Decoder Text Summarization Model
    Hakami, Nada Ali
    Mahmoud, Hanan Ahmed Hosni
    CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 74 (02): : 3697 - 3710
  • [5] Atrous spatial pyramid convolution for object detection with encoder-decoder
    Jie, Feiran
    Nie, Qingfeng
    Li, Mingsuo
    Yin, Ming
    Jin, Taisong
    NEUROCOMPUTING, 2021, 464 : 107 - 118
  • [6] Object Contour Detection with a Fully Convolutional Encoder-Decoder Network
    Yang, Jimei
    Price, Brian
    Cohen, Scott
    Lee, Honglak
    Yang, Ming-Hsuan
    2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 193 - 202
  • [7] A dual-stream encoder-decoder network with attention mechanism for saliency detection in video(s)
    Kumain, Sandeep Chand
    Singh, Maheep
    Awasthi, Lalit Kumar
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (03) : 2037 - 2046
  • [8] An automated detection system for colonoscopy images using a dual encoder-decoder model
    Hwang, Maxwell
    Wang, Da
    Kong, Xiang-Xing
    Wang, Zhanhuai
    Li, Jun
    Jiang, Wei-Cheng
    Hwang, Kao-Shing
    Ding, Kefeng
    COMPUTERIZED MEDICAL IMAGING AND GRAPHICS, 2020, 84
  • [9] ASLNet: An Encoder-Decoder Architecture for Audio Splicing Detection and Localization
    Zhang, Zhenyu
    Zhao, Xianfeng
    Yi, Xiaowei
    SECURITY AND COMMUNICATION NETWORKS, 2022, 2022
  • [10] Cloud Detection Method Based on Spatial-Spectral Features and Encoder-Decoder Feature Fusion
    Zhang, Jing
    Shi, Xinlong
    Wu, Jun
    Song, Liangnong
    Li, Yunsong
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61 : 1 - 15