Oceanship: A Large-Scale Dataset for Underwater Audio Target Recognition

被引:0
|
作者
Li, Zeyu [1 ,2 ]
Xiang, Suncheng [1 ,3 ]
Yu, Tong [1 ,2 ]
Gao, Jingsheng [1 ,2 ]
Ruan, Jiacheng [1 ,2 ]
Hu, Yanping [1 ,2 ]
Liu, Ting [1 ,2 ]
Fu, Yuzhuo [1 ,2 ]
机构
[1] Shanghai Jiao Tong Univ, Shanghai 200240, Peoples R China
[2] Sch Elect Informat & Elect Engn, Shanghai, Peoples R China
[3] Sch Biomed Engn, Shanghai, Peoples R China
基金
中国国家自然科学基金;
关键词
Underwater Acoustic Target Recognition; Audio Retrieval; Zero-Shot Classification;
D O I
10.1007/978-981-97-5591-2_40
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The recognition of underwater audio plays a significant role in identifying a vessel while it is in motion. Underwater target recognition tasks have a wide range of applications in areas such as marine environmental protection, detection of ship radiated noise, underwater noise control, and coastal vessel dispatch. The traditional UATR task involves training a network to extract features from audio data and predict the vessel type. The current UATR dataset exhibits shortcomings in both duration and sample quantity. In this paper, we propose Oceanship, a large-scale and diverse underwater audio dataset. This dataset comprises 15 categories, spans a total duration of 121 h, and includes comprehensive annotation information such as coordinates, velocity, vessel types, and timestamps. We compiled the dataset by crawling and organizing original communication data from the Ocean Communication Network (ONC) database between 2021 and 2022. While audio retrieval tasks are well-established in general audio classification, they have not been explored in the context of underwater audio recognition. Leveraging the Oceanship dataset, we introduce a baseline model named Oceannet for underwater audio retrieval. This model achieves a recall at 1 (R@1) accuracy of 67.11% and a recall at 5 (R@5) accuracy of 99.13% on the Deepship dataset.
引用
收藏
页码:475 / 486
页数:12
相关论文
共 50 条
  • [41] MS-Celeb-1M: A Dataset and Benchmark for Large-Scale Face Recognition
    Guo, Yandong
    Zhang, Lei
    Hu, Yuxiao
    He, Xiaodong
    Gao, Jianfeng
    COMPUTER VISION - ECCV 2016, PT III, 2016, 9907 : 87 - 102
  • [42] Advancing music emotion recognition: large-scale dataset construction and evaluator impact analysis
    Hu, Qiong
    Murad, Masrah Azrifah Azmi
    Li, Qi
    MULTIMEDIA SYSTEMS, 2025, 31 (02)
  • [43] POLIMI-ITW-S: A large-scale dataset for human activity recognition in the wild
    Quan, Hao
    Hu, Yu
    Bonarini, Andrea
    DATA IN BRIEF, 2022, 43
  • [44] Large-Scale Historical Watermark Recognition: dataset and a new consistency-based approach
    Shen, Xi
    Pastrolin, Ilaria
    Bounou, Oumayma
    Gidaris, Spyros
    Smith, Marc
    Poncet, Olivier
    Aubry, Mathieu
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 6810 - 6817
  • [45] A large-scale dataset of solar event reports from automated feature recognition modules
    Schuh, Michael A.
    Angryk, Rafal A.
    Martens, Petrus C.
    JOURNAL OF SPACE WEATHER AND SPACE CLIMATE, 2016, 6
  • [46] CNN ARCHITECTURES FOR LARGE-SCALE AUDIO CLASSIFICATION
    Hershey, Shawn
    Chaudhuri, Sourish
    Ellis, Daniel P. W.
    Gemmeke, Jort F.
    Jansen, Aren
    Moore, R. Channing
    Plakal, Manoj
    Platt, Devin
    Saurous, Rif A.
    Seybold, Bryan
    Slaney, Malcolm
    Weiss, Ron J.
    Wilson, Kevin
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 131 - 135
  • [47] Efficient large-scale multichannel audio coding
    Sandnes, FE
    PROCEEDINGS OF THE 27TH EUROMICRO CONFERENCE - 2001: A NET ODYSSEY, 2001, : 392 - 399
  • [48] MIND: A Large-scale Dataset for News Recommendation
    Wu, Fangzhao
    Qiao, Ying
    Chen, Jiun-Hung
    Wu, Chuhan
    Qi, Tao
    Lian, Jianxun
    Liu, Danyang
    Xie, Xing
    Gao, Jianfeng
    Wu, Winnie
    Zhou, Ming
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 3597 - 3606
  • [49] DANEWSROOM: A Large-scale Danish Summarisation Dataset
    Varab, Daniel
    Schluter, Natalie
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 6731 - 6739
  • [50] Pchatbot: A Large-Scale Dataset for Personalized Chatbot
    Qian, Hongjin
    Li, Xiaohe
    Zhong, Hanxun
    Guo, Yu
    Ma, Yueyuan
    Zhu, Yutao
    Liu, Zhanliang
    Dou, Zhicheng
    Wen, Ji-Rong
    SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 2470 - 2477