Acoustic-based LEGO recognition using attention-based convolutional neural networks

被引:0
|
作者
Van-Thuan Tran
Chia-Yang Wu
Wei-Ho Tsai
机构
[1] National Taipei University of Technology,Department of Electronic Engineering
来源
关键词
LEGO recognition; Acoustic-based object detection; Attention mechanism; Audio classification; Audio features; Convolutional neural networks; Time-distributed layers;
D O I
暂无
中图分类号
学科分类号
摘要
This work investigates the classification of LEGO types using deep learning-based audio classification approaches. The motivation for this investigation is based on the following assumption. If objects of the same shape fall freely from a certain height and hit a fixed plane, the impact sounds will be very similar, so we can distinguish the same types of objects from the others. Applying this idea to LEGO recognition, we collect impact sounds of 200 LEGO objects that fall from a height of about 30cm from a designated plane, and design a CNN-based recognition system that processes the impact sounds to determine the type of LEGO it belongs to. Recognizing that the fall of LEGO results in the main impact sound (i.e., only the sound at the moment of impact) and several subsequent sounds, we examine whether considering only the first impact sound or all sounds brings about better classification accuracies. We propose a compact two-dimensional CNN model, namely LegoNet, which is designed with a frame-level attention module at the input spectrogram and time-distributed fully-connected layers. Our experiments show that free-fall impact sounds can be used efficiently for accurate object recognition, and the proposed LegoNet, with a much smaller size, achieves better accuracy and robustness compared to baseline models. Also, using the whole sequence of impact sounds is more informative for LEGO classification than only considering the first impact sound. Moreover, it is found that utilizing data of specific object postures can help to improve the classifier’s performance in the case of small training data. The proposed approach can be employed as an extra module to build intelligent agents or object classification systems that require a rich understanding of the surrounding physical world.
引用
收藏
相关论文
共 50 条
  • [1] Acoustic-based LEGO recognition using attention-based convolutional neural networks
    Tran, Van-Thuan
    Wu, Chia-Yang
    Tsai, Wei-Ho
    ARTIFICIAL INTELLIGENCE REVIEW, 2024, 57 (01)
  • [2] Acoustic-Based Train Arrival Detection Using Convolutional Neural Networks With Attention
    Van-Thuan Tran
    Tsai, Wei-Ho
    IEEE ACCESS, 2022, 10 : 72120 - 72131
  • [3] Acoustic-Based Emergency Vehicle Detection Using Convolutional Neural Networks
    Tran, Van-Thuan
    Tsai, Wei-Ho
    IEEE ACCESS, 2020, 8 : 75702 - 75713
  • [4] An Attention-Based Convolutional Recurrent Neural Networks for Scene Text Recognition
    Alshawi, Adil Abdullah Abdulhussein
    Tanha, Jafar
    Balafar, Mohammad Ali
    IEEE ACCESS, 2024, 12 : 8123 - 8134
  • [5] Attention-based Convolutional Neural Networks for Sentence Classification
    Zhao, Zhiwei
    Wu, Youzheng
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 705 - 709
  • [6] Hyperspectral Band Selection Using Attention-Based Convolutional Neural Networks
    Lorenzo, Pablo Ribalta
    Tulczyjew, Lukasz
    Marcinkiewicz, Michal
    Nalepa, Jakub
    IEEE ACCESS, 2020, 8 : 42384 - 42403
  • [7] Causal Discovery with Attention-Based Convolutional Neural Networks
    Nauta, Meike
    Bucur, Doina
    Seifert, Christin
    MACHINE LEARNING AND KNOWLEDGE EXTRACTION, 2019, 1 (01):
  • [8] ATTENTION-BASED ATROUS CONVOLUTIONAL NEURAL NETWORKS: VISUALISATION AND UNDERSTANDING PERSPECTIVES OF ACOUSTIC SCENES
    Ren, Zhao
    Kong, Qiuqiang
    Han, Jing
    Plumbley, Mark D.
    Schuller, Bjoern W.
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 56 - 60
  • [9] Attention-Based Convolutional and Recurrent Neural Networks for Driving Behavior Recognition Using Smartphone Sensor Data
    Zhang, Jun
    Wu, Zhongcheng
    Li, Fang
    Luo, Jianfei
    Ren, Tingting
    Hu, Song
    Li, Wenjing
    Li, Wei
    IEEE ACCESS, 2019, 7 : 148031 - 148046
  • [10] EEG emotion recognition using attention-based convolutional transformer neural network
    Gong, Linlin
    Li, Mingyang
    Zhang, Tao
    Chen, Wanzhong
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2023, 84