Acoustic-based LEGO recognition using attention-based convolutional neural networks

被引:0
|
作者
Van-Thuan Tran
Chia-Yang Wu
Wei-Ho Tsai
机构
[1] National Taipei University of Technology,Department of Electronic Engineering
来源
关键词
LEGO recognition; Acoustic-based object detection; Attention mechanism; Audio classification; Audio features; Convolutional neural networks; Time-distributed layers;
D O I
暂无
中图分类号
学科分类号
摘要
This work investigates the classification of LEGO types using deep learning-based audio classification approaches. The motivation for this investigation is based on the following assumption. If objects of the same shape fall freely from a certain height and hit a fixed plane, the impact sounds will be very similar, so we can distinguish the same types of objects from the others. Applying this idea to LEGO recognition, we collect impact sounds of 200 LEGO objects that fall from a height of about 30cm from a designated plane, and design a CNN-based recognition system that processes the impact sounds to determine the type of LEGO it belongs to. Recognizing that the fall of LEGO results in the main impact sound (i.e., only the sound at the moment of impact) and several subsequent sounds, we examine whether considering only the first impact sound or all sounds brings about better classification accuracies. We propose a compact two-dimensional CNN model, namely LegoNet, which is designed with a frame-level attention module at the input spectrogram and time-distributed fully-connected layers. Our experiments show that free-fall impact sounds can be used efficiently for accurate object recognition, and the proposed LegoNet, with a much smaller size, achieves better accuracy and robustness compared to baseline models. Also, using the whole sequence of impact sounds is more informative for LEGO classification than only considering the first impact sound. Moreover, it is found that utilizing data of specific object postures can help to improve the classifier’s performance in the case of small training data. The proposed approach can be employed as an extra module to build intelligent agents or object classification systems that require a rich understanding of the surrounding physical world.
引用
收藏
相关论文
共 50 条
  • [31] Exploring Deep Spectrum Representations via Attention-Based Recurrent and Convolutional Neural Networks for Speech Emotion Recognition
    Zhao, Ziping
    Bao, Zhongtian
    Zhao, Yiqin
    Zhang, Zixing
    Cummins, Nicholas
    Ren, Zhao
    Schuller, Bjorn
    IEEE ACCESS, 2019, 7 : 97515 - 97525
  • [32] A Neural Autoregressive Approach to Attention-based Recognition
    Zheng, Yin
    Zemel, Richard S.
    Zhang, Yu-Jin
    Larochelle, Hugo
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2015, 113 (01) : 67 - 79
  • [33] A Neural Autoregressive Approach to Attention-based Recognition
    Yin Zheng
    Richard S. Zemel
    Yu-Jin Zhang
    Hugo Larochelle
    International Journal of Computer Vision, 2015, 113 : 67 - 79
  • [34] An attention-based convolutional neural network for recipe recommendation
    Jia, Nan
    Chen, Jie
    Wang, Rongzheng
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 201
  • [35] Attention-Based Convolutional Neural Network for Ingredients Identification
    Chen, Shi
    Li, Ruixue
    Wang, Chao
    Liang, Jiakai
    Yue, Keqiang
    Li, Wenjun
    Li, Yilin
    ENTROPY, 2023, 25 (02)
  • [36] Graph Convolutional Networks and Attention-Based Outlier Detection
    Qiu, Rui
    Du, Xusheng
    Yu, Jiong
    Wu, Jiaying
    Li, Shu
    IEEE ACCESS, 2022, 10 : 72388 - 72399
  • [37] Attention-based 3D convolutional networks
    Ding, Enjie
    Xu, Dawei
    Zhao, Yingfei
    Liu, Zhongyu
    Liu, Yafeng
    JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE, 2023, 35 (01) : 93 - 108
  • [38] Acoustic-Based UAV Detection Using Late Fusion of Deep Neural Networks
    Casabianca, Pietro
    Zhang, Yu
    DRONES, 2021, 5 (03)
  • [39] Attention-based graph neural networks: a survey
    Sun, Chengcheng
    Li, Chenhao
    Lin, Xiang
    Zheng, Tianji
    Meng, Fanrong
    Rui, Xiaobin
    Wang, Zhixiao
    ARTIFICIAL INTELLIGENCE REVIEW, 2023, 56 (SUPPL 2) : 2263 - 2310
  • [40] Attention-based graph neural networks: a survey
    Chengcheng Sun
    Chenhao Li
    Xiang Lin
    Tianji Zheng
    Fanrong Meng
    Xiaobin Rui
    Zhixiao Wang
    Artificial Intelligence Review, 2023, 56 : 2263 - 2310