Acoustic-based LEGO recognition using attention-based convolutional neural networks

被引:0
|
作者
Van-Thuan Tran
Chia-Yang Wu
Wei-Ho Tsai
机构
[1] National Taipei University of Technology,Department of Electronic Engineering
来源
关键词
LEGO recognition; Acoustic-based object detection; Attention mechanism; Audio classification; Audio features; Convolutional neural networks; Time-distributed layers;
D O I
暂无
中图分类号
学科分类号
摘要
This work investigates the classification of LEGO types using deep learning-based audio classification approaches. The motivation for this investigation is based on the following assumption. If objects of the same shape fall freely from a certain height and hit a fixed plane, the impact sounds will be very similar, so we can distinguish the same types of objects from the others. Applying this idea to LEGO recognition, we collect impact sounds of 200 LEGO objects that fall from a height of about 30cm from a designated plane, and design a CNN-based recognition system that processes the impact sounds to determine the type of LEGO it belongs to. Recognizing that the fall of LEGO results in the main impact sound (i.e., only the sound at the moment of impact) and several subsequent sounds, we examine whether considering only the first impact sound or all sounds brings about better classification accuracies. We propose a compact two-dimensional CNN model, namely LegoNet, which is designed with a frame-level attention module at the input spectrogram and time-distributed fully-connected layers. Our experiments show that free-fall impact sounds can be used efficiently for accurate object recognition, and the proposed LegoNet, with a much smaller size, achieves better accuracy and robustness compared to baseline models. Also, using the whole sequence of impact sounds is more informative for LEGO classification than only considering the first impact sound. Moreover, it is found that utilizing data of specific object postures can help to improve the classifier’s performance in the case of small training data. The proposed approach can be employed as an extra module to build intelligent agents or object classification systems that require a rich understanding of the surrounding physical world.
引用
收藏
相关论文
共 50 条
  • [21] Attention-based multimodal sentiment analysis and emotion recognition using deep neural networks
    Aslam, Ajwa
    Sargano, Allah Bux
    Habib, Zulfiqar
    APPLIED SOFT COMPUTING, 2023, 144
  • [22] Recognition of acoustic vortex fields based on a convolutional attention neural network
    Xiao, Haicai
    Fan, Xinwen
    Kang, Yang
    Huang, Xiaolong
    Li, Can
    Li, Ning
    Weng, Chunsheng
    Fan, Xudong
    PHYSICAL REVIEW APPLIED, 2024, 22 (01):
  • [23] Attention-Based Radar PRI Modulation Recognition With Recurrent Neural Networks
    Li, Xueqiong
    Liu, Zhangmeng
    Huang, Zhitao
    IEEE ACCESS, 2020, 8 : 57426 - 57436
  • [24] Gaze Estimation with Multi-scale Attention-based Convolutional Neural Networks
    Zhang, Yuanyuan
    Li, Jing
    Ouyang, Gaoxiang
    2023 29TH INTERNATIONAL CONFERENCE ON MECHATRONICS AND MACHINE VISION IN PRACTICE, M2VIP 2023, 2023,
  • [25] Leveraging attention-based convolutional neural networks for meningioma classification in computational histopathology
    Sehring, J.
    Dohmen, H.
    Selignow, C.
    Schmid, K.
    Grau, S.
    Stein, M.
    Uhl, E.
    Mukhopadhyay, A.
    Nemeth, A.
    Amsel, D.
    Acker, T.
    BRAIN PATHOLOGY, 2023, 33
  • [26] A Lightweight Attention-Based Convolutional Neural Networks for Tomato Leaf Disease Classification
    Bhujel, Anil
    Kim, Na-Eun
    Arulmozhi, Elanchezhian
    Basak, Jayanta Kumar
    Kim, Hyeon-Tae
    AGRICULTURE-BASEL, 2022, 12 (02):
  • [27] Mineral prospectivity mapping using attention-based convolutional neural network
    Li, Quanke
    Chen, Guoxiong
    Luo, Lei
    ORE GEOLOGY REVIEWS, 2023, 156
  • [28] Attention-Based Gated Convolutional Neural Networks for Distant Supervised Relation Extraction
    Li, Xingya
    Chen, Yufeng
    Xu, Jinan
    Zhang, Yujie
    CHINESE COMPUTATIONAL LINGUISTICS, CCL 2019, 2019, 11856 : 246 - 257
  • [29] Attention-based sentiment analysis using convolutional and recurrent neural network
    Usama, Mohd
    Ahmad, Belal
    Song, Enmin
    Hossain, M. Shamim
    Alrashoud, Mubarak
    Muhammad, Ghulam
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2020, 113 : 571 - 578
  • [30] Leveraging Attention-Based Convolutional Neural Networks for Meningioma Classification in Computational Histopathology
    Sehring, Jannik
    Dohmen, Hildegard
    Selignow, Carmen
    Schmid, Kai
    Grau, Stefan
    Stein, Marco
    Uhl, Eberhard
    Mukhopadhyay, Anirban
    Nemeth, Attila
    Amsel, Daniel
    Acker, Till
    CANCERS, 2023, 15 (21)