Multi-Modal Deep Learning-Based Violin Bowing Action Recognition

被引:2
|
作者
Liu, Bao-Yun [1 ]
Jen, Yi-Hsin [2 ,3 ]
Sun, Shih-Wei [4 ]
Su, Li [2 ]
Chang, Pao-Chi [1 ]
机构
[1] Natl Cent Univ, Dept Commun Engn, Taoyuan, Taiwan
[2] Acad Sinica, Inst Informat Sci, Taipei, Taiwan
[3] Natl Tsing Hua Univ, Dept Comp Sci, Hsinchu, Taiwan
[4] Taipei Natl Univ Arts, Dept New Media Art, Taipei, Taiwan
关键词
D O I
10.1109/icce-taiwan49838.2020.9257995
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, a deep learning-based violin action recognition is proposed. By fusing the sensing signals from depth camera modality and inertial sensor modalities, violin bowing actions can be recognized by the proposed deep learning scheme. The actions performed by a violinist are captured by a depth camera, and recorded by wearable sensors on the forearm of a violinist. In the proposed system, 3D convolution neural network (3D-CNN) and long short-term memory (LSTM) deep learning algorithms are adopted to generate the action models from depth camera modality and inertial sensor modalities. The features and models obtained from multi-modalities are used to classify different violin bowing actions. A fusion process from different modalities can achieve satisfactory recognition accuracy. In this paper, we generate a violin bowing actions dataset for the preliminary study and the system performance evaluation.
引用
收藏
页数:2
相关论文
共 50 条
  • [31] Vision-Based Multi-Modal Framework for Action Recognition
    Romaissa, Beddiar Djamila
    Mourad, Oussalah
    Brahim, Nini
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 5859 - 5866
  • [32] Learning discriminative motion feature for enhancing multi-modal action recognition
    Yang, Jianyu
    Huang, Yao
    Shao, Zhanpeng
    Liu, Chunping
    Journal of Visual Communication and Image Representation, 2021, 79
  • [33] Multi-Modal Multi-Action Video Recognition
    Shi, Zhensheng
    Liang, Ju
    Li, Qianqian
    Zheng, Haiyong
    Gu, Zhaorui
    Dong, Junyu
    Zheng, Bing
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 13658 - 13667
  • [34] SKELETON-INDEXED DEEP MULTI-MODAL FEATURE LEARNING FOR HIGH PERFORMANCE HUMAN ACTION RECOGNITION
    Song, Sijie
    Lan, Cuiling
    Xing, Junliang
    Zeng, Wenjun
    Liu, Jiaying
    2018 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2018,
  • [35] Learning-Based Confidence Estimation for Multi-modal Classifier Fusion
    Nadeem, Uzair
    Bennamoun, Mohammed
    Sohel, Ferdous
    Togneri, Roberto
    NEURAL INFORMATION PROCESSING (ICONIP 2019), PT II, 2019, 11954 : 299 - 312
  • [36] Multi-Modal Transformer and Reinforcement Learning-Based Beam Management
    Ghassemi, Mohammad
    Zhang, Han
    Afana, Ali
    Sediq, Akram Bin
    Erol-Kantarci, Melike
    IEEE Networking Letters, 2024, 6 (04): : 222 - 226
  • [37] Modality Mixer for Multi-modal Action Recognition
    Lee, Sumin
    Woo, Sangmin
    Park, Yeonju
    Nugroho, Muhammad Adi
    Kim, Changick
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 3297 - 3306
  • [38] Deep learning-based multi-modal approach for predicting brain radionecrosis after proton therapy
    Seetha, Sithin Thulasi
    Fontana, Giulia
    Bazani, Alessia
    Riva, Giulia
    Molinelli, Silvia
    Goodyear, Christina Amanda
    Ciccone, Lucia Pia
    Iannalfi, Alberto
    Orlandi, Ester
    RADIOTHERAPY AND ONCOLOGY, 2024, 194 : S5027 - S5030
  • [39] RETRACTION: An Efficient Deep Learning-based Video Captioning Framework Using Multi-modal Features
    Varma, S.
    James, D. P.
    EXPERT SYSTEMS, 2025, 42 (02)
  • [40] Deep Learning-Based Multi-Modal Ensemble Classification Approach for Human Breast Cancer Prognosis
    Jadoon, Ehtisham Khan
    Khan, Fiaz Gul
    Shah, Sajid
    Khan, Ahmad
    ElAffendi, Muhammed
    IEEE ACCESS, 2023, 11 : 85760 - 85769