Speaker Recognition using Convolutional Neural Network with Minimal Training Data for Smart Home Solutions

被引:0
|
作者
Wang, Mingshan [1 ]
Sirlapu, Tejaswini [1 ]
Kwasniewska, Alicja [2 ]
Szankin, Maciej [1 ]
Bartscherer, Marko [1 ]
Nicolas, Rey [1 ]
机构
[1] Intel Corp, San Diego, CA 92131 USA
[2] Gdansk Univ Technol, Fac Elect Telecommun & Informat, Gdansk, Poland
关键词
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
With the technology advancements in smart home sector, voice control and automation are key components that can make a real difference in people's lives. The voice recognition technology market continues to involve rapidly as almost all smart home devices arc providing speaker recognition capability today. However, most of them provide cloud-based solutions or use very deep Neural Networks for speaker recognition task, which are not suitable models to run on smart home devices. In this paper, we compare relatively small Convolutional Neural Networks (CNN) and evaluate effectiveness of speaker recognition using these models on edge devices. In addition, we also apply transfer learning technique to deal with a problem of limited training data. By developing solution suitable for running inference locally on edge devices, we eliminate the well-known cloud computing issues, such as data privacy and network latency, etc. The preliminary results proved that the chosen model adapts the benefit of computer vision task by using CNN and spectrograms to perform speaker classification with precision and recall similar to 84% in time less than 60 ms on mobile device with Atom Cherry Trail processor.
引用
收藏
页码:139 / 145
页数:7
相关论文
共 50 条
  • [41] Speaker recognition using MLP: An artificial neural network model
    Shah, SAH
    Farooq, FM
    Ahmed, A
    Hasan, KM
    Farooq, FM
    Naz, A
    Akbar, S
    7TH WORLD MULTICONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL V, PROCEEDINGS: COMPUTER SCIENCE AND ENGINEERING: I, 2003, : 331 - 334
  • [42] Characterization Vector Extraction Using Neural Network for Speaker Recognition
    Wang, Wenchao
    Yuan, Qingsheng
    Zhou, Ruohua
    Yan, Yonghong
    2016 8TH INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS (IHMSC), VOL. 1, 2016, : 355 - 358
  • [43] CNN: A speaker recognition system using a cascaded neural network
    Zaki, M
    Ghalwash, A
    Elkouny, AA
    INTERNATIONAL JOURNAL OF NEURAL SYSTEMS, 1996, 7 (02) : 203 - 212
  • [44] Multi Lingual Speaker Recognition Using Artificial Neural Network
    Agrawal, Prateek
    Shukla, Anupam
    Tiwari, Ritu
    ADVANCES IN COMPUTATIONAL INTELLIGENCE, 2009, 61 : 1 - 9
  • [45] CNN: A speaker recognition system using a cascaded neural network
    Zaki, M
    Ghalwash, A
    Elkouny, AA
    MULTIDIMENSIONAL SYSTEMS AND SIGNAL PROCESSING, 1996, 7 (01) : 87 - 99
  • [46] Facial Expression Recognition Using Convolutional Neural Network
    Agrawal, Ved
    Bamb, Chirag
    Mata, Harsh
    Dhunde, Harshal
    Hablani, Ramchand
    SMART TRENDS IN COMPUTING AND COMMUNICATIONS, VOL 4, SMARTCOM 2024, 2024, 948 : 267 - 278
  • [47] Recognition of Chinese food using convolutional neural network
    Teng, Jianing
    Zhang, Dong
    Lee, Dah-Jye
    Chou, Yao
    MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (09) : 11155 - 11172
  • [48] Recognition of Chinese food using convolutional neural network
    Jianing Teng
    Dong Zhang
    Dah-Jye Lee
    Yao Chou
    Multimedia Tools and Applications, 2019, 78 : 11155 - 11172
  • [49] Genre Recognition of Artworks using Convolutional Neural Network
    Hosainl, Md Kamran
    Harun-Ur-Rashid
    Taher, Tasnova Bintee
    Rahman, Mohammad Masudur
    2020 23RD INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY (ICCIT 2020), 2020,
  • [50] Bird Sound Recognition Using a Convolutional Neural Network
    Incze, Agnes
    Jancso, Henrietta-Bernadett
    Szilagyi, Zoltan
    Farkas, Attila
    Sulyok, Csaba
    2018 IEEE 16TH INTERNATIONAL SYMPOSIUM ON INTELLIGENT SYSTEMS AND INFORMATICS (SISY 2018), 2018, : 295 - 300