Virtual home assistant for voice based controlling and scheduling with short speech speaker identification

被引:0
|
作者
Varun Tiwari
Mohammad Farukh Hashmi
Avinash Keskar
N. C. Shivaprakash
机构
[1] Visvesvaraya National Institute of Technology,Department of Electronics and Communication Engineering
[2] National Institute of Technology Campus,Department of Electronics and Communication Engineering
[3] Indian Institute of Science,Department of Instrumentation and Applied Physics
来源
关键词
Cloud services; Gaussian mixture models; Internet of things; Principal component analysis; Speaker identification; Vector quantization;
D O I
暂无
中图分类号
学科分类号
摘要
With the advancement of interface technologies in smart devices, voice-controlled assistants have quickly gained popularity. These assistants are designed to use voice commands to achieve a more human-friendly interaction. On these lines, we propose a cloud-connected voice based home assistant in this paper. It accepts voice commands to control or monitor devices in a home. It can understand and schedule device operations based on time or sensor data through a simple voice based approach. To enhance its capability, it is designed to identify the speakers. Mel-Frequency Cepstrum Coefficients (MFCC) in combination with other speech features are used as feature vector. We use Vector Quantization (VQ) and Principal Component Analysis (PCA) for dimensionality reduction of the feature vector, followed by Gaussian Mixture Model (GMM) for classification. The validation of the short speech speaker identification is carried out on a set of Indian speakers in an uncontrolled indoor environment. An accuracy greater than 92% is achieved for speech samples as small as 1 second. A database of more than 50 different commands per speaker is also created for validation of the proposed virtual assistant. IBM’s Bluemix and Google’s cloud service is used for speech to text conversion.
引用
收藏
页码:5243 / 5268
页数:25
相关论文
共 50 条
  • [21] Speaker Identification based on MFSC voice feature extraction using Transformer
    Bao, Liao
    Zuo, Yi
    2023 23RD IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS, ICDMW 2023, 2023, : 1 - 7
  • [22] Speaker Identification Using Voice-Based Cryptography for Mobile VoIP Secure Voice Communication
    Ryu, Sang-Hyeon
    Kim, Hyoung-Gook
    2013 THIRD WORLD CONGRESS ON INFORMATION AND COMMUNICATION TECHNOLOGIES (WICT), 2013, : 94 - 97
  • [23] An Investigation on the Accuracy of Truncated DKLT Representation for Speaker Identification With Short Sequences of Speech Frames
    Biagetti, Giorgio
    Crippa, Paolo
    Falaschetti, Laura
    Orcioni, Simone
    Turchetti, Claudio
    IEEE TRANSACTIONS ON CYBERNETICS, 2017, 47 (12) : 4235 - 4249
  • [24] Roles of the Average Voice in Speaker-adaptive HMM-based Speech Synthesis
    Yamagishi, Junichi
    Watts, Oliver
    King, Simon
    Usabaev, Bela
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 418 - +
  • [25] Neural Network Control Interface of the Speaker Dependent Computer System "Deep Interactive Voice Assistant DIVA" to Help People with Speech Impairments
    Khorosheva, Tatiana
    Novoseltseva, Marina
    Geidarov, Nazim
    Krivosheev, Nikolay
    Chernenko, Sergey
    PROCEEDINGS OF THE THIRD INTERNATIONAL SCIENTIFIC CONFERENCE INTELLIGENT INFORMATION TECHNOLOGIES FOR INDUSTRY (IITI'18), VOL 1, 2019, 874 : 444 - 452
  • [26] The Control of Home Appliances Using Voice Command Based on Speech Recognition
    Latif, M.
    Dafid, Achmad
    Widyaningrum, Vivi Tri
    Romadhon, Ahmad Sahru
    Wahyuni, Sri
    ADVANCED SCIENCE LETTERS, 2017, 23 (12) : 12417 - 12419
  • [27] Robust speech features based on wavelet transform with application to speaker identification
    Hsieh, CT
    Lai, E
    Wang, YC
    IEE PROCEEDINGS-VISION IMAGE AND SIGNAL PROCESSING, 2002, 149 (02): : 108 - 114
  • [28] Speaker Identification for Whispered Speech based on Frequency Warping and Score Competition
    Fan, Xing
    Hansen, John H. L.
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1313 - 1316
  • [29] Enhanced VQ-based algorithms for speech independent speaker identification
    Fan, NP
    Rosca, J
    AUDIO-BASED AND VIDEO-BASED BIOMETRIC PERSON AUTHENTICATION, PROCEEDINGS, 2003, 2688 : 470 - 477
  • [30] Evaluation of speaker de-identification based on voice gender and age conversion
    Pribil, Jiri
    Pribilova, Anna
    Matousek, Jindrich
    JOURNAL OF ELECTRICAL ENGINEERING-ELEKTROTECHNICKY CASOPIS, 2018, 69 (02): : 138 - 147