Virtual home assistant for voice based controlling and scheduling with short speech speaker identification

被引：0

作者：

Varun Tiwari

Mohammad Farukh Hashmi

Avinash Keskar

N. C. Shivaprakash

机构：

[1] Visvesvaraya National Institute of Technology,Department of Electronics and Communication Engineering

[2] National Institute of Technology Campus,Department of Electronics and Communication Engineering

[3] Indian Institute of Science,Department of Instrumentation and Applied Physics

来源：

Multimedia Tools and Applications | 2020年 / 79卷

关键词：

Cloud services; Gaussian mixture models; Internet of things; Principal component analysis; Speaker identification; Vector quantization;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

With the advancement of interface technologies in smart devices, voice-controlled assistants have quickly gained popularity. These assistants are designed to use voice commands to achieve a more human-friendly interaction. On these lines, we propose a cloud-connected voice based home assistant in this paper. It accepts voice commands to control or monitor devices in a home. It can understand and schedule device operations based on time or sensor data through a simple voice based approach. To enhance its capability, it is designed to identify the speakers. Mel-Frequency Cepstrum Coefficients (MFCC) in combination with other speech features are used as feature vector. We use Vector Quantization (VQ) and Principal Component Analysis (PCA) for dimensionality reduction of the feature vector, followed by Gaussian Mixture Model (GMM) for classification. The validation of the short speech speaker identification is carried out on a set of Indian speakers in an uncontrolled indoor environment. An accuracy greater than 92% is achieved for speech samples as small as 1 second. A database of more than 50 different commands per speaker is also created for validation of the proposed virtual assistant. IBM’s Bluemix and Google’s cloud service is used for speech to text conversion.

引用

页码：5243 / 5268

页数：25

共 50 条

[21] Speaker Identification based on MFSC voice feature extraction using Transformer
Bao, Liao
Zuo, Yi
2023 23RD IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS, ICDMW 2023, 2023, : 1 - 7
[22] Speaker Identification Using Voice-Based Cryptography for Mobile VoIP Secure Voice Communication
Ryu, Sang-Hyeon
Kim, Hyoung-Gook
2013 THIRD WORLD CONGRESS ON INFORMATION AND COMMUNICATION TECHNOLOGIES (WICT), 2013, : 94 - 97
[23] An Investigation on the Accuracy of Truncated DKLT Representation for Speaker Identification With Short Sequences of Speech Frames
Biagetti, Giorgio
Crippa, Paolo
Falaschetti, Laura
Orcioni, Simone
Turchetti, Claudio
IEEE TRANSACTIONS ON CYBERNETICS, 2017, 47 (12) : 4235 - 4249
[24] Roles of the Average Voice in Speaker-adaptive HMM-based Speech Synthesis
Yamagishi, Junichi
Watts, Oliver
King, Simon
Usabaev, Bela
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 418 - +
[25] Neural Network Control Interface of the Speaker Dependent Computer System "Deep Interactive Voice Assistant DIVA" to Help People with Speech Impairments
Khorosheva, Tatiana
Novoseltseva, Marina
Geidarov, Nazim
Krivosheev, Nikolay
Chernenko, Sergey
PROCEEDINGS OF THE THIRD INTERNATIONAL SCIENTIFIC CONFERENCE INTELLIGENT INFORMATION TECHNOLOGIES FOR INDUSTRY (IITI'18), VOL 1, 2019, 874 : 444 - 452
[26] The Control of Home Appliances Using Voice Command Based on Speech Recognition
Latif, M.
Dafid, Achmad
Widyaningrum, Vivi Tri
Romadhon, Ahmad Sahru
Wahyuni, Sri
ADVANCED SCIENCE LETTERS, 2017, 23 (12) : 12417 - 12419
[27] Robust speech features based on wavelet transform with application to speaker identification
Hsieh, CT
Lai, E
Wang, YC
IEE PROCEEDINGS-VISION IMAGE AND SIGNAL PROCESSING, 2002, 149 (02): : 108 - 114
[28] Speaker Identification for Whispered Speech based on Frequency Warping and Score Competition
Fan, Xing
Hansen, John H. L.
INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1313 - 1316
[29] Enhanced VQ-based algorithms for speech independent speaker identification
Fan, NP
Rosca, J
AUDIO-BASED AND VIDEO-BASED BIOMETRIC PERSON AUTHENTICATION, PROCEEDINGS, 2003, 2688 : 470 - 477
[30] Evaluation of speaker de-identification based on voice gender and age conversion
Pribil, Jiri
Pribilova, Anna
Matousek, Jindrich
JOURNAL OF ELECTRICAL ENGINEERING-ELEKTROTECHNICKY CASOPIS, 2018, 69 (02): : 138 - 147

← 1 2 3 4 5 →