An automatic multimodal speech recognition system with audio and video information

被引：12

作者：

Karpov, A. A. ^{[1
,2
]}

机构：

[1] Russian Acad Sci, St Petersburg Inst Informat & Automat, St Petersburg 196140, Russia

[2] ITMO Univ, St Petersburg, Russia

来源：

AUTOMATION AND REMOTE CONTROL | 2014年 / 75卷 / 12期

基金：

俄罗斯基础研究基金会;

关键词：

Speech Recognition; Automatic Speech Recognition; Speech Recognition System; Speech Corpus; Automatic Speech Recognition System;

D O I：

10.1134/S000511791412008X

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The mathematical model and software implementation of an automatic Russian speech recognition system that employs techniques of digital processing and analysis of audiovisual signals from a microphone and a video camera are presented. The description of probabilistic modeling of audiovisual speech based on coupled hidden Markov models, information fusion methods with weight coefficients for audio and video speech modalities, and parametric representation of signals is provided. Quantitative results in multimodal recognition of continuous Russian speech indicate high accuracy and reliability of the automatic system.

引用

页码：2190 / 2200

页数：11

共 50 条

[1] An automatic multimodal speech recognition system with audio and video information
A. A. Karpov
Automation and Remote Control, 2014, 75 : 2190 - 2200
[2] An audio-visual corpus for multimodal automatic speech recognition
Andrzej Czyzewski
Bozena Kostek
Piotr Bratoszewski
Jozef Kotus
Marcin Szykulski
Journal of Intelligent Information Systems, 2017, 49 : 167 - 192
[3] An audio-visual corpus for multimodal automatic speech recognition
Czyzewski, Andrzej
Kostek, Bozena
Bratoszewski, Piotr
Kotus, Jozef
Szykulski, Marcin
JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2017, 49 (02) : 167 - 192
[4] Indonesian Audio-Visual Speech Corpus for Multimodal Automatic Speech Recognition
Maulana, Muhammad Rizki Aulia Rahman
Fanany, Mohamad Ivan
2017 INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER SCIENCE AND INFORMATION SYSTEMS (ICACSIS), 2017, : 381 - 385
[5] AUTOMATIC RECOGNITION OF SPEECH WITHOUT ANY AUDIO INFORMATION
Heracleous, Panikos
Hagita, Norihiro
2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 2392 - 2395
[6] Automatic Speech Recognition System for Lithuanian Broadcast Audio
Alumae, Tanel
Tilk, Ottokar
HUMAN LANGUAGE TECHNOLOGIES - THE BALTIC PERSPECTIVE, 2016, 289 : 39 - 45
[7] MULTIMODAL SPEECH EMOTION RECOGNITION USING AUDIO AND TEXT
Yoon, Seunghyun
Byun, Seokhyun
Jung, Kyomin
2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 112 - 118
[8] Multimodal English corpus for automatic speech recognition
Kunka, Bartosz
Kupryjanow, Adam
Dalka, Piotr
Bratoszewski, Piotr
Szczodrak, Maciej
Spaleniak, Pawel
Szykulski, Marcin
Czyzewski, Andrzej
2013 SIGNAL PROCESSING: ALGORITHMS, ARCHITECTURES, ARRANGEMENTS, AND APPLICATIONS (SPA), 2013, : 106 - 111
[9] Enhancing Quality and Accuracy of Speech Recognition System by Using Multimodal Audio-Visual Speech signal
El Maghraby, Eslam E.
Gody, Amr M.
Farouk, M. Hesham
ICENCO 2016 - 2016 12TH INTERNATIONAL COMPUTER ENGINEERING CONFERENCE (ICENCO) - BOUNDLESS SMART SOCIETIES, 2016, : 219 - 229
[10] Audio-Visual (Multimodal) Speech Recognition System Using Deep Neural Network
Paulin, Hebsibah
Milton, R. S.
JanakiRaman, S.
Chandraprabha, K.
JOURNAL OF TESTING AND EVALUATION, 2019, 47 (06) : 3963 - 3974

← 1 2 3 4 5 →