Full-duplex Speech-to-text System for Estonian

被引:7
|
作者
Alumaee, Tanel [1 ]
机构
[1] Tallinn Univ Technol, Inst Cybernet, EE-19086 Tallinn, Estonia
关键词
Speech recognition; Estonian; radiology; client-server; open source;
D O I
10.3233/978-1-61499-442-8-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The paper describes a distributed online speech-to-text system. The main features of the system are real-time speech recognition and full-duplex user experience, meaning that the partially recognized utterance is progressively displayed to the user during speaking. Other benefits include easy client-server communication protocol and system scalability to many concurrent user sessions. The paper also describes two Estonian speech-to-text applications based on the developed framework: a general-domain dictation application with an estimated word error rate of 26.4% and a radiology report dictation system with a word error rate of 13.7%. The system is open-source and based on free software.
引用
收藏
页码:3 / 10
页数:8
相关论文
共 50 条
  • [21] Consecutive Decoding for Speech-to-text Translation
    Dong, Qianqian
    Wang, Mingxuan
    Zhou, Hao
    Xu, Shuang
    Xu, Bo
    Li, Lei
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 12738 - 12748
  • [22] Word Boundary Modelling and Full Covariance Gaussians for Arabic Speech-to-Text Systems
    Diehl, F.
    Gales, M. J. F.
    Liu, X.
    Tomalin, M.
    Woodland, P. C.
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 784 - 787
  • [23] NAIST Simultaneous Speech-to-Text Translation System for IWSLT 2022
    Fukuda, Ryo
    Ko, Yuka
    Kano, Yasumasa
    Doi, Kosuke
    Tokuyama, Hirotaka
    Saktit, Sakriani
    Sudoh, Katsuhito
    Nakamura, Satoshi
    PROCEEDINGS OF THE 19TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE TRANSLATION (IWSLT 2022), 2022, : 286 - 292
  • [24] Latvian Speech-To-Text Transcription Service
    Salimbajevs, Askars
    Strigins, Jevgenijs
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 722 - 723
  • [25] TOWARDS UNSUPERVISED SPEECH-TO-TEXT TRANSLATION
    Chung, Yu-An
    Weng, Wei-Hung
    Tong, Schrasing
    Glass, James
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 7170 - 7174
  • [26] Automatic speech-to-text transcription in arabic
    Lamel, Lori
    Messaoudi, Abdelkhalek
    Gauvain, Jean-Luc
    ACM Transactions on Asian Language Information Processing, 2009, 8 (04):
  • [27] Elpis, an accessible speech-to-text tool
    Foley, Ben
    Rakhi, Alina
    Lambourne, Nicholas
    Buckeridge, Nicholas
    Wiles, Janet
    INTERSPEECH 2019, 2019, : 4624 - 4625
  • [28] Digital Assisted Analog Cancellation in Full-Duplex System
    Cheng, Kexin
    Xing, Zhifang
    Du, Changhao
    Zhang, Zhongshan
    2023 INTERNATIONAL CONFERENCE ON FUTURE COMMUNICATIONS AND NETWORKS, FCN, 2023,
  • [29] Tx/Rx Antenna System for Full-Duplex Application
    Gbafa, K.
    Diallo, A.
    Le Thuc, P.
    Staraj, R.
    2018 IEEE ANTENNAS AND PROPAGATION SOCIETY INTERNATIONAL SYMPOSIUM ON ANTENNAS AND PROPAGATION & USNC/URSI NATIONAL RADIO SCIENCE MEETING, 2018, : 1571 - 1572
  • [30] Antenna System for Full-Duplex Operation of Handheld Radios
    Abdelrahman, Ahmed H.
    Filipovic, Dejan S.
    IEEE TRANSACTIONS ON ANTENNAS AND PROPAGATION, 2019, 67 (01) : 522 - 530