Neurally Optimized Decoder for Low Bitrate Speech Codec

被引:1
|
作者
Kim, Hyung Yong [1 ,2 ]
Yoon, Ji Won [1 ,2 ]
Cho, Won Ik [1 ,2 ]
Kim, Nam Soo [1 ,2 ]
机构
[1] Seoul Natl Univ, Dept Elect & Comp Engn, Seoul 08826, South Korea
[2] Seoul Natl Univ, Inst New Media & Commun, Seoul 08826, South Korea
关键词
Decoding; Speech coding; Speech codecs; Bit rate; Encoding; Convolution; Knowledge engineering; generative adversarial network; generative model; attention mechanism; NETWORKS;
D O I
10.1109/LSP.2021.3132557
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Recently, a conventional neural decoder for speech codec has shown promising performance. However, it typically requires some prior knowledge of decoding such as bit allocation or dequantization scheme, which is not a universal solution for many different kinds of speech codecs. In order to address this limitation, we propose a neurally optimized decoder based on a generative model which can directly reconstruct the speech from the bitstream without a prior knowledge. The proposed decoder mainly consists of two components: 1) a dequantization model to group and dequantize related bits from the bitstream and 2) a generative model to restore the speech conditioned on the output of the dequantization model. Through experiments with mixed excitation linear prediction (MELP), Advanced multi-band excitation (AMBE), and SPEEX at around 2.4 kb/s, it is showed that the proposed model showed better performance in most of the objective and subjective evaluation compared to the conventional speech codecs.
引用
收藏
页码:244 / 248
页数:5
相关论文
共 50 条
  • [1] A low-power DSP core architecture for low bitrate speech codec
    Okuhata, H
    Miki, MH
    Onoye, T
    Shirakawa, I
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 1998, E81A (08) : 1616 - 1621
  • [2] A low-power DSP core architecture for low bitrate speech CODEC
    Okuhata, H
    Miki, MH
    Onoye, T
    Shirakawa, I
    PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 3121 - 3124
  • [3] A scalable wideband speech codec using the wavelet packet transform based on the internet low bitrate codec
    Seto, Koji
    Ogunfunmi, Tokunbo
    COMPUTER SPEECH AND LANGUAGE, 2019, 54 : 61 - 70
  • [4] Universal Speech Token Learning via Low-Bitrate Neural Codec and Pretrained Representations
    Jiang, Xue
    Peng, Xiulian
    Zhang, Yuan
    Lu, Yan
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2024, 18 (08) : 1477 - 1489
  • [5] Design of a Bitrate Scalable Speech Codec Based on G.723.1
    Lee, Joonseok
    Kang, Sangwon
    Lee, Kangeun
    Park, Dongwon
    JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2005, 24 (06): : 358 - 364
  • [6] ARCHITECTURE FOR VARIABLE BITRATE NEURAL SPEECH CODEC WITH CONFIGURABLE COMPUTATION COMPLEXITY
    Jayashankar, Tejas
    Koehler, Thilo
    Kalgaonkar, Kaustubh
    Xiu, Zhiping
    Wu, Jilong
    Lin, Ju
    Agrawal, Prabhav
    He, Qing
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 861 - 865
  • [7] A low power CELP decoder VLSI architecture with reduced memory requirement for low bit rate speech codec
    Suen, AN
    Wang, JF
    Lin, JL
    INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS, 1997 DIGEST OF TECHNICAL PAPERS, 1997, : 214 - 215
  • [8] SemantiCodec: An Ultra Low Bitrate Semantic Audio Codec for General Sound
    Liu, Haohe
    Xu, Xuenan
    Yuan, Yi
    Wu, Mengyue
    Wang, Wenwu
    Plumbley, Mark D.
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2024, 18 (08) : 1448 - 1461
  • [9] Low power design for Speech Codec
    Okamura, T
    Kinoshita, Y
    Yoshida, H
    Yamane, D
    ELEVENTH ANNUAL IEEE INTERNATIONAL ASIC CONFERENCE - PROCEEDINGS, 1998, : 135 - 138
  • [10] SPEECH ENHANCEMENT FOR LOW BIT RATE SPEECH CODEC
    Lin, Ju
    Kalgaonkar, Kaustubh
    He, Qing
    Lei, Xin
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7777 - 7781