Speech Signal Segmentation into Vocalized and Unvocalized Segments on the Basis of Simultaneous Masking

被引:1
|
作者
Konev, A. A. [1 ]
Meshcheryakov, R. V. [1 ]
Kostyuchenko, E. Yu [1 ]
机构
[1] Tomsk State Univ Control Syst & Radioelect, Pr Lenina 40, Tomsk 634050, Russia
关键词
speech signal; simultaneous masking; speech signal segmentation; vocalized and unvocalized segments;
D O I
10.3103/S8756699018040076
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
This paper touches upon a model of simultaneous acoustic masking, which detects speech signal components perceived by a human's auditory system. A simultaneous masking algorithm on the basis of this model is proposed. It is shown that, after simultaneous masking, a signal becomes a binary structure that reflects the harmonic structure of a vocalized sequence. It is experimentally proven that this structure can be used to detect key speech segments (from the standpoint of perception by an auditory system). This structure serves as a basis for an algorithm of high-quality segmentation of a speech signal into vocalized and unvocalized segments, which does not require learning before use. The joint use of the algorithms for simultaneous masking and speech signal segmentation is tested, and their performance is evaluated.
引用
收藏
页码:361 / 366
页数:6
相关论文
共 50 条
  • [31] Signal segmentation and its application in the feature extraction of speech
    Rahman, AIA
    Salleh, SHS
    Sha'ameri, AZ
    Al-Attas, SAR
    IEEE 2000 TENCON PROCEEDINGS, VOLS I-III: INTELLIGENT SYSTEMS AND TECHNOLOGIES FOR THE NEW MILLENNIUM, 2000, : 265 - 270
  • [32] SIGNAL PROPERTIES THAT REDUCE MASKING BY SIMULTANEOUS, RANDOM-FREQUENCY MASKERS
    NEFF, DL
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1995, 98 (04): : 1909 - 1920
  • [33] Real-Time Speech Signal Segmentation Methods
    Kupryjanow, Adam
    Czyzewski, Andrzej
    JOURNAL OF THE AUDIO ENGINEERING SOCIETY, 2013, 61 (7-8): : 521 - 534
  • [34] Real-time speech signal segmentation methods
    2013, Audio Engineering Society (61): : 7 - 8
  • [35] Pre-processing and segmentation of speech signal in frequency domain for speech recognition
    Kolokolov, A.S.
    Avtomatika i Telemekhanika, 2003, (06): : 152 - 162
  • [36] A signal subspace approach for speech enhancement using masking properties of the human ear
    Jabloun, F
    Champagne, B
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 4024 - 4024
  • [37] The effects of spatial separation in distance on the informational and energetic masking of a nearby speech signal
    Brungart, DS
    Simpson, BD
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2002, 112 (02): : 664 - 676
  • [38] Simultaneous speech segmentation and phoneme recognition using dynamic programming
    Bajwa, RS
    Owens, RM
    Kelliher, TP
    1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 3213 - 3216
  • [39] End-to-End Simultaneous Speech Translation with Differentiable Segmentation
    Zhang, Shaolei
    Feng, Yang
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 7659 - 7680
  • [40] Effects of energetic and informational masking on speech segmentation by native and non-native speakers
    Mattys, Sven L.
    Carroll, Lucy M.
    Li, Carrie K. W.
    Chan, Sonia L. Y.
    SPEECH COMMUNICATION, 2010, 52 (11-12) : 887 - 899