Speech Signal Segmentation into Vocalized and Unvocalized Segments on the Basis of Simultaneous Masking

被引：1

作者：

Konev, A. A. ^{[1
]}

Meshcheryakov, R. V. ^{[1
]}

Kostyuchenko, E. Yu ^{[1
]}

机构：

[1] Tomsk State Univ Control Syst & Radioelect, Pr Lenina 40, Tomsk 634050, Russia

来源：

OPTOELECTRONICS INSTRUMENTATION AND DATA PROCESSING | 2018年 / 54卷 / 04期

关键词：

speech signal; simultaneous masking; speech signal segmentation; vocalized and unvocalized segments;

D O I：

10.3103/S8756699018040076

中图分类号：

O4 [物理学];

学科分类号：

0702 ;

摘要：

This paper touches upon a model of simultaneous acoustic masking, which detects speech signal components perceived by a human's auditory system. A simultaneous masking algorithm on the basis of this model is proposed. It is shown that, after simultaneous masking, a signal becomes a binary structure that reflects the harmonic structure of a vocalized sequence. It is experimentally proven that this structure can be used to detect key speech segments (from the standpoint of perception by an auditory system). This structure serves as a basis for an algorithm of high-quality segmentation of a speech signal into vocalized and unvocalized segments, which does not require learning before use. The joint use of the algorithms for simultaneous masking and speech signal segmentation is tested, and their performance is evaluated.

引用

页码：361 / 366

页数：6

共 50 条

[31] Signal segmentation and its application in the feature extraction of speech
Rahman, AIA
Salleh, SHS
Sha'ameri, AZ
Al-Attas, SAR
IEEE 2000 TENCON PROCEEDINGS, VOLS I-III: INTELLIGENT SYSTEMS AND TECHNOLOGIES FOR THE NEW MILLENNIUM, 2000, : 265 - 270
[32] SIGNAL PROPERTIES THAT REDUCE MASKING BY SIMULTANEOUS, RANDOM-FREQUENCY MASKERS
NEFF, DL
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1995, 98 (04): : 1909 - 1920
[33] Real-Time Speech Signal Segmentation Methods
Kupryjanow, Adam
Czyzewski, Andrzej
JOURNAL OF THE AUDIO ENGINEERING SOCIETY, 2013, 61 (7-8): : 521 - 534
[34] Real-time speech signal segmentation methods
2013, Audio Engineering Society (61): : 7 - 8
[35] Pre-processing and segmentation of speech signal in frequency domain for speech recognition
Kolokolov, A.S.
Avtomatika i Telemekhanika, 2003, (06): : 152 - 162
[36] A signal subspace approach for speech enhancement using masking properties of the human ear
Jabloun, F
Champagne, B
2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 4024 - 4024
[37] The effects of spatial separation in distance on the informational and energetic masking of a nearby speech signal
Brungart, DS
Simpson, BD
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2002, 112 (02): : 664 - 676
[38] Simultaneous speech segmentation and phoneme recognition using dynamic programming
Bajwa, RS
Owens, RM
Kelliher, TP
1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 3213 - 3216
[39] End-to-End Simultaneous Speech Translation with Differentiable Segmentation
Zhang, Shaolei
Feng, Yang
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 7659 - 7680
[40] Effects of energetic and informational masking on speech segmentation by native and non-native speakers
Mattys, Sven L.
Carroll, Lucy M.
Li, Carrie K. W.
Chan, Sonia L. Y.
SPEECH COMMUNICATION, 2010, 52 (11-12) : 887 - 899

← 1 2 3 4 5 →