Discriminative Training and Unsupervised Adaptation for Labeling Prosodic Events with Limited Training Data

被引：0

作者：

Fernandez, Raul ^{[1
]}

Ramabhadran, Bhuvana ^{[1
]}

机构：

[1] IBM Corp, TJ Watson Res Lab, Yorktown Hts, NY 10598 USA

来源：

11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2 | 2010年

关键词：

prosody labeling; conditional random fields;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Many applications of spoken-language systems can benefit from having access to annotations of prosodic events. Unfortunately, obtaining human annotations of these events, even sensible amounts to train a supervised system, can become a laborious and costly effort. In this paper we explore applying conditional random fields to automatically label major and minor break indices and pitch accents from a corpus of recorded and transcribed speech using a large set of fully automatically-extracted acoustic and linguistic features. We demonstrate the robustness of these features when used in a discriminative training framework as a function of reducing the amount of training data. We also explore adapting the baseline system in an unsupervised fashion to a target dataset for which no prosodic labels are available, and show how, when operating at point where only limited amounts of data are available, an unsupervised approach can offer up to an additional 3% improvement.

引用

页码：1429 / 1432

页数：4

共 50 条

[41] Semi-supervised and unsupervised discriminative language model training for automatic speech recognition
Dikici, Erinc
Saraclar, Murat
SPEECH COMMUNICATION, 2016, 83 : 54 - 63
[42] Self-Training for Unsupervised Neural Machine Translation in Unbalanced Training Data Scenarios
Sun, Haipeng
Wang, Rui
Chen, Kehai
Utiyama, Masao
Sumita, Eiichiro
Zhao, Tiejun
2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 3975 - 3981
[43] Data Labeling Using Unsupervised Cascaded Pre-training with Fused Multi-port Data for Optical Failure Management
Yang, Weijie
Zhang, Chunyu
Wang, Danshi
Zhu, Hong
Xu, Xinxing
Shi, Degang
Zhang, Min
2024 OPTICAL FIBER COMMUNICATIONS CONFERENCE AND EXHIBITION, OFC, 2024,
[44] Training speaker recognition systems with limited data
Vaessen, Nik
van Leeuwen, David A.
INTERSPEECH 2022, 2022, : 4760 - 4764
[45] Recognizing New Activities with Limited Training Data
Nguyen, Le T.
Zeng, Ming
Tague, Patrick
Zhang, Joy
ISWC 2015: PROCEEDINGS OF THE 2015 ACM INTERNATIONAL SYMPOSIUM ON WEARABLE COMPUTERS, 2015, : 67 - 74
[46] Semantic Segmentation from Limited Training Data
Milan, A.
Pham, T.
Vijay, K.
Morrison, D.
Tow, A. W.
Liu, L.
Erskine, J.
Grinover, R.
Gurman, A.
Hunn, T.
Kelly-Boxall, N.
Lee, D.
McTaggart, M.
Rallos, G.
Razjigaev, A.
Rowntree, T.
Shen, T.
Smith, R.
Wade-McCue, S.
Zhuang, Z.
Lehnert, C.
Lin, G.
Reid, I.
Corke, P.
Leitner, J.
2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2018, : 1908 - 1915
[47] Utilizing limited training data in materials informatics
Ling, Chen
ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2018, 255
[48] Training Generative Adversarial Networks with Limited Data
Karras, Tero
Aittala, Miika
Hellsten, Janne
Laine, Samuli
Lehtinen, Jaakko
Aila, Timo
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
[49] Unsupervised NAP Training Data Design for Speaker Recognition
Sun, Hanwu
Ma, Bin
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1098 - 1101
[50] Unsupervised training on large amounts of broadcast news data
Ma, Jeff
Matsoukas, Spyros
Kimball, Owen
Schwartz, Richard
2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 3507 - 3510

← 1 2 3 4 5 →