Optimized Active Learning Strategy for Audiovisual Speaker Recognition

被引：4

作者：

Karlos, Stamatis ^{[1
]}

Kaleris, Konstantinos ^{[2
]}

Fazakis, Nikos ^{[2
]}

Kanas, Vasileios G. ^{[2
]}

Kotsiantis, Sotiris ^{[1
]}

机构：

[1] Univ Patras, Dept Math, Rion 26504, Achaia, Greece

[2] Univ Patras, Dept Elect & Engn, Rion 26504, Achaia, Greece

来源：

SPEECH AND COMPUTER (SPECOM 2018) | 2018年 / 11096卷

关键词：

Active Learning; Optimized learner; Speaker Recognition; Audiovisual features; Support Vector Machines; Hyperopt package tool; SPEECH; EXTRACTION;

D O I：

10.1007/978-3-319-99579-3_30

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The purpose of this work is to investigate the improved recognition accuracy caused from exploiting optimization stages for tuning parameters of an Active Learning (AL) classifier. Since plenty of data could be available during Speaker Recognition (SR) tasks, the AL concept, which incorporates human entities inside its learning kernel for exploring hidden insights into unlabeled data, seems extremely suitable, without demanding much expertise on behalf of the human factor. Six datasets containing 8 and 16 speakers' utterances under different recording setups, are described by audiovisual features and evaluated through the time-efficient Uncertainty Sampling query strategy (UncS). Both Support Vector Machines (SVMs) and Random Forest (RF) algorithms were selected to be tuned over a small subset of the initial training data and then applied iteratively for mining the most suitable instances from a corresponding pool of unlabeled instances. Useful conclusions are drawn concerning the values of the selected parameters, allowing future optimization attempts to get employed into more restricted regions, while remarkable improvements rates were obtained using an ideal annotator.

引用

页码：281 / 290

页数：10

共 50 条

[41] Improved Deep Speaker Feature Learning for Text-Dependent Speaker Recognition
Li, Lantian
Lin, Yiye
Zhang, Zhiyong
Wang, Dong
2015 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2015, : 426 - 429
[42] AUDIOVISUAL SPEAKER DIARIZATION OF TV SERIES
Bost, Xavier
Linares, Georges
Gueye, Serigne
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4799 - 4803
[43] From Speaker Recognition to Forensic Speaker Recognition
Drygajlo, Andrzej
BIOMETRIC AUTHENTICATION (BIOMET 2014), 2014, 8897 : 93 - 104
[44] A Probabilistic Fusion Strategy for Audiovisual Emotion Recognition of Sparse and Noisy Data
Lin, Jen-Chun
Wu, Chung-Hsien
Wei, Wen-Li
1ST INTERNATIONAL CONFERENCE ON ORANGE TECHNOLOGIES (ICOT 2013), 2013, : 278 - 281
[45] DOMAIN ROBUST DEEP EMBEDDING LEARNING FOR SPEAKER RECOGNITION
Hu, Hang-Rui
Song, Yan
Liu, Ying
Dai, Li-Rong
McLoughlin, Ian
Liu, Lin
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7182 - 7186
[46] Max-Margin Metric Learning for Speaker Recognition
Li, Laitian
Wang, Dong
Xing, Chao
Zheng, Thomas Fang
2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,
[47] Streamlining Action Recognition in Autonomous Shared Vehicles with an Audiovisual Cascade Strategy
Pinto, Joao Ribeiro
Carvalho, Pedro
Pinto, Carolina
Sousa, Afonso
Capozzi, Leonardo
Cardoso, Jaime S.
PROCEEDINGS OF THE 17TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 5, 2022, : 467 - 474
[48] Federated Learning for Privacy-Preserving Speaker Recognition
Woubie, Abraham
Backstrom, Tom
IEEE ACCESS, 2021, 9 : 149477 - 149485
[49] Speaker recognition via block sparse bayesian learning
Wang, Wei
Han, Jiqing
Zheng, Tieran
Zheng, Guibin
Shao, Mingguang
International Journal of Multimedia and Ubiquitous Engineering, 2015, 10 (07): : 247 - 254
[50] Erratum to: Latent discriminative representation learning for speaker recognition
Duolin Huang
Qirong Mao
Zhongchen Ma
Zhishen Zheng
Sidheswar Routray
Elias-Nii-Noi Ocquaye
Frontiers of Information Technology & Electronic Engineering, 2021, 22 : 914 - 914

← 1 2 3 4 5 →