Variational Bayesian methods for audio indexing

被引：0

作者：

Valente, F ^{[1
]}

Wellekens, C ^{[1
]}

机构：

[1] Inst Eurecom, Sophia Antipolis, France

来源：

MACHINE LEARNING FOR MULTIMODAL INTERACTION | 2005年 / 3869卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper we aim to investigate the use of Variational Bayesian methods for audio indexing purposes. Variational Bayesian (VB) techniques are approximated techniques for fully Bayesian learning. Contrarily to non Bayesian methods (e.g. Maximum Likelihood) or partially Bayesian criterion (e.g. Maximum a Posteriori), VB benefits from important model selection properties. VB learning is based on the Free Energy optimization; Free Energy can be used at the same time as an objective function and as a model selection criterion allowing simultaneous model learning/model selection. Here we explore the use of VB learning and VB model selection in a speaker clustering task comparing results with classical learning techniques (ML and MAP) and classical model selection criteria (BIC). Experiments are run on the evaluation data set NIST-1996 HUB-4 and results show that VB can outperform classical methods.

引用

页码：307 / 319

页数：13

共 50 条

[41] Speech and Singing Discrimination for Audio Data Indexing
Tsai, Wei-Ho
Ma, Cin-Hao
2014 IEEE INTERNATIONAL CONGRESS ON BIG DATA (BIGDATA CONGRESS), 2014, : 276 - 280
[42] Audio visual cues for video indexing and retrieval
Muneesawang, Paisarn
Amin, Tahir
Guan, Ling
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2004, 3331 : 642 - 649
[43] An Overview on Perceptually Motivated Audio Indexing and Classification
Richard, Gael
Sundaram, Shiva
Narayanan, Shrikanth
PROCEEDINGS OF THE IEEE, 2013, 101 (09) : 1939 - 1954
[44] AN AUDIO INDEXING SYSTEM FOR ELECTION VIDEO MATERIAL
Alberti, Christopher
Bacchiani, Michiel
Bezman, Ari
Chelba, Ciprian
Drofa, Anastassia
Liao, Hank
Moreno, Pedro
Power, Ted
Sahuguet, Arnaud
Shugrina, Maria
Siohan, Olivier
2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4873 - 4876
[45] Speech and language technologies for audio indexing and retrieval
Makhoul, J
Kubala, F
Leek, T
Liu, DB
Nguyen, L
Schwartz, R
Srivastava, A
PROCEEDINGS OF THE IEEE, 2000, 88 (08) : 1338 - 1353
[46] Parallel algorithms for indexing and retrieval in audio databases
Subramanya, SR
Youssef, A
INTERNATIONAL SOCIETY FOR COMPUTERS AND THEIR APPLICATIONS 10TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING SYSTEMS, 1997, : 611 - 618
[47] Audio visual cues for video indexing and retrieval
Muneesawang, P
Amin, T
Guan, L
ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2004, PT 1, PROCEEDINGS, 2004, 3331 : 642 - 649
[48] Mixtures of probability experts for audio retrieval and indexing
Slaney, M
IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL I AND II, PROCEEDINGS, 2002, : 345 - 348
[49] Real-world audio indexing systems
Logan, B
Goddeau, D
Van Thong, JM
2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 1001 - 1004
[50] Fast Caption Alignment for Automatic Indexing of Audio
Knight, Allan
Almeroth, Kevin
INTERNATIONAL JOURNAL OF MULTIMEDIA DATA ENGINEERING & MANAGEMENT, 2010, 1 (02): : 1 - 17

← 1 2 3 4 5 →