Fast multimedia contents retrieval by partially spoken query

被引：0

作者：

Jeong, So-Young ^{[1
]}

Han, Icksang ^{[1
]}

Kwak, Byung-Kwan ^{[1
]}

Cho, Jeongmi ^{[1
]}

Kim, Jeongsu ^{[1
]}

机构：

[1] Samsung Elect Co Ltd, Samsung Adv Inst Technol, Seoul, South Korea

来源：

IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE 2011) | 2011年

关键词：

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

We present novel fast multi-pass decoding strategies for recognizing large named-entities on a low-resource embedded device and thus retrieving MP3 music using spoken query, which contains partial segments of whole music titles and artists. After acoustic-phonetic decoding in the first stage processing, we incorporate word boundary information with phonetic confusion matrix into next stage partial word matching. Then, we rescore candidate phone lists using more complex context-dependent acoustic model, whose outputs are the retrieved songs. We tested our retrieval system to the task of retrieving 1000 songs on a commercial MP3 player and could achieve about 15.5% relative improvements in response time over conventional frame-based multi-pass decoding method without sacrificing recognition rates.

引用

页码：839 / 840

页数：2

共 50 条

[31] The MPEG Query Format: Unifying Access to Multimedia Retrieval Systems
Doeller, Mario
Tous, Ruben
Gruhne, Matthias
Yoon, Kyoungro
Sano, Masanori
Burnett, Ian S.
IEEE MULTIMEDIA, 2008, 15 (04) : 82 - 95
[32] Semantic-Driven Multimedia Retrieval with the MPEG Query Format
Tous, Ruben
Delgado, Jaime
SEMANTIC MULTIMEDIA, PROCEEDINGS, 2008, 5392 : 149 - 163
[33] Semantic-driven multimedia retrieval with the MPEG Query Format
Ruben Tous
Jaime Delgado
Multimedia Tools and Applications, 2010, 49 : 213 - 233
[34] Semantic-driven multimedia retrieval with the MPEG Query Format
Tous, Ruben
Delgado, Jaime
MULTIMEDIA TOOLS AND APPLICATIONS, 2010, 49 (01) : 213 - 233
[35] Fast and Effective Retrieval for Large Multimedia Collections
Wagenpfeil, Stefan
Vu, Binh
Mc Kevitt, Paul
Hemmje, Matthias
BIG DATA AND COGNITIVE COMPUTING, 2021, 5 (03)
[36] A Proposal of Semantic Multimedia Contents Retrieval Framework for Smart TV
Kim, Myung-Eun
Cho, Joon-Myun
Yoo, Jeong-Ju
Kim, Sang-Ha
2012 IEEE INTERNATIONAL SYMPOSIUM ON BROADBAND MULTIMEDIA SYSTEMS AND BROADCASTING (BMSB), 2012,
[37] SPOKEN DOCUMENT RETRIEVAL LEVERAGING BERT-BASED MODELING AND QUERY REFORMULATION
Fan-Jiang, Shao-Wei
Lo, Tien-Hong
Chen, Berlin
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 8144 - 8148
[38] Adaptation to Pronunciation Variations in Indonesian Spoken Query-Based Information Retrieval
Lestari, Dessi Puji
Furui, Sadaoki
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2010, E93D (09) : 2388 - 2396
[39] Visual Concept-based Selection of Query Expansions for Spoken Content Retrieval
Rudinac, Stevan
Larson, Martha
Hanjalic, Alan
SIGIR 2010: PROCEEDINGS OF THE 33RD ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH DEVELOPMENT IN INFORMATION RETRIEVAL, 2010, : 891 - 892
[40] Multimedia Contents Retrieval based on 12-Mood Vector
Moon, Chang Bae
Lee, Jong Yeol
Kim, Dong-Seong
Kim, Byeong Man
35TH INTERNATIONAL CONFERENCE ON INFORMATION NETWORKING (ICOIN 2021), 2021, : 842 - 844

← 1 2 3 4 5 →