Fast multimedia contents retrieval by partially spoken query

被引:0
|
作者
Jeong, So-Young [1 ]
Han, Icksang [1 ]
Kwak, Byung-Kwan [1 ]
Cho, Jeongmi [1 ]
Kim, Jeongsu [1 ]
机构
[1] Samsung Elect Co Ltd, Samsung Adv Inst Technol, Seoul, South Korea
关键词
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
We present novel fast multi-pass decoding strategies for recognizing large named-entities on a low-resource embedded device and thus retrieving MP3 music using spoken query, which contains partial segments of whole music titles and artists. After acoustic-phonetic decoding in the first stage processing, we incorporate word boundary information with phonetic confusion matrix into next stage partial word matching. Then, we rescore candidate phone lists using more complex context-dependent acoustic model, whose outputs are the retrieved songs. We tested our retrieval system to the task of retrieving 1000 songs on a commercial MP3 player and could achieve about 15.5% relative improvements in response time over conventional frame-based multi-pass decoding method without sacrificing recognition rates.
引用
收藏
页码:839 / 840
页数:2
相关论文
共 50 条
  • [31] The MPEG Query Format: Unifying Access to Multimedia Retrieval Systems
    Doeller, Mario
    Tous, Ruben
    Gruhne, Matthias
    Yoon, Kyoungro
    Sano, Masanori
    Burnett, Ian S.
    IEEE MULTIMEDIA, 2008, 15 (04) : 82 - 95
  • [32] Semantic-Driven Multimedia Retrieval with the MPEG Query Format
    Tous, Ruben
    Delgado, Jaime
    SEMANTIC MULTIMEDIA, PROCEEDINGS, 2008, 5392 : 149 - 163
  • [33] Semantic-driven multimedia retrieval with the MPEG Query Format
    Ruben Tous
    Jaime Delgado
    Multimedia Tools and Applications, 2010, 49 : 213 - 233
  • [34] Semantic-driven multimedia retrieval with the MPEG Query Format
    Tous, Ruben
    Delgado, Jaime
    MULTIMEDIA TOOLS AND APPLICATIONS, 2010, 49 (01) : 213 - 233
  • [35] Fast and Effective Retrieval for Large Multimedia Collections
    Wagenpfeil, Stefan
    Vu, Binh
    Mc Kevitt, Paul
    Hemmje, Matthias
    BIG DATA AND COGNITIVE COMPUTING, 2021, 5 (03)
  • [36] A Proposal of Semantic Multimedia Contents Retrieval Framework for Smart TV
    Kim, Myung-Eun
    Cho, Joon-Myun
    Yoo, Jeong-Ju
    Kim, Sang-Ha
    2012 IEEE INTERNATIONAL SYMPOSIUM ON BROADBAND MULTIMEDIA SYSTEMS AND BROADCASTING (BMSB), 2012,
  • [37] SPOKEN DOCUMENT RETRIEVAL LEVERAGING BERT-BASED MODELING AND QUERY REFORMULATION
    Fan-Jiang, Shao-Wei
    Lo, Tien-Hong
    Chen, Berlin
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 8144 - 8148
  • [38] Adaptation to Pronunciation Variations in Indonesian Spoken Query-Based Information Retrieval
    Lestari, Dessi Puji
    Furui, Sadaoki
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2010, E93D (09) : 2388 - 2396
  • [39] Visual Concept-based Selection of Query Expansions for Spoken Content Retrieval
    Rudinac, Stevan
    Larson, Martha
    Hanjalic, Alan
    SIGIR 2010: PROCEEDINGS OF THE 33RD ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH DEVELOPMENT IN INFORMATION RETRIEVAL, 2010, : 891 - 892
  • [40] Multimedia Contents Retrieval based on 12-Mood Vector
    Moon, Chang Bae
    Lee, Jong Yeol
    Kim, Dong-Seong
    Kim, Byeong Man
    35TH INTERNATIONAL CONFERENCE ON INFORMATION NETWORKING (ICOIN 2021), 2021, : 842 - 844