Fast multimedia contents retrieval by partially spoken query

被引:0
|
作者
Jeong, So-Young [1 ]
Han, Icksang [1 ]
Kwak, Byung-Kwan [1 ]
Cho, Jeongmi [1 ]
Kim, Jeongsu [1 ]
机构
[1] Samsung Elect Co Ltd, Samsung Adv Inst Technol, Seoul, South Korea
关键词
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
We present novel fast multi-pass decoding strategies for recognizing large named-entities on a low-resource embedded device and thus retrieving MP3 music using spoken query, which contains partial segments of whole music titles and artists. After acoustic-phonetic decoding in the first stage processing, we incorporate word boundary information with phonetic confusion matrix into next stage partial word matching. Then, we rescore candidate phone lists using more complex context-dependent acoustic model, whose outputs are the retrieved songs. We tested our retrieval system to the task of retrieving 1000 songs on a commercial MP3 player and could achieve about 15.5% relative improvements in response time over conventional frame-based multi-pass decoding method without sacrificing recognition rates.
引用
收藏
页码:839 / 840
页数:2
相关论文
共 50 条
  • [21] Efficient Multimedia Information Retrieval with Query Level Fusion
    Sattari, Saeid
    Yazici, Adnan
    FLEXIBLE QUERY ANSWERING SYSTEMS 2015, 2016, 400 : 367 - 379
  • [22] Visual Query Posing in Multimedia Web Document Retrieval
    Rinaldi, Antonio M.
    Russo, Cristiano
    Tommasino, Cristian
    2021 IEEE 15TH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC 2021), 2021, : 415 - 420
  • [23] Query reformulation for content based multimedia retrieval in MARS
    Porkaew, K
    Ortega, M
    Mehrotra, S
    IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA COMPUTING AND SYSTEMS, PROCEEDINGS VOL 2, 1999, : 747 - 751
  • [24] Statistical language models for query-by-example spoken document retrieval
    Paula Lopez-Otero
    Javier Parapar
    Alvaro Barreiro
    Multimedia Tools and Applications, 2020, 79 : 7927 - 7949
  • [25] Exploiting Result Consistency to Select Query Expansions for Spoken Content Retrieval
    Rudinac, Stevan
    Larson, Martha
    Hanjalic, Alan
    ADVANCES IN INFORMATION RETRIEVAL, PROCEEDINGS, 2010, 5993 : 645 - 648
  • [26] NEURAL RELEVANCE-AWARE QUERY MODELING FOR SPOKEN DOCUMENT RETRIEVAL
    Lo, Tien-Hong
    Chen, Ying-Wen
    Chen, Kuan-Yu
    Wang, Hsin-Min
    Chen, Berlin
    2017 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2017, : 466 - 473
  • [27] The impact of speech recognition errors on the effectiveness of spoken Cantonese query retrieval
    Choi, TK
    Zhu, XM
    Luk, RWP
    Chung, FL
    Mak, MW
    Lam, KM
    Siu, WC
    PROCEEDINGS OF THE 2004 INTERNATIONAL SYMPOSIUM ON INTELLIGENT MULTIMEDIA, VIDEO AND SPEECH PROCESSING, 2004, : 210 - 213
  • [28] Statistical language models for query-by-example spoken document retrieval
    Lopez-Otero, Paula
    Parapar, Javier
    Barreiro, Alvaro
    MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (11-12) : 7927 - 7949
  • [29] ESSENCE VECTOR-BASED QUERY MODELING FOR SPOKEN DOCUMENT RETRIEVAL
    Chen, Kuan-Yu
    Liu, Shih-Hung
    Chen, Berlin
    Wang, Hsin-Min
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 6274 - 6278
  • [30] Novel multimedia retrieval technique: progressive query (why wait?)
    Kiranyaz, S
    Gabbouj, M
    IEE PROCEEDINGS-VISION IMAGE AND SIGNAL PROCESSING, 2005, 152 (03): : 356 - 366