Subword Lexical Chaining for Automatic Story Segmentation in Chinese Broadcast News

被引:0
|
作者
Xie, Lei [1 ]
Yang, Yulian [1 ]
Zeng, Jia [2 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Sci, Audio Speech & Language Proc Grp ASLP, Xian 710072, Peoples R China
[2] Hong Kong Baptist Univ, Dept Comp Sci, Hong Kong, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
Story segmentation; topic segmentation; spoken document retrieval; multimedia; Chinese;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We present a subword lexical chaining approach to automatic story segmentation of Chinese broadcast news (BN). Conventional lexical chains link related words with cohesion (e.g. repetition of words) and high concentration points of starting and ending chains are indicative of story boundaries. However, inevitable speech recognition errors in BN transcripts may destroy the cohesiveness of words, resulting in word match failures. We show the robustness of Chinese subwords (characters and syllables) in lexical matching in errorful ASR transcripts. This motivates us to discover story boundaries on chains formed by character and syllable n-gram units. Experimental results on the TDT2 Mandarin corpus show that chaining by character unigram exhibits the best story segmentation performance with relative F-measure improvement of 6.06% over conventional word chaining. Integrations of multi-scales (words and subwords) exhibit further improvement. For example, fusion by voting from different scales achieves an F-measure gain of 9.04% over words.
引用
收藏
页码:248 / +
页数:3
相关论文
共 50 条
  • [1] A Subword Normalized Cut Approach to Automatic Story Segmentation of Chinese Broadcast News
    Zhang, Jin
    Xie, Lei
    Feng, Wei
    Zhang, Yanning
    INFORMATION RETRIEVAL TECHNOLOGY, PROCEEDINGS, 2009, 5839 : 136 - +
  • [2] SUBWORD LATENT SEMANTIC ANALYSIS FOR TEXTTILING-BASED AUTOMATIC STORY SEGMENTATION OF CHINESE BROADCAST NEWS
    Yang, Yulian
    Xie, Lei
    2008 6TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2008, : 358 - 361
  • [3] Lexical Story Co-Segmentation of Chinese Broadcast News
    Feng, Wei
    Nie, Xuecheng
    Wan, Liang
    Xie, Lei
    Jiang, Jianmin
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2283 - 2286
  • [4] On the effectiveness of subwords for lexical cohesion based story segmentation of Chinese broadcast news
    Xie, L.
    Yang, Y. -L.
    Liu, Z. -Q.
    INFORMATION SCIENCES, 2011, 181 (13) : 2873 - 2891
  • [5] Multi-scale TextTiling for automatic story segmentation in Chinese broadcast news
    Xie, Lei
    Zeng, Jia
    Feng, Wei
    INFORMATION RETRIEVAL TECHNOLOGY, 2008, 4993 : 345 - +
  • [6] Laplacian Eigenmaps for Automatic Story Segmentation of Broadcast News
    Xie, Lei
    Zheng, Lilei
    Liu, Zihan
    Zhang, Yanning
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (01): : 276 - 289
  • [7] Self-validated Story Segmentation of Chinese Broadcast News
    Feng, Wei
    Xie, Lei
    Zhang, Jin
    Zhang, Yujun
    Zhang, Yanning
    ADVANCES IN BRAIN INSPIRED COGNITIVE SYSTEMS, BICS 2018, 2018, 10989 : 568 - 578
  • [8] Story Segmentation in TV News Broadcast
    Kannao, Raghvendra
    Guha, Prithwijit
    2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, : 2948 - 2953
  • [9] Broadcast news navigation using story segmentation
    Merlino, A
    Morey, D
    Maybury, M
    ACM MULTIMEDIA 97, PROCEEDINGS, 1997, : 381 - 391
  • [10] Discovering salient prosodic cues and their interactions for automatic story segmentation in Mandarin broadcast news
    Xie, Lei
    MULTIMEDIA SYSTEMS, 2008, 14 (04) : 237 - 253