Mining Summary of Short Text with Centroid Similarity Distance

被引:1
|
作者
Franciscus, Nigel [1 ]
Wang, Junhu [1 ]
Stantic, Bela [1 ]
机构
[1] Inst Integrated & Intelligent Syst, Brisbane, Qld, Australia
来源
ADVANCED DATA MINING AND APPLICATIONS, ADMA 2019 | 2019年 / 11888卷
关键词
Text summarization; Short text; Word embeddings;
D O I
10.1007/978-3-030-35231-8_32
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text summarization aims at producing a concise summary that preserves key information. Many textual inputs are short and do not fit with the standard longer text-based techniques. Most of the existing short text summarization approaches rely on metadata information such as the authors or reply networks. However, not all raw textual data can provide such information. In this paper, we present our method to summarize short text using a centroid-based method with word embeddings. In particular, we consider the task when there is no metadata information other than the text itself. We show that the centroid embeddings approach can be applied to short text to capture semantically similar sentences for summarization. With further clustering strategy, we were able to identify relevant sub-topics that further improves the context diversity in the overall summary. The empirical evaluation demonstrates that our approach can outperform other methods on two annotated LREC track dataset.
引用
收藏
页码:447 / 461
页数:15
相关论文
共 50 条
  • [21] Consensus Similarity Measure for Short Text Clustering
    Shin, Youhyun
    Ahn, Yeonchan
    Jeon, Heesik
    Lee, Sang-goo
    2015 26TH INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS (DEXA), 2015, : 264 - 268
  • [22] Text-mining Similarity Approximation Operators for Opinion Mining in BI tools
    Kaplanski, Pawel
    Rizun, Nina
    Taranenko, Yurii
    Seganti, Alessandro
    PROCEEDINGS OF THE 11TH SCIENTIFIC CONFERENCE INTERNET IN THE INFORMATION SOCIETY 2016, 2016, : 121 - 140
  • [23] Text Similarity Function Based on Word Embeddings for Short Text Analysis
    Pascual, Adrian Jimenez
    Fujita, Sumio
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING (CICLING 2017), PT I, 2018, 10761 : 391 - 402
  • [24] Summary of the neural centroid TSP
    Wolfe, WJ
    Duca, FA
    APPLICATIONS AND SCIENCE OF COMPUTATIONAL INTELLIGENCE, 1998, 3390 : 50 - 56
  • [25] Short - Text Mining Approach for Medical Domain
    Kavitha, R.
    Padmaja, A.
    Subha, P.
    GLOBAL TRENDS IN INFORMATION SYSTEMS AND SOFTWARE APPLICATIONS, PT 2, 2012, 270 : 606 - 612
  • [26] Similarity Detection between Turkish Text Documents with Distance Metrics
    Kaya Keles, Mumine
    Ozel, Selma Ayse
    2017 INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ENGINEERING (UBMK), 2017, : 316 - 321
  • [27] Automatic transfer learning for short text mining
    Yang, Lei
    Zhang, Jianpei
    EURASIP JOURNAL ON WIRELESS COMMUNICATIONS AND NETWORKING, 2017,
  • [28] Hamming Distance based Approximate Similarity Text Search Algorithm
    Hu, Haifeng
    Zhang, Liang
    Wu, Jianshen
    2015 SEVENTH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTATIONAL INTELLIGENCE (ICACI), 2015, : 1 - 6
  • [29] Automatic transfer learning for short text mining
    Lei Yang
    Jianpei Zhang
    EURASIP Journal on Wireless Communications and Networking, 2017
  • [30] A Fuzzy Similarity Based Concept Mining Model for Text Classification
    Puri, Shalini
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2011, 2 (11) : 115 - 121