Automatic Topic Modeling for Single Document Short Texts

被引:1
|
作者
Sajid, Anamta [1 ]
Jan, Sadaqat [1 ]
Shah, Ibrar A. [1 ]
机构
[1] Univ Engn & Technol, Dept Comp Software Engn, Mardan Campus, Mardan, Pakistan
关键词
Data mining; text mining; topic modeling;
D O I
10.1109/FIT.2017.00020
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
This paper presents a novel approach to automate the process of extracting topic and main title from a single-document short text. The proposed approach uses online text mining and Natural Language Processing techniques. The title of any text provides an efficient way to concisely grasp the overview of the contents in the text by giving a glance on its main heading only, which is quicker than reading the summary. In this paper, three different mechanisms have been proposed, implemented and compared to find the best approach for automatic extraction of a topic that is more relevant to the overall event explained in the text. The proposed system is evaluated against fifteen news articles from New York Times. The significance of the paper is twofold: Firstly, these automatic topic extraction techniques can be used further for document classification, document relevancy and similarity, summarization, comprehensive grasp of any event and finding novelty in outsized and scattered text data by scanning titles. Secondly, it can be used as a roadmap for the new researchers by using this detailed analysis of various data mining techniques. The experimental results show that the Nouns are more related, reliable, and suitable words for finding the topic of the text.
引用
收藏
页码:70 / 75
页数:6
相关论文
共 50 条
  • [1] Topic Modeling of Short Texts: A Pseudo-Document View
    Zuo, Yuan
    Wu, Junjie
    Zhang, Hui
    Lin, Hao
    Wang, Fei
    Xu, Ke
    Xiong, Hui
    KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 2105 - 2114
  • [2] Topic Modeling for Short Texts via Word Embedding and Document Correlation
    Yi, Feng
    Jiang, Bo
    Wu, Jianjun
    IEEE ACCESS, 2020, 8 : 30692 - 30705
  • [3] Online Topic Modeling for Short Texts
    Roy, Suman
    Malladi, Vijay Varma
    Sengupta, Ayan
    Das, Souparna
    SERVICE-ORIENTED COMPUTING (ICSOC 2020), 2020, 12571 : 563 - 579
  • [4] Topic Modeling of Short Texts: A Pseudo-Document View With Word Embedding Enhancement
    Zuo, Yuan
    Li, Congrui
    Lin, Hao
    Wu, Junjie
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (01) : 972 - 985
  • [5] BTM: Topic Modeling over Short Texts
    Cheng, Xueqi
    Yan, Xiaohui
    Lan, Yanyan
    Guo, Jiafeng
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (12) : 2928 - 2941
  • [6] SBTM: Topic Modeling over Short Texts
    Pang, Jianhui
    Li, Xiangsheng
    Xie, Haoran
    Rao, Yanghui
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2016, 2016, 9645 : 43 - 56
  • [7] Topic modeling methods for short texts: A survey
    Fan, Yuwei
    Shi, Lei
    Yuan, Lu
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 45 (02) : 1971 - 1990
  • [8] A generalized topic modeling approach for automatic document annotation
    Tuarob, Suppawong
    Pouchard, Line C.
    Mitra, Prasenjit
    Giles, C. Lee
    INTERNATIONAL JOURNAL ON DIGITAL LIBRARIES, 2015, 16 (02) : 111 - 128
  • [9] Topic Modeling for Short Texts with Large Language Models
    Doi, Tomoki
    Isonuma, Masaru
    Yanaka, Hitomi
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 4: STUDENT RESEARCH WORKSHOP, 2024, : 21 - 33
  • [10] Targeted aspects oriented topic modeling for short texts
    Jin He
    Lei Li
    Yan Wang
    Xindong Wu
    Applied Intelligence, 2020, 50 : 2384 - 2399