Automatic Topic Modeling for Single Document Short Texts

被引:1
|
作者
Sajid, Anamta [1 ]
Jan, Sadaqat [1 ]
Shah, Ibrar A. [1 ]
机构
[1] Univ Engn & Technol, Dept Comp Software Engn, Mardan Campus, Mardan, Pakistan
关键词
Data mining; text mining; topic modeling;
D O I
10.1109/FIT.2017.00020
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
This paper presents a novel approach to automate the process of extracting topic and main title from a single-document short text. The proposed approach uses online text mining and Natural Language Processing techniques. The title of any text provides an efficient way to concisely grasp the overview of the contents in the text by giving a glance on its main heading only, which is quicker than reading the summary. In this paper, three different mechanisms have been proposed, implemented and compared to find the best approach for automatic extraction of a topic that is more relevant to the overall event explained in the text. The proposed system is evaluated against fifteen news articles from New York Times. The significance of the paper is twofold: Firstly, these automatic topic extraction techniques can be used further for document classification, document relevancy and similarity, summarization, comprehensive grasp of any event and finding novelty in outsized and scattered text data by scanning titles. Secondly, it can be used as a roadmap for the new researchers by using this detailed analysis of various data mining techniques. The experimental results show that the Nouns are more related, reliable, and suitable words for finding the topic of the text.
引用
收藏
页码:70 / 75
页数:6
相关论文
共 50 条
  • [41] TEXTS OF DIFFERENT EMOTIONAL CLASSES AND THEIR TOPIC MODELING
    Kolmogorova, Anastasia, V
    Sun, Qiuhua
    VESTNIK VOLGOGRADSKOGO GOSUDARSTVENNOGO UNIVERSITETA-SERIYA 2-YAZYKOZNANIE, 2024, 23 (05):
  • [42] Topic Modeling for Amharic User Generated Texts
    Neshir, Girma
    Rauber, Andreas
    Atnafu, Solomon
    INFORMATION, 2021, 12 (10)
  • [43] Arabic texts analysis for topic modeling evaluation
    Abderrezak Brahmi
    Ahmed Ech-Cherif
    Abdelkader Benyettou
    Information Retrieval, 2012, 15 : 33 - 53
  • [44] Arabic texts analysis for topic modeling evaluation
    Brahmi, Abderrezak
    Ech-Cherif, Ahmed
    Benyettou, Abdelkader
    INFORMATION RETRIEVAL, 2012, 15 (01): : 33 - 53
  • [45] AOBTM: Adaptive Online Biterm Topic Modeling for Version Sensitive Short-texts Analysis
    Hadi, Mohammad Abdul
    Fard, Fatemeh H.
    2020 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME 2020), 2020, : 593 - 604
  • [46] Federated Non-negative Matrix Factorization for Short Texts Topic Modeling with Mutual Information
    Si, Shijing
    Wang, Jianzong
    Zhang, Ruiyi
    Su, Qinliang
    Xiao, Jing
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [47] Topic Evolution Modeling in Social Media Short Texts Based on Recurrent Semantic Dependent CRP
    Zhang, Yuhao
    Mao, Wenji
    Zeng, Daniel
    2017 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENCE AND SECURITY INFORMATICS (ISI), 2017, : 119 - 124
  • [48] Using topic keyword clusters for automatic document clustering
    Chang, HC
    Hsu, CC
    THIRD INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND APPLICATIONS, VOL 1, PROCEEDINGS, 2005, : 419 - 424
  • [49] Using topic keyword clusters for automatic document clustering
    Chang, HC
    Hsu, CC
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2005, E88D (08) : 1852 - 1860
  • [50] The Ideal Topic: Interdependence of Topic Interpretability and Other Quality Features in Topic Modelling for Short Texts
    Blekanov, Ivan S.
    Bodrunova, Svetlana S.
    Zhuravleva, Nina
    Smoliarova, Anna
    Tarasov, Nikita
    SOCIAL COMPUTING AND SOCIAL MEDIA. DESIGN, ETHICS, USER BEHAVIOR, AND SOCIAL NETWORK ANALYSIS, SCSM 2020, PT I, 2020, 12194 : 19 - 26