Semantically-Enhanced Topic Modeling

被引:10
|
作者
Viegas, Felipe [1 ]
Luiz, Washington [1 ]
Gomes, Christian [2 ]
Khatibi, Amir [1 ]
Canuto, Sergio [3 ]
Mourao, Fernando [4 ]
Salles, Thiago [1 ]
Rocha, Leonardo [2 ]
Goncalves, Marcos Andre [1 ]
机构
[1] Univ Fed Minas Gerais, Belo Horizonte, MG, Brazil
[2] Univ Fed Sao Joao del Rei, Sao Joao del Rei, Brazil
[3] IFG, Luziania, Brazil
[4] Seek AI Labs, Belo Horizonte, MG, Brazil
关键词
Topic Modeling; Word Embeddings; Bag of Words;
D O I
10.1145/3269206.3271797
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we advance the state-of-the-art in topic modeling by means of the design and development of a novel (semi-formal) general topic modeling framework. The novel contributions of our solution include: (i) the introduction of new semantically-enhanced data representations for topic modeling based on pooling, and (ii) the proposal of a novel topic extraction strategy - ASToC -that solves the difficulty in representing topics in our semantically-enhanced information space. In our extensive experimentation evaluation, covering 12 datasets and 12 state-of-the-art baselines, totalizing 108 tests, we exceed (with a few ties) in almost 100 cases, with gains of more than 50% against the best baselines (achieving up to 80% against some runner-ups). We provide qualitative and quantitative statistical analyses of why our solutions work so well. Finally, we show that our method is able to improve document representation in automatic text classification.
引用
收藏
页码:893 / 902
页数:10
相关论文
共 50 条
  • [11] Semantically-enhanced information retrieval using multiple knowledge sources
    Yuncheng Jiang
    Cluster Computing, 2020, 23 : 2925 - 2944
  • [12] Sequential visual place recognition using semantically-enhanced features
    Varun Paturkar
    Rohit Yadav
    Rahul Kala
    Multimedia Tools and Applications, 2024, 83 : 50477 - 50491
  • [13] ARHINET A System for Generating and Processing Semantically-Enhanced Archival eContent
    Salomie, Ioan
    Dinsoreanu, Mihaela
    Pop, Cristina
    Suciu, Sorin
    Vlad, Tudor
    Iacob, Ioana
    WEBIST 2009: PROCEEDINGS OF THE FIFTH INTERNATIONAL CONFERENCE ON WEB INFORMATION SYSTEMS AND TECHNOLOGIES, 2009, : 151 - 158
  • [14] Semantically-enhanced on-demand resource provision and management for the grid
    Siddiqui, Mumtaz
    Fahringer, Thomas
    MULTIAGENT AND GRID SYSTEMS, 2007, 3 (03) : 327 - 339
  • [15] Semantically-enhanced Configurability in State Estimation Structures of Power Systems
    Milis, Georgios M.
    Asprou, Markos
    Kyriakides, Elias
    Panayiotou, Christos G.
    Polycarpou, Marios M.
    2015 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI), 2015, : 679 - 686
  • [16] SocialBROKER: A collaborative social space for gathering semantically-enhanced financial information
    Esteban-Gil, Angel
    Garcia-Sanchez, Francisco
    Valencia-Garcia, Rafael
    Fernandez-Breis, Jesualdo T.
    EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (10) : 9715 - 9722
  • [17] Creating a semantically-enhanced cloud services environment through ontology evolution
    Angel Rodriguez-Garcia, Miguel
    Valencia-Garcia, Rafael
    Garcia-Sanchez, Francisco
    Javier Samper-Zapater, J.
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2014, 32 : 295 - 306
  • [18] Semantically-enhanced Deep Collision Prediction for Autonomous Navigation using Aerial Robots
    Kulkarni, Mihir
    Nguyen, Huan
    Alexis, Kostas
    2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, IROS, 2023, : 3056 - 3063
  • [19] Semantically-Enhanced Feature Extraction with CLIP and Transformer Networks for Driver Fatigue Detection
    Gao, Zhen
    Chen, Xiaowen
    Xu, Jingning
    Yu, Rongjie
    Zhang, Heng
    Yang, Jinqiu
    SENSORS, 2024, 24 (24)
  • [20] System description: An orienteering strategy to browse semantically-enhanced educational wiki pages
    Pansanato, Luciano T. E.
    Fortes, Renata P. M.
    SEMANTIC WEB: RESEARCH AND APPLICATIONS, PROCEEDINGS, 2007, 4519 : 809 - +