An effective short text conceptualization based on new short text similarity

被引:6
|
作者
Bekkali, Mohammed [1 ]
Lachkar, Abdelmonaime [2 ]
机构
[1] USMBA, ENSA, LISA Lab, Fes, Morocco
[2] AEU, ENSA, Tangier, Morocco
关键词
Arabic language; Conceptualization; Word sense disambiguation; Short text similarity; Rough set theory;
D O I
10.1007/s13278-018-0544-8
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recently short text messages, tweets, comments and so on, have become a large portion of the online text data. They are limited in length and different from traditional documents in their shortness and sparseness. As a result, short text tends to be ambiguous and its degree is not the same for all languages; and as Arabic is a very high flexional language, where a single word can have multiple meanings, the short text representation plays a vital role in any Text Mining task. To address these issues, we propose an efficient representation for short text based on concepts instead of terms using BabelNet as an external knowledge. However, in the conceptualization process, while searching polysemic term-corresponding concepts, multiple matches are detected. Therefore, assigning a term to a concept is a crucial step and we believe that short text similarity can be useful to overcome the problem of mapping term to the corresponding concept. In this paper, we reintroduce Web-based Kernel function for measuring the semantic relatedness between concepts to disambiguate an expression versus multiple concepts. The proposed method has been evaluated using an Arabic short text categorization system and the obtained results illustrate the interest of our contribution.
引用
收藏
页数:11
相关论文
共 50 条
  • [31] Identifying constitutive articles of cumulative dissertation theses by bilingual text similarity. Evaluation of similarity methods on a new short text task
    Donner, Paul
    QUANTITATIVE SCIENCE STUDIES, 2021, 2 (03):
  • [32] A New Method for Short Text Compression
    Aslanyurek, Murat
    Mesut, Altan
    IEEE ACCESS, 2023, 11 : 141022 - 141035
  • [33] SyMSS: A syntax-based measure for short-text semantic similarity
    Oliva, Jesus
    Ignacio Serrano, Jose
    Dolores del Castillo, Maria
    Iglesias, Angel
    DATA & KNOWLEDGE ENGINEERING, 2011, 70 (04) : 390 - 405
  • [34] GCNs-Based Context-Aware Short Text Similarity Model
    Sun, Xiaoqi
    Wu, Shaochun
    Liu, Yue
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 1329 - 1335
  • [35] Research on Semantic Similarity of Short Text Based on Bert and Time Warping Distance
    Qiu, Shijie
    Niu, Yan
    Li, Jun
    Li, Xing
    JOURNAL OF WEB ENGINEERING, 2021, 20 (08): : 2521 - 2543
  • [36] Chinese Short Text Entity Linking Based On Semantic Similarity and Entity Correlation
    Zhao, Yan
    Wang, Yun
    Yang, Na
    2020 IEEE 32ND INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2020, : 426 - 431
  • [37] Similarity Calculation Method of Chinese Short Text Based on Semantic Feature Space
    Pan, Liqiang
    Zhang, Pu
    Xiong, Anping
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2015, 6 (02) : 306 - 310
  • [38] Chinese Web Short Text Subject Clustering Based on Similarity Upper Approximation
    Zhu, JiaWei
    Zhang, YunHua
    2017 INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS, ELECTRONICS AND CONTROL (ICCSEC), 2017, : 1307 - 1310
  • [39] Diagnosis of Depression Based on Short Text
    Zheng, Jinghua
    Bian, Jianli
    Jia, Jincheng
    HUMAN CENTERED COMPUTING, 2019, 11956 : 637 - 646
  • [40] Short Text Classification Based on Semantics
    Ma, Chenglong
    Wan, Xin
    Zhang, Zhen
    Li, Taisong
    Zhang, Yan
    ADVANCED INTELLIGENT COMPUTING THEORIES AND APPLICATIONS, ICIC 2015, PT III, 2015, 9227 : 463 - 470