SIMTEX: An Approach for Detecting and Measuring Textual Similarity based on Discourse and Semantics

被引:3
|
作者
da Cunha, Iria [1 ]
Vivaldi, Jorge [1 ]
Torres-Moreno, Juan-Manuel [2 ,3 ]
Sierra, Gerardo [2 ,4 ]
机构
[1] Univ Pompeu Fabra, Univ Inst Appl Linguist, Barcelona, Spain
[2] Univ Avignon & Pays Vaucluse, Agorantic, LIA, Avignon, France
[3] Ecole Poytech Montreal, Montreal, PQ, Canada
[4] Univ Nacl Autonoma Mexico, Inst Ingn, Mexico City, DF, Mexico
来源
COMPUTACION Y SISTEMAS | 2014年 / 18卷 / 03期
关键词
Textual similarity; discourse; semantics; paraphrase;
D O I
10.13053/CyS-18-3-2033
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Nowadays automatic systems for detecting and measuring textual similarity are being developed, in order to apply them to different tasks in the field of Natural Language Processing (NLP). Currently, these systems use surface linguistic features or statistical information. Nowadays, few researchers use deep linguistic information. In this work, we present an algorithm for detecting and measuring textual similarity that takes into account information offered by discourse relations of Rhetorical Structure Theory (RST), and lexical-semantic relations included in EuroWordNet. We apply the algorithm, called SIMTEX, to texts written in Spanish, but the methodology is potentially language-independent.
引用
收藏
页码:505 / 516
页数:12
相关论文
共 50 条
  • [1] Regression Based Approaches for Detecting and Measuring Textual Similarity
    Sarkar, Sandip
    Pakray, Partha
    Das, Dipankar
    Gelbukh, Alexander
    MINING INTELLIGENCE AND KNOWLEDGE EXPLORATION (MIKE 2016), 2017, 10089 : 144 - 152
  • [2] Measuring Web Page Similarity Based on Textual and Visual Properties
    Bartik, Vladimir
    ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, PT II, 2012, 7268 : 13 - 21
  • [3] MEASURING SENTENCES SIMILARITY BASED ON DISCOURSE REPRESENTATION STRUCTURE
    Farouk, Mamdouh
    COMPUTING AND INFORMATICS, 2020, 39 (03) : 464 - 480
  • [4] A new Approach for Similarity Search based on Textual Content
    Uwimana, Clotilde
    Wu, Renyong
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INFORMATION MANAGEMENT AND EVALUATION, 2010, : 399 - 404
  • [5] Frame Semantics-based Approach to Spanish Textual Categorization
    Crespo Miguel, Mario
    Frias Delgado, Antonio
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2008, (41): : 65 - 71
  • [6] Textual and discourse borders: a linguistic approach and textual paratext
    Lane, Philippe
    1ER CONGRES MONDIAL DE LINGUISTIQUE FRANCAISE: CMLF 2008, PROCEEDINGS, 2008, : 1379 - 1387
  • [7] A Semantic Logic-Based Approach to Determine Textual Similarity
    Blanco, Eduardo
    Moldovan, Dan
    IEEE Transactions on Audio, Speech and Language Processing, 2015, 23 (04): : 683 - 693
  • [8] A Semantic Logic-Based Approach to Determine Textual Similarity
    Blanco, Eduardo
    Moldovan, Dan
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (04) : 683 - 693
  • [9] A Semantics-Based Approach on Binary Function Similarity Detection
    Zhang, Yuntao
    Fang, Binxing
    Xiong, Zehui
    Wang, Yanhao
    Liu, Yuwei
    Zheng, Chao
    Zhang, Qinnan
    IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (15): : 25910 - 25924
  • [10] Processing evaluative discourse: a textual approach
    Eensoo, Egle
    Valette, Mathieu
    LANGUE FRANCAISE, 2014, (184): : 107 - +