Semantic textual similarity between sentences using bilingual word semantics

被引:21
|
作者
Shajalal, Md [1 ]
Aono, Masaki [2 ]
机构
[1] Bangladesh Agr Univ, Dept Comp Sci & Math, Mymensingh 2202, Bangladesh
[2] Toyohashi Univ Technol, Dept Comp Sci & Engn, Toyohashi, Aichi, Japan
基金
日本学术振兴会;
关键词
Semantic similarity; Word semantics; Word-embedding; Textual similarity; Bilingual semantics;
D O I
10.1007/s13748-019-00180-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Semantic textual similarity between sentences is indispensable for many information retrieval tasks. Traditional lexical similarity measures cannot compute the similarity beyond a trivial level. Moreover, they only can capture the textual similarity, but not semantic. In this paper, we propose a method for semantic textual similarity that leverages bilingual word-level semantics to compute the semantic similarity between sentences. To capture word-level semantics, we employ distribute representation of words in two different languages. The similarity function based on the concept-to-concept relationship corresponding to the words is also utilized for the same purpose. Multiple new semantic similarity measures are introduced based on word-embedding models trained on two different corpora in two different languages. Apart from these, another new semantic similarity measure is also introduced using the word sense comparison. The similarity score between the sentences is then computed by applying a linear ranking approach to all proposed measures with their importance score estimated employing a supervised feature selection technique. We conducted experiments on the SemEval Semantic Textual Similarity (STS-2017) test collections. The experimental results demonstrated that our method is effective for measuring semantic textual similarity and outperforms some known related methods.
引用
收藏
页码:263 / 272
页数:10
相关论文
共 50 条
  • [21] Bilingual Textual Similarity in Scientific Documents
    Kawamura, Takahiro
    Egami, Shusaku
    IEEE TRANSACTIONS ON ENGINEERING MANAGEMENT, 2021, 68 (05) : 1299 - 1308
  • [22] Semantic Similarity between Turkish and European Languages Using Word Embeddings
    Senel, Lutfi Kerem
    Yucesoy, Veysel
    Koc, Aykut
    Cukur, Tolga
    2017 25TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2017,
  • [23] Supervised Learning to Measure the Semantic Similarity Between Arabic Sentences
    Wali, Wafa
    Gargouri, Bilel
    Ben Hamadou, Abdelmajid
    COMPUTATIONAL COLLECTIVE INTELLIGENCE (ICCCI 2015), PT I, 2015, 9329 : 158 - 167
  • [24] Semantic similarity between sentences through approximate tree matching
    Ribadas, FJ
    Vilares, M
    Vilares, J
    PATTERN RECOGNITION AND IMAGE ANALYSIS, PT 2, PROCEEDINGS, 2005, 3523 : 638 - 646
  • [25] Question Similarity Detection in Turkish Using Semantic Textual Similarity Methods
    Yildiz, Eray
    Findik, Yasin
    2019 27TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2019,
  • [26] Calculation of Textual Similarity Using Semantic Relatedness Functions
    Kairaldeen, Ammar Riadh
    Ercan, Gonenc
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING (CICLING 2015), PT II, 2015, 9042 : 516 - 524
  • [27] Textual Similarity for Word Sequences
    Konaka, Fumito
    Miura, Takao
    SIMILARITY SEARCH AND APPLICATIONS, SISAP 2015, 2015, 9371 : 244 - 249
  • [28] Semantic similarity measures for Malay sentences
    Noah, Shahrul Azman
    Amruddin, Amru Yusrin
    Omar, Nazlia
    ASIAN DIGITAL LIBRARIES: LOOKING BACK 10 YEARS AND FORGING NEW FRONTIERS, PROCEEDINGS, 2007, 4822 : 117 - 126
  • [29] Measuring semantic similarity within sentences
    Liu, Xiao-Ying
    Zhou, Yi-Ming
    Zheng, Ruo-Shi
    PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2008, : 2558 - +
  • [30] An Integrated Approach for Measuring Semantic Similarity between Words and Sentences using Web Search Engine
    Adhikesavan, Kavitha
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2015, 12 (06) : 589 - 596