A Grammar-Based Semantic Similarity Algorithm for Natural Language Sentences

被引:33
|
作者
Lee, Ming Che [1 ]
Chang, Jia Wei [2 ]
Hsieh, Tung Cheng [3 ]
机构
[1] Ming Chuan Univ, Dept Comp & Commun Engn, Taoyuan 333, Taiwan
[2] Natl Cheng Kung Univ, Dept Engn Sci, Tainan 701, Taiwan
[3] Hsuan Chuang Univ, Dept Visual Commun Design, Hsinchu 300, Taiwan
来源
SCIENTIFIC WORLD JOURNAL | 2014年
关键词
INFORMATION; PRINCIPLES; EXTRACTION; RETRIEVAL; WORDNET;
D O I
10.1155/2014/437162
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
This paper presents a grammar and semantic corpus based similarity algorithm for natural language sentences. Natural language, in opposition to "artificial language", such as computer programming languages, is the language used by the general public for daily communication. Traditional information retrieval approaches, such as vector models, LSA, HAL, or even the ontology-based approaches that extend to include concept similarity comparison instead of cooccurrence terms/words, may not always determine the perfect matching while there is no obvious relation or concept overlap between two natural language sentences. This paper proposes a sentence similarity algorithm that takes advantage of corpus-based ontology and grammatical rules to overcome the addressed problems. Experiments on two famous benchmarks demonstrate that the proposed algorithm has a significant performance improvement in sentences/short-texts with arbitrary syntax and structure.
引用
收藏
页数:17
相关论文
共 50 条
  • [41] Semantic similarity measures for Malay sentences
    Noah, Shahrul Azman
    Amruddin, Amru Yusrin
    Omar, Nazlia
    ASIAN DIGITAL LIBRARIES: LOOKING BACK 10 YEARS AND FORGING NEW FRONTIERS, PROCEEDINGS, 2007, 4822 : 117 - 126
  • [42] A "generalized" lexical functional grammar-based processing of an Indian language - Bangla
    Sengupta, P
    Chaudhuri, BB
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 1998, 12 (05) : 695 - 720
  • [43] Weighted parsing for grammar-based language models over multioperator monoids
    Moerbitz, Richard
    Vogler, Heiko
    INFORMATION AND COMPUTATION, 2021, 281
  • [44] Grammar-based immune programming
    Heder S. Bernardino
    Helio J. C. Barbosa
    Natural Computing, 2011, 10 : 209 - 241
  • [45] Measuring semantic similarity within sentences
    Liu, Xiao-Ying
    Zhou, Yi-Ming
    Zheng, Ruo-Shi
    PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2008, : 2558 - +
  • [46] A Hybrid Grammar-Based Approach for Learning and Recognizing Natural Hand Gestures
    Sadeghipour, Amir
    Kopp, Stefan
    PROCEEDINGS OF THE TWENTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2014, : 2069 - 2077
  • [47] Convolutional Network-Based Semantic Similarity Model of Sentences
    Huang J.-P.
    Ji D.-H.
    2017, South China University of Technology (45): : 68 - 75
  • [48] Semantic Textual Similarity of Sentences with Emojis
    Debnath, Alok
    Pinnaparaju, Nikhil
    Shrivastava, Manish
    Varma, Vasudeva
    Augenstein, Isabelle
    WWW'20: COMPANION PROCEEDINGS OF THE WEB CONFERENCE 2020, 2020, : 426 - 430
  • [49] Grammar-based whitebox fuzzing
    Microsoft Research, Redmond, WA, United States
    不详
    不详
    ACM SIGPLAN Not., 6 (206-215):
  • [50] Similarity of Sentences With Contradiction Using Semantic Similarity Measures
    Prasad, M. Krishna Siva
    Sharma, Poonam
    COMPUTER JOURNAL, 2022, 65 (03): : 701 - 717