Distributional Semantic Model Based on Convolutional Neural Network for Arabic Textual Similarity

被引:3
|
作者
Mahmoud, Adnen [1 ]
Zrigui, Mounir [2 ]
机构
[1] Higher Inst Comp Sci & Commun Tech, Monastir, Tunisia
[2] Fac Sci Monastir, Monastir, Tunisia
关键词
Arabic Language; Context Based Approach; Global Vectors Representation; Natural Language Processing; Paraphrase Detection; Semantic Similarity; Word Embedding; Word2vec;
D O I
10.4018/IJCINI.2020010103
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The problem addressed is to develop a model that can reliably identify whether a previously unseen document pair is paraphrased or not. Its detection in Arabic documents is a challenge because of its variability in features and the lack of publicly available corpora. Faced with these problems, the authors propose a semantic approach. At the feature extraction level, the authors use global vectors representation combining global co-occurrence counting and a contextual skip gram model. At the paraphrase identification level, the authors apply a convolutional neural network model to learn more contextual and semantic information between documents. For experiments, the authors use Open Source Arabic Corpora as a source corpus. Then the authors collect different datasets to create a vocabulary model. For the paraphrased corpus construction, the authors replace each word from the source corpus by its most similar one which has the same grammatical class applying the word2vec algorithm and the part-of-speech annotation. Experiments show that the model achieves promising results in terms of precision and recall compared to existing approaches in the literature.
引用
收藏
页码:35 / 50
页数:16
相关论文
共 50 条
  • [41] Generating Poetry Title Based on Semantic Relevance with Convolutional Neural Network
    Li, Z.
    Niu, K.
    He, Z. Q.
    2ND INTERNATIONAL CONFERENCE ON AUTOMATION, CONTROL AND ROBOTICS ENGINEERING (CACRE 2017), 2017, 235
  • [42] Similarity is closeness: Using distributional semantic spaces to model similarity in visual and linguistic metaphors
    Bolognesi, Marianna
    Aina, Laura
    CORPUS LINGUISTICS AND LINGUISTIC THEORY, 2019, 15 (01) : 101 - 137
  • [43] A Network Intrusion Detection Model Based on Convolutional Neural Network
    Tao, Wenwei
    Zhang, Wenzhe
    Hu, Chao
    Hu, Chaohui
    SECURITY WITH INTELLIGENT COMPUTING AND BIG-DATA SERVICES, 2020, 895 : 771 - 783
  • [44] Multi-Channel Embedding Convolutional Neural Network Model for Arabic Sentiment Classification
    Dahou, Abdelghani
    Xiong, Shengwu
    Zhou, Junwei
    Abd Elaziz, Mohamed
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2019, 18 (04)
  • [45] Semantic Convolutional Neural Network model for Safe Business Investment by Using BERT
    Heidari, Maryam
    Rafatirad, Setareh
    2020 SEVENTH INTERNATIONAL CONFERENCE ON SOCIAL NETWORK ANALYSIS, MANAGEMENT AND SECURITY (SNAMS), 2020, : 142 - 147
  • [46] A convolutional neural network model for semantic segmentation of mitotic events in microscopy images
    Orturk, Saban
    Akdemir, Bayram
    NEURAL COMPUTING & APPLICATIONS, 2019, 31 (08): : 3719 - 3728
  • [47] A convolutional neural network model for semantic segmentation of mitotic events in microscopy images
    Şaban Öztürk
    Bayram Akdemir
    Neural Computing and Applications, 2019, 31 : 3719 - 3728
  • [48] A Semantic Logic-Based Approach to Determine Textual Similarity
    Blanco, Eduardo
    Moldovan, Dan
    IEEE Transactions on Audio, Speech and Language Processing, 2015, 23 (04): : 683 - 693
  • [49] An overview of textual semantic similarity measures based on web intelligence
    Martinez-Gil, Jorge
    ARTIFICIAL INTELLIGENCE REVIEW, 2014, 42 (04) : 935 - 943
  • [50] An overview of textual semantic similarity measures based on web intelligence
    Jorge Martinez-Gil
    Artificial Intelligence Review, 2014, 42 : 935 - 943