Distributional Semantic Model Based on Convolutional Neural Network for Arabic Textual Similarity

被引:3
|
作者
Mahmoud, Adnen [1 ]
Zrigui, Mounir [2 ]
机构
[1] Higher Inst Comp Sci & Commun Tech, Monastir, Tunisia
[2] Fac Sci Monastir, Monastir, Tunisia
关键词
Arabic Language; Context Based Approach; Global Vectors Representation; Natural Language Processing; Paraphrase Detection; Semantic Similarity; Word Embedding; Word2vec;
D O I
10.4018/IJCINI.2020010103
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The problem addressed is to develop a model that can reliably identify whether a previously unseen document pair is paraphrased or not. Its detection in Arabic documents is a challenge because of its variability in features and the lack of publicly available corpora. Faced with these problems, the authors propose a semantic approach. At the feature extraction level, the authors use global vectors representation combining global co-occurrence counting and a contextual skip gram model. At the paraphrase identification level, the authors apply a convolutional neural network model to learn more contextual and semantic information between documents. For experiments, the authors use Open Source Arabic Corpora as a source corpus. Then the authors collect different datasets to create a vocabulary model. For the paraphrased corpus construction, the authors replace each word from the source corpus by its most similar one which has the same grammatical class applying the word2vec algorithm and the part-of-speech annotation. Experiments show that the model achieves promising results in terms of precision and recall compared to existing approaches in the literature.
引用
收藏
页码:35 / 50
页数:16
相关论文
共 50 条
  • [31] Stemming for Arabic Words Similarity Measures based on Latent Semantic Analysis Model
    Froud, Hanane
    Lachkar, Abdelmonaime
    Alaoui Ouatik, Said
    2012 INTERNATIONAL CONFERENCE ON MULTIMEDIA COMPUTING AND SYSTEMS (ICMCS), 2012, : 780 - 784
  • [32] Artwork Retrieval Based on Similarity of Touch Using Convolutional Neural Network
    Fujita, Takayuki
    Osana, Yuko
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2018, PT I, 2018, 11139 : 235 - 243
  • [33] Neural Models for Detecting Binary Semantic Textual Similarity for Algerian and MSA
    Adouane, Wafia
    Bernardy, Jean-Philippe
    Dobnik, Simon
    FOURTH ARABIC NATURAL LANGUAGE PROCESSING WORKSHOP (WANLP 2019), 2019, : 78 - 87
  • [34] Semantic similarity measurement of process table based on graph neural network
    Hua B.
    Zhou B.
    Gu X.
    Bao J.
    Jisuanji Jicheng Zhizao Xitong/Computer Integrated Manufacturing Systems, CIMS, 2022, 28 (12): : 3805 - 3821
  • [35] Deep Convolutional Neural Network for Arabic Speech Recognition
    Amari, Rafik
    Noubigh, Zouhaira
    Zrigui, Salah
    Berchech, Dhaou
    Nicolas, Henri
    Zrigui, Mounir
    COMPUTATIONAL COLLECTIVE INTELLIGENCE, ICCCI 2022, 2022, 13501 : 120 - 134
  • [36] Semantic Representation Learning of Convolutional Neural Network Based on Tensor Computation
    Yang L.-J.
    Wang J.-Q.
    Jing L.-P.
    Yu J.
    Jisuanji Xuebao/Chinese Journal of Computers, 2023, 46 (03): : 568 - 578
  • [37] Semantic Template-based Convolutional Neural Network for Text Classification
    Chang, Yung-Chun
    Ng, Siu Hin
    Chen, Jung-Peng
    Liang, Yu-Chi
    Hsu, Wen-Lian
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (11)
  • [38] Semantic concept based video retrieval using convolutional neural network
    Janwe, Nitin
    Bhoyar, Kishor
    SN APPLIED SCIENCES, 2020, 2 (01):
  • [39] Application of semantic segmentation based on convolutional neural network in medical images
    Wu Y.
    Lin L.
    Wang J.
    Wu S.
    Shengwu Yixue Gongchengxue Zazhi/Journal of Biomedical Engineering, 2020, 37 (03): : 533 - 540
  • [40] Semantic concept based video retrieval using convolutional neural network
    Nitin Janwe
    Kishor Bhoyar
    SN Applied Sciences, 2020, 2