Distributional Semantic Model Based on Convolutional Neural Network for Arabic Textual Similarity

被引:3
|
作者
Mahmoud, Adnen [1 ]
Zrigui, Mounir [2 ]
机构
[1] Higher Inst Comp Sci & Commun Tech, Monastir, Tunisia
[2] Fac Sci Monastir, Monastir, Tunisia
关键词
Arabic Language; Context Based Approach; Global Vectors Representation; Natural Language Processing; Paraphrase Detection; Semantic Similarity; Word Embedding; Word2vec;
D O I
10.4018/IJCINI.2020010103
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The problem addressed is to develop a model that can reliably identify whether a previously unseen document pair is paraphrased or not. Its detection in Arabic documents is a challenge because of its variability in features and the lack of publicly available corpora. Faced with these problems, the authors propose a semantic approach. At the feature extraction level, the authors use global vectors representation combining global co-occurrence counting and a contextual skip gram model. At the paraphrase identification level, the authors apply a convolutional neural network model to learn more contextual and semantic information between documents. For experiments, the authors use Open Source Arabic Corpora as a source corpus. Then the authors collect different datasets to create a vocabulary model. For the paraphrased corpus construction, the authors replace each word from the source corpus by its most similar one which has the same grammatical class applying the word2vec algorithm and the part-of-speech annotation. Experiments show that the model achieves promising results in terms of precision and recall compared to existing approaches in the literature.
引用
收藏
页码:35 / 50
页数:16
相关论文
共 50 条
  • [21] A recognition model for handwritten Persian/Arabic numbers based on optimized deep convolutional neural network
    Saqib Ali
    Sana Sahiba
    Muhammad Azeem
    Zeeshan Shaukat
    Tariq Mahmood
    Zareen Sakhawat
    Muhammad Saqlain Aslam
    Multimedia Tools and Applications, 2023, 82 : 14557 - 14580
  • [22] Semantic Textual Similarity of Portuguese-Language Texts: An Approach Based on the Semantic Inferentialism Model
    Pinheiro, Vladia
    Furtado, Vasco
    Albuquerque, Adriano
    COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANGUAGE, 2014, 8775 : 183 - 188
  • [23] Semantic Similarity Algorithm Based on Generalized Regression Neural Network
    Cao, Rui
    Wu, Lingda
    Wang, Rui
    Yang, Chao
    PROCEEDINGS OF THE FIRST INTERNATIONAL CONFERENCE ON INFORMATION SCIENCES, MACHINERY, MATERIALS AND ENERGY (ICISMME 2015), 2015, 126 : 1333 - 1336
  • [24] A Convolutional Neural Network for Arabic Document Analysis
    Bouressace, Hassina
    Csirik, Janos
    2019 IEEE 19TH INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (ISSPIT 2019), 2019,
  • [25] Semantic Map Construction Based on Deep Convolutional Neural Network
    Hu M.
    Zhang Y.
    Qin C.
    Liu T.
    Jiqiren/Robot, 2019, 41 (04): : 452 - 463
  • [26] Study on semantic image segmentation based on convolutional neural network
    Li, Lin-Hui
    Qian, Bo
    Lian, Jing
    Zheng, Wei-Na
    Zhou, Ya-Fu
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2017, 33 (06) : 3397 - 3404
  • [27] Recurrent and Convolutional Neural Network Based on Interest in Classifying Arabic Documents
    Al-Mansoub, Akram
    Qi, Deyu
    Aqlan, Fares
    Alqwbani, Abdullah
    PROCEEDINGS OF 2021 2ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND INFORMATION SYSTEMS (ICAIIS '21), 2021,
  • [28] Evaluation of semantic similarity using vector space model based on textual corpus
    Hssina, Badr
    Bouikhalene, Belaid
    Merbouha, Abdelkrim
    2016 13TH INTERNATIONAL CONFERENCE ON COMPUTER GRAPHICS, IMAGING AND VISUALIZATION (CGIV), 2016, : 295 - 300
  • [29] Attentive Siamese LSTM Network for Semantic Textual Similarity Measure
    Bao, Wei
    Bao, Wugedele
    Du, Jinhua
    Yang, Yuanyuan
    Zhao, Xiaobing
    2018 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2018, : 312 - 317
  • [30] Attention-Based Overall Enhance Network for Chinese Semantic Textual Similarity Measure
    Zhang, Hao
    Zhang, HuaXiong
    Lu, XingYu
    Gao, Qiang
    JOURNAL OF APPLIED SCIENCE AND ENGINEERING, 2022, 25 (02): : 287 - +