Sense disambiguation for Punjabi language using supervised machine learning techniques

被引:2
|
作者
Singh, Varinder Pal [1 ]
Kumar, Parteek [1 ]
机构
[1] Thapar Inst Engn & Technol, Comp Sci & Engn Dept, Patiala 147004, Punjab, India
关键词
Lexical features; syntactic features; word embedding; supervised learning techniques; word sense disambiguation;
D O I
10.1007/s12046-019-1206-x
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Automatic identification of a meaning of a word in a context is termed as Word Sense Disambiguation (WSD). It is a vital and hard artificial intelligence problem used in several natural language processing applications like machine translation, question answering, information retrieval, etc. In this paper, an explicit WSD system for Punjabi language using supervised techniques has been analysed. The sense tagged corpus of 150 ambiguous Punjabi noun words has been manually prepared. The six supervised machine learning techniques Decision List, Decision Tree, Naive Bayes, K-Nearest Neighbour (K-NN), Random Forest and Support Vector Machines (SVM) have been investigated in this proposed work. Every classifier has used same feature space encompassing lexical (unigram, bigram, collocations, and co-occurrence) and syntactic (part of speech) count based features. The semantic features of Punjabi language have been devised from the unlabelled Punjabi Wikipedia text using word2vec continuous bag of word and skip gram shallow neural network models. Two deep learning neural network classifiers multilayer perceptron and long short term memory have also been applied for WSD of Punjabi words. The word embedding features have experimented on six classifiers for the Punjabi WSD task. It has been observed that the performance of the supervised classifiers applied for the WSD task of Punjabi language has been enhanced with the application of word embedding features. In this work, an accuracy of 84% has been achieved by LSTM classifier using word embedding feature.
引用
收藏
页数:15
相关论文
共 50 条
  • [41] A New Approach to the Supervised Word Sense Disambiguation
    Agre, Gennady
    Petrov, Daniel
    Keskinova, Simona
    ARTIFICIAL INTELLIGENCE: METHODOLOGY, SYSTEMS, AND APPLICATIONS, AIMSA 2018, 2018, 11089 : 3 - 15
  • [42] Ethnicity-based name partitioning for author name disambiguation using supervised machine learning
    Kim, Jinseok
    Kim, Jenna
    Owen-Smith, Jason
    JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY, 2021, 72 (08) : 979 - 994
  • [43] Word sense disambiguation using a second language monolingual corpus
    Dagan, Ido
    Itai, Alon
    Computational Linguistics, 1994, 20 (04)
  • [44] MSC+: Language pattern learning for word sense induction and disambiguation
    Goularte, Fabio Bif
    Sorato, Danielly
    Nassar, Silvia Modesto
    Fileto, Renato
    Saggion, Horacio
    KNOWLEDGE-BASED SYSTEMS, 2020, 188 (188)
  • [45] WORD SENSE DISAMBIGUATION USING MACHINE-READABLE DICTIONARIES
    KROVETZ, R
    CROFT, WB
    PROCEEDINGS OF THE TWELFTH ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 1989, 23 : 127 - 136
  • [46] An Approach to Word Sense Disambiguation based on Semantic Classes and Machine Learning
    Izquierdo, Ruben
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2011, (46): : 123 - 124
  • [47] Automatic Language Identification using Machine learning Techniques
    Venkatesan, Hariraj
    Venkatasubramanian, T. Varun
    Sangeetha, J.
    PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON COMMUNICATION AND ELECTRONICS SYSTEMS (ICCES 2018), 2018, : 583 - 588
  • [48] Word Sense Disambiguation in Nepali Language
    Dhungana, Udaya Raj
    Shakya, Subarna
    2014 FOURTH INTERNATIONAL CONFERENCE ON DIGITAL INFORMATION AND COMMUNICATION TECHNOLOGY AND IT'S APPLICATIONS (DICTAP), 2014, : 46 - 50
  • [49] A Novel Word Sense Disambiguation Algorithm Based on Semi-Supervised Statistical Learning
    Huang, Zhehuang
    Chen, Yidong
    Shi, Xiaodong
    INTERNATIONAL JOURNAL OF APPLIED MATHEMATICS & STATISTICS, 2013, 43 (13): : 452 - 458
  • [50] Supervised word sense disambiguation using new features based on word embeddings
    Sadi, Majid Fahandezi
    Ansari, Ebrahim
    Afsharchi, Mohsen
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2019, 37 (01) : 1467 - 1476