Named Entity Recognition and Classification for Punjabi Shahmukhi

被引:14
|
作者
Ahmad, Muhammad Tayyab [1 ,2 ]
Malik, Muhammad Kamran [1 ,2 ]
Shahzad, Khurram [1 ,2 ]
Aslam, Faisal [1 ,2 ]
Iqbal, Asif [1 ,2 ]
Nawaz, Zubair [1 ,2 ]
Bukhari, Faisal [1 ,2 ]
机构
[1] Punjab Univ Coll Informat Technol, Lahore, Pakistan
[2] Univ Punjab, Punjab Univ Coll Informat Technol, New Campus, Lahore, Pakistan
关键词
Low-resource languages; Asian languages; Punjabi; Shahmukhi; named entity recognition;
D O I
10.1145/3383306
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Named entity recognition (NER) refers to the identification of proper nouns from natural language text and classifying them into named entity types, such as person, location, and organization. Due to the widespread applications of NER, numerous NER techniques and benchmark datasets have been developed for bothWestern and Asian languages. Even though Shahmukhi script of the Punjabi language has been used by nearly three fourths of the Punjabi speakers worldwide, Gurmukhi has been the main focus of research activities. Specifically, a benchmark NER corpus for Shahmukhi is non-existent, which has thwarted the commencement of NER research for the Shahmukhi script. To this end, this article presents the development and specifications of the first-ever NER corpus for Shahmukhi. The newly developed corpus is composed of 318,275 tokens and 16,300 named entities, including 11,147 persons, 3,140 locations, and 2,013 organizations. To establish the strength of our corpus, we have compared the specifications of our corpus with its Gurmukhi counterparts. Furthermore, we have demonstrated the usability of our corpus using five supervised learning techniques, including two state-of-the-art deep learning techniques. The results are compared, and valuable insights about the behaviors of the most effective technique are discussed.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] Named Entity Recognition Approaches
    Mansouri, Alireza
    Affendey, Lilly Suriani
    Mamat, Ali
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2008, 8 (02): : 339 - 344
  • [32] Arabic Named Entity Recognition
    Benajiba, Yassine
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2010, (44): : 151 - 152
  • [33] Incorporating rich background knowledge for gene named entity classification and recognition
    Li, Yanpeng
    Lin, Hongfei
    Yang, Zhihao
    BMC BIOINFORMATICS, 2009, 10
  • [34] Kannada Named Entity Recognition and classification using Conditional Random Fields
    Amarappa, S.
    Sathyanarayana, S. V.
    2015 INTERNATIONAL CONFERENCE ON EMERGING RESEARCH IN ELECTRONICS, COMPUTER SCIENCE AND TECHNOLOGY (ICERECT), 2015, : 186 - 191
  • [35] Radar technical language modeling with named entity recognition and text classification
    Zaunegger, Jackson S.
    Singerman, Paul G.
    Narayanan, Ram M.
    O'Rourke, Sean M.
    Rangaswamy, Muralidhar
    RADAR SENSOR TECHNOLOGY XXVI, 2022, 12108
  • [36] Dynamic Named Entity Recognition
    Luiggi, Tristan
    Soulier, Laure
    Guigue, Vincent
    Jendoubi, Siwar
    Baelde, Aurelien
    38TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2023, 2023, : 890 - 897
  • [37] Speech recognition of a named entity
    Tomita, T
    Okimoto, Y
    Yamamoto, H
    Sagisaka, Y
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 1057 - 1060
  • [38] Named Entity Recognition in Query
    Guo, Jiafeng
    Xu, Gu
    Cheng, Xueqi
    Li, Hang
    PROCEEDINGS 32ND ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2009, : 267 - 274
  • [39] Named entity recognition and classification in biomedical text using classifier ensemble
    Saha, Sriparna
    Ekbal, Asif
    Sikdar, Utpal Kumar
    INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2015, 11 (04) : 365 - 391
  • [40] Unified Named Entity Recognition as Word-Word Relation Classification
    Li, Jingye
    Fei, Hao
    Liu, Jiang
    Wu, Shengqiong
    Zhang, Meishan
    Teng, Chong
    Ji, Donghong
    Li, Fei
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 10965 - 10973