Text-based Language Identifier using Multinomial Naive Bayes Algorithm

被引:0
|
作者
Rawat, Sunita [1 ]
Werulkar, Lakshita [1 ]
Jaywant, Sagarika [1 ]
机构
[1] Shri Ramdeobaba Coll Engn & Management, Dept Comp Sci & Engn, Nagpur, India
来源
关键词
Language Identification; Natural Language Processing (NLP); Multinomial Na?ve Bayes (MNB); N-Gram algorithm; Term Frequency-Inverse Document Frequency (TF-IDF);
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Language Identification is among the crucial steps in any NLP based application. Text -based documents and webpages are rapidly increasing in the modern Internet. It is simple to locate documents written in different languages from all across the world that are available with just one click. Therefore, a language identifier is absolutely necessary in order to help the user interpret the content. Language identification has so far tended to be more concentrated on European languages and is still rather limited for Indian Traditional Languages. Many researchers have become more interested in the study of language identification for similar languages from popular languages. In this paper, Multinomial Naive Bayes Algorithm is used for detecting languages in Devanagari like Marathi, Sanskrit and Hindi, and three European languages French, Italian and English. An experiment done on datasets of each language has produced satisfactorily accurate results after training and testing the model.
引用
收藏
页码:96 / 102
页数:7
相关论文
共 50 条
  • [21] A novel text classification algorithm based on Naive Bayes and KL-divergence
    Wang, BY
    Zhang, SM
    PDCAT 2005: SIXTH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING, APPLICATIONS AND TECHNOLOGIES, PROCEEDINGS, 2005, : 913 - 915
  • [22] The naive Bayes text classification algorithm based on rough set in the cloud platform
    Dai, Yugang
    Sun, Haosheng
    Journal of Chemical and Pharmaceutical Research, 2014, 6 (07) : 1636 - 1643
  • [23] Text and Image Based Spam Email Classification using KNN, Naive Bayes and Reverse DBSCAN Algorithm
    Harisinghaney, Anirudh
    Dixit, Aman
    Gupta, Saurabh
    Arora, Anuja
    PROCEEDINGS OF THE 2014 INTERNATIONAL CONFERENCE ON RELIABILTY, OPTIMIZATION, & INFORMATION TECHNOLOGY (ICROIT 2014), 2014, : 153 - 155
  • [24] Comparison Of Multinomial Naive Bayes Algorithm And Logistic Regression For Intent Classification In Chatbot
    Setyawan, Muhammad Yusril Helmi
    Awangga, Rolly Maulana
    Efendi, Safif Rafi
    PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON APPLIED ENGINEERING (ICAE), 2018,
  • [25] Sentiment Classification by a Hybrid Method of Greedy Search and Multinomial Naive Bayes Algorithm
    Chirawichitchai, Nivet
    2013 ELEVENTH INTERNATIONAL CONFERENCE ON ICT AND KNOWLEDGE ENGINEERING (ICT&KE), 2013,
  • [26] Classifying User Experience (UX) Of The M-Commerce Application using Multinomial Naive Bayes Algorithm
    Habal, Beau Gray M.
    Mangaba, Joel
    PROCEEDINGS OF 2023 7TH INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND INFORMATION RETRIEVAL, NLPIR 2023, 2023, : 135 - 142
  • [27] Stacking algorithm based on naive Bayes
    Huang, Chen
    Zhou, Yuting
    Yang, Xuemei
    Liu, Shiqi
    Yin, Junping
    2024 2ND ASIA CONFERENCE ON COMPUTER VISION, IMAGE PROCESSING AND PATTERN RECOGNITION, CVIPPR 2024, 2024,
  • [28] Improving Naive Bayes text classifier with modified EM algorithm
    Kim, HJ
    Chang, JY
    FOUNDATIONS OF INTELLIGENT SYSTEMS, 2003, 2871 : 326 - 333
  • [29] Sentiment analysis on hotel reviews using Multinomial Naive Bayes classifier
    Farisi, Arif Abdurrahman
    Sibaroni, Yuliant
    Al Faraby, Said
    2ND INTERNATIONAL CONFERENCE ON DATA AND INFORMATION SCIENCE, 2019, 1192
  • [30] A message classifier based on multinomial Naive Bayes for online social contexts
    de Souza Viana, Tharsis Salathiel
    de Oliveira, Marcos
    Coelho da Silva, Ticiana Linhares
    Rodrigues Falc Ao, Mario Sergio
    Tavares Goncalves, Enyo Jose
    JOURNAL OF MANAGEMENT ANALYTICS, 2018, 5 (03) : 213 - 229