Bangla-BERT: Transformer-Based Efficient Model for Transfer Learning and Language Understanding

被引:17
|
作者
Kowsher, M. [1 ]
Sami, Abdullah A. S. [2 ]
Prottasha, Nusrat Jahan [3 ]
Arefin, Mohammad Shamsul [3 ,4 ]
Dhar, Pranab Kumar [4 ]
Koshiba, Takeshi [5 ]
机构
[1] Stevens Inst Technol, Dept Comp Sci, Hoboken, NJ 07030 USA
[2] Chittagong Univ Engn & Technol, Dept Comp Sci & Engn, Chattogram 4349, Bangladesh
[3] Daffodil Int Univ, Dept Comp Sci & Engn, Dhaka 1207, Bangladesh
[4] Chittagong Univ Engn & Technol, Chattogram 4349, Bangladesh
[5] Waseda Univ, Shinjuku Ku, Tokyo 1698050, Japan
来源
IEEE ACCESS | 2022年 / 10卷
关键词
Bit error rate; Learning systems; Transformers; Data models; Computational modeling; Internet; Transfer learning; Bangla NLP; BERT-base; large corpus; transformer;
D O I
10.1109/ACCESS.2022.3197662
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The advent of pre-trained language models has directed a new era of Natural Language Processing (NLP), enabling us to create powerful language models. Among these models, Transformer-based models like BERT have grown in popularity due to their cutting-edge effectiveness. However, these models heavily rely on resource-intensive languages, forcing other languages into multilingual models(mBERT). The two fundamental challenges with mBERT become significantly more challenging in a resource-constrained language like Bangla. It was trained on a limited and organized dataset and contained weights for all other languages. Besides, current research on other languages suggests that a language-specific BERT model will exceed multilingual ones. This paper introduces Bangla-BERT,a a monolingual BERT model for the Bangla language. Despite the limited data available for NLP tasks in Bangla, we perform pre-training on the largest Bangla language model dataset, BanglaLM, which we constructed using 40 GB of text data. Bangla-BERT achieves the highest results in all datasets and vastly improves the state-of-the-art performance in binary linguistic classification, multilabel extraction, and named entity recognition, outperforming multilingual BERT and other previous research. The pre-trained model is assessed against several non-contextual models such as Bangla fasttext and word2vec the downstream tasks. Finally, this model is evaluated by transfer learning based on hybrid deep learning models such as LSTM, CNN, and CRF in NER, and it is observed that Bangla-BERT outperforms state-of-the-art methods. The proposed Bangla-BERT model is assessed by using benchmark datasets, including Banfakenews, Sentiment Analysis on Bengali News Comments, and Cross-lingual Sentiment Analysis in Bengali. Finally, it is concluded that Bangla-BERT surpasses all prior state-of-the-art results by 3.52%, 2.2%, and 5.3%.
引用
收藏
页码:91855 / 91870
页数:16
相关论文
共 50 条
  • [1] Reading comprehension based question answering system in Bangla language with transformer-based learning
    Aurpa, Tanjim Taharat
    Rifat, Richita Khandakar
    Ahmed, Md Shoaib
    Anwar, Md Musfique
    Ali, A. B. M. Shawkat
    HELIYON, 2022, 8 (10)
  • [2] LVBERT: Transformer-Based Model for Latvian Language Understanding
    Znotins, Arturs
    Barzdins, Guntis
    HUMAN LANGUAGE TECHNOLOGIES - THE BALTIC PERSPECTIVE (HLT 2020), 2020, 328 : 111 - 115
  • [3] ParsBERT: Transformer-based Model for Persian Language Understanding
    Mehrdad Farahani
    Mohammad Gharachorloo
    Marzieh Farahani
    Mohammad Manthouri
    Neural Processing Letters, 2021, 53 : 3831 - 3847
  • [4] ParsBERT: Transformer-based Model for Persian Language Understanding
    Farahani, Mehrdad
    Gharachorloo, Mohammad
    Farahani, Marzieh
    Manthouri, Mohammad
    NEURAL PROCESSING LETTERS, 2021, 53 (06) : 3831 - 3847
  • [5] Transformer-based Natural Language Understanding and Generation
    Zhang, Feng
    An, Gaoyun
    Ruan, Qiuqi
    2022 16TH IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP2022), VOL 1, 2022, : 281 - 284
  • [6] InFi-BERT 1.0: Transformer-Based Language Model for Indian Financial Volatility Prediction
    Sasubilli, Sravani
    Verma, Mridula
    MACHINE LEARNING AND PRINCIPLES AND PRACTICE OF KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2022, PT II, 2023, 1753 : 128 - 138
  • [7] Transformer-based Language Models and Homomorphic Encryption: An Intersection with BERT-tiny
    Rovida, Lorenzo
    Leporati, Alberto
    PROCEEDINGS OF THE 10TH ACM INTERNATIONAL WORKSHOP ON SECURITY AND PRIVACY ANALYTICS, IWSPA 2024, 2024, : 3 - 13
  • [8] TMD-BERT: A Transformer-Based Model for Transportation Mode Detection
    Drosouli, Ifigenia
    Voulodimos, Athanasios
    Mastorocostas, Paris
    Miaoulis, Georgios
    Ghazanfarpour, Djamchid
    ELECTRONICS, 2023, 12 (03)
  • [9] Abusive Bangla comments detection on Facebook using transformer-based deep learning models
    Aurpa, Tanjim Taharat
    Sadik, Rifat
    Ahmed, Md Shoaib
    SOCIAL NETWORK ANALYSIS AND MINING, 2022, 12 (01)
  • [10] Abusive Bangla comments detection on Facebook using transformer-based deep learning models
    Tanjim Taharat Aurpa
    Rifat Sadik
    Md Shoaib Ahmed
    Social Network Analysis and Mining, 2022, 12