Bangla-BERT: Transformer-Based Efficient Model for Transfer Learning and Language Understanding

被引:17
|
作者
Kowsher, M. [1 ]
Sami, Abdullah A. S. [2 ]
Prottasha, Nusrat Jahan [3 ]
Arefin, Mohammad Shamsul [3 ,4 ]
Dhar, Pranab Kumar [4 ]
Koshiba, Takeshi [5 ]
机构
[1] Stevens Inst Technol, Dept Comp Sci, Hoboken, NJ 07030 USA
[2] Chittagong Univ Engn & Technol, Dept Comp Sci & Engn, Chattogram 4349, Bangladesh
[3] Daffodil Int Univ, Dept Comp Sci & Engn, Dhaka 1207, Bangladesh
[4] Chittagong Univ Engn & Technol, Chattogram 4349, Bangladesh
[5] Waseda Univ, Shinjuku Ku, Tokyo 1698050, Japan
来源
IEEE ACCESS | 2022年 / 10卷
关键词
Bit error rate; Learning systems; Transformers; Data models; Computational modeling; Internet; Transfer learning; Bangla NLP; BERT-base; large corpus; transformer;
D O I
10.1109/ACCESS.2022.3197662
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The advent of pre-trained language models has directed a new era of Natural Language Processing (NLP), enabling us to create powerful language models. Among these models, Transformer-based models like BERT have grown in popularity due to their cutting-edge effectiveness. However, these models heavily rely on resource-intensive languages, forcing other languages into multilingual models(mBERT). The two fundamental challenges with mBERT become significantly more challenging in a resource-constrained language like Bangla. It was trained on a limited and organized dataset and contained weights for all other languages. Besides, current research on other languages suggests that a language-specific BERT model will exceed multilingual ones. This paper introduces Bangla-BERT,a a monolingual BERT model for the Bangla language. Despite the limited data available for NLP tasks in Bangla, we perform pre-training on the largest Bangla language model dataset, BanglaLM, which we constructed using 40 GB of text data. Bangla-BERT achieves the highest results in all datasets and vastly improves the state-of-the-art performance in binary linguistic classification, multilabel extraction, and named entity recognition, outperforming multilingual BERT and other previous research. The pre-trained model is assessed against several non-contextual models such as Bangla fasttext and word2vec the downstream tasks. Finally, this model is evaluated by transfer learning based on hybrid deep learning models such as LSTM, CNN, and CRF in NER, and it is observed that Bangla-BERT outperforms state-of-the-art methods. The proposed Bangla-BERT model is assessed by using benchmark datasets, including Banfakenews, Sentiment Analysis on Bengali News Comments, and Cross-lingual Sentiment Analysis in Bengali. Finally, it is concluded that Bangla-BERT surpasses all prior state-of-the-art results by 3.52%, 2.2%, and 5.3%.
引用
收藏
页码:91855 / 91870
页数:16
相关论文
共 50 条
  • [41] Transformer-based Reinforcement Learning Model for Optimized Quantitative Trading
    Kumar, Aniket
    Rizk, Rodrigue
    Santosh, K. C.
    2024 IEEE CONFERENCE ON ARTIFICIAL INTELLIGENCE, CAI 2024, 2024, : 1454 - 1455
  • [42] No Train No Gain: Revisiting Efficient Training Algorithms For Transformer-based Language Models
    Kaddour, Jean
    Key, Oscar
    Nawrot, Piotr
    Minervini, Pasquale
    Kusner, Matt J.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [43] SentimentFormer: A Transformer-Based Multimodal Fusion Framework for Enhanced Sentiment Analysis of Memes in Under-Resourced Bangla Language
    Faria, Fatema Tuj Johora
    Baniata, Laith H.
    Baniata, Mohammad H.
    Khair, Mohannad A.
    Bani Ata, Ahmed Ibrahim
    Bunterngchit, Chayut
    Kang, Sangwoo
    ELECTRONICS, 2025, 14 (04):
  • [44] A Framework for Accelerating Transformer-Based Language Model on ReRAM-Based Architecture
    Kang, Myeonggu
    Shin, Hyein
    Kim, Lee-Sup
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2022, 41 (09) : 3026 - 3039
  • [45] A Transformer-Based Math Language Model for Handwritten Math Expression Recognition
    Huy Quang Ung
    Cuong Tuan Nguyen
    Hung Tuan Nguyen
    Thanh-Nghia Truong
    Nakagawa, Masaki
    DOCUMENT ANALYSIS AND RECOGNITION, ICDAR 2021, PT II, 2021, 12917 : 403 - 415
  • [46] Not all quantifiers are equal: Probing transformer-based language models' understanding of generalised quantifiers
    Madusanka, Tharindu
    Zahid, Iqra
    Li, Hao
    Pratt-Hartmann, Ian
    Batista-Navarro, Riza
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 8680 - 8692
  • [47] TRANSFORMER-BASED APPROACH FOR DOCUMENT LAYOUT UNDERSTANDING
    Yang, Huichen
    Hsu, William
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 4043 - 4047
  • [48] Cyberbullying Text Identification: A Deep Learning and Transformer-based Language Modeling Approach
    Saifullah K.
    Khan M.I.
    Jamal S.
    Sarker I.H.
    EAI Endorsed Transactions on Industrial Networks and Intelligent Systems, 2024, 11 (01) : 1 - 12
  • [49] Transformer-Based Music Language Modelling and Transcription
    Zonios, Christos
    Pavlopoulos, John
    Likas, Aristidis
    PROCEEDINGS OF THE 12TH HELLENIC CONFERENCE ON ARTIFICIAL INTELLIGENCE, SETN 2022, 2022,
  • [50] Transfer Learning of Transformer-Based Speech Recognition Models from Czech to Slovak
    Lehecka, Jan
    Psutka, Josef, V
    Psutka, Josef
    TEXT, SPEECH, AND DIALOGUE, TSD 2023, 2023, 14102 : 328 - 338