Bangla-BERT: Transformer-Based Efficient Model for Transfer Learning and Language Understanding

被引:17
|
作者
Kowsher, M. [1 ]
Sami, Abdullah A. S. [2 ]
Prottasha, Nusrat Jahan [3 ]
Arefin, Mohammad Shamsul [3 ,4 ]
Dhar, Pranab Kumar [4 ]
Koshiba, Takeshi [5 ]
机构
[1] Stevens Inst Technol, Dept Comp Sci, Hoboken, NJ 07030 USA
[2] Chittagong Univ Engn & Technol, Dept Comp Sci & Engn, Chattogram 4349, Bangladesh
[3] Daffodil Int Univ, Dept Comp Sci & Engn, Dhaka 1207, Bangladesh
[4] Chittagong Univ Engn & Technol, Chattogram 4349, Bangladesh
[5] Waseda Univ, Shinjuku Ku, Tokyo 1698050, Japan
来源
IEEE ACCESS | 2022年 / 10卷
关键词
Bit error rate; Learning systems; Transformers; Data models; Computational modeling; Internet; Transfer learning; Bangla NLP; BERT-base; large corpus; transformer;
D O I
10.1109/ACCESS.2022.3197662
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The advent of pre-trained language models has directed a new era of Natural Language Processing (NLP), enabling us to create powerful language models. Among these models, Transformer-based models like BERT have grown in popularity due to their cutting-edge effectiveness. However, these models heavily rely on resource-intensive languages, forcing other languages into multilingual models(mBERT). The two fundamental challenges with mBERT become significantly more challenging in a resource-constrained language like Bangla. It was trained on a limited and organized dataset and contained weights for all other languages. Besides, current research on other languages suggests that a language-specific BERT model will exceed multilingual ones. This paper introduces Bangla-BERT,a a monolingual BERT model for the Bangla language. Despite the limited data available for NLP tasks in Bangla, we perform pre-training on the largest Bangla language model dataset, BanglaLM, which we constructed using 40 GB of text data. Bangla-BERT achieves the highest results in all datasets and vastly improves the state-of-the-art performance in binary linguistic classification, multilabel extraction, and named entity recognition, outperforming multilingual BERT and other previous research. The pre-trained model is assessed against several non-contextual models such as Bangla fasttext and word2vec the downstream tasks. Finally, this model is evaluated by transfer learning based on hybrid deep learning models such as LSTM, CNN, and CRF in NER, and it is observed that Bangla-BERT outperforms state-of-the-art methods. The proposed Bangla-BERT model is assessed by using benchmark datasets, including Banfakenews, Sentiment Analysis on Bengali News Comments, and Cross-lingual Sentiment Analysis in Bengali. Finally, it is concluded that Bangla-BERT surpasses all prior state-of-the-art results by 3.52%, 2.2%, and 5.3%.
引用
收藏
页码:91855 / 91870
页数:16
相关论文
共 50 条
  • [11] Vision transformer-based visual language understanding of the construction process
    Yang, Bin
    Zhang, Binghan
    Han, Yilong
    Liu, Boda
    Hu, Jiniming
    Jin, Yiming
    ALEXANDRIA ENGINEERING JOURNAL, 2024, 99 : 242 - 256
  • [12] Transformer-based heart language model with electrocardiogram annotations
    Tudjarski, Stojancho
    Gusev, Marjan
    Kanoulas, Evangelos
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [13] Panini: a transformer-based grammatical error correction method for Bangla
    Nahid Hossain
    Mehedi Hasan Bijoy
    Salekul Islam
    Swakkhar Shatabda
    Neural Computing and Applications, 2024, 36 : 3463 - 3477
  • [14] Panini: a transformer-based grammatical error correction method for Bangla
    Hossain, Nahid
    Bijoy, Mehedi Hasan
    Islam, Salekul
    Shatabda, Swakkhar
    NEURAL COMPUTING & APPLICATIONS, 2024, 36 (07): : 3463 - 3477
  • [15] Personality BERT: A Transformer-Based Model for Personality Detection from Textual Data
    Jain, Dipika
    Kumar, Akshi
    Beniwal, Rohit
    PROCEEDINGS OF INTERNATIONAL CONFERENCE ON COMPUTING AND COMMUNICATION NETWORKS (ICCCN 2021), 2022, 394 : 515 - 522
  • [16] AN EFFICIENT TRANSFORMER-BASED MODEL FOR VOICE ACTIVITY DETECTION
    Zhao, Yifei
    Champagne, Benoit
    2022 IEEE 32ND INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2022,
  • [17] A BERT BASED JOINT LEARNING MODEL WITH FEATURE GATED MECHANISM FOR SPOKEN LANGUAGE UNDERSTANDING
    Zhang, Wang
    Jiang, Lei
    Zhang, Shaokang
    Wang, Shuo
    Tan, Jianlong
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7512 - 7516
  • [18] Transfer Learning Study of Motion Transformer-based Trajectory Predictions
    Ullrich, Lars
    McMaster, Alex
    Graichen, Knut
    2024 35TH IEEE INTELLIGENT VEHICLES SYMPOSIUM, IEEE IV 2024, 2024, : 110 - 117
  • [19] Learning Daily Human Mobility with a Transformer-Based Model
    Wang, Weiying
    Osaragi, Toshihiro
    ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2024, 13 (02)
  • [20] Automatic Bangla Image Captioning Based on Transformer Model in Deep Learning
    Hossain, Md Anwar
    Hasan, Mirza A. F. M. Rashidul
    Hossen, Ebrahim
    Asraful, Md
    Faruk, Md Omar
    Abadin, A. F. M. Zainul
    Ali, Md Suhag
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (11) : 1110 - 1117