Bangla-BERT: Transformer-Based Efficient Model for Transfer Learning and Language Understanding

被引:17
|
作者
Kowsher, M. [1 ]
Sami, Abdullah A. S. [2 ]
Prottasha, Nusrat Jahan [3 ]
Arefin, Mohammad Shamsul [3 ,4 ]
Dhar, Pranab Kumar [4 ]
Koshiba, Takeshi [5 ]
机构
[1] Stevens Inst Technol, Dept Comp Sci, Hoboken, NJ 07030 USA
[2] Chittagong Univ Engn & Technol, Dept Comp Sci & Engn, Chattogram 4349, Bangladesh
[3] Daffodil Int Univ, Dept Comp Sci & Engn, Dhaka 1207, Bangladesh
[4] Chittagong Univ Engn & Technol, Chattogram 4349, Bangladesh
[5] Waseda Univ, Shinjuku Ku, Tokyo 1698050, Japan
来源
IEEE ACCESS | 2022年 / 10卷
关键词
Bit error rate; Learning systems; Transformers; Data models; Computational modeling; Internet; Transfer learning; Bangla NLP; BERT-base; large corpus; transformer;
D O I
10.1109/ACCESS.2022.3197662
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The advent of pre-trained language models has directed a new era of Natural Language Processing (NLP), enabling us to create powerful language models. Among these models, Transformer-based models like BERT have grown in popularity due to their cutting-edge effectiveness. However, these models heavily rely on resource-intensive languages, forcing other languages into multilingual models(mBERT). The two fundamental challenges with mBERT become significantly more challenging in a resource-constrained language like Bangla. It was trained on a limited and organized dataset and contained weights for all other languages. Besides, current research on other languages suggests that a language-specific BERT model will exceed multilingual ones. This paper introduces Bangla-BERT,a a monolingual BERT model for the Bangla language. Despite the limited data available for NLP tasks in Bangla, we perform pre-training on the largest Bangla language model dataset, BanglaLM, which we constructed using 40 GB of text data. Bangla-BERT achieves the highest results in all datasets and vastly improves the state-of-the-art performance in binary linguistic classification, multilabel extraction, and named entity recognition, outperforming multilingual BERT and other previous research. The pre-trained model is assessed against several non-contextual models such as Bangla fasttext and word2vec the downstream tasks. Finally, this model is evaluated by transfer learning based on hybrid deep learning models such as LSTM, CNN, and CRF in NER, and it is observed that Bangla-BERT outperforms state-of-the-art methods. The proposed Bangla-BERT model is assessed by using benchmark datasets, including Banfakenews, Sentiment Analysis on Bengali News Comments, and Cross-lingual Sentiment Analysis in Bengali. Finally, it is concluded that Bangla-BERT surpasses all prior state-of-the-art results by 3.52%, 2.2%, and 5.3%.
引用
收藏
页码:91855 / 91870
页数:16
相关论文
共 50 条
  • [31] Stress Test Evaluation of Transformer-based Models in Natural Language Understanding Tasks
    Aspillaga, Carlos
    Carvallo, Andres
    Araujo, Vladimir
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 1882 - 1894
  • [32] Transformers-sklearn: a toolkit for medical language understanding with transformer-based models
    Feihong Yang
    Xuwen Wang
    Hetong Ma
    Jiao Li
    BMC Medical Informatics and Decision Making, 21
  • [33] BiCalBERT: An Efficient Transformer-based Model for Chinese Question Answering
    Han, Yanbo
    Zhan, Buchao
    Zhang, Bin
    Zhao, Chao
    Yan, Shankai
    2024 8TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS, METAHEURISTICS & SWARM INTELLIGENCE, ISMSI 2024, 2024, : 100 - 104
  • [34] Transformers-sklearn: a toolkit for medical language understanding with transformer-based models
    Yang, Feihong
    Wang, Xuwen
    Ma, Hetong
    Li, Jiao
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2021, 21 (SUPPL 2)
  • [35] Exploiting Data-Efficient Image Transformer-Based Transfer Learning for Valvular Heart Diseases Detection
    Jumphoo, Talit
    Phapatanaburi, Khomdet
    Pathonsuwan, Wongsathon
    Anchuen, Patikorn
    Uthansakul, Monthippa
    Uthansakul, Peerapong
    IEEE ACCESS, 2024, 12 : 15845 - 15855
  • [36] Transformer-Based Model for Monocular Visual Odometry: A Video Understanding Approach
    Francani, Andre O.
    Maximo, Marcos R. O. A.
    IEEE ACCESS, 2025, 13 : 13959 - 13971
  • [37] Transformer-based deep learning model for forced oscillation localization
    Matar, Mustafa
    Estevez, Pablo Gill
    Marchi, Pablo
    Messina, Francisco
    Elmoudi, Ramadan
    Wshah, Safwan
    INTERNATIONAL JOURNAL OF ELECTRICAL POWER & ENERGY SYSTEMS, 2023, 146
  • [38] Characterization of groundwater contamination: A transformer-based deep learning model
    Bai, Tao
    Tahmasebi, Pejman
    ADVANCES IN WATER RESOURCES, 2022, 164
  • [39] GIT: A Transformer-Based Deep Learning Model for Geoacoustic Inversion
    Feng, Sheng
    Zhu, Xiaoqian
    Ma, Shuqing
    Lan, Qiang
    JOURNAL OF MARINE SCIENCE AND ENGINEERING, 2023, 11 (06)
  • [40] A transformer-based spelling error correction framework for Bangla and resource scarce Indic
    Bijoy, Mehedi Hasan
    Hossain, Nahid
    Islam, Salekul
    Shatabda, Swakkhar
    COMPUTER SPEECH AND LANGUAGE, 2025, 89