JASMINE: Arabic GPT Models for Few-Shot Learning

被引:0
|
作者
Nagoudi, El Moatez Billah [1 ,2 ]
Abdul-Mageed, Muhammad [1 ,2 ,3 ,4 ]
Elmadany, AbdelRahim [1 ,2 ]
Inciarte, Alcides Alcoba [1 ,2 ]
Khondaker, Md Tawkat Islam [1 ,2 ]
机构
[1] Univ British Columbia, Deep Learning, Vancouver, BC, Canada
[2] Univ British Columbia, Nat Language Proc Grp, Vancouver, BC, Canada
[3] MBZUAI, Dept Nat Language Proc, Abu Dhabi, U Arab Emirates
[4] MBZUAI, Dept Machine Learning, Abu Dhabi, U Arab Emirates
基金
加拿大自然科学与工程研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Scholarship on generative pretraining (GPT) remains acutely Anglocentric, leaving serious gaps in our understanding of the whole class of autoregressive models. For example, we have little knowledge about the potential of these models and their societal impacts in diverse linguistic and cultural settings. We alleviate this issue for Arabic, a wide collection of languages and dialectal varieties with similar to 450 million population, by introducing JASMINE. JASMINE is a suite of powerful Arabic autoregressive Transformer language models ranging in size between 300 million-6.7 billion parameters pretrained on a large and diverse dataset (similar to 235GB of text). We also carefully design and release a comprehensive benchmark for both automated and human evaluation of Arabic autoregressive models, with coverage of potential social biases, harms, and toxicity. Using our novel benchmark, we evaluate JASMINE extensively showing powerful performance intrinsically as well as in few-shot learning on a wide range of NLP tasks. We aim to responsibly release our models and evaluation benchmark with interested researchers, along with code for experimenting with them.
引用
收藏
页码:16721 / 16744
页数:24
相关论文
共 50 条
  • [1] JASMINE: Arabic GPT Models for Few-Shot Learning
    Nagoudi, El Moatez Billah
    Abdul-Mageed, Muhammad
    Elmadany, AbdelRahim
    Inciarte, Alcides Alcoba
    Khondaker, Tawkat Islam
    EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings, 2023, : 16721 - 16744
  • [2] True Few-Shot Learning with Language Models
    Perez, Ethan
    Kiela, Douwe
    Cho, Kyunghyun
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,
  • [3] Few-Shot Few-Shot Learning and the role of Spatial Attention
    Lifchitz, Yann
    Avrithis, Yannis
    Picard, Sylvaine
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 2693 - 2700
  • [4] Multimodal Few-Shot Learning with Frozen Language Models
    Tsimpoukelli, Maria
    Menick, Jacob
    Cabi, Serkan
    Eslami, S. M. Ali
    Vinyals, Oriol
    Hill, Felix
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [5] Defensive Few-Shot Learning
    Li, Wenbin
    Wang, Lei
    Zhang, Xingxing
    Qi, Lei
    Huo, Jing
    Gao, Yang
    Luo, Jiebo
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (05) : 5649 - 5667
  • [6] Federated Few-shot Learning
    Wang, Song
    Fu, Xingbo
    Ding, Kaize
    Chen, Chen
    Chen, Huiyuan
    Li, Jundong
    PROCEEDINGS OF THE 29TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2023, 2023, : 2374 - 2385
  • [7] Fractal Few-Shot Learning
    Zhou, Fobao
    Huang, Wenkai
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 35 (11) : 1 - 15
  • [8] Survey on Few-shot Learning
    Zhao K.-L.
    Jin X.-L.
    Wang Y.-Z.
    Ruan Jian Xue Bao/Journal of Software, 2021, 32 (02): : 349 - 369
  • [9] Variational Few-Shot Learning
    Zhang, Jian
    Zhao, Chenglong
    Ni, Bingbing
    Xu, Minghao
    Yang, Xiaokang
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 1685 - 1694
  • [10] Fractal Few-Shot Learning
    Zhou, Fobao
    Huang, Wenkai
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (11) : 16353 - 16367