JASMINE: Arabic GPT Models for Few-Shot Learning

被引:0
|
作者
Nagoudi, El Moatez Billah [1 ,2 ]
Abdul-Mageed, Muhammad [1 ,2 ,3 ,4 ]
Elmadany, AbdelRahim [1 ,2 ]
Inciarte, Alcides Alcoba [1 ,2 ]
Khondaker, Md Tawkat Islam [1 ,2 ]
机构
[1] Univ British Columbia, Deep Learning, Vancouver, BC, Canada
[2] Univ British Columbia, Nat Language Proc Grp, Vancouver, BC, Canada
[3] MBZUAI, Dept Nat Language Proc, Abu Dhabi, U Arab Emirates
[4] MBZUAI, Dept Machine Learning, Abu Dhabi, U Arab Emirates
基金
加拿大自然科学与工程研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Scholarship on generative pretraining (GPT) remains acutely Anglocentric, leaving serious gaps in our understanding of the whole class of autoregressive models. For example, we have little knowledge about the potential of these models and their societal impacts in diverse linguistic and cultural settings. We alleviate this issue for Arabic, a wide collection of languages and dialectal varieties with similar to 450 million population, by introducing JASMINE. JASMINE is a suite of powerful Arabic autoregressive Transformer language models ranging in size between 300 million-6.7 billion parameters pretrained on a large and diverse dataset (similar to 235GB of text). We also carefully design and release a comprehensive benchmark for both automated and human evaluation of Arabic autoregressive models, with coverage of potential social biases, harms, and toxicity. Using our novel benchmark, we evaluate JASMINE extensively showing powerful performance intrinsically as well as in few-shot learning on a wide range of NLP tasks. We aim to responsibly release our models and evaluation benchmark with interested researchers, along with code for experimenting with them.
引用
收藏
页码:16721 / 16744
页数:24
相关论文
共 50 条
  • [41] Few-shot Learning with Prompting Methods
    2023 6TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION AND IMAGE ANALYSIS, IPRIA, 2023,
  • [42] Active Few-Shot Learning with FASL
    Muller, Thomas
    Perez-Torro, Guillermo
    Basile, Angelo
    Franco-Salvador, Marc
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS (NLDB 2022), 2022, 13286 : 98 - 110
  • [43] Few-Shot Learning with Embedded Class Models and Shot-Free Meta Training
    Ravichandran, Avinash
    Bhotika, Rahul
    Soatto, Stefano
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 331 - 339
  • [44] Explore pretraining for few-shot learning
    Li, Yan
    Huang, Jinjie
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (2) : 4691 - 4702
  • [45] Few-Shot Learning for Opinion Summarization
    Brazinskas, Arthur
    Lapata, Mirella
    Titov, Ivan
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 4119 - 4135
  • [46] Few-Shot Learning With Geometric Constraints
    Jung, Hong-Gyu
    Lee, Seong-Whan
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (11) : 4660 - 4672
  • [47] Prototype Reinforcement for Few-Shot Learning
    Xu, Liheng
    Xie, Qian
    Jiang, Baoqing
    Zhang, Jiashuo
    2020 CHINESE AUTOMATION CONGRESS (CAC 2020), 2020, : 4912 - 4916
  • [48] Learning about few-shot concept learning
    Rastogi, Ananya
    NATURE COMPUTATIONAL SCIENCE, 2022, 2 (11): : 698 - 698
  • [49] An Applicative Survey on Few-shot Learning
    Zhang J.
    Zhang X.
    Lv L.
    Di Y.
    Chen W.
    Recent Patents on Engineering, 2022, 16 (05) : 104 - 124
  • [50] Secure collaborative few-shot learning
    Xie, Yu
    Wang, Han
    Yu, Bin
    Zhang, Chen
    KNOWLEDGE-BASED SYSTEMS, 2020, 203