JASMINE: Arabic GPT Models for Few-Shot Learning

被引:0
|
作者
Nagoudi, El Moatez Billah [1 ,2 ]
Abdul-Mageed, Muhammad [1 ,2 ,3 ,4 ]
Elmadany, AbdelRahim [1 ,2 ]
Inciarte, Alcides Alcoba [1 ,2 ]
Khondaker, Md Tawkat Islam [1 ,2 ]
机构
[1] Univ British Columbia, Deep Learning, Vancouver, BC, Canada
[2] Univ British Columbia, Nat Language Proc Grp, Vancouver, BC, Canada
[3] MBZUAI, Dept Nat Language Proc, Abu Dhabi, U Arab Emirates
[4] MBZUAI, Dept Machine Learning, Abu Dhabi, U Arab Emirates
基金
加拿大自然科学与工程研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Scholarship on generative pretraining (GPT) remains acutely Anglocentric, leaving serious gaps in our understanding of the whole class of autoregressive models. For example, we have little knowledge about the potential of these models and their societal impacts in diverse linguistic and cultural settings. We alleviate this issue for Arabic, a wide collection of languages and dialectal varieties with similar to 450 million population, by introducing JASMINE. JASMINE is a suite of powerful Arabic autoregressive Transformer language models ranging in size between 300 million-6.7 billion parameters pretrained on a large and diverse dataset (similar to 235GB of text). We also carefully design and release a comprehensive benchmark for both automated and human evaluation of Arabic autoregressive models, with coverage of potential social biases, harms, and toxicity. Using our novel benchmark, we evaluate JASMINE extensively showing powerful performance intrinsically as well as in few-shot learning on a wide range of NLP tasks. We aim to responsibly release our models and evaluation benchmark with interested researchers, along with code for experimenting with them.
引用
收藏
页码:16721 / 16744
页数:24
相关论文
共 50 条
  • [21] RankDNN: Learning to Rank for Few-Shot Learning
    Guo, Qianyu
    Gong Haotong
    Wei, Xujun
    Fu, Yanwei
    Yu, Yizhou
    Zhang, Wenqiang
    Ge, Weifeng
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 1, 2023, : 728 - 736
  • [22] Learning about few-shot concept learning
    Ananya Rastogi
    Nature Computational Science, 2022, 2 : 698 - 698
  • [23] Co-Learning for Few-Shot Learning
    Xu, Rui
    Xing, Lei
    Shao, Shuai
    Liu, Baodi
    Zhang, Kai
    Liu, Weifeng
    NEURAL PROCESSING LETTERS, 2022, 54 (04) : 3339 - 3356
  • [24] Federated Few-Shot Learning with Adversarial Learning
    Fan, Chenyou
    Huang, Jianwei
    2021 19TH INTERNATIONAL SYMPOSIUM ON MODELING AND OPTIMIZATION IN MOBILE, AD HOC, AND WIRELESS NETWORKS (WIOPT), 2021,
  • [25] Personalized Federated Few-Shot Learning
    Zhao, Yunfeng
    Yu, Guoxian
    Wang, Jun
    Domeniconi, Carlotta
    Guo, Maozu
    Zhang, Xiangliang
    Cui, Lizhen
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (02) : 2534 - 2544
  • [26] Few-Shot Classification with Contrastive Learning
    Yang, Zhanyuan
    Wang, Jinghua
    Zhu, Yingying
    COMPUTER VISION, ECCV 2022, PT XX, 2022, 13680 : 293 - 309
  • [27] A Feature Generator for Few-Shot Learning
    Kanagalingam, Heethanjan
    Pathmanathan, Thenukan
    Ketheeswaran, Navaneethan
    Vathanakumar, Mokeeshan
    Afham, Mohamed
    Rodrigo, Ranga
    arXiv,
  • [28] Few-shot learning for ear recognition
    Zhang, Jie
    Yu, Wen
    Yang, Xudong
    Deng, Fang
    PROCEEDINGS OF 2019 INTERNATIONAL CONFERENCE ON IMAGE, VIDEO AND SIGNAL PROCESSING (IVSP 2019), 2019, : 50 - 54
  • [29] Few-Shot Learning with Novelty Detection
    Bjerge, Kim
    Bodesheim, Paul
    Karstoft, Henrik
    DEEP LEARNING THEORY AND APPLICATIONS, PT I, DELTA 2024, 2024, 2171 : 340 - 363
  • [30] Prototype Completion for Few-Shot Learning
    Zhang, Baoquan
    Li, Xutao
    Ye, Yunming
    Feng, Shanshan
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (10) : 12250 - 12268