Spatial Mixture-of-Experts

被引:0
|
作者
Dryden, Nikoli [1 ]
Hoefler, Torsten [1 ]
机构
[1] Swiss Fed Inst Technol, Zurich, Switzerland
基金
欧盟地平线“2020”;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many data have an underlying dependence on spatial location; it may be weather on the Earth, a simulation on a mesh, or a registered image. Yet this feature is rarely taken advantage of, and violates common assumptions made by many neural network layers, such as translation equivariance. Further, many works that do incorporate locality fail to capture fine-grained structure. To address this, we introduce the Spatial Mixture-of-Experts (SMOE) layer, a sparsely-gated layer that learns spatial structure in the input domain and routes experts at a fine-grained level to utilize it. We also develop new techniques to train SMOEs, including a self-supervised routing loss and damping expert errors. Finally, we show strong results for SMOEs on numerous tasks, and set new state-of-the-art results for medium-range weather prediction and post-processing ensemble weather forecasts.
引用
收藏
页数:17
相关论文
共 50 条
  • [21] Efficient Reflectance Capture With a Deep Gated Mixture-of-Experts
    Ma, Xiaohe
    Yu, Yaxin
    Wu, Hongzhi
    Zhou, Kun
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2024, 30 (07) : 4246 - 4256
  • [22] Self-Supervised Mixture-of-Experts by Uncertainty Estimation
    Zheng, Zhuobin
    Yuan, Chun
    Zhu, Xinrui
    Lin, Zhihui
    Cheng, Yangyang
    Shi, Cheng
    Ye, Jiahui
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 5933 - 5940
  • [23] Steered Mixture-of-Experts for Light Field Video Coding
    Avramelos, Vasileios
    Saenen, Ignace
    Verhack, Ruben
    Van Wallendael, Glenn
    Lambert, Peter
    Sikora, Thomas
    APPLICATIONS OF DIGITAL IMAGE PROCESSING XLI, 2018, 10752
  • [24] Steered Mixture-of-Experts Approximation of Spherical Image Data
    Verhack, Ruben
    Madhu, Nilesh
    Van Wallendael, Glenn
    Lambert, Peter
    Sikora, Thomas
    2018 26TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2018, : 256 - 260
  • [25] A mixture-of-experts approach for gene regulatory network inference
    Shao, Borong
    Lavesson, Niklas
    Boeva, Veselka
    Shahzad, Raja Khurram
    INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2016, 14 (03) : 258 - 275
  • [26] Practical and theoretical aspects of mixture-of-experts modeling: An overview
    Nguyen, Hien D.
    Chamroukhi, Faicel
    WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2018, 8 (04)
  • [27] GLaM: Efficient Scaling of Language Models with Mixture-of-Experts
    Du, Nan
    Huang, Yanping
    Dai, Andrew M.
    Tong, Simon
    Lepikhin, Dmitry
    Xu, Yuanzhong
    Krikun, Maxim
    Zhou, Yanqi
    Yu, Adams Wei
    Firat, Orhan
    Zoph, Barret
    Fedus, Liam
    Bosma, Maarten
    Zhou, Zongwei
    Wang, Tao
    Wang, Yu Emma
    Webster, Kellie
    Pellat, Marie
    Robinson, Kevin
    Meier-Hellstern, Kathleen
    Duke, Toju
    Dixon, Lucas
    Zhang, Kun
    Le, Quoc V.
    Wu, Yonghui
    Chen, Zhifeng
    Cui, Claire
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [28] On-line learning of a mixture-of-experts neural network
    Huh, NJ
    Oh, JH
    Kang, K
    JOURNAL OF PHYSICS A-MATHEMATICAL AND GENERAL, 2000, 33 (48): : 8663 - 8672
  • [29] MoE-SPNet: A mixture-of-experts scene parsing network
    Fu, Huan
    Gong, Mingming
    Wang, Chaohui
    Tao, Dacheng
    PATTERN RECOGNITION, 2018, 84 : 226 - 236
  • [30] SPEECHMOE2: MIXTURE-OF-EXPERTS MODEL WITH IMPROVED ROUTING
    You, Zhao
    Feng, Shulin
    Su, Dan
    Yu, Dong
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7217 - 7221