Parallel-mentoring for Offline Model-based Optimization

被引:0
|
作者
Chen, Can [1 ,2 ]
Beckham, Christopher [2 ,3 ]
Liu, Zixuan [4 ]
Liu, Xue [1 ,2 ]
Pal, Christopher [2 ,3 ]
机构
[1] McGill Univ, Montreal, PQ, Canada
[2] MILA Quebec AI Inst, Montreal, PQ, Canada
[3] Polytech Montreal, Montreal, PQ, Canada
[4] Univ Washington, Seattle, WA 98195 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We study offline model-based optimization to maximize a black-box objective function with a static dataset of designs and scores. These designs encompass a variety of domains, including materials, robots and DNA sequences. A common approach trains a proxy on the static dataset to approximate the black-box objective function and performs gradient ascent to obtain new designs. However, this often results in poor designs due to the proxy inaccuracies for out-of-distribution designs. Recent studies indicate that: (a) gradient ascent with a mean ensemble of proxies generally outperforms simple gradient ascent, and (b) a trained proxy provides weak ranking supervision signals for design selection. Motivated by (a) and (b), we propose parallel-mentoring as an effective and novel method that facilitates mentoring among parallel proxies, creating a more robust ensemble to mitigate the out-of-distribution issue. We focus on the three-proxy case and our method consists of two modules. The first module, voting-based pairwise supervision, operates on three parallel proxies and captures their ranking supervision signals as pairwise comparison labels. These labels are combined through majority voting to generate consensus labels, which incorporate ranking supervision signals from all proxies and enable mutual mentoring. However, label noise arises due to possible incorrect consensus. To alleviate this, we introduce an adaptive soft-labeling module with soft-labels initialized as consensus labels. Based on bi-level optimization, this module fine-tunes proxies in the inner level and learns more accurate labels in the outer level to adaptively mentor proxies, resulting in a more robust ensemble. Experiments validate the effectiveness of our method. Our code is available here.
引用
收藏
页数:18
相关论文
共 50 条
  • [31] Population model-based optimization
    Chen, Xi
    Zhou, Enlu
    JOURNAL OF GLOBAL OPTIMIZATION, 2015, 63 (01) : 125 - 148
  • [32] Population model-based optimization
    Xi Chen
    Enlu Zhou
    Journal of Global Optimization, 2015, 63 : 125 - 148
  • [33] MODEL-BASED EVOLUTIONARY OPTIMIZATION
    Wang, Yongqiang
    Fu, Michael C.
    Marcus, Steven I.
    PROCEEDINGS OF THE 2010 WINTER SIMULATION CONFERENCE, 2010, : 1199 - 1210
  • [34] Model-Based Optimization for Robotics
    Mombaur, Katja
    Kheddar, Abderrahmane
    Harada, Kensuke
    Buschmann, Thomas
    Atkeson, Chris
    IEEE ROBOTICS & AUTOMATION MAGAZINE, 2014, 21 (03) : 24 - 161
  • [35] Model-Based Offline Reinforcement Learning for Autonomous Delivery of Guidewire
    Li, Hao
    Zhou, Xiao-Hu
    Xie, Xiao-Liang
    Liu, Shi-Qi
    Feng, Zhen-Qiu
    Gui, Mei-Jiang
    Xiang, Tian-Yu
    Huang, De-Xing
    Hou, Zeng-Guang
    IEEE TRANSACTIONS ON MEDICAL ROBOTICS AND BIONICS, 2024, 6 (03): : 1054 - 1062
  • [36] Bayesian Model-Based Offline Reinforcement Learning for Product Allocation
    Jenkins, Porter
    Wei, Hua
    Jenkins, J. Stockton
    Li, Zhenhui
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 12531 - 12537
  • [37] A MODEL-BASED APPROACH FOR DESCRIBING OFFLINE NAVIGATION OF WEB APPLICATIONS
    Albertos-Marco, Felix
    Penichet, Victor M. R.
    Gallud, Jose A.
    Winckler, Marco
    JOURNAL OF WEB ENGINEERING, 2017, 16 (1-2): : 1 - 38
  • [38] SETTLING THE SAMPLE COMPLEXITY OF MODEL-BASED OFFLINE REINFORCEMENT LEARNING
    Li, Gen
    Shi, Laixi
    Chen, Yuxin
    Chi, Yuejie
    Wei, Yuting
    ANNALS OF STATISTICS, 2024, 52 (01): : 233 - 260
  • [39] Discriminator-Guided Model-Based Offline Imitation Learning
    Zhang, Wenjia
    Xu, Haoran
    Niu, Haoyi
    Cheng, Peng
    Li, Ming
    Zhang, Heming
    Zhou, Guyue
    Zhan, Xianyuan
    CONFERENCE ON ROBOT LEARNING, VOL 205, 2022, 205 : 1266 - 1276
  • [40] Bidirectional Learning for Offline Model-based Biological Sequence Design
    Chen, Can
    Zhang, Yingxue
    Liu, Xue
    Coates, Mark
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202, 2023, 202