Projection-Free Methods for Stochastic Simple Bilevel Optimization with Convex Lower-level Problem

被引:0
|
作者
Cao, Jincheng [1 ]
Jiang, Ruichen [1 ]
Abolfazli, Nazanin [2 ]
Hamedani, Erfan Yazdandoost [2 ]
Mokhtari, Aryan [1 ]
机构
[1] UT Austin, ECE Dept, Austin, TX 78712 USA
[2] Univ Arizona, SIE Dept, Tucson, AZ USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we study a class of stochastic bilevel optimization problems, also known as stochastic simple bilevel optimization, where we minimize a smooth stochastic objective function over the optimal solution set of another stochastic convex optimization problem. We introduce novel stochastic bilevel optimization methods that locally approximate the solution set of the lower-level problem via a stochastic cutting plane, and then run a conditional gradient update with variance reduction techniques to control the error induced by using stochastic gradients. For the case that the upper-level function is convex, our method requires (O) over tilde (max{1/epsilon(2)(f), 1/epsilon(2)(g)}) stochastic oracle queries to obtain a solution that is epsilon(f-)optimal for the upper-level and epsilon(g)-optimal for the lower-level. This guarantee improves the previous best-known complexity of (O) over tilde (max{1/epsilon(4)(f), 1/epsilon(4)(g)}). Moreover, for the case that the upper-level function is non-convex, our method requires at most (O) over tilde (max{1/epsilon(3)(f), 1/epsilon(3)(g)}) stochastic oracle queries to find an (epsilon(f), epsilon(g))-stationary point. In the finite-sum setting, we show that the number of stochastic oracle calls required by our method are (O) over tilde(root n/epsilon) and (O) over tilde(root n/epsilon(2)) for the convex and non-convex settings, respectively, where epsilon = min{epsilon(f), epsilon(g)}.
引用
收藏
页数:27
相关论文
共 50 条