Parallel Asynchronous Stochastic Variance Reduction for Nonconvex Optimization

被引：0

作者：

Fang, Cong

Lin, Zhouchen ^{[1
]}

机构：

[1] Peking Univ, Sch EECS, Key Lab Machine Percept MOE, Beijing, Peoples R China

来源：

THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE | 2017年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Nowadays, asynchronous parallel algorithms have received much attention in the optimization field due to the crucial demands for modern large-scale optimization problems. However, most asynchronous algorithms focus on convex problems. Analysis on nonconvex problems is lacking. For the Asynchronous Stochastic Descent (ASGD) algorithm, the best result from (Lian et al. 2015) can only achieve an asymptotic O(1/epsilon(2)) rate (convergence to the stationary points, namely, parallel to del f(x)parallel to(2) <= c) on nonconvex problems. In this paper, we study Stochastic Variance Reduced Gradient (SVRG) in the asynchronous setting. We propose the Asynchronous Stochastic Variance Reduced Gradient (ASVRG) algorithm for nonconvex finite-sum problems. We develop two schemes for ASVRG, depending on whether the parameters are updated as an atom or not. We prove that both of the two schemes can achieve linear speed up(1)(a non-asymptotic O(n(2/3)/epsilon) rate to the stationary points) for nonconvex problems when the delay parameter tau < n(1/3), where n is the number of training samples. We also establish a non-asymptotic O(n(2/3) tau(1/3)/epsilon) rate (convergence to the stationary points) for our algorithm without assumptions on t. This further demonstrates that even with asynchronous updating, SVRG has less number of Incremental First-order Oracles (IFOs) compared with Stochastic Gradient Descent and Gradient Descent. We also conduct experiments on a shared memory multi-core system to demonstrate the efficiency of our algorithm.

引用

页码：794 / 800

页数：7

共 50 条

[11] Finding Global Optima in Nonconvex Stochastic Semidefinite Optimization with Variance Reduction
Zeng, Jinshan
Ma, Ke
Yao, Yuan
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84, 2018, 84
[12] General inertial proximal stochastic variance reduction gradient for nonconvex nonsmooth optimization
Sun, Shuya
He, Lulu
JOURNAL OF INEQUALITIES AND APPLICATIONS, 2023, 2023 (01)
[13] General inertial proximal stochastic variance reduction gradient for nonconvex nonsmooth optimization
Shuya Sun
Lulu He
Journal of Inequalities and Applications, 2023
[14] Decentralized Asynchronous Nonconvex Stochastic Optimization on Directed Graphs
Kungurtsev, Vyacheslav
Morafah, Mahdi
Javidi, Tara
Scutari, Gesualdo
IEEE TRANSACTIONS ON CONTROL OF NETWORK SYSTEMS, 2023, 10 (04): : 1796 - 1804
[15] ASYNCHRONOUS PARALLEL NONCONVEX LARGE-SCALE OPTIMIZATION
Cannelli, L.
Facchinei, F.
Kungurtsev, V.
Scutari, G.
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 4706 - 4710
[16] Nonconvex Stochastic Optimization for Model Reduction
Han-Fu Chen
Hai-Tao Fang
Journal of Global Optimization, 2002, 23 : 359 - 372
[17] Nonconvex stochastic optimization for model reduction
Chen, HF
Fang, HT
JOURNAL OF GLOBAL OPTIMIZATION, 2002, 23 (3-4) : 359 - 372
[18] Stochastic Variance Reduced Optimization for Nonconvex Sparse Learning
Li, Xingguo
Zhao, Tuo
Arora, Raman
Liu, Han
Haupt, Jarvis
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
[19] Stabilized SVRG: Simple Variance Reduction for Nonconvex Optimization
Ge, Rong
Li, Zhize
Wang, Weiyao
Wang, Xiang
CONFERENCE ON LEARNING THEORY, VOL 99, 2019, 99
[20] Asynchronous Schemes for Stochastic and Misspecified Potential Games and Nonconvex Optimization
Lei, Jinlong
Shanbhag, Uday, V
OPERATIONS RESEARCH, 2020, 68 (06) : 1742 - 1766

← 1 2 3 4 5 →