Accelerated Gradient Method for Multi-Task Sparse Learning Problem

被引:126
|
作者
Chen, Xi [1 ]
Pan, Weike [2 ]
Kwok, James T. [2 ]
Carbonell, Jaime G. [1 ]
机构
[1] Carnegie Mellon Univ, Sch Comp Sci, Pittsburgh, PA 15213 USA
[2] Hong Kong Univ Sci & Technol, Dept Comp Sci & Technol, Hong Kong, Peoples R China
关键词
multi-task learning; L-1-infinity regularization; optimal method; gradient descend; SHRINKAGE;
D O I
10.1109/ICDM.2009.128
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Many real world learning problems can be recast as multi-task learning problems which utilize correlations among different tasks to obtain better generalization performance than learning each task individually. The feature selection problem in multi-task setting has many applications in fields of computer vision, text classification and bio-informatics. Generally, it can be realized by solving a L-1-infinity regularized optimization problem. And the solution automatically yields the joint sparsity among different tasks. However, due to the nonsmooth nature of the L-1-infinity norm, there lacks an efficient training algorithm for solving such problem with general convex loss functions. In this paper, we propose an accelerated gradient method based on an "optimal" first order black-box method named after Nesterov and provide the convergence rate for smooth convex loss functions. For nonsmooth convex loss functions, such as hinge loss, our method still has fast convergence rate empirically. Moreover, by exploiting the structure of the L-1-infinity ball, we solve the black-box oracle in Nesterov's method by a simple sorting scheme. Our method is suitable for large-scale multi-task learning problem since it only utilizes the first order information and is very easy to implement. Experimental results show that our method significantly outperforms the most state-of-the-art methods in both convergence speed and learning accuracy.
引用
收藏
页码:746 / +
页数:2
相关论文
共 50 条
  • [21] Decentralized multi-task reinforcement learning policy gradient method with momentum over networks
    Shi Junru
    Wang Qiong
    Liu Muhua
    Ji Zhihang
    Zheng Ruijuan
    Wu Qingtao
    APPLIED INTELLIGENCE, 2023, 53 (09) : 10365 - 10379
  • [22] AN ACCELERATED GRADIENT METHOD FOR NONCONVEX SPARSE SUBSPACE CLUSTERING PROBLEM
    Li, Hongwu
    Zhang, Haibin
    Xiao, Yunhai
    PACIFIC JOURNAL OF OPTIMIZATION, 2022, 18 (02): : 265 - 280
  • [23] Multi-task Sparse Gaussian Processes with Improved Multi-task Sparsity Regularization
    Zhu, Jiang
    Sun, Shiliang
    PATTERN RECOGNITION (CCPR 2014), PT I, 2014, 483 : 54 - 62
  • [24] Conflict-Averse Gradient Descent for Multi-task Learning
    Liu, Bo
    Liu, Xingchao
    Jin, Xiaojie
    Stone, Peter
    Liu, Qiang
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [25] Online Multi-Task Gradient Temporal-Difference Learning
    Sreenivasan, Vishnu Purushothaman
    Ammar, Haitham Bou
    Eaton, Eric
    PROCEEDINGS OF THE TWENTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2014, : 3136 - 3137
  • [26] Multi-task Learning via Non-sparse Multiple Kernel Learning
    Samek, Wojciech
    Binder, Alexander
    Kawanabe, Motoaki
    COMPUTER ANALYSIS OF IMAGES AND PATTERNS: 14TH INTERNATIONAL CONFERENCE, CAIP 2011, PT I, 2011, 6854 : 335 - 342
  • [27] Representation learning with deep sparse auto-encoder for multi-task learning
    Zhu, Yi
    Wu, Xindong
    Qiang, Jipeng
    Hu, Xuegang
    Zhang, Yuhong
    Li, Peipei
    PATTERN RECOGNITION, 2022, 129
  • [28] Adaptive Group Sparse Multi-task Learning via Trace Lasso
    Liu, Sulin
    Pan, Sinno Jialin
    PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 2358 - 2364
  • [29] Robust Visual Tracking via Structured Multi-Task Sparse Learning
    Zhang, Tianzhu
    Ghanem, Bernard
    Liu, Si
    Ahuja, Narendra
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2013, 101 (02) : 367 - 383
  • [30] PARTS-BASED MULTI-TASK SPARSE LEARNING FOR VISUAL TRACKING
    Kang, Zhengjian
    Wong, Edward K.
    2015 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2015, : 4022 - 4026