Compression-Based Regularization With an Application to Multitask Learning

被引:6
|
作者
Vera, Matias [1 ]
Vega, Leonardo Rey [1 ,2 ]
Piantanida, Pablo [3 ]
机构
[1] Univ Buenos Aires, Fac Ingn, RA-1053 Buenos Aires, DF, Argentina
[2] Univ Buenos Aires, CSC CONICET, RA-1053 Buenos Aires, DF, Argentina
[3] Univ Paris Sud, CNRS, Cent Supelec, F-91400 Orsay, France
关键词
Multitask learning; information bottleneck; regularization; Arimoto-Blahut algorithm; side information; CLASSIFICATION; INFORMATION; FRAMEWORK; CAPACITY;
D O I
10.1109/JSTSP.2018.2846218
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper investigates, from information theoretic grounds, a learning problem based on the principle that any regularity in a given dataset can be exploited to extract compact features from data, i.e., using fewer bits than needed to fully describe the data itself, in order to build meaningful representations of a relevant content (multiple labels). We begin studying a multitask learning (MTL) problem from the average (over the tasks) of mis-classification probability point of view and linking it with the popular cross-entropy criterion. Our approach allows an information theoretic formulation of an MTL problem as a supervised learning framework, in which the prediction models for several related tasks are learned jointly from common representations to achieve better generalization performance. More precisely, our formulation of the MTL problem can be interpreted as an information bottleneck problem with side information at the decoder. Based on that, we present an iterative algorithm for computing the optimal tradeoffs and some of its convergence properties are studied. An important feature of this algorithm is to provide a natural safeguard against overfitting, because it minimizes the average risk taking into account a penalization induced by the model complexity. Remarkably, empirical results illustrate that there exists an optimal information rate minimizing the excess risk, which depends on the nature and the amount of available training data. Applications to hierarchical text categorization and distributional word clusters are also investigated, extending previous works.
引用
收藏
页码:1063 / 1076
页数:14
相关论文
共 50 条
  • [31] Compression-Based Arabic Text Classification
    Ta'amneh, Haneen
    Abu Keshek, Ehsan
    Issa, Manar Bani
    Al-Ayyoub, Mahmoud
    Jararweh, Yaser
    2014 IEEE/ACS 11TH INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS (AICCSA), 2014, : 594 - 600
  • [32] Compression-based similarity in EEG signals
    Prilepok, Michal
    Platos, Jan
    Snasel, Vaclav
    Jahan, Ibrahim Salem
    2013 13TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS (ISDA), 2013, : 247 - 252
  • [33] Compression-based pruning of decision lists
    Pfahringer, B
    MACHINE LEARNING : ECML-97, 1997, 1224 : 199 - 212
  • [34] A compression-based backward approach for the forward sparse modeling with application to speech coding
    Omara, A. N.
    Hefnawy, A. A.
    Zekry, Abdelhalim
    COMPUTERS & ELECTRICAL ENGINEERING, 2017, 62 : 612 - 629
  • [35] Application of compression-based distance measures to protein sequence classification:: a methodological study
    Kocsor, A
    Kertész-Farkas, A
    Kaján, L
    Pongor, S
    BIOINFORMATICS, 2006, 22 (04) : 407 - 412
  • [36] Enhancing metagenomic classification with compression-based features
    Silva, Jorge Miguel
    Almeida, Joao Rafael
    ARTIFICIAL INTELLIGENCE IN MEDICINE, 2024, 156
  • [37] Compression-based data mining of sequential data
    Eamonn Keogh
    Stefano Lonardi
    Chotirat Ann Ratanamahatana
    Li Wei
    Sang-Hee Lee
    John Handley
    Data Mining and Knowledge Discovery, 2007, 14 : 99 - 129
  • [38] OCTEN: ONLINE COMPRESSION-BASED TENSOR DECOMPOSITION
    Gujral, Ekta
    Pasricha, Ravdeep
    Yang, Tianxiong
    Papalexakis, Evangelos E.
    2019 IEEE 8TH INTERNATIONAL WORKSHOP ON COMPUTATIONAL ADVANCES IN MULTI-SENSOR ADAPTIVE PROCESSING (CAMSAP 2019), 2019, : 455 - 459
  • [39] Compression-based Modelling of Musical Similarity Perception
    Pearce, Marcus
    Mullensiefen, Daniel
    JOURNAL OF NEW MUSIC RESEARCH, 2017, 46 (02) : 135 - 155
  • [40] Compression-based inference of network motif sets
    Benichou, Alexis
    Masson, Jean-Baptiste
    Vestergaard, Christian L.
    PLOS COMPUTATIONAL BIOLOGY, 2024, 20 (10)