Learning and Transferring Mid-Level Image Representations using Convolutional Neural Networks

被引:1995
|
作者
Oquab, Maxime [1 ]
Bottou, Leon [2 ]
Laptev, Ivan [1 ]
Sivic, Josef [1 ]
机构
[1] INRIA, Paris, France
[2] MSR, New York, NY USA
关键词
D O I
10.1109/CVPR.2014.222
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Convolutional neural networks (CNN) have recently shown outstanding image classification performance in the large-scale visual recognition challenge (ILSVRC2012). The success of CNNs is attributed to their ability to learn rich mid-level image representations as opposed to hand-designed low-level features used in other image classification methods. Learning CNNs, however, amounts to estimating millions of parameters and requires a very large number of annotated image samples. This property currently prevents application of CNNs to problems with limited training data. In this work we show how image representations learned with CNNs on large-scale annotated datasets can be efficiently transferred to other visual recognition tasks with limited amount of training data. We design a method to reuse layers trained on the ImageNet dataset to compute mid-level image representation for images in the PASCAL VOC dataset. We show that despite differences in image statistics and tasks in the two datasets, the transferred representation leads to significantly improved results for object and action classification, outperforming the current state of the art on Pascal VOC 2007 and 2012 datasets. We also show promising results for object and action localization.
引用
收藏
页码:1717 / 1724
页数:8
相关论文
共 50 条
  • [1] LEARNING AND TRANSFERRING REPRESENTATIONS FOR IMAGE STEGANALYSIS USING CONVOLUTIONAL NEURAL NETWORK
    Qian, Yinlong
    Dong, Jing
    Wang, Wei
    Tan, Tieniu
    2016 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2016, : 2752 - 2756
  • [2] A Novel Method for Scene Classification Feeding Mid-Level Image Patch to Convolutional Neural Networks
    Yang, Fei
    Yang, Jinfu
    Wang, Ying
    Zhang, Gaoming
    INFORMATION TECHNOLOGY AND INTELLIGENT TRANSPORTATION SYSTEMS, VOL 2, 2017, 455 : 347 - 357
  • [3] Image Super-resolution Using Mid-level Representations
    Yang, Li
    Wang, Yaxing
    Mu, Xiaomin
    Wang, Yaping
    2016 INTERNATIONAL CONFERENCE ON INFORMATION ENGINEERING AND COMMUNICATIONS TECHNOLOGY (IECT 2016), 2016, : 291 - 296
  • [4] Unsupervised learning of mid-level visual representations
    Matteucci, Giulio
    Piasini, Eugenio
    Zoccolan, Davide
    CURRENT OPINION IN NEUROBIOLOGY, 2024, 84
  • [5] Local, Mid-Level and Convolutional Features Fusion Using Multiple Kernel Learning for Image Classification
    Lu, Yao
    Zhang, Hui
    Xie, Bojun
    2019 2ND IEEE INTERNATIONAL CONFERENCE ON INFORMATION COMMUNICATION AND SIGNAL PROCESSING (ICICSP), 2019, : 390 - 394
  • [6] Transferring and Compressing Convolutional Neural Networks for Face Representations
    Grundstrom, Jakob
    Chen, Jiandan
    Ljungqvist, Martin Georg
    Astrom, Kalle
    IMAGE ANALYSIS AND RECOGNITION (ICIAR 2016), 2016, 9730 : 20 - 29
  • [7] Transferring Ensemble Representations Using Deep Convolutional Neural Networks for Small-Scale Image Classification
    Xia, Shuyin
    Xia, Yulong
    Yu, Hong
    Liu, Qun
    Luo, Yueguo
    Wang, Guoyin
    Chen, Zizhong
    IEEE ACCESS, 2019, 7 : 168175 - 168186
  • [8] A machine vision based pistachio sorting using transferred mid-level image representation of Convolutional Neural Network
    Farazi, Mohammad
    Abbas-Zadeh, Mohammad Javad
    Moradi, Hadi
    2017 10TH IRANIAN CONFERENCE ON MACHINE VISION AND IMAGE PROCESSING (MVIP), 2017, : 145 - 148
  • [9] Reuse of Mid-Level Feature in Deep Convolutional Neural Network
    Cai, ChaoQuan
    Wang, YiLei
    Wu, YingJie
    Chen, JingLin
    2017 INTERNATIONAL CONFERENCE ON GREEN INFORMATICS (ICGI), 2017, : 60 - 66
  • [10] DEEP NEURAL NETWORK BASED LEARNING AND TRANSFERRING MID-LEVEL AUDIO FEATURES FOR ACOUSTIC SCENE CLASSIFICATION
    Mun, Seongkyu
    Shon, Suwon
    Kim, Wooil
    Han, David K.
    Ko, Hanseok
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 796 - 800