Learning and Transferring Mid-Level Image Representations using Convolutional Neural Networks

被引:1995
|
作者
Oquab, Maxime [1 ]
Bottou, Leon [2 ]
Laptev, Ivan [1 ]
Sivic, Josef [1 ]
机构
[1] INRIA, Paris, France
[2] MSR, New York, NY USA
关键词
D O I
10.1109/CVPR.2014.222
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Convolutional neural networks (CNN) have recently shown outstanding image classification performance in the large-scale visual recognition challenge (ILSVRC2012). The success of CNNs is attributed to their ability to learn rich mid-level image representations as opposed to hand-designed low-level features used in other image classification methods. Learning CNNs, however, amounts to estimating millions of parameters and requires a very large number of annotated image samples. This property currently prevents application of CNNs to problems with limited training data. In this work we show how image representations learned with CNNs on large-scale annotated datasets can be efficiently transferred to other visual recognition tasks with limited amount of training data. We design a method to reuse layers trained on the ImageNet dataset to compute mid-level image representation for images in the PASCAL VOC dataset. We show that despite differences in image statistics and tasks in the two datasets, the transferred representation leads to significantly improved results for object and action classification, outperforming the current state of the art on Pascal VOC 2007 and 2012 datasets. We also show promising results for object and action localization.
引用
收藏
页码:1717 / 1724
页数:8
相关论文
共 50 条
  • [31] EGOCENTRIC ACTIVITY RECOGNITION BY LEVERAGING MULTIPLE MID-LEVEL REPRESENTATIONS
    Hsieh, Peng-Ju
    Tin, Yen-Hang
    Chen, Yu-Hsiu
    Hsu, Winston
    2016 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO (ICME), 2016,
  • [32] Learning Ordinal Relationships for Mid-Level Vision
    Zoran, Daniel
    Isola, Phillip
    Krishnan, Dilip
    Freeman, William T.
    2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 388 - 396
  • [33] DeepCAMP: Deep Convolutional Action & Attribute Mid-Level Patterns
    Diba, Ali
    Pazandeh, Ali Mohammad
    Pirsiavash, Hamed
    Gool, Luc Van
    2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 3557 - 3565
  • [34] Image Classification Using Convolutional Neural Networks
    Filippov, S. A.
    AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS, 2024, 58 (SUPPL3) : S143 - S149
  • [35] String representations and distances in deep Convolutional Neural Networks for image classification
    Barat, Cecile
    Ducottet, Christophe
    PATTERN RECOGNITION, 2016, 54 : 104 - 115
  • [36] Learning Image Representation Based on Convolutional Neural Networks
    Yang, Zhanbo
    Hu, Fei
    Wang, Jingyuan
    Zhang, Jinjing
    Li, Li
    NEURAL INFORMATION PROCESSING (ICONIP 2017), PT II, 2017, 10635 : 642 - 652
  • [37] A VLSI architecture suitable for mid-level image processing
    Dessbesell, Gustavo F.
    Pacheco, Marcio A.
    Martins, Joao B. dos S.
    Molz, Rolf Fredi
    2008 4TH SOUTHERN CONFERENCE ON PROGRAMMABLE LOGIC, PROCEEDINGS, 2008, : 87 - +
  • [38] Celiac Disease Deep Learning Image Classification Using Convolutional Neural Networks
    Carreras, Joaquim
    JOURNAL OF IMAGING, 2024, 10 (08)
  • [39] Time Series Classification Using Federated Convolutional Neural Networks and Image-Based Representations
    Silva, Felipe A. R.
    Orang, Omid
    Javier Erazo-Costa, Fabricio
    Silva, Petronio C. L.
    Barros, Pedro H.
    Ferreira, Ricardo P. M.
    Gadelha Guimaraes, Frederico
    IEEE ACCESS, 2025, 13 : 56180 - 56194
  • [40] Image interpolation using convolutional neural networks with deep recursive residual learning
    Kwok-Wai Hung
    Kun Wang
    Jianmin Jiang
    Multimedia Tools and Applications, 2019, 78 : 22813 - 22831