Weakly Supervised Visual Dictionary Learning by Harnessing Image Attributes

被引:35
|
作者
Gao, Yue [2 ]
Ji, Rongrong [1 ]
Liu, Wei [3 ]
Dai, Qionghai [2 ]
Hua, Gang [4 ]
机构
[1] Xiamen Univ, Dept Cognit Sci, Sch Informat Sci & Engn, Xiamen 361005, Peoples R China
[2] Tsinghua Univ, Dept Automat, Tsinghua Natl Lab Informat Sci & Technol, Beijing 100084, Peoples R China
[3] IBM TJ Watson Res Ctr, Yorktown Hts, NY 10598 USA
[4] Stevens Inst Technol, Dept Comp Sci, Hoboken, NJ 07030 USA
基金
中国国家自然科学基金; 美国国家科学基金会;
关键词
Bag-of-features; visual dictionary; image attribute; weakly supervised learning; hidden Markov random field; image classification; image search; VOCABULARIES; FEATURES;
D O I
10.1109/TIP.2014.2364536
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Bag-of-features (BoFs) representation has been extensively applied to deal with various computer vision applications. To extract discriminative and descriptive BoF, one important step is to learn a good dictionary to minimize the quantization loss between local features and codewords. While most existing visual dictionary learning approaches are engaged with unsupervised feature quantization, the latest trend has turned to supervised learning by harnessing the semantic labels of images or regions. However, such labels are typically too expensive to acquire, which restricts the scalability of supervised dictionary learning approaches. In this paper, we propose to leverage image attributes to weakly supervise the dictionary learning procedure without requiring any actual labels. As a key contribution, our approach establishes a generative hidden Markov random field (HMRF), which models the quantized codewords as the observed states and the image attributes as the hidden states, respectively. Dictionary learning is then performed by supervised grouping the observed states, where the supervised information is stemmed from the hidden states of the HMRF. In such a way, the proposed dictionary learning approach incorporates the image attributes to learn a semantic-preserving BoF representation without any genuine supervision. Experiments in large-scale image retrieval and classification tasks corroborate that our approach significantly outperforms the state-of-the-art unsupervised dictionary learning approaches.
引用
收藏
页码:5400 / 5411
页数:12
相关论文
共 50 条
  • [1] Weakly Supervised Dictionary Learning
    You, Zeyu
    Raich, Raviv
    Fern, Xiaoli Z.
    Kim, Jinsub
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2018, 66 (10) : 2527 - 2541
  • [2] Saliency Guided Dictionary Learning for Weakly-Supervised Image Parsing
    Lai, Baisheng
    Gong, Xiaojin
    2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 3630 - 3639
  • [3] Weakly-Supervised Cross-Domain Dictionary Learning for Visual Recognition
    Zhu, Fan
    Shao, Ling
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2014, 109 (1-2) : 42 - 59
  • [4] Weakly-Supervised Cross-Domain Dictionary Learning for Visual Recognition
    Fan Zhu
    Ling Shao
    International Journal of Computer Vision, 2014, 109 : 42 - 59
  • [5] Weakly Supervised Learning of Objects, Attributes and Their Associations
    Shi, Zhiyuan
    Yang, Yongxin
    Hospedales, Timothy M.
    Xiang, Tao
    COMPUTER VISION - ECCV 2014, PT II, 2014, 8690 : 472 - 487
  • [6] Weakly-Supervised Image Annotation and Segmentation with Objects and Attributes
    Shi, Zhiyuan
    Yang, Yongxin
    Hospedales, Timothy M.
    Xiang, Tao
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) : 2525 - 2538
  • [7] Weakly-supervised learning of visual relations
    Peyre, Julia
    Laptev, Ivan
    Schmid, Cordelia
    Sivic, Josef
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 5189 - 5198
  • [8] WEAKLY-SUPERVISED ANALYSIS DICTIONARY LEARNING WITH CARDINALITY CONSTRAINTS
    You, Zeyu
    Raich, Raviv
    Fern, Xiaoli Z.
    Kim, Jinsub
    2016 IEEE STATISTICAL SIGNAL PROCESSING WORKSHOP (SSP), 2016,
  • [9] - Semantic Transform - Weakly Supervised Semantic Inference for Relating Visual Attributes
    Shankar, Sukrit
    Lasenby, Joan
    Cipolla, Roberto
    2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, : 361 - 368
  • [10] Multimodal Visual Concept Learning with Weakly Supervised Techniques
    Bouritsas, Giorgos
    Koutras, Petros
    Zlatintsi, Athanasia
    Maragos, Petros
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 4914 - 4923