Scale-space multi-view bag of words for scene categorization

被引:24
|
作者
Giveki, Davar [1 ]
机构
[1] Malayer Univ, Dept Comp Engn, POB 65719-95863, Malayer, Iran
关键词
Scene categorization; Bag of words; Scale-space features; Feature fusion; TF-IDF weighting; OF-VISUAL-WORDS; NEURAL-NETWORK; SPARSE REPRESENTATION; IMAGE CLASSIFICATION; FEATURES; MODEL; RETRIEVAL;
D O I
10.1007/s11042-020-09759-9
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As a widely-used method in the image categorization tasks, the Bag-of-Words (BoW) method still suffers from many limitations such as overlooking spatial information. In this paper, we propose four improvements to the BoW method to consider spatial and semantic information as well as information from multiple views. In particular, our contributions are: (a) encoding spatial information based on a combination of wavelet transform image scaling and a new image partitioning scheme, (b) proposing a spatial-information- and content-aware visual word dictionary generation approach, (c) developing a content-aware feature weighting approach to considers the significance of the features for different semantics, (d) proposing a novel weighting strategy to fuse color information when discriminative shape features are lacking. We call our method Scale-Space Multi-View Bag of Words (SSMV-BoW). We conducted extensive experiments to evaluate our SSMV-BoW and compare it to the state-of-the-art scene categorization methods. For our experiments, we use four publicly available and widely used scene categorization benchmark datasets. Results demonstrate that our SSMV-BoW outperforms the methods using both hand-crafted and deep learning features. In addition, ablation studies show that all four improvements contribute to the performance of our SSMV-BoW.
引用
收藏
页码:1223 / 1245
页数:23
相关论文
共 50 条
  • [41] Non-redundant rendering for efficient multi-view scene discretization
    Naiwen Xie
    Lili Wang
    Voicu Popescu
    The Visual Computer, 2017, 33 : 1555 - 1569
  • [42] An end-to-end model for multi-view scene text recognition
    Banerjee, Ayan
    Shivakumara, Palaiahnakote
    Bhattacharya, Saumik
    Pal, Umapada
    Liu, Cheng-Lin
    PATTERN RECOGNITION, 2024, 149
  • [43] Path-Guided Motion Prediction with Multi-view Scene Perception
    Log, Zongyun
    Yang, Yang
    Gao, Xuehao
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT VII, 2025, 15037 : 439 - 453
  • [44] Camera Relocalization by Exploiting Multi-View Constraints for Scene Coordinates Regression
    Cai, Ming
    Zhan, Huangying
    Weerasekera, Chamara Saroj
    Li, Kejie
    Reid, Ian
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 3769 - 3777
  • [45] GPU-Accelerated and Efficient Multi-View Triangulation for Scene Reconstruction
    Mak, Jason
    Hess-Flores, Mauricio
    Recker, Shawn
    Owens, John D.
    Joy, Kenneth I.
    2014 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2014, : 61 - 68
  • [46] Multi-view Urban Scene Reconstruction in Non-uniform Volume
    Mao, Run-Chao
    Wu, Qiang
    Qiao, Yu
    Bai, Li
    Yang, Jie
    SIXTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2013), 2013, 9067
  • [47] Multi-view non-negative matrix factorization for scene recognition
    Tang, Jinjiang
    Qian, Weijie
    Zhao, Zhijun
    Liu, Weiliang
    He, Ping
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2019, 59 : 9 - 13
  • [48] Multi-View Inpainting for Image-Based Scene Editing and Rendering
    Thonat, Theo
    Shechtman, Eli
    Paris, Sylvain
    Drettakis, George
    PROCEEDINGS OF 2016 FOURTH INTERNATIONAL CONFERENCE ON 3D VISION (3DV), 2016, : 351 - 359
  • [49] A multi-view approach to object tracking in a cluttered scene using memory
    Kang, HB
    Cho, SH
    ADVANCES IN INTELLIGENT COMPUTING, PT 2, PROCEEDINGS, 2005, 3645 : 870 - 879
  • [50] Robust Focal Length Estimation by Voting in Multi-view Scene Reconstruction
    Bujnak, Martin
    Kukelova, Zuzana
    Pajdla, Tomas
    COMPUTER VISION - ACCV 2009, PT I, 2010, 5994 : 13 - +