Scale-space multi-view bag of words for scene categorization

被引：24

作者：

Giveki, Davar ^{[1
]}

机构：

[1] Malayer Univ, Dept Comp Engn, POB 65719-95863, Malayer, Iran

来源：

MULTIMEDIA TOOLS AND APPLICATIONS | 2021年 / 80卷 / 01期

关键词：

Scene categorization; Bag of words; Scale-space features; Feature fusion; TF-IDF weighting; OF-VISUAL-WORDS; NEURAL-NETWORK; SPARSE REPRESENTATION; IMAGE CLASSIFICATION; FEATURES; MODEL; RETRIEVAL;

D O I：

10.1007/s11042-020-09759-9

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

As a widely-used method in the image categorization tasks, the Bag-of-Words (BoW) method still suffers from many limitations such as overlooking spatial information. In this paper, we propose four improvements to the BoW method to consider spatial and semantic information as well as information from multiple views. In particular, our contributions are: (a) encoding spatial information based on a combination of wavelet transform image scaling and a new image partitioning scheme, (b) proposing a spatial-information- and content-aware visual word dictionary generation approach, (c) developing a content-aware feature weighting approach to considers the significance of the features for different semantics, (d) proposing a novel weighting strategy to fuse color information when discriminative shape features are lacking. We call our method Scale-Space Multi-View Bag of Words (SSMV-BoW). We conducted extensive experiments to evaluate our SSMV-BoW and compare it to the state-of-the-art scene categorization methods. For our experiments, we use four publicly available and widely used scene categorization benchmark datasets. Results demonstrate that our SSMV-BoW outperforms the methods using both hand-crafted and deep learning features. In addition, ablation studies show that all four improvements contribute to the performance of our SSMV-BoW.

引用

页码：1223 / 1245

页数：23

共 50 条

[31] Multi-view stereo for large-scale scene reconstruction with MRF-based depth inference
Sun, Shang
Xu, Dan
Wu, Hao
Ying, Haocong
Mou, Yurui
COMPUTERS & GRAPHICS-UK, 2022, 106 : 248 - 258
[32] Large Scale Multi-view Stereopsis Evaluation
Jensen, Rasmus
Dahl, Anders
Vogiatzis, George
Tola, Engin
Aanaes, Henrik
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 406 - 413
[33] A Hierarchical Approach for Joint Multi-view Object Pose Estimation and Categorization
Ozay, Mete
Walas, Krzysztof
Leonardis, Ales
2014 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2014, : 5480 - 5487
[34] A Scale-Space Theory and Bag-of-Features Based Time Series Classification Method
Altay, Tayip
Baydogan, Mustafa Gokce
2017 25TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2017,
[35] Multi-view and Multi-scale Recognition of Symmetric Patterns
Teferi, Dereje
Bigun, Josef
IMAGE ANALYSIS, PROCEEDINGS, 2009, 5575 : 657 - 666
[36] Multi-view stereo-regulated NeRF for urban scene novel view synthesis
Bian, Feihu
Xiong, Suya
Yi, Ran
Ma, Lizhuang
VISUAL COMPUTER, 2025, 41 (01): : 243 - 255
[37] Modelling human visual navigation using multi-view scene reconstruction
Pickup, Lyndsey C.
Fitzgibbon, Andrew W.
Glennerster, Andrew
BIOLOGICAL CYBERNETICS, 2013, 107 (04) : 449 - 464
[38] Modelling human visual navigation using multi-view scene reconstruction
Lyndsey C. Pickup
Andrew W. Fitzgibbon
Andrew Glennerster
Biological Cybernetics, 2013, 107 : 449 - 464
[39] Non-redundant rendering for efficient multi-view scene discretization
Xie, Naiwen
Wang, Lili
Popescu, Voicu
VISUAL COMPUTER, 2017, 33 (12): : 1555 - 1569
[40] A modeling method for virtual scene based on multi-view image sequence
王佳生
唐好选
杨铁冬
Journal of Harbin Institute of Technology(New series), 2009, (02) : 217 - 222

← 1 2 3 4 5 →