Scale-space multi-view bag of words for scene categorization

被引：24

作者：

Giveki, Davar ^{[1
]}

机构：

[1] Malayer Univ, Dept Comp Engn, POB 65719-95863, Malayer, Iran

来源：

MULTIMEDIA TOOLS AND APPLICATIONS | 2021年 / 80卷 / 01期

关键词：

Scene categorization; Bag of words; Scale-space features; Feature fusion; TF-IDF weighting; OF-VISUAL-WORDS; NEURAL-NETWORK; SPARSE REPRESENTATION; IMAGE CLASSIFICATION; FEATURES; MODEL; RETRIEVAL;

D O I：

10.1007/s11042-020-09759-9

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

As a widely-used method in the image categorization tasks, the Bag-of-Words (BoW) method still suffers from many limitations such as overlooking spatial information. In this paper, we propose four improvements to the BoW method to consider spatial and semantic information as well as information from multiple views. In particular, our contributions are: (a) encoding spatial information based on a combination of wavelet transform image scaling and a new image partitioning scheme, (b) proposing a spatial-information- and content-aware visual word dictionary generation approach, (c) developing a content-aware feature weighting approach to considers the significance of the features for different semantics, (d) proposing a novel weighting strategy to fuse color information when discriminative shape features are lacking. We call our method Scale-Space Multi-View Bag of Words (SSMV-BoW). We conducted extensive experiments to evaluate our SSMV-BoW and compare it to the state-of-the-art scene categorization methods. For our experiments, we use four publicly available and widely used scene categorization benchmark datasets. Results demonstrate that our SSMV-BoW outperforms the methods using both hand-crafted and deep learning features. In addition, ablation studies show that all four improvements contribute to the performance of our SSMV-BoW.

引用

页码：1223 / 1245

页数：23

共 50 条

[41] Non-redundant rendering for efficient multi-view scene discretization
Naiwen Xie
Lili Wang
Voicu Popescu
The Visual Computer, 2017, 33 : 1555 - 1569
[42] An end-to-end model for multi-view scene text recognition
Banerjee, Ayan
Shivakumara, Palaiahnakote
Bhattacharya, Saumik
Pal, Umapada
Liu, Cheng-Lin
PATTERN RECOGNITION, 2024, 149
[43] Path-Guided Motion Prediction with Multi-view Scene Perception
Log, Zongyun
Yang, Yang
Gao, Xuehao
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT VII, 2025, 15037 : 439 - 453
[44] Camera Relocalization by Exploiting Multi-View Constraints for Scene Coordinates Regression
Cai, Ming
Zhan, Huangying
Weerasekera, Chamara Saroj
Li, Kejie
Reid, Ian
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 3769 - 3777
[45] GPU-Accelerated and Efficient Multi-View Triangulation for Scene Reconstruction
Mak, Jason
Hess-Flores, Mauricio
Recker, Shawn
Owens, John D.
Joy, Kenneth I.
2014 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2014, : 61 - 68
[46] Multi-view Urban Scene Reconstruction in Non-uniform Volume
Mao, Run-Chao
Wu, Qiang
Qiao, Yu
Bai, Li
Yang, Jie
SIXTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2013), 2013, 9067
[47] Multi-view non-negative matrix factorization for scene recognition
Tang, Jinjiang
Qian, Weijie
Zhao, Zhijun
Liu, Weiliang
He, Ping
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2019, 59 : 9 - 13
[48] Multi-View Inpainting for Image-Based Scene Editing and Rendering
Thonat, Theo
Shechtman, Eli
Paris, Sylvain
Drettakis, George
PROCEEDINGS OF 2016 FOURTH INTERNATIONAL CONFERENCE ON 3D VISION (3DV), 2016, : 351 - 359
[49] A multi-view approach to object tracking in a cluttered scene using memory
Kang, HB
Cho, SH
ADVANCES IN INTELLIGENT COMPUTING, PT 2, PROCEEDINGS, 2005, 3645 : 870 - 879
[50] Robust Focal Length Estimation by Voting in Multi-view Scene Reconstruction
Bujnak, Martin
Kukelova, Zuzana
Pajdla, Tomas
COMPUTER VISION - ACCV 2009, PT I, 2010, 5994 : 13 - +

← 1 2 3 4 5 →