Photo Semantic Understanding and Retargeting by a Noise-Robust Regularized Topic Model

被引:0
|
作者
Wang, Guifeng [1 ]
Zhang, Luming [1 ]
Li, Yongbin [1 ]
Sheng, Yichuan [1 ]
机构
[1] Jinhua Polytech, Key Lab Crop Harvesting Equipment Technol Zhejiang, Jinhua 321007, Peoples R China
关键词
Aerial photo; deep feature; matrix factorization; probabilistic model; retargeting; COMMUNITIES; ALGORITHM;
D O I
10.1109/JSTARS.2023.3247745
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Retargeting aims at displaying a photo with an arbitrary aspect ratio, wherein the visually/semantically prominent objects are appropriately preserved and visual distortions can be well alleviated. Conventional retargeting models are built upon the visual perception of photos from a family of prespecified communities (e.g., "portrait"), wherein the underlying community-specific features are not learned explicitly. Thus, they cannot appropriately retarget aerial photos, which contains a rich variety of objects with different scales. In this article, a novel aerial photo retargeting framework is designed by encoding the deep features from automatically detected Google Maps (https://www.google.com/maps) communities into a regularized probabilistic model. Specifically, we first propose an enhanced matrix factorization (MF) algorithm to calculate communities based on million-scale Google Maps pictures, for each of which deep feature is learned simultaneously. The enhanced MF incorporates label denoising, between-communities correlation, and deep feature encoding collaboratively. Subsequently, a probabilistic model called latent topic model (LTM) is designed that quantifies the spatial layouts of multiple Google Maps communities in the underlying hidden space. To alleviate the overfitting from Google Maps communities with imbalanced numbers of aerial photos, a regularizer is added into the LTM. Finally, by leveraging the regularized LTM, we shrink the test photo horizontally/vertically to maximize the posterior probability of the retargted photo. Comprehensive subjective evaluations and visualizations have demonstrated the advantages of our method. Besides, our calculate Google Maps communities are competitively consistent with the ground truth, according to the quantitative comparisons on the 2 M Google Maps photos.
引用
收藏
页码:3495 / 3505
页数:11
相关论文
共 40 条
  • [21] INTEGRATED DNN-BASED MODEL ADAPTATION TECHNIQUE FOR NOISE-ROBUST SPEECH RECOGNITION
    Lee, Kang Hyun
    Kang, Woo Hyun
    Kang, Tae Gyoon
    Kim, Nam Soo
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5245 - 5249
  • [22] A data-driven model parameter compensation method for noise-robust speech recognition
    Chung, YJ
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2005, E88D (03): : 432 - 434
  • [23] NoreSpeech: Knowledge Distillation based Conditional Diffusion Model for Noise-robust Expressive TTS
    Yang, Dongchao
    Liu, Songxiang
    Wang, Helin
    Yu, Jianwei
    Weng, Chao
    Zou, Yuexian
    INTERSPEECH 2023, 2023, : 4798 - 4802
  • [24] Noise-robust Sleep States Classification Model using Sound Feature Extraction and Conversion
    Ko, Sangkeun
    Min, Seongho
    Choi, Ye Shin
    Kim, Woo-Je
    Lee, Suan
    2024 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING, IEEE BIGCOMP 2024, 2024, : 281 - 286
  • [25] Model adaptation employing DNN-based estimation of noise corruption function for noise-robust speech recognition
    Yoon, Ki-mu
    Kim, Wooil
    JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2019, 38 (01): : 47 - 50
  • [26] Structural similarity-based noise-robust band selection model for hyperspectral image classification
    Liu, Yifan
    Qian, Longxia
    Hong, Mei
    Wang, Xianyue
    JOURNAL OF APPLIED REMOTE SENSING, 2024, 18 (03)
  • [27] Dealing with Unreliable Annotations: A Noise-Robust Network for Semantic Segmentation through A Transformer-Improved Encoder and Convolution Decoder
    Wang, Ziyang
    Voiculescu, Irina
    APPLIED SCIENCES-BASEL, 2023, 13 (13):
  • [28] A Noise-Robust Online convolutional coding model and its applications to poisson denoising and image fusion
    Wang, Wei
    Xia, Xiang-Gen
    He, Chuanjiang
    Ren, Zemin
    Wang, Tianfu
    Lei, Baiying
    APPLIED MATHEMATICAL MODELLING, 2021, 95 : 644 - 666
  • [29] MosquitoSong plus : A noise-robust deep learning model for mosquito classification from wingbeat sounds
    Supratak, Akara
    Haddawy, Peter
    Yin, Myat Su
    Ziemer, Tim
    Siritanakorn, Worameth
    Assawavinijkulchai, Kanpitcha
    Chiamsakul, Kanrawee
    Chantanalertvilai, Tharit
    Suchalermkul, Wish
    Sa-ngamuang, Chaitawat
    Sriwichai, Patchara
    PLOS ONE, 2024, 19 (10):
  • [30] Latent semantic understanding of geographical environment spatio-temporal data based on topic model
    Zhu, Jie
    Zhang, Hongjun
    Liao, Xianglin
    Tian, Jiangpeng
    Cehui Xuebao/Acta Geodaetica et Cartographica Sinica, 2021, 50 (10): : 1404 - 1415