Learning Unsupervised Visual Grounding Through Semantic Self-Supervision

被引:0
|
作者
Javed, Syed Ashar [1 ]
Saxena, Shreyas
Gandhi, Vineet [2 ]
机构
[1] Carnegie Mellon Univ, Robot Inst, Pittsburgh, PA 15213 USA
[2] IIIT Hyderabad, CVIT, Kohli Ctr Intelligent Syst KCIS, Hyderabad, India
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Localizing natural language phrases in images is a challenging problem that requires joint understanding of both the textual and visual modalities. In the unsupervised setting, lack of supervisory signals exacerbate this difficulty. In this paper, we propose a novel framework for unsupervised visual grounding which uses concept learning as a proxy task to obtain self-supervision. The intuition behind this idea is to encourage the model to localize to regions which can explain some semantic property in the data, in our case, the property being the presence of a concept in a set of images We present thorough quantitative and qualitative experiments to demonstrate the efficacy of our approach and show a 5.6% improvement over the current state of the art on Visual Genome dataset, a 5.8% improvement on the ReferItGame dataset and comparable to state-of-art performance on the Flickr30k dataset.
引用
收藏
页码:796 / 802
页数:7
相关论文
共 50 条
  • [31] Improving Spatiotemporal Self-supervision by Deep Reinforcement Learning
    Buechler, Uta
    Brattoli, Biagio
    Ommer, Bjoern
    COMPUTER VISION - ECCV 2018, PT 15, 2018, 11219 : 797 - 814
  • [32] Multi-modal NeRF Self-Supervision for LiDAR Semantic Segmentation
    Timoneda, Xavier
    Herb, Markus
    Duerr, Fabian
    Goehring, Daniel
    Yu, Fisher
    2024 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS 2024), 2024, : 12939 - 12946
  • [33] Co-learning: Learning from Noisy Labels with Self-supervision
    Tan, Cheng
    Xia, Jun
    Wu, Lirong
    Li, Stan Z.
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 1405 - 1413
  • [34] Soft prompt-tuning for unsupervised domain adaptation via self-supervision
    Zhu, Yi
    Wang, Shuqin
    Li, Yun
    Yuan, Yunhao
    Qiang, Jipeng
    NEUROCOMPUTING, 2025, 617
  • [35] Better Self-training for Image Classification Through Self-supervision
    Sahito, Attaullah
    Frank, Eibe
    Pfahringer, Bernhard
    AI 2021: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, 13151 : 645 - 657
  • [36] Self-supervision, surveillance and transgression
    Simon, Gail
    JOURNAL OF FAMILY THERAPY, 2010, 32 (03) : 308 - 325
  • [37] Contextualized Spatio-Temporal Contrastive Learning with Self-Supervision
    Yuan, Liangzhe
    Qian, Rui
    Cui, Yin
    Gong, Boqing
    Schroff, Florian
    Yang, Ming-Hsuan
    Adam, Hartwig
    Liu, Ting
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 13957 - 13966
  • [38] Task-specific image summaries using semantic information and self-supervision
    Deepak Kumar Sharma
    Anurag Singh
    Sudhir Kumar Sharma
    Gautam Srivastava
    Jerry Chun-Wei Lin
    Soft Computing, 2022, 26 : 7581 - 7594
  • [39] Anomalies, representations, and self-supervision
    Dillon, Barry M.
    Favaro, Luigi
    Feiden, Friedrich
    Modak, Tanmoy
    Plehn, Tilman
    SCIPOST PHYSICS CORE, 2024, 7 (03):
  • [40] MetaDetector: Detecting Outliers by Learning to Learn from Self-supervision
    Tan, Jeremy
    Kart, Turkay
    Hou, Benjamin
    Batten, James
    Kainz, Bernhard
    BIOMEDICAL IMAGE REGISTRATION, DOMAIN GENERALISATION AND OUT-OF-DISTRIBUTION ANALYSIS, 2022, 13166 : 119 - 126