Semantic R-CNN for Natural Language Object Detection

被引:0
|
作者
Ye, Shuxiong [1 ]
Qin, Zheng [1 ]
Xu, Kaiping [1 ]
Huang, Kai [1 ]
Wang, Guolong [1 ]
机构
[1] Tsinghua Univ, Sch Software, Beijing, Peoples R China
关键词
Object detection; Natural language; RPN;
D O I
10.1007/978-3-319-77383-4_10
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we present a simple and effective framework for natural language object detection, to localize a target within an image based on description of the target. The method, called semantic R-CNN, extends RPN (Region Proposal Network) [1] by adding LSTM [20] module for processing natural language query text. LSTM [20] module take encoded query text and image descriptors as input and output the probability of the query text conditioned on visual features of candidate box and whole image. Those candidate boxes are generated by RPN and their local features are extracted by ROI pooling. RPN can be initialized from pre-trained Faster R-CNN model [1], transfers object visual knowledge from traditional object detection domain to our task. Experimental results demonstrate that our method significantly outperform previous baseline SCRC (Spatial Context Recurrent ConvNet) [7] model on Referit dataset [8], moreover, our model is simple to train similar to Faster R-CNN.
引用
收藏
页码:98 / 107
页数:10
相关论文
共 50 条
  • [1] Oriented R-CNN for Object Detection
    Xie, Xingxing
    Cheng, Gong
    Wang, Jiabao
    Yao, Xiwen
    Han, Junwei
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 3500 - 3509
  • [2] R-CNN for Small Object Detection
    Chen, Chenyi
    Liu, Ming-Yu
    Tuzel, Oncel
    Xiao, Jianxiong
    COMPUTER VISION - ACCV 2016, PT V, 2017, 10115 : 214 - 230
  • [3] ME R-CNN: Multi-Expert R-CNN for Object Detection
    Lee, Hyungtae
    Eum, Sungmin
    Kwon, Heesung
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 (29) : 1030 - 1044
  • [4] Joint Semantic Segmentation and Object Detection Based on Relational Mask R-CNN
    Zhang, Yanni
    Xu, Hui
    Fan, Jingxuan
    Qi, Miao
    Liu, Tao
    Wang, Jianzhong
    INTELLIGENT COMPUTING THEORIES AND APPLICATION (ICIC 2022), PT I, 2022, 13393 : 506 - 521
  • [5] DDL R-CNN: Dynamic Direction Learning R-CNN for Rotated Object Detection
    Su, Weixian
    Jing, Donglin
    ALGORITHMS, 2025, 18 (01)
  • [6] An Improved Faster R-CNN for Object Detection
    Liu, Yu
    2018 11TH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID), VOL 2, 2018, : 119 - 123
  • [7] Street Object Detection Based on Faster R-CNN
    Cai, Wendi
    Li, Jiadie
    Xie, Zhongzhao
    Zhao, Tao
    Lu, Kang
    2018 37TH CHINESE CONTROL CONFERENCE (CCC), 2018, : 9500 - 9503
  • [8] Study Of Object Detection Based On Faster R-CNN
    Liu, Bin
    Zhao, Wencang
    Sun, Qiaoqiao
    2017 CHINESE AUTOMATION CONGRESS (CAC), 2017, : 6233 - 6236
  • [9] Object detection based on RGC mask R-CNN
    Wu, Minghu
    Yue, Hanhui
    Wang, Juan
    Huang, Yongxi
    Liu, Min
    Jiang, Yuhan
    Ke, Cong
    Zeng, Cheng
    IET IMAGE PROCESSING, 2020, 14 (08) : 1502 - 1508
  • [10] Comparison of faster R-CNN models for object detection
    Lee, Chungkeun
    Kim, H. Jin
    Oh, Kyeong Won
    2016 16TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS), 2016, : 107 - 110