Segmentation from Natural Language Expressions

被引:226
|
作者
Hu, Ronghang [1 ]
Rohrbach, Marcus [1 ,2 ]
Darrell, Trevor [1 ]
机构
[1] Univ Calif Berkeley, EECS, Berkeley, CA 94720 USA
[2] ICSI, Berkeley, CA USA
来源
关键词
Natural language; Segmentation; Recurrent neural network; Fully convolutional network;
D O I
10.1007/978-3-319-46448-0_7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we approach the novel problem of segmenting an image based on a natural language expression. This is different from traditional semantic segmentation over a predefined set of semantic classes, as e.g., the phrase "two men sitting on the right bench" requires segmenting only the two people on the right bench and no one standing or sitting on another bench. Previous approaches suitable for this task were limited to a fixed set of categories and/or rectangular regions. To produce pixelwise segmentation for the language expression, we propose an end-to-end trainable recurrent and convolutional network model that jointly learns to process visual and linguistic information. In our model, a recurrent neural network is used to encode the referential expression into a vector representation, and a fully convolutional network is used to a extract a spatial feature map from the image and output a spatial response map for the target object. We demonstrate on a benchmark dataset that our model can produce quality segmentation output from the natural language expression, and outperforms baseline methods by a large margin.
引用
收藏
页码:108 / 124
页数:17
相关论文
共 50 条
  • [11] Natural Language Watermarking by Morpheme Segmentation
    Kim, Mi-Young
    2009 FIRST ASIAN CONFERENCE ON INTELLIGENT INFORMATION AND DATABASE SYSTEMS, 2009, : 144 - 149
  • [12] A Method for Representing Mathematical Expressions as Words in Natural Language
    Watabe, Takayuki
    Miyazaki, Yoshinori
    SMART DIGITAL FUTURES 2014, 2014, 262 : 335 - 344
  • [13] Knowledge acquisition from parsing natural language expressions for humanoid robot action commands
    Recupero, Diego Reforgiato
    Spiga, Federico
    INFORMATION PROCESSING & MANAGEMENT, 2020, 57 (06)
  • [14] A method of extracting and evaluating popularity and unpopularity for natural language expressions
    Morita, K
    Kadoya, Y
    Atlam, ES
    Fuketa, M
    Kashiji, S
    Aoe, J
    KNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT 1, PROCEEDINGS, 2004, 3213 : 567 - 574
  • [15] REMIT: A NATURAL LANGUAGE PARAPHRASER FOR RELATIONAL QUERY EXPRESSIONS.
    Lowden, B.G.T.
    De Roeck, A.N.
    ICL technical journal, 1986, 5 (01): : 32 - 45
  • [16] Mapping nurses' natural language to oncology patients' symptom expressions
    Rotegard, Ann Kristin
    Slaughter, Laura
    Ruland, Cornelia M.
    CONSUMER-CENTERED COMPUTER-SUPPPORTED CARE FOR HEALTHY PEOPLE, 2006, 122 : 987 - +
  • [17] Programming Bots by Synthesizing Natural Language Expressions into API Invocations
    Zamanirad, Shayan
    Benatallah, Boualem
    Barukh, Moshe Chai
    Casati, Fabio
    Rodriguez, Carlos
    PROCEEDINGS OF THE 2017 32ND IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING (ASE'17), 2017, : 832 - 837
  • [18] SemRegex: A Semantics-Based Approach for Generating Regular Expressions from Natural Language Specifications
    Zhong, Zexuan
    Guo, Jiaqi
    Yang, Wei
    Peng, Jian
    Xie, Tao
    Lou, Jian-Guang
    Liu, Ting
    Zhang, Dongmei
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 1608 - 1618
  • [19] Natural language processing: Word recognition without segmentation
    Saeed, K
    Dardzinska, A
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2001, 52 (14): : 1275 - 1279
  • [20] Learning Open-vocabulary Semantic Segmentation Models From Natural Language Supervision
    Xu, Jilan
    Hou, Junlin
    Zhang, Yuejie
    Feng, Rui
    Wang, Yi
    Qiao, Yu
    Xie, Weidi
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 2935 - 2944