Segmentation from Natural Language Expressions

被引:226
|
作者
Hu, Ronghang [1 ]
Rohrbach, Marcus [1 ,2 ]
Darrell, Trevor [1 ]
机构
[1] Univ Calif Berkeley, EECS, Berkeley, CA 94720 USA
[2] ICSI, Berkeley, CA USA
来源
关键词
Natural language; Segmentation; Recurrent neural network; Fully convolutional network;
D O I
10.1007/978-3-319-46448-0_7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we approach the novel problem of segmenting an image based on a natural language expression. This is different from traditional semantic segmentation over a predefined set of semantic classes, as e.g., the phrase "two men sitting on the right bench" requires segmenting only the two people on the right bench and no one standing or sitting on another bench. Previous approaches suitable for this task were limited to a fixed set of categories and/or rectangular regions. To produce pixelwise segmentation for the language expression, we propose an end-to-end trainable recurrent and convolutional network model that jointly learns to process visual and linguistic information. In our model, a recurrent neural network is used to encode the referential expression into a vector representation, and a fully convolutional network is used to a extract a spatial feature map from the image and output a spatial response map for the target object. We demonstrate on a benchmark dataset that our model can produce quality segmentation output from the natural language expression, and outperforms baseline methods by a large margin.
引用
收藏
页码:108 / 124
页数:17
相关论文
共 50 条
  • [1] Generating Predicate Logic Expressions From Natural Language
    Levkovskyi, Oleksii
    Li, Wei
    SOUTHEASTCON 2021, 2021, : 465 - 472
  • [2] NALDO: From natural language definitions to OWL expressions
    Emani, Cheikh Kacfah
    Da Silva, Catarina Ferreira
    Fies, Bruno
    Ghodous, Parisa
    DATA & KNOWLEDGE ENGINEERING, 2019, 122 : 130 - 141
  • [3] Inferring Business Rules from Natural Language Expressions
    Aiello, Giovanni
    Di Bernardo, Roberto
    Maggio, Martino
    Di Bona, Daniele
    Lo Re, Giuseppe
    2014 IEEE 7TH INTERNATIONAL CONFERENCE ON SERVICE-ORIENTED COMPUTING AND APPLICATIONS (SOCA), 2014, : 131 - 136
  • [4] Logical Expressions in Natural Language
    Svoboda, Vladimir
    FILOSOFICKY CASOPIS, 2017, 65 (01): : 35 - 57
  • [5] Generating Natural Language From Logic Expressions With Structural Representation
    Wu, Xin
    Cai, Yi
    Lian, Zetao
    Leung, Ho-fung
    Wang, Tao
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 1499 - 1510
  • [6] Analysing frequent natural language expressions from design conversations
    Ungureanu, Lucian-Constantin
    Hartmann, Timo
    DESIGN STUDIES, 2021, 72
  • [7] Video Object Segmentation with Language Referring Expressions
    Khoreva, Anna
    Rohrbach, Anna
    Schiele, Bernt
    COMPUTER VISION - ACCV 2018, PT IV, 2019, 11364 : 123 - 141
  • [8] QUANTITY EXPRESSIONS IN NATURAL-LANGUAGE
    MOXEY, LM
    BULLETIN OF THE BRITISH PSYCHOLOGICAL SOCIETY, 1986, 39 : A8 - A8
  • [9] Conceptual Framework for Modeling Dynamic Paths from Natural Language Expressions
    Hornsby, Kathleen Stewart
    Li, Naicong
    TRANSACTIONS IN GIS, 2009, 13 : 27 - 45
  • [10] Mining information from time series in the form of natural language expressions
    Novak, Vilem
    PROCEEDINGS OF THE 2015 CONFERENCE OF THE INTERNATIONAL FUZZY SYSTEMS ASSOCIATION AND THE EUROPEAN SOCIETY FOR FUZZY LOGIC AND TECHNOLOGY, 2015, 89 : 1119 - 1125