Segmentation from Natural Language Expressions

被引:226
|
作者
Hu, Ronghang [1 ]
Rohrbach, Marcus [1 ,2 ]
Darrell, Trevor [1 ]
机构
[1] Univ Calif Berkeley, EECS, Berkeley, CA 94720 USA
[2] ICSI, Berkeley, CA USA
来源
关键词
Natural language; Segmentation; Recurrent neural network; Fully convolutional network;
D O I
10.1007/978-3-319-46448-0_7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we approach the novel problem of segmenting an image based on a natural language expression. This is different from traditional semantic segmentation over a predefined set of semantic classes, as e.g., the phrase "two men sitting on the right bench" requires segmenting only the two people on the right bench and no one standing or sitting on another bench. Previous approaches suitable for this task were limited to a fixed set of categories and/or rectangular regions. To produce pixelwise segmentation for the language expression, we propose an end-to-end trainable recurrent and convolutional network model that jointly learns to process visual and linguistic information. In our model, a recurrent neural network is used to encode the referential expression into a vector representation, and a fully convolutional network is used to a extract a spatial feature map from the image and output a spatial response map for the target object. We demonstrate on a benchmark dataset that our model can produce quality segmentation output from the natural language expression, and outperforms baseline methods by a large margin.
引用
收藏
页码:108 / 124
页数:17
相关论文
共 50 条
  • [31] Natural Language Processing Using Neighbour Entropy-based Segmentation
    Qiao J.
    Yan X.
    Lv S.
    Journal of Computing and Information Technology, 2021, 29 (02) : 113 - 131
  • [32] Automated Discovery of Valid Test Strings from the Web using Dynamic Regular Expressions Collation and Natural Language Processing
    Shahbaz, Muzammil
    McMinn, Phil
    Stevenson, Mark
    2012 12TH INTERNATIONAL CONFERENCE ON QUALITY SOFTWARE (QSIC), 2012, : 79 - 88
  • [33] From natural language to accounting entries using a natural language processing method
    Chen, Yasheng
    Huang, Xian
    Wu, Zhuojun
    ACCOUNTING AND FINANCE, 2023, 63 (04): : 3781 - 3795
  • [34] Natural Language Processing Algorithms for Normalizing Expressions of Synonymous Symptoms in Traditional Chinese Medicine
    Zhou, Lu
    Liu, Shuangqiao
    Li, Caiyan
    Sun, Yuemeng
    Zhang, Yizhuo
    Li, Yuda
    Yuan, Huimin
    Sun, Yan
    Xu, Fengqin
    Li, Yuhang
    EVIDENCE-BASED COMPLEMENTARY AND ALTERNATIVE MEDICINE, 2021, 2021
  • [35] A lexicon of multiword expressions for linguistically precise, wide-coverage natural language processing
    Tanabe, Toshifumi
    Takahashi, Masahito
    Shudo, Kosho
    COMPUTER SPEECH AND LANGUAGE, 2014, 28 (06): : 1317 - 1339
  • [36] IDL-expressions: A formalism for representing and parsing finite languages in natural language processing
    Nederhof, MJ
    Satta, G
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2004, 21 : 287 - 317
  • [37] IDL-expressions: A formalism for representing and parsing finite languages in natural language processing
    Nederhof, M.-J. (MARKJAN@LET.RUG.NL), 1600, American Association for Artificial Intelligence (21):
  • [38] Multi-Word Expressions in Serbian - Properties, Typology and Classification for Natural Language Processing
    Krstev, Cvetana
    Vitas, Dusko
    PROCEEDINGS OF THE INTERNATIONAL JUBILEE CONFERENCE OF THE INSTITUTE FOR BULGARIAN LANGUAGE, VOL 1, 2017, : 298 - 310
  • [39] Automated Text Structuring: Natural Language Processing and Regular Expressions in XML Tag Filling
    Malashin, Ivan P.
    Tynchenko, Vadim S.
    Gantimurov, Andrei P.
    Nelyub, Vladimir A.
    Borodulin, Aleksei S.
    IEEE ACCESS, 2024, 12 : 190582 - 190597
  • [40] Asymmetric Cross-Guided Attention Network for Actor and Action Video Segmentation From Natural Language Query
    Wang, Hao
    Deng, Cheng
    Yan, Junchi
    Tao, Dacheng
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 3938 - 3947