Cross-Lingual Image Caption Generation

被引:0
|
作者
Miyazaki, Takashi [1 ]
Shimizu, Nobuyuki [1 ]
机构
[1] Yahoo Japan Corp, Tokyo, Japan
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Automatically generating a natural language description of an image is a fundamental problem in artificial intelligence. This task involves both computer vision and natural language processing and is called "image caption generation." Research on image caption generation has typically focused on taking in an image and generating a caption in English as existing image caption corpora are mostly in English. The lack of corpora in languages other than English is an issue, especially for morphologically rich languages such as Japanese. There is thus a need for corpora sufficiently large for image captioning in other languages. We have developed a Japanese version of the MS COCO caption dataset and a generative model based on a deep recurrent architecture that takes in an image and uses this Japanese version of the dataset to generate a caption in Japanese. As the Japanese portion of the corpus is small, our model was designed to transfer the knowledge representation obtained from the English portion into the Japanese portion. Experiments showed that the resulting bilingual comparable corpus has better performance than a monolingual corpus, indicating that image understanding using a resource-rich language benefits a resource-poor language.
引用
收藏
页码:1780 / 1790
页数:11
相关论文
共 50 条
  • [1] Cross-Lingual Image Caption Generation Based on Visual Attention Model
    Wang, Bin
    Wang, Cungang
    Zhang, Qian
    Su, Ying
    Wang, Yang
    Xu, Yanyan
    IEEE ACCESS, 2020, 8 : 104543 - 104554
  • [2] Unpaired Cross-lingual Image Caption Generation with Self-Supervised Rewards
    Song, Yuqing
    Chen, Shizhe
    Zhao, Yida
    Jin, Qin
    PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 784 - 792
  • [3] A Cross-Lingual Summarization method based on cross-lingual Fact-relationship Graph Generation
    Zhang, Yongbing
    Gao, Shengxiang
    Huang, Yuxin
    Tan, Kaiwen
    Yu, Zhengtao
    PATTERN RECOGNITION, 2024, 146
  • [4] Cross-Lingual Training for Automatic Question Generation
    Kumar, Vishwajeet
    Joshi, Nitish
    Mukherjee, Arijit
    Ramakrishnan, Ganesh
    Jyothi, Preethi
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 4863 - 4872
  • [5] UNISON: Unpaired Cross-Lingual Image Captioning
    Gao, Jiahui
    Zhou, Yi
    Yu, Philip L. H.
    Joty, Shafiq
    Gu, Jiuxiang
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 10654 - 10662
  • [6] A Workbench for Rapid Generation of Cross-Lingual Summaries
    Jhaveri, Nisarg
    Gupta, Manish
    Varma, Vasudeva
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 3209 - 3215
  • [7] Cross-Lingual and Cross-Cultural Variation in Image Descriptions
    Berger, Uri
    Ponti, Edoardo M.
    arXiv,
  • [8] Limitations of cross-lingual learning from image search
    Hartmann, Mareike
    Sogaard, Anders
    REPRESENTATION LEARNING FOR NLP, 2018, : 159 - 163
  • [9] Harvesting Deep Models for Cross-Lingual Image Annotation
    Wei, Qijie
    Wang, Xiaoxu
    Li, Xirong
    PROCEEDINGS OF THE 15TH INTERNATIONAL WORKSHOP ON CONTENT-BASED MULTIMEDIA INDEXING (CBMI), 2017,
  • [10] Fluency-Guided Cross-Lingual Image Captioning
    Lan, Weiyu
    Li, Xirong
    Dong, Jianfeng
    PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 1549 - 1557