STAIR Captions: Constructing a Large-Scale Japanese Image Caption Dataset

被引：47

作者：

Yoshikawa, Yuya ^{[1
]}

Shigeto, Yutaro ^{[1
]}

Takeuchi, Akikazu ^{[1
]}

机构：

[1] Chiba Inst Technol, STAIR Lab, 2-17-1 Tsudanuma, Narashino, Chiba, Japan

来源：

PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 2 | 2017年

关键词：

D O I：

10.18653/v1/P17-2066

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

In recent years, automatic generation of image descriptions (captions), that is, image captioning, has attracted a great deal of attention. In this paper, we particularly consider generating Japanese captions for images. Since most available caption datasets have been constructed for English language, there are few datasets for Japanese. To tackle this problem, we construct a large-scale Japanese image caption dataset based on images from MS-COCO, which is called STAIR Captions. STAIR Captions consists of 820,310 Japanese captions for 164,062 images. In the experiment, we show that a neural network trained using STAIR Captions can generate more natural and better Japanese captions, compared to those generated using English-Japanese machine translation after generating English captions.

引用

页码：417 / 421

页数：5

共 50 条

[41] Large-Scale Analysis of the Docker Hub Dataset
Zhao, Nannan
Tarasov, Vasily
Albahar, Hadeel
Anwar, Ali
Rupprecht, Lukas
Skourtis, Dimitrios
Warke, Amit S.
Mohamed, Mohamed
Butt, Ali R.
2019 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER), 2019, : 215 - 224
[42] A large-scale dataset of buildings and construction sites
Cheng, Xuanhao
Jia, Mingming
He, Jian
COMPUTER-AIDED CIVIL AND INFRASTRUCTURE ENGINEERING, 2024, 39 (09) : 1390 - 1406
[43] SGF: A Crowdsourced Large-scale Event Dataset
Heuschkel, Jens
Froemmgen, Alexander
PROCEEDINGS OF THE 9TH ACM MULTIMEDIA SYSTEMS CONFERENCE (MMSYS'18), 2018, : 351 - 356
[44] MineRL: A Large-Scale Dataset of Minecraft Demonstrations
Guss, William H.
Houghton, Brandon
Topin, Nicholay
Wang, Phillip
Codel, Cayden
Veloso, Manuela
Salakhutdinov, Ruslan
PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 2442 - 2448
[45] MultiSubs: A Large-scale Multimodal and Multilingual Dataset
Wang, Josiah
Figueiredo, Josiel
Specia, Lucia
LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 6776 - 6785
[46] A large-scale and global car dataset for verification
Hu, Lingji
Luo, Xingcheng
Deng, Jianhua
Lai, Fengjie
Hu, Jian
Yu, Yongbin
PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ELECTRONIC TECHNOLOGY, 2016, 48 : 49 - 52
[47] EdNet: A Large-Scale Hierarchical Dataset in Education
Choi, Youngduck
Lee, Youngnam
Shin, Dongmin
Cho, Junghyun
Park, Seoyon
Lee, Seewoo
Baek, Jineon
Bae, Chan
Kim, Byungsoo
Heo, Jaewe
ARTIFICIAL INTELLIGENCE IN EDUCATION (AIED 2020), PT II, 2020, 12164 : 69 - 73
[48] A Large-Scale Dataset for Empathetic Response Generation
Welivita, Anuradha
Xie, Yubo
Pu, Pearl
2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 1251 - 1264
[49] VoxCeleb: a large-scale speaker identification dataset
Nagrani, Arsha
Chung, Joon Son
Zisserman, Andrew
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2616 - 2620
[50] A large-scale hyperspectral dataset for flower classification
Zheng, Yongrong
Zhang, Tao
Fu, Ying
KNOWLEDGE-BASED SYSTEMS, 2022, 236

← 1 2 3 4 5 →