Image Captioning for Nantong Blue Calico Through Stacked Local-Global Channel Attention Network

被引：0

作者：

Guo, Chenyi ^{[1
]}

Zhang, Li ^{[1
]}

Yu, Xiang ^{[2
]}

机构：

[1] Soochow Univ, Sch Comp Sci & Technol, Suzhou 215006, Peoples R China

[2] Nantong Vocat Coll Sci & Technol, Dept Comp Sci & Technol, Nantong 226007, Peoples R China

来源：

ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT II | 2023年 / 14255卷

关键词：

Intangible cultural heritage; Nantong blue calico; Image captioning; Channel attention; Transformer;

D O I：

10.1007/978-3-031-44210-0_29

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Nantong blue calico, a Chinese folk hand-made printing and dyeing craft, has become one of intangible cultural heritages (ICHs) in China. To inherite and promote the ICH of Nantong blue calico, this study applies the image captioning technology to explaining blue-calico images. For this purpose, a novel image captioning method, called the stacked local-global channel attention network (SLGCAN), is proposed. This new network focuses on extracting important features from blue-calico images so that it can generate more accurate captions for blue-calico images. SLGCAN contains three parts, residual network (ResNet), stacked local-global channel attention module (SLGCAM), and Transformer. First, the pre-trained ResNet-101 model is used to extract rough features from blue-calico images and then, SLGCAM is to obtain the fine-grained information from rough image features. Eventually, SLGCAN adopts Transformer to encode and decode the fine-grained information of blue-calico images to predict the word information for generating accurate image captions. Experiments are conducted on a collected blue-calico image dataset. In experiments, we compare our SLGCAN with baseline models and show that that the proposed model is feasible and effective.

引用

页码：357 / 372

页数：16

共 50 条

[1] Local-global visual interaction attention for image captioning
Wang, Changzhi
Gu, Xiaodong
DIGITAL SIGNAL PROCESSING, 2022, 130
[2] TOPIC-GUIDED LOCAL-GLOBAL GRAPH NEURAL NETWORK FOR IMAGE CAPTIONING
Kan, Jichao
Hu, Kun
Wang, Zhiyong
Wu, Qiuxia
Hagenbuchner, Markus
Tsoi, Ah Chung
2021 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW), 2021,
[3] Without detection: Two-step clustering features with local-global attention for image captioning
Li, Xuan
Zhang, Wenkai
Sun, Xian
Gao, Xin
IET COMPUTER VISION, 2022, 16 (03) : 280 - 294
[4] Transformer-based local-global guidance for image captioning
Parvin, Hashem
Naghsh-Nilchi, Ahmad Reza
Mohammadi, Hossein Mahvash
EXPERT SYSTEMS WITH APPLICATIONS, 2023, 223
[5] A multi-focus image fusion network with local-global joint attention module
Zou, Xinheng
Yang, You
Zhai, Hao
Jiang, Weiping
Pan, Xin
APPLIED INTELLIGENCE, 2025, 55 (02)
[6] Hyperspectral Image Super-Resolution Network of Local-Global Attention Feature Reuse
Size, Wang
Xin, Guan
Qiang, Li
ACTA OPTICA SINICA, 2023, 43 (21)
[7] A multi-focus image fusion network with local-global joint attention moduleA Multi-focus image fusion network with local-global joint attention moduleX. Zou et al.
Xinheng Zou
You Yang
Hao Zhai
Weiping Jiang
Xin Pan
Applied Intelligence, 2025, 55 (2)
[8] Enhancing High-Resolution Image Compression Through Local-Global Joint Attention Mechanism
Jiang, Zeyu
Liu, Xiaohong
Li, Aini
Wang, Guangyu
IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 1044 - 1048
[9] Local-global feature fusion network for hyperspectral image classification
Gan, Yuquan
Zhang, Hao
Liu, Weihua
Ma, Jieming
Luo, Yiming
Pan, Yushan
INTERNATIONAL JOURNAL OF REMOTE SENSING, 2024, 45 (22) : 8548 - 8575
[10] SDPN: A Slight Dual-Path Network With Local-Global Attention Guided for Medical Image Segmentation
Wang, Jing
Li, Shuyi
Yu, Luyue
Qu, Aixi
Wang, Qing
Liu, Ju
Wu, Qiang
IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2023, 27 (06) : 2956 - 2967

← 1 2 3 4 5 →