AesCLIP: Multi-Attribute Contrastive Learning for Image Aesthetics Assessment

被引：6

作者：

Sheng, Xiangfei ^{[1
,2
]}

Li, Leida ^{[1
]}

Chen, Pengfei ^{[1
]}

Wu, Jinjian ^{[1
]}

Dong, Weisheng ^{[1
]}

Yang, Yuzhe ^{[2
]}

Xu, Liwu ^{[2
]}

Li, Yaqian ^{[2
]}

Shi, Guangming ^{[1
]}

机构：

[1] Xidian Univ, Sch Artificial Intelligence, Xian, Peoples R China

[2] OPPO Res Inst, Chengdu, Peoples R China

来源：

PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023 | 2023年

基金：

中国国家自然科学基金;

关键词：

Image aesthetics assessment; CLIP; Aesthetics attributes; Contrastive Learning;

D O I：

10.1145/3581783.3611969

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Image aesthetics assessment (IAA) aims at predicting the aesthetic quality of images. Recently, large pre-trained vision-language models, like CLIP, have shown impressive performances on various visual tasks. When it comes to IAA, a straightforward way is to finetune the CLIP image encoder using aesthetic images. However, this can only achieve limited success without considering the uniqueness of multimodal data in the aesthetics domain. People usually assess image aesthetics according to fine-grained visual attributes, e.g., color, light and composition. However, how to learn aesthetics-aware attributes from CLIP-based semantic space has not been addressed before. With this motivation, this paper presents a CLIP-based multi-attribute contrastive learning framework for IAA, dubbed AesCLIP. Specifically, AesCLIP consists of two major components, i.e., aesthetic attribute-based comment classification and attribute-aware learning. The former classifies the aesthetic comments into different attribute categories. Then the latter learns an aesthetic attribute-aware representation by contrastive learning, aiming to mitigate the domain shift from the general visual domain to the aesthetics domain. Extensive experiments have been done by using the pre-trained AesCLIP on four popular IAA databases, and the results demonstrate the advantage of AesCLIP over the state-of-the-arts. The source code will be public at https://github.com/OPPOMKLab/AesCLIP.

引用

页码：1117 / 1126

页数：10

共 50 条

[41] Efficient Multi-Attribute Similarity Learning Towards Attribute-based Fashion Search
Ak, Kenan E.
Lim, Joo Hwee
Tham, Jo Yew
Kassim, Ashraf A.
2018 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2018), 2018, : 1671 - 1679
[42] Service demand analysis using multi-attribute learning mechanisms
Inoue, A
Takahashi, S
Nishimatsu, K
Kawano, H
INTERNATIONAL CONFERENCE ON INTEGRATION OF KNOWLEDGE INTENSIVE MULTI-AGENT SYSTEMS: KIMAS'03: MODELING, EXPLORATION, AND ENGINEERING, 2003, : 634 - 639
[43] Learning user preferences for multi-attribute negotiation:: An evolutionary approach
Guo, YT
Müller, JP
Weinhardt, C
MULTI-AGENT SYSTEMS AND APPLICATIONS III, PROCEEDINGS, 2003, 2691 : 303 - 313
[44] Affective image recognition with multi-attribute knowledge in deep neural networks
Hao Zhang
Gaifang Luo
Yingying Yue
Kangjian He
Dan Xu
Multimedia Tools and Applications, 2024, 83 : 18353 - 18379
[45] Image Retrieval and Ranking via Consistently Reconstructing Multi-attribute Queries
Cao, Xiaochun
Zhang, Hua
Guo, Xiaojie
Liu, Si
Chen, Xiaowu
COMPUTER VISION - ECCV 2014, PT I, 2014, 8689 : 569 - 583
[46] Multi-attribute Preference Logic
Hindriks, Koen V.
Visser, Wietske
Jonker, Catholijn M.
PRINCIPLES AND PRACTICE OF MULTI-AGENT SYSTEMS, 2012, 7057 : 181 - 195
[47] Multi-attribute procurement contracts
Li, Zhaolin
Ryan, Jennifer K.
Sun, Daewon
INTERNATIONAL JOURNAL OF PRODUCTION ECONOMICS, 2015, 159 : 137 - 146
[48] Affective image recognition with multi-attribute knowledge in deep neural networks
Zhang, Hao
Luo, Gaifang
Yue, Yingying
He, Kangjian
Xu, Dan
MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (06) : 18353 - 18379
[49] Multi-attribute sequential search
Bearden, J. Neil
Connolly, Terry
ORGANIZATIONAL BEHAVIOR AND HUMAN DECISION PROCESSES, 2007, 103 (01) : 147 - 158
[50] Multi-attribute proportional representation
Lang, Jerome
Skowron, Piotr
ARTIFICIAL INTELLIGENCE, 2018, 263 : 74 - 106

← 1 2 3 4 5 →