AesCLIP: Multi-Attribute Contrastive Learning for Image Aesthetics Assessment

被引:6
|
作者
Sheng, Xiangfei [1 ,2 ]
Li, Leida [1 ]
Chen, Pengfei [1 ]
Wu, Jinjian [1 ]
Dong, Weisheng [1 ]
Yang, Yuzhe [2 ]
Xu, Liwu [2 ]
Li, Yaqian [2 ]
Shi, Guangming [1 ]
机构
[1] Xidian Univ, Sch Artificial Intelligence, Xian, Peoples R China
[2] OPPO Res Inst, Chengdu, Peoples R China
基金
中国国家自然科学基金;
关键词
Image aesthetics assessment; CLIP; Aesthetics attributes; Contrastive Learning;
D O I
10.1145/3581783.3611969
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Image aesthetics assessment (IAA) aims at predicting the aesthetic quality of images. Recently, large pre-trained vision-language models, like CLIP, have shown impressive performances on various visual tasks. When it comes to IAA, a straightforward way is to finetune the CLIP image encoder using aesthetic images. However, this can only achieve limited success without considering the uniqueness of multimodal data in the aesthetics domain. People usually assess image aesthetics according to fine-grained visual attributes, e.g., color, light and composition. However, how to learn aesthetics-aware attributes from CLIP-based semantic space has not been addressed before. With this motivation, this paper presents a CLIP-based multi-attribute contrastive learning framework for IAA, dubbed AesCLIP. Specifically, AesCLIP consists of two major components, i.e., aesthetic attribute-based comment classification and attribute-aware learning. The former classifies the aesthetic comments into different attribute categories. Then the latter learns an aesthetic attribute-aware representation by contrastive learning, aiming to mitigate the domain shift from the general visual domain to the aesthetics domain. Extensive experiments have been done by using the pre-trained AesCLIP on four popular IAA databases, and the results demonstrate the advantage of AesCLIP over the state-of-the-arts. The source code will be public at https://github.com/OPPOMKLab/AesCLIP.
引用
收藏
页码:1117 / 1126
页数:10
相关论文
共 50 条
  • [41] Efficient Multi-Attribute Similarity Learning Towards Attribute-based Fashion Search
    Ak, Kenan E.
    Lim, Joo Hwee
    Tham, Jo Yew
    Kassim, Ashraf A.
    2018 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2018), 2018, : 1671 - 1679
  • [42] Service demand analysis using multi-attribute learning mechanisms
    Inoue, A
    Takahashi, S
    Nishimatsu, K
    Kawano, H
    INTERNATIONAL CONFERENCE ON INTEGRATION OF KNOWLEDGE INTENSIVE MULTI-AGENT SYSTEMS: KIMAS'03: MODELING, EXPLORATION, AND ENGINEERING, 2003, : 634 - 639
  • [43] Learning user preferences for multi-attribute negotiation:: An evolutionary approach
    Guo, YT
    Müller, JP
    Weinhardt, C
    MULTI-AGENT SYSTEMS AND APPLICATIONS III, PROCEEDINGS, 2003, 2691 : 303 - 313
  • [44] Affective image recognition with multi-attribute knowledge in deep neural networks
    Hao Zhang
    Gaifang Luo
    Yingying Yue
    Kangjian He
    Dan Xu
    Multimedia Tools and Applications, 2024, 83 : 18353 - 18379
  • [45] Image Retrieval and Ranking via Consistently Reconstructing Multi-attribute Queries
    Cao, Xiaochun
    Zhang, Hua
    Guo, Xiaojie
    Liu, Si
    Chen, Xiaowu
    COMPUTER VISION - ECCV 2014, PT I, 2014, 8689 : 569 - 583
  • [46] Multi-attribute Preference Logic
    Hindriks, Koen V.
    Visser, Wietske
    Jonker, Catholijn M.
    PRINCIPLES AND PRACTICE OF MULTI-AGENT SYSTEMS, 2012, 7057 : 181 - 195
  • [47] Multi-attribute procurement contracts
    Li, Zhaolin
    Ryan, Jennifer K.
    Sun, Daewon
    INTERNATIONAL JOURNAL OF PRODUCTION ECONOMICS, 2015, 159 : 137 - 146
  • [48] Affective image recognition with multi-attribute knowledge in deep neural networks
    Zhang, Hao
    Luo, Gaifang
    Yue, Yingying
    He, Kangjian
    Xu, Dan
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (06) : 18353 - 18379
  • [49] Multi-attribute sequential search
    Bearden, J. Neil
    Connolly, Terry
    ORGANIZATIONAL BEHAVIOR AND HUMAN DECISION PROCESSES, 2007, 103 (01) : 147 - 158
  • [50] Multi-attribute proportional representation
    Lang, Jerome
    Skowron, Piotr
    ARTIFICIAL INTELLIGENCE, 2018, 263 : 74 - 106