A unified active learning framework for annotating graph data for regression task

被引:1
|
作者
Samoaa, Peter [1 ]
Aronsson, Linus [1 ]
Longa, Antonio [2 ]
Leitner, Philipp [3 ]
Chehreghani, Morteza Haghir [1 ]
机构
[1] Chalmers Univ Technol, Data Sci & AI, Gothenburg, Sweden
[2] Univ Trento, Trento, Italy
[3] Chalmers Univ Technol, Interact Design & Software Engn, Gothenburg, Sweden
基金
瑞典研究理事会;
关键词
Graph neural networks (GNNs); Active learning; Graphs-level regression; NETWORKS;
D O I
10.1016/j.engappai.2024.109383
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In many domains, effectively applying machine learning models requires a large number of annotations and labelled data, which might not be available in advance. Acquiring annotations often requires significant time, effort, and computational resources, making it challenging. Active learning strategies are pivotal in addressing these challenges, particularly for diverse data types such as graphs. Although active learning has been extensively explored for node-level classification, its application to graph-level learning, especially for regression tasks, is not well-explored. We develop a unified active learning framework specializing in graph annotating and graph-level learning for regression tasks on both standard and expanded graphs, which are more detailed representations. We begin with graph collection and construction. Then, we construct various graph embeddings (unsupervised and supervised) into a latent space. Given such an embedding, the framework becomes task agnostic and active learning can be performed using any regression method and query strategy suited for regression. Within this framework, we investigate the impact of using different levels of information for active and passive learning, e.g., partially available labels and unlabelled test data. Despite our framework being domain agnostic, we validate it on a real-world application of software performance prediction, where the execution time of the source code is predicted. Thus, the graph is constructed as an intermediate source code representation. We support our methodology with a real-world dataset to underscore the applicability of our approach. Our real-world experiments reveal that satisfactory performance can be achieved by querying labels for only a small subset of all the data. A key finding is that Graph2Vec (an unsupervised embedding approach for graph data) performs the best, but only when all train and test features are used. However, Graph Neural Networks (GNNs) are the most flexible embedding techniques when used for different levels of information with and without label access. In addition, we find that the benefit of active learning increases for larger datasets (more graphs) and when the graphs are more complex, which is arguably when active learning is the most important.
引用
收藏
页数:25
相关论文
共 50 条
  • [1] Graph Deep Active Learning Framework for Data Deduplication
    Cao, Huan
    Du, Shengdong
    Hu, Jie
    Yang, Yan
    Horng, Shi-Jinn
    Li, Tianrui
    BIG DATA MINING AND ANALYTICS, 2024, 7 (03): : 753 - 764
  • [2] Learning From Less Data: A Unified Data Subset Selection and Active Learning Framework for Computer Vision
    Kaushal, Vishal
    Iyer, Rishabh
    Kothawade, Suraj
    Mahadev, Rohan
    Doctor, Khoshrav
    Ramakrishnan, Ganesh
    2019 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2019, : 1289 - 1299
  • [3] A unified structure learning framework for graph attention networks
    Yuan, Jinliang
    Cao, Meng
    Cheng, Hao
    Yu, Hualei
    Xie, Junyuan
    Wang, Chongjun
    NEUROCOMPUTING, 2022, 495 : 194 - 204
  • [4] GRAPH-BASED INDUCTION AS A UNIFIED LEARNING FRAMEWORK
    YOSHIDA, K
    MOTODA, H
    INDURKHYA, N
    APPLIED INTELLIGENCE, 1994, 4 (03) : 297 - 316
  • [5] A Unified Framework for Automatic Distributed Active Learning
    Chen, Xu
    Wujek, Brett
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (12) : 9774 - 9786
  • [6] A joint learning framework for Gaussian processes regression and graph learning
    Miao, Xiaoyu
    Jiang, Aimin
    Zhu, Yanping
    Kwan, Hon Keung
    SIGNAL PROCESSING, 2022, 201
  • [7] A Unified Framework for Data Poisoning Attack to Graph-based Semi-supervised Learning
    Liu, Xuanqing
    Si, Si
    Zhu, Xiaojin
    Li, Yang
    Hsieh, Cho-Jui
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [8] OmiEmbed: A Unified Multi-Task Deep Learning Framework for Multi-Omics Data
    Zhang, Xiaoyu
    Xing, Yuting
    Sun, Kai
    Guo, Yike
    CANCERS, 2021, 13 (12)
  • [9] A unified framework for structured graph learning via spectral constraints
    Kumar, Sandeep
    Ying, Jiaxi
    Cardoso, José Vinícius de M.
    Palomar, Daniel P.
    Journal of Machine Learning Research, 2020, 21
  • [10] A Unified Framework Based on Graph Consensus Term for Multiview Learning
    Meng, Xiangzhu
    Feng, Lin
    Guo, Chonghui
    Wang, Huibing
    Wu, Shu
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (03) : 3964 - 3977