A unified active learning framework for annotating graph data for regression task

被引:1
|
作者
Samoaa, Peter [1 ]
Aronsson, Linus [1 ]
Longa, Antonio [2 ]
Leitner, Philipp [3 ]
Chehreghani, Morteza Haghir [1 ]
机构
[1] Chalmers Univ Technol, Data Sci & AI, Gothenburg, Sweden
[2] Univ Trento, Trento, Italy
[3] Chalmers Univ Technol, Interact Design & Software Engn, Gothenburg, Sweden
基金
瑞典研究理事会;
关键词
Graph neural networks (GNNs); Active learning; Graphs-level regression; NETWORKS;
D O I
10.1016/j.engappai.2024.109383
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In many domains, effectively applying machine learning models requires a large number of annotations and labelled data, which might not be available in advance. Acquiring annotations often requires significant time, effort, and computational resources, making it challenging. Active learning strategies are pivotal in addressing these challenges, particularly for diverse data types such as graphs. Although active learning has been extensively explored for node-level classification, its application to graph-level learning, especially for regression tasks, is not well-explored. We develop a unified active learning framework specializing in graph annotating and graph-level learning for regression tasks on both standard and expanded graphs, which are more detailed representations. We begin with graph collection and construction. Then, we construct various graph embeddings (unsupervised and supervised) into a latent space. Given such an embedding, the framework becomes task agnostic and active learning can be performed using any regression method and query strategy suited for regression. Within this framework, we investigate the impact of using different levels of information for active and passive learning, e.g., partially available labels and unlabelled test data. Despite our framework being domain agnostic, we validate it on a real-world application of software performance prediction, where the execution time of the source code is predicted. Thus, the graph is constructed as an intermediate source code representation. We support our methodology with a real-world dataset to underscore the applicability of our approach. Our real-world experiments reveal that satisfactory performance can be achieved by querying labels for only a small subset of all the data. A key finding is that Graph2Vec (an unsupervised embedding approach for graph data) performs the best, but only when all train and test features are used. However, Graph Neural Networks (GNNs) are the most flexible embedding techniques when used for different levels of information with and without label access. In addition, we find that the benefit of active learning increases for larger datasets (more graphs) and when the graphs are more complex, which is arguably when active learning is the most important.
引用
收藏
页数:25
相关论文
共 50 条
  • [21] Annotating retrieval database with active learning
    Zhang, C
    Chen, TH
    2003 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL 2, PROCEEDINGS, 2003, : 595 - 598
  • [22] A unified framework of constrained regression
    Benjamin Hofner
    Thomas Kneib
    Torsten Hothorn
    Statistics and Computing, 2016, 26 : 1 - 14
  • [23] A unified framework of constrained regression
    Hofner, Benjamin
    Kneib, Thomas
    Hothorn, Torsten
    STATISTICS AND COMPUTING, 2016, 26 (1-2) : 1 - 14
  • [24] GADAL: An Active Learning Framework for Graph Anomaly Detection
    Chang, Wenjing
    Yu, Jianjun
    Zhou, Xiaojun
    WEB AND BIG DATA, PT I, APWEB-WAIM 2022, 2023, 13421 : 435 - 442
  • [25] Heuristic Learning with Graph Neural Networks: A Unified Framework for Link Prediction
    Zhang, Juzheng
    Wei, Lanning
    Xu, Zhen
    Yao, Quanming
    PROCEEDINGS OF THE 30TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2024, 2024, : 4223 - 4231
  • [26] UniSKGRep: A unified representation learning framework of social network and knowledge graph
    Shen, Yinghan
    Jiang, Xuhui
    Li, Zijian
    Wang, Yuanzhuo
    Xu, Chengjin
    Shen, Huawei
    Cheng, Xueqi
    NEURAL NETWORKS, 2023, 158 : 142 - 153
  • [27] TASK-AWARE GRAPH CONVOLUTIONAL NETWORK FOR ACTIVE LEARNING
    Ye, Yujia
    Wu, Zhangquan
    Su, Guoliang
    Zhou, Jun
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 495 - 499
  • [28] A Unified Multi-task Adversarial Learning Framework for Pharmacovigilance Mining
    Yadav, Shweta
    Ekbal, Asif
    Saha, Sriparna
    Bhattacharyya, Pushpak
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 5234 - 5245
  • [29] A Framework and Benchmark for Deep Batch Active Learning for Regression
    Holzmueller, David
    Zaverkin, Viktor
    Kaestner, Johannes
    Steinwart, Ingo
    JOURNAL OF MACHINE LEARNING RESEARCH, 2023, 24
  • [30] A Unified Feature Selection Framework for Graph Embedding on High Dimensional Data
    Chen, Marcus
    Tsang, Ivor W.
    Tan, Mingkui
    Cham, Tat Jen
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2015, 27 (06) : 1465 - 1477