Efficient and robust active learning methods for interactive database exploration

被引:0
|
作者
Huang, Enhui [1 ]
Diao, Yanlei [1 ,2 ]
Liu, Anna [2 ]
Peng, Liping [2 ]
Palma, Luciano Di [1 ]
机构
[1] Ecole Polytech, Palaiseau, France
[2] Univ Massachusetts Amherst, Amherst, MA USA
来源
VLDB JOURNAL | 2024年 / 33卷 / 04期
基金
欧洲研究理事会;
关键词
Interactive data exploration; Active learning; Label noise; IMBALANCED DATA; QUERY; EXAMPLE; CLASSIFICATION; SEARCH; NOISE;
D O I
10.1007/s00778-023-00816-x
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
There is an increasing gap between fast growth of data and the limited human ability to comprehend data. Consequently, there has been a growing demand of data management tools that can bridge this gap and help the user retrieve high-value content from data more effectively. In this work, we propose an interactive data exploration system as a new database service, using an approach called "explore-by-example." Our new system is designed to assist the user in performing highly effective data exploration while reducing the human effort in the process. We cast the explore-by-example problem in a principled "active learning" framework. However, traditional active learning suffers from two fundamental limitations: slow convergence and lack of robustness under label noise. To overcome the slow convergence and label noise problems, we bring the properties of important classes of database queries to bear on the design of new algorithms and optimizations for active learning-based database exploration. Evaluation results using real-world datasets and user interest patterns show that our new system, both in the noise-free case and in the label noise case, significantly outperforms state-of-the-art active learning techniques and data exploration systems in accuracy while achieving the desired efficiency for interactive data exploration.
引用
收藏
页码:931 / 956
页数:26
相关论文
共 50 条
  • [21] Efficient Exploration of Microstructure-Property Spaces via Active Learning
    Morand, Lukas
    Link, Norbert
    Iraki, Tarek
    Dornheim, Johannes
    Helm, Dirk
    FRONTIERS IN MATERIALS, 2022, 8
  • [22] Using Active Learning Techniques for Improving Database Schema Matching Methods
    Rodrigues, Diego
    da Silva, Altigran
    Rodrigues, Rosiane
    dos Santos, Eulanda
    2015 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2015,
  • [23] Safe Exploration for Interactive Machine Learning
    Turchetta, Matteo
    Berkenkamp, Felix
    Krause, Andreas
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [24] ACTIVE LEARNING WITH INTERACTIVE WHITEBOARDS
    Schroeder, Robert
    COMMUNICATIONS IN INFORMATION LITERACY, 2007, 1 (02) : 64 - 73
  • [25] Active Learning Methods for Efficient Hybrid Biophysical Variable Retrieval
    Verrelst, Jochem
    Dethier, Sara
    Rivera, Juan Pablo
    Munoz-Mari, Jordi
    Camps-Valls, Gustau
    Moreno, Jose
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2016, 13 (07) : 1012 - 1016
  • [26] A General Framework for Robust Interactive Learning
    Emamjomeh-Zadeh, Ehsan
    Kempe, David
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [27] METHODS AND APPROACHES IN INTERACTIVE LEARNING
    Yegenissova, A. K.
    Tulenova, U.
    Aidnaliyeva, N. A.
    Balgabayeva, G. Z.
    Baizhanova, S. A.
    Togaibayeva, A.
    Ramazanova, D.
    Ichshanova, G. E.
    AD ALTA-JOURNAL OF INTERDISCIPLINARY RESEARCH, 2020, 10 (02): : 35 - 40
  • [28] ACTIVE AND INTERACTIVE METHODS IN THE CONTEMPORARY EDUCATION
    Kovalenko, Kseniya E.
    Kovalenko, Nataliya E.
    Gubareva, Anna V.
    QUID-INVESTIGACION CIENCIA Y TECNOLOGIA, 2018, (02): : 17 - 20
  • [29] Learning dialogue strategies for interactive database search
    Rieser, Verena
    Lemon, Oliver
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2041 - +
  • [30] An interactive tool for teaching and learning database normalization
    Stefanidis, Christos
    Koloniari, Georgia
    20TH PAN-HELLENIC CONFERENCE ON INFORMATICS (PCI 2016), 2016,