Efficient and robust active learning methods for interactive database exploration

被引:0
|
作者
Huang, Enhui [1 ]
Diao, Yanlei [1 ,2 ]
Liu, Anna [2 ]
Peng, Liping [2 ]
Palma, Luciano Di [1 ]
机构
[1] Ecole Polytech, Palaiseau, France
[2] Univ Massachusetts Amherst, Amherst, MA USA
来源
VLDB JOURNAL | 2024年 / 33卷 / 04期
基金
欧洲研究理事会;
关键词
Interactive data exploration; Active learning; Label noise; IMBALANCED DATA; QUERY; EXAMPLE; CLASSIFICATION; SEARCH; NOISE;
D O I
10.1007/s00778-023-00816-x
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
There is an increasing gap between fast growth of data and the limited human ability to comprehend data. Consequently, there has been a growing demand of data management tools that can bridge this gap and help the user retrieve high-value content from data more effectively. In this work, we propose an interactive data exploration system as a new database service, using an approach called "explore-by-example." Our new system is designed to assist the user in performing highly effective data exploration while reducing the human effort in the process. We cast the explore-by-example problem in a principled "active learning" framework. However, traditional active learning suffers from two fundamental limitations: slow convergence and lack of robustness under label noise. To overcome the slow convergence and label noise problems, we bring the properties of important classes of database queries to bear on the design of new algorithms and optimizations for active learning-based database exploration. Evaluation results using real-world datasets and user interest patterns show that our new system, both in the noise-free case and in the label noise case, significantly outperforms state-of-the-art active learning techniques and data exploration systems in accuracy while achieving the desired efficiency for interactive data exploration.
引用
收藏
页码:931 / 956
页数:26
相关论文
共 50 条
  • [41] Interactive Visualization for the Active Learning Classroom
    Schweitzer, Dino
    Brown, Wayne
    SIGCSE 2007: PROCEEDINGS OF THE THIRTY-EIGHTH SIGCSE TECHNICAL SYMPOSIUM ON COMPUTER SCIENCE EDUCATION, 2007, : 208 - 212
  • [42] Interactive search fusion methods for video database retrieval
    Smith, JR
    Jaimes, A
    Lin, CY
    Naphade, M
    Natsev, AP
    Tseng, B
    2003 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL 1, PROCEEDINGS, 2003, : 741 - 744
  • [43] COPS Benchmark: interactive analysis of database search methods
    Frank, Karl
    Gruber, Markus
    Sippl, Manfred J.
    BIOINFORMATICS, 2010, 26 (04) : 574 - 575
  • [44] Robust online active learning
    Cacciarelli, Davide
    Kulahci, Murat
    Tyssedal, John Solve
    QUALITY AND RELIABILITY ENGINEERING INTERNATIONAL, 2024, 40 (01) : 277 - 296
  • [45] Potential of Machine Learning Methods for Robust Performance and Efficient Engine Control Development
    Garg, Prasoon
    Silvas, Emilia
    Willems, Frank
    IFAC PAPERSONLINE, 2021, 54 (10): : 189 - 195
  • [46] Towards a Robust Interactive and Learning Social Robot
    de Jong, Michiel
    Zhang, Kevin
    Roth, Aaron M.
    Rhodes, Travers
    Schmucker, Robin
    Zhou, Chenghui
    Ferreira, Sofia
    Cartucho, Joao
    Veloso, Manuela
    PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS (AAMAS' 18), 2018, : 883 - 891
  • [47] An exploration of the relationship between active learning and student motivation in STEM: a mixed methods study
    Stieha, Vicki
    Earl, Brittnee
    Hagens, Harrisen
    Haynes, Meagan
    Ulappa, Amy
    Bond, Laura
    Oxford, Julia Thom
    ADVANCES IN PHYSIOLOGY EDUCATION, 2024, 48 (03) : 621 - 638
  • [48] Robust and Efficient Hamiltonian Learning
    Yu, Wenjun
    Sun, Jinzhao
    Han, Zeyao
    Yuan, Xiao
    QUANTUM, 2023, 7
  • [49] Active Learning Methods
    Velichova, Daniela
    XXVII INTERNATIONAL COLLOQUIUM ON THE MANAGEMENT OF EDUCATIONAL PROCESS, 2009, : 155 - 159
  • [50] Annotating retrieval database with active learning
    Zhang, C
    Chen, TH
    2003 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL 2, PROCEEDINGS, 2003, : 595 - 598