An adjustable description quality measure for pattern discovery using the AQ methodology

被引:12
|
作者
Kaufman, KA [1 ]
Michalski, RS
机构
[1] George Mason Univ, Machine Learning & Inference Lab, Fairfax, VA 22030 USA
[2] Polish Acad Sci, Inst Comp Sci, PL-00901 Warsaw, Poland
关键词
machine learning; data mining; learning from noisy data; natural induction; AQ learning; decision rules; separate and conquer;
D O I
10.1023/A:1008787919756
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In concept learning and data mining tasks, the learner is typically faced with a choice of many possible hypotheses or patterns characterizing the input data. If one can assume that training data contain no noise, then the primary conditions a hypothesis must satisfy are consistency and completeness with regard to the data. In real-world applications, however, data are often noisy, and the insistence on the full completeness and consistency of the hypothesis is no longer valid. In such situations, the problem is to determine a hypothesis that represents the best trade-off between completeness and consistency. This paper presents an approach to this problem in which a learner seeks rules optimizing a rule quality criterion that combines the rule coverage (a measure of completeness) and training accuracy (a measure of inconsistency). These factors are combined into a single rule quality measure through a lexicographical evaluation functional (LEF). The method has been implemented in the AQ18 learning system for natural induction and pattern discovery, and compared with several other methods. Experiments have shown that the proposed method can be easily tailored to different problems and can simulate different rule learners by modifying the parameter of the rule quality criterion.
引用
收藏
页码:199 / 216
页数:18
相关论文
共 50 条
  • [1] An Adjustable Description Quality Measure for Pattern Discovery Using the AQ Methodology
    Kenneth A. Kaufman
    Ryszard S. Michalski
    Journal of Intelligent Information Systems, 2000, 14 : 199 - 216
  • [2] Parallel Frequent Pattern Discovery: Challenges and Methodology
    Zhang, Yuzhou
    Wang, Jianyong
    Zhou, Lizhu
    Tsinghua Science and Technology, 2007, 12 (06) : 719 - 728
  • [3] Parallel Frequent Pattern Discovery:Challenges and Methodology
    张宇宙
    王建勇
    周立柱
    Tsinghua Science and Technology, 2007, (06) : 719 - 728
  • [4] Pattern discovery and detection: A unified statistical methodology
    Hand, DJ
    Bolton, RJ
    JOURNAL OF APPLIED STATISTICS, 2004, 31 (08) : 885 - 924
  • [5] Case Study of Spatial Pattern Description, Identification and Application Methodology
    Germanaite, Indraja Elzbieta
    Zaleckis, Kestutis
    Butleris, Rimantas
    Jarmalaviciene, Kristina
    JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2020, 26 (06) : 649 - 670
  • [6] Using the vignette methodology to measure access
    Van Ginneken, E.
    EUROPEAN JOURNAL OF PUBLIC HEALTH, 2021, 31
  • [7] ''Best Hospitals'': A description of the methodology for the Index of Hospital Quality
    Hill, CA
    Winfrey, KL
    Rudolph, BA
    INQUIRY-THE JOURNAL OF HEALTH CARE ORGANIZATION PROVISION AND FINANCING, 1997, 34 (01) : 80 - 90
  • [8] The AQ21 natural induction program for pattern discovery: Initial version and its novel features
    Wojtusiak, J.
    Michalski, R. S.
    Kaufman, K. A.
    Pietrzykowski, J.
    ICTAI-2006: EIGHTEENTH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2006, : 523 - +
  • [9] Pattern Discovery Using Association Rules
    Kiruthika, M.
    Jadhav, Rahul
    Dixit, Dipa
    Rashmi, J.
    Nehete, Anjali
    Khodkar, Trupti
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2011, 2 (12) : 69 - 74
  • [10] Fuzzy classification using pattern discovery
    Hamilton-Wright, Andrew
    Stashuk, Daniel W.
    Tizhoosh, Hamid R.
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2007, 15 (05) : 772 - 783