Efficient attribute-oriented generalization for knowledge discovery from large databases

被引:38
|
作者
Carter, CL [1 ]
Hamilton, HJ [1 ]
机构
[1] Univ Regina, Dept Comp Sci, Networks Ctr Excellence Program, Ctr Excellence Lab,IRIS, Regina, SK S4S 0A2, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
knowledge discovery from databases; data mining; attribute-oriented induction;
D O I
10.1109/69.683752
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present GDBR (Generalize DataBase Relation) and FIGR (Fast, incremental Generalization and Regeneralization), two enhancements of Attribute-Oriented Generalization, a well-known knowledge discovery from databases technique. GDBR and FIGR are both O(n) and, as such, are optimal. GDBR is an on-line algorithm and requires only a small, constant amount of space. FIGR also requires a constant amount of space that is generally reasonable although, under certain circumstances, may grow large. FIGR is incremental, allowing changes to the database to be reflected in the generalization results without rereading input data. FIGR also allows fast regeneralization to both higher and lower levels of generality without rereading input. We compare GDBR and FIGR to two previous algorithms, LCHR and AOI, which are O(n log n) and O(np), respectively, where n is the number of input tuples and p the number of tuples in the generalized relation. Both require O(n) space that, for large input, causes memory problems. We implemented all four algorithms and ran empirical tests, and we found that GDBR and FIGR are faster. In addition, their runtimes increase only linearly as input size increases, while the runtimes of LCHR and AOI increase greatly when input size exceeds memory limitations.
引用
收藏
页码:193 / 208
页数:16
相关论文
共 50 条
  • [1] Knowledge discovery in medical databases based on rough sets and attribute-oriented generalization
    Tsumoto, S
    1998 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AT THE IEEE WORLD CONGRESS ON COMPUTATIONAL INTELLIGENCE - PROCEEDINGS, VOL 1-2, 1998, : 1296 - 1301
  • [2] Knowledge discovery in fuzzy databases using attribute-oriented induction
    Angryk, RA
    Petry, FE
    FOUNDATIONS AND NOVEL APPROACHES IN DATA MINING, 2006, 9 : 169 - +
  • [3] A fuzzy attribute-oriented induction method for knowledge discovery in relational databases
    Mouaddib, N
    Raschia, G
    ADVANCES IN DATABASE TECHNOLOGIES, 1999, 1552 : 1 - 13
  • [4] Automated knowledge acquisition from clinical databases based on rough sets and attribute-oriented generalization
    Tsumoto, S
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 1998, : 548 - 552
  • [5] Quantifiable attribute-oriented generalization
    Chen, H.M.
    Wang, L.Z.
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2001, 38 (02):
  • [6] Towards fuzzification of attribute-oriented generalization
    Tsumoto, S
    Lin, TY
    PROCEEDINGS OF THE FIFTH JOINT CONFERENCE ON INFORMATION SCIENCES, VOLS 1 AND 2, 2000, : 178 - 181
  • [7] A new method of attribute-oriented spatial generalization
    Wang, LZ
    Zhou, LH
    Chen, T
    PROCEEDINGS OF THE 2004 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2004, : 1393 - 1398
  • [8] Attribute-oriented induction using domain generalization graphs
    Hamilton, HJ
    Hilderman, RJ
    Cercone, N
    EIGHTH IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 1996, : 246 - 253
  • [9] Fuzzification of attribute-oriented generalization and its application to medicine
    Tsumoto, S
    DATA MINING AND KNOWLEDGE DISCOVERY: THEORY, TOOLS, AND TECHNOLOGY II, 2000, 4057 : 100 - 107
  • [10] Incorporating domain knowledge into attribute-oriented data mining
    McClean, S
    Scotney, B
    Shapcott, M
    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2000, 15 (06) : 535 - 547