Selected data mining concepts

被引:0
|
作者
Abello, James [1 ]
Cormode, Graham [1 ]
Fradkin, Dmitriy [1 ]
Madigan, David [1 ]
Melnik, Ofer [1 ]
Muchnik, Ilya [1 ]
机构
[1] Rutgers State Univ, DIMACS Res Inst, Piscataway, NJ 08854 USA
来源
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this multi-authored chapter, we introduce six key techniques from data mining that have been succesfully applied to epidemiological data analysis. Cluster Analysis is an unsupervised learning technique that takes large collections of data points and attempts to identify clusters of similar points. More formally, it tries to create clusters to optimize various mathematical properties, such as minimizing the maximum spread of each cluster, or minimizing the sum of the spreads. A variety of algorithms have been proposed to create clusters from a data set, including k-means, hierarchical clustering, and expectation maximization. Association Rules have become one of the central tools in transactional data mining. They can also be applied as a way of looking for correlations and associations within epidemiological data. Support Vector Machines are a popular classification method which produce linear classifiers in the form of cutting hyperplanes, which divide the space into positive and negative examples. Through the so-called "kernel-trick" it can also produce non-linear rules by projecting the data into a different space. Statistical Techniques are a bedrock of epidemiological study, through sampling, hypothesis testing, experiment design, and inference techniques. An important suite of methods draw on Bayesian inference, giving more interpretable confidence bounds and simpler model fitting. Boosting is a very influential technique for "boosting" the quality of classifier-based methods by varying the emphasis put on examples to focus the classification method on the "harder examples". The output of a number of rules is then combined by taking a weighted majority vote. The method has been extended to a wide variety of settings, and applied to a large number of different scenarios. External Memory Algorithms are needed when the volume of data being processed exceeds the internal (fast) memory of the machine, and means that some data must reside on external (slow) memory, ie disks. Such methods give deep insights into the structure and properties of the underlying algorithms.
引用
收藏
页码:1 / 40
页数:40
相关论文
共 50 条
  • [1] DATA MINING DATA MINING CONCEPTS AND TECHNIQUES
    Agarwal, Shivam
    2013 INTERNATIONAL CONFERENCE ON MACHINE INTELLIGENCE AND RESEARCH ADVANCEMENT (ICMIRA 2013), 2013, : 203 - 207
  • [2] Selected techniques for data mining in medicine
    Lavrac, N
    ARTIFICIAL INTELLIGENCE IN MEDICINE, 1999, 16 (01) : 3 - 23
  • [3] Detecting emerging concepts in textual data mining
    Pottenger, WM
    Yang, TH
    COMPUTATIONAL INFORMATION RETRIEVAL, 2001, : 89 - 105
  • [4] Actionability and formal concepts: A data mining perspective
    Boulicaut, Jean-Francois
    Besson, Jeremy
    FORMAL CONCEPT ANALYSIS, PROCEEDINGS, 2008, 4933 : 14 - 31
  • [5] SolEuNet: Selected data mining techniques and applications
    Lavrac, N
    FROM DATA AND INFORMATION ANALYSIS TO KNOWLEDGE ENGINEERING, 2006, : 32 - 39
  • [6] Selected Data Mining Tools for Data Analysis in Distributed Environment
    Moshkov, Mikhail
    Zielosko, Beata
    Tetteh, Evans Teiko
    ENTROPY, 2022, 24 (10)
  • [7] AN APPLICATION OF SELECTED DATA MINING TECHNIQUES TO THE SPECIFIC SPORT DATA
    Gorecki, Jan
    ICT FOR COMPETITIVENESS 2012, 2012, : 118 - 123
  • [8] Basic Concepts and Principles of Data Mining in Clinical Practice
    Lee, Sun-Mi
    Park, Rae Woong
    HEALTHCARE INFORMATICS RESEARCH, 2009, 15 (02) : 175 - 189
  • [9] Research On Web Data Mining Concepts, Techniques And Applications
    Jayamalini, K.
    Ponnavaikko, M.
    2017 INTERNATIONAL CONFERENCE ON ALGORITHMS, METHODOLOGY, MODELS AND APPLICATIONS IN EMERGING TECHNOLOGIES (ICAMMAET), 2017,
  • [10] Big Data, Data Mining, Machine Learning, and Deep Learning Concepts in Crime Data
    Ates, Emre Cihan
    Bostanci, Erkan
    Guzel, Mehmet Serdar
    JOURNAL OF PENAL LAW AND CRIMINOLOGY-CEZA HUKUKU VE KRIMINOLOJI DERGISI, 2020, 8 (02): : 293 - 319