Modeling uncertain data using Monte Carlo integration method for clustering

被引:37
|
作者
Sharma, Krishna Kumar [1 ]
Seal, Ayan [1 ]
机构
[1] PDPM Indian Inst Informat Technol Design & Mfg, Jabalpur 482005, Madhya Pradesh, India
关键词
Uncertain data modeling; Monte Carlo integration; Kullback-Leibler divergence; Jeffreys divergence; Clustering analysis; DIVERGENCE ESTIMATION; SELECTION;
D O I
10.1016/j.eswa.2019.06.050
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Nowadays, data clustering is an important task to the mining research community since the availability of uncertain data is increasing rapidly in many applications such as weather forecasting, business information management systems. In this work, proposed Monte Carlo integration based uncertain objects modeling technique is compared with three state-of-the-art methods namely, kernel density estimation, Dempster-Shafer, and Monte Carlo simulation. Then Kullback-Leibler and Jeffrey divergences are used to measure the similarity between uncertain objects and merge them with modified DBSCAN and k-medoids clustering algorithms. A heuristic algorithm is proposed to find the optimum radius, which is one of the inputs of DBSCAN. All the experiments are performed on one synthesized dataset and three real datasets namely, weather data, Japanese vowels and activity of daily living data. Five performance measures namely, accuracy, precision, recall, F-score, and Jaccard index are considered for comparing proposed method with state-of-the-art methods. Two non-parametric tests namely, Wilcoxon rank sum and sign test are also conducted. These results denote the effectiveness and efficiency of the proposed method over state-of-the-art methods. (C) 2019 Elsevier Ltd. All rights reserved.
引用
收藏
页码:100 / 116
页数:17
相关论文
共 50 条
  • [1] Modeling of recrystallization using Monte Carlo method based on EBSD data
    Tarasiuk, J
    Gerber, P
    Bacroix, B
    Piekos, K
    TEXTURES OF MATERIALS, PTS 1 AND 2, 2002, 408-4 : 395 - 400
  • [2] Bayesian Reliability Modeling Using Monte Carlo Integration
    Camara, Vincent A. R.
    Tsokos, Chris P.
    JOURNAL OF MODERN APPLIED STATISTICAL METHODS, 2005, 4 (01) : 172 - 186
  • [3] Monte Carlo method for multiple parameter estimation in the presence of uncertain data
    Siu, N.
    1600, (28):
  • [4] Clustering Longitudinal Data Using R: A Monte Carlo Study
    Verboon, Peter
    Pat-El, Ron
    METHODOLOGY-EUROPEAN JOURNAL OF RESEARCH METHODS FOR THE BEHAVIORAL AND SOCIAL SCIENCES, 2022, 18 (02) : 144 - 163
  • [6] Modified Markov chain Monte Carlo method for dynamic data integration using streamline approach
    Efendiev, Yalchin
    Datta-Gupta, Akhil
    Ma, Xianlin
    Mallick, Bani
    MATHEMATICAL GEOSCIENCES, 2008, 40 (02) : 213 - 232
  • [7] Modified Markov Chain Monte Carlo Method for Dynamic Data Integration Using Streamline Approach
    Yalchin Efendiev
    Akhil Datta-Gupta
    Xianlin Ma
    Bani Mallick
    Mathematical Geosciences, 2008, 40 : 213 - 232
  • [8] Monte Carlo modeling with uncertain probability density functions
    Brattin, WJ
    Barry, TM
    Chiu, N
    HUMAN AND ECOLOGICAL RISK ASSESSMENT, 1996, 2 (04): : 820 - 840
  • [9] Using Monte Carlo method for ranking interval data
    Jahanshahloo, G. R.
    Lotfi, F. Hosseinzadeh
    Balf, F. Rezai
    Rezai, H. Zhiani
    APPLIED MATHEMATICS AND COMPUTATION, 2008, 201 (1-2) : 613 - 620
  • [10] Modeling the hierarchical protein folding using clustering Monte-Carlo algorithm
    Yesylevskyy, SO
    Demchenko, AP
    PROTEIN AND PEPTIDE LETTERS, 2001, 8 (06): : 437 - 442