Neighborhood Information-Based Method for Multivariate Association Mining

被引:2
|
作者
Cheng, Honghong [1 ,2 ]
Qian, Yuhua [3 ]
Guo, Yingjie [3 ]
Zheng, Keyin [3 ]
Zhang, Qingfu [4 ,5 ]
机构
[1] Shanxi Univ Finance & Econ, Sch Informat, Taiyuan 030012, Shanxi, Peoples R China
[2] Shanxi Univ, Inst Big Data Sci & Ind, Taiyuan 030006, Shanxi, Peoples R China
[3] Shanxi Univ, Inst Big Data Sci & Ind, Sch Comp & Informat Technol, Key Lab Comp Intelligence & China Informat Proc,Mi, Taiyuan 030006, Shanxi, Peoples R China
[4] City Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China
[5] City Univ Hong Kong, Shenzhen Res Inst, Shenzhen 518057, Peoples R China
基金
中国国家自然科学基金;
关键词
Entropy; Spirals; Noise measurement; Mutual information; Knowledge engineering; Data mining; Data engineering; Association mining; multivariate association measure; distribution-free; nonparametric; neighborhood information; ATTRIBUTE REDUCTION;
D O I
10.1109/TKDE.2022.3178090
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most current data is multivariable, exploring and identifying valuable information in these datasets has far-reaching impacts. In particular, discovering meaningful hidden association patterns in multivariate plays an important role. Plenty of measures for multivariate association have been proposed, yet it is still an open research challenge for effectively capturing association patterns among three or more variables, especially the scenario without any prior knowledge about those relationships. To do so, we desire a distribution-free, association type-independent and non-parametrical measure. For practical applications, such a measure should comparable, interpretable, scalable, intuitive, reliability, and robust. However, no exiting measures fulfill all of these desiderata. In this paper, taking advantage of the neighborhood information of a sample, we propose MNA, a maximal neighborhood multivariate association measure that satisfies all the above criteria. Extensive experiments on synthetic and real data show it outperforms state-of-the-art multivariate association measures.
引用
收藏
页码:6126 / 6135
页数:10
相关论文
共 50 条
  • [41] Information-based machine translation
    Spoken Language Technology, Sony US Research Laboratories, 3300 Zanker Road, San Jose
    CA
    95134, United States
    Meet. North American Chapter Assoc. Comput. Linguist., NAACL, 1600,
  • [42] Information-based Medicine - Introduction
    Lasser, Catherine
    Svinte, Michael
    IBM SYSTEMS JOURNAL, 2007, 46 (01)
  • [43] Information-based manufacturing with the Web
    Shaw, MJ
    INTERNATIONAL JOURNAL OF FLEXIBLE MANUFACTURING SYSTEMS, 2000, 12 (2-3): : 115 - 129
  • [44] Information-based manufacturing with the web
    Shaw, Michael J., 2000, Kluwer Academic Publishers, Dordrecht (12):
  • [45] Normalized information-based divergences
    Coeurjolly, J. -F.
    Drouilhet, R.
    Robineau, J. -F.
    PROBLEMS OF INFORMATION TRANSMISSION, 2007, 43 (03) : 167 - 189
  • [46] An introduction to information-based complexity
    Wozniakowski, H
    ZEITSCHRIFT FUR ANGEWANDTE MATHEMATIK UND MECHANIK, 1996, 76 : 128 - 130
  • [47] The Making of an Information-Based Society
    LI YE & WU ZHOU
    China Today, 2004, (10) : 58 - 59
  • [48] INFORMATION-BASED ASSET PRICING
    Brody, Dorje C.
    Hughston, Lane P.
    Macrina, Andrea
    INTERNATIONAL JOURNAL OF THEORETICAL AND APPLIED FINANCE, 2008, 11 (01) : 107 - 142
  • [49] Information-Based Accruals Strategy
    Liu, Qiao
    Qi, Rong
    REVIEW OF BUSINESS, 2007, 28 (01): : 45 - 53
  • [50] An information-based discussion of vagueness
    Dubois, D
    Esteva, F
    Godo, L
    Prade, H
    10TH IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-3: MEETING THE GRAND CHALLENGE: MACHINES THAT SERVE PEOPLE, 2001, : 781 - 784