Possibilistic Similarity Measures for Data Science and Machine Learning Applications

被引:4
|
作者
Charfi, Amal [1 ]
Bouhamed, Sonda Ammar [1 ,2 ]
Bosse, Eloi [2 ,3 ]
Kallel, Imene Khanfir [1 ,2 ]
Bouchaala, Wassim [4 ]
Solaiman, Basel [2 ]
Derbel, Nabil [1 ]
机构
[1] Univ Sfax, Natl Sch Engineers Sfax, Control & Energy Managment CEM Lab, Sfax 3038, Tunisia
[2] IMT Atlantique, Image & Informat Proc Dept iTi, F-838182923 Brest, France
[3] Expertises Parafuse Inc, Quebec City, PQ G1W 4N1, Canada
[4] Tunisian Profess Training Agcy, Sfax 3000, Tunisia
关键词
Uncertainty; Possibility theory; Measurement uncertainty; Machine learning; Atmospheric measurements; Particle measurements; Indexes; Classification; distance; entropy; learning; measures of specificity; possibility distributions; similarity; uncertainty; INFORMATION; UNCERTAINTY; NOTION;
D O I
10.1109/ACCESS.2020.2979553
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Measuring similarity is of a great interest in many research areas such as in data sciences, machine learning, pattern recognition, text analysis and information retrieval to name a few. Literature has shown that possibility is an attractive notion in the context of distinguishability assessment and can lead to very efficient and computationally inexpensive learning schemes. This paper focuses on determining the similarity between two possibility distributions. A review of existing similarity measures within the possibilistic framework is presented first. Then, similarity measures are analyzed with respect to their capacity to satisfy a set of required properties that a similarity measure should own. Most of the existing possibilistic similarity measures produce undesirable outcomes since they generally depend on the application context. A new similarity measure, called InfoSpecificity, is introduced and the similarity measures are categorized into three main methods: morphic-based, amorphic-based and hybrid. Two experiments are being conducted using four benchmark databases. The aim of the experiments is to compare the efficiency of the possibilistic similarity measures when applied to real data. Empirical experiments have shown good results for the hybrid methods, particularly with the InfoSpecificity measure. In general, the hybrid methods outperform the other two categories when evaluated on small-size samples, i.e., poor-data context (or poor-informed environment) where possibility theory can be used at the greatest benefit.
引用
收藏
页码:49198 / 49211
页数:14
相关论文
共 50 条
  • [21] Data Science and Machine Learning at Scale
    Sundaresan, Neel
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT I, 2011, 6911 : 10 - 10
  • [22] Data and Machine Learning in Polymer Science
    Yun-Qi Li
    Ying Jiang
    Li-Quan Wang
    Jian-Feng Li
    Chinese Journal of Polymer Science, 2023, 41 (09) : 1371 - 1376
  • [23] Data science and machine learning in anesthesiology
    Chae, Dongwoo
    KOREAN JOURNAL OF ANESTHESIOLOGY, 2020, 73 (04) : 285 - 295
  • [24] Data and Machine Learning in Polymer Science
    Li, Yun-Qi
    Jiang, Ying
    Wang, Li-Quan
    Li, Jian-Feng
    CHINESE JOURNAL OF POLYMER SCIENCE, 2023, 41 (09) : 1371 - 1376
  • [25] Data and Machine Learning in Polymer Science
    Yun-Qi Li
    Ying Jiang
    Li-Quan Wang
    Jian-Feng Li
    Chinese Journal of Polymer Science, 2023, 41 : 1371 - 1376
  • [26] A possibilistic analogue to Bayes estimation with fuzzy data and its application in machine learning
    Arefi, Mohsen
    Viertl, Reinhard
    Taheri, S. Mahmoud
    SOFT COMPUTING, 2022, 26 (12) : 5497 - 5510
  • [27] Benchmarking antibody clustering methods using sequence, structural, and machine learning similarity measures for antibody discovery applications
    Chomicz, Dawid
    Konczak, Jaroslaw
    Wrobel, Sonia
    Satlawa, Tadeusz
    Dudzic, Pawel
    Janusz, Bartosz
    Tarkowski, Mateusz
    Deszynski, Piotr
    Gawlowski, Tomasz
    Kostyn, Anna
    Orlowski, Marek
    Klaus, Tomasz
    Schulte, Lukas
    Martin, Kyle
    Comeau, Stephen R.
    Krawczyk, Konrad
    FRONTIERS IN MOLECULAR BIOSCIENCES, 2024, 11
  • [28] Evaluation measures for learning probabilistic and possibilistic networks
    Borgelt, C
    Kruse, R
    PROCEEDINGS OF THE SIXTH IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS I - III, 1997, : 669 - 676
  • [29] A possibilistic analogue to Bayes estimation with fuzzy data and its application in machine learning
    Mohsen Arefi
    Reinhard Viertl
    S. Mahmoud Taheri
    Soft Computing, 2022, 26 : 5497 - 5510
  • [30] Applications and Techniques for Fast Machine Learning in Science
    Deiana, Allison McCarn
    Tran, Nhan
    Agar, Joshua
    Blott, Michaela
    Di Guglielmo, Giuseppe
    Duarte, Javier
    Harris, Philip
    Hauck, Scott
    Liu, Mia
    Neubauer, Mark S.
    Ngadiuba, Jennifer
    Ogrenci-Memik, Seda
    Pierini, Maurizio
    Aarrestad, Thea
    Baehr, Steffen
    Becker, Juergen
    Berthold, Anne-Sophie
    Bonventre, Richard J.
    Bravo, Tomas E. Muller
    Diefenthaler, Markus
    Dong, Zhen
    Fritzsche, Nick
    Gholami, Amir
    Govorkova, Ekaterina
    Guo, Dongning
    Hazelwood, Kyle J.
    Herwig, Christian
    Khan, Babar
    Kim, Sehoon
    Klijnsma, Thomas
    Liu, Yaling
    Lo, Kin Ho
    Nguyen, Tri
    Pezzullo, Gianantonio
    Rasoulinezhad, Seyedramin
    Rivera, Ryan A.
    Scholberg, Kate
    Selig, Justin
    Sen, Sougata
    Strukov, Dmitri
    Tang, William
    Thais, Savannah
    Unger, Kai Lukas
    Vilalta, Ricardo
    von Krosigk, Belina
    Wang, Shen
    Warburton, Thomas K.
    FRONTIERS IN BIG DATA, 2022, 5