An overview of recent distributed algorithms for learning fuzzy models in Big Data classification

被引:15
|
作者
Ducange, Pietro [1 ]
Fazzolari, Michela [2 ]
Marcelloni, Francesco [1 ]
机构
[1] Dipartimento Ingn Informaz, Largo Lucio Lazzarino 1, I-56122 Pisa, Italy
[2] CNR, Ist Informat & Telemat, Via Giuseppe Moruzzi 1, I-56124 Pisa, Italy
关键词
Big Data; Fuzzy models; Data mining; Classification algorithms; Distributed computing; MULTIOBJECTIVE EVOLUTIONARY APPROACH; ASSOCIATIVE CLASSIFICATION; CLUSTERING-ALGORITHM; SYSTEMS; MAPREDUCE; ANALYTICS; DESIGN; GRANULARITY; CLASSIFIERS; SELECTION;
D O I
10.1186/s40537-020-00298-6
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Nowadays, a huge amount of data are generated, often in very short time intervals and in various formats, by a number of different heterogeneous sources such as social networks and media, mobile devices, internet transactions, networked devices and sensors. These data, identified as Big Data in the literature, are characterized by the popular Vs features, such as Value, Veracity, Variety, Velocity and Volume. In particular, Value focuses on the useful knowledge that may be mined from data. Thus, in the last years, a number of data mining and machine learning algorithms have been proposed to extract knowledge from Big Data. These algorithms have been generally implemented by using ad-hoc programming paradigms, such as MapReduce, on specific distributed computing frameworks, such as Apache Hadoop and Apache Spark. In the context of Big Data, fuzzy models are currently playing a significant role, thanks to their capability of handling vague and imprecise data and their innate characteristic to be interpretable. In this work, we give an overview of the most recent distributed learning algorithms for generating fuzzy classification models for Big Data. In particular, we first show some design and implementation details of these learning algorithms. Thereafter, we compare them in terms of accuracy and interpretability. Finally, we argue about their scalability.
引用
收藏
页数:29
相关论文
共 50 条
  • [31] Big data algorithms beyond machine learning
    Mnich M.
    KI - Kunstliche Intelligenz, 2018, 32 (01): : 9 - 17
  • [32] Overview of deep learning algorithms for PolSAR image classification
    Bi, Haixia
    Kuang, Zuzheng
    Li, Fan
    Gao, Jinghuai
    Xu, Chen
    Chinese Science Bulletin, 2024, 69 (35) : 5108 - 5128
  • [33] Imbalanced Big Data Classification: A Distributed Implementation of SMOTE
    Rastogi, Avnish Kumar
    Narang, Nitin
    Siddiqui, Zamir Ahmad
    PROCEEDINGS OF THE WORKSHOP PROGRAM OF THE 19TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING AND NETWORKING (ICDCN'18), 2018,
  • [34] Overview of Data Mining Classification Techniques: Traditional vs. Parallel/Distributed Programming Models
    Besimi, Nuhi
    Cico, Betim
    Besimi, Adrian
    2017 6TH MEDITERRANEAN CONFERENCE ON EMBEDDED COMPUTING (MECO), 2017, : 433 - 436
  • [35] An Overview of Fashion Business Models in Big Data Environment
    Wang, Yun-Yun
    Li, Yi
    Perry, Patsy
    Liu, Zhang-Chi
    TEXTILE BIOENGINEERING AND INFORMATICS SYMPOSIUM (TBIS) PROCEEDINGS, 2018, 2018, : 690 - 700
  • [36] Fuzzy Neighbors and Deep Learning-Assisted Spark Model for Imbalanced Classification of Big Data
    Nalinipriya, G.
    Geetha, M.
    Sudha, D.
    Daniya, T.
    INTERNATIONAL JOURNAL OF UNCERTAINTY FUZZINESS AND KNOWLEDGE-BASED SYSTEMS, 2023, 31 (01) : 141 - 162
  • [38] Distributed Fuzzy Rough Prototype Selection for Big Data Regression
    Vluymans, Sarah
    Asfoor, Hasan
    Saeys, Yvan
    Cornelis, Chris
    Tolentino, Matthew
    Teredesai, Ankur
    De Cock, Martine
    2015 ANNUAL MEETING OF THE NORTH AMERICAN FUZZY INFORMATION PROCESSING SOCIETY DIGIPEN NAFIPS 2015, 2015,
  • [39] Recent Advancements in Learning Algorithms for Point Clouds: An Updated Overview
    Camuffo, Elena
    Mari, Daniele
    Milani, Simone
    SENSORS, 2022, 22 (04)
  • [40] Runtime prediction of big data jobs: performance comparison of machine learning algorithms and analytical models
    Nasim Ahmed
    Andre L. C. Barczak
    Mohammad A. Rashid
    Teo Susnjak
    Journal of Big Data, 9