An overview of recent distributed algorithms for learning fuzzy models in Big Data classification

被引:15
|
作者
Ducange, Pietro [1 ]
Fazzolari, Michela [2 ]
Marcelloni, Francesco [1 ]
机构
[1] Dipartimento Ingn Informaz, Largo Lucio Lazzarino 1, I-56122 Pisa, Italy
[2] CNR, Ist Informat & Telemat, Via Giuseppe Moruzzi 1, I-56124 Pisa, Italy
关键词
Big Data; Fuzzy models; Data mining; Classification algorithms; Distributed computing; MULTIOBJECTIVE EVOLUTIONARY APPROACH; ASSOCIATIVE CLASSIFICATION; CLUSTERING-ALGORITHM; SYSTEMS; MAPREDUCE; ANALYTICS; DESIGN; GRANULARITY; CLASSIFIERS; SELECTION;
D O I
10.1186/s40537-020-00298-6
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Nowadays, a huge amount of data are generated, often in very short time intervals and in various formats, by a number of different heterogeneous sources such as social networks and media, mobile devices, internet transactions, networked devices and sensors. These data, identified as Big Data in the literature, are characterized by the popular Vs features, such as Value, Veracity, Variety, Velocity and Volume. In particular, Value focuses on the useful knowledge that may be mined from data. Thus, in the last years, a number of data mining and machine learning algorithms have been proposed to extract knowledge from Big Data. These algorithms have been generally implemented by using ad-hoc programming paradigms, such as MapReduce, on specific distributed computing frameworks, such as Apache Hadoop and Apache Spark. In the context of Big Data, fuzzy models are currently playing a significant role, thanks to their capability of handling vague and imprecise data and their innate characteristic to be interpretable. In this work, we give an overview of the most recent distributed learning algorithms for generating fuzzy classification models for Big Data. In particular, we first show some design and implementation details of these learning algorithms. Thereafter, we compare them in terms of accuracy and interpretability. Finally, we argue about their scalability.
引用
收藏
页数:29
相关论文
共 50 条
  • [21] Big Data Image Classification Based on Distributed Deep Representation Learning Model
    Zhu, Minjun
    Chen, Qinghua
    IEEE Access, 2020, 8 : 133890 - 133904
  • [22] A Distributed Fuzzy Associative Classifier for Big Data
    Segatori, Armando
    Bechini, Alessio
    Ducange, Pietro
    Marcelloni, Francesco
    IEEE TRANSACTIONS ON CYBERNETICS, 2018, 48 (09) : 2656 - 2669
  • [23] Fuzzy Models for Big Data Mining
    Ducange, Pietro
    FUZZY LOGIC AND APPLICATIONS, WILF 2018, 2019, 11291 : 257 - 260
  • [24] On Distributed Fuzzy Decision Trees for Big Data
    Segatori, Armando
    Marcelloni, Francesco
    Pedrycz, Witold
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2018, 26 (01) : 174 - 192
  • [25] Distributed fuzzy c-means algorithms for big sensor data based on cloud computing
    Zhang, Qingchen
    Chen, Zhikui
    Leng, Yonglin
    INTERNATIONAL JOURNAL OF SENSOR NETWORKS, 2015, 18 (1-2) : 32 - 39
  • [26] A Performance Evaluation of Classification Algorithms for Big Data
    Hai, Mo
    Zhang, You
    Zhang, Youjin
    5TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND QUANTITATIVE MANAGEMENT, ITQM 2017, 2017, 122 : 1100 - 1107
  • [27] A Mapreduce Fuzzy Techniques of Big Data Classification
    El Bakry, Malak
    Safwat, Soha
    Hegazy, Osman
    PROCEEDINGS OF THE 2016 SAI COMPUTING CONFERENCE (SAI), 2016, : 118 - 128
  • [28] An overview of recent advances on distributed and agile sensing algorithms and implementation
    Banavar, Mahesh K.
    Zhang, Jun J.
    Chakraborty, Bhavana
    Kwon, Homin
    Li, Ying
    Jiang, Huaiguang
    Spanias, Andreas
    Tepedelenlioglu, Cihan
    Chakrabarti, Chaitali
    Papandreou-Suppappola, Antonia
    DIGITAL SIGNAL PROCESSING, 2015, 39 : 1 - 14
  • [29] Handling Imbalance Classification Virtual Screening Big Data Using Machine Learning Algorithms
    Hussin, Sahar K.
    Abdelmageid, Salah M.
    Alkhalil, Adel
    Omar, Yasser M.
    Marie, Mahmoud, I
    Ramadan, Rabie A.
    COMPLEXITY, 2021, 2021
  • [30] Learning fuzzy linguistic models from low quality data by genetic algorithms
    Sanchez, Luciano
    Otero, Jose
    2007 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-4, 2007, : 1926 - 1931