Big data analytics for identifying electricity theft using machine learning approaches in microgrids for smart communities

被引:20
|
作者
Arif, Arooj [1 ]
Javaid, Nadeem [1 ]
Aldegheishem, Abdulaziz [2 ]
Alrajeh, Nabil [3 ]
机构
[1] COMSATS Univ Islamabad, Dept Comp Sci, Islamabad 44000, Pakistan
[2] King Saud Univ KSU, Coll Architecture & Planning, Urban Planning Dept, Riyadh, Saudi Arabia
[3] King Saud Univ KSU, Biomed Technol Dept, Coll Appl Med Sci, Riyadh, Saudi Arabia
来源
关键词
big data; electricity theft detection; hyperactive optimization toolkit; machine learning; smart grids; urban planning; IMBALANCED DATA; OPTIMIZATION; SYSTEMS;
D O I
10.1002/cpe.6316
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Electricity theft (ET) causes major revenue loss in power utilities. It reduces the quality of supply, raises production cost, causes legal consumers to pay the higher cost, and impacts the economy as a whole. In this article, we use the State Grid Corporation of China (SGCC) dataset, which contains electricity consumption data of 1035 days for two classes: normal and fraudulent. In this work, ET detection model is proposed that consists of four steps: interpolation, data balancing, feature extraction, and classification. First, missing values of the dataset are recovered using the interpolation method. Second, resampling technique is implemented. ET consumers are 9% in the SGCC dataset that make the model inefficient to correctly classify both classes (normal and theft). A hybrid resampling technique is proposed, named synthetic minority oversampling technique with near miss. Third, residual network extracts the latent features from the SGCC dataset. Fourth, three tree based classifiers, such as decision tree (DT), random forest (RF), and adaptive boosting (AdaBoost) are applied to train the encoded feature vectors for classification. Besides, search for good hyperparameters is a challenging task, which is usually done manually and takes a considerable amount of time. To resolve this problem, Bayesian optimizer is used to simplify the tuning process of DT, RF, and AdaBoost. Finally, the results indicate that RF outperforms DT and AdaBoost.
引用
收藏
页数:21
相关论文
共 50 条
  • [41] Data Analytics and Machine Learning for Smart Process Manufacturing: Recent Advances and Perspectives in the Big Data Era
    Shang, Chao
    You, Fengqi
    ENGINEERING, 2019, 5 (06) : 1010 - 1016
  • [42] Evaluation of Online Machine Learning Algorithms for Electricity Theft Detection in Smart Grids
    Alkhresheh, Ashraf
    Al-Tarawneh, Mutaz A. B.
    Alnawayseh, Mohammad
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (10) : 805 - 813
  • [43] Data Analytics and Machine Learning: Navigating the Big Data Landscape
    Sloboda, Brian W.
    INTERNATIONAL STATISTICAL REVIEW, 2024,
  • [44] Big data analytics and machine learning: 2015 and beyond
    Passos, Ives Cavalcante
    Mwangi, Benson
    Kapczinski, Flavio
    LANCET PSYCHIATRY, 2016, 3 (01): : 13 - 15
  • [45] Machine learning with big data analytics for cloud security
    Mohammad, Abdul Salam
    Pradhan, Manas Ranjan
    COMPUTERS & ELECTRICAL ENGINEERING, 2021, 96
  • [46] A SURVEY OF MACHINE LEARNING ALGORITHMS FOR BIG DATA ANALYTICS
    Athmaja, S.
    Hanumanthappa, M.
    Kavitha, Vasantha
    2017 INTERNATIONAL CONFERENCE ON INNOVATIONS IN INFORMATION, EMBEDDED AND COMMUNICATION SYSTEMS (ICIIECS), 2017,
  • [47] Advanced Machine Learning Applications in Big Data Analytics
    Li, Taiyong
    Deng, Wu
    Wu, Jiang
    ELECTRONICS, 2023, 12 (13)
  • [48] Machine learning and big data analytics in mood disorders
    Yang, Lu
    Chen, Jun
    FRONTIERS IN PSYCHIATRY, 2024, 15
  • [49] A Machine Learning and Multi-Agent Model to Automate Big Data Analytics in Smart Cities
    Sassite, Fouad
    Addou, Malika
    Barramou, Fatimazahra
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (07) : 441 - 451
  • [50] A Machine Learning and Multi-Agent Model to Automate Big Data Analytics in Smart Cities
    Sassite, Fouad
    Addou, Malika
    Barramou, Fatimazahra
    International Journal of Advanced Computer Science and Applications, 2022, 13 (07): : 441 - 451