A Hybrid Machine Learning-Based Framework for Data Injection Attack Detection in Smart Grids Using PCA and Stacked Autoencoders

被引：0

作者：

Tufail, Shahid ^{[1
]}

Iqbal, Hasan ^{[1
]}

Tariq, Mohd ^{[1
]}

Sarwat, Arif I. ^{[1
]}

机构：

[1] Florida Int Univ, Dept Elect & Comp Engn, Miami, FL 33174 USA

来源：

IEEE ACCESS | 2025年 / 13卷

关键词：

Smart grids; Principal component analysis; Accuracy; Autoencoders; Random forests; Data models; Machine learning algorithms; Dimensionality reduction; Computer security; Support vector machines; Photovoltaic (PV) systems; grid-connected PV systems; machine learning algorithms; random forest; autoencoders; multi-layer perceptron (MLP); principal component analysis (PCA); INTRUSION DETECTION; CYBER-SECURITY;

D O I：

10.1109/ACCESS.2025.3543751

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Cyberattacks, especially data injection attacks, are becoming more common as smart grids are increasingly interconnected. In addition, accurate and unbiased high-quality data is required for model training. Most of the data we collect from the real world is sparse, incomplete, inconsistent, and skewed. To address these issues, we have proposed a framework to detect such attacks in this study. Using a stacked autoencoder architecture, synthetic instances of minority class data were generated. The generated classes address the imbalances in the data to enhance the generalizability of the model and address diverse attack scenarios. Various machine learning algorithms were evaluated, and the Random Forest (RF) model consistently achieved superior accuracy, ranging from 99.32% to 95.89%. In particular, traditional algorithms such as Logistic Regression (LR) exhibited sensitivity to dimensionality reductions, experiencing a 16.96% accuracy drop when the principal components were reduced from all to 10. In contrast, RF demonstrated resilience, with only a 1.67% mean accuracy drop under similar conditions. Both RF and XGBoost (XGB) emerged as standout models, showcasing high accuracy and robust performance even with dimensionality reduction via principal component analysis (PCA). However, reducing PCA components from 10 to 5 led to performance decreases in all models. The Support Vector Machine (SVM) Classifier shows the highest accuracy drop of 14.21%. This study shows the importance of understanding algorithmic behavior and data features and how it can impact the performance of ML models. This analysis will strengthen cybersecurity in smart grids and focusing on the critical need for careful feature selection and tuning, particularly for models sensitive to dimensionality reduction.

引用

页码：33783 / 33798

页数：16

共 50 条

[1] Machine Learning-based False Data Injection Attack Detection and Localization in Power Grids
Leao, Bruno P.
Vempati, Jagannadh
Muenz, Ulrich
Shekhar, Shashank
Pandey, Amit
Hingos, David
Bhela, Siddharth
Wang, Jing
Bilby, Chris
2022 IEEE CONFERENCE ON COMMUNICATIONS AND NETWORK SECURITY (CNS), 2022,
[2] Deep learning-based multilabel classification for locational detection of false data injection attack in smart grids
Debottam Mukherjee
Samrat Chakraborty
Sandip Ghosh
Electrical Engineering, 2022, 104 : 259 - 282
[3] Deep learning-based multilabel classification for locational detection of false data injection attack in smart grids
Mukherjee, Debottam
Chakraborty, Samrat
Ghosh, Sandip
ELECTRICAL ENGINEERING, 2022, 104 (01) : 259 - 282
[4] Stacked Autoencoder Framework of False Data Injection Attack Detection in Smart Grid
Chen, Liang
Gu, Songlin
Wang, Ying
Yang, Yang
Li, Yang
MATHEMATICAL PROBLEMS IN ENGINEERING, 2021, 2021
[5] False data injection attack in smart grid: Attack model and reinforcement learning-based detection method
Lin, Xixiang
An, Dou
Cui, Feifei
Zhang, Feiye
FRONTIERS IN ENERGY RESEARCH, 2023, 10
[6] A Stacked Machine and Deep Learning-Based Approach for Analysing Electricity Theft in Smart Grids
Khan, Inam Ullah
Javeid, Nadeem
Taylor, C. James
Gamage, Kelum A. A.
Ma, Xiandong
IEEE TRANSACTIONS ON SMART GRID, 2022, 13 (02) : 1633 - 1644
[7] Reinforcement Learning-Based False Data Injection Attacks in Smart Grids
Xiao, Liang
Chen, Haoyu
Xu, Shiyu
Lv, Zefang
Wang, Chuxuan
Xiao, Yilin
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2025,
[8] Reinforcement Learning Based Vulnerability Analysis of Data Injection Attack for Smart Grids
Luo, Weifeng
Xiao, Liang
2021 PROCEEDINGS OF THE 40TH CHINESE CONTROL CONFERENCE (CCC), 2021, : 6788 - 6792
[9] A Deep Learning-Based Cyber-Physical Strategy to Mitigate False Data Injection Attack in Smart Grids
Wei, Tin
Mendis, Gihan J.
IEEE PROCEEDINGS OF THE 2016 JOINT WORKSHOP ON CYBER-PHYSICAL SECURITY AND RESILIENCE IN SMART GRIDS (CPSR-SG), 2016,
[10] LSTM-Based False Data Injection Attack Detection in Smart Grids
Zhao, Yi
Jia, Xian
An, Dou
Yang, Qingyu
2020 35TH YOUTH ACADEMIC ANNUAL CONFERENCE OF CHINESE ASSOCIATION OF AUTOMATION (YAC), 2020, : 638 - 644

← 1 2 3 4 5 →