Adaptive Stochastic Gradient Descent (SGD) for erratic datasets

被引:2
|
作者
Dagal, Idriss [1 ]
Tanrioven, Kursat [1 ]
Nayir, Ahmet [1 ]
Akin, Burak [2 ]
机构
[1] Istanbul Beykent Univ, Elect Engn, Hadim Koruyolu Caddesi 19, TR-34450 Istanbul, Turkiye
[2] Yildiz Tech Univ, Elect Engn, Davutpasa Caddesi, TR-34220 Istanbul, Turkiye
关键词
Gradient descent; Stochastic Gradient Descent; Accuracy; Principal Component Analysis; QUASI-NEWTON METHOD; NEURAL NETWORKS; ALGORITHM; MLP;
D O I
10.1016/j.future.2024.107682
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Stochastic Gradient Descent (SGD) is a highly efficient optimization algorithm, particularly well suited for large datasets due to its incremental parameter updates. In this study, we apply SGD to a simple linear classifier using logistic regression, a widely used method for binary classification tasks. Unlike traditional batch Gradient Descent (GD), which processes the entire dataset simultaneously, SGD offers enhanced scalability and performance for streaming and large-scale data. Our experiments reveal that SGD outperforms GD across multiple performance metrics, achieving 45.83% accuracy compared to GD's 41.67 %, and excelling in precision (60 % vs. 45.45 %), recall (100 % vs. 60 %), and F1-score (100 % vs. 62 %). Additionally, SGD achieves 99.99 % of Principal Component Analysis (PCA) accuracy, slightly surpassing GD's 99.92 %. These results highlight SGD's superior efficiency and flexibility for large-scale data environments, driven by its ability to balance precision and recall effectively. To further enhance SGD's robustness, the proposed method incorporates adaptive learning rates, momentum, and logistic regression, addressing traditional GD drawbacks. These modifications improve the algorithm's stability, convergence behavior, and applicability to complex, large-scale optimization tasks where standard GD often struggles, making SGD a highly effective solution for challenging data-driven scenarios.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Stochastic Runge-Kutta methods and adaptive SGD-G2 stochastic gradient descent
    Ayadi, Imen
    Turinici, Gabriel
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 8220 - 8227
  • [2] Efficient distance metric learning by adaptive sampling and mini-batch stochastic gradient descent (SGD)
    Qi Qian
    Rong Jin
    Jinfeng Yi
    Lijun Zhang
    Shenghuo Zhu
    Machine Learning, 2015, 99 : 353 - 372
  • [3] Efficient distance metric learning by adaptive sampling and mini-batch stochastic gradient descent (SGD)
    Qian, Qi
    Jin, Rong
    Yi, Jinfeng
    Zhang, Lijun
    Zhu, Shenghuo
    MACHINE LEARNING, 2015, 99 (03) : 353 - 372
  • [4] SW-SGD: The Sliding Window Stochastic Gradient Descent Algorithm
    Chakroun, Imen
    Haber, Tom
    Ashby, Thomas J.
    INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE (ICCS 2017), 2017, 108 : 2318 - 2322
  • [5] AG-SGD: Angle-Based Stochastic Gradient Descent
    Song, Chongya
    Pons, Alexander
    Yen, Kang
    IEEE ACCESS, 2021, 9 : 23007 - 23024
  • [6] Guided Stochastic Gradient Descent Algorithm for inconsistent datasets
    Sharma, Anuraganand
    APPLIED SOFT COMPUTING, 2018, 73 : 1068 - 1080
  • [7] Asynchronous Stochastic Gradient Descent Over Decentralized Datasets
    Du, Yubo
    You, Keyou
    IEEE TRANSACTIONS ON CONTROL OF NETWORK SYSTEMS, 2021, 8 (03): : 1212 - 1224
  • [8] Asynchronous Stochastic Gradient Descent over Decentralized Datasets
    Du, Yubo
    You, Keyou
    Mo, Yilin
    2020 IEEE 16TH INTERNATIONAL CONFERENCE ON CONTROL & AUTOMATION (ICCA), 2020, : 216 - 221
  • [9] CP-SGD: Distributed stochastic gradient descent with compression and periodic compensation
    Yu, Enda
    Dong, Dezun
    Xu, Yemao
    Ouyang, Shuo
    Liao, Xiangke
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2022, 169 : 42 - 57
  • [10] CuMF_SGD: Parallelized Stochastic Gradient Descent for Matrix Factorization on GPUs
    Xie, Xiaolong
    Tan, Wei
    Fong, Liana L.
    Liang, Yun
    HPDC'17: PROCEEDINGS OF THE 26TH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE PARALLEL AND DISTRIBUTED COMPUTING, 2017, : 79 - 92