Enhancing Software Defect Prediction accuracy using Modified Entropy Calculation in Random Forest Algorithm

被引:0
|
作者
Suryawanshi, Ranjeetsingh [1 ]
Kadam, Amol [1 ]
机构
[1] Bharati Vidyapeeth Deemed Be Univ, Coll Engn, Pune, India
关键词
Random forest; decision tree; classification; prediction; entropy; Taylor series; NETWORKS;
D O I
10.52783/jes.754
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Imagine you are trying to classify software defect for a large dataset. How will you choose the best algorithm to do that? For the above problem we have various algorithms like Random Forest, Support Vector Machine, Neural Networks, Naive Bayes, K -Nearest Neighbours, Decision Tree, Logistic Regression etc. One of the most used methods is Random Forest algorithm, which uses multiple Decision Trees to make predictions. However, this algorithm relies on a complex calculation called Entropy, which measures the uncertainty in the data. Entropy function that uses natural logarithm which may be time consuming calculation. Is there a better way to calculate entropy? In this research, have explored a different way to calculate the natural logarithm using the Taylor series expression. It is a series consisting of sum of infinite terms that approximates any function by using its derivatives. We further modified the Random Forest algorithm by replacing the natural logarithm the Taylor series expression in the Entropy formula. We tested our modified algorithm on dataset and compared its performance with the original Entropy formula. We found that our modification in the algorithm has improved the accuracy of the algorithm on software defect prediction.
引用
收藏
页码:84 / 91
页数:8
相关论文
共 50 条
  • [31] Disease Prediction: Smart Disease Prediction System using Random Forest Algorithm
    Swarupa, A. N. V. K.
    Sree, V. Heina
    Nookambika, S.
    Kishore, Y. Kiran Sai
    Teja, U. Ravi
    2021 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS, SMART AND GREEN TECHNOLOGIES (ICISSGT 2021), 2021, : 48 - 51
  • [32] Prediction of software fault-prone classes using ensemble random forest with adaptive synthetic sampling algorithm
    A. Balaram
    S. Vasundra
    Automated Software Engineering, 2022, 29
  • [33] Prediction of software fault-prone classes using ensemble random forest with adaptive synthetic sampling algorithm
    Balaram, A.
    Vasundra, S.
    AUTOMATED SOFTWARE ENGINEERING, 2022, 29 (01)
  • [34] Enhancing Software Defect Prediction Using Supervised-Learning Based Framework
    Bashir, Kamal
    Li, Tianrui
    Yohannese, Chubato Wondaferaw
    Mahama, Yahaya
    2017 12TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND KNOWLEDGE ENGINEERING (IEEE ISKE), 2017,
  • [35] Enhancing software defect prediction models using metaheuristics with a learning to rank approach
    Aryan Boloori
    Azadeh Zamanifar
    Amirfarhad Farhadi
    Discover Data, 2 (1):
  • [36] Efficiency of oversampling methods for enhancing software defect prediction by using imbalanced data
    Tirimula Rao Benala
    Karunya Tantati
    Innovations in Systems and Software Engineering, 2023, 19 : 247 - 263
  • [37] Efficiency of oversampling methods for enhancing software defect prediction by using imbalanced data
    Benala, Tirimula Rao
    Tantati, Karunya
    INNOVATIONS IN SYSTEMS AND SOFTWARE ENGINEERING, 2023, 19 (03) : 247 - 263
  • [38] A Novel Approach to Improve Software Defect Prediction Accuracy Using Machine Learning
    Mehmood, Iqra
    Shahid, Sidra
    Hussain, Hameed
    Khan, Inayat
    Ahmad, Shafiq
    Rahman, Shahid
    Ullah, Najeeb
    Huda, Shamsul
    IEEE ACCESS, 2023, 11 : 63579 - 63597
  • [39] Sustainable inventory prediction with random defect and rework using Bat algorithm
    Jain, Madhu
    Sharma, Nidhi
    Singh, Praveendra
    RAIRO-OPERATIONS RESEARCH, 2023, 57 (02) : 481 - 501
  • [40] Enhancing Modified Cuckoo Search Algorithm by using MCMC Random Walk
    Husaini, Noor Aida
    Ghazali, Rozaida
    Yanto, Iwan Tri Riyadi
    PROCEEDINGS OF 2016 2ND INTERNATIONAL CONFERENCE ON SCIENCE IN INFORMATION TECHNOLOGY (ICSITECH) - INFORMATION SCIENCE FOR GREEN SOCIETY AND ENVIRONMENT, 2016, : 306 - 311