HDL-ODPRs: A Hybrid Deep Learning Technique Based Optimal Duplication Detection for Pull-Requests in Open-Source Repositories

被引:0
|
作者
Alotaibi, Saud S. [1 ]
机构
[1] Umm Al Qura Univ, Coll Comp & Informat Syst, Dept Informat Syst, Mecca 24382, Saudi Arabia
来源
APPLIED SCIENCES-BASEL | 2022年 / 12卷 / 24期
关键词
duplicate pull requests; deep learning; textual extraction; similarity computation; duplicate detection; PROJECT;
D O I
10.3390/app122412594
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Recently, open-source repositories have grown rapidly due to volunteer contributions worldwide. Collaboration software platforms have gained popularity as thousands of external contributors have contributed to open-source repositories. Although data de-duplication decreases the size of backup workloads, this causes poor data locality (fragmentation) and redundant review time and effort. Deep learning and machine learning techniques have recently been applied to identify complex bugs and duplicate issue reports. It is difficult to use, but it increases the risk of developers submitting duplicate pull requests, resulting in additional maintenance costs. We propose a hybrid deep learning technique in this work on the basis of an optimal duplication detection is for pull requests (HDL-ODPRs) in open-source repositories. An algorithm used to extract textual data from pull requests is hybrid leader-based optimization (HLBO), which increases the accuracy of duplicate detection. Following that, we compute the similarities between pull requests by utilizing the multiobjective alpine skiing optimization (MASO) algorithm, which provides textual, file-change, and code-change similarities. For pull request duplicate detection, a hybrid deep learning technique (named GAN-GS) is introduced, in which the global search (GS) algorithm is used to optimize the design metrics of the generative adversarial network (GAN). The proposed HDL-ODPR model is validated against the public standard benchmark datasets, such as DupPR-basic and DupPR-complementary data. According to the simulation results, the proposed HDL-ODPR model can achieve promising results in comparison with existing state-of-the-art models.
引用
收藏
页数:17
相关论文
共 9 条
  • [1] DeepPull: Deep Learning-Based Approach for Predicting Reopening, Decision, and Lifetime of Pull Requests on GitHub Open-Source Projects
    Banyongrakkul, Peerachai
    Phoomvuthisarn, Suronapee
    SOFTWARE TECHNOLOGIES, ICSOFT 2023, 2024, 2104 : 100 - 123
  • [2] Deep learning based landslide detection using open-source resources: Opportunities and challenges
    Das, Suvam
    Sharma, Priyanka
    Pain, Anindya
    Kanungo, Debi Prasanna
    Sarkar, Shantanu
    EARTH SCIENCE INFORMATICS, 2023, 16 (04) : 4035 - 4052
  • [3] Deep learning based landslide detection using open-source resources: Opportunities and challenges
    Suvam Das
    Priyanka Sharma
    Anindya Pain
    Debi Prasanna Kanungo
    Shantanu Sarkar
    Earth Science Informatics, 2023, 16 : 4035 - 4052
  • [4] Open-source deep learning-based air-void detection algorithm for concrete microscopic images
    Hilloulin, B.
    Bekrine, I
    Schmitt, E.
    Loukili, A.
    JOURNAL OF MICROSCOPY, 2022, 286 (02) : 179 - 184
  • [5] Deep Learning Based Cyber Event Detection from Open-Source Re-Emerging Social Data
    Mohammad, Farah
    Al-Ahmadi, Saad
    Al-Muhtadi, Jalal
    CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 76 (02): : 1423 - 1438
  • [6] The digital eye for mammography: deep transfer learning and model ensemble based open-source toolkit for mass detection and classification
    Terzi, Ramazan
    Kilic, Ahmet Enes
    Karaahmetoglu, Gokhan
    Ozdemir, Okan Bilge
    SIGNAL IMAGE AND VIDEO PROCESSING, 2025, 19 (01)
  • [7] DEEP LEARNING-BASED CLOUD DETECTION IN HIGH-RESOLUTION SATELLITE IMAGERY USING VARIOUS OPEN-SOURCE CLOUD IMAGES
    Yun, Yerin
    Kim, Taeheon
    Lee, Changhui
    Han, Youkyung
    IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 6538 - 6541
  • [8] The Wildfire Dataset: Enhancing Deep Learning-Based Forest Fire Detection with a Diverse Evolving Open-Source Dataset Focused on Data Representativeness and a Novel Multi-Task Learning Approach
    El-Madafri, Ismail
    Pena, Marta
    Olmedo-Torre, Noelia
    FORESTS, 2023, 14 (09):
  • [9] Open-Source 3D Morphing Software for Facial Plastic Surgery and Facial Landmark Detection Research and Open Access Face Data Set Based on Deep Learning (Artificial Intelligence) Generated Synthetic 3D Models
    Topsakal, Oguzhan
    Glinton, Juan
    Akbas, M. Ilhan
    Celikoyar, M. Mazhar
    FACIAL PLASTIC SURGERY & AESTHETIC MEDICINE, 2023, : 152 - 159