A Generalize Hardware Debugging Approach for Large Language Models Semi-Synthetic, Datasets

被引:0
|
作者
Fu, Weimin [1 ]
Li, Shijie [2 ]
Zhao, Yifang [2 ]
Yang, Kaichen [3 ]
Zhang, Xuan [4 ]
Jin, Yier [2 ]
Guo, Xiaolong [1 ]
机构
[1] Kansas State Univ, Mike Wiegers Dept Elect & Comp Engn, Manhattan, KS 66506 USA
[2] Univ Sci & Technol China, Sch Cyber Sci & Technol, Hefei 230026, Anhui, Peoples R China
[3] Michigan Technol Univ, Dept Elect & Comp Engn, Houghton, MI 49931 USA
[4] Northeastern Univ, Dept Elect & Comp Engn, Boston, MA 02115 USA
基金
美国国家科学基金会;
关键词
Hardware; Codes; Training; Software; Large language models; Chatbots; Debugging; Synthetic data; Open source hardware; Computer bugs; Large language model; artificial intelligence; hardware debug; version control; electronic design automation; ENERGY;
D O I
10.1109/TCSI.2024.3487486
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Large Language Models (LLMs) have precipitated emerging trends towards intelligent automation. However, integrating LLMs into the hardware debug domain encounters challenges: the datasets for LLMs for hardware are often plagued by a dual dilemma - scarcity and subpar quality. Traditional hardware debug approaches that rely on experienced labor to generate detailed prompts are not cheaply scalable. Similarly, strategies that depend on existing LLMs and randomly generated prompts fail to achieve sufficient reliability. We propose a directed, semi-synthetic data synthetic method that leverages version control information and journalistic event descriptions. To produce high-quality data, this approach utilizes version control data from hardware projects combined with the 5W1H (Who, What, When, Where, Why, How) journalistic principles. It facilitates the linear scaling of dataset volumes without depending on skilled labor. We have implemented this method on a collected dataset of open-source hardware designs and fine-tuned fifteen general-purpose LLMs to enable their capability in hardware debugging tasks, thereby validating the efficacy of our approach.
引用
收藏
页码:623 / 636
页数:14
相关论文
共 50 条
  • [1] An improved semi-synthetic approach for creating visual-inertial odometry datasets
    Schofield, Sam
    Bainbridge-Smith, Andrew
    Green, Richard
    GRAPHICAL MODELS, 2023, 126
  • [2] Systematic approach to selecting a semi-synthetic
    Mariani, Gino
    Cutting Tool Engineering, 1988, 40 (05): : 50 - 51
  • [3] A semi-synthetic approach to treat tuberculosis
    不详
    LAB ANIMAL, 2014, 43 (03) : 74 - 74
  • [4] Towards Generating Semi-Synthetic Datasets for Network Intrusion Detection System
    Ngoc-Truong Nguyen
    Ton-Nhan Le
    Khanh-Hoi Le-Minh
    Kim-Hung Le
    2023 INTERNATIONAL CONFERENCE ON INFORMATION NETWORKING, ICOIN, 2023, : 62 - 66
  • [5] DebugBench: Evaluating Debugging Capability of Large Language Models
    Tian, Runchu
    Ye, Yining
    Qin, Yujia
    Cong, Xin
    Lin, Yankai
    Pan, Yinxu
    Wu, Yesai
    Hui, Haotian
    Liu, Weichuan
    Liu, Zhiyuan
    Sun, Maosong
    Proceedings of the Annual Meeting of the Association for Computational Linguistics, 2024, : 4173 - 4198
  • [6] A synthetic biology approach to the construction of membrane proteins in semi-synthetic minimal cells
    Kuruma, Yutetsu
    Stano, Pasquale
    Ueda, Takuya
    Luisi, Pier Luigi
    BIOCHIMICA ET BIOPHYSICA ACTA-BIOMEMBRANES, 2009, 1788 (02): : 567 - 574
  • [7] A concise semi-synthetic approach to betulinic acid from betulin
    Kim, DSHL
    Chen, ZD
    Nguyen, VT
    Pezzuto, JM
    Qiu, SX
    Lu, ZZ
    SYNTHETIC COMMUNICATIONS, 1997, 27 (09) : 1607 - 1612
  • [8] A NEW APPROACH TO SEMI-SYNTHETIC PENICILLINS BY THE MIXED ANHYDRIDE METHOD
    SKARIC, V
    TURJAKZEBIC, V
    CROATICA CHEMICA ACTA, 1980, 53 (03) : 449 - 451
  • [9] A survey of datasets in medicine for large language models
    Zhang, Deshiwei
    Xue, Xiaojuan
    Gao, Peng
    Jin, Zhijuan
    Hu, Menghan
    Wu, Yue
    Ying, Xiayang
    INTELLIGENCE & ROBOTICS, 2024, 4 (04): : 457 - 478
  • [10] Tracing the 'ninth sulfur' of the nitrogenase cofactor via a semi-synthetic approach
    Tanifuji, Kazuki
    Lee, Chi Chung
    Sickerman, Nathaniel S.
    Tatsumi, Kazuyuki
    Ohki, Yasuhiro
    Hu, Yilin
    Ribbe, Markus W.
    NATURE CHEMISTRY, 2018, 10 (05) : 568 - 572