Finding Reusable Machine Learning Components to Build Programming Language Processing Pipelines

被引:0
|
作者
Flynn, Patrick [1 ,2 ]
Vanderbruggen, Tristan [1 ]
Liao, Chunhua [1 ]
Lin, Pei-Hung [1 ]
Emani, Murali [3 ]
Shen, Xipeng [4 ]
机构
[1] Lawrence Livermore Natl Lab, Livermore, CA 94550 USA
[2] Univ North Carolina Charlotte, Charlotte, NC 28223 USA
[3] Argonne Natl Lab, Lemont, IL 60439 USA
[4] North Carolina State Univ, Raleigh, NC 27695 USA
关键词
reusable datasets; reusable machine learning; programming language processing; interoperable pipelines;
D O I
10.1007/978-3-031-36889-9_27
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Programming Language Processing (PLP) using machine learning has made vast improvements in the past few years. Increasingly more people are interested in exploring this promising field. However, it is challenging for new researchers and developers to find the right components to construct their own machine learning pipelines, given the diverse PLP tasks to be solved, the large number of datasets and models being released, and the set of complex compilers or tools involved. To improve the findability, accessibility, interoperability and reusability (FAIRness) of machine learning components, we collect and analyze a set of representative papers in the domain of machine learning-based PLP. We then identify and characterize key concepts including PLP tasks, model architectures and supportive tools. Finally, we show some example use cases of leveraging the reusable components to construct machine learning pipelines to solve a set of PLP tasks.
引用
收藏
页码:402 / 417
页数:16
相关论文
共 50 条
  • [31] RESEARCH ON THE TEXT CLASSIFICATION BASED ON NATURAL LANGUAGE PROCESSING AND MACHINE LEARNING
    Chen Keming
    Zheng Jianguo
    JOURNAL OF THE BALKAN TRIBOLOGICAL ASSOCIATION, 2016, 22 (03): : 2484 - 2494
  • [32] Towards Machine Learning Fairness Education in a Natural Language Processing Course
    Bobesh, Samantha Jane
    Miller, Tyler
    Newman, Pax
    Liu, Yudong
    Elglaly, Yasmine N.
    PROCEEDINGS OF THE 54TH ACM TECHNICAL SYMPOSIUM ON COMPUTER SCIENCE EDUCATION, VOL 1, SIGCSE 2023, 2023, : 312 - 318
  • [33] Extracting Biomarker Information applying Natural Language Processing and Machine Learning
    Islam, Md Tawhidul
    Shaikh, Mostafa
    Nayak, Abhaya
    Ranganathan, Shoba
    2010 4TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICAL ENGINEERING (ICBBE 2010), 2010,
  • [34] Analysis of Breakdown Reports Using Natural Language Processing and Machine Learning
    Ahmed, Mobyen Uddin
    Bengtsson, Marcus
    Salonen, Antti
    Funk, Peter
    INTERNATIONAL CONGRESS AND WORKSHOP ON INDUSTRIAL AI 2021, 2022, : 40 - 52
  • [35] Detecting Phishing Attacks Using Natural Language Processing and Machine Learning
    Peng, Tianrui
    Harris, Ian G.
    Sawa, Yuki
    2018 IEEE 12TH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), 2018, : 300 - 301
  • [36] SmishGuard: Leveraging Machine Learning and Natural Language Processing for Smishing Detection
    Samad, Saleem Raja Abdul
    Ganesan, Pradeepa
    Rajasekaran, Justin
    Radhakrishnan, Madhubala
    Ammaippan, Hariraman
    Ramamurthy, Vinodhini
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (11) : 586 - 593
  • [37] Detecting Phishing Attacks Using Natural Language Processing And Machine Learning
    Banu, Reshma
    Anand, M.
    Kamath, Akshatha C.
    Ashika, S.
    Ujwala, H. S.
    Harshitha, S. N.
    PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICCS), 2019, : 1210 - 1214
  • [38] Applying machine learning and natural language processing to detect phishing email
    Alhogail, Areej
    Alsabih, Afrah
    COMPUTERS & SECURITY, 2021, 110
  • [39] Subjective Answers Evaluation Using Machine Learning and Natural Language Processing
    Bashir, Muhammad Farrukh
    Arshad, Hamza
    Javed, Abdul Rehman
    Kryvinska, Natalia
    Band, Shahab S.
    IEEE ACCESS, 2021, 9 : 158972 - 158983
  • [40] SmartFund: Predicting Research Outcomes with Machine Learning and Natural Language Processing
    Alaphat, Alvin
    Jiang, Meng
    2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 2857 - 2865