Finding Reusable Machine Learning Components to Build Programming Language Processing Pipelines

被引:0
|
作者
Flynn, Patrick [1 ,2 ]
Vanderbruggen, Tristan [1 ]
Liao, Chunhua [1 ]
Lin, Pei-Hung [1 ]
Emani, Murali [3 ]
Shen, Xipeng [4 ]
机构
[1] Lawrence Livermore Natl Lab, Livermore, CA 94550 USA
[2] Univ North Carolina Charlotte, Charlotte, NC 28223 USA
[3] Argonne Natl Lab, Lemont, IL 60439 USA
[4] North Carolina State Univ, Raleigh, NC 27695 USA
关键词
reusable datasets; reusable machine learning; programming language processing; interoperable pipelines;
D O I
10.1007/978-3-031-36889-9_27
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Programming Language Processing (PLP) using machine learning has made vast improvements in the past few years. Increasingly more people are interested in exploring this promising field. However, it is challenging for new researchers and developers to find the right components to construct their own machine learning pipelines, given the diverse PLP tasks to be solved, the large number of datasets and models being released, and the set of complex compilers or tools involved. To improve the findability, accessibility, interoperability and reusability (FAIRness) of machine learning components, we collect and analyze a set of representative papers in the domain of machine learning-based PLP. We then identify and characterize key concepts including PLP tasks, model architectures and supportive tools. Finally, we show some example use cases of leveraging the reusable components to construct machine learning pipelines to solve a set of PLP tasks.
引用
收藏
页码:402 / 417
页数:16
相关论文
共 50 条
  • [1] Towards a Language for Defining Reusable Programming Language Components (Project Paper)
    van der Rest, Cas
    Poulsen, Casper Bach
    TRENDS IN FUNCTIONAL PROGRAMMING, TFP 2022, 2022, 13401 : 18 - 38
  • [2] Implications of Programming Language Selection for Serverless Data Processing Pipelines
    Cordingly, Robert
    Yu, Hanfei
    Hoang, Varik
    Perez, David
    Foster, David
    Sadeghi, Zohreh
    Hatchett, Rashad
    Lloyd, Wes J.
    2020 IEEE INTL CONF ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING, INTL CONF ON PERVASIVE INTELLIGENCE AND COMPUTING, INTL CONF ON CLOUD AND BIG DATA COMPUTING, INTL CONF ON CYBER SCIENCE AND TECHNOLOGY CONGRESS (DASC/PICOM/CBDCOM/CYBERSCITECH), 2020, : 704 - 711
  • [3] AN ADAPTIVE PROLOG PROGRAMMING LANGUAGE WITH MACHINE LEARNING
    Lu, Benjie
    Liu, Zhiqing
    Gao, Hui
    2012 IEEE 2ND INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENT SYSTEMS (CCIS) VOLS 1-3, 2012, : 21 - 24
  • [4] Developing reusable and robust language processing components for information systems using GATE
    Bontcheva, K
    Cunningham, H
    Maynard, D
    Tablan, V
    Saggion, H
    13TH INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 2002, : 223 - 227
  • [6] Knowledgeable Machine Learning for Natural Language Processing
    Han, Xu
    Zhang, Zhengyan
    Liu, Zhiyuan
    COMMUNICATIONS OF THE ACM, 2021, 64 (11) : 50 - 51
  • [7] Machine learning in statistical natural language processing
    Mochihashi, Daichi
    Kyokai Joho Imeji Zasshi/Journal of the Institute of Image Information and Television Engineers, 2015, 69 (02): : 131 - 135
  • [8] Artificial learning companionusing machine learning and natural language processing
    R. Pugalenthi
    A Prabhu Chakkaravarthy
    J Ramya
    Samyuktha Babu
    R. Rasika Krishnan
    International Journal of Speech Technology, 2021, 24 : 553 - 560
  • [9] Artificial learning companionusing machine learning and natural language processing
    Pugalenthi, R.
    Prabhu Chakkaravarthy, A.
    Ramya, J.
    Babu, Samyuktha
    Rasika Krishnan, R.
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2021, 24 (03) : 553 - 560
  • [10] FINDING WARNING MARKERS: LEVERAGING NATURAL LANGUAGE PROCESSING AND MACHINE LEARNING TECHNOLOGIES TO DETECT RISK OF SCHOOL VIOLENCE
    Osborn, Alexander K.
    Barzman, Drew H.
    JOURNAL OF THE AMERICAN ACADEMY OF CHILD AND ADOLESCENT PSYCHIATRY, 2020, 59 (10): : S145 - S146