Towards a Large -Scale Empirical Study of Python']Python Static Type Annotations

被引:0
|
作者
Lin, Xinrong [1 ]
Hua, Baojian [1 ]
Wang, Yang [1 ]
Pan, Zhizhong [1 ]
机构
[1] Univ Sci & Technol China, Sch Software Engn, Hefei, Peoples R China
基金
中国国家自然科学基金;
关键词
Empirical Study; !text type='Python']Python[!/text; Static Type Annotations; BUG-DENSITY; IMPACT;
D O I
10.1109/SANER56733.2023.00046
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Python, as one of the most popular and important programming languages in the era of data science, has recently introduced a syntax for static type annotations with PEP 484, to improve code maintainability, quality, and readability. However, it is still unknown whether and how static type annotations are used in practical Python projects. This paper presents, to the best of our knowledge, the first and most comprehensive empirical study on the defects, evolution and rectification of static type annotations in Python projects. We first designed and implemented a software prototype dubbed PYSCAN, then used it to scan notable Python projects with diverse domains and sizes and type annotation manners, which add up to 19,478,428 lines of Python code. The empirical results provide interesting findings and insights, such as: 1) we proposed a taxonomy of Python type annotation-related defects, by classifying defects into four categories; 2) we investigated the evolution of type annotation-related defects; and 3) we proposed automatic defect rectification strategies, generating rectification suggestions for 82 out of 110 (74.55%) defects successfully. We suggest that: 1) Python language designers should clarify the type annotation specification; 2) checking tool builders should improve their tools to suppress false positives; and 3) Python developers should integrate such checking tools into their development workflow to catch type annotation-related defects at an early development stage. We have reported our findings and suggestions to Python language designers, checking tool builders, and Python developers. They have acknowledged us and taken actions based on our suggestions. We believe these guidelines would improve static type annotation practices and benefit the Python ecosystem in general.
引用
收藏
页码:414 / 425
页数:12
相关论文
共 50 条
  • [31] An Empirical Study for Common Language Features Used in Python']Python Projects
    Peng, Yun
    Zhang, Yu
    Hu, Mingzhe
    2021 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING (SANER 2021), 2021, : 24 - 35
  • [32] An Empirical Study on the Impact of Python']Python Dynamic Typing on the Project Maintenance
    Xia, Xinmeng
    Yan, Yanyan
    He, Xincheng
    Wu, Di
    Xu, Lei
    Xu, Baowen
    INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2022, 32 (05) : 745 - 768
  • [33] An Empirical Study on Dynamic Typing Related Practices in Python']Python Systems
    Chen, Zhifei
    Li, Yanhui
    Chen, Bihuan
    Ma, Wanwangying
    Chen, Lin
    Xu, Baowen
    2020 IEEE/ACM 28TH INTERNATIONAL CONFERENCE ON PROGRAM COMPREHENSION, ICPC, 2020, : 83 - 93
  • [34] An empirical study of the Python']Python/C API on evolution and bug patterns
    Hu, Mingzhe
    Zhang, Yu
    JOURNAL OF SOFTWARE-EVOLUTION AND PROCESS, 2023, 35 (02)
  • [35] Compressed Python']Python likelihood for large scale temperature and polarization from Planck
    Prince, Heather
    Dunkley, Jo
    PHYSICAL REVIEW D, 2022, 105 (02)
  • [36] A Large-Scale Comparison of Python']Python Code in Jupyter Notebooks and Scripts
    Grotov, Konstantin
    Titov, Sergey
    Sotnikov, Vladimir
    Golubev, Yaroslav
    Bryksin, Timofey
    2022 MINING SOFTWARE REPOSITORIES CONFERENCE (MSR 2022), 2022, : 353 - 364
  • [37] Structural Observability Analysis of Large Scale Systems Using Modelica and Python']Python
    Anushka, M.
    Perera, S.
    Lie, Bernt
    Pfeiffer, Carlos Fernando
    MODELING IDENTIFICATION AND CONTROL, 2015, 36 (01) : 53 - 65
  • [38] GenomeDiagram: a python']python package for the visualization of large-scale genomic data
    Pritchard, L
    White, JA
    Birch, PRJ
    Toth, IK
    BIOINFORMATICS, 2006, 22 (05) : 616 - 617
  • [39] Efficient Graph Analytics in Python']Python for Large-Scale Data Science
    Zhou, Xiantian
    Ordonez, Carlos
    BIG DATA ANALYTICS AND KNOWLEDGE DISCOVERY (DAWAK 2021), 2021, 12925 : 158 - 164
  • [40] BioNet: A Python']Python interface to NEURON for modeling large-scale networks
    Gratiy, Sergey L.
    Billeh, Yazan N.
    Dai, Kael
    Mitelut, Catalin
    Feng, David
    Gouwens, Nathan W.
    Cain, Nicholas
    Koch, Christof
    Anastassiou, Costas A.
    Arkhipov, Anton
    PLOS ONE, 2018, 13 (08):