SYNTHETIC DATA AND THE FUTURE OF AI

被引:0
|
作者
Lee, Peter [1 ,2 ]
机构
[1] Univ Calif Davis, Sch Law, Ctr Innovat Law & Soc, Law, Davis, CA 95616 USA
[2] Univ Calif Davis, Sch Law, Ctr Innovat Law & Soc, Davis, CA 95616 USA
关键词
INTELLECTUAL PROPERTY; TRADE SECRETS; INNOVATION; COPYRIGHT; INDUSTRY; HEALTH; RIGHTS; BIAS; FIRM; LAW;
D O I
暂无
中图分类号
D9 [法律]; DF [法律];
学科分类号
0301 ;
摘要
The future of artificial intelligence (AI) is synthetic. Several of the most prominent technical and legal challenges of AI derivefrom the need to amass huge amounts of real-world data to train machine learning (ML) models. Collecting such real- world data can be highly difficult and can threaten privacy, introduce bias in automated decision making, and infringe copyrights on a massive scale. This Article explores the emergence of a seemingly paradoxical technical creation that can mitigate-though not completely eliminate-these concerns: synthetic data. Increasingly, data scientists are using simulated driving environments, fabricated medical records, fake images, and other forms of synthetic data to train ML models. Artificial data, in other words, is training artificial intelligence. Synthetic data offers a host of technical and legal benefits; it promises to radically decrease the cost of obtaining data, sidestep privacy issues, reduce automated discrimination, and avoid copyright infringement. Alongside such promises, however, synthetic data offers perils as well. Deficiencies in the development and deployment of synthetic data can exacerbate the dangers of AI and cause significant social harm. In light of the enormous value and importance of synthetic data, this Article sketches the contours of an innovation ecosystem to promote its robust and responsible development. It identifies three objectives that should guide legal and policy measures shaping the creation of synthetic data: provisioning, disclosure, and democratization. Ideally, such an ecosystem should incentivize the generation of high-quality synthetic data, encourage disclosure of both synthetic data and processes for generating it, and promote multiple sources of innovation. This Article then examines a suite of "innovation mechanisms" that can advance these objectives, ranging from open source production to proprietary approaches based on patents, trade secrets, and copyrights. Throughout, it suggests policy and doctrinal reforms to enhance innovation, transparency, and democratic access to synthetic data. Just as AI will have enormous implications for law, legal regimes can play a central role in shaping the future of AI.
引用
收藏
页码:1 / 74
页数:74
相关论文
共 50 条
  • [1] AI, big data, and the future of consent
    Andreotta, Adam J.
    Kirkham, Nin
    Rizzi, Marco
    AI & SOCIETY, 2022, 37 (04) : 1715 - 1728
  • [2] AI, big data, and the future of consent
    Adam J. Andreotta
    Nin Kirkham
    Marco Rizzi
    AI & SOCIETY, 2022, 37 : 1715 - 1728
  • [3] Efficient ai adaption using synthetic data
    Blank A.
    Baier L.
    Kedilioglu O.
    Zhu X.
    Metzner M.
    Franke J.
    WT Werkstattstechnik, 2021, 111 (10): : 759 - 762
  • [4] AI WILL SHAPE THE FUTURE OF NDE DATA ANALYSIS
    Virkkunen, Iikka
    Koskinen, Tuomas
    Tyystjarvi, Topias
    Siljama, Oskar
    MATERIALS EVALUATION, 2023, 81 (06) : 15 - 16
  • [5] Cybersecurity Framework for Synthetic Data in Training Medical AI
    Greser, Jaroslaw
    EUROPEAN JOURNAL OF RISK REGULATION, 2024, 15 (04) : 903 - 911
  • [6] Optimized Crash Safety through AI and Synthetic Data
    Laufer, Patrick
    Konhäuser, Robin
    ATZ worldwide, 2023, 125 (12) : 28 - 33
  • [7] Data-centric AI: Techniques and Future Perspectives
    Zha, Daochen
    Lai, Kwei-Herng
    Yang, Fan
    Zou, Na
    Gao, Huiji
    Hu, Xia
    PROCEEDINGS OF THE 29TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2023, 2023, : 5839 - 5840
  • [8] Sharing Data Is Essential for the Future of AI in Medical Imaging
    Bell, Laura C.
    Shimron, Efrat
    RADIOLOGY-ARTIFICIAL INTELLIGENCE, 2024, 6 (01)
  • [9] Merging synthetic and real embryo data for advanced AI predictions
    Presacan, Oriana
    Dorobantiu, Alexandru
    Thambawita, Vajira
    Riegler, Michael A.
    Stensen, Mette H.
    Iliceto, Mario
    Aldea, Alexandru C.
    Sharma, Akriti
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [10] Mixing Synthetic and Real Data to Build AI Vision Models
    Rizzi, Peter
    Gormish, Michael
    Kovarskiy, Jacob
    Reite, Aaron
    Zeiler, Matthew
    SYNTHETIC DATA FOR ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING: TOOLS, TECHNIQUES, AND APPLICATIONS II, 2024, 13035