Negative Data in Data Sets for Machine Learning Training

被引:16
|
作者
Maloney, Michael P.
Coley, Connor W.
Genheden, Samuel
Carson, Nessa
Helquist, Paul
Norrby, Per-Ola
Wiest, Olaf
机构
[1] Department of Chemistry and Biochemistry, University of Notre Dame, Notre Dame,IN,46556, United States
[2] Department of Chemical Engineering and Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge,MA,02139, United States
[3] Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Pepparedsleden 1, Mölndal,SE-431 83, Sweden
[4] Early Chemical Development, Pharmaceutical Sciences, R&D, AstraZeneca, Macclesfield,SK10 2NA, United Kingdom
[5] Data Science and Modelling, Pharmaceutical Sciences, R&D, AstraZeneca, Gothenburg, Pepparedsleden 1, Mölndal,SE-431 83, Sweden
来源
JOURNAL OF ORGANIC CHEMISTRY | 2023年 / 88卷 / 09期
关键词
D O I
10.1021/acs.joc.3c00844
中图分类号
O62 [有机化学];
学科分类号
070303 ; 081704 ;
摘要
引用
收藏
页码:5239 / 5241
页数:3
相关论文
共 50 条
  • [31] Perceptions of Data Set Experts on Important Characteristics of Health Data Sets Ready for Machine Learning
    Ng, Madelena Y.
    Youssef, Alaa
    Miner, Adam S.
    Sarellano, Daniela
    Long, Jin
    Larson, David B.
    Hernandez-Boussard, Tina
    Langlotz, Curtis P.
    JAMA NETWORK OPEN, 2023, 6 (12) : E2345892
  • [32] Machine Learning with Digital Generators for Training Sets Including Proteins Modeling in the Context of Big Data and Blockchain Technologies
    Abramov, V.
    Fokicheva, A.
    Istomin, E.
    Sokolov, A.
    Goloskvskaya, E.
    Levina, A.
    EDUCATION EXCELLENCE AND INNOVATION MANAGEMENT THROUGH VISION 2020, 2019, : 8638 - 8642
  • [33] Improved learning from data competitions through strategic design of training and test data sets
    Anderson-Cook, Christine M.
    Lu, Lu
    Myers, Kary L.
    Quinlan, Kevin R.
    Pawley, Norma
    QUALITY ENGINEERING, 2019, 31 (04) : 564 - 580
  • [34] Data Driven Prognostics with Lack of Training Data Sets
    Xi, Zhimin
    Zhao, Xiangxue
    INTERNATIONAL DESIGN ENGINEERING TECHNICAL CONFERENCES AND COMPUTERS AND INFORMATION IN ENGINEERING CONFERENCE, 2015, VOL 2A, 2016,
  • [35] Molecular quantum chemical data sets and databases for machine learning potentials
    Ullah, Arif
    Chen, Yuxinxin
    Dral, Pavlo O.
    MACHINE LEARNING-SCIENCE AND TECHNOLOGY, 2024, 5 (04):
  • [36] Advantages of Synthetic Noise and Machine Learning for Analyzing Radioecological Data Sets
    Shuryak, Igor
    PLOS ONE, 2017, 12 (01):
  • [37] Entropy-based matrix learning machine for imbalanced data sets
    Zhu, Changming
    Wang, Zhe
    PATTERN RECOGNITION LETTERS, 2017, 88 : 72 - 80
  • [38] Probabilistic Random Forest: A Machine Learning Algorithm for Noisy Data Sets
    Reis, Itamar
    Baron, Dalya
    Shahaf, Sahar
    ASTRONOMICAL JOURNAL, 2019, 157 (01):
  • [39] Fuzzy Sets in Data Analysis: From Statistical Foundations to Machine Learning
    Couso, Ina
    Borgelt, Christian
    Huellermeier, Eyke
    Kruse, Rudolf
    IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE, 2019, 14 (01) : 31 - 44
  • [40] Machine Learning in Genomic Medicine: A Review of Computational Problems and Data Sets
    Leung, Michael K. K.
    Delong, Andrew
    Alipanahi, Babak
    Frey, Brendan J.
    PROCEEDINGS OF THE IEEE, 2016, 104 (01) : 176 - 197