The prediction of crystal densities of a big data set using 1D and 2D structure features

被引:0
|
作者
Li, Xianlan [1 ]
Kong, Dingling [1 ]
Luan, Yue [1 ]
Guo, Lili [1 ]
Lu, Yanhua [2 ]
Li, Wei [2 ]
Tang, Meng [3 ]
Zhang, Qingyou [1 ]
Pang, Aimin [2 ]
机构
[1] Henan Univ, Henan Engn Res Ctr Ind Circulating Water Treatment, Henan Joint Int Res Lab Environm Pollut Control Ma, Kaifeng 475004, Peoples R China
[2] Hubei Inst Aerosp Chemotechnol, Sci & Technol Aerosp Chem Power Lab, Xiangyang 441003, Hubei, Peoples R China
[3] Harbin Inst Technol, Sch Phys, Harbin 150001, Peoples R China
基金
中国国家自然科学基金;
关键词
Density; Quantitative structure-property relationships; Big data set; Partial least squares; Random forest; NITRATE ESTERS; IONIC LIQUIDS; QSPR; ENTHALPIES; VAPORIZATION; EXPLOSIVES; NITRAMINES; SURFACE; HEAT;
D O I
10.1007/s11224-024-02279-4
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
A large data set of over 30 thousand organic compounds containing carbon, nitrogen, oxygen, fluorine, and hydrogen was collected, and the density of each compound was predicted by 1D descriptors derived from its molecular formula and 2D descriptors derived from its constitutional structural features. The 2D structural features are composed of Benson's groups, corrected groups, and 2D structural features of the whole molecular structures. All the descriptors were extracted by an in-house program in Java with a function to ensure that each atom (or bond) of molecules is represented by Benson's groups once for atom-based (or bond-based) descriptors. Partial least square (PLS) and random forest (RF) methods were used separately to build models to predict the density. Further, the variable selection of descriptors was performed by variable importance of RF. For partial least square, the combination of the models constructed by descriptors based on the atoms and the bonds achieved the best results in this paper: for the cross-validation of the training set, the Pearson correlation coefficient (R) = 0.9270, mean absolute error (MAE) = 0.0270 g center dot cm-3, and root mean squared error (RMSE) = 0.0426 g center dot cm-3; for the prediction of the test set, R = 0.9454, MAE = 0.0263 g center dot cm-3, and RMSE = 0.0375 g center dot cm-3.
引用
收藏
页码:1375 / 1385
页数:11
相关论文
共 50 条
  • [41] Photoswitchable Conversion of 1D and 2D Nanostructures
    Liu Zhong-Fan
    ACTA PHYSICO-CHIMICA SINICA, 2017, 33 (10) : 1929 - 1929
  • [42] 1D and 2D Materials, and Flexible Substrates
    Yang, Eui-Hyeok
    MICRO- AND NANOTECHNOLOGY SENSORS, SYSTEMS, AND APPLICATIONS XI, 2019, 10982
  • [43] Revisiting the 1D and 2D Laplace Transforms
    Ortigueira, Manuel Duarte
    Machado, Jose Tenreiro
    MATHEMATICS, 2020, 8 (08)
  • [44] Method of fabrication of 1D and 2D gratings
    Stepanov, DY
    Surve, SR
    Balon, SAP
    LASER BEAM CONTROL AND APPLICATIONS, 2006, 6101
  • [45] The 2D Coulomb gas on a 1D lattice
    Narayan, O
    Shastry, BS
    JOURNAL OF PHYSICS A-MATHEMATICAL AND GENERAL, 1999, 32 (07): : 1131 - 1146
  • [46] On the 1D and 2D rogersramanujan continued fractions
    Department of Computer Science, Montclair State University, Montclair, NJ 07025, United States
    不详
    J. Circuits Syst. Comput., 4 (573-585):
  • [47] 1D and 2D Flow Routing on a Terrain
    Aarhus, Lars Arge
    Lowe, Aaron
    Svendsen, Svend C.
    Agarwal, Pankaj K.
    ACM TRANSACTIONS ON SPATIAL ALGORITHMS AND SYSTEMS, 2023, 9 (01)
  • [48] 1D and 2D Ξb and Λb baryons
    于国梁
    王志刚
    王修武
    Chinese Physics C, 2022, 46 (09) : 24 - 39
  • [49] Phase retrieval for 1D and 2D crystals
    Arnal, Romain D.
    Millane, Rick P.
    2017 INTERNATIONAL CONFERENCE ON IMAGE AND VISION COMPUTING NEW ZEALAND (IVCNZ), 2017,
  • [50] Fabrication of 1D and 2D grating structures
    Stepanov, Dmitrii Yu.
    Surve, Sachin
    2006 OPTICAL FIBER COMMUNICATION CONFERENCE/NATIONAL FIBER OPTIC ENGINEERS CONFERENCE, VOLS 1-6, 2006, : 2427 - 2429