The prediction of crystal densities of a big data set using 1D and 2D structure features

被引:0
|
作者
Li, Xianlan [1 ]
Kong, Dingling [1 ]
Luan, Yue [1 ]
Guo, Lili [1 ]
Lu, Yanhua [2 ]
Li, Wei [2 ]
Tang, Meng [3 ]
Zhang, Qingyou [1 ]
Pang, Aimin [2 ]
机构
[1] Henan Univ, Henan Engn Res Ctr Ind Circulating Water Treatment, Henan Joint Int Res Lab Environm Pollut Control Ma, Kaifeng 475004, Peoples R China
[2] Hubei Inst Aerosp Chemotechnol, Sci & Technol Aerosp Chem Power Lab, Xiangyang 441003, Hubei, Peoples R China
[3] Harbin Inst Technol, Sch Phys, Harbin 150001, Peoples R China
基金
中国国家自然科学基金;
关键词
Density; Quantitative structure-property relationships; Big data set; Partial least squares; Random forest; NITRATE ESTERS; IONIC LIQUIDS; QSPR; ENTHALPIES; VAPORIZATION; EXPLOSIVES; NITRAMINES; SURFACE; HEAT;
D O I
10.1007/s11224-024-02279-4
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
A large data set of over 30 thousand organic compounds containing carbon, nitrogen, oxygen, fluorine, and hydrogen was collected, and the density of each compound was predicted by 1D descriptors derived from its molecular formula and 2D descriptors derived from its constitutional structural features. The 2D structural features are composed of Benson's groups, corrected groups, and 2D structural features of the whole molecular structures. All the descriptors were extracted by an in-house program in Java with a function to ensure that each atom (or bond) of molecules is represented by Benson's groups once for atom-based (or bond-based) descriptors. Partial least square (PLS) and random forest (RF) methods were used separately to build models to predict the density. Further, the variable selection of descriptors was performed by variable importance of RF. For partial least square, the combination of the models constructed by descriptors based on the atoms and the bonds achieved the best results in this paper: for the cross-validation of the training set, the Pearson correlation coefficient (R) = 0.9270, mean absolute error (MAE) = 0.0270 g center dot cm-3, and root mean squared error (RMSE) = 0.0426 g center dot cm-3; for the prediction of the test set, R = 0.9454, MAE = 0.0263 g center dot cm-3, and RMSE = 0.0375 g center dot cm-3.
引用
收藏
页码:1375 / 1385
页数:11
相关论文
共 50 条
  • [1] The prediction of crystal densities of a big data set using 1D and 2D structure features
    Henan Engineering Research Center of Industrial Circulating Water Treatment, Henan Joint International Research Laboratory of Environmental Pollution Control Materials, Henan University, Kaifeng
    475004, China
    不详
    441003, China
    不详
    150001, China
    Struct Chem, 5 (1375-1385):
  • [2] 1D and 2D Phononic Crystal Sensors
    Lucklum, R.
    Li, J.
    Zubtsov, M.
    EUROSENSORS XXIV CONFERENCE, 2010, 5 : 436 - 439
  • [3] Calibration of a 1D/1D urban flood model using 1D/2D model results in the absence of field data
    Leandro, J.
    Djordjevic, S.
    Chen, A. S.
    Savic, D. A.
    Stanic, M.
    WATER SCIENCE AND TECHNOLOGY, 2011, 64 (05) : 1016 - 1024
  • [4] Synthesis, Crystal Structure, and Photoluminescent Property of a New 1D→2D Interdigitated Framework
    Zhang, Da-Wei
    Zhao, Guang-Yue
    SYNTHESIS AND REACTIVITY IN INORGANIC METAL-ORGANIC AND NANO-METAL CHEMISTRY, 2015, 45 (04) : 524 - 526
  • [5] Air waveguide in a hybrid 1D and 2D photonic crystal hetero-structure
    Liu, Ken
    Yuan, Xiao Dong
    Ye, Wei Min
    Zeng, Chun
    OPTICS COMMUNICATIONS, 2009, 282 (22) : 4445 - 4448
  • [6] Harmogram feature sets for 1D and 2D data
    Holland, OT
    Poston, WL
    SIGNAL PROCESSING, SENSOR FUSION, AND TARGET RECOGNITION VII, 1998, 3374 : 414 - 425
  • [7] Determination of ecdysteroids structure by 1D and 2D NMR
    Girault, JP
    RUSSIAN JOURNAL OF PLANT PHYSIOLOGY, 1998, 45 (03) : 306 - 309
  • [8] Decoupling 1D and 2D features of 2D sp-nanoribbons-the megatom model
    Andriotis, Antonis N.
    Menon, Madhu
    JOURNAL OF PHYSICS-CONDENSED MATTER, 2023, 35 (09)
  • [9] FDTD Application on the PBG of 1D and 2D Photonic Crystal
    Zhao, Zhang-Yi
    Liu, Xiao-Yan
    TEXTILE BIOENGINEERING AND INFORMATICS SYMPOSIUM PROCEEDINGS, 2014, VOLS 1 AND 2, 2014, : 675 - 680
  • [10] Prediction Model for Tea Polyphenol Content with Deep Features Extracted Using 1D and 2D Convolutional Neural Network
    Luo, Na
    Li, Yunlong
    Yang, Baohua
    Liu, Biyun
    Dai, Qianying
    AGRICULTURE-BASEL, 2022, 12 (09):