The prediction of crystal densities of a big data set using 1D and 2D structure features

被引:0
|
作者
Li, Xianlan [1 ]
Kong, Dingling [1 ]
Luan, Yue [1 ]
Guo, Lili [1 ]
Lu, Yanhua [2 ]
Li, Wei [2 ]
Tang, Meng [3 ]
Zhang, Qingyou [1 ]
Pang, Aimin [2 ]
机构
[1] Henan Univ, Henan Engn Res Ctr Ind Circulating Water Treatment, Henan Joint Int Res Lab Environm Pollut Control Ma, Kaifeng 475004, Peoples R China
[2] Hubei Inst Aerosp Chemotechnol, Sci & Technol Aerosp Chem Power Lab, Xiangyang 441003, Hubei, Peoples R China
[3] Harbin Inst Technol, Sch Phys, Harbin 150001, Peoples R China
基金
中国国家自然科学基金;
关键词
Density; Quantitative structure-property relationships; Big data set; Partial least squares; Random forest; NITRATE ESTERS; IONIC LIQUIDS; QSPR; ENTHALPIES; VAPORIZATION; EXPLOSIVES; NITRAMINES; SURFACE; HEAT;
D O I
10.1007/s11224-024-02279-4
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
A large data set of over 30 thousand organic compounds containing carbon, nitrogen, oxygen, fluorine, and hydrogen was collected, and the density of each compound was predicted by 1D descriptors derived from its molecular formula and 2D descriptors derived from its constitutional structural features. The 2D structural features are composed of Benson's groups, corrected groups, and 2D structural features of the whole molecular structures. All the descriptors were extracted by an in-house program in Java with a function to ensure that each atom (or bond) of molecules is represented by Benson's groups once for atom-based (or bond-based) descriptors. Partial least square (PLS) and random forest (RF) methods were used separately to build models to predict the density. Further, the variable selection of descriptors was performed by variable importance of RF. For partial least square, the combination of the models constructed by descriptors based on the atoms and the bonds achieved the best results in this paper: for the cross-validation of the training set, the Pearson correlation coefficient (R) = 0.9270, mean absolute error (MAE) = 0.0270 g center dot cm-3, and root mean squared error (RMSE) = 0.0426 g center dot cm-3; for the prediction of the test set, R = 0.9454, MAE = 0.0263 g center dot cm-3, and RMSE = 0.0375 g center dot cm-3.
引用
收藏
页码:1375 / 1385
页数:11
相关论文
共 50 条
  • [21] 1D metals for 2D electronics
    Jolie, Wouter
    Michely, Thomas
    NATURE NANOTECHNOLOGY, 2024, 19 (07) : 883 - 884
  • [22] Classification of 1D and 2D Orbifolds
    Nilse, Lars
    SUSY06: The 14th International Conference on Supersymmetry and the Unification of Fundamental Interactions, 2007, 903 : 411 - 414
  • [23] 2D/2D/1D Structure of a Self-Supporting Electrocatalyst for Efficient Hydrogen Evolution
    Zhang, Jinping
    Li, Yingxue
    Xu, Chunyan
    Li, Jing
    Yang, Liying
    Yin, Shougen
    ACS APPLIED ENERGY MATERIALS, 2022, 5 (02) : 1710 - 1719
  • [24] Smart batch process: The evolution from 1D and 2D to new 3D perspectives in the era of Big Data
    Zhou, Yuanqiang
    Gao, Furong
    JOURNAL OF PROCESS CONTROL, 2023, 130
  • [25] Thermometer for the 2D electron gas using 1D thermopower
    Appleyard, NJ
    Nicholls, JT
    Simmons, MY
    Tribe, WR
    Pepper, M
    PHYSICAL REVIEW LETTERS, 1998, 81 (16) : 3491 - 3494
  • [27] An Unusual Independent 1D Metal-Organic Nanotube with Mesohelical Structure and 1D → 2D Interdigitation
    Jin, Jun-Cheng
    Wang, Yao-Yu
    Liu, Ping
    Liu, Rui-Ting
    Ren, Chen
    Shi, Qi-Zhen
    CRYSTAL GROWTH & DESIGN, 2010, 10 (05) : 2029 - 2032
  • [28] Pyridine Carboxylate Lanthanide Coordination Complexes with 1D and 2D Structure
    Fang Zhang
    Fang Huang
    Xu Yao
    Ying Jin
    Qifan Chen
    Fei Liu
    Guangming Li
    Journal of Inorganic and Organometallic Polymers and Materials, 2015, 25 : 1183 - 1188
  • [29] 1D to 2D transitional structure of plasmonic crystals: fabrication and characterization
    H. K. Kang
    K. H. Lee
    C. C. Wong
    F. Romanato
    Applied Physics B, 2009, 97 : 671 - 677
  • [30] SPATIAL STRUCTURE BY 1D AND 2D NUCLEAR-RELAXATION METHODS
    NICCOLAI, N
    JOURNAL OF MOLECULAR GRAPHICS, 1989, 7 (02): : 95 - 95