Using Autonomous Outlier Detection Methods for Thermophysical Property Data

被引:0
|
作者
Schnorr, Andrea [1 ]
Kaldi, Daniel Johannes [1 ]
Staubach, Jens [2 ]
Garth, Christoph [1 ]
Stephan, Simon [2 ]
机构
[1] RPTU Kaiserslautern, Sci Visualizat Lab, D-67663 Kaiserslautern, Germany
[2] RPTU Kaiserslautern, Lab Engn Thermodynam LTD, D-67663 Kaiserslautern, Germany
来源
关键词
EQUATION-OF-STATE; VAPOR-LIQUID-EQUILIBRIA; LENNARD-JONES MIXTURES; MONTE-CARLO SIMULATIONS; XML-BASED APPROACH; THERMODYNAMIC PROPERTIES; PHASE-EQUILIBRIA; MOLECULAR SIMULATION; TRANSPORT-PROPERTIES; QUALITY ASSESSMENT;
D O I
10.1021/acs.jced.3c00588
中图分类号
O414.1 [热力学];
学科分类号
摘要
The reliability and accuracy of thermophysical property data are of central importance for the development of models that describe these properties. In this work, we compare different autonomous algorithms for identifying the outliers in an existing database. Therefore, the comprehensive database on thermophysical property data for the Lennard-Jones fluid [J. Chem. Inf. Model. 2019, 59, 4248-4265] is used. We focus on homogeneous state property data at given temperature and density for the pressure p, thermal expansion coefficient alpha, isothermal compressibility beta, thermal pressure coefficient gamma, internal energy u, isochoric heat capacity c(v), isobaric heat capacity c(p), Gr & uuml;neisen coefficient Gamma, Joule-Thomson coefficient mu(JT), speed of sound w, chemical potential mu, (reduced) Helmholtz energy a = a/T, and its derivatives a(nm). A comprehensive comparison of 19 outlier detection methods is carried out, which provides insights into the applicability of generic outlier detection algorithms for thermophysical property data. Different classes of outlier detection algorithms are included in the study, namely, machine learning, distance-based, density-based, statistical, ensemble, and model-informed. Two approaches are used for the method evaluation: in approach (a), the original database (comprising real outliers) is used. In approach (b), synthetic outliers are introduced. The results and findings from both approaches are consistent. Machine learning methods yield in some cases better performance compared to that of the distance-based, density-based, ensemble, and statistical methods. The best performance is obtained from the model-informed method (called MoDOD). The results also provide insights into the nature of the outliers in the Lennard-Jones database.
引用
收藏
页码:864 / 880
页数:17
相关论文
共 50 条
  • [1] Outlier detection for compositional data using robust methods
    Filzmoser, Peter
    Hron, Karel
    MATHEMATICAL GEOSCIENCES, 2008, 40 (03) : 233 - 248
  • [2] Outlier Detection for Compositional Data Using Robust Methods
    Peter Filzmoser
    Karel Hron
    Mathematical Geosciences, 2008, 40 : 233 - 248
  • [3] Electricity Consumption Data Analysis Using Various Outlier Detection Methods
    Kaddour, Sidi Mohammed
    Lehsaini, Mohamed
    INTERNATIONAL JOURNAL OF SOFTWARE SCIENCE AND COMPUTATIONAL INTELLIGENCE-IJSSCI, 2021, 13 (03): : 12 - 27
  • [4] Discussion of Outlier Detection Methods of Purchasing Data
    Kono, Katsuya
    Yamamoto, Yoshiro
    2016 14TH INTERNATIONAL CONFERENCE ON ICT AND KNOWLEDGE ENGINEERING (ICT&KE), 2016, : 12 - 18
  • [5] WMEVF: AN OUTLIER DETECTION METHODS FOR CATEGORICAL DATA
    Rokhman, Nur
    Subanar
    Winarko, Edi
    2016 INTERNATIONAL CONFERENCE ON INFORMATICS AND COMPUTING (ICIC), 2016, : 37 - 42
  • [6] A Comparative Study of Autonomous Learning Outlier Detection Methods Applied to Fault Detection
    Bezerra, Clauber Gomes
    Jales Costa, Bruno Sielly
    Guedes, Luiz Affonso
    Angelov, Plamen Parvanov
    2015 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE 2015), 2015,
  • [7] Using data images for outlier detection
    Marchette, DJ
    Solka, JL
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2003, 43 (04) : 541 - 552
  • [8] OUTLIER DETECTION IN OCEAN WAVE MEASUREMENTS BY USING UNSUPERVISED DATA MINING METHODS
    Mahmoodi, Kumars
    Ghassemi, Hassan
    POLISH MARITIME RESEARCH, 2018, 25 (01) : 44 - 50
  • [9] Outlier detection with data mining techniques and statistical methods
    Orellana, Marcos
    Cedillo, Priscila
    ENFOQUE UTE, 2020, 11 (01): : 56 - 67
  • [10] A comparison of multiple outlier detection methods for regression data
    Billor, Nedret
    Kiral, Gulsen
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2008, 37 (03) : 521 - 545