Random Forests lithology prediction method for imbalanced data sets

被引:0
|
作者
Wang G. [1 ]
Song J. [1 ]
Xu F. [1 ]
Zhang W. [2 ]
Liu J. [3 ]
Chen F. [4 ]
机构
[1] School of Geosciences, China University of Petroleum (East China), Qingdao
[2] School of Earth and Space Sciences, University of Science and Technology of China, Hefei
[3] SINOPEC Petroleum Exploration and Production Research Institute, Beijing
[4] Research Institute of Petroleum Exploration and Development, PetroChina Tarim Oilfield Company, Korla
关键词
Class balancing techniques; Imbalanced data sets; Lithology prediction; Machine learning; Random Forests classification;
D O I
10.13810/j.cnki.issn.1000-7210.2021.04.001
中图分类号
学科分类号
摘要
For the lithology prediction method depending on a supervised machine learning classifier, if the data set has too few samples of target lithology while too many samples of non-target lithology, the classifier trained on this imbalanced data set will cause the prediction results be biased toward the non-target lithology, resulting in poor prediction accuracy of target lithology. With regard to this problem, a Random Forests lithology prediction method for imbalanced data sets is proposed. Firstly, a lithology data set is constructed with lithological logging data as sample labels and seismic attributes and elastic parameters of rock at the uphole trace as sample features. Secondly, the NM-SMOTE algorithm integrating near miss (NM) and synthetic minority over-sampling technique (SMOTE) is employed to balance the lithology data set. Then a Random Forests classifier is trained on the balanced data set to build a nonlinear relationship of lithology with various seismic attributes and elastic parameters. Finally, the seismic attributes and elastic parameters of the target explorato-ry area are input into the Random Forests classifier which will predict lithology according to the above nonlinear relationship obtained during training. The actual data test results demonstrate that too many samples of non-target lithology will affect the prediction accuracy of the Random Forests classifier, and the prediction accuracy of lithology is only 38%. After the training data set is balanced with the NM-SMOTE algorithm, the prediction accuracy of lithology rises up to 83%, and a data volume of lithology is obtained, which is more consistent with seismic data. © 2021, Editorial Department OIL GEOPHYSICAL PROSPECTING. All right reserved.
引用
收藏
页码:679 / 687
页数:8
相关论文
共 23 条
  • [1] LI Yucun, LI Jun, SUN Ming, Et al., Seismic interpretation techniques for middle and deep lithological trap evaluation in Gaobei Slope, Oil Geophysical Prospecting, 52, S1, pp. 207-213, (2017)
  • [2] FU Guangming, YAN Jiayong, ZHANG Kun, Et al., Current status and progress of lithology identification technology, Progress in Geophysics, 32, 1, pp. 26-40, (2017)
  • [3] ZHAO Qian, ZHOU Jiangyu, ZHANG Li, Et al., Prediction of marine clastic rocks assemblage with seismic waveform and amplitude responses: an example in Beikang Basin, South China Sea, Oil Geophysical Prospecting, 52, 6, pp. 1280-1289, (2017)
  • [4] HUANG Fengxiang, XIA Zhenyu, MA Xiuling, Et al., Identification and prediction of metamorphic buried hill lithology based on logging and seismic technology, Fault-Block Oil&Gas Field, 23, 6, pp. 721-725, (2016)
  • [5] SUN Ming, LIAO Jun, CHEN Weichao, Et al., Seismic fan-delta sand prediction in the eastern Nanpu Depression, Oil Geophysical Prospecting, 52, S1, pp. 128-133, (2017)
  • [6] HUANG Rao, LIU Zhibin, Application of prestack simultaneous inversion in sandstone oil reservoir prediction, Progress in Geophysics, 28, 1, pp. 380-386, (2013)
  • [7] HONG Zhong, ZHANG Menggang, SU Mingjun, The applicability and limitations of the seismic waveform classification technology to the identification of litho-logical facies, Geophysical & Geochemical Exploration, 37, 5, pp. 904-910, (2013)
  • [8] TIAN Yukun, ZHOU Hui, YUAN Sanyi, Lithologic discrimination method based on Markov random field, Chinese Journal of Geophysics, 56, 4, pp. 1360-1368, (2013)
  • [9] LI Guofu, Multi-parameter Reservoir Prediction and Fluid Identification Method Research, (2011)
  • [10] LI Guohe, ZHENG Yang, LI Ying, Et al., Lithology recognition of multi-sampling points based on deep belief network, Progress in Geophysics, 33, 4, pp. 1660-1665, (2018)