Sediment grain segmentation in thin-section images using dual-modal Vision Transformer

被引:3
|
作者
Zheng, Dongyu [1 ,2 ,3 ]
Hou, Li [4 ]
Hu, Xiumian [5 ]
Hou, Mingcai [1 ,2 ,3 ]
Dong, Kai [1 ]
Hu, Sihai [1 ]
Teng, Runlin [1 ]
Ma, Chao [1 ,2 ,3 ]
机构
[1] Chengdu Univ Technol, State Key Lab Oil & Gas Reservoir Geol & Exploitat, Chengdu 610059, Peoples R China
[2] Chengdu Univ Technol, MNR, Key Lab Deep time Geog & Environm Reconstruct & Ap, Chengdu, Peoples R China
[3] Chengdu Univ Technol, Inst Sedimentary Geol, Chengdu, Peoples R China
[4] Chengdu Univ Technol, Coll Comp Sci & Cyber Secur, Chengdu 610059, Peoples R China
[5] Nanjing Univ, Sch Earth Sci & Engn, Nanjing 210023, Peoples R China
关键词
Thin-section images; Deep learning; Vision Transformer; Dual; -modal; Semantic segmentation; Petrography; RECOGNITION; ROCKS;
D O I
10.1016/j.cageo.2024.105664
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Accurately identifying grain types in thin sections of sandy sediments or sandstones is crucial for understanding their provenance, depositional environments, and potential as natural resources. Although traditional computer vision methods and machine learning algorithms have been used for automatic grain identification, recent advancements in deep learning techniques have opened up new possibilities for achieving more reliable results with less manual labor. In this study, we present Trans-SedNet, a state-of-the-art dual-modal Vision-Transformer (ViT) model that uses both cross- (XPL) and plane-polarized light (PPL) images to achieve semantic segmentation of thin-section images. Our model classifies a total of ten grain types, including subtypes of quartz, feldspar, and lithic fragments, to emulate the manual identification process in sedimentary petrology. To optimize performance, we use SegFormer as the model backbone and add window- and mix-attention to the encoder to identify local information in the images and to best use XPL and PPL images. We also use a combination of focal and dice loss and a smoothing procedure to address imbalances and reduce over-segmentation. Our comparative analysis of several deep convolution neural networks and ViT models, including FCN, U-Net, DeepLabV3Plus, SegNeXT, and CMX, shows that Trans-SedNet outperforms the other models with a significant increase in evaluation metrics of mIoU and mPA. We also conduct an experiment to test the models' ability to handle dual-modal information, which reveals that the dual-modal models, including Trans-SedNet, achieve better results than single-modal models with the extra input of PPL images. Our study demonstrates the potential of ViT models in semantic segmentation of thin-section images and highlights the importance of dual-modal models for handling complex input in various geoscience disciplines. By improving data quality and quantity, our model has the potential to enhance the efficiency and reliability of grain identification in sedimentary petrology and relevant subjects.
引用
收藏
页数:9
相关论文
共 50 条
  • [31] Computer-aided differential diagnosis of solitary pulmonary nodules using thin-section CT images: Comparison with observer performance
    Kusumoto, M
    Kawata, Y
    Ohmatsu, H
    Niki, N
    Kaneko, M
    Moriyama, N
    RADIOLOGY, 2001, 221 : 210 - 210
  • [32] Dual-path Network for Liver and Tumor Segmentation in CT Images Using Swin Transformer Encoding Approach
    Yang, Zhen
    Li, Shuzhou
    CURRENT MEDICAL IMAGING, 2023, 19 (10) : 1114 - 1123
  • [33] Segmentation of sandstone thin section images with separation of touching grains using optimum path forest operators
    Mingireanov Filho, Ivan
    Spina, Thiago Vallin
    Falcao, Alexandre Xavier
    Vidal, Alexandre Campane
    COMPUTERS & GEOSCIENCES, 2013, 57 : 146 - 157
  • [34] The auto segmentation for cardiac structures using a dual-input deep learning network based on vision saliency and transformer
    Wang, Jing
    Wang, Shuyu
    Liang, Wei
    Zhang, Nan
    Zhang, Yan
    JOURNAL OF APPLIED CLINICAL MEDICAL PHYSICS, 2022, 23 (05):
  • [35] Semiautomatic contour tracking method for biological object segmentation in thin-section electron microscope images with modified zero DC component-type Gabor wavelets
    Maeda, Gen
    Baba, Misuzu
    Baba, Norio
    MICROSCOPY, 2023, 72 (05) : 433 - 445
  • [36] Characterizing moldic and vuggy pore space in karst aquifers using borehole-wall, slabbed-core and thin-section images
    Manda, Alex K.
    Culpepper, Alexander R.
    JOURNAL OF APPLIED GEOPHYSICS, 2013, 88 : 12 - 22
  • [37] ASSESSMENT OF PARAMETRIAL INVASION BY CERVICAL-CARCINOMA WITH THIN-SECTION OBLIQUE T2-WEIGHTED IMAGES AND DYNAMIC MR-IMAGES BY USING A TURBOFLASH TECHNIQUE
    JOJA, I
    ASAKAWA, M
    ASAKAWA, T
    MITSUMORI, A
    TOGAMI, I
    HIRAKI, Y
    RADIOLOGY, 1995, 197 : 472 - 472
  • [38] Estimation of 3D Permeability from Pore Network Models Constructed Using 2D Thin-Section Images in Sandstone Reservoirs
    Luo, Chengfei
    Wan, Huan
    Chen, Jinding
    Huang, Xiangsheng
    Cui, Shuheng
    Qin, Jungan
    Yan, Zhuoyu
    Qiao, Dan
    Shi, Zhiqiang
    ENERGIES, 2023, 16 (19)
  • [39] Estimating 3D elastic moduli of rock from 2D thin-section images using differential effective medium theory
    Karimpouli, Sadegh
    Tahmasebi, Pejman
    Saenger, Erik H.
    GEOPHYSICS, 2018, 83 (04) : MR211 - MR219
  • [40] A vision transformer based approach for analysis of plasmodium vivax life cycle for malaria prediction using thin blood smear microscopic images
    Sengar, Neha
    Burget, Radim
    Dutta, Malay Kishore
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2022, 224