Sediment grain segmentation in thin-section images using dual-modal Vision Transformer

被引:3
|
作者
Zheng, Dongyu [1 ,2 ,3 ]
Hou, Li [4 ]
Hu, Xiumian [5 ]
Hou, Mingcai [1 ,2 ,3 ]
Dong, Kai [1 ]
Hu, Sihai [1 ]
Teng, Runlin [1 ]
Ma, Chao [1 ,2 ,3 ]
机构
[1] Chengdu Univ Technol, State Key Lab Oil & Gas Reservoir Geol & Exploitat, Chengdu 610059, Peoples R China
[2] Chengdu Univ Technol, MNR, Key Lab Deep time Geog & Environm Reconstruct & Ap, Chengdu, Peoples R China
[3] Chengdu Univ Technol, Inst Sedimentary Geol, Chengdu, Peoples R China
[4] Chengdu Univ Technol, Coll Comp Sci & Cyber Secur, Chengdu 610059, Peoples R China
[5] Nanjing Univ, Sch Earth Sci & Engn, Nanjing 210023, Peoples R China
关键词
Thin-section images; Deep learning; Vision Transformer; Dual; -modal; Semantic segmentation; Petrography; RECOGNITION; ROCKS;
D O I
10.1016/j.cageo.2024.105664
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Accurately identifying grain types in thin sections of sandy sediments or sandstones is crucial for understanding their provenance, depositional environments, and potential as natural resources. Although traditional computer vision methods and machine learning algorithms have been used for automatic grain identification, recent advancements in deep learning techniques have opened up new possibilities for achieving more reliable results with less manual labor. In this study, we present Trans-SedNet, a state-of-the-art dual-modal Vision-Transformer (ViT) model that uses both cross- (XPL) and plane-polarized light (PPL) images to achieve semantic segmentation of thin-section images. Our model classifies a total of ten grain types, including subtypes of quartz, feldspar, and lithic fragments, to emulate the manual identification process in sedimentary petrology. To optimize performance, we use SegFormer as the model backbone and add window- and mix-attention to the encoder to identify local information in the images and to best use XPL and PPL images. We also use a combination of focal and dice loss and a smoothing procedure to address imbalances and reduce over-segmentation. Our comparative analysis of several deep convolution neural networks and ViT models, including FCN, U-Net, DeepLabV3Plus, SegNeXT, and CMX, shows that Trans-SedNet outperforms the other models with a significant increase in evaluation metrics of mIoU and mPA. We also conduct an experiment to test the models' ability to handle dual-modal information, which reveals that the dual-modal models, including Trans-SedNet, achieve better results than single-modal models with the extra input of PPL images. Our study demonstrates the potential of ViT models in semantic segmentation of thin-section images and highlights the importance of dual-modal models for handling complex input in various geoscience disciplines. By improving data quality and quantity, our model has the potential to enhance the efficiency and reliability of grain identification in sedimentary petrology and relevant subjects.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] Superpixel-Based Grain Segmentation in Sandstone Thin-Section
    Dabek, Przemyslaw
    Chudy, Krzysztof
    Nowak, Izabella
    Zimroz, Radoslaw
    MINERALS, 2023, 13 (02)
  • [2] An efficient multitasking cascade network for arteriovenous segmentation using dual-modal fundus images
    Rajnish Kumar Diwakar
    Pammi Kumari
    Priyank Saxena
    Raju Poddar
    Multimedia Tools and Applications, 2024, 83 : 48399 - 48414
  • [3] An efficient multitasking cascade network for arteriovenous segmentation using dual-modal fundus images
    Diwakar, Rajnish Kumar
    Kumari, Pammi
    Saxena, Priyank
    Poddar, Raju
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (16) : 48399 - 48414
  • [4] Compressibility predictions using digital thin-section images of rocks
    Das, Vishal
    Saxena, Nishank
    Hofmann, Ronny
    COMPUTERS & GEOSCIENCES, 2020, 139
  • [5] Simultaneous Arteriole and Venule Segmentation of Dual-Modal Fundus Images Using a Multi-Task Cascade Network
    Zhang, Shulin
    Zheng, Rui
    Luo, Yuhao
    Wang, Xuewei
    Mao, Jianbo
    Roberts, Cynthia J.
    Sun, Mingzhai
    IEEE ACCESS, 2019, 7 : 57561 - 57573
  • [6] Compressibility predictions using digital thin-section images of rocks
    Das, Vishal
    Saxena, Nishank
    Hofmann, Ronny
    Computers and Geosciences, 2020, 139
  • [7] AUTOMATED EVALUATION OF VOLUMETRIC GRAIN-SIZE DISTRIBUTION FROM THIN-SECTION IMAGES
    PARESCHI, MT
    POMPILIO, M
    INNOCENTI, F
    COMPUTERS & GEOSCIENCES, 1990, 16 (08) : 1067 - 1084
  • [8] The edge segmentation of grains in thin-section petrographic images utilising extinction consistency perception network
    Ping Zhang
    Jiazhou Zhou
    Wen Zhao
    Xuyang Li
    Liu Pu
    Complex & Intelligent Systems, 2024, 10 (1) : 1231 - 1245
  • [9] The edge segmentation of grains in thin-section petrographic images utilising extinction consistency perception network
    Zhang, Ping
    Zhou, Jiazhou
    Zhao, Wen
    Li, Xuyang
    Pu, Liu
    COMPLEX & INTELLIGENT SYSTEMS, 2024, 10 (01) : 1231 - 1245
  • [10] Automatic Brain Segmentation for PET/MR Dual-Modal Images Through a Cross-Fusion Mechanism
    Tang, Hongyan
    Huang, Zhenxing
    Li, Wenbo
    Wu, Yaping
    Yuan, Jianmin
    Yang, Yang
    Zhang, Yan
    Qin, Jing
    Zheng, Hairong
    Liang, Dong
    Wang, Meiyun
    Hu, Zhanli
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2025, 29 (03) : 1982 - 1994