Transformer with convolution and graph-node co-embedding: An accurate and interpretable vision backbone for predicting gene expressions from local histopathological image

被引:4
|
作者
Xiao, Xiao [1 ,2 ,4 ]
Kong, Yan [1 ,2 ]
Li, Ronghan [2 ,3 ]
Wang, Zuoheng [4 ]
Lu, Hui [1 ,2 ,5 ,6 ]
机构
[1] Sch Life Sci & Biotechnol, State Key Lab Microbial Metab, Joint Int Res Lab Metab & Dev Sci, Shanghai Jiao Tong Univ,Dept Bioinformat & Biostat, Shanghai, Peoples R China
[2] Shanghai Jiao Tong Univ, AI Inst, SJTU Yale Joint Ctr Biostat & Data Sci, Natl Ctr Translat Med,MoE Key Lab Artificial Intel, Shanghai, Peoples R China
[3] Shanghai Jiao Tong Univ, Zhiyuan Coll, Shanghai, Peoples R China
[4] Yale Univ, Yale Sch Publ Hlth, Dept Biostat, New Haven, CT USA
[5] Shanghai Engn Res Ctr Big Data Pediat Precis Med, NHC Key Lab Med Embryogenesis & Dev Mol Biol, Shanghai, Peoples R China
[6] Shanghai Engn Res Ctr Big Data Pediat Precis Med, Shanghai Key Lab Embryo & Reprod Engn, Shanghai, Peoples R China
关键词
Deep learning; Breast cancer; Convolutional neural network; Graph neural network; Transformer; Spatial transcriptomics;
D O I
10.1016/j.media.2023.103040
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Inferring gene expressions from histopathological images has long been a fascinating yet challenging task, primarily due to the substantial disparities between the two modality. Existing strategies using local or global features of histological images are suffering model complexity, GPU consumption, low interpretability, insufficient encoding of local features, and over-smooth prediction of gene expressions among neighboring sites. In this paper, we develop TCGN (Transformer with Convolution and Graph-Node co-embedding method) for gene expression estimation from H&E-stained pathological slide images. TCGN comprises a combination of convolutional layers, transformer encoders, and graph neural networks, and is the first to integrate these blocks in a general and interpretable computer vision backbone. Notably, TCGN uniquely operates with just a single spot image as input for histopathological image analysis, simplifying the process while maintaining interpretability. We validate TCGN on three publicly available spatial transcriptomic datasets. TCGN consistently exhibited the best performance (with median PCC 0.232). TCGN offers superior accuracy while keeping parameters to a minimum (just 86.241 million), and it consumes minimal memory, allowing it to run smoothly even on personal computers. Moreover, TCGN can be extended to handle bulk RNA-seq data while providing the interpretability. Enhancing the accuracy of omics information prediction from pathological images not only establishes a connection between genotype and phenotype, enabling the prediction of costly-to-measure biomarkers from affordable histopathological images, but also lays the groundwork for future multi-modal data modeling. Our results confirm that TCGN is a powerful tool for inferring gene expressions from histopathological images in precision health applications.
引用
收藏
页数:18
相关论文
empty
未找到相关数据