Plastic waste identification based on multimodal feature selection and cross-modal Swin Transformer

被引:0
|
作者
Ji, Tianchen [1 ]
Fang, Huaiying [1 ]
Zhang, Rencheng [1 ]
Yang, Jianhong [1 ]
Wang, Zhifeng [2 ]
Wang, Xin [1 ]
机构
[1] Huaqiao Univ, Coll Mech Engn & Automat, Xiamen, Fujian, Peoples R China
[2] Xiamen Luhai Proenvironm Inc, Xiamen, Fujian, Peoples R China
关键词
Multimodal; Swin Transformer; Cross-modalfusion; Feature selection; Waste identification; CLASSIFICATION;
D O I
10.1016/j.wasman.2024.11.027
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
The classification and recycling of municipal solid waste (MSW) are strategies for resource conservation and pollution prevention, with plastic waste identification being an essential component of waste sorting. Multimodal detection of solid waste has increasingly replaced single-modal methods constrained by limited informational capacity. However, existing hyperspectral feature selection algorithms and multimodal identification methods have yet to leverage cross-modal information exhaustively. Therefore, two RGB-hyperspectral image (RGB-HSI) multimodal instance segmentation datasets were constructed to support research in plastic waste sorting. A feature band selection algorithm based on the Activation Weight function was proposed to automatically select influential hyperspectral bands from multimodal data, thereby reducing the burden of data acquisition, transmission, and inference. Furthermore, the multimodal Selective Feature Network (SFNet) was introduced to balance information across various modalities and stages. Moreover, the Correlation Swin Transformer Block was proposed, specifically crafted to fuse cross-modal mutual information, which can be synergistically employed with SFNet to enhance multimodal recognition capabilities further. Experimental results show that the Activation Weight band selection function can select the most effective feature bands. At the same time, the Correlation SFSwin Transformer achieved the highest F1-scores of 97.85% and 97.37% in the two plastic waste object detection experiments, respectively. The source code and final models are available at https://github.com/Bazenr/Corr elation-SFSwin, and the dataset can be accessed at https://www.kaggle.com/datasets/bazenr/rgb-hsi-rgb-nirmunicipal-solid-waste.
引用
收藏
页码:58 / 68
页数:11
相关论文
共 50 条
  • [21] Video multimodal sentiment analysis using cross-modal feature translation and dynamical propagation
    Gan, Chenquan
    Tang, Yu
    Fu, Xiang
    Zhu, Qingyi
    Jain, Deepak Kumar
    Garcia, Salvador
    KNOWLEDGE-BASED SYSTEMS, 2024, 299
  • [22] Feature Compensation Network for Prototype-Based Cross-Modal Person Re-Identification
    Murali, Nirmala
    Mishra, Deepak
    IEEE ACCESS, 2024, 12 : 117994 - 118006
  • [23] Cross-modal pedestrian re-identification based on feature fusion and spatial information adaptation
    Zhao, Qian
    Qian, Zhengzhe
    JOURNAL OF ELECTRONIC IMAGING, 2024, 33 (06)
  • [24] A TRANSFORMER-BASED CROSS-MODAL IMAGE-TEXT RETRIEVAL METHOD USING FEATURE DECOUPLING AND RECONSTRUCTION
    Zhang, Huan
    Sun, Yingzhi
    Liao, Yu
    Xu, SiYuan
    Yang, Rui
    Wang, Shuang
    Hou, Biao
    Jiao, Licheng
    2022 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2022), 2022, : 1796 - 1799
  • [25] Histopathology Cross-Modal Retrieval based on Dual-Transformer Network
    Hu, Dingyi
    Xie, Fengying
    Jiang, Zhiguo
    Zheng, Yushan
    Shi, Jun
    2022 IEEE 22ND INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING (BIBE 2022), 2022, : 97 - 102
  • [26] Multimodal Sentiment Analysis Based on Cross-Modal Joint-Encoding
    Sun, Bin
    Jiang, Tao
    Jia, Li
    Cui, Yiming
    Computer Engineering and Applications, 2024, 60 (18) : 208 - 216
  • [27] A novel cross-modal hashing algorithm based on multimodal deep learning
    Qu, Wen
    Wang, Daling
    Feng, Shi
    Zhang, Yifei
    Yu, Ge
    SCIENCE CHINA-INFORMATION SCIENCES, 2017, 60 (09)
  • [28] A novel cross-modal hashing algorithm based on multimodal deep learning
    Wen QU
    Daling WANG
    Shi FENG
    Yifei ZHANG
    Ge YU
    ScienceChina(InformationSciences), 2017, 60 (09) : 50 - 63
  • [29] Multimodal Sentiment Analysis Based on a Cross-Modal Multihead Attention Mechanism
    Deng, Lujuan
    Liu, Boyi
    Li, Zuhe
    CMC-COMPUTERS MATERIALS & CONTINUA, 2024, 78 (01): : 1157 - 1170
  • [30] Coupled feature selection based semi-supervised modality-dependent cross-modal retrieval
    En Yu
    Jiande Sun
    Li Wang
    Wenbo Wan
    Huaxiang Zhang
    Multimedia Tools and Applications, 2019, 78 : 28931 - 28951