DSFormer: Leveraging Transformer with Cross-Modal Attention for Temporal Consistency in Low-Light Video Enhancement

被引:0
|
作者
Xu, JiaHao [1 ,2 ]
Mei, ShuHao [2 ]
Chen, ZiZheng [2 ]
Zhang, DanNi [2 ]
Shi, Fan [1 ,2 ]
Zhao, Meng [1 ,2 ]
机构
[1] Tianjin Univ Technol, Minist Educ, Engn Res Ctr Learning Based Intelligent Syst, Tianjin 300384, Peoples R China
[2] Tianjin Univ Technol, Sch Comp Sci & Engn, Tianjin 300384, Peoples R China
基金
中国国家自然科学基金;
关键词
Low-Light Video Enhancement; Transformer; Optical flow;
D O I
10.1007/978-981-97-5612-4_3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent advancements in deep learning have significantly impacted low-light video enhancement, sparking great interest in the field. However, while these techniques have proven effective for enhancing individual static images, they struggle with temporal instability when applied to videos, leading to artifacts and flickering. This challenge is further compounded by the difficulty of obtaining dynamic low-light/high-light video pairs in real-world scenarios. Our proposed solution tackles these issues by integrating a cross-attention mechanism with optical flow. This approach helps mitigate temporal inconsistencies, often found when training with static images, by using optical flow to infer motion in individual frames. We have also developed a Transformer model (DSFormer) that leverages spatial and channel features to enhance visual quality and temporal stability in videos. Additionally, we have created a novel dual path feed-forward network (DPFN) that improves our method's ability to capture and maintain local contextual information, which is crucial for low-light enhancement. Through extensive comparative and ablation studies, we demonstrate that our approach delivers high luminance and temporal consistency in enhancement sequences.
引用
收藏
页码:27 / 38
页数:12
相关论文
共 50 条
  • [1] Unsupervised Low-Light Video Enhancement With Spatial-Temporal Co-Attention Transformer
    Lv, Xiaoqian
    Zhang, Shengping
    Wang, Chenyang
    Zhang, Weigang
    Yao, Hongxun
    Huang, Qingming
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 4701 - 4715
  • [2] STARNet: Low-light video enhancement using spatio-temporal consistency aggregation
    Wu, Zhe
    Sheng, Zehua
    Zhang, Xue
    Cao, Si-Yuan
    Zhang, Runmin
    Yu, Beinan
    Zhang, Chenghao
    Yang, Bailin
    Shen, Hui-Liang
    PATTERN RECOGNITION, 2025, 160
  • [3] Cross-modal decoupling in temporal attention
    Muehlberg, Stefanie
    Oriolo, Giovanni
    Soto-Faraco, Salvador
    EUROPEAN JOURNAL OF NEUROSCIENCE, 2014, 39 (12) : 2089 - 2097
  • [4] Adaptive Locally-Aligned Transformer for low-light video enhancement
    Cao, Yiwen
    Su, Yukun
    Deng, Jingliang
    Zhang, Yu
    Wu, Qingyao
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 240
  • [5] Cross-Modal Video Moment Retrieval with Spatial and Language-Temporal Attention
    Jiang, Bin
    Huang, Xin
    Yang, Chao
    Yuan, Junsong
    ICMR'19: PROCEEDINGS OF THE 2019 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, 2019, : 217 - 225
  • [6] Low-light image enhancement using transformer with color fusion and channel attention
    Sun, Yinbang
    Sun, Jing
    Sun, Fuming
    Wang, Fasheng
    Li, Haojie
    JOURNAL OF SUPERCOMPUTING, 2024, 80 (13): : 18365 - 18391
  • [7] Spatio-temporal propagation and reconstruction for low-light video enhancement
    Ye, Jing
    Qiu, Changzhen
    Zhang, Zhiyong
    DIGITAL SIGNAL PROCESSING, 2023, 139
  • [8] Temporal-Spatial Filtering for Enhancement of Low-Light Surveillance Video
    Guo, Fan
    Tang, Jin
    Peng, Hui
    Zou, Beiji
    JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2016, 20 (04) : 652 - 661
  • [9] A non-uniform low-light image enhancement method with multi-scale attention transformer and luminance consistency loss
    Fang, Xiao
    Gao, Xin
    Li, Baofeng
    Zhai, Feng
    Qin, Yu
    Meng, Zhihang
    Lu, Jiansheng
    Xiao, Chun
    VISUAL COMPUTER, 2025, 41 (03): : 1591 - 1608
  • [10] An unsupervised low-light video enhancement network based on inter-frame consistency
    Wen, Shuyuan
    Li, Wenchao
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (11) : 7909 - 7920