机构:
Anhui Univ, Sch Comp Sci & Technol, Hefei, Peoples R ChinaAnhui Univ, Sch Comp Sci & Technol, Hefei, Peoples R China
Cheng, Zhiyuan
[1
]
Lu, Andong
论文数: 0引用数: 0
h-index: 0
机构:
Anhui Univ, Sch Comp Sci & Technol, Hefei, Peoples R ChinaAnhui Univ, Sch Comp Sci & Technol, Hefei, Peoples R China
Lu, Andong
[1
]
论文数: 引用数:
h-index:
机构:
Zhang, Zhang
[4
,5
]
Li, Chenglong
论文数: 0引用数: 0
h-index: 0
机构:
Anhui Prov Key Lab Multimodal Cognit Computat, Hefei, Peoples R China
Anhui Univ, Sch Artificial Intelligence, Hefei, Peoples R ChinaAnhui Univ, Sch Comp Sci & Technol, Hefei, Peoples R China
Li, Chenglong
[2
,3
]
Wang, Liang
论文数: 0引用数: 0
h-index: 0
机构:
Ctr Res Intelligent Percept & Comp, NLPR, CASIA, Beijing, Peoples R China
Univ Chinese Acad Sci, Beijing, Peoples R ChinaAnhui Univ, Sch Comp Sci & Technol, Hefei, Peoples R China
Wang, Liang
[4
,5
]
机构:
[1] Anhui Univ, Sch Comp Sci & Technol, Hefei, Peoples R China
[2] Anhui Prov Key Lab Multimodal Cognit Computat, Hefei, Peoples R China
[3] Anhui Univ, Sch Artificial Intelligence, Hefei, Peoples R China
[4] Ctr Res Intelligent Percept & Comp, NLPR, CASIA, Beijing, Peoples R China
[5] Univ Chinese Acad Sci, Beijing, Peoples R China
RGBT tracking is often affected by complex scenes ( i.e., occlusions, scale changes, noisy background, etc). Existing works usually adopt a single-strategy RGBT tracking fusion scheme to handle modalityfitsion in all scenarios. However, due to the limitation of fusion model capacity, it is difficult to fully integrate the discriminative features between different modalities. 'lb tackle this problem, we propose a Fusion Tree Network (FTNet), which provides a multistrategy fusion model with high capacity to efficiently fuse different modalities. Specifically, we combine three kinds of attention modules ( i.e., channel attention, spatial attention, and location attention) in a tree structure to achieve multi-path hybrid attention in the deeper convolutional stages of the object tracking network Extensive experiments are performed on three RGBT tracking datasets, and the results show that our method achieves superior performance among state-of-the-art RGBT tracking models.