Development of Adaptive Tracking Methods with Enhanced Performance Based on Deep Learning

Loading...
Thumbnail Image

Date

2025-04-10

Authors

Zhang, Shuo

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Adaptive object tracking aspires to locate the target incessantly in each frame with designated initial target location, which is an imperative yet demanding task in computer vision. Recent adaptive approaches strive to fuse global information of template and search region for achieving promising tracking performance. However, fusion of global information devastates some local details. Local information is essential for distinguishing the target from background regions. To address this problem, we present a novel tracker TGLC integrating a channel-aware convolution block and Transformer attention for global and local representation aggregation, and for channel information modeling. Experimental results demonstrate the superior tracking performance of TGLC. Ablation experiments further verify the effectiveness of multiple information aggregation for improving tracking performance. Long-term tracking is a vital component in real-world tracking scenarios. Recently, one-stage long-term trackers achieve state-of-the-art tracking results due to more sufficient integration of search and template representations. These methods usually adopt an encoder for synchronous feature generation and interaction. Despite their high performance, the approaches tend to feed the encoder full input representations that are highly redundant during training. A novel algorithm MIMTracking is developed for tackling this problem. MIMTracking exploits an encoder and a decoder for masked image modeling during training. This design alleviates input redundancy and reduces the computational cost of the training process. The proposed MIMTracking achieves state-of-the-art tracking results on numerous datasets. Addressing tracking challenges is an essential topic in real-world applications. Constantly varying appearance of targets brings tremendous challenges for object tracking, especially in background clutter scenarios. Current leading trackers attempt to introduce dynamic templates to encode changing target information. However, dynamic templates are obtained from intermediate frames that are not manually annotated. Therefore, these templates may contain a large amount of uninformative and irrelevant background noise due to imprecise tracking. To tackle the problem, a novel tracker ATPTrack is proposed for tracking. Particularly, ATPTrack develops an alternating token trimming method that prunes dynamic templates and search region progressively. Compared to merely trimming the search region, ATPTrack further reduces MACs by 11.5% with negligible performance drop of 0.3% by alternately pruning dynamic templates and search region.

Description

Keywords

Citation