2021 journal article

A Bottom-Up and Top-Down Integration Framework for Online Object Tracking

IEEE TRANSACTIONS ON MULTIMEDIA, 23, 105–119.

By: M. Li n, L. Peng*, T. Wu n  & Z. Peng *

co-author countries: China πŸ‡¨πŸ‡³ United States of America πŸ‡ΊπŸ‡Έ
author keywords: Target tracking; Correlation; Encoding; Object tracking; Benchmark testing; Visualization; Online object tracking; bottom-up and top-down; graph regularized sparse coding; alternating direction method of multipliers
Source: Web Of Science
Added: January 19, 2021

Robust online object tracking entails integrating short-term memory based trackers and long-term memory based trackers in an elegant framework to handle structural and appearance variations of unknown objects in an online manner. The integration and synergy between short-term and long-term memory based trackers have yet studied well in the literature, especially in pre-training free settings. To address this issue, this paper presents a bottom-up and top-down integration framework. The bottom-up component realizes a data-driven approach for particle generation. It exploits a short-term memory based tracker to generate bounding box proposals in a new frame. In the top-down component, this paper presents a graph regularized sparse coding scheme as the long-term memory based tracker. The over-complete bases for sparse coding are composed of part-based representations learned from earlier tracking results and new observations to form a space with rich temporal context information. A particle graph is computed whose nodes are the bottom-up discriminative particles and edges are formed on-the-fly in terms of appearance and spatial-temporal similarities between particles. The particle graph induces a regularization term in optimizing the sparse coding coefficients for bottom-up particles. In experiments, the proposed method is tested on the widely used OTB-100 benchmark and the VOT2016 benchmark with better performance obtained than baselines including deep learning based trackers. In addition, the outputs from the top-down sparse coding are potentially useful for downstream tasks such as action recognition, multiple-object tracking, and object re-identification.