This project presents a
novel patch-based approach for object tracking. Initially, the original
template is divided into rectangular subregions (patches), and each
patch is tracked independently. The displacement of the whole template
is obtained using a weighted vector median filter that combines the
displacement of each patch and also a predicted displacement computed
based on the previous frames. An updating scheme is also applied to
cope with appearance changes of the template. Experimental results
indicate that the proposed scheme is robust to partial and short-time
total occlusions, presenting a good compromise between accuracy and
execution time when compared to other competitive approaches.
Key words: Object tracking, Occlusion, Multiple patches, Weighted
Vector Median Filter, Bhattacharyya Distance
Here, we presents some experimental results
obtained with the proposed algorithm, called the Coherent Patch
Displacement (CPD) Tracking algorithm . The experimental validation
performed qualitatively, by visual inspection of tracking results, and
also quantitatively, by comparing the tracking errors produced by the
proposed approach and by two state-of-the-art techniques, namely the
MeanShift algorithm  and the FragTrack algorithm .
All the results presented here were computed using C++ implementations
of the algorithms (the code for FragTrack was kindly provided by Amit
Adam), running on a PC computer with a Pentium Core 2 Duo 2.33GHz
processor, 1GB RAM and windows XP operational system. To compare the
techniques we used five different video sequences: Woman, Caviar,
Face1, Face2, and Person. The Woman sequence is a video of a woman
walking on a sidewalk, sometimes partially occluded by different cars
(available for download at
The Caviar sequence is one of the videos of the CAVIAR project
and it consists of some
people walking through a mall. Face1 and Face2 are two facial video
sequences, containing head tilts, turns and occlusions.
Finally, the Person video sequence illustrates a full body person
moving behind several trees in an outdoor environment with illumination
changes (from cloudy to partly sunny), shot with a moving camera. The
Woman and Caviar video sequences are available with ground truth data,
while the remaining sequences were manually ground truthed by our group.
For those five video sequences we computed the execution time and the
tracking error (Euclidean distance between the actual position and the
tracked position) for each frame. In all experiments, we used the same
search region (30 × 30) for CPD Tracking and FragTrack
(Meanshift does not require a search region). For Frag-Track, we used
16 histograms bins (only luminance information), and the EMD distance
as the histogram matching procedure. For CPD Tracking, we used the same
set of 5 features for all sequences: the 3 color channels and the 2
components of the gradient computed from the luminance component. It is
important to emphasize that, although CPD presents other tunable
parameters (sp, w, Tp, and c), the same default values
described in the previous Section were used in all experiments.
Clearly, even better results could be achieved by fine tuning those
parameters to each individual video sequence.
We grant permission to use and publish all videos and numerical
results generated by our group (namely, videos Face1, Face2 and Person), as long as reference 
The Woman and Caviar video sequences are available with ground truth
data, while the remaining sequences were manually ground truthed
by our group.
 D. Comaniciu, V. Ramesh, P. Meer, Kernel-based object tracking,
IEEE Transactions on Pattern Analysis and Machine Intelligence 25 (5)
 A. Adam, E. Rivlin, I. Shimshoni, Robust fragments-based tracking
using the integral histogram, in: Conference on Computer Vision and
Pattern Recognition, IEEE Computer Society, Washington, DC, USA, 2006,
 DIHL, Leandro L. ; JUNG, Cláudio Rosito ; BINS, Jose C. .
Adaptive Patch-based Object Tracking using Weighted Vector Median
Filters. In: SIBGRAPI, 2011, Maceio,. Conference on Graphics, Patterns
and Images (SIBGRAPI), 2011. p. 149-156.