Abstract
When tracking and segmenting semantic video objects, different forms of representational model can be used to find the object region on a per-frame basis. We propose a novel hierarchical technique using parametric models to describe the appearance and location of an object and then use non-parametric methods to model the sub-object regions for accurate pixel-wise segmentation. Our motivation is to use parametric models to locate the object, improving the sensitivity of the non-parametric sub-object region models to background clutter. The results indicate this is a promising approach to extracting video objects.