Abstract
This paper describes a fully pipelined parallel architecture for the New Three Step Search (NTSS) hierarchical search block-matching algorithm for the estimation of small motions in video coding. Techniques for reducing external memory accesses are also developed. The proposed architecture produces efficient solution for real-time motion estimation required in video applications with low memory bandwidth requirement. This architecture is the best tradeoff in terms of hardware overload and speed among the all-existing Three Step Search (TSS) architectures and is also suitable for estimation of small motion in video coding. This architecture can be used for various video applications from low bit-rate video to HDTV systems.