Abstract
Abstract: We propose a distributed, real-time computing platform for tracking multiple interacting persons in motion. To overcome occlusion and articulated motion we use a multi-view implementation, where 2-D semantic features are independently tracked in each view and then collectively integrated using a Bayesian belief network with a topology that varies as a function of scene content and feature confidence. The network fuses observations from multiple cameras by resolving independency relationships and confidence levels within the graph, thereby producing the most likely vector of 3-D state estimates given the available data. We demonstrate the efficacy of the proposed system using a multi-view sequence of several people in motion. Our experiments suggest that, when compared with data fusion based on averaging, the proposed technique yields a noticeable improvement in tracking accuracy.