International Conference on Parallel Processing, 2004. ICPP 2004.
Download PDF

Abstract

Early parallelizing compilers use the owner-computes rule to partition computation. Partial replication is then introduced to eliminate near-neighbor communication at the cost of some replicated computation, hence improves the performance and scalability. Current exploration of partial replicate computation partitioning is limited within a single loop nest. In this paper, we present a formal description of the global partial replicate computation partitioning problem, a simplified cost model and a heuristic solution. Experimental results show that the solution is superior to local approaches.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Related Articles