Extending Collective Operations with Application Semantics for Improving Multi-Cluster Performance

Lars Ailo Bongo; Otto Anshus; John Markus Bjorndalen; Tore Larsen

doi:10.1109/ISPDC.2004.24

Parallel and Distributed Computing, International Symposium on

Extending Collective Operations with Application Semantics for Improving Multi-Cluster Performance

Year: 2004, Pages: 320-327

DOI Bookmark: 10.1109/ISPDC.2004.24

Authors

Lars Ailo Bongo, University of Troms?
Otto Anshus, University of Troms?
John Markus Bjorndalen, University of Troms?
Tore Larsen, University of Troms?

Abstract

We identify two ways of increasing the performance of allreduce-style of collective operations in a multi-cluster with large WAN latencies: (i) hiding latency in system noise, and (ii) conditional-allreduce where knowledge about the application is used to reduce the number of WAN messages. In our multicluster, system noise was not large enough to hide the WAN latency. But, the latency could be hidden using conditional-allreduce, since on many iterations only cluster-local values were needed, and many of the values needed from other clusters were prefetched. A speedup of 2.4 was achieved for a microbenchmark. Prefetching introduced a small overhead in the cluster with the slowest hosts.

Like what you’re reading?

Already a member?

Get this article FREE with a new membership!

Performance Characterisation of Intra-Cluster Collective Communications
Computer Architecture and High Performance Computing, Symposium on
Efficient offloading of collective communications in large-scale systems
2007 IEEE International Conference on Cluster Computing
Efficient Collective Operations Using Remote Memory Operations on VIA-Based Clusters
Parallel and Distributed Processing Symposium, International
Efficient MPI Collective Operations for Clusters in Long-and-Fast Networks
2006 IEEE International Conference on Cluster Computing
TACO-Exploiting Cluster Networks for High-Level Collective Operations
Cluster Computing and the Grid, IEEE International Symposium on
Efficient Algorithms for Collective Operations with Notified Communication in Shared Windows
2018 IEEE/ACM Parallel Applications Workshop, Alternatives To MPI (PAW-ATM)
Performance Modelling and Optimization of Controller Cluster Deployments in Software-Defined WAN
2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS)
Efficient and Eventually Consistent Collective Operations
2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)
In-network reductions on multi-dimensional HyperX
2021 IEEE Symposium on High-Performance Interconnects (HOTI)
Optimizing Distributed ML Communication with Fused Computation-Collective Operations
SC24: International Conference for High Performance Computing, Networking, Storage and Analysis

Extending Collective Operations with Application Semantics for Improving Multi-Cluster Performance

Authors

Abstract

Related Articles