| Abstract |
|
We report on a performance evaluation of a Fair Share
system at the ASCI Blue Mountain supercomputer cluster.
We study the impacts of share allocation under Fair Share
on wait times and expansion factor. We also measure the
Service Ratio, a typical figure of merit for Fair Share
systems, with respect to a number of job parameters. We
conclude that Fair Share does little to alter important
performance metrics such as expansion factor. This leads
to the question of what Fair Share means on cluster
machines. The essential difference between Fair Share on
a uni-processor and a cluster is that the workload on a
cluster is not fungible in space or time. We find that cluster
machines must be highly utilized and support
checkpointing in order for Fair Share to function more
closely to the spirit in which it was originally developed.
|
Additional Information
|
Citation:
Stephen D. Kleban, Scott H. Clearwater,
"Fair Share on High Performance Computing Systems: What Does Fair Really Mean?,"
ccgrid,
p. 146,
Third IEEE International Symposium on Cluster Computing and the Grid (CCGrid'03),
2003
|