|
Published Articles >> Table of Contents >> Abstract
18th International Parallel and Distributed Processing Symposium (IPDPS'04) - Workshop 14
p. 255a
Identifying Performance Bottlenecks on Modern Microarchitectures Using an Adaptable Probe
Gorden Griem, Lawrence Berkeley National Laboratory
Leonid Oliker, Lawrence Berkeley National Laboratory
John Shalf, Lawrence Berkeley National Laboratory
Katherine Yelick, Lawrence Berkeley National Laboratory and University of California at Berkeley
Full Article Text:
 
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/IPDPS.2004.1303320
Send link to a friend
| Abstract |
|
The gap between peak and delivered performance for scientific applications running on microprocessor-based systems has grown considerably in recent years. The inability to achieve the desired performance even on a single processor is often attributed to an inadequate memory system, but without identification or quantification of a specific bottleneck. In this work, we use an adaptable synthetic benchmark to isolate application characteristics that cause a significant drop in performance, giving application programmers and architects information about possible optimizations. Our adaptable probe, called sqmat, uses only four parameters to capture key characteristics of scientific workloads: working-set size, computational intensity, indirection, and irregularity. This paper describes the implementation of sqmat and uses its tunable parameters to evaluate four leading 64-bit microprocessors that are popular building blocks for current high performance systems: Intel Itanium2, AMD Opteron, IBM Power3, and IBM Power4.
|
Additional Information
|
Citation:
Gorden Griem, Leonid Oliker, John Shalf, Katherine Yelick,
"Identifying Performance Bottlenecks on Modern Microarchitectures Using an Adaptable Probe,"
ipdps,
p. 255a,
18th International Parallel and Distributed Processing Symposium (IPDPS'04) - Workshop 14,
2004
|
|