|
Published Articles >> Table of Contents >> Abstract
2nd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID'02)
p. 84
Implementing Data Cube Construction using a Cluster Middleware: Algorithms, Implementation Experience, and Performance Evaluation
Ge Yang
Ruoming Jin
Gagan Agrawal
Full Article Text:
 
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/CCGRID.2002.1017115
Send link to a friend
| Abstract |
|
With increases in the amount of data available for analysis in
commercial settings, On Line Analytical Processing (OLAP) and
decision support have become important applications for high
performance computing. Implementing such applications on clusters
requires a lot of expertise and effort, particularly because of the sizes
of input and outputdatasets.
In this paper, we describe our experiences in developing one
such application using a cluster middleware, called ADR. We focus
on the problem of data cube construction, which commonly arises in
multi-dimensional OLAP. We show how ADR, originally developed
for scientific data intensive applications, can be used for carrying out
an efficient and scalable data cube construction implementation. A
particular issue with the use of ADR is tiling of output datasets. We
present new algorithms that combine inter-processor communication
and tiling within each processor. These algorithms preserve the
important properties that are desirable from any parallel data cube
construction algorithm.
We have carried out a detailed evaluation of our implementation.
The main results from our experiments are as follows: 1) High speedups
are achieved on both dense and sparse datasets, even though we
have used simple algorithms that sequentialize a part of the computation,
2) The execution time depends only upon the amount of computation,
and does not increase in a super-linear fashion as the dataset size or the
number of tiles increases, and 3) As the datasets become more sparse,
sequential performance degrades, but the parallel speedups are still
quite good.
|
Additional Information
|
Citation:
Ge Yang, Ruoming Jin, Gagan Agrawal,
"Implementing Data Cube Construction using a Cluster Middleware: Algorithms, Implementation Experience, and Performance Evaluation,"
ccgrid,
p. 84,
2nd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID'02),
2002
|
|