|
Published Articles >> Table of Contents >> Abstract
Third IEEE International Symposium on Cluster Computing and the Grid (CCGrid'03)
p. 172
Improving Access to Multi-dimensional Self-describing Scientific Datasets
Beomseok Nam, University of Maryland
Alan Sussman, University of Maryland
Full Article Text:
 
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/CCGRID.2003.1199366
Send link to a friend
| Abstract |
|
Applications that query into very large multi-
dimensional datasets are becoming more common.
Many self-describing scientific data file formats have also
emerged, which have structural metadata to help navigate
the multi-dimensional arrays that are stored in the files.
The files may also contain application-specific semantic
metadata. In this paper, we discuss efficient methods
for performing searches for subsets of multi-dimensional
data objects, sing semantic information to build multi-
dimensional indexes, and group data items into properly
sized chunks to maximize disk I/O bandwidth. This work is
the first step in the design and implementation of a generic
indexing library that will work with various high-dimension
scientific data file formats containing semantic information
about the stored data. To validate the approach, we have
implemented indexing structures for NASA remote sensing
data stored in the HDF format with a specific schema
(HDF-EOS), and show the performance improvements that
are gained from indexing the datasets, compared to using
the existing HDF library for accessing the data.
|
Additional Information
|
Citation:
Beomseok Nam, Alan Sussman,
"Improving Access to Multi-dimensional Self-describing Scientific Datasets,"
ccgrid,
p. 172,
Third IEEE International Symposium on Cluster Computing and the Grid (CCGrid'03),
2003
|
|