|
Published Articles >> Table of Contents >> Abstract
21st International Conference on Data Engineering (ICDE'05)
pp. 32-43
Range-Efficient Computation of F₀ over Massive Data Streams
A. Pavan, Iowa State University
Srikanta Tirthapura, Iowa State University
Full Article Text:
 
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICDE.2005.118
Send link to a friend
| Abstract |
|
Efficient one-pass computation of F₀, the number of distinct
elements in a data stream, is a fundamental problem
arising in various contexts in databases and networking. We
consider the problem of efficiently estimating F₀ of a data
stream where each element of the stream is an interval of integers.
We present a randomized algorithm which gives an (\varepsilon ,\delta )
approximation of F₀, with the following time complexity (n
is the size of the universe of the items): (1) The amortized
processing time per interval is 0(\log \frac{1}{\delta }Log\frac{n}{\varepsilon }). (2) The time
to answer a query for F₀ is 0(\log {1 \mathord{\left/ {\vphantom {1 {\delta )}}} \right. \kern-\nulldelimiterspace} {\delta )}}. The workspace used
is 0(\frac{1}{{\varepsilon ^2 }}\log \frac{1}{\delta }\log n) bits.
Our algorithm improves upon a previous algorithm
by Bar-Yossef, Kumar and Sivakumar [5], which requires
0(\frac{1}{{\varepsilon ^5 }}\log \frac{1}{\delta }\log ^5 n) processing time per item. Our algorithm can be used to compute the max-dominance norm
of a stream of multiple signals, and significantly improves
upon the current best bounds due to Cormode and
Muthukrishnan [11]. This also provides efficient and novel
solutions for data aggregation problems in sensor networks
studied by Nath and Gibbons [22] and Considine et.
al. [8].
|
Additional Information
|
Citation:
A. Pavan, Srikanta Tirthapura,
"Range-Efficient Computation of F₀ over Massive Data Streams,"
icde,
pp. 32-43,
21st International Conference on Data Engineering (ICDE'05),
2005
|
|