Language Compression and Pseudorandom Generators

Harry Buhrman; Troy Lee; Dieter van Melkebeek

doi:10.1109/CCC.2004.1313772

Abstract

The language compression problem asks for succinct descriptions of the strings in a language A such that the strings can be efficiently recovered from their description when given a membership oracle for A. We study randomized and nondeterministic decompression schemes and investigate how close we can get to the information theoretic lower bound of l\log \left\| {A^{ = n} } \right\| for the description length of strings of length n. Using nondeterminism alone, we can achieve the information theoretic lower bound up to an additive term of 0((\sqrt {\log \left\| {A^{ = n} } \right\|} + \log n)\log n); using both nondeterminism and randomness, we can make do with an excess term of 0(\log ^3 n). With randomness alone, we show a lower bound of n - \log \left\| {A^{ = n} } \right\| - 0(\log n) on the description length of strings in A of length n, and a lower bound of 2 \cdot \log \left\| {A^{ = n} } \right\| - 0(1) on the length of any program that distinguishes a given string length n in A from any other string. The latter lower bound is tight up to an additive term of 0(log n). The key ingredient for our upperbounds is the relativizable hardness versus randomness trade offs based on the Nisan-Wigderson pseudorandom generator construction.

Language Compression and Pseudorandom Generators

Authors

Abstract

Related Articles