Small-bias sample space
In theoretical computer science, a small-bias sample space (also known as [math]\displaystyle{ \epsilon }[/math]-biased sample space, [math]\displaystyle{ \epsilon }[/math]-biased generator, or small-bias probability space) is a probability distribution that fools parity functions. In other words, no parity function can distinguish between a small-bias sample space and the uniform distribution with high probability, and hence, small-bias sample spaces naturally give rise to pseudorandom generators for parity functions.
The main useful property of small-bias sample spaces is that they need far fewer truly random bits than the uniform distribution to fool parities. Efficient constructions of small-bias sample spaces have found many applications in computer science, some of which are derandomization, error-correcting codes, and probabilistically checkable proofs. The connection with error-correcting codes is in fact very strong since [math]\displaystyle{ \epsilon }[/math]-biased sample spaces are equivalent to [math]\displaystyle{ \epsilon }[/math]-balanced error-correcting codes.
Definition
Bias
Let [math]\displaystyle{ X }[/math] be a probability distribution over [math]\displaystyle{ \{0,1\}^n }[/math]. The bias of [math]\displaystyle{ X }[/math] with respect to a set of indices [math]\displaystyle{ I \subseteq \{1,\dots,n\} }[/math] is defined as[1]
- [math]\displaystyle{ \text{bias}_I(X) = \left| \Pr_{x\sim X} \left(\sum_{i\in I} x_i = 0\right) - \Pr_{x\sim X} \left(\sum_{i\in I} x_i = 1\right) \right| = \left| 2 \cdot \Pr_{x\sim X} \left(\sum_{i\in I} x_i = 0\right) -1 \right| \,, }[/math]
where the sum is taken over [math]\displaystyle{ \mathbb F_2 }[/math], the finite field with two elements. In other words, the sum [math]\displaystyle{ \sum_{i\in I} x_i }[/math] equals [math]\displaystyle{ 0 }[/math] if the number of ones in the sample [math]\displaystyle{ x\in\{0,1\}^n }[/math] at the positions defined by [math]\displaystyle{ I }[/math] is even, and otherwise, the sum equals [math]\displaystyle{ 1 }[/math]. For [math]\displaystyle{ I=\emptyset }[/math], the empty sum is defined to be zero, and hence [math]\displaystyle{ \text{bias}_{\emptyset} (X) = 1 }[/math].
ϵ-biased sample space
A probability distribution [math]\displaystyle{ X }[/math] over [math]\displaystyle{ \{0,1\}^n }[/math] is called an [math]\displaystyle{ \epsilon }[/math]-biased sample space if [math]\displaystyle{ \text{bias}_I(X) \leq \epsilon }[/math] holds for all non-empty subsets [math]\displaystyle{ I \subseteq \{1,2,\ldots,n\} }[/math].
ϵ-biased set
An [math]\displaystyle{ \epsilon }[/math]-biased sample space [math]\displaystyle{ X }[/math] that is generated by picking a uniform element from a multiset [math]\displaystyle{ X\subseteq \{0,1\}^n }[/math] is called [math]\displaystyle{ \epsilon }[/math]-biased set. The size [math]\displaystyle{ s }[/math] of an [math]\displaystyle{ \epsilon }[/math]-biased set [math]\displaystyle{ X }[/math] is the size of the multiset that generates the sample space.
ϵ-biased generator
An [math]\displaystyle{ \epsilon }[/math]-biased generator [math]\displaystyle{ G:\{0,1\}^\ell \to \{0,1\}^n }[/math] is a function that maps strings of length [math]\displaystyle{ \ell }[/math] to strings of length [math]\displaystyle{ n }[/math] such that the multiset [math]\displaystyle{ X_G=\{G(y) \;\vert\; y\in\{0,1\}^\ell \} }[/math] is an [math]\displaystyle{ \epsilon }[/math]-biased set. The seed length of the generator is the number [math]\displaystyle{ \ell }[/math] and is related to the size of the [math]\displaystyle{ \epsilon }[/math]-biased set [math]\displaystyle{ X_G }[/math] via the equation [math]\displaystyle{ s=2^\ell }[/math].
Connection with epsilon-balanced error-correcting codes
There is a close connection between [math]\displaystyle{ \epsilon }[/math]-biased sets and [math]\displaystyle{ \epsilon }[/math]-balanced linear error-correcting codes. A linear code [math]\displaystyle{ C:\{0,1\}^n\to\{0,1\}^s }[/math] of message length [math]\displaystyle{ n }[/math] and block length [math]\displaystyle{ s }[/math] is [math]\displaystyle{ \epsilon }[/math]-balanced if the Hamming weight of every nonzero codeword [math]\displaystyle{ C(x) }[/math] is between [math]\displaystyle{ (\frac{1}{2}-\epsilon)s }[/math] and [math]\displaystyle{ (\frac{1}{2}+\epsilon)s }[/math]. Since [math]\displaystyle{ C }[/math] is a linear code, its generator matrix is an [math]\displaystyle{ (n\times s) }[/math]-matrix [math]\displaystyle{ A }[/math] over [math]\displaystyle{ \mathbb F_2 }[/math] with [math]\displaystyle{ C(x)=x \cdot A }[/math].
Then it holds that a multiset [math]\displaystyle{ X\subset\{0,1\}^{n} }[/math] is [math]\displaystyle{ \epsilon }[/math]-biased if and only if the linear code [math]\displaystyle{ C_X }[/math], whose columns are exactly elements of [math]\displaystyle{ X }[/math], is [math]\displaystyle{ \epsilon }[/math]-balanced.[2]
Constructions of small epsilon-biased sets
Usually the goal is to find [math]\displaystyle{ \epsilon }[/math]-biased sets that have a small size [math]\displaystyle{ s }[/math] relative to the parameters [math]\displaystyle{ n }[/math] and [math]\displaystyle{ \epsilon }[/math]. This is because a smaller size [math]\displaystyle{ s }[/math] means that the amount of randomness needed to pick a random element from the set is smaller, and so the set can be used to fool parities using few random bits.
Theoretical bounds
The probabilistic method gives a non-explicit construction that achieves size [math]\displaystyle{ s=O(n/\epsilon^2) }[/math].[2] The construction is non-explicit in the sense that finding the [math]\displaystyle{ \epsilon }[/math]-biased set requires a lot of true randomness, which does not help towards the goal of reducing the overall randomness. However, this non-explicit construction is useful because it shows that these efficient codes exist. On the other hand, the best known lower bound for the size of [math]\displaystyle{ \epsilon }[/math]-biased sets is [math]\displaystyle{ s=\Omega(n/ (\epsilon^2 \log (1/\epsilon)) }[/math], that is, in order for a set to be [math]\displaystyle{ \epsilon }[/math]-biased, it must be at least that big.[2]
Explicit constructions
There are many explicit, i.e., deterministic constructions of [math]\displaystyle{ \epsilon }[/math]-biased sets with various parameter settings:
- (Naor Naor) achieve [math]\displaystyle{ \displaystyle s= \frac{n}{\text{poly}(\epsilon)} }[/math]. The construction makes use of Justesen codes (which is a concatenation of Reed–Solomon codes with the Wozencraft ensemble) as well as expander walk sampling.
- (Alon Goldreich) achieve [math]\displaystyle{ \displaystyle s= O\left(\frac{n}{\epsilon \log (n/\epsilon)}\right)^2 }[/math]. One of their constructions is the concatenation of Reed–Solomon codes with the Hadamard code; this concatenation turns out to be an [math]\displaystyle{ \epsilon }[/math]-balanced code, which gives rise to an [math]\displaystyle{ \epsilon }[/math]-biased sample space via the connection mentioned above.
- Concatenating Algebraic geometric codes with the Hadamard code gives an [math]\displaystyle{ \epsilon }[/math]-balanced code with [math]\displaystyle{ \displaystyle s= O\left(\frac{n}{\epsilon^3 \log (1/\epsilon)}\right) }[/math].[2]
- (Ben-Aroya Ta-Shma) achieves [math]\displaystyle{ \displaystyle s= O\left(\frac{n}{\epsilon^2 \log (1/\epsilon)}\right)^{5/4} }[/math].
- (Ta-Shma 2017) achieves [math]\displaystyle{ \displaystyle s= O\left(\frac{n}{\epsilon^{2+o(1)}}\right) }[/math] which is almost optimal because of the lower bound.
These bounds are mutually incomparable. In particular, none of these constructions yields the smallest [math]\displaystyle{ \epsilon }[/math]-biased sets for all settings of [math]\displaystyle{ \epsilon }[/math] and [math]\displaystyle{ n }[/math].
Application: almost k-wise independence
An important application of small-bias sets lies in the construction of almost k-wise independent sample spaces.
k-wise independent spaces
A random variable [math]\displaystyle{ Y }[/math] over [math]\displaystyle{ \{0,1\}^n }[/math] is a k-wise independent space if, for all index sets [math]\displaystyle{ I\subseteq\{1,\dots,n\} }[/math] of size [math]\displaystyle{ k }[/math], the marginal distribution [math]\displaystyle{ Y|_I }[/math] is exactly equal to the uniform distribution over [math]\displaystyle{ \{0,1\}^k }[/math]. That is, for all such [math]\displaystyle{ I }[/math] and all strings [math]\displaystyle{ z\in\{0,1\}^k }[/math], the distribution [math]\displaystyle{ Y }[/math] satisfies [math]\displaystyle{ \Pr_Y (Y|_I = z) = 2^{-k} }[/math].
Constructions and bounds
k-wise independent spaces are fairly well understood.
- A simple construction by (Joffe 1974) achieves size [math]\displaystyle{ n^k }[/math].
- (Alon Babai) construct a k-wise independent space whose size is [math]\displaystyle{ n^{k/2} }[/math].
- (Chor Goldreich) prove that no k-wise independent space can be significantly smaller than [math]\displaystyle{ n^{k/2} }[/math].
Joffe's construction
(Joffe 1974) constructs a [math]\displaystyle{ k }[/math]-wise independent space [math]\displaystyle{ Y }[/math] over the finite field with some prime number [math]\displaystyle{ n\gt k }[/math] of elements, i.e., [math]\displaystyle{ Y }[/math] is a distribution over [math]\displaystyle{ \mathbb F_n^n }[/math]. The initial [math]\displaystyle{ k }[/math] marginals of the distribution are drawn independently and uniformly at random:
- [math]\displaystyle{ (Y_0,\dots,Y_{k-1}) \sim\mathbb F_n^k }[/math].
For each [math]\displaystyle{ i }[/math] with [math]\displaystyle{ k \leq i \lt n }[/math], the marginal distribution of [math]\displaystyle{ Y_i }[/math] is then defined as
- [math]\displaystyle{ Y_i=Y_0 + Y_1 \cdot i + Y_2 \cdot i^2 + \dots + Y_{k-1} \cdot i^{k-1}\,, }[/math]
where the calculation is done in [math]\displaystyle{ \mathbb F_n }[/math]. (Joffe 1974) proves that the distribution [math]\displaystyle{ Y }[/math] constructed in this way is [math]\displaystyle{ k }[/math]-wise independent as a distribution over [math]\displaystyle{ \mathbb F_n^n }[/math]. The distribution [math]\displaystyle{ Y }[/math] is uniform on its support, and hence, the support of [math]\displaystyle{ Y }[/math] forms a [math]\displaystyle{ k }[/math]-wise independent set. It contains all [math]\displaystyle{ n^k }[/math] strings in [math]\displaystyle{ \mathbb F_n^k }[/math] that have been extended to strings of length [math]\displaystyle{ n }[/math] using the deterministic rule above.
Almost k-wise independent spaces
A random variable [math]\displaystyle{ Y }[/math] over [math]\displaystyle{ \{0,1\}^n }[/math] is a [math]\displaystyle{ \delta }[/math]-almost k-wise independent space if, for all index sets [math]\displaystyle{ I\subseteq\{1,\dots,n\} }[/math] of size [math]\displaystyle{ k }[/math], the restricted distribution [math]\displaystyle{ Y|_I }[/math] and the uniform distribution [math]\displaystyle{ U_k }[/math] on [math]\displaystyle{ \{0,1\}^k }[/math] are [math]\displaystyle{ \delta }[/math]-close in 1-norm, i.e., [math]\displaystyle{ \Big\|Y|_I - U_k\Big\|_1 \leq \delta }[/math].
Constructions
(Naor Naor) give a general framework for combining small k-wise independent spaces with small [math]\displaystyle{ \epsilon }[/math]-biased spaces to obtain [math]\displaystyle{ \delta }[/math]-almost k-wise independent spaces of even smaller size. In particular, let [math]\displaystyle{ G_1:\{0,1\}^h\to\{0,1\}^n }[/math] be a linear mapping that generates a k-wise independent space and let [math]\displaystyle{ G_2:\{0,1\}^\ell \to \{0,1\}^h }[/math] be a generator of an [math]\displaystyle{ \epsilon }[/math]-biased set over [math]\displaystyle{ \{0,1\}^h }[/math]. That is, when given a uniformly random input, the output of [math]\displaystyle{ G_1 }[/math] is a k-wise independent space, and the output of [math]\displaystyle{ G_2 }[/math] is [math]\displaystyle{ \epsilon }[/math]-biased. Then [math]\displaystyle{ G : \{0,1\}^\ell \to \{0,1\}^n }[/math] with [math]\displaystyle{ G(x) = G_1(G_2(x)) }[/math] is a generator of an [math]\displaystyle{ \delta }[/math]-almost [math]\displaystyle{ k }[/math]-wise independent space, where [math]\displaystyle{ \delta=2^{k/2} \epsilon }[/math].[3]
As mentioned above, (Alon Babai) construct a generator [math]\displaystyle{ G_1 }[/math] with [math]\displaystyle{ h=\tfrac{k}{2} \log n }[/math], and (Naor Naor) construct a generator [math]\displaystyle{ G_2 }[/math] with [math]\displaystyle{ \ell=\log s=\log h + O(\log(\epsilon^{-1})) }[/math]. Hence, the concatenation [math]\displaystyle{ G }[/math] of [math]\displaystyle{ G_1 }[/math] and [math]\displaystyle{ G_2 }[/math] has seed length [math]\displaystyle{ \ell = \log k + \log \log n + O(\log(\epsilon^{-1})) }[/math]. In order for [math]\displaystyle{ G }[/math] to yield a [math]\displaystyle{ \delta }[/math]-almost k-wise independent space, we need to set [math]\displaystyle{ \epsilon = \delta 2^{-k/2} }[/math], which leads to a seed length of [math]\displaystyle{ \ell = \log \log n + O(k+\log(\delta^{-1})) }[/math] and a sample space of total size [math]\displaystyle{ 2^\ell \leq \log n \cdot \text{poly}(2^k \cdot\delta^{-1}) }[/math].
Notes
References
- Alon, Noga; Babai, László; Itai, Alon (1986), "A fast and simple randomized parallel algorithm for the maximal independent set problem", Journal of Algorithms 7 (4): 567–583, doi:10.1016/0196-6774(86)90019-2, http://www.tau.ac.il/~nogaa/PDFS/Publications2/A%20fast%20and%20simple%20randomized%20parallel%20algorithm%20for%20the%20maximal%20independent%20set%20problem.pdf
- Alon, Noga; Goldreich, Oded; Håstad, Johan; Peralta, René (1992), "Simple Constructions of Almost k-wise Independent Random Variables", Random Structures & Algorithms 3 (3): 289–304, doi:10.1002/rsa.3240030308, http://tau.ac.il/~nogaa/PDFS/aghp4.pdf
- Ben-Aroya, Avraham; Ta-Shma, Amnon (2009). "Constructing Small-Bias Sets from Algebraic-Geometric Codes". 2009 50th Annual IEEE Symposium on Foundations of Computer Science. pp. 191–197. doi:10.1109/FOCS.2009.44. ISBN 978-1-4244-5116-6. http://www.wisdom.weizmann.ac.il/~benaroya/SmallBiasNew.pdf.
- Chor, Benny; Goldreich, Oded; Håstad, Johan; Freidmann, Joel; Rudich, Steven; Smolensky, Roman (1985). "The bit extraction problem or t-resilient functions". 26th Annual Symposium on Foundations of Computer Science (SFCS 1985). pp. 396–407. doi:10.1109/SFCS.1985.55. ISBN 978-0-8186-0644-1.
- Goldreich, Oded (2001), Lecture 7: Small bias sample spaces, http://www.wisdom.weizmann.ac.il/~oded/PS/RND/l07.ps
- Joffe, Anatole (1974), "On a Set of Almost Deterministic k-Independent Random Variables", Annals of Probability 2 (1): 161–162, doi:10.1214/aop/1176996762
- Naor, Joseph; Naor, Moni (1990), "Small-bias probability spaces: Efficient constructions and applications", Proceedings of the twenty-second annual ACM symposium on Theory of computing - STOC '90, pp. 213–223, doi:10.1145/100216.100244, ISBN 978-0897913614, http://www.wisdom.weizmann.ac.il/~naor/PAPERS/bias_abs.html
- Ta-Shma, Amnon (2017), "Explicit, almost optimal, epsilon-balanced codes", Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, pp. 238–251, doi:10.1145/3055399.3055408, ISBN 9781450345286
Original source: https://en.wikipedia.org/wiki/Small-bias sample space.
Read more |