Shannon–Fano–Elias coding
In information theory, Shannon–Fano–Elias coding is a precursor to arithmetic coding, in which probabilities are used to determine codewords.[1]
Algorithm description
Given a discrete random variable X of ordered values to be encoded, let [math]\displaystyle{ p(x) }[/math] be the probability for any x in X. Define a function
- [math]\displaystyle{ \bar F(x) = \sum_{x_i \lt x}p(x_i) + \frac 12 p(x) }[/math]
Algorithm:
- For each x in X,
- Let Z be the binary expansion of [math]\displaystyle{ \bar F(x) }[/math].
- Choose the length of the encoding of x, [math]\displaystyle{ L(x) }[/math], to be the integer [math]\displaystyle{ \left\lceil \log_2 \frac {1}{p(x)} \right\rceil + 1 }[/math]
- Choose the encoding of x, [math]\displaystyle{ code(x) }[/math], be the first [math]\displaystyle{ L(x) }[/math] most significant bits after the decimal point of Z.
Example
Let X = {A, B, C, D}, with probabilities p = {1/3, 1/4, 1/6, 1/4}.
- For A
- [math]\displaystyle{ \bar F(A) = \frac 12 p(A) = \frac 12 \cdot \frac 13 = 0.1666\ldots }[/math]
- In binary, Z(A) = 0.0010101010...
- [math]\displaystyle{ L(A) = \left\lceil \log_2 \frac 1 \frac 1 3 \right\rceil + 1 = \mathbf 3 }[/math]
- code(A) is 001
- For B
- [math]\displaystyle{ \bar F(B) = p(A) + \frac 12 p(B) = \frac 13 + \frac 12 \cdot \frac 14 = 0.4583333\ldots }[/math]
- In binary, Z(B) = 0.01110101010101...
- [math]\displaystyle{ L(B) = \left\lceil \log_2 \frac 1 \frac 1 4 \right\rceil + 1 = \mathbf 3 }[/math]
- code(B) is 011
- For C
- [math]\displaystyle{ \bar F(C) = p(A) + p(B) + \frac 12 p(C) = \frac 13 + \frac 14 + \frac 12 \cdot \frac 16 = 0.66666\ldots }[/math]
- In binary, Z(C) = 0.101010101010...
- [math]\displaystyle{ L(C) = \left\lceil \log_2 \frac 1 \frac 1 6 \right\rceil + 1 = \mathbf 4 }[/math]
- code(C) is 1010
- For D
- [math]\displaystyle{ \bar F(D) = p(A) + p(B) + p(C) + \frac 12 p(D) = \frac 13 + \frac 14 + \frac 16 + \frac 12 \cdot \frac 14 = 0.875 }[/math]
- In binary, Z(D) = 0.111
- [math]\displaystyle{ L(D) = \left\lceil \log_2 \frac 1 \frac 1 4 \right\rceil + 1 = \mathbf 3 }[/math]
- code(D) is 111
Algorithm analysis
Prefix code
Shannon–Fano–Elias coding produces a binary prefix code, allowing for direct decoding.
Let bcode(x) be the rational number formed by adding a decimal point before a binary code. For example, if code(C) = 1010 then bcode(C) = 0.1010. For all x, if no y exists such that
- [math]\displaystyle{ \operatorname{bcode}(x) \le \operatorname{bcode}(y) \lt \operatorname{bcode}(x) + 2^{-L(x)} }[/math]
then all the codes form a prefix code.
By comparing F to the CDF of X, this property may be demonstrated graphically for Shannon–Fano–Elias coding.
By definition of L it follows that
- [math]\displaystyle{ 2^{-L(x)} \le \frac 12 p(x) }[/math]
And because the bits after L(y) are truncated from F(y) to form code(y), it follows that
- [math]\displaystyle{ \bar F(y) - \operatorname{bcode}(y) \le 2^{-L(y)} }[/math]
thus bcode(y) must be no less than CDF(x).
So the above graph demonstrates that the [math]\displaystyle{ \operatorname{bcode}(y) - \operatorname{bcode}(x) \gt p(x) \ge 2^{-L(x)} }[/math], therefore the prefix property holds.
Code length
The average code length is
[math]\displaystyle{ LC(X) = \sum_{x \in X}p(x)L(x) = \sum_{x \in X}p(x) \left(\left\lceil \log_2 \frac {1}{p(x)} \right\rceil + 1\right) }[/math].
Thus for H(X), the entropy of the random variable X,
- [math]\displaystyle{ H(X) + 1 \le LC(X) \lt H(X) + 2 }[/math]
Shannon Fano Elias codes from 1 to 2 extra bits per symbol from X than entropy, so the code is not used in practice.
References
- ↑ T. M. Cover and Joy A. Thomas (2006). Elements of information theory (2nd ed.). John Wiley and Sons. pp. 127–128. ISBN 978-0-471-24195-9. https://books.google.com/books?id=0QuawYmc2pIC&pg=PA127.
Original source: https://en.wikipedia.org/wiki/Shannon–Fano–Elias coding.
Read more |