Fibonacci coding

From HandWiki
Short description: Is a universal code which encodes positive integers into binary code words

In mathematics and computing, Fibonacci coding is a universal code[citation needed] which encodes positive integers into binary code words. It is one example of representations of integers based on Fibonacci numbers. Each code word ends with "11" and contains no other instances of "11" before the end.

The Fibonacci code is closely related to the Zeckendorf representation, a positional numeral system that uses Zeckendorf's theorem and has the property that no number has a representation with consecutive 1s. The Fibonacci code word for a particular integer is exactly the integer's Zeckendorf representation with the order of its digits reversed and an additional "1" appended to the end.

Definition

For a number [math]\displaystyle{ N\! }[/math], if [math]\displaystyle{ d(0),d(1),\ldots,d(k-1),d(k)\! }[/math] represent the digits of the code word representing [math]\displaystyle{ N\! }[/math] then we have:

[math]\displaystyle{ N = \sum_{i=0}^{k-1} d(i) F(i+2),\text{ and }d(k-1)=d(k)=1.\! }[/math]

where F(i) is the ith Fibonacci number, and so F(i+2) is the ith distinct Fibonacci number starting with [math]\displaystyle{ 1,2,3,5,8,13,\ldots }[/math]. The last bit [math]\displaystyle{ d(k) }[/math] is always an appended bit of 1 and does not carry place value.

It can be shown that such a coding is unique, and the only occurrence of "11" in any code word is at the end i.e. d(k−1) and d(k). The penultimate bit is the most significant bit and the first bit is the least significant bit. Also leading zeros cannot be omitted as they can in e.g. decimal numbers.

The first few Fibonacci codes are shown below, and also their so-called implied probability, the value for each number that has a minimum-size code in Fibonacci coding.

Symbol Fibonacci representation Fibonacci code word Implied probability
1 [math]\displaystyle{ F(2) }[/math] 11 1/4
2 [math]\displaystyle{ F(3) }[/math] 011 1/8
3 [math]\displaystyle{ F(4) }[/math] 0011 1/16
4 [math]\displaystyle{ F(2) + F(4) }[/math] 1011 1/16
5 [math]\displaystyle{ F(5) }[/math] 00011 1/32
6 [math]\displaystyle{ F(2) + F(5) }[/math] 10011 1/32
7 [math]\displaystyle{ F(3) + F(5) }[/math] 01011 1/32
8 [math]\displaystyle{ F(6) }[/math] 000011 1/64
9 [math]\displaystyle{ F(2) + F(6) }[/math] 100011 1/64
10 [math]\displaystyle{ F(3) + F(6) }[/math] 010011 1/64
11 [math]\displaystyle{ F(4) + F(6) }[/math] 001011 1/64
12 [math]\displaystyle{ F(2)+F(4)+F(6) }[/math] 101011 1/64
13 [math]\displaystyle{ F(7) }[/math] 0000011 1/128
14 [math]\displaystyle{ F(2) + F(7) }[/math] 1000011 1/128

To encode an integer N:

  1. Find the largest Fibonacci number equal to or less than N; subtract this number from N, keeping track of the remainder.
  2. If the number subtracted was the ith Fibonacci number F(i), put a 1 in place i−2 in the code word (counting the left most digit as place 0).
  3. Repeat the previous steps, substituting the remainder for N, until a remainder of 0 is reached.
  4. Place an additional 1 after the rightmost digit in the code word.

To decode a code word, remove the final "1", assign the remaining the values 1,2,3,5,8,13... (the Fibonacci numbers) to the bits in the code word, and sum the values of the "1" bits.

Comparison with other universal codes

Fibonacci coding has a useful property that sometimes makes it attractive in comparison to other universal codes: it is an example of a self-synchronizing code, making it easier to recover data from a damaged stream. With most other universal codes, if a single bit is altered, none of the data that comes after it will be correctly read. With Fibonacci coding, on the other hand, a changed bit may cause one token to be read as two, or cause two tokens to be read incorrectly as one, but reading a "0" from the stream will stop the errors from propagating further. Since the only stream that has no "0" in it is a stream of "11" tokens, the total edit distance between a stream damaged by a single bit error and the original stream is at most three.

This approach—encoding using sequence of symbols, in which some patterns (like "11") are forbidden, can be freely generalized.[1]

Example

The following table shows that the number 65 is represented in Fibonacci coding as 0100100011, since 65 = 2 + 8 + 55. The first two Fibonacci numbers (0 and 1) are not used, and an additional 1 is always appended.

[math]\displaystyle{ \begin{array}{ccccccccccc|c} \hline 0 & 1 & 1 & 2 & 3 & 5 & 8 & 13 & 21 & 34 & 55 & - \\ \hline F(0) & F(1) & F(2) & F(3) & F(4) & F(5) & F(6) & F(7) & F(8) & F(9) & F(10) & \scriptstyle\text{additional} \\ \hline - & - & 0 & 1 & 0 & 0 & 1 & 0 & 0 & 0 & 1 & 1 \\ \hline \end{array} }[/math]

Generalizations

The Fibonacci encodings for the positive integers are binary strings that end with "11" and contain no other instances of "11". This can be generalized to binary strings that end with N consecutive 1's and contain no other instances of N consecutive 1's. For instance, for N = 3 the positive integers are encoded as 111, 0111, 00111, 10111, 000111, 100111, 010111, 110111, 0000111, 1000111, 0100111, …. In this case, the number of encodings as a function of string length is given by the sequence of Tribonacci numbers.

For general constraints defining which symbols are allowed after a given symbol, the maximal information rate can be obtained by first finding the optimal transition probabilities using maximal entropy random walk, then use entropy coder (with switched encoder with decoder) to encode a message as a sequence of symbols fulfilling the found optimal transition probabilities.

See also

References

  1. Duda, Jarek (2007). "Optimal encoding on discrete lattice with translational invariant constrains using statistical algorithms". arXiv:0710.3861 [cs.IT].

Further reading

  • Stakhov, A. P. (2009). The Mathematics of Harmony: From Euclid to Contemporary Mathematics and Computer Science. Singapore: World Scientific Publishing.