Quote notation

From HandWiki

Quote notation is a representation of the rational numbers based on Kurt Hensel's p-adic numbers. In quote notation, arithmetic operations take particularly simple, consistent forms, producing exact answers with no roundoff error. Quote notation’s arithmetic algorithms work in a right-to-left direction; addition, subtraction, and multiplication algorithms are the same as for natural numbers, and division is easier than the usual division algorithm. The notation was invented by Eric Hehner of the University of Toronto and Nigel Horspool, then at McGill University, and published in the SIAM Journal on Computing, v.8, n.2, May 1979, pp. 124–134.

Representation

Introduction

A standard representation of rational numbers begins with a sign (+ or – ; if no sign is written the implied sign is +) and continues with a (finite or infinite) sequence of digits with a radix point (called a decimal point in base ten) somewhere in the sequence. For examples,

–12.345
0.33333...

To make the representation finite, an overscore may be used over the repeating digits. Some examples are:

[math]\displaystyle{ -0.02\overline{34} }[/math]

It is also standard practice to leave negation and division operators in the number representation, without performing the negation or division. For example, –1/3 (minus one-third).

In quote notation, each rational number has a unique (up to normalization) finite representation as a sequence of digits with a radix point and a quote somewhere in the sequence. The quote means that the digits to its left are repeated infinitely to the left. For examples,

12'34.56 = ...12121234.56
12.34'56 = ...1234123412.3456
123!45 = ...123123123.45

An exclamation mark is used when the quote and point are in the same place. If the repeated sequence is all 0s, both the zeros and the quote can be omitted. The radix point has its usual function; moving it left divides by the base, and moving it right multiplies by the base. When the radix point is at the right end, the multiplicative factor is 1, and the point can be omitted. This gives the natural numbers their familiar form. Scientific notation may be used as an alternative to the radix point.

The interpretation of the leading repeating sequence is an extension of the sum of the geometric series:

[math]\displaystyle{ 1+10^k+10^{2k}+10^{3k}+... = \tfrac{1}{1-10^k} }[/math].

For instance:

[math]\displaystyle{ 1+10+100+1000+... = \tfrac{1}{1-10}=-\tfrac{1}{9} }[/math]

and

[math]\displaystyle{ 1+1000+1000^2+1000^3+... = \tfrac{1}{1-1000}=-\tfrac{1}{999} }[/math].

With this convention, numbers in quote notation are interpreted as:

3' = ...333 = 3(... + 100 + 10 + 1) = –3/9 = –1/3
123' =...123123123 = 123(...1000000 + 1000 + 1) = –123/999
123'45.6 = 45.6 + 123'00 = 45.6 + 100 × 123' = 45.6 – 12300/999

This leads to the rule:

abc...z' = – abc...z/999...9,

with the same number of 9's in the denominator as there are digits in the repeating part of the sequence. The general form in mathematical notation: the string

[math]\displaystyle{ d_{n+m}d_{n+m-1} \dotsb d_{n+1}^{\;\;\;\;\,\shortmid} d_nd_{n-1} \dotsb d_2d_1d_0 }[/math]

represents the number

[math]\displaystyle{ \sum_{i=0}^n d_i b^i - \sum_{i=n+1}^{n+m} d_i b^i / (b^m-1) }[/math]

where [math]\displaystyle{ b\in\Z\setminus \{\pm1,0\} }[/math] is the base of the representation. The [math]\displaystyle{ d_i \in \{0,1, \dotsc, b-1\} }[/math] are the digits.

Natural numbers

The natural numbers are generally written in the way one usually expects to see them, but they can also be written using an explicit quote, an explicit radix point, or redundant zeros on either end. For example, the integer two can be written as 2 or 2. or 0'2 or 0'2. or even 000'02.000, and the integer zero can be written as 0 or 0' or 0. or 0!.

Negative integers

Negative integers begin with the digit one less than the base. For example, in decimal, minus three is written as 9'7.

9'7 = 7 – 90/9 = –3

As

9' = – 9/9 = –1,

it is easily understood that for instance:

–189 = –1 × 189 = 9' × 189 = 1701 + 17010 + 170100 + ... = ...999811 = 9'811 = 811 – 1000

or alternatively, as:

9'000 = –1000,
–189 = 811 – 1000 = 811 + 9'000

Numbers beginning with any other repeating sequence are not integers. For example:

6'7 = 7 – 60/9 = 1/3

and

7'6 = 6 – 70/9 = – 16/9

Interpreting quote notation

Conversion algorithm

To convert quote notation into standard notation, the following algorithm can be used.

Let x and y be sequences of digits, as in [math]\displaystyle{ x'y }[/math].
Let z be the digit 1 followed by a sequence of zeros of the same length as y.
Let a be the largest valued digit (one less than the base). In decimal, we have a = 9.
Let w be a sequence of as of the same length as x.

Then the number represented by [math]\displaystyle{ x'y }[/math] is given by [math]\displaystyle{ y-xz/w }[/math].

As an example, we will take 12'345 and convert it to a standard notation.

x = 12
y = 345
z = 1000
a = 9
w = 99

Then our standard notation follows,

[math]\displaystyle{ 345 - \dfrac{12\times 1000}{99} = \dfrac{7385}{33}. }[/math]

Sign determination

If the leading digit is less than the first digit after the quote, the number is positive. For example, 123'45 is positive because 1 is less than 4. If the leading digit is more than the first digit after the quote, the number is negative. For example, 54'3 is negative because 5 is more than 3.

If the quote comes at the end, just append a zero after the radix point. For example, 592' = 592!0, which is negative because 5 is more than 0. And 59.2' = 59.2'0 which is also negative.

If the leading digit equals the first digit after the quote, then either the number is 0!0 = 0, or the representation can be shortened by rolling the repetition to the right. For example, 23'25 = 32'5 which is positive because 3 is less than 5.

In binary, if it starts with 1 it is negative, and if it starts with 0 it is nonnegative, assuming the repetition has been rolled to the right as far as possible.

Arithmetic

Addition

In our usual sign-and-magnitude notation, to add the two integers 25 and −37, one first compares signs, and determines that the addition will be performed by subtracting the magnitudes. Then one compares the magnitudes to determine which will be subtracted from which, and to determine the sign of the result. In our usual fraction notation, to add 2/3 + 4/5 requires finding a common denominator, multiplying each numerator by the new factors in this common denominator, then adding the numerators, then dividing the numerator and denominator by any factors they have in common.

In quote notation, to add, just add. There are no sign or magnitude comparisons, and no common denominators. Addition is the same as for natural numbers. Here are some examples.

  9'7 minus three              9'4 minus six
+ 0'6 add plus six          +  9'2 add minus eight
—————                        —————
  0'3 makes plus three       9'8 6 makes minus fourteen
  6'7 one-third
+ 7'6 add minus one and seven-ninths
—————
  4'3 makes minus one and four-ninths

Subtraction

In our usual sign-and-magnitude notation, subtraction involves sign comparison and magnitude comparison, and may require adding or subtracting the magnitudes, just like addition. In our usual fraction notation, subtraction requires finding a common denominator, multiplying, subtracting, and reducing to lowest terms, just like addition.

In quote notation, to subtract, just subtract. There are no sign or magnitude comparisons, and no common denominators. When a minuend digit is less than the corresponding subtrahend digit, do not borrow from the minuend digit to its left; instead, carry (add one) to the subtrahend digit to its left. Here are some examples.

  9'7 minus three              9'4 minus six
- 0'6 subtract plus six     -  9'2 subtract minus eight
—————                        —————
  9'1 makes minus nine         0'2 makes plus two
  6'7 one-third
- 7'6 subtract minus one and seven-ninths
—————
8'9 1 makes plus two and one-ninth

Multiplication

Multiplication is the same as for natural numbers. To recognize the repetition in the answer, it helps to add the partial results pairwise. Here are some examples.

6'7 x 0'3 = 0'1 one-third times three makes one
6'7 x 7'6 one-third times minus one and seven-ninths:
multiply 6'7 by 6:        0'2 answer digit 2
multiply 6'7 by 7:      6'9
              add:     ————
                        6'9 answer digit 9
multiply 6'7 by 7:    6'9
              add:   ————
                      3'5 answer digit 5
multiply 6'7 by 7:  6'9
              add: ————
                    0'2 repetition of original
makes 592' minus sixteen twenty-sevenths

To someone who is unfamiliar with quote notation, 592' is unfamiliar, and translation to −16/27 is helpful. To someone who normally uses quote notation, −16/27 is a formula with a negation and a division operation; performing those operations yields the answer 592'.

Division

The commonly used division algorithm produces digits from left to right, which is opposite to addition, subtraction, and multiplication. This makes further arithmetic difficult. For example, how do we add 1.234234234234... + 5.67676767...? Usually we use a finite number of digits and accept an approximate answer with roundoff error. The commonly used division algorithm also produces duplicate representations; for example, 0.499999... and 0.5 represent the same number. In decimal, there is a kind of guess for each digit, which is seen to be right or wrong as the calculation progresses.

In quote notation, division produces digits from right to left, the same as all other arithmetic algorithms; therefore further arithmetic is easy. Quote arithmetic is exact, with no error. Each rational number has a unique representation (if the repetition is expressed as simply as possible, and we have no meaningless 0s at the right end after a radix point). Each digit is determined by a "division table", which is the inverse of part of the multiplication table; there is no "guessing". Here is an example.

9'84 / 0'27 minus sixteen divided by twenty-seven
since 0'27 ends in 7 and 9'84 ends in 4, ask:
                          9'8 4 What times 7 ends in 4? It's 2
multiply 0'27 by 2:       0'5 4
          subtract:       —————
                          9'3   What times 7 ends in 3? It's 9.
multiply 0'27 by 9:   0'2 4 3
          subtract:   ———————
                      9'7 5   What times 7 ends in 5? It's 5.
multiply 0'27 by 5: 0'1 3 5
          subtract: ———————
                    9'8 4 repetition of original
makes 592' minus sixteen twenty-sevenths

Division works when the divisor and the base have no factors in common except 1. In the previous example, 27 has factors 1, 3, and 27. The base is 10, which has factors 1, 2, 5, and 10. So division worked. When there are factors in common, they must be removed. For example, to divide 4 by 15, first multiply both 4 and 15 by 2:

4/15 = 8/30

Any 0s at the end of the divisor just tell where the radix point goes in the result. So now divide 8 by 3.

                      0'8 What times 3 ends in 8? It's 6.
multiply 0'3 by 6:  0'1 8
         subtract:   ————
                      9' What times 3 ends in 9? It's 3.
multiply 0'3 by 3:  0'9
         subtract: ————
                    9' repetition of earlier difference
makes 3'6 two and two-thirds
Now move the decimal point one place left, to get
3!6 four-fifteenths

Removing common factors is annoying, and it is unnecessary if the base is a prime number. Computers use base 2, which is a prime number, so division always works. And the division tables are trivial. The only questions are: what times 1 ends in 0? and: what times 1 ends in 1? Thus the rightmost bits in the differences are the bits in the answer. For example, one divided by three, which is 1/11, proceeds as follows.

             0'1 rightmost bit is 1
  subtract 0'1 1
           —————
             1'  rightmost bit is 1
subtract 0'1 1
         —————
         1'0   rightmost bit is 0
subtract   0'
         ————
         1'   repetition of earlier difference
makes 01'1 one-third

Negation

To negate, complement each digit, and then add 1. For example, in decimal, to negate 12'345, complement and get 87'654, and then add 1 to get 87'655. In binary, flip the bits, then add 1 (same as 2's complement). For example, to negate 01'1, which is one-third, flip the bits to get 10'0, then add 1 to get 10'1, and roll right to shorten it to 01' which is minus one-third.

Comparison with other representations

There are two representations of the rational numbers in common use. One uses a sign (+ or  –), followed by a nonnegative integer (numerator), followed by a division symbol, followed by a positive integer (denominator). For example, –58/2975. (If no sign is written, the sign is +.) The other is a sign followed by a sequence of digits, with a radix point (called a decimal point in base ten) somewhere in the sequence, and an overscore over one or more of the rightmost digits. For example, [math]\displaystyle{ -0.02\overline{34} }[/math] . (There are alternative notations to the overscore; see Repeating decimal.) The overscore can be thought of as saying that the digits beneath it are repeated forever to the right. In the example, that's –0.023434343434... . Quote notation does not need a sign; it has a sequence of digits with a radix point somewhere in the sequence, and a quote somewhere in the sequence. For example, 4.3'2 . The quote can be thought of as saying that the digits to its left are repeated forever to the left. In the example, that's ...43434343434.32 . All three examples in this paragraph represent the same rational number.

[math]\displaystyle{ -58/2975 = -0.02\overline{34} = 4.3\text{'}2 }[/math]

The three representations can be compared in two ways: space required for storage, and time required for arithmetic operations.

Space

Quote notation and overscore notation require essentially the same space. Hehner and Horspool admit on p. 12: “But quote notation and numerator-denominator notation can differ greatly.”[Rem. 1] The very worst case occurs for some prime denominators (see Fermat's little theorem). For example, +1/7 = 285714'3 (in binary it is 011'1). To represent +1/947 in binary as a sign and numerator and denominator requires 12 bits, and as quote notation requires 947 bits. (Extra bits are required to delimit two variable-length numbers, but these are the same for all three representations, so ignoring them does not affect the comparison.) The number of places required to represent a repetend of the rational number [math]\displaystyle{ n/d }[/math] with [math]\displaystyle{ \gcd(d,b)=1 }[/math] in a base b quote notation is [math]\displaystyle{ \operatorname{ord}_d(b) = \min\{k \; \colon \, b^k \equiv 1 \text{ mod } d \} }[/math] whose maximum across all bases b is the exponent of the multiplicative group of integers modulo d which is given by the Carmichael function [math]\displaystyle{ \lambda(d) }[/math].

The performance of computer algorithms is measured in terms of the length of the input. The length of a rational number [math]\displaystyle{ n/d }[/math] in numerator-denominator notation is essentially the sum of the logarithms of the two numbers, so it is in [math]\displaystyle{ \mathcal{O}(\log(n) + \log(d)) . }[/math] The length of a rational number [math]\displaystyle{ n/d }[/math] in quote notation is the sum of the logarithm of the numerator [math]\displaystyle{ \mathcal{O}(\log(n)) }[/math] and the length [math]\displaystyle{ \mathcal{O}(d) }[/math] of the repetend, so it is in [math]\displaystyle{ \mathcal{O}(\log(n) + d) }[/math] and thus exponential in the length of the input.

Hehner and Horspool on p. 8:

“The 180,000 shortest numerator-denominator representations require 15.65 bits on average, and those same numbers in quote notation require 39.48 bits on average. Taking the shortest numerator-denominator numbers, and then translating those numbers to quote notation, results in a biased comparison in favor of numerator-denominator. If we take all binary quote representations up to and including 14 bits (all quote positions and all radix point positions), then discard those that are not normalized, we have 1,551,567 numbers requiring 13.26 bits on average. If we translate them to numerator-denominator notation,[Rem. 2] then normalize the result by removing common factors, they require 26.48 bits on average. This comparison is biased in favor of quote notation. An unbiased comparison is difficult to design.”

... and even more difficult to prove. Indeed, extrapolating a finite sample to an infinite set has only a limited mathematical significance.

On the other hand, Hehner and Horspool describe quote notation as attractive for use in computer hardware (p. 1). The hardware instructions of many computers operate on relatively small chunks of memory of fixed length, such as word (32 bits), double word (64 bits), 128-bit word, 256-bit word. There are only a few processors that do operate on 512-bit data.[Rem. 3]

The following table shows denominators, where the quote notation of a corresponding fraction [math]\displaystyle{ 1/d }[/math] does not fit into a binary integer of size 32, 64, 128, 256, and 512 bits respectively. Given are the smallest 20 denominators d for each chunk size, their prime factors, the length [math]\displaystyle{ \mathrm{ord}_d(2) }[/math] of the repetend of the fraction [math]\displaystyle{ 1/d }[/math], and the Carmichael value [math]\displaystyle{ \lambda(d) . }[/math]

[math]\displaystyle{ \mathrm{ord}_d(2) \gt 32 }[/math]
d factors ord λ
37 37 36 36
53 53 52 52
59 59 58 58
61 61 60 60
67 67 66 66
71 71 35 70
74 2·37 36 36
79 79 39 78
81 34 54 54
83 83 82 82
95 5·19 36 36
97 97 48 96
101 101 100 100
103 103 51 102
106 2·53 52 52
107 107 106 106
109 109 36 108
111 3·37 36 36
115 5·23 44 44
118 2·59 58 58
[math]\displaystyle{ \mathrm{ord}_d(2) \gt 64 }[/math]
d factors ord λ
67 67 66 66
83 83 82 82
101 101 100 100
107 107 106 106
121 112 110 110
125 53 100 100
131 131 130 130
134 2·67 66 66
137 137 68 136
139 139 138 138
149 149 148 148
163 163 162 162
166 2·83 82 82
167 167 83 166
169 132 156 156
173 173 172 172
179 179 178 178
181 181 180 180
191 191 95 190
193 193 96 192
[math]\displaystyle{ \mathrm{ord}_d(2) \gt 128 }[/math]
d factors ord λ
131 131 130 130
139 139 138 138
149 149 148 148
163 163 162 162
169 132 156 156
173 173 172 172
179 179 178 178
181 181 180 180
197 197 196 196
211 211 210 210
227 227 226 226
243 35 162 162
262 2·131 130 130
263 263 131 262
269 269 268 268
271 271 135 270
278 2·139 138 138
289 172 136 272
293 293 292 292
298 2·149 148 148
[math]\displaystyle{ \mathrm{ord}_d(2) \gt 256 }[/math]
d factors ord λ
269 269 268 268
293 293 292 292
317 317 316 316
347 347 346 346
349 349 348 348
361 192 342 342
373 373 372 372
379 379 378 378
389 389 388 388
419 419 418 418
421 421 420 420
443 443 442 442
461 461 460 460
467 467 466 466
491 491 490 490
509 509 508 508
521 521 260 520
523 523 522 522
538 2·269 268 268
541 541 540 540
[math]\displaystyle{ \mathrm{ord}_d(2) \gt 512 }[/math]
d factors ord λ
523 523 522 522
541 541 540 540
547 547 546 546
557 557 556 556
563 563 562 562
587 587 586 586
613 613 612 612
619 619 618 618
653 653 652 652
659 659 658 658
661 661 660 660
677 677 676 676
701 701 700 700
709 709 708 708
757 757 756 756
773 773 772 772
787 787 786 786
797 797 796 796
821 821 820 820
827 827 826 826

The table shows that quote notation is capable to handle only quite small denominators, even with the largest chunk sizes available up to now.

In addition, Hehner and Horspool try to play down worst-case analysis, but already these small tables above show that cases which are unfavorable for their thesis are quite frequent: the 20 smallest numbers blasting the chunks make up around 10% in a range of about 200.

This frequency correlates well with the theorems of Paul Erdős and others which show the asymptotically exponential behaviour of λ (see the sections Carmichael function, Carmichael function, Carmichael function, and Carmichael function).

Time

To add two numbers in numerator-denominator notation, for example (+a/b) + (–c/d), requires the following steps:

• sign comparison to determine if we will be adding or subtracting; in our example, the signs differ so we will be subtracting

• then 3 multiplications; in our example, a×d, b×c, b×d

• then, if we are subtracting, a comparison of a×d to b×c to determine which is subtrahend and which is minuend, and what is the sign of the result; let's say a×d < b×c so the sign will be –

• then the addition or subtraction; b×ca×d and we have –(b×ca×d)/(b×d)

• finding the greatest common divisor of the new numerator and denominator

• dividing numerator and denominator by their greatest common divisor to obtain a normalized result

Normalizing the result is not necessary for correctness, but without it, the space requirements quickly grow during a sequence of operations. Subtraction is almost identical to addition.

Adding two numbers in overscore notation is problematic because there is no right end to start at. The easiest way to do the addition is to translate the numbers to quote notation, then add, then translate back. Likewise for subtraction.

To add two numbers in quote notation, just add them the same way you add two positive integers. The repetition is recognized when the repeating parts of the two operands return to their starting digits. Then the result may be roll-normalized by checking whether the first digit equals the first digit after the quote. Likewise for subtraction. For both addition and subtraction, quote notation is superior to the other two notations.

Multiplication in numerator-denominator notation is two integer multiplications, finding a greatest common divisor, and then two divisions. Multiplication in overscore notation is problematic for the same reason that addition is. Multiplication in quote notation proceeds exactly like positive integer multiplication, comparing each new sum to previous sums in order to detect the repetition. For multiplication, quote notation is superior to overscore notation, and may be slightly better than numerator-denominator notation.

Division in numerator-denominator notation has the same complexity as multiplication in numerator-denominator notation. Division in overscore notation is problematic because it requires a sequence of subtractions, which are problematic in overscore notation. Division in quote notation proceeds just like multiplication in quote notation, producing the answer digits from right to left, each one determined by the rightmost digit of the current difference and divisor (trivial in binary). For division, quote notation is superior to both overscore and numerator-denominator notations.

Drawbacks

Cost

It should, however, not be suppressed that the worst case cost in space (and for some operations also the cost in time) of the described quote notation is [math]\displaystyle{ \mathcal{O}(d) }[/math][Rem. 4] for a rational number with denominator [math]\displaystyle{ d\in\N }[/math] — compared to [math]\displaystyle{ \mathcal{O}(\log d) }[/math] of the numerator-denominator representation, a fact which makes quote notation unsuited as a means for the exact handling of rational numbers of arbitrary size, e.g. in a computer algebra package.

Examples
[math]\displaystyle{ d = 19: }[/math]
−1/19 =  052631578947368421!
−2/19 =  105263157894736842!
[−1/10011]2 = [000011010111100101!]2
[−10/10011]2 = [000110101111001010!]2
This means: [math]\displaystyle{ d-1=18 }[/math] decimals/duals in quote notation correspond to 3 resp. 7 decimals/duals in numerator-denominator notation.
[math]\displaystyle{ d = 59: }[/math]
−1/59 =  0169491525423728813559322033898305084745762711864406779661!
−2/59 =  0338983050847457627118644067796610169491525423728813559322!
[−1/111011]2 = [0000010001010110110001111001011111011101010010011100001101!]2
[−10/111011]2 = [0000100010101101100011110010111110111010100100111000011010!]2
This means: [math]\displaystyle{ d-1=58 }[/math] decimals/duals in quote notation correspond to 3 resp. 8 decimals/duals in numerator-denominator notation.
Remark: The sequence of decimals/duals of a representation of numerator 2 is a rotated shift of the representation of numerator 1.

Rounding by truncation

Truncation on the left cannot be used for rounding purposes in the quote notation system. The authors do not provide approximate versions of the addition, subtraction, multiplication, and division operators, instead they propose conversion to the overscore notation and then truncation on the right.

This means that the operations have to be expanded to the full repeating group and then be converted, which in the light of section #Cost does not appear to be a practicable proposal.

Zero-divisors

If the base [math]\displaystyle{ d=e \cdot f }[/math] is composite, the ring [math]\displaystyle{ \Z_d }[/math] contains zero-divisors. Let's assume [math]\displaystyle{ d=10=2\cdot 5 }[/math]. Because [math]\displaystyle{ \Q_2 \cap \Q_5 = \Q }[/math], no rational in [math]\displaystyle{ \Z_{10} }[/math] is a zero-divisor. But there exist (non-rational) numbers [math]\displaystyle{ 1_2, 1_5 \in \Z_{10} }[/math] which are [math]\displaystyle{ 1_2 \equiv_{\Z_2} 1 }[/math] and [math]\displaystyle{ 1_5 \equiv_{\Z_5} 1 }[/math], but the product is [math]\displaystyle{ 1_2 \cdot 1_5 =_{\Z_{10}} 0 }[/math].

Remarks

  1. The same is true for every place-value notation, independent of the base chosen.
  2. So, by translating forth and back they try to give the impression of presenting an objective assessment.
  3. Up to now, every support for variable length objects is done in software – and not in hardware. This is most probably because
    1. the involved degree of complication has not been managed yet,
    2. at least for the proposed objects,
    3. the gain of a hardware solution is too small compared to software,
    4. or the selling point is too low.
  4. The source does not really address the [math]\displaystyle{ \mathcal{O}(d) }[/math]-problem: “But quote notation and numerator-denominator notation can differ greatly.” and mention [math]\displaystyle{ d = 947 }[/math] which requires 946 bits in the repeating group. But there are infinitely many such denominators, all of them have a relatively big totient function, e. g. [math]\displaystyle{ d = 802787 }[/math] with [math]\displaystyle{ \varphi(d) = 802786 }[/math].
    In their 3rd "Appendix added later" they add some considerations to [math]\displaystyle{ d = 59 }[/math].

References