CORDIC

From HandWiki
(Redirected from Bit-by-bit algorithm)
Short description: Algorithm for computing trigonometric, hyperbolic, logarithmic and exponential functions

Template:Anchor (or Anchors): too many anchors, maximum is ten

CORDIC (for "coordinate rotation digital computer"), also known as Volder's algorithm, or: Digit-by-digit method Circular CORDIC (Jack E. Volder),[1][2] Linear CORDIC, Hyperbolic CORDIC (John Stephen Walther),[3][4] and Generalized Hyperbolic CORDIC (GH CORDIC) (Yuanyong Luo et al.),[5][6] is a simple and efficient algorithm to calculate trigonometric functions, hyperbolic functions, square roots, multiplications, divisions, and exponentials and logarithms with arbitrary base, typically converging with one digit (or bit) per iteration. CORDIC is therefore also an example of digit-by-digit algorithms. CORDIC and closely related methods known as pseudo-multiplication and pseudo-division or factor combining are commonly used when no hardware multiplier is available (e.g. in simple microcontrollers and FPGAs), as the only operations it requires are additions, subtractions, bitshift and lookup tables. As such, they all belong to the class of shift-and-add algorithms. In computer science, CORDIC is often used to implement floating-point arithmetic when the target platform lacks hardware multiply for cost or space reasons.

History

Similar mathematical techniques were published by Henry Briggs as early as 1624[7][8] and Robert Flower in 1771,[9] but CORDIC is better optimized for low-complexity finite-state CPUs.

CORDIC was conceived in 1956[10][11] by Jack E. Volder at the aeroelectronics department of Convair out of necessity to replace the analog resolver in the B-58 bomber's navigation computer with a more accurate and faster real-time digital solution.[11] Therefore, CORDIC is sometimes referred to as a digital resolver.[12][13]

In his research Volder was inspired by a formula in the 1946 edition of the CRC Handbook of Chemistry and Physics:[11]

[math]\displaystyle{ \begin{align} K_n R \sin(\theta \pm \varphi) &= R \sin(\theta) \pm 2^{-n} R \cos(\theta), \\ K_n R \cos(\theta \pm \varphi) &= R \cos(\theta) \mp 2^{-n} R \sin(\theta), \\ \end{align} }[/math]

where [math]\displaystyle{ \varphi }[/math] is such that [math]\displaystyle{ \tan(\varphi) = 2^{-n} }[/math], and [math]\displaystyle{ K_n := \sqrt{1 + 2^{-2n}} }[/math].

His research led to an internal technical report proposing the CORDIC algorithm to solve sine and cosine functions and a prototypical computer implementing it.[10][11] The report also discussed the possibility to compute hyperbolic coordinate rotation, logarithms and exponential functions with modified CORDIC algorithms.[10][11] Utilizing CORDIC for multiplication and division was also conceived at this time.[11] Based on the CORDIC principle, Dan H. Daggett, a colleague of Volder at Convair, developed conversion algorithms between binary and binary-coded decimal (BCD).[11][14]

In 1958, Convair finally started to build a demonstration system to solve radar fix-taking problems named CORDIC I, completed in 1960 without Volder, who had left the company already.[1][11] More universal CORDIC II models A (stationary) and B (airborne) were built and tested by Daggett and Harry Schuss in 1962.[11][15]

Volder's CORDIC algorithm was first described in public in 1959,[1][2][11][13][16] which caused it to be incorporated into navigation computers by companies including Martin-Orlando, Computer Control, Litton, Kearfott, Lear-Siegler, Sperry, Raytheon, and Collins Radio.[11]

Volder teamed up with Malcolm McMillan to build Athena, a fixed-point desktop calculator utilizing his binary CORDIC algorithm.[17] The design was introduced to Hewlett-Packard in June 1965, but not accepted.[17] Still, McMillan introduced David S. Cochran (HP) to Volder's algorithm and when Cochran later met Volder he referred him to a similar approach John E. Meggitt (IBM[18]) had proposed as pseudo-multiplication and pseudo-division in 1961.[18][19] Meggitt's method also suggested the use of base 10[18] rather than base 2, as used by Volder's CORDIC so far. These efforts led to the ROMable logic implementation of a decimal CORDIC prototype machine inside of Hewlett-Packard in 1966,[20][19] built by and conceptually derived from Thomas E. Osborne's prototypical Green Machine, a four-function, floating-point desktop calculator he had completed in DTL logic[17] in December 1964.[21] This project resulted in the public demonstration of Hewlett-Packard's first desktop calculator with scientific functions, the HP 9100A in March 1968, with series production starting later that year.[17][21][22][23]

When Wang Laboratories found that the HP 9100A used an approach similar to the factor combining method in their earlier LOCI-1[24] (September 1964) and LOCI-2 (January 1965)[25][26] Logarithmic Computing Instrument desktop calculators,[27] they unsuccessfully accused Hewlett-Packard of infringement of one of An Wang's patents in 1968.[19][28][29][30]

John Stephen Walther at Hewlett-Packard generalized the algorithm into the Unified CORDIC algorithm in 1971, allowing it to calculate hyperbolic functions, natural exponentials, natural logarithms, multiplications, divisions, and square roots.[31][3][4][32] The CORDIC subroutines for trigonometric and hyperbolic functions could share most of their code.[28] This development resulted in the first scientific handheld calculator, the HP-35 in 1972.[28][33][34][35][36][37] Based on hyperbolic CORDIC, Yuanyong Luo et al. further proposed a Generalized Hyperbolic CORDIC (GH CORDIC) to directly compute logarithms and exponentials with an arbitrary fixed base in 2019.[5][6][38][39][40] Theoretically, Hyperbolic CORDIC is a special case of GH CORDIC.[5]

Originally, CORDIC was implemented only using the binary numeral system and despite Meggitt suggesting the use of the decimal system for his pseudo-multiplication approach, decimal CORDIC continued to remain mostly unheard of for several more years, so that Hermann Schmid and Anthony Bogacki still suggested it as a novelty as late as 1973[16][13][41][42][43] and it was found only later that Hewlett-Packard had implemented it in 1966 already.[11][13][20][28]

Decimal CORDIC became widely used in pocket calculators,[13] most of which operate in binary-coded decimal (BCD) rather than binary. This change in the input and output format did not alter CORDIC's core calculation algorithms. CORDIC is particularly well-suited for handheld calculators, in which low cost – and thus low chip gate count – is much more important than speed.

CORDIC has been implemented in the ARM-based STM32G4, Intel 8087,[43][44][45][46][47] 80287,[47][48] 80387[47][48] up to the 80486[43] coprocessor series as well as in the Motorola 68881[43][44] and 68882 for some kinds of floating-point instructions, mainly as a way to reduce the gate counts (and complexity) of the FPU sub-system.

Applications

CORDIC uses simple shift-add operations for several computing tasks such as the calculation of trigonometric, hyperbolic and logarithmic functions, real and complex multiplications, division, square-root calculation, solution of linear systems, eigenvalue estimation, singular value decomposition, QR factorization and many others. As a consequence, CORDIC has been used for applications in diverse areas such as signal and image processing, communication systems, robotics and 3D graphics apart from general scientific and technical computation.[49][50]

Hardware

The algorithm was used in the navigational system of the Apollo program's Lunar Roving Vehicle to compute bearing and range, or distance from the Lunar module.[51][52] CORDIC was used to implement the Intel 8087 math coprocessor in 1980, avoiding the need to implement hardware multiplication.[53]

CORDIC is generally faster than other approaches when a hardware multiplier is not available (e.g., a microcontroller), or when the number of gates required to implement the functions it supports should be minimized (e.g., in an FPGA or ASIC). In fact, CORDIC is a standard drop-in IP in FPGA development applications such as Vivado for Xilinx, while a power series implementation is not due to the specificity of such an IP, i.e. CORDIC can compute many different functions (general purpose) while a hardware multiplier configured to execute power series implementations can only compute the function it was designed for.

On the other hand, when a hardware multiplier is available (e.g., in a DSP microprocessor), table-lookup methods and power series are generally faster than CORDIC. In recent years, the CORDIC algorithm has been used extensively for various biomedical applications, especially in FPGA implementations[citation needed].

The STM32G4 series and certain STM32H7 series of MCUs implement a CORDIC module to accelerate computations in various mixed signal applications such as graphics for human-machine interface and field oriented control of motors. While not as fast as a power series approximation, CORDIC is indeed faster than interpolating table based implementations such as the ones provided by the ARM CMSIS and C standard libraries.[54] Though the results may be slightly less accurate as the CORDIC modules provided only achieve 20 bits of precision in the result. For example, most of the performance difference compared to the ARM implementation is due to the overhead of the interpolation algorithm, which achieves full floating point precision (24 bits) and can likely achieve relative error to that precision.[55] Another benefit is that the CORDIC module is a coprocessor and can be run in parallel with other CPU tasks.

The issue with using Taylor series is that while they do provide small absolute error, they do not exhibit well behaved relative error.[56] Other means of polynomial approximation, such as minimax optimization, may be used to control both kinds of error.

Software

Many older systems with integer-only CPUs have implemented CORDIC to varying extents as part of their IEEE floating-point libraries. As most modern general-purpose CPUs have floating-point registers with common operations such as add, subtract, multiply, divide, sine, cosine, square root, log10, natural log, the need to implement CORDIC in them with software is nearly non-existent. Only microcontroller or special safety and time-constrained software applications would need to consider using CORDIC.

Modes of operation

Rotation mode

CORDIC can be used to calculate a number of different functions. This explanation shows how to use CORDIC in rotation mode to calculate the sine and cosine of an angle, assuming that the desired angle is given in radians and represented in a fixed-point format. To determine the sine or cosine for an angle [math]\displaystyle{ \beta }[/math], the y or x coordinate of a point on the unit circle corresponding to the desired angle must be found. Using CORDIC, one would start with the vector [math]\displaystyle{ v_0 }[/math]:

[math]\displaystyle{ v_0 = \begin{bmatrix} 1 \\ 0 \end{bmatrix}. }[/math]
An illustration of the CORDIC algorithm in progress

In the first iteration, this vector is rotated 45° counterclockwise to get the vector [math]\displaystyle{ v_1 }[/math]. Successive iterations rotate the vector in one or the other direction by size-decreasing steps, until the desired angle has been achieved. Each step angle is [math]\displaystyle{ \gamma_i = \arctan{(2^{-i})} }[/math] for [math]\displaystyle{ i = 0, 1, 2, \dots }[/math].

More formally, every iteration calculates a rotation, which is performed by multiplying the vector [math]\displaystyle{ v_i }[/math] with the rotation matrix [math]\displaystyle{ R_{i} }[/math]:

[math]\displaystyle{ v_{i+1} = R_i v_i. }[/math]

The rotation matrix is given by

[math]\displaystyle{ R_i = \begin{bmatrix} \cos(\gamma_i) & -\sin(\gamma_i) \\ \sin(\gamma_i) & \cos(\gamma_i) \end{bmatrix}. }[/math]

Using the following two trigonometric identities:

[math]\displaystyle{ \begin{align} \cos(\gamma_i) &= \frac{1}{\sqrt{1 + \tan^2(\gamma_i)}}, \\ \sin(\gamma_i) &= \frac{\tan(\gamma_i)}{\sqrt{1 + \tan^2(\gamma_i)}}, \end{align} }[/math]

the rotation matrix becomes

[math]\displaystyle{ R_i = \frac{1}{\sqrt{1 + \tan^2(\gamma_i)}} \begin{bmatrix} 1 & -\tan(\gamma_i) \\ \tan(\gamma_i) & 1 \end{bmatrix}. }[/math]

The expression for the rotated vector [math]\displaystyle{ v_{i+1} = R_i v_i }[/math] then becomes

[math]\displaystyle{ \begin{bmatrix} x_{i+1} \\ y_{i+1} \end{bmatrix} = \frac{1}{\sqrt{1 + \tan^2(\gamma_i)}} \begin{bmatrix} 1 & -\tan(\gamma_i) \\ \tan(\gamma_i) & 1 \end{bmatrix} \begin{bmatrix} x_i \\ y_i \end{bmatrix}, }[/math]

where [math]\displaystyle{ x_i }[/math] and [math]\displaystyle{ y_i }[/math] are the components of [math]\displaystyle{ v_i }[/math]. Restricting the angles [math]\displaystyle{ \gamma_i }[/math] such that [math]\displaystyle{ \tan(\gamma_i) = \pm 2^{-i} }[/math], the multiplication with the tangent can be replaced by a division by a power of two, which is efficiently done in digital computer hardware using a bit shift. The expression then becomes

[math]\displaystyle{ \begin{bmatrix} x_{i+1} \\ y_{i+1} \end{bmatrix} = K_i \begin{bmatrix} 1 & -\sigma_i 2^{-i} \\ \sigma_i 2^{-i} & 1 \end{bmatrix} \begin{bmatrix} x_i \\ y_i \end{bmatrix}, }[/math]

where

[math]\displaystyle{ K_i = \frac{1}{\sqrt{1 + 2^{-2i}}}, }[/math]

and [math]\displaystyle{ \sigma_i }[/math] is used to determine the direction of the rotation: if the angle [math]\displaystyle{ \gamma_i }[/math] is positive, then [math]\displaystyle{ \sigma_i }[/math] is +1, otherwise it is −1.

All [math]\displaystyle{ K_i }[/math] factors can be ignored in the iterative process and then applied all at once afterwards with a scaling factor [math]\displaystyle{ K(n) }[/math]

[math]\displaystyle{ K(n) = \prod_{i=0}^{n-1} K_i = \prod_{i=0}^{n-1} \frac{1}{\sqrt{1 + 2^{-2i}}}, }[/math]

which is calculated in advance and stored in a table or as a single constant, if the number of iterations is fixed. This correction could also be made in advance, by scaling [math]\displaystyle{ v_0 }[/math] and hence saving a multiplication. Additionally, it can be noted that[43]

[math]\displaystyle{ K = \lim_{n \to \infty} K(n) \approx 0.6072529350088812561694 }[/math]

to allow further reduction of the algorithm's complexity. Some applications may avoid correcting for [math]\displaystyle{ K }[/math] altogether, resulting in a processing gain [math]\displaystyle{ A }[/math]:[57]

[math]\displaystyle{ A = \frac{1}{K} = \lim_{n \to \infty} \prod_{i=0}^{n-1} \sqrt{1 + 2^{-2i}} \approx 1.64676025812107. }[/math]

After a sufficient number of iterations, the vector's angle will be close to the wanted angle [math]\displaystyle{ \beta }[/math]. For most ordinary purposes, 40 iterations (n = 40) are sufficient to obtain the correct result to the 10th decimal place.

The only task left is to determine whether the rotation should be clockwise or counterclockwise at each iteration (choosing the value of [math]\displaystyle{ \sigma }[/math]). This is done by keeping track of how much the angle was rotated at each iteration and subtracting that from the wanted angle; then in order to get closer to the wanted angle [math]\displaystyle{ \beta }[/math], if [math]\displaystyle{ \beta_{n+1} }[/math] is positive, the rotation is clockwise, otherwise it is negative and the rotation is counterclockwise:

[math]\displaystyle{ \beta_0 = \beta }[/math]
[math]\displaystyle{ \beta_{i+1} = \beta_i - \sigma_i \gamma_i, \quad \gamma_i = \arctan(2^{-i}). }[/math]

The values of [math]\displaystyle{ \gamma_n }[/math] must also be precomputed and stored. But for small angles, [math]\displaystyle{ \arctan(\gamma_n) = \gamma_n }[/math] in fixed-point representation, reducing table size.

As can be seen in the illustration above, the sine of the angle [math]\displaystyle{ \beta }[/math] is the y coordinate of the final vector [math]\displaystyle{ v_n, }[/math] while the x coordinate is the cosine value.

Vectoring mode

The rotation-mode algorithm described above can rotate any vector (not only a unit vector aligned along the x axis) by an angle between −90° and +90°. Decisions on the direction of the rotation depend on [math]\displaystyle{ \beta_i }[/math] being positive or negative.

The vectoring-mode of operation requires a slight modification of the algorithm. It starts with a vector whose x coordinate is positive whereas the y coordinate is arbitrary. Successive rotations have the goal of rotating the vector to the x axis (and therefore reducing the y coordinate to zero). At each step, the value of y determines the direction of the rotation. The final value of [math]\displaystyle{ \beta_i }[/math] contains the total angle of rotation. The final value of x will be the magnitude of the original vector scaled by K. So, an obvious use of the vectoring mode is the transformation from rectangular to polar coordinates.

Implementation

In Java the Math class has a scalb(double x,int scale) method to perform such a shift,[58] C has the ldexp function,[59] and the x86 class of processors have the fscale floating point operation.[60]

Software Example (Python)

from math import atan2, sqrt, sin, cos, radians

ITERS = 16
theta_table = [atan2(1, 2**i) for i in range(ITERS)]

def compute_K(n):
    """
    Compute K(n) for n = ITERS. This could also be
    stored as an explicit constant if ITERS above is fixed.
    """
    k = 1.0
    for i in range(n):
        k *= 1 / sqrt(1 + 2 ** (-2 * i))
    return k

def CORDIC(alpha, n):
    K_n = compute_K(n)
    theta = 0.0
    x = 1.0
    y = 0.0
    P2i = 1  # This will be 2**(-i) in the loop below
    for arc_tangent in theta_table:
        sigma = +1 if theta < alpha else -1
        theta += sigma * arc_tangent
        x, y = x - sigma * y * P2i, sigma * P2i * x + y
        P2i /= 2
    return x * K_n, y * K_n

if __name__ == "__main__":
    # Print a table of computed sines and cosines, from -90° to +90°, in steps of 15°,
    # comparing against the available math routines.
    print("  x       sin(x)     diff. sine     cos(x)    diff. cosine ")
    for x in range(-90, 91, 15):
        cos_x, sin_x = CORDIC(radians(x), ITERS)
        print(
            f"{x:+05.1f}°  {sin_x:+.8f} ({sin_x-sin(radians(x)):+.8f}) {cos_x:+.8f} ({cos_x-cos(radians(x)):+.8f})"
        )

Output

$ python cordic.py
  x       sin(x)     diff. sine     cos(x)    diff. cosine
-90.0°  -1.00000000 (+0.00000000) -0.00001759 (-0.00001759)
-75.0°  -0.96592181 (+0.00000402) +0.25883404 (+0.00001499)
-60.0°  -0.86601812 (+0.00000729) +0.50001262 (+0.00001262)
-45.0°  -0.70711776 (-0.00001098) +0.70709580 (-0.00001098)
-30.0°  -0.50001262 (-0.00001262) +0.86601812 (-0.00000729)
-15.0°  -0.25883404 (-0.00001499) +0.96592181 (-0.00000402)
+00.0°  +0.00001759 (+0.00001759) +1.00000000 (-0.00000000)
+15.0°  +0.25883404 (+0.00001499) +0.96592181 (-0.00000402)
+30.0°  +0.50001262 (+0.00001262) +0.86601812 (-0.00000729)
+45.0°  +0.70709580 (-0.00001098) +0.70711776 (+0.00001098)
+60.0°  +0.86601812 (-0.00000729) +0.50001262 (+0.00001262)
+75.0°  +0.96592181 (-0.00000402) +0.25883404 (+0.00001499)
+90.0°  +1.00000000 (-0.00000000) -0.00001759 (-0.00001759)

Hardware example

The number of logic gates for the implementation of a CORDIC is roughly comparable to the number required for a multiplier as both require combinations of shifts and additions. The choice for a multiplier-based or CORDIC-based implementation will depend on the context. The multiplication of two complex numbers represented by their real and imaginary components (rectangular coordinates), for example, requires 4 multiplications, but could be realized by a single CORDIC operating on complex numbers represented by their polar coordinates, especially if the magnitude of the numbers is not relevant (multiplying a complex vector with a vector on the unit circle actually amounts to a rotation). CORDICs are often used in circuits for telecommunications such as digital down converters.

Double iterations CORDIC

In two of the publications by Vladimir Baykov,[61][62] it was proposed to use the double iterations method for the implementation of the functions: arcsine, arccosine, natural logarithm, exponential function, as well as for the calculation of the hyperbolic functions. Double iterations method consists in the fact that unlike the classical CORDIC method, where the iteration step value changes EVERY time, i.e. on each iteration, in the double iteration method, the iteration step value is repeated twice and changes only through one iteration. Hence the designation for the degree indicator for double iterations appeared: [math]\displaystyle{ i = 0, 0, 1, 1, 2, 2\dots }[/math]. Whereas with ordinary iterations: [math]\displaystyle{ i = 0, 1, 2\dots }[/math]. The double iteration method guarantees the convergence of the method throughout the valid range of argument changes.

The generalization of the CORDIC convergence problems for the arbitrary positional number system with radix [math]\displaystyle{ R }[/math] showed[63] that for the functions sine, cosine, arctangent, it is enough to perform [math]\displaystyle{ R - 1 }[/math] iterations for each value of i (i = 0 or 1 to n, where n is the number of digits), i.e. for each digit of the result. For the natural logarithm, exponential, hyperbolic sine, cosine and arctangent, [math]\displaystyle{ R }[/math] iterations should be performed for each value [math]\displaystyle{ i }[/math]. For the functions arcsine and arccosine, two [math]\displaystyle{ R - 1 }[/math] iterations should be performed for each number digit, i.e. for each value of [math]\displaystyle{ i }[/math].[63]

For inverse hyperbolic sine and arcosine functions, the number of iterations will be [math]\displaystyle{ 2R }[/math] for each [math]\displaystyle{ i }[/math], that is, for each result digit.

Related algorithms

CORDIC is part of the class of "shift-and-add" algorithms, as are the logarithm and exponential algorithms derived from Henry Briggs' work. Another shift-and-add algorithm which can be used for computing many elementary functions is the BKM algorithm, which is a generalization of the logarithm and exponential algorithms to the complex plane. For instance, BKM can be used to compute the sine and cosine of a real angle [math]\displaystyle{ x }[/math] (in radians) by computing the exponential of [math]\displaystyle{ 0+ix }[/math], which is [math]\displaystyle{ \operatorname{cis}(x) = \cos(x) + i \sin(x) }[/math]. The BKM algorithm is slightly more complex than CORDIC, but has the advantage that it does not need a scaling factor (K).

See also

References

  1. 1.0 1.1 1.2 "The CORDIC Computing Technique". Proceedings of the Western Joint Computer Conference (WJCC) (San Francisco, California, USA: National Joint Computer Committee (NJCC)): 257–261. 1959-03-03. http://www.computer.org/csdl/proceedings/afips/1959/5054/00/50540257.pdf. Retrieved 2016-01-02. 
  2. 2.0 2.1 "The CORDIC Trigonometric Computing Technique". IRE Transactions on Electronic Computers (The Institute of Radio Engineers, Inc. (IRE)) 8 (3): 330–334 (reprint: 226–230). 1959-05-25. September 1959. EC-8(3):330–334. http://home.citycable.ch/pierrefleur/Jacques-Laporte/Volder_CORDIC.pdf. Retrieved 2016-01-01. 
  3. 3.0 3.1 "A unified algorithm for elementary functions". Proceedings of the Spring Joint Computer Conference (Atlantic City, New Jersey, USA: Hewlett-Packard Company) 38: 379–385. May 1971. http://home.citycable.ch/pierrefleur/Jacques-Laporte/Welther-Unified%20Algorithm.pdf. Retrieved 2016-01-01. 
  4. 4.0 4.1 "The Story of Unified CORDIC". The Journal of VLSI Signal Processing (Hingham, MA, USA: Kluwer Academic Publishers) 25 (2 (Special issue on CORDIC)): 107–112. June 2000. doi:10.1023/A:1008162721424. ISSN 0922-5773. https://dl.acm.org/citation.cfm?id=2812970. 
  5. 5.0 5.1 5.2 "Generalized Hyperbolic CORDIC and Its Logarithmic and Exponential Computation With Arbitrary Fixed Base". IEEE Transactions on Very Large Scale Integration (VLSI) Systems 27 (9): 2156–2169. September 2019. doi:10.1109/TVLSI.2019.2919557. 
  6. 6.0 6.1 "Corrections to "Generalized Hyperbolic CORDIC and Its Logarithmic and Exponential Computation With Arbitrary Fixed Base"". IEEE Transactions on Very Large Scale Integration (VLSI) Systems 27 (9): 2222. September 2019. doi:10.1109/TVLSI.2019.2932174. 
  7. Arithmetica Logarithmica. London. 1624.  (Translation: [1] )
  8. "Henry Briggs and the HP 35". Paris, France. 2014. http://www.jacques-laporte.org/Briggs%20and%20the%20HP35.htm.  [2]
  9. The Radix. A new way of making logarithms.. London: J. Beecroft. 1771. https://books.google.com/books?id=mYpaAAAAcAAJ. Retrieved 2016-01-02. 
  10. 10.0 10.1 10.2 Binary Computation Algorithms for Coordinate Rotation and Function Generation, Convair, Aeroelectronics group, 1956-06-15, IAR-1.148 
  11. 11.00 11.01 11.02 11.03 11.04 11.05 11.06 11.07 11.08 11.09 11.10 11.11 "The Birth of CORDIC". Journal of VLSI Signal Processing (Hingham, MA, USA: Kluwer Academic Publishers) 25 (2 (Special issue on CORDIC)): 101–105. June 2000. doi:10.1023/A:1008110704586. ISSN 0922-5773. http://late-dpedago.urv.cat/site_media/papers/fulltext_2.pdf. Retrieved 2016-01-02. 
  12. "CORDIC Technique Reduces Trigonometric Function Look-Up", Computer Design (Boston, MA, USA: Computer Design Publishing Corp.): 72–78, June 1971  (NB. Some sources erroneously refer to this as by P. Z. Perle or in Component Design.)
  13. 13.0 13.1 13.2 13.3 13.4 Decimal Computation (1 (reprint) ed.). Malabar, Florida, USA: Robert E. Krieger Publishing Company. 1983. pp. 162, 165–176, 181–193. ISBN 0-89874-318-4. https://books.google.com/books?id=uEYZAQAAIAAJ. Retrieved 2016-01-03.  (NB. At least some batches of this reprint edition were misprints with defective pages 115–146.)
  14. "Decimal-Binary Conversions in CORDIC". IRE Transactions on Electronic Computers (The Institute of Radio Engineers, Inc. (IRE)) 8 (3): 335–339. September 1959. doi:10.1109/TEC.1959.5222694. EC-8(3):335–339. ISSN 0367-9950. https://www.researchgate.net/researcher/74881302_D_H_Daggett. Retrieved 2016-01-02. 
  15. Advanced Systems Group (1962-08-06), Technical Description of Fix-taking Tie-in Equipment, Fort Worth, Texas, USA: General Dynamics, FZE-052 
  16. 16.0 16.1 Decimal Computation (1 ed.). Binghamton, New York, USA: John Wiley & Sons, Inc.. 1974. pp. 162, 165–176, 181–193. ISBN 0-471-76180-X. https://archive.org/details/decimalcomputati0000schm. Retrieved 2016-01-03. "So far CORDIC has been known to be implemented only in binary form. But, as will be demonstrated here, the algorithm can be easily modified for a decimal system.* […] *In the meantime it has been learned that Hewlett-Packard and other calculator manufacturers employ the decimal CORDIC techniques in their scientific calculators." 
  17. 17.0 17.1 17.2 17.3 "The HP 9100 Project: An Exothermic Reaction". 2010. http://www.hp9825.com/html/the_9100_part_2.html. 
  18. 18.0 18.1 18.2 "Pseudo Division and Pseudo Multiplication Processes". IBM Journal of Research and Development (Riverton, New Jersey, USA: IBM Corporation) 6 (2): 210–226, 287. 1961-08-29. April 1962. doi:10.1147/rd.62.0210. http://home.citycable.ch/pierrefleur/Jacques-Laporte/Meggitt_62.pdf. Retrieved 2016-01-09. "John E. Meggitt B.A., 1953; PhD, 1958, Cambridge University. Awarded the First Smith Prize at Cambridge in 1955 and elected a Research Fellowship at Emmanuel College. […] Joined IBM British Laboratory at Hursley, Winchester in 1958. Interests include error-correcting codes and small microprogrammed computers.".  ([3], [4])
  19. 19.0 19.1 19.2 "A Quarter Century at HP". Computer History Museum / HP Memories. 2010-11-19. 7: Scientific Calculators, circa 1966. http://www.hpmemoryproject.org/timeline/dave_cochran/a_quarter_century_at_hp_00.htm#chapter_07. "I even flew down to Southern California to talk with Jack Volder who had implemented the transcendental functions in the Athena machine and talked to him for about an hour. He referred me to the original papers by Meggitt where he'd gotten the pseudo division, pseudo multiplication generalized functions. […] I did quite a bit of literary research leading to some very interesting discoveries. […] I found a treatise from 1624 by Henry Briggs discussing the calculation of common logarithms, interestingly used the same pseudo-division/pseudo-multiplication method that MacMillan and Volder used in Athena. […] We had purchased a LOCI-2 from Wang Labs and recognized that Wang Labs LOCI II used the same algorithm to do square root as well as log and exponential. After the introduction of the 9100 our legal department got a letter from Wang saying that we had infringed on their patent. And I just sent a note back with the Briggs reference in Latin and it said, "It looks like prior art to me." We never heard another word."  ([5])
  20. 20.0 20.1 About utilizing CORDIC for computing transcendental functions in BCD. 1966-03-14. 
  21. 21.0 21.1 "Tom Osborne's Story in His Own Words". 2010. http://www.hp9825.com/html/osborne_s_story.html. 
  22. "The HP 9100: The Initial Journey". 2010. http://www.hp9825.com/html/the_9100_project.html. 
  23. "Internal Programming of the 9100A Calculator". Hewlett-Packard Journal (Palo Alto, California, USA: Hewlett-Packard): 14–16. September 1968. http://www.hpmemoryproject.org/timeline/dave_cochran/hpj_sep68.htm. Retrieved 2016-01-02.  ([6])
  24. Extend your Personal Computing Power with the new LOCI-1 Logarithmic Computing Instrument, Wang Laboratories, Inc., 1964, pp. 2–3, http://www.oldcalculatormuseum.com/a-loci1br-23.html, retrieved 2016-01-03 
  25. "Wang LOCI-2". Old Calculator Web Museum. Beavercreek, Oregon City, Oregon, USA. 2013-08-31. http://www.oldcalculatormuseum.com/wangloci.html. 
  26. "Wang LOCI Service Manual". Wang Laboratories, Inc.. 1967. http://bitsavers.informatik.uni-stuttgart.de/pdf/wang/loci/Wang_LOCI_Service_Manual.pdf. 
  27. "Wang Model 360SE Calculator System". Old Calculator Web Museum. Beavercreek, Oregon City, Oregon, USA. 2004-10-23. http://www.oldcalculatormuseum.com/wang360.html. 
  28. 28.0 28.1 28.2 28.3 "The HP-35 Design, A Case Study in Innovation". HP Memory Project. June 2010. http://www.hpmemoryproject.org/wb_pages/d_cochran_01.htm. "During the development of the desktop HP 9100 calculator I was responsible for developing the algorithms to fit the architecture suggested by Tom Osborne. Although the suggested methodology for the algorithms came from Malcolm McMillan I did considerable amount of reading to understand the core calculations […] Although Wang Laboratories had used similar methods of calculation, my study found prior art dated 1624 that read on their patents. […] This research enabled the adaption of the transcendental functions through the use of the algorithms to match the needs of the customer within the constraints of the hardware. This proved invaluable during the development of the HP-35, […] Power series, polynomial expansions, continued fractions, and Chebyshev polynomials were all considered for the transcendental functions. All were too slow because of the number of multiplications and divisions required. The generalized algorithm that best suited the requirements of speed and programming efficiency for the HP-35 was an iterative pseudo-division and pseudo-multiplication method first described in 1624 by Henry Briggs in 'Arithmetica Logarithmica' and later by Volder and Meggitt. This is the same type of algorithm that was used in previous HP desktop calculators. […] The complexity of the algorithms made multilevel programming a necessity. This meant the calculator had to have subroutine capability, […] To generate a transcendental function such as Arc-Hyperbolic-Tan required several levels of subroutines. […] Chris Clare later documented this as Algorithmic State Machine (ASM) methodology. Even the simple Sine or Cosine used the Tangent routine, and then calculated the Sine from trigonometric identities. These arduous manipulations were necessary to minimize the number of unique programs and program steps […] The arithmetic instruction set was designed specifically for a decimal transcendental-function calculator. The basic arithmetic operations are performed by a 10's complement adder-subtractor which has data paths to three of the registers that are used as working storage." 
  29. Wang, An, "Calculating apparatus", US patent 3402285A, published 1968-09-17, issued 1968-09-17, assigned to Wang Laboratories ([7], [8])
  30. Wang, An, "Rechenmaschine fuer logarithmische Rechnungen", DE patent 1499281B1, published 1970-05-06, issued 1970-05-06, assigned to Wang Laboratories ([9])
  31. Computer Arithmetic. 1 (2 ed.). Los Alamitos: IEEE Computer Society Press. 1990. 0818689315. ISBN 9780818689314. https://books.google.com/books?id=egIpAQAAMAAJ. Retrieved 2016-01-02. 
  32. Petrocelli, Orlando R., ed. (1972), The Best Computer Papers of 1971, Auerbach Publishers, p. 71, ISBN 0877691274, https://books.google.com/books?id=f6ezAAAAIAAJ, retrieved 2016-01-02 
  33. "Algorithms and Accuracy in the HP-35". Hewlett-Packard Journal 23 (10): 10–11. June 1972. http://www.hpl.hp.com/hpjournal/72jun/jun72a2.pdf. Retrieved 2016-01-02. 
  34. "HP35 trigonometric algorithm". Paris, France. 2005-12-06. http://www.jacques-laporte.org//Trigonometry.htm.  [10]
  35. "The secret of the algorithms". L'Ordinateur Individuel (Paris, France) (24). February 2005. http://www.jacques-laporte.org/TheSecretOfTheAlgorithms.htm. Retrieved 2016-01-02.  [11]
  36. "Digit by digit methods". Paris, France. February 2012. http://www.jacques-laporte.org/digit_by_digit.htm.  [12]
  37. "HP 35 Logarithm Algorithm". Paris, France. February 2012. http://www.jacques-laporte.org/Logarithm_1.htm.  [13]
  38. "GH CORDIC-Based Architecture for Computing Nth Root of Single-Precision Floating-Point Number". IEEE Transactions on Very Large Scale Integration (VLSI) Systems 28 (4): 864–875. January 2020. doi:10.1109/TVLSI.2019.2959847. 
  39. "Low Complexity Generic VLSI Architecture Design Methodology for Nth Root and Nth Power Computations". IEEE Transactions on Circuits and Systems I: Regular Papers 66 (12): 4673–4686. September 2019. doi:10.1109/TCSI.2019.2939720. 
  40. "CORDIC as a Switched Nonlinear System". Circuits, Systems and Signal Processing 39 (6): 3234–3249. November 2019. doi:10.1007/s00034-019-01295-8. 
  41. "Use Decimal CORDIC for Generation of Many Transcendental Functions". EDN: 64–73. 1973-02-20. 
  42. An Analysis of Algorithms for Hardware Evaluation of Elementary Functions. Monterey, California, USA: Department of the Navy, Naval Postgraduate School. 1973-05-08. NPS-53FE73051A. http://calhoun.nps.edu/bitstream/handle/10945/29706/analysisofalgori00fran.pdf. Retrieved 2016-01-03. 
  43. 43.0 43.1 43.2 43.3 43.4 Elementary Functions: Algorithms and Implementation (2 ed.). Boston: Birkhäuser. 2006. p. 134. ISBN 978-0-8176-4372-0. http://perso.ens-lyon.fr/jean-michel.muller/SecondEdition.html. Retrieved 2015-12-01. 
  44. 44.0 44.1 "Implementation of Transcendental Functions on a Numerics Processor". Microprocessing and Microprogramming 11 (3–4): 221–225. March 1983. doi:10.1016/0165-6074(83)90151-5. 
  45. The 8087 Primer (1 ed.). John Wiley & Sons Australia, Limited. 1984. 9780471875697. ISBN 0471875694. https://archive.org/details/8087primer00palm. Retrieved 2016-01-02. 
  46. "Math Coprocessors: A look at what they do, and how they do it". Byte 15 (1): 337–348. January 1990. ISSN 0360-5280. 
  47. 47.0 47.1 47.2 "Implementing CORDIC algorithms – A single compact routine for computing transcendental functions". Dr. Dobb's Journal: 152–156. 1990-10-01. http://www.drdobbs.com/database/implementing-cordic-algorithms/184408428. Retrieved 2016-01-02. 
  48. 48.0 48.1 "Intel's Floating-Point Processors". Electro/88 Conference Record: 48/5/1–7. 1988. 
  49. "50 Years of CORDIC: Algorithms, Architectures and Applications". IEEE Transactions on Circuits and Systems I: Regular Papers 56 (9): 1893–1907. 2008-08-22. 2009-09-09. doi:10.1109/TCSI.2009.2025803. https://eprints.soton.ac.uk/267873/1/tcas1_cordic_review.pdf. 
  50. "Low Complexity Generic VLSI Architecture Design Methodology for Nth Root and Nth Power Computations". IEEE Transactions on Very Large Scale Integration (VLSI) Systems 21 (2): 217–228. February 2013. doi:10.1109/TVLSI.2012.2187080. 
  51. "Technical Memorandum 70-2014-8: The Navigation System of the Lunar Roving Vehicle". Washington, D.C., USA: Bellcomm. 1970-12-11. p. 14. https://www.hq.nasa.gov/alsj/19790072520_1979072520.pdf. 
  52. "Technical Note D-7469: Lunar Roving Vehicle Navigation System Performance Review". Huntsville, Alabama, USA: Marshall Space Flight Center. November 1973. p. 17. https://www.hq.nasa.gov/alsj/19740003321_1974003321.pdf. 
  53. "Extracting ROM constants from the 8087 math coprocessor's die". May 2020. http://www.righto.com/2020/05/extracting-rom-constants-from-8087-math.html. "The ROM contains 16 arctangent values, the arctans of 2−n. It also contains 14 log values, the base-2 logs of (1+2−n). These may seem like unusual values, but they are used in an efficient algorithm called CORDIC, which was invented in 1958." 
  54. "Getting started with the CORDIC accelerator using STM32CubeG4 MCU Package". STMicroelectronics. https://www.st.com/resource/en/application_note/dm00614795-getting-started-with-the-cordic-accelerator-using-stm32cubeg4-mcu-package-stmicroelectronics.pdf. 
  55. "CMSIS/CMSIS/DSP_Lib/Source/ControllerFunctions/arm_sin_cos_f32.c". ARM. https://github.com/ARM-software/CMSIS/blob/master/CMSIS/DSP_Lib/Source/ControllerFunctions/arm_sin_cos_f32.c. 
  56. "Error bounds of Taylor Expansion for Sine". https://math.stackexchange.com/q/2464759. 
  57. "A survey of CORDIC algorithms for FPGA based computers". ACM (North Kingstown, RI, USA: Andraka Consulting Group, Inc.). 1998. 0-89791-978-5/98/01. http://www.andraka.com/files/crdcsrvy.pdf. Retrieved 2016-05-08. 
  58. "Class Math". Java Platform Standard. Oracle Corporation. 2018. https://docs.oracle.com/javase/8/docs/api/java/lang/Math.html#scalb-double-int-. 
  59. "ldexp, ldexpf, ldexpl". 2015-06-11. http://en.cppreference.com/w/c/numeric/math/ldexp. 
  60. Intel 64 and IA-32 Architectures Software Developer's Manual Volume 1: Basic Architecture. Intel Corporation. September 2016. pp. 8–22. http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-1-manual.pdf. 
  61. Baykov, Vladimir. "The outline (autoreferat) of my PhD, published in 1972". http://baykov.de/CORDIC1972.htm. 
  62. Baykov, Vladimir. "Hardware implementation of elementary functions in computers". http://baykov.de/Cordic1975.htm. 
  63. 63.0 63.1 Baykov, Vladimir. "Special-purpose processors: iterative algorithms and structures". http://baykov.de/Cordic1985.htm. 

Further reading

External links