Bailey's FFT algorithm
The Bailey's FFT (also known as a 4-step FFT) is a high-performance algorithm for computing the fast Fourier transform (FFT). This variation of the Cooley–Tukey FFT algorithm was originally designed for systems with hierarchical memory common in modern computers (and was the first FFT algorithm in this so called "out of core" class). The algorithm treats the samples as a two dimensional matrix (thus yet another name, a matrix FFT algorithm[1]) and executes short FFT operations on the columns and rows of the matrix, with a correction multiplication by "twiddle factors" in between.[2]
The algorithm got its name after an article by David H. Bailey, FFTs in external or hierarchical memory, published in 1989. In this article Bailey credits the algorithm to W. M. Gentleman and G. Sande who published their paper, Fast Fourier Transforms: for fun and profit, some twenty years earlier in 1966.[3] The algorithm can be considered a radix-[math]\displaystyle{ \sqrt n }[/math] FFT decomposition.[4]
Here is a brief overview of how the "4-step" version of the Bailey FFT algorithm works:
- The data (in natural order) is first arranged into a matrix.
- Each column of a matrix is then independently processed using a standard FFT algorithm.
- Each element of a matrix is multiplied by a correction coefficient.
- Each row of a matrix is then independently processed using a standard FFT algorithm.
The result (in natural order) is read column-by-column. Since the operations are performed column-wise and row-wise, steps 2 and 4 (and reading of the result) might include a matrix transpose to rearrange the elements in a way convenient for processing. The algorithm resembles a 2-dimensional FFT, a 3-dimensional (and beyond) extensions are known as 5-step FFT, 6-step FFT, etc.[2][5]
The Bailey FFT is typically used for computing DFTs of large datasets, such as those used in scientific and engineering applications. The Bailey FFT is a very efficient algorithm, and it has been used to compute FFTs of datasets with billions of elements (when applied to the number-theoretic transform, the datasets of the order of 1012 elements were processed in mid-2000s[6]).
See also
- Row-column FFT algorithm
- Vector-radix FFT algorithm
References
- ↑ Arndt 2010, p. 438.
- ↑ 2.0 2.1 Hart, Tornaría & Watkins 2010, p. 191.
- ↑ Bailey 1989, p. 2.
- ↑ Frigo & Johnson 2005, p. 2.
- ↑ Al Na'mneh & Pan 2007, pp. 191-192.
- ↑ Al Na'mneh & Pan 2007.
Sources
- Bailey, D. H. (March 1989). "FFTS in external of hierarchical memory". Proceedings of the 1989 ACM/IEEE conference on Supercomputing - Supercomputing '89. 4. ACM Press. pp. 23–35. doi:10.1145/76263.76288. ISBN 0897913418. https://www.davidhbailey.com/dhbpapers/fftq.pdf.
- Frigo, M.; Johnson, S.G. (February 2005). "The Design and Implementation of FFTW3". Proceedings of the IEEE 93 (2): 216–231. doi:10.1109/JPROC.2004.840301. ISSN 0018-9219. Bibcode: 2005IEEEP..93..216F.
- Hart, William B.; Tornaría, Gonzalo; Watkins, Mark (2010). "Congruent Number Theta Coefficients to 1012". Algorithmic Number Theory. Lecture Notes in Computer Science. 6197. Springer Berlin Heidelberg. pp. 186–200. doi:10.1007/978-3-642-14518-6_17. ISBN 978-3-642-14517-9. http://wrap.warwick.ac.uk/41654/1/WRAP_Hart_0584144-ma-270913-congruent.pdf.
- Al Na'mneh, Rami; Pan, W. David (March 2007). "Five-step FFT algorithm with reduced computational complexity". Information Processing Letters 101 (6): 262–267. doi:10.1016/j.ipl.2006.10.009. ISSN 0020-0190.
- Arndt, Jörg (1 October 2010). "The Matrix Fourier Algorithm (MFA)". Matters Computational: Ideas, Algorithms, Source Code. Springer Science & Business Media. pp. 438–439. ISBN 978-3-642-14764-7. OCLC 1005788763. https://books.google.com/books?id=HsRHS6u7e80C&pg=PA438.
Original source: https://en.wikipedia.org/wiki/Bailey's FFT algorithm.
Read more |