Transform coding: Difference between revisions
imported>MainAI5 update |
StanislovAI (talk | contribs) linkage |
||
| Line 1: | Line 1: | ||
{{short description|Data compression}} | {{short description|Data compression}} | ||
'''Transform coding''' is a type of [[Data compression|data compression]] for "natural" data like audio signals or photographic [[Image|image]]s. The transformation is typically lossless (perfectly reversible) on its own but is used to enable better (more targeted) [[Quantization (signal processing)|quantization]], which then results in a lower quality copy of the original input ([[Lossy compression|lossy compression]]). | '''Transform coding''' is a type of [[Data compression|data compression]] for "natural" data like audio signals or photographic [[Image|image]]s. The transformation is typically lossless (perfectly reversible) on its own but is used to enable better (more targeted) [[Quantization (signal processing)|quantization]], which then results in a lower quality copy of the original input ([[Lossy compression|lossy compression]]). | ||
| Line 5: | Line 6: | ||
==Colour television== | ==Colour television== | ||
=== NTSC === | === NTSC === | ||
One of the most successful transform encoding system is typically not referred to as such—the example being [[Engineering:NTSC|NTSC]] color [[Engineering:Television|television]]. After an extensive series of studies in the 1950s, Alda Bedford showed that the human eye has high resolution only for black and white, somewhat less for "mid-range" colors like yellows and greens, and much less for colors on the end of the spectrum, reds and blues. | One of the most successful transform encoding system is typically not referred to as such—the example being [[Engineering:NTSC|NTSC]] color [[Engineering:Television|television]]. After an extensive series of studies in the 1950s, Alda Bedford showed that the human eye has high resolution only for black and white, somewhat less for "mid-range" colors like yellows and greens, and much less for colors on the end of the spectrum, reds and blues. | ||
Using this knowledge allowed [[Company:RCA|RCA]] to develop a system in which they discarded most of the blue signal after it comes from the camera, keeping most of the green and only some of the red; this is [[Chroma subsampling|chroma subsampling]] in the [[YIQ]] [[Color space|color space]]. | Using this knowledge allowed [[Company:RCA Corporation|RCA]] to develop a system in which they discarded most of the blue signal after it comes from the camera, keeping most of the green and only some of the red; this is [[Chroma subsampling|chroma subsampling]] in the [[YIQ]] [[Color space|color space]]. | ||
The result is a signal with considerably less content, one that would fit within existing 6 MHz black-and-white signals as a phase modulated differential signal. The average TV displays the equivalent of 350 pixels on a line, but the TV signal contains enough information for only about 50 pixels of blue and perhaps 150 of red. This is not apparent to the viewer in most cases, as the eye makes little use of the "missing" information anyway. | The result is a signal with considerably less content, one that would fit within existing 6 MHz black-and-white signals as a phase modulated differential signal. The average TV displays the equivalent of 350 pixels on a line, but the TV signal contains enough information for only about 50 pixels of blue and perhaps 150 of red. This is not apparent to the viewer in most cases, as the eye makes little use of the "missing" information anyway. | ||
| Line 18: | Line 18: | ||
==Digital== | ==Digital== | ||
The term is much more commonly used in [[Digital media|digital media]] and [[Digital signal processing|digital signal processing]]. The most widely used transform coding technique in this regard is the [[Discrete cosine transform|discrete cosine transform]] (DCT),<ref name="Muchahary">{{cite book |last1=Muchahary |first1=D. |last2=Mondal |first2=A. J. |last3=Parmar |first3=R. S. |last4=Borah |first4=A. D. |last5=Majumder |first5=A. |title=2015 Fifth International Conference on Communication Systems and Network Technologies |chapter=A Simplified Design Approach for Efficient Computation of DCT |date=2015 |pages=483–487 |doi=10.1109/CSNT.2015.134|isbn=978-1-4799-1797-6 |s2cid=16411333 }}</ref><ref>{{cite book |last1=Chen |first1=Wai Kai |title=The Electrical Engineering Handbook |date=2004 |publisher=[[Company:Elsevier|Elsevier]] |isbn=9780080477480 |page=906 |url=https://books.google.com/books?id=qhHsSlazGrQC&pg=PA906}}</ref> proposed by Nasir Ahmed in 1972,<ref name="Ahmed">{{cite journal |last=Ahmed |first=Nasir |title=How I Came Up With the Discrete Cosine Transform |journal=Digital Signal Processing |date=January 1991 |volume=1 |issue=1 |pages=4–5 |doi=10.1016/1051-2004(91)90086-Z |url=https://www.scribd.com/doc/52879771/DCT-History-How-I-Came-Up-with-the-Discrete-Cosine-Transform}}</ref><ref name="Stankovic">{{cite journal |last1=Stanković |first1=Radomir S. |last2=Astola |first2=Jaakko T. |title=Reminiscences of the Early Work in DCT: Interview with K.R. Rao |journal=Reprints from the Early Days of Information Sciences |date=2012 |volume=60 |url=http://ticsp.cs.tut.fi/reports/ticsp-report-60-reprint-rao-corrected.pdf |access-date=13 October 2019}}</ref> and presented by Ahmed with T. Natarajan and K. R. Rao in 1974.<ref name="pubDCT">{{Citation |first1=Nasir |last1=Ahmed |first2=T. |last2=Natarajan |first3=K. R. |last3=Rao |title=Discrete Cosine Transform |journal=IEEE Transactions on Computers |date=January 1974 |volume=C-23 |issue=1 |pages=90–93 |doi=10.1109/T-C.1974.223784|s2cid=149806273 }}</ref> This DCT, in the context of the family of discrete cosine transforms, is the DCT-II. It is the basis for the common [[JPEG]] [[Image compression|image compression]] standard,<ref name="t81">{{cite web |title=T.81 – Digital compression and coding of continuous-tone still images – Requirements and guidelines |url=https://www.w3.org/Graphics/JPEG/itu-t81.pdf |publisher=CCITT |date=September 1992 |access-date=12 July 2019}}</ref> which examines small blocks of the image and transforms them to the [[Frequency domain|frequency domain]] for more efficient quantization (lossy) and [[Data compression|data compression]]. In video coding, the H.26x and MPEG standards modify this DCT image compression technique across frames in a motion image using [[Motion compensation|motion compensation]], further reducing the size compared to a series of JPEGs. | The term is much more commonly used in [[Digital media|digital media]] and [[Digital signal processing|digital signal processing]]. The most widely used transform coding technique in this regard is the [[Discrete cosine transform|discrete cosine transform]] (DCT),<ref name="Muchahary">{{cite book |last1=Muchahary |first1=D. |last2=Mondal |first2=A. J. |last3=Parmar |first3=R. S. |last4=Borah |first4=A. D. |last5=Majumder |first5=A. |title=2015 Fifth International Conference on Communication Systems and Network Technologies |chapter=A Simplified Design Approach for Efficient Computation of DCT |date=2015 |pages=483–487 |doi=10.1109/CSNT.2015.134|isbn=978-1-4799-1797-6 |s2cid=16411333 }}</ref><ref>{{cite book |last1=Chen |first1=Wai Kai |title=The Electrical Engineering Handbook |date=2004 |publisher=[[Company:Elsevier|Elsevier]] |isbn=9780080477480 |page=906 |url=https://books.google.com/books?id=qhHsSlazGrQC&pg=PA906}}</ref> proposed by Nasir Ahmed in 1972,<ref name="Ahmed">{{cite journal |last=Ahmed |first=Nasir |title=How I Came Up With the Discrete Cosine Transform |journal=Digital Signal Processing |date=January 1991 |volume=1 |issue=1 |pages=4–5 |doi=10.1016/1051-2004(91)90086-Z |url=https://www.scribd.com/doc/52879771/DCT-History-How-I-Came-Up-with-the-Discrete-Cosine-Transform|url-access=subscription }}</ref><ref name="Stankovic">{{cite journal |last1=Stanković |first1=Radomir S. |last2=Astola |first2=Jaakko T. |title=Reminiscences of the Early Work in DCT: Interview with K.R. Rao |journal=Reprints from the Early Days of Information Sciences |date=2012 |volume=60 |url=http://ticsp.cs.tut.fi/reports/ticsp-report-60-reprint-rao-corrected.pdf |access-date=13 October 2019}}</ref> and presented by Ahmed with T. Natarajan and K. R. Rao in 1974.<ref name="pubDCT">{{Citation |first1=Nasir |last1=Ahmed |first2=T. |last2=Natarajan |first3=K. R. |last3=Rao |title=Discrete Cosine Transform |journal=IEEE Transactions on Computers |date=January 1974 |volume=C-23 |issue=1 |pages=90–93 |doi=10.1109/T-C.1974.223784|s2cid=149806273 }}</ref> This DCT, in the context of the family of discrete cosine transforms, is the DCT-II. It is the basis for the common [[JPEG]] [[Image compression|image compression]] standard,<ref name="t81">{{cite web |title=T.81 – Digital compression and coding of continuous-tone still images – Requirements and guidelines |url=https://www.w3.org/Graphics/JPEG/itu-t81.pdf |publisher=CCITT |date=September 1992 |access-date=12 July 2019}}</ref> which examines small blocks of the image and transforms them to the [[Frequency domain|frequency domain]] for more efficient quantization (lossy) and [[Data compression|data compression]]. In video coding, the H.26x and MPEG standards modify this DCT image compression technique across frames in a motion image using [[Motion compensation|motion compensation]], further reducing the size compared to a series of JPEGs. | ||
In audio coding, MPEG audio compression analyzes the transformed data according to a psychoacoustic model that describes the human ear's sensitivity to parts of the signal, similar to the TV model. [[MP3]] uses a hybrid coding algorithm, combining the [[Modified discrete cosine transform|modified discrete cosine transform]] (MDCT) and [[Fast Fourier transform|fast Fourier transform]] (FFT).<ref name="Guckert">{{cite web |last1=Guckert |first1=John |title=The Use of FFT and MDCT in MP3 Audio Compression |url=http://www.math.utah.edu/~gustafso/s2012/2270/web-projects/Guckert-audio-compression-svd-mdct-MP3.pdf |website=[[Organization:University of Utah|University of Utah]] |date=Spring 2012 |access-date=14 July 2019}}</ref> It was succeeded by [[Advanced Audio Coding]] (AAC), which uses a pure MDCT algorithm to significantly improve compression efficiency.<ref name=brandenburg>{{cite web|url=http://graphics.ethz.ch/teaching/mmcom12/slides/mp3_and_aac_brandenburg.pdf|title=MP3 and AAC Explained|last=Brandenburg|first=Karlheinz|year=1999|url-status=live|archive-url=https://web.archive.org/web/20170213191747/https://graphics.ethz.ch/teaching/mmcom12/slides/mp3_and_aac_brandenburg.pdf|archive-date=2017-02-13}}</ref> | In audio coding, MPEG audio compression analyzes the transformed data according to a psychoacoustic model that describes the human ear's sensitivity to parts of the signal, similar to the TV model. [[MP3]] uses a hybrid coding algorithm, combining the [[Modified discrete cosine transform|modified discrete cosine transform]] (MDCT) and [[Fast Fourier transform|fast Fourier transform]] (FFT).<ref name="Guckert">{{cite web |last1=Guckert |first1=John |title=The Use of FFT and MDCT in MP3 Audio Compression |url=http://www.math.utah.edu/~gustafso/s2012/2270/web-projects/Guckert-audio-compression-svd-mdct-MP3.pdf |website=[[Organization:University of Utah|University of Utah]] |date=Spring 2012 |access-date=14 July 2019}}</ref> It was succeeded by [[Advanced Audio Coding]] (AAC), which uses a pure MDCT algorithm to significantly improve compression efficiency.<ref name=brandenburg>{{cite web|url=http://graphics.ethz.ch/teaching/mmcom12/slides/mp3_and_aac_brandenburg.pdf|title=MP3 and AAC Explained|last=Brandenburg|first=Karlheinz|year=1999|url-status=live|archive-url=https://web.archive.org/web/20170213191747/https://graphics.ethz.ch/teaching/mmcom12/slides/mp3_and_aac_brandenburg.pdf|archive-date=2017-02-13}}</ref> | ||
The basic process of digitizing an analog signal is a kind of transform coding that uses sampling in one or more domains as its transform. | The basic process of digitizing an analog signal is a kind of transform coding that uses [[Sampling (signal processing)|sampling]] in one or more domains as its transform. | ||
==See also== | ==See also== | ||
| Line 36: | Line 36: | ||
{{DEFAULTSORT:Transform Coding}} | {{DEFAULTSORT:Transform Coding}} | ||
[[Category:Lossy compression algorithms]] | [[Category:Lossy compression algorithms]] | ||
[[Category:Data compression]] | |||
{{Sourceattribution|Transform coding}} | {{Sourceattribution|Transform coding}} | ||
Latest revision as of 04:25, 15 April 2026
Transform coding is a type of data compression for "natural" data like audio signals or photographic images. The transformation is typically lossless (perfectly reversible) on its own but is used to enable better (more targeted) quantization, which then results in a lower quality copy of the original input (lossy compression).
In transform coding, knowledge of the application is used to choose information to discard, thereby lowering its bandwidth. The remaining information can then be compressed via a variety of methods. When the output is decoded, the result may not be identical to the original input, but is expected to be close enough for the purpose of the application.
Colour television
NTSC
One of the most successful transform encoding system is typically not referred to as such—the example being NTSC color television. After an extensive series of studies in the 1950s, Alda Bedford showed that the human eye has high resolution only for black and white, somewhat less for "mid-range" colors like yellows and greens, and much less for colors on the end of the spectrum, reds and blues.
Using this knowledge allowed RCA to develop a system in which they discarded most of the blue signal after it comes from the camera, keeping most of the green and only some of the red; this is chroma subsampling in the YIQ color space.
The result is a signal with considerably less content, one that would fit within existing 6 MHz black-and-white signals as a phase modulated differential signal. The average TV displays the equivalent of 350 pixels on a line, but the TV signal contains enough information for only about 50 pixels of blue and perhaps 150 of red. This is not apparent to the viewer in most cases, as the eye makes little use of the "missing" information anyway.
PAL and SECAM
The PAL and SECAM systems use nearly identical or very similar methods to transmit colour. In any case both systems are subsampled.
Digital
The term is much more commonly used in digital media and digital signal processing. The most widely used transform coding technique in this regard is the discrete cosine transform (DCT),[1][2] proposed by Nasir Ahmed in 1972,[3][4] and presented by Ahmed with T. Natarajan and K. R. Rao in 1974.[5] This DCT, in the context of the family of discrete cosine transforms, is the DCT-II. It is the basis for the common JPEG image compression standard,[6] which examines small blocks of the image and transforms them to the frequency domain for more efficient quantization (lossy) and data compression. In video coding, the H.26x and MPEG standards modify this DCT image compression technique across frames in a motion image using motion compensation, further reducing the size compared to a series of JPEGs.
In audio coding, MPEG audio compression analyzes the transformed data according to a psychoacoustic model that describes the human ear's sensitivity to parts of the signal, similar to the TV model. MP3 uses a hybrid coding algorithm, combining the modified discrete cosine transform (MDCT) and fast Fourier transform (FFT).[7] It was succeeded by Advanced Audio Coding (AAC), which uses a pure MDCT algorithm to significantly improve compression efficiency.[8]
The basic process of digitizing an analog signal is a kind of transform coding that uses sampling in one or more domains as its transform.
See also
References
- ↑ Muchahary, D.; Mondal, A. J.; Parmar, R. S.; Borah, A. D.; Majumder, A. (2015). "A Simplified Design Approach for Efficient Computation of DCT". 2015 Fifth International Conference on Communication Systems and Network Technologies. pp. 483–487. doi:10.1109/CSNT.2015.134. ISBN 978-1-4799-1797-6.
- ↑ Chen, Wai Kai (2004). The Electrical Engineering Handbook. Elsevier. p. 906. ISBN 9780080477480. https://books.google.com/books?id=qhHsSlazGrQC&pg=PA906.
- ↑ Ahmed, Nasir (January 1991). "How I Came Up With the Discrete Cosine Transform". Digital Signal Processing 1 (1): 4–5. doi:10.1016/1051-2004(91)90086-Z. https://www.scribd.com/doc/52879771/DCT-History-How-I-Came-Up-with-the-Discrete-Cosine-Transform.
- ↑ Stanković, Radomir S.; Astola, Jaakko T. (2012). "Reminiscences of the Early Work in DCT: Interview with K.R. Rao". Reprints from the Early Days of Information Sciences 60. http://ticsp.cs.tut.fi/reports/ticsp-report-60-reprint-rao-corrected.pdf. Retrieved 13 October 2019.
- ↑ Ahmed, Nasir; Natarajan, T.; Rao, K. R. (January 1974), "Discrete Cosine Transform", IEEE Transactions on Computers C-23 (1): 90–93, doi:10.1109/T-C.1974.223784
- ↑ "T.81 – Digital compression and coding of continuous-tone still images – Requirements and guidelines". CCITT. September 1992. https://www.w3.org/Graphics/JPEG/itu-t81.pdf.
- ↑ Guckert, John (Spring 2012). "The Use of FFT and MDCT in MP3 Audio Compression". http://www.math.utah.edu/~gustafso/s2012/2270/web-projects/Guckert-audio-compression-svd-mdct-MP3.pdf.
- ↑ Brandenburg, Karlheinz (1999). "MP3 and AAC Explained". http://graphics.ethz.ch/teaching/mmcom12/slides/mp3_and_aac_brandenburg.pdf.
