Spectral band replication

From HandWiki
Short description: Low bitrate digital audio enhancement technique
Spectrogram of this recording of a violin playing. Note the harmonics occurring at whole-number multiples of the fundamental frequency. SBR exploits this redundancy.

Spectral band replication (SBR) is a technology to enhance audio or speech codecs, especially at low bit rates and is based on harmonic redundancy in the frequency domain.

It can be combined with any audio compression codec: the codec itself transmits the lower and midfrequencies of the spectrum, while SBR replicates higher frequency content by transposing up harmonics from the lower and midfrequencies at the decoder.[1] Some guidance information for reconstruction of the high-frequency spectral envelope is transmitted as side information.

When needed, it also reconstructs or adaptively mixes in noise-like information in selected frequency bands in order to faithfully replicate signals that originally contained no or fewer tonal components.

The SBR idea is based on the principle that the psychoacoustic part of the human brain tends to analyse higher frequencies with less accuracy; thus harmonic phenomena associated with the spectral band replication process needs only be accurate in a perceptual sense and not technically or mathematically exact.

History and use

A Swedish company Coding Technologies (acquired by Dolby in 2007) developed and pioneered the use of SBR in its MPEG-2 AAC-derived codec called aacPlus, which first appeared in 2001. This codec was submitted to MPEG and formed the basis of MPEG-4 High-Efficiency AAC (HE-AAC), standardized in 2003.[2] Lars Liljeryd, Kristofer Kjörling, and Martin Dietz received the IEEE Masaru Ibuka Consumer Electronics Award in 2013 for their work developing and marketing HE-AAC.[3][4] Coding Technologies' SBR method has also been used with WMA 10 Professional to create WMA 10 Pro LBR, and with MP3 to create mp3PRO.

HE-AAC which uses SBR is used in broadcast systems like DAB+, Digital Radio Mondiale (including xHE-AAC), HD Radio, and XM Satellite Radio.[5]

If the player is not capable of using the side information that has been transmitted alongside the "normal" compressed audio data, it may still be able to play the "baseband" data (e.g. sampled at 22.05 kbps instead of 44.1 kbps) as usual, resulting in a dull (since the high frequencies are missing), but otherwise mostly acceptable sound. This is, for example, the case if an mp3PRO file is played back with MP3 software incapable of utilizing the SBR information.

Opus's CELT part performs spectral folding on the MDCT bin level, making it a far less advanced but lower-delay technique compared to SBR.[6]

Dolby Digital Plus (E-AC3) performs Spectral Extension (SPX). SPX reduces high-frequency components to metadata and is similar to E-AC3 multichannel coupling calculation.[7] Dolby AC-4 expands the technique to Advanced Spectral Extension (A-SPX), with the option of interleaving with regular, non-extended data in time or frequency domain. As a result, SPX can be selective disabled for difficult portions.[8]

Methods

Encoding of SBR produces a downsampled (usually 2:1) audio signal and guidance information. In an early publication, the guiding data is described as being produced by quadrature mirror filter (QMF) analysis and an envelope estimator.[9]

Decoding of SBR requires transposing harmonics, a case of audio time stretching and pitch scaling.[10]

  • A traditional approach starts with small intervals of discrete fourier transform (DFT), phase adjustments, IDFT, and ends with overlap-add. This method is sensitive to transient signals which can cause echos, requiring some padding (50% in USAC) in the DFT.
  • A newer approach is the QMF. One single filter bank can perform a whole time-stretch and pitch-scale operation for lower computational complexity.

See also

External links

  • Coding Technologies page describing SBR, as it appeared in 2007 at the Dolby acquisition

References

  1. Novak, Clark. "Spectral Band Replication and aacPlus Coding - An Overview". http://www.telos-systems.com/techtalk/aacplus/aacPlus_overview.pdf. 
  2. ISO (2003). "Bandwidth extension, ISO/IEC 14496-3:2001/Amd 1:2003". ISO. http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=38148. Retrieved 2009-10-13. 
  3. "IEEE Masaru Ibuka Consumer Electronics Award". http://www.ieee.org/about/awards/bios/ibuka_recipients.html. Retrieved 7 July 2015. 
  4. "Interview with Martin Dietz, Kristofer Kjörling, and Lars Liljeryd". https://www.youtube.com/watch?v=i-eKbP_K2Sg. Retrieved 7 July 2015. 
  5. "XM Radio – Fast Facts". http://sounds.xmradio.com/about/fast-facts/sound.xmc. Retrieved February 8, 2010. 
  6. "High-Quality, Low-Delay Music Coding in the Opus Codec". New York, NY: Xiph.Org Foundation. October 17–20, 2013. p. 2. http://jmvalin.ca/papers/aes135_opus_celt.pdf. 
  7. Andersen, Robert Loring; Crockett, B.; Davidson, G.; Davis, Mark; Fielder, L.; Turner, Stephen C.; Vinton, M.; Williams, P. (1 October 2004). "Introduction to Dolby Digital Plus, an Enhancement to the Dolby Digital Coding System". https://web.archive.org/web/20161119192949/https://www.dolby.com/us/en/technologies/aes-convention-paper-intro-to-dolby-digital-plus.pdf. 
  8. "Dolby® AC-4: Audio delivery for next-generation entertainment services". https://professional.dolby.com/siteassets/technologies/dolbt_atmos_ac-4_whitepaper.pdf. 
  9. Ekstrand, Per (November 2022). "Bandwidth extension of audio signals by spectral band replication". https://www.esat.kuleuven.be/psi/spraak/seminars/mpca/papers/ekstrand:mpca02.pdf. 
  10. Zhong, Haishan; Villemoes, Lars; Ekstrand, Per; Disch, Sascha; Nagel, Frederik; Wilde, Stephan; Chong, Kok Seng; Norimatsu, Takeshi (19 October 2011). "QMF Based Harmonic Spectral Band Replication" (in English). Audio Engineering Society. https://www.aes.org/e-lib/browse.cfm?elib=16043.