Spectral flux

From HandWiki

Spectral flux is a measure of how quickly the power spectrum of a signal is changing, calculated by comparing the power spectrum for one frame against the power spectrum from the previous frame.[1]

More precisely, it is usually calculated as the L2-norm (also known as the Euclidean distance) between the two normalised spectra. Calculated this way, the spectral flux is not dependent upon overall power (since the spectra are normalised), nor on phase considerations (since only the magnitudes are compared).

The spectral flux can be used to determine the timbre of an audio signal, or in onset detection,[2] among other things.

Variations

Some implementations use the L1-norm rather than the L2-norm (i.e. the sum of absolute differences rather than the sum of squared differences).

Some implementations do not normalise the spectra.

For onset detection, increases in energy are important (not decreases), so some algorithms only include values calculated from bins in which the energy is increasing.

References

  1. Dimitrios Giannoulis; Michael Massberg; Joshua D. Reiss (October 2013). "Automating Dynamic Range Compression". Journal of the Audio Engineering Society (Audio Engineering Society) 61 (10): Section 2.1.3. 
  2. Dixon, S. (2006) Onset Detection Revisited, in Proceedings of the 9th International Conference on Digital Audio Effects (DAFx-06), Montreal, Canada, September 18-20, 2006