Whitening transformation

From HandWiki

A whitening transformation or sphering transformation is a linear transformation that transforms a vector of random variables with a known covariance matrix into a set of new variables whose covariance is the identity matrix, meaning that they are uncorrelated and each have variance 1.[1] The transformation is called "whitening" because it changes the input vector into a white noise vector. Several other transformations are closely related to whitening:

  1. the decorrelation transform removes only the correlations but leaves variances intact,
  2. the standardization transform sets variances to 1 but leaves correlations intact,
  3. a coloring transformation transforms a vector of white random variables into a random vector with a specified covariance matrix.[2]


Suppose [math]\displaystyle{ X }[/math] is a random (column) vector with non-singular covariance matrix [math]\displaystyle{ \Sigma }[/math] and mean [math]\displaystyle{ 0 }[/math]. Then the transformation [math]\displaystyle{ Y = W X }[/math] with a whitening matrix [math]\displaystyle{ W }[/math] satisfying the condition [math]\displaystyle{ W^\mathrm{T} W = \Sigma^{-1} }[/math] yields the whitened random vector [math]\displaystyle{ Y }[/math] with unit diagonal covariance.

There are infinitely many possible whitening matrices [math]\displaystyle{ W }[/math] that all satisfy the above condition. Commonly used choices are [math]\displaystyle{ W = \Sigma^{-1/2} }[/math] (Mahalanobis or ZCA whitening), [math]\displaystyle{ W = L^T }[/math] where [math]\displaystyle{ L }[/math] is the Cholesky decomposition of [math]\displaystyle{ \Sigma^{-1} }[/math] (Cholesky whitening),[3] or the eigen-system of [math]\displaystyle{ \Sigma }[/math] (PCA whitening).[4]

Optimal whitening transforms can be singled out by investigating the cross-covariance and cross-correlation of [math]\displaystyle{ X }[/math] and [math]\displaystyle{ Y }[/math].[3] For example, the unique optimal whitening transformation achieving maximal component-wise correlation between original [math]\displaystyle{ X }[/math] and whitened [math]\displaystyle{ Y }[/math] is produced by the whitening matrix [math]\displaystyle{ W = P^{-1/2} V^{-1/2} }[/math] where [math]\displaystyle{ P }[/math] is the correlation matrix and [math]\displaystyle{ V }[/math] the variance matrix.

Whitening a data matrix

Whitening a data matrix follows the same transformation as for random variables. An empirical whitening transform is obtained by estimating the covariance (e.g. by maximum likelihood) and subsequently constructing a corresponding estimated whitening matrix (e.g. by Cholesky decomposition).

R implementation

An implementation of several whitening procedures in R, including ZCA-whitening and PCA whitening but also CCA whitening, is available in the "whitening" R package [5] published on CRAN.

See also


  1. Koivunen, A.C.; Kostinski, A.B. (1999). "The Feasibility of Data Whitening to Improve Performance of Weather Radar". Journal of Applied Meteorology 38 (6): 741–749. doi:10.1175/1520-0450(1999)038<0741:TFODWT>2.0.CO;2. ISSN 1520-0450. Bibcode1999JApMe..38..741K. https://digitalcommons.mtu.edu/cgi/viewcontent.cgi?article=1279&context=physics-fp. 
  2. Hossain, Miliha. "Whitening and Coloring Transforms for Multivariate Gaussian Random Variables". Project Rhea. https://www.projectrhea.org/rhea/index.php/ECE662_Whitening_and_Coloring_Transforms_S14_MH. 
  3. 3.0 3.1 Kessy, A.; Lewin, A.; Strimmer, K. (2018). "Optimal whitening and decorrelation". The American Statistician 72 (4): 309–314. doi:10.1080/00031305.2016.1277159. 
  4. Friedman, J. (1987). "Exploratory Projection Pursuit". Journal of the American Statistical Association 82 (397): 249–266. doi:10.1080/01621459.1987.10478427. ISSN 0162-1459. 
  5. "whitening R package". https://cran.r-project.org/package=whitening. 

External links