Sherman–Morrison formula

From HandWiki
Short description: Formula computing the inverse of the sum of a matrix and the outer product of two vectors

In linear algebra, the Sherman–Morrison formula, named after Jack Sherman and Winifred J. Morrison, computes the inverse of a "rank-1 update" to a matrix whose inverse has previously been computed.[1][2][3] That is, given an invertible matrix [math]\displaystyle{ A }[/math] and the outer product [math]\displaystyle{ u v^\textsf{T} }[/math] of vectors [math]\displaystyle{ u }[/math] and [math]\displaystyle{ v, }[/math] the formula cheaply computes an updated matrix inverse [math]\displaystyle{ \left(A + uv^\textsf{T}\right)\vphantom)^{\!-1}. }[/math]

The Sherman–Morrison formula is a special case of the Woodbury formula. Though named after Sherman and Morrison, it appeared already in earlier publications.[4]

Statement

Suppose [math]\displaystyle{ A\in\mathbb{R}^{n\times n} }[/math] is an invertible square matrix and [math]\displaystyle{ u,v\in\mathbb{R}^n }[/math] are column vectors. Then [math]\displaystyle{ A + uv^\textsf{T} }[/math] is invertible iff [math]\displaystyle{ 1 + v^\textsf{T} A^{-1}u \neq 0 }[/math]. In this case,

[math]\displaystyle{ \left(A + uv^\textsf{T}\right)^{-1} = A^{-1} - {A^{-1}uv^\textsf{T}A^{-1} \over 1 + v^\textsf{T}A^{-1}u}. }[/math]

Here, [math]\displaystyle{ uv^\textsf{T} }[/math] is the outer product of two vectors [math]\displaystyle{ u }[/math] and [math]\displaystyle{ v }[/math]. The general form shown here is the one published by Bartlett.[5]

Proof

([math]\displaystyle{ \Leftarrow }[/math]) To prove that the backward direction [math]\displaystyle{ 1 + v^\textsf{T}A^{-1}u \neq 0 \Rightarrow A + uv^\textsf{T} }[/math] is invertible with inverse given as above) is true, we verify the properties of the inverse. A matrix [math]\displaystyle{ Y }[/math] (in this case the right-hand side of the Sherman–Morrison formula) is the inverse of a matrix [math]\displaystyle{ X }[/math] (in this case [math]\displaystyle{ A + uv^\textsf{T} }[/math]) if and only if [math]\displaystyle{ XY = YX = I }[/math].

We first verify that the right hand side ([math]\displaystyle{ Y }[/math]) satisfies [math]\displaystyle{ XY = I }[/math].

[math]\displaystyle{ \begin{align} XY &= \left(A + uv^\textsf{T}\right)\left(A^{-1} - {A^{-1}uv^\textsf{T}A^{-1} \over 1 + v^\textsf{T} A^{-1}u}\right) \\[6pt] &= AA^{-1} + uv^\textsf{T}A^{-1} - {AA^{-1}uv^\textsf{T}A^{-1} + uv^\textsf{T}A^{-1}uv^\textsf{T}A^{-1} \over 1 + v^\textsf{T}A^{-1}u} \\[6pt] &= I + uv^\textsf{T}A^{-1} - {uv^\textsf{T}A^{-1} + uv^\textsf{T}A^{-1}uv^\textsf{T}A^{-1} \over 1 + v^\textsf{T}A^{-1}u} \\[6pt] &= I + uv^\textsf{T}A^{-1} - {u\left(1 + v^\textsf{T}A^{-1} u\right) v^\textsf{T}A^{-1} \over 1 + v^\textsf{T}A^{-1} u} \\[6pt] &= I + uv^\textsf{T}A^{-1} - uv^\textsf{T}A^{-1}\\[6pt] \end{align} }[/math]

To end the proof of this direction, we need to show that [math]\displaystyle{ YX = I }[/math] in a similar way as above:

[math]\displaystyle{ YX = \left(A^{-1} - {A^{-1}uv^\textsf{T} A^{-1} \over 1 + v^\textsf{T} A^{-1}u}\right)(A + uv^\textsf{T}) = I. }[/math]

(In fact, the last step can be avoided since for square matrices [math]\displaystyle{ X }[/math] and [math]\displaystyle{ Y }[/math], [math]\displaystyle{ XY = I }[/math] is equivalent to [math]\displaystyle{ YX = I }[/math].)

([math]\displaystyle{ \Rightarrow }[/math]) Reciprocally, if [math]\displaystyle{ 1 + v^\textsf{T}A^{-1}u = 0 }[/math], then via the matrix determinant lemma, [math]\displaystyle{ \det\!\left(A + uv^\textsf{T}\right)=(1 + v^\textsf{T} A^{-1} u) \det(A) = 0 }[/math], so [math]\displaystyle{ \left(A + uv^\textsf{T}\right) }[/math] is not invertible.

Application

If the inverse of [math]\displaystyle{ A }[/math] is already known, the formula provides a numerically cheap way to compute the inverse of [math]\displaystyle{ A }[/math] corrected by the matrix [math]\displaystyle{ uv^\textsf{T} }[/math] (depending on the point of view, the correction may be seen as a perturbation or as a rank-1 update). The computation is relatively cheap because the inverse of [math]\displaystyle{ A + uv^\textsf{T} }[/math] does not have to be computed from scratch (which in general is expensive), but can be computed by correcting (or perturbing) [math]\displaystyle{ A^{-1} }[/math].

Using unit columns (columns from the identity matrix) for [math]\displaystyle{ u }[/math] or [math]\displaystyle{ v }[/math], individual columns or rows of [math]\displaystyle{ A }[/math] may be manipulated and a correspondingly updated inverse computed relatively cheaply in this way.[6] In the general case, where [math]\displaystyle{ A^{-1} }[/math] is a [math]\displaystyle{ n }[/math]-by-[math]\displaystyle{ n }[/math] matrix and [math]\displaystyle{ u }[/math] and [math]\displaystyle{ v }[/math] are arbitrary vectors of dimension [math]\displaystyle{ n }[/math], the whole matrix is updated[5] and the computation takes [math]\displaystyle{ 3n^2 }[/math] scalar multiplications.[7] If [math]\displaystyle{ u }[/math] is a unit column, the computation takes only [math]\displaystyle{ 2n^2 }[/math] scalar multiplications. The same goes if [math]\displaystyle{ v }[/math] is a unit column. If both [math]\displaystyle{ u }[/math] and [math]\displaystyle{ v }[/math] are unit columns, the computation takes only [math]\displaystyle{ n^2 }[/math] scalar multiplications.

This formula also has application in theoretical physics. Namely, in quantum field theory, one uses this formula to calculate the propagator of a spin-1 field.[8][circular reference] The inverse propagator (as it appears in the Lagrangian) has the form [math]\displaystyle{ A + uv^\textsf{T} }[/math]. One uses the Sherman–Morrison formula to calculate the inverse (satisfying certain time-ordering boundary conditions) of the inverse propagator—or simply the (Feynman) propagator—which is needed to perform any perturbative calculation[9] involving the spin-1 field.

One of the issues with the formula is that little is known about its numerical stability. There are no published results concerning its error bounds. Anecdotal evidence [10] suggests that the Woodbury matrix identity (a general case of the Sherman–Morrison formula) may diverge even for seemingly benign examples (when both the original and modified matrices are well-conditioned).

Alternative verification

Following is an alternate verification of the Sherman–Morrison formula using the easily verifiable identity

[math]\displaystyle{ \left(I + wv^\textsf{T}\right)^{-1} = I - \frac{wv^\textsf{T}}{1 + v^\textsf{T}w} }[/math].

Let

[math]\displaystyle{ u = Aw, \quad \text{and} \quad A + uv^\textsf{T} = A\left(I + wv^\textsf{T}\right), }[/math]

then

[math]\displaystyle{ \left(A + uv^\textsf{T}\right)^{-1} = \left(I + wv^\textsf{T}\right)^{-1} A^{-1} = \left(I - \frac{wv^\textsf{T}}{1 + v^\textsf{T}w}\right)A^{-1} }[/math].

Substituting [math]\displaystyle{ w = A^{-1}u }[/math] gives

[math]\displaystyle{ \left(A + uv^\textsf{T}\right)^{-1} = \left(I - \frac{A^{-1}uv^\textsf{T}}{1 + v^\textsf{T}A^{-1}u} \right)A^{-1} = A^{-1} - \frac{A^{-1}uv^\textsf{T}A^{-1}}{1 + v^\textsf{T}A^{-1}u} }[/math]

Generalization (Woodbury matrix identity)

Given a square invertible [math]\displaystyle{ n \times n }[/math] matrix [math]\displaystyle{ A }[/math], an [math]\displaystyle{ n \times k }[/math] matrix [math]\displaystyle{ U }[/math], and a [math]\displaystyle{ k \times n }[/math] matrix [math]\displaystyle{ V }[/math], let [math]\displaystyle{ B }[/math] be an [math]\displaystyle{ n \times n }[/math] matrix such that [math]\displaystyle{ B = A + UV }[/math]. Then, assuming [math]\displaystyle{ \left(I_k + VA^{-1} U\right) }[/math] is invertible, we have

[math]\displaystyle{ B^{-1} = A^{-1} - A^{-1}U \left(I_k + VA^{-1}U\right)^{-1} VA^{-1}. }[/math]

See also

References

  1. Sherman, Jack; Morrison, Winifred J. (1949). "Adjustment of an Inverse Matrix Corresponding to Changes in the Elements of a Given Column or a Given Row of the Original Matrix (abstract)". Annals of Mathematical Statistics 20: 621. doi:10.1214/aoms/1177729959. 
  2. Sherman, Jack; Morrison, Winifred J. (1950). "Adjustment of an Inverse Matrix Corresponding to a Change in One Element of a Given Matrix". Annals of Mathematical Statistics 21 (1): 124–127. doi:10.1214/aoms/1177729893. 
  3. Press, William H.; Teukolsky, Saul A.; Vetterling, William T.; Flannery, Brian P. (2007), "Section 2.7.1 Sherman–Morrison Formula", Numerical Recipes: The Art of Scientific Computing (3rd ed.), New York: Cambridge University Press, ISBN 978-0-521-88068-8, http://apps.nrbook.com/empanel/index.html?pg=76, retrieved 2011-08-08 
  4. Hager, William W. (1989). "Updating the inverse of a matrix". SIAM Review 31 (2): 221–239. doi:10.1137/1031049. 
  5. 5.0 5.1 Bartlett, Maurice S. (1951). "An Inverse Matrix Adjustment Arising in Discriminant Analysis". Annals of Mathematical Statistics 22 (1): 107–111. doi:10.1214/aoms/1177729698. 
  6. Langville, Amy N.; and Meyer, Carl D.; "Google's PageRank and Beyond: The Science of Search Engine Rankings", Princeton University Press, 2006, p. 156
  7. Update of the inverse matrix by the Sherman–Morrison formula
  8. Propagator
  9. "Perturbative quantum field theory". https://ncatlab.org/nlab/show/perturbative+quantum+field+theory. 
  10. "MathOverflow discussion". https://mathoverflow.net/questions/80340/special-considerations-when-using-the-woodbury-matrix-identity-numerically. 

External links