Rank factorization

From HandWiki
Revision as of 18:27, 6 March 2023 by AstroAI (talk | contribs) (add)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

In mathematics, given a field [math]\displaystyle{ \mathbb F }[/math], nonnegative integers [math]\displaystyle{ m,n }[/math], and a matrix [math]\displaystyle{ A\in\mathbb F^{m\times n} }[/math], a rank decomposition or rank factorization of A is a factorization of A of the form A = CF, where [math]\displaystyle{ C\in\mathbb F^{m\times r} }[/math] and [math]\displaystyle{ F\in\mathbb F^{r\times n} }[/math], where [math]\displaystyle{ r=\operatorname{rank} A }[/math] is the rank of [math]\displaystyle{ A }[/math].

Existence

Every finite-dimensional matrix has a rank decomposition: Let [math]\displaystyle{ A }[/math] be an [math]\displaystyle{ m\times n }[/math] matrix whose column rank is [math]\displaystyle{ r }[/math]. Therefore, there are [math]\displaystyle{ r }[/math] linearly independent columns in [math]\displaystyle{ A }[/math]; equivalently, the dimension of the column space of [math]\displaystyle{ A }[/math] is [math]\displaystyle{ r }[/math]. Let [math]\displaystyle{ \mathbf{c}_1, \mathbf{c}_2, \ldots, \mathbf{c}_r }[/math] be any basis for the column space of [math]\displaystyle{ A }[/math] and place them as column vectors to form the [math]\displaystyle{ m\times r }[/math] matrix [math]\displaystyle{ C = \begin{bmatrix}\mathbf{c}_1 & \mathbf{c}_2 & \cdots & \mathbf{c}_r\end{bmatrix} }[/math]. Therefore, every column vector of [math]\displaystyle{ A }[/math] is a linear combination of the columns of [math]\displaystyle{ C }[/math]. To be precise, if [math]\displaystyle{ A = \begin{bmatrix}\mathbf{a}_1 & \mathbf{a}_2 & \cdots & \mathbf{a}_n\end{bmatrix} }[/math] is an [math]\displaystyle{ m\times n }[/math] matrix with [math]\displaystyle{ \mathbf{a}_j }[/math] as the [math]\displaystyle{ j }[/math]-th column, then

[math]\displaystyle{ \mathbf{a}_j = f_{1j} \mathbf{c}_1 + f_{2j} \mathbf{c}_2 + \cdots + f_{rj} \mathbf{c}_r, }[/math]

where [math]\displaystyle{ f_{ij} }[/math]'s are the scalar coefficients of [math]\displaystyle{ \mathbf{a}_j }[/math] in terms of the basis [math]\displaystyle{ \mathbf{c}_1, \mathbf{c}_2, \ldots, \mathbf{c}_r }[/math]. This implies that [math]\displaystyle{ A = CF }[/math], where [math]\displaystyle{ f_{ij} }[/math] is the [math]\displaystyle{ (i,j) }[/math]-th element of [math]\displaystyle{ F }[/math].

Non-uniqueness

If [math]\displaystyle{ A = C_1 F_1 }[/math] is a rank factorization, taking [math]\displaystyle{ C_2 = C_1 R }[/math] and [math]\displaystyle{ F_2 = R^{-1} F_1 }[/math] gives another rank factorization for any invertible matrix [math]\displaystyle{ R }[/math] of compatible dimensions.

Conversely, if [math]\displaystyle{ A = F_{1}G_{1} = F_{2}G_{2} }[/math] are two rank factorizations of [math]\displaystyle{ A }[/math], then there exists an invertible matrix [math]\displaystyle{ R }[/math] such that [math]\displaystyle{ F_1 = F_2 R }[/math] and [math]\displaystyle{ G_1 = R^{-1} G_{2} }[/math].[1]

Construction

Rank factorization from reduced row echelon forms

In practice, we can construct one specific rank factorization as follows: we can compute [math]\displaystyle{ B }[/math], the reduced row echelon form of [math]\displaystyle{ A }[/math]. Then [math]\displaystyle{ C }[/math] is obtained by removing from [math]\displaystyle{ A }[/math] all non-pivot columns (which can be determined by looking for columns in [math]\displaystyle{ B }[/math] which do not contain a pivot), and [math]\displaystyle{ F }[/math] is obtained by eliminating any all-zero rows of [math]\displaystyle{ B }[/math].

Note: For a full-rank square matrix (i.e. when [math]\displaystyle{ n=m=r }[/math]), this procedure will yield the trivial result [math]\displaystyle{ C=A }[/math] and [math]\displaystyle{ F=B=I_n }[/math] (the [math]\displaystyle{ n\times n }[/math] identity matrix).

Example

Consider the matrix

[math]\displaystyle{ A = \begin{bmatrix} 1 & 3 & 1 & 4 \\ 2 & 7 & 3 & 9 \\ 1 & 5 & 3 & 1 \\ 1 & 2 & 0 & 8 \end{bmatrix} \sim \begin{bmatrix} 1 & 0 & -2 & 0 \\ 0 & 1 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 \end{bmatrix} = B\text{.} }[/math]

[math]\displaystyle{ B }[/math] is in reduced echelon form.

Then [math]\displaystyle{ C }[/math] is obtained by removing the third column of [math]\displaystyle{ A }[/math], the only one which is not a pivot column, and [math]\displaystyle{ F }[/math] by getting rid of the last row of zeroes from [math]\displaystyle{ B }[/math], so

[math]\displaystyle{ C = \begin{bmatrix} 1 & 3 & 4 \\ 2 & 7 & 9 \\ 1 & 5 & 1 \\ 1 & 2 & 8 \end{bmatrix}\text{,}\qquad F = \begin{bmatrix} 1 & 0 & -2 & 0 \\ 0 & 1 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}\text{.} }[/math]

It is straightforward to check that

[math]\displaystyle{ A = \begin{bmatrix} 1 & 3 & 1 & 4 \\ 2 & 7 & 3 & 9 \\ 1 & 5 & 3 & 1 \\ 1 & 2 & 0 & 8 \end{bmatrix} = \begin{bmatrix} 1 & 3 & 4 \\ 2 & 7 & 9 \\ 1 & 5 & 1 \\ 1 & 2 & 8 \end{bmatrix} \begin{bmatrix} 1 & 0 & -2 & 0 \\ 0 & 1 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} = CF\text{.} }[/math]

Proof

Let [math]\displaystyle{ P }[/math] be an [math]\displaystyle{ n\times n }[/math] permutation matrix such that [math]\displaystyle{ AP = (C, D) }[/math] in block partitioned form, where the columns of [math]\displaystyle{ C }[/math] are the [math]\displaystyle{ r }[/math] pivot columns of [math]\displaystyle{ A }[/math]. Every column of [math]\displaystyle{ D }[/math] is a linear combination of the columns of [math]\displaystyle{ C }[/math], so there is a matrix [math]\displaystyle{ G }[/math] such that [math]\displaystyle{ D = CG }[/math], where the columns of [math]\displaystyle{ G }[/math] contain the coefficients of each of those linear combinations. So [math]\displaystyle{ AP = (C, CG) = C(I_r, G) }[/math], [math]\displaystyle{ I_r }[/math] being the [math]\displaystyle{ r\times r }[/math] identity matrix. We will show now that [math]\displaystyle{ (I_r, G) = FP }[/math].

Transforming [math]\displaystyle{ A }[/math] into its reduced row echelon form [math]\displaystyle{ B }[/math] amounts to left-multiplying by a matrix [math]\displaystyle{ E }[/math] which is a product of elementary matrices, so [math]\displaystyle{ EAP = BP = EC(I_r, G) }[/math], where [math]\displaystyle{ EC = \begin{pmatrix} I_r \\ 0 \end{pmatrix} }[/math]. We then can write [math]\displaystyle{ BP = \begin{pmatrix} I_r & G \\ 0 & 0 \end{pmatrix} }[/math], which allows us to identify [math]\displaystyle{ (I_r, G) = FP }[/math], i.e. the nonzero [math]\displaystyle{ r }[/math] rows of the reduced echelon form, with the same permutation on the columns as we did for [math]\displaystyle{ A }[/math]. We thus have [math]\displaystyle{ AP = CFP }[/math], and since [math]\displaystyle{ P }[/math] is invertible this implies [math]\displaystyle{ A = CF }[/math], and the proof is complete.

Singular value decomposition

If [math]\displaystyle{ \mathbb F\in\{\mathbb R,\mathbb C\}, }[/math] then one can also construct a full-rank factorization of [math]\displaystyle{ A }[/math] via a singular value decomposition

[math]\displaystyle{ A = U \Sigma V^* = \begin{bmatrix} U_1 & U_2 \end{bmatrix} \begin{bmatrix} \Sigma_r & 0 \\ 0 & 0 \end{bmatrix} \begin{bmatrix} V_1^* \\ V_2^* \end{bmatrix} = U_1 \left(\Sigma_r V_1^*\right) . }[/math]

Since [math]\displaystyle{ U_1 }[/math] is a full-column-rank matrix and [math]\displaystyle{ \Sigma_r V_1^* }[/math] is a full-row-rank matrix, we can take [math]\displaystyle{ C = U_1 }[/math] and [math]\displaystyle{ F = \Sigma_r V_1^* }[/math].

Consequences

rank(A) = rank(AT)

An immediate consequence of rank factorization is that the rank of [math]\displaystyle{ A }[/math] is equal to the rank of its transpose [math]\displaystyle{ A^\textsf{T} }[/math]. Since the columns of [math]\displaystyle{ A }[/math] are the rows of [math]\displaystyle{ A^\textsf{T} }[/math], the column rank of [math]\displaystyle{ A }[/math] equals its row rank.[2]

Proof: To see why this is true, let us first define rank to mean column rank. Since [math]\displaystyle{ A = CF }[/math], it follows that [math]\displaystyle{ A^\textsf{T} = F^\textsf{T}C^\textsf{T} }[/math]. From the definition of matrix multiplication, this means that each column of [math]\displaystyle{ A^\textsf{T} }[/math] is a linear combination of the columns of [math]\displaystyle{ F^\textsf{T} }[/math]. Therefore, the column space of [math]\displaystyle{ A^\textsf{T} }[/math] is contained within the column space of [math]\displaystyle{ F^\textsf{T} }[/math] and, hence, [math]\displaystyle{ \operatorname{rank}\left(A^\textsf{T}\right) \leq \operatorname{rank}\left(F^\textsf{T}\right) }[/math].

Now, [math]\displaystyle{ F^\textsf{T} }[/math] is [math]\displaystyle{ n \times r }[/math], so there are [math]\displaystyle{ r }[/math] columns in [math]\displaystyle{ F^\textsf{T} }[/math] and, hence, [math]\displaystyle{ \operatorname{rank}\left(A^\textsf{T}\right) \leq r = \operatorname{rank}\left(A\right) }[/math]. This proves that [math]\displaystyle{ \operatorname{rank}\left(A^\textsf{T}\right) \leq \operatorname{rank}\left(A\right) }[/math].

Now apply the result to [math]\displaystyle{ A^\textsf{T} }[/math] to obtain the reverse inequality: since [math]\displaystyle{ \left(A^\textsf{T}\right)^\textsf{T} = A }[/math], we can write [math]\displaystyle{ \operatorname{rank}\left(A\right)= \operatorname{rank}\left(\left(A^\textsf{T}\right)^\textsf{T}\right) \leq \operatorname{rank}\left(A^\textsf{T}\right) }[/math]. This proves [math]\displaystyle{ \operatorname{rank}\left(A\right) \leq \operatorname{rank}\left(A^\textsf{T}\right) }[/math].

We have, therefore, proved [math]\displaystyle{ \operatorname{rank}\left(A^\textsf{T}\right) \leq \operatorname{rank}\left(A\right) }[/math] and [math]\displaystyle{ \operatorname{rank}\left(A\right) \leq \operatorname{rank}\left(A^\textsf{T}\right) }[/math], so [math]\displaystyle{ \operatorname{rank}\left(A\right) = \operatorname{rank}\left(A^\textsf{T}\right) }[/math].

Notes

  1. Piziak, R.; Odell, P. L. (1 June 1999). "Full Rank Factorization of Matrices". Mathematics Magazine 72 (3): 193. doi:10.2307/2690882. 
  2. Banerjee, Sudipto; Roy, Anindya (2014), Linear Algebra and Matrix Analysis for Statistics, Texts in Statistical Science (1st ed.), Chapman and Hall/CRC, ISBN 978-1420095388 

References

  • Banerjee, Sudipto; Roy, Anindya (2014), Linear Algebra and Matrix Analysis for Statistics, Texts in Statistical Science (1st ed.), Chapman and Hall/CRC, ISBN 978-1420095388 
  • Lay, David C. (2005), Linear Algebra and its Applications (3rd ed.), Addison Wesley, ISBN 978-0-201-70970-4 
  • Golub, Gene H.; Van Loan, Charles F. (1996), Matrix Computations, Johns Hopkins Studies in Mathematical Sciences (3rd ed.), The Johns Hopkins University Press, ISBN 978-0-8018-5414-9 
  • Stewart, Gilbert W. (1998), Matrix Algorithms. I. Basic Decompositions, SIAM, ISBN 978-0-89871-414-2 
  • Piziak, R.; Odell, P. L. (1 June 1999). "Full Rank Factorization of Matrices". Mathematics Magazine 72 (3): 193. doi:10.2307/2690882.