Leibniz formula for determinants

From HandWiki
Short description: Mathematics formula

In algebra, the Leibniz formula, named in honor of Gottfried Leibniz, expresses the determinant of a square matrix in terms of permutations of the matrix elements. If [math]\displaystyle{ A }[/math] is an [math]\displaystyle{ n \times n }[/math] matrix, where [math]\displaystyle{ a_{ij} }[/math] is the entry in the [math]\displaystyle{ i }[/math]-th row and [math]\displaystyle{ j }[/math]-th column of [math]\displaystyle{ A }[/math], the formula is

[math]\displaystyle{ \det(A) = \sum_{\tau \in S_n} \sgn(\tau) \prod_{i = 1}^n a_{i\tau(i)} = \sum_{\sigma \in S_n} \sgn(\sigma) \prod_{i = 1}^n a_{\sigma(i)i} }[/math]

where [math]\displaystyle{ \sgn }[/math] is the sign function of permutations in the permutation group [math]\displaystyle{ S_n }[/math], which returns [math]\displaystyle{ +1 }[/math] and [math]\displaystyle{ -1 }[/math] for even and odd permutations, respectively.

Another common notation used for the formula is in terms of the Levi-Civita symbol and makes use of the Einstein summation notation, where it becomes

[math]\displaystyle{ \det(A) = \epsilon_{i_1\cdots i_n} {a}_{1i_1} \cdots {a}_{ni_n}, }[/math]

which may be more familiar to physicists.

Directly evaluating the Leibniz formula from the definition requires [math]\displaystyle{ \Omega(n! \cdot n) }[/math] operations in general—that is, a number of operations asymptotically proportional to [math]\displaystyle{ n }[/math] factorial—because [math]\displaystyle{ n! }[/math] is the number of order-[math]\displaystyle{ n }[/math] permutations. This is impractically difficult for even relatively small [math]\displaystyle{ n }[/math]. Instead, the determinant can be evaluated in [math]\displaystyle{ O(n^3) }[/math] operations by forming the LU decomposition [math]\displaystyle{ A = LU }[/math] (typically via Gaussian elimination or similar methods), in which case [math]\displaystyle{ \det A = \det L \cdot \det U }[/math] and the determinants of the triangular matrices [math]\displaystyle{ L }[/math] and [math]\displaystyle{ U }[/math] are simply the products of their diagonal entries. (In practical applications of numerical linear algebra, however, explicit computation of the determinant is rarely required.) See, for example, (Trefethen Bau). The determinant can also be evaluated in fewer than [math]\displaystyle{ O(n^3) }[/math] operations by reducing the problem to matrix multiplication, but most such algorithms are not practical.

Formal statement and proof

Theorem. There exists exactly one function [math]\displaystyle{ F : M_n (\mathbb K) \rightarrow \mathbb K }[/math] which is alternating multilinear w.r.t. columns and such that [math]\displaystyle{ F(I) = 1 }[/math].

Proof.

Uniqueness: Let [math]\displaystyle{ F }[/math] be such a function, and let [math]\displaystyle{ A = (a_i^j)_{i = 1, \dots, n}^{j = 1, \dots , n} }[/math] be an [math]\displaystyle{ n \times n }[/math] matrix. Call [math]\displaystyle{ A^j }[/math] the [math]\displaystyle{ j }[/math]-th column of [math]\displaystyle{ A }[/math], i.e. [math]\displaystyle{ A^j = (a_i^j)_{i = 1, \dots , n} }[/math], so that [math]\displaystyle{ A = \left(A^1, \dots, A^n\right). }[/math]

Also, let [math]\displaystyle{ E^k }[/math] denote the [math]\displaystyle{ k }[/math]-th column vector of the identity matrix.

Now one writes each of the [math]\displaystyle{ A^j }[/math]'s in terms of the [math]\displaystyle{ E^k }[/math], i.e.

[math]\displaystyle{ A^j = \sum_{k = 1}^n a_k^j E^k }[/math].

As [math]\displaystyle{ F }[/math] is multilinear, one has

[math]\displaystyle{ \begin{align} F(A)& = F\left(\sum_{k_1 = 1}^n a_{k_1}^1 E^{k_1}, \dots, \sum_{k_n = 1}^n a_{k_n}^n E^{k_n}\right) = \sum_{k_1, \dots, k_n = 1}^n \left(\prod_{i = 1}^n a_{k_i}^i\right) F\left(E^{k_1}, \dots, E^{k_n}\right). \end{align} }[/math]

From alternation it follows that any term with repeated indices is zero. The sum can therefore be restricted to tuples with non-repeating indices, i.e. permutations:

[math]\displaystyle{ F(A) = \sum_{\sigma \in S_n} \left(\prod_{i = 1}^n a_{\sigma(i)}^i\right) F(E^{\sigma(1)}, \dots , E^{\sigma(n)}). }[/math]

Because F is alternating, the columns [math]\displaystyle{ E }[/math] can be swapped until it becomes the identity. The sign function [math]\displaystyle{ \sgn(\sigma) }[/math] is defined to count the number of swaps necessary and account for the resulting sign change. One finally gets:

[math]\displaystyle{ \begin{align} F(A)& = \sum_{\sigma \in S_n} \sgn(\sigma) \left(\prod_{i = 1}^n a_{\sigma(i)}^i\right) F(I)\\ & = \sum_{\sigma \in S_n} \sgn(\sigma) \prod_{i = 1}^n a_{\sigma(i)}^i \end{align} }[/math]

as [math]\displaystyle{ F(I) }[/math] is required to be equal to [math]\displaystyle{ 1 }[/math].

Therefore no function besides the function defined by the Leibniz Formula can be a multilinear alternating function with [math]\displaystyle{ F\left(I\right)=1 }[/math].

Existence: We now show that F, where F is the function defined by the Leibniz formula, has these three properties.

Multilinear:

[math]\displaystyle{ \begin{align} F(A^1, \dots, cA^j, \dots) & = \sum_{\sigma \in S_n} \sgn(\sigma) ca_{\sigma(j)}^j\prod_{i = 1, i \neq j}^n a_{\sigma(i)}^i\\ & = c \sum_{\sigma \in S_n} \sgn(\sigma) a_{\sigma(j)}^j\prod_{i = 1, i \neq j}^n a_{\sigma(i)}^i\\ &=c F(A^1, \dots, A^j, \dots)\\ \\ F(A^1, \dots, b+A^j, \dots) & = \sum_{\sigma \in S_n} \sgn(\sigma)\left(b_{\sigma(j)} + a_{\sigma(j)}^j\right)\prod_{i = 1, i \neq j}^n a_{\sigma(i)}^i\\ & = \sum_{\sigma \in S_n} \sgn(\sigma) \left( \left(b_{\sigma(j)}\prod_{i = 1, i \neq j}^n a_{\sigma(i)}^i\right) + \left(a_{\sigma(j)}^j\prod_{i = 1, i \neq j}^n a_{\sigma(i)}^i\right)\right)\\ & = \left(\sum_{\sigma \in S_n} \sgn(\sigma) b_{\sigma(j)}\prod_{i = 1, i \neq j}^n a_{\sigma(i)}^i\right) + \left(\sum_{\sigma \in S_n} \sgn(\sigma) \prod_{i = 1}^n a_{\sigma(i)}^i\right)\\ &= F(A^1, \dots, b, \dots) + F(A^1, \dots, A^j, \dots)\\ \\ \end{align} }[/math]

Alternating:

[math]\displaystyle{ \begin{align} F(\dots, A^{j_1}, \dots, A^{j_2}, \dots) & = \sum_{\sigma \in S_n} \sgn(\sigma) \left(\prod_{i = 1, i \neq j_1, i\neq j_2}^n a_{\sigma(i)}^i\right) a_{\sigma(j_1)}^{j_1} a_{\sigma(j_2)}^{j_2}\\ \end{align} }[/math]

For any [math]\displaystyle{ \sigma \in S_n }[/math] let [math]\displaystyle{ \sigma' }[/math] be the tuple equal to [math]\displaystyle{ \sigma }[/math] with the [math]\displaystyle{ j_1 }[/math] and [math]\displaystyle{ j_2 }[/math] indices switched.

[math]\displaystyle{ \begin{align} F(A) & = \sum_{\sigma\in S_{n},\sigma(j_{1})\lt \sigma(j_{2})}\left[\sgn(\sigma)\left(\prod_{i = 1, i \neq j_1, i\neq j_2}^na_{\sigma(i)}^{i}\right)a_{\sigma(j_{1})}^{j_{1}}a_{\sigma(j_{2})}^{j_{2}}+\sgn(\sigma')\left(\prod_{i = 1, i \neq j_1, i\neq j_2}^na_{\sigma'(i)}^{i}\right)a_{\sigma'(j_{1})}^{j_{1}}a_{\sigma'(j_{2})}^{j_{2}}\right]\\ & =\sum_{\sigma\in S_{n},\sigma(j_{1})\lt \sigma(j_{2})}\left[\sgn(\sigma)\left(\prod_{i = 1, i \neq j_1, i\neq j_2}^na_{\sigma(i)}^{i}\right)a_{\sigma(j_{1})}^{j_{1}}a_{\sigma(j_{2})}^{j_{2}}-\sgn(\sigma)\left(\prod_{i = 1, i \neq j_1, i\neq j_2}^na_{\sigma(i)}^{i}\right)a_{\sigma(j_{2})}^{j_{1}}a_{\sigma(j_{1})}^{j_{2}}\right]\\ & =\sum_{\sigma\in S_{n},\sigma(j_{1})\lt \sigma(j_{2})}\sgn(\sigma)\left(\prod_{i = 1, i \neq j_1, i\neq j_2}^na_{\sigma(i)}^{i}\right)\underbrace{\left(a_{\sigma(j_{1})}^{j_{1}}a_{\sigma(j_{2})}^{j_{2}}-a_{\sigma(j_{1})}^{j_{2}}a_{\sigma(j_{2})}^{j_{_{1}}}\right)}_{=0\text{, if }A^{j_1}=A^{j_2}}\\ \\ \end{align} }[/math]

Thus if [math]\displaystyle{ A^{j_1} = A^{j_2} }[/math] then [math]\displaystyle{ F(\dots, A^{j_1}, \dots, A^{j_2}, \dots)=0 }[/math].

Finally, [math]\displaystyle{ F(I)=1 }[/math]:

[math]\displaystyle{ \begin{align} F(I) & = \sum_{\sigma \in S_n} \sgn(\sigma) \prod_{i = 1}^n I^i_{\sigma(i)} = \sum_{\sigma \in S_n} \sgn(\sigma) \prod_{i = 1}^n \operatorname{\delta}_{i,\sigma(i)}\\ & = \sum_{\sigma \in S_n} \sgn(\sigma) \operatorname{\delta}_{\sigma,\operatorname{id}_{\{1\ldots n\}}} = \sgn(\operatorname{id}_{\{1\ldots n\}}) = 1 \end{align} }[/math]

Thus the only alternating multilinear functions with [math]\displaystyle{ F(I)=1 }[/math] are restricted to the function defined by the Leibniz formula, and it in fact also has these three properties. Hence the determinant can be defined as the only function [math]\displaystyle{ \det : M_n (\mathbb K) \rightarrow \mathbb K }[/math] with these three properties.

See also

References