Multilinear multiplication

From HandWiki

In multilinear algebra, applying a map that is the tensor product of linear maps to a tensor is called a multilinear multiplication.

Abstract definition

Let [math]\displaystyle{ F }[/math] be a field of characteristic zero, such as [math]\displaystyle{ \mathbb{R} }[/math] or [math]\displaystyle{ \mathbb{C} }[/math]. Let [math]\displaystyle{ V_k }[/math] be a finite-dimensional vector space over [math]\displaystyle{ F }[/math], and let [math]\displaystyle{ \mathcal{A} \in V_1 \otimes V_2 \otimes \cdots \otimes V_d }[/math] be an order-d simple tensor, i.e., there exist some vectors [math]\displaystyle{ \mathbf{v}_k \in V_k }[/math] such that [math]\displaystyle{ \mathcal{A} = \mathbf{v}_1 \otimes \mathbf{v}_2 \otimes \cdots \otimes \mathbf{v}_d }[/math]. If we are given a collection of linear maps [math]\displaystyle{ A_k : V_k \to W_k }[/math], then the multilinear multiplication of [math]\displaystyle{ \mathcal{A} }[/math] with [math]\displaystyle{ (A_1, A_2, \ldots, A_d) }[/math] is defined[1] as the action on [math]\displaystyle{ \mathcal{A} }[/math] of the tensor product of these linear maps,[2] namely

[math]\displaystyle{ \begin{align} A_1 \otimes A_2 \otimes \cdots \otimes A_d : V_1 \otimes V_2 \otimes \cdots \otimes V_d & \to W_1 \otimes W_2 \otimes \cdots \otimes W_d, \\ \mathbf{v}_1 \otimes \mathbf{v}_2 \otimes \cdots \otimes \mathbf{v}_d & \mapsto A_1(\mathbf{v}_1) \otimes A_2(\mathbf{v}_2) \otimes \cdots \otimes A_d(\mathbf{v}_d) \end{align} }[/math]

Since the tensor product of linear maps is itself a linear map,[2] and because every tensor admits a tensor rank decomposition,[1] the above expression extends linearly to all tensors. That is, for a general tensor [math]\displaystyle{ \mathcal{A} \in V_1 \otimes V_2 \otimes \cdots \otimes V_d }[/math], the multilinear multiplication is

[math]\displaystyle{ \begin{align} & \mathcal{B} := (A_1 \otimes A_2 \otimes \cdots \otimes A_d)(\mathcal{A}) \\[4pt] = {} & (A_1 \otimes A_2 \otimes \cdots \otimes A_d)\left(\sum_{i=1}^r \mathbf{a}_i^1 \otimes \mathbf{a}_i^2 \otimes \cdots \otimes \mathbf{a}_i^d\right) \\[5pt] = {} & \sum_{i=1}^r A_1(\mathbf{a}_i^1) \otimes A_2(\mathbf{a}_i^2) \otimes \cdots \otimes A_d( \mathbf{a}_i^d ) \end{align} }[/math]

where [math]\displaystyle{ \mathcal{A} = \sum_{i=1}^r \mathbf{a}_i^1 \otimes \mathbf{a}_i^2 \otimes \cdots \otimes \mathbf{a}_i^d }[/math] with [math]\displaystyle{ \mathbf{a}_i^k \in V_k }[/math] is one of [math]\displaystyle{ \mathcal{A} }[/math]'s tensor rank decompositions. The validity of the above expression is not limited to a tensor rank decomposition; in fact, it is valid for any expression of [math]\displaystyle{ \mathcal{A} }[/math] as a linear combination of pure tensors, which follows from the universal property of the tensor product.

It is standard to use the following shorthand notations in the literature for multilinear multiplications:[math]\displaystyle{ (A_1, A_2, \ldots, A_d) \cdot \mathcal{A} := (A_1 \otimes A_2 \otimes \cdots \otimes A_d)(\mathcal{A}) }[/math]and[math]\displaystyle{ A_k \cdot_k \mathcal{A} := (\operatorname{Id}_{V_1}, \ldots, \operatorname{Id}_{V_{k-1}}, A_k, \operatorname{Id}_{V_{k+1}}, \ldots, \operatorname{Id}_{V_{d}}) \cdot \mathcal{A}, }[/math]where [math]\displaystyle{ \operatorname{Id}_{V_k} : V_k \to V_k }[/math] is the identity operator.

Definition in coordinates

In computational multilinear algebra it is conventional to work in coordinates. Assume that an inner product is fixed on [math]\displaystyle{ V_k }[/math] and let [math]\displaystyle{ V_k^* }[/math] denote the dual vector space of [math]\displaystyle{ V_k }[/math]. Let [math]\displaystyle{ \{ e_1^k, \ldots, e_{n_k}^k \} }[/math] be a basis for [math]\displaystyle{ V_k }[/math], let [math]\displaystyle{ \{ (e_1^k)^*, \ldots, (e_{n_k}^k)^* \} }[/math] be the dual basis, and let [math]\displaystyle{ \{f_1^k, \ldots, f_{m_k}^k \} }[/math] be a basis for [math]\displaystyle{ W_k }[/math]. The linear map [math]\displaystyle{ M_k = \sum_{i=1}^{m_k} \sum_{j=1}^{n_k} m_{i,j}^{(k)} f_i^k \otimes (e_j^k)^* }[/math] is then represented by the matrix [math]\displaystyle{ \widehat{M}_k = [m_{i,j}^{(k)}] \in F^{m_k \times n_k} }[/math]. Likewise, with respect to the standard tensor product basis [math]\displaystyle{ \{ e_{j_1}^1 \otimes e_{j_2}^2 \otimes \cdots \otimes e_{j_d}^d \}_{j_1,j_2,\ldots,j_d} }[/math], the abstract tensor[math]\displaystyle{ \mathcal{A} = \sum_{j_1=1}^{n_1} \sum_{j_2=1}^{n_2} \cdots \sum_{j_d=1}^{n_d} a_{j_1,j_2,\ldots,j_d} e_{j_1}^1 \otimes e_{j_2}^2 \otimes \cdots \otimes e_{j_d}^d }[/math]is represented by the multidimensional array [math]\displaystyle{ \widehat{\mathcal{A}} = [a_{j_1,j_2,\ldots,j_d}] \in F^{n_1 \times n_2 \times \cdots \times n_d} }[/math] . Observe that [math]\displaystyle{ \widehat{\mathcal{A}} = \sum_{j_1=1}^{n_1} \sum_{j_2=1}^{n_2} \cdots \sum_{j_d=1}^{n_d} a_{j_1,j_2,\ldots,j_d} \mathbf{e}_{j_1}^1 \otimes \mathbf{e}_{j_2}^2 \otimes \cdots \otimes \mathbf{e}_{j_d}^d, }[/math]

where [math]\displaystyle{ \mathbf{e}_j^k \in F^{n_k} }[/math] is the jth standard basis vector of [math]\displaystyle{ F^{n_k} }[/math] and the tensor product of vectors is the affine Segre map [math]\displaystyle{ \otimes : (\mathbf{v}^{(1)}, \mathbf{v}^{(2)}, \ldots, \mathbf{v}^{(d)}) \mapsto [v_{i_1}^{(1)} v_{i_2}^{(2)} \cdots v_{i_d}^{(d)}]_{i_1,i_2,\ldots,i_d} }[/math]. It follows from the above choices of bases that the multilinear multiplication [math]\displaystyle{ \mathcal{B} = (M_1, M_2, \ldots, M_d) \cdot \mathcal{A} }[/math] becomes

[math]\displaystyle{ \begin{align} \widehat{\mathcal{B}} &= (\widehat{M}_1, \widehat{M}_2, \ldots, \widehat{M}_d) \cdot \sum_{j_1=1}^{n_1} \sum_{j_2=1}^{n_2} \cdots \sum_{j_d=1}^{n_d} a_{j_1,j_2,\ldots,j_d} \mathbf{e}_{j_1}^1 \otimes \mathbf{e}_{j_2}^2 \otimes \cdots \otimes \mathbf{e}_{j_d}^d \\ &= \sum_{j_1=1}^{n_1} \sum_{j_2=1}^{n_2} \cdots \sum_{j_d=1}^{n_d} a_{j_1,j_2,\ldots,j_d} (\widehat{M}_1, \widehat{M}_2, \ldots, \widehat{M}_d) \cdot (\mathbf{e}_{j_1}^1 \otimes \mathbf{e}_{j_2}^2 \otimes \cdots \otimes \mathbf{e}_{j_d}^d) \\ &= \sum_{j_1=1}^{n_1} \sum_{j_2=1}^{n_2} \cdots \sum_{j_d=1}^{n_d} a_{j_1,j_2,\ldots,j_d} (\widehat{M}_1 \mathbf{e}_{j_1}^1) \otimes (\widehat{M}_2 \mathbf{e}_{j_2}^2) \otimes \cdots \otimes (\widehat{M}_d \mathbf{e}_{j_d}^d). \end{align} }[/math]

The resulting tensor [math]\displaystyle{ \widehat{\mathcal{B}} }[/math] lives in [math]\displaystyle{ F^{m_1 \times m_2 \times \cdots \times m_d} }[/math].

Element-wise definition

From the above expression, an element-wise definition of the multilinear multiplication is obtained. Indeed, since [math]\displaystyle{ \widehat{\mathcal{B}} }[/math] is a multidimensional array, it may be expressed as [math]\displaystyle{ \widehat{\mathcal{B}} = \sum_{j_1=1}^{n_1} \sum_{j_2=1}^{n_2} \cdots \sum_{j_d=1}^{n_d} b_{j_1,j_2,\ldots,j_d} \mathbf{e}_{j_1}^1 \otimes \mathbf{e}_{j_2}^2 \otimes \cdots \otimes \mathbf{e}_{j_d}^d, }[/math]where [math]\displaystyle{ b_{j_1,j_2,\ldots,j_d} \in F }[/math] are the coefficients. Then it follows from the above formulae that

[math]\displaystyle{ \begin{align} & \left( (\mathbf{e}_{i_1}^1)^T, (\mathbf{e}_{i_2}^2)^T, \ldots, (\mathbf{e}_{i_d}^d)^T \right) \cdot \widehat{\mathcal{B}} \\ = {} & \sum_{j_1=1}^{n_1} \sum_{j_2=1}^{n_2} \cdots \sum_{j_d=1}^{n_d} b_{j_1,j_2,\ldots,j_d} \left( (\mathbf{e}_{i_1}^1)^T \mathbf{e}_{j_1}^1 \right) \otimes \left((\mathbf{e}_{i_2}^2)^T \mathbf{e}_{j_2}^2\right) \otimes \cdots \otimes \left( (\mathbf{e}_{i_d}^d)^T \mathbf{e}_{j_d}^d \right) \\ = {} & \sum_{j_1=1}^{n_1} \sum_{j_2=1}^{n_2} \cdots \sum_{j_d=1}^{n_d} b_{j_1,j_2,\ldots,j_d} \delta_{i_1, j_1} \cdot \delta_{i_2,j_2} \cdots \delta_{i_d, j_d} \\ = {} & b_{i_1,i_2,\ldots,i_d}, \end{align} }[/math]

where [math]\displaystyle{ \delta_{i,j} }[/math] is the Kronecker delta. Hence, if [math]\displaystyle{ \mathcal{B} = (M_1, M_2, \ldots, M_d) \cdot \mathcal{A} }[/math], then

[math]\displaystyle{ \begin{align} & b_{i_1,i_2,\ldots,i_d} = \left( (\mathbf{e}_{i_1}^1)^T, (\mathbf{e}_{i_2}^2)^T, \ldots, (\mathbf{e}_{i_d}^d)^T \right) \cdot \widehat{\mathcal{B}} \\ = {} & \left( (\mathbf{e}_{i_1}^1)^T, (\mathbf{e}_{i_2}^2)^T, \ldots, (\mathbf{e}_{i_d}^d)^T \right) \cdot (\widehat{M}_1, \widehat{M}_2, \ldots, \widehat{M}_d) \cdot \sum_{j_1=1}^{n_1} \sum_{j_2=1}^{n_2} \cdots \sum_{j_d=1}^{n_d} a_{j_1,j_2,\ldots,j_d} \mathbf{e}_{j_1}^1 \otimes \mathbf{e}_{j_2}^2 \otimes \cdots \otimes \mathbf{e}_{j_d}^d \\ = {} & \sum_{j_1=1}^{n_1} \sum_{j_2=1}^{n_2} \cdots \sum_{j_d=1}^{n_d} a_{j_1,j_2,\ldots,j_d} ((\mathbf{e}_{i_1}^1)^T \widehat{M}_1 \mathbf{e}_{j_1}^1) \otimes ((\mathbf{e}_{i_2}^2)^T \widehat{M}_2 \mathbf{e}_{j_2}^2) \otimes \cdots \otimes ((\mathbf{e}_{i_d}^d)^T \widehat{M}_d \mathbf{e}_{j_d}^d) \\ = {} & \sum_{j_1=1}^{n_1} \sum_{j_2=1}^{n_2} \cdots \sum_{j_d=1}^{n_d} a_{j_1,j_2,\ldots,j_d} m_{i_1,j_1}^{(1)} \cdot m_{i_2,j_2}^{(2)} \cdots m_{i_d,j_d}^{(d)}, \end{align} }[/math]

where the [math]\displaystyle{ m_{i,j}^{(k)} }[/math] are the elements of [math]\displaystyle{ \widehat{M}_k }[/math] as defined above.

Properties

Let [math]\displaystyle{ \mathcal{A} \in V_1 \otimes V_2 \otimes \cdots \otimes V_d }[/math] be an order-d tensor over the tensor product of [math]\displaystyle{ F }[/math]-vector spaces.

Since a multilinear multiplication is the tensor product of linear maps, we have the following multilinearity property (in the construction of the map):[1][2]

[math]\displaystyle{ A_1 \otimes \cdots \otimes A_{k-1} \otimes (\alpha A_k + \beta B) \otimes A_{k+1} \otimes \cdots \otimes A_d = \alpha A_1 \otimes \cdots \otimes A_d + \beta A_1 \otimes \cdots \otimes A_{k-1} \otimes B \otimes A_{k+1} \otimes \cdots \otimes A_d }[/math]

Multilinear multiplication is a linear map:[1][2] [math]\displaystyle{ (M_1, M_2, \ldots, M_d) \cdot (\alpha \mathcal{A} + \beta \mathcal{B}) = \alpha \; (M_1, M_2, \ldots, M_d) \cdot \mathcal{A} + \beta \; (M_1, M_2, \ldots, M_d) \cdot \mathcal{B} }[/math]

It follows from the definition that the composition of two multilinear multiplications is also a multilinear multiplication:[1][2]

[math]\displaystyle{ (M_1, M_2, \ldots, M_d) \cdot \left( (K_1, K_2, \ldots, K_d) \cdot \mathcal{A} \right) = (M_1 \circ K_1, M_2 \circ K_2, \ldots, M_d \circ K_d) \cdot \mathcal{A}, }[/math]

where [math]\displaystyle{ M_k : U_k \to W_k }[/math] and [math]\displaystyle{ K_k : V_k \to U_k }[/math] are linear maps.

Observe specifically that multilinear multiplications in different factors commute,

[math]\displaystyle{ M_k \cdot_k \left( M_\ell \cdot_\ell \mathcal{A} \right) = M_\ell \cdot_\ell \left( M_k \cdot_k \mathcal{A} \right) = M_k \cdot_k M_\ell \cdot_\ell \mathcal{A}, }[/math]

if [math]\displaystyle{ k \ne \ell. }[/math]

Computation

The factor-k multilinear multiplication [math]\displaystyle{ M_k \cdot_k\mathcal{A} }[/math] can be computed in coordinates as follows. Observe first that

[math]\displaystyle{ \begin{align} M_k \cdot_k \mathcal{A} &= M_k \cdot_k \sum_{j_1=1}^{n_1} \sum_{j_2=1}^{n_2} \cdots \sum_{j_d=1}^{n_d} a_{j_1,j_2,\ldots,j_d} \mathbf{e}_{j_1}^1 \otimes \mathbf{e}_{j_2}^2 \otimes \cdots \otimes \mathbf{e}_{j_d}^d \\ &= \sum_{j_1=1}^{n_1} \cdots \sum_{j_{k-1}=1}^{n_{k-1}} \sum_{j_{k+1}=1}^{n_{k+1}} \cdots \sum_{j_d=1}^{n_d} \mathbf{e}_{j_1}^1 \otimes \cdots \otimes \mathbf{e}_{j_{k-1}}^{k-1} \otimes M_k \left ( \sum_{j_k=1}^{n_k} a_{j_1,j_2,\ldots,j_d} \mathbf{e}_{j_k}^k \right) \otimes \mathbf{e}_{j_{k+1}}^{k+1} \otimes \cdots \otimes \mathbf{e}_{j_d}^d. \end{align} }[/math]

Next, since

[math]\displaystyle{ F^{n_1} \otimes F^{n_2} \otimes \cdots \otimes F^{n_d} \simeq F^{n_k} \otimes (F^{n_1} \otimes \cdots \otimes F^{n_{k-1}} \otimes F^{n_{k+1}} \otimes \cdots \otimes F^{n_d}) \simeq F^{n_k} \otimes F^{n_1 \cdots n_{k-1} n_{k+1} \cdots n_d}, }[/math]

there is a bijective map, called the factor-k standard flattening,[1] denoted by [math]\displaystyle{ (\cdot)_{(k)} }[/math], that identifies [math]\displaystyle{ M_k \cdot_k \mathcal{A} }[/math] with an element from the latter space, namely

[math]\displaystyle{ \left( M_k \cdot_k \mathcal{A} \right)_{(k)} := \sum_{j_1=1}^{n_1} \cdots \sum_{j_{k-1}=1}^{n_{k-1}} \sum_{j_{k+1}=1}^{n_{k+1}} \cdots \sum_{j_d=1}^{n_d} M_k \left ( \sum_{j_k=1}^{n_k} a_{j_1,j_2,\ldots,j_d} \mathbf{e}_{j_{k}}^{k} \right) \otimes \mathbf{e}_{\mu_k(j_1,\ldots,j_{k-1},j_{k+1},\ldots,j_d)} := M_k \mathcal{A}_{(k)}, }[/math]

where [math]\displaystyle{ \mathbf{e}_j }[/math]is the jth standard basis vector of [math]\displaystyle{ F^{N_k} }[/math], [math]\displaystyle{ N_k = n_1 \cdots n_{k-1} n_{k+1} \cdots n_d }[/math], and [math]\displaystyle{ \mathcal{A}_{(k)} \in F^{n_k} \otimes F^{N_k} \simeq F^{n_k \times N_k} }[/math] is the factor-k flattening matrix of [math]\displaystyle{ \mathcal{A} }[/math] whose columns are the factor-k vectors [math]\displaystyle{ [a_{j_1,\ldots,j_{k-1},i,j_{k+1},\ldots,j_d}]_{i=1}^{n_k} }[/math] in some order, determined by the particular choice of the bijective map

[math]\displaystyle{ \mu_k : [1,n_1] \times \cdots \times [1,n_{k-1}] \times [1,n_{k+1}] \times \cdots \times [1,n_d] \to [1,N_k]. }[/math]

In other words, the multilinear multiplication [math]\displaystyle{ (M_1, M_2, \ldots, M_d) \cdot \mathcal{A} }[/math] can be computed as a sequence of d factor-k multilinear multiplications, which themselves can be implemented efficiently as classic matrix multiplications.

Applications

The higher-order singular value decomposition (HOSVD) factorizes a tensor given in coordinates [math]\displaystyle{ \mathcal{A} \in F^{n_1 \times n_2 \times \cdots \times n_d} }[/math] as the multilinear multiplication [math]\displaystyle{ \mathcal{A} = (U_1, U_2, \ldots, U_d) \cdot \mathcal{S} }[/math], where [math]\displaystyle{ U_k \in F^{n_k \times n_k} }[/math] are orthogonal matrices and [math]\displaystyle{ \mathcal{S} \in F^{n_1 \times n_2 \times \cdots \times n_d} }[/math].

Further reading

  1. 1.0 1.1 1.2 1.3 1.4 1.5 M., Landsberg, J. (2012). Tensors : geometry and applications. Providence, R.I.: American Mathematical Society. ISBN 9780821869079. OCLC 733546583. 
  2. 2.0 2.1 2.2 2.3 2.4 (in en) Multilinear Algebra | Werner Greub | Springer. Universitext. Springer. 1978. ISBN 9780387902845. https://www.springer.com/gp/book/9780387902845.