Higher-order singular value decomposition

From HandWiki
Short description: Tensor decomposition

In multilinear algebra, the higher-order singular value decomposition (HOSVD) is a misnomer. There does not exist a single tensor decomposition that retains all the defining properties of the matrix SVD. The matrix SVD simultaneously yields a

  • rank-𝑅 decomposition and
  • orthonormal subspaces for the row and column spaces.

These properties are not realized within a single algorithm for higher-order tensors, but are instead realized by two distinct algorithmic developments and represent two distinct research directions. Harshman, as well as, the team of Carol and Chang proposed Canonical polyadic decomposition (CPD), which is a variant of the tensor rank decomposition, in which a tensor is approximated as a sum of K rank-1 tensors for a user-specified K. L. R. Tucker proposed a strategy for computing orthonormal subspaces for third order tensors. Aspects of these algorithms can be traced as far back as F. L. Hitchcock in 1928.[1]


De Lathauwer et al.[2][3] introduced clarity to the Tucker concepts, while Vasilescu and Terzopoulos[4][5][6] introduced algorithmic clarity. Vasilescu and Terzopoulos[4][6] introduced the M-mode SVD, which is the classic algorithm that is currently referred in the literature as the Tucker or the HOSVD. The Tucker approach and De Lathauwer's implementation are both sequential and rely on iterative procedures such as gradient descent or the power method. By contrast, the M-mode SVD provides a closed-form solution that can be executed sequentially and is well-suited for parallel computation.

This misattribution has had lasting impact on the scholarly record, obscuring the original source of a widely adopted algorithm, and complicating efforts to trace its development, reproduce results, and recognizing the respective contributions of different research efforts.

The term M-mode SVD accurately reflects the algorithm employed. It captures the actual computation, a set of SVDs on mode-flattenings without making assumptions about the structure of the core tensor or implying a rank decomposition.

Robust and L1-norm-based variants of this decomposition framework have since been proposed.[7][8][9][10]

Definition

For the purpose of this article, the abstract tensor 𝒜 is assumed to be given in coordinates with respect to some basis as a M-way array, also denoted by 𝒜I1×I2×Im×IM, where M is the number of modes and the order of the tensor. is the complex numbers and it includes both the real numbers and the pure imaginary numbers.

Let 𝒜[m]Im×(I1I2Im1Im+1IM) denote the mode-m flattening of 𝒜, so that the left index of 𝒜[m] corresponds to the m'th index 𝒜 and the right index of 𝒜[m] corresponds to all other indices of 𝒜 combined. Let 𝐔mIm×Imbe a unitary matrix containing a basis of the left singular vectors of the 𝒜[m] such that the jth column 𝐮j of 𝐔m corresponds to the jth largest singular value of 𝒜[m]. Observe that the mode/factor matrix 𝐔m does not depend on the particular on the specific definition of the mode m flattening. By the properties of the multilinear multiplication, we have𝒜=𝒜×(𝐈,𝐈,,𝐈)=𝒜×(𝐔1𝐔1H,𝐔2𝐔2H,,𝐔M𝐔MH)=(𝒜×(𝐔1H,𝐔2H,,𝐔MH))×(𝐔1,𝐔2,,𝐔M),where H denotes the conjugate transpose. The second equality is because the 𝐔m's are unitary matrices. Define now the core tensor𝒮:=𝒜×(𝐔1H,𝐔2H,,𝐔MH).Then, the M-mode SVD(HOSVD)[2] of 𝒜 is the decomposition𝒜=𝒮×(𝐔1,𝐔2,,𝐔M). The above construction shows that every tensor has a M-mode SVD(HOSVD).

Compact M-mode SVD (HOSVD)

As in the case of the compact singular value decomposition of a matrix, where the rows and columns corresponding to vanishing singular values are dropped, it is also possible to consider a compact M-mode SVD(HOSVD), which is very useful in applications.

Assume that 𝐔mIm×Rm is a matrix with unitary columns containing a basis of the left singular vectors corresponding to the nonzero singular values of the standard factor-m flattening 𝒜[m] of 𝒜. Let the columns of 𝐔m be sorted such that the rm th column 𝐮rm of 𝐔m corresponds to the rmth largest nonzero singular value of 𝒜[m]. Since the columns of 𝐔m form a basis for the image of 𝒜[m], we have𝒜[m]=𝐔m𝐔mH𝒜[m]=(𝒜×m(𝐔m𝐔mH))[m],where the first equality is due to the properties of orthogonal projections (in the Hermitian inner product) and the last equality is due to the properties of multilinear multiplication. As flattenings are bijective maps and the above formula is valid for all m=1,2,,m,,M, we find as before that𝒜=𝒜×(𝐔1𝐔1H,𝐔2𝐔2H,,𝐔M𝐔MH)=(𝒜×(𝐔1H,𝐔2H,,𝐔MH))×(𝐔1,𝐔2,,𝐔M)=𝒮×(𝐔1,𝐔2,,𝐔M),where the core tensor 𝒮 is now of size R1×R2××RM.

Multilinear rank

The multilinear rank[1] of 𝒜 is denoted with rank-(R1,R2,,RM). The multilinear rank is a tuple in M where Rm:=rank(𝒜[m]). Not all tuples in M are multilinear ranks.[11] The multilinear ranks are bounded by 1RmIm and it satisfy the constraint RmimRi must hold.[11]

The compact M-mode SVD(HOSVD) is a rank-revealing decomposition in the sense that the dimensions of its core tensor correspond with the components of the multilinear rank of the tensor.

Interpretation

The following geometric interpretation is valid for both the full and compact M-mode SVD(HOSVD). Let (R1,R2,,RM) be the multilinear rank of the tensor 𝒜. Since 𝒮R1×R2××RM is a multidimensional array, we can expand it as follows𝒮=r1=1R1r2=1R2rM=1RMsr1,r2,,rM𝐞r1𝐞r2𝐞rM,where 𝐞rm is the rmth standard basis vector of Im. By definition of the multilinear multiplication, it holds that𝒜=r1=1R1r2=1R2rM=1RMsr1,r2,,rM𝐮r1𝐮r2𝐮rM,where the 𝐮rm are the columns of 𝐔mIm×Rm. It is easy to verify that B={𝐮r1𝐮r2𝐮rM}r1,r2,,rM is an orthonormal set of tensors. This means that the M-mode SVD(HOSVD) can be interpreted as a way to express the tensor 𝒜 with respect to a specifically chosen orthonormal basis B with the coefficients given as the multidimensional array 𝒮.

Computation

Let 𝒜I1×I2××IM be a tensor with a rank-(R1,R2,,RM), where contains the reals as a subset.

Classic computation

While De Lathauwer et al. clarified Tucker's concepts through two influential papers, Vasilescu and Terzopoulos provided algoritmic clarity. The Tucker algorithm and De Lathauwer et al.[2] companion algorithm are sequential, relying on iterative methods such as gradient descent or the power method. In contrast, the M-mode SVD is computes the othonormal subspaces in closed-form that can be executed sequentially, but it is also well-suited for parallel computation.

M-mode SVD (also referred to as HOSVD or Tucker)

What is commonly referred to as the HOSVD or Tucker was developed by Vasilescu and Terzopoulos under the name M-mode SVD.[4][6]

  • For m=1,,M, do the following:
  1. Construct the mode-m flattening 𝒜[m];
  2. Compute the (compact) singular value decomposition
    𝒜[m]=𝐔mΣm𝐕mT, and store the left singular vectors 𝐔Im×Rm;
  • Compute the core tensor 𝒮 via the mode-m product
    𝒮=𝒜×1𝐔1H×2𝐔2H×m𝐔mH×M𝐔MH

Interlacing computation

A strategy that is significantly faster when some or all RmIm consists of interlacing the computation of the core tensor and the factor matrices, as follows:[5][12][13][14]

  • Set 𝒜0=𝒜;
  • For m=1,2,M perform the following:
    1. Construct the standard mode-m flattening 𝒜[m]m1;
    2. Compute the (compact) singular value decomposition 𝒜[m]m1=UmΣmVmT, and store the left singular vectors UmFIm×Rm;
    3. Set 𝒜m=UmH×m𝒜m1, or, equivalently, 𝒜[m]m=ΣmVmT.

In-place computation

The M-mode SVD (HOSVD) can be computed in-place via the Fused In-place Sequentially Truncated Higher Order Singular Value Decomposition (FIST-HOSVD) [14] algorithm by overwriting the original tensor by the M-mode SVD (HOSVD) core tensor, significantly reducing the memory consumption of computing HOSVD.

Approximation

In applications, such as those mentioned below, a common problem consists of approximating a given tensor 𝒜I1×I2××Im×IM by one with a reduced multilinear rank. Formally, if the multilinear rank of 𝒜 is denoted by rank(R1,R2,,Rm,,RM), then computing the optimal A¯ that approximates 𝒜 for a given reduced rank(R¯1,R¯2,,R¯m,,R¯M) is a nonlinear non-convex 2-optimization problem minA¯I1×I2××IM12𝒜A¯F2s.t.rank(R¯1,R¯2,,R¯M),where (R¯1,R¯2,,R¯M)M is the reduced multilinear rank with 1R¯m<RmIm, and the norm .F is the Frobenius norm.

A simple idea for trying to solve this optimization problem is to truncate the (compact) SVD in step 2 of either the classic or the interlaced computation. A classically truncated M-mode SVD/HOSVD is obtained by replacing step 2 in the classic computation by

  • Compute a rank-R¯m truncated SVD 𝒜[m]UmΣmVmT, and store the top R¯m left singular vectors UmFIm×R¯m;

while a sequentially truncated M-mode SVD (HOSVD) (or successively truncated M-mode SVD(HOSVD)) is obtained by replacing step 2 in the interlaced computation by

  • Compute a rank-R¯m truncated SVD 𝒜[m]m1UmΣmVmT, and store the top R¯m left singular vectors UmFIm×R¯m. Unfortunately, truncation does not result in an optimal solution for the best low multilinear rank optimization problem,.[2][5][12][14] However, both the classically and interleaved truncated M-mode SVD/HOSVD result in a quasi-optimal solution:[5][12][13][15] if A¯t denotes the classically or sequentially truncated M-mode SVD(HOSVD) and A¯* denotes the optimal solution to the best low multilinear rank approximation problem, then𝒜A¯tFM𝒜A¯*F;in practice this means that if there exists an optimal solution with a small error, then a truncated M-mode SVD/HOSVD will for many intended purposes also yield a sufficiently good solution.

Applications

The M-mode SVD (HOSVD/Tucker) is most commonly applied to the extraction of relevant information from multi-way arrays.

Starting in the early 2000s, Vasilescu addressed causal questions by reframing the data analysis, recognition and synthesis problems as multilinear tensor problems. The power of the tensor framework was showcased by decomposing and representing an image in terms of its causal factors of data formation, in the context of Human Motion Signatures for gait recognition,[16] face recognition—TensorFaces[17][18] and computer graphics—TensorTextures.[19]

The M-mode SVD (HOSVD) has been successfully applied to signal processing and big data, e.g., in genomic signal processing.[20][21][22] These applications also inspired a higher-order GSVD (HO GSVD)[23] and a tensor GSVD.[24]

A combination of M-mode SVD (HOSVD) and SVD also has been applied for real-time event detection from complex data streams (multivariate data with space and time dimensions) in disease surveillance.[25]

It is also used in tensor product model transformation-based controller design.[26][27]

The concept of M-mode SVD (HOSVD) was carried over to functions by Baranyi and Yam via the TP model transformation.[26][27] This extension led to the definition of the M-mode SVD/HOSVD canonical form of tensor product functions and Linear Parameter Varying system models[28] and to convex hull manipulation based control optimization theory, see TP model transformation in control theories.

M-mode SVD (HOSVD) was proposed to be applied to multi-way data analysis in an unsupervised manner[29] and was successfully applied to in silico drug discovery from gene expression.[30]

Robust L1-norm variant

L1-Tucker is the L1-norm-based, robust variant of Tucker decomposition.[8][9] L1-HOSVD is the analogous of M-mode SVD(HOSVD) for the solution to L1-Tucker.[8][10]

References

  1. 1.0 1.1 Hitchcock, Frank L (1928-04-01). "Multiple Invariants and Generalized Rank of a M-Way Array or Tensor" (in en). Journal of Mathematics and Physics 7 (1–4): 39–79. doi:10.1002/sapm19287139. ISSN 1467-9590. 
  2. 2.0 2.1 2.2 2.3 De Lathauwer, L.; De Moor, B.; Vandewalle, J. (2000-01-01). "On the Best Rank-1 and Rank-(R1 ,R2 ,. . .,RN) Approximation of Higher-Order Tensors". SIAM Journal on Matrix Analysis and Applications 21 (4): 1324–1342. doi:10.1137/S0895479898346995. ISSN 0895-4798. 
  3. De Lathauwer, L.; De Moor, B.; Vandewalle, J. (2000-01-01). "A Multilinear Singular Value Decomposition". SIAM Journal on Matrix Analysis and Applications 21 (4): 1253–1278. doi:10.1137/s0895479896305696. ISSN 0895-4798. 
  4. 4.0 4.1 4.2 M. A. O. Vasilescu, D. Terzopoulos (2002). "Multilinear Analysis of Image Ensembles: TensorFaces". Copenhagen, Denmark. https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=67eb22ae0baf86af77188fc0ab27edacf07a9140. 
  5. 5.0 5.1 5.2 5.3 Vasilescu, M.A.O.; Terzopoulos, D. (2003). "Multilinear Subspace Analysis for Image Ensembles". 2. Madison, WI. pp. 93-99. 
  6. 6.0 6.1 6.2 M. A. O. Vasilescu, D. Terzopoulos (2005). "Multilinear Independent Component Analysis". San Diego, CA. 
  7. Godfarb, Donald; Zhiwei, Qin (2014). "Robust low-rank tensor recovery: Models and algorithms". SIAM Journal on Matrix Analysis and Applications 35 (1): 225–253. doi:10.1137/130905010. 
  8. 8.0 8.1 8.2 Chachlakis, Dimitris G.; Prater-Bennette, Ashley; Markopoulos, Panos P. (22 November 2019). "L1-norm Tucker Tensor Decomposition". IEEE Access 7: 178454–178465. doi:10.1109/ACCESS.2019.2955134. Bibcode2019IEEEA...7q8454C. 
  9. 9.0 9.1 Markopoulos, Panos P.; Chachlakis, Dimitris G.; Papalexakis, Evangelos (April 2018). "The Exact Solution to Rank-1 L1-Norm TUCKER2 Decomposition". IEEE Signal Processing Letters 25 (4): 511–515. doi:10.1109/LSP.2018.2790901. Bibcode2018ISPL...25..511M. 
  10. 10.0 10.1 Markopoulos, Panos P.; Chachlakis, Dimitris G.; Prater-Bennette, Ashley (21 February 2019). "L1-Norm Higher-Order Singular-Value Decomposition". 2018 IEEE Global Conference on Signal and Information Processing (GlobalSIP). pp. 1353–1357. doi:10.1109/GlobalSIP.2018.8646385. ISBN 978-1-7281-1295-4. 
  11. 11.0 11.1 Carlini, Enrico; Kleppe, Johannes (2011). "Ranks derived from multilinear maps". Journal of Pure and Applied Algebra 215 (8): 1999–2004. doi:10.1016/j.jpaa.2010.11.010. 
  12. 12.0 12.1 12.2 Vannieuwenhoven, N.; Vandebril, R.; Meerbergen, K. (2012-01-01). "A New Truncation Strategy for the Higher-Order Singular Value Decomposition". SIAM Journal on Scientific Computing 34 (2): A1027–A1052. doi:10.1137/110836067. ISSN 1064-8275. Bibcode2012SJSC...34A1027V. https://lirias.kuleuven.be/handle/123456789/337210. 
  13. 13.0 13.1 Hackbusch, Wolfgang (2012) (in en-gb). Tensor Spaces and Numerical Tensor Calculus | SpringerLink. Springer Series in Computational Mathematics. 42. doi:10.1007/978-3-642-28027-6. ISBN 978-3-642-28026-9. 
  14. 14.0 14.1 14.2 Cobb, Benjamin; Kolla, Hemanth; Phipps, Eric; Çatalyürek, Ümit V. (2022). "FIST-HOSVD: Fused in-Place Sequentially Truncated Higher Order Singular Value Decomposition" (in en). Platform for Advanced Scientific Computing(PASC). doi:10.1145/3539781.3539798. ISBN 978-1-4503-9410-9. 
  15. Grasedyck, L. (2010-01-01). "Hierarchical Singular Value Decomposition of Tensors". SIAM Journal on Matrix Analysis and Applications 31 (4): 2029–2054. doi:10.1137/090764189. ISSN 0895-4798. 
  16. M. A. O. Vasilescu (2002) "Human Motion Signatures: Analysis, Synthesis, Recognition," Proceedings of International Conference on Pattern Recognition (ICPR 2002), Vol. 3, Quebec City, Canada, Aug, 2002, 456–460.
  17. M.A.O. Vasilescu, D. Terzopoulos (2003) "Multilinear Subspace Analysis for Image Ensembles, M. A. O. Vasilescu, D. Terzopoulos, Proc. Computer Vision and Pattern Recognition Conf. (CVPR '03), Vol.2, Madison, WI, June, 2003, 93–99.
  18. M.A.O. Vasilescu, D. Terzopoulos (2002) "Multilinear Analysis of Image Ensembles: TensorFaces," Proc. 7th European Conference on Computer Vision (ECCV'02), Copenhagen, Denmark, May, 2002, in Computer Vision -- ECCV 2002, Lecture Notes in Computer Science, Vol. 2350, A. Heyden et al. (Eds.), Springer-Verlag, Berlin, 2002, 447–460.
  19. M.A.O. Vasilescu, D. Terzopoulos (2004) "TensorTextures: Multilinear Image-Based Rendering", M. A. O. Vasilescu and D. Terzopoulos, Proc. ACM SIGGRAPH 2004 Conference Los Angeles, CA, August, 2004, in Computer Graphics Proceedings, Annual Conference Series, 2004, 336–342.
  20. L. Omberg; G. H. Golub; O. Alter (November 2007). "A Tensor Higher-Order Singular Value Decomposition for Integrative Analysis of DNA Microarray Data From Different Studies". PNAS 104 (47): 18371–18376. doi:10.1073/pnas.0709146104. PMID 18003902. Bibcode2007PNAS..10418371O. 
  21. L. Omberg; J. R. Meyerson; K. Kobayashi; L. S. Drury; J. F. X. Diffley; O. Alter (October 2009). "Global Effects of DNA Replication and DNA Replication Origin Activity on Eukaryotic Gene Expression". Molecular Systems Biology 5: 312. doi:10.1038/msb.2009.70. Highlight. PMID 19888207. 
  22. C. Muralidhara; A. M. Gross; R. R. Gutell; O. Alter (April 2011). "Tensor Decomposition Reveals Concurrent Evolutionary Convergences and Divergences and Correlations with Structural Motifs in Ribosomal RNA". PLOS ONE 6 (4). doi:10.1371/journal.pone.0018768. Highlight. PMID 21625625. Bibcode2011PLoSO...618768M. 
  23. S. P. Ponnapalli; M. A. Saunders; C. F. Van Loan; O. Alter (December 2011). "A Higher-Order Generalized Singular Value Decomposition for Comparison of Global mRNA Expression from Multiple Organisms". PLOS ONE 6 (12). doi:10.1371/journal.pone.0028072. Highlight. PMID 22216090. Bibcode2011PLoSO...628072P. 
  24. P. Sankaranarayanan; T. E. Schomay; K. A. Aiello; O. Alter (April 2015). "Tensor GSVD of Patient- and Platform-Matched Tumor and Normal DNA Copy-Number Profiles Uncovers Chromosome Arm-Wide Patterns of Tumor-Exclusive Platform-Consistent Alterations Encoding for Cell Transformation and Predicting Ovarian Cancer Survival". PLOS ONE 10 (4). doi:10.1371/journal.pone.0121396. AAAS EurekAlert! Press Release and NAE Podcast Feature. PMID 25875127. Bibcode2015PLoSO..1021396S. 
  25. Hadi Fanaee-T; João Gama (May 2015). "EigenEvent: An algorithm for event detection from complex data streams in Syndromic surveillance". Intelligent Data Analysis 19 (3): 597–616. doi:10.3233/IDA-150734. Bibcode2014arXiv1406.3496F. 
  26. 26.0 26.1 P. Baranyi (April 2004). "TP model transformation as a way to LMI based controller design". IEEE Transactions on Industrial Electronics 51 (2): 387–400. doi:10.1109/tie.2003.822037. 
  27. 27.0 27.1 P. Baranyi; D. Tikk; Y. Yam; R. J. Patton (2003). "From Differential Equations to PDC Controller Design via Numerical Transformation". Computers in Industry 51 (3): 281–297. doi:10.1016/s0166-3615(03)00058-7. 
  28. P. Baranyi; L. Szeidl; P. Várlaki; Y. Yam (July 3–5, 2006). "Definition of the HOSVD-based canonical form of polytopic dynamic models". 3rd International Conference on Mechatronics (ICM 2006). Budapest, Hungary. pp. 660–665. 
  29. Y-h. Taguchi (August 2017). "Tensor decomposition-based unsupervised feature extraction applied to matrix products for multi-view data processing". PLOS ONE 12 (8). doi:10.1371/journal.pone.0183933. PMID 28841719. Bibcode2017PLoSO..1283933T. 
  30. Y-h. Taguchi (October 2017). "Identification of candidate drugs using tensor-decomposition-based unsupervised feature extraction in integrated analysis of gene expression between diseases and DrugMatrix datasets". Scientific Reports 7 (1): 13733. doi:10.1038/s41598-017-13003-0. PMID 29062063. Bibcode2017NatSR...713733T.