Bartels–Stewart algorithm

From HandWiki
Revision as of 18:30, 6 March 2023 by Unex (talk | contribs) (fixing)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Short description: Algorithm in numerical linear algebra

In numerical linear algebra, the Bartels–Stewart algorithm is used to numerically solve the Sylvester matrix equation [math]\displaystyle{ AX - XB = C }[/math]. Developed by R.H. Bartels and G.W. Stewart in 1971,[1] it was the first numerically stable method that could be systematically applied to solve such equations. The algorithm works by using the real Schur decompositions of [math]\displaystyle{ A }[/math] and [math]\displaystyle{ B }[/math] to transform [math]\displaystyle{ AX - XB = C }[/math] into a triangular system that can then be solved using forward or backward substitution. In 1979, G. Golub, C. Van Loan and S. Nash introduced an improved version of the algorithm,[2] known as the Hessenberg–Schur algorithm. It remains a standard approach for solving Sylvester equations when [math]\displaystyle{ X }[/math] is of small to moderate size.

The algorithm

Let [math]\displaystyle{ X, C \in \mathbb{R}^{m \times n} }[/math], and assume that the eigenvalues of [math]\displaystyle{ A }[/math] are distinct from the eigenvalues of [math]\displaystyle{ B }[/math]. Then, the matrix equation [math]\displaystyle{ AX - XB = C }[/math] has a unique solution. The Bartels–Stewart algorithm computes [math]\displaystyle{ X }[/math] by applying the following steps:[2]

1.Compute the real Schur decompositions

[math]\displaystyle{ R = U^TAU, }[/math]
[math]\displaystyle{ S = V^TB^TV. }[/math]

The matrices [math]\displaystyle{ R }[/math] and [math]\displaystyle{ S }[/math] are block-upper triangular matrices, with diagonal blocks of size [math]\displaystyle{ 1 \times 1 }[/math] or [math]\displaystyle{ 2 \times 2 }[/math].

2. Set [math]\displaystyle{ F = U^TCV. }[/math]

3. Solve the simplified system [math]\displaystyle{ RY - YS^T = F }[/math], where [math]\displaystyle{ Y = U^TXV }[/math]. This can be done using forward substitution on the blocks. Specifically, if [math]\displaystyle{ s_{k-1, k} = 0 }[/math], then

[math]\displaystyle{ (R - s_{kk}I)y_k = f_{k} + \sum_{j = k+1}^n s_{kj}y_j, }[/math]

where [math]\displaystyle{ y_k }[/math]is the [math]\displaystyle{ k }[/math]th column of [math]\displaystyle{ Y }[/math]. When [math]\displaystyle{ s_{k-1, k} \neq 0 }[/math], columns [math]\displaystyle{ [ y_{k-1} \mid y_{k}] }[/math] should be concatenated and solved for simultaneously.

4. Set [math]\displaystyle{ X = UYV^T. }[/math]

Computational cost

Using the QR algorithm, the real Schur decompositions in step 1 require approximately [math]\displaystyle{ 10(m^3 + n^3) }[/math] flops, so that the overall computational cost is [math]\displaystyle{ 10(m^3 + n^3) + 2.5(mn^2 + nm^2) }[/math].[2]

Simplifications and special cases

In the special case where [math]\displaystyle{ B=-A^T }[/math] and [math]\displaystyle{ C }[/math] is symmetric, the solution [math]\displaystyle{ X }[/math] will also be symmetric. This symmetry can be exploited so that [math]\displaystyle{ Y }[/math] is found more efficiently in step 3 of the algorithm.[1]

The Hessenberg–Schur algorithm

The Hessenberg–Schur algorithm[2] replaces the decomposition [math]\displaystyle{ R = U^TAU }[/math] in step 1 with the decomposition [math]\displaystyle{ H = Q^TAQ }[/math], where [math]\displaystyle{ H }[/math] is an upper-Hessenberg matrix. This leads to a system of the form [math]\displaystyle{ HY - YS^T = F }[/math] that can be solved using forward substitution. The advantage of this approach is that [math]\displaystyle{ H = Q^TAQ }[/math] can be found using Householder reflections at a cost of [math]\displaystyle{ (5/3)m^3 }[/math] flops, compared to the [math]\displaystyle{ 10m^3 }[/math] flops required to compute the real Schur decomposition of [math]\displaystyle{ A }[/math].

Software and implementation

The subroutines required for the Hessenberg-Schur variant of the Bartels–Stewart algorithm are implemented in the SLICOT library. These are used in the MATLAB control system toolbox.

Alternative approaches

For large systems, the [math]\displaystyle{ \mathcal{O}(m^3 + n^3) }[/math] cost of the Bartels–Stewart algorithm can be prohibitive. When [math]\displaystyle{ A }[/math] and [math]\displaystyle{ B }[/math] are sparse or structured, so that linear solves and matrix vector multiplies involving them are efficient, iterative algorithms can potentially perform better. These include projection-based methods, which use Krylov subspace iterations, methods based on the alternating direction implicit (ADI) iteration, and hybridizations that involve both projection and ADI.[3] Iterative methods can also be used to directly construct low rank approximations to [math]\displaystyle{ X }[/math] when solving [math]\displaystyle{ AX-XB = C }[/math].

References

  1. 1.0 1.1 Bartels, R. H.; Stewart, G. W. (1972). "Solution of the matrix equation AX + XB = C [F4]". Communications of the ACM 15 (9): 820–826. doi:10.1145/361573.361582. ISSN 0001-0782. 
  2. 2.0 2.1 2.2 2.3 Golub, G.; Nash, S.; Loan, C. Van (1979). "A Hessenberg–Schur method for the problem AX + XB= C". IEEE Transactions on Automatic Control 24 (6): 909–913. doi:10.1109/TAC.1979.1102170. ISSN 0018-9286. 
  3. Simoncini, V. (2016). "Computational Methods for Linear Matrix Equations" (in en-US). SIAM Review 58 (3): 377–441. doi:10.1137/130912839. ISSN 0036-1445.