Biconjugate gradient method

From HandWiki

In mathematics, more specifically in numerical linear algebra, the biconjugate gradient method is an algorithm to solve systems of linear equations

[math]\displaystyle{ A x= b.\, }[/math]

Unlike the conjugate gradient method, this algorithm does not require the matrix [math]\displaystyle{ A }[/math] to be self-adjoint, but instead one needs to perform multiplications by the conjugate transpose A*.

The Algorithm

  1. Choose initial guess [math]\displaystyle{ x_0\, }[/math], two other vectors [math]\displaystyle{ x_0^* }[/math] and [math]\displaystyle{ b^*\, }[/math] and a preconditioner [math]\displaystyle{ M\, }[/math]
  2. [math]\displaystyle{ r_0 \leftarrow b-A\, x_0\, }[/math]
  3. [math]\displaystyle{ r_0^* \leftarrow b^*-x_0^*\, A^* }[/math]
  4. [math]\displaystyle{ p_0 \leftarrow M^{-1} r_0\, }[/math]
  5. [math]\displaystyle{ p_0^* \leftarrow r_0^*M^{-1}\, }[/math]
  6. for [math]\displaystyle{ k=0, 1, \ldots }[/math] do
    1. [math]\displaystyle{ \alpha_k \leftarrow {r_k^* M^{-1} r_k \over p_k^* A p_k}\, }[/math]
    2. [math]\displaystyle{ x_{k+1} \leftarrow x_k + \alpha_k \cdot p_k\, }[/math]
    3. [math]\displaystyle{ x_{k+1}^* \leftarrow x_k^* + \overline{\alpha_k}\cdot p_k^*\, }[/math]
    4. [math]\displaystyle{ r_{k+1} \leftarrow r_k - \alpha_k \cdot A p_k\, }[/math]
    5. [math]\displaystyle{ r_{k+1}^* \leftarrow r_k^*- \overline{\alpha_k} \cdot p_k^*\, A^* }[/math]
    6. [math]\displaystyle{ \beta_k \leftarrow {r_{k+1}^* M^{-1} r_{k+1} \over r_k^* M^{-1} r_k}\, }[/math]
    7. [math]\displaystyle{ p_{k+1} \leftarrow M^{-1} r_{k+1} + \beta_k \cdot p_k\, }[/math]
    8. [math]\displaystyle{ p_{k+1}^* \leftarrow r_{k+1}^*M^{-1} + \overline{\beta_k}\cdot p_k^*\, }[/math]

In the above formulation, the computed [math]\displaystyle{ r_k\, }[/math] and [math]\displaystyle{ r_k^* }[/math] satisfy

[math]\displaystyle{ r_k = b - A x_k,\, }[/math]
[math]\displaystyle{ r_k^* = b^* - x_k^*\, A^* }[/math]

and thus are the respective residuals corresponding to [math]\displaystyle{ x_k\, }[/math] and [math]\displaystyle{ x_k^* }[/math], as approximate solutions to the systems

[math]\displaystyle{ A x = b,\, }[/math]
[math]\displaystyle{ x^*\, A^* = b^*\,; }[/math]

[math]\displaystyle{ x^* }[/math] is the adjoint, and [math]\displaystyle{ \overline{\alpha} }[/math] is the complex conjugate.

Unpreconditioned version of the algorithm

  1. Choose initial guess [math]\displaystyle{ x_0\, }[/math],
  2. [math]\displaystyle{ r_0 \leftarrow b-A\, x_0\, }[/math]
  3. [math]\displaystyle{ \hat{r}_0 \leftarrow \hat{b} - \hat{x}_0A }[/math]
  4. [math]\displaystyle{ p_0 \leftarrow r_0\, }[/math]
  5. [math]\displaystyle{ \hat{p}_0 \leftarrow \hat{r}_0\, }[/math]
  6. for [math]\displaystyle{ k=0, 1, \ldots }[/math] do
    1. [math]\displaystyle{ \alpha_k \leftarrow {\hat{r}_k r_k \over \hat{p}_k A p_k}\, }[/math]
    2. [math]\displaystyle{ x_{k+1} \leftarrow x_k + \alpha_k \cdot p_k\, }[/math]
    3. [math]\displaystyle{ \hat{x}_{k+1} \leftarrow \hat{x}_k + \alpha_k \cdot \hat{p}_k\, }[/math]
    4. [math]\displaystyle{ r_{k+1} \leftarrow r_k - \alpha_k \cdot A p_k\, }[/math]
    5. [math]\displaystyle{ \hat{r}_{k+1} \leftarrow \hat{r}_k- \alpha_k \cdot \hat{p}_k A }[/math]
    6. [math]\displaystyle{ \beta_k \leftarrow {\hat{r}_{k+1} r_{k+1} \over \hat{r}_k r_k}\, }[/math]
    7. [math]\displaystyle{ p_{k+1} \leftarrow r_{k+1} + \beta_k \cdot p_k\, }[/math]
    8. [math]\displaystyle{ \hat{p}_{k+1} \leftarrow \hat{r}_{k+1} + \beta_k \cdot \hat{p}_k\, }[/math]

Discussion

The biconjugate gradient method is numerically unstable[citation needed] (compare to the biconjugate gradient stabilized method), but very important from a theoretical point of view. Define the iteration steps by

[math]\displaystyle{ x_k:=x_j+ P_k A^{-1}\left(b - A x_j \right), }[/math]
[math]\displaystyle{ x_k^*:= x_j^*+\left(b^*- x_j^* A \right) P_k A^{-1}, }[/math]

where [math]\displaystyle{ j\lt k }[/math] using the related projection

[math]\displaystyle{ P_k:= \mathbf{u}_k \left(\mathbf{v}_k^* A \mathbf{u}_k \right)^{-1} \mathbf{v}_k^* A, }[/math]

with

[math]\displaystyle{ \mathbf{u}_k=\left[u_0, u_1, \dots, u_{k-1} \right], }[/math]
[math]\displaystyle{ \mathbf{v}_k=\left[v_0, v_1, \dots, v_{k-1} \right]. }[/math]

These related projections may be iterated themselves as

[math]\displaystyle{ P_{k+1}= P_k+ \left( 1-P_k\right) u_k \otimes {v_k^* A\left(1-P_k \right) \over v_k^* A\left(1-P_k \right) u_k}. }[/math]

A relation to Quasi-Newton methods is given by [math]\displaystyle{ P_k= A_k^{-1} A }[/math] and [math]\displaystyle{ x_{k+1}= x_k- A_{k+1}^{-1}\left(A x_k -b \right) }[/math], where

[math]\displaystyle{ A_{k+1}^{-1}= A_k^{-1}+ \left( 1-A_k^{-1}A\right) u_k \otimes {v_k^* \left(1-A A_k^{-1} \right) \over v_k^* A\left(1-A_k^{-1}A \right) u_k}. }[/math]

The new directions

[math]\displaystyle{ p_k = \left(1-P_k \right) u_k, }[/math]
[math]\displaystyle{ p_k^* = v_k^* A \left(1- P_k \right) A^{-1} }[/math]

are then orthogonal to the residuals:

[math]\displaystyle{ v_i^* r_k= p_i^* r_k=0, }[/math]
[math]\displaystyle{ r_k^* u_j = r_k^* p_j= 0, }[/math]

which themselves satisfy

[math]\displaystyle{ r_k= A \left( 1- P_k \right) A^{-1} r_j, }[/math]
[math]\displaystyle{ r_k^*= r_j^* \left( 1- P_k \right) }[/math]

where [math]\displaystyle{ i,j\lt k }[/math].

The biconjugate gradient method now makes a special choice and uses the setting

[math]\displaystyle{ u_k = M^{-1} r_k,\, }[/math]
[math]\displaystyle{ v_k^* = r_k^* \, M^{-1}.\, }[/math]

With this particular choice, explicit evaluations of [math]\displaystyle{ P_k }[/math] and A−1 are avoided, and the algorithm takes the form stated above.

Properties

  • If [math]\displaystyle{ A= A^*\, }[/math] is self-adjoint, [math]\displaystyle{ x_0^*= x_0 }[/math] and [math]\displaystyle{ b^*=b }[/math], then [math]\displaystyle{ r_k= r_k^* }[/math], [math]\displaystyle{ p_k= p_k^* }[/math], and the conjugate gradient method produces the same sequence [math]\displaystyle{ x_k= x_k^* }[/math] at half the computational cost.
  • The sequences produced by the algorithm are biorthogonal, i.e., [math]\displaystyle{ p_i^*Ap_j=r_i^*M^{-1}r_j=0 }[/math] for [math]\displaystyle{ i \neq j }[/math].
  • if [math]\displaystyle{ P_{j'}\, }[/math] is a polynomial with [math]\displaystyle{ \deg\left(P_{j'}\right)+j\lt k }[/math], then [math]\displaystyle{ r_k^*P_{j'}\left(M^{-1}A\right)u_j=0 }[/math]. The algorithm thus produces projections onto the Krylov subspace.
  • if [math]\displaystyle{ P_{i'}\, }[/math] is a polynomial with [math]\displaystyle{ i+\deg\left(P_{i'}\right)\lt k }[/math], then [math]\displaystyle{ v_i^*P_{i'}\left(AM^{-1}\right)r_k=0 }[/math].

See also

References