Linear recurrence with constant coefficients

From HandWiki

In mathematics (including combinatorics, linear algebra, and dynamical systems), a linear recurrence with constant coefficients[1](ch. 17)[2](ch. 10) (also known as a linear recurrence relation or linear difference equation) sets equal to 0 a polynomial that is linear in the various iterates of a variable—that is, in the values of the elements of a sequence. The polynomial's linearity means that each of its terms has degree 0 or 1. A linear recurrence denotes the evolution of some variable over time, with the current time period or discrete moment in time denoted as t, one period earlier denoted as t − 1, one period later as t + 1, etc.

The solution of such an equation is a function of t, and not of any iterate values, giving the value of the iterate at any time. To find the solution it is necessary to know the specific values (known as initial conditions) of n of the iterates, and normally these are the n iterates that are oldest. The equation or its variable is said to be stable if from any set of initial conditions the variable's limit as time goes to infinity exists; this limit is called the steady state.

Difference equations are used in a variety of contexts, such as in economics to model the evolution through time of variables such as gross domestic product, the inflation rate, the exchange rate, etc. They are used in modeling such time series because values of these variables are only measured at discrete intervals. In econometric applications, linear difference equations are modeled with stochastic terms in the form of autoregressive (AR) models and in models such as vector autoregression (VAR) and autoregressive moving average (ARMA) models that combine AR with other features.

Definitions

A linear recurrence with constant coefficients is an equation of the following form, written in terms of parameters a1, ..., an and b:

[math]\displaystyle{ y_t = a_1y_{t-1} + \cdots + a_ny_{t-n} + b, }[/math]

or equivalently as

[math]\displaystyle{ y_{t+n}= a_1y_{t+n-1} + \cdots + a_ny_t + b. }[/math]

The positive integer [math]\displaystyle{ n }[/math] is called the order of the recurrence and denotes the longest time lag between iterates. The equation is called homogeneous if b = 0 and nonhomogeneous if b ≠ 0.

If the equation is homogeneous, the coefficients determine the characteristic polynomial (also "auxiliary polynomial" or "companion polynomial")

[math]\displaystyle{ p(\lambda)= \lambda^n - a_1\lambda^{n-1} - a_2\lambda^{n-2}-\cdots-a_{n} }[/math]

whose roots play a crucial role in finding and understanding the sequences satisfying the recurrence.

Conversion to homogeneous form

If b ≠ 0, the equation

[math]\displaystyle{ y_t = a_1y_{t-1} + \cdots + a_ny_{t-n} + b }[/math]

is said to be nonhomogeneous. To solve this equation it is convenient to convert it to homogeneous form, with no constant term. This is done by first finding the equation's steady state value—a value y* such that, if n successive iterates all had this value, so would all future values. This value is found by setting all values of y equal to y* in the difference equation, and solving, thus obtaining

[math]\displaystyle{ y^* = \frac{b}{1-a_1-\cdots - a_n} }[/math]

assuming the denominator is not 0. If it is zero, the steady state does not exist.

Given the steady state, the difference equation can be rewritten in terms of deviations of the iterates from the steady state, as

[math]\displaystyle{ \left(y_t -y^*\right)= a_1\left(y_{t-1}-y^*\right) + \cdots + a_n\left(y_{t-n}-y^*\right) }[/math]

which has no constant term, and which can be written more succinctly as

[math]\displaystyle{ x_t= a_1x_{t-1} + \cdots + a_nx_{t-n} }[/math]

where x equals yy*. This is the homogeneous form.

If there is no steady state, the difference equation

[math]\displaystyle{ y_t = a_1y_{t-1} + \cdots + a_ny_{t-n} + b }[/math]

can be combined with its equivalent form

[math]\displaystyle{ y_{t-1}= a_1y_{t-2}+ \cdots + a_ny_{t-(n+1)}+ b }[/math]

to obtain (by solving both for b)

[math]\displaystyle{ y_t - a_1y_{t-1} - \cdots - a_ny_{t-n} = y_{t-1}- a_1y_{t-2}- \cdots - a_ny_{t-(n+1)} }[/math]

in which like terms can be combined to give a homogeneous equation of one order higher than the original.

Solution example for small orders

The roots of the characteristic polynomial play a crucial role in finding and understanding the sequences satisfying the recurrence. If there are [math]\displaystyle{ d }[/math] distinct roots [math]\displaystyle{ r_1, r_2, \ldots, r_d, }[/math] then each solution to the recurrence takes the form [math]\displaystyle{ a_n = k_1 r_1^n + k_2 r_2^n + \cdots + k_d r_d^n, }[/math] where the coefficients [math]\displaystyle{ k_i }[/math] are determined in order to fit the initial conditions of the recurrence. When the same roots occur multiple times, the terms in this formula corresponding to the second and later occurrences of the same root are multiplied by increasing powers of [math]\displaystyle{ n }[/math]. For instance, if the characteristic polynomial can be factored as [math]\displaystyle{ (x-r)^3 }[/math], with the same root [math]\displaystyle{ r }[/math] occurring three times, then the solution would take the form [math]\displaystyle{ a_n = k_1 r^n + k_2 n r^n + k_3 n^2 r^n. }[/math][3]

Order 1

For order 1, the recurrence [math]\displaystyle{ a_{n}=r a_{n-1} }[/math] has the solution [math]\displaystyle{ a_n = r^n }[/math] with [math]\displaystyle{ a_0 = 1 }[/math] and the most general solution is [math]\displaystyle{ a_n = k r^n }[/math] with [math]\displaystyle{ a_0 = k }[/math]. The characteristic polynomial equated to zero (the characteristic equation) is simply [math]\displaystyle{ t - r = 0 }[/math].

Order 2

Solutions to such recurrence relations of higher order are found by systematic means, often using the fact that [math]\displaystyle{ a_n = r^n }[/math] is a solution for the recurrence exactly when [math]\displaystyle{ t = r }[/math] is a root of the characteristic polynomial. This can be approached directly or using generating functions (formal power series) or matrices.

Consider, for example, a recurrence relation of the form [math]\displaystyle{ a_{n} = Aa_{n-1}+Ba_{n-2}. }[/math]

When does it have a solution of the same general form as [math]\displaystyle{ a_n = r^n }[/math]? Substituting this guess (ansatz) in the recurrence relation, we find that [math]\displaystyle{ r^{n}=Ar^{n-1}+Br^{n-2} }[/math] must be true for all [math]\displaystyle{ n \gt 1 }[/math].

Dividing through by [math]\displaystyle{ r^{n-2} }[/math], we get that all these equations reduce to the same thing:

[math]\displaystyle{ \begin{align} r^2 &= Ar + B, \\ r^2 - Ar - B &= 0, \end{align} }[/math]

which is the characteristic equation of the recurrence relation. Solve for [math]\displaystyle{ r }[/math] to obtain the two roots [math]\displaystyle{ \lambda_1 }[/math], [math]\displaystyle{ \lambda_2 }[/math]: these roots are known as the characteristic roots or eigenvalues of the characteristic equation. Different solutions are obtained depending on the nature of the roots: If these roots are distinct, we have the general solution

[math]\displaystyle{ a_n = C\lambda_1^n+D\lambda_2^n }[/math]

while if they are identical (when [math]\displaystyle{ A^2 + 4 B = 0 }[/math]), we have

[math]\displaystyle{ a_n = C\lambda^n + D n\lambda^n }[/math]

This is the most general solution; the two constants [math]\displaystyle{ C }[/math] and [math]\displaystyle{ D }[/math] can be chosen based on two given initial conditions [math]\displaystyle{ a_0 }[/math] and [math]\displaystyle{ a_1 }[/math] to produce a specific solution.

In the case of complex eigenvalues (which also gives rise to complex values for the solution parameters [math]\displaystyle{ C }[/math] and [math]\displaystyle{ D }[/math]), the use of complex numbers can be eliminated by rewriting the solution in trigonometric form. In this case we can write the eigenvalues as [math]\displaystyle{ \lambda_1, \lambda_2 = \alpha \pm \beta i. }[/math] Then it can be shown that

[math]\displaystyle{ a_n = C\lambda_1^n + D\lambda_2^n }[/math]

can be rewritten as[4]:{{{1}}}

[math]\displaystyle{ a_n = 2 M^n \left( E \cos(\theta n) + F \sin(\theta n)\right) = 2 G M^n \cos(\theta n - \delta), }[/math]

where

[math]\displaystyle{ \begin{array}{lcl} M = \sqrt{\alpha^2+\beta^2} & \cos (\theta) =\tfrac{\alpha}{M} & \sin( \theta) = \tfrac{\beta}{M} \\ C,D = E \mp F i & & \\ G = \sqrt{E^2+F^2} & \cos (\delta ) = \tfrac{E}{G} & \sin (\delta )= \tfrac{F}{G} \end{array} }[/math]

Here [math]\displaystyle{ E }[/math] and [math]\displaystyle{ F }[/math] (or equivalently, [math]\displaystyle{ G }[/math] and [math]\displaystyle{ \delta }[/math]) are real constants which depend on the initial conditions. Using [math]\displaystyle{ \lambda_1+\lambda_2=2 \alpha = A, }[/math] [math]\displaystyle{ \lambda_1 \cdot \lambda_2=\alpha^2+\beta^2=-B, }[/math]

one may simplify the solution given above as

[math]\displaystyle{ a_n = (-B)^{\frac{n}{2}} \left( E \cos(\theta n) + F \sin(\theta n)\right), }[/math]

where [math]\displaystyle{ a_1 }[/math] and [math]\displaystyle{ a_2 }[/math] are the initial conditions and

[math]\displaystyle{ \begin{align} E &= \frac{-A a_1 + a_2}{B} \\ F &=-i \frac{A^2 a_1 - A a_2 +2 a_1 B}{B \sqrt{A^2+4B}} \\ \theta &=\arccos \left (\frac{A}{2 \sqrt{-B}} \right ) \end{align} }[/math]

In this way there is no need to solve for [math]\displaystyle{ \lambda_1 }[/math] and [math]\displaystyle{ \lambda_2 }[/math].

In all cases—real distinct eigenvalues, real duplicated eigenvalues, and complex conjugate eigenvalues—the equation is stable (that is, the variable [math]\displaystyle{ a }[/math] converges to a fixed value [specifically, zero]) if and only if both eigenvalues are smaller than one in absolute value. In this second-order case, this condition on the eigenvalues can be shown[5] to be equivalent to [math]\displaystyle{ |A| \lt 1 - B \lt 2 }[/math], which is equivalent to [math]\displaystyle{ |B| \lt 1 }[/math] and [math]\displaystyle{ |A| \lt 1 - B }[/math].

General solution

Characteristic polynomial and roots

Solving the homogeneous equation

[math]\displaystyle{ x_t= a_1x_{t-1} + \cdots + a_nx_{t-n} }[/math]

involves first solving its characteristic polynomial

[math]\displaystyle{ \lambda^n = a_1\lambda^{n-1} + \cdots + a_{n-2}\lambda^2+a_{n-1} \lambda + a_n }[/math]

for its characteristic roots λ1, ..., λn. These roots can be solved for algebraically if n ≤ 4, but not necessarily otherwise. If the solution is to be used numerically, all the roots of this characteristic equation can be found by numerical methods. However, for use in a theoretical context it may be that the only information required about the roots is whether any of them are greater than or equal to 1 in absolute value.

It may be that all the roots are real or instead there may be some that are complex numbers. In the latter case, all the complex roots come in complex conjugate pairs.

Solution with distinct characteristic roots

If all the characteristic roots are distinct, the solution of the homogeneous linear recurrence

[math]\displaystyle{ x_t= a_1x_{t-1} + \cdots + a_nx_{t-n} }[/math]

can be written in terms of the characteristic roots as

[math]\displaystyle{ x_t=c_1\lambda_1^t +\cdots + c_n\lambda_n^t }[/math]

where the coefficients ci can be found by invoking the initial conditions. Specifically, for each time period for which an iterate value is known, this value and its corresponding value of t can be substituted into the solution equation to obtain a linear equation in the n as-yet-unknown parameters; n such equations, one for each initial condition, can be solved simultaneously for the n parameter values. If all characteristic roots are real, then all the coefficient values ci will also be real; but with non-real complex roots, in general some of these coefficients will also be non-real.

Converting complex solution to trigonometric form

If there are complex roots, they come in conjugate pairs and so do the complex terms in the solution equation. If two of these complex terms are cjλtj and cj+1λtj+1, the roots λj can be written as

[math]\displaystyle{ \lambda_j, \lambda_{j+1}=\alpha \pm \beta i =M\left(\frac{\alpha}{M} \pm \frac{\beta}{M}i\right) }[/math]

where i is the imaginary unit and M is the modulus of the roots:

[math]\displaystyle{ M = \sqrt{\alpha^2+\beta^2}. }[/math]

Then the two complex terms in the solution equation can be written as

[math]\displaystyle{ \begin{align} c_j\lambda_j^t + c_{j+1}\lambda_{j+1}^t & = M^t\left(c_j\left(\frac{\alpha}{M} + \frac{\beta}{M}i\right)^t + c_{j+1}\left(\frac{\alpha}{M} - \frac{\beta}{M}i\right)^t\right) \\[6pt] & = M^t\left(c_j\left(\cos\theta + i\sin\theta\right)^t + c_{j+1}\left(\cos \theta - i\sin\theta\right)^t\right) \\[6pt] & = M^t\bigl(c_j\left(\cos\theta t + i\sin \theta t\right) + c_{j+1}\left(\cos\theta t - i\sin\theta t\right) \bigr) \end{align} }[/math]

where θ is the angle whose cosine is α/M and whose sine is β/M; the last equality here made use of de Moivre's formula.

Now the process of finding the coefficients cj and cj+1 guarantees that they are also complex conjugates, which can be written as γ ± δi. Using this in the last equation gives this expression for the two complex terms in the solution equation:

[math]\displaystyle{ 2M^t\left(\gamma \cos\theta t - \delta \sin\theta t\right) }[/math]

which can also be written as

[math]\displaystyle{ 2\sqrt{\gamma^2+\delta^2}M^t \cos(\theta t + \psi) }[/math]

where ψ is the angle whose cosine is γ/γ2 + δ2 and whose sine is δ/γ2 + δ2.

Cyclicity

Depending on the initial conditions, even with all roots real the iterates can experience a transitory tendency to go above and below the steady state value. But true cyclicity involves a permanent tendency to fluctuate, and this occurs if there is at least one pair of complex conjugate characteristic roots. This can be seen in the trigonometric form of their contribution to the solution equation, involving cos θt and sin θt.

Solution with duplicate characteristic roots

In the second-order case, if the two roots are identical (λ1 = λ2), they can both be denoted as λ and a solution may be of the form

[math]\displaystyle{ x_t = c_1 \lambda^t + c_2 t \lambda^t. }[/math]

Solution by conversion to matrix form

An alternative solution method involves converting the nth order difference equation to a first-order matrix difference equation. This is accomplished by writing w1,t = yt, w2,t = yt−1 = w1,t−1, w3,t = yt−2 = w2,t−1, and so on. Then the original single nth-order equation

[math]\displaystyle{ y_t = a_1y_{t-1} + a_2y_{t-2} + \cdots + a_ny_{t-n} + b }[/math]

can be replaced by the following n first-order equations:

[math]\displaystyle{ \begin{align} w_{1, t} & = a_1w_{1, t-1} + a_2w_{2, t-1}+\cdots + a_nw_{n, t-1} + b \\ w_{2, t} & = w_{1, t-1} \\ & \,\,\,\vdots \\ w_{n,t} & =w_{n-1, t-1}. \end{align} }[/math]

Defining the vector wi as

[math]\displaystyle{ \mathbf{w}_i = \begin{bmatrix}w_{1,i} \\ w_{2,i} \\ \vdots \\ w_{n,i} \end{bmatrix} }[/math]

this can be put in matrix form as

[math]\displaystyle{ \mathbf{w}_t = \mathbf{A}\mathbf{w}_{t-1}+\mathbf{b} }[/math]

Here A is an n × n matrix in which the first row contains a1, ..., an and all other rows have a single 1 with all other elements being 0, and b is a column vector with first element b and with the rest of its elements being 0.

This matrix equation can be solved using the methods in the article Matrix difference equation. In the homogeneous case yi is a para-permanent of a lower triangular matrix [6]

Solution using generating functions

The recurrence

[math]\displaystyle{ y_t = a_1y_{t-1} + \cdots + a_ny_{t-n} + b, }[/math]

can be solved using the theory of generating functions. First, we write [math]\displaystyle{ Y(x) = \sum_{t \ge 0} y_t x^t }[/math]. The recurrence is then equivalent to the following generating function equation:

[math]\displaystyle{ Y(x) = a_1xY(x) + a_2x^2Y(x) + \cdots + a_nx^nY(x) + \frac{b}{1-x} + p(x) }[/math]

where [math]\displaystyle{ p(x) }[/math] is a polynomial of degree at most [math]\displaystyle{ n-1 }[/math] correcting the initial terms. From this equation we can solve to get

[math]\displaystyle{ Y(x) = \left(\frac{b}{1-x} + p(x)\right) \cdot \frac{1}{1 - a_1 x - a_2 x^2 - \cdots - a_n x^n}. }[/math]

In other words, not worrying about the exact coefficients, [math]\displaystyle{ Y(x) }[/math] can be expressed as a rational function [math]\displaystyle{ Y(x) = \frac{f(x)}{g(x)}. }[/math]

The closed form can then be derived via partial fraction decomposition. Specifically, if the generating function is written as [math]\displaystyle{ \frac{f(x)}{g(x)} = \sum_i \frac{f_i(x)}{(x - r_i)^{m_i}} }[/math]

then the polynomial [math]\displaystyle{ p(x) }[/math] determines the initial set of corrections [math]\displaystyle{ z(n) }[/math], the denominator [math]\displaystyle{ (x - r_i)^m }[/math] determines the exponential term [math]\displaystyle{ r_i^n }[/math], and the degree [math]\displaystyle{ m }[/math] together with the numerator [math]\displaystyle{ f_i(x) }[/math] determine the polynomial coefficient [math]\displaystyle{ k_i(n) }[/math].

Relation to solution to differential equations

The method for solving linear differential equations is similar to the method above—the "intelligent guess" (ansatz) for linear differential equations with constant coefficients is [math]\displaystyle{ e^{\lambda x} }[/math] where [math]\displaystyle{ \lambda }[/math] is a complex number that is determined by substituting the guess into the differential equation.

This is not a coincidence. Considering the Taylor series of the solution to a linear differential equation:

[math]\displaystyle{ \sum_{n=0}^\infin \frac{f^{(n)}(a)}{n!} (x-a)^n }[/math]

it can be seen that the coefficients of the series are given by the [math]\displaystyle{ n }[/math]-th derivative of [math]\displaystyle{ f(x) }[/math] evaluated at the point [math]\displaystyle{ a }[/math]. The differential equation provides a linear difference equation relating these coefficients.

This equivalence can be used to quickly solve for the recurrence relationship for the coefficients in the power series solution of a linear differential equation.

The rule of thumb (for equations in which the polynomial multiplying the first term is non-zero at zero) is that:

[math]\displaystyle{ y^{[k]} \to f[n+k] }[/math] and more generally [math]\displaystyle{ x^m*y^{[k]} \to n(n-1)...(n-m+1)f[n+k-m] }[/math]

Example: The recurrence relationship for the Taylor series coefficients of the equation:

[math]\displaystyle{ (x^2 + 3x -4)y^{[3]} -(3x+1)y^{[2]} + 2y = 0 }[/math]

is given by

[math]\displaystyle{ n(n-1)f[n+1] + 3nf[n+2] -4f[n+3] -3nf[n+1] -f[n+2]+ 2f[n] = 0 }[/math]

or

[math]\displaystyle{ -4f[n+3] +2nf[n+2] + n(n-4)f[n+1] +2f[n] = 0. }[/math]

This example shows how problems generally solved using the power series solution method taught in normal differential equation classes can be solved in a much easier way.

Example: The differential equation

[math]\displaystyle{ ay'' + by' +cy = 0 }[/math]

has solution

[math]\displaystyle{ y=e^{ax}. }[/math]

The conversion of the differential equation to a difference equation of the Taylor coefficients is

[math]\displaystyle{ af[n + 2] + bf[n + 1] + cf[n] = 0. }[/math]

It is easy to see that the [math]\displaystyle{ n }[/math]-th derivative of [math]\displaystyle{ e^{ax} }[/math] evaluated at [math]\displaystyle{ 0 }[/math] is [math]\displaystyle{ a^n }[/math].

Solving with z-transforms

Certain difference equations - in particular, linear constant coefficient difference equations - can be solved using z-transforms. The z-transforms are a class of integral transforms that lead to more convenient algebraic manipulations and more straightforward solutions. There are cases in which obtaining a direct solution would be all but impossible, yet solving the problem via a thoughtfully chosen integral transform is straightforward.

Stability

In the solution equation

[math]\displaystyle{ x_t = c_1\lambda_1^t +\cdots + c_n\lambda_n^t, }[/math]

a term with real characteristic roots converges to 0 as t grows indefinitely large if the absolute value of the characteristic root is less than 1. If the absolute value equals 1, the term will stay constant as t grows if the root is +1 but will fluctuate between two values if the root is −1. If the absolute value of the root is greater than 1 the term will become larger and larger over time. A pair of terms with complex conjugate characteristic roots will converge to 0 with dampening fluctuations if the absolute value of the modulus M of the roots is less than 1; if the modulus equals 1 then constant amplitude fluctuations in the combined terms will persist; and if the modulus is greater than 1, the combined terms will show fluctuations of ever-increasing magnitude.

Thus the evolving variable x will converge to 0 if all of the characteristic roots have magnitude less than 1.

If the largest root has absolute value 1, neither convergence to 0 nor divergence to infinity will occur. If all roots with magnitude 1 are real and positive, x will converge to the sum of their constant terms ci; unlike in the stable case, this converged value depends on the initial conditions; different starting points lead to different points in the long run. If any root is −1, its term will contribute permanent fluctuations between two values. If any of the unit-magnitude roots are complex then constant-amplitude fluctuations of x will persist.

Finally, if any characteristic root has magnitude greater than 1, then x will diverge to infinity as time goes to infinity, or will fluctuate between increasingly large positive and negative values.

A theorem of Issai Schur states that all roots have magnitude less than 1 (the stable case) if and only if a particular string of determinants are all positive.[2]:{{{1}}}

If a non-homogeneous linear difference equation has been converted to homogeneous form which has been analyzed as above, then the stability and cyclicality properties of the original non-homogeneous equation will be the same as those of the derived homogeneous form, with convergence in the stable case being to the steady-state value y* instead of to 0.

See also

References

  1. Chiang, Alpha (1984). Fundamental Methods of Mathematical Economics (Third ed.). New York: McGraw-Hill. ISBN 0-07-010813-7. https://archive.org/details/fundamentalmetho0000chia_h4v2. 
  2. 2.0 2.1 Baumol, William (1970). Economic Dynamics (Third ed.). New York: Macmillan. ISBN 0-02-306660-1. https://archive.org/details/economicdynamics0000baum_c7i2. 
  3. Greene, Daniel H.; Knuth, Donald E. (1982), "2.1.1 Constant coefficients – A) Homogeneous equations", Mathematics for the Analysis of Algorithms (2nd ed.), Birkhäuser, p. 17 .
  4. Chiang, Alpha C., Fundamental Methods of Mathematical Economics, third edition, McGraw-Hill, 1984.
  5. Papanicolaou, Vassilis, "On the asymptotic stability of a class of linear difference equations," Mathematics Magazine 69(1), February 1996, 34–43.
  6. Zatorsky, Roman; Goy, Taras (2016). "Parapermanent of triangular matrices and some general theorems on number sequences". J. Int. Seq. 19: 16.2.2. https://cs.uwaterloo.ca/journals/JIS/VOL19/Goy/goy2.html.