Method of characteristics
Differential equations 

Classification 
Solution 
In mathematics, the method of characteristics is a technique for solving partial differential equations. Typically, it applies to firstorder equations, although more generally the method of characteristics is valid for any hyperbolic partial differential equation. The method is to reduce a partial differential equation to a family of ordinary differential equations along which the solution can be integrated from some initial data given on a suitable hypersurface.
Characteristics of firstorder partial differential equation
For a firstorder PDE (partial differential equation), the method of characteristics discovers curves (called characteristic curves or just characteristics) along which the PDE becomes an ordinary differential equation (ODE).^{[1]} Once the ODE is found, it can be solved along the characteristic curves and transformed into a solution for the original PDE.
For the sake of simplicity, we confine our attention to the case of a function of two independent variables x and y for the moment. Consider a quasilinear PDE of the form

[math]\displaystyle{ a(x,y,z) \frac{\partial z}{\partial x}+b(x,y,z) \frac{\partial z}{\partial y}=c(x,y,z). }[/math]
(
)
Suppose that a solution z is known, and consider the surface graph z = z(x,y) in R^{3}. A normal vector to this surface is given by
 [math]\displaystyle{ \left(\frac{\partial z}{\partial x}(x,y),\frac{\partial z}{\partial y}(x,y),1\right).\, }[/math]
As a result,^{[2]} equation (1) is equivalent to the geometrical statement that the vector field
 [math]\displaystyle{ (a(x,y,z),b(x,y,z),c(x,y,z))\, }[/math]
is tangent to the surface z = z(x,y) at every point, for the dot product of this vector field with the above normal vector is zero. In other words, the graph of the solution must be a union of integral curves of this vector field. These integral curves are called the characteristic curves of the original partial differential equation and are given by the Lagrange–Charpit equations^{[3]}
 [math]\displaystyle{ \begin{array}{rcl} \frac{dx}{dt}&=&a(x,y,z),\\ \frac{dy}{dt}&=&b(x,y,z),\\ \frac{dz}{dt}&=&c(x,y,z). \end{array} }[/math]
A parametrization invariant form of the Lagrange–Charpit equations^{[3]} is:
 [math]\displaystyle{ \frac{dx}{a(x,y,z)} = \frac{dy}{b(x,y,z)} = \frac{dz}{c(x,y,z)} . }[/math]
Linear and quasilinear cases
Consider now a PDE of the form
 [math]\displaystyle{ \sum_{i=1}^n a_i(x_1,\dots,x_n,u) \frac{\partial u}{\partial x_i}=c(x_1,\dots,x_n,u). }[/math]
For this PDE to be linear, the coefficients a_{i} may be functions of the spatial variables only, and independent of u. For it to be quasilinear,^{[4]} a_{i} may also depend on the value of the function, but not on any derivatives. The distinction between these two cases is inessential for the discussion here.
For a linear or quasilinear PDE, the characteristic curves are given parametrically by
 [math]\displaystyle{ (x_1,\dots,x_n,u) = (x_1(s),\dots,x_n(s),u(s)) }[/math]
 [math]\displaystyle{ u(\mathbf{X}(s)) = U(s) }[/math]
such that the following system of ODEs is satisfied

[math]\displaystyle{ \frac{dx_i}{ds} = a_i(x_1,\dots,x_n,u) }[/math]
(
)

[math]\displaystyle{ \frac{du}{ds} = c(x_1,\dots,x_n,u). }[/math]
(
)
Equations (2) and (3) give the characteristics of the PDE.
Proof for quasilinear Case
In the quasilinear case, the use of the method of characteristics is justified by Grönwall's inequality. The above equation may be written as [math]\displaystyle{ \mathbf{a}(\mathbf{x},u) \cdot \nabla u(\mathbf{x}) = c(\mathbf{x},u) }[/math]
We must distinguish between the solutions to the ODE and the solutions to the PDE, which we do not know are equal a priori. Letting capital letters be the solutions to the ODE we find [math]\displaystyle{ \mathbf{X}'(s) = \mathbf{a}(\mathbf{X}(s),U(s)) }[/math] [math]\displaystyle{ U'(s) = c(\mathbf{X}(s), U(s)) }[/math]
Examining [math]\displaystyle{ \Delta(s) = u(\mathbf{X}(s))  U(s)^2 }[/math], we find, upon differentiating that [math]\displaystyle{ \Delta'(s) = 2\big(u(\mathbf{X}(s))  U(s)\big) \Big(\mathbf{X}'(s)\cdot \nabla u(\mathbf{X}(s))  U'(s)\Big) }[/math] which is the same as [math]\displaystyle{ \Delta'(s) = 2\big(u(\mathbf{X}(s))  U(s)\big) \Big(\mathbf{a}(\mathbf{X}(s),U(s))\cdot \nabla u(\mathbf{X}(s))  c(\mathbf{X}(s),U(s))\Big) }[/math]
We cannot conclude the above is 0 as we would like, since the PDE only guarantees us that this relationship is satisfied for [math]\displaystyle{ u(\mathbf{x}) }[/math], [math]\displaystyle{ \mathbf{a}(\mathbf{x},u) \cdot \nabla u(\mathbf{x}) = c(\mathbf{x},u) }[/math], and we do not yet know that [math]\displaystyle{ U(s) = u(\mathbf{X}(s)) }[/math].
However, we can see that [math]\displaystyle{ \Delta'(s) = 2\big(u(\mathbf{X}(s))  U(s)\big) \Big(\mathbf{a}(\mathbf{X}(s),U(s))\cdot \nabla u(\mathbf{X}(s))  c(\mathbf{X}(s),U(s))\big(\mathbf{a}(\mathbf{X}(s),u(\mathbf{X}(s))) \cdot \nabla u(\mathbf{X}(s))  c(\mathbf{X}(s),u(\mathbf{X}(s)))\big)\Big) }[/math] since by the PDE, the last term is 0. This equals [math]\displaystyle{ \Delta'(s) = 2\big(u(\mathbf{X}(s))  U(s)\big) \Big(\big(\mathbf{a}(\mathbf{X}(s),U(s))\mathbf{a}(\mathbf{X}(s),u(\mathbf{X}(s)))\big)\cdot \nabla u(\mathbf{X}(s))  \big(c(\mathbf{X}(s),U(s))c(\mathbf{X}(s),u(\mathbf{X}(s)))\big)\Big) }[/math]
By the triangle inequality, we have [math]\displaystyle{ \Delta'(s) \leq 2\bigu(\mathbf{X}(s))  U(s)\big \Big(\big\\mathbf{a}(\mathbf{X}(s),U(s))\mathbf{a}(\mathbf{X}(s),u(\mathbf{X}(s)))\big\ \ \\nabla u(\mathbf{X}(s))\ + \bigc(\mathbf{X}(s),U(s))c(\mathbf{X}(s),u(\mathbf{X}(s)))\big\Big) }[/math]
Assuming [math]\displaystyle{ \mathbf{a},c }[/math] are at least [math]\displaystyle{ C^1 }[/math], we can bound this for small times. Choose a neighborhood [math]\displaystyle{ \Omega }[/math] around [math]\displaystyle{ \mathbf{X}(0), U(0) }[/math] small enough such that [math]\displaystyle{ \mathbf{a},c }[/math] are locally Lipschitz. By continuity, [math]\displaystyle{ (\mathbf{X}(s),U(s)) }[/math] will remain in [math]\displaystyle{ \Omega }[/math] for small enough [math]\displaystyle{ s }[/math]. Since [math]\displaystyle{ U(0) = u(\mathbf{X}(0)) }[/math], we also have that [math]\displaystyle{ (\mathbf{X}(s), u(\mathbf{X}(s))) }[/math] will be in [math]\displaystyle{ \Omega }[/math] for small enough [math]\displaystyle{ s }[/math] by continuity. So, [math]\displaystyle{ (\mathbf{X}(s),U(s)) \in \Omega }[/math] and [math]\displaystyle{ (\mathbf{X}(s), u(\mathbf{X}(s))) \in \Omega }[/math] for [math]\displaystyle{ s \in [0,s_0] }[/math]. Additionally, [math]\displaystyle{ \\nabla u(\mathbf{X}(s))\ \leq M }[/math] for some [math]\displaystyle{ M \in \R }[/math] for [math]\displaystyle{ s \in [0,s_0] }[/math] by compactness. From this, we find the above is bounded as [math]\displaystyle{ \Delta'(s) \leq Cu(\mathbf{X}(s))  U(s)^2 = C \Delta(s) }[/math] for some [math]\displaystyle{ C \in \mathbb{R} }[/math]. It is a straightforward application of Grönwall's Inequality to show that since [math]\displaystyle{ \Delta(0) = 0 }[/math] we have [math]\displaystyle{ \Delta(s) = 0 }[/math] for as long as this inequality holds. We have some interval [math]\displaystyle{ [0, \epsilon) }[/math] such that [math]\displaystyle{ u(X(s)) = U(s) }[/math] in this interval. Choose the largest [math]\displaystyle{ \epsilon }[/math] such that this is true. Then, by continuity, [math]\displaystyle{ U(\epsilon) = u(\mathbf{X}(\epsilon)) }[/math]. Provided the ODE still has a solution in some interval after [math]\displaystyle{ \epsilon }[/math], we can repeat the argument above to find that [math]\displaystyle{ u(X(s)) = U(s) }[/math] in a larger interval. Thus, so long as the ODE has a solution, we have [math]\displaystyle{ u(X(s)) = U(s) }[/math].
Fully nonlinear case
Consider the partial differential equation

[math]\displaystyle{ F(x_1,\dots,x_n,u,p_1,\dots,p_n)=0 }[/math]
(
)
where the variables p_{i} are shorthand for the partial derivatives
 [math]\displaystyle{ p_i = \frac{\partial u}{\partial x_i}. }[/math]
Let (x_{i}(s),u(s),p_{i}(s)) be a curve in R^{2n+1}. Suppose that u is any solution, and that
 [math]\displaystyle{ u(s) = u(x_1(s),\dots,x_n(s)). }[/math]
Along a solution, differentiating (4) with respect to s gives
 [math]\displaystyle{ \sum_i(F_{x_i} + F_u p_i)\dot{x}_i + \sum_i F_{p_i}\dot{p}_i = 0 }[/math]
 [math]\displaystyle{ \dot{u}  \sum_i p_i \dot{x}_i = 0 }[/math]
 [math]\displaystyle{ \sum_i (\dot{x}_i dp_i  \dot{p}_i dx_i)= 0. }[/math]
The second equation follows from applying the chain rule to a solution u, and the third follows by taking an exterior derivative of the relation [math]\displaystyle{ du  \sum_i p_i \, dx_i = 0 }[/math]. Manipulating these equations gives
 [math]\displaystyle{ \dot{x}_i=\lambda F_{p_i},\quad\dot{p}_i=\lambda(F_{x_i}+F_up_i),\quad \dot{u}=\lambda\sum_i p_iF_{p_i} }[/math]
where λ is a constant. Writing these equations more symmetrically, one obtains the Lagrange–Charpit equations for the characteristic
 [math]\displaystyle{ \frac{\dot{x}_i}{F_{p_i}}=\frac{\dot{p}_i}{F_{x_i}+F_up_i}=\frac{\dot{u}}{\sum p_iF_{p_i}}. }[/math]
Geometrically, the method of characteristics in the fully nonlinear case can be interpreted as requiring that the Monge cone of the differential equation should everywhere be tangent to the graph of the solution. The second order partial differential equation is solved with Charpit method .
Example
As an example, consider the advection equation (this example assumes familiarity with PDE notation, and solutions to basic ODEs).
 [math]\displaystyle{ a \frac{\partial u}{\partial x} + \frac{\partial u}{\partial t} = 0 }[/math]
where [math]\displaystyle{ a }[/math] is constant and [math]\displaystyle{ u }[/math] is a function of [math]\displaystyle{ x }[/math] and [math]\displaystyle{ t }[/math]. We want to transform this linear firstorder PDE into an ODE along the appropriate curve; i.e. something of the form
 [math]\displaystyle{ \frac{d}{ds}u(x(s), t(s)) = F(u, x(s), t(s)) , }[/math]
where [math]\displaystyle{ (x(s),t(s)) }[/math] is a characteristic line. First, we find
 [math]\displaystyle{ \frac{d}{ds}u(x(s), t(s)) = \frac{\partial u}{\partial x} \frac{dx}{ds} + \frac{\partial u}{\partial t} \frac{dt}{ds} }[/math]
by the chain rule. Now, if we set [math]\displaystyle{ \frac{dx}{ds} = a }[/math] and [math]\displaystyle{ \frac{dt}{ds} = 1 }[/math] we get
 [math]\displaystyle{ a \frac{\partial u}{\partial x} + \frac{\partial u}{\partial t} }[/math]
which is the left hand side of the PDE we started with. Thus
 [math]\displaystyle{ \frac{d}{ds}u = a \frac{\partial u}{\partial x} + \frac{\partial u}{\partial t} = 0. }[/math]
So, along the characteristic line [math]\displaystyle{ (x(s), t(s)) }[/math], the original PDE becomes the ODE [math]\displaystyle{ u_s = F(u, x(s), t(s)) = 0 }[/math]. That is to say that along the characteristics, the solution is constant. Thus, [math]\displaystyle{ u(x_s, t_s) = u(x_0, 0) }[/math] where [math]\displaystyle{ (x_s, t_s)\, }[/math] and [math]\displaystyle{ (x_0, 0) }[/math] lie on the same characteristic. Therefore, to determine the general solution, it is enough to find the characteristics by solving the characteristic system of ODEs:
 [math]\displaystyle{ \frac{dt}{ds} = 1 }[/math], letting [math]\displaystyle{ t(0)=0 }[/math] we know [math]\displaystyle{ t=s }[/math],
 [math]\displaystyle{ \frac{dx}{ds} = a }[/math], letting [math]\displaystyle{ x(0)=x_0 }[/math] we know [math]\displaystyle{ x=as+x_0=at+x_0 }[/math],
 [math]\displaystyle{ \frac{du}{ds} = 0 }[/math], letting [math]\displaystyle{ u(0)=f(x_0) }[/math] we know [math]\displaystyle{ u(x(t), t)=f(x_0)=f(xat) }[/math].
In this case, the characteristic lines are straight lines with slope [math]\displaystyle{ a }[/math], and the value of [math]\displaystyle{ u }[/math] remains constant along any characteristic line.
Characteristics of linear differential operators
Let X be a differentiable manifold and P a linear differential operator
 [math]\displaystyle{ P : C^\infty(X) \to C^\infty(X) }[/math]
of order k. In a local coordinate system x^{i},
 [math]\displaystyle{ P = \sum_{\alpha\le k} P^{\alpha}(x)\frac{\partial}{\partial x^\alpha} }[/math]
in which α denotes a multiindex. The principal symbol of P, denoted σ_{P}, is the function on the cotangent bundle T^{∗}X defined in these local coordinates by
 [math]\displaystyle{ \sigma_P(x,\xi) = \sum_{\alpha=k} P^\alpha(x)\xi_\alpha }[/math]
where the ξ_{i} are the fiber coordinates on the cotangent bundle induced by the coordinate differentials dx^{i}. Although this is defined using a particular coordinate system, the transformation law relating the ξ_{i} and the x^{i} ensures that σ_{P} is a welldefined function on the cotangent bundle.
The function σ_{P} is homogeneous of degree k in the ξ variable. The zeros of σ_{P}, away from the zero section of T^{∗}X, are the characteristics of P. A hypersurface of X defined by the equation F(x) = c is called a characteristic hypersurface at x if
 [math]\displaystyle{ \sigma_P(x,dF(x)) = 0. }[/math]
Invariantly, a characteristic hypersurface is a hypersurface whose conormal bundle is in the characteristic set of P.
Qualitative analysis of characteristics
Characteristics are also a powerful tool for gaining qualitative insight into a PDE.
One can use the crossings of the characteristics to find shock waves for potential flow in a compressible fluid. Intuitively, we can think of each characteristic line implying a solution to [math]\displaystyle{ u }[/math] along itself. Thus, when two characteristics cross, the function becomes multivalued resulting in a nonphysical solution. Physically, this contradiction is removed by the formation of a shock wave, a tangential discontinuity or a weak discontinuity and can result in nonpotential flow, violating the initial assumptions.^{[5]}
Characteristics may fail to cover part of the domain of the PDE. This is called a rarefaction, and indicates the solution typically exists only in a weak, i.e. integral equation, sense.
The direction of the characteristic lines indicates the flow of values through the solution, as the example above demonstrates. This kind of knowledge is useful when solving PDEs numerically as it can indicate which finite difference scheme is best for the problem.
See also
Notes
 ↑ Zachmanoglou, E. C.; Thoe, Dale W. (1976), "Linear Partial Differential Equations : Characteristics, Classification, and Canonical Forms", Introduction to Partial Differential Equations with Applications, Baltimore: Williams & Wilkins, pp. 112–152, ISBN 0486652513
 ↑ John, Fritz (1991), Partial differential equations (4th ed.), Springer, ISBN 9780387906096, https://archive.org/details/partialdifferent00john_0
 ↑ ^{3.0} ^{3.1} Delgado, Manuel (1997), "The LagrangeCharpit Method", SIAM Review 39 (2): 298–304, doi:10.1137/S0036144595293534, Bibcode: 1997SIAMR..39..298D
 ↑ "Partial Differential Equations (PDEs)—Wolfram Language Documentation". https://reference.wolfram.com/language/tutorial/DSolveLinearAndQuasiLinearFirstOrderPDEs.html.
 ↑ Debnath, Lokenath (2005), "Conservation Laws and Shock Waves", Nonlinear Partial Differential Equations for Scientists and Engineers (2nd ed.), Boston: Birkhäuser, pp. 251–276, ISBN 0817643230
References
 Courant, Richard; Hilbert, David (1962), Methods of Mathematical Physics, Volume II, WileyInterscience
 Evans, Lawrence C. (1998), Partial Differential Equations, Providence: American Mathematical Society, ISBN 0821807722
 Polyanin, A. D.; Zaitsev, V. F.; Moussiaux, A. (2002), Handbook of First Order Partial Differential Equations, London: Taylor & Francis, ISBN 041527267X
 Polyanin, A. D. (2002), Handbook of Linear Partial Differential Equations for Engineers and Scientists, Boca Raton: Chapman & Hall/CRC Press, ISBN 1584882999
 Sarra, Scott (2003), "The Method of Characteristics with applications to Conservation Laws", Journal of Online Mathematics and Its Applications, http://www.scottsarra.org/shock/shock.html.
 Streeter, VL; Wylie, EB (1998), Fluid mechanics (International 9th Revised ed.), McGrawHill Higher Education
External links
 Prof. Scott Sarra tutorial on Method of Characteristics
 Prof. Alan Hood tutorial on Method of Characteristics
Original source: https://en.wikipedia.org/wiki/Method of characteristics.
Read more 