Calculus on Euclidean space
Part of a series of articles about |
Calculus |
---|
In mathematics, calculus on Euclidean space is a generalization of calculus of functions in one or several variables to calculus of functions on Euclidean space [math]\displaystyle{ \mathbb{R}^n }[/math] as well as a finite-dimensional real vector space. This calculus is also known as advanced calculus, especially in the United States. It is similar to multivariable calculus but is somewhat more sophisticated in that it uses linear algebra (or some functional analysis) more extensively and covers some concepts from differential geometry such as differential forms and Stokes' formula in terms of differential forms. This extensive use of linear algebra also allows a natural generalization of multivariable calculus to calculus on Banach spaces or topological vector spaces.
Calculus on Euclidean space is also a local model of calculus on manifolds, a theory of functions on manifolds.
Basic notions
Functions in one real variable
This section is a brief review of function theory in one-variable calculus.
A real-valued function [math]\displaystyle{ f : \mathbb{R} \to \mathbb{R} }[/math] is continuous at [math]\displaystyle{ a }[/math] if it is approximately constant near [math]\displaystyle{ a }[/math]; i.e.,
- [math]\displaystyle{ \lim_{h \to 0} (f(a + h) - f(a)) = 0. }[/math]
In contrast, the function [math]\displaystyle{ f }[/math] is differentiable at [math]\displaystyle{ a }[/math] if it is approximately linear near [math]\displaystyle{ a }[/math]; i.e., there is some real number [math]\displaystyle{ \lambda }[/math] such that
- [math]\displaystyle{ \lim_{h \to 0} \frac{f(a + h) - f(a) - \lambda h}{h} = 0. }[/math][1]
(For simplicity, suppose [math]\displaystyle{ f(a) = 0 }[/math]. Then the above means that [math]\displaystyle{ f(a + h) = \lambda h + g(a, h) }[/math] where [math]\displaystyle{ g(a, h) }[/math] goes to 0 faster than h going to 0 and, in that sense, [math]\displaystyle{ f(a + h) }[/math] behaves like [math]\displaystyle{ \lambda h }[/math].)
The number [math]\displaystyle{ \lambda }[/math] depends on [math]\displaystyle{ a }[/math] and thus is denoted as [math]\displaystyle{ f'(a) }[/math]. If [math]\displaystyle{ f }[/math] is differentiable on an open interval [math]\displaystyle{ U }[/math] and if [math]\displaystyle{ f' }[/math] is a continuous function on [math]\displaystyle{ U }[/math], then [math]\displaystyle{ f }[/math] is called a C1 function. More generally, [math]\displaystyle{ f }[/math] is called a Ck function if its derivative [math]\displaystyle{ f' }[/math] is Ck-1 function. Taylor's theorem states that a Ck function is precisely a function that can be approximated by a polynomial of degree k.
If [math]\displaystyle{ f : \mathbb{R} \to \mathbb{R} }[/math] is a C1 function and [math]\displaystyle{ f'(a) \ne 0 }[/math] for some [math]\displaystyle{ a }[/math], then either [math]\displaystyle{ f'(a) \gt 0 }[/math] or [math]\displaystyle{ f'(a) \lt 0 }[/math]; i.e., either [math]\displaystyle{ f }[/math] is strictly increasing or strictly decreasing in some open interval containing a. In particular, [math]\displaystyle{ f : f^{-1}(U) \to U }[/math] is bijective for some open interval [math]\displaystyle{ U }[/math] containing [math]\displaystyle{ f(a) }[/math]. The inverse function theorem then says that the inverse function [math]\displaystyle{ f^{-1} }[/math] is differentiable on U with the derivatives: for [math]\displaystyle{ y \in U }[/math]
- [math]\displaystyle{ (f^{-1})'(y) = {1 \over f'(f^{-1}(y))}. }[/math]
Derivative of a map and chain rule
For functions [math]\displaystyle{ f }[/math] defined in the plane or more generally on an Euclidean space [math]\displaystyle{ \mathbb{R}^n }[/math], it is necessary to consider functions that are vector-valued or matrix-valued. It is also conceptually helpful to do this in an invariant manner (i.e., a coordinate-free way). Derivatives of such maps at a point are then vectors or linear maps, not real numbers.
Let [math]\displaystyle{ f : X \to Y }[/math] be a map from an open subset [math]\displaystyle{ X }[/math] of [math]\displaystyle{ \mathbb{R}^n }[/math] to an open subset [math]\displaystyle{ Y }[/math] of [math]\displaystyle{ \mathbb{R}^m }[/math]. Then the map [math]\displaystyle{ f }[/math] is said to be differentiable at a point [math]\displaystyle{ x }[/math] in [math]\displaystyle{ X }[/math] if there exists a (necessarily unique) linear transformation [math]\displaystyle{ f'(x) : \mathbb{R}^n \to \mathbb{R}^m }[/math], called the derivative of [math]\displaystyle{ f }[/math] at [math]\displaystyle{ x }[/math], such that
- [math]\displaystyle{ \lim_{ h \to 0 } \frac{1}{|h|} |f(x + h) - f(x) - f'(x)h| = 0 }[/math]
where [math]\displaystyle{ f'(x)h }[/math] is the application of the linear transformation [math]\displaystyle{ f'(x) }[/math] to [math]\displaystyle{ h }[/math].[2] If [math]\displaystyle{ f }[/math] is differentiable at [math]\displaystyle{ x }[/math], then it is continuous at [math]\displaystyle{ x }[/math] since
- [math]\displaystyle{ |f(x + h) - f(x)| \le (|h|^{-1}|f(x + h) - f(x) - f'(x)h|) |h| + |f'(x)h| \to 0 }[/math] as [math]\displaystyle{ h \to 0 }[/math].
As in the one-variable case, there is
Chain rule — [3] Let [math]\displaystyle{ f }[/math] be as above and [math]\displaystyle{ g : Y \to Z }[/math] a map for some open subset [math]\displaystyle{ Z }[/math] of [math]\displaystyle{ \mathbb{R}^l }[/math]. If [math]\displaystyle{ f }[/math] is differentiable at [math]\displaystyle{ x }[/math] and [math]\displaystyle{ g }[/math] differentiable at [math]\displaystyle{ y = f(x) }[/math], then the composition [math]\displaystyle{ g \circ f }[/math] is differentiable at [math]\displaystyle{ x }[/math] with the derivative
- [math]\displaystyle{ (g \circ f)'(x) = g'(y) \circ f'(x). }[/math]
This is proved exactly as for functions in one variable. Indeed, with the notation [math]\displaystyle{ \widetilde{h} = f(x + h) - f(x) }[/math], we have:
- [math]\displaystyle{ \begin{align} & \frac{1}{|h|} |g(f(x + h)) - g(y) - g'(y) f'(x) h| \\ & \le \frac{1}{|h|} |g(y + \widetilde{h}) - g(y) - g'(y)\widetilde{h}| + \frac{1}{|h|} |g'(y)(f(x+h) - f(x) - f'(x) h)|. \end{align} }[/math]
Here, since [math]\displaystyle{ f }[/math] is differentiable at [math]\displaystyle{ x }[/math], the second term on the right goes to zero as [math]\displaystyle{ h \to 0 }[/math]. As for the first term, it can be written as:
- [math]\displaystyle{ \begin{cases} \frac{|\widetilde{h}|}{|h|} |g(y+ \widetilde{h}) - g(y) - g'(y)\widetilde{h}|/|\widetilde{h}|, & \widetilde{h} \neq 0, \\ 0, & \widetilde{h} = 0. \end{cases} }[/math]
Now, by the argument showing the continuity of [math]\displaystyle{ f }[/math] at [math]\displaystyle{ x }[/math], we see [math]\displaystyle{ \frac{|\widetilde{h}|}{|h|} }[/math] is bounded. Also, [math]\displaystyle{ \widetilde{h} \to 0 }[/math] as [math]\displaystyle{ h \to 0 }[/math] since [math]\displaystyle{ f }[/math] is continuous at [math]\displaystyle{ x }[/math]. Hence, the first term also goes to zero as [math]\displaystyle{ h \to 0 }[/math] by the differentiability of [math]\displaystyle{ g }[/math] at [math]\displaystyle{ y }[/math]. [math]\displaystyle{ \square }[/math]
The map [math]\displaystyle{ f }[/math] as above is called continuously differentiable or [math]\displaystyle{ C^1 }[/math] if it is differentiable on the domain and also the derivatives vary continuously; i.e., [math]\displaystyle{ x \mapsto f'(x) }[/math] is continuous.
Corollary — If [math]\displaystyle{ f, g }[/math] are continuously differentiable, then [math]\displaystyle{ g \circ f }[/math] is continuously differentiable.
As a linear transformation, [math]\displaystyle{ f'(x) }[/math] is represented by an [math]\displaystyle{ m \times n }[/math]-matrix, called the Jacobian matrix [math]\displaystyle{ Jf(x) }[/math] of [math]\displaystyle{ f }[/math] at [math]\displaystyle{ x }[/math] and we write it as:
- [math]\displaystyle{ (Jf)(x) = \begin{bmatrix} \frac{\partial f_1}{\partial x_1}(x) & \cdots & \frac{\partial f_1}{\partial x_n}(x) \\ \vdots & \ddots & \vdots \\ \frac{\partial f_m}{\partial x_1}(x) & \cdots & \frac{\partial f_m}{\partial x_n}(x) \end{bmatrix}. }[/math]
Taking [math]\displaystyle{ h }[/math] to be [math]\displaystyle{ h e_j }[/math], [math]\displaystyle{ h }[/math] a real number and [math]\displaystyle{ e_j = (0, \cdots, 1, \cdots, 0) }[/math] the j-th standard basis element, we see that the differentiability of [math]\displaystyle{ f }[/math] at [math]\displaystyle{ x }[/math] implies:
- [math]\displaystyle{ \lim_{h \to 0} \frac{f_i(x + h e_j) - f_i(x)}{h} = \frac{\partial f_i}{\partial x_j}(x) }[/math]
where [math]\displaystyle{ f_i }[/math] denotes the i-th component of [math]\displaystyle{ f }[/math]. That is, each component of [math]\displaystyle{ f }[/math] is differentiable at [math]\displaystyle{ x }[/math] in each variable with the derivative [math]\displaystyle{ \frac{\partial f_i}{\partial x_j}(x) }[/math]. In terms of Jacobian matrices, the chain rule says [math]\displaystyle{ J(g \circ f)(x) = Jg(y) Jf(x) }[/math]; i.e., as [math]\displaystyle{ (g \circ f)_i = g_i \circ f }[/math],
- [math]\displaystyle{ \frac{\partial (g_i \circ f)}{\partial x_j}(x) = \frac{\partial g_i}{\partial y_1} (y) \frac{\partial f_1}{\partial x_j}(x) + \cdots + \frac{\partial g_i}{\partial y_m} (y) \frac{\partial f_m}{\partial x_j}(x), }[/math]
which is the form of the chain rule that is often stated.
A partial converse to the above holds. Namely, if the partial derivatives [math]\displaystyle{ {\partial f_i}/{\partial x_j} }[/math] are all defined and continuous, then [math]\displaystyle{ f }[/math] is continuously differentiable.[4] This is a consequence of the mean value inequality:
Mean value inequality — [5] Given the map [math]\displaystyle{ f : X \to Y }[/math] as above and points [math]\displaystyle{ x, y }[/math] in [math]\displaystyle{ X }[/math] such that the line segment between [math]\displaystyle{ x, y }[/math] lies in [math]\displaystyle{ X }[/math], if [math]\displaystyle{ t \mapsto f(x + ty) }[/math] is continuous on [math]\displaystyle{ [0, 1] }[/math] and is differentiable on the interior, then, for any vector [math]\displaystyle{ v \in \mathbb{R}^m }[/math],
- [math]\displaystyle{ |\Delta_y f(x) - v| \le \sup_{0 \lt t \lt 1} \left| \frac{d}{dt}f(x + ty) - v \right| }[/math]
where [math]\displaystyle{ \Delta_y f(x) = f(x + y) - f(x). }[/math]
(This version of mean value inequality follows from mean value inequality in Mean value theorem § Mean value theorem for vector-valued functions applied to the function [math]\displaystyle{ [0, 1] \to \mathbb{R}^m, \, t \mapsto f(x + ty) - tv }[/math], where the proof on mean value inequality is given.)
Indeed, let [math]\displaystyle{ g(x) = (Jf)(x) }[/math]. We note that, if [math]\displaystyle{ y = y_i e_i }[/math], then
- [math]\displaystyle{ \frac{d}{dt}f(x + ty) = \frac{\partial f}{\partial x_i}(x+ty)y = g(x + ty)(y_i e_i). }[/math]
For simplicity, assume [math]\displaystyle{ n = 2 }[/math] (the argument for the general case is similar). Then, by mean value inequality, with the operator norm [math]\displaystyle{ \| \cdot \| }[/math],
- [math]\displaystyle{ \begin{align} &|\Delta_y f (x) - g(x)y| \\ &\le |\Delta_{y_1 e_1} f(x_1, x_2 + y_2) - g(x)(y_1 e_1)| + |\Delta_{y_2 e_2} f(x_1, x_2) - g(x)(y_2 e_2)| \\ &\le |y_1| \sup_{0 \lt t \lt 1}\|g(x_1 + t y_1, x_2 + y_2) - g(x)\| + |y_2| \sup_{0 \lt t \lt 1}\|g(x_1, x_2 + ty_2) - g(x)\|, \end{align} }[/math]
which implies [math]\displaystyle{ |\Delta_y f (x) - g(x)y|/|y| \to 0 }[/math] as required. [math]\displaystyle{ \square }[/math]
Example: Let [math]\displaystyle{ U }[/math] be the set of all invertible real square matrices of size n. Note [math]\displaystyle{ U }[/math] can be identified as an open subset of [math]\displaystyle{ \mathbb{R}^{n^2} }[/math] with coordinates [math]\displaystyle{ x_{ij}, 0 \le i, j \ne n }[/math]. Consider the function [math]\displaystyle{ f(g) = g^{-1} }[/math] = the inverse matrix of [math]\displaystyle{ g }[/math] defined on [math]\displaystyle{ U }[/math]. To guess its derivatives, assume [math]\displaystyle{ f }[/math] is differentiable and consider the curve [math]\displaystyle{ c(t) = ge^{tg^{-1}h} }[/math] where [math]\displaystyle{ e^A }[/math] means the matrix exponential of [math]\displaystyle{ A }[/math]. By the chain rule applied to [math]\displaystyle{ f(c(t)) = e^{-t g^{-1}h} g^{-1} }[/math], we have:
- [math]\displaystyle{ f'(c(t)) \circ c'(t) = -g^{-1}h e^{-t g^{-1}h} g^{-1} }[/math].
Taking [math]\displaystyle{ t = 0 }[/math], we get:
- [math]\displaystyle{ f'(g) h = -g^{-1}h g^{-1} }[/math].
Now, we then have:[6]
- [math]\displaystyle{ \|(g+h)^{-1} - g^{-1} + g^{-1}h g^{-1}\| \le \| (g+h)^{-1} \| \|h\| \|g^{-1} h g^{-1}\|. }[/math]
Since the operator norm is equivalent to the Euclidean norm on [math]\displaystyle{ \mathbb{R}^{n^2} }[/math] (any norms are equivalent to each other), this implies [math]\displaystyle{ f }[/math] is differentiable. Finally, from the formula for [math]\displaystyle{ f' }[/math], we see the partial derivatives of [math]\displaystyle{ f }[/math] are smooth (infinitely differentiable); whence, [math]\displaystyle{ f }[/math] is smooth too.
Higher derivatives and Taylor formula
If [math]\displaystyle{ f : X \to \mathbb{R}^m }[/math] is differentiable where [math]\displaystyle{ X \subset \mathbb{R}^n }[/math] is an open subset, then the derivatives determine the map [math]\displaystyle{ f' : X \to \operatorname{Hom}(\mathbb{R}^n, \mathbb{R}^m) }[/math], where [math]\displaystyle{ \operatorname{Hom} }[/math] stands for homomorphisms between vector spaces; i.e., linear maps. If [math]\displaystyle{ f' }[/math] is differentiable, then [math]\displaystyle{ f'' : X \to \operatorname{Hom}(\mathbb{R}^n, \operatorname{Hom}(\mathbb{R}^n, \mathbb{R}^m)) }[/math]. Here, the codomain of [math]\displaystyle{ f'' }[/math] can be identified with the space of bilinear maps by:
- [math]\displaystyle{ \operatorname{Hom}(\mathbb{R}^n, \operatorname{Hom}(\mathbb{R}^n, \mathbb{R}^m)) \overset{\varphi}\underset{\sim}\to \{ (\mathbb{R}^n)^2 \to \mathbb{R}^m \text{ bilinear}\} }[/math]
where [math]\displaystyle{ \varphi(g)(x, y) = g(x)y }[/math] and [math]\displaystyle{ \varphi }[/math] is bijective with the inverse [math]\displaystyle{ \psi }[/math] given by [math]\displaystyle{ (\psi(g)x)y = g(x, y) }[/math].[lower-alpha 1] In general, [math]\displaystyle{ f^{(k)} = (f^{(k-1)})' }[/math] is a map from [math]\displaystyle{ X }[/math] to the space of [math]\displaystyle{ k }[/math]-multilinear maps [math]\displaystyle{ (\mathbb{R}^n)^k \to \mathbb{R}^m }[/math].
Just as [math]\displaystyle{ f'(x) }[/math] is represented by a matrix (Jacobian matrix), when [math]\displaystyle{ m = 1 }[/math] (a bilinear map is a bilinear form), the bilinear form [math]\displaystyle{ f''(x) }[/math] is represented by a matrix called the Hessian matrix of [math]\displaystyle{ f }[/math] at [math]\displaystyle{ x }[/math]; namely, the square matrix [math]\displaystyle{ H }[/math] of size [math]\displaystyle{ n }[/math] such that [math]\displaystyle{ f''(x)(y, z) = (Hy, z) }[/math], where the paring refers to an inner product of [math]\displaystyle{ \mathbb{R}^n }[/math], and [math]\displaystyle{ H }[/math] is none other than the Jacobian matrix of [math]\displaystyle{ f' : X \to (\mathbb{R}^n)^* \simeq \mathbb{R}^n }[/math]. The [math]\displaystyle{ (i, j) }[/math]-th entry of [math]\displaystyle{ H }[/math] is thus given explicitly as [math]\displaystyle{ H_{ij} = \frac{\partial^2 f}{\partial x_i \partial x_j}(x) }[/math].
Moreover, if [math]\displaystyle{ f'' }[/math] exists and is continuous, then the matrix [math]\displaystyle{ H }[/math] is symmetric, the fact known as the symmetry of second derivatives.[7] This is seen using the mean value inequality. For vectors [math]\displaystyle{ u, v }[/math] in [math]\displaystyle{ \mathbb{R}^n }[/math], using mean value inequality twice, we have:
- [math]\displaystyle{ |\Delta_v \Delta_u f(x) - f''(x)(u, v)| \le \sup_{0 \lt t_1, t_2 \lt 1} | f''(x + t_1 u + t_2 v)(u, v) - f''(x)(u, v) |, }[/math]
which says
- [math]\displaystyle{ f''(x)(u, v) = \lim_{s, t \to 0} (\Delta_{tv} \Delta_{su} f(x) - f(x))/(st). }[/math]
Since the right-hand side is symmetric in [math]\displaystyle{ u, v }[/math], so is the left-hand side: [math]\displaystyle{ f''(x)(u, v) = f''(x)(v, u) }[/math]. By induction, if [math]\displaystyle{ f }[/math] is [math]\displaystyle{ C^k }[/math], then the k-multilinear map [math]\displaystyle{ f^{(k)}(x) }[/math] is symmetric; i.e., the order of taking partial derivatives does not matter.[7]
As in the case of one variable, the Taylor series expansion can then be proved by integration by parts:
- [math]\displaystyle{ f(z+(h,k))=\sum_{a+b\lt n} \partial_x^a\partial_y^b f(z){h^a k^b\over a! b!} + n\int_0^1 (1-t)^{n-1} \sum_{a+b=n} \partial_x^a\partial_y^b f(z+t(h,k)){h^a k^b\over a! b!} \, dt. }[/math]
Taylor's formula has an effect of dividing a function by variables, which can be illustrated by the next typical theoretical use of the formula.
Example:[8] Let [math]\displaystyle{ T : \mathcal{S} \to \mathcal{S} }[/math] be a linear map between the vector space [math]\displaystyle{ \mathcal{S} }[/math] of smooth functions on [math]\displaystyle{ \mathbb{R}^n }[/math] with rapidly decreasing derivatives; i.e., [math]\displaystyle{ \sup |x^{\beta} \partial^{\alpha} \varphi| \lt \infty }[/math] for any multi-index [math]\displaystyle{ \alpha, \beta }[/math]. (The space [math]\displaystyle{ \mathcal{S} }[/math] is called a Schwartz space.) For each [math]\displaystyle{ \varphi }[/math] in [math]\displaystyle{ \mathcal{S} }[/math], Taylor's formula implies we can write:
- [math]\displaystyle{ \varphi - \psi \varphi(y) = \sum_{j=1}^n (x_j - y_j) \varphi_j }[/math]
with [math]\displaystyle{ \varphi_j \in \mathcal{S} }[/math], where [math]\displaystyle{ \psi }[/math] is a smooth function with compact support and [math]\displaystyle{ \psi(y) = 1 }[/math]. Now, assume [math]\displaystyle{ T }[/math] commutes with coordinates; i.e., [math]\displaystyle{ T (x_j \varphi) = x_j T\varphi }[/math]. Then
- [math]\displaystyle{ T\varphi - \varphi(y) T\psi = \sum_{j=1}^n (x_j - y_j) T\varphi_j }[/math].
Evaluating the above at [math]\displaystyle{ y }[/math], we get [math]\displaystyle{ T\varphi(y) = \varphi(y) T\psi(y). }[/math] In other words, [math]\displaystyle{ T }[/math] is a multiplication by some function [math]\displaystyle{ m }[/math]; i.e., [math]\displaystyle{ T\varphi = m \varphi }[/math]. Now, assume further that [math]\displaystyle{ T }[/math] commutes with partial differentiations. We then easily see that [math]\displaystyle{ m }[/math] is a constant; [math]\displaystyle{ T }[/math] is a multiplication by a constant.
(Aside: the above discussion almost proves the Fourier inversion formula. Indeed, let [math]\displaystyle{ F, R : \mathcal{S} \to \mathcal{S} }[/math] be the Fourier transform and the reflection; i.e., [math]\displaystyle{ (R \varphi)(x) = \varphi(-x) }[/math]. Then, dealing directly with the integral that is involved, one can see [math]\displaystyle{ T = RF^2 }[/math] commutes with coordinates and partial differentiations; hence, [math]\displaystyle{ T }[/math] is a multiplication by a constant. This is almost a proof since one still has to compute this constant.)
A partial converse to the Taylor formula also holds; see Borel's lemma and Whitney extension theorem.
Inverse function theorem and submersion theorem
Inverse function theorem — Let [math]\displaystyle{ f : X \to Y }[/math] be a map between open subsets [math]\displaystyle{ X, Y }[/math] in [math]\displaystyle{ \mathbb{R}^n, \mathbb{R}^m }[/math]. If [math]\displaystyle{ f }[/math] is continuously differentiable (or more generally [math]\displaystyle{ C^k }[/math]) and [math]\displaystyle{ f'(x) }[/math] is bijective, there exists neighborhoods [math]\displaystyle{ U, V }[/math] of [math]\displaystyle{ x, f(x) }[/math] and the inverse [math]\displaystyle{ f^{-1} : V \to U }[/math] that is continuously differentiable (or respectively [math]\displaystyle{ C^k }[/math]).
A [math]\displaystyle{ C^k }[/math]-map with the [math]\displaystyle{ C^k }[/math]-inverse is called a [math]\displaystyle{ C^k }[/math]-diffeomorphism. Thus, the theorem says that, for a map [math]\displaystyle{ f }[/math] satisfying the hypothesis at a point [math]\displaystyle{ x }[/math], [math]\displaystyle{ f }[/math] is a diffeomorphism near [math]\displaystyle{ x, f(x). }[/math] For a proof, see Inverse function theorem § A proof using successive approximation.
The implicit function theorem says:[9] given a map [math]\displaystyle{ f : \mathbb{R}^n \times \mathbb{R}^m \to \mathbb{R}^m }[/math], if [math]\displaystyle{ f(a, b) = 0 }[/math], [math]\displaystyle{ f }[/math] is [math]\displaystyle{ C^k }[/math] in a neighborhood of [math]\displaystyle{ (a, b) }[/math] and the derivative of [math]\displaystyle{ y \mapsto f(a, y) }[/math] at [math]\displaystyle{ b }[/math] is invertible, then there exists a differentiable map [math]\displaystyle{ g : U \to V }[/math] for some neighborhoods [math]\displaystyle{ U, V }[/math] of [math]\displaystyle{ a, b }[/math] such that [math]\displaystyle{ f(x, g(x)) = 0 }[/math]. The theorem follows from the inverse function theorem; see Inverse function theorem § Implicit function theorem.
Another consequence is the submersion theorem.
Integrable functions on Euclidean spaces
A partition of an interval [math]\displaystyle{ [a, b] }[/math] is a finite sequence [math]\displaystyle{ a = t_0 \le t_1 \le \cdots \le t_k = b }[/math]. A partition [math]\displaystyle{ P }[/math] of a rectangle [math]\displaystyle{ D }[/math] (product of intervals) in [math]\displaystyle{ \mathbb{R}^n }[/math] then consists of partitions of the sides of [math]\displaystyle{ D }[/math]; i.e., if [math]\displaystyle{ D = \prod_1^n [a_i, b_i] }[/math], then [math]\displaystyle{ P }[/math] consists of [math]\displaystyle{ P_1, \dots, P_n }[/math] such that [math]\displaystyle{ P_i }[/math] is a partition of [math]\displaystyle{ [a_i, b_i] }[/math].[10]
Given a function [math]\displaystyle{ f }[/math] on [math]\displaystyle{ D }[/math], we then define the upper Riemann sum of it as:
- [math]\displaystyle{ U(f, P) = \sum_{Q \in P} (\sup_Q f) \operatorname{vol}(Q) }[/math]
where
- [math]\displaystyle{ Q }[/math] is a partition element of [math]\displaystyle{ P }[/math]; i.e., [math]\displaystyle{ Q = \prod_{i = 1}^n [t_{i, j_i}, t_{i, j_i+1}] }[/math] when [math]\displaystyle{ P_i : a_i = t_{i, 0} \le \dots \cdots \le t_{i, k_i} = b_i }[/math] is a partition of [math]\displaystyle{ [a_i, b_i] }[/math].[11]
- The volume [math]\displaystyle{ \operatorname{vol}(Q) }[/math] of [math]\displaystyle{ Q }[/math] is the usual Euclidean volume; i.e., [math]\displaystyle{ \operatorname{vol}(Q) = \prod_1^n (t_{i, j_i+1} - t_{i, j_i}) }[/math].
The lower Riemann sum [math]\displaystyle{ L(f, P) }[/math] of [math]\displaystyle{ f }[/math] is then defined by replacing [math]\displaystyle{ \sup }[/math] by [math]\displaystyle{ \inf }[/math]. Finally, the function [math]\displaystyle{ f }[/math] is called integrable if it is bounded and [math]\displaystyle{ \sup \{ L(f, P) \mid P \} = \inf \{ U(f, P) \mid P \} }[/math]. In that case, the common value is denoted as [math]\displaystyle{ \int_D f \, dx }[/math].[12]
A subset of [math]\displaystyle{ \mathbb{R}^n }[/math] is said to have measure zero if for each [math]\displaystyle{ \epsilon \gt 0 }[/math], there are some possibly infinitely many rectangles [math]\displaystyle{ D_1, D_2, \dots, }[/math] whose union contains the set and [math]\displaystyle{ \sum_i \operatorname{vol}(D_i) \lt \epsilon. }[/math][13]
A key theorem is
Theorem — [14] A bounded function [math]\displaystyle{ f }[/math] on a closed rectangle is integrable if and only if the set [math]\displaystyle{ \{ x | f \text{ is not continuous at } x \} }[/math] has measure zero.
The next theorem allows us to compute the integral of a function as the iteration of the integrals of the function in one-variables:
Fubini's theorem — If [math]\displaystyle{ f }[/math] is a continuous function on a closed rectangle [math]\displaystyle{ D = \prod [a_i, b_i] }[/math] (in fact, this assumption is too strong), then
- [math]\displaystyle{ \int_D f \, dx = \int_{a_n}^{b_n} \cdots \left( \int_{a_1}^{b_1} f(x_1, \dots, x_n) dx_1 \right) dx_2 \cdots dx_n. }[/math]
In particular, the order of integrations can be changed.
Finally, if [math]\displaystyle{ M \subset \mathbb{R}^n }[/math] is a bounded open subset and [math]\displaystyle{ f }[/math] a function on [math]\displaystyle{ M }[/math], then we define [math]\displaystyle{ \int_M f \, dx := \int_D \chi_M f \, dx }[/math] where [math]\displaystyle{ D }[/math] is a closed rectangle containing [math]\displaystyle{ M }[/math] and [math]\displaystyle{ \chi_M }[/math] is the characteristic function on [math]\displaystyle{ M }[/math]; i.e., [math]\displaystyle{ \chi_M(x) = 1 }[/math] if [math]\displaystyle{ x \in M }[/math] and [math]\displaystyle{ =0 }[/math] if [math]\displaystyle{ x \not\in M, }[/math] provided [math]\displaystyle{ \chi_M f }[/math] is integrable.[15]
Surface integral
If a bounded surface [math]\displaystyle{ M }[/math] in [math]\displaystyle{ \mathbb{R}^3 }[/math] is parametrized by [math]\displaystyle{ \textbf{r} = \textbf{r}(u, v) }[/math] with domain [math]\displaystyle{ D }[/math], then the surface integral of a measurable function [math]\displaystyle{ F }[/math] on [math]\displaystyle{ M }[/math] is defined and denoted as:
- [math]\displaystyle{ \int_M F \, dS := \int \int_D (F \circ \textbf{r}) | \textbf{r}_u \times \textbf{r}_v | \, du dv }[/math]
If [math]\displaystyle{ F : M \to \mathbb{R}^3 }[/math] is vector-valued, then we define
- [math]\displaystyle{ \int_M F \cdot dS := \int_M (F \cdot \textbf{n}) \, dS }[/math]
where [math]\displaystyle{ \textbf{n} }[/math] is an outward unit normal vector to [math]\displaystyle{ M }[/math]. Since [math]\displaystyle{ \textbf{n} = \frac{\textbf{r}_u \times \textbf{r}_v}{|\textbf{r}_u \times \textbf{r}_v|} }[/math], we have:
- [math]\displaystyle{ \int_M F \cdot dS = \int \int_D (F \circ \textbf{r}) \cdot (\textbf{r}_u \times \textbf{r}_v) \, du dv = \int \int_D \det(F \circ \textbf{r}, \textbf{r}_u, \textbf{r}_v) \, dudv. }[/math]
Vector analysis
Tangent vectors and vector fields
Let [math]\displaystyle{ c : [0, 1] \to \mathbb{R}^n }[/math] be a differentiable curve. Then the tangent vector to the curve [math]\displaystyle{ c }[/math] at [math]\displaystyle{ t }[/math] is a vector [math]\displaystyle{ v }[/math] at the point [math]\displaystyle{ c(t) }[/math] whose components are given as:
- [math]\displaystyle{ v = (c_1'(t), \dots, c_n'(t)) }[/math].[16]
For example, if [math]\displaystyle{ c(t) = (a \cos(t), a \sin(t), bt), a \gt 0, b \gt 0 }[/math] is a helix, then the tangent vector at t is:
- [math]\displaystyle{ c'(t) = (-a \sin(t), a \cos(t), b). }[/math]
It corresponds to the intuition that the a point on the helix moves up in a constant speed.
If [math]\displaystyle{ M \subset \mathbb{R}^n }[/math] is a differentiable curve or surface, then the tangent space to [math]\displaystyle{ M }[/math] at a point p is the set of all tangent vectors to the differentiable curves [math]\displaystyle{ c: [0, 1] \to M }[/math] with [math]\displaystyle{ c(0) = p }[/math].
A vector field X is an assignment to each point p in M a tangent vector [math]\displaystyle{ X_p }[/math] to M at p such that the assignment varies smoothly.
Differential forms
The dual notion of a vector field is a differential form. Given an open subset [math]\displaystyle{ M }[/math] in [math]\displaystyle{ \mathbb{R}^n }[/math], by definition, a differential 1-form (often just 1-form) [math]\displaystyle{ \omega }[/math] is an assignment to a point [math]\displaystyle{ p }[/math] in [math]\displaystyle{ M }[/math] a linear functional [math]\displaystyle{ \omega_p }[/math] on the tangent space [math]\displaystyle{ T_p M }[/math] to [math]\displaystyle{ M }[/math] at [math]\displaystyle{ p }[/math] such that the assignment varies smoothly. For a (real or complex-valued) smooth function [math]\displaystyle{ f }[/math], define the 1-form [math]\displaystyle{ df }[/math] by: for a tangent vector [math]\displaystyle{ v }[/math] at [math]\displaystyle{ p }[/math],
- [math]\displaystyle{ df_p(v) = v(f) }[/math]
where [math]\displaystyle{ v(f) }[/math] denotes the directional derivative of [math]\displaystyle{ f }[/math] in the direction [math]\displaystyle{ v }[/math] at [math]\displaystyle{ p }[/math].[17] For example, if [math]\displaystyle{ x_i }[/math] is the [math]\displaystyle{ i }[/math]-th coordinate function, then [math]\displaystyle{ dx_{i, p}(v) = v_i }[/math]; i.e., [math]\displaystyle{ dx_{i,p} }[/math] are the dual basis to the standard basis on [math]\displaystyle{ T_p M }[/math]. Then every differential 1-form [math]\displaystyle{ \omega }[/math] can be written uniquely as
- [math]\displaystyle{ \omega = f_1 \, dx_1 + \cdots + f_n \, dx_n }[/math]
for some smooth functions [math]\displaystyle{ f_1, \dots, f_n }[/math] on [math]\displaystyle{ M }[/math] (since, for every point [math]\displaystyle{ p }[/math], the linear functional [math]\displaystyle{ \omega_p }[/math] is a unique linear combination of [math]\displaystyle{ dx_i }[/math] over real numbers). More generally, a differential k-form is an assignment to a point [math]\displaystyle{ p }[/math] in [math]\displaystyle{ M }[/math] a vector [math]\displaystyle{ \omega_p }[/math] in the [math]\displaystyle{ k }[/math]-th exterior power [math]\displaystyle{ \bigwedge^k T^*_p M }[/math] of the dual space [math]\displaystyle{ T^*_p M }[/math] of [math]\displaystyle{ T_p M }[/math] such that the assignment varies smoothly.[17] In particular, a 0-form is the same as a smooth function. Also, any [math]\displaystyle{ k }[/math]-form [math]\displaystyle{ \omega }[/math] can be written uniquely as:
- [math]\displaystyle{ \omega = \sum_{i_1 \lt \cdots \lt i_k} f_{i_1 \dots i_k} \, dx_{i_1} \wedge \cdots \wedge dx_{i_k} }[/math]
for some smooth functions [math]\displaystyle{ f_{i_1 \dots i_k} }[/math].[17]
Like a smooth function, we can differentiate and integrate differential forms. If [math]\displaystyle{ f }[/math] is a smooth function, then [math]\displaystyle{ df }[/math] can be written as:[18]
- [math]\displaystyle{ df = \sum_{i=1}^n \frac{\partial f}{\partial x_i} \, dx_i }[/math]
since, for [math]\displaystyle{ v = \partial / \partial x_j |_p }[/math], we have: [math]\displaystyle{ df_p(v) = \frac{\partial f}{\partial x_j}(p) = \sum_{i=1}^n \frac{\partial f}{\partial x_i}(p) \, dx_i(v) }[/math]. Note that, in the above expression, the left-hand side (whence the right-hand side) is independent of coordinates [math]\displaystyle{ x_1, \dots, x_n }[/math]; this property is called the invariance of differential.
The operation [math]\displaystyle{ d }[/math] is called the exterior derivative and it extends to any differential forms inductively by the requirement (Leibniz rule)
- [math]\displaystyle{ d(\alpha \wedge \beta) = d \alpha \wedge \beta + (-1)^p \alpha \wedge d \beta. }[/math]
where [math]\displaystyle{ \alpha, \beta }[/math] are a p-form and a q-form.
The exterior derivative has the important property that [math]\displaystyle{ d \circ d = 0 }[/math]; that is, the exterior derivative [math]\displaystyle{ d }[/math] of a differential form [math]\displaystyle{ d \omega }[/math] is zero. This property is a consequence of the symmetry of second derivatives (mixed partials are equal).
Boundary and orientation
A circle can be oriented clockwise or counterclockwise. Mathematically, we say that a subset [math]\displaystyle{ M }[/math] of [math]\displaystyle{ \mathbb{R}^n }[/math] is oriented if there is a consistent choice of normal vectors to [math]\displaystyle{ M }[/math] that varies continuously. For example, a circle or, more generally, an n-sphere can be oriented; i.e., orientable. On the other hand, a Möbius strip (a surface obtained by identified by two opposite sides of the rectangle in a twisted way) cannot oriented: if we start with a normal vector and travel around the strip, the normal vector at end will point to the opposite direction.
Proposition — A bounded differentiable region [math]\displaystyle{ M }[/math] in [math]\displaystyle{ \mathbb{R}^n }[/math] of dimension [math]\displaystyle{ k }[/math] is oriented if and only if there exists a nowhere-vanishing [math]\displaystyle{ k }[/math]-form on [math]\displaystyle{ M }[/math] (called a volume form).
The proposition is useful because it allows us to give an orientation by giving a volume form.
Integration of differential forms
If [math]\displaystyle{ \omega = f \, dx_1 \wedge \cdots \wedge dx_n }[/math] is a differential n-form on an open subset M in [math]\displaystyle{ \mathbb{R}^n }[/math] (any n-form is that form), then the integration of it over [math]\displaystyle{ M }[/math] with the standard orientation is defined as:
- [math]\displaystyle{ \int_M \omega = \int_M f \, dx_1 \cdots dx_n. }[/math]
If M is given the orientation opposite to the standard one, then [math]\displaystyle{ \int_M \omega }[/math] is defined as the negative of the right-hand side.
Then we have the fundamental formula relating exterior derivative and integration:
Stokes' formula — For a bounded region [math]\displaystyle{ M }[/math] in [math]\displaystyle{ \mathbb{R}^n }[/math] of dimension [math]\displaystyle{ k }[/math] whose boundary is a union of finitely many [math]\displaystyle{ C^1 }[/math]-subsets, if [math]\displaystyle{ M }[/math] is oriented, then
- [math]\displaystyle{ \int_{\partial M} \omega = \int_M d\omega }[/math]
for any differential [math]\displaystyle{ (k-1) }[/math]-form [math]\displaystyle{ \omega }[/math] on the boundary [math]\displaystyle{ \partial M }[/math] of [math]\displaystyle{ M }[/math].
Here is a sketch of proof of the formula.[19] If [math]\displaystyle{ f }[/math] is a smooth function on [math]\displaystyle{ \mathbb{R}^n }[/math] with compact support, then we have:
- [math]\displaystyle{ \int d(f \omega) = 0 }[/math]
(since, by the fundamental theorem of calculus, the above can be evaluated on boundaries of the set containing the support.) On the other hand,
- [math]\displaystyle{ \int d(f \omega) = \int df \wedge \omega + \int f \, d\omega. }[/math]
Let [math]\displaystyle{ f }[/math] approach the characteristic function on [math]\displaystyle{ M }[/math]. Then the second term on the right goes to [math]\displaystyle{ \int_M d \omega }[/math] while the first goes to [math]\displaystyle{ -\int_{\partial M} \omega }[/math], by the argument similar to proving the fundamental theorem of calculus. [math]\displaystyle{ \square }[/math]
The formula generalizes the fundamental theorem of calculus as well as Stokes' theorem in multivariable calculus. Indeed, if [math]\displaystyle{ M = [a, b] }[/math] is an interval and [math]\displaystyle{ \omega = f }[/math], then [math]\displaystyle{ d\omega = f' \, dx }[/math] and the formula says:
- [math]\displaystyle{ \int_M f' \, dx = f(b) - f(a) }[/math].
Similarly, if [math]\displaystyle{ M }[/math] is an oriented bounded surface in [math]\displaystyle{ \mathbb{R}^3 }[/math] and [math]\displaystyle{ \omega = f\,dx + g\,dy + h\,dz }[/math], then [math]\displaystyle{ d(f\,dx) = df \wedge dx = \frac{\partial f}{\partial y} \, dy \wedge dx + \frac{\partial f}{\partial z} \,dz \wedge dx }[/math] and similarly for [math]\displaystyle{ d(g\,dy) }[/math] and [math]\displaystyle{ d(g\,dy) }[/math]. Collecting the terms, we thus get:
- [math]\displaystyle{ d\omega = \left( \frac{\partial h}{\partial y} - \frac{\partial g}{\partial z} \right) dy \wedge dz + \left( \frac{\partial f}{\partial z} - \frac{\partial h}{\partial x} \right) dz \wedge dx + \left( \frac{\partial g}{\partial x} - \frac{\partial f}{\partial y} \right) dx \wedge dy. }[/math]
Then, from the definition of the integration of [math]\displaystyle{ \omega }[/math], we have [math]\displaystyle{ \int_M d \omega = \int_M (\nabla \times F) \cdot dS }[/math] where [math]\displaystyle{ F = (f, g, h) }[/math] is the vector-valued function and [math]\displaystyle{ \nabla = \left( \frac{\partial}{\partial x}, \frac{\partial}{\partial y}, \frac{\partial}{\partial z} \right) }[/math]. Hence, Stokes’ formula becomes
- [math]\displaystyle{ \int_M (\nabla \times F) \cdot dS = \int_{\partial M} (f\,dx + g\,dy + h\,dz), }[/math]
which is the usual form of the Stokes' theorem on surfaces. Green’s theorem is also a special case of Stokes’ formula.
Stokes' formula also yields a general version of Cauchy's integral formula. To state and prove it, for the complex variable [math]\displaystyle{ z = x + iy }[/math] and the conjugate [math]\displaystyle{ \bar z }[/math], let us introduce the operators
- [math]\displaystyle{ \frac{\partial}{\partial z} = \frac{1}{2}\left( \frac{\partial}{\partial x} - i \frac{\partial}{\partial y} \right), \, \frac{\partial}{\partial \bar{z}} = \frac{1}{2}\left( \frac{\partial}{\partial x} + i \frac{\partial}{\partial y} \right). }[/math]
In these notations, a function [math]\displaystyle{ f }[/math] is holomorphic (complex-analytic) if and only if [math]\displaystyle{ \frac{\partial f}{\partial \bar z} = 0 }[/math] (the Cauchy–Riemann equations). Also, we have:
- [math]\displaystyle{ df = \frac{\partial f}{\partial z}dz + \frac{\partial f}{\partial \bar{z}}d \bar{z}. }[/math]
Let [math]\displaystyle{ D_{\epsilon} = \{ z \in \mathbb{C} \mid \epsilon \lt |z - z_0| \lt r \} }[/math] be a punctured disk with center [math]\displaystyle{ z_0 }[/math]. Since [math]\displaystyle{ 1/(z - z_0) }[/math] is holomorphic on [math]\displaystyle{ D_{\epsilon} }[/math], We have:
- [math]\displaystyle{ d \left( \frac{f}{z-z_0} dz \right) = \frac{\partial f}{\partial \bar z} \frac{d \bar{z} \wedge dz}{z - z_0} }[/math].
By Stokes’ formula,
- [math]\displaystyle{ \int_{D_{\epsilon}} \frac{\partial f}{\partial \bar z} \frac{d \bar{z} \wedge dz}{z - z_0} = \left( \int_{|z - z_0| = r} - \int_{|z - z_0| = \epsilon} \right) \frac{f}{z-z_0} dz. }[/math]
Letting [math]\displaystyle{ \epsilon \to 0 }[/math] we then get:[20][21]
- [math]\displaystyle{ 2\pi i \, f(z_0) = \int_{|z - z_0| = r} \frac{f}{z-z_0} dz + \int_{|z - z_0| \le r} \frac{\partial f}{\partial \bar z} \frac{dz \wedge d \bar z}{z - z_0}. }[/math]
Winding numbers and Poincaré lemma
A differential form [math]\displaystyle{ \omega }[/math] is called closed if [math]\displaystyle{ d\omega = 0 }[/math] and is called exact if [math]\displaystyle{ \omega = d\eta }[/math] for some differential form [math]\displaystyle{ \eta }[/math] (often called a potential). Since [math]\displaystyle{ d \circ d = 0 }[/math], an exact form is closed. But the converse does not hold in general; there might be a non-exact closed form. A classic example of such a form is:[22]
- [math]\displaystyle{ \omega = \frac{-y}{x^2 + y^2} + \frac{x}{x^2 + y^2} }[/math],
which is a differential form on [math]\displaystyle{ \mathbb{R}^2 - 0 }[/math]. Suppose we switch to polar coordinates: [math]\displaystyle{ x = r \cos \theta, y = r \sin \theta }[/math] where [math]\displaystyle{ r = \sqrt{x^2 + y^2} }[/math]. Then
- [math]\displaystyle{ \omega = r^{-2}(-r \sin \theta \, dx + r \cos \theta \, dy) = d \theta. }[/math]
This does not show that [math]\displaystyle{ \omega }[/math] is exact: the trouble is that [math]\displaystyle{ \theta }[/math] is not a well-defined continuous function on [math]\displaystyle{ \mathbb{R}^2 - 0 }[/math]. Since any function [math]\displaystyle{ f }[/math] on [math]\displaystyle{ \mathbb{R}^2 - 0 }[/math] with [math]\displaystyle{ df = \omega }[/math] differ from [math]\displaystyle{ \theta }[/math] by constant, this means that [math]\displaystyle{ \omega }[/math] is not exact. The calculation, however, shows that [math]\displaystyle{ \omega }[/math] is exact, for example, on [math]\displaystyle{ \mathbb{R}^2 - \{ x = 0 \} }[/math] since we can take [math]\displaystyle{ \theta = \arctan(y/x) }[/math] there.
There is a result (Poincaré lemma) that gives a condition that guarantees closed forms are exact. To state it, we need some notions from topology. Given two continuous maps [math]\displaystyle{ f, g : X \to Y }[/math] between subsets of [math]\displaystyle{ \mathbb{R}^m, \mathbb{R}^n }[/math] (or more generally topological spaces), a homotopy from [math]\displaystyle{ f }[/math] to [math]\displaystyle{ g }[/math] is a continuous function [math]\displaystyle{ H : X \times [0, 1] \to Y }[/math] such that [math]\displaystyle{ f(x) = H(x, 0) }[/math] and [math]\displaystyle{ g(x) = H(x, 1) }[/math]. Intuitively, a homotopy is a continuous variation of one function to another. A loop in a set [math]\displaystyle{ X }[/math] is a curve whose starting point coincides with the end point; i.e., [math]\displaystyle{ c : [0, 1] \to X }[/math] such that [math]\displaystyle{ c(0) = c(1) }[/math]. Then a subset of [math]\displaystyle{ \mathbb{R}^n }[/math] is called simply connected if every loop is homotopic to a constant function. A typical example of a simply connected set is a disk [math]\displaystyle{ D = \{ (x, y) \mid \sqrt{x^2 + y^2} \le r \} \subset \mathbb{R}^2 }[/math]. Indeed, given a loop [math]\displaystyle{ c : [0, 1] \to D }[/math], we have the homotopy [math]\displaystyle{ H : [0, 1]^2 \to D, \, H(x, t) = (1-t) c(x) + t c(0) }[/math] from [math]\displaystyle{ c }[/math] to the constant function [math]\displaystyle{ c(0) }[/math]. A punctured disk, on the other hand, is not simply connected.
Poincaré lemma — If [math]\displaystyle{ M }[/math] is a simply connected open subset of [math]\displaystyle{ \mathbb{R}^n }[/math], then each closed 1-form on [math]\displaystyle{ M }[/math] is exact.
Geometry of curves and surfaces
Moving frame
Vector fields [math]\displaystyle{ E_1, \dots, E_3 }[/math] on [math]\displaystyle{ \mathbb{R}^3 }[/math] are called a frame field if they are orthogonal to each other at each point; i.e., [math]\displaystyle{ E_i \cdot E_j = \delta_{ij} }[/math] at each point.[23] The basic example is the standard frame [math]\displaystyle{ U_i }[/math]; i.e., [math]\displaystyle{ U_i(x) }[/math] is a standard basis for each point [math]\displaystyle{ x }[/math] in [math]\displaystyle{ \mathbb{R}^3 }[/math]. Another example is the cylindrical frame
- [math]\displaystyle{ E_1 = \cos \theta U_1 + \sin \theta U_2, \, E_2 = -\sin \theta U_1 + \cos \theta U_2, \, E_3 = U_3. }[/math][24]
For the study of the geometry of a curve, the important frame to use is a Frenet frame [math]\displaystyle{ T, N, B }[/math] on a unit-speed curve [math]\displaystyle{ \beta : I \to \mathbb{R}^3 }[/math] given as:
The Gauss–Bonnet theorem
The Gauss–Bonnet theorem relates the topology of a surface and its geometry.
The Gauss–Bonnet theorem — [25] For each bounded surface [math]\displaystyle{ M }[/math] in [math]\displaystyle{ \mathbb{R}^3 }[/math], we have:
- [math]\displaystyle{ 2\pi \, \chi(M) = \int_M K \, dS }[/math]
where [math]\displaystyle{ \chi(M) }[/math] is the Euler characteristic of [math]\displaystyle{ M }[/math] and [math]\displaystyle{ K }[/math] the curvature.
Calculus of variations
Method of Lagrange multiplier
Lagrange multiplier — [26] Let [math]\displaystyle{ g : U \to \mathbb{R}^r }[/math] be a differentiable function from an open subset of [math]\displaystyle{ \mathbb{R}^n }[/math] such that [math]\displaystyle{ g' }[/math] has rank [math]\displaystyle{ r }[/math] at every point in [math]\displaystyle{ g^{-1}(0) }[/math]. For a differentiable function [math]\displaystyle{ f : \mathbb{R}^n \to \mathbb{R} }[/math], if [math]\displaystyle{ f }[/math] attains either a maximum or minimum at a point [math]\displaystyle{ p }[/math] in [math]\displaystyle{ g^{-1}(0) }[/math], then there exists real numbers [math]\displaystyle{ \lambda_1, \dots, \lambda_r }[/math] such that
- [math]\displaystyle{ \nabla f(p) = \lambda_i \sum_{i=1}^r \nabla g_i(p) }[/math].
In other words, [math]\displaystyle{ p }[/math] is a stationary point of [math]\displaystyle{ f - \sum_1^r \lambda_i g_i }[/math].
The set [math]\displaystyle{ g^{-1}(0) }[/math] is usually called a constraint.
Example:[27] Suppose we want to find the minimum distance between the circle [math]\displaystyle{ x^2 + y^2 = 1 }[/math] and the line [math]\displaystyle{ x + y = 4 }[/math]. That means that we want to minimize the function [math]\displaystyle{ f(x, y, u, v) = (x - u)^2 + (y - v)^2 }[/math], the square distance between a point [math]\displaystyle{ (x, y) }[/math] on the circle and a point [math]\displaystyle{ (u, v) }[/math] on the line, under the constraint [math]\displaystyle{ g = (x^2 + y^2 - 1, u + v - 4) }[/math]. We have:
- [math]\displaystyle{ \nabla f = (2(x - u), 2(y - v), -2(x - u), -2(y - v)). }[/math]
- [math]\displaystyle{ \nabla g_1 = (2x, 2y, 0, 0), \nabla g_2 = (0, 0, 1, 1). }[/math]
Since the Jacobian matrix of [math]\displaystyle{ g }[/math] has rank 2 everywhere on [math]\displaystyle{ g^{-1}(0) }[/math], the Lagrange multiplier gives:
- [math]\displaystyle{ x - u = \lambda_1 x, \, y - v = \lambda_1 y, \, 2(x-u) = -\lambda_2, \, 2(y-v) = -\lambda_2. }[/math]
If [math]\displaystyle{ \lambda_1 = 0 }[/math], then [math]\displaystyle{ x = u, y = v }[/math], not possible. Thus, [math]\displaystyle{ \lambda_1 \ne 0 }[/math] and
- [math]\displaystyle{ x = \frac{x-u}{\lambda_1}, \, y = \frac{y-v}{\lambda_1}. }[/math]
From this, it easily follows that [math]\displaystyle{ x = y = 1/\sqrt{2} }[/math] and [math]\displaystyle{ u = v = 2 }[/math]. Hence, the minimum distance is [math]\displaystyle{ 2\sqrt{2} - 1 }[/math] (as a minimum distance clearly exists).
Here is an application to linear algebra.[28] Let [math]\displaystyle{ V }[/math] be a finite-dimensional real vector space and [math]\displaystyle{ T : V \to V }[/math] a self-adjoint operator. We shall show [math]\displaystyle{ V }[/math] has a basis consisting of eigenvectors of [math]\displaystyle{ T }[/math] (i.e., [math]\displaystyle{ T }[/math] is diagonalizable) by induction on the dimension of [math]\displaystyle{ V }[/math]. Choosing a basis on [math]\displaystyle{ V }[/math] we can identify [math]\displaystyle{ V = \mathbb{R}^n }[/math] and [math]\displaystyle{ T }[/math] is represented by the matrix [math]\displaystyle{ [a_{ij}] }[/math]. Consider the function [math]\displaystyle{ f(x) = (Tx, x) }[/math], where the bracket means the inner product. Then [math]\displaystyle{ \nabla f = 2(\sum a_{1i} x_i, \dots, \sum a_{ni} x_i) }[/math]. On the other hand, for [math]\displaystyle{ g = \sum x_i^2 - 1 }[/math], since [math]\displaystyle{ g^{-1}(0) }[/math] is compact, [math]\displaystyle{ f }[/math] attains a maximum or minimum at a point [math]\displaystyle{ u }[/math] in [math]\displaystyle{ g^{-1}(0) }[/math]. Since [math]\displaystyle{ \nabla g = 2(x_1, \dots, x_n) }[/math], by Lagrange multiplier, we find a real number [math]\displaystyle{ \lambda }[/math] such that [math]\displaystyle{ 2 \sum_i a_{ji} u_i = 2 \lambda u_j, 1 \le j \le n. }[/math] But that means [math]\displaystyle{ Tu = \lambda u }[/math]. By inductive hypothesis, the self-adjoint operator [math]\displaystyle{ T : W \to W }[/math], [math]\displaystyle{ W }[/math] the orthogonal complement to [math]\displaystyle{ u }[/math], has a basis consisting of eigenvectors. Hence, we are done. [math]\displaystyle{ \square }[/math].
Weak derivatives
Up to measure-zero sets, two functions can be determined to be equal or not by means of integration against other functions (called test functions). Namely, the following sometimes called the fundamental lemma of calculus of variations:
Lemma[29] — If [math]\displaystyle{ f, g }[/math] are locally integrable functions on an open subset [math]\displaystyle{ M \subset \mathbb{R}^n }[/math] such that
- [math]\displaystyle{ \int (f - g) \varphi \, dx = 0 }[/math]
for every [math]\displaystyle{ \varphi \in C_c^{\infty}(M) }[/math] (called a test function). Then [math]\displaystyle{ f = g }[/math] almost everywhere. If, in addition, [math]\displaystyle{ f, g }[/math] are continuous, then [math]\displaystyle{ f = g }[/math].
Given a continuous function [math]\displaystyle{ f }[/math], by the lemma, a continuously differentiable function [math]\displaystyle{ u }[/math] is such that [math]\displaystyle{ \frac{\partial u}{\partial x_i} = f }[/math] if and only if
- [math]\displaystyle{ \int \frac{\partial u}{\partial x_i} \varphi \, dx = \int f \varphi \, dx }[/math]
for every [math]\displaystyle{ \varphi \in C_c^{\infty}(M) }[/math]. But, by integration by parts, the partial derivative on the left-hand side of [math]\displaystyle{ u }[/math] can be moved to that of [math]\displaystyle{ \varphi }[/math]; i.e.,
- [math]\displaystyle{ -\int u \frac{\partial \varphi}{\partial x_i} \, dx = \int f \varphi \, dx }[/math]
where there is no boundary term since [math]\displaystyle{ \varphi }[/math] has compact support. Now the key point is that this expression makes sense even if [math]\displaystyle{ u }[/math] is not necessarily differentiable and thus can be used to give sense to a derivative of such a function.
Note each locally integrable function [math]\displaystyle{ u }[/math] defines the linear functional [math]\displaystyle{ \varphi \mapsto \int u \varphi \, dx }[/math] on [math]\displaystyle{ C_c^{\infty}(M) }[/math] and, moreover, each locally integrable function can be identified with such linear functional, because of the early lemma. Hence, quite generally, if [math]\displaystyle{ u }[/math] is a linear functional on [math]\displaystyle{ C_c^{\infty}(M) }[/math], then we define [math]\displaystyle{ \frac{\partial u}{\partial x_i} }[/math] to be the linear functional [math]\displaystyle{ \varphi \mapsto -\left \langle u, \frac{\partial \varphi}{\partial x_i} \right\rangle }[/math] where the bracket means [math]\displaystyle{ \langle \alpha, \varphi \rangle = \alpha(\varphi) }[/math]. It is then called the weak derivative of [math]\displaystyle{ u }[/math] with respect to [math]\displaystyle{ x_i }[/math]. If [math]\displaystyle{ u }[/math] is continuously differentiable, then the weak derivate of it coincides with the usual one; i.e., the linear functional [math]\displaystyle{ \frac{\partial u}{\partial x_i} }[/math] is the same as the linear functional determined by the usual partial derivative of [math]\displaystyle{ u }[/math] with respect to [math]\displaystyle{ x_i }[/math]. A usual derivative is often then called a classical derivative. When a linear functional on [math]\displaystyle{ C_c^{\infty}(M) }[/math] is continuous with respect to a certain topology on [math]\displaystyle{ C_c^{\infty}(M) }[/math], such a linear functional is called a distribution, an example of a generalized function.
A classic example of a weak derivative is that of the Heaviside function [math]\displaystyle{ H }[/math], the characteristic function on the interval [math]\displaystyle{ (0, \infty) }[/math].[30] For every test function [math]\displaystyle{ \varphi }[/math], we have:
- [math]\displaystyle{ \langle H', \varphi \rangle = -\int_0^{\infty} \varphi' \, dx = \varphi(0). }[/math]
Let [math]\displaystyle{ \delta_a }[/math] denote the linear functional [math]\displaystyle{ \varphi \mapsto \varphi(a) }[/math], called the Dirac delta function (although not exactly a function). Then the above can be written as:
- [math]\displaystyle{ H' = \delta_0. }[/math]
Cauchy's integral formula has a similar interpretation in terms of weak derivatives. For the complex variable [math]\displaystyle{ z = x + iy }[/math], let [math]\displaystyle{ E_{z_0}(z) = \frac{1}{\pi (z - z_0)} }[/math]. For a test function [math]\displaystyle{ \varphi }[/math], if the disk [math]\displaystyle{ | z - z_0 | \le r }[/math] contains the support of [math]\displaystyle{ \varphi }[/math], by Cauchy's integral formula, we have:
- [math]\displaystyle{ \varphi(z_0) = {1 \over 2 \pi i} \int \frac{\partial \varphi}{\partial \bar z} \frac{dz \wedge d \bar z}{z - z_0}. }[/math]
Since [math]\displaystyle{ dz \wedge d \bar z = -2i dx \wedge dy }[/math], this means:
- [math]\displaystyle{ \varphi(z_0) = -\int E_{z_0} \frac{\partial \varphi}{\partial \bar z} dxdy = \left\langle \frac{\partial E_{z_0}}{\partial \bar z}, \varphi \right \rangle, }[/math]
or
- [math]\displaystyle{ \frac{\partial E_{z_0}}{\partial \bar z} = \delta_{z_0}. }[/math][31]
In general, a generalized function is called a fundamental solution for a linear partial differential operator if the application of the operator to it is the Dirac delta. Hence, the above says [math]\displaystyle{ E_{z_0} }[/math] is the fundamental solution for the differential operator [math]\displaystyle{ \partial/\partial \bar z }[/math].
Hamilton–Jacobi theory
Calculus on manifolds
Definition of a manifold
- This section requires some background in general topology.
A manifold is a Hausdorff topological space that is locally modeled by an Euclidean space. By definition, an atlas of a topological space [math]\displaystyle{ M }[/math] is a set of maps [math]\displaystyle{ \varphi_i : U_i \to \mathbb{R}^n }[/math], called charts, such that
- [math]\displaystyle{ U_i }[/math] are an open cover of [math]\displaystyle{ M }[/math]; i.e., each [math]\displaystyle{ U_i }[/math] is open and [math]\displaystyle{ M = \cup_i U_i }[/math],
- [math]\displaystyle{ \varphi_i : U_i \to \varphi_i(U_i) }[/math] is a homeomorphism and
- [math]\displaystyle{ \varphi_j \circ \varphi_i^{-1} : \varphi_i(U_i \cap U_j) \to \varphi_j(U_i \cap U_j) }[/math] is smooth; thus a diffeomorphism.
By definition, a manifold is a second-countable Hausdorff topological space with a maximal atlas (called a differentiable structure); "maximal" means that it is not contained in strictly larger atlas. The dimension of the manifold [math]\displaystyle{ M }[/math] is the dimension of the model Euclidean space [math]\displaystyle{ \mathbb{R}^n }[/math]; namely, [math]\displaystyle{ n }[/math] and a manifold is called an n-manifold when it has dimension n. A function on a manifold [math]\displaystyle{ M }[/math] is said to be smooth if [math]\displaystyle{ f|_U \circ \varphi^{-1} }[/math] is smooth on [math]\displaystyle{ \varphi(U) }[/math] for each chart [math]\displaystyle{ \varphi : U \to \mathbb{R}^n }[/math] in the differentiable structure.
A manifold is paracompact; this has an implication that it admits a partition of unity subordinate to a given open cover.
If [math]\displaystyle{ \mathbb{R}^n }[/math] is replaced by an upper half-space [math]\displaystyle{ \mathbb{H}^n }[/math], then we get the notion of a manifold-with-boundary. The set of points that map to the boundary of [math]\displaystyle{ \mathbb{H}^n }[/math] under charts is denoted by [math]\displaystyle{ \partial M }[/math] and is called the boundary of [math]\displaystyle{ M }[/math]. This boundary may not be the topological boundary of [math]\displaystyle{ M }[/math]. Since the interior of [math]\displaystyle{ \mathbb{H}^n }[/math] is diffeomorphic to [math]\displaystyle{ \mathbb{R}^n }[/math], a manifold is a manifold-with-boundary with empty boundary.
The next theorem furnishes many examples of manifolds.
Theorem — [32] Let [math]\displaystyle{ g: U \to \mathbb{R}^r }[/math] be a differentiable map from an open subset [math]\displaystyle{ U \subset \mathbb{R}^n }[/math] such that [math]\displaystyle{ g'(p) }[/math] has rank [math]\displaystyle{ r }[/math] for every point [math]\displaystyle{ p }[/math] in [math]\displaystyle{ g^{-1}(0) }[/math]. Then the zero set [math]\displaystyle{ g^{-1}(0) }[/math] is an [math]\displaystyle{ (n-r) }[/math]-manifold.
For example, for [math]\displaystyle{ g(x) = x_1^2 + \cdots + x_{n+1}^2 - 1 }[/math], the derivative [math]\displaystyle{ g'(x) = \begin{bmatrix}2 x_1 & 2 x_2 & \cdots & 2 x_{n+1}\end{bmatrix} }[/math] has rank one at every point [math]\displaystyle{ p }[/math] in [math]\displaystyle{ g^{-1}(0) }[/math]. Hence, the n-sphere [math]\displaystyle{ g^{-1}(0) }[/math] is an n-manifold.
The theorem is proved as a corollary of the inverse function theorem.
Many familiar manifolds are subsets of [math]\displaystyle{ \mathbb{R}^n }[/math]. The next theoretically important result says that there is no other kind of manifolds. An immersion is a smooth map whose differential is injective. An embedding is an immersion that is homeomorphic (thus diffeomorphic) to the image.
Whitney's embedding theorem — Each [math]\displaystyle{ k }[/math]-manifold can be embedded into [math]\displaystyle{ \mathbb{R}^{2k} }[/math].
The proof that a manifold can be embedded into [math]\displaystyle{ \mathbb{R}^N }[/math] for some N is considerably easier and can be readily given here. It is known[citation needed] that a manifold has a finite atlas [math]\displaystyle{ \{ \varphi_i : U_i \to \mathbb{R}^n \mid 1 \le i \le r \} }[/math]. Let [math]\displaystyle{ \lambda_i }[/math] be smooth functions such that [math]\displaystyle{ \operatorname{Supp}(\lambda_i) \subset U_i }[/math] and [math]\displaystyle{ \{ \lambda_i = 1 \} }[/math] cover [math]\displaystyle{ M }[/math] (e.g., a partition of unity). Consider the map
- [math]\displaystyle{ f = (\lambda_1 \varphi_1, \dots, \lambda_r \varphi_r, \lambda_1, \dots, \lambda_r) : M \to \mathbb{R}^{(k+1)r} }[/math]
It is easy to see that [math]\displaystyle{ f }[/math] is an injective immersion. It may not be an embedding. To fix that, we shall use:
- [math]\displaystyle{ (f, g) : M \to \mathbb{R}^{(k+1)r+1} }[/math]
where [math]\displaystyle{ g }[/math] is a smooth proper map. The existence of a smooth proper map is a consequence of a partition of unity. See [1] for the rest of the proof in the case of an immersion. [math]\displaystyle{ \square }[/math]
Nash's embedding theorem says that, if [math]\displaystyle{ M }[/math] is equipped with a Riemannian metric, then the embedding can be taken to be isometric with an expense of increasing [math]\displaystyle{ 2k }[/math]; for this, see this T. Tao's blog.
Tubular neighborhood and transversality
A technically important result is:
Tubular neighborhood theorem — Let M be a manifold and [math]\displaystyle{ N \subset M }[/math] a compact closed submanifold. Then there exists a neighborhood [math]\displaystyle{ U }[/math] of [math]\displaystyle{ N }[/math] such that [math]\displaystyle{ U }[/math] is diffeomorphic to the normal bundle [math]\displaystyle{ \nu_N = TM|_N/TN }[/math] to [math]\displaystyle{ i : N \hookrightarrow M }[/math] and [math]\displaystyle{ N }[/math] corresponds to the zero section of [math]\displaystyle{ \nu_i }[/math] under the diffeomorphism.
This can be proved by putting a Riemannian metric on the manifold [math]\displaystyle{ M }[/math]. Indeed, the choice of metric makes the normal bundle [math]\displaystyle{ \nu_i }[/math] a complementary bundle to [math]\displaystyle{ TN }[/math]; i.e., [math]\displaystyle{ TM|_N }[/math] is the direct sum of [math]\displaystyle{ TN }[/math] and [math]\displaystyle{ \nu_N }[/math]. Then, using the metric, we have the exponential map [math]\displaystyle{ \exp : U \to V }[/math] for some neighborhood [math]\displaystyle{ U }[/math] of [math]\displaystyle{ N }[/math] in the normal bundle [math]\displaystyle{ \nu_N }[/math] to some neighborhood [math]\displaystyle{ V }[/math] of [math]\displaystyle{ N }[/math] in [math]\displaystyle{ M }[/math]. The exponential map here may not be injective but it is possible to make it injective (thus diffeomorphic) by shrinking [math]\displaystyle{ U }[/math] (for now, see see [2]).
Integration on manifolds and distribution densities
The starting point for the topic of integration on manifolds is that there is no invariant way to integrate functions on manifolds. This may be obvious if we asked: what is an integration of functions on a finite-dimensional real vector space? (In contrast, there is an invariant way to do differentiation since, by definition, a manifold comes with a differentiable structure). There are several ways to introduce integration theory to manifolds:
- Integrate differential forms.
- Do integration against some measure.
- Equip a manifold with a Riemannian metric and do integration against such a metric.
For example, if a manifold is embedded into an Euclidean space [math]\displaystyle{ \mathbb{R}^n }[/math], then it acquires the Lebesgue measure restricting from the ambient Euclidean space and then the second approach works. The first approach is fine in many situations but it requires the manifold to be oriented (and there is a non-orientable manifold that is not pathological). The third approach generalizes and that gives rise to the notion of a density.
Generalizations
Extensions to infinite-dimensional normed spaces
The notions like differentiability extend to normed spaces.
See also
Notes
- ↑ This is just the tensor-hom adjunction.
Citations
- ↑ Spivak 1965, Ch 2. Basic definitions.
- ↑ Hörmander 2015, Definition 1.1.4.
- ↑ Hörmander 2015, (1.1.3.)
- ↑ Hörmander 2015, Theorem 1.1.6.
- ↑ Hörmander 2015, (1.1.2)'
- ↑ Hörmander 2015, p. 8
- ↑ 7.0 7.1 Hörmander 2015, Theorem 1.1.8.
- ↑ Hörmander 2015, Lemma 7.1.4.
- ↑ Spivak 1965, Theorem 2-12.
- ↑ Spivak 1965, p. 46
- ↑ Spivak 1965, p. 47
- ↑ Spivak 1965, p. 48
- ↑ Spivak 1965, p. 50
- ↑ Spivak 1965, Theorem 3-8.
- ↑ Spivak 1965, p. 55
- ↑ Spivak 1965, Exercise 4.14.
- ↑ 17.0 17.1 17.2 Spivak 1965, p. 89
- ↑ Spivak 1965, Theorem 4-7.
- ↑ Hörmander 2015, p. 151
- ↑ Theorem 1.2.1. in Hörmander, Lars (1990). An Introduction to Complex Analysis in Several Variables (Third ed.). North Holland..
- ↑ Spivak 1965, Exercise 4-33.
- ↑ Spivak 1965, p. 93
- ↑ O'Neill 2006, Definition 6.1.
- ↑ O'Neill 2006, Example 6.2. (1)
- ↑ O'Neill 2006, Theorem 6.10.
- ↑ Spivak 1965, Exercise 5-16.
- ↑ Edwards 1994, Ch. II, $ 5. Example 9.
- ↑ Spivak 1965, Exercise 5-17.
- ↑ Hörmander 2015, Theorem 1.2.5.
- ↑ Hörmander 2015, Example 3.1.2.
- ↑ Hörmander 2015, p. 63
- ↑ Spivak 1965, Theorem 5-1.
References
- do Carmo, Manfredo P. (1976), Differential Geometry of Curves and Surfaces, Prentice-Hall, ISBN 978-0-13-212589-5
- Edwards, Charles Henry (1994), Advanced Calculus of Several Variables, Mineola, New York: Dover Publications, ISBN 0-486-68336-2
- Folland, Gerald, Real Analysis: Modern Techniques and Their Applications (2nd ed.)
- Cartan, Henri (1971) (in fr), Calcul Differentiel, Hermann, ISBN 9780395120330
- Hirsch, Morris (1994), Differential Topology (2nd ed.), Springer-Verlag
- Hörmander, Lars (2015), The Analysis of Linear Partial Differential Operators I: Distribution Theory and Fourier Analysis, Classics in Mathematics (2nd ed.), Springer, ISBN 9783642614972
- Loomis, Lynn Harold; Sternberg, Shlomo (1968), Advanced Calculus, Addison-Wesley (revised 1990, Jones and Bartlett; reprinted 2014, World Scientific) [this text in particular discusses density]
- O'Neill, Barrett (2006), Elementary Differential Geometry (revised 2nd ed.), Amsterdam: Elsevier/Academic Press, ISBN 0-12-088735-5
- Rudin, Walter (1976), Principles of Mathematical Analysis (3rd ed.), New York: McGraw Hill, pp. 204–299, ISBN 978-0-07-054235-8, https://archive.org/details/1979RudinW
- Spivak, Michael (1965). Calculus on Manifolds: A Modern Approach to Classical Theorems of Advanced Calculus. San Francisco: Benjamin Cummings. ISBN 0-8053-9021-9.
Original source: https://en.wikipedia.org/wiki/Calculus on Euclidean space.
Read more |