Raising and lowering indices

From HandWiki

In mathematics and mathematical physics, raising and lowering indices are operations on tensors which change their type. Raising and lowering indices are a form of index manipulation in tensor expressions.

Vectors, covectors and the metric

Mathematical formulation

Mathematically vectors are elements of a vector space [math]\displaystyle{ V }[/math] over a field [math]\displaystyle{ K }[/math], and for use in physics [math]\displaystyle{ V }[/math] is usually defined with [math]\displaystyle{ K=\mathbb{R} }[/math] or [math]\displaystyle{ \mathbb{C} }[/math]. Concretely, if the dimension [math]\displaystyle{ n=\text{dim}V }[/math] of [math]\displaystyle{ V }[/math] is finite, then, after making a choice of basis, we can view such vector spaces as [math]\displaystyle{ \mathbb{R}^n }[/math] or [math]\displaystyle{ \mathbb{C}^n }[/math].

The dual space is the space of linear functionals mapping [math]\displaystyle{ V\rightarrow K }[/math]. Concretely, in matrix notation these can be thought of as row vectors, which give a number when applied to column vectors. We denote this by [math]\displaystyle{ V^*:= \text{Hom}(V,K) }[/math], so that [math]\displaystyle{ \alpha \in V^* }[/math] is a linear map [math]\displaystyle{ \alpha:V\rightarrow K }[/math].

Then under a choice of basis [math]\displaystyle{ \{e_i\} }[/math], we can view vectors [math]\displaystyle{ v\in V }[/math] as an [math]\displaystyle{ K^n }[/math] vector with components [math]\displaystyle{ v^i }[/math] (vectors are taken by convention to have indices up). This picks out a choice of basis [math]\displaystyle{ \{e^i\} }[/math] for [math]\displaystyle{ V^* }[/math], defined by the set of relations [math]\displaystyle{ e^i(e_j) = \delta^i_j }[/math].

For applications, raising and lowering is done using a structure known as the (pseudo-)metric tensor (the 'pseudo-' refers to the fact we allow the metric to be indefinite). Formally, this is a non-degenerate, symmetric bilinear form

[math]\displaystyle{ g:V\times V\rightarrow K \text{ a bilinear form} }[/math]
[math]\displaystyle{ g(u,v) = g(v,u) \text{ for all }u,v\in V \text{ (Symmetric)} }[/math]
[math]\displaystyle{ \forall v\in V, \exists u\in V \text{ such that } g(v,u)\neq 0 \text{ (Non-degenerate)} }[/math]

In this basis, it has components [math]\displaystyle{ g(e_i,e_j) = g_{ij} }[/math], and can be viewed as a symmetric matrix in [math]\displaystyle{ \text{Mat}_{n\times n}(K) }[/math] with these components. The inverse metric exists due to non-degeneracy and is denoted [math]\displaystyle{ g^{ij} }[/math], and as a matrix is the inverse to [math]\displaystyle{ g_{ij} }[/math].

Raising and lowering vectors and covectors

Raising and lowering is then done in coordinates. Given a vector with components [math]\displaystyle{ v^i }[/math], we can contract with the metric to obtain a covector:

[math]\displaystyle{ g_{ij}v^j = v_i }[/math]

and this is what we mean by lowering the index. Conversely, contracting a covector with the inverse metric gives a vector:

[math]\displaystyle{ g^{ij}\alpha_j=\alpha^i. }[/math]

This process is called raising the index.

Raising and then lowering the same index (or conversely) are inverse operations, which is reflected in the metric and inverse metric tensors being inverse to each other (as is suggested by the terminology):

[math]\displaystyle{ g^{ij}g_{jk}=g_{kj}g^{ji}={\delta^i}_k={\delta_k}^i }[/math]

where [math]\displaystyle{ \delta^i_j }[/math] is the Kronecker delta or identity matrix.

Finite-dimensional real vector spaces with (pseudo-)metrics are classified up to signature, a coordinate-free property which is well-defined by Sylvester's law of inertia. Possible metrics on real space are indexed by signature [math]\displaystyle{ (p,q) }[/math]. This is a metric associated to [math]\displaystyle{ n=p+q }[/math] dimensional real space. The metric has signature [math]\displaystyle{ (p,q) }[/math] if there exists a basis (referred to as an orthonormal basis) such that in this basis, the metric takes the form [math]\displaystyle{ (g_{ij}) = \text{diag}(+1, \cdots, +1, -1, \cdots, -1) }[/math] with [math]\displaystyle{ p }[/math] positive ones and [math]\displaystyle{ q }[/math] negative ones.

The concrete space with elements which are [math]\displaystyle{ n }[/math]-vectors and this concrete realization of the metric is denoted [math]\displaystyle{ \mathbb{R}^{p,q}=(\mathbb{R}^n,g_{ij}) }[/math], where the 2-tuple [math]\displaystyle{ (\mathbb{R}^n, g_{ij}) }[/math] is meant to make it clear that the underlying vector space of [math]\displaystyle{ \mathbb{R}^{p,q} }[/math] is [math]\displaystyle{ \mathbb{R}^n }[/math]: equipping this vector space with the metric [math]\displaystyle{ g_{ij} }[/math] is what turns the space into [math]\displaystyle{ \mathbb{R}^{p,q} }[/math].

Examples:

  • [math]\displaystyle{ \mathbb{R}^3 }[/math] is a model for 3-dimensional space. The metric is equivalent to the standard dot product.
  • [math]\displaystyle{ \mathbb{R}^{n,0} = \mathbb{R}^n }[/math], equivalent to [math]\displaystyle{ n }[/math] dimensional real space as an inner product space with [math]\displaystyle{ g_{ij} = \delta_{ij} }[/math]. In Euclidean space, raising and lowering is not necessary due to vectors and covector components being the same.
  • [math]\displaystyle{ \mathbb{R}^{1,3} }[/math] is Minkowski space (or rather, Minkowski space in a choice of orthonormal basis), a model for spacetime with weak curvature. It is common convention to use greek indices when writing expressions involving tensors in Minkowski space, while Latin indices are reserved for Euclidean space.

Well-formulated expressions are constrained by the rules of Einstein summation: any index may appear at most once and furthermore a raised index must contract with a lowered index. With these rules we can immediately see that an expression such as

[math]\displaystyle{ g_{ij}v^iu^j }[/math]

is well formulated while

[math]\displaystyle{ g_{ij}v_iu_j }[/math]

is not.

Example in Minkowski spacetime

The covariant 4-position is given by

[math]\displaystyle{ X_\mu = (-ct, x, y, z) }[/math]

with components:

[math]\displaystyle{ X_0 = -ct, \quad X_1 = x, \quad X_2 = y, \quad X_3 = z }[/math]

(where x,y,z are the usual Cartesian coordinates) and the Minkowski metric tensor with metric signature (− + + +) is defined as

[math]\displaystyle{ \eta_{\mu \nu} = \eta^{\mu \nu} = \begin{pmatrix} -1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix} }[/math]

in components:

[math]\displaystyle{ \eta_{00} = -1, \quad \eta_{i0} = \eta_{0i} = 0,\quad \eta_{ij} = \delta_{ij}\,(i,j \neq 0). }[/math]

To raise the index, multiply by the tensor and contract:

[math]\displaystyle{ X^\lambda = \eta^{\lambda\mu}X_\mu = \eta^{\lambda 0}X_0 + \eta^{\lambda i}X_i }[/math]

then for λ = 0:

[math]\displaystyle{ X^0 = \eta^{00}X_0 + \eta^{0i}X_i = -X_0 }[/math]

and for λ = j = 1, 2, 3:

[math]\displaystyle{ X^j = \eta^{j0}X_0 + \eta^{ji}X_i = \delta^{ji}X_i = X_j \,. }[/math]

So the index-raised contravariant 4-position is:

[math]\displaystyle{ X^\mu = (ct, x, y, z)\,. }[/math]

This operation is equivalent to the matrix multiplication

[math]\displaystyle{ \begin{pmatrix} -1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix} \begin{pmatrix} -ct \\ x \\ y \\ z \end{pmatrix} = \begin{pmatrix} ct \\ x \\ y \\ z \end{pmatrix}. }[/math]

Given two vectors, [math]\displaystyle{ X^\mu }[/math] and [math]\displaystyle{ Y^\mu }[/math], we can write down their (pseudo-)inner product in two ways:

[math]\displaystyle{ \eta_{\mu\nu}X^\mu Y^\nu. }[/math]

By lowering indices, we can write this expression as

[math]\displaystyle{ X_\mu Y^\mu. }[/math]

What is this in matrix notation? The first expression can be written as

[math]\displaystyle{ \begin{pmatrix} X^0 & X^1 & X^2 & X^3 \end{pmatrix} \begin{pmatrix} -1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix} \begin{pmatrix} Y^0 \\ Y^1 \\ Y^2 \\ Y^3\end{pmatrix} }[/math]

while the second is, after lowering the indices of [math]\displaystyle{ X^\mu }[/math],

[math]\displaystyle{ \begin{pmatrix} -X^0 & X^1 & X^2 & X^3 \end{pmatrix}\begin{pmatrix} Y^0 \\ Y^1 \\ Y^2 \\ Y^3\end{pmatrix}. }[/math]

Coordinate free formalism

It is instructive to consider what raising and lowering means in the abstract linear algebra setting.

We first fix definitions: [math]\displaystyle{ V }[/math] is a finite-dimensional vector space over a field [math]\displaystyle{ K }[/math]. Typically [math]\displaystyle{ K=\mathbb{R} }[/math] or [math]\displaystyle{ \mathbb{C} }[/math].

[math]\displaystyle{ \phi }[/math] is a non-degenerate bilinear form, that is, [math]\displaystyle{ \phi:V\times V\rightarrow K }[/math] is a map which is linear in both arguments, making it a bilinear form.

By [math]\displaystyle{ \phi }[/math] being non-degenerate we mean that for each [math]\displaystyle{ v\in V }[/math], there is a [math]\displaystyle{ u\in V }[/math] such that

[math]\displaystyle{ \phi(v,u)\neq 0. }[/math]

In concrete applications, [math]\displaystyle{ \phi }[/math] is often considered a structure on the vector space, for example an inner product or more generally a metric tensor which is allowed to have indefinite signature, or a symplectic form [math]\displaystyle{ \omega }[/math]. Together these cover the cases where [math]\displaystyle{ \phi }[/math] is either symmetric or anti-symmetric, but in full generality [math]\displaystyle{ \phi }[/math] need not be either of these cases.

There is a partial evaluation map associated to [math]\displaystyle{ \phi }[/math],

[math]\displaystyle{ \phi(\cdot, - ):V\rightarrow V^*; v\mapsto \phi(v,\cdot) }[/math]

where [math]\displaystyle{ \cdot }[/math] denotes an argument which is to be evaluated, and [math]\displaystyle{ - }[/math] denotes an argument whose evaluation is deferred. Then [math]\displaystyle{ \phi(v,\cdot) }[/math] is an element of [math]\displaystyle{ V^* }[/math], which sends [math]\displaystyle{ u\mapsto \phi(v,u) }[/math].

We made a choice to define this partial evaluation map as being evaluated on the first argument. We could just as well have defined it on the second argument, and non-degeneracy is also independent of argument chosen. Also, when [math]\displaystyle{ \phi }[/math] has well defined (anti-)symmetry, evaluating on either argument is equivalent (up to a minus sign for anti-symmetry).

Non-degeneracy shows that the partial evaluation map is injective, or equivalently that the kernel of the map is trivial. In finite dimension, the dual space [math]\displaystyle{ V^* }[/math] has equal dimension to [math]\displaystyle{ V }[/math], so non-degeneracy is enough to conclude the map is a linear isomorphism. If [math]\displaystyle{ \phi }[/math] is a structure on the vector space sometimes call this the canonical isomorphism [math]\displaystyle{ V\rightarrow V^* }[/math].

It therefore has an inverse, [math]\displaystyle{ \phi^{-1}:V^*\rightarrow V, }[/math] and this is enough to define an associated bilinear form on the dual:

[math]\displaystyle{ \phi^{-1}:V^*\times V^*\rightarrow K, \phi^{-1}(\alpha,\beta) = \phi(\phi^{-1}(\alpha),\phi^{-1}(\beta)). }[/math]

where the repeated use of [math]\displaystyle{ \phi^{-1} }[/math] is disambiguated by the argument taken. That is, [math]\displaystyle{ \phi^{-1}(\alpha) }[/math] is the inverse map, while [math]\displaystyle{ \phi^{-1}(\alpha,\beta) }[/math] is the bilinear form.

Checking these expressions in coordinates makes it evident that this is what raising and lowering indices means abstractly.

Tensors

We will not develop the abstract formalism for tensors straightaway. Formally, an [math]\displaystyle{ (r,s) }[/math] tensor is an object described via its components, and has [math]\displaystyle{ r }[/math] components up, [math]\displaystyle{ s }[/math] components down. A generic [math]\displaystyle{ (r,s) }[/math] tensor is written

[math]\displaystyle{ T^{\mu_1\cdots \mu_r}{}_{\nu_1\cdots \nu_s}. }[/math]

We can use the metric tensor to raise and lower tensor indices just as we raised and lowered vector indices and raised covector indices.

Examples

  • A (0,0) tensor is a number in the field [math]\displaystyle{ \mathbb{F} }[/math].
  • A (1,0) tensor is a vector.
  • A (0,1) tensor is a covector.
  • A (0,2) tensor is a bilinear form. An example is the metric tensor [math]\displaystyle{ g_{\mu\nu}. }[/math]
  • A (1,1) tensor is a linear map. An example is the delta, [math]\displaystyle{ \delta^\mu{}_\nu }[/math], which is the identity map, or a Lorentz transformation [math]\displaystyle{ \Lambda^\mu{}_\nu. }[/math]

Example of raising and lowering

For a (0,2) tensor,[1] twice contracting with the inverse metric tensor and contracting in different indices raises each index:

[math]\displaystyle{ A^{\mu\nu}=g^{\mu\rho}g^{\nu\sigma}A_{\rho \sigma}. }[/math]

Similarly, twice contracting with the metric tensor and contracting in different indices lowers each index:

[math]\displaystyle{ A_{\mu\nu}=g_{\mu\rho}g_{\nu\sigma}A^{\rho\sigma} }[/math]

Let's apply this to the theory of electromagnetism.

The contravariant electromagnetic tensor in the (+ − − −) signature is given by[2]

[math]\displaystyle{ F^{\alpha\beta} = \begin{pmatrix} 0 & -\frac{E_x}{c} & -\frac{E_y}{c} & -\frac{E_z}{c} \\ \frac{E_x}{c} & 0 & -B_z & B_y \\ \frac{E_y}{c} & B_z & 0 & -B_x \\ \frac{E_z}{c} & -B_y & B_x & 0 \end{pmatrix}. }[/math]

In components,

[math]\displaystyle{ F^{0i} = -F^{i0} = - \frac{E^i}{c} ,\quad F^{ij} = - \varepsilon^{ijk} B_k }[/math]

To obtain the covariant tensor Fαβ, contract with the inverse metric tensor:

[math]\displaystyle{ \begin{align} F_{\alpha\beta} & = \eta_{\alpha\gamma} \eta_{\beta\delta} F^{\gamma\delta} \\ & = \eta_{\alpha 0} \eta_{\beta 0} F^{0 0} + \eta_{\alpha i} \eta_{\beta 0} F^{i 0} + \eta_{\alpha 0} \eta_{\beta i} F^{0 i} + \eta_{\alpha i} \eta_{\beta j} F^{i j} \end{align} }[/math]

and since F00 = 0 and F0i = − Fi0, this reduces to

[math]\displaystyle{ F_{\alpha\beta} = \left(\eta_{\alpha i} \eta_{\beta 0} - \eta_{\alpha 0} \eta_{\beta i} \right) F^{i 0} + \eta_{\alpha i} \eta_{\beta j} F^{i j} }[/math]

Now for α = 0, β = k = 1, 2, 3:

[math]\displaystyle{ \begin{align} F_{0k} & = \left(\eta_{0i} \eta_{k0} - \eta_{00} \eta_{ki} \right) F^{i0} + \eta_{0i} \eta_{kj} F^{ij} \\ & = \bigl(0 - (-\delta_{ki}) \bigr) F^{i0} + 0 \\ & = F^{k0} = - F^{0k} \\ \end{align} }[/math]

and by antisymmetry, for α = k = 1, 2, 3, β = 0:

[math]\displaystyle{ F_{k0} = - F^{k0} }[/math]

then finally for α = k = 1, 2, 3, β = l = 1, 2, 3;

[math]\displaystyle{ \begin{align} F_{kl} & = \left(\eta_{ k i} \eta_{ l 0} - \eta_{ k 0} \eta_{ l i} \right) F^{i 0} + \eta_{ k i} \eta_{ l j} F^{i j} \\ & = 0 + \delta_{ k i} \delta_{ l j} F^{i j} \\ & = F^{k l} \\ \end{align} }[/math]

The (covariant) lower indexed tensor is then:

[math]\displaystyle{ F_{\alpha\beta} = \begin{pmatrix} 0 & \frac{E_x}{c} & \frac{E_y}{c} & \frac{E_z}{c} \\ -\frac{E_x}{c} & 0 & -B_z & B_y \\ -\frac{E_y}{c} & B_z & 0 & -B_x \\ -\frac{E_z}{c} & -B_y & B_x & 0 \end{pmatrix} }[/math]

This operation is equivalent to the matrix multiplication

[math]\displaystyle{ \begin{pmatrix} -1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix} \begin{pmatrix} 0 & -\frac{E_x}{c} & -\frac{E_y}{c} & -\frac{E_z}{c} \\ \frac{E_x}{c} & 0 & -B_z & B_y \\ \frac{E_y}{c} & B_z & 0 & -B_x \\ \frac{E_z}{c} & -B_y & B_x & 0 \end{pmatrix} \begin{pmatrix} -1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix} =\begin{pmatrix} 0 & \frac{E_x}{c} & \frac{E_y}{c} & \frac{E_z}{c} \\ -\frac{E_x}{c} & 0 & -B_z & B_y \\ -\frac{E_y}{c} & B_z & 0 & -B_x \\ -\frac{E_z}{c} & -B_y & B_x & 0 \end{pmatrix}. }[/math]

General rank

For a tensor of order n, indices are raised by (compatible with above):[1]

[math]\displaystyle{ g^{j_1i_1}g^{j_2i_2}\cdots g^{j_ni_n}A_{i_1i_2\cdots i_n} = A^{j_1j_2\cdots j_n} }[/math]

and lowered by:

[math]\displaystyle{ g_{j_1i_1}g_{j_2i_2}\cdots g_{j_ni_n}A^{i_1i_2\cdots i_n} = A_{j_1j_2\cdots j_n} }[/math]

and for a mixed tensor:

[math]\displaystyle{ g_{p_1i_1}g_{p_2i_2}\cdots g_{p_ni_n}g^{q_1j_1}g^{q_2j_2}\cdots g^{q_mj_m}{A^{i_1i_2\cdots i_n}}_{j_1j_2\cdots j_m} = {A_{p_1p_2\cdots p_n}}^{q_1q_2\cdots q_m} }[/math]

We need not raise or lower all indices at once: it is perfectly fine to raise or lower a single index. Lowering an index of an [math]\displaystyle{ (r,s) }[/math] tensor gives a [math]\displaystyle{ (r-1,s+1) }[/math] tensor, while raising an index gives a [math]\displaystyle{ (r+1,s-1) }[/math] (where [math]\displaystyle{ r,s }[/math] have suitable values, for example we cannot lower the index of a [math]\displaystyle{ (0,2) }[/math] tensor.)

See also

References

  1. 1.0 1.1 Kay, D. C. (1988). Tensor Calculus. Schaum’s Outlines. New York: McGraw Hill. ISBN 0-07-033484-6. 
  2. NB: Some texts, such as: Griffiths, David J. (1987). Introduction to Elementary Particles. Wiley, John & Sons, Inc. ISBN 0-471-60386-4. , will show this tensor with an overall factor of −1. This is because they used the negative of the metric tensor used here: (− + + +), see metric signature. In older texts such as Jackson (2nd edition), there are no factors of c since they are using Gaussian units. Here SI units are used.