# Radon–Nikodym theorem

__: Expressing a measure as an integral of another__

**Short description**In mathematics, the **Radon–Nikodym theorem** is a result in measure theory that expresses the relationship between two measures defined on the same measurable space. A *measure* is a set function that assigns a consistent magnitude to the measurable subsets of a measurable space. Examples of a measure include area and volume, where the subsets are sets of points; or the probability of an event, which is a subset of possible outcomes within a wider probability space.

One way to derive a new measure from one already given is to assign a density to each point of the space, then integrate over the measurable subset of interest. This can be expressed as

- [math]\displaystyle{ \nu(A) = \int_A f \, d\mu, }[/math]

where *ν* is the new measure being defined for any measurable subset *A* and the function *f* is the density at a given point. The integral is with respect to an existing measure *μ*, which may often be the canonical Lebesgue measure on the Real line **R** or the n-dimensional Euclidean space **R**^{n} (corresponding to our standard notions of length, area and volume). For example, if *f* represented mass density and *μ* was the Lebesgue measure in three-dimensional space **R**^{3}, then *ν*(*A*) would equal the total mass in a spatial region *A*.

The Radon–Nikodym theorem essentially states that, under certain conditions, any measure *ν* can be expressed in this way with respect to another measure *μ* on the same space. The function *f* is then called the **Radon–Nikodym derivative** and is denoted by [math]\displaystyle{ \tfrac{d\nu}{d\mu} }[/math].^{[1]} An important application is in probability theory, leading to the probability density function of a random variable.

The theorem is named after Johann Radon, who proved the theorem for the special case where the underlying space is **R**^{n} in 1913, and for Otto Nikodym who proved the general case in 1930.Cite error: Closing `</ref>`

missing for `<ref>`

tag

## Properties

- Let
*ν*,*μ*, and*λ*be σ-finite measures on the same measurable space. If*ν*≪*λ*and*μ*≪*λ*(*ν*and*μ*are both absolutely continuous with respect to*λ*), then [math]\displaystyle{ \frac{d(\nu+\mu)}{d\lambda} = \frac{d\nu}{d\lambda}+\frac{d\mu}{d\lambda} \quad \lambda\text{-almost everywhere}. }[/math] - If
*ν*≪*μ*≪*λ*, then [math]\displaystyle{ \frac{d\nu}{d\lambda}=\frac{d\nu}{d\mu}\frac{d\mu}{d\lambda}\quad\lambda\text{-almost everywhere}. }[/math] - In particular, if
*μ*≪*ν*and*ν*≪*μ*, then [math]\displaystyle{ \frac{d\mu}{d\nu}=\left(\frac{d\nu}{d\mu}\right)^{-1}\quad\nu\text{-almost everywhere}. }[/math] - If
*μ*≪*λ*and g is a*μ*-integrable function, then [math]\displaystyle{ \int_X g\,d\mu = \int_X g\frac{d\mu}{d\lambda}\,d\lambda. }[/math] - If
*ν*is a finite signed or complex measure, then [math]\displaystyle{ {d|\nu|\over d\mu} = \left|{d\nu\over d\mu}\right|. }[/math]

## Applications

### Probability theory

The theorem is very important in extending the ideas of probability theory from probability masses and probability densities defined over real numbers to probability measures defined over arbitrary sets. It tells if and how it is possible to change from one probability measure to another. Specifically, the probability density function of a random variable is the Radon–Nikodym derivative of the induced measure with respect to some base measure (usually the Lebesgue measure for continuous random variables).

For example, it can be used to prove the existence of conditional expectation for probability measures. The latter itself is a key concept in probability theory, as conditional probability is just a special case of it.

### Financial mathematics

Amongst other fields, financial mathematics uses the theorem extensively, in particular via the Girsanov theorem. Such changes of probability measure are the cornerstone of the rational pricing of derivatives and are used for converting actual probabilities into those of the risk neutral probabilities.

### Information divergences

If *μ* and *ν* are measures over X, and *μ* ≪ *ν*

- The Kullback–Leibler divergence from
*ν*to*μ*is defined to be [math]\displaystyle{ D_\text{KL}(\mu \parallel \nu) = \int_X \log \left( \frac{d \mu}{d \nu} \right) \; d\mu. }[/math] - For
*α*> 0,*α*≠ 1 the Rényi divergence of order*α*from*ν*to*μ*is defined to be [math]\displaystyle{ D_\alpha(\mu \parallel \nu) = \frac{1}{\alpha - 1} \log\left(\int_X\left(\frac{d\mu}{d\nu}\right)^{\alpha-1}\; d\mu\right). }[/math]

## The assumption of σ-finiteness

The Radon–Nikodym theorem above makes the assumption that the measure *μ* with respect to which one computes the rate of change of *ν* is σ-finite.

### Negative example

Here is an example when *μ* is not σ-finite and the Radon–Nikodym theorem fails to hold.

Consider the Borel σ-algebra on the real line. Let the counting measure, μ, of a Borel set A be defined as the number of elements of A if A is finite, and ∞ otherwise. One can check that μ is indeed a measure. It is not σ-finite, as not every Borel set is at most a countable union of finite sets. Let ν be the usual Lebesgue measure on this Borel algebra. Then, ν is absolutely continuous with respect to μ, since for a set A one has *μ*(*A*) = 0 only if A is the empty set, and then *ν*(*A*) is also zero.

Assume that the Radon–Nikodym theorem holds, that is, for some measurable function *f* one has

- [math]\displaystyle{ \nu(A) = \int_A f \,d\mu }[/math]

for all Borel sets. Taking A to be a singleton set, *A* = {*a*}, and using the above equality, one finds

- [math]\displaystyle{ 0 = f(a) }[/math]

for all real numbers a. This implies that the function *f* , and therefore the Lebesgue measure ν, is zero, which is a contradiction.

### Positive result

Assuming [math]\displaystyle{ \nu\ll\mu, }[/math] the Radon-Nikodym theorem also holds if [math]\displaystyle{ \mu }[/math] is localizable and [math]\displaystyle{ \nu }[/math] is *accessible with respect to* [math]\displaystyle{ \mu }[/math],^{[2]}^{(p. 189, Exercise 9O)} i.e., [math]\displaystyle{ \nu(A)=\sup\{\nu(B):B\in{\cal P}(A)\cap\mu^\operatorname{pre}(\R_{\ge0})\} }[/math] for all [math]\displaystyle{ A\in\Sigma. }[/math]^{[3]}^{(Theorem 1.111 (Radon-Nikodym, II))}^{[2]}^{(p. 190, Exercise 9T(ii))}

## Proof

This section gives a measure-theoretic proof of the theorem. There is also a functional-analytic proof, using Hilbert space methods, that was first given by von Neumann.

For finite measures μ and ν, the idea is to consider functions *f* with *f dμ* ≤ *dν*. The supremum of all such functions, along with the monotone convergence theorem, then furnishes the Radon–Nikodym derivative. The fact that the remaining part of μ is singular with respect to ν follows from a technical fact about finite measures. Once the result is established for finite measures, extending to σ-finite, signed, and complex measures can be done naturally. The details are given below.

### For finite measures

**Constructing an extended-valued candidate** First, suppose μ and ν are both finite-valued nonnegative measures. Let F be the set of those extended-value measurable functions *f* : *X* → [0, ∞] such that:

- [math]\displaystyle{ \forall A \in \Sigma:\qquad \int_A f\,d\mu \leq \nu(A) }[/math]

*F* ≠ ∅, since it contains at least the zero function. Now let *f*_{1}, *f*_{2} ∈ *F*, and suppose A is an arbitrary measurable set, and define:

- [math]\displaystyle{ \begin{align} A_1 &= \left\{ x \in A : f_1(x) \gt f_2(x) \right\}, \\ A_2 &= \left\{ x \in A : f_2(x) \geq f_1(x) \right\}, \end{align} }[/math]

Then one has

- [math]\displaystyle{ \int_A\max\left\{f_1, f_2\right\}\,d\mu = \int_{A_1} f_1\,d\mu + \int_{A_2} f_2\,d\mu \leq \nu\left(A_1\right) + \nu\left(A_2\right) = \nu(A), }[/math]

and therefore, max{ *f* _{1}, *f* _{2}} ∈ *F*.

Now, let { *f _{n}* } be a sequence of functions in F such that

- [math]\displaystyle{ \lim_{n\to\infty}\int_X f_n\,d\mu = \sup_{f\in F} \int_X f\,d\mu. }[/math]

By replacing *f _{n}* with the maximum of the first n functions, one can assume that the sequence {

*f*} is increasing. Let g be an extended-valued function defined as

_{n}- [math]\displaystyle{ g(x) := \lim_{n\to\infty}f_n(x). }[/math]

By Lebesgue's monotone convergence theorem, one has

- [math]\displaystyle{ \lim_{n\to\infty} \int_A f_n\,d\mu = \int_A \lim_{n\to\infty} f_n(x)\,d\mu(x) = \int_A g\,d\mu \leq \nu(A) }[/math]

for each *A* ∈ Σ, and hence, *g* ∈ *F*. Also, by the construction of g,

- [math]\displaystyle{ \int_X g\,d\mu = \sup_{f\in F}\int_X f\,d\mu. }[/math]

**Proving equality** Now, since *g* ∈ *F*,

- [math]\displaystyle{ \nu_0(A) := \nu(A) - \int_A g\,d\mu }[/math]

defines a nonnegative measure on Σ. To prove equality, we show that *ν*_{0} = 0.

Suppose ν_{0} ≠ 0; then, since μ is finite, there is an *ε* > 0 such that *ν*_{0}(*X*) > *ε μ*(*X*). To derive a contradiction from ν_{0} ≠ 0, we look for a positive set *P* ∈ Σ for the signed measure *ν*_{0} − *ε μ* (i.e. a measurable set P, all of whose measurable subsets have non-negative *ν*_{0}−*ε μ* measure), where also P has positive μ-measure. Conceptually, we're looking for a set P, where *ν*_{0} ≥ *ε μ* in every part of P. A convenient approach is to use the Hahn decomposition (*P*, *N*) for the signed measure *ν*_{0} − *ε μ*.

Note then that for every *A* ∈ Σ one has *ν*_{0}(*A* ∩ *P*) ≥ *ε μ*(*A* ∩ *P*), and hence,

- [math]\displaystyle{ \begin{align} \nu(A) &= \int_A g\,d\mu + \nu_0(A) \\ &\geq \int_A g\,d\mu + \nu_0(A\cap P)\\ &\geq \int_A g\,d\mu + \varepsilon\mu(A\cap P) = \int_A\left(g + \varepsilon 1_P\right)\,d\mu, \end{align} }[/math]

where 1_{P} is the indicator function of P. Also, note that *μ*(*P*) > 0 as desired; for if *μ*(*P*) = 0, then (since ν is absolutely continuous in relation to μ) *ν*_{0}(*P*) ≤ *ν*(*P*) = 0, so *ν*_{0}(*P*) = 0 and

- [math]\displaystyle{ \nu_0(X) - \varepsilon\mu(X) = \left(\nu_0 - \varepsilon\mu\right)(N) \leq 0, }[/math]

contradicting the fact that *ν*_{0}(*X*) > *εμ*(*X*).

Then, since also

- [math]\displaystyle{ \int_X\left(g + \varepsilon1_P\right)\,d\mu \leq \nu(X) \lt +\infty, }[/math]

*g* + *ε* 1_{P} ∈ *F* and satisfies

- [math]\displaystyle{ \int_X\left(g + \varepsilon 1_P\right)\,d\mu \gt \int_X g\,d\mu = \sup_{f\in F}\int_X f\,d\mu. }[/math]

This is impossible because it violates the definition of a supremum; therefore, the initial assumption that *ν*_{0} ≠ 0 must be false. Hence, *ν*_{0} = 0, as desired.

**Restricting to finite values** Now, since g is μ-integrable, the set {*x* ∈ *X* : *g*(*x*) = ∞} is μ-null. Therefore, if a *f* is defined as

- [math]\displaystyle{ f(x) = \begin{cases} g(x) & \text{if }g(x) \lt \infty \\ 0 & \text{otherwise,} \end{cases} }[/math]

then *f* has the desired properties.

**Uniqueness** As for the uniqueness, let *f*, *g* : *X* → [0, ∞) be measurable functions satisfying

- [math]\displaystyle{ \nu(A) = \int_A f\,d\mu = \int_A g\,d\mu }[/math]

for every measurable set A. Then, *g* − *f* is μ-integrable, and

- [math]\displaystyle{ \int_A(g - f)\,d\mu = 0. }[/math]

In particular, for *A* = {*x* ∈ *X* : *f*(*x*) > *g*(*x*)}, or {*x* ∈ *X* : *f*(*x*) < *g*(*x*)}. It follows that

- [math]\displaystyle{ \int_X(g - f)^+\,d\mu = 0 = \int_X(g - f)^-\,d\mu, }[/math]

and so, that (*g* − *f* )^{+} = 0 μ-almost everywhere; the same is true for (*g* − *f* )^{−}, and thus, *f* = *g* μ-almost everywhere, as desired.

### For σ-finite positive measures

If μ and ν are σ-finite, then X can be written as the union of a sequence {*B _{n}*}

_{n}of disjoint sets in Σ, each of which has finite measure under both μ and ν. For each n, by the finite case, there is a Σ-measurable function

*f*:

_{n}*B*→ [0, ∞) such that

_{n}- [math]\displaystyle{ \nu_n(A) = \int_A f_n\,d\mu }[/math]

for each Σ-measurable subset A of *B _{n}*. The sum [math]\displaystyle{ \left(\sum_n f_n 1_{B_n}\right) := f }[/math] of those functions is then the required function such that [math]\displaystyle{ \nu(A) = \int_A f d\mu }[/math].

As for the uniqueness, since each of the *f _{n}* is μ-almost everywhere unique, so is

*f*.

### For signed and complex measures

If ν is a σ-finite signed measure, then it can be Hahn–Jordan decomposed as *ν* = *ν*^{+} − *ν*^{−} where one of the measures is finite. Applying the previous result to those two measures, one obtains two functions, *g*, *h* : *X* → [0, ∞), satisfying the Radon–Nikodym theorem for *ν*^{+} and *ν*^{−} respectively, at least one of which is μ-integrable (i.e., its integral with respect to μ is finite). It is clear then that *f* = *g* − *h* satisfies the required properties, including uniqueness, since both g and h are unique up to μ-almost everywhere equality.

If ν is a complex measure, it can be decomposed as *ν* = *ν*_{1} + *iν*_{2}, where both *ν*_{1} and *ν*_{2} are finite-valued signed measures. Applying the above argument, one obtains two functions, *g*, *h* : *X* → [0, ∞), satisfying the required properties for *ν*_{1} and *ν*_{2}, respectively. Clearly, *f* = *g* + *ih* is the required function.

## The Lebesgue decomposition theorem

Lebesgue's decomposition theorem shows that the assumptions of the Radon–Nikodym theorem can be found even in a situation which is seemingly more general. Consider a σ-finite positive measure [math]\displaystyle{ \mu }[/math] on the measure space [math]\displaystyle{ (X,\Sigma) }[/math] and a σ-finite signed measure [math]\displaystyle{ \nu }[/math] on [math]\displaystyle{ \Sigma }[/math], without assuming any absolute continuity. Then there exist unique signed measures [math]\displaystyle{ \nu_a }[/math] and [math]\displaystyle{ \nu_s }[/math] on [math]\displaystyle{ \Sigma }[/math] such that [math]\displaystyle{ \nu=\nu_a+\nu_s }[/math], [math]\displaystyle{ \nu_a\ll\mu }[/math], and [math]\displaystyle{ \nu_s\perp\mu }[/math]. The Radon–Nikodym theorem can then be applied to the pair [math]\displaystyle{ \nu_a,\mu }[/math].

## See also

## Notes

- ↑ Billingsley, Patrick (1995).
*Probability and Measure*(Third ed.). New York: John Wiley & Sons. pp. 419–427. ISBN 0-471-00710-2. - ↑
^{2.0}^{2.1}Brown, Arlen; Pearcy, Carl (1977).*Introduction to Operator Theory I: Elements of Functional Analysis*. ISBN 0-398-90257-0. - ↑ Fonseca, Irene; Leoni, Giovanni.
*Modern Methods in the Calculus of Variations: L*. Springer. p. 68. ISBN 978-0-387-35784-3.^{p}Spaces

## References

- Lang, Serge (1969).
*Analysis II: Real analysis*. Addison-Wesley. Contains a proof for vector measures assuming values in a Banach space. - Royden, H. L.; Fitzpatrick, P. M. (2010).
*Real Analysis*(4th ed.). Pearson. Contains a lucid proof in case the measure*ν*is not σ-finite. - Shilov, G. E.; Gurevich, B. L. (1978).
*Integral, Measure, and Derivative: A Unified Approach*. Richard A. Silverman, trans.. Dover Publications. ISBN 0-486-63519-8. - Stein, Elias M.; Shakarchi, Rami (2005).
*Real analysis: measure theory, integration, and Hilbert spaces*. Princeton lectures in analysis. Princeton, N.J: Princeton University Press. ISBN 978-0-691-11386-9. Contains a proof of the generalisation. - Teschl, Gerald. "Topics in Real and Functional Analysis". https://www.mat.univie.ac.at/~gerald/ftp/book-fa/index.html.

Original source: https://en.wikipedia.org/wiki/Radon–Nikodym theorem.
Read more |