Schuette–Nesbitt formula

From HandWiki

In mathematics, the Schuette–Nesbitt formula is a generalization of the inclusion–exclusion principle. It is named after Donald R. Schuette and Cecil J. Nesbitt. The probabilistic version of the Schuette–Nesbitt formula has practical applications in actuarial science, where it is used to calculate the net single premium for life annuities and life insurances based on the general symmetric status.

Combinatorial versions

Consider a set Ω and subsets A1, ..., Am. Let

[math]\displaystyle{ N(\omega)=\sum_{n=1}^m 1_{A_n}(\omega),\qquad \omega\in\Omega, }[/math]

 

 

 

 

(1)

denote the number of subsets to which ω ∈ Ω belongs, where we use the indicator functions of the sets A1, ..., Am. Furthermore, for each k ∈ {0, 1, ..., m}, let

[math]\displaystyle{ N_k(\omega)=\sum_{\scriptstyle J\subset\{1,\ldots,m\}\atop\scriptstyle|J|=k} 1_{\cap_{j\in J}A_j}(\omega),\qquad\omega\in\Omega, }[/math]

 

 

 

 

(2)

denote the number of intersections of exactly k sets out of A1, ..., Am, to which ω belongs, where the intersection over the empty index set is defined as Ω, hence N0 = 1Ω. Let V denote a vector space over a field R such as the real or complex numbers (or more generally a module over a ring R with multiplicative identity). Then, for every choice of c0, ..., cmV,

[math]\displaystyle{ \sum_{n=0}^m 1_{\{N=n\}}c_n = \sum_{k=0}^m N_k\sum_{l=0}^k (-1)^{k-l}\binom klc_l, }[/math]

 

 

 

 

(3)

where 1{N=n} denotes the indicator function of the set of all ω ∈ Ω with N(ω) = n, and [math]\displaystyle{ \textstyle\binom kl }[/math] is a binomial coefficient. Equality (3) says that the two V-valued functions defined on Ω are the same.

Proof of (3)

We prove that (3) holds pointwise. Take ω ∈ Ω and define n = N(ω). Then the left-hand side of (3) equals cn. Let I denote the set of all those indices i ∈ {1, ..., m} such that ωAi, hence I contains exactly n indices. Given J ⊂ {1, ..., m} with k elements, then ω belongs to the intersection jJAj if and only if J is a subset of I. By the combinatorial interpretation of the binomial coefficient, there are Nk = [math]\displaystyle{ \textstyle\binom nk }[/math] such subsets (the binomial coefficient is zero for k > n). Therefore the right-hand side of (3) evaluated at ω equals

[math]\displaystyle{ \sum_{k=0}^m \binom nk\sum_{l=0}^k (-1)^{k-l}\binom klc_l =\sum_{l=0}^m\underbrace{\sum_{k=l}^n (-1)^{k-l}\binom nk \binom kl}_{=:\,(*)} c_l, }[/math]

where we used that the first binomial coefficient is zero for k > n. Note that the sum (*) is empty and therefore defined as zero for n < l. Using the factorial formula for the binomial coefficients, it follows that

[math]\displaystyle{ \begin{align} (*) &=\sum_{k=l}^n (-1)^{k-l}\frac{n!}{k!\,(n-k)!}\,\frac{k!}{l!\,(k-l)!}\\ &=\underbrace{\frac{n!}{l!\,(n-l)!}}_{=\binom nl}\underbrace{\sum_{k=l}^n (-1)^{k-l}\frac{(n-l)!}{(n-k)!\,(k-l)!}}_{=:\,(**)}\\ \end{align} }[/math]

Rewriting (**) with the summation index j = kl und using the binomial formula for the third equality shows that

[math]\displaystyle{ \begin{align} (**) &=\sum_{j=0}^{n-l} (-1)^{j}\frac{(n-l)!}{(n-l-j)!\,j!}\\ &=\sum_{j=0}^{n-l} (-1)^{j}\binom{n-l}{j} =(1-1)^{n-l} =\delta_{ln}, \end{align} }[/math]

which is the Kronecker delta. Substituting this result into the above formula and noting that n choose l equals 1 for l = n, it follows that the right-hand side of (3) evaluated at ω also reduces to cn.

Representation in the polynomial ring

As a special case, take for V the polynomial ring R[x] with the indeterminate x. Then (3) can be rewritten in a more compact way as

[math]\displaystyle{ \sum_{n=0}^m 1_{\{N=n\}}x^n = \sum_{k=0}^m N_k(x-1)^k. }[/math]

 

 

 

 

(4)

This is an identity for two polynomials whose coefficients depend on ω, which is implicit in the notation.

Proof of (4) using (3): Substituting cn = xn for n ∈ {0, ..., m} into (3) and using the binomial formula shows that

[math]\displaystyle{ \sum_{n=0}^m 1_{\{N=n\}}x^n =\sum_{k=0}^m N_k\underbrace{\sum_{l=0}^k \binom kl(-1)^{k-l}x^l}_{=\,(x-1)^k}, }[/math]

Representation with shift and difference operators

Consider the linear shift operator E and the linear difference operator Δ, which we define here on the sequence space of V by

[math]\displaystyle{ \begin{align} E:V^{\mathbb{N}_0}&\to V^{\mathbb{N}_0},\\ E(c_0,c_1,c_2,c_3,\ldots)&\mapsto(c_1,c_2,c_3,\ldots),\\ \end{align} }[/math]

and

[math]\displaystyle{ \begin{align} \Delta:V^{\mathbb{N}_0}&\to V^{\mathbb{N}_0},\\ \Delta(c_0,c_1,c_2,c_3\ldots)&\mapsto(c_1-c_0,c_2-c_1,c_3-c_2,\ldots).\\ \end{align} }[/math]

Substituting x = E in (4) shows that

[math]\displaystyle{ \sum_{n=0}^m 1_{\{N=n\}}E^n = \sum_{k=0}^m N_k\Delta^k, }[/math]

 

 

 

 

(5)

where we used that Δ = EI with I denoting the identity operator. Note that E0 and Δ0 equal the identity operator I on the sequence space, Ek and Δk denote the k-fold composition.

Let kc)0 denote the 0th component of the k-fold composition Δk applied to c = (c0, c1, ..., cm, ...), where Δ0 denotes the identity. Then (3) can be rewritten in a more compact way as

[math]\displaystyle{ \sum_{n=0}^m 1_{\{N=n\}}c_n = \sum_{k=0}^m N_k(\Delta^k c)_0. }[/math]

 

 

 

 

(6)

Probabilistic versions

Consider arbitrary events A1, ..., Am in a probability space (Ω, F, [math]\displaystyle{ \mathbb{P} }[/math]) and let E denote the expectation operator. Then N from (1) is the random number of these events which occur simultaneously. Using Nk from (2), define

[math]\displaystyle{ S_k=\mathbb{E}[N_k]=\sum_{\scriptstyle J\subset\{1,\ldots,m\}\atop\scriptstyle|J|=k} \mathbb{P}\biggl(\bigcap_{j\in J}A_j\biggr),\qquad k\in\{0,\ldots,m\}, }[/math]

 

 

 

 

(7)

where the intersection over the empty index set is again defined as Ω, hence S0 = 1. If the ring R is also an algebra over the real or complex numbers, then taking the expectation of the coefficients in (4) and using the notation from (7),

[math]\displaystyle{ \sum_{n=0}^m \mathbb{P}(N=n)x^n = \sum_{k=0}^m S_k(x-1)^k }[/math]

 

 

 

 

(4')

in R[x]. If R is the field of real numbers, then this is the probability-generating function of the probability distribution of N.

Similarly, (5) and (6) yield

[math]\displaystyle{ \sum_{n=0}^m \mathbb{P}(N=n)E^n=\sum_{k=0}^m S_k\Delta^k }[/math]

 

 

 

 

(5')

and, for every sequence c = (c0, c1, c2, c3, ..., cm, ...),

[math]\displaystyle{ \sum_{n=0}^m \mathbb{P}(N=n)\,c_n=\sum_{k=0}^m S_k\,(\Delta^kc)_0. }[/math]

 

 

 

 

(6')

The quantity on the left-hand side of (6') is the expected value of cN.

Remarks

  1. In actuarial science, the name Schuette–Nesbitt formula refers to equation (6'), where V denotes the set of real numbers.
  2. The left-hand side of equation (5') is a convex combination of the powers of the shift operator E, it can be seen as the expected value of random operator EN. Accordingly, the left-hand side of equation (6') is the expected value of random component cN. Note that both have a discrete probability distribution with finite support, hence expectations are just the well-defined finite sums.
  3. The probabilistic version of the inclusion–exclusion principle can be derived from equation (6') by choosing the sequence c = (0, 1, 1, ...): the left-hand side reduces to the probability of the event {N ≥ 1}, which is the union of A1, ..., Am, and the right-hand side is S1S2 + S3 – ... – (–1)mSm, because 0c)0 = 0 and kc)0 = –(–1)k for k ∈ {1, ..., m}.
  4. Equations (5), (5'), (6) and (6') are also true when the shift operator and the difference operator are considered on a subspace like the  p spaces.
  5. If desired, the formulae (5), (5'), (6) and (6') can be considered in finite dimensions, because only the first m + 1 components of the sequences matter. Hence, represent the linear shift operator E and the linear difference operator Δ as mappings of the (m + 1)-dimensional Euclidean space into itself, given by the (m + 1) × (m + 1)-matrices
[math]\displaystyle{ E=\begin{pmatrix} 0&1&0&\cdots&0\\ 0&0&1&\ddots&\vdots\\ \vdots&\ddots&\ddots&\ddots&0\\ 0&\cdots&0&0&1\\ 0&\cdots&0&0&0 \end{pmatrix}, \qquad \Delta=\begin{pmatrix} -1&1&0&\cdots&0\\ 0&-1&1&\ddots&\vdots\\ \vdots&\ddots&\ddots&\ddots&0\\ 0&\cdots&0&-1&1\\ 0&\cdots&0&0&-1 \end{pmatrix}, }[/math]
and let I denote the (m + 1)-dimensional identity matrix. Then (6) and (6') hold for every vector c = (c0, c1, ..., cm)T in (m + 1)-dimensional Euclidean space, where the exponent T in the definition of c denotes the transpose.
  1. Equations (5) and (5') hold for an arbitrary linear operator E as long as Δ is the difference of E and the identity operator I.
  2. The probabilistic versions (4'), (5') and (6') can be generalized to every finite measure space.

For textbook presentations of the probabilistic Schuette–Nesbitt formula (6') and their applications to actuarial science, cf. (Gerber 1997). Chapter 8, or (Bowers Gerber), Chapter 18 and the Appendix, pp. 577–578.

History

For independent events, the formula (6') appeared in a discussion of Robert P. White and T.N.E. Greville's paper by Donald R. Schuette and Cecil J. Nesbitt, see (Schuette Nesbitt). In the two-page note (Gerber 1979), Hans U. Gerber, called it Schuette–Nesbitt formula and generalized it to arbitrary events. Christian Buchta, see (Buchta 1994), noticed the combinatorial nature of the formula and published the elementary combinatorial proof of (3).

Cecil J. Nesbitt, PhD, F.S.A., M.A.A.A., received his mathematical education at the University of Toronto and the Institute for Advanced Study in Princeton. He taught actuarial mathematics at the University of Michigan from 1938 to 1980. He served the Society of Actuaries from 1985 to 1987 as Vice-President for Research and Studies. Professor Nesbitt died in 2001. (Short CV taken from (Bowers Gerber), page xv.)

Donald Richard Schuette was a PhD student of C. Nesbitt, he later became professor at the University of Wisconsin–Madison.

The probabilistic version of the Schuette–Nesbitt formula (6') generalizes much older formulae of Waring, which express the probability of the events {N = n} and {Nn} in terms of S1, S2, ..., Sm. More precisely, with [math]\displaystyle{ \textstyle\binom kn }[/math] denoting the binomial coefficient,

[math]\displaystyle{ \mathbb{P}(N=n)=\sum_{k=n}^m(-1)^{k-n}\binom kn S_k, \qquad n\in\{0,\ldots,m\}, }[/math]

 

 

 

 

(8)

and

[math]\displaystyle{ \mathbb{P}(N\ge n)=\sum_{k=n}^m(-1)^{k-n}\binom{k-1}{n-1}S_k,\qquad n\in\{1,\ldots,m\}, }[/math]

 

 

 

 

(9)

see (Feller 1968), Sections IV.3 and IV.5, respectively.

To see that these formulae are special cases of the probabilistic version of the Schuette–Nesbitt formula, note that by the binomial theorem

[math]\displaystyle{ \Delta^k=(E-I)^k=\sum_{j=0}^k\binom kj (-1)^{k-j}E^j,\qquad k\in\mathbb{N}_0. }[/math]

Applying this operator identity to the sequence c = (0, ..., 0, 1, 0, 0, ...) with n leading zeros and noting that (E jc)0 = 1 if j = n and (E jc)0 = 0 otherwise, the formula (8) for {N = n} follows from (6').

Applying the identity to c = (0, ..., 0, 1, 1, 1, ...) with n leading zeros and noting that (E jc)0 = 1 if jn and (E jc)0 = 0 otherwise, equation (6') implies that

[math]\displaystyle{ \mathbb{P}(N\ge n)=\sum_{k=n}^m S_k\sum_{j=n}^k\binom kj(-1)^{k-j}. }[/math]

Expanding (1 – 1)k using the binomial theorem and using equation (11) of the formulas involving binomial coefficients, we obtain

[math]\displaystyle{ \sum_{j=n}^k\binom kj(-1)^{k-j} =-\sum_{j=0}^{n-1}\binom kj(-1)^{k-j} =(-1)^{k-n}\binom{k-1}{n-1}. }[/math]

Hence, we have the formula (9) for {Nn}.

Applications

In actuarial science

Problem: Suppose there are m persons aged x1, ..., xm with remaining random (but independent) lifetimes T1, ..., Tm. Suppose the group signs a life insurance contract which pays them after t years the amount cn if exactly n persons out of m are still alive after t years. How high is the expected payout of this insurance contract in t years?

Solution: Let Aj denote the event that person j survives t years, which means that Aj = {Tj > t}. In actuarial notation the probability of this event is denoted by t pxj and can be taken from a life table. Use independence to calculate the probability of intersections. Calculate S1, ..., Sm and use the probabilistic version of the Schuette–Nesbitt formula (6') to calculate the expected value of cN.

In probability theory

Let σ be a random permutation of the set {1, ..., m} and let Aj denote the event that j is a fixed point of σ, meaning that Aj = {σ(j) = j}. When the numbers in J, which is a subset of {1, ..., m}, are fixed points, then there are (m – |J|)! ways to permute the remaining m – |J| numbers, hence

[math]\displaystyle{ \mathbb{P}\biggl(\bigcap_{j\in J}A_j\biggr)=\frac{(m-|J|)!}{m!}. }[/math]

By the combinatorical interpretation of the binomial coefficient, there are [math]\displaystyle{ \textstyle\binom mk }[/math] different choices of a subset J of {1, ..., m} with k elements, hence (7) simplifies to

[math]\displaystyle{ S_k=\binom mk \frac{(m-k)!}{m!}=\frac1{k!}. }[/math]

Therefore, using (4'), the probability-generating function of the number N of fixed points is given by

[math]\displaystyle{ \mathbb{E}[x^N]=\sum_{k=0}^m\frac{(x-1)^k}{k!},\qquad x\in\mathbb{R}. }[/math]

This is the partial sum of the infinite series giving the exponential function at x – 1, which in turn is the probability-generating function of the Poisson distribution with parameter 1. Therefore, as m tends to infinity, the distribution of N converges to the Poisson distribution with parameter 1.

See also

References

External links