Covering number

From HandWiki

In mathematics, a covering number is the number of balls of a given size needed to completely cover a given space, with possible overlaps between the balls. The covering number quantifies the size of a set and can be applied to general metric spaces. Two related concepts are the packing number, the number of disjoint balls that fit in a space, and the metric entropy, the number of points that fit in a space when constrained to lie at some fixed minimum distance apart.

Definition

Let (M, d) be a metric space, let K be a subset of M, and let r be a positive real number. Let Br(x) denote the ball of radius r centered at x. A subset C of M is an r-external covering of K if:

[math]\displaystyle{ K \subseteq \bigcup_{x \in C} B_r(x) }[/math].

In other words, for every [math]\displaystyle{ y\in K }[/math] there exists [math]\displaystyle{ x\in C }[/math] such that [math]\displaystyle{ d(x,y)\leq r }[/math].

If furthermore C is a subset of K, then it is an r-internal covering.

The external covering number of K, denoted [math]\displaystyle{ N^{\text{ext}}_r(K) }[/math], is the minimum cardinality of any external covering of K. The internal covering number, denoted [math]\displaystyle{ N^{\text{int}}_r(K) }[/math], is the minimum cardinality of any internal covering.

A subset P of K is a packing if [math]\displaystyle{ P \subseteq K }[/math] and the set [math]\displaystyle{ \{B_r(x)\}_{x \in P} }[/math] is pairwise disjoint. The packing number of K, denoted [math]\displaystyle{ N^{\text{pack}}_r(K) }[/math], is the maximum cardinality of any packing of K.

A subset S of K is r-separated if each pair of points x and y in S satisfies d(x, y) ≥ r. The metric entropy of K, denoted [math]\displaystyle{ N^{\text{met}}_r(K) }[/math], is the maximum cardinality of any r-separated subset of K.

Examples

  1. The metric space is the real line [math]\displaystyle{ \mathbb{R} }[/math]. [math]\displaystyle{ K\subset \mathbb{R} }[/math] is a set of real numbers whose absolute value is at most [math]\displaystyle{ k }[/math]. Then, there is an external covering of [math]\displaystyle{ \left\lceil \frac{2k}{r} \right\rceil }[/math] intervals of length [math]\displaystyle{ r }[/math], covering the interval [math]\displaystyle{ [-k, k] }[/math]. Hence:
    [math]\displaystyle{ N^{\text{ext}}_r(K) \leq \frac{2 k}{r} }[/math]
  2. The metric space is the Euclidean space [math]\displaystyle{ \mathbb{R}^m }[/math] with the Euclidean metric. [math]\displaystyle{ K\subset \mathbb{R}^m }[/math] is a set of vectors whose length (norm) is at most [math]\displaystyle{ k }[/math]. If [math]\displaystyle{ K }[/math] lies in a d-dimensional subspace of [math]\displaystyle{ \mathbb{R}^m }[/math], then:[1]:337
    [math]\displaystyle{ N^{\text{ext}}_r(K) \leq \left(\frac{2 k \sqrt{d}}{r}\right)^d }[/math].
  3. The metric space is the space of real-valued functions, with the l-infinity metric. The covering number [math]\displaystyle{ N^{\text{int}}_r(K) }[/math] is the smallest number [math]\displaystyle{ k }[/math] such that, there exist [math]\displaystyle{ h_1,\ldots,h_k \in K }[/math] such that, for all [math]\displaystyle{ h\in K }[/math] there exists [math]\displaystyle{ i\in\{1,\ldots,k\} }[/math] such that the supremum distance between [math]\displaystyle{ h }[/math] and [math]\displaystyle{ h_i }[/math] is at most [math]\displaystyle{ r }[/math]. The above bound is not relevant since the space is [math]\displaystyle{ \infty }[/math]-dimensional. However, when [math]\displaystyle{ K }[/math] is a compact set, every covering of it has a finite sub-covering, so [math]\displaystyle{ N^{\text{int}}_r(K) }[/math] is finite.[2]:61

Properties

  1. The internal and external covering numbers, the packing number, and the metric entropy are all closely related. The following chain of inequalities holds for any subset K of a metric space and any positive real number r.[3]
    [math]\displaystyle{ N^{\text{met}}_{2 r}(K) \leq N^{\text{pack}}_r(K) \leq N^{\text{ext}}_r(K) \leq N^{\text{int}}_r(K) \leq N^{\text{met}}_r(K) }[/math]
  2. Each function except the internal covering number is non-increasing in r and non-decreasing in K. The internal covering number is monotone in r but not necessarily in K.

The following properties relate to covering numbers in the standard Euclidean space, [math]\displaystyle{ \mathbb{R}^m }[/math]:[1]:338

  1. If all vectors in [math]\displaystyle{ K }[/math] are translated by a constant vector [math]\displaystyle{ k_0\in \mathbb{R}^m }[/math], then the covering number does not change.
  2. If all vectors in [math]\displaystyle{ K }[/math] are multiplied by a scalar [math]\displaystyle{ k \in \mathbb{R} }[/math], then:
    for all [math]\displaystyle{ r }[/math]: [math]\displaystyle{ N^{\text{ext}}_{|k|\cdot r}(k\cdot K) = N^{\text{ext}}_{r}(K) }[/math]
  3. If all vectors in [math]\displaystyle{ K }[/math] are operated by a Lipschitz function [math]\displaystyle{ \phi }[/math] with Lipschitz constant [math]\displaystyle{ k }[/math], then:
    for all [math]\displaystyle{ r }[/math]: [math]\displaystyle{ N^{\text{ext}}_{|k|\cdot r}(\phi\circ K) \leq N^{\text{ext}}_{r}(K) }[/math]

Application to machine learning

Let [math]\displaystyle{ K }[/math] be a space of real-valued functions, with the l-infinity metric (see example 3 above). Suppose all functions in [math]\displaystyle{ K }[/math] are bounded by a real constant [math]\displaystyle{ M }[/math]. Then, the covering number can be used to bound the generalization error of learning functions from [math]\displaystyle{ K }[/math], relative to the squared loss:[2]:61

[math]\displaystyle{ \operatorname{Prob}\left[ \sup_{h\in K} \big\vert\text{GeneralizationError}(h) - \text{EmpiricalError}(h)\big\vert \geq \epsilon \right] \leq N^\text{int}_r (K)\, 2\exp{-m\epsilon^2 \over 2M^4} }[/math]

where [math]\displaystyle{ r = {\epsilon \over 8M} }[/math] and [math]\displaystyle{ m }[/math] is the number of samples.

See also

References

  1. 1.0 1.1 Shalev-Shwartz, Shai; Ben-David, Shai (2014). Understanding Machine Learning – from Theory to Algorithms. Cambridge University Press. ISBN 9781107057135. 
  2. 2.0 2.1 Mohri, Mehryar; Rostamizadeh, Afshin; Talwalkar, Ameet (2012). Foundations of Machine Learning. USA, Massachusetts: MIT Press. ISBN 9780262018258. 
  3. Tao, Terence. "Metric entropy analogues of sum set theory". http://terrytao.wordpress.com/2014/03/19/metric-entropy-analogues-of-sum-set-theory/. Retrieved 2 June 2014.