Covering design

From HandWiki
Short description: Collection of subsets covering all t-element subsets


In combinatorial design theory, a covering design, or a (v, k, t)-covering design, is a collection of k-element subsets, called blocks, chosen from a v-element set, such that every t-element subset is contained in at least one block. The minimum number of blocks required is the covering number, denoted C(v,k,t). Covering designs generalize the notion of a Steiner system and are dual to the concept of a packing design. They have connections to coding theory, Turán theory, and finite geometry.[1]

Definition

Let V be a set of v elements. A (v, k, t)-covering design is a family of k-element subsets of V (each called a block) such that every t-element subset of V is contained in at least one block in . The parameters must satisfy vkt0.

The covering number C(v,k,t) is the smallest possible number of blocks in a (v,k,t)-covering design.[2]

The density of a covering design with || blocks is the average number of blocks containing a given t-element subset, equal to[1]

||(kt)(vt).

The density is always at least 1, and equals 1 if and only if every t-element subset is covered by exactly one block, which is the defining condition of a Steiner system.

Example

Consider v=5, k=3, t=2. The four blocks

{1,2,3},{1,4,5},{2,4,5},{3,4,5}

form a (5, 3, 2)-covering design over {1,2,3,4,5}, since every pair of elements appears in at least one block. This is optimal: each block covers three of the ten possible pairs, but any two distinct 3-element subsets of a 5-element set share at most one pair, so three blocks can cover at most nine pairs. Therefore C(5,3,2)=4.

Bounds

Lower bounds

The most widely used general lower bound is the Schönheim bound, based on a recursive inequality due to Schönheim (1964):[3]

C(v,k,t)vkC(v1,k1,t1).

This inequality has an intuitive interpretation: the element of a covering that appears in the fewest blocks appears in at most (k/v)C(v,k,t) blocks, and removing that element and all other blocks yields a (v1,k1,t1)-covering.[1] Applying the inequality recursively yields the closed-form Schönheim bound:

C(v,k,t)L(v,k,t)=vkv1k1vt+1kt+1.

A bound due to de Caen is sometimes tighter, particularly when k and t are not too small:[4]

C(v,k,t)(t+1)(vt)(k+1)(vk)(vt)(kt).

Upper bounds

Rödl (1985) proved that for fixed k and t, there exist coverings whose density approaches 1 as v. This implies that the covering number is asymptotically equal to (vt)/(kt), matching the trivial lower bound.[5]

A weaker but fully explicit bound, valid for all parameter values, was given by Erdős and Spencer (1974) using the probabilistic method:[6]

C(v,k,t)(vt)(kt)(1+ln(kt)).

This bound can be improved by at most a constant factor (at most 4ln22.77) asymptotically, as demonstrated by the case (v,v1,v/2), where the Schönheim lower bound gives density asymptotic to v/4 while the Erdős–Spencer upper bound gives density asymptotic to vln2.[1]

Construction methods

Several general methods are known for constructing covering designs.

Greedy algorithm

A greedy covering is constructed by the following procedure:

  1. Arrange all k-element subsets of the v-set in a list.
  2. Select the k-subset that covers the greatest number of currently uncovered t-subsets (breaking ties by position in the list).
  3. Repeat until all t-subsets are covered.

This approach is analogous to the greedy algorithm of Conway and Sloane for constructing lexicographic codes.[7] The list of k-subsets may be arranged in lexicographic order, colexicographic order, Gray code order, or random order; different orderings can yield coverings of different sizes. The greedy method is completely general (it applies to all valid parameters) and in practice produces quite good results: approximately 42% of the table entries computed by Gordon, Kuperberg, and Patashnik came from greedy coverings. Notably, the Steiner system S(24,8,5) arises as a greedy covering under lexicographic order.[1]

The main disadvantage is computational cost. For fixed k and t, the algorithm requires time and space O(vk).[1]

Finite geometry coverings

Finite geometries yield very good and often optimal coverings for certain parameter families.[8][9]

In the projective geometry PG(m,q) over the field GF(q), the k-dimensional subspaces (k-flats) cover every set of k+1 independent points. Taking the (qm+11)/(q1) points as elements and the k-flats as blocks gives the bound:

C(qm+11q1,qk+11q1,k+1)(m+1k+1)q

where (nk)q denotes the Gaussian binomial coefficient. Similarly, the affine geometry AG(m,q) gives:

C(qm,qk,k+1)qmk(mk)q.

In both cases, equality holds when k=m1 or k=1. The coverings by lines (k=1) are Steiner systems.[1]

Induced coverings

An induced covering constructs a smaller (v,k,t)-covering from a larger (v,k,t)-covering, where v<v and k<k. The procedure is to randomly choose v elements of the v-set, restrict each block to the chosen elements, and then adjust block sizes: blocks smaller than k are padded with arbitrary elements, while blocks larger than k are replaced by a covering of their elements. The resulting family forms a valid (v,k,t)-covering.[1]

This method tends to work best when k/kv/v, and when a good covering, such as one from a finite geometry, is used as the starting point.[1]

Combining smaller coverings

A (v1+v2,k,t)-covering can be assembled from coverings on disjoint sets of sizes v1 and v2. For each partition of t elements into s elements from the first set and ts from the second, one uses a (v1,,s)-covering and a (v2,k,ts)-covering. Optimizing over for each value of s yields the bound:[1]

C(v1+v2,k,t)s=0tminC(v1,,s)C(v2,k,ts).

The products in this sum can have redundancy, which can be reduced using dynamic programming to combine adjacent terms and produce tighter bounds. This method accounts for approximately 30% of the table entries computed by Gordon, Kuperberg, and Patashnik.[1] It includes as special cases a number of elementary constructions, such as:

  • C(v,k+1,t)C(v,k,t) (adding a random element to each block),
  • C(v+1,k+1,t)C(v,k,t) (adding a new element to every block),
  • C(v+1,k,t)C(v,k,t)+C(v,k1,t1) (adding a new element to only the blocks of the second covering).

Other methods

  • Cyclic coverings: If the target size of a covering is v, one can choose a single k-subset and take all v1 cyclic shifts. For some parameters, this produces optimal coverings.[1]
  • Hill-climbing and simulated annealing: Starting from a random collection of blocks and iteratively improving by replacing weak blocks. Nurmela and Östergård used simulated annealing to find many good coverings.[10]
  • Replication: Replacing each element by m copies yields C(mv,mk,t)C(v,k,t).[1]

Steiner systems

A Steiner system S(t,k,v) is a covering design in which every t-element subset is contained in exactly one block (equivalently, a covering design of density 1). When a Steiner system exists, it is simultaneously an optimal covering and an optimal packing, and C(v,k,t)=L(v,k,t). Steiner systems exist only for certain parameter sets; necessary divisibility conditions must be satisfied, and existence is not guaranteed even when these hold. Projective and affine geometries over finite fields provide infinite families of Steiner systems with t=2.[1]

Turán numbers

The Turán number T(n,,r) is the minimum number of r-element subsets of an n-set such that every -element subset contains at least one of the chosen r-subsets. By passing to complementary subsets, one obtains the identity[1]

C(v,k,t)=T(v,vt,vk).

Although covering numbers and Turán numbers are thus equivalent, they have historically been studied in different parameter ranges: covering problems typically have v large relative to k and t, while Turán problems typically have n large relative to and r (corresponding to k and t close to v).[1]

Turán (1941) determined T(n,,2) exactly for all n and , which implies that C(v,v2,t)=L(v,v2,t) for all v and t.[11][12]

Tables of covering numbers

Extensive tables of upper bounds on C(v,k,t) have been compiled using combinations of the methods described above. Gordon, Kuperberg, and Patashnik (1995) tabulated bounds for v32, k16, and t8, covering 1631 nontrivial parameter sets, of which approximately 93% were obtained from the constructions in their paper.[1] An online, regularly updated repository of the best known covering designs is maintained by Gordon.[13]

For t=2, most known covering numbers are optimal, with the ratio between the best known upper and lower bounds being at most approximately 1.12. This ratio increases with t: up to about 1.89 for t=4, 2.98 for t=6, and 3.72 for t=8.[1]

Small covering numbers C(v,k,2)
vk 3 4 5 6 7 8
3 1
4 3 1
5 4 3 1
6 6 3 3 1
7 7 5 3 3 1
8 11 6 4 3 3 1
9 12 8 5 3 3 3
10 17 9 6 4 3 3
11 19 11 7 6 4 3

The covering numbers C(v,3,2) for v = 3, 4, 5...:

1, 3, 4, 6, 7, 11, 12, 17, 19, 24, 26, 33, 35, 43, 46, 54, 57, 67... OEISA011975

The covering numbers C(v,4,2) for v = 4, 5, 6...:

1, 3, 3, 5, 6, 8, 9, 11, 12, 13, 18, 19, 20, 26, 27, 31, 35, 37, 39... OEISA011976

Similar sequences for other parameters exist, but unlike the previous two, these have a limited number of known entries:

See also

References

  1. 1.00 1.01 1.02 1.03 1.04 1.05 1.06 1.07 1.08 1.09 1.10 1.11 1.12 1.13 1.14 1.15 1.16 1.17 Gordon, Daniel M.; Kuperberg, Greg; Patashnik, Oren (1995). "New constructions for covering designs". Journal of Combinatorial Designs 3 (4): 269–284. doi:10.1002/jcd.3180030404. https://www.dmgordon.org/papers/cover.pdf. 
  2. Mills, W. H.; Mullin, R. C. (1992). "Coverings and packings". in Dinitz, Jeffrey H.; Stinson, Douglas R.. Contemporary Design Theory: A Collection of Surveys. Wiley. pp. 371–399. ISBN 0-471-53141-3. 
  3. Schönheim, J. (1964). "On coverings". Pacific Journal of Mathematics 14 (4): 1405–1411. doi:10.2140/pjm.1964.14.1405. 
  4. de Caen, D. (1983). "Extension of a theorem of Moon and Moser on complete subgraphs". Ars Combinatoria 16: 5–10. https://books.google.com/books/about/Extension_of_a_Theorem_of_Moon_and_Moser.html?id=HiUmNAEACAAJ. 
  5. Rödl, Vojtěch (1985). "On a packing and covering problem". European Journal of Combinatorics 5 (1): 69–78. doi:10.1016/S0195-6698(85)80023-8. 
  6. Erdős, Paul; Spencer, Joel (1974). Probabilistic Methods in Combinatorics. Academic Press. pp. 74–75. doi:10.1002/net.3230070309. https://real-eod.mtak.hu/19636/1/505_642.pdf. 
  7. Conway, John H.; Sloane, N. J. A. (1986). "Lexicographic codes: Error-correcting codes from game theory". IEEE Transactions on Information Theory 32 (3): 337–348. doi:10.1109/TIT.1986.1057187. http://neilsloane.com/doc/Me122.pdf. 
  8. Ray-Chaudhuri, D. K. (1968). "Combinatorial information retrieval systems for files". SIAM Journal on Applied Mathematics 16 (5): 973–992. doi:10.1137/0116079. 
  9. Abraham, C. T.; Ghosh, S. P.; Ray-Chaudhuri, D. K. (1968). "File organization schemes based on finite geometries". Information and Control 12 (2): 143–163. doi:10.1016/S0019-9958(68)90251-9. 
  10. Nurmela, Kari J.; Östergård, Patric R. J. (1993). "Upper bounds for covering designs by simulated annealing". Congressus Numerantium 96: 93–111. https://users.aalto.fi/~pat/patric_pub.html. 
  11. Turán, Paul (1941). "Eine Extremalaufgabe aus der Graphentheorie" (in hu). Mat. Fiz. Lapok 48: 436–452. 
  12. de Caen, D. (1994). "The current status of Turán's problem on hypergraphs". in Frankl, P.; Füredi, Z.; Katona, G. et al.. Extremal Problems for Finite Sets. Budapest: János Bolyai Mathematical Society. pp. 187–197. ISBN 9638022817. 
  13. 13.0 13.1 Gordon, Daniel M.. "La Jolla Covering Repository". https://www.dmgordon.org/cover/.