Witness set

From HandWiki
Short description: Computational learning concept

In combinatorics and computational learning theory, a witness set is a set of elements that distinguishes a given Boolean function from a given class of other Boolean functions. Let C be a concept class over a domain X (that is, a family of Boolean functions over X) and c be a concept in X (a single Boolean function). A subset S of X is a witness set for c in X if S distinguishes c from all the other functions in C, in the sense that no other function in C has the same values on S.[1]

For a concept class with |C| concepts, there exists a concept that has a witness of size at most log2|C|; this bound is tight when C consists of all Boolean functions over X.[1] By a result of (Bondy 1972) there exists a single witness set of size at most |C|1 that is valid for all concepts in C; this bound is tight when C consists of the indicator functions of the empty set and some singleton sets.[1][2] One way to construct this set is to interpret the concepts as bitstrings, and the domain elements as positions in these bitstrings. Then the set of positions at which a trie of the bitstrings branches forms the desired witness set. This construction is central to the operation of the fusion tree data structure.[3]

The minimum size of a witness set for c is called the witness size or specification number and is denoted by wC(c). The value max{wC(c):cC} is called the teaching dimension of C. It represents the number of examples of a concept that need to be presented by a teacher to a learner, in the worst case, to enable the learner to determine which concept is being presented.[4]

Witness sets have also been called teaching sets, keys, specifying sets, or discriminants.[5] The "witness set" terminology is from (Kushilevitz Linial),[5][6] who trace the concept of witness sets to work by (Cover 1965).[6][7]

References

  1. 1.0 1.1 1.2 Jukna, Stasys (2011), "Chapter 11: Witness sets and isolation", Extremal Combinatorics, Texts in Theoretical Computer Science. An EATCS Series, Springer, pp. 155–163, doi:10.1007/978-3-642-17364-6, ISBN 978-3-642-17363-9 
  2. "Induced subsets", Journal of Combinatorial Theory, Series B 12 (2): 201–202, 1972, doi:10.1016/0095-8956(72)90025-1 
  3. "Surpassing the information-theoretic bound with fusion trees", Journal of Computer and System Sciences 47 (3): 424–436, 1993, doi:10.1016/0022-0000(93)90040-4 
  4. "On the complexity of teaching", Journal of Computer and System Sciences 50 (1): 20–31, 1995, doi:10.1006/jcss.1995.1003 
  5. 5.0 5.1 Balbach, Frank J. (2008), "Measuring teachability using variants of the teaching dimension", Theoretical Computer Science 397 (1–3): 94–113, doi:10.1016/j.tcs.2008.02.025, https://core.ac.uk/download/pdf/82471792.pdf 
  6. 6.0 6.1 Kushilevitz, Eyal (1996), "Witness sets for families of binary vectors", Journal of Combinatorial Theory, Series A 73 (2): 376–380, doi:10.1016/S0097-3165(96)80015-X, https://www.cs.huji.ac.il/~nati/PAPERS/klrs.pdf 
  7. "Geometrical and statistical properties of systems of linear inequalities with applications in pattern recognition", IEEE Transactions on Electronic Computers EC-14 (3): 326–334, June 1965, doi:10.1109/pgec.1965.264137