Natarajan dimension
In the theory of Probably Approximately Correct Machine Learning, the Natarajan dimension characterizes the complexity of learning a set of functions, generalizing from the Vapnik-Chervonenkis dimension for boolean functions to multi-class functions. Originally introduced as the Generalized Dimension by Natarajan,[1] it was subsequently renamed the Natarajan Dimension by Haussler and Long.[2]
Definition
Let [math]\displaystyle{ H }[/math] be a set of functions from a set [math]\displaystyle{ X }[/math] to a set [math]\displaystyle{ Y }[/math]. [math]\displaystyle{ H }[/math] shatters a set [math]\displaystyle{ C \subset X }[/math] if there exist two functions [math]\displaystyle{ f_0, f_1 \in H }[/math] such that
- For every [math]\displaystyle{ x \in C, f_0(x) \neq f_1(x) }[/math].
- For every [math]\displaystyle{ X\subset C }[/math], there exists a function [math]\displaystyle{ h \in H }[/math] such that
for all [math]\displaystyle{ x \in B, h(x) = f_0(x) }[/math] and for all [math]\displaystyle{ x \in C - B, h(x) = f_1(x) }[/math].
The Natarajan dimension of H is the maximal cardinality of a set shattered by [math]\displaystyle{ H }[/math].
It is easy to see that if [math]\displaystyle{ |Y| = 2 }[/math], the Natarajan dimension collapses to the Vapnik Chervonenkis dimension.
Shalev-Shwartz and Ben-David [3] present comprehensive material on multi-class learning and the Natarajan dimension, including uniform convergence and learnability.
References
- ↑ Natarajan, Balas Kausik (1989). "On Learning sets and functions". Machine Learning 4: 67–97. doi:10.1007/BF00114804.
- ↑ Haussler, David; Long, Philip (1995). "A Generalization of Sauer's Lemma". Journal of Combinatorial Theory 71: 219–240.
- ↑ Shalev-Shwartz, Shai; Ben-David, Shai (2013). Understanding machine learning. From theory to algorithms. Cambridge University Press.
Original source: https://en.wikipedia.org/wiki/Natarajan dimension.
Read more |