Hyperplane separation theorem

Hyperplane separation theorem
	Illustration of the hyperplane separation theorem.
Type	Theorem
Field	Convex geometry; Topological vector spaces; Collision detection;
Conjectured by	Hermann Minkowski
Open problem	No
Generalizations	Hahn–Banach separation theorem

Short description: On the existence of hyperplanes separating disjoint convex sets

In geometry, the hyperplane separation theorem is a theorem about disjoint convex sets in n-dimensional Euclidean space. There are several rather similar versions. In one version of the theorem, if both these sets are closed and at least one of them is compact, then there is a hyperplane in between them and even two parallel hyperplanes in between them separated by a gap. In another version, if both disjoint convex sets are open, then there is a hyperplane in between them, but not necessarily any gap. An axis which is orthogonal to a separating hyperplane is a separating axis, because the orthogonal projections of the convex bodies onto the axis are disjoint.

The hyperplane separation theorem is due to Hermann Minkowski. The Hahn–Banach separation theorem generalizes the result to topological vector spaces.

A related result is the supporting hyperplane theorem.

In the context of support-vector machines, the optimally separating hyperplane or maximum-margin hyperplane is a hyperplane which separates two convex hulls of points and is equidistant from the two.^[1]^[2]^[3]

Statements and proof

Hyperplane separation theorem^[4] — Let [math]\displaystyle{ A }[/math] and [math]\displaystyle{ B }[/math] be two disjoint nonempty convex subsets of [math]\displaystyle{ \R^n }[/math]. Then there exist a nonzero vector [math]\displaystyle{ v }[/math] and a real number [math]\displaystyle{ c }[/math] such that

[math]\displaystyle{ \langle x, v \rangle \ge c \, \text{ and } \langle y, v \rangle \le c }[/math]

for all [math]\displaystyle{ x }[/math] in [math]\displaystyle{ A }[/math] and [math]\displaystyle{ y }[/math] in [math]\displaystyle{ B }[/math]; i.e., the hyperplane [math]\displaystyle{ \langle \cdot, v \rangle = c }[/math], [math]\displaystyle{ v }[/math] the normal vector, separates [math]\displaystyle{ A }[/math] and [math]\displaystyle{ B }[/math].

If both sets are closed, and at least one of them is compact, then the separation can be strict, that is, [math]\displaystyle{ \langle x, v \rangle \gt c_1 \, \text{ and } \langle y, v \rangle \lt c_2 }[/math] for some [math]\displaystyle{ c_1 \gt c_2 }[/math]

In all cases, assume [math]\displaystyle{ A, B }[/math] to be disjoint, nonempty, and convex subsets of [math]\displaystyle{ \R^n }[/math]. The summary of the results are as follows:

summary table
[math]\displaystyle{ A }[/math]	[math]\displaystyle{ B }[/math]	[math]\displaystyle{ \langle x, v\rangle }[/math]	[math]\displaystyle{ \langle y, v\rangle }[/math]
		[math]\displaystyle{ \geq c }[/math]	[math]\displaystyle{ \leq c }[/math]
closed compact	closed	[math]\displaystyle{ \gt c_1 }[/math]	[math]\displaystyle{ \lt c_2 }[/math] with [math]\displaystyle{ c_2 \lt c_1 }[/math]
closed	closed compact	[math]\displaystyle{ \gt c_1 }[/math]	[math]\displaystyle{ \lt c_2 }[/math] with [math]\displaystyle{ c_2 \lt c_1 }[/math]
open		[math]\displaystyle{ \gt c }[/math]	[math]\displaystyle{ \leq c }[/math]
open	open	[math]\displaystyle{ \gt c }[/math]	[math]\displaystyle{ \lt c }[/math]

The number of dimensions must be finite. In infinite-dimensional spaces there are examples of two closed, convex, disjoint sets which cannot be separated by a closed hyperplane (a hyperplane where a continuous linear functional equals some constant) even in the weak sense where the inequalities are not strict.^[5]

Here, the compactness in the hypothesis cannot be relaxed; see an example in the section Counterexamples and uniqueness. This version of the separation theorem does generalize to infinite-dimension; the generalization is more commonly known as the Hahn–Banach separation theorem.

The proof is based on the following lemma:

Lemma — Let [math]\displaystyle{ A }[/math] and [math]\displaystyle{ B }[/math] be two disjoint closed subsets of [math]\displaystyle{ \R^n }[/math], and assume [math]\displaystyle{ A }[/math] is compact. Then there exist points [math]\displaystyle{ a_0 \in A }[/math] and [math]\displaystyle{ b_0 \in B }[/math] minimizing the distance [math]\displaystyle{ \|a - b\| }[/math] over [math]\displaystyle{ a \in A }[/math] and [math]\displaystyle{ b \in B }[/math].

Proof of lemma

Let [math]\displaystyle{ a\in A }[/math] and [math]\displaystyle{ b \in B }[/math] be any pair of points, and let [math]\displaystyle{ r_1 = \|b - a\| }[/math]. Since [math]\displaystyle{ A }[/math] is compact, it is contained in some ball centered on [math]\displaystyle{ a }[/math]; let the radius of this ball be [math]\displaystyle{ r_2 }[/math]. Let [math]\displaystyle{ S = B \cap \overline{B_{r_1 + r_2}(a)} }[/math] be the intersection of [math]\displaystyle{ B }[/math] with a closed ball of radius [math]\displaystyle{ r_1 + r_2 }[/math] around [math]\displaystyle{ a }[/math]. Then [math]\displaystyle{ S }[/math] is compact and nonempty because it contains [math]\displaystyle{ b }[/math]. Since the distance function is continuous, there exist points [math]\displaystyle{ a_0 }[/math] and [math]\displaystyle{ b_0 }[/math] whose distance [math]\displaystyle{ \|a_0 - b_0\| }[/math] is the minimum over all pairs of points in [math]\displaystyle{ A \times S }[/math]. It remains to show that [math]\displaystyle{ a_0 }[/math] and [math]\displaystyle{ b_0 }[/math] in fact have the minimum distance over all pairs of points in [math]\displaystyle{ A \times B }[/math]. Suppose for contradiction that there exist points [math]\displaystyle{ a' }[/math] and [math]\displaystyle{ b' }[/math] such that [math]\displaystyle{ \|a' - b'\| \lt \|a_0 - b_0\| }[/math]. Then in particular, [math]\displaystyle{ \|a' - b'\| \lt r_1 }[/math], and by the triangle inequality, [math]\displaystyle{ \|a - b'\| \le \|a' - b'\| + \|a - a'\| \lt r_1 + r_2 }[/math]. Therefore [math]\displaystyle{ b' }[/math] is contained in [math]\displaystyle{ S }[/math], which contradicts the fact that [math]\displaystyle{ a_0 }[/math] and [math]\displaystyle{ b_0 }[/math] had minimum distance over [math]\displaystyle{ A \times S }[/math]. [math]\displaystyle{ \square }[/math]

Proof illustration.

Proof of theorem

We first prove the second case. (See the diagram.)

WLOG, [math]\displaystyle{ A }[/math] is compact. By the lemma, there exist points [math]\displaystyle{ a_0 \in A }[/math] and [math]\displaystyle{ b_0 \in B }[/math] of minimum distance to each other. Since [math]\displaystyle{ A }[/math] and [math]\displaystyle{ B }[/math] are disjoint, we have [math]\displaystyle{ a_0 \neq b_0 }[/math]. Now, construct two hyperplanes [math]\displaystyle{ L_A, L_B }[/math] perpendicular to line segment [math]\displaystyle{ [a_0, b_0] }[/math], with [math]\displaystyle{ L_A }[/math] across [math]\displaystyle{ a_0 }[/math] and [math]\displaystyle{ L_B }[/math] across [math]\displaystyle{ b_0 }[/math]. We claim that neither [math]\displaystyle{ A }[/math] nor [math]\displaystyle{ B }[/math] enters the space between [math]\displaystyle{ L_A, L_B }[/math], and thus the perpendicular hyperplanes to [math]\displaystyle{ (a_0, b_0) }[/math] satisfy the requirement of the theorem.

Algebraically, the hyperplanes [math]\displaystyle{ L_A, L_B }[/math] are defined by the vector [math]\displaystyle{ v:= b_0 - a_0 }[/math], and two constants [math]\displaystyle{ c_A := \langle v, a_0\rangle \lt c_B := \langle v, b_0\rangle }[/math], such that [math]\displaystyle{ L_A = \{x: \langle v, x\rangle = c_A\}, L_B = \{x: \langle v, x\rangle = c_B\} }[/math]. Our claim is that [math]\displaystyle{ \forall a\in A, \langle v, a\rangle \leq c_A }[/math] and [math]\displaystyle{ \forall b\in B, \langle v, b\rangle \geq c_B }[/math].

Suppose there is some [math]\displaystyle{ a\in A }[/math] such that [math]\displaystyle{ \langle v, a\rangle \gt c_A }[/math], then let [math]\displaystyle{ a' }[/math] be the foot of perpendicular from [math]\displaystyle{ b_0 }[/math] to the line segment [math]\displaystyle{ [a_0, a] }[/math]. Since [math]\displaystyle{ A }[/math] is convex, [math]\displaystyle{ a' }[/math] is inside [math]\displaystyle{ A }[/math], and by planar geometry, [math]\displaystyle{ a' }[/math] is closer to [math]\displaystyle{ b_0 }[/math] than [math]\displaystyle{ a_0 }[/math], contradiction. Similar argument applies to [math]\displaystyle{ B }[/math].

Now for the first case.

Approach both [math]\displaystyle{ A, B }[/math] from the inside by [math]\displaystyle{ A_1 \subseteq A_2 \subseteq \cdots \subseteq A }[/math] and [math]\displaystyle{ B_1 \subseteq B_2 \subseteq \cdots \subseteq B }[/math], such that each [math]\displaystyle{ A_k, B_k }[/math] is closed and compact, and the unions are the relative interiors [math]\displaystyle{ \mathrm{relint}(A), \mathrm{relint}(B) }[/math]. (See relative interior page for details.)

Now by the second case, for each pair [math]\displaystyle{ A_k, B_k }[/math] there exists some unit vector [math]\displaystyle{ v_k }[/math] and real number [math]\displaystyle{ c_k }[/math], such that [math]\displaystyle{ \langle v_k, A_k\rangle \lt c_k \lt \langle v_k, B_k\rangle }[/math].

Since the unit sphere is compact, we can take a convergent subsequence, so that [math]\displaystyle{ v_k \to v }[/math]. Let [math]\displaystyle{ c_A := \sup_{a\in A} \langle v, a\rangle, c_B := \inf_{b\in B} \langle v, b\rangle }[/math]. We claim that [math]\displaystyle{ c_A \leq c_B }[/math], thus separating [math]\displaystyle{ A, B }[/math].

Assume not, then there exists some [math]\displaystyle{ a\in A, b\in B }[/math] such that [math]\displaystyle{ \langle v, a\rangle \gt \langle v, b\rangle }[/math], then since [math]\displaystyle{ v_k \to v }[/math], for large enough [math]\displaystyle{ k }[/math], we have [math]\displaystyle{ \langle v_k, a\rangle \gt \langle v_k, b\rangle }[/math], contradiction.

Since a separating hyperplane cannot intersect the interiors of open convex sets, we have a corollary:

Separation theorem I — Let [math]\displaystyle{ A }[/math] and [math]\displaystyle{ B }[/math] be two disjoint nonempty convex sets. If [math]\displaystyle{ A }[/math] is open, then there exist a nonzero vector [math]\displaystyle{ v }[/math] and real number [math]\displaystyle{ c }[/math] such that

[math]\displaystyle{ \langle x, v \rangle \gt c \, \text{ and } \langle y, v \rangle \le c }[/math]

for all [math]\displaystyle{ x }[/math] in [math]\displaystyle{ A }[/math] and [math]\displaystyle{ y }[/math] in [math]\displaystyle{ B }[/math]. If both sets are open, then there exist a nonzero vector [math]\displaystyle{ v }[/math] and real number [math]\displaystyle{ c }[/math] such that

[math]\displaystyle{ \langle x, v \rangle\gt c \, \text{ and } \langle y, v \rangle\lt c }[/math]

for all [math]\displaystyle{ x }[/math] in [math]\displaystyle{ A }[/math] and [math]\displaystyle{ y }[/math] in [math]\displaystyle{ B }[/math].

Case with possible intersections

If the sets [math]\displaystyle{ A, B }[/math] have possible intersections, but their relative interiors are disjoint, then the proof of the first case still applies with no change, thus yielding:

Separation theorem II — Let [math]\displaystyle{ A }[/math] and [math]\displaystyle{ B }[/math] be two nonempty convex subsets of [math]\displaystyle{ \R^n }[/math] with disjoint relative interiors. Then there exist a nonzero vector [math]\displaystyle{ v }[/math] and a real number [math]\displaystyle{ c }[/math] such that

[math]\displaystyle{ \langle x, v \rangle \ge c \, \text{ and } \langle y, v \rangle \le c }[/math]

in particular, we have the supporting hyperplane theorem.

Supporting hyperplane theorem — if [math]\displaystyle{ A }[/math] is a convex set in [math]\displaystyle{ \mathbb{R}^n, }[/math] and [math]\displaystyle{ a_0 }[/math] is a point on the boundary of [math]\displaystyle{ A }[/math], then there exists a supporting hyperplane of [math]\displaystyle{ A }[/math] containing [math]\displaystyle{ a_0 }[/math].

Proof

If the affine span of [math]\displaystyle{ A }[/math] is not all of [math]\displaystyle{ \mathbb{R}^n }[/math], then extend the affine span to a supporting hyperplane. Else, [math]\displaystyle{ \mathrm{relint}(A) = \mathrm{int}(A) }[/math] is disjoint from [math]\displaystyle{ \mathrm{relint}(\{a_0\}) = \{a_0\} }[/math], so apply the above theorem.

Converse of theorem

Note that the existence of a hyperplane that only "separates" two convex sets in the weak sense of both inequalities being non-strict obviously does not imply that the two sets are disjoint. Both sets could have points located on the hyperplane.

Counterexamples and uniqueness

The theorem does not apply if one of the bodies is not convex.

If one of A or B is not convex, then there are many possible counterexamples. For example, A and B could be concentric circles. A more subtle counterexample is one in which A and B are both closed but neither one is compact. For example, if A is a closed half plane and B is bounded by one arm of a hyperbola, then there is no strictly separating hyperplane:

[math]\displaystyle{ A = \{(x,y) : x \le 0\} }[/math]

[math]\displaystyle{ B = \{(x,y) : x \gt 0, y \geq 1/x \}.\ }[/math]

(Although, by an instance of the second theorem, there is a hyperplane that separates their interiors.) Another type of counterexample has A compact and B open. For example, A can be a closed square and B can be an open square that touches A.

In the first version of the theorem, evidently the separating hyperplane is never unique. In the second version, it may or may not be unique. Technically a separating axis is never unique because it can be translated; in the second version of the theorem, a separating axis can be unique up to translation.

The horn angle provides a good counterexample to many hyperplane separations. For example, in [math]\displaystyle{ \R^2 }[/math], the unit disk is disjoint from the open interval [math]\displaystyle{ ((1, 0), (1,1)) }[/math], but the only line separating them contains the entirety of [math]\displaystyle{ ((1, 0), (1,1)) }[/math]. This shows that if [math]\displaystyle{ A }[/math] is closed and [math]\displaystyle{ B }[/math] is relatively open, then there does not necessarily exist a separation that is strict for [math]\displaystyle{ B }[/math]. However, if [math]\displaystyle{ A }[/math] is closed polytope then such a separation exists.^[6]

More variants

Farkas' lemma and related results can be understood as hyperplane separation theorems when the convex bodies are defined by finitely many linear inequalities.

More results may be found.^[6]

Use in collision detection

In collision detection, the hyperplane separation theorem is usually used in the following form:

Separating axis theorem — Two closed convex objects are disjoint if there exists a line ("separating axis") onto which the two objects' projections are disjoint.

Regardless of dimensionality, the separating axis is always a line. For example, in 3D, the space is separated by planes, but the separating axis is perpendicular to the separating plane.

The separating axis theorem can be applied for fast collision detection between polygon meshes. Each face's normal or other feature direction is used as a separating axis. Note that this yields possible separating axes, not separating lines/planes.

In 3D, using face normals alone will fail to separate some edge-on-edge non-colliding cases. Additional axes, consisting of the cross-products of pairs of edges, one taken from each object, are required.^[7]

For increased efficiency, parallel axes may be calculated as a single axis.

Notes

↑ Hastie, Trevor; Tibshirani, Robert; Friedman, Jerome (2008). The Elements of Statistical Learning : Data Mining, Inference, and Prediction (Second ed.). New York: Springer. pp. 129–135. https://web.stanford.edu/~hastie/Papers/ESLII.pdf#page=148.
↑ Witten, Ian H.; Frank, Eibe; Hall, Mark A.; Pal, Christopher J. (2016). Data Mining: Practical Machine Learning Tools and Techniques (Fourth ed.). Morgan Kaufmann. pp. 253–254. ISBN 9780128043578. https://books.google.com/books?id=1SylCgAAQBAJ&pg=PA253.
↑ Deisenroth, Marc Peter; Faisal, A. Aldo; Ong, Cheng Soon (2020). Mathematics for Machine Learning. Cambridge University Press. pp. 337–338. ISBN 978-1-108-45514-5.
↑ Boyd & Vandenberghe 2004, Exercise 2.22.
↑ Haïm Brezis, Analyse fonctionnelle : théorie et applications, 1983, remarque 4, p. 7.
↑ ^6.0 ^6.1 Stoer, Josef; Witzgall, Christoph (1970) (in en). Convexity and Optimization in Finite Dimensions I. Springer Berlin, Heidelberg. (2.12.9). doi:10.1007/978-3-642-46216-0. ISBN 978-3-642-46216-0. https://link.springer.com/book/10.1007/978-3-642-46216-0.
↑ "Advanced vector math". https://docs.godotengine.org/en/stable/tutorials/math/vectors_advanced.html#collision-detection-in-3d.

References

Boyd, Stephen P.; Vandenberghe, Lieven (2004). Convex Optimization. Cambridge University Press. ISBN 978-0-521-83378-3. https://web.stanford.edu/~boyd/cvxbook/bv_cvxbook.pdf.
Golshtein, E. G.; Tretyakov, N.V. (1996). Modified Lagrangians and monotone maps in optimization. New York: Wiley. p. 6. ISBN 0-471-54821-9.
Shimizu, Kiyotaka; Ishizuka, Yo; Bard, Jonathan F. (1997). Nondifferentiable and two-level mathematical programming. Boston: Kluwer Academic Publishers. p. 19. ISBN 0-7923-9821-1.

Soltan, V. (2021). Support and separation properties of convex sets in finite dimension. Extracta Math. Vol. 36, no. 2, 241-278.

External links

Collision detection and response

fr:Séparation des convexes

0.00

(0 votes)

Original source: https://en.wikipedia.org/wiki/Hyperplane separation theorem. Read more

[1] Hastie, Trevor; Tibshirani, Robert; Friedman, Jerome (2008). The Elements of Statistical Learning : Data Mining, Inference, and Prediction (Second ed.). New York: Springer. pp. 129–135. https://web.stanford.edu/~hastie/Papers/ESLII.pdf#page=148.

[2] Witten, Ian H.; Frank, Eibe; Hall, Mark A.; Pal, Christopher J. (2016). Data Mining: Practical Machine Learning Tools and Techniques (Fourth ed.). Morgan Kaufmann. pp. 253–254. ISBN 9780128043578. https://books.google.com/books?id=1SylCgAAQBAJ&pg=PA253.

[3] Deisenroth, Marc Peter; Faisal, A. Aldo; Ong, Cheng Soon (2020). Mathematics for Machine Learning. Cambridge University Press. pp. 337–338. ISBN 978-1-108-45514-5.

[4] Boyd & Vandenberghe 2004, Exercise 2.22.

[5] Haïm Brezis, Analyse fonctionnelle : théorie et applications, 1983, remarque 4, p. 7.

[:0-6] 6.0 ^6.1 Stoer, Josef; Witzgall, Christoph (1970) (in en). Convexity and Optimization in Finite Dimensions I. Springer Berlin, Heidelberg. (2.12.9). doi:10.1007/978-3-642-46216-0. ISBN 978-3-642-46216-0. https://link.springer.com/book/10.1007/978-3-642-46216-0.

[7] "Advanced vector math". https://docs.godotengine.org/en/stable/tutorials/math/vectors_advanced.html#collision-detection-in-3d.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

v t e Functional analysis (topics)
Topological vector spaces	Asplund Banach (list) Banach lattice Barrelled Bornological Brauner F-space Fréchet (tame) Hilbert (Inner product space Polarization identity) LF-space Locally convex (Seminorms/Minkowski functionals) Mackey Montel Nuclear Normed (norm) Quasinormed Reflexive Riesz Smith Stereotype Strictly convex Webbed Topological tensor product (of Hilbert spaces)
Topologies of function spaces	Dual Dual space (Dual norm) Operator Ultraweak Weak (polar operator) Mackey Strong (polar operator) Ultrastrong Uniform convergence
Linear operators	Adjoint Bilinear (form operator sesquilinear) (Un)Bounded Closed Compact (on Hilbert spaces) (Dis)Continuous Densely defined Fredholm Hilbert–Schmidt Functionals (positive) Normal Nuclear Self-adjoint Strictly singular Trace class Transpose Unitary
Operator theory	Banach algebras C-algebras Spectrum (C-algebra radius) Spectral theory (of ODEs Spectral theorem) Polar decomposition Singular value decomposition
Theorems	Banach–Alaoglu Banach–Mazur Banach–Saks Banach–Schauder (open mapping) Banach–Steinhaus (Uniform boundedness) Bessel's inequality Cauchy–Schwarz inequality Closed graph Closed range Eberlein–Šmulian Freudenthal spectral Gelfand–Mazur Gelfand–Naimark Goldstine Hahn–Banach (hyperplane separation) Kakutani fixed-point Krein–Milman Lomonosov's invariant subspace Mackey–Arens Mazur's lemma M. Riesz extension Riesz representation Parseval's identity Schauder fixed-point
Analysis	Abstract Wiener space Bochner space Differentiation in Fréchet spaces Derivatives (Fréchet Gateaux functional holomorphic) Integrals (Bochner Dunford Gelfand–Pettis regulated Paley–Wiener weak) Functional calculus (Borel continuous holomorphic) Inverse function theorem (Nash–Moser theorem) Measures (Lebesgue Projection-valued Vector) Weakly measurable function
Types of sets	Absolutely convex Absorbing Balanced Bounded Convex Convex cone (subset) Linear cone (subset) Radial Star-shaped Symmetric Zonotope
Subsets / set operations	Algebraic interior (core) Bounding points Convex hull Extreme point Interior Minkowski addition Polar

Anonymous

Search

Hyperplane separation theorem

Namespaces

More

Page actions

Contents

Statements and proof

Case with possible intersections

Converse of theorem

Counterexamples and uniqueness

More variants

Use in collision detection

See also

Notes

References

External links

Navigation

Navigation

Help

Translate

Wiki tools

Wiki tools

Anonymous

Search

Hyperplane separation theorem

Statements and proof

Case with possible intersections

Converse of theorem

Counterexamples and uniqueness

More variants

Use in collision detection

See also

Notes

References

External links

Navigation

Wiki tools

Page tools

Other projects

Categories