Well-separated pair decomposition

From HandWiki

In computational geometry, a well-separated pair decomposition (WSPD) of a set of points [math]\displaystyle{ S \subset \mathbb{R}^d }[/math], is a sequence of pairs of sets [math]\displaystyle{ (A_i, B_i) }[/math], such that each pair is well-separated, and for each two distinct points [math]\displaystyle{ p, q \in S }[/math], there exists precisely one pair which separates the two. The graph induced by a well-separated pair decomposition can serve as a k-spanner of the complete Euclidean graph, and is useful in approximating solutions to several problems pertaining to this.[1]

Definition

Visual representation of well-separated pair

Let [math]\displaystyle{ A, B }[/math] be two disjoint sets of points in [math]\displaystyle{ \mathbb{R}^d }[/math], [math]\displaystyle{ R(X) }[/math] denote the axis-aligned minimum bounding box for the points in [math]\displaystyle{ X }[/math], and [math]\displaystyle{ s \gt 0 }[/math] denote the separation factor.

We consider [math]\displaystyle{ A }[/math] and [math]\displaystyle{ B }[/math] to be well-separated, if for each of [math]\displaystyle{ R(A) }[/math] and [math]\displaystyle{ R(B) }[/math] there exists a d-ball of radius [math]\displaystyle{ \rho }[/math] containing it, such that the two spheres have a minimum distance of at least [math]\displaystyle{ s \rho }[/math].[2]

We consider a sequence of well-separated pairs of subsets of [math]\displaystyle{ S }[/math], [math]\displaystyle{ (A_1, B_1), (A_2, B_2), \ldots, (A_m,B_m) }[/math] to be a well-separated pair decomposition (WSPD) of [math]\displaystyle{ S }[/math] if for any two distinct points [math]\displaystyle{ p, q \in S }[/math], there exists precisely one [math]\displaystyle{ i }[/math], [math]\displaystyle{ 1 \leq i \leq m }[/math], such that either

  • [math]\displaystyle{ p \in A_i }[/math] and [math]\displaystyle{ q \in B_i }[/math], or
  • [math]\displaystyle{ q \in A_i }[/math] and [math]\displaystyle{ p \in B_i }[/math].[1]

Construction

Split tree

By way of constructing a fair split tree, it is possible to construct a WSPD of size [math]\displaystyle{ O(s^d n) }[/math] in [math]\displaystyle{ O(n \lg n) }[/math] time.[2]

The general principle of the split tree of a point set S is that each node u of the tree represents a set of points Su and that the bounding box R(Su) of Su is split along its longest side in two equal parts which form the two children of u and their point set. It is done recursively until there is only one point in the set.

Let Lmax(R(X)) denote the size of the longest interval of the bounding hyperrectangle of point set X and let Li(R(X)) denote the size of the i-th dimension of the bounding hyperrectangle of point set X. We give pseudocode for the Split tree computation below.

SplitTree(S)
    Let u be the node for S
    if |S| = 1
        R(u) := R(S) // R(S) is a hyperrectangle which each side has a length of zero.
        Store in u the only point in S.
    else
        Compute R(S)
        Let the i-th dimension be the one where Lmax(R(S)) = Li(R(S))
        Split R(S) along the i-th dimension in two same-size hyperrectangles and take the points contained in these hyperrectangles to form the two sets Sv and Sw.
        v := SplitTree(Sv)
        w := SplitTree(Sw)
        Store v and w as, respectively, the left and right children of u.
        R(u) := R(S)
    return u

This algorithm runs in [math]\displaystyle{ O(n^2) }[/math] time.

We give a more efficient algorithm that runs in [math]\displaystyle{ O(n \lg n) }[/math] time below. The goal is to loop over the list in only [math]\displaystyle{ O(n) }[/math] operations per step of the recursion but only call the recursion on at most half the points each time.

Let Sij be the i-th coordinate of the j-th point in S such that Si is sorted according to the i-th coordinate and p(Sij) be the point. Also, let h(R(S)) be the hyperplane that splits the longest side of R(S) in two. Here is the algorithm in pseudo-code:

SplitTree(S, u)
    if |S| = 1
        R(u) := R(S) // R(S) is a hyperrectangle which each side has a length of zero.
        Store in u the only point in S.
    else
        size := |S|
        repeat
            Compute R(S)
            R(u) := R(S)
            j : = 1
            k : = |S|
            Let the i-th dimension be the one where Lmax(R(S)) = Li(R(S))
            Sv : = ∅
            Sw : = ∅
            while Sij+1 < h(R(S)) and Sik-1 > h(R(S))
                size := size - 1
                Sv : = Sv ∪ {p(S_i^j)}
                Sw : = Sw ∪ {p(S_i^k)}
                j := j + 1
                k := k - 1
            
            Let v and w be respectively, the left and right children of u.
            if Sij+1 > h(R(S))
                Sw := S \ Sv
                u := w
                S := Sw
                SplitTree(Sv,v)
            else if Sik-1 < h(R(S))
                Sv := S \ Sw
                u := v
                S := Sv
                SplitTree(Sw,w)
        until size ≤ ​n2
        SplitTree(S,u)

To be able to maintain the sorted lists for each node, linked lists are used. Cross-pointers are kept for each list to the others to be able to retrieve a point in constant time. In the algorithm above, in each iteration of the loop, a call to the recursion is done. In reality, to be able to reconstruct the list without the overhead of resorting the points, it is necessary to rebuild the sorted lists once all points have been assigned to their nodes. To do the rebuilding, walk along each list for each dimension, add each point to the corresponding list of its nodes, and add cross-pointers in the original list to be able to add the cross-pointers for the new lists. Finally, call the recursion on each node and his set.

WSPD computation

Visual representation of a well-separated pair computed with the bounding boxes

The WSPD can be extracted from such a split tree by calling the recursive FindPairs(v,w) function on the children of every node in the split tree. Let ul / ur denote the children of the node u. We give pseudocode for the FindWSPD(T, s) function below.

FindWSPD(T, s)
    for each node u that is not a leaf in the split tree T do
        FindPairs(ul, ur)

We give pseudocode for the FindPairs(v, w) function below.

FindPairs(v, w)
    if Sv and Sw are well-separated with respect to s 
        report pair(Sv, Sw)
    else
        if( Lmax(R(v)) ≤ Lmax(R(w)) )
            Recursively call FindPairs(v, wl) and FindPairs(v, wr)
        else
            Recursively call FindPairs(vl, w) and FindPairs(vr, w)

Combining the s-well-separated pairs from all the calls of FindPairs(v,w) gives the WSPD for separation s.

Proof of correctness of the algorithm

It is clear that the pairs returned by the algorithm are well-separated because of the return condition of the function FindPairs.

Now, we have to prove that for any distinct points [math]\displaystyle{ p }[/math] and [math]\displaystyle{ q }[/math] in [math]\displaystyle{ S }[/math], there is a unique pair [math]\displaystyle{ \{A, B\} }[/math] so that (i) [math]\displaystyle{ p \in A }[/math] and [math]\displaystyle{ q \in B }[/math] or (ii) [math]\displaystyle{ p \in B }[/math] and [math]\displaystyle{ q \in A }[/math]. Assume without loss of generality that (i) holds.

Let [math]\displaystyle{ u }[/math] be the lowest common ancestor of [math]\displaystyle{ p }[/math] and [math]\displaystyle{ q }[/math] in the split tree and let [math]\displaystyle{ v }[/math] and [math]\displaystyle{ w }[/math] be the children of [math]\displaystyle{ u }[/math]. Because of the last assumption, [math]\displaystyle{ p }[/math] is in the subtree of [math]\displaystyle{ v }[/math] and [math]\displaystyle{ q }[/math] in the subtree of [math]\displaystyle{ w }[/math]. A call to FindPairs(v,w) is necessarily done in FindWSPD. Because, each time there is a recursion, the recursion tree creates two branches that contain all the points of the current recursion call, there will be a sequence of call to FindPairs leading to having [math]\displaystyle{ p }[/math] in [math]\displaystyle{ A }[/math] and [math]\displaystyle{ q }[/math] in [math]\displaystyle{ B }[/math].

Because [math]\displaystyle{ u }[/math] is the lowest common ancestor of [math]\displaystyle{ p }[/math] and [math]\displaystyle{ q }[/math], calling FindPairs on the children of a higher node would result of [math]\displaystyle{ p }[/math] and [math]\displaystyle{ q }[/math] not being in a pair and calling FindPairs on the children in one of the nodes of one of the subtrees of [math]\displaystyle{ u }[/math] would result by [math]\displaystyle{ p }[/math] or [math]\displaystyle{ q }[/math] not being in any pair. Thus, the pair [math]\displaystyle{ \{A, B\} }[/math] is the unique one separating [math]\displaystyle{ p }[/math] and [math]\displaystyle{ q }[/math].

Each time the recursion tree split in two, there is one more pair added to the decomposition. So, the algorithm run-time is in the number of pairs in the final decomposition.

Callahan and Kosaraju proved that this algorithm finds a Well-separated pair decomposition (WSPD) of size [math]\displaystyle{ O(s^d n) }[/math].[2]

Properties

Lemma 1: Let [math]\displaystyle{ \{A, B\} }[/math] be a well-separated pair with respect to [math]\displaystyle{ s }[/math]. Let [math]\displaystyle{ p, p' \in A }[/math] and [math]\displaystyle{ q \in B }[/math]. Then, [math]\displaystyle{ |pp'| \leq (2/s)|pq| }[/math].

Proof: Because [math]\displaystyle{ p }[/math] and [math]\displaystyle{ p' }[/math] are in the same set, we have that [math]\displaystyle{ |pp'| \leq 2\rho }[/math] where [math]\displaystyle{ \rho }[/math] is the radius of the enclosing circle of [math]\displaystyle{ A }[/math] and [math]\displaystyle{ B }[/math]. Because [math]\displaystyle{ p }[/math] and [math]\displaystyle{ q }[/math] are in two well-separated sets, we have that [math]\displaystyle{ |pq| \geq s\rho }[/math]. We obtain that:

[math]\displaystyle{ \begin{align} & \frac{|pp'|}{2} \leq \rho \leq \frac{|pq|}{s} \\ \Leftrightarrow & \\ & \frac{|pp'|}{2} \leq \frac{|pq|}{s} \\ \Leftrightarrow & \\ & |pp'| \leq \frac{2}{s}|pq| \\ \end{align} }[/math]

Lemma 2: Let [math]\displaystyle{ \{A, B\} }[/math] be a well-separated pair with respect to [math]\displaystyle{ s }[/math]. Let [math]\displaystyle{ p, p' \in A }[/math] and [math]\displaystyle{ q, q' \in B }[/math]. Then, [math]\displaystyle{ |p'q'| \leq (1+ 4/s)|pq| }[/math].

Proof: By the triangle inequality, we have:

[math]\displaystyle{ |p'q'| \leq |p'p| + |pq| + |qq'| }[/math]

From Lemma 1, we obtain:

[math]\displaystyle{ \begin{align} |p'q'| & \leq (2/s)|pq| + |pq| + (2/s)|pq| \\ & = (1+4/s)|pq| \end{align} }[/math]

Applications

The well-separated pair decomposition has application in solving a number of problems. WSPD can be used to:

  • Solve the closest pair problem in [math]\displaystyle{ O(n \lg n) }[/math] time.[1]
  • Solve the k-closest pairs problem in [math]\displaystyle{ O(n \lg n + k) }[/math] time.[1]
  • Solve the k-closest pair problem in [math]\displaystyle{ O(n \lg n) }[/math] time.[3]
  • Solve the all-nearest neighbors problem in [math]\displaystyle{ O(n \lg n) }[/math] time.[1]
  • Provide a [math]\displaystyle{ (1-\epsilon) }[/math]-approximation of the diameter of a point set in [math]\displaystyle{ O(n \lg n) }[/math] time.[1]
  • Directly induce a t-spanner of a point set.[1]
  • Provide a t-approximation of the Euclidean minimum spanning tree in d dimensions in [math]\displaystyle{ O(n \lg n) }[/math] time.[1]
  • Provide a [math]\displaystyle{ (1+\epsilon) }[/math]-approximation of the Euclidean minimum spanning tree in d dimensions in [math]\displaystyle{ O(n \lg n + (\epsilon^{-2} \lg ^2 \frac{1}{\epsilon})n) }[/math] time.[4]

References

  1. 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 Smid, Michiel (16 August 2005). "The well-separated pair decomposition and its applications". http://people.scs.carleton.ca/~michiel/aa-handbook.pdf. Retrieved 26 March 2014. 
  2. 2.0 2.1 2.2 Callahan, P. B.; Kosaraju, S. R. (January 1995). "A Decomposition of Multidimensional Point Sets with Applications to k-Nearest-Neighbors and n-Body Potential Fields". Journal of the ACM 42 (1): 67–90. doi:10.1145/200836.200853. 
  3. Bespamyatnikh, Sergei; Segal, Michael (2002). "Fast Algorithms for Approximating Distances.". Algorithmica 33 (2): 263–269. doi:10.1007/s00453-001-0114-7. 
  4. Arya, Sunil; Mount, David M. (2016). "A fast and simple algorithm for computing approximate euclidean minimum spanning trees". Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms.