Distance oracle

In computing, a distance oracle (DO) is a data structure for calculating distances between vertices in a graph.

Introduction

Let G(V,E) be an undirected, weighted graph, with n = |V| nodes and m = |E| edges. We would like to answer queries of the form "what is the distance between the nodes s and t?".

One way to do this is just run the Dijkstra algorithm. This takes time [math]\displaystyle{ O(m + n \log n) }[/math], and requires no extra space (besides the graph itself).

In order to answer many queries more efficiently, we can spend some time in pre-processing the graph and creating an auxiliary data structure.

A simple data structure that achieves this goal is a matrix which specifies, for each pair of nodes, the distance between them. This structure allows us to answer queries in constant time [math]\displaystyle{ O(1) }[/math], but requires [math]\displaystyle{ O(n^2) }[/math] extra space. It can be initialized in time [math]\displaystyle{ O(n^3) }[/math] using an all-pairs shortest paths algorithm, such as the Floyd–Warshall algorithm.

A DO lies between these two extremes. It uses less than [math]\displaystyle{ O(n^2) }[/math] space in order to answer queries in less than [math]\displaystyle{ O(m + n \log n) }[/math] time. Most DOs have to compromise on accuracy, i.e. they don't return the accurate distance but rather a constant-factor approximation of it.

Approximate DO

Thorup and Zwick^[1] describe more than 10 different DOs. They then suggest a new DO that, for every k, requires space [math]\displaystyle{ O(k n^{1+1/k}) }[/math], such that any subsequent distance query can be approximately answered in time [math]\displaystyle{ O(k) }[/math]. The approximate distance returned is of stretch at most [math]\displaystyle{ 2k-1 }[/math], that is, the quotient obtained by dividing the estimated distance by the actual distance lies between 1 and [math]\displaystyle{ 2k-1 }[/math]. The initialization time is [math]\displaystyle{ O(kmn^{1/k}) }[/math].

Some special cases include:

For [math]\displaystyle{ k=1 }[/math] we get the simple distance matrix.
For [math]\displaystyle{ k=2 }[/math] we get a structure using [math]\displaystyle{ O(n^{1.5}) }[/math] space which answers each query in constant time and approximation factor at most 3.
For [math]\displaystyle{ k=\lfloor \log n \rfloor }[/math], we get a structure using [math]\displaystyle{ O(n \log n) }[/math] space, query time [math]\displaystyle{ O(\log n) }[/math], and stretch [math]\displaystyle{ O(\log n) }[/math].

Higher values of k do not improve the space or preprocessing time.

DO for general metric spaces

The oracle is built of a decreasing collection of k+1 sets of vertices:

[math]\displaystyle{ A_0 = V }[/math]
For every [math]\displaystyle{ i=1,\ldots,k-1 }[/math]: [math]\displaystyle{ A_i }[/math] contains each element of [math]\displaystyle{ A_{i-1} }[/math], independently, with probability [math]\displaystyle{ n^{-1/k} }[/math]. Note that the expected size of [math]\displaystyle{ A_i }[/math] is [math]\displaystyle{ n^{1-i/k} }[/math]. The elements of [math]\displaystyle{ A_i }[/math] are called i-centers.
[math]\displaystyle{ A_k = \emptyset }[/math]

For every node v, calculate its distance from each of these sets:

For every [math]\displaystyle{ i=0,\ldots,k-1 }[/math]: [math]\displaystyle{ d(A_i,v) = \min{(d(w,v)|w\in A_i} }[/math] and [math]\displaystyle{ p_i(v) = \arg \min{(d(w,v)|w\in A_i} }[/math]. I.e., [math]\displaystyle{ p_i(v) }[/math] is the i-center nearest to v, and [math]\displaystyle{ d(A_i,v) }[/math] is the distance between them. Note that for a fixed v, this distance is weakly increasing with i. Also note that for every v, [math]\displaystyle{ d(A_0,v)=0 }[/math] and [math]\displaystyle{ p_0(v)=v }[/math].
[math]\displaystyle{ d(A_k,v)=\infty }[/math].

For every node v, calculate:

For every [math]\displaystyle{ i=0,\ldots,k-1 }[/math]: [math]\displaystyle{ B_i(v) = \{w\in A_i\setminus A_{i+1}|d(w,v)\lt d(A_{i+1},v)\} }[/math]. [math]\displaystyle{ B_i(v) }[/math] contains all vertices in [math]\displaystyle{ A_i }[/math] which are strictly closer to v than all vertices in [math]\displaystyle{ A_{i+1} }[/math]. The partial unions of [math]\displaystyle{ B_i }[/math]s are balls in increasing diameter, that contain vertices with distances up to the first vertex of the next level.

For every v, compute its bunch:

[math]\displaystyle{ B(v) = \bigcup_{i=0}^{k-1}B_i(v). }[/math]

It is possible to show that the expected size of [math]\displaystyle{ B(v) }[/math] is at most [math]\displaystyle{ kn^{1/k} }[/math].

For every bunch [math]\displaystyle{ B(v) }[/math], construct a hash table that holds, for every [math]\displaystyle{ w\in B(V) }[/math], the distance [math]\displaystyle{ d(w,v) }[/math].

The total size of the data structure is [math]\displaystyle{ O(kn+\Sigma|B(v)|)=O(kn+nkn^{1/k})=O(kn^{1+1/k}) }[/math]

Having this structure initialized, the following algorithm finds the distance between two nodes, u and v:

[math]\displaystyle{ w:=u, i:=0 }[/math]
while [math]\displaystyle{ w\notin B(v) }[/math]:
- [math]\displaystyle{ i:=i+1 }[/math]
- [math]\displaystyle{ (u,v):=(v,u) }[/math] (swap the two input nodes; this does not change the distance between them since the graph is undirected).
- [math]\displaystyle{ w := p_i(u) }[/math]
return [math]\displaystyle{ d(w,u)+d(w,v) }[/math]

It is possible to show that, in each iteration, the distance [math]\displaystyle{ d(w,u) }[/math] grows by at most [math]\displaystyle{ d(v,u) }[/math]. Since [math]\displaystyle{ A_{k-1}\subseteq B(v) }[/math], there are at most k-1 iterations, so finally [math]\displaystyle{ d(w,u)\leq (k-1)d(v,u) }[/math]. Now by the triangle inequality, [math]\displaystyle{ d(w,v)\leq d(w,u)+d(u,v)\leq kd(v,u) }[/math], so the distance returned is at most [math]\displaystyle{ (2k-1)d(u,v) }[/math].

Improvements

The above result was later improved by Patrascu and Roditty^[2] who suggest a DO of size [math]\displaystyle{ O(n^{4/3} m^{1/3}) }[/math] which returns a factor 2 approximation.

Reduction from set intersection oracle

If there is a DO with an approximation factor of at most 2, then it is possible to build a set intersection oracle (SIO) with query time [math]\displaystyle{ O(1) }[/math] and space requirements [math]\displaystyle{ O(N+n) }[/math], where n is the number of sets and N the sum of their sizes; see set intersection oracle#Reduction to approximate distance oracle.

It is believed that the SIO problem does not have a non-trivial solution. I.e., it requires [math]\displaystyle{ \Omega(n^2) }[/math] space to answer queries in time [math]\displaystyle{ O(1) }[/math], e.g. using an n-by-n table with the intersection between each two sets. If this conjecture is true, this implies that there is no DO with an approximation factor of less than 2 and a constant query time.^[2]

References

↑ Thorup, M.; Zwick, U. (2005). "Approximate distance oracles". Journal of the ACM 52: 1–24. doi:10.1145/1044731.1044732.
↑ ^2.0 ^2.1 Patrascu, M.; Roditty, L. (2010). "Distance Oracles beyond the Thorup–Zwick Bound". 2010 IEEE 51st Annual Symposium on Foundations of Computer Science (FOCS). pp. 815. doi:10.1109/FOCS.2010.83. ISBN 978-1-4244-8525-3.

0.00

(0 votes)

Original source: https://en.wikipedia.org/wiki/Distance oracle. Read more

[1] Thorup, M.; Zwick, U. (2005). "Approximate distance oracles". Journal of the ACM 52: 1–24. doi:10.1145/1044731.1044732.

[pr10-2] 2.0 ^2.1 Patrascu, M.; Roditty, L. (2010). "Distance Oracles beyond the Thorup–Zwick Bound". 2010 IEEE 51st Annual Symposium on Foundations of Computer Science (FOCS). pp. 815. doi:10.1109/FOCS.2010.83. ISBN 978-1-4244-8525-3.

[1]

[2]

Anonymous

Search

Distance oracle

Namespaces

More

Page actions

Contents

Introduction

Approximate DO

DO for general metric spaces

Improvements

Reduction from set intersection oracle

References

Navigation

Navigation

Help

Translate

Wiki tools

Wiki tools

Anonymous

Search

Distance oracle

Introduction

Approximate DO

DO for general metric spaces

Improvements

Reduction from set intersection oracle

References

Navigation

Wiki tools

Page tools

Other projects

Categories