Shadow heap

From HandWiki

In computer science, a shadow heap is a mergeable heap data structure which supports efficient heap merging in the amortized sense. More specifically, shadow heaps make use of the shadow merge algorithm to achieve insertion in O(f(n)) amortized time and deletion in O((log n log log n)/f(n)) amortized time, for any choice of 1 ≤ f(n) ≤ log log n.[1]

Throughout this article, it is assumed that A and B are binary heaps with |A| ≤ |B|.

Shadow merge

Shadow merge is an algorithm for merging two binary heaps efficiently if these heaps are implemented as arrays. Specifically, the running time of shadow merge on two heaps [math]\displaystyle{ A }[/math] and [math]\displaystyle{ B }[/math] is [math]\displaystyle{ O(|A| + \min\{\log |B| \log \log |B|, \log |A| \log |B|\}) }[/math].

Algorithm

We wish to merge the two binary min-heaps [math]\displaystyle{ A }[/math] and [math]\displaystyle{ B }[/math]. The algorithm is as follows:

  1. Concatenate the array [math]\displaystyle{ A }[/math] at the end of the array [math]\displaystyle{ B }[/math] to obtain an array [math]\displaystyle{ C }[/math].
  2. Identify the shadow of [math]\displaystyle{ A }[/math] in [math]\displaystyle{ C }[/math]; that is, the ancestors of the last [math]\displaystyle{ |A| }[/math] nodes in [math]\displaystyle{ C }[/math] which destroy the heap property.
  3. Identify the following two parts of the shadow from [math]\displaystyle{ C }[/math]:
    • The path [math]\displaystyle{ P }[/math]: the set of nodes in the shadow for which there are at most 2 at any depth of [math]\displaystyle{ C }[/math];
    • The subtree [math]\displaystyle{ T }[/math]: the remainder of the shadow.
  4. Extract and sort the smallest [math]\displaystyle{ |P| }[/math] nodes from the shadow into an array [math]\displaystyle{ S }[/math].
  5. Transform [math]\displaystyle{ S }[/math] as follows:
    • If [math]\displaystyle{ |S| \gt |C| }[/math], then starting from the smallest element in the sorted array, sequentially insert each element of [math]\displaystyle{ S }[/math] into [math]\displaystyle{ C }[/math], replacing them with [math]\displaystyle{ C }[/math]'s smallest elements.
    • If [math]\displaystyle{ |S| \leq |C| }[/math], then extract and sort the [math]\displaystyle{ |P| }[/math] smallest elements from [math]\displaystyle{ C }[/math], and merge this sorted list with [math]\displaystyle{ S }[/math].
  6. Replace the elements of [math]\displaystyle{ S }[/math] into their original positions in [math]\displaystyle{ C }[/math].
  7. Make a heap out of [math]\displaystyle{ T }[/math].

Running time

Again, let [math]\displaystyle{ P }[/math] denote the path, and [math]\displaystyle{ T }[/math] denote the subtree of the concatenated heap [math]\displaystyle{ C }[/math]. The number of nodes in [math]\displaystyle{ P }[/math] is at most twice the depth of [math]\displaystyle{ C }[/math], which is [math]\displaystyle{ O(\log |B|) }[/math]. Moreover, the number of nodes in [math]\displaystyle{ T }[/math] at depth [math]\displaystyle{ d }[/math] is at most 3/4 the number of nodes at depth [math]\displaystyle{ d + 1 }[/math], so the subtree has size [math]\displaystyle{ O(|A|) }[/math]. Since there are at most 2 nodes at each level on [math]\displaystyle{ P }[/math], then reading the smallest [math]\displaystyle{ |P| }[/math] elements of the shadow into the sorted array [math]\displaystyle{ S }[/math] takes [math]\displaystyle{ O(\log |B|) }[/math] time.

If [math]\displaystyle{ |S| \gt |C| }[/math], then combining [math]\displaystyle{ P }[/math] and [math]\displaystyle{ C }[/math] as in step 5 above takes time [math]\displaystyle{ O(\log |A| \log |B|) }[/math]. Otherwise, the time taken in this step is [math]\displaystyle{ O(|A| + \log |B| \log \log |B|) }[/math]. Finally, making a heap of the subtree [math]\displaystyle{ T }[/math] takes [math]\displaystyle{ O(|A|) }[/math] time. This amounts to a total running time for shadow merging of [math]\displaystyle{ O(|A| + \min\{\log |A| \log |B|, \log |B| \log \log |B|\}) }[/math].

Structure

A shadow heap [math]\displaystyle{ H }[/math] consists of threshold function [math]\displaystyle{ f(H) }[/math], and an array for which the usual array-implemented binary heap property is upheld in its first entries, and for which the heap property is not necessarily upheld in the other entries. Thus, the shadow heap is essentially a binary heap [math]\displaystyle{ B }[/math] adjacent to an array [math]\displaystyle{ A }[/math]. To add an element to the shadow heap, place it in the array [math]\displaystyle{ A }[/math]. If the array becomes too large according to the specified threshold, we first build a heap out of [math]\displaystyle{ A }[/math] using Floyd's algorithm for heap construction,[2] and then merge this heap with [math]\displaystyle{ B }[/math] using shadow merge. Finally, the merging of shadow heaps is simply done through sequential insertion of one heap into the other using the above insertion procedure.

Analysis

We are given a shadow heap [math]\displaystyle{ H = (B, A) }[/math], with threshold function [math]\displaystyle{ \log |H| \leq f(H) \leq \log |H| \log \log |H| }[/math] as above. Suppose that the threshold function is such that any change in [math]\displaystyle{ |B| }[/math] induces no larger a change than in [math]\displaystyle{ f(H) }[/math]. We derive the desired running time bounds for the mergeable heap operations using the potential method for amortized analysis. The potential [math]\displaystyle{ \Psi(H) }[/math] of the heap is chosen to be:

[math]\displaystyle{ \Psi(H) = |A| (1 + \min\{\log |B| \log \log |B|, \log |B| \log |A|\}/f(H)) }[/math]

Using this potential, we can obtain the desired amortized running times:

create(H): initializes a new empty shadow heap [math]\displaystyle{ H }[/math]

Here, the potential [math]\displaystyle{ \Psi }[/math] is unchanged, so the amortized cost of creation is [math]\displaystyle{ O(1) }[/math], the actual cost.

insert(x, H): inserts [math]\displaystyle{ x }[/math] into the shadow heap [math]\displaystyle{ H }[/math]

There are two cases:
  • If the merge is employed, then the drop in the potential function is exactly the actual cost of merging [math]\displaystyle{ B }[/math] and [math]\displaystyle{ A }[/math], so the amortized cost is [math]\displaystyle{ O(1) }[/math].
  • If the merge is not done, then the amortized cost is [math]\displaystyle{ O(1 + \min\{\log |B| \log \log |B|, \log |B| \log |A|\}/f(H)) }[/math]
By choice of the threshold function, we thus obtain that the amortized cost of insertion is:
[math]\displaystyle{ O(\log |H| \log \log |H|/f(H)) }[/math]

delete_min(H): deletes the minimum priority element from [math]\displaystyle{ H }[/math]

Finding and deleting the minimum takes actual time [math]\displaystyle{ O(|A| + \log |B|) }[/math]. Moreover, the potential function can only increase after this deletion if the value of [math]\displaystyle{ f(H) }[/math] decreases. By choice of [math]\displaystyle{ f }[/math], we have that the amortized cost of this operation is the same as the actual cost.

Related algorithms & data structures

A naive binary heap merging algorithm will merge the two heaps [math]\displaystyle{ A }[/math] and [math]\displaystyle{ B }[/math] in time [math]\displaystyle{ O(|B|) }[/math] by simply concatenating both heaps and making a heap out of the resulting array using Floyd's algorithm for heap construction. Alternatively, the heaps can simply be merged by sequentially inserting each element of [math]\displaystyle{ A }[/math] into [math]\displaystyle{ B }[/math], taking time [math]\displaystyle{ O(|A| \log |B|) }[/math].

Sack and Strothotte proposed an algorithm for merging the binary heaps in [math]\displaystyle{ O(|A| + \log |A| \log |B|) }[/math] time.[3] Their algorithm is known to be more efficient than the second naive solution described above roughly when [math]\displaystyle{ |A| \gt \log |B| }[/math]. Shadow merge performs asymptotically better than their algorithm, even in the worst case.

There are several other heaps which support faster merge times. For instance, Fibonacci heaps can be merged in [math]\displaystyle{ O(1) }[/math] time. Since binary heaps require [math]\displaystyle{ \Omega(|A|) }[/math] time to merge,[4] shadow merge remains efficient.

References

  1. Template:Cite tech report
  2. Suchenek, Marek A. (2012), "Elementary Yet Precise Worst-Case Analysis of Floyd's Heap-Construction Program", Fundamenta Informaticae (IOS Press) 120 (1): 75–92, doi:10.3233/FI-2012-751, http://www.deepdyve.com/lp/ios-press/elementary-yet-precise-worst-case-analysis-of-floyd-s-heap-50NW30HMxU 
  3. Sack, Jörg-R.; Strothotte, Thomas (1985), "An Algorithm for Merging Heaps", Acta Informatica (Springer-Verlag) 22 (2): 171–186, doi:10.1007/BF00264229 .
  4. Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L. (1990). Introduction to Algorithms (1st ed.). MIT Press and McGraw-Hill. ISBN 0-262-03141-8.