Brodal queue

From HandWiki
Short description: Optimal data structure for priority queue operations
Brodal queue
TypeHeap/priority queue
Invented1996
Invented byGerth Stølting Brodal
Time complexity in big O notation
Algorithm Average Worst case

In computer science, the Brodal queue is a heap/priority queue structure with very low worst case time bounds: O(1) for insertion, find-minimum, meld (merge two queues) and decrease-key and O(log(n)) for delete-minimum and general deletion. They are the first heap variant to achieve these bounds without resorting to amortization of operational costs. Brodal queues are named after their inventor Gerth Stølting Brodal.[1]

While having better asymptotic bounds than other priority queue structures, they are, in the words of Brodal himself, "quite complicated" and "[not] applicable in practice."[1] Brodal and Okasaki describe a persistent (purely functional) version of Brodal queues.[2]

Definition

A Brodal queue is a set of two trees T1 and T2 and 5 guides. The definition of the guide data structure can be found in the following section. For both trees, each node has a rank, this rank is useful for later operations and intuitively corresponds to the logarithm of the size of the subtree rooted in the node. We note arityi(x) the number of children of the node x with rank i. We will also use t1 for the root of tree T1 and t2 for the root of tree T2. At each given time, every subtree rooted in a node needs to fulfill these 5 invariants (which will be later called RANK invariants):

  • LEAF-RANK : If x is a leaf, then rank(x)=0,
  • PARENT-RANK : rank(x)<rank(parent(x)),
  • NEXT-RANK-ARITY : If rank(x)>0, then arityrank(x)1(x)2,
  • ARITY-BOUND: arityi(x){0,2,3,,7} we highlight that arityi(x)1,
  • ROOT-RANK : T2= or rank(t1)rank(t2).

Here NEXT-RANK-ARITY guarantees us that the size of the subtree rooted in a node is at least exponential to the rank of that node. In addition, ARITY-BOUND bounds the number of children of each rank for a given node, this implies that all nodes have rank and degrees in O(logn).

In a Brodal queue, not every node will have a bigger value than its parent, the nodes vialoating this condition will be called violating nodes. However, we want to keep the number of violating nodes relatively small. To keep track of violating nodes, we create for each node two sets V(x) and W(x) of nodes larger than x. Intuitively, V(x) are the nodes larger of x with large rank (such that yV(x) if rank(y)rank(t1)), and W(x) are the nodes with small rank (rank(y)<rank(t1)). These sets are implemented using doubly linked list meaning they have an order. In particular, all violating nodes added to V(x) are added at the front of the list, and all violating nodes added to W(x) are inserted next to a node of same rank. We let wi(x) denote the number of nodes in W(x) of rank iThe V(x) and W(x) lists fulfill these 5 invariants (we will call the SETS invariants):

  • MINIMUM-NODE : t1=min(T1T2)
  • VIOLATING-CONDITION : If yV(x)W(x) then yx
  • PARENT-VIOLATING: If y<parent(y) then there exist a node xy such that yV(x)W(x)
  • W-RANK-BOUND: wi(x)6
  • V-RANK-BOUND: By denoting V(x)=(v|V(x)|,,v2,v1), we have: rank(vi)i1α for a certain constant α.

Since all nodes have rank in O(logn) the W-RANK-BOUND and V-RANK-BOUND, all V(x) and W(x) are in size O(logn).

We also have some invariants of the roots of the trees T1 and T2: t1 and t2 (called the ROOTS invariants).

  • ROOT-ARITY : ti{2,3,,7} for i{0,1,,rank(ti)1},
  • V-SIZE-BOUND: |V(x)|α rank(t1),
  • W-ELEMENTS-RANK: if yW(t1), then rank(y)<rank(t1).

The V-SIZE-BOUND invariant essentially tells us that if we increase the rank of t1 by one, we have at most α new "large" violations (here large means having a high rank) without violating the V-RANK-BOUND invariant. On the other hand, the W-ELEMENTS-RANKinvariant tells us that all violations in W(x) are "small", this invariant is true per the definition of W. Maintaining the invariants W-RANK-BOUND and ROOT-ARITY is not trivial, to maintain these we will use the DecreaseKey operation which can be implemented using a guide as defined in the next section. Each time we will call the DecreaseKey operation, we will essentially :

  1. Add the new violation to V(t1) or W(t1) depending on the rank of that violation.
  2. To avoid V(t1) and W(t1) from getting too large, we incrementally do two kinds of transformations:
    1. Moving the sons of t2 to t1 to increase the rank of t1
    2. Reducing the number of violations in W(t1) by replacing two violations of rank k to one violation of rank k+1

The guide data structure

This definition is based on the definition from Brodal's paper.[3]

We assume that we have a sequence of variables xk,,x1 and we want to make sure that ik,xiT for some threshold T. The only operation allowed is REDUCE(i) which decreases xi by at least 2 and increases xi+1 by at most 1. We can assume without loss of generality that REDUCE(i) reduces xi by 2 and increases xi+1by 1.

If a xj is increased by one, the goal of the guide is to tell us on which indices i to apply REDUCE(i) in order to respect the threshold. The guide is only allowed to make O(1) calls to the REDUCE function for each increase.

The guide has access to another sequence x'k,,x'1 such that xix'i and x'i{T2,T1,T}. As long as after the increase of xj we have xjx'j we do not need to ask help from our guide since xj is "far" bellow T. However, if xj=x'j before the increase, then we have xj+1>x'j after the change.

To simplify the explanation, we can assume that T=2, so that x'i{0,1,2}. The guide will create blocks in the sequence of x'i of form 2,1,1,,1,0 where we allow there to be no 1. The guide maintains the invariant that each element which isn't in a block is either a 1 or a 0. For example, here are the blocks for a sequence of x'i.

1,2,1,1,0_,1,1,2,0_,2,0_,1,0,2,1,0_

The guide is composed of 3 arrays :

  • x the array of xk,,x1
  • xthe array of x'k,,x'1
  • p an array of pointers where all the pi for which x'i are in the same block will point to the same memory cell containing a value. If a x'i is not in a block, then pi points to a memory cell containing .

With this definition, a guide has two important properties :

  1. For each element in a block, we can find the most left element of the block in time O(1).
  2. We can destroy a block in time O(1) by assigning to the memory cell pointed to by each element of the block.

This way, the guide is able to decide which indices to REDUCE in time O(1). Here is an example :

2,1,1,0_,2,1,1,1,0_2,1,1,0_,2,2,1,1,0_Increment x'i2,1,1,1_,0,2,1,1,0_REDUCE2,1,1,1_,1,0,1,1,0_REDUCE2,1,1,1,1,0_,1,1,0reestablish blocks

To reestablish the blocks, the pointers of the 1 and 0 added to the first block now point to the same cell as all the other elements from the first block, and the value of the second block's cell is changed to . In the previous example, only two REDUCE operations were needed, this is actually the case for all instances. Therefore, the queue only needs O(1) operations to reestablish the property.

Operations of a Brodal Queue

To implement the different priority queue operations we first need to describe some essential transformations for the trees.

Transformations

Linking trees

To link trees, we need three nodes x1,x2 and x3of equal rank. We can calculate the minimum of these three nodes with two comparisons. Here we assume that x1 is the minimum but the process is similar for all xi. We can now make the nodes x2 and x3 the two leftmost sons of x1 and increase the rank of x1 by one. This preserves all the RANK and SETS invariants.

Delinking trees

If x has exactly two or three sons of rank rank(x)1, we can remove these sons and x gets the rank of its new largest son plus one. From the ARITY-BOUND condition, we know that that the NEXT-RANK-ARITY invariant will be preserved. Then, all the RANK and SETS invariants remain satisfied. If x has 4 or more children, we can simply cut of two of them and all the invariants remain true. Therefore, the delinking of a tree of rank k will always result in two or three trees of rank k1 (from the 2 or 3 children cut off) and one additional tree of rank at most k.

Maintaining the sons of a root

When we add and remove sons of a root, we want to keep the ROOT-RANK invariant true. For this purpose, we use 4 guides, two for each roots t1 and t2. To have constant time access to the son of t1 we create an extendible array of pointed that has for each rank i{0,,rank(t1)1} a pointer to a son of t1 of rank i. One guide will maintain the condition that arityi(t1)7 and the other maintains arityi(t1)2 both for i{0,,rank(t1)3}. The sons of t1of rank rank(t1)1 and rank(t1)2 are treated separately in a straight forward way to maintant their number between 2 and 7. The equivalent to the x'i variable in the definition of the guide will have the values {5,6,7} for the higher bound guide and {4,3,2} for the lower bound.

In this context, when we add a child of rank i to the root, we increase x'i by one, and apply the REDUCE operations. The RECUCE(i) operation here consists of linking three trees of rank i which creates a new child of rank i+1. Therefore, we decrease arityi(t1) by three and increase arityi+1(t1) by one. If this increase results in too many sons of rank rank(t1)2 or rank(t1)1 we link some of these sons together and possibly increase the rank of t1. If we increase the rank of t1, we have to increase the length of the extendible array managed by the guides.

Cutting off a son from t1 is very similar, except here the REDUCE operation corresponds to the delinking of a tree.

For the root t2 the situation is nearly the same. However, since MINIMUM-NODE guarantees us that t1 is the minimum element, we know that we won't create any violation by linking or delinking children of t1. The same cannot be said for t2. Linking sons will never create new violations but the delinking of sons can create up to three new violations. The tree left over by a delinking is made a son of t1 if it has rank less than rank(t1) and otherwise it becomes a son of t2. The new violations which have rank larger than rank(t1) are added to V(t1). To maintain the invariants on the V(t1) set (namely V-RANK-BOUND and V-SIZE-BOUND), we have to guarantee that the rank of t1 will be increased and that α in those invariants is chosen large enough.

Violation reducing

The goal of this transformation is to reduce the total amount of potential violations, meaning reducing |xT1T2V(x)W(x)|.

We assume we have two potential violations x1 and x2 of equal rank k. We then have several cases:

  1. If one of the nodes turns out not to be a violation, we just remove it from its corresponding violation set.
  2. Otherwise, both nodes are violations. Because of ARITY-BOUND, we know that both x1 and x2 have at least one brother. Then:
    1. If x1 and x2 are not brothers, then we can assume without losing generality that parent(x1)parent(x2), we can then swap the subtrees rooted in x1 and x2. The number of violations can only go down during that swap.
    2. Else, x1 and x2 are brothers of a node we will call y.
      1. If x1 has more than one brother of rank k, we can just cut off x1 and make it a non violating node of t1as described in the previous subsection.
      2. Else, x1 and x2 are the only children of rank k of y..
        1. If rank(y)>k+1, we can cut off both x1 and x2 nodes from y and make them non violating nodes of t1 as described in the previous subsection
        2. Else, rank(y)=k+1. We will cut off x1,x2 and y, the new rank of y will be one plus the rank of its leftmost son, We replace y by a son on t1 of rank k+1, which can be cut off as described in the previous subsection. If the replacement for y becomes a violating node of rank k+1, we add it to W(t1). Finally we make x1,x2 and y new sons of t1 as described above.

Avoiding too many violations

The only violation sets where we will add violations are V(t1) and W(t1). As described above, the invariants on those sets are maintained using guides. When we add a violation to W(t1) we have two cases:

  1. If there are exactly 6 violations of the given rank and there are at least two violating nodes which aren't sons of t2, we apply the REDUCE operations given by the guide.
  2. If there are more than 4 violations which are sons of t2, we cut the additional violations off and link them below t1. This removes the violation created by these nodes and doesn't affect the guide maintaining the sons of t2.

For each priority queue operation that is performed, we increase the rank of t1 by at least one by moving a constant number of sons of t2 to t1 (provided that T2). Increasing the rank of t1 allows us to add violations to V(t1) while still maintaining all our invariants. If T2 and rank(t2)rank(t1)+2 we can cut the largest sons of t2, link them to t1 and then make t2 a son of t1. This satisfies all the invariants. Otherwise, we cut off a son of t2 of rank rank(t1)+2, delink this son and add the resulting trees to t1. If T2=, we know that t1 is the node of largest rank, therefore we know that no large violations can be created.

Priority queue operations

MakeQueue

MakeQueue() just returns a pair of empty trees.

FindMin

FindMin(Q) returns t1.

Insert

Insert(Q,e) is only a special case of Meld(Q1,Q2) where Q2 is a queue only containing e and Q1=Q.

Meld

Meld(Q1,Q2) involves four trees (two for each queue). The tree having the minimum root becomes the new T1 tree. If this tree is also the tree of maximum rank, we can add all the other trees below as described previously. In this case, no violating node is created so no transformation is done on violating nodes. Otherwise, the tree of maximum rank becomes the new T2 tree and the other trees are added below as described in the section "Maintaining the sons of a root". If some trees have equal rank to this new T2, we can delink them before adding them. The violations created are handled as explained in the section : "Avoiding too many violations".

DecreaseKey

DecreaseKey(Q,e,e) replaces the element of e by e (with ee). If e<t1, we swap the two nodes, otherwise, we handle the potential new violation as explained in the section "Avoiding too many violations".

DeleteMin

DeleteMin(Q) is allowed to take worst case time O(logn). First, we completely empty T2 by moving all the sons of t2 to t1 then making t2 a rank 0 son of t1. Then, t1 is deleted, this leaves us with at most O(logn) independent trees. The new minimum is then found by looking at the violating sets of the old root and looking at all the roots of the new trees. If the minimum element is not a root, we can swap a root from a tree of equal rank to it. This creates at most one violation. Then, we make the independent trees sons of the new minimum element by performing O(logn) linking and delinking operations. This reestablishes the RANK and ROOTS invariants. By merging the V and W sets of the new root along with the V and W sets of the old root together, we get one new violation set of size O(logn). By doing at most O(logn) violation reducing transformations we can make the violation set only contain at most one element of each rank. This set will be our new W set and the new V set is empty. This reestablishes the SETS invariants. We also have to initialise a new guide for the new root t1.

Delete

Here, denotes the smallest possible element. Delete(Q,e) can simply be implemented by calling DecreaseKey(Q,e,) followed by DeleteMin(Q).

Implementation details

In this section, we summarize some implementation details for the Brodal Queue data structure.

In each tree, each node is a record having the following fields:

  • The element associated with the node (its value),
  • The rank of the node,
  • pointers to the node's left and right brothers,
  • a pointer to the father node,
  • a pointer to the leftmost son,
  • pointers to the first element of the node's V and W sets,
  • pointers to the following and previous element in the violation set the node belongs to. If this node is the first node of the violation set V(x) or W(x) it belongs to, the previous pointer points to x.
  • an array of pointers to sons of t1 of rank i (with i{0,,rank(t1)1}),
  • a similar array for t2,
  • an array of pointers to nodes in W(t1) of rank i (with i{0,,rank(t1)1}).

Finally, we have 5 guides: three to maintain the upper bounds on arityi(t1), arityi(t2) and wi(t1) and two to maintain the lower bounds on arityi(t1) and arityi(t2).

Due to its high amount of pointers and sets to keep track of, the Brodal Queue is extremely hard to implement. For this reason it is best described as a purely theoretical object to lower time complexity on algorithms like Dijkstra's algorithm. However, the Brodal has been implemented in Scala (the GitHub repository can be found here: https://github.com/ruippeixotog/functional-brodal-queues). In his paper, Gerth Stølting Brodal mentions that: "An important issue for further work is to simplify the construction to make it applicable in practice".[3]

Summary of running times

Here are time complexities[4] of various heap data structures. Function names assume a min-heap. For the meaning of "O(f)" and "Θ(f)" see Big O notation.

Operation find-min delete-min insert decrease-key meld
Binary[4] Θ(1) Θ(log n) O(log n) O(log n) Θ(n)
Leftist Θ(1) Θ(log n) Θ(log n) O(log n) Θ(log n)
Binomial[4][5] Θ(1) Θ(log n) Θ(1)[lower-alpha 1] Θ(log n) O(log n)[lower-alpha 2]
Fibonacci[4][6] Θ(1) O(log n)[lower-alpha 1] Θ(1) Θ(1)[lower-alpha 1] Θ(1)
Pairing[7] Θ(1) O(log n)[lower-alpha 1] Θ(1) o(log n)[lower-alpha 1][lower-alpha 3] Θ(1)
Brodal[10][lower-alpha 4] Θ(1) O(log n) Θ(1) Θ(1) Θ(1)
Rank-pairing[12] Θ(1) O(log n)[lower-alpha 1] Θ(1) Θ(1)[lower-alpha 1] Θ(1)
Strict Fibonacci[13] Θ(1) O(log n) Θ(1) Θ(1) Θ(1)
2-3 heap[14] O(log n) O(log n)[lower-alpha 1] O(log n)[lower-alpha 1] Θ(1) ?
  1. 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 Amortized time.
  2. n is the size of the larger heap.
  3. Lower bound of Ω(loglogn),[8] upper bound of O(22loglogn).[9]
  4. Brodal and Okasaki later describe a persistent variant with the same bounds except for decrease-key, which is not supported. Heaps with n elements can be constructed bottom-up in O(n).[11]

Gerth Stølting Brodal

Gerth Stølting Brodal is a professor at the University of Aarhus, Denmark.[15] He is best known for the Brodal queue.

References

  1. 1.0 1.1 Gerth Stølting Brodal (1996). Worst-case efficient priority queues. Proc. 7th ACM-SIAM Symposium on Discrete Algorithms, pp. 52–58
  2. Gerth Stølting Brodal and Chris Okasaki (1996). Optimal purely functional priority queues. Journal of Functional Programming.
  3. 3.0 3.1 Brodal, Gerth Stølting (1996). "Worst-Case Efficient Priority Queues". https://www.cs.au.dk/~gerth/papers/soda96.pdf. 
  4. 4.0 4.1 4.2 4.3 Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L. (1990). Introduction to Algorithms (1st ed.). MIT Press and McGraw-Hill. ISBN 0-262-03141-8. 
  5. "Binomial Heap | Brilliant Math & Science Wiki" (in en-us). https://brilliant.org/wiki/binomial-heap/. 
  6. Fredman, Michael Lawrence; Tarjan, Robert E. (July 1987). "Fibonacci heaps and their uses in improved network optimization algorithms". Journal of the Association for Computing Machinery 34 (3): 596-615. doi:10.1145/28869.28874. http://bioinfo.ict.ac.cn/~dbu/AlgorithmCourses/Lectures/Fibonacci-Heap-Tarjan.pdf. 
  7. Iacono, John (2000), "Improved upper bounds for pairing heaps", Proc. 7th Scandinavian Workshop on Algorithm Theory, Lecture Notes in Computer Science, 1851, Springer-Verlag, pp. 63–77, doi:10.1007/3-540-44985-X_5, ISBN 3-540-67690-2, http://john2.poly.edu/papers/swat00/paper.pdf 
  8. Fredman, Michael Lawrence (July 1999). "On the Efficiency of Pairing Heaps and Related Data Structures". Journal of the Association for Computing Machinery 46 (4): 473–501. doi:10.1145/320211.320214. http://www.uqac.ca/azinflou/Fichiers840/EfficiencyPairingHeap.pdf. 
  9. Pettie, Seth (2005). "Towards a Final Analysis of Pairing Heaps". FOCS '05 Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science. pp. 174–183. doi:10.1109/SFCS.2005.75. ISBN 0-7695-2468-0. http://web.eecs.umich.edu/~pettie/papers/focs05.pdf. 
  10. Brodal, Gerth S. (1996), "Worst-Case Efficient Priority Queues", Proc. 7th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 52–58, http://www.cs.au.dk/~gerth/papers/soda96.pdf 
  11. Goodrich, Michael T.; Tamassia, Roberto (2004). "7.3.6. Bottom-Up Heap Construction". Data Structures and Algorithms in Java (3rd ed.). pp. 338-341. ISBN 0-471-46983-1. 
  12. Haeupler, Bernhard; Sen, Siddhartha; Tarjan, Robert E. (November 2011). "Rank-pairing heaps". SIAM J. Computing 40 (6): 1463–1485. doi:10.1137/100785351. http://sidsen.org/papers/rp-heaps-journal.pdf. 
  13. Brodal, Gerth Stølting; Lagogiannis, George; Tarjan, Robert E. (2012). "Strict Fibonacci heaps". Proceedings of the 44th symposium on Theory of Computing - STOC '12. pp. 1177–1184. doi:10.1145/2213977.2214082. ISBN 978-1-4503-1245-5. http://www.cs.au.dk/~gerth/papers/stoc12.pdf. 
  14. Takaoka, Tadao (1999), Theory of 2–3 Heaps, pp. 12, https://ir.canterbury.ac.nz/bitstream/handle/10092/14769/2-3heaps.pdf 
  15. "Website of Gerth Stølting Brodal, at the University of Aarhus". http://www.cs.au.dk/~gerth/. Retrieved 18 February 2016.