Heaps and Priority Queues¶
Our major container structures that we have seen so far are:
- Trees
- Hash Tables
height balanced trees provide logarithmic operations and ordering of elements, but are limited to logarithmic time for operations. Hash tables provide constant insertion and lookup time, but do not provide ordering of elements. So, each structure introduces tradeoffs between features, requirements of the data being stored in the structure, and running time. Heaps offer new alternatives to these tradeoffs.
Priority Queues¶
So, we have a bunch of items that we want to store in and remove from a data structure, BUT we want to get the highest priority (or lowest priority) items out first, regardless of the order in which the items were inserted. We need:
- A key indicating an item’s priority
- A data structure that arranges items by this key
Some obvious approoaches:
- a linked list. Insertion in and removal in (or reversed)
- A binary tree. Should we consider balanced trees, splay trees?
NOTE
Trees provide a LOT of operations, but a priority queue, much like a queue, does not need that many operations. We just need:
- enqueue
- dequeue
The only difference is that dequeue should return the highest priority item in the structure
A priority queue has only 2 required operations/goals!
- Remove the minimum (or highest priority element) quickly
- support quick insertions
This is why trees are overkill. We don’t need a total ordering of elements and a bunch of operations. Instead, we can get away with a PARTIAL ORDERING elements since we don’t need to traverse the items in order. Its OK if elements themselves have a total ordering, but the priority queue will only represent a partial ordering of the items.
Applications of priority queues:
Anything with Quality Of Service (QOS) levels.
- What about multi-QOS internet protocols at routers?
Heap sort.
Also just a useful algorithmic tool (to verify time complexities).
A Partial Order is: (from wiki)
A partial order is a binary relation over a set which is reflexive, antisymmetric, and transitive, i.e., for all , we have that: * (reflexivity); * if and then (antisymmetry); * if (transitivity). * For , elements of a partially ordered set , if , then are comparable. Otherwise they are incomparable. A partial order under which every pair of elements is comparable is called a Total order
If is totally ordered under , then the following statements hold for all : * If (antisymmetry); * If (transitivity); * (totality).
Graphically, partial orders look like a lattice:
Binary Heaps¶
A binary heap is a natural fit for a priority queue. It enforces a partial order over elements, leading to fast enqueu, dequeue, and build times.
Definition
A binary heap is a binary tree with a few properties:
- The heap is complete binary tree. So, we can use the array implementation of a tree
- The heap order property is maintaind. This property is stated such that for every node X, the key of X is less than or equal to the key of X‘s parent (except the root which has no parent)
Note the for property 2, there is NO ordering among siblings of a parent. This is the partial ordering part.
When inserting deleting, we must maintain the structural property of having a complete tree, but we don’t need rotations! rotations are a consequence of total ordering.
Instead, structural operations only involve swapping the items between a parent and child!
- moving an item up or down along a path to a root is called a PERCOLATION
INSERT Operation: Add the new node to the end of the tree (maintaining the complete tree property), then percolate up:
DELETE highest prioirity item Operation (deleteMin): 1. Find the min element. It is ALWAYS at the root! 2. Swap the item in the root with the item in the LAST node of the tree 3. Delete the last node of the tree. (Its ALWAYS a leaf, so that part is easy) 4. Percolate the new root down.
When percolating down ALWAYS swap the item with smaller of its two child items.
Excercise
What is the time complexity of heap insert and heap deleteMin?
Other operations on heaps
Heaps are great at removing the highest priority item and inserting new items, but not other operations. for example:
- findMax – must traverse the entire tree (because of partial ordering, we don’t know if it will be the leftmost leaf, the rightmost leaf, or something on between)
- decreaseKey – we must traverse the heap to find the item, before decreasing the key
- increaseKey – same as decreaseKey
- Remove arbitrary element – use decreaseKey to make the item the smallest in the heap, then just use deleteMin
If we can locate an element quickly, we can speed up some of the operations. Perhaps a hash table that:
HashTable: item -> heap index
OR
HashTable: item -> element
OR
HashTable: item -> node pointer
So far, the heap hasn’t given us any time complexity advantages. This is where building a heap shines!
We can build a heap by continually inserting items, leading to time.
Or, we can just throw all the items into an array, then just percolate down EVERY ITEM beginning from the END of the array. This leads to time:
Double Ended Priority Queues (DEPQs)¶
Sometimes, it is not enough to be able to get just the highest priority element. Occasionally, we want to be able to be able to get rid of some the lowest priority elements if, for example, the heap is getting too large. Normal heaps don’t provide a nice way to do this. Double ended priority queues offer a solution.
One Approach:
- use 2 normal priority queues.
- One is a min heap, the other is a max heap
- each element goes into both
- keep pointers between identical elements
Clearly, we are duplicating space.
A better approach: a correspondence structure:
- use a min and max heap, but each only stores nodes
- if is odd, keep 1 node in a buffer
- keep a total correspondence between nodes in each heap (every node in a heap is associated with exactly 1 node in the other heap)
- We must be able to support a remove operation in the heaps
Definition: Total Correspondence: Each element in the min priority queue ispairs with (has a pointer to) an element in the max priority queue that is >= to it
Example:
Insert:
- If buffer is empty, place in buffer
- Otherwise, insert smaller of new element and buffer into min queue, and larger into max queue
- Establish the correspondence between the newly inserted elements
RemoveMin:
- If buffer is the min, remove the buffer
Otherwise
- Remove min from min heap
- Remove corresponding element from max heap
- Reinsert element from max heap (the element and the buffer form a pair)
Question:
Is the DEPQ a good solution for internet routers?
Roundup on Basic Heaps¶
- Build heap from unsorted complete tree: O(n)
- Remove Min: O(lg n)
- Insert: O(lg n)
- Traverse: it’s a tree: O(n)
- Heap sort?
Advanced Heaps¶
These heaps use some fancy tricks to improve the time complexity of some operations.
Leftist Heaps¶
Up until now, we have spent a lot of time trying to make sure trees are balanced. Leftist heaps take the opposite approach:
- Make the tree predictably unbalanced
- If the tree is unbalanced towards the left, then…
- The tree is deep on the left parts, But the tree is shallow on the right parts
So we can focus our operations on the right parts!
Furthermore, findMin in a (min) heap does not need to search down to a leaf! So having a deep part of the tree does not affect performance much
STRUCTURE: Leftist heaps are binary trees, but not necessarily complete trees, so we can’t just use the array implementation of a binary tree (stuck with nodes)
OPERATIONS: Leftist heaps can do everything a normal heap can do. But, we can meld two leftist heaps in time. For normal heaps, we can append the arrays and re-build the heap in time, so leftist heaps have a big advantage there.
Preliminaries:
Lets treat a binary tree as an extended binary tree. Basically, just draw the NULL pointers, and call them external nodes.
Definition: the s() function:
The s() function defines the NULL PATH LENGTH
For any node x in an extended binary tree, let s(x) be the length of a shortest path from x to an external node in the subtree rooted at x.
- If x is an external node, then s(x) = 0
Otherwise
- s(x) = min ( s(leftChild( x ), s(rightChild( x ) ) +1
DEFINITION: A Height Biased Leftist Tree:
a binary tree is a (height biased) leftist tree iff for every internal node x:
s( leftChild( x ) ) >= s( rightChild( x ) )
Properties of Leftist Trees:
- In a leftist tree, the rightmost path is a shortest root to external node path and the length of this path is s(root).
- The number of internal nodes is at least . (Because levels 1 through s(root) have no external nodes).
- Length of rightmost path is , where is the number of (internal) nodes in a leftist tree.
So… the rightmost path has a guaranteed bound, and is the shortest path through the tree…
Make sure all operations are performed on the rightmost path!!
Melding leftist trees
The major operation is to meld. In fact, insert, remove mi, and build heap will ALL be implemented with a meld operation!
Always meld with the right (shortest) subtree!
Merge Algorithm:
- Take right subheap from root of heap with smallest root
- Merge that with other heap
- repeat
Once a merge with an empty tree takes place, Walk back up the tree
- Swap child nodes if leftist property does not hold
Meld takes , since the rightmost branch is logarithmically bounded in height. However, because the rightmost branch can be arbitrarily short, it often is much faster!
Insert Operation the new item is a singleton leftist tree. Meld it with the existing tree
Remove Min Operation remove the root node. The roots children are two leftist trees. Meld them together.
Build a leftist Heap:
- Create n single-node min leftist trees and place them in a FIFO queue.
- Repeatedly remove two min leftist trees from the FIFO queue, meld them, and put the resulting min leftist tree into the FIFO queue.
- The process terminates when only 1 min leftist tree remains in the FIFO queue.
- Analysis is the same as for heap initialization
- Remember a binary heap is a complete tree and satisfies the leftist property
Skew Heaps¶
Very similar to leftist tree, except that no s() values are stored on the nodes.
When melding, instead of swapping left and right subtrees just when s(l(x)) < s(r(x)), ALWAYS swap them!
- Amortized complexity of each operation is O(log n)
- Don’t need to store s() values, so no extra information is needed in a node
- Easy to implement, just need a merge (meld) operation
- (relationship to leftist heaps is similar of the relationship of splay trees to AVL trees)
Exercise:
Perform a meld on the trees used in the leftist tree meld example, but treat them as skew heaps.