In this post, I am going to tell you about different heap data structures that are common in Computer Science. I will start with the basic heap structure, i.e. the binary heap, and then discuss binomial and Fibonacci heaps. The code for each data structure is in Python. For each heap structure, only the most important methods are given for the sake of brevity - for other methods/functions consult the references given at the end. Note that the term "heap", as used here, has nothing to do with the memory heap that is a part of the runtime environment of languages like Python and C.
A binary heap is a binary tree that satisfies 2 properties -
Heap order property There are 2 kinds of heaps - max-heap and min-heap. For a max-heap, any element (or node) other than the root is at most equal to its parent. Similarly, in a min-heap, a non-root element is at least equal to its parent.
Complete binary tree property - The heap tree is a complete binary tree - in which all levels except the last are necessarily full and the nodes in the last level occupy leftmost positions. In other words, each level i of the tree, has 2^i nodes except the last level which may have fewer nodes (occupying leftmost positions). In an array A representing a heap with n nodes, the nodes are the successive elements from A[0] to A[n-1].
A diagram of a max-heap is shown below. It satisfies the above properties.
The code that builds a heap takes as input an array (list) of numbers. It makes a copy of that list and builds a heap out of it.
class BinaryHeap:
def __init__(self, input):
'''Build a binary heap from iterable of keys.'''
self.heap = list(input)
self.build_max_heap()
def build_max_heap(self):
'''Bottom-up heap construction.'''
self.last_index = len(self.heap) - 1
parent_of_last = self.parent(self.last_index)
for i in range(parent_of_last, -1, -1):
self.max_heapify(i)
def parent(self, i):
'''Return index of parent of element with index i.'''
return (i-1)//2
def left(self, i):
'''Return index of left-child of element with index i.'''
return 2*i + 1
def right(self, i):
'''Return index of right-child of element with index i.'''
return 2*i + 2
def max_heapify(self, i):
'''Downheap bubbling of an element.'''
l = self.left(i)
r = self.right(i)
if l <= self.last_index and self.heap[l] > self.heap[i]:
largest = l
else:
largest = i
if r <= self.last_index and self.heap[r] > self.heap[largest]:
largest = r
if largest != i:
self.heap[i], self.heap[largest] = self.heap[largest], self.heap[i]
self.max_heapify(largest)
We construct the heap with downheap-bubbling on all elements from the parent of
the last element upto the root. The elements that we don't consider are
already max-heaps of one element each. max_heapify
is a simple function that
that takes an index and compares the value at that index with the values in the
left and right children of the node (with index i
). If the largest of the
three is not at the root of the subtree, the function recurs with largest
.
max_heapify
has a running time of O(h), where h is the height of the binary
tree. A complete binary tree of n nodes has height floor(log n). So the
efficiency of the algorithm is O(log n). Since max_heapify
is repeated O(n)
times, so the heap is built in time O(nlog n). But this is not a tight bound.
It can be shown that the running time of build_max_heap
function is linear,
i.e. O(n). This method of building the heap is a bottom-up construction.
A binary heap is used in heapsort and to build efficient priority queues.
A binomial heap is a collection of binomial trees. It is essential to understanding Fibonacci heaps which are discussed later. Binomial heaps and Fibonacci heaps are known as mergeable heaps as they define an efficient union operation. Otherwise, performance-wise they are similar to binary heaps.
A binomial tree \(B_k\) has the following properties:
There are \(2^k\) nodes
The height of the tree is k.
There are exactly \({k \choose i}\) nodes at depth i for i = 0,1,...k where \({k \choose i} = \frac {k!}{i!(k-i)!}\)
The root has the highest degree, k. The k children of the root are subtrees of degree k - 1, k - 2, ..., 1, 0, from left to right.
"Binomial trees" are so named because of property 3 above which is the formula for binomial coefficient. The binomial tree \(B_0\) consists of a single node. The tree \(B_k\) is formed by linking 2 \(B_{k-1}\) trees - the root of one such tree has the other tree as its leftmost child. The diagram below shows the trees \(B_0\), \(B_1\), \(B_2\) and \(B_3\).
A binomial heap H is a set of binomial trees with the following properties -
Nodes of a tree are ordered - in min-heap ordering, any node has a key that is equal to or greater than that of its parent. (Similarly, there can be max heap ordering.)
The degrees of the roots of the constituent trees are unique.
The following diagram shows a binomial heap with trees \(B_0\), \(B_2\) and \(B_3\) -
Binomial heaps are represented with the left-child, right-sibling representation. Each node consists of a parent node pointer p, a key field, a degree field, a left-child node pointer child and a right-sibling node pointer sibling. In this way, we avoid having to store pointers to all children in a parent node. The representation is shown below:
The code shown below is for the union of 2 binomial heaps as it is the operation
that sets such heaps apart from binary heaps. The union
method uses 2 other
methods - link
and merge
. link
links 2 binomial trees with the same root
degree. merge
merges the root lists of 2 heaps giving a heap with possibly
repeated root degrees.
class Node:
'''A node of a binomial tree in left-child, right-sibling representation.'''
def __init__(self, key):
self.p = None
self.key = key
self.degree = 0
self.child = None
self.sibling = None
class BinomialHeap:
def __init__(self):
'''Empty binomial heap.'''
self.head = None
@staticmethod
def link(r1, r2): # 2 root nodes. r2 becomes the resulting root
'''Make binomial tree with root r1 the left-child of binomial tree
with root r2.'''
r1.p = r2
r1.sibling = r2.child
r2.child = r1
r2.degree += 1
@staticmethod
def merge(h1, h2): # parameters are 2 binomial heaps
'''Merge the root lists of 2 binomial heaps.'''
x = h1.head
y = h2.head
start = z = Node(0) # dummy node
while x is not None and y is not None:
if x.degree <= y.degree:
z.sibling = x
x = x.sibling
else:
z.sibling = y
y = y.sibling
z = z.sibling
if x is None:
z.sibling = y
if y is None:
z.sibling = x
return start.sibling # The actual start of the merged root list
@staticmethod
def union(h1, h2): # parameters are 2 binomial heaps
'''Merge 2 binomial heaps.'''
h = BinomialHeap()
h.head = merge(h1, h2) # h1 and h2 objects can be deleted afterwards
if h.head is None:
return h
prev_x = None
x = h.head
next_x = x.sibling
while next_x is not None:
if ((x.degree != next_x.degree) or # 1
((next_x.sibling is not None) and
(next_x.sibling.degree == x.degree))) # 2
prev_x = x
x = next_x
elif x.key <= next_x.key:
x.sibling = next_x.sibling
BinomialHeap.link(next_x, x) # 3
else:
if prev_x is None:
h.head = next_x
else:
prev_x.sibling = next_x
BinomialHeap.link(x, next_x) # 4
x = next_x
next_x = x.sibling
return h
The union
method works as follows. First an empty binomial heap is created.
The 2 input heaps h1
and h2
are merged with the merge
method which simply
merges 2 linked lists. 3 pointers prev_x
, x
and next_x
are set-up to point
to the previous, current and next root nodes. In this algorithm, at most 3
successive root nodes can have the same degree (though right after the merge
method, there can be only pairs of successive roots with same degree). The if
statements at 1
and 2
check if 2 consecutive nodes are of different degrees
or if 3 consecutive nodes have the same degree. In these 2 cases, the pointers are
simply moved ahead by a step. If the conditions 1
and 2
are false, then either
of 2 cases arise. If next_x
has a greater key, then the tree at next_x
is made
a child of the tree at x
by the link
method at 3
. But if x
has a greater
key, then it is made a child of the next_x
tree by the link
method at 4
.
Before 4
, there is an if
to adjust the pointers taking into account the
starting scenario.
The running time of the union
method is O(log n), based on the fact that the
number of roots in a heap of size n nodes is at most \(\lfloor log n\rfloor + 1\).
This is an improvement over the O(n) time of merging 2 binary heaps.
Finally, we consider Fibonacci heaps. A Fibonacci heap is a collection of ordered (either min-heap or max-heap) trees. It is loosely based on the binomial heap. It is used in fast algorithms for finding minimum spanning trees and for the single-source shortest path problem. However because of the complexity of programming of such heaps, it does not find wide usage. The term "Fibonacci" in the name comes from the analysis of the running time, as will be explained later.
When only mergeable heap operations are involved (namely, make-heap, insert, minimum, extract-min and union), the trees in a Fibonacci heap are unordered binomial trees. An unordered binomial tree is a binomial tree whose root node of degree k has child nodes of degrees k-1, k-2, ... 1, 0 in no specific order. When decrease-key or delete operations are involved, we can get trees that violate the unordered binomial tree properties. Such a heap is shown in the diagram below.
Performance-wise Fibonacci heaps are an improvement over binomial heaps. The amortized cost of most of the operations, including union, is O(1). (Loosely speaking, amortization is a technique to compute the average cost spread over all the operations.) The reason behind this is that in all operations except extract-min, the "consolidation" of the heap is postponed until later. It is performed during the extract-min operation. As this operation is the most important one for this heap, we will examine the code for it.
The diagram below shows an example of a Fibonacci heap.
A Fibonacci heap is represented in code with the following structure. A node within the heap contains a number of pointers apart from the key key, the degree degree and a boolean field mark described below. p is the parent pointer, child is the pointer to any one of the node's children. The children of a node are in a circular doubly-linked list with left and right being pointers to the siblings of a node.
mark is set to True
when a node loses a child node after it is made a child of
another node. It is set to False
for newly created nodes and nodes that are
made a child of another. The Fibonacci heap shown above has 3 marked nodes. In
this article, we will not consider operations which remove child nodes, so the
mark field will always be False
.
The heap object will have 2 fields. A min field will point to the node in the root list with the minimum key and n will be the number of nodes currently in the heap. The code to perform the extract-min operation is shown below.
from math import floor, log
class FibonacciHeap:
def __init__(self):
'''Empty Fibonacci heap.'''
self.min = None
self.n = 0
def extract_min(self):
'''Remove and return the node with the minimum key from the heap.'''
z = self.min
if z is not None:
x = z.child
if x is not None:
while x.right is not x:
x.right.p = None
x = x.right
x.p = None
self.concatenate_lists(x)
self.delete_from_root_list(z)
if z is z.right:
self.min = None
else:
self.min = z.right
self.consolidate()
self.n -= 1
return z
def concatenate_lists(self, x):
'''Concatenate the child list of the min node with the root list.'''
x.left.right = self.min
y = x.left
x.left = self.min.left
self.min.left.right = x
self.min.left = y
def delete_from_root_list(self, node):
'''Delete node from the heap. The node will retain its pointers.'''
node.left.right = node.right
node.right.left = node.left
def consolidate(self):
'''Consolidate the heap by successively linking nodes of the same
degree in the root list.'''
a = [None] * (floor(log(self.n, 1.618))+1) # 1.618 is the golden ratio
w = self.min
while w not in a:
x = w
d = x.degree
while a[d] is not None:
y = a[d]
if x.key > y.key:
x, y = y, x
self.link(y, x)
a[d] = None
d += 1
a[d] = x
w = x.right
self.min = None
z = None
for i in range(len(a)):
if a[i] is not None:
z = self.add_to_list(z, a[i])
if self.min is None or a[i].key < self.min.key:
self.min = a[i]
@staticmethod
def add_to_list(z, node):
'''Add node to the beginning of the list pointed to by z.'''
if z is None:
node.left = node.right = node
return node
node.right = z
n = z.left
z.left = node
n.right = node
node.left = n
return node
def link(self, y, x):
'''Make root node y a child of root node x.'''
self.delete_from_root_list(y)
x.child = self.add_to_list(x.child, y)
y.p = x
y.mark = False
The extract_min
operation removes and returns the min
node. It adds the child
nodes of the min
node to the root list. This is what concatenate_lists
method
does. It is a simple procedure to concatenate 2 circular doubly-linked lists. If
the min
node is the only node in the heap, then set min
to None
and end the
extract_min
method by returning the extracted node. If not, then set min
to a
temporary node value and consolidate the heap. Decrease the count of nodes by
1 and return the extracted node.
The consolidate
method starts off by setting up an auxiliary list of size equal
to the maximum degree possible of any node in the heap. The formula uses the
golden ratio as the logarithm base (1.618). Then for each node in the root list,
it checks the list a
, with its degree as the index, if there is a node listed. If
found, it links the 2 nodes x
and y
according to the keys in the link
method.
Then the a
list entry is made None
and higher indexes are checked. On exit from
the inner while
loop, the list index on exit is set to the consolidated root node.
In this way, list a
is built up. After the outer while loop, min
is set to None
and a new root list is built up from the nodes in a
. min
is set to the node with
the least key.
If k is the degree of any node in a heap of size n, then it can be shown that \(k <= \lfloor log_\phi n \rfloor\) where \(\phi\) is the golden ratio i.e. \(\phi = (1 + \sqrt {5})/2\). The \(n^{th}\) Fibonacci number is the integer closest to \(\phi^n\), and hence the name of this heap structure.
Now you have a good idea about binary, binomial and Fibonacci heaps. Binary heaps
are used more often than binomial or Fibonacci heaps. There are other advanced heap
structures that I haven't gone into, like the relaxed heap. It offers better
performance than the Fibonacci heap. You can read about it here.
Introduction to algorithms - second edition. By Cormen, Leiserson, Rivest and Stein
Data structures and algorithms in Python - Wiley student edition. By Goodrich, Tamassia and Goldwasser