Back to index

## Heap data structures in CS

In this post, I am going to tell you about different heap data structures that are common in Computer Science. I will start with the basic heap structure, i.e. the binary heap, and then discuss binomial and Fibonacci heaps. The code for each data structure is in Python. For each heap structure, only the most important methods are given for the sake of brevity - for other methods/functions consult the references given at the end. Note that the term "heap", as used here, has nothing to do with the memory heap that is a part of the runtime environment of languages like Python and C.

#### Binary heap

A binary heap is a binary tree that satisfies 2 properties -

1. Heap order property There are 2 kinds of heaps - max-heap and min-heap. For a max-heap, any element (or node) other than the root is at most equal to its parent. Similarly, in a min-heap, a non-root element is at least equal to its parent.

2. Complete binary tree property - The heap tree is a complete binary tree - in which all levels except the last are necessarily full and the nodes in the last level occupy leftmost positions. In other words, each level i of the tree, has 2^i nodes except the last level which may have fewer nodes (occupying leftmost positions). In an array A representing a heap with n nodes, the nodes are the successive elements from A[0] to A[n-1].

A diagram of a max-heap is shown below. It satisfies the above properties.

A max heap

The code that builds a heap takes as input an array (list) of numbers. It makes a copy of that list and builds a heap out of it.

class BinaryHeap:
def __init__(self, input):
'''Build a binary heap from iterable of keys.'''

self.heap = list(input)
self.build_max_heap()

def build_max_heap(self):
'''Bottom-up heap construction.'''

self.last_index = len(self.heap) - 1
parent_of_last = self.parent(self.last_index)
for i in range(parent_of_last, -1, -1):
self.max_heapify(i)

def parent(self, i):
'''Return index of parent of element with index i.'''

return (i-1)//2

def left(self, i):
'''Return index of left-child of element with index i.'''

return 2*i + 1

def right(self, i):
'''Return index of right-child of element with index i.'''

return 2*i + 2

def max_heapify(self, i):
'''Downheap bubbling of an element.'''

l = self.left(i)
r = self.right(i)

if l <= self.last_index and self.heap[l] > self.heap[i]:
largest = l
else:
largest = i

if r <= self.last_index and self.heap[r] > self.heap[largest]:
largest = r

if largest != i:
self.heap[i], self.heap[largest] = self.heap[largest], self.heap[i]
self.max_heapify(largest)


We construct the heap with downheap-bubbling on all elements from the parent of the last element upto the root. The elements that we don't consider are already max-heaps of one element each. max_heapify is a simple function that that takes an index and compares the value at that index with the values in the left and right children of the node (with index i). If the largest of the three is not at the root of the subtree, the function recurs with largest.

max_heapify has a running time of O(h), where h is the height of the binary tree. A complete binary tree of n nodes has height floor(log n). So the efficiency of the algorithm is O(log n). Since max_heapify is repeated O(n) times, so the heap is built in time O(nlog n). But this is not a tight bound. It can be shown that the running time of build_max_heap function is linear, i.e. O(n). This method of building the heap is a bottom-up construction.

A binary heap is used in heapsort and to build efficient priority queues.

#### Binomial heap

A binomial heap is a collection of binomial trees. It is essential to understanding Fibonacci heaps which are discussed later. Binomial heaps and Fibonacci heaps are known as mergeable heaps as they define an efficient union operation. Otherwise, performance-wise they are similar to binary heaps.

A binomial tree $$B_k$$ has the following properties:

1. There are $$2^k$$ nodes

2. The height of the tree is k.

3. There are exactly $${k \choose i}$$ nodes at depth i for i = 0,1,...k where $${k \choose i} = \frac {k!}{i!(k-i)!}$$

4. The root has the highest degree, k. The k children of the root are subtrees of degree k - 1, k - 2, ..., 1, 0, from left to right.

"Binomial trees" are so named because of property 3 above which is the formula for binomial coefficient. The binomial tree $$B_0$$ consists of a single node. The tree $$B_k$$ is formed by linking 2 $$B_{k-1}$$ trees - the root of one such tree has the other tree as its leftmost child. The diagram below shows the trees $$B_0$$, $$B_1$$, $$B_2$$ and $$B_3$$.

Binomial trees of degree 0, 1, 2 and 3

A binomial heap H is a set of binomial trees with the following properties -

1. Nodes of a tree are ordered - in min-heap ordering, any node has a key that is equal to or greater than that of its parent. (Similarly, there can be max heap ordering.)

2. The degrees of the roots of the constituent trees are unique.

The following diagram shows a binomial heap with trees $$B_0$$, $$B_2$$ and $$B_3$$ -

Binomial heap with 3 binomial trees
From: CLRS. See Ref. 1

Binomial heaps are represented with the left-child, right-sibling representation. Each node consists of a parent node pointer p, a key field, a degree field, a left-child node pointer child and a right-sibling node pointer sibling. In this way, we avoid having to store pointers to all children in a parent node. The representation is shown below:

Representation of the binomial heap shown above
From: CLRS. See Ref. 1

The code shown below is for the union of 2 binomial heaps as it is the operation that sets such heaps apart from binary heaps. The union method uses 2 other methods - link and merge. link links 2 binomial trees with the same root degree. merge merges the root lists of 2 heaps giving a heap with possibly repeated root degrees.

class Node:
'''A node of a binomial tree in left-child, right-sibling representation.'''
def __init__(self, key):
self.p = None
self.key = key
self.degree = 0
self.child = None
self.sibling = None

class BinomialHeap:
def __init__(self):
'''Empty binomial heap.'''

@staticmethod
def link(r1, r2):  # 2 root nodes. r2 becomes the resulting root
'''Make binomial tree with root r1 the left-child of binomial tree
with root r2.'''
r1.p = r2
r1.sibling = r2.child
r2.child = r1
r2.degree += 1

@staticmethod
def merge(h1, h2):  # parameters are 2 binomial heaps
'''Merge the root lists of 2 binomial heaps.'''
start = z = Node(0)  # dummy node
while x is not None and y is not None:
if x.degree <= y.degree:
z.sibling = x
x = x.sibling
else:
z.sibling = y
y = y.sibling
z = z.sibling
if x is None:
z.sibling = y
if y is None:
z.sibling = x
return start.sibling  # The actual start of the merged root list

@staticmethod
def union(h1, h2):  # parameters are 2 binomial heaps
'''Merge 2 binomial heaps.'''
h = BinomialHeap()
h.head = merge(h1, h2) # h1 and h2 objects can be deleted afterwards
return h
prev_x = None
next_x = x.sibling

while next_x is not None:
if ((x.degree != next_x.degree) or  # 1
((next_x.sibling is not None) and
(next_x.sibling.degree == x.degree)))  # 2
prev_x = x
x = next_x
elif x.key <= next_x.key:
x.sibling = next_x.sibling
else:
if prev_x is None:
else:
prev_x.sibling = next_x
x = next_x
next_x = x.sibling

return h


The union method works as follows. First an empty binomial heap is created. The 2 input heaps h1 and h2 are merged with the merge method which simply merges 2 linked lists. 3 pointers prev_x, x and next_x are set-up to point to the previous, current and next root nodes. In this algorithm, at most 3 successive root nodes can have the same degree (though right after the merge method, there can be only pairs of successive roots with same degree). The if statements at 1 and 2 check if 2 consecutive nodes are of different degrees or if 3 consecutive nodes have the same degree. In these 2 cases, the pointers are simply moved ahead by a step. If the conditions 1 and 2 are false, then either of 2 cases arise. If next_x has a greater key, then the tree at next_x is made a child of the tree at x by the link method at 3. But if x has a greater key, then it is made a child of the next_x tree by the link method at 4. Before 4, there is an if to adjust the pointers taking into account the starting scenario.

The running time of the union method is O(log n), based on the fact that the number of roots in a heap of size n nodes is at most $$\lfloor log n\rfloor + 1$$. This is an improvement over the O(n) time of merging 2 binary heaps.

#### Fibonacci heap

Finally, we consider Fibonacci heaps. A Fibonacci heap is a collection of ordered (either min-heap or max-heap) trees. It is loosely based on the binomial heap. It is used in fast algorithms for finding minimum spanning trees and for the single-source shortest path problem. However because of the complexity of programming of such heaps, it does not find wide usage. The term "Fibonacci" in the name comes from the analysis of the running time, as will be explained later.

When only mergeable heap operations are involved (namely, make-heap, insert, minimum, extract-min and union), the trees in a Fibonacci heap are unordered binomial trees. An unordered binomial tree is a binomial tree whose root node of degree k has child nodes of degrees k-1, k-2, ... 1, 0 in no specific order. When decrease-key or delete operations are involved, we can get trees that violate the unordered binomial tree properties. Such a heap is shown in the diagram below.

Performance-wise Fibonacci heaps are an improvement over binomial heaps. The amortized cost of most of the operations, including union, is O(1). (Loosely speaking, amortization is a technique to compute the average cost spread over all the operations.) The reason behind this is that in all operations except extract-min, the "consolidation" of the heap is postponed until later. It is performed during the extract-min operation. As this operation is the most important one for this heap, we will examine the code for it.

The diagram below shows an example of a Fibonacci heap.

A Fibonacci heap
From: CLRS. See Ref. 1

A Fibonacci heap is represented in code with the following structure. A node within the heap contains a number of pointers apart from the key key, the degree degree and a boolean field mark described below. p is the parent pointer, child is the pointer to any one of the node's children. The children of a node are in a circular doubly-linked list with left and right being pointers to the siblings of a node.

mark is set to True when a node loses a child node after it is made a child of another node. It is set to False for newly created nodes and nodes that are made a child of another. The Fibonacci heap shown above has 3 marked nodes. In this article, we will not consider operations which remove child nodes, so the mark field will always be False.

The heap object will have 2 fields. A min field will point to the node in the root list with the minimum key and n will be the number of nodes currently in the heap. The code to perform the extract-min operation is shown below.

from math import floor, log

class FibonacciHeap:
def __init__(self):
'''Empty Fibonacci heap.'''

self.min = None
self.n = 0

def extract_min(self):
'''Remove and return the node with the minimum key from the heap.'''

z = self.min
if z is not None:
x = z.child
if x is not None:
while x.right is not x:
x.right.p = None
x = x.right
x.p = None
self.concatenate_lists(x)
self.delete_from_root_list(z)
if z is z.right:
self.min = None
else:
self.min = z.right
self.consolidate()
self.n -= 1
return z

def concatenate_lists(self, x):
'''Concatenate the child list of the min node with the root list.'''

x.left.right = self.min
y = x.left
x.left = self.min.left
self.min.left.right = x
self.min.left = y

def delete_from_root_list(self, node):
'''Delete node from the heap. The node will retain its pointers.'''

node.left.right = node.right
node.right.left = node.left

def consolidate(self):
'''Consolidate the heap by successively linking nodes of the same
degree in the root list.'''

a = [None] * (floor(log(self.n, 1.618))+1)  # 1.618 is the golden ratio
w = self.min
while w not in a:
x = w
d = x.degree
while a[d] is not None:
y = a[d]
if x.key > y.key:
x, y = y, x
a[d] = None
d += 1
a[d] = x
w = x.right
self.min = None
z = None
for i in range(len(a)):
if a[i] is not None:
if self.min is None or a[i].key < self.min.key:
self.min = a[i]

@staticmethod
'''Add node to the beginning of the list pointed to by z.'''

if z is None:
node.left = node.right = node
return node
node.right = z
n = z.left
z.left = node
n.right = node
node.left = n
return node

'''Make root node y a child of root node x.'''

self.delete_from_root_list(y)
y.p = x
y.mark = False


The extract_min operation removes and returns the min node. It adds the child nodes of the min node to the root list. This is what concatenate_lists method does. It is a simple procedure to concatenate 2 circular doubly-linked lists. If the min node is the only node in the heap, then set min to None and end the extract_min method by returning the extracted node. If not, then set min to a temporary node value and consolidate the heap. Decrease the count of nodes by 1 and return the extracted node.

The consolidate method starts off by setting up an auxiliary list of size equal to the maximum degree possible of any node in the heap. The formula uses the golden ratio as the logarithm base (1.618). Then for each node in the root list, it checks the list a, with its degree as the index, if there is a node listed. If found, it links the 2 nodes x and y according to the keys in the link method. Then the a list entry is made None and higher indexes are checked. On exit from the inner while loop, the list index on exit is set to the consolidated root node. In this way, list a is built up. After the outer while loop, min is set to None and a new root list is built up from the nodes in a. min is set to the node with the least key.

If k is the degree of any node in a heap of size n, then it can be shown that $$k <= \lfloor log_\phi n \rfloor$$ where $$\phi$$ is the golden ratio i.e. $$\phi = (1 + \sqrt {5})/2$$. The $$n^{th}$$ Fibonacci number is the integer closest to $$\phi^n$$, and hence the name of this heap structure.

Now you have a good idea about binary, binomial and Fibonacci heaps. Binary heaps are used more often than binomial or Fibonacci heaps. There are other advanced heap structures that I haven't gone into, like the relaxed heap. It offers better performance than the Fibonacci heap. You can read about it here.

### References

1. Introduction to algorithms - second edition. By Cormen, Leiserson, Rivest and Stein

2. Data structures and algorithms in Python - Wiley student edition. By Goodrich, Tamassia and Goldwasser

Back to index