Graph Algorithms: Minimum Spanning Tree, Kruskal and Prim Algorithms

Minimum Spanning Tree header

The “Minimum Spanning Tree” (MST) of a connected graph is a subset of the edges of that graph that forms a tree, and this tree must connect all the nodes of the graph. Furthermore, the sum of the weights of the edges in this tree must be as low as possible.

[wpda_org_chart tree_id=43 theme_id=50]

The Minimum Spanning Tree algorithm

In other words, imagine you have a graph where each edge has an associated weight. A Minimum Spanning Tree of this graph is a set of edges that connects all the nodes without forming cycles and the sum of the weights of these edges is the lowest possible among all the trees that can be formed by connecting all the nodes of the graph.

This problem is relevant in many practical situations. For example, it might represent the street network of a city, where you want to build a series of streets so that every area of the city is accessible and the sum of the street lengths is minimal. Or it could represent a computer network, where the arcs represent links between computers and you want to minimize the cost of connectivity.

Algorithms such as Kruskal’s algorithm and Prim’s algorithm are used to solve the Minimum Spanning Tree problem. These algorithms are implemented to efficiently find the set of edges that constitute a Minimum Spanning Tree in a weighted graph.

Kruskal algorithm

The Kruskal algorithm is a greedy algorithm used to find the Minimum Spanning Tree (MST) of an undirected and weighted graph. Its basic idea is to select the lightest edges in ascending order and add them to the tree, avoiding forming loops. The algorithm ends when all nodes are connected or when all edges have been considered.

Here is a detailed description of how Kruskal’s algorithm works:

  • Initialization: Sort the edges of the graph in ascending order with respect to their weights.
  • Main loop: Iterate through the arcs in ascending order. For each arc, check whether the nodes it connects belong to different connected components. If they don’t form a loop, add the edge to the tree and join the connected components of the two nodes.
  • Termination: Repeat step 2 until all nodes are connected or all edges have been considered.

Features of the Kruskal algorithm:

  • Greedy: The algorithm makes local optimal choices at each step in the hope of obtaining a globally optimal solution.
  • Does not form loops: To avoid forming loops, checks whether adding an edge creates a loop in the current MST.
  • Union-Find data structure: Kruskal’s algorithm often uses a data structure called Union-Find to keep track of connected components and help check for loops.

Here is an example of implementation

class Graph:
    def __init__(self, vertices):
        self.V = vertices
        self.graph = []

    def add_edge(self, u, v, w):
        self.graph.append([u, v, w])

    def kruskal(self):
        self.graph = sorted(self.graph, key=lambda item: item[2])
        parent = [i for i in range(self.V)]
        result = []

        def find(x):
            if parent[x] != x:
                parent[x] = find(parent[x])
            return parent[x]

        def union(x, y):
            root_x = find(x)
            root_y = find(y)
            if root_x != root_y:
                parent[root_x] = root_y
                return True
            return False

        for edge in self.graph:
            u, v, w = edge
            if union(u, v):
                result.append([u, v, w])

        return result

# Example of usage
g = Graph(4)
g.add_edge(0, 1, 10)
g.add_edge(0, 2, 6)
g.add_edge(0, 3, 5)
g.add_edge(1, 3, 15)
g.add_edge(2, 3, 4)

mst = g.kruskal()
print("Minimum Spanning Tree:")
for edge in mst:
    print(edge)

In questo esempio, g.kruskal() restituirà il Minimo Albero Ricoprente del grafo definito. I pesi degli archi sono dati dalla terza componente di ciascun arco nella rappresentazione [u, v, w].

Executing the code you will get the following result:

Minimum Spanning Tree:
[2, 3, 4]
[0, 3, 5]
[0, 1, 10]

Prim algorithm

The Prim algorithm is another greedy algorithm used to find the Minimum Spanning Tree (MST) of an undirected and weighted graph. Unlike Kruskal, Prim’s algorithm starts from an arbitrary node and iteratively adds the lightest edge connecting a node already included in the tree to one outside the tree.

Here’s how Prim’s algorithm works:

  • Initialization: Choose a starting node. Initializes a set of nodes already included in the tree and a set of tree edges, both initially empty.
  • Updating sets: As long as there are nodes outside the tree, find the lightest edge that connects a node inside the tree to a node outside the tree. Add the node and bow to the tree.
  • Termination: Continue until all nodes are included in the tree.

Features of Prim’s algorithm:

  • Greedy: Like Kruskal, Prim’s algorithm makes locally optimal choices at each step in the hope of obtaining a globally optimal solution.
  • Does not form loops: Since only one node is added in each step, Prim’s algorithm does not form loops.
  • Heap data structure: The algorithm often uses a priority queue (heap) to keep track of the lightest edges.

Here is an example Python implementation of Prim’s algorithm:

import heapq

class Graph:
    def __init__(self, vertices):
        self.V = vertices
        self.graph = {}

    def add_edge(self, u, v, w):
        if u not in self.graph:
            self.graph[u] = {}
        if v not in self.graph:
            self.graph[v] = {}

        self.graph[u][v] = w
        self.graph[v][u] = w

    def prim(self, start_node):
        heap = [(0, start_node)]
        included = set()
        mst = []

        while heap:
            weight, node = heapq.heappop(heap)
            if node not in included:
                included.add(node)

                for neighbor, edge_weight in self.graph[node].items():
                    heapq.heappush(heap, (edge_weight, neighbor))
                if len(mst) < self.V - 1:
                    mst.append((node, neighbor, edge_weight))

        return mst

# Example of usage
g = Graph(4)
g.add_edge(0, 1, 10)
g.add_edge(0, 2, 6)
g.add_edge(0, 3, 5)
g.add_edge(1, 3, 15)
g.add_edge(2, 3, 4)

mst = g.prim(0)
print("Minimum Spanning Tree:")
for edge in mst:
    print(edge)

In this example, g.prim(0) will return the Minimum Spanning Tree of the defined graph, starting from node 0 as the starting node. The weights of the edges are given by the values associated with the edges in the graph dictionary.

Executing you will get the following result:

Minimum Spanning Tree:
(0, 3, 5)
(3, 2, 4)
(2, 3, 4)

Some considerations on these algorithms

Both the Kruskal and Prim algorithms for calculating the Minimum Spanning Tree (MST) in a weighted graph have asymptotic complexities that mainly depend on the way they are implemented and the data structures used.

Kruskal algorithm:
The asymptotic complexity of Kruskal’s algorithm depends mainly on the ordering of the edges and the Union-Find data structure used. Assuming that an efficient sorting algorithm such as Merge Sort or Heap Sort is used, and a Union-Find data structure based on union-per-rank and path compression, the asymptotic complexity of Kruskal’s algorithm is O(E log E), where E is the number of edges in the graph.

Prim algorithm:
The asymptotic complexity of Prim’s algorithm depends mainly on the implementation of the priority queue (heap). Using a min-heap-based priority queue, Prim’s algorithm has an asymptotic complexity of O((V + E) log V), where V is the number of nodes and E is the number of edges in the graph.

In both cases, the complexity is dominated by the sorting operation (for Kruskal) or the priority queue operations (for Prim). By choosing efficient algorithms for these operations, it is possible to achieve acceptable performance even for large graphs.

Data Structures Used:

  • Kruskal: Use the Union-Find data structure to manage connected components and check for loops.
  • Prim: Uses a priority queue (heap) to manage the lightest edges to add to the tree.

Time Complexity:

  • Kruskal: The asymptotic complexity is O(E log E), where E is the number of edges. This complexity is often better for sparse graphs.
  • Prim: The asymptotic complexity is O((V + E) log V), where V is the number of nodes and E is the number of edges. This complexity may be better for dense graphs.

Best Case and Worst Case:

  • Kruskal: Usually performs well even on dense graphs, but the best case is when the edges are already ordered.
  • Prim: Usually performs well on dense graphs and the best case is when the priority queue is implemented efficiently.

Tree Structure Produced:

  • Kruskal: The algorithm selects edges based on weight and may create multiple temporary trees before obtaining a unique MST.
  • Prim: The algorithm builds the tree from the starting node, guaranteeing a unique MST without the need for subsequent merges.

Implementation and Memory:

  • Kruskal: Requires an arc sorting step and implementing Union-Find.
  • Prim: Requires a priority queue and does not require sorting, but queue management may require additional memory.

Choice of Starting Node:

  • Kruskal: Independent of the choice of starting node.
  • Prim: The result may vary based on the starting node, but all the MSTs obtained are equivalent in weight.

Use in Specific Contexts:

  • Kruskal: May be preferred when the graph is sparse and searching for lighter edges is more efficient.
  • Prim: May be preferred when the graph is dense and managing a priority queue is more efficient.

In general, the choice between Kruskal and Prim depends on the specific characteristics of the graph and the performance requirements of the context in which they are used. Both algorithms are efficient and provide optimal solutions for the Minimum Spanning Tree problem.

Leave a Reply