Graph Algorithms

Minimum Spanning Tree

● Advanced ⏱ 13 min read graph-algorithms

A Minimum Spanning Tree (MST) of a connected, undirected, weighted graph is a spanning tree (connects all V vertices with V−1 edges) whose total edge weight is minimised. Two classic algorithms: Kruskal’s (edge-based) and Prim’s (vertex-based).

What Is an MST?

Text
Graph (V=4, E=5):
  A -1- B
  |   / |
  4  2  3
  | /   |
  C -5- D

All spanning trees connect A,B,C,D with 3 edges.
MST picks the minimum-weight set:
  A-B (1), B-C (2), B-D (3)  → total = 6

Applications: network cable layout, road construction,
cluster analysis, approximation algorithms for TSP.

Union-Find (Disjoint Set Union)

Kruskal’s algorithm needs to detect whether adding an edge would create a cycle. Union-Find maintains a forest of connected components and answers “are u and v connected?” in near-O(1) with two optimisations: path compression and union by rank.

Python
class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))
        self.rank   = [0] * n

    def find(self, x):
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])  # path compression
        return self.parent[x]

    def union(self, x, y):
        rx, ry = self.find(x), self.find(y)
        if rx == ry:
            return False          # already in same component — would form cycle
        if self.rank[rx] < self.rank[ry]:
            rx, ry = ry, rx       # union by rank: attach smaller tree under larger
        self.parent[ry] = rx
        if self.rank[rx] == self.rank[ry]:
            self.rank[rx] += 1
        return True

uf = UnionFind(4)    # 4 nodes: 0,1,2,3
uf.union(0, 1)
uf.union(1, 2)
print(uf.find(0) == uf.find(2))   # True — connected
print(uf.find(0) == uf.find(3))   # False — not connected

Kruskal’s Algorithm

Sort all edges by weight, then greedily add each edge to the MST if it does not form a cycle (checked via Union-Find).

Python
def kruskal(vertices, edges):
    """
    vertices: list of vertex labels
    edges: list of (weight, u, v)
    Returns (mst_edges, total_weight)
    """
    v_to_idx = {v: i for i, v in enumerate(vertices)}
    uf       = UnionFind(len(vertices))
    mst      = []
    total    = 0

    for w, u, v in sorted(edges):           # sort by weight
        ui, vi = v_to_idx[u], v_to_idx[v]
        if uf.union(ui, vi):                # no cycle → include edge
            mst.append((u, v, w))
            total += w
            if len(mst) == len(vertices) - 1:
                break                       # MST complete (V-1 edges)

    return mst, total

vertices = ["A", "B", "C", "D"]
edges    = [(1,"A","B"), (4,"A","C"), (2,"B","C"), (3,"B","D"), (5,"C","D")]
mst, cost = kruskal(vertices, edges)
print(mst)    # [('A', 'B', 1), ('B', 'C', 2), ('B', 'D', 3)]
print(cost)   # 6

Prim’s Algorithm

Start from any vertex and greedily grow the MST by always adding the cheapest edge that connects a new vertex to the current tree. Uses a min-heap to track the cheapest available edge.

Python
import heapq
from collections import defaultdict

def prim(graph, start):
    """
    graph: adjacency list {u: [(v, weight), ...]}
    Returns (mst_edges, total_weight)
    """
    in_mst  = {start}
    heap    = [(w, start, v) for v, w in graph[start]]
    heapq.heapify(heap)
    mst     = []
    total   = 0

    while heap and len(in_mst) < len(graph):
        w, u, v = heapq.heappop(heap)
        if v in in_mst:
            continue                     # vertex already in MST
        in_mst.add(v)
        mst.append((u, v, w))
        total += w
        for nb, weight in graph[v]:
            if nb not in in_mst:
                heapq.heappush(heap, (weight, v, nb))

    return mst, total

g = defaultdict(list)
for w, u, v in [(1,"A","B"),(4,"A","C"),(2,"B","C"),(3,"B","D"),(5,"C","D")]:
    g[u].append((v, w))
    g[v].append((u, w))

mst, cost = prim(g, "A")
print(mst)    # [('A', 'B', 1), ('B', 'C', 2), ('B', 'D', 3)]
print(cost)   # 6

Kruskal’s vs Prim’s

PropertyKruskal’sPrim’s
StrategySort edges globally, pick cheapest non-cycle edgeGrow tree vertex by vertex via cheapest outgoing edge
Best forSparse graphs (small E)Dense graphs (large E, small V)
Data structureUnion-FindMin-heap
TimeO(E log E)O((V+E) log V) with heap
Disconnected graphProduces forestOnly handles connected component of start

Complexity Table

AlgorithmTimeSpaceNote
Kruskal’sO(E log E)O(V)Dominated by sorting edges
Prim’s (binary heap)O((V+E) log V)O(V)Better on dense graphs
Union-Find (with optimisations)O(α(V)) per opO(V)α is inverse Ackermann ≈ O(1) practically