Skip to main content

Kosaraju's Algorithm

A strongly connected component, or SCC, of a directed graph is a maximal set of vertices in which every pair is mutually reachable. Kosaraju's algorithm finds all SCCs with two linear passes of depth first search.

The key observation is that a graph and its transpose (the same graph with every edge reversed) have identical SCCs. If we run DFS on the original graph and record the order in which vertices finish, then run DFS on the transpose starting from vertices in reverse-finish order, each DFS tree of the second pass is exactly one SCC. The first pass gives us a topological order of the condensation; the second pass peels off sinks of that condensation one at a time, and a sink SCC cannot reach anything else, so each second-pass DFS stays inside its own component.

Compared to Tarjan's one-pass algorithm, Kosaraju is conceptually cleaner but traverses the graph twice and requires materialising the transpose.

Codes

def solve(xs):
"""xs = [n, edges]; edges are directed [u, v].
Returns list of SCCs, each a sorted list of vertices."""
n, edges = xs
g = [[] for _ in range(n)]
gr = [[] for _ in range(n)]
for u, v in edges:
g[u].append(v)
gr[v].append(u)

order, seen = [], [False] * n
# first pass: finish-time order on original graph
for start in range(n):
if seen[start]:
continue
stack = [(start, 0)]
seen[start] = True
while stack:
u, i = stack[-1]
if i < len(g[u]):
stack[-1] = (u, i + 1)
v = g[u][i]
if not seen[v]:
seen[v] = True
stack.append((v, 0))
else:
order.append(u)
stack.pop()

# second pass: DFS on transpose in reverse finish order
comp = [-1] * n
sccs = []
for start in reversed(order):
if comp[start] != -1:
continue
cid = len(sccs)
comp[start] = cid
stack = [start]
members = []
while stack:
u = stack.pop()
members.append(u)
for v in gr[u]:
if comp[v] == -1:
comp[v] = cid
stack.append(v)
sccs.append(sorted(members))
return sccs

Example

Loading Python runner...

Description

Run time analysis

O(V+E)O(V + E). Two DFS passes over the graph plus building the transpose, each linear in the size of the graph.

Space analysis

O(V+E)O(V + E). The transpose adjacency list is the same size as the original; the finish-order list, component label array, and explicit DFS stack each take O(V)O(V).

Proof of correctness

Let CC be an SCC of the original graph and let uu be the vertex in CC with the largest finish time during the first DFS. We show the second DFS started from uu (on the transpose) visits exactly CC.

First, every vertex vCv \in C is reachable from uu in the transpose. That is because vv is reachable from uu in the original graph (they share an SCC), and reversing all edges swaps reachability. So every vertex of CC ends up in the component DFS rooted at uu.

Second, no vertex outside CC is reached. Suppose wCw \notin C is reachable from uu in the transpose; then uu is reachable from ww in the original graph. Since uu has the largest first-pass finish time in CC, and we process vertices in decreasing finish-time order in the second pass, ww must have been assigned to a component before uu, unless ww and uu are in the same SCC — contradicting wCw \notin C. A standard property of DFS finish times (the "white path" theorem, CLRS 22.4) formalises that if uu has a larger finish time than any vertex of its SCC, and ww is reachable from uu in the transpose, then ww shares uu's SCC. Therefore the second-pass DFS tree rooted at uu is precisely CC. \blacksquare

Extensions

Applications

References

  • Cormen, Leiserson, Rivest, Stein. Introduction to Algorithms, 3rd ed., Chapter 22.5 (Strongly Connected Components).
  • Sharir, Micha. "A strong-connectivity algorithm and its applications in data flow analysis." Computers and Mathematics with Applications, 7(1):67-72, 1981.
  • cp-algorithms — Strongly connected components, Kosaraju