Kosaraju's Algorithm
A strongly connected component, or SCC, of a directed graph is a maximal set of vertices in which every pair is mutually reachable. Kosaraju's algorithm finds all SCCs with two linear passes of depth first search.
The key observation is that a graph and its transpose (the same graph with every edge reversed) have identical SCCs. If we run DFS on the original graph and record the order in which vertices finish, then run DFS on the transpose starting from vertices in reverse-finish order, each DFS tree of the second pass is exactly one SCC. The first pass gives us a topological order of the condensation; the second pass peels off sinks of that condensation one at a time, and a sink SCC cannot reach anything else, so each second-pass DFS stays inside its own component.
Compared to Tarjan's one-pass algorithm, Kosaraju is conceptually cleaner but traverses the graph twice and requires materialising the transpose.
Codes
- Python
- C++
def solve(xs):
"""xs = [n, edges]; edges are directed [u, v].
Returns list of SCCs, each a sorted list of vertices."""
n, edges = xs
g = [[] for _ in range(n)]
gr = [[] for _ in range(n)]
for u, v in edges:
g[u].append(v)
gr[v].append(u)
order, seen = [], [False] * n
# first pass: finish-time order on original graph
for start in range(n):
if seen[start]:
continue
stack = [(start, 0)]
seen[start] = True
while stack:
u, i = stack[-1]
if i < len(g[u]):
stack[-1] = (u, i + 1)
v = g[u][i]
if not seen[v]:
seen[v] = True
stack.append((v, 0))
else:
order.append(u)
stack.pop()
# second pass: DFS on transpose in reverse finish order
comp = [-1] * n
sccs = []
for start in reversed(order):
if comp[start] != -1:
continue
cid = len(sccs)
comp[start] = cid
stack = [start]
members = []
while stack:
u = stack.pop()
members.append(u)
for v in gr[u]:
if comp[v] == -1:
comp[v] = cid
stack.append(v)
sccs.append(sorted(members))
return sccs
#include <algorithm>
#include <vector>
std::vector<std::vector<int>> kosaraju(
int n, const std::vector<std::pair<int,int>>& edges) {
std::vector<std::vector<int>> g(n), gr(n);
for (auto [u, v] : edges) {
g[u].push_back(v);
gr[v].push_back(u);
}
std::vector<int> order;
std::vector<char> seen(n, 0);
for (int s = 0; s < n; ++s) {
if (seen[s]) continue;
std::vector<std::pair<int,int>> st = {{s, 0}};
seen[s] = 1;
while (!st.empty()) {
auto& [u, i] = st.back();
if (i < (int)g[u].size()) {
int v = g[u][i++];
if (!seen[v]) { seen[v] = 1; st.push_back({v, 0}); }
} else {
order.push_back(u);
st.pop_back();
}
}
}
std::vector<int> comp(n, -1);
std::vector<std::vector<int>> sccs;
for (int idx = (int)order.size() - 1; idx >= 0; --idx) {
int s = order[idx];
if (comp[s] != -1) continue;
int cid = (int)sccs.size();
sccs.push_back({});
std::vector<int> st = {s};
comp[s] = cid;
while (!st.empty()) {
int u = st.back(); st.pop_back();
sccs[cid].push_back(u);
for (int v : gr[u]) {
if (comp[v] == -1) { comp[v] = cid; st.push_back(v); }
}
}
std::sort(sccs[cid].begin(), sccs[cid].end());
}
return sccs;
}
Example
Description
Run time analysis
. Two DFS passes over the graph plus building the transpose, each linear in the size of the graph.
Space analysis
. The transpose adjacency list is the same size as the original; the finish-order list, component label array, and explicit DFS stack each take .
Proof of correctness
Let be an SCC of the original graph and let be the vertex in with the largest finish time during the first DFS. We show the second DFS started from (on the transpose) visits exactly .
First, every vertex is reachable from in the transpose. That is because is reachable from in the original graph (they share an SCC), and reversing all edges swaps reachability. So every vertex of ends up in the component DFS rooted at .
Second, no vertex outside is reached. Suppose is reachable from in the transpose; then is reachable from in the original graph. Since has the largest first-pass finish time in , and we process vertices in decreasing finish-time order in the second pass, must have been assigned to a component before , unless and are in the same SCC — contradicting . A standard property of DFS finish times (the "white path" theorem, CLRS 22.4) formalises that if has a larger finish time than any vertex of its SCC, and is reachable from in the transpose, then shares 's SCC. Therefore the second-pass DFS tree rooted at is precisely .
Extensions
Applications
References
- Cormen, Leiserson, Rivest, Stein. Introduction to Algorithms, 3rd ed., Chapter 22.5 (Strongly Connected Components).
- Sharir, Micha. "A strong-connectivity algorithm and its applications in data flow analysis." Computers and Mathematics with Applications, 7(1):67-72, 1981.
- cp-algorithms — Strongly connected components, Kosaraju