Skip to main content

KD Tree

A kd-tree is a binary space partitioning tree that recursively splits a set of kk-dimensional points by alternating along the coordinate axes. At depth dd the split axis is dmodkd \bmod k, so in the plane the root splits by xx, its children split by yy, the grandchildren split by xx again, and so on. The split value is usually chosen as the median of the current point set so the tree stays balanced.

Once built, the tree answers orthogonal range queries by pruning: whenever the query rectangle is disjoint from a subtree's bounding region, the whole subtree is skipped; whenever the bounding region is fully contained, every point in the subtree is reported. Only the "boundary" subtrees need to be explored recursively, and a classical analysis of de Berg et al. bounds the number of visited nodes by O(n+m)O(\sqrt n + m) in two dimensions, where mm is the output size.

Codes

def solve(xs):
points, queries = xs

def build(idx, depth):
if not idx:
return None
axis = depth % 2
idx.sort(key=lambda i: points[i][axis])
mid = len(idx) // 2
return {
"i": idx[mid],
"axis": axis,
"left": build(idx[:mid], depth + 1),
"right": build(idx[mid + 1:], depth + 1),
}

root = build(list(range(len(points))), 0)

def query(node, x1, y1, x2, y2, out):
if node is None:
return
x, y = points[node["i"]]
if x1 <= x <= x2 and y1 <= y <= y2:
out.append(node["i"])
axis = node["axis"]
lo, hi = (x1, x2) if axis == 0 else (y1, y2)
val = points[node["i"]][axis]
if lo <= val:
query(node["left"], x1, y1, x2, y2, out)
if val <= hi:
query(node["right"], x1, y1, x2, y2, out)

result = []
for x1, y1, x2, y2 in queries:
out = []
query(root, x1, y1, x2, y2, out)
result.append(sorted(out))
return result

Example

Loading Python runner...

Description

Run time analysis

Build is O(nlogn)O(n \log n) and each 2D orthogonal range report runs in O(n+m)O(\sqrt n + m) where mm is the number of reported points. The square-root bound follows from counting how many subtrees at alternating levels can straddle one side of the query rectangle.

Space analysis

O(n)O(n). The tree stores each input point once and the pointer overhead is linear in the number of nodes.

Proof of correctness

Every reported point satisfies the query rectangle because we only push a point into the output after an explicit in-range test. Conversely, for a point pp inside the rectangle, induction on depth shows the recursion reaches the leaf of pp: at each level the splitting coordinate of pp lies in the query interval on that axis, so at least one of the two "lo/hi" guards fires and the search descends into the side containing pp. See de Berg et al., Chapter 5, for the full pruning argument and the O(n)O(\sqrt n) query bound.

Extensions

Applications

References

  • de Berg, Cheong, van Kreveld, Overmars. Computational Geometry: Algorithms and Applications, 3rd ed., Chapter 5 (Orthogonal Range Searching).
  • Bentley, J. L. "Multidimensional binary search trees used for associative searching." Communications of the ACM, 1975.
  • cp-algorithms — K-d tree