Skip to main content

Binary Search

Binary search is a search algorithm that finds the first (lower_bound) or last (upper_bound) position of a target value within a sorted array.

It is the most common used search algorithm in computer science, if you find any O(nlogn)O(n\log n) algorithm, it is likely to be a binary search algorithm. You should always consider using binary search first especially when there is "test and trial" pattern in the problem and the required solution is a single number.

Binary search finds a target in a sorted array by halving the candidate interval on every step. More generally, it locates the first index of a sorted array at which some monotone predicate pp flips from false to true: as long as pp is monotone over the search space, you can binary search it even when the space is abstract (answers to an optimization problem, a real interval, the height of a tree, etc.).

The template below uses the half-open interval [lo,hi)[lo, hi) with the termination condition lo < hi and the shrinking rules hi = mid or lo = mid + 1. This layout is robust against off-by-one bugs: the loop only terminates when lo == hi, and that single index is the answer (or len(nums) if no element satisfies the predicate).

For a classic sorted-array lookup, the predicate is "the element at this index is at least the target". The returned index is either the position of the target or the insertion index where it would be placed, so a final equality check recovers the "found / not found" answer.

Codes

Original compact template:

public int BinarySearch(int[] nums, int target) {
int i=0;
int j=nums.length;
while (i<j){
if(nums[(i+j)/2]>target){
j=(i+j)/2;
}else if(nums[(i+j)/2]<target){
i=(i+j)/2+1;
}else{
return (i+j)/2;
}
}
return -1;
}

Example

Loading Python runner...

Description

Run time analysis

O(logn)O(\log n). Each iteration halves the size of the interval hilohi - lo, so the loop runs at most log2n\lceil \log_2 n \rceil times and each iteration is constant work.

Space analysis

O(1)O(1) auxiliary. Only two index variables are kept; the iterative form needs no call stack.

Proof of correctness

Maintain the invariant "the first index ii with p(i)p(i) true lies in the half-open interval [lo,hi)[lo, hi), or no such index exists and the answer is nn". Initially lo=0lo = 0 and hi=nhi = n, so the invariant holds trivially. At each step, let mm be the midpoint. If p(m)p(m) is true, by monotonicity every index at mm or above is also true, so shrinking to [lo,m)[lo, m) by hi = mid preserves the invariant. If p(m)p(m) is false, by monotonicity every index at mm or below is also false, so shrinking to [m+1,hi)[m+1, hi) by lo = mid + 1 preserves the invariant. Because hilohi - lo strictly decreases on every iteration, the loop terminates with lo=hilo = hi, at which point the invariant forces lolo to be the first true index (or nn).

Extensions

Applications

Leetcode 2528

It takes a while to realize that binary search is all I need. I tried to use segment tree to solve this problem, but it is too complex to implement and the time complexity is not guaranteed to be O(nlogn)O(n\log n) in the insertion of queries.

Intuition

I see I only need to return the minimum power of the city that can be powered by the given cities.

Each test solution can be verified using a single iteration over the cities.

Solution
class Solution:
def maxPower(self, stations: List[int], r: int, k: int) -> int:
def canPower(power: int,k: int) -> bool:
# stored the power range of the latest power station constructed
# formated as (expiration (inclusive), powersupply)
power_range=[]
additional_power=0
quota=0
for i,e in enumerate(stations):
# at most one entry is popped each time
if power_range and power_range[0][0]<i:
additional_power -= power_range.pop(0)[1]
if e + additional_power < power:
required_power = power - e - additional_power
k -= required_power
power_range.append((i+2*(r+1)-1, required_power))
if k < 0:
return False
additional_power += required_power
return True
# first updated the true power of the cities
n=len(stations)
cur_power=sum(stations[:r+1])
true_power=[0]*n
for i in range(n):
true_power[i]=cur_power
if i-r-1>=0:
cur_power -= stations[i-r-1]
if i+r<n:
cur_power += stations[i+r]
power=true_power
lo, hi = min(stations), max(stations)+k+1
while lo < hi:
mid = (lo + hi) // 2
if canPower(mid,k):
hi = mid
else:
lo = mid + 1
return lo

References

  • Cormen, Leiserson, Rivest, Stein. Introduction to Algorithms, 3rd ed., Section 2.3.
  • cp-algorithms — Binary search
  • Bentley, Jon. Programming Pearls, 2nd ed., Column 4 ("Writing Correct Programs").