Problem

Given a sorted sequence, find the exact position of a given element.

A simple approach is to do a linear search (check each element one by one). The time complexity of a linear search is O(n). Another approach to perform the same task in O(log n) is using Binary Search.

Considering a sorted list that contains 1 million items:
- If we perform a linear search we could have to make 1 million iterations and 1 million comparisons
- With a binary search we will find it for sure with no more than 20 iterations and 20 comparisons (asymptotically 5000 times faster!)

Applications

  • Find data index, more importantly within huge sequences database.
  • Key idea in binary search trees, a fundamental data structure.
  • It is used to estimate the square roots of numbers (e.g. Newton's method).

Homonyms

  • Dichotomic search
  • Half-interval search
  • Logarithmic search
  • Binary chop

Binary search is a decrease and conquer search algorithm that can be used on a sorted array. It operates by determining whether the search value is less or higher than the middle value. It then loops on the lower or upper half of the sequence until either the value is found or not (half is an empty sequence).

Steps

  • 1. Compare search value with the middle element.
  • 2. If they match, we are done: return the middle index.
  • 3. If the search value is higher: loop within the upper half sequence.
  • 4. If the search value is smaller: loop within the lower half sequence.


Code

Index BinarySearch(Iterator begin, Iterator end, const T key)
{
  auto index = -1;
  auto middle = begin + distance(begin, end) / 2;

  // While there is still objects between the two iterators and no object has been found yet
  while(begin < end && index < 0)
  {
    if (IsEqual(key, middle))   // Object Found: Retrieve index
      index = position(middle);
    else if (key > middle)      // Search key within upper sequence
      begin = middle + 1;
    else                        // Search key within lower sequence
      end = middle;

    middle = begin + distance(begin, end) / 2;
  }

  return index;
} 

As the analysis visualization is quite trivial: it has been included within the header image carousel of this article.

Time

Best

The best configuration occurs when the first middle point selected is the search value; leading to a complexity of O(1).

Average / Worst

The worst case scenario occurs when the search value is not in the sequence.
After the first look-up, we loop the search on the lower/upper half and so on. Let us put this into a recurrence relation: $$T(n) = O(1) + T(\frac{n}{2})$$ Using the master theorem for divide-and-conquer recurrences: T(n) = O(log n).

Space

Binary search does not use any buffer nor does make any recursion. Thus, it requires O(1) space in all case.