How is it possible to do Binary Search on a Doubly-Linked List in O(n) time?

Technology CommunityCategory: Linked ListsHow is it possible to do Binary Search on a Doubly-Linked List in O(n) time?
VietMX Staff asked 3 years ago
Problem

I’ve heard that it’s possible to implement binary search over a doubly-linked list in O(n) time. Accessing a random element of a doubly-linked list takes O(n) time, and binary search accesses O(log n) different elements, so shouldn’t the runtime be O(log n)) instead?

It’s technically correct to say that the runtime of binary search on a doubly-linked list is O(log n)), but that’s not a tight upper bound. Using a slightly better implementation of binary search and a more clever analysis, it’s possible to get binary search to run in time O(n).

The basic idea behind binary search is the following:

  • If the list is empty, the element being searched for doesn’t exist.
  • Otherwise:
    • Look at the middle element.
      • If it matches the element in question, return it.
      • If it’s bigger than the element in question, discard the back half of the list.
      • If it’s smaller than the element in question, discard the front half of the list.

naive implementation of binary search on a doubly-linked list would work by computing the indices to look up on each iteration (just like in the array case), then access each one by starting at the front of the list and scanning forward the appropriate number of steps. This is indeed very slow.

We can speed this up by a factor of Θ(log n) by being more clever with our approach. The reason the previous algorithm is slow is that every time we need to look up an element, we start the search from the beginning of the array. However, we don’t need to do this. After looking up the middle element the first time, we’re already in the middle of the array, and we know that the next lookup we’re going to make will be either at position n / 4 or 3n / 4, which is only distance n / 4 from where we left off (compared to n / 4 or 3n / 4 if we start from the beginning of the array). What if we just trekked out from our stopping position (n / 2) to the next position, rather than restarting at the front of the list?

Here’s our new algorithm. Begin by scanning to the middle of the array, which requires n / 2 steps. Then, determine whether to visit the element at the middle of the first half of the array or the middle of the second half of the array. Reaching there from position n / 2 only requires n / 4 total steps. From there, going to the midpoint of the quarter of the array containing the element takes only n / 8 steps, and going from there to the midpoint of the eighth of the array containing the element takes only n / 16 steps, etc. This means that the total number of steps made is given by

n / 2 + n / 4 + n / 8 + n / 16 + …

= n (1/2 + 1/4 + 1/8 + …)

≤ n

This follows from the fact that the sum of the infinite geometric series 1/2 + 1/4 + 1/8 + … is 1. Therefore, the total work done in the worst case only Θ(n), which is much better than the Θ(n log n) worst case from before.