Sunday, January 7, 2018

TOTW: Binary Search

Hi, everyone. Let's talk about binary search algorithms.

Let's say we have an array of numbers, and that we're trying to see whether a certain value is in that array. Take the following array A as an example:

A[] = {1, 3, 5, 7, 9, 11, 13, 15, 17}

I want to find whether x = 7 is in this array. At first, we think, why not just go through every single value in the array and see if we find 7? That's a sure fire way to find whether our x value is in there. 

So we begin by comparing our x value to the first element of the array, which is 1. Is that it? Nope. Let's move to the second element, 3. Is that it? Still no. What about the third? No. The fourth? Yes! We found 7 in the array, so it is indeed there. We return the index of the element, which in this case is A[3], and we're done. 

What we've just described is a sequential search. This is the most basic search, where you recursively step through every single element of the array or other data structure and find the value you were looking for. However, as you might be able to tell, this is not very efficient. If we were looking for 17 in this array, we would have to step through every single element until then. If you were looking for 19, you'd step through the whole array and finally return that it's not in there after comparing x to every element. This means the most number of comparisons you can make in a sequential search is n, the amount of elements in the array. 

A sequential search algorithm is quite simple, though. Given an array A of length n and a search value x (and a counter value i), it looks something like this:

1. Initialize i to zero.
2. If A[i] = x, you're done the search. The program terminates successfully and returns i. 
3. If A[i] does not = x, increase i by 1. 
4. If i < n (that is to say, if there are elements remaining to be compared to), return to step 2. Otherwise, terminate and return -1. 

Written in c++, this would look like:

int sequentialSearch(A, n, x) {
          for(int i = 0; i < n; i++) {
                    if(A[i] == x)
                              return (i);
                    else
                              return (-1);
          }
}

However, like we said before, this isn't the best, fastest, most efficient way to do it. An algorithm called binary search might be better. 

Whereas in sequential search, we do not require an array to be sorted, in binary search, we do need it to be sorted in increasing or decreasing order. This is because in binary search, we start by comparing our value not to the first element, but to the middle one. Based on whether it is equal to, greater than, or lesser than that middle value, the range which our algorithm will look for the x value in changes. Let's look at an example.

For the following array A, we want to find x = 11:

A[] = {1, 3, 5, 7, 9, 11, 13}

First we look at the middle element. To know which element is the middle, we take the sum of the indexes of the first and last elements, or the lower and upper bounds, and divide by 2 to get the index of the middle element. In this case, we take 0 and 6, and divide by 2 to get 3. We go to A[3] and see that it is 9. 9 is lesser than our x value of 11, so we keep our upper bound as 13 but change our new lower bound to be 1 higher than the old middle value, which is A[4], or 9. Again we find the middle index value: (4+6)/2 = 5. We look at A[5] and see that it is 11. Yay! We found our value in the array. 

If that explanation was a little confusing, here is a great visual for what is happening: 
Enter a value to search for in the top left corner, indicate whether you want a small or large sample, and then pick a type of search. 

A binary search algorithm is a little bit more complicated than a sequential search algorithm. Given an array A which is in ascending order, and a search value x (and a counter value i), it is:

1. If middle element of array segment is desired value, then done
2. Else, if the middle element is too large, repeat binary search in first half of array segment
3. Else, if the middle element is too small, repeat binary search on the second half of array segment

More specifically,

int binarySearch(int A[], int first, int last, int x) {
          if (first > last)
                    return -1;
          middle = (first + last) / 2;
          if (A[middle] == x)
                    return middle;
          if (A[middle] < value)
                    return binarySearch(array, middle + 1, last, x)
          if (A[middle] > value)
                    return binarySearch(array, first, middle - 1, x)
}

What's happening in this code fragment is that, if the value at the middle element, which we calculate the index of, is not equal to x, we identify whether it is lesser or greater. If it is lesser, we want our new range to be to the left of the middle element, so we keep the first element as the first, but we change the last element of what our program is examining to be one element to the left of middle, which we say is middle minus 1. If x is greater than the element at the middle index, we want our new range to be to the right of the middle element, so we keep the last element as the last, but we change the first element of what our program is examining to be one element to the right of middle, which we say is middle plus 1. 

I encourage you to write these programs for yourself. It takes a few times of writing and running the program to figure out what really is going on. I wrote them in C++ because that is the language I am currently learning in my course work, but you can do the exact same in Java or any other computer language. Don't forget, an algorithm is just a series of steps you take, or a particular way to solve a problem. How you implement these steps is up to you. 

Please feel free to leave a comment about your code in another language, or about any questions or comments you may have. As always, you can reach out to me personally as well through my email.

Happy coding! See you next week with another cool topic. 

No comments:

Post a Comment