3

I am studying iterators and have been stuck for 3 days on figuring out why do we use:

auto mid = text.begin() + (end - beg) / 2;

Code:

int main()

{
    vector<int> text{ 10,9,8,7,6,5,4,3,2,1 };
    int sought = 3;
    // text must be sorted
    // beg and end will denote the range we're searching
    auto beg = text.begin(), end = text.end();
    auto mid = text.begin() + (end - beg) / 2; // original midpoint
                                               // while there are still elements to look at and we haven't yet found sought
    while (mid != end && *mid != sought) {
        if (sought < *mid) // is the element we want in the first half?
            end = mid; // if so, adjust the range to ignore the second half
        else // the element we want is in the second half
            beg = mid + 1; // start looking with the element just after mid
        mid = beg + (end - beg) / 2;// new midpoint
    }

    system("pause");
}

why do

auto mid = text.begin() + (end - beg) / 2;

and not:

auto mid = text.begin() + text.size() / 2;

Please help.

Yavar
  • 11,883
  • 5
  • 32
  • 63
jibzoiderz
  • 91
  • 8
  • Do ***we*** use "(end - begin)/2"? Where did you find this? – Wolf Jul 25 '16 at 06:06
  • 1
    @Wolf - c++ primer 5th edition. Its a little misleading as the book at chapter 3.4 says that this is a "classic algorithm" so I assumed that this was a common occurrence (correct me if im wrong) – jibzoiderz Jul 25 '16 at 06:12
  • 1
    The reason this is confusing is that the example implements the binary search inside the main function. If it was properly extracted into a function that only takes an iterator range to search, it would be clear why you can't call size on the container - because you have no way of referring to the container. – Sebastian Redl Jul 25 '16 at 07:48
  • [Overflow issues when implementing math formulas](http://stackoverflow.com/q/10882368/995714) – phuclv Aug 02 '16 at 07:11

2 Answers2

4

This is done to avoid overflow that may happen in adding two very big integers where the addition result may become greater than the max integer limit and yield weird results.

Extra, Extra - Read All About It: Nearly All Binary Searches and Mergesorts are Broken

From the blog:

So what's the best way to fix the bug? Here's one way:
 6:             int mid = low + ((high - low) / 2);

Probably faster, and arguably as clear is:
 6:             int mid = (low + high) >>> 1;

In C and C++ (where you don't have the >>> operator), you can do this:
 6:             mid = ((unsigned int)low + (unsigned int)high)) >> 1;
Yavar
  • 11,883
  • 5
  • 32
  • 63
  • Pointers don't support addition at all; big integer indexes seldom overflow (the size of the array is usually much smaller than the limit of integers). – vincent163 Jul 25 '16 at 09:38
  • Well I am talking about a general technique and not specifically related to pointers. The question suggests the OP is only asking why not (beg + end)/2? Please read the link (Google Research Blog) in my answer to understand more in detail. – Yavar Jul 25 '16 at 10:33
3

Binary searching is traditionally written like so. This form of writing helps coders understand binary searching since only start, end, middle is used in a standard binary search.

You could use size() rather than end-star before the loop, but you have to use end-start in the while-loop since end-start would change. You should avoid using size() for consistency.

vincent163
  • 384
  • 2
  • 13