binary search nearest match with last occurrence

Question

I am implementing effective algorithm to search last occurrence of ( the key or nearest match (upper bound)).

So far, i got this.

long bin_search_closest_match_last_occurance ( long  * lArray, long sizeArray, long lnumber)
{
    long left, right, mid, last_occur;

    left = 0;
    right = sizeArray - 1;
    last_occur = -1;

    while ( left <= right )
    {
        mid = ( left + right ) / 2;

        if ( lArray[mid] == lnumber  )
        {
            last_occur = mid;
            left = mid +1;
        }

        if ( lArray[mid] > lnumber ) 
            right = mid - 1;
        else 
            left = mid + 1;
    }
    return last_occur!=-1?last_occur:mid;
}

Let's have an array {0,0,1,5,9,9,9,9} and the key is 6 Fce should return index 7, but my fce returns 4

Please note, that i do not want to iterate linearly to the last matching index.

In mind i have solution where i change parameters fce(add start,end indexes) and do another binary search withing fce from found upper bound to the end of the array (Only if i dont find exact match, last_occur==-1).

I want to ask if there's better/cleaner solution to implement it?

Not sure why it should return 7. It asks to find the last occurance of the key, OR nearest match, so if the key is not in the list (which is your example) - returning 4 should be just fine, as I understand the task description. — amit, Nov 25 '14 at 11:36
You open a { right after your while instruction and you never close it. — Daniel Daranas, Nov 25 '14 at 11:44
I have not a slightest idea why it could be useful, but the solution is to run upper_bound search twice. — n. m. could be an AI, Nov 25 '14 at 12:09
@n.m. One scenario where it is possible is where elements are not integers, but objects, and you search according to comparator, but want the last object that matches the comparator (because 'identical' objects according to this comparator are ordered in some significant way as well and you need a specific one of them). — amit, Nov 25 '14 at 12:49
@amit I'm not assuming they are integers. Find the first element larger than x, call it y. Then find first element larger than y, and return the previous element. — n. m. could be an AI, Nov 25 '14 at 13:03

j_random_hacker · Accepted Answer · 2014-11-25T14:31:45.500

n.m.'s 2-search approach will work, and it keeps the optimal time complexity, but it's likely to increase the constant factor by around 2, or by around 1.5 if you begin the second search from where the first search ended.

If instead you take an "ordinary" binary search that finds the first instance of lnumber (or, if it doesn't exist, a lower bound), and change it so that the algorithm logically "reverses" the array by changing every array access lArray[x] to lArray[sizeArray - 1 - x] (for any expression x), and also "reverse" the ordering by changing the > lnumber test to < lnumber, then only a single binary search is needed. The only array accesses this algorithm actually performs are two lookups to lArray[mid], which an optimising compiler is very likely to evaluate only once if it can prove that nothing will change the value in between the accesses (this might require adding restrict to the declaration of long * lArray; alternatively, you could just load the element into a local variable and test it twice instead). Either way, if only a single array lookup per iteration is needed, then changing the index from mid to sizeArray - 1 - mid will add just 2 extra subtractions per iteration (or just 1 if you --sizeArray before entering the loop), which I expect will not increase the constant nearly as much as n.m.'s approach. Of course, as with anything, if performance is critical then test it; and if it's not, then don't worry too much about saving microseconds.

You will also need to "reverse" the return value too:

return last_occur!=-1?last_occur:sizeArray - 1 - mid;

binary search nearest match with last occurrence

1 Answers1