break sub task of parallel_for_each

Question

I have a big vector of items that are sorted based on one of their fields, e.g. a cost attribute, and I want to do a bit of processing on each of these items to find the maximum value of a different attribute... The constraint here is that we cannot use an item to calculate a maximum value if that item's cost exceeds some arbitrary price.

The single threaded for-loop looks like this:

auto maxValue = -MAX_FLT;
for(const auto& foo: foos) {

    // Break if the cost is too high.
    if(foo.cost() > 46290) { 
        break;
    }

    maxValue = max(maxValue , foo.value()); 
}

I've been able to somewhat convert this into a parallel_for_each. (Disclaimer: I'm new to PPL.)

combinable<float> localMaxValue([]{ return -MAX_FLT; });

parallel_for_each(begin(foos), end(foos), [&](const auto& foo) {

    // Attempt to early out if the cost is too high.
    if(foo.getCost() > 46290) {
        return; 
    }

    localMaxValue.local() = max(localMaxValue.local(), foo.getValue());
}

auto maxValue = localMaxValue.combine(
    [](const auto& first, const auto& second) { 
        return max<float>(first, second); 
    });

The return statement inside the parallel_for feels inefficient since it's still executing over every item, and in this case, it's quite possible that the parallel_for could end up iterating over multiple portions of the vector that are costed too high.

How can I take advantage of the fact that the vector is already sorted by cost?

I looked into using a cancellation token, but that approach seems incorrect as it would cause all sub tasks of the parallel_for to be cancelled which means I may get the wrong maximum value.

Is there something like a cancellation token that could cancel that specific sub task of the parallel_for, or is there a better tool than the parallel_for in this case?

Jonathan · Answer 1 · 2015-08-25T06:46:23.543

If the vector is sorted by cost then you can iterate over only the items whose cost is lower then the cost limit.

If the cost is x. find the first item iterator which is equal or larger than x. you can use std::lower_bound. then you use your parallel_for_each from the beginning of the vector to the iterator you found.

combinable<float> localMaxValue([]{ return -MAX_FLT; });

//I'm assuming foos is std::vector.
int cost_limit = 46290;
auto it_end = std::lower_bound(foos.begin(), foos.end(), cost_limit, [](const auto& foo, int cost_limit)
{
    return foo.getCost() < cost_limit;
});

parallel_for_each(foos.begin(), foos.end(), [&](const auto& foo) {    
    localMaxValue.local() = max(localMaxValue.local(), foo.getValue());
}

auto maxValue = localMaxValue.combine(
    [](const auto& first, const auto& second) { 
        return max<float>(first, second); 
    });

I could see how that would work well for this example; the practical problem I'm trying to solve is a bit different though and I'm not sure identifying the 'max cost' index prior to the parallel_for would end up saving time. Still worth a try though. Thanks for the suggestion. — user8709, Aug 25 '15 at 06:40

break sub task of parallel_for_each

1 Answers1