I am writing an image processing filter, and I want to speed up the computations using openmp. My pseudo-code structure follows like this:
for(every pixel in the image){
//do some stuff here
for(any combination of parameters){
//do other stuff here and filter
}
}
The code is filtering every pixel using different parameters, and choosing the optimal ones.
My question is what is faster: to parallelize the first loop among the processors, or to access sequentially the pixels and parallelize the different parameters selection.
I think the question could be a more general one: what is faster, giving big amounts of operations to every thread, or creating many threads with few operations.
I don't care for now about the implementation details, and I think I can handle them with my previous expertise using openmp. Thanks!