1

I have an array of n items of type T, and a categorization function f(t) that assigns to each item a category number, from O to k-1. (k being the number of categories). The goal is to divide the array into k segments, one for each category, and rearrange the items so that they are all in the right segment.

With two different arrays for input and output, I could do it in O(n), but I need to do it in-place (i.e. using swaps as basic operation), and if possible, using a parallelizable algorithm.

One idea would be to do one segment after the other (first swapping all 0's onto a segment at the beginning [O, i0], then all 1's (starting after i0) to a new segment after that, etc). This would be O(n * k) (with n getting smaller), but is not parallelizable.

Another way would be to use a sorting algorithm in O(n log n) that may be parallelizable, but this is likely not optimal because most items compare as equal.

My question is what would be a good approach for this problem, and how this problem would be called in literature?

tmlen
  • 8,533
  • 5
  • 31
  • 84
  • Here's a partial duplicate (doesn't discuss parallelization): http://stackoverflow.com/questions/15682100/sorting-in-linear-time-and-in-place. One way to parallelize is to split into pieces; arrange each piece; and then in-place parallel merge (the last is not O(n).) Hope that gives you some search keys. – rici Oct 19 '14 at 23:03

1 Answers1

0

As a quick note, this problem is related to - but not exactly the same as - the Dutch national flag problem. In this problem, you have an array with balls of three different colors (red, white, and blue), and the goal is to reorder the elements to get them sorted so that red comes first, then white, then blue.

Using ideas from the Dutch national flag problem, I think that you can solve this relatively efficiently and in-place. For example, you may want to use a quicksort variant that's specifically designed to handle duplicate elements. The Bentley-McIlroy 3-way partitioning algorithm, for example, was specifically designed to handle inputs where there are a lot of duplicate keys and does a quicksort where the partitioning scheme groups elements into three groups - elements less than the key, elements greater than the key, and elements equal to the key - then only sorts the "less" and "greater" groups. If you have an array with only k distinct values in it, then the runtime will be O(n log k) on expectation, since each recursive call will be made on a subarray with roughly half as many distinct keys in it. This isn't O(n), but it does work in-place and parallelizes really well (have different threads handle each subarray).

templatetypedef
  • 362,284
  • 104
  • 897
  • 1,065