5

I have N integers, for example 3, 1, 4, 5, 2, 8, 7. There may be some duplicates. I want to divide this sequence into contiguous subsequences such that we can form from them non-decreasing sequence. How to calculate minimal number of cuts? For the example mentioned above, the answer is 6, because we can partition this sequence into {3}, {1}, {4, 5}, {2}, {7}, {8} and then form {1, 2, 3, 4, 5, 7, 8}. What is the fastest way to do this?

Does anyone know how to solve it assuming that some numbers may be equal?

user128409235
  • 239
  • 1
  • 11
  • Why is the question tagged as graph-theory or multiset? If you think, for example, that graphs can be useful here, please share your findings with us. As it stands, your question shows no effort or research – Armen Tsirunyan Nov 28 '15 at 15:29
  • I tagged it as graph theory because many hard problems which does not look as concerned with graphs, have solutions in a graph theory. I have no idea how to solve this problem. – user128409235 Nov 28 '15 at 15:32
  • @Paul No it is not. Take a look at this example: 3, 1, 4, 5, 2, 8, 7. Your algorithm would partition it into {3}, {1, 4, 5}, {2, 8}, {7}, but we cant make a non-decreasing seqeunce out of these pieces. – user128409235 Nov 28 '15 at 15:34
  • 1
    This could be considered a Decision Problem in that you can reformulate the question as: Is this the minimal number of cuts I can make on this sequence? This is probably NP-Complete in this case but I can see some psuedo-polynomial dynamic programming possibilities. – Lucas Nov 28 '15 at 15:43

2 Answers2

2

I would cut the array into non-decreasing segments at points where the values decrease, and then use these segments as input into a (single) merge phase - as in a sort-merge - keeping with the same segment, where possible, in the case of ties. Create additional locations for cuts when you have to switch from one segment to another.

The output is sorted, so this produces enough cuts to do the job. Cuts are produced at points where the sequence decreases, or at points where a gap must be created because the original sequence jumps across a number present elsewhere - so no sequence without all of these cuts can be rearranged into sorted order.

Worst case for the merge overhead is if the initial sequence is decreasing. If you use a heap to keep track of what sequences to pick next then this turns into heapsort with cost n log n. Handle ties by pulling all occurrences of the same value from the heap and only then deciding what to do.

mcdowella
  • 19,301
  • 2
  • 19
  • 25
1

This approach works if the list does not contain duplicates. Perhaps those could be taken care of efficiently first.

We can compute the permutation inversion vector in O(n * log n) time and O(n) space using a Fenwick tree. Continuous segments of the vector with the same number can represent sections that need not be cut. Unfortunately, they can also return false positives, like,

Array: {8,1,4,5,7,6,3}
Vector: 0,1,1,1,1,2,5

where the 6 and 3 imply cuts in the sequence, [1,4,5,7]. To counter this, we take a second inversion vector representing the number of smaller elements following each element. Continuous segments parallel in both vectors need not be cut:

Array:  {8,1,4,5,7,6,3,0}
Vector:  0,1,1,1,1,2,5,7  // # larger preceding
Vector:  7,1,2,2,3,2,1,0  // # smaller following
            |---|  // not cut

Array:   {3,1,4,5,2,8,7}
Vectors:  0,1,0,0,3,0,1
          2,0,1,1,0,1,0
             |---|  // not cut

Array:   {3,1,2,4}
Vectors:  0,1,1,0
          2,0,0,0
           |---|  // not cut
גלעד ברקן
  • 23,602
  • 3
  • 25
  • 61