3

I have this method to merge 2 sorted arrays into one sorted array:

    public void merge(T[] a, int l1, int r1, T[] b, int l2, int r2, T[] c, int l3) {
        while (l1 < r1 && l2 < r2) {
            if (a[l1].compareTo(b[l2]) < 0) {
                c[l3++] = a[l1++];
            } else
                c[l3++] = b[l2++];
        }

        while (l1 < r1)
            c[l3++] = a[l1++];
        while (l2 < r2)
            c[l3++] = b[l2++];
    }

But now I want to do it with 4 arrays at once.

I tried really long to come up with a solution, but wasn’t really successful. Does somebody have an idea how to do it?

Alexander Ivanchenko
  • 25,667
  • 5
  • 22
  • 46
Kicksy
  • 83
  • 1
  • 7
  • I have a C++ example of 4 way merge sort using a nested tree of if else statements for the 4 way merge at the end of [this answer](https://stackoverflow.com/questions/34844613/34845789#34845789). The example uses pointers, but indexes could be used instead. For more than 4 arrays, a minheap could be used, but that will be slower than just merging 4 at a time. The C++ example is old code using goto's. For Java, parts of the code would need to be duplicated in lieu of the goto's, and more if statements needed when dropping down to 3-way, 2-way merge and 1-way copy. – rcgldr May 09 '22 at 19:52
  • @Dan You are making a wrong conclusion because you are not reading the problem carefully. The problem of merging N **sorted arrays** has a differs a lot in terms of time complexity from the task of merging arrays that are unsorted. Multiple method invocations will increase the memory consumption (because of intermediate arrays) and also the total number of comparisons required will be greater (i.e. performance will degrade). Therefore, this question **does make** sense. – Alexander Ivanchenko May 09 '22 at 20:58

4 Answers4

4

There is a much simpler way using Java8 streams than doing this by hand:

  1. combine all arrays into one stream (i've used 2 but you can use as many as you want to):
int[] arr1 = {1, 7, 10};
int[] arr2 = {1, 2, 4, 9};

Stream<int[]> ints = Stream.of(arr1, arr2);
  1. then flatMap and sort them in a stream:
IntStream intStream = ints.flatMapToInt(Arrays::stream).sorted();

and when you print them you will see all the numbers sorted:

intStream.forEach(System.out::println);

1
1
2
4
7
9
10

combined in a function, it could look something like this:

public int[] merge(int[]... arrays) {
  return Stream.of(arrays)
                 .flatMapToInt(Arrays::stream)
                 .sorted()
                 .toArray();
}

EDIT: The advantage of streams is, that you can further modify the values as you like. e.g. by leveraging the distinct function you can easily remove duplicates:

intStream = intStream.distinct();
intStream.forEach(System.out::println);

1
2
4
7
9
10
trpouh
  • 175
  • 2
  • 11
  • Arrays received as an input are **already sorted**, your solution doesn't take advantage of that, but introduces an additional overhead of sorting instead. – Alexander Ivanchenko May 09 '22 at 20:12
  • that's not true. combining two sorted arrays does not necessarily mean that the combined array is sorted. the example above, without applying the `sorted` (and `distinct`) method (and both input arrays sorted) would result in `1, 7, 10, 1, 2, 4, 9` – trpouh May 09 '22 at 20:20
  • You should reread the problem statement as it described by OP carefully: "*I have this method to merge 2 **sorted arrays** into one sorted array*". And it seems you didn't examine what the code provided by OP does, I suggest you to do it. – Alexander Ivanchenko May 09 '22 at 20:34
  • By the way, the question doesn't mention that duplicates should be discarded, and the code listed by OP will **preserve duplicated values**. Hence, there's no need to apply `distinct()`, otherwise your result appears to be not aligned with OP wants to achieve. – Alexander Ivanchenko May 09 '22 at 21:17
  • I didn't try to understand OPs code due to its obfuscated nature. However - "*...into one sorted array*" suggests that the result *should* be sorted too, which `.sorted()` ensures. In my answer I did not conclude that `disctinct` is necessary to achieve OPs result. – trpouh May 09 '22 at 21:39
  • @AlexanderIvanchenko As much as the sorting feels like horribly extra work, Java is likely to use https://en.wikipedia.org/wiki/Timsort under the hood. Which is good at efficiently discovering and taking advantage of sorted runs. – btilly May 09 '22 at 21:47
  • The fact that you've omitted that **input arrays are sorted** and introduced an unnecessary overhead. – Alexander Ivanchenko May 09 '22 at 21:50
  • Having run both OPs code (using sorted arrays as inputs) and mine with and without `sorted` shows that `sorted` is necessary to achieve the result OP implied. Due to lack of proof in your comments, I conclude that you have neither tested your hypothesis nor plan to contribute to this answer in any way. – trpouh May 09 '22 at 22:03
  • @trpouh *Having run both OPs code* - That is, the [demo with OP's code](https://www.jdoodle.com/iembed/v0/qEL) which proves that is does what intended, fill free to check it. And even it was broken somehow, it doesn't change the fact that you're that you've misinterpreted the problem statement **merge 2 sorted arrays**. – Alexander Ivanchenko May 09 '22 at 22:31
  • Its irrelevant whether the 2 arrays are sorted or not. If you can prove otherwise, please suggest an edit to my answer, I will gladly change it. – trpouh May 09 '22 at 22:52
  • @trpouh As Linus Torvalds said: "Talk is cheap. Show me the code.” If you've measured the performance, it'll be interesting to see your benchmarks, otherwise how can you state that it's irrelevant? If you encounter a question about merging arrays that are sorted during the job interview, I kindly advise you not to repeat the same thing you've posted here. – Alexander Ivanchenko May 10 '22 at 21:39
2

I've generalized the problem to "merging N sorted arrays into a single sorted array".

The code provided in the question utilizes generics. But it introduces a problem because arrays are not type-safe. In short, there's a substantial difference in their behavior: arrays are covariant and, on the other hand, generics are invariant. Due to that, compiler will not be abler to identify a problem when generics and arrays are mixed. It's a good practice to avoid usage of generic arrays.

Also, I've taken into account that it is clearly an algorithmic problem (therefore its audience broader than readers who have a deep insight in Java, which is required to grasp generic-based implementation) I've decided to create two flavors of solution one using arrays exclusively, another with generics and Collections framework.

Non-generic version

Below is the description of how to merge an arbitrary number of sorted arrays of primitives:

  • find the total number of elements and create a resulting array based on it;
  • define an array that will maintain a current position in each of the source arrays;
  • using a nested for loop for each position in the resulting array, pick the lowest value of all currently accessible values.

The time complexity of this algorithm is O(n * m) (where n - is the total number of elements in all arrays and m is the number of arrays).

The implementation might look like this:

public static int[] mergeNSorted(int[]... arrays) {
    int[] result = new int[getTotalLength(arrays)];
    int[] positions = new int[arrays.length]; // position for each array
    
    for (int pos = 0; pos < result.length; pos++) {
        int minCurVal = Integer.MAX_VALUE;
        int curArr = 0;
        for (int i = 0; i < arrays.length; i++) {
            if (positions[i] < arrays[i].length && arrays[i][positions[i]] < minCurVal) {
                minCurVal = arrays[i][positions[i]];
                curArr = i;
            }
        }
        result[pos] = minCurVal;
        positions[curArr]++;
    }
    return result;
}

public static int getTotalLength(int[][] arrays) {
    long totalLen = 0;
    for (int[] arr : arrays) totalLen += arr.length;
    
    if (totalLen > Integer.MAX_VALUE) throw new IllegalArgumentException("total length exceeded Integer.MAX_VALUE");
    return (int) totalLen;
}

main() - demo

public static void main(String[] args) {
    int[][] input =
        {{1, 3}, {}, {2, 6, 7}, {10}, {4, 5, 8, 9}};

    System.out.println(Arrays.toString(mergeNSorted(input)));
}

Output

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Generic version

In this version, input considered to be a list containing multiple lists of generic type T which expected to implement Comparable interface.

This solution enhances the array-based implementation provided above, reducing the overall time complexity to O(n * log m) (where n - is the total number of elements in all arrays and m is the number of arrays).

Instead of performing m iteration for each resulting element it maintains a PriorityQueue, which in this case represents a Min-Heap (i.e. when a head element is being retrieved from it, it'll have the lowest value of all the elements that are present in the queue).

Every element in queue wraps the value of a particular element retrieved from one of the given lists, as well the data regarding the source of this value (i.e. an index of the list and a position inside this list).

This wrapper over the element of the nested list can be represented by the class shown below.

public class ElementWrapper<V extends Comparable<V>> implements Comparable<ElementWrapper<V>> {
    private V value;
    private int listIndex;
    private int position;
    
    public ElementWrapper(V value, int listIndex, int position) {
        this.value = value;
        this.listIndex = listIndex;
        this.position = position;
    }
    
    // getters
    
    @Override
    public int compareTo(ElementWrapper<V> o) {
        return value.compareTo(o.getValue());
    }
}

Note, that this class implements the of Comparable interface based on the value of wrapped list element.

The queue is being prepopulated with the first element of each non-empty list. And then until the queue is not empty, its lowest element is being removed and gets added to the resulting list. Also, if a list to which the latest element retrieved from the queue points, has more elements, the next of them will be added into the queue.

Note that both operations of adding a new element into the priority queue add() and removing its head element remove() according to the documentation has a cost of O(n) time (where n is the number of elements in the queue).

The same time complexity can be achieved by utilizing a TreeSet instead, but in practice PriorityQueue will perform better because a heap is easier to maintain than a red-black tree.

The code might look like this:

public static <T extends Comparable<T>> List<T> mergeNSorted(List<List<T>> lists) {
    List<T> result = new ArrayList<>();
    Queue<ElementWrapper<T>> queue = getInitializedQueue(lists);
    
    while (!queue.isEmpty()) {
        ElementWrapper<T> next = queue.remove();
        result.add(next.getValue());
        
        if (next.getPosition() + 1 < lists.get(next.getListIndex()).size()) {
            queue.add(new ElementWrapper<>(lists.get(next.getListIndex()).get(next.getPosition() + 1),
                                           next.getListIndex(),
                                           next.getPosition() + 1));
        }
    }
    return result;
}

public static <T extends Comparable<T>> Queue<ElementWrapper<T>> getInitializedQueue(List<List<T>> lists) {
    Queue<ElementWrapper<T>> queue = new PriorityQueue<>();
    for (int i = 0; i < lists.size(); i++) {
        if (lists.get(i).isEmpty()) continue;
        queue.add(new ElementWrapper<>(lists.get(i).get(0), i, 0));
    }
    return queue;
}

main() - demo

public static void main(String[] args) {
    List<List<Integer>> genericInput =
        List.of(List.of(1, 3), List.of(), List.of(2, 6, 7), List.of(10), List.of(4, 5, 8, 9));
    
    System.out.println(mergeNSorted(genericInput));
}

Output

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Alexander Ivanchenko
  • 25,667
  • 5
  • 22
  • 46
  • 1
    If you have `k` arrays this is `O(k)` comparisons per element produced, most of which will be between values that didn't change. It is possible to do better. – btilly May 09 '22 at 21:41
  • @btilly I've provided the implementation which runs in **O(n * log k)** time. – Alexander Ivanchenko May 10 '22 at 21:15
  • Good. For non-primitive arrays Java will use Timsort. If an array of `n` things has `k` sorted runs, that will also sort it in `O(n * log(k))` time. You'd have to benchmark it to find out whether the constants are better or worse. – btilly May 10 '22 at 22:57
  • 1
    @btilly OK, you've convinced me, I'll take a time to measure the performance. – Alexander Ivanchenko May 11 '22 at 10:37
1

I'm not a Java programmer so I'll just give Pythonesque pseudo-code.

First turn each non-emptyarray into a triplet:

(next_value, index, array)

Now put those into a priority queue sorted by next value.

while 0 < queue.size():
    (next_value, index, array) = queue.poll()
    answer.append(next_value)
    if index+1 < array.length:
        queue.add((array[index+1], index+1, array))

If you have k arrays, this will take O(log(k)) comparisons per element produced.

Sadly, Java does not seem to have anything corresponding to the swaptop method. I practice if one array has a run of values, using .peek() to get the top element then .swaptop(...) if you can will let you go through those runs with O(1) work per element.

btilly
  • 43,296
  • 3
  • 59
  • 88
0

This could also be an good example using List<String> in addition to int[]

import org.testng.annotations.Test;

import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.stream.Collectors;
import java.util.stream.Stream;

public class TestClass {

    public static List<String> list(String... elems) {

        return new ArrayList<>(Arrays.asList(elems));
    }

    public static List<String> mergedListSorted(List<String>... listsVarArgs) {

        return Stream.of(listsVarArgs).flatMap(List::stream).sorted().collect(Collectors.toList());
    }

    @Test
    public void sortedListsTest() {

        // Sorted sub lists
        List<String> AGMS = list("A", "G", "M", "S");
        List<String> BHNT = list("B", "H", "N", "T");
        List<String> CIOU = list("C", "I", "O", "U");
        List<String> DJPV = list("D", "J", "P", "V");
        List<String> EKQW = list("E", "K", "Q", "W");
        List<String> FLRX = list("F", "L", "R", "X");

        System.out.println(mergedListSorted(AGMS, BHNT, CIOU, DJPV, EKQW, FLRX));
        System.out.println(mergedListSorted(BHNT, BHNT, CIOU, BHNT));

    }

}

The according output of two examples:

[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X]
[B, B, B, C, H, H, H, I, N, N, N, O, T, T, T, U]