Check if an array is a merge of 2 arrays

Question

Implement a function checking if a given array can be constructed as a merge of the 2 other arrays in any way.

public static boolean isMerge(int[] arr1, int[] arr2, int[] merge){
   //...
}

Examples:

isMerge([3, 1, 2, 2], [2, 1, 1], [3, 1, 2, 2, 2, 1, 1]) -- true

isMerge([1, 2, 3], [4, 5, 6], [1, 2, 3, 4, 5, 6]) -- true

isMerge([1, 2, 3], [4, 5, 6], [1, 4, 5, 2, 3, 6]) -- true

isMerge([1, 2], [2, 3], [1, 2, 3, 2]) -- true

isMerge([1, 2], [3, 4], [1, 2, 3, 4, 5]) -- false

isMerge([1, 2], [3, 4], [1, 2, 5, 3, 4]) -- false

isMerge([1, 2], [3, 4], [5, 1, 2, 3, 4]) -- false

My first thought was to implement a solution with 3 iterators. Iterating through each of the arrays check if the current element in either arr1 or arr2 matches the element of merge.

If so then move to the next element keeping iterating until the mismatch is found or it's proven that the result can be constructed by merging arr1 and arr2.

But the solution fails on isMerge([1, 2], [2, 3], [1, 2, 3, 2]) and reports false instead.

I'm looking for the most efficient solution in terms of time and memory, but would be interested to know of any working approach.

I would use a HashMap to keep Merged array elements with counts or just an int array if elements range is not to high. — aydinugur, Apr 20 '22 at 21:43
In terms of performance: Does your problem statement allows duplicates in input arrays and resulting array? Are elements already sorted (I see in some examples that not). — Athlan, Apr 20 '22 at 21:43
@aydinugur Unfortuantely it wouldn't check if it's a merge, but a permutation. — St.Antario, Apr 20 '22 at 21:49
@Athlan Yeah, the arrays are unsorted and duplicates are allowed. — St.Antario, Apr 20 '22 at 21:49
Most likely, as most of the time, you won't be able to optimize both memory and performance as `Map` will give you a good performance but due to the load factor it will consume more memory than actually required to actually store the values. What is preferred here or are you happy to just see some approaches supported with arguments to where it performs good/ bad? — Mushroomator, Apr 20 '22 at 21:52
@Turing85 Well, I described at the bottom of the question the approach with 3 iterators I tried, but it failed on some case. — St.Antario, Apr 20 '22 at 21:57
Rough sketch: Create two [`Map`](https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/Map.html). - Iterate over over `arr1`, add `(arr1[i], 1)` to the map if no entry for `arr1[i]` exists, otherwise increment the current value for `arr1[i]` in the map by 1. - Repeat the process, for `arr2`, still using the 1st `Map`. - Repeat the process for `arr3`, using the 2nd `Map`. — Turing85, Apr 20 '22 at 22:05
Can you show a [mre] illustrating [what you've tried so far](http://idownvotedbecau.se/noattempt/)? — EJoshuaS - Stand with Ukraine, Apr 20 '22 at 22:08
While it was eventually clear what you're asking (whether the third array can be created by taking elements one at a time from the heads of both arrays), you should really specify what you mean by merge, as many people might think you're talking about unions, and your examples don't clarify this. — kcsquared, Apr 20 '22 at 22:08
One more question: what happens when we have all possible values in the arrays to merge them to get the merge array but also additional values that are not required for the merge e.g. `arr1 = [1, 2]`, `arr2=[3,4,5]` and `merge=[1,2,3]`. We can merge the array but then we have `4` and `5` unused. What should I return in this case? `true` or `false`? — Mushroomator, Apr 20 '22 at 22:36
Suppose `a = [1, 2]`, `b = [2, 3]` and m = `[1, 2, 3, 2]`. In pseudocode, 1) create `c = a + b => [1, 2, 2, 3]`; 2) create a frequency hash from `c`, `h = { 1=>1, 2=>2, 3=>1 }`; 3) for each element `e` of `m` see if `h` has a key `e` such that `h[e] > 0`. If yes, set `h[e] = h[e] - 1` and examine the next element of `m`; if no, `m` cannot be formed by "merging" `a` and `b`. If "yes" for all elements of `m`, `m` can be formed from `a` and `b`... — Cary Swoveland, Apr 20 '22 at 22:57
...If `c` contains `n` elements, 1) and 2) both have O(n) computational complexity. 3) has "close to" O(n) complexity, as hash key lookups are almost O(1) (constant). Therefore the overall computational complexity is close to O(n). That's considerable faster than sorting both `c` and `m` and checking to see if the sorted arrays are equal, which has O(nlog(n)) computational complexity. Obviously, if `c` and `m` are different sizes one can answer the question immediately. — Cary Swoveland, Apr 20 '22 at 22:59
This is tangentially related to a Leetcode problem, [Largest merge of two strings](https://leetcode.com/problems/largest-merge-of-two-strings/). You may want to use their terminology to explain what a merge is in your edit. Also. you can solve this with dynamic programming in O(n^2) time, by tracking all the valid numbers of elements that could have been removed from the front of arr1[], after the first `k` elements of merge[] have been filled in. — kcsquared, Apr 20 '22 at 22:59
@St.Antario : Should `isMerge(new int [] {1, 2, 3}, new int [] {4, 5, 6}, new int [] {1, 3, 2, 4, 5, 6}` return `false`? — Old Dog Programmer, Apr 21 '22 at 02:28
I have a different interpretation of the problem than @Mushroomator and others. My code returns `false` for '{1, 2, 3}, {4, 5, 6}, {1, 3, 2, 4, 5, 6}` while the code in that answer returnw `true`. — Old Dog Programmer, Apr 21 '22 at 15:49
@St.Antario, May I ask where the problem came from? This might be a good problem to put onto one of the coding practice sites, such as CodeSignal or HackerRank. Would the creator of this problem be interested in putting it there? — Old Dog Programmer, Apr 23 '22 at 19:09
Suggestion for improving the wording: The contents of two 'source' arrays have been combined into a 'destination' array as follows: Randomly pick one of the elements remaining in the source arrays. Delete that element, and append it into the destination array. Repeat until the source arrays are empty. Write a method `boolean isMerge` that accepts as arguments the original source arrays and the destination array. Return `true` if it is possible the destination array was created as described, and `false` if impossible. — Old Dog Programmer, Apr 23 '22 at 19:18

score 1 · Accepted Answer · answered Apr 20 '22 at 23:34

Here is how I would do it. This is a relatively simple approach using two Maps that simply count the occurrences of values in both the two arrays (combined) and the merge array. This can be done in one simple loop over the longest of the three arrays. That is only of course if we don't have more values in the merge array then in the two other arrays combined. If that's the case we can immediately return false as there is no way the array was merged from the two. I also return false if merge is empty and any of the other two arrays is not.

When we have counted the values we just need to compare keys and values of both maps. If all keys and their respective values match, the arrays can be created by merging both.

The runtime is O(n) with two loops over n so roughly 2n + k with n being the number of elements in the biggest provided array and k being a small constant (depending on how you count one operation) for all the operations other than loops that are happening in the function.

In terms of memory we have a, b, n for the length of arr1, arr2, merge (any one of them could have any of the lengths, but for this calculation we assume a = arr1.length, b = arr2.length and n = merge.length). Then we have as memory requirement for the two Maps:

for merge: (n * 0.75 + 1)*2
for arr1 and arr2: ((a + b) * 0.75 + 1)*2

The next power of 2 will be used by Java internally for the capacity of the array so in worst case we need double the amount of space then actually required to store the values, hence the *2.

See code from Java HashMap that determines size of backing array:

/**
* Returns a power of two size for the given target capacity.
*/
static final int tableSizeFor(int cap) {
    int n = -1 >>> Integer.numberOfLeadingZeros(cap - 1);
    return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
}

Assuming you have n elements in merge and n/2 elements in arr1 and arr2 a the memory required for the maps in worst case would be (n * 0.75 + 1)*2 + ((n/2 + n/2) * 0.75 + 1)*2 which equals 4 * (n * 0.75 + 1) = 3n + 4. You could additionally add the space required for local variables to this, but they are quite insignificant really.

All in all this approach has O(n) runtime and is therefore asympotically optimal as you will have to "look" at each value once, though there might be implementations with (significantly) smaller constants.

In terms of memory there certainly are implementations that take much less than this, but most likely memory isn't a big concern on modern hardware for Integer arrays.

import java.util.*;

public class Application {
    public static void main(String[] args) {
        System.out.println(isMerge(new int[]{3, 1, 2, 2}, new int[]{2, 1, 1}, new int[]{3, 1, 2, 2, 2, 1, 1}));
        System.out.println(isMerge(new int[]{1, 2, 3}, new int[]{4, 5, 6}, new int[]{1, 2, 3, 4, 5, 6}));
        System.out.println(isMerge(new int[]{1, 2, 3}, new int[]{4, 5, 6}, new int[]{1, 4, 5, 2, 3, 6}));
        System.out.println(isMerge(new int[]{1, 2}, new int[]{2, 3}, new int[]{1, 2, 3, 2}));
        System.out.println(isMerge(new int[]{1, 2}, new int[]{3, 4}, new int[]{1, 2, 3, 4, 5}));
        System.out.println(isMerge(new int[]{1, 2}, new int[]{3, 4}, new int[]{1, 2, 5, 3, 4}));
        System.out.println(isMerge(new int[]{1, 2}, new int[]{3, 4}, new int[]{5, 1, 2, 3, 4}));
    }

    public static boolean isMerge(int[] arr1, int[] arr2, int[] merge) {
        // early out if we have less values in arr1 + arr2 then in merge or if merge is empty and any of the other is not
        // this could be changed to arr1.length + arr.length !0 merge.length when you don't want to allow this: arr1 = [1, 2], arr2=[3,4,5] and merge=[1,2,3] to return true. It does also make calculating the space easier and will reduce the average case runtime drastically for random inputs
        if (arr1.length + arr2.length < merge.length || (merge.length == 0 && (arr1.length != 0 || arr2.length != 0))) return false;
        // prevent possible rehashing by assigning maximum amount of possible values in the map divided by load factor but also use little memory as possible
        // one could change the load factor: increase -> more performance, more space or decrease -> less performance, less space and measure the performance
        // the calculation for the expected Map size is done like this in Guava and JDK8
        var twoArrValCount = new HashMap<Integer, Integer>((int)((float)(arr1.length + arr2.length) / 0.75f + 1.0f));
        var mergeValCount = new HashMap<Integer, Integer>((int)((float)merge.length / 0.75f + 1.0f));

        // determine longest array
        var longestOverall = Math.max(arr1.length, arr2.length);
        longestOverall = Math.max(longestOverall, merge.length);

        // count values in merge array and in two arrays combined
        for (int i = 0; i < longestOverall; i++) {
            // add 1 as count if its key is not present yet, add one to current value otherwise
            if (i < arr1.length) twoArrValCount.compute(arr1[i], (k, v) -> (v == null) ? 1 : v + 1);
            if (i < arr2.length) twoArrValCount.compute(arr2[i], (k, v) -> (v == null) ? 1 : v + 1);
            if (i < merge.length) mergeValCount.compute(merge[i], (k, v) -> (v == null) ? 1 : v + 1);
        }

        // compare both maps: if all values match return true, return false otherwise
        return mergeValCount
                .entrySet()
                .stream()
                .allMatch(entry -> {
                    // if map2 does not contain a key that is present in map1 -> return false
                    if (!twoArrValCount.containsKey(entry.getKey())) return false;
                    // return result of comparison: if match -> return true, if no match -> return false
                    // if you want to return true for e.g. arr1 = [1, 2], arr2=[3,4,5] and merge=[1,2,3]
                    return twoArrValCount.get(entry.getKey()) <= entry.getValue();
                    // if you want to return false for e.g. arr1 = [1, 2], arr2=[3,4,5] and merge=[1,2,3]
                    // return Objects.equals(twoArrValCount.get(entry.getKey()), entry.getValue())
                });
    }
}

Expected output:

true
true
true
true
false
false
false

How did you post this? The question was closed 40 minutes before you answered, I thought that was impossible. Also, this answer seems to assume that 'merge' in the question means 'set union', which seems an unlikely to be OP's interpretation. Not your fault, by the way; this ambiguity in explanation is why the question was closed. — kcsquared, Apr 20 '22 at 23:40
Didn't even notice it was closed :D Surely have typed the first character of my answer before the question was closed. Did finish the answer much later (when it already was closed) but just clicked on `Post`. Don't know why SO let me post tbh. Also thought it would prevent people from doing this. — Mushroomator, Apr 20 '22 at 23:44
It's not a set union btw, because a set union would return `true` for `arr1 = [1, 2]`, `arr2 = [1, 2]` and `merge=[1, 2]`, I return `false` in this case as I count values. I would return true though for `arr1 = [1, 2]`, `arr2 = [1, 2]` and `merge=[1, 1, 2, 2]`. But as you said, not really clear what OP is asking for. — Mushroomator, Apr 20 '22 at 23:51

Check if an array is a merge of 2 arrays

1 Answers1

Linked