-3

What is the space complexity of the code?

I think it is O(1), since I don't technically store all the inputs into one array, i sort of split it up into 2 arrays.

This code is supposed to take an array with duplicates, and find the two numbers that don't have duplicates.

My Code:

public int[] singleNumber(int[] nums) {
        int[] res = new int[2];
        HashSet<Integer> temp = new HashSet<>();
        HashSet<Integer> temp2 = new HashSet<>();

        for (int i : nums) {
            if (!temp.contains(i)) temp.add(i);
            else temp2.add(i);
        }

        for (int i : nums) {
            if (!temp2.contains(i)) {
                if (res[0] == 0) res[0] = i;
                else res[1] = i;
            }
        }

        return res;
    }
data_pi
  • 801
  • 2
  • 11
  • 30
  • 1
    O(1) would mean (roughly speaking) that no matter how large nums was, the function would never use additional memory. This is clearly not the case. The more elements are in nums, the more memory the function will use, and there is some kind of direct scaling between the number of elements of nums, and the amount of used memory. The only question is, what exactly is that scaling? – Dave Cousineau Jul 16 '17 at 03:10
  • To add on to what @Sahuagin said, unless otherwise specified, Big-O notation is for worst-case, and worst case here is that all input values are distinct, which means that `Set` size will equal array size, which means _O(n)_. – Andreas Jul 16 '17 at 03:12
  • O(n) is shorthand for some constant times the input size (i.e. a linear relationship) where the constant can be anything, greater than or less than 1 but positive. I.e. O(n) means `size=f(n)` where `f(n)=Kn` for some `K > 0`. This is what your algorithm is where `K` will be less than 1 and will depend on the number of duplicates. – Jim Garrison Jul 16 '17 at 03:13

1 Answers1

1

The space complexity is "best case" O(1) and "worst-case" O(N). The best case is when all of the numbers in nums are the same, and the worst case occurs in a variety of situations ... including when there close to N/2 duplicates, which is your intended use-case.

The time complexity is O(N) in all cases.


Here is my reasoning.

Time complexity.

It we assume that the hash function is well-behaved, then the time complexity of HashSet.add and HashSet.contains are both O(1).

The time complexity of Integer.valueOf(int) is also O(1).

These O(1) operations are performed at most "some constant" times in two O(N) loops, making the entire computation O(N) in time.

Space complexity.

This is a bit more complex, but let us just consider the worst case where all int values are unique. This statement

        if (!temp.contains(i)) temp.add(i);
        else temp2.add(i);

is going to add Integer.valueOf(i) to either temp or temp2. And in this particular case, they will all end up in temp. (Think about it ...) So that means we end up with N unique entries in the temp set, and none in the temp2 set.

Now the space required for a HashSet with N entries is O(N). (The constant of proportionality is large ... but we are taling about space complexity here so that is not pertinent.)

The best case occurs when all of the int values are the same. Then you end up with one entry in temp and one entry in temp2. (The same value is repeatedly added to the temp2 set ... which does nothing.) The space usage of two maps with one entry each is O(1).

But (I hear you ask) what about the objects created by Integer.valueOf(int)? I argue that they don't count:

  1. Unless they are made reachable (via one of the HashSet objects), the Integer objects will be garbage collected. Therefore they don't count as space usage in the sense that it is normally considered in Java1.

  2. A smart compiler could actually optimize away the need for Integer objects entirely. (Not current generation HotSpot Java compilers, but in the future we could see such things in the production compilers.)


1 - If you start considering temporary (i.e. immediately unreachable) Java objects as space utilization and sum up this utilization, you will get meaningless results; e.g. an application with O(N^2) "space utilization" that runs in an O(N) sized heap. Temporary objects do count, but only while they are reachable. In your example, the contribution is O(1).

Stephen C
  • 698,415
  • 94
  • 811
  • 1,216