2

In the scenario I present to you, my solution is supposed to represent O(n^2 * log n), and the "pointers" solution, which I assume is the fastest way to resolve the "3SUM" problem, represents O(n^2 * 1); leaving the question of is O(1) faster than O(log n), exampling it with my code.
Could someone explain why this seems to be the case? Please. My logic tells me that O(log n) should be as fast as O(1), if not faster.
I hope my comments on the code of my solution are clear.

Edit: I know that this does not sound very smart... log(n) counts the input (n -> ∞), while 1... is just 1. BUT, in this case, for finding a number, how is it supposed to be faster to do sums and subtractions instead of using binary search (log n)? It just does not enter my head.


LeetCode 3SUM problem description


O(n^2 * log n)

For an input of 3,000 values:

  • Iterations: 1,722,085 (61% less than the "pointers solution")
  • Runtime: ~92 ms (270% slower than the typical O(n^2) solution)
public IList<IList<int>> MySolution(int[] nums)
{
    IList<IList<int>> triplets = new List<IList<int>>();

    Array.Sort(nums);

    for (int i = 0; i < nums.Length; i++)
    {
        // Avoid duplicating results.
        if (i > 0 && nums[i] == nums[i - 1])
            continue;

        for (int j = i+1; j < nums.Length - 1; j++)
        {
            // Avoid duplicating results.
            if (j > (i+1) && nums[j] == nums[j - 1])
                continue;

            // The solution for this triplet.
            int numK = -(nums[i] + nums[j]);

            // * This is the problem.
            // Search for 'k' index in the array.
            int kSearch = Array.BinarySearch(nums, j + 1, nums.Length - (j + 1), numK);

            // 'numK' exists in the array.
            if (kSearch > 0)
            {
                triplets.Add(new List<int>() { nums[i], nums[j], numK });
            }
            // 'numK' is too small, break this loop since its value is just going to increase.
            else if (~kSearch == (j + 1))
            {
                break;
            }
        }
    }

    return triplets;
}

O(n^2)

For the same input of 3,000 values:

  • Iterations: 4.458.579
  • Runtime: ~34 ms
public IList<IList<int>> PointersSolution(int[] nums)
{
    IList<IList<int>> triplets = new List<IList<int>>();

    Array.Sort(nums);

    for (int i = 0; i < nums.Length; i++)
    {
        if (i > 0 && nums[i] == nums[i - 1])
            continue;

        int l = i + 1, r = nums.Length - 1;

        while (l < r)
        {
            int sum = nums[i] + nums[l] + nums[r];

            if (sum < 0)
            {
                l++;
            }
            else if (sum > 0)
            {
                r--;
            }
            else
            {
                triplets.Add(new List<int>() { nums[i], nums[l], nums[r] });

                do
                {
                    l++;
                }
                while (l < r && nums[l] == nums[l - 1]);
            }
        }
    }

    return triplets;
}
Bita
  • 139
  • 4
  • 3
    _"My logic tells me that O(log n) should be as fast as O(1), if not faster."_ - why does it tells you that? – Guru Stron Jul 26 '22 at 23:35
  • I suspect the implementation of Array.BinarySearch() is affecting the performance. It is the only library method in code that could significantly impact performance. c# is managed and there is significant overhead in calling a method. – jdweng Jul 26 '22 at 23:39
  • How have you counted the iterations? – Guru Stron Jul 27 '22 at 00:17
  • @GuruStron Counting the number of times the last loop runs. – Bita Jul 27 '22 at 00:24
  • @Bita then you are not counting iterations which will be done by `Array.BinarySearch` and I would say if you were able to account for those you would get the expected result of `O(n^2 * log n)` having both more iterations and higher runtime (which you already have). – Guru Stron Jul 27 '22 at 00:27
  • @GuruStron Ohhh! Of course, that explains the incoherence on the iteration counting. Thank you very much, Stron. I hope you get the time to develop an explanation on the main question. – Bita Jul 27 '22 at 00:32
  • TBH I don't understand what you are missing here =) Having a fixed number of operations will always be faster for `n -> ∞` then having a number of operations which are growing with `n`. – Guru Stron Jul 27 '22 at 00:36
  • @GuruStron Yes, that is obvious, but I can not abstract it from the example of finding a number. On my primitive thinking, searching a number in an array should be much faster using logarithms (binary search) than sums and subtractions (pointers). – Bita Jul 27 '22 at 00:39
  • @Bita that sounds like a logical proposal, but the thing is that (I would not be able to prove that mathematically) you are missing that first solution has 3 "loops" (n, n, log n) and second has only two (n and n) so while binary search is faster then iterating elements it adds an extra "loop" here (which you forgot to count in iterations also). – Guru Stron Jul 27 '22 at 00:45
  • @GuruStron I recognize that, for some reason, I assumed that O(n^2 * log n) equals O(n^2), omitting the third loop. But, Wouldn't it be possible to resolve this problem with O(n*log n) using binary search? This is my last question, you already helped me a lot. – Bita Jul 27 '22 at 00:51
  • @Bita I'm not smart enough to prove that, but it seems that no =) – Guru Stron Jul 27 '22 at 00:55

3 Answers3

2

It seems that your conceptual misunderstanding comes from the fact that you are missing that Array.BinarySearch does some iterations too (it was indicated by the initial iterations counts in the question which you now have changed).

So while assumption that binary search should be faster than simple iteration trough the collection is pretty valid - you are missing that binary search is basically an extra loop, so you should not compare those two but compare the second for loop + binary search in the first solution against the second loop of the second.

P.S.

To argue about time complexity based on runtimes with at least some degree of certainty you need at least to perform several tests with different increasing number of elements (like 100, 1000, 10000, 100000 ...) and see how the runtime changes. Also different inputs for the same number of elements are recommended because in theory you can hit some optimal cases for one algorithm which can be the worst case scenarios for another.

halfer
  • 19,824
  • 17
  • 99
  • 186
Guru Stron
  • 102,774
  • 10
  • 95
  • 132
1

Quick interjection--not sure your second solution (pointers) is O(n^2)--It has a third inner loop. (See Stron's response below)

I took a moment to profile you code with a generic .NET profiler and:

enter image description here

That ought to do it, huh? ;)

After checking the implementation, I found that BinarySearch internally uses CompareTo which I imagine isn't ideal (but, being a generic for an unmanaged type, it shouldn't be that bad...)

To "Improve" it, I dragged BinarySearch, kicking and screaming, and replaced the CompareTo with actual comparison operators. I named this benchmark MyImproved Here's the results:

Flamegraph

Benchmark.NET results:

Interestingly, Benchmark.NET disregards common sense and puts MyImproved over Pointers. This may be due to some optimization which is turned off by the profiler.

Method Complexity Mean Error StdDev Code Size
Pointers O(n^2)??? 76.76 ms 1.465 ms 1.628 ms 1,781 B
My O(n^2 * log n) 93.08 ms 1.831 ms 3.980 ms 1,999 B
MyImproved O(n^2 * log n) 62.53 ms 1.234 ms 2.226 ms 1,980 B

TL;DR:

.CompareTo() seemed to be bogging down the implementation of .BinarySearch(). Removing it and using actual integer comparison seemed to help a lot. Either that, or it's some funky interface stuff that I'm not prepared to investigate :)

Mooshua
  • 526
  • 2
  • 16
  • 1
    The third inner loop is actually "the same" as the second, cause it moves left and right indexer to each other. – Guru Stron Jul 27 '22 at 01:13
  • Thanks for pointing that out--I misread the code a little ;) – Mooshua Jul 27 '22 at 01:18
  • This community is impressive, I think that your "MyImproved()" is right the "3SUM" solution I was looking for. I will study this. Thank you very much for your time @Mooshua. – Bita Jul 27 '22 at 01:19
0

Two tips:

  1. Use sharplab.io to see your lowered code, it may reveal something (link)

  2. try running these seperate tests through the dotnetBenchmark nuget package, it'll give you more accurate timings, and if the memory usage or allocations is considerably higher in one case, that could be your answer.

Anyway, are you running these tests in debug or release mode? I just had a thought that I haven't tested recently, but I believe that the debugger overhead can significantly affect the performance of a binary search.

Give it a go, and let me know

K Scandrett
  • 16,390
  • 4
  • 40
  • 65
Andy Wynn
  • 1,171
  • 7
  • 11
  • The optimal solution is known to be O(n^2), the question is why is that the optimal one. But thanks for the tips tho, I will follow them. – Bita Jul 26 '22 at 23:53
  • @Bita if you have only tried this in debug, try running it in Release mode with a stopwatch and console output. You may see the performance improvement you were expecting. – Andy Wynn Jul 27 '22 at 00:14
  • Using BenchmarkDotNet in release mode, without debugging, enhances the performance difference between O(n^2 * log n) and O(n^2), the first one being up to 5 times slower than the second one. Thank you for the tip, this is the right way to measure code. Regarding the question of the post, the answer may be in the `.CompareTo()` function used by `.BinarySearch()`, as @Mooshua pointed out. – Bita Jul 29 '22 at 11:48