2

I was looking at this question where the OP wanted to know how to compare items in two arrays without looping through each array.

The command given was:

$array3 = @(Compare-Object $array1 $array2 | select -Expand InputObject

My question is two-fold:

One, does this actually avoid iterating through the arrays in any form? Or does it simply obfuscate the operation from the user by doing it behind the scenes.

Two, as far as performance goes is this the best method for comparing objects? It appears to me it is actually significantly slower.

I made a real crude test:

$Array1 = @("1","2","Orchid","Envy","Sam","Map Of the World","Short String","s","V","DM","qwerty","1234567891011")
$Array2 = @("Bob", "Helmet", "Jane")

$Date1 = Get-Date
$Array2 | ForEach-Object `
    {
    if ($Array1 -contains $_){}

    }
$Date2 = Get-Date
$Time1 = [TimeSpan]$Date2.Subtract($Date1)
Write-Host $Time1

$Date1 = Get-Date
$Array3 = @(Compare-Object $Array1 $Array2)
$Date2 = Get-Date
$Time2 = [TimeSpan]$Date2.Subtract($Date1)
Write-Host $Time2

And my times came out:

ForEach-Object: 00:00:00.0030001

Compare-Object: 00:00:00.0030002


Edit

I updated the script to make it more fair, and it essentially evened out the times.

So what is the behind the scenes difference between Compare-Object and a traditional loop? Am I correct in assuming none?


Edit 2

I found this code using the decompiler:

internal int Compare(ObjectCommandPropertyValue first, ObjectCommandPropertyValue second)
    {
      if (first.IsExistingProperty && second.IsExistingProperty)
        return this.Compare(first.PropertyValue, second.PropertyValue);
      if (first.IsExistingProperty)
        return -1;
      return second.IsExistingProperty ? 1 : 0;
    }

    public int Compare(object first, object second)
    {
      if (ObjectCommandComparer.IsValueNull(first) && ObjectCommandComparer.IsValueNull(second))
        return 0;
      PSObject psObject1 = first as PSObject;
      if (psObject1 != null)
        first = psObject1.BaseObject;
      PSObject psObject2 = second as PSObject;
      if (psObject2 != null)
        second = psObject2.BaseObject;
      try
      {
        return LanguagePrimitives.Compare(first, second, !this.caseSensitive, (IFormatProvider) this.cultureInfo) * (this.ascendingOrder ? 1 : -1);
      }
      catch (InvalidCastException ex)
      {
      }
      catch (ArgumentException ex)
      {
      }
      return string.Compare(((object) PSObject.AsPSObject(first)).ToString(), ((object) PSObject.AsPSObject(second)).ToString(), !this.caseSensitive, this.cultureInfo) * (this.ascendingOrder ? 1 : -1);
    }

I have traced it around as best as I can, and I believe these are the two worker threads. It appears Compare-Object actually only does a 1 <==> 1 check down the list. Am I missing something here?

Community
  • 1
  • 1
Austin T French
  • 5,022
  • 1
  • 22
  • 40
  • Your cheating with `ForEach-Object`. To be fair you'll also need construct a result comparison object like `Compare-Object` does. – Andy Arismendi May 22 '13 at 04:59
  • I thought about that last night, and removed *| select -Expand InputObject* and now the Compare-Object is faster... – Austin T French May 22 '13 at 11:20
  • I asked a similar [question](http://stackoverflow.com/questions/15822046/wildcard-matching-in-get-childitem-in-powershell) a while back: how to check code of specific cmdlets. Apparently, no one knows it. You could try http://www.jetbrains.com/decompiler/ (I never used it), or you could simply put a breakpoint with `Set-PSBreakpoint .\your-script.ps1 -Line x` and then monitor the executed code and try to reconstruct the code. Please post back when you know more. – Davor Josipovic May 22 '13 at 13:11
  • 1
    @davor is right, the inner workings of `Compare-Object` are not available in public documents, as with pretty much all MS cmdlets. You'll have to de-compile the DLL hosting the Compare-Object code and dig through it. In addition to jetbrains, Telerik has [one](http://www.telerik.com/products/decompiler.aspx) and you can also look at [ILSpy](http://ilspy.net/). – Andy Arismendi May 22 '13 at 13:35
  • @AndyArismendi, Davor, Updated. It looks like it is not the same but I want a second opinion – Austin T French May 22 '13 at 20:11
  • 1
    Does order matter? If so, keep in mind Compare-Object has a SyncWindow param. Set it to 0 if the order matters. Should speed it up a bit. Compare-Object is pretty clunky in general. I only use it for pass/fail results and really need a more robust version of it. Write us a better one would ya? Maybe one that accepts a comparer? For decompiling I use dotPeek, and it has been invaluable. Learned more from it that I ever will from Technet. Also, Get-Date is a terrible way to benchmark. Use Measure-Command or [System.Diagnostics.Stopwatch]. – skataben Apr 16 '14 at 07:01
  • @skataben yes, Stopwatch is certainly better. This question being almost a year old though I have no clue why I did not use it! And writing a more robust version of it would not be hard as they are written in C#... – Austin T French Apr 16 '14 at 12:57
  • lol, I must have *really* been procrastinating last night (tax day) not to notice when this was posted. – skataben Apr 17 '14 at 00:33

0 Answers0