4

I have a function that converts a PSObject into a hashtable. Function works well, but there's a little subtlety that I am trying to understand and can't really grasp my head around.

I'm using PowerShell Core 7.0.3

The function:

function Convert-PSObjectToHashtable
{
    param (
        [Parameter(ValueFromPipeline)]
        $InputObject
    )

    process
    {
        if ($null -eq $InputObject) { return $null }

        if ($InputObject -is [System.Collections.IEnumerable] -and $InputObject -isnot [string])
        {
            $collection = @(
                foreach ($object in $InputObject) { Convert-PSObjectToHashtable $object }
            )

            # buggy
            #Write-Output -NoEnumerate $collection
            
            # correct
            $collection
        }
        elseif ($InputObject -is [psobject])
        {
            $hash = @{}

            foreach ($property in $InputObject.PSObject.Properties)
            {
                $hash[$property.Name] = Convert-PSObjectToHashtable $property.Value
            }

            $hash
        }
        else
        {
            $InputObject
        }
    }
}

I execute the following code:

$obj = "{level1: ['e','f']}"
$x = $obj | ConvertFrom-Json | Convert-PSObjectToHashtable
[Newtonsoft.Json.JsonConvert]::SerializeObject($x)

The "buggy" code returns me:

{"level1":{"CliXml":"<Objs Version=\"1.1.0.1\" xmlns=\"http://schemas.microsoft.com/powershell/2004/04\">\r\n  <Obj RefId=\"0\">\r\n    <TN RefId=\"0\">\r\n      <T>System.Object[]</T>\r\n      <T>System.Array</T>\r\n      <T>System.Object</T>\r\n    </TN>\r\n    <LST>\r\n      <S>e</S>\r\n      <S>f</S>\r\n    </LST>\r\n  </Obj>\r\n</Objs>"}}

The correct code returns me:

{"level1":["e","f"]}

Why wouldn't the buggy code work, if technically, in PowerShell when working with the object result, they look equivalent?

Thank you!

DOMZE
  • 1,369
  • 10
  • 27
  • 2
    The difference is `-noenumerate`. Using a variable as output unrolls the array. So each array element is sent to the pipeline. When `-noenumerate` is used, the array as a single object is sent to the pipeline. If you remove `-noenumerate`, then `write-output` will work the same as the variable. – AdminOfThings Aug 25 '20 at 15:57
  • @AdminOfThings I actually just tried removing `-NoEmumerate` and leaving `Write-Output $collection` and I get the same "buggy" output. – DOMZE Aug 25 '20 at 18:08
  • As an aside: To test if a given object is a _custom_ object ("property bag"), you must use `-is [System.Management.Automation.PSCustomObject]` - `-is [psobject]` does _not_ work reliably, and neither does `-is [pscustomobject]`, because - confusingly - `[psobject]` and `[pscustomobject]` both refer to `[System.Management.Automation.PSObject]`, and incidentally `PSObject`-wrapped objects of any type, as explained in Mathias' answer, therefore also report `$true` for `-is [psobject]` - see https://github.com/PowerShell/PowerShell/issues/11921 – mklement0 Aug 26 '20 at 05:25
  • @AdminOfThings: In this particular case - because the command output is assigned to a property and `$collection` is `[object[]]`-typed - `$collection` (letting the engine enumerate and collect in a new `[object[]]` array), `Write-Output $collection` (ditto) and `Write-Output -NoEnumerate $collection` (output the array as a whole) all happen to have the same effect - _except_ that use of a cmdlet, such as `Write-Output` here, creates normally invisible `[psobject]` wrappers, which happen to surface in the `[Newtonsoft.Json.JsonConvert]::SerializeObject()` call, as explained in Mathias' answer. – mklement0 Aug 26 '20 at 18:47

1 Answers1

1

This occurs because PowerShell loves to wrap things in PSObject's.

The "plumbing" through which Write-Output (and all other binary cmdlets) emit standard output is implemented in a way that forces this explicit wrapping of the input objects in a PSObject.

So from a PowerShell user's perspective, these two variables have identical values:

$a = 1..3 |Write-Output
$b = 1..3

By any reasonable indication, both variables hold an array containing the integers 1,2,3:

PS ~> $a.GetType().Name
Object[]
PS ~> $b.GetType().Name
Object[]
PS ~> $a[0] -is [int]
True
PS ~> $a[0] -eq $b[0]
True

Behind the scenes though, the object hierarchy actually looks like this:

$a = 1..3 |Write-Output
# Behaves like: @(1,2,3)
# Is actually:  @([psobject]::new(1),[psobject]::new(2),[psobject]::(3))

$b = 1..3
# Behaves like: @(1,2,3)
# Is actually : @(1,2,3)

You might think this would pose a problem, but PowerShell goes through great length to keep this wrapper layer completely hidden from the user. When the runtime subsequently evaluates a statement like $a[1] and finds a PSObject wrapper, it transparently returns the base value (e.g. 2) as if it was the actual value referenced by the underlying array.

But [JsonConvert]::SerializeObject() isn't written in PowerShell, and when it starts traversing the object hierarchy outside the confines of the PowerShell language engine, it encounters the wrapping PSObject instances and picks its default serialization format (CliXml) instead of what should have otherwise been treated as native JSON types.

The expression $collection on the other hand is not a binary cmdlet and there are no downstream pipeline consumers, so its value is enumerated and written directly to the output stream, bypassing the PSObject wrapping/boxing step. The resulting array therefore references the output values directly instead of their respective PSObject wrappers, and the serialization works as expected again.


You can unwrap objects by referencing the ImmediateBaseObject property on the hidden psobject memberset:

$a = 1,2 |Write-Output
# Actual: @([psobject]::new(1),[psobject]::new(2))

$a = $a |ForEach-Object { $_.psobject.ImmediateBaseObject }
# Actual: @(1,2)

Beware that wrapping re-occurs every time an object goes through |:

$a = 1,2
# Actual: @(1,2)

$a = $a |ForEach-Object { $_ }
# Actual: @([psobject]::new(1),[psobject]::new(2))

If you wonder whether an expression returns PSObject-wrapped objects from within PowerShell, pass the output to Type.GetTypeArray():

PS ~> [type]::GetTypeArray(@(1..3|Write-Output)).Name
PSObject
PSObject
PSObject
PS ~> [type]::GetTypeArray(@(1..3)).Name
Int32
Int32
Int32
Mathias R. Jessen
  • 157,619
  • 12
  • 148
  • 206
  • 1
    Nicely done; the relevant GitHub issue with a list of real-world ramifications is at https://github.com/PowerShell/PowerShell/issues/5579 (I've just added the case at hand to it).To test a single object or a collection itself, you can use `-is [psobject]`. Any reason you're recommending `.ImmediateBaseObject` over `.BaseObject`? Note that applying either to a `[pscustomobject]` instance yields `$null`. – mklement0 Aug 26 '20 at 05:20
  • 1
    @mklement0 Thanks for the link, was having trouble finding it last night! I'm using `ImmediateBaseObject` because I was trying to obtain a reference to _the immediate base object_ ^_^ but you have a point, using `BaseObject` is probably a safer lesson to impart on future readers. – Mathias R. Jessen Aug 26 '20 at 17:39
  • wow! amazing! thank you, really well written and explained – DOMZE Aug 27 '20 at 01:28