3

It occurred to me today that, after so many years of habitually passing -NoTypeInformation to Export-Csv/ConvertTo-Csv to prevent that undesirable comment line from being emitted, perhaps Import-Csv/ConvertFrom-Csv would be able to reconstruct objects with their original property types (instead of all of them being String) if only I hadn't suppressed that type information. I gave it a try...

PS> Get-Service | ConvertTo-Csv

...and, having not actually seen in a long time what gets emitted by omitting -NoTypeInformation, was reminded that it only includes the type of the input objects (just the first object, in fact), not the types of the members...

#TYPE System.ServiceProcess.ServiceController
"Name","RequiredServices","CanPauseAndContinue","CanShutdown","CanStop","DisplayName","DependentServices","MachineName","ServiceName","ServicesDependedOn","ServiceHandle","Status","ServiceType","StartType","Site","Container"
...

Comparing the result of serializing with type information and then deserializing...

PS> Get-Service | ConvertTo-Csv | ConvertFrom-Csv | Get-Member


   TypeName: CSV:System.ServiceProcess.ServiceController

Name                MemberType   Definition
----                ----------   ----------
Equals              Method       bool Equals(System.Object obj)
GetHashCode         Method       int GetHashCode()
GetType             Method       type GetType()
ToString            Method       string ToString()
CanPauseAndContinue NoteProperty string CanPauseAndContinue=False
CanShutdown         NoteProperty string CanShutdown=False
CanStop             NoteProperty string CanStop=True
Container           NoteProperty object Container=null
DependentServices   NoteProperty string DependentServices=System.ServiceProcess.ServiceController[]
DisplayName         NoteProperty string DisplayName=Adobe Acrobat Update Service
MachineName         NoteProperty string MachineName=.
Name                NoteProperty string Name=AdobeARMservice
RequiredServices    NoteProperty string RequiredServices=System.ServiceProcess.ServiceController[]
ServiceHandle       NoteProperty string ServiceHandle=
ServiceName         NoteProperty string ServiceName=AdobeARMservice
ServicesDependedOn  NoteProperty string ServicesDependedOn=System.ServiceProcess.ServiceController[]
ServiceType         NoteProperty string ServiceType=Win32OwnProcess
Site                NoteProperty string Site=
StartType           NoteProperty string StartType=Automatic
Status              NoteProperty string Status=Running

...to the result of serializing without type information and then deserializing...

PS> Get-Service | ConvertTo-Csv -NoTypeInformation | ConvertFrom-Csv | Get-Member


   TypeName: System.Management.Automation.PSCustomObject

Name                MemberType   Definition
----                ----------   ----------
Equals              Method       bool Equals(System.Object obj)
GetHashCode         Method       int GetHashCode()
GetType             Method       type GetType()
ToString            Method       string ToString()
CanPauseAndContinue NoteProperty string CanPauseAndContinue=False
CanShutdown         NoteProperty string CanShutdown=False
CanStop             NoteProperty string CanStop=True
Container           NoteProperty object Container=null
DependentServices   NoteProperty string DependentServices=System.ServiceProcess.ServiceController[]
DisplayName         NoteProperty string DisplayName=Adobe Acrobat Update Service
MachineName         NoteProperty string MachineName=.
Name                NoteProperty string Name=AdobeARMservice
RequiredServices    NoteProperty string RequiredServices=System.ServiceProcess.ServiceController[]
ServiceHandle       NoteProperty string ServiceHandle=
ServiceName         NoteProperty string ServiceName=AdobeARMservice
ServicesDependedOn  NoteProperty string ServicesDependedOn=System.ServiceProcess.ServiceController[]
ServiceType         NoteProperty string ServiceType=Win32OwnProcess
Site                NoteProperty string Site=
StartType           NoteProperty string StartType=Automatic
Status              NoteProperty string Status=Running

...the only difference is that the TypeName changes from CSV:System.ServiceProcess.ServiceController (the same type specified by the #TYPE comment prefixed with CSV:) to System.Management.Automation.PSCustomObject. All the members are the same, all the properties are of type String, and in both cases you have deserialized objects that do not contain the methods of and are not in any way connected to or proxies of the original objects.

Evidently Microsoft felt that not only could it be desirable to include this type information in the CSV output, but that it should be done by default. Since I can't really ask "Why did they do this?", what I'm wondering is how could this information potentially be useful? Do the *-Csv serialization cmdlets use it for anything other than setting the TypeName of each object? Why would I ever want this type information communicated "in-band" in the CSV output as opposed to just...knowing that this file containing service information came from System.ServiceProcess.ServiceController instances, or even just not caring what the original type was as long as it has the properties I expect?

The only two use cases I can think of are if you receive a CSV file created by an unknown PowerShell script then having the type information can aide in determining what application/library/module was used to produce the data, or if you had a ridiculously general script that attempted to refresh arbitrary input based on the type information, like this...

Import-Csv ... `
    | ForEach-Object -Process {
        # Original type name of the first record, not necessarily this record!
        $firstTypeName = $_.PSObject.TypeNames[0];

        if ($firstTypeName -eq 'CSV:System.ServiceProcess.ServiceController')
        {
            Get-Service ...
        }
        elseif ('CSV:System.IO.DirectoryInfo', 'CSV:System.IO.FileInfo' -contains $firstTypeName)
        {
            Get-ChildItem ...
        }
        elseif ($firstTypeName -eq 'CSV:Microsoft.ActiveDirectory.Management.ADObject')
        {
            Get-ADObject ...
        }
        ...
    }

...but those aren't very compelling examples, especially considering that, as I noted, this type information was deemed so important that it is included by default. Are there other use cases I'm not thinking of? Or is this simply a case of "It's better (and cheap) to include it and not need it than to need it and not include it"?

Related: PowerShell GitHub issue with discussion about making -NoTypeInformation the default in a then-future version (6.0, according the cmdlet documentation), as well as an apparent consensus that there's no point in ever omitting -NoTypeInformation.

Lance U. Matthews
  • 15,725
  • 6
  • 48
  • 68
  • Yeah... weird design decisions from 10 years ago. I suspect in 6.1 the switch will be off by default (considering it was always enabled in the current landscape and you only get cursing sysadmins exporting csvs when they forgot) – Maximilian Burszley Aug 08 '18 at 00:23

1 Answers1

2

If you have a thing which understands the header and also knows how to construct an other thing of that type, then that first thing could conceivably create one or more those other things from the string representation. I'm not saying it's useful but I am saying that is a potential use if a .csv might be usable where a file of another type might not. I may or may not have actually done this for reasons similar to what I mention here in answer to your question.

No Refunds No Returns
  • 8,092
  • 4
  • 32
  • 43
  • So you're saying that even though `ConvertFrom-Csv`/`Import-Csv` either can't or simply don't rehydrate the CSV data into instances of the original type, you might have a class/script/"thing" that _can_ and uses the CSV-derived data to initialize the properties of the real objects? So the "thing" is able to construct objects with both the type and properties being specified entirely by the CSV data, not hard-coded. I could see that; perhaps a directory of `.csv` files representing the various type-specific collections that make up an application's data? – Lance U. Matthews Aug 08 '18 at 02:55
  • Or a collection of objects all of the same type. It has the potential to be less verbose than either XML or JSON since you have the column ... er... property names listed only once and then all of the values in a nice orderly row. This falls down a bit if you have lots of optional properties that appear as strings of commas and falls over if you have wildly different types on each row. – No Refunds No Returns Aug 08 '18 at 20:54