2

I am having a helluva time trying to understand why this script is not working as intended. It is a simple script in which I am attempting to import a CSV, select a few columns that I want, then export the CSV and copy over itself. (Basically we have archived data that I only need a few columns from for another project due to memory size constraints). This script is very simple, which apparently has an inverse relationship with how much frustration it causes when it doesn't work... Right now the end result is I end up with an empty csv instead of a csv containing only the columns I selected with Select-Object.

$RootPath = "D:\SomeFolder"

$csvFilePaths = Get-ChildItem $RootPath -Recurse -Include *.csv | 
    ForEach-Object{
        Import-CSV $_ |
        Select-Object Test_Name, Test_DataName, Device_Model, Device_FW, Data_Avg_ms, Data_StdDev | 
        Export-Csv $_.FullName -NoType -Force
}
Clayton Lewis
  • 394
  • 2
  • 16

3 Answers3

5

Unless you read the input file into memory in full, up front, you cannot safely read from and write back to the same file in a given pipeline.

Specifically, a command such as Import-Csv file.csv | ... | Export-Csv file.csv will erase the content of file.csv.[1]

The simplest solution is to enclose the command that reads the input file in (...), but note that:

  • The file's content (transformed into objects) must fit into memory as a whole.

  • There is a slight risk of data loss if the pipeline is interrupted before all (transformed) objects have been written back to the file.

Applied to your command:

$RootPath = "D:\SomeFolder"

Get-ChildItem $RootPath -Recurse -Include *.csv -OutVariable csvFiles | 
    ForEach-Object{
        (Import-CSV $_.FullName) | # NOTE THE (...)
          Select-Object Test_Name, Test_DataName, Device_Model, Device_FW, 
                        Data_Avg_ms, Data_StdDev | 
            Export-Csv $_.FullName -NoType -Force
}

Note that I've used -OutVariable csvFiles in order to collect the CSV file-info objects in output variable $csvFiles. Your attempt to collect the file paths via $csvFilePaths = ... doesn't work, because it attempts to collects Export-Csv's output, but Export-Csv produces no output.

Also, to be safe, I've changed the Import-Csv argument from $_ to $_.FullName to ensure that Import-Csv finds the input file (because, regrettably, file-info object $_ is bound as a string, which sometimes expands to the mere file name).


A safer solution would be to output to a temporary file first, and (only) on successful completion replace the original file.

With either approach, the replacement file will have default file attributes and permissions; if the original file had special attributes and/or permissions that you want to preserve, you must recreate them explicitly.


[1] Note that in other contexts (on Windows) you may simply get an error that doesn't result in the input file's content getting erased; e.g., the following pipeline: Get-Content file.txt | ForEach-Object { "[$_]" } | Set-Content file.txt reports error The process cannot access the file '...\file.txt' because it is being used by another process, without erasing the content of file.txt

mklement0
  • 382,024
  • 64
  • 607
  • 775
  • 1
    Thanks for this! I just needed a quick and dirt method to reduce the sizes of these files and I am not modifying the actual archives, but a copy of them I made for this project so I am not overly concerned about error proofing or accidental data loss, as the data is archived and easily retrievable. I went with a more straightforward method of just declaring a variable in my for loop that performs the Import-Csv function, but your solution is definitely more elegant. One of those situations where when the answer is presented, all I can say is "duh." to myself... – user3749224 Jul 11 '18 at 17:56
1

As Matt commented, your last $PSItem ($_) not related to the Get-ChildItem anymore but for the Select-Object cmdlet which don't have a FullName Property

You can use differnt foreach approach:

$RootPath = "D:\SomeFolder"
$csvFilePaths = Get-ChildItem $RootPath -Recurse -Include *.csv

foreach ($csv in $csvFilePaths)
{
Import-CSV $csv.FullName |
Select-Object Test_Name,Test_DataName,Device_Model,Device_FW,Data_Avg_ms,Data_StdDev | 
Export-Csv $csv.FullName -NoType -Force
}

Or keeping your code, add $CsvPath Variable containing the csv path and use it later on:

$RootPath = "D:\SomeFolder"
Get-ChildItem $RootPath -Recurse -Include *.csv | ForEach-Object{
$CsvPath = $_.FullName
Import-CSV $CsvPath |
Select-Object Test_Name,Test_DataName,Device_Model,Device_FW,Data_Avg_ms,Data_StdDev | 
Export-Csv $CsvPath -NoType -Force
}
Avshalom
  • 8,657
  • 1
  • 25
  • 43
  • 1
    Thank you all for helping out, unfortunately none of these suggestions work. Whenever I export the CSV it is exported as an empty CSV. What is funny is I started with the more straightforward ForEach loop then moved to piping into the ForEach_Object cmdlet because the ForEach method was producing the same error. This is making me want to pull my hair out... – user3749224 Jul 10 '18 at 23:12
  • No, `$_` still refers to the `Get-ChildItem` input object, because it it isn't used in a script block. The real problem is reading and writing the same file in a single pipeline (without reading the file in full first). – mklement0 Jul 11 '18 at 00:57
0

So I have figured it out. I was attempting to pipe through the Import-Csv cmdlet directly instead of declaring it as a variable in the o.g. code. Here is the code snippet that gets what I wanted to get done, done. I was trying to pipe in the Import-Csv cmdlet directly before, I simply had to declare a variable that uses the Import-Csv cmdlet as its definition and pipe that variable through to Select-Object then Export-Csv cmdlets. Thank you all for your assistance, I appreciate it!

$RootPath = "\someDirectory\"
$CsvFilePaths = @(Get-ChildItem $RootPath -Recurse -Include *.csv)
$ColumnsWanted = @('Test_Name','Test_DataName','Device_Model','Device_FW','Data_Avg_ms','Data_StdDev')
for($i=0;$i -lt $CsvFilePaths.Length; $i++){
  $csvPath = $CsvFilePaths[$i]
  Write-Host $csvPath
  $importedCsv = Import-CSV $csvPath 
  $importedCsv | Select-Object $ColumnsWanted | Export-CSV $csvPath -NoTypeInformation
}