Background
I've been looking through several posts here on Stack and can only find answers to "how to add one single row of data to a CSV file" (notably this one). While they are good, they only refer to the specific case of adding a single entry from memory. Suppose I have 100,000 rows I want to add to a CSV file, then the speed of the query will be orders of magnitude slower if I for each row write it to file. I imagine that it will be much faster to keep everything in memory, and once I've built a variable that contains all the data that I want to add, only then write it to file.
Current situation
I have log files that I receive from customers containing about half a million rows. Some of these rows begin with a datetime and how much memory the server is using. In order to get a better view of how the memory usage looks like, I want to plot the memory usage over time using this information. (Note: yes, the best solution would be to ask the developers to add this information as it is fairly common we need this, but since we don't have that yet, I need to work with what I got)
I am able to read the log files, extract the contents, create two variables called $timeStamp and $memoryUsage that finds all the relevant entries. The problem occurs when I occurs when I try to add this to a custom PSObject
. It would seem that using a $csvObject += $newRow
only adds a pointer to the $newRow
variable rather than the actual row itself. Here's the code that I've got so far:
$header1 = "Time Stamp"
$header2 = "Memory Usage"
$csvHeaders = @"
$header1;$header2
"@
# The following two lines are a workaround to make sure that the $csvObject becomes a PSObject that matches the output I'm trying to achieve.
$csvHeaders | Out-File -FilePath $csvFullPath
$csvObject = Import-Csv -Path $csvFullPath -Delimiter ";"
foreach ($TraceFile in $traceFilesToLookAt) {
$curTraceFile = Get-Content $TraceFile.FullName
Write-Host "Starting on file: $($TraceFile.Name)`n"
foreach ($line in $curTraceFile) {
try {
if (($line.Substring(4,1) -eq '-') -and ($line.Substring(7,1) -eq '-')) {
$TimeStamp = $line.Split("|",4)[0]
$memoryUsage = $($line.Split("|",4)[2]).Replace(",","")
$newRow = New-Object PSObject -Property @{
$header1 = $TimeStamp;
$header2 = $memoryUsage
}
$reorderedRow = $newRow | Select-Object -Property $header1,$header2
$reorderedRow | Export-Csv -Path $csvFullPath -Append -Delimiter ";"
}
} catch {
Out-Null
}
This works fine as it appends the row each time it finds one to the CSV file. The problem is that it's not very efficient.
End goal
I would ideally like to solve it with something like:
$newRow = New-Object PSObject -Property @{
$header1 = $TimeStamp;
$header2 = $memoryUsage
}
$rowsToAddToCSV += $newRow
And then in the final step do a:
$rowsToAddToCSV | Export-Csv -Path $csvFullPath -Append -Delimiter ";"
I have not been able to create any form of workaround for this. Among other things, PowerShell tells me that op_Addition is not part of the object, that the object I'm trying to export (the collection of rows) doesn't match the CSV file etc.