I have a small process for ingesting a .xel file, converting it to custom objects with a dba-tools module, and then turning them into single-line JSON and exporting them to a file that gets sent off to wherever it goes. Here:
$path = 'C:\temp\big_xe_file.xel'
#Read in file
$xes = Read-DbaXEFile -Path $path
#Output Variable
$file = ""
foreach ($xe in $xes) {
#format date column
$xe.timestamp = ($xe.timestamp.DateTime).ToString("yyyy-MM-ddThh:mm:ss.ffff")
# convert to JSON and change escaped unicode characters back
$xe = (($xe | ConvertTo-Json -compress)) | % { #| % { [System.Text.RegularExpressions.Regex]::Unescape($_) }
[Regex]::Replace($_,
"\\u(?<Value>[a-zA-Z0-9]{4})", {
param($m) ([char]([int]::Parse($m.Groups['Value'].Value,
[System.Globalization.NumberStyles]::HexNumber))).ToString() } )}
#Write line to file
Add-Content -Value "$($xe)`n" -Path 'C:\temp\myevents.json' -Encoding utf8 -NoNewline
}
This fits the bill and does exactly what I need it to. The nasty regex in the middle is because when you convertto-json, it HANDILY escapes all unicode characters, and the regex magically turns them all back to the characters we know and love.
However, it's all a bit too slow. We churn out lots of .xel files, usually 500mb in size, and we would like to have a shorter delay between the traces being written and being ingested. As it stands, it takes ~35 minutes to serially process a file this way. The delay would likely grow if we got behind, which seems likely at that speed.
I've already sped this up quite a bit. I've tried using [System.Text.RegularExpressions.Regex]::Unescape in place of the regex code I have, but it is only slightly faster and does not provide the correct formatting that we need anyway. My next step is to split the files into smaller pieces and process them in parallel, but that would be significantly more CPU intensive and I'd like to avoid that if possible.
Any help optimizing this is much appreciated!