1

The following scripts converts an XML into a specific CSV format to feed a specific source system. It works fine, but the performance is incredibly slow. I believe the issue is because Out-File is opening - closing the file for every line. Is there a better way to do this?

$url = 'http://www.ecb.europa.eu/stats/eurofxref/eurofxref-hist.xml'
$result = Invoke-RestMethod  -Uri $url
$elements = $result.Envelope.Cube

foreach($element in $elements)
{
    foreach($x in $element.Cube)
    {
        foreach($y in $x.Cube)
        {
            $time = $x.time.ToString() -replace "-"
            $output =  $time + "`t" + $y.currency.ToString() + "`t" + $y.rate.ToString()
            $output | Out-File -Append ".\rates.csv"
        }
    }   
}
Mik1893
  • 317
  • 4
  • 14

2 Answers2

3

I believe the issue is because Out-File is opening - closing the file for every line

That is indeed the reason, so the key to speeding up your command is to pipe all data to a single invocation of Out-File, which you can achieve by wrapping your foreach loop in a script block ({ ... }) that you invoke with &, the call operator:

& {
  foreach ($element in $elements) {
    foreach ($x in $element.Cube) {
      foreach ($y in $x.Cube) {
        $time = $x.time.ToString() -replace "-"
        # Synthesize and output the line to save.
        $time + "`t" + $y.currency.ToString() + "`t" + $y.rate.ToString()
      }
    }   
  }
} | Out-File .\rates.csv

The above preserves PowerShell's typical streaming pipeline behavior, sending output lines one by one to Out-File.

Given that your data is already in memory anyway, you can speed up the operation a little by using $(...) rather than & { ... } around your foreach loop, i.e. by using $(), the subexpression operator.


That said, in-memory data allows even faster processing:

  • Through bypassing the pipeline and instead passing all output lines as an argument.

  • Additionally, given that you're saving text to a file, by using Set-Content to speed things up a bit.

    • Note: In Windows PowerShell, Set-Content's default encoding differs from Out-File's: the active ANSI code page's encoding vs. UTF-16LE ("Unicode"); in PowerShell [Core] 7+, all cmdlets consistently default to BOM-less UTF-8; use the -Encoding parameter as needed.
  • Finally, you can eliminate one level of nesting from your foreach loops by taking advantage of PowerShell's member-access enumeration.

# Adjust -Encoding as needed.
Set-Content -Encoding utf8 .\rates.csv -Value $(
  foreach ($x in $elements.Cube) {
    foreach ($y in $x.Cube) {
      $time = $x.time.ToString() -replace "-"
      # Synthesize and output the output line.
      $time + "`t" + $y.currency.ToString() + "`t" + $y.rate.ToString()
    }
  }   
)
mklement0
  • 382,024
  • 64
  • 607
  • 775
  • 1
    the last one is the fastest method - i'm now down to 1.2 seconds! thanks a lot for the extensive explanation, really helpful! – Mik1893 Jan 19 '21 at 22:47
1

Try this:

$url = 'http://www.ecb.europa.eu/stats/eurofxref/eurofxref-hist.xml'
$result = Invoke-RestMethod  -Uri $url
$elements = $result.Envelope.Cube

$outputAll = New-Object -TypeName "System.Collections.ArrayList"

foreach($element in $elements)
{
    foreach($x in $element.Cube)
    {
        foreach($y in $x.Cube)
        {
            $time = $x.time.ToString() -replace "-"
            $output =  $time + "`t" + $y.currency.ToString() + "`t" + $y.rate.ToString()
            $null = $outputAll.add($output)
        }
    }   
}

$outputAll | Out-File -Append ".\rates.csv"
Farbkreis
  • 604
  • 3
  • 12
  • works like a charm, takes 4 seconds vs 5 minutes before! thanks – Mik1893 Dec 29 '20 at 20:06
  • 2
    I would like to add that the reason for this is that `Out-File` is an cmdlet that involves IO operations. These are generally very expensive, and should be used as little as possible. As you have found, collecting the result first and then calling `Out-File` once speeds things up considerably. – Rno Dec 29 '20 at 20:20