-1

I’m getting unexpected slowdowns when processing a very large Array in PowerShell.

I use Get-ChildItem to obtain an Array of FileInfo objects; the resulting Array has about 75,000 elements.

I then start uploading the files with a foreach ($file in $queue) construct, passing the file name to WinSCP’s .NET Assembly. After each file is uploaded, I clear its archive bit.

I’m uploading the files via SFTP to an OpenSSH server running on localhost on a solid state hard drive. No Internet access involved—everything is in memory.

The script runs fine at first (I estimate about 5 files uploaded per second), but after processing 10–20,000 files it begins to slow down to a crawl. It eventually gets so slow it appears frozen, and and I have to kill & restart the script. As soon as I restart the script, it runs (initially) at its original zippy speed.

The slowdown occurs both when running in Visual Studio Code, and in a standalone instance of PowerShell. It also persists after a system reboot.

I don’t touch the array itself during the process, and system memory use never gets above 30-35% (I have a 64-bit system with 16 GB RAM).

Does PowerShell have any known memory management problems with large data sets? Would using an ArrayList and deleting each element after its file is uploaded make things any faster?

Reproducing the problem takes several hours of runtime before the slowdown becomes unbearable, so I thought I’d see if anyone has experienced this kind of problem before I resort to brute force testing. And, does anyone know how long it would take to create a 75,000 element ArrayList from an Array?

Any insights would be greatly appreciated.

My $PSVersionTable:

Name                           Value
----                           -----
PSVersion                      7.3.6
PSEdition                      Core
GitCommitId                    7.3.6
OS                             Microsoft Windows 10.0.19045
Platform                       Win32NT
PSCompatibleVersions           {1.0, 2.0, 3.0, 4.0…}       
PSRemotingProtocolVersion      2.3
SerializationVersion           1.1.0.1
WSManStackVersion              3.0
aksarben
  • 588
  • 3
  • 7
  • 2
    Have you tried using the pipeline instead of a giant array? – Olaf Sep 02 '23 at 17:46
  • No. Why would using a pipeline make any difference? There's considerable logic involved before I pass the file name & server destination to the WinSCP Session object:, so I'm not sure a pipeline would even be possible. And I have to pass multiple params to the Session object, so don't know if that's possible with a pipeline. – aksarben Sep 02 '23 at 17:48
  • 1
    Opbjects passed through the pipeline will be processed one at a time. If you have issues with a giant array I could imagine that the pipeline would perform more reliable in such a case. Anyway it will not hurt you if you tried. `¯\_(ツ)_/¯` – Olaf Sep 02 '23 at 17:50
  • Here's the call to the WinSCP Session: ```$session.PutFileToDirectory($fullFileName, $pathBelowRoot, $remove, $options)```. It takes four parameters for each call: Is it even possible to do that with a pipeline? – aksarben Sep 02 '23 at 17:54
  • 3
    Why shouldn't be poosible? You use a `Foreach-Object` and put basically the same code in it like you do in a `foreach` loop. Please show your code. (Add it to your question - not as comment!!). – Olaf Sep 02 '23 at 18:06
  • Any ways to see the actual code ? – Douda Sep 02 '23 at 18:49
  • [PowerShell scripting performance considerations](https://learn.microsoft.com/powershell/scripting/dev-cross-plat/performance/script-authoring-considerations) – iRon Sep 02 '23 at 19:04
  • 1
    += kills puppies. – js2010 Sep 02 '23 at 19:52
  • 2
    If you want advice regarding your slow performing code... you need to share the code. How else would you imagine we could help you? I'm voting to close until you actually present a MRE. You are not new here, so you should know what's acceptable and what's just noise. – Doug Maurer Sep 02 '23 at 21:39
  • I agree with Doug, you need to post the code if you need help with this. I suspect you are accruing too many object in memory. PowerShell has issues with garbage collection during the execution of a function, especially if you are using parallel. You may need to find ways to periodically run [gc]::Collect(); [System.GC]::WaitForPendingFinalizers(); [gc]::Collect() – Tolga Sep 03 '23 at 03:39

0 Answers0