3

I have a folder on a server that contains thousands of log files. New files are being populated every second and the folder continues to grow. Once a week I want to take those files, copy them and paste them into another folder. I will then run a python script to process the logs in the new folder. My script is processing around 70K logs a week and takes over an hour.

how can I make it more efficient/faster?

$ScriptProperties = @{
    "FolderDir" = "\\server\folder1\Temp";
    "FolderName" = "Orignal_Folder";
    "OldFolderName" = "New_Folder";
    "TempFolderName" = "Temp_Folder";



}

#$DateStamp = get-date -uformat "%Y-%m-%d@%H-%M-%S"

# if (Test-Path "$($ScriptProperties.FolderDir)\$($ScriptProperties.FolderName)"){
#   #write-host "$($ScriptProperties.FolderDir)\$($ScriptProperties.FolderName)   folder folder"
#   Copy-Item -Path "$($ScriptProperties.FolderDir)\$($ScriptProperties.FolderName)" -Destination "$($ScriptProperties.FolderDir)\$($ScriptProperties.OldFolderName)-$($DateStamp)" -force -recurse
#   #New-Item -Path "$($ScriptProperties.FolderDir)\$($ScriptProperties.FolderName)" -type directory -force
# }#else{
#   #write-host "folder not found."
#   New-Item -Path "$($ScriptProperties.FolderDir)\$($ScriptProperties.FolderName)" -type directory
#}

write-host "Start"


if (Test-Path "$($ScriptProperties.FolderDir)\$($ScriptProperties.FolderName)"){
    #Renaming folder to Temp directory
    Rename-Item "$($ScriptProperties.FolderDir)\$($ScriptProperties.FolderName)" -NewName "$($ScriptProperties.FolderDir)\$($ScriptProperties.TempFolderName)"

    #Creating new Log directory
    New-Item -Path "$($ScriptProperties.FolderDir)\$($ScriptProperties.FolderName)" -type directory

    #Creating new Move folder if not found
    if (-Not (Test-Path "$($ScriptProperties.FolderDir)\$($ScriptProperties.OldFolderName)")) {New-Item -Path "$($ScriptProperties.FolderDir)\$($ScriptProperties.OldFolderName)" -type directory -force}
    #Moving content to move directory
    Move-Item -Path "$($ScriptProperties.FolderDir)\$($ScriptProperties.TempFolderName)\*.*" -Destination "$($ScriptProperties.FolderDir)\$($ScriptProperties.OldFolderName)" -force 
    #Removing temp directory
    Remove-Item "$($ScriptProperties.FolderDir)\$($ScriptProperties.TempFolderName)" -Force

}


write-host "Complete"
mattk
  • 93
  • 1
  • 1
  • 7
  • 2
    You're looking for [`robocopy`](https://technet.microsoft.com/de-de/library/cc733145%28v=ws.10%29.aspx). – Ansgar Wiechers Feb 29 '16 at 20:38
  • It depends on where your bottleneck(s) are. If your bottleneck is the network, then transferring less data or transferring the data with less overhead will help. If the bottleneck is the source or destination CPU, memory, or I/O (ie PCIe) channels then you'd need to improve the situation on the server (reduce load or increase server capacity). Assuming it is network, then zipping the log files will significantly reduce amount of data to be transferred (log files are highly compressible). To be most effective the servers must not be too near full utilization of CPU. – Χpẘ Mar 01 '16 at 04:45
  • Ansgar- I have been testing robocopy. I tested moving 600 files and it took approximately 1:00 minute. In production i will be moving around 70K-80K files which based off my test will take about 2hrs. Speed is essential but maybe this is the best I can do. – mattk Mar 02 '16 at 15:55
  • Xpw- I think it is a little of both. The source of destination probably doesnt have enough capacity, but increasing that is not an option with costs. Also the network might be slow as well. I have tried zipping the folder but that takes over an hour as well. – mattk Mar 02 '16 at 15:58
  • Transferring tens of thousands of files is going to take time, whichever way you turn. – Ansgar Wiechers Mar 02 '16 at 22:39
  • Would a runspace pool/job sequence not work in this instance? It would allow you to thread the process so it would work asynchronously...? – Fredulom Mar 16 '16 at 14:35
  • Running the script "local" instead of using a UNC path should also increase the speed (try to do a major ACL change on a folder with a lot of child folders and files over UNC, now try the same "local" and watch the speed difference) – bluuf Mar 16 '16 at 17:15

1 Answers1

0

@Ansgar recommended robocopy; Why?! You are already in PowerShell, don't go back a decade to cmd and batch scripts if you can avoid it!

Also, be careful with a move! If something interrupts your transfer you have corrupted or lost your data. You aren't even checking if the file you transferred is equal in size; what if it failed? Here you delete it regardless. I would consider adding some kind of logic to check the file size before and after copying.

Consider using a BITS transfer to avoid this and delete the files you moved once you confirm the file size is equal. The other benefit is it will continue your copy from where you left off if it interrupted for any reason.

You can perform asynchronous operations and possiblystart processing data before your copy is finished. this should give you more detail on the options you have: https://msdn.microsoft.com/en-us/library/ee663885(v=vs.85).aspx. this is the TechNet page for the Start-BitsTransfer Cmdlet: https://technet.microsoft.com/en-us/library/dd819420.aspx to reference for syntax and parameter options.