5

I searched but didn't find how to do it yet.
I am working on filtering data from large files (~2GB).
I used Where-Object and when it find match it continues to search for other matches which it makes sense.

Is it possible to stop it on the first match ?

For example (#1):

Get-Process | Where-Object {$_.ProcessName.StartsWith("svchost")}

The output will be:

Handles  NPM(K)    PM(K)      WS(K) VM(M)   CPU(s)     Id ProcessName
-------  ------    -----      ----- -----   ------     -- -----------
    666      38    26928      18672    92             568 svchost
    596      28    11516      16560    92             792 svchost
    425      14     5364       7036    45             832 svchost
    406      17     7032       8416    39            1004 svchost

What I want is to return the output after the first match:

Handles  NPM(K)    PM(K)      WS(K) VM(M)   CPU(s)     Id ProcessName
-------  ------    -----      ----- -----   ------     -- -----------
    666      38    26928      18672    92             568 svchost

This is what I tried (also with Foreach-Object):

Get-Process | Where-Object {if($_.ProcessName.StartsWith("svchost")){return $_}}
Get-Process | Where-Object {if($_.ProcessName.StartsWith("svchost")){return $_;break;}}    
Get-Process | ForEach-Object {if($_.ProcessName.StartsWith("svchost")){return $_}}

But it still returns the full output.
Reference:
How to break Foreach loop in Powershell?
Is it possible to terminate or stop a PowerShell pipeline from within a filter

EDIT (explanation about the problem with large data):
Example (#2):
I have two XMLs:
A.xml:

<?xml version="1.0" encoding="UTF-8"?>
<Events>
  <Event>
    <EventData Name="Time">09/10/2017 12:54:16</EventData>
    <EventData Name="WorkstationName">USER2-PC</EventData>
    <EventData Name="UserName">user2</EventData>
  </Event>  
</Events>

B.xml:

<?xml version="1.0" encoding="UTF-8"?>
<Events>
   <Event>
    <EventData Name="Time">09/10/2017 14:54:16</EventData>
    <EventData Name="WorkstationName">USER1-PC</EventData>
    <EventData Name="UserName">user1</EventData>
  </Event>
  <Event>
    <EventData Name="Time">09/10/2017 13:54:16</EventData>
    <EventData Name="WorkstationName">USER2-PC</EventData>
    <EventData Name="UserName">user2</EventData>
  </Event> 
 ... (more 100,000 events like the above two)
</Events>

These XMLs are being loaded as objects:

$fileA = "C:\tmp\A.xml"
$a = New-Object Xml.XmlDocument
$a.Load($fileA)

$fileB = "C:\tmp\B.xml"
$b = New-Object Xml.XmlDocument
$b.Load($fileB)

Then I want to search for the first match of the same username:

$result = $b.Events.Event | Where-Object {
    (($_.EventData | where-object {$_.Name -eq "UserName"})."#text" -eq $username)
}

$result.EventData

In this case it waste of time to run over the rest of 99,999 events if I have match on the first event.

EDIT (SOLVED):
After reading Nick answer, there wasn't anything new I didn't try.
The command:

Get-Process | Where-Object {if($_.ProcessName.StartsWith("svchost")){ $_;break;}}  

Indeed stops the Where-Object but it doesn't return the item.
This one can be solved by:

Get-Process | Where-Object {if($_.ProcessName.StartsWith("svchost")){ $someVar = $_;break;}}  

Therefore I marked his answer.

E235
  • 11,560
  • 24
  • 91
  • 141
  • 2
    `... | Where-Object { $_.ProcessName -like 'svchost*' } | Select-Object -First 1`? – Ansgar Wiechers Sep 20 '17 at 16:49
  • 1
    If you're filtering file data, why not use Select-String with the -List option to make it stop on the first match? – mjolinor Sep 20 '17 at 16:54
  • @AnsgarWiechers It will still pass over all the processes and after it will get the object with **all** the 'svchost.exe' processes it will select the first one. You can see that it pass all the objects by: `Get-Process | Where-Object { $_.ProcessName -like 'svchost*'; Write-Host $_} | Select-Object -First 1` – E235 Sep 20 '17 at 22:18

4 Answers4

4

If efficiency is what you need you can try break it out in to a loop:

Get-Process | foreach {If ($_.ProcessName.StartsWith("svchost")){$_;break}}

You can confirm it works with this check:

$i=0; Get-Process | foreach {$i++;$i; If ($_.ProcessName.StartsWith("svchost")){$_;break}}

It will make the loop print out a number each time it loops, in my case it got to 115, Then if i do (Get-Process).Count I have 157 Processes, So it looped over my processes found the one we want and then stopped the loop.

As stated here in other answers, You can use [0], On any array or list you can select a individual row using the index inside square brackets, Be careful though because attempting this on a null or empty object will throw a exception:

(Get-Process | Where-Object {$_.ProcessName.StartsWith("svchost")})[0]

Or you can you Select-Object which works in a similar way but has more options than just Index and will not throw any error if the object is null or empty.

Get-Process | Where-Object {$_.ProcessName.StartsWith("svchost")} | Select-Object -First 1

How ever both of these options will still evaluate the entire list before you select the first result.

Nick
  • 1,783
  • 1
  • 15
  • 18
  • efficiency is what I need, it is important. Regarding the last two answer you mentioned, like you wrote, they still evaluate the entire list and therefore I am not interesting with them. Regarding your first suggestion, it is good and I also tried it when, but it doesn't return the request item. But I can hack this by saving the requested variable `{$someVar = $_;break}` and that way to solve it. – E235 Sep 21 '17 at 07:25
  • I'm seeing this work nicely as an independent command, but it doesn't work within a script - the script exits when it hits the `break`. Per https://ss64.com/ps/foreach.html, it looks like piping into `foreach` treats it as an alias for `ForEach-Object`, which would explain why `break` acts like it would act outside a loop. – Tydaeus Feb 13 '19 at 22:57
3

Both, Where-Object and ForEach-Object are Cmdlets. You cannot break Cmdlets (commands). What you can do instead is to use the keyword foreach like this

$process = Get-Process

foreach ($item in $process) {
    if ($item.Name -eq 'svchost') {
        $item
        return
    }
}
vrdse
  • 2,899
  • 10
  • 20
  • I think you intend to use `break` rather than `return` here. – Bill_Stewart Sep 20 '17 at 19:02
  • @vrdse The problem with this workaround that you still passing over all the object. More than that, you are doing it twice. First time to with: `$process = Get-Process` and second time with the loop: `foreach ($item in $process)`. In this case, the use of Where-Object is faster. – E235 Sep 20 '17 at 22:24
  • I think that highly depends on when the first item is found in the loop, isn't it? – vrdse Sep 21 '17 at 10:41
1

Super interesting. I don't know why, but this article contradicts our findings!

https://community.idera.com/database-tools/powershell/powertips/b/tips/posts/save-time-with-select-object-first

I even tested it. Since PS3, select-object -first stops the pipeline

user242114
  • 29
  • 3
  • As it’s currently written, your answer is unclear. Please [edit] to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Feb 06 '22 at 01:04
  • I agree this answer is not well written but actually I believe it is the correct answer to the question, as adding this to the pipeline will actually stop its execution, much more elegantly than the other answers. I.e.: `Get-Process | Where-Object {$_.ProcessName.StartsWith("svchost")} | Select-Object -First 1`. Not to mention it gives you full control over how many results you want to get. – joniba Jan 17 '23 at 13:59
0

For filtering data from large files use a StreamReader instead of regular PowerShell cmdlets:

$filename = 'C:\path\to\your.txt'
$word     = 'something'

$rdr = [IO.File]::OpenText($filename)
while ($rdr.Peek() -ge 0) {
    $line = $rdr.ReadLine()
    if ($line -like "*${word}*") { break }
}
$rdr.Close()
$rdr.Dispose()
Ansgar Wiechers
  • 193,178
  • 25
  • 254
  • 328
  • I edited my question, I put examples for the large data I am working on. In this kind of data I am loading the data as XML objects so I don't see how to use the `StreamReader` in this case. The best thing I though is to use the pipe `Where-Object` but I can't stop it on the first match – E235 Sep 20 '17 at 22:48
  • I can use `foreach($event in $b.Events.Event){...}` but I thought that maybe using `Foreach-Object` in pipe line will be faster. – E235 Sep 20 '17 at 23:18