What distinguishes this question from the near-duplicate at "How do I return only the matching regular expression when I select-string(grep) in PowerShell?" is the desire to extract substrings of interest via surrounding in-line context not to be included in the match:
PS> Select-String '(?<=load is: )\d+' .\log.html | ForEach-Object { $_.Matches[0].Value }
0
5875
6077
6072
5846
1900
1900
If you want to output actual numbers, simply place [int]
(for instance) before $_.Matches[0].Value
to cast (convert) the text results to an integer.
Select-String
can accept file paths directly, so for a single file or a group of files matched by a wildcard expression you generally don't need to pipe from Get-Content
.
(For processing entire directory subtrees, pipe from Get-ChildItem -File -Recurse
).
Regex '(?<=load is: )\d+'
uses a (positive) lookbehind assertion ((?<=...)
) to match part of each line without including what was matched in the result; only the \d+
part - a nonempty run of digits - is captured.
Select-String
outputs [Microsoft.PowerShell.Commands.MatchInfo]
instances whose .Matches
property contains the results of regex matching operation; its .Value
property contains what the regex captured.
In the case at hand, the lookbehind solution is probably simplest, but an alternative solution is to use a capture group, which is ultimately more flexible:
# Same output as above.
Select-String 'load is: (\d+)' .\log.html | ForEach-Object {$_.Matches[0].Groups[1].Value}
What the capture group (the parenthesized subexpression, (...)
) matched is available on the output objects' .Matches.Groups
collection, whose element at index 0
contains the overall match, and element 1
containing the 1st capture groups, 2
the 2nd, and so on.