tl;dr
-match
indeed makes the regex on the RHS (right-hand side) match substrings by default:
'foo' -match 'o' # true
However, you can anchor the regex with ^
to match the start of the input string, and/or $
to match the end:
'foo' -match '^foo$' # true - full match
'foot' -match '^foo$' # false
Read on for details and information about other string-matching operators.
Preface:
PowerShell string-comparison operators are case-insensitive by default (unlike the string operators, which use the invariant culture, the regex operators seem to use the current culture, though that difference rarely matters in regex operations).
- You can opt into case-sensitive matching by using prefix
c
;
e.g., -cmatch
instead of -match
.
All comparison operators can be negated with prefix not
; e.g., -notmatch
negates -match
.
With a single string as the LHS, the comparison operators return $True
or $False
, but with an array of strings they act as filters; that is, they return the subarray of elements for which the comparison is true.
EBGreen's comment on the question provides the best explanation (lightly edited and emphasis added):
[...] by default, -match
will return $True
if the [RHS] pattern (regex) can be found anywhere in the string. If you want to find the string at certain positions within the string, use ^
to indicate the beginning of the string and $
to indicate the end of the string. To match the entire string, use both.
Applied to part of your code:
$Reg2 = '^[0-9]{4}-[0-9]{2}-[0-9]{2}[A-Z]{1}[0-9]{2}_[0-9]{2}_[0-9]{2}$'
# ...
$c -match $Reg2
Note the ^
at the start and the $
at the end to ensure that the entire input string must match.
Also note that I've omitted the [regex]
cast, as it isn't necessary, given that -match
can accept strings directly.
On a related note, you can use assertion \b
to modify substring matching so that matching only succeeds at word boundaries (where a word is defined as any nonempty run of letters, digits, and underscores); e.g. 'a10' -match 'a1'
is true, but 'a10' -match 'a1\b'
is not, because the 1
in the input string is not at the end of a word.
Note that using -match
with a single string as the LHS (as opposed to an array) records the details of the most recent match in the automatic $Matches
variable, which is a hash table whose 0
entry contains the entire match (the part of the input string that matched); if capture groups (subexpressions enclosed in (...)
) were used in the regex - entry 1
contains what the 1st capture group captured, 2
what the 2nd one captured, and so on; named capture groups (e.g.,
(?<foo>...)
) get entries by their name (e.g, foo
).
Also, instead of a wordy if
/ elseif
construct for matching multiple regexes in sequence, you can use the switch
statement with the -regex
option:
Instead of:
if ($c -match $reg1) {
$c = $c -replace $regyear
}
elseif ($c -match $reg2) {
$c = $c -replace $reg2
}
you could write more cleanly:
switch -regex ($c) {
$reg1 { $c = $c -replace $regyear; break }
$reg2 { $c = $c -replace $reg2; break }
default { <# handles the case where nothing above matched #> }
}
break
ensures that no further matching is performed.
switch
's default matching (or with option -exact
) works like the -eq
operator (see below).
You can also make it perform wildcard-expression matching - like the -like
operator (see below) - with the
-wildcard
option.
The -casesensitive
option makes matching case-sensitive for any of the matching modes.
If the input is an array, matching is performed on each element; note that break
then stop processing of further elements, whereas continue
instantly proceeds to the next element.
Other methods of string matching in PowerShell:
-like
allows you to match strings based on wildcard expressions.
Simply put, *
matches any run of characters, including none, ?
matches exactly 1 character, and [...]
matches any one character in a specified set or range of characters.
Unlike -match
, -like
always matches the entire string, but note that wildcard expressions have fundamentally different syntax from regular expressions and are far less powerful - you cannot use -like
and -match
interchangeably.
Thus, to get substring matching, place a *
o both ends of your expression; e.g.:
'ingot' -like '*go*' # true
-eq
compares entire strings, literally (except for case variations).
Note that PowerShell has no literal substring-matching operator, but you can (somewhat clumsily) emulate one with -match
and [regex]::Escape()
:
'Cost: 7$.' -match [regex]::Escape('7$') # true
[regex]::Escape()
escapes its argument so that its content is treated literally when interpreted as a regex (which the RHS of -match
invariably is).
This is somewhat inefficient, as there is no good reason to use regexes to begin with.
Direct use of the .NET [string]
type's .IndexOf()
method is an option, but is also nontrivial; the following is the equivalent of the previous command:
'Cost: 7$.'.IndexOf('7$', [StringComparison]::InvariantCultureIgnoreCase) -ne -1 # true
Note the need to use InvariantCultureIgnoreCase
to match PowerShell's default behavior, and the need to compare to -1
, given that the character index of where the substring starts is returned.
On the flip side, this method gives you more control over how matching is performed, via the other members of the [System.StringComparison]
enumeration.
If you're looking for case-sensitive substring matching based on the current culture, then you can simply rely on the default behavior of .IndexOf()
; e.g.,
'I am here.'.IndexOf('am') -ne -1 # true
vs.
'I am here.'.IndexOf('AM') -ne -1 # false, because matching is case-sensitive
Finally, note that the Select-String
cmdlet performs string matching in the pipeline, and it supports both regexes (by default) and literal substring matching (with the -SimpleMatch
) switch.
Unlike the comparison operators, Select-Object
outputs a match-information object of type [Microsoft.PowerShell.Commands.MatchInfo]
for each matching input line that contains the original line plus metadata about the match.