I usually recommend using regex capture-groups
, so you can break and simplify your matching problem to smaller sets. For most cases I use \d
and \w
, s
for matching numbers, standard letters and whitespaces.
I usually experiment on https://regex101.com before I put it into code, because it provides a nice interactive way to play with expressions and samples.
Regarding your question the expression that I came up is:
$regexp = "^(\d+)\s*((\w+\s*)+),\s*(\w+),\s*(\w+),\s*((\w\d)*)$"
In PowerShell I like to use the direct regex
class, because it offers more granularity than the standard -match
operator.
# Example match and results
$sample = "1460 Finch Ave East, Toronto, Ontario, A1A1A1"
$match = [regex]::Match($sample, $regexp)
$match.Success
$match | Select -ExpandProperty groups | Format-Table Name, Value
# Constructed fields
@{
number = $match.Groups[1]
street = $match.Groups[2]
city = $match.Groups[4]
state = $match.Groups[5]
areacode = $match.Groups[6]
}
So this will result in $match.Success
$true
and the following numbered capture-groups
will be presented in the Groups
list:
Name Value
---- -----
0 1460 Finch Ave East, Toronto, Ontario, A1A1A1
1 1460
2 Finch Ave East
3 East
4 Toronto
5 Ontario
6 A1A1A1
7 A1
For constructing the fields, you can ignore 3 and 7 as those are partial-groups:
Name Value
---- -----
areacode A1A1A1
street Finch Ave East
city Toronto
state Ontario
number 1460