1

this is the address method

the number might be different 12 or 412 and how many words for the finch ave east

1460 Finch Ave East, Toronto, Ontario, A1A1A1

so I try this

^[0-9]+\s+[a-zA-Z]+\s+[a-zA-Z]+\s+[a-zA-Z]+[,]{1}+\s[a-zA-Z]+[,]{1}+\s+[a-zA-Z]+[,]{1}+\s[A-Za-z]\d[A-Za-z][ -]?\d[A-Za-z]\d$
muratiakos
  • 1,064
  • 11
  • 18

2 Answers2

3

I usually recommend using regex capture-groups, so you can break and simplify your matching problem to smaller sets. For most cases I use \d and \w, s for matching numbers, standard letters and whitespaces.

I usually experiment on https://regex101.com before I put it into code, because it provides a nice interactive way to play with expressions and samples.

Regarding your question the expression that I came up is:

$regexp = "^(\d+)\s*((\w+\s*)+),\s*(\w+),\s*(\w+),\s*((\w\d)*)$"

In PowerShell I like to use the direct regex class, because it offers more granularity than the standard -match operator.

# Example match and results
$sample = "1460 Finch Ave East, Toronto, Ontario, A1A1A1"
$match = [regex]::Match($sample, $regexp)
$match.Success
$match | Select -ExpandProperty groups | Format-Table Name, Value

# Constructed fields
@{
    number = $match.Groups[1]
    street = $match.Groups[2]
    city = $match.Groups[4]
    state = $match.Groups[5]
    areacode = $match.Groups[6]
}

So this will result in $match.Success $true and the following numbered capture-groups will be presented in the Groups list:

Name Value
---- -----
0    1460 Finch Ave East, Toronto, Ontario, A1A1A1
1    1460
2    Finch Ave East
3    East
4    Toronto
5    Ontario
6    A1A1A1
7    A1

For constructing the fields, you can ignore 3 and 7 as those are partial-groups:

Name     Value
----     -----
areacode A1A1A1
street   Finch Ave East
city     Toronto
state    Ontario
number   1460
muratiakos
  • 1,064
  • 11
  • 18
3

To add to mákos excellent answer, I would suggest using named capture groups and the $Matches automatic variable. This makes it super easy to grab the individual fields and turn them into objects for multiple input strings:

function Split-CanadianAddress {
  param(
    [Parameter(Mandatory,ValueFromPipeline)]
    [string[]]$InputString
  )

  $Pattern = "^(?<Number>\d+)\s*(?<Street>(\w+\s*)+),\s*(?<City>(\w+\s*)+),\s*(?<State>(\w+\s*)+),\s*(?<AreaCode>(\w\d)*)$"

  foreach($String in $InputString){
    if($String -match $Pattern){
      $Fields = @{}
      $Matches.Keys |Where-Object {$_ -isnot [int]} |ForEach-Object {
        $Fields.Add($_,$Matches[$_])
      }
      [pscustomobject]$Fields
    }
  }
}

The $Matches hashtable will contain both the numbered and named capture groups, which is why I copy only the named entries to the $Fields variable before creating the pscustomobject

Now you can use it like:

PS C:\> $sample |Split-CanadianAddress

Street   : Finch Ave East
State    : Ontario
AreaCode : A1A1A1
Number   : 1460
City     : Toronto

I've update the pattern to allow for spaces in city and state names as well (think "New Westminster, British Columbia")

Mathias R. Jessen
  • 157,619
  • 12
  • 148
  • 206