1

Here is the list; I'm doing this to "normalize" a data set of addresses for easier look-ups.

I've tried using strtr() and str_ireplace() but it doesn't work out well. Here is a shorter set of the list for testing.

<?php
function street_abbreviations_regex($input) {
  $list = array(
    ' ave'  => ' avenue',
    ' blvd' => ' boulevard',
    ' cir'  => ' circle',
    ' ct'   => ' court',
    ' expy' => ' expressway',
    ' fwy'  => ' freeway',
    ' ln'   => ' lane',
    ' pky'  => ' parkway',
    ' rd'   => ' road',
    ' sq'   => ' square',
    ' st'   => ' street',
    ' tpke' => ' turnpike',
    ' n'    => ' north',
    ' e'    => ' east',
    ' s'    => ' south',
    ' w'    => ' west',
    ' ne'   => ' northeast',
    ' se'   => ' southeast',
    ' sw'   => ' southwest',
    ' nw'   => ' northwest',
  );
//   $input = strtr(strtolower($input), $list);
  $input = str_ireplace(array_keys($list), array_values($list), strtolower($input));
  $regex_street = (preg_replace("/[^A-Za-z0-9]/", "", $input));
  return $regex_street;
?>

Input

echo street_abbreviations_regex('10 E Union St.') . " <br>\n";
echo street_abbreviations_regex('10 E Union Street') . " <br>\n";

Output from strtr()

10eastunionsoutht
10eastunionsouthtreet

Output from str_ireplace()

10eastunionsouthtreet
10eastunionsouthtreetreet
mikeytown2
  • 1,744
  • 24
  • 37

1 Answers1

0

I work for a company called SmartyStreets where we do address parsing, standardization, etc... and I will say that the task you're trying to do is actually incredibly complicated. I know from experience!

Instead of listing all the types of input -- valid and invalid -- that would trump any regular expression, trust me that addresses come in many forms and sizes; and standardizing output accurately is not easy to do.

The USPS has certified a handful of providers to do perform address normalization using their official data. Look into CASS-Certified services. You can start your search with LiveAddress API (which is free). It's really easy to use with PHP (because LiveAddress returns a JSON string with PHP parses natively).

With any further questions about this at all, I'll be happy to personally answer them.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
Matt
  • 22,721
  • 17
  • 71
  • 112