0

I want to do some validation in PHP (of a postal address).

For example, validate first line of address and postcode:

123 Some Road

and

 W3 1TJ

These two fields both have a number, letter and a space.

I've tried a couple of regular expression patterns but its not accepting the space and this is where I need help.

Here's the code I used so far:

$address1CheckPattern = '/^[a-z0-9-]+$/';

$adress1HasError = !preg_match($address1CheckPattern, $address1);
hakre
  • 193,403
  • 52
  • 435
  • 836
  • could you post what you have tried and what error you got ? – Madra David Jul 29 '13 at 09:15
  • Why do you want to validate an address with a regex? it is totally useless... – NDM Aug 01 '13 at 14:45
  • Are you saying you want each field to contain at least one space, one letter, and one number? All these zip codes from a specific country? I'm a little unclear on the validation rules you want. – ficuscr Aug 06 '13 at 06:54
  • You've shared your code and you've explained what you're missing. Question is, why you can't add the space to your pattern? Only if we know *why you fail* this question can be answered (what I mean is, just try to explain the problem after the code in your own words, like what your mental concept is to insert a space to the pattern and why you don't know/see where to add it). Especially as you're looking for "official sources" now with your bounty. – hakre Aug 06 '13 at 08:05
  • I agree with @NickyDeMaeyer, don't use Regex. I decided to use Geocoding for my validation, plenty of APIs out there for that. – Dan Power Aug 07 '13 at 09:40
  • Why don't you use Google Maps API Geocoding https://developers.google.com/maps/documentation/geocoding/ if a result is returned then it is a valid address. – cmorrissey Aug 05 '13 at 20:46

12 Answers12

3

That works in a general case. But not all address line 1's have a number, Some just have a name. eg House Name, Street Name

if your happy with your regex and you just want it to accept a space. Add a space to the regex

$add_check = '/^[a-z0-9- ]+$/i';

But its still not a good way to match addresses. Using a public API which get real data from the royal mail will be the best. Google API (free but rate limited) or a paid for service like Postcode anywhere will be much better for you

exussum
  • 18,275
  • 8
  • 32
  • 65
  • The OP's regex doesn't require there to be a number, so it will match against an address even if it doesn't contain a number (other than the space). Also, I think you made the same mistake I did: the extra '-' in between 9 and space. This sets up another range which will probably be either empty or contain a lot of additional characters. – brianmearns Jul 31 '13 at 10:25
  • 0-9- will match "0123456789- " the - literal. Really hes just chekcing it only contains them chars. but `'` is also a legal char in a first line – exussum Jul 31 '13 at 10:29
  • 1
    You're right, I just verified that the extra dash doesn't create another range, it just adds a literal dash character to the allowed set. – brianmearns Jul 31 '13 at 10:36
1

not exactly answering the question, but I had once the mission of validating postal codes

I build up the following regexes for that purpose; hope that helps.

as some stated before me, there is no way to validate EVERY address in a given country, let alone on the planet! it will have to be a general text field, maybe just check for characters you don't want:"?<>=;@$%^*!" and sanitize the input not to get SQL injected.

in my application, the postal code was tested depending of the country field but if you know you will test only a selected few you can put some together eg:US/UK = /^([\d]{5}(-[\d]{4})?)|([A-Z]?[A-Z][0-9][A-Z0-9]?\s[0-9][A-Z]{2})/

Afghanistan,Angola,Belize,Benin,Hong Kong,Ireland, Macau
no Postal Code

Argentina
^[A-Z][0-9]{4}[A-Z]{3}$

Canada
^(?!.*[DFIOQU].*)([A-Z][0-9]){3}$       

US
^[\d]{5}(-[\d]{4})?$

UK
^[A-Z]?[A-Z][0-9][A-Z0-9]?\s[0-9][A-Z]{2}$


Latvia
^LV[-\s]?[\d]{4}$

Hungary,Denmark,Cyprus,Georgia,Bangladesh,Austria,Armenia,Australia,Albania,Belgium,Bulgaria,Cape Verde,Philippines,Paraguay
Norway,New Zealand,Liechtenstein,Luxembourg,South Africa,Tunisia,Switzerland
^[\d]{4}$

Netherlands
^[\d]{4}\s[A-Z]{2}$

Portugal
^[\d]{4}[\s-][\d]{3}$

Israel,Iraq,Indonesia,Greece,Germany,Guam,Croatia,Costa Rica,Estonia,Egypt,France,Finland,American Samoa,Algeria
Brazil,Bosnia and Herzegovina,Cambodia,Palau,Morocco,Montenegro,Northern Mariana Islands,Lithuania,Italy,Malaysia
Mexico,Marshall Islands,Micronesia,Serbia,Puerto Rico,San Marino,Taiwan,Thailand,Spain,Sri Lanka,Turkey,Ukraine,U.S. Virgin Islands,Vatican
^[\d]{5}$

Poland
^[\d]{2}[\s-]?[\d]{3}$

Czech Republic, Slovakia, Sweden
^[\d]{3}[\s-]?[\d]{2}$

Iran
^[\d]{5}[\s-][\d]{5}$


China,Colombia,Belarus,Panama,Pakistan,Nigeria,Kazakhstan,Singapore,Romania,Russia
^[\d]{6}$
Georges Brisset
  • 246
  • 2
  • 6
  • I am from the Netherlands. I can tell you that your regex is incorrect. I guess it will be the same for a lot of other regular expressions. There are two problems, the first one the \d, because a zero is not allowed as the first number and secondly the check using the \s. A line feed or carriage return is not a valid character in the expression. – Loek Bergman Aug 06 '13 at 08:02
  • Thanks Loek, it is true that these are quite broad regex and they do not go to specifics like which numbers are valid (except Canada where some letters are not used) and they may allow a non valid postal code. but isn't `\s` the character for space? – Georges Brisset Aug 08 '13 at 21:21
  • 1
    First of all: I really appreciate your list of regular expressions of all countries. It is very convenient to have one. \s Means 'any whitespace character', see: http://www.php.net/manual/en/regexp.reference.escape.php. When you look at the function trim, then do you see what is included in \s: http://php.net/manual/en/function.trim.php. – Loek Bergman Aug 09 '13 at 07:33
  • 1
    Loek, I have to thank you: I never realized that \s included chr 0,9,10,11, 13 etc. always thought it was limited to chr32 or nbsp. I like your purist approach - thx again – Georges Brisset Aug 09 '13 at 14:35
0

You can include a space in acceptable characters simply by putting a space in the square brackets. So it would look like this:

$add_check = '/^[a-zA-Z0-9- ]+$/';

Is that what you're looking for?

Notice I also added "A-Z" range to the regex: you probably want to allow addresses to contain capital letters, unless you're already reducing the address to all lower case or something.

I would also recommend adding a dot to the character class, since abbreviations like "st." and "ave." may show up. Then it would look like this:

$add_check = '/^[a-zA-Z0-9-. ]+$/';
brianmearns
  • 9,581
  • 10
  • 52
  • 79
0

Your regex only contains alphabets and digits and no provision for checking space or tab try following regex its working

$add_check = '/^\s*[a-z0-9\s]+$/i';
if(!preg_match($add_check,$address1)) {
    $error_message .= '&add1=Please enter a valid Address Line 1.';
}
Poonam
  • 4,591
  • 1
  • 15
  • 20
0

What about special chars in the adressline?

In Germany there is often a "-" in the streetname or the german word for "street" is abroivated with "str." - so a dot is in the line. When I had to handle german addresses on work I had to deal with lines like "NW 10 Straße 39b", "Elsa-Brandström-Str 128 Haus 3", etc.

The same for post codes. I Germany we have 5 digist - easy, but other countrys also have alphanumeric chars, as your example 'W3 1TJ', etc. inside it.

So I would claim, there is no general validation for address lines.

bish
  • 3,381
  • 9
  • 48
  • 69
0

I took the question literally, if an address has a house number, a space then, a road name.

//address first line [1 pass 3 fails]
$add_check = '/[0-9] [a-zA-Z]/';
echo preg_match($add_check,"123 Some Road");
echo preg_match($add_check,"Some Road");
echo preg_match($add_check,"123Some Road");
echo preg_match($add_check,"2ND Road");

A post code has some letters, some numbers and space, some numbers, finally more letters

//UK type postal code [1 pass 3 fails]
$add_check = '/[a-zA-Z][0-9] [0-9][a-zA-Z]/';
echo preg_match($add_check,"W3 1TJ");
echo preg_match($add_check,"W 1TJ");
echo preg_match($add_check,"W3 1");
echo preg_match($add_check,"3 1TJ");

Of course if I don't live at '1 The Road' but at 'myHouse, The Road'. Ouch what to do about that comma, and what's happened to the numbers. This is probably a good time to put validateAddressLine($address) in a function so you can improve it later. And save a log message everytime it is false, then you can check to see if you have blocked some format you weren't expecting. [This also implies you should not be too strict with html validation, let the server check and log it.]

Robin
  • 104
  • 7
0

Here's some functions I made recently for a basic site. It's basic, and doesn't lookup postcode to see if exists, or check if postcode structure is correct (letter(s)s : number(s) : letter(s) etc). It only allows A-z0-9 and space as you stated (some addresses use comma or apostrophe, which can easily be added in the regex).

I've added a function to allow space, which doesn't check various things it could, such as if the space is in the right place, or if there is only one of them, I didn't need any of this for the basic site, feel free to amend where you need to.

//Address validation
function fnc_address_validation($strFncAddress)
  {
    if (strlen($strFncAddress) > 50)
      {
        return '50 characters maximum';
      }
    elseif (strlen($strFncAddress) < 5)
      {
        return '5 characters minimum';
      }
    elseif (preg_match("/^[A-z0-9 ]+$/", $strFncAddress) != 1)
      {
        return 'Invalid characters: Allowed 0-9, A-z, space';
      }
    else
      {
        return false;
      }
  }



//Postcode validation without space
function fnc_postcode_validation($strFncPostcode)
  {
    if (preg_match("/^[A-z0-9]+$/", $strFncPostcode) != 1)
      {
        return 'Invalid characters: Allowed A-z, 0-9';
      }
    elseif (strlen($strFncPostcode) > 7)
      {
        return '7 characters maximum';
      }
    elseif (strlen($strFncPostcode) < 4))
      {
        return '4 characters minimum';
      }
    else
      {
        return false;
      }
  }


//Postcode validation with space
function fnc_postcode_validation($strFncPostcode)
  {
    if (preg_match("/^[A-z0-9 ]+$/", $strFncPostcode) != 1)
      {
        return 'Invalid characters: Allowed A-z, 0-9, 1 space';
      }
    elseif (strlen($strFncPostcode) > 8)
      {
        return '8 characters maximum';
      }
    elseif (strlen($strFncPostcode) < 5))
      {
        return '5 characters minimum';
      }
    else
      {
        return false;
      }
  }

largest UK postcode (without space) is 7 chars (afaik), so notice when space is allowed in last function, strlen accepts 5-8.

then on the form page, something like

$strPostAddress = $_POST['address'];
$strPostPostcode = $_POST['postcode'];

$strCheckAddress = fnc_address_validation($strPostAddress);
$strCheckPostcode = fnc_postcode_validation($strPostPostcode);


if ($strCheckAddress === false && $strCheckPostcode == false)
  {
    //do whatever as all ok - insert into DB etc
  }


echo '<p>Address <input type="text" name="address" size="35" maxlength="50"
     value="'.$strPostAddress.'"> '.$strCheckAddress.'</p>';
echo '<p>Postcode <input type="text" name="postcode" size="35" maxlength="8"
      value="'.$strPostPostcode.'"> '.$strCheckPostcode.'</p>';

You can change the return = false on the functions to text too, if you want the form to verify each field is, for example, "Ok". Just check if all strings are == "Ok" then.

James
  • 4,644
  • 5
  • 37
  • 48
0

In this side there are some regular expressions for different countries: http://www.pixelenvision.com/1708/zip-postal-code-validation-regex-php-code-for-12-countries/

The regular expression for the Netherlands has the same problem as pointed out for the regular expression provided by Georges Brisset. It should be:

'/[1-9][0-9]{3} ?[a-zA-Z]{2}/'.

Georges Brisset has delivered you a lot of regular expressions and that is great. You could create a properties file and when you need the validation get the proper regular expression from this file and test the input with it.

The content of such a file would be something like this:

us=/.../
uk=/.../
nl=/.../
ge=/.../

The key value is the official abbreviation of a country. That list can be found here: http://www.iso.org/iso/country_codes/iso_3166_code_lists/country_names_and_code_elements.htm

After that you just read the file using:

$expressions = file('expressions.txt');

which returns the content of the file as an array. Next you loop through the list (I would say that you order the list with respect of the chance that you have to test the zipcode of a particular country) and execute the expression after the = - character.

Loek Bergman
  • 2,192
  • 20
  • 18
0
$address1CheckPattern = '^[a-zA-Z0-9][a-zA-Z0-9- ,()\/]*$';
//consider following type also
// new part 5/12, georgia
// 12 old street -10
// old stree block 4 (b)

$adress1HasError = !preg_match($address1CheckPattern, $address1);
Notepad
  • 1,659
  • 1
  • 12
  • 14
0

It should work if you put the whitespace ahead of the hyphen:

$address1CheckPattern = '/[^A-Za-z0-9_ -]/';

after the hyphen it would be interpreted as the range from underscore to space.

Hope that helps :)

Severin
  • 8,508
  • 14
  • 68
  • 117
0

For The Netherlands streetnames I would suggest:

\^[a-zA-Z]+\s[0-9]+[a-zA-Z]?$\

Where [a-zA-Z]+ is the streetname
\s the space betweens streetname and the number
[0-9]+ all the digits after the streetname
[a-zA-Z]? the possible add-on after the digits

Possible streetnames:

Amsterdamseweg 2<br> Hilversumsestraat 38a

user2707496
  • 1
  • 1
  • 1
0

To validate postal codes you can use something like this, in this case it is the Spain postal code validation but you can find more examples at https://rgxdb.com/r/3EZMSVWM

$add_check = '/^(?:0[1-9]|[1-4]\d|5[0-2])\d{3}$/';

if(!preg_match($add_check, $postal_code)){
  //error
}