5

I have a bunch of strings in php that all look like this:

10 NE HARRISBURG
4 E HASWELL
2 SE OAKLEY
6 SE REDBIRD
PROVO
6 W EADS
21 N HARRISON

What I am needing to do is remove the numbers and the letters from before the city names. The problem I am having is that it varies a lot from city to city. The data is almost never the same. Is it possible to remove this data and keep it in a separate string?

hakre
  • 193,403
  • 52
  • 435
  • 836
shinjuo
  • 20,498
  • 23
  • 73
  • 104

4 Answers4

7

Check out regular expressions and preg_replace. $nameOfCity = preg_replace("/^\d+\s+\w{1,2}\s+/", "", $source);

Explained:

  1. ^ matches the beginning of the string
  2. \d+\s+ start with one or more numbers followed by one or more white-space characters
  3. \w{1,2}\s+ then there should be one or two letters followed by one or more white-space characters
  4. The rest of the string should be the name of the city.

Cases not covered

  • If there's only the text qualifier before the city name
  • If there's only a number qualifier before the city name
  • If there's only a number qualifier and a the cities name is two letters long.

If you want to be more precise, I assume you could enumerate all the possible letters before the city name (S|SE|E|NE|N|NW|W|SW) instead of matching any one or two letter long strings.

Aleksi Yrttiaho
  • 8,266
  • 29
  • 36
1

For each line, try this :

$arr = preg_split('/ /', $line);

if(count($arr) === 3)
{
    // $arr[0] is the number
    // $arr[1] is the letter
    // $arr[2] is your city
}
else
{
    // Like "PROVO" no number, no letter
}

Yes, this code is horible but it works... And it keeps all your data. The important thing is to use preg_split not the deprecated split method.

William Durand
  • 5,439
  • 1
  • 26
  • 37
  • Wouldn't preg_match be a better fit? Especially if the city name contains two or more words e.g. New York, San Marino, Frankfurt af Main – Aleksi Yrttiaho Mar 01 '11 at 00:57
  • Given that you're splitting on a non-regular expression, you can use `explode()`. However, the above fails for cities with multiple words such as New York and Sun Valley. – David Harkness Mar 01 '11 at 00:59
  • Yes... `explode()` should work fine with $limit = 3 according to the [explode doc](http://php.net/manual/en/function.explode.php). – William Durand Mar 01 '11 at 01:06
1

See if the following works for you:

$new_str = preg_replace('/^([0-9]* \w+ )?(.*)$/', '$2', $str);
1

If you want to get a list of cities as an array, try:

if(preg_match_all("/(\w+$)/", $source, $_matches)) {
  $cities = $_matches[1];
}
Ben Rowe
  • 28,406
  • 6
  • 55
  • 75
  • Same as William Durands solution, this doesn't not match cities that have a name that contains more than one word. The expression must contain the unwanted parts as well though with a zero or one qualifier. Furthermore, preg_match_all is a bit unnecessary as there is only one possible match. Doesn't hurt though. – Aleksi Yrttiaho Mar 01 '11 at 01:01