2

Hey, folks. I'm looking for some regular expressions to help grab street addresses and phone numbers from free-form text (a la Gmail).

Given some text: "John, I went to the store today, and it was awesome! Did you hear that they moved to 500 Green St.? ... Give me a call at +14252425424 when you get a chance."

I'd like to be able to pull out:

500 Green St. (recognized as a street address)

+14252425424 (recognized as a phone number)

What makes this problem easier is that I don't care about parsing text that gets pulled out. That is, I don't care that Green is the name of the road or that 425 is the area code. I just want to grab strings that "look like" addresses or telephone numbers.

Unfortunately, this needs to work internationally, as best as possible.

Anyone have any leads? Thanks!

Juha Syrjälä
  • 33,425
  • 31
  • 131
  • 183
spitzanator
  • 1,877
  • 4
  • 19
  • 29

3 Answers3

1

Phone numbers as long as you have a list of all country codes and number formats is easy, street addresses I have no idea, the only advice I can give you is to validate each set of words @ addressdoctor.com

Alix Axel
  • 151,645
  • 95
  • 393
  • 500
1

You can give RecogniContact (-> address-parser.com) a try, it recognizes both postal addresses and phone numbers.

Mike Warner
  • 137
  • 3
  • 8
0

Take a look at Chapter 7 of Dive Into Python. It touches both phone numbers and street addresses. I believe you can use this as a starting point. The international part seems tough. I suggest you build a first draft, try it on several locales, iterate and improve.

Yuval F
  • 20,565
  • 5
  • 44
  • 69
  • Ah, but I imagine this problem is already solved. Do you know of any already-existing regular expressions that I may employ? Thanks. – spitzanator May 22 '09 at 21:41
  • Well, you can check http://regexlib.com/. It's the #1 source of regex solutions for problems that shouldn't be solved with regexes. ;) – Alan Moore May 22 '09 at 22:00
  • Alan, this looks like a great resource, thanks. Cursory search gave me several international phone number regexes; No international street address ones, though. I still believe this is hard. – Yuval F May 23 '09 at 04:24