0

Im trying to extract an address (written in french) out of a listing using regex. here is the example:

"Don't wait, this home won't be on the market for long! Pictures can be forwarded upon request.

123 de la street - city 345-555-1234 "

Imagine that whole thing is item.description. Here is a working set so far:

In "item.description", replace "^\d{1,4} des|de la|du [^,\s]+$" with "whatever"

and the address (123 de la street) will be correctly written over with whatever. BUT if I try to make it the only thing kept from the description, something like this (which dosent work):

In "item.description" replace "(.)(^\d{1,4} des|de la|du [^,\s]+$)(.)" with "$2"

What would be the best way to replace the whole description with just the address?

Thanks!

Juha Syrjälä
  • 33,425
  • 31
  • 131
  • 183
JB Lesage
  • 81
  • 1
  • 9

1 Answers1

1

Try adding * to the first and last token, plus watch out for ^$ signs! (They match start and end of the text.)

"^(.*)(\d{1,4} des|de la|du [^,\s]+)(.*)$"
Miroslav Bajtoš
  • 10,667
  • 1
  • 41
  • 99
  • Thanks Miroslav, I tried this as well with no luck. I would have assumed this to work though... have a look at the comment I left on David's answer to see if that changes anything – JB Lesage Jun 04 '09 at 12:45
  • Since your text is spanning multiple lines, I would assume the problem is that "." doesn't match newline characters. I am not familiar with Yahoo Pipes, so I can't advice you on how to change this behaviour. – Miroslav Bajtoš Jun 04 '09 at 13:09
  • The multiline was the problem, I just removed all
    tags before running this regex and it worked. Thank you!
    – JB Lesage Jun 04 '09 at 15:41