0

I have a string that I need to parse using regex. This string is:

http://carto1.wallonie.be/documents/terrils/fiche_terril.idc?TERRIL_id=1 Crachet 7/12

What I try to do is to separate the url and the comment, so I tried:

(\S+)\s(.+) but as result, I get:

$1 = > http://carto1.wallonie.be/documents/terrils/fiche_terril.idc?TERRIL_id=1 Crachet

$2 = > 7/12

So, it seem that first character is not a space!

I tried to replace \s by 'X' and got

http://carto1.wallonie.be/documents/terrils/fiche_terril.idc?TERRIL_id=1 CrachetX7/12

I am sure to have something strange.

I tried to replace every character by 'X' (\n, \t, etc.) but cannot find what is this "space lookalike"

How can I identify this character and split my string?

EDIT:

If you want to play with my code, this is a Yahoo! Pipe: http://pipes.yahoo.com/pipes/pipe.edit?_id=a732be6cf2b7cb92cec5f9ee6ebca756

According to the Pipes documentation, it looks like it uses fairly standard regex syntax.

Some tests:

enter image description here

and

enter image description here

Waza_Be
  • 39,407
  • 49
  • 186
  • 260
  • What language? Please tag it. Your regex works perfectly fine in perl, for example. – Brian Roach Sep 29 '11 at 16:20
  • That's Yahoo! Pipes (added to my question) – Waza_Be Sep 29 '11 at 16:22
  • Your pipes example ... works fine. It shows `title` as `Crachet 7/12` (?) – Brian Roach Sep 29 '11 at 16:27
  • In your [previous question](http://stackoverflow.com/q/7596250/20670), `\S+` matched the part up to the first space before `Crachet` correctly as evidenced by the second screenshot. So what did change in-between? – Tim Pietzcker Sep 29 '11 at 16:28
  • It worked only for the first one... I need a few minuts of reflexion to put all the information together.. Thank a lot for your help. – Waza_Be Sep 29 '11 at 16:32
  • You still haven't anchored the regex to the start of the line (`^` anchor) or used the multiline modifier (`m`) --> `^(\S+)...` and check the `m` checkbox. – Tim Pietzcker Sep 29 '11 at 16:34
  • Ok, now \S+ is working fine! Just need the opposite of (\S+), so I can extract the other filed. Please post in answer, so I can accept it ;-) – Waza_Be Sep 29 '11 at 16:45

1 Answers1

1

Try the regex

^(\S+)\s+(.*)$

with the g and m modifier checkboxes checked.

Tim Pietzcker
  • 328,213
  • 58
  • 503
  • 561