-1

I have a string that is a list of website names followed by a delimiter and then the url, with each name/url pair followed by a space and set of new line delimiters. Unfortunately the new line delimiters are not always present, there is a space between each pair, but as the site names can have spaces I can't simply split on space.

I have a regex and (according to regexPlanet) it matches all but the last pair.

Is it possible to get the last pair also?

Regex:
(.+?(?=\|)).(.+?(?= ))

Example String:
Website 1|https://site1.example.com \r\nWeb Site 2|https://2.example.co.uk \r\nSite 3|https://w3.example.com.au site 4|https://s4.example.org \r\nWeb Site5|https://s5.other.example.ac.uk/

RegexPlanet reports that the regex will match on the first four sites, just not for the fifth one.

Any ideas would be greatly welcomed

Gavin
  • 1,725
  • 21
  • 34

2 Answers2

1

Just added |$ to end of the regex

(.+?(?=\|)).(.+?(?= |$))
Hülya
  • 3,353
  • 2
  • 12
  • 19
1

You may use this regex with 2 capture groups:

([^|]+)\|(.+?(?=\s|\z))

RegEx Demo

Regex Details:

  • ([^|]+): Capture group #1 to match 1+ of any character that is not |
  • \|: Match a literal |
  • (.+?(?=\s|\z)): Capture group #2 to match 1+ of any character that is followed by a whitespace or end of line
anubhava
  • 761,203
  • 64
  • 569
  • 643
  • 1
    As both answers would answer my question, this one improves the regex I had and explains the regex. Thanks. – Gavin Nov 11 '20 at 10:51