2

I have a few hundred html files that I want to show on a website. They all have links in them in the following format:

[My Test URL|https://www.mywebsite.com/test?param=123]

The problem is that some urls are split up so:

[My Test URL|https://www.mywebsite.c om/test?param=123]

I want to replace all of those with the HTML counter-part so:

<a href="https://www.mywebsite.com/test?param=123">My Test URL</a>

I know that with the regex "/[(.*?)]/" I can match the brackets but how can I split by the pipe, remove the whitespaces in the URL and convert everything to a string?

user754730
  • 1,341
  • 5
  • 31
  • 62
  • 1
    If you want to match both parts, then add both capture groups (also escaping `[` and `]`) then is just a case of looping over it to output links https://3v4l.org/2Jaid – Lawrence Cherone Dec 30 '21 at 17:43
  • Thanks for your help, it works with the given regex. But how can I replace it the text itself? I can't echo the links only but I need to replace everything. – user754730 Dec 30 '21 at 17:45
  • 1
    with [preg_replace](https://www.php.net/manual/en/function.preg-replace.php) https://3v4l.org/9Depu – Lawrence Cherone Dec 30 '21 at 17:49
  • Thx a lot:) But then again I have the problem with the whitespaces in the URLs which won't be replaced. Is there a way around that as well? – user754730 Dec 30 '21 at 17:55
  • 1
    oh, lol didn't see the space.. you can use preg_replace_callback to call a function to do stuff on the matches before replace https://3v4l.org/5PbGD - though `https://www.mywebsite.c om/test?param=123` is not a valid domain name, so should fix the issue which is causing it to have spaces, if its hard coded you can do find all and string replace in your editor `mywebsite.c om` -> `mywebsite.com`, shouldn't need code – Lawrence Cherone Dec 30 '21 at 18:02
  • or just do `$str = str_replace('mywebsite.c om', 'mywebsite.com', $str);` before you do the regex – Lawrence Cherone Dec 30 '21 at 18:13
  • Awesome! Your latest 3v4l works fine! Thanks a lot! Unfortunately I can't replace them because the space is always somewhere else in the URL and the URL is also different every time. Thanks a lot for your help! – user754730 Dec 30 '21 at 18:21

3 Answers3

2

If you just want to remove (white)spaces in the URL part in these markdown links you can use a mere preg_replace like

preg_replace('~(?:\G(?!\A)|\[[^][|]*\|)[^][\s]*\K\s+(?=[^][]*])~', '', $text)

See the regex demo. Details:

  • (?:\G(?!\A)|\[[^][|]*\|) - end of the previous match or [, then zero or more chars other than [, ] and | and then a | char
  • [^][\s]* - zero or more chars other than [, ] and whitespace
  • \K - discard all text matched so far
  • \s+ - one or more whitespaces
  • (?=[^][]*]) - there must be zero or more chars other than [ and ] and then a ] immediately to the right of the current location.

If you want to remove spaces inside the URL part and convert markdown to HTML, you had better use preg_replace_callback:

$text = '[My Test URL|https://ww w.mywebsite.c om/t  est?param=123]';

echo preg_replace_callback('/\[([^][|]*)\|([^][]+)]/', function($m) {
    return '<a href="' . str_replace(' ', '', $m[2]) . '">' . $m[1] . '</a>';
}, $text);

See the PHP demo. Details:

  • \[ - a [ char
  • ([^][|]*) - Group 1: any zero or more chars other than [, ] and |
  • \| - a | char
  • ([^][]+) - Group 2: any one or more chars other than ] and [
  • ] - a ] char.
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
1

If you want to remove all WHITESPACES from a string in php all you need to do is:

$string = str_replace(' ', '', $string);

Where $string is your string with the url.

1

I think this will do it for you

$html = "[My Test URL|https://www.mywebsite.c om/test?param=123] [My Test URL|https://www.mywebsite.com/test?param=123]";
$html = preg_replace_callback("|\[.*?\|.*?\]|", function($matches){
    list($anchor, $link) = explode("|",substr($matches[0], 1, -1));
    return "<a href='".str_replace(' ', '', $link)."'>$anchor</a>";
}, $html);

echo $html;
// echos  <a href="https://www.mywebsite.com/test?param=123">My Test URL</a> <a href="https://www.mywebsite.com/test?param=123">My Test URL</a>