-1

I have the following string:

$linkString="The Following is a link to google <a class='links' href='http://google.com'>

http://google.com
</a>
";

In this string the hypertext of the html link in new line. I want to remove and may be replace all of the link (its html tag and the hypertext) from the string, so I tried the following:

<?php
$linkString="The Following is a link to google <a class='links' href='http://google.com'>

http://google.com
</a>
";

//Remove link tag:

echo preg_replace('/<[^>]*>/','',$linkString);

However, the above example prints out:

The Following is a link to google 

http://google.com

This is an online DEMO: http://codepad.org/whw81bwa

I want to know a regex that able to remove all the link (tag and hypertext)

SaidbakR
  • 13,303
  • 20
  • 101
  • 195

2 Answers2

2

Instead of using regex, make effective use of DOM to do this for you.

$doc = new DOMDocument;
@$doc->loadHTML($html); // load the HTML data

$xpath = new DOMXPath($doc);  

foreach ($xpath->query('//a') as $tag) {
   $tag->parentNode->removeChild($tag);
}

echo $doc->saveHTML();
hwnd
  • 69,796
  • 4
  • 95
  • 132
  • Thank you for the valuable idea, but I found another regex that allow replacing the link too. It is shown in my answer. – SaidbakR Feb 16 '15 at 00:00
  • I trired to use your solution and I get this error: `Strict Standards: Non-static method DOMDocument::loadHTML() should not be called statically in E:\XXX\viewtopic.php on line 1422` Plus UTF8 issue in render text. @hwnd – SaidbakR Feb 16 '15 at 00:29
  • See update and probably need to append a meta header to force UTF-8 interpretation. – hwnd Feb 16 '15 at 00:49
  • I found using `mb_convert_encoding($html, 'HTML-ENTITIES', "UTF-8");` in the `'loadHTML` method will solve the encoding issue [check this question](http://stackoverflow.com/questions/1154528/how-to-force-xpath-to-use-utf8). However, is there a way to `replaceChild` instead of `removeChild`? @hwnd – SaidbakR Feb 16 '15 at 06:39
  • Oh, sorry, Indeed there is already a method called `replaceChild`. Thank you. – SaidbakR Feb 16 '15 at 06:45
0

The following regex solve the issue:

/(?i)<a([^>]+)>(.+?)<\/a>/'

So,

<?php
$linkString="The Following is a link to google <a class='links' href='http://google.com'>

http://google.com
</a>
";

//Remove link tag:

echo preg_replace('/(?i)<a([^>]+)>(.+?)<\/a>/','A Hidden Link',$linkString);
SaidbakR
  • 13,303
  • 20
  • 101
  • 195