Your code doesn't work because you're parsing <scrip
, followed by zero or more characters other than >
, followed by script>
.
There is no such substring in your content. In your $string, after <scrip
you have a t
(which matches [^>]+
) and then you have a >
instead of script>
. So, no match.
Here's what you need to do instead:
$string = preg_replace("/<script.*?<\/script>/si", "", $string);
You cannot use [^<]
or [^>]
because javascript code may contain many <
and >
characters itself.
Here's what the above regex does:
• Search for <script
I intentionally did not include the closing >
bracket here, because maybe you have some attributed in the script tag, like <script type='text/javascript'>
• Followed by any sequence of random characters, using lazy evaluation
Note the .*?
instead of .*
, this captures as little characters as possible to find a match, instead of as much as possible. This avoids the following problem:
<script>something</script> other content <script>more script</script>
Without lazy evaluation, it would remove everything from the first <script>
to the last </script>
• Followed by </script>
to mark the end of the script section
Note I'm escaping the slash (\/
instead of /
) because /
is the regex delimiter character here. We could also have used a different character at the beginnen and end of the regex, like #
, and then the /
didn't have to be escaped.
• Finally, I added the s
and i
modifiers. s
to make it parse multiline content. Javascript code can of course contain linebreaks, and we want .*?
to match those as well. And i
to make it case insensitive, because I assume you want to replace <Script>
or <SCRIPT>
too.