I need to split a string on any non-alphanumeric character except /
and -
. For example, in preg_split()
:
/[^a-zA-Z0-9\/\-]/
This works great, but now I want to split the string at all these points except when the characters are found in a URL (i.e. I want to keep the URL together). I consider a URL to be a whitespace-delimited substring that starts with http://
or https://
. In other words:
My string. https://my-url.com?q=3 More strings.
Should get split into:
[0] My
[1] string
[2] https://my-url.com?q=3
[3] More
[4] strings
I've tried some naive approaches like /[^a-zA-Z0-9\/\-(https?\:\/\/.\s)]+/
but, unfortunately, I don't know how to do this outside a character class, which obviously is not giving me the results I want.
I am using PHP for now, and I'm hoping to just use preg_split()
but I am open to better, more comprehensive ways than this.