This question is a follow-up for the following post: Javascript regex: Find all URLs outside <a> tags - Nested Tags
I discovered that the code:
\b((https?|ftps?):\/\/[^"<\s]+)(?![^<>]*>|[^"]*?<\/a)
is extremely inefficient compared to executing it separately for http
and ftp
part like this:
\b(https?:\/\/[^"<\s]+)(?![^<>]*>|[^"]*?<\/a)
and
\b(ftps?:\/\/[^"<\s]+)(?![^<>]*>|[^"]*?<\/a)
Here are examples at regex101.com:
- 1st method - 6395 steps
- 2nd method - 3393 steps + 863 steps
However, in one of my HTML page these codes compares as 85628 steps vs. 7258 + 795 steps, that is quite insane.
As far as I have seen, using (x|y) pattern reduces the execution length but here probably for a strange reason it is otherwise.
Any help would be appreciated.