Consider this string (notice the horizontal scroll - the string is long):
$content = 'Xxxxxx xx xxxx xxxxxx/xxxx xxxxxxx xx xxxxx xx xxx XXXXXXX/XXXXX XXXX XXXXXXX XXXX XXXXXX XXXXX XXXXXX XXXXXX XXXXXX XXXXX XXXXXX';
I have my own mb_trim()
function to support unicode strings, but I found it's performing really bad for this string in specific.
After debugging, I realized that it's just the "end-of-string" bit that doesn't perform, while "beginning-of-string" is fine.
So, just doing this (minimal code):
$trim = preg_replace('/\s+$/u', '', $content);
This takes 2s ~ 3s.
But even without the u
modifier, it still takes ~1.60s.
If I replace the spaces in the middle with some letter, the preg_replace
will take 0s.
Is there a way to fix this performance issue?
It's funny that if I run this:
$trim = preg_replace('/\s{2,}/u', ' ', $content);
$trim = preg_replace('/\s+$/u', '', $trim);
This will run fast.
But I don't understand why are the spaces in the middle of the string a problem for an "end-of-string" regex. I'd think it would be optimized in a way that it would only look at the end of the string and not in the middle.
--
UPDATE - This seems to take the 2s on the server running AlmaLinux (even though it has a very good CPU and RAM) and on a Docker container running CentOS 7 on a Windows. But if I run the script on the Windows itself, it runs instantly. It also runs fast on 3v4l.
I tried on another Linux host running PHP 7.4, and it took 5.4s.
I wonder what could be causing the hang on the Linux systems above?