7

I just stuck at this and cannot find solution. I would like to try to transform a string to lower case using preg_replace. I just cannot create the right regex. The reason is that normal strtolower does not support unicode characters. I know that I could use mb_strtolower but this function seems to be quite slow and beside them not everyone has MB support.

Any clue?

Regards, Radek

EDIT: Ok, thanks alot for your help guys. I think my approach was not quite correct. I think it would be much better to use this: How do I detect non-ASCII characters in a string? and then respectively use either the strtolower or mb_strtolower if available.

Community
  • 1
  • 1
Radek Suski
  • 1,352
  • 1
  • 13
  • 23

2 Answers2

5

Regex is not able to change characters by itself, it can only change their order and/or add additional characters/delete some of them.

There is preg_replace_callback or /e flag, but they can manipulate only with known functions, and therefore can't do better than strtolower.

If you can't rely on existense of mb_strolower function, you will have to implement it yourself.

Nameless
  • 2,306
  • 4
  • 23
  • 28
0

You shouldn't use a preg_replace for this because preg_replace is used to match a certain pattern and replace it with something else. Wat you want is to replace every single uppercase character with a lowercase one, so no need to match a pattern.

mb_strtolower would be the way to go, and if you don't have the mb_ functions you'll have to write a function yourself using a lot of str_replace's...

gitaarik
  • 42,736
  • 12
  • 98
  • 105
  • Yes but mb_stratolower is so damn slow. I did some test with to compare mb_strlower and native strtolower and MB seems to be about 30 times slower than the native one. My biggest problem is that at the time I have to do the strtolower I don't know if I have unicode characters within the string or not. – Radek Suski Mar 30 '12 at 08:21
  • Maybe first use strtolower, then use preg_replace_callback to replace all characters that are not default lowercase characters (/[^a-z]+/) and then use mb_strtolower for that – gitaarik Mar 30 '12 at 09:07
  • 1
    I came here because I was looking for a way to "decaptialize" a string (e.g. turn "`The War of NextGen`" to "`the war of nextGen`" - so for this case lower-casing the entire string will NOT work for me. I was trying to use preg_replace with `"/\b(\w)/"` and the replace string as `strtolower("$1")` - but it doesn't work! Surely there should be a way? – Yuval A. Jun 10 '17 at 12:24