6

I'm trying to remove everything that is not alphanumeric, or is a space with _:

$filename = preg_replace("([^a-zA-Z0-9]|^\s)", "_", $filename);

What am I doing wrong here, it doesn't seem to work. I've tried several regex combinations...(and I'm generally not very bright).

jonnnnnnnnnie
  • 1,219
  • 3
  • 15
  • 24
  • I'm slightly confused--do you want to replace spaces with "_" or no? – Mike Park Nov 17 '10 at 23:57
  • Well, for one thing you’ve managed to neglect quite a few characters: `ˋunichars -a '[\p{Alpha}\p{Number}]' '[^a-zA-Z0-9]' | wc -lˋ == 14717`. Not a good place to start. – tchrist Nov 18 '10 at 00:03
  • You should put the `\s` in the square brackets. Otherwise `^\s` matches just whitespaces at the start `^` of the subject. Also use `/../` for enclosing, round brackets are only for capturing. – mario Nov 18 '10 at 00:04
  • Yes, I want to replace spaces and everything that isn't alphanumeric with an underscore _ – jonnnnnnnnnie Nov 18 '10 at 00:15
  • 2
    `[^\pL\pN]` is any single nonalphanumeric character. – tchrist Nov 18 '10 at 00:19

4 Answers4

12

Try this:

$filename = preg_replace("/[^a-zA-Z0-9 ]/", "_", $filename);
cdhowie
  • 158,093
  • 24
  • 286
  • 300
7
$filename = preg_replace('~[\W\s]~', '_', $filename);

If I understand your question correctly, you want to replace any space (\s) or non-alphanumerical (\W) character with a '_'. This should do fine. Note the \W is uppercase, as opposed to lowercase \w which would match alphanumerical characters.

lheurt
  • 391
  • 2
  • 8
  • 2
    The meaning of `\W` varies from one flavor to the next, but in PHP it matches any character that's not an ASCII word character, i.e. `[A-Za-z0-9_]`. That includes ASCII whitespace characters (so the `\s` is redundant) and alphanumeric characters from other scripts. Even accented Latin letters are considered non-word characters by `\W`. – Alan Moore Nov 18 '10 at 01:34
2

The solution that works for me is:

$filename = preg_replace('/\W+/', '_', $filename);

The + matches blocks of one or more occurances of \W whitespace which includes spaces and all non-alphanumeric characters

Marcus Adams
  • 53,009
  • 9
  • 91
  • 143
WonderWorker
  • 8,539
  • 4
  • 63
  • 74
0

Try

$filename = preg_replace("/[a-zA-Z0-9]|\s/", "_", $filename);
Tableking
  • 276
  • 3
  • 11
  • Wow. Um, like what is that `\s` doing *outside* the character class? And what about the thousands of alphanumerics you forgot about, eh? – tchrist Nov 18 '10 at 00:17