4

I desire to collapse any combination of TWO OR MORE non-alphanumberic characters into a single "."

I already have one filter before this one, so that the only 3 such characters I need to worry about are "_", "-", and "."

This is what I came up with

OutNameNoExt:= RegExReplace(OutNameNoExt,"[\._-]+" , ".")

Sadly, it fails because I have only read the first 3 chapters of my regex book.

I would like to clean up a string such as this

98788._Interview__with_a_booger..876789_-_.avi

so that it would read

98788.Interview.with.a.booger.876789.avi

I also believe I would have to use a totally new operator so that the replacement happens with all occurrences and not just the first one, right?

Ready for the knowledge to flow!

dwilbank
  • 2,470
  • 2
  • 26
  • 37

1 Answers1

4
OutNameNoExt:= RegExReplace(OutNameNoExt,"[^A-Za-z0-9]{2,}" , ".")

[^A-Za-z0-9] matches a non-alphanumeric character (^ stands for negation in the context of a bracket expression); {2,} matches 2 or more characters from the previous expression. It is basically the same as [^A-Za-z0-9][^A-Za-z0-9]+.

João Silva
  • 89,303
  • 29
  • 152
  • 158
  • 2
    That works, sir, and I see the curly braces operate with occurrences of patterns just as it does with occurrences of characters! Thanks! – dwilbank Aug 25 '12 at 03:49
  • But for the sake of education, could I have done it without using the excluding ^ symbol? Just by listing the three target characters inside the square brackets? – dwilbank Aug 25 '12 at 03:52
  • Ah, yes, you could. I thought you wanted to exclude more characters than just those three. Use `[._-]{2,}` instead, which is almost the expression you already had in the beginning. No need to escape `.` inside a bracket expression. – João Silva Aug 25 '12 at 03:53
  • Well your first one worked great. This second one.. eh... actually added spaces back into the final string. Looks like the pros would use the first way. – dwilbank Aug 25 '12 at 04:02