0

If escaped characters in a regular expression created in JavaScript with the RegExp object need to be escaped again how does the following code in sizzle.js work -

identifier = "(?:\\\\.|[\\w-]|[^\0-\\xa0])+"

If \\\\\\\ = \ and \\\w = \w then how does \0 = \0 when only a single backslash is used?

When run in Google console identifier is "(?:\\\\.|[\w-]|[^-\\xa0])+"

Is this a mistake or am I not understanding correctly? If this is correct and this is how it is intended to work what is the purpose of \0?

ruakh
  • 175,680
  • 26
  • 273
  • 307
usr56777
  • 55
  • 4
  • Instead of `\0`, use `\\x00` – Wiktor Stribiżew Feb 07 '16 at 20:49
  • `\0` is a simple character that gets parsed as such right away. You only potentially need multiple backslashes for special character classes or other special things like `\w`, because those aren’t just characters. Is this what you’re asking? – Sebastian Simon Feb 07 '16 at 20:51
  • @Xufox Yes and thank you. I am still new to this and I noticed that my post contained way too many backslashes. – usr56777 Feb 07 '16 at 21:06
  • 1
    @Wiktor Stribiżew Thank you for your answer. I found a website which gives a little more insight to what exactly the \0 character is and why, like you mention, \\x00 should be used instead. According to the website, \0 is an 'Octal escape sequence' and has been deprecated in ES5. \\x00, or a 'Hexadecimal escape sequence' should be used instead. Here is the link that I found - https://mathiasbynens.be/notes/javascript-escapes – usr56777 Feb 08 '16 at 00:15
  • I only dropped a comment. My answers cannot be that short. Glad it made you perform a deeper research. – Wiktor Stribiżew Feb 08 '16 at 06:43

1 Answers1

1

If your regular expression needs to contain a backslash — e.g., because you need something like \( (which matches an actual () or \w (which matches a letter or digit or underscore) — and you're creating the regular expression from a string literal, then you need to write \\, which ends up as \ in the regular expression.

But in your \0 example, the regular expression doesn't need to contain a backslash. It just needs to contain the character U+0000 (which matches itself). So the string literal can just contain \0, which ends up as the character U+0000.

ruakh
  • 175,680
  • 26
  • 273
  • 307