4

I'm trying to match the hash fragment of a jQuery Mobile URL like this:

    matches = window.location.hash.match ///
        #                   # we're interested in the hash fragment
        (?:.*/)?            # the path; the full page path might be /dir/dir/map.html, /map.html or map.html
                            # note the path is not captured
        (\w+\.html)$        # the name at the end of the string
        ///

However, the problem is that the # symbol gets chopped from the regex in the compiled JS file because it's treated as the start of a comment. I know I could switch to a normal regex, but is there any way to use # in a heregex?

Alex Korban
  • 14,916
  • 5
  • 44
  • 55

2 Answers2

5

Escape it in the usual fashion:

matches = window.location.hash.match ///
    \#                  # we're interested in the hash fragment
    (?:.*/)?            # the path; the full page path might be /dir/dir/map.html, /map.html or map.html
                        # note the path is not captured
    (\w+\.html)$        # the name at the end of the string
    ///

That will compile to this regex:

/\#(?:.*\/)?(\w+\.html)$/

And \# is the same as # in a JavaScript regex.

You could also use the Unicode escape \u0023:

matches = window.location.hash.match ///
    \u0023              # we're interested in the hash fragment
    (?:.*/)?            # the path; the full page path might be /dir/dir/map.html, /map.html or map.html
                        # note the path is not captured
    (\w+\.html)$        # the name at the end of the string
    ///

But not many people are going to recognize \u0023 as a hash symbol so \# is probably a better choice.

mu is too short
  • 426,620
  • 70
  • 833
  • 800
3

Implementer here. Heregex comments are removed altogether with whitespace using a simple regex (/\s+(?:#.*)?/g), so any non-whitespace character before # (or placing it at the very beginning) works.

$ coffee -bcs
  /// [#] ///                      
  /// (?:#) ///
  ///#///       

// Generated by CoffeeScript 1.2.1-pre
/[#]/;

/(?:#)/;

/#/;
matyr
  • 5,774
  • 28
  • 22
  • Is there an "official" approach that is guaranteed to work for all time? I would (of course) suggest `\#` and a quick note about it in [the documentation](http://coffeescript.org/#regexes). – mu is too short Feb 29 '12 at 17:40
  • As explained, any sequence that doesn't match `/\s+#/` works. `\#` does seem most appropriate for OP case (I didn't list it because you did already), but writing e.g. `[?\#]` is redundant since `[?#]` is sufficient. – matyr Feb 29 '12 at 19:54
  • There is a difference between something that just happens to work and something that is documented and guaranteed to work now and in the future. One of the nicest things about complete and thorough specifications is future proofing. – mu is too short Feb 29 '12 at 23:57
  • At the moment CoffeeScript has no specifications or future proofing. It's still an ad-hoc language. – matyr Mar 01 '12 at 02:52