5

There is something mysterious to me about the escape status of a backslash within a single quoted string literal as argument of String#tr. Can you explain the contrast between the three examples below? I particularly do not understand the second one. To avoid complication, I am using 'd' here, which does not change the meaning when escaped in double quotation ("\d" = "d").

'\\'.tr('\\', 'x')      #=> "x"
'\\'.tr('\\d', 'x')     #=> "\\"
'\\'.tr('\\\d', 'x')    #=> "x"
sawa
  • 165,429
  • 45
  • 277
  • 381
  • Interesting. And why does `'\\rs'.tr('\\rs','x')` return `\\xx`?! I would have expected a return value of `xxx`. And in the [documentation example](http://ruby-doc.org/core/classes/String.html#M001199) `"hello".tr('aeiou', '*')`, `*` is shorter than `aeiou`, so what does *If to_str is shorter than from_str, it is padded with its last character* mean? Is it a bug? Is it a fly? Is it Superman? :D – Zabba May 08 '11 at 09:26
  • @Zabba It means that `"hello".tr('aeiou', '*#')` is the same as `"hello".tr('aeiou', '*####')`. :D – sawa May 08 '11 at 09:27
  • What output would you expect (don't execute it just yet, please!), from `'\\d'.tr('\\d','xx)` ? I don't expect what it *does* give.. – Zabba May 08 '11 at 09:30
  • @Zabba What I would have expected is 'xx'. – sawa May 08 '11 at 09:36
  • Zabba, obviously a syntax error. ;) – nitro2k01 May 08 '11 at 09:40
  • @nitro2k01, yes, that was a typo.. :) @sawa, yes I expected `xx` too, but the actual result is `\x` (at least on 1.8.7). – Zabba May 08 '11 at 10:14
  • If I wanted to understand `tr`'s behaviour, as well as listing the three examples above, I would have seen how it handled normal characters when there's differences in string length. – Andrew Grimm May 08 '11 at 23:33
  • Watch out does not work with UTF8 strings. – lzap Jul 20 '12 at 22:16

1 Answers1

9

Escaping in tr

The first argument of tr works much like bracket character grouping in regular expressions. You can use ^ in the start of the expression to negate the matching (replace anything that doesn't match) and use e.g. a-f to match a range of characters. Since it has control characters, it also does escaping internally, so you can use - and ^ as literal characters.

print 'abcdef'.tr('b-e', 'x')  # axxxxf
print 'abcdef'.tr('b\-e', 'x') # axcdxf

Escaping in Ruby single quote strings

Furthermore, when using single quotes, Ruby tries to include the backslash when possible, i.e. when it's not used to actually escape another backslash or a single quote.

# Single quotes
print '\\'    # \
print '\d'    # \d
print '\\d'   # \d
print '\\\d'  # \\d

# Double quotes
print "\\"    # \
print "\d"    # d
print "\\d"   # \d
print "\\\d"  # \d

The examples revisited

With all that in mind, let's look at the examples again.

'\\'.tr('\\', 'x')      #=> "x"

The string defined as '\\' becomes the literal string \ because the first backslash escapes the second. No surprises there.

'\\'.tr('\\d', 'x')     #=> "\\"

The string defined as '\\d' becomes the literal string \d. The tr engine, in turn uses the backslash in the literal string to escape the d. Result: tr replaces instances of d with x.

'\\'.tr('\\\d', 'x')    #=> "x"

The string defined as '\\\d' becomes the literal \\d. First \\ becomes \. Then \d becomes \d, i.e. the backslash is preserved. (This particular behavior is different from double strings, where the backslash would be eaten alive, leaving only a lonesome d)

The literal string \\d then makes tr replace all characters that are either a backslash or a d with the replacement string.

nitro2k01
  • 7,627
  • 4
  • 25
  • 30