4

I've followed Why's (Poignant) Guide to Ruby, through couple of other guides, to Ruby style guide to see how Rubyists think. But this is the first time I see trailing underscores. What are these things? Are they useful and if so, when do we use them and how do we use them with splat operators?

(Ruby style guide link is anchored to the actual example)

James Pond
  • 267
  • 1
  • 4
  • 18

2 Answers2

8

A legal variable name is _. And something like *_ is similar to *x. The term a trailing underscore variable actually refers to the last variable name in a comma separated series of variables on the left side of an assignment statement, e.g.:

a, b, _ = [1, 2, 3, 4]

The splat operator has two uses:

  1. Explode an array into its individual items.
  2. Gather items into an array.

Which of those happens depends on the context that the splat operator is used in.

Here are the examples that the Ruby style guide says are bad:

a, b, _ = *foo

The trailing underscore variable in that example is unnecessary because you can assign the first two elements of foo to the variables a and b by writing:

a, b = *foo

The underscore variable is used to say, I don't care about this variable, and therefore it isn't necessary in that example if all you want to do is assign to a and b.

The example also might be considered bad style because the * operator isn't needed either(credit: Cary Swoveland):

a, b = [1, 2, 3]
p a, b

--output:--
1
2

The * can be used on the right hand side to good effect like this:

x, y, z = 10, [20, 30]
p x, y, z

--output:--
10
[20, 30]
nil

x, y, z = 10, *[20, 30]
p x, y, z

--output:--
10
20
30

So, just keep in mind that in the rest of the examples from the style guide the * is superfluous on the right hand side.

The next bad example is:

a, _, _ = *foo

Here is a more concrete example:

a, _, _ = *[1, 2, 3, 4]
p a, _

puts "-" * 10

a, _ = *[1, 2, 3, 4]
p a, _

--output:--
1
3
----------
1
2

The following shows the way the assignment works in the first section of the example:

 a, _, _
 ^  ^  ^
 |  |  |
[1, 2, 3, 4]

In any case, if you get rid of the second underscore variable on the left, then a will be assigned the same thing. What about getting rid of both underscore variables?

a = *[1, 2, 3, 4]
p a

--output:--
[1, 2, 3, 4]

Nope. So the first underscore variable on the left appears to be necessary. However, there is another syntax to get the same result without using a trailing underscore variable:

a, = *[1, 2, 3, 4]
p a

--output:--
1

Therefore, the third bad example:

a, *_ = *foo

can also be written as:

a, = *foo 

and thereby avoid a trailing underscore variable.

Finally, the style guide offers this cryptic advice:

Trailing underscore variables are necessary when there is a splat variable defined on the left side of the assignment, and the splat variable is not an underscore.

I think that may be referring to something like this:

*a = *[1, 2, 3, 4]
p a

--output:--
[1, 2, 3, 4]

If you want a to be assigned the first three elements of the array, then you have to write:

*a, _ = *[1, 2, 3, 4]
p a

--output:--
[1, 2, 3]

For whatever reason, the parser cannot handle:

*a, = *[1, 2, 3, 4]

--output:--
*a, = *[1, 2, 3, 4]
     ^
1.rb:6: syntax error, unexpected '\n', expecting :: or '[' or '.'

Here is one of the good examples:

*a, b, _ = *foo

The trailing underscore variable is necessary there, IF you want to assign the second to the last element of foo to b.

The following good examples are a little perplexing:

a, _b = *[1, 2, 3, 4]
a, _b, = *[1, 2, 3, 4]

Let's try them out:

a, _b = *[1, 2, 3, 4]
p a, _b

puts "-" * 10

a, _b, = *[1, 2, 3, 4]
p a, _b

--output:--
1
2
----------
1
2

In ruby, a variable name such as _b is no different than a variable named _ or b. In functional languages, like Erlang, the variables _ and _B and B have different effects--but not in Ruby.

By the way, I wouldn't spend five minutes learning that style--it's too esoteric.

7stud
  • 46,922
  • 14
  • 101
  • 127
  • It can't handle `*a, = ...` for the same reason it can't handle `*a, *b = ...`: How does it decide how many items to put in `a` and how many in `b`? Or in the case of `*a, = ...`, how does it decide how many items to put in `a` and how many to ignore? It can't, so the parser doesn't even allow it. – Jordan Running Oct 08 '15 at 05:40
  • @Jordan: Yeah, but look at the error message. *Or in the case of *a, = ..., how does it decide how many items to put in a and how many to ignore?*-- The parser could greedily assign to `*a,`, with the result being that `a` is equal to the whole array except the last element. – 7stud Oct 08 '15 at 05:51
  • So there is no actual difference between `a, b, = *bar`, `a, _b, = *bar` and `a, _b = *bar`, thus no actual difference between `_b` and `b` or even `a` and `_a`, right? (last example) – James Pond Oct 08 '15 at 05:54
  • 1
    @JamesPond, Right. No difference between `x` and `_x` and `_`. It's a little confusing when you see `*_`, but that is equivalent to `*x`. The underscore is merely a hint to the reader, taken from functional languages, that the variable is not going to be used. – 7stud Oct 08 '15 at 05:55
  • @7stud I'm targetting Haskell right now, driven by its successes on the [ICFP](https://en.wikipedia.org/wiki/ICFP_Programming_Contest) last couple of years. What do you think? I don't get the use of implementing leading underscores then, but i guess some people may come from functional background and it's easier for them. – James Pond Oct 08 '15 at 05:58
  • @JamesPond, I studied Haskell for a little while, but it gets very theoretical fast, as in you need to know number theory to understand what the hell Haskell programmers are talking about. – 7stud Oct 08 '15 at 06:00
  • 1
    @7stud The parser *could* do that, but it would be arbitrary. If it were greedy, why would it capture everything *except* the last element? Why wouldn't it capture every element? The trailing comma is essentially greedy itself, hence the analogy to `*a, *b = ...` It just means "throw everything else away." To change its meaning to "throw the last element away" when it follows a splat variable wouldn't make any sense. – Jordan Running Oct 08 '15 at 06:02
  • @Jordan, *why would it capture everything except the last element?* Because of the comma. *The parser could do that, but it would be arbitrary.* I don't agree. I think the way the parser works in all these cases is completely arbitrary to begin with. The parser could treat either the splat as greedy or the comma as greedy depending on which it sees first. – 7stud Oct 08 '15 at 06:02
  • Forgive me for repeating myself, but: "[The trailing comma] just means 'throw everything else away.' To change its meaning to 'throw the last element away' when it follows a splat variable wouldn't make any sense." There's no scenario in which the trailing comma indicates a single element. Making a special case where it means "throw the last element away" doesn't make any more sense than making it mean "throw everything except the first element away." It's equally arbitrary and equally inconsistent with the existing behavior. – Jordan Running Oct 08 '15 at 06:10
  • @Jordan: *The trailing comma is essentially greedy itself... It just means "throw everything else away.*--Really?? So in the line `x, y, z = [1, 2, 3]` the comma after `x` means throw everything away? Ahh, so the comma can mean different things depending on the context, right? – 7stud Oct 08 '15 at 06:17
  • "Trailing" as in "at the end of the list." There is no trailing comma in that example. If there's some term other than "trailing" you'd prefer to use for "after the last variable name and before the `=`" I'll be happy to use it instead. – Jordan Running Oct 08 '15 at 06:21
  • @Jordan, I understand that you think the way the parser is currently programmed is the only way that things could be--as if the parser is obeying some fundamental law of physics. However, in all the examples you mentioned, the parser could do other things using other algorithms that make perfect sense. For instance, `*a, *b = *foo`, could divide the array in half and assign portions to both `a` and `b`. Of course, then you'll claim that that isn't possible because the array could have an uneven number of elements. So, I suggest you just downvote my post, and leave me alone. – 7stud Oct 08 '15 at 06:32
  • I upvoted your answer an hour ago. It's a good answer. I commented with the intention of helping you (and OP) understand why the language's designers may have made the choice they did (it clearly having been a design choice and not an issue of parser constraints), since it seemed like you'd be interested in some insight on that topic, but it appears that you're more invested in being right. Much more so than I am. And so: You're right. I hope you keep on being right for as long as it continues to make you happy. Good evening. – Jordan Running Oct 08 '15 at 06:41
  • 2
    Wow! That's quite a tour de force. I'm guessing that you didn't expect your answer to be nearly so long and far-reaching when you started writing it. You noted that if `a, = *[1, 2, 3, 4]` results in `a #=> 1`. I may have missed it in your answer, but the same result is obtained when you remove the splat. It's common to see `a,_ = [1,2,3,4]`, but in fact only the comma is needed. – Cary Swoveland Oct 08 '15 at 07:07
  • @CarySwoveland, *I may have missed it in your answer, but the same result is obtained when you remove the splat.* Good catch! I blindly followed the examples in the style guide. I did present your same point in a post 6 years ago: https://www.ruby-forum.com/topic/191999#new. Does that count? By the way, the David Black in that thread is the same David Black who wrote 'The Well Grounded Rubyist'. – 7stud Oct 08 '15 at 22:33
1

Prepending a variable with an underscore _ does not have any syntactic meaning, it is just a convention (I.e., it is not a strict rule, and can be always broken without breaking the code).

It is not used so often in Ruby, but it is the counterpart in Ruby to prepending a function with an at mark @ in LaTeX programming. To my understanding, it is used to express the notion such as "inferior", "sub-routine", "core", or "element". It can be thought of as the opposite of "pluralization".

Example 1. It is common to pluralize the variable name of an array whose elements are to be referred to with a singular name:

items = [1, 2, 3]
items.each{|item| ...}

It may be a bit strange, but this practice is occasionally seen even when the elements are referred to with a non-word variable name:

is = [1, 2, 3]
is.each{|i| ...}

Now, suppose the array was named in advance with a singular name, and then you wanted to distinguish the element names from it. In such case, you can prepend an underscore to the element names:

item = [1, 2, 3]
item.each{|_item| ...}

x = [1, 2, 3]
x.each{|_x| ...}

Example 2. Suppose you have a method that calls another, and the latter is used only as a sub-routine, and does most of what the former does. In such case, you can prepend the latter with an underscore:

def foo a, b
  ... # Do something just a bit here
  _foo(a, b)
end
def _foo a, b
  ... # Do still a bit here
  __foo(a, b)
end
def __foo a, b
  ... # Most of the job is done here
end

Example 3. This is a bit different in that it is part of hardcoded Ruby, and it uses two underscores (instead of one), and is surrounding the variable (instead of prepending). There is method send, which calls __send__. This is so that, even when send is redefined to something else, __send__ can be used.

sawa
  • 165,429
  • 45
  • 277
  • 381
  • 1
    Thank you Sir, the previous answer lacked only the leading underscore explanation, for which i wasn't asking, but received from you. You are a true scholar, and a gentleman. – James Pond Oct 08 '15 at 08:52
  • By the way, does the _foo(a, b) and __foo(a, b) in the second method do anything else than inform "I'm going to use this method in the bigger method _this_method_name"? Or did I miss the point there? – James Pond Oct 08 '15 at 08:56
  • @JamesPond Informing that is the only intention. It implies that it is not used by itself. – sawa Oct 08 '15 at 09:00
  • Ah. Correct me if I am wrong. Its named the same, with the leading underscore, to show that this can be invoked only in the method without the underscore. We cannot call _foo ourside of foo (in the main part of the program), the same way we cannot invoke __foo outside of _foo (can we call it in foo, though?) – James Pond Oct 08 '15 at 09:07
  • Almost correct. It is to show that the method **is (usually/intended to be)** invoked only in the method without the underscore. As I wrote, it is only a convention, and there is nothing syntactically wrong with directly calling a method with a prepended underscore. And yes, what `_foo` is to `foo` is what `__foo` is to `_foo`. – sawa Oct 08 '15 at 09:09
  • Thus, it is against convention to call __foo in foo? – James Pond Oct 08 '15 at 09:13
  • 1
    I don't think it is that strict. If `__foo` is to be used both in `foo` and `_foo`, then it is okay. People use it under rather a loose interpretation, so you don't need to take this too seriously. – sawa Oct 08 '15 at 09:14