4

I am curious about having a trailing comma in a block in Ruby.

For example:

[[1, 2], [3, 4]].collect { |x, | x }
# returns [1, 3]

It's as if there is an optional argument after the first argument.

However:

(proc { |x, | x }).arity
# returns 1

If arity is 1 then the array should not be decomposed across X.

Checking (proc { |x, | x }).parameters gives no hint that this any "secret" second parameter.

Are there methods of introspection to tell that proc { |x, | } is different from proc { |x| }?

I understand the basics of decomposition etc, and I can see that a trailing comma effectively creates a "secret" parameter in the since that |x, | and |x, _| work the same.

But what I am surprised at is there is no introspective way to find that the trailing comma is there, short of getting into the AST. It is just surprising.

Mitch VanDuyn
  • 2,838
  • 1
  • 22
  • 29
  • I'm not sure the decomposition part is relevant to `arity`. There is no secret second parameter, there's just an operation that moves the first element into `x`, discarding the rest. Rust has a similar operation. – tadman Feb 28 '23 at 19:36
  • Seems like Ruby is treating this as an edge case of [array decomposition](https://ruby-doc.org/3.2.1/syntax/methods_rdoc.html#label-Array+Decomposition). `proc { |(x,y,z)| puts x, y, z }` also has an arity of 1. – Schwern Feb 28 '23 at 19:39
  • Does `RubyVM::AbstractSyntaxTree` (or a Ruby parser) count? – cremno Feb 28 '23 at 19:41
  • This thread has some valuable related information: https://stackoverflow.com/questions/41865139/ruby-block-taking-array-or-multiple-parameters –  Feb 28 '23 at 20:07
  • I'm curious. Is this just for your own understanding of the inner workings, or do you have a specific use case in mind? –  Mar 01 '23 at 18:58
  • @MichaelB - a bit of both. I have some debug tools that use the parameters and arity methods and I was not aware of this case, which they cannot detect. Except by digging in the AST. Also am involved in the Opal project and it got me curious how MRI implements this. – Mitch VanDuyn Mar 01 '23 at 19:01
  • I don't have anything to add with regard to the inner workings, but for debugging; do you have access to the source code text? If so, could you possibly just create a modified debugging tool that uses the source code in string form? Just spitballing. –  Mar 01 '23 at 19:06
  • @MichaelB - yes, getting the AST will work. So at this point its mostly curiosity. Not like ruby to have this odd basically undocumented feature, that doesn't seem to work with the rest of the system very well. – Mitch VanDuyn Mar 01 '23 at 19:18

1 Answers1

6

In most situations, a trailing comma in a parameter list has no effect. However, in the exact situation you just listed, it matters.

Ruby has two types of procedure objects: strict and non-strict. Blocks created with { ... } or with the proc function are non-strict, while -> lambdas are strict. Likewise, def methods are always strict.

A strict function must be called with the correct number of arguments. A non-strict function will accept any number of arguments and will pad/truncate the parameter list as needed. Non-strict functions also have special behavior with regards to array destructuring. If a non-strict function is (a) declared to take more than one argument, and (b) given exactly one argument which is an array, then the array is destructured and treated as multiple arguments. This is why most of Enumerable just "works" on Hash, despite Hash functions seeming to take two arguments. When you write

my_hash.map { |k, v| ... }

The map on Enumerable is passing in an array [k, v], but the block is smart enough to destructure it. However, if the block is declared to take exactly one argument, it won't destructure. The trailing comma instructs Ruby to destructure the argument, even though there's only one declared parameter.

[[1, 2], [3, 4]].collect { |x| x }

There is one declared parameter here, so no destructuring occurs. This returns [[1, 2], [3, 4]].

[[1, 2], [3, 4]].collect { |x,| x }

The trailing comma says "destructure anyway". So the block will be called with 1 and then 3. The "extraneous" arguments 2 and 4 are dropped, since blocks are non-strict. The result of this block of code is [1, 3]

Silvio Mayolo
  • 62,821
  • 6
  • 74
  • 116
  • "*This is why most of Enumerable just "works" on Hash, despite Hash functions seeming to take two arguments.*" +1. procs are magic to pull double-duty as syntax. – Schwern Feb 28 '23 at 19:42
  • Interestingly enough explicit decomposition e.g. `{|(k,v)| }` works the same however the parser does not like `{|(k,)| }` and raises a `SyntaxError`. So it is unclear if this is decomposition or Multiple Assignment as described in the [documentation](https://ruby-doc.org/core-3.0.0/doc/syntax/assignment_rdoc.html#label-Multiple+Assignment) e.g. `a = 1,2` works fine with Array literal, `a,=[1,2]` works fine with Multiple Assignment, `(a,) = [1,2]` works fine with decomposition but `proc {|(a,)| }` does not. Maybe someone (unlikely to be me) can find the explicit implementation in the parser. – engineersmnky Feb 28 '23 at 22:01
  • Thanks for the answers, all of which I basically understand. My real question which perhaps I was not clear on, is HOW does the implementation KNOW that there is a trailing comma? In otherwords is there any kind of introspection (like .parameters, or .arity) that can be used to determine that that trailing comma exists in the argument list? – Mitch VanDuyn Feb 28 '23 at 22:16
  • @MitchVanDuyn not on any ruby version that I have available to me. All versions return parameters `[[:opt],[:x]]` and arity `1`. – engineersmnky Feb 28 '23 at 23:34
  • @MitchVanDuyn: There is but both are implementation-specific and unstable (see [`RubyVM`](https://docs.ruby-lang.org/en/3.2/RubyVM.html)). Is this acceptable? – cremno Mar 01 '23 at 16:52
  • @cremno - I checked RUBYVM and didn't see anything close, can you give an example of how you would get this info using that API? If so put in an answer and I will definitely accept it. – Mitch VanDuyn Mar 01 '23 at 17:50