What does this syntax using "on:" mean in Ruby on Rails?
Since you are specifically asking about syntax, not semantics, I will answer your question about syntax.
This is really hard to do a google search about because I have no idea if it's a Ruby thing or a Rails thing
That is easy to answer: Ruby does not allow to modify the syntax, so it cannot possibly be a Rails thing. Anything related to syntax must be a "Ruby thing" since neither Rails nor any other user code can change the syntax of Ruby.
What you are asking about is just basic Ruby syntax, nothing more, and nothing to do with Rails.
I am not sure what after_commit on: %i[create update] do
is supposed to mean.
What you see here, is called a message send. (In other programming languages like Java or C# and in some parts of the Ruby documentation, it might be called a method call and in programming languages like C++, it might be called a virtual member function call.) More precisely, it is a message send with an implicit receiver.
A message is always sent to a specific receiver (just like a message in the real world). The general syntax of a message send looks like this:
foo.bar(baz, quux: 23) {|garple| glorp(garple) }
Here,
foo
is the receiver, i.e. the object that receives the message. Note that foo
can of course be any arbitrary Ruby expression, e.g. in (2 + 3).to_s
, the message to_s
is sent to the result of evaluating the expression 2 + 3
, which in turn is actually just the message +
sent to the result of evaluating the expression 2
, passing the result of evaluating the expression 3
as the single positional argument.
bar
is the message selector, or simply message. It tells the receiver object what to do.
- The parentheses after the message selector contain the argument list. Here, we have one positional argument, which is the expression
baz
(which could be either a local variable or another message send, more on that later), and one keyword argument which is the keyword quux
with the value 23
. (Again, the value can be any arbitrary Ruby expression.) Note: it is actually not necessarily true that this is a keyword argument. It could also be a Hash
. More on that later.
- After the argument list comes the literal block argument. Every message send in Ruby can have a literal block argument … it is up to the method that gets invoked to ignore it, use it, or do whatever it wants with it.
- A block is a lightweight piece of executable code, and so, just like methods, it has a parameter list and a body. The parameter list is delimited by
|
pipe symbols – in this case, there is only one positional parameter named garple
, but it can have all the same kinds of parameters methods can have, plus block-local variables. And the body, of course, can contain arbitrary Ruby expressions.
Now, the important thing here is that a lot of those elements are optional:
- You can leave out the parentheses:
foo.bar(baz, quux: 23)
is the same as foo.bar baz, quux: 23
, which also implies that foo.bar()
is the same as foo.bar
.
- You can leave out the explicit receiver, in which case the implicit receiver is
self
, i.e. self.foo(bar, baz: 23)
is the same as foo(bar, baz: 23)
, which is of course then the same as foo bar, baz: 23
.
- If you put the two together, that means that e.g.
self.foo()
is the same as foo
, which I was alluding to earlier: if you just write foo
on its own without context, you don't actually know whether it is a local variable or a message send. Only if you see either a receiver or an argument (or both), can you be sure that it is a message send, and only if you see an assignment in the same scope can you be sure that it is a variable. If you see neither of those things it could be either.
- You can leave out the block parameter list of you're not using it, and you can leave out the block altogether as well.
- If the last argument of the argument list (before the block, obviously, which is passed after the closing parenthesis of the argument list) is a
Hash
literal, you can leave off the curly braces, i.e. foo.bar(baz, { quux: 23, garple: 42 })
can also be written as foo.bar(baz, quux: 23, garple: 42)
which can also be written as foo.bar baz, quux: 23, garple: 42
. That's what I alluded to earlier: the syntax for passing a new-style Hash
literal and the syntax for passing a keyword argument are actually the same. You have to look at the parameter list of the method definition to figure out which of the two it is, and there are some corner cases that have changed how, exactly, it is interpreted a couple of times in between Ruby 2.0 when keyword parameters and arguments were first introduced and Ruby 3.2.
So let's dissect the syntax of what you are seeing here:
after_commit on: %i[create update] do
__elasticsearch__.index_document
end
The first layer is
after_commit … some stuff … do
… some stuff …
end
We know that this is a message send and not a local variable, because there is a literal block argument, and variables don't take arguments, only message sends do.
So, this is sending the message after_commit
to the implicit receiver self
(which in a module definition body is just the module itself), and passes some arguments, including a literal block.
If we add the optional elements back in, we can see that
after_commit … some stuff … do
… some stuff …
end
is equivalent to
self.after_commit(… some stuff …) do
… some stuff …
end
The block has no parameter list, only a body. The content of the body is
__elasticsearch__.index_document
Again, we know that index_document
is a message send because it has a receiver. Whenever you see either an argument or a receiver or both, you know that you have a message send. So, this is sending the message index_document
to the receiver expression __elasticsearch__
.
Now, what is __elasticsearch__
? As I mentioned above, we can't actually know without context what it is: it could be either a receiver-less message send with no argument list, i.e. a message send to the implicit receiver self
, roughly equivalent to self.__elasticsearch__()
. Or, it could be a local variable. The way this ambiguity is resolved is by looking at the preceding context: if there has been an assignment to __elasticsearch__
parsed (not necessarily executed) before this point, it will be treated as a local variable, otherwise, as a message send.
In this particular case, there is no assignment to __elasticsearch__
, therefore, it must be a message send, i.e. it is sending the message __elasticsearch__
to the implicit receiver self
(which is here still the module itself, because blocks lexically capture self
, although that is part of the language semantics and you asked strictly about syntax).
If we add the optional elements back in, we can see that
__elasticsearch__.index_document
is equivalent to
self.__elasticsearch__().index_document()
So far, we have dissected the body of the block as well as the outermost message send. If we put together what we have found so far and add all the optional syntax elements back in, we see that
after_commit on: %i[create update] do
__elasticsearch__.index_document
end
is equivalent to
self.after_commit(on: %i[create update]) do
self.__elasticsearch__().index_document()
end
Now, let's look at the argument list:
(on: %i[create update])
And specifically, let's first focus on the expression %i[create update]
.
This is a percent literal, more precisely, a Symbol
Array
percent literal. It has the form
%
character
i
character
- opening-delimiter
Symbol
s separated by whitespace
- closing-delimiter
If the opening-delimiter is one of <
, [
, (
, or {
, then the closing-delimiter must be the corresponding >
, ]
, )
, or }
. Otherwise, the opening-delimiter can be any arbitrary character and the closing-delimiter must be that same character.
These percent Array
literals allow you to concisely create Array
s from whitespace separated bare words.
In this case,
%i[create update]
is equivalent to
[:create, :update]
As mentioned above, there is an ambiguity related to the interpretation of the argument list here: this could either be a keyword argument on
whose value is the result of evaluating the expression [:create, :update]
or it could be a Hash
literal equivalent to { :on => [:create, :update] }
.
We can't know which is which without knowing what the definition of after_update
looks like.
So, there you have it:
If after_update
is defined with a keyword parameter something like this:
def after_update(on:); end
Then the whole thing will be interpreted like this:
self.after_commit(on: [:create, :update]) do
self.__elasticsearch__().index_document()
end
Whereas, if after_update
is defined with a positional parameter something like this:
def after_update(condition); end
Then the whole thing will be interpreted like this:
self.after_commit({ :on => [:create, :update] }) do
self.__elasticsearch__().index_document()
end
This involves the following syntax elements:
- message sends
- arguments
- either keyword arguments
- or positional arguments
- with a trailing
Hash
literal
- block literals
- percent literals
But I'm still not sure how this syntax "on:" is created.
It's not quite clear what you mean by "how this syntax is created". The way all Ruby syntax is created (and in fact, all syntax for any programming language is created), is by writing down the rules for the syntax in the programming language specification. Now, unfortunately, Ruby does not have a single unified specification document, but for example, you can find parts of the syntax specification in the ISO/IEC 30170:2012 Information technology — Programming languages — Ruby specification. You can also find bits and pieces in the ruby/spec, for example on Symbol
Array
percent literals. Other sources are the RDoc documentation generated from the YARV sourcecode, the Ruby issue tracker, and the mailing lists, in particular the ruby-core (English) and ruby-dev (Japanese) mailing lists.