Arity of BETWEEN expression

Question

What is the arity of the sql BETWEEN expression? I thought it was three (ternary) since the expression usually looks like:

WHERE...
   1 BETWEEN 2 AND 3

But it's listed as binary on BigQuery's documentation, and I assume other places as well.

Source: Operators.

What is the arity of the BETWEEN expression and why? I think the answer is 3 from the following example:

select 
  ~ (SELECT -1 AS expr_1)                              AS 'bitwise_arity_1',
    (SELECT 1 AS expr_1) * (SELECT 2 AS expr_2)        AS 'times_arity_2',
    (SELECT 1 AS expr_1) BETWEEN 
      (SELECT 2 AS expr_2) AND (SELECT 3 AS expr_3)    AS 'bitwise_arity_3?'

I suppose one way to interpret it might just be that the grammar is:

expr 'BETWEEN' logicalAndExpr

And so the two expressions in the logicalAnd are just grouped into one. Is that a correct understanding?

SQLFiddle: http://www.sqlfiddle.com/#!9/b28da2/2156

any function with arity `N` - can be redefined as function with arity `N-1` because any two arguments can be presented as tuple, so `between is binary because it accepts value and range` at the same time `between is ternary because it accepts value, low and high` — Iłya Bursov, Nov 05 '22 at 23:06
@IłyaBursov I see, thanks for the explanation. So is this sort of open to interpretation, as to whether it is considered binary or ternary? — David542, Nov 05 '22 at 23:08
yes, from [wiki](https://en.wikipedia.org/wiki/Arity#n-ary) `From a mathematical point of view, a function of n arguments can always be considered as a function of one single argument which is an element of some product space.` and `The same is true for programming languages, where functions taking several arguments could always be defined as functions taking a single argument of some composite type such as a tuple` — Iłya Bursov, Nov 05 '22 at 23:09
I'm not sure what that SQL fiddle is intended to show, but it's reasonably clear that the `AND` in a `BETWEEN`...`AND` operation has nothing to do with the logical `AND`. They don't even have the same precedence, which is one of the annoyances in parsing SQL (and, for similar reasons, COBOL). — rici, Nov 06 '22 at 02:09

rici · Accepted Answer · 2022-11-06T02:28:15.103

It's binary, in syntactic terms. See below for a discussion of syntax vs. semantics, where I note that a better syntactic term is "infix".

Similarly, function calls and array subscripting are postfix unary operators and the C family's conditional operator (often misnamed "the ternary operator" as though it were the only such thing) is also infix. The reason is that the interior operands (the operands between BETWEEN...AND, (...), [...], and ?...:, respectively) are fenced off from the rest of the syntax by the pair of surrounding terminal tokens which function as a syntactic barrier, like parentheses. Precedence does not penetrate to the enclosed operands; only the outer operand(s) remain floating in the syntax.

The semantic view is quite different, of course. BETWEEN...AND and ?...: are certainly three-argument functions, although since the latter is short-circuiting, only two of the three arguments are ever evaluated, which makes it hard to discuss in strict mathematical terms [Note 1]. Moreover, the semantic view is complicated by the fact that there is not just a single way to look at what an argument is. As noted in a comment, you can always curry functions into a series of unary applications of higher-order functions. Although you might be tempted to try to redefine "arity" as the length of that sequence, you will soon find higher-order functions which have different sequence lengths depending on the values of their arguments. Also, in most programming languages (unlike SQL) the function being called is a full expression which does not need to be evaluated at compile-time, and since different functions have different argument counts, there is no good way to describe the arity of a function call unless you respecify the call to be the application of a list-of-arguments object to a callable object. That's often done, but it's a bit unsatisfying because (in most languages), the list object does not really exist and cannot be observed as an object.

I'd suggest taking the Wikipedia article on arity with a good-sized saline dosage, because it completely misses the distinction between semantics and syntactic structure, giving rise to the confusing ambiguity between the semantic and syntactic view of SQL's range operator or C's conditional operator. Personally, I prefer to reserve "arity" for the semantic meaning, using "fixity" or "valence" for the syntactic feature. (The advantage of "fixity" is that it encourages the distinction between prefix and postfix, which is a real distinction hidden by calling both cases "unary operators".)

Notes

BETWEEN...AND could short-circuit, too, but standard SQL doesn't guarantee short-circuiting, as far as I know (although some SQL implementations do.)

thanks so much for this detailed answer. One thing I'd add is SQL has the conditional expressions which short-cirtuit, but that's it as far as I'm aware. For example, `COALESCE`, `CASE`, `[NULL]IF` , `IF[NULL]`. — David542, Nov 06 '22 at 00:23
@David542: I might be wrong, but I believe that guaranteed short-circuiting is a (fairly common) extension rather than part of the SQL standard. For example, Microsoft's [documentation for SQL Server](https://learn.microsoft.com/en-us/sql/t-sql/language-elements/case-transact-sql?view=sql-server-ver16#remarks) warns that "The CASE expression cannot be used to control the flow of execution... In some situations, an expression is evaluated before a CASE expression receives the results of the expression as its input. Errors in evaluating these expressions are possible." — rici, Nov 06 '22 at 01:03
I see, thanks for that. What do you mean by `Similarly, function calls and array subscripting are postfix unary operators` ? Aren't functions similar to global tokens that accept zero or more input parameters, such as `NOW()`, `ABS(1)`, `GREATER(1,2)`. How would that be the same as something like `[1,2,3][0]` ? Wouldn't the function be: `identifier '(' [expr [,...]] ')'` and subscripting would be: `expr '[' expr? ']'` ? where the first `expr` would need to evaluate to an array ? — David542, Nov 06 '22 at 04:57
@David542: In SQL, functions are not first-class objects; I was thinking of other languages, in which the function being called is in expression (although most of the time, you use a simple expression consisting of an identifier). So `f(3)` has the syntax `expr '(' expr_list ')'`, which is syntactically the postfix operator `'(' expr_list ')'` applied to the first `expr`. I know that's a bit mind-bending at first, but if you think about it it makes sense. — rici, Nov 06 '22 at 05:06
In the same way, consider the doubly-subscripted `a[2][i]`. What's going on here, syntactically? Obviously, it's not `a([2][i])` :-). So it must be `(a[2])[i]`. Syntactically, the operator is `[...]`, which must be a postfix operator (and postfix operators, remember, generally have the highest precedence of all, which is why `-a[i]` is `-(a[i])`). — rici, Nov 06 '22 at 05:09
We could write function application and subscript as infix operators, say `f $ 3` (Haskell uses that syntax, as an alternative to `f 3` with different precedence), or `ary AT i`. Those are two argument infix operators. But now the precedence rules get wonky. (Try it and you might see the issues.) If you had to write `ary AT (i + 1)` for `ary[i+1]`, you would reasonably ask , what's the point of the `AT` operator? — rici, Nov 06 '22 at 05:12
None of that is relevant to SQL, I think. In C or Python, you can, for example, put a bunch of functions (with the same prototype) into an array, and then select which one to call: `funcarray[index](20.0)`. The function expression in the function call is the result of an array selector, which is what makes that syntax useful. But SQL wouldn't know how to deal with that. — rici, Nov 06 '22 at 05:19

Arity of BETWEEN expression

1 Answers1

Notes