3

Let's say I have any one of the following expression:

SELECT
    DATE '2014-01-01' < '2014-02-01',
    DATE '2014-01-01' < '321',
    9 < '10',
    9 < 'a'

Is there a recommendation or requirement in the SQL standard of how these should be compared? I suppose the three levels or 'strictness' would be the following:

  1. Raise an error [most strict] -- all 4 expressions above would fail.
  2. Try casting the string to the non-string type, if it doesn't work raise an error -- expressions 2 and 4 above would fail.
  3. Try casting the string to the non-string type, if it fails fallback to casting the non-string to a string -- all 4 above work.
  4. Cast the non-string operand to a string -- all 4 above work.

It seems BigQuery uses the second approach, postgres uses something like 2/3 (only the last one fails), mysql uses either 3 or 4 (no fail).

Does the standard make any recommendations here?

Salman A
  • 262,204
  • 82
  • 430
  • 521
David542
  • 104,438
  • 178
  • 489
  • 842
  • What are you doing here -- it does not make sense to have these comparisons in the select statement. – Hogan May 09 '23 at 17:13
  • 2
    @Hogan - its a question about what the sql standard say (if anything) for handling incompatible data types - not about a particular sql query. As far as I can tell you have to pay for a copy of "the sql standard" so personally I've never seen it ;) – topsail May 09 '23 at 17:57
  • I guess another possibility is coerce to type of left operand. No idea of what the standard says. Interesting question. – Álvaro González May 09 '23 at 18:19
  • I don't know the standard and for SQL server, implicit conversions are not always reliable especially for datetime (ie: '20230511' would correctly convert to a datetime implicitly when comparing to a datetime but '2023/05/11' or '2023-05-11' may not - have seen it fail under some specific language, date settings, and SQL server version). Probably best is to use explicit conversions. I think postgreSQL uses 2 and only 1 and 3 succeed. – Cetin Basoz May 11 '23 at 16:45
  • 1
    @CetinBasoz, another case where implicit conversion can cause confusion is `9 < '10'` - which will give different results depending on if you cast to char or to numeric. – jarlh May 12 '23 at 15:08
  • @jarlh, exactly. – Cetin Basoz May 12 '23 at 20:02

3 Answers3

5

Does the standard make any recommendations here?

I believe you're asking if there are any official rules regarding automatic type conversion for comparison? The answer is no*.

For comparison operation, the specs mention that data types [...] shall be comparable. Comparison of different data types is not described but it does say that implicit type conversion can occur in expressions [...]. So RDBMS is allowed to convert the data type of one or both operands to compare them.

That being said, the rules for automatic type conversion are implementation-dependent. The rules vastly differ across RDBMS. Consult your RDBMS' documentation to understand them.

SQL Server for example, uses data type precedence to convert the data type of one of the operands to match the other:

  • For DATE '2014-01-01' < '2014-02-01', the varchar value will be converted to date for comparison
  • For 9 < '10', the varchar value will be converted to int for comparison

MySQL has a different set of rules:

  • For DATE '2014-01-01' < '2014-02-01', both values will be converted to timestamps (not timestamp data type) for comparison
  • For 9 < '10', both values will be converted to floating point numbers for comparison

Unfortunately implicit conversions have too many gotchas and they should be avoided. For example 2.0 = '2.01' is true in SQL Server and 2 = '2foo' is true in MySQL.

Use the CAST function, variables of correct type (DECLARE @userdate AS DATE = '20120201') or appropriate functions (STR_TO_DATE('01-02-2012','%d-%m-%Y')) on strings.


* The answer is based on SQL-92 standard — very old but still relevant to the question.

Salman A
  • 262,204
  • 82
  • 430
  • 521
  • thanks for this that's very helpful. Is there any such mention as a 'super-type' or class of types or something of that sort. For example, 2 (int) being comparable with 2.4 (float) or DATE '2014-01-01' being comparable with DATETIME '2014-01-02' (I've yet to see an implementation that doesn't support comparing those similar data-type classes). – David542 May 11 '23 at 22:13
  • 2
    Like I said it is implementation dependent. *General rule* is that the data type having smaller range is converted to the larger one to avoid _unexpected behavior_ e.g. 2 = 2.4 returning true. So, at least in SQL server, date will be promoted to datetime with h:m:s set to 0. Int will be promoted to float. Two decimals having different scale and precision... both will be converted to have same precision and scale (larger of both). And again SQL server... you can't compare two `text` data types, cuz specs don't define the exact rules. – Salman A May 11 '23 at 22:26
2

The ISO/IEC 9075 SQL:2016 latest new features (also discussed here or in this PDF) did not include much regarding casting.

That standard does not dictate every detail of how a specific implementation should handle all situations. Instead, it sets general guidelines and requirements that implementations should adhere to, while allowing some flexibility for individual vendors to tailor their products.

That means any solution based on CASTing will need to be checked against the specific documentation of your SQL server.

SELECT
    DATE '2014-01-01' < DATE '2014-02-01',
    DATE '2014-01-01' < CAST('321' AS DATE),
    CAST(9 AS VARCHAR) < '10',
    CAST(9 AS VARCHAR) < 'a'

or

SELECT
    DATE '2014-01-01' < CAST('2014-02-01' AS DATE),
    9 < CAST('10' AS INTEGER)

While CAST is generally more widely supported, you might need to use a different function or syntax depending on the specific SQL implementation you are working with.
For example, SQL Server uses the CONVERT function, which is similar to CAST but also allows you to specify a style argument for certain conversions.

VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250
-2

The SQL standard for conversion is using the CAST keyword

CAST ( expression AS data_type )

I would suggest using that if you want to be "standard"

Hogan
  • 69,564
  • 10
  • 76
  • 117
  • Right -- but without an explicit cast, the question is how are different data types coerced (or not). For example here is how SQLServer does it: https://learn.microsoft.com/en-us/sql/t-sql/data-types/data-type-conversion-database-engine?view=sql-server-ver16 (scroll down to see the graphic). – David542 May 09 '23 at 18:54
  • @David542 -- the question does not make sense. You are talking about behavior that is not covered by the standards and then asking what the standard way is. It is by definition not something that has a standard -- each platform will be different. – Hogan May 09 '23 at 21:31
  • sure, but how do you know it's not covered by the standard? – David542 May 09 '23 at 22:05
  • @David542 how do I know you are not a potato? – Hogan May 09 '23 at 22:06
  • 1
    "You are talking about behavior that is not covered by the standards" - Well, that's exactly what's being asked here. For any feature, a given standard can define it exactly, leave it open to implementations, declare it undefined behaviour or omit it altogether. ANSI standards are not freely available (model is similar to research papers publishing) and, even if they were, a plain English explanation is always useful. Remember that Stack Overflow aims to be a Q&A repository rather than a help forum. – Álvaro González May 10 '23 at 09:48
  • @ÁlvaroGonzález -- How else do you want me to answer this question. I said -- the standard is to use the cast keyword to convert types. I really don't know any other way to say what the standard does more clearly. – Hogan May 10 '23 at 13:59
  • 1
    Sorry then, perhaps I didn't understand the answer. Do you mean that using the less than operator with operands with different data types is an open violation of the SQL ANSI standard, and DBMS's that implement any kind of implicit casting (as opposed to rejecting the query altogether) are not ANSI compliant? – Álvaro González May 10 '23 at 15:47
  • @ÁlvaroGonzález -- yes. But no DBMS would reject the query, users would not put up with it. There are MANY parts of SQL that are not ANSI compliant -- there is no standard for DDL for example. For SPs. The list goes on and on. Standard SQL is actually quite small. – Hogan May 10 '23 at 20:29
  • 1
    The SQL standard includes both DDL and stored procedures (PSM). – jarlh May 12 '23 at 14:31
  • 3
    @David542 "implicit type conversion" occurs only one time in section 4.12 of the final (1999) ANSI SQL standard. It does not define how "Implicit Type Conversions" should be performed only that they may be but goes on to say if you want to explicity convert types use cast. `Implicit type conversion can occur in expressions, fetch operations, single row select operations, inserts, deletes, and updates. Explicit type conversions can be specified by the use of the CAST operator.` – rabbit May 12 '23 at 15:23