0

What's the best way to check if the VARIANT TYPE with JSONs with value similar to TIMESTAMP_NTZ has correct format?

For example, this is the correct format that I would like to have

 2020-12-26T12:12:11.215581Z

but there are times when it looks different in database, like this

2021-11-26T12:12:11.215581Z[UTC]

I would like to detect records which are in a different format than the reference.

I tried with simple LIKE, but it omits formats that could be different than this.

LIKE '%[UTC]%'
Tims
  • 31
  • 6
  • 4
    Which dbms are you using? (When it comes to date/time, many products have their own, non-ANSI functions.) – jarlh Jan 20 '22 at 11:43
  • 2
    Column data type? – jarlh Jan 20 '22 at 11:43
  • Does it need to have exactly 6 digits after decimal (or any decimal portion at all)? – Salman A Jan 20 '22 at 11:48
  • A column defined with the data type `timestamp` does not have any "format". So you don't have to "check" it because it won't allow to store invalid values. –  Jan 20 '22 at 11:52
  • @a_horse_with_no_name sorry generally it is VARIANT TYPE with JSONs, so I would like to check if specific key has this correct value format. – Tims Jan 20 '22 at 11:59
  • @SalmanA I am not 100% sure, but probably yes – Tims Jan 20 '22 at 12:01
  • 2
    Well, then it's not a "timestamp" column, but a string/text column containing something that is supposed to be a timestamp. If you had chosen a proper data type, you wouldn't have this problem to begin with. –  Jan 20 '22 at 12:01
  • @a_horse_with_no_name you are right, but sometimes happens that this value has different format than I expecting, so I would like to detect such situations. So should I use regex or something else? – Tims Jan 20 '22 at 12:03
  • 1
    Please tag the rdbms. Regex is needed. – Salman A Jan 20 '22 at 12:04
  • 1
    The tag is a _general_ tag, but here we need the tag for the specific dbms product you're using. (Different products have different functionality.) – jarlh Jan 20 '22 at 12:26

2 Answers2

0

If all the records you want to detect come in addition to the end of the format that should be then you can use length.

Basically:

WHERE NOT len(column_name)=27

27 because your correct format 2020-12-26T12:12:11.215581Z has 27 characters.

Eren Temelli
  • 333
  • 1
  • 14
0

You may use a regular expression to check the equality of the desired format. I will extend this answer if the rdbms is known.

[0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}\.[0-9]{6}Z

MySQL & SQLite:

SELECT * FROM table 
WHERE timestamp not REGEXP '[0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}\.[0-9]{6}Z';

Oracle:

SELECT * FROM table 
WHERE not REGEXP_LIKE(timestamp, '[0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}\.[0-9]{6}Z');
Dominik Klug
  • 139
  • 1
  • 5
  • Of course it fails to catch dates like '2022-02-29T...' – Salman A Jan 20 '22 at 14:05
  • May you clarify, why this should fail on a leap date? – Dominik Klug Jan 20 '22 at 15:04
  • It passes regex but it will fail (or do something unexpected) when someone actually tries to cast it to a datetime (e.g. this `select cast('2022-02-29T12:12:11.215581Z' as datetimeoffset)` will fail on on SQL server). – Salman A Jan 20 '22 at 15:08
  • I derived from the questions comment section that the column is of type string/text. So no need to pay attention to the tedious handling of certain dates. Op asked how to determine different formats, not how to cast those strings. – Dominik Klug Jan 20 '22 at 15:15