0

I have a very messy date field that is fed from several systems, each with their own date format.

I have:

DD-MM-YY hh:mm:ss.ssss

DD-MON-YY hh:mm:ss

DD-MM-YYYY hh:mm:ss.ssss

All of this is stored in a varchar2 field. Now I have to do searches between date ranges and this is causing me problems. How can I approach this?

Here are some code snippets I have tried:

A standard substr works well, but I can't account for different date formats:

select substr(created_on, 1,9) as date2 from rtl.HK_Alerts

Ideally if I can get this to work:

select to_date(created_on, 'dd-mon-yy') as date_convert from rtl.HK_Alerts

Then I can do this:

select * from my_table 
    where to_date(created_on, 'dd-mon-yy') > '01-Jan-1970'
    and to_date(***strong text***created_on, 'dd-mon-yy') < '31-Jan-1970'

Also, how do I account for the different date formats from different systems? Unfortunately there is no system identifier I can work with.

here is a picture of some of the dates: created_on_date

07-JAN-19 01.53.47.702000000
07-JAN-19 01.53.47.992000000
07-JAN-19 01.53.48.186000000
07-JAN-19 01.53.48.360000000
07-JAN-19 01.53.48.548000000
07-JAN-19 01.53.48.709000000
07-JAN-19 01.53.48.900000000
20-JAN-19 22.49.30.801000000
20-JAN-19 22.49.30.014000000
20-JAN-19 22.49.33.968000000

vwdewaal
  • 975
  • 2
  • 11
  • 25
  • 1
    [Please post formatted text, not pictures](http://meta.stackoverflow.com/questions/285551/why-may-i-not-upload-images-of-code-on-so-when-asking-a-question/285557#285557). But all of the values in that image seem to be tiehr strings in the same format, or actual timestamp that your client is formatting like that for display. Please include sample data that actually shows the issue, in your question (not as a comment), and as formatted text. And the actual data type of the column in your table. – Alex Poole May 10 '19 at 10:15

2 Answers2

3

You could write some switch logic which converts each timestamp string to a bona fide timestamp using the appropriate mask. Here is an example:

SELECT
    created_on,
    CASE WHEN REGEXP_LIKE (created_on, '^[0-9]{2}-[0-9]{2}-[0-9]{4}')
         THEN TO_TIMESTAMP(created_on, 'DD-MM-YYYY HH24:MI:SS.FF') END AS 
         WHEN REGEXP_LIKE (created_on, '^[0-9]{2}-[0-9]{2}-[0-9]{2}')
         THEN TO_TIMESTAMP(created_on, 'DD-MM-RR HH24:MI:SS.FF')
         WHEN REGEXP_LIKE (created_on, '^[0-9]{2}-[A-Z]{3}-[0-9]{2}')
         THEN TO_TIMESTAMP(created_on, 'DD-MON-RR HH24:MI:SS')

created_on_ts FROM rtl.HK_Alerts;

enter image description here

Demo

But the best long term fix here might be to fix your source data such that the incoming timestamps are all standardized, and in a format which Oracle can easily consume.

Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
  • 1
    @AlexPoole You're the resident Oracle expert on this site :-) ... I pushed the four year pattern first, which gets around the problem you pointed out. Better would have been to use word boundaries. Then, the order of the `CASE` expression wouldn't even matter. – Tim Biegeleisen May 10 '19 at 10:45
3

You may be overthinking this. Oracle's date conversion is pretty flexible by default, which is both a good and bad thing. In this case if you only have the formats you showed then conversion can be done with a single format mask:

to_timestamp(created_on, 'DD-MM-RRRR HH24:MI:SS.FF')

Demo with some made-up data:

-- CTE for sample data
with hk_alerts (created_on) as (
            select '10-05-19 12:34:56' from dual
  union all select '10-05-19 12:34:56.789' from dual
  union all select '10-May-19 12:34:56' from dual
  union all select '10-May-19 12:34:56.789' from dual
  union all select '10-May-2019 12:34:56' from dual
  union all select '10-May-2019 12:34:56.789' from dual
)
-- actual query
select created_on, to_timestamp(created_on, 'DD-MM-RRRR HH24:MI:SS.FF') as date_convert
from hk_alerts;

CREATED_ON               DATE_CONVERT                 
------------------------ -----------------------------
10-05-19 12:34:56        2019-05-10 12:34:56.000000000
10-05-19 12:34:56.789    2019-05-10 12:34:56.789000000
10-May-19 12:34:56       2019-05-10 12:34:56.000000000
10-May-19 12:34:56.789   2019-05-10 12:34:56.789000000
10-May-2019 12:34:56     2019-05-10 12:34:56.000000000
10-May-2019 12:34:56.789 2019-05-10 12:34:56.789000000

Of course, this still has to assume that any strings using month names/abbreviations are in the same language your session is using. Not much you can do about that though.

Your comparison can then be done against timestamp literals:

where to_timestamp(created_on, 'DD-MM-RRRR HH24:MI:SS.FF') >= timestamp '1970-01-01 00:00:00'
and to_timestamp(created_on, 'DD-MM-RRRR HH24:MI:SS.FF') < timestamp '1970-02-01 00:00:00'
Alex Poole
  • 183,384
  • 11
  • 179
  • 318
  • Oracle's [String-to-Date conversion rules are in its documentation](https://docs.oracle.com/cd/B19306_01/server.102/b14200/sql_elements004.htm#sthref478) and you could also use `DD-MM-YY HH24:MI:SS.FF9` as the format model. – MT0 May 10 '19 at 11:02
  • @MT0 - surely RR is safer than YY if there are 2-digit years involved? (Though agree it could be RR rather than RRRR.) – Alex Poole May 10 '19 at 11:05
  • That really depends on what the OP's data means. Is `01-JAN-70` the year `2070` or `1970` and there could be a use case for both (hopefully not intermixed in the same table) and the OP should take the time to understand the difference between the two format models. – MT0 May 10 '19 at 11:13
  • Something called `created_on` is unlikely to have future dates, I hope *8-) But yes, I agree. – Alex Poole May 10 '19 at 11:18