1

I'm working on use case where i need to convert the string type to integer type in AWS Athena

Below is my input data -

year    month
2022    jan
2022    feb
2012    apr
2023    may
2019    jun

Below is what I'm expecting -

year    month   yearmon
2022    jan     202201
2022    feb     202202
2012    apr     201204
2023    may     202305
2019    dec     201912

Initially I thought to use case statement wherein I hardcode for each month, concatenate this with year column and finally parse the yearmon as int. Something like -

concat(year, case when month = 'jan' then '01'
...
... end) 

Finally cast it to int

I tried below formats -

Select month("jan")
Select cast("jan" as date)

But nothing seem to have worked.

Is there a better way to cast this use case than hard coding the values in case statement?

Guru Stron
  • 102,774
  • 10
  • 95
  • 132
djm
  • 45
  • 6

1 Answers1

1

You can try using date_parse and date_format. The following works in Trino on which the Athena is based on:

-- sample data
with dataset(year, month) as (
    values (2022, 'jan'),
    (2022, 'feb'),
    (2012, 'apr'),
    (2023, 'may'),
    (2019, 'jun')
)

-- query
select date_format(
             date_parse(cast(year as varchar) || '-' || month, '%Y-%b'),
             '%Y%m'
           )
from dataset;

Which produces the following output (TBH I was a bit surprised that it can handle lowercase month names):

_col0
202201
202202
201204
202305
201906

If this will not work for your actual data then using case-when is the approach you might want to take.

Guru Stron
  • 102,774
  • 10
  • 95
  • 132