I am working with AWS data pipeline. In this context, I am passing several parameters from pipeline definition to sql file as follows:
s3://reporting/preprocess.sql,-d,RUN_DATE=#{@scheduledStartTime.format('YYYYMMdd')}"
My sql file looks like below:
CREATE EXTERNAL TABLE RESULT (
STUDENT_ID STRING,
REMARKS STRING,
EXAM_DATE STRING
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
STORED AS TEXTFILE LOCATION 's3://result/data/run_date=${previous day of RUN_DATE}'; <----(1)
...
insert into temp
select a.roll_number, remarks
from student a inner join result b
on a.student_id = b.student_id
where exam_date>='<start date of previous month of RUN_DATE>' <---- (2)
and exam_date<='<end date of previous month of RUN_DATE>';<---- (3)
Here in the above sql, I am not getting any idea how I will achieve (1) ,(2) and (3) from RUN_DATE.
So if RUN_DATE=20190101, then value in (1) should be "20181231", value in (2) should be "2018-12-01" and (3) should be "2018-12-31" .