I found very confusing about how the airflow schedule works.
I would like to schedule a dag that runs on Friday and I would like use its result on Saturday. So I did the crontab expression like this: 00 16 * * 5
, however, as of today 2020-03-10, the last execution date I got from airflow run is 2020-02-28. This is not desired as the most recent Friday is actually 2020-03-06, I couldn't get the 2020-03-06 to run unless I schedule it every day and skip it if it is not Friday. Is there a way to do this schedule right?
Asked
Active
Viewed 305 times
-1

JOHN
- 871
- 1
- 12
- 24
1 Answers
0
A lot of people get confused by how Airflow's execution_date
and schedule_interval
values work, namely that it waits for a period of time to "close" before it'll execute for that period; here's a portion from a previous answer I gave:
Think of it like this: If you ran a process quarterly and generated a report from data for that quarter, would you name the report for the quarter you were in when you created the file, or for the quarter the data in the report is from? That's what the
execution_date
is.
Try changing your start_date
to be less one whole schedule interval. It should run on 03/06 but its execution_date
will say 02/28

joebeeson
- 4,159
- 1
- 22
- 29