i have a DAG (DAG1) where i copy a bunch of files. I would then like to kick off another DAG (DAG2) for each file that was copied. As the number of files copied will vary per DAG1 run, i would like to essentially loop over the files and call DAG2 with the appropriate parameters.
eg:
with DAG( 'DAG1',
description="copy files over",
schedule_interval="* * * * *",
max_active_runs=1
) as dag:
t_rsync = RsyncOperator( task_id='rsync_data',
source='/source/',
target='/destination/' )
t_trigger_preprocessing = TriggerDagRunOperator( task_id='trigger_preprocessing',
trigger_daq_id='DAG2',
python_callable=trigger
)
t_rsync >> t_trigger_preprocessing
i was hoping to use the python_callable trigger
to pull the relevant xcom data from t_rsync
and then trigger DAG2; but its not clear to me how to do this.
i would prefer to put the logic of calling DAG2 here to simplify the contents of DAG2 (and also provide stacking schematics with the max_active_runs
)