2

I have a csv file with various columns of data that can be utilized for Python functions called by my PythonOperators. My dags pipeline is set up in such a way that I want to read the CSV for each row and feed those inputs into my operators. But how can I iterate my dag across the csv rows?

1 Answers1

2

If you want to read a csv file, and process each row separately in a task, you can read the csv and use Dynamic Task Mapping (available since 2.3.0) to process the rows

with DAG(dag_id="dag id", start_date=...) as dag:

    @task
    def read_csv():
        # here load the csv file and prepare the data to process
        csv_file = ... # read csv_file

        data_process = ... # a list of data calculated from the csv_file
        
       return data_process # ex: [{"row":1, "x":1}, {"row":2, "x":1}, {"row":3, "x":2}]


    @task
    def processing(data_to_process):
        # implement your processing function
        print(f"row data: {data_to_process}")

    data_to_process = read_csv()
    processing.expand(data_to_process=data_to_process)
Hussein Awala
  • 4,285
  • 2
  • 9
  • 23