0

I'm using below function:

def final_pipeline():     
    with beam.Pipeline(options=pipeline_options) as pipeline: 
        readable_files = (                 
                      pipeline                 
                        | fileio.MatchFiles(file_pattern="gs://bucket_name/zipfile/*.csv")   
                        | fileio.ReadMatches()               
                        | beam.Reshuffle()        
                          )         
        files_parsing = ( 
                    readable_files                 
                          | beam.ParDo(DataTransformFn())    
                          | WriteToBigQuery(user_options.TABLE_NAME)         
                          ) 

if _name_ == '_main_':     
    final_pipeline()         

Is there a way to dynamically pass the staging bucket location as an argument to the final_pipeline() function or to beam.Pipeline()in Apache Beam, allowing for flexibility in specifying the source location of files for subsequent pipelines in Google Cloud Platform (GCP) Dataflow?

I don't want to use match file pattern i,e.. fileio.MatchFiles(), because it will lead to have different dataflows for different files, instead of that I want to pass a dynamic bucket path to the final_pipeline() function in main which I'll fetch from composer using user_options.

I need to make a single dataflow for different files(.csv, .txt, .asc etc), so instead of fileMatch() I need to give a dynamic path.

Andromeda
  • 1,205
  • 1
  • 14
  • 21
  • You can pass any option you want: https://beam.apache.org/releases/pydoc/current/apache_beam.options.pipeline_options.html. Another example is here: https://github.com/GoogleCloudPlatform/dataflow-cookbook/blob/main/Python/bigquery/write_bigquery.py#L46 – XQ Hu Jul 23 '23 at 01:00
  • that method `(_add_argparse_args())` is used to fetch variables that are passed from composer (in my case), then that variable which consist of bucket path has to be passed in above `run()` function in pipelines so that it can read file or match files. – Rohan Anand Jul 27 '23 at 05:29

0 Answers0