I have a piece of apache beam pipe code that reads from a file in the GCS bucket and prints it. It is working perfectly with the DirectRunner and prints the file output but with the Dataflow runner it is not printing anything no errors as well.
Do we need to do anything special/different for the Dataflow runner?
Code looks like this
p = beam.Pipeline(options=pipeline_options)
read_file_pipe = (
p
| "Create {}".format(file_name) >> beam.Create(["Start"])
| "Read File {}".format(file_name)
>> ReadFromTextWithFilename(file_path, skip_header_lines=1)
| beam.Map(print)
)
p.run().wait_until_finish()
call stack is python3 Test_Pipe.py --region us-central1 --output_project= --runner=DataflowRunner --project= --temp_location= --service_account_email= --experiments=use_network_tags=default-uscentral1 --subnetwork --no_use_public_ips