My python code for dataflow job looks like below:
import apache_beam as beam
from apache_beam.io.external.kafka import ReadFromKafka
from apache_beam.options.pipeline_options import PipelineOptions
topic1="topic1"
conf={'bootstrap.servers':'gcp_instance_public_ip:9092'}
pipeline = beam.Pipeline(options=PipelineOptions())
(pipeline
| ReadFromKafka(consumer_config=conf,topics=['topic1'])
)
pipeline.run()
As i am using kafkaIO in python code, someone suggested me to use DataflowRunner_V2( I think V1 doesn't support python).
As per dataflow documentation, i am using this parameter to use runner v2:--experiments=use_runner_v2
(I have not made any change on code level for switching from V1 to V2.)
I am getting below error:
http_response, method_config=method_config, request=request)
apitools.base.py.exceptions.HttpBadRequestError: HttpError accessing <https://dataflow.googleapis.com/v1b3/projects/metal-voyaasfger-23424/locations/us-central1/jobs?alt=json>: response: <{'vary': 'Origin, X-Origin, Referer', 'content-type': 'application/json; charset=UTF-8', 'date': 'Wed, 08 Jul 2020 07:23:21 GMT', 'server': 'ESF', 'cache-control': 'private', 'x-xss-protection': '0', 'x-frame-options': 'SAMEORIGIN', 'x-content-type-options': 'nosniff', 'transfer-encoding': 'chunked', 'status': '400', 'content-length': '544', '-content-encoding': 'gzip'}>, content <{
"error": {
"code": 400,
"message": "(5fd1bf4d41e8b7e): The workflow could not be created. Causes: (5fd1bf4d41e8018): The workflow could not be created due to misconfiguration. If you are trying any experimental feature, make sure your project and the specified region support that feature. Contact Google Cloud Support for further help. Experiments enabled for project: [enable_streaming_engine, enable_windmill_service, shuffle_mode=service], experiments requested for job: [use_runner_v2]",
"status": "INVALID_ARGUMENT"
}
}
I have already added service account using export GOOGLE_APPLICATION_CREDENTIALS=
(project owner permission) command.
Can someone help where is my mistake. Am i mistaking using Runner_V2?
I will really thnkful if someone shortly tell whats difference in using Runner_v1 and Runner_V2.
Thanks ... :)