0

I'm trying to create a dataflow template which takes the input parameter as a RuntimeValue. Following the example from the docs

import re

import apache_beam as beam
from apache_beam.io import ReadFromText
from apache_beam.io import WriteToText
from apache_beam.options.pipeline_options import PipelineOptions

# [START example_wordcount_templated]
class WordcountTemplatedOptions(PipelineOptions):
  @classmethod
  def _add_argparse_args(cls, parser):
    # Use add_value_provider_argument for arguments to be templatable
    # Use add_argument as usual for non-templatable arguments
    parser.add_value_provider_argument(
        '--input', help='Path of the file to read from')
    parser.add_argument(
        '--output', required=True, help='Output file to write results to.')

pipeline_options = PipelineOptions(['--output', 'some/output_path'])
with beam.Pipeline(options=pipeline_options) as p:

  wordcount_options = pipeline_options.view_as(WordcountTemplatedOptions)
  lines = p | 'Read' >> ReadFromText(wordcount_options.input)
# [END example_wordcount_templated]

(taken directly from the official snippets) gives the following error when trying to create a template using the following command (with specifics filled in):

 python -m examples.mymodule \
    --runner DataflowRunner \
    --project YOUR_PROJECT_ID \
    --staging_location gs://YOUR_BUCKET_NAME/staging \
    --temp_location gs://YOUR_BUCKET_NAME/temp \
    --template_location gs://YOUR_BUCKET_NAME/templates/YOUR_TEMPLATE_NAME
  File "lib/python3.7/site-packages/apache_beam/options/value_provider.py", line 139, in _f                                                              
    raise error.RuntimeValueProviderError('%s not accessible' % obj)
apache_beam.error.RuntimeValueProviderError: RuntimeValueProvider(option: input, type: str, default_value: None) not
 accessible                                                                                                        

The docs also state that:

Some I/O connectors contain methods that accept ValueProvider objects. To determine support for I/O connectors and their methods, see the API reference documentation for the connector. The following I/O connectors accept runtime parameters:

File-based IOs: textio, avroio, tfrecordio

I'm not sure why the example code is giving errors. Can someone give me a hand?

For what it's worth I'm using:

apache-beam = {extras = ["gcp"], version = "^2.19.0"}
minikomi
  • 8,363
  • 3
  • 44
  • 51
  • 1
    This looks like the same issue as https://stackoverflow.com/questions/59940069/runtimevalueprovidererror-when-creating-a-google-cloud-dataflow-template-with-ap but that should've been fixed in 2.19.0. Can you try using 2.17.0 and check if it works? – AMargheriti Feb 17 '20 at 12:19
  • @AMargheriti I went with another workaround (using lambda to get runtime vars), but I believe this is slated to be fixed in 2.20.0 .. I'll leave this open for now – minikomi Feb 20 '20 at 09:27
  • Ah you're right. I meant to say that it should be fixed in 2.20.0. – AMargheriti Feb 20 '20 at 13:02

1 Answers1

0

This has been fixed in beam 0.20.0, released as of 4/15/2020.

Using beam.io operations such as beam.io.ReadFromText(wordcount_options.input) with a RuntimeValueProvider simply works. If you get the same error I got in the question, try upgrading your beam dependency version.

minikomi
  • 8,363
  • 3
  • 44
  • 51