0

Good Morning,

I created a DataFlow template that read some informations in BigQuery, apply some transformations and write the result in a new BigQuery Table.

This template takes 2 parameters :

  • Input query
  • Project's name

I wanted to write the project's name in a bigquery table through 'WriteToBigquery' transformation, but instead of writing the name of the project that filled by the user, it returns me an error..

Do you know how can I get this value and write it please ?

Thanks you for your help !

CODE :

    @classmethod
    def _add_argparse_args(cls, parser):
        parser.add_value_provider_argument(
            '--query',
            default='',
            help='q')
        parser.add_value_provider_argument(
            '--projet',
            default='',
            help='d')

[...]

  my_options = pipeline_options.view_as(BqReaderOptions).query
  myProjet = pipeline_options.view_as(BqReaderOptions).projet
        
                nb_val = (
                    p
                    | 'Readl' >> beam.io.ReadFromBigQuery(query=my_options, use_standard_sql = True) 
                    |beam.Map(lambda elem :elem== ' 0' )       
                    | 'countVal' >>  beam.combiners.Count.PerElement()  
                    |beam.Map(lambda elem : { "Nb" : int(elem), 'projet': myProjet })) 
                    



 ERROR : 

    default_encoder "Object of type '%s' is not JSON serializable" % type(obj).__name__) TypeError: Object of type 'RuntimeValueProvider' is not JSON serializable [while running 'writeToBigQuery1/BigQueryBatchFileLoads/ParDo(WriteRecordsToFile)/ParDo(WriteRecordsToFile)/ParDo(WriteRecordsToFile)']
amine
  • 33
  • 2

1 Answers1

0

You're getting that error because you're outputting a ValueProvider as the result of a transform, and it attempts to do a default encoding to JSON which fails. What it looks like you intended, however, is to output project as a string instead of the raw ValueProvider. You can read the details on how to use ValueProvider in your own functions, but basically you just need to make a DoFn object containing the ValueProvider, and use the get method on it, like so:

class MyFn(beam.DoFn):
    def __init__(self, project): # Pass in project as a ValueProvider
      self.project = project

    def process(self, elem):
      yield { "Nb" : int(elem), "project": self.project.get() }
Daniel Oliveira
  • 1,361
  • 4
  • 7