3

I have a ModelBatchPredictOp component in my pipeline. This component generates 3 artifacts: batchpredictionjob, big_query_table, and gcs_output_directory. The pipeline is running fine.

What I need is a way to access the tableId property of artifact big_query_table, so I can use it in the next component (a BigqueryQueryJobOp).

That is what I want: Here you will find an image of what I want

The URI information would be good as well since it contains the full path for the created table and I can extract the desired part.

This is my 2nd component which is creating the batch prediction and the 3rd component which should consume the previous outputs.

# Component to do the batch prediction
batch_predict_op = ModelBatchPredictOp(
   project=project_id, 
   location=DEFAULT_VERTEX_REGION, 
   instances_format = 'bigquery',
   predictions_format = 'bigquery',
   model=importer_spec.outputs['artifact'],
   job_display_name='teste_batch_predict',
   bigquery_source_input_uri=f'bq://{input_data_table_ref}',
   bigquery_destination_output_uri= f'bq://{output_bq}',
   ).after(input_data_table_op)

   top_predictions_table_ref = f'{project_id}.{bigquery_dataset}.test'
        
# Component to create the table based on previous component
top_predictions_op = bq.BigqueryQueryJobOp(
   project_id,
   location = bigquery_job_location,
   query = predict_dataset.get_query(
   output_table = top_predictions_table_ref,
   source_table = batch_predict_op.outputs['bigquery_output_table'],
   query_name = 'query_top_100.sql',
   DEBUG = DEBUG)
   ).after(batch_predict_op)
Anton Menshov
  • 2,266
  • 14
  • 34
  • 55
Openworld
  • 31
  • 3

0 Answers0