I have a ModelBatchPredictOp
component in my pipeline. This component generates 3 artifacts: batchpredictionjob
, big_query_table
, and gcs_output_directory
. The pipeline is running fine.
What I need is a way to access the tableId
property of artifact big_query_table
, so I can use it in the next component (a BigqueryQueryJobOp
).
That is what I want:
The URI information would be good as well since it contains the full path for the created table and I can extract the desired part.
This is my 2nd component which is creating the batch prediction and the 3rd component which should consume the previous outputs.
# Component to do the batch prediction
batch_predict_op = ModelBatchPredictOp(
project=project_id,
location=DEFAULT_VERTEX_REGION,
instances_format = 'bigquery',
predictions_format = 'bigquery',
model=importer_spec.outputs['artifact'],
job_display_name='teste_batch_predict',
bigquery_source_input_uri=f'bq://{input_data_table_ref}',
bigquery_destination_output_uri= f'bq://{output_bq}',
).after(input_data_table_op)
top_predictions_table_ref = f'{project_id}.{bigquery_dataset}.test'
# Component to create the table based on previous component
top_predictions_op = bq.BigqueryQueryJobOp(
project_id,
location = bigquery_job_location,
query = predict_dataset.get_query(
output_table = top_predictions_table_ref,
source_table = batch_predict_op.outputs['bigquery_output_table'],
query_name = 'query_top_100.sql',
DEBUG = DEBUG)
).after(batch_predict_op)