0

how to get string as true value of tfx.orchestration.data_types.RuntimeParameter during execution pipeline?

Hi,

I'm defining a runtime parameter like data_root = tfx.orchestration.data_types.RuntimeParameter(name='data-root', ptype=str) for a base path, from which I define many subfolders for various components like str(data_root)+'/model' for model serving path in tfx.components.Pusher().

It was working like a charm before I moved to tfx==1.12.0: str(data_root) is now providing a json dump. To overcome that, i tried to define a runtime parameter for model path like model_root = tfx.orchestration.data_types.RuntimeParameter(name='model-root', ptype=str) and then feed the Pusher component the way I saw in many tutotrials:

pusher = Pusher(model=trainer.outputs['model'],
                model_blessing=evaluator.outputs['blessing'],
                push_destination=tfx.proto.PushDestination(
                    filesystem=tfx.proto.PushDestination.Filesystem(base_directory=model_root)))

but I get a TypeError saying tfx.proto.PushDestination.Filesystem does not accept Runtime parameter.

It completely breaks the existing setup as i received those parameters from external client for each kubeflow run.

Thanks a lot for any help.

1 Answers1

1

I was able to fix it.

First of all, the docstring is not clear regarding which parameter of Pusher can be a RuntimeParameter or not. I finally went to __init__ code definition of component Pusher to see that only the parameter push_destination can be a RuntimeParameter:

  def __init__(
      self,
      model: Optional[types.BaseChannel] = None,
      model_blessing: Optional[types.BaseChannel] = None,
      infra_blessing: Optional[types.BaseChannel] = None,
      push_destination: Optional[Union[pusher_pb2.PushDestination,
                                       data_types.RuntimeParameter]] = None,
      custom_config: Optional[Dict[str, Any]] = None,
      custom_executor_spec: Optional[executor_spec.ExecutorSpec] = None):

Then I defined the component consequently, using my RuntimeParameter

model_root = tfx.orchestration.data_types.RuntimeParameter(name='model-serving-location', ptype=str)
pusher = Pusher(model=trainer.outputs['model'],
                model_blessing=evaluator.outputs['blessing'],
                push_destination=model_root)

As push_destination parameter is supposed to be message proto tfx.proto.pusher_pb2.PushDestination, you have then to respect the associated schema when instantiating and running a pipeline execution, meaning the value should be like:

{'type': 'model-serving-location': 'value': '{"filesystem": {"base_directory": "path/to/model/serving/for/the/run"}}'}

Regards