0

I have deployed a model on a batch endpoint, and it works when i create the job by the GUI, selecting via wizard the input and output data locations.

I need to run it from notebook, in particular i've seen that the microsoft tutorial suggest to use the "invoke" method to call the batch endpoint, i do this way:

ml_client = MLClient(DefaultAzureCredential(), subscription_id, resource_group, workspace)

endpoint_name="endpoint-name" 
endpoint = ml_client.batch_endpoints.get(name = "loaded-models-endpoint")

The credentials are correct, through the ml_client i'm able to retrieve the endpoint. I continue setting up the input object

data_path_input = "folder/file.parquet"
datastore_input = ml_client.datastores.get(name = "datastorename")
input_data = Input(type=AssetTypes.URI_FILE, path=f"{datastore_input.id}/paths/{data_path_input}")

print(input_data)

the result is

{'type': 'uri_folder', 'path': 'datastore_id_path.../paths/fodler/file.parquet'}

job = ml_client.batch_endpoints.invoke(
    endpoint_name = endpoint.name,
    inputs = {"deployement-name" : input_data}
)

ValidationException: Invalid input path

I tried not using the dictionary, but the simple input, due to the fact that i have only a single deployemnt on that endpoint:

job = ml_client.batch_endpoints.invoke(
    endpoint_name = endpoint.name,
    input = input_data
)

I also tried to provide directly the uri to the file:

job = ml_client.batch_endpoints.invoke(
    endpoint_name = endpoint.name,
    input = uri_to_file
)

In this case it raises:

JSONDecodeError: Expecting value: line 1 column 1 (char 0) During handling of the above exception, another exception occurred: Exception: BY_POLICY

1 Answers1

0
  • The input path you provided is incorrect. You can try to check if the path is correct and if the file exists in the specified location. Also, you can try to use the InputFileDatasetConfig class to create the input object instead of Input class.

  • You're using both inputs and input as parameter names. The correct parameter name is inputs (plural). Use inputs when passing a dictionary of multiple inputs or a single input to the invoke method.

from azureml.core.dataset import Dataset
from azureml.pipeline.core import PipelineData

ml_client = MLClient(DefaultAzureCredential(), subscription_id, resource_group, workspace)
endpoint_name = "endpoint-name"
endpoint = ml_client.batch_endpoints.get(name="loaded-models-endpoint")

data_path_input = "folder/file.parquet"
datastore_input = ml_client.datastores.get(name="datastorename")
input_data = Input(type=AssetTypes.URI_FILE, path=f"{datastore_input.id}/paths/{data_path_input}")

inputs = {"deployement-name": input_data}

job = ml_client.batch_endpoints.invoke(endpoint_name=endpoint.name, inputs=inputs)

enter image description here

  • BY_POLICY This could be due to a security policy that is blocking the request. You can try to check if there are any security policies that are blocking the request. Also, you can try to use the wait_for_completion parameter in the invoke method to wait for the job to complete before returning.

  • Setting the output for a batch endpoint:

output_folder_path = "output_folder"  # Relative path to the output folder
output_datastore = ml_client.datastores.get(name="datastorename")
output = Output(type=AssetTypes.URI, path=f"{output_datastore.id}/{output_folder_path}")

job = ml_client.batch_endpoints.invoke(endpoint_name=endpoint.name, input=input_data, output=output)

Reference :

  1. Run batch endpoints from Azure Data Factory
Suresh Chikkam
  • 623
  • 2
  • 2
  • 6
  • the "input"/"inputs" point is correct, but it is noted that in case of single deployment on endpoint even input is acceptable. And it works now, the problem was that i have not registrated the endpoint i nthe active directory register app, From it I got the credentials to instantiate teh EnvironmentCredentials. Another thing, is it possible set the output? I use the "output" argument with an Output object to a specific folder, but it is ignored (no errors) – Alberto Bertoncini Jun 26 '23 at 09:15
  • Great to hear @AlbertoBertoncini. For resolving by registering the endpoint in AAD, It's possible to specify the output location using the output parameter. Set the output object path in this format `datastore_id_path.../output_folder` please recheck my solution which is updated just now. – Suresh Chikkam Jun 26 '23 at 11:14