I am attempting to host a Python MLflow model using Databricks model serving. While the serving endpoint functions correctly without private Python packages, I am encountering difficulties when attempting to include them.
Context:
- Without Private Packages: The serving endpoint works fine.
- With Private Packages: I can only use the
--index.url
set to my private PyPI server as detailed in this answer.
I wish to avoid storing my token for the private PyPI in plain text. Since init scripts are not supported with model serving, I don't know how to inject the token, as a secret at build time. (Could this be possible?)
Attempted Solution:
Following this tutorial, I built the whl
files, uploaded them to dbfs, and listed them in pip_requirements
in mlflow.pyfunc.log_model
. Unfortunately, the file on dbfs cannot be found at build time, preventing the endpoint creation.
Code:
Here's how I'm logging the model:
mlflow.pyfunc.log_model(
"hello-world",
python_model=model,
registered_model_name="hello-world",
pip_requirements=[
"/dbfs/FileStore/tables/private_package-0.1.10-py3-none-any.whl"
],
)
I have tried different paths in pip_requirements, and the file's existence on dbfs has been verified through both the Databricks CLI.
in pip_requirements
I have tried:
/dbfs/FileStore...
dbfs/FileStore...
/dbfs:/FileStore...
dbfs:/FileStore...
Command to view package in databricks notebook (this command works!, I can see the file):
dbutils.fs.ls("dbfs:/FileStore/tables/private_package-0.1.10-py3-none-any.whl")
Error:
The build logs from the databricks serving endpoint produce the following error:
ERROR: Could not install packages due to an OSError: [Errno 2] No such file or directory: '/dbfs/FileStore/tables/private_package-0.1.10-py3-none-any.whl'
CondaEnvException: Pip failed
My hypothesis is that there might be a permission error, and Databricks model hosting might not have access to dbfs. Being new to Databricks, I am unsure how to debug this. Any guidance or insights on how to resolve this issue would be greatly appreciated!