Questions tagged [mlops]

This tag is for programming questions about MLOps, which is the application of DevOps principles in the design and deployment of Machine Learning (ML) systems.

See also:

Related tags

  • mlflow
  • kubeflow
  • feature-store
228 questions
0
votes
1 answer

Automate batch predictions with VertexAI pipeline and Kuberflow component

The code below loads a model already trained in VertexAI and runs a pipeline for batch predictions. However, I get a json decoder error that I am not able to figure out where it comes from. The input file is in jsonl format and it works fine if I…
Annalix
  • 470
  • 2
  • 6
  • 17
0
votes
2 answers

Does MLflow allow to log artifacts from remote locations like S3?

My setting I have developed an environment for ML experiments that looks like the following: training happens in the AWS cloud with SageMaker Training Jobs. The trained model is stored in the /opt/ml/model directory, which is reserved by SageMaker…
Javier Beltrán
  • 128
  • 2
  • 10
0
votes
1 answer

How does autologging work in MLOps platforms like Comet or MLFlow?

I was wondering how the implementation of logging is done where you just need to create an experiment object from comet_ml and it auto-detects and gives out all the statistics of the trained experiment. Is there some sort of logging system used?
0
votes
1 answer

Mlflow log_model, not able to predict with spark_udf but with python works

I was wondering to log a model on mlflow, once I do it, I'm able to predict probabilities with python loaded model but not with spark_udf. The thing is, I still need to have a preprocessing function inside the model. Here is a toy reproductible…
Tom
  • 496
  • 8
  • 16
0
votes
1 answer

Logging the git_sha as a parameter on Mlflow using Kedro hooks

I would like to log the git_sha parameter on Mlflow as shown in the documentation. What appears to me here, is that simply running the following portion of code should be enough to get git_sha logged in the Mlflow UI. Am I right ? @hook_impl def…
Downforu
  • 317
  • 5
  • 13
0
votes
1 answer

How to increase scheduler memory in GKE for DASK

I have deployed a kubernetes cluster on GCP with a combination of prefect and dask. The jobs run fine in a normal scenario but it is failing to scale for 2 times the data. So far, I have narrowed it down to scheduler getting shut off due to high…
0
votes
1 answer

Azure Data Lake Storage Gen2 (ADLS Gen2) as a data source for Kedro pipeline

According to Kedro's documentation, Azure Blob Storage is one of the available data sources. Does this extend to ADLS Gen2 ? Haven't tried Kedro yet, but before I invest some time on it, I wanted to make sure I could connect to ADLS Gen2. Thank you…
Downforu
  • 317
  • 5
  • 13
0
votes
3 answers

Azure ML release bug AZUREML_COMPUTE_USE COMMON_RUNTIME

On 2021-10-13 in our application in Azure ML platform we get this new error that causes failures in pipeline steps - python module import failures - warning stack <- warning that leads to pipeline runtime error we needed to set it to false. Why is…
0
votes
1 answer

How to separate development and production requirements.txt for Machine Learning Project?

I'm looking for a better AI/ML project code structure. I know that cookiecutter is there and I really like it. Here is the problem: I want my Jupyter Notebook added to the project structure like cookiecutter. But when I want to deploy the model and…
0
votes
1 answer

Error in deploying PyTorch model using SageMaker Pipeline and RegisterModel

Can anyone provide an example for deploying a pytorch model using SageMaker Pipeline? I've used the MLOps template (MLOps template for model building, traing and deployment) of SageMaker Studio to build a MLOps project. The template is using…
user_5
  • 498
  • 1
  • 5
  • 22
0
votes
1 answer

MLOps monitoring with quicksight

We currently have 3 machine learning models in production in our team (2 classifiers & one time-series). Sagemaker studio with Sagemaker model monitoring wasn't the right option for us because of our CICD architecture. So now we have an ECS…
JanBennk
  • 277
  • 7
  • 16
0
votes
2 answers

MLFlow Pytorch Model

I have a trained Yolo model and is in model.pt format, I am able to upload the model to create an artifact in mlflow. However, when I look at the yaml file it has a few dependencies listed. I am sure that I am loading in the wrong…
0
votes
1 answer

ClearML How to get configurable hyperparameters?

How do I get args like epochs to show up in the UI configuration panel under hyperparameters? I want to be able to change number of epochs and learning rate from within the UI.
0
votes
1 answer

Do we need dataset in each of the worker when using ParameterServerStrategy?

In the tutorial code of ParameterServerTraining from tensorflow API, has the following snippet of code in model.fit section def dataset_fn(input_context): global_batch_size = 64 batch_size =…
0
votes
1 answer

SageMaker is not authorized to perform: iam:PassRole

I'm following the automate_model_retraining_workflow example from SageMaker examples, and I'm running that in AWS SageMaker Jupyter notebook. I followed all the steps given in the example for creating the roles and policies. But when I try to run…