Questions tagged [mlops]

This tag is for programming questions about MLOps, which is the application of DevOps principles in the design and deployment of Machine Learning (ML) systems.

See also:

Related tags

  • mlflow
  • kubeflow
  • feature-store
228 questions
2
votes
1 answer

DVC experiments with large data in kubernetes

We have a Computer Vision project. Raw Data stores in S3. Label Team every day send new increment of labeled data. We want to automize train process with these new data. We use dvc for reproducing pipelines and ML Flow for logging and deploying…
RazDva
  • 21
  • 3
2
votes
2 answers

Disk Full Error When running Azure ML Jobs using Custom Environemnts from Devops

I get a disk full error while running a Model training job using Azure ML SDK launched from Azure DevOps. I created a custom environment inside the Azure ML Workspace and used it. I am using azure CLI tasks in Azure DevOps to launch these training…
Imperial_J
  • 306
  • 1
  • 7
  • 23
2
votes
1 answer

How do I set Dask autoscaling using Iguazio?

I need to create a new Dask cluster in Iguazio. I want to take advantage of Dask's autoscaling features that are described here: https://docs.dask.org/en/stable/how-to/adaptive.html Does Iguazio support Dask cluster autoscaling and, if so, how do I…
Brennan
  • 39
  • 1
  • 5
2
votes
0 answers

Getting Provisioning failed for CloudFormation when creating a project in SageMaker studio

I am testing out MLOps using SageMaker studio and am creating a project using a template for MLOps provided by SageMaker: MLOps template for model building, training, and deployment with third-party Git repositories using CodePipeline I am getting…
2
votes
0 answers

How to set custom path for databricks mlflow artifacts on s3

I've created an empty experiments from databricks experiments console and given the path for my artifacts on s3 i.e. s3:///. When i run the scripts, the artifacts are stored at s3:////<32 char id>/artifacts/model-Elasticnet/model.pkl I want…
shahidammer
  • 1,026
  • 2
  • 10
  • 24
2
votes
0 answers

Custom MLFlow scoring_server for model serving

I would like to know if MLflow currently does support any kind of customization of it's scoring_serving that would allow the ability to register new endpoints to the published Rest API. By default the scoring server provides /ping and /invocations…
jarey
  • 323
  • 2
  • 8
2
votes
0 answers

How to track the big data stored in Gdrive through DVC?

I am currently working on the ML project and the data size is around 10 GB. The data I stored in google drive. Its impossible for me to download it on my local machine. So, how to use the DVC (data version control) to track that data? Thank you in…
dave vedant
  • 329
  • 2
  • 4
  • 11
2
votes
1 answer

Vertex AI - No module named 'google_cloud_pipeline_components.remote on ModelDeployOp(...)

I have created a simple pipeline that trains a model and deploys it to a Vertex AI endpoint. I have noticed that while attempting to deploy the model using the google_cloud_pipeline_components.aiplatform.ModelDeployOp() component, it returns an…
2
votes
0 answers

Error can't get attribute Net when saving PyTorch model with MLFlow

After installing MLFlow using one-click-mlflow I save a pytorch model using the default command that I found in the user guide. You can find the command bellow: mlflow.pytorch.log_model(net, artifact_path="model", pickle_module=pickle) The neural…
ucsky
  • 442
  • 6
  • 13
2
votes
1 answer

Diagnose crashes of FiftyOne app – logs or other tools

We need to make a FiftyOne instance available to multiple users via a web browser. We need to start a process and have it run, even after we log off from the session that initiated the app processes. I’m using the following command to start the…
Cola
  • 2,097
  • 4
  • 24
  • 30
2
votes
0 answers

Kubeflow: Notebook server stuck on loading

Whenever I try to create a Kubeflow notebook server to build a pipeline from a jupyter notebook, it keeps loading forever without displaying any error. I'm currently using a Kubeflow dashboard that's already up and running on a server, so I didn't…
2
votes
1 answer

Waiting for nodes to finish in Kedro

I have a pipeline in Kedro that looks like this: from kedro.pipeline import Pipeline, node from .nodes import * def foo(): return Pipeline([ node(a, inputs=["train_x", "test_x"], outputs=dict(bar_a="bar_a"), name="A"), node(b,…
João Areias
  • 1,192
  • 11
  • 41
2
votes
1 answer

Kubeflow-kale :- How to integrate kubeflow-kale extension to run pipelines on a seperate standalone cluster of Kubeflow pipelines

I am currently trying to use the kubeflow kale jupyter extension on my local jupyterlab server without Kubernetes and kubeflow installed and trying to run my code pipeline on GCP AI pipeline server or any other Cloud Kubeflow pipeline server. I am…
1
vote
3 answers

How to pass only necessary features to pipeline after SelectKBest

I have a regular tabular dataset, 100 features from the database are added I want to push it into a regular sklearn.pipeline in which there will be preprocessing, encoding, some custom transformers, etc. Penultimate estimator would be…
1
vote
1 answer

How to integrate stable_baselines3 with dagshub and MLflow?

I am trying to integrate stable_baselines3 in dagshub and MlFlow. I am new to MLOPS Here is a sample code that is easy to run: import mlflow import gym from gym import spaces import numpy as np from stable_baselines3 import PPO import…
TheGainadl
  • 523
  • 1
  • 6
  • 14
1 2
3
15 16