Questions tagged [mlops]

This tag is for programming questions about MLOps, which is the application of DevOps principles in the design and deployment of Machine Learning (ML) systems.

See also:

Related tags

  • mlflow
  • kubeflow
  • feature-store
228 questions
0
votes
1 answer

In Iguazio cluster, when initing a spark session without any configuration modification, what is the default spark.executor.memory set to?

When initing a spark session without any configuration modification, what is the default spark.executor.memory set to?
xsqian
  • 199
  • 5
  • 13
0
votes
0 answers

Heroku deployment of NLP model showing error ( app runs fine locally )

I have deployed my Flask App ( NLP model ) on Heroku. I was basically a price prediction model where some columns were in Japanese where I applied NLP + Nagisa Library for tokenization and some columns were numerical data. I pickled vectorizers and…
0
votes
1 answer

How do I use secrets in Iguazio?

How do I define secrets in Iguazio? Are they cluster-wide or just at the project level? Once the secret is defined, how do I use it in my jobs? What about distributed jobs that have many pods/workers like Spark and Dask, can I use secrets with those…
Brennan
  • 39
  • 1
  • 5
0
votes
1 answer

Where are my monitoring metrics in Iguazio?

Iguazio is supposed to be monitoring models and also resource metrics across the cluster like resource usage per pod / service, etc. When I open up the grafana service I only see a few pre-built dashboards for model monitoring and nuclio functions…
Brennan
  • 39
  • 1
  • 5
0
votes
1 answer

Pod "no2-pipeline-x5kpd-2954674781" is invalid: spec.volumes[3].name: Duplicate value: "no2-pvc"

Hi I am trying to run a Kubeflow pipeline. Two steps will run in parallel and dump data to two different folders of PVC, then the third component will collect data from those to folders and merge them together and dump the merged data to another PVC…
0
votes
1 answer

Rate limiting Kedro API requests

I have a few datasets from the government dataset that I'm using on my ML model, the problem is, their server is not that great to put it nicely. Whenever I run my pipeline, when I pull from their API all at once, their server goes down for a few…
João Areias
  • 1,192
  • 11
  • 41
0
votes
1 answer

Can I use credentials in Nuclio functions?

I am using Nuclio functions and I need to provide credentials in the function for things like accessing database etc. Is there a way to store these credentials securely (not plain text) ?
Brennan
  • 39
  • 1
  • 5
0
votes
1 answer

Iguazio job is stuck on 'Pending' status

I have a job I am running in Iguazio. It starts and then the status is "Pending" and the icon is blue. It stays like this indefinitely and there is nothing in the logs that describes what is going on. How do I fix this?
Brennan
  • 39
  • 1
  • 5
0
votes
2 answers

Kubeflow Pipeline Training Component Failing | Unknown return type:

I am running an ML pipeline and the training component/step (see code below) continues to fail with the following error: "RuntimeError: Unknown return type: . Must be one of str, int, float, a subclass of Artifact, or a…
0
votes
1 answer

How do I log a model with metrics and plots in MLRun?

I'm training a model using MLRun and would like to log the model using experiment tracking. What kinds of things can I log with the model? I'm specifically looking for metrics (i.e. accuracy, F1, etc.) and plots like loss over time
Nick Schenone
  • 209
  • 1
  • 7
0
votes
0 answers

Save matplotlib to variable rather than save as a file

I want to save a matplotlib figure to a variable so I can pass it to a separate script and save it using output parameters defined there. It is not possible to have this output directory in the script that makes the plots.
0
votes
1 answer

When would I use Spark Operator vs Spark Standalone in Iguazio?

I see in the services UI that I can create a Spark cluster. I also see that I can use the Spark operator runtime when executing a job. What is the use case for each and why would I choose one vs the other?
Nick Schenone
  • 209
  • 1
  • 7
0
votes
1 answer

Can I use Iguazio to serve a model on a REST API?

Does Iguazio support Flask, FastAPI, or any other framework to serve models? And how do I secure the endpoints?
0
votes
1 answer

Spark job fails on image pull in Iguazio

I am using code examples in the MLRun documentation for running a spark job on Iguazio platform. Docs say I can use a default spark docker image provided by the platform, but when I try to run the job the pod hangs with Error ImagePullBackOff. Here…
Brennan
  • 39
  • 1
  • 5
0
votes
0 answers

How can I pass request body for sklearn model, decision tree classifier in AWS Sagemaker while invoking endpoint?

I am trying to invoke a multi-model endpoint in AWS Sagemaker and I am getting a ModelError -ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (500) from model with message "invalid literal…