Questions tagged [mlops]

This tag is for programming questions about MLOps, which is the application of DevOps principles in the design and deployment of Machine Learning (ML) systems.

See also:

Related tags

  • mlflow
  • kubeflow
  • feature-store
228 questions
0
votes
0 answers

I am struggling to write a transform python file for the tfx pipeline in transform component. I am working with object detection task

I am building a pipeline using tensorflow extended. The dataset is already conversted to the tfrecord for the ingestion. I want to do data preprocessing such as scalin, data augmentation along with bbox and prepare data for the object detection…
0
votes
0 answers

mlflow: INVALID_PARAMETER_VALUE Parameters can not be modified after a value is set

I get the following exception raised, help me please mlflow.exceptions.RestException: INVALID_PARAMETER_VALUE: Response: { 'Error': { 'Code': 'ValidationError', 'Severity': None, 'Message': 'Parameters can not be modified after a value…
Kaushik J
  • 962
  • 7
  • 17
0
votes
1 answer

training in Azure ml using a docker container which already has a training script

I was looking into training in Azure ml using a custom docker container which already has a training script, but so far in the docs, I haven't found anything. Is it possible to upload a custom container(containing the training script) to the…
0
votes
1 answer

Issue during import, MLRunNotFoundError

I installed python package MLRun correctly, but I got in jupyter this error --------------------------------------------------------------------------- HTTPError Traceback (most recent call…
JIST
  • 1,139
  • 2
  • 8
  • 30
0
votes
1 answer

Basic question on downloading data for a Kubeflow pipeline

I'm a newbie on Kubeflow, just started exploring. I've setup a microk8s cluster and charmed kubeflow. I have executed a few examples trying to understand the different components. Now I'm trying to setup a pipeline from scratch for a classification…
0
votes
0 answers

Vetiver Error when connecting to board: Error in order(results$name) : argument 1 is not a vector

Summary/Context I am attempting to deploy a model from R to Google Cloud. I am following these steps/tutorial from the vetiver GitHub. I am using the sample data provided in the tutorial except I am attempting to deploy the model to Google Cloud,…
Jaskeil
  • 1,044
  • 12
  • 33
0
votes
0 answers

MLFlow run with docker environment bringing error "repository name must be lowercase"

I have built a docker image for my docker_env image in MLFlow. However, when i run mlflow run . an error "docker: invalid reference format: repository name must be lowercase" comes up despite using lowercase for repository name for both local and…
Andrewchi
  • 11
  • 1
0
votes
2 answers

Reaching batch prediction quota limit when not submitting that many batch predictions

I'm using Vertex AI batch predictions using a custom XGBoost model with Explainable AI using Shapley values. The explanation part is quite computationally intensive so I've tried to split up the input dataset into chunks and submit 5 batch…
0
votes
0 answers

Triggering training whenever recieved new data in some specific folder(ex: my_data)

I am having an issue. I have completed my code in tensorflow for image classification in mnist dataset. It is working perfectly. But I want to implement some advancement in it. Let's say I have a folder name my_data. In this folder I have only…
sameer
  • 23
  • 6
0
votes
1 answer

Loading pandas data frame from pickle file in S3 bucket to AWS Lambda - problem with type

I created a machine-learning model with a KNN classifier. Then, I made a pickle file of the test dataset and uploaded it to the AWS S3 bucket using AWS SDK. For testing purposes, I have downloaded it and tested the type with the following: with…
AbelAI
  • 25
  • 3
0
votes
1 answer

Shared Python Packages Among Docker Containers

I've multiple docker containers that host some flask apps which runs some machine learning services. Let's say container 1 is using pytorch, and container 2 is also using pytorch. When I build image, both pytorch take up some size on disk. For some…
0
votes
1 answer

Metrics for monitoring LDA Model

We use LDA for topic-modelling in production. I was wondering if there are any metrics which we could use to monitor the quality of this model to understand when model starts to perform poorly and we need to retrain it (for example,if we have too…
ulie
  • 15
  • 3
0
votes
1 answer

Deploying NLP model to AWS for beginners

I have the task of optimizing search on the website. The search should be for pictures and for text by text query. I have already developed, trained, tested and selected a machine learning model that transforms images and text into a feature vector…
0
votes
3 answers

Kedro - Getting path to item in the datacatalog

I'm training an nlp model using spacy. I have the preprocessing steps all written as a pipeline, and now I need to do the training. According to spacy's documentation I need to run the following command: python -m spacy train config.cfg --output…
João Areias
  • 1,192
  • 11
  • 41
0
votes
1 answer

How to use the same code for machine learning "data transformations" before prediction in both reshearch & production

I wonder what is the best practice/s of taking for example a jupiyter notebook that contains the whole flow from eda to prediction and use the same code for "data transformations from the raw data" till using it for predictions, in case that…
george k
  • 1
  • 1