What is the difference between Deploying and Serving ML model?

Question

Recently I have developed a ML model for classification problem and now would like to put in the production to do classification on actual production data, while exploring I have came across two methods deploying and serving ML model what is the basic difference between them ?

Deploying is the process of putting the model into the server. Serving is the process of making a model accessible from the server (for example with REST API or web sockets). — pplonski, Apr 09 '21 at 15:02

Tony · Accepted Answer · 2021-04-09T11:48:12.620

2

Based on my own readings and understanding, here's the difference:

Deploying = it means that you want to create a server/api (e.g. REST API) so that it will be able to predict on new unlabelled data
Serving = it acts as a server that is specialized for predict models. The idea is that it can serve multiple models with different requests.

Basically, if your use case requires deploying multiple ML models, you might want to look for serving like torchServe. But if it's just one model, for me, Flask is already good enough.

Reference:

Pytorch Deploying using flask

TorchServe

edited Apr 09 '21 at 11:48

answered Apr 09 '21 at 10:04

Tony

36
4

I found this on internet : Model Serving allows you to host machine learning models from Model Registry as REST endpoints that are updated automatically based on the availability of model versions and their stages. so it means deploying = REST API, Serving = REST endpoints ? – alex3465 Apr 09 '21 at 10:11
Both deployment and serving can have REST API (or endpoint). Deployment doesn't necessarily require a REST API (an API is already okay). – Tony Apr 09 '21 at 11:46

What is the difference between Deploying and Serving ML model?

1 Answers1