Serving multiple deep learning models from cluster

Question

I was thinking about how one should deploy multiple models for use. I am currently dealing with tensorflow. I was referring this and this article.

But I am not able to find any article which targets need to serve several models distributed manner. Q.1. Does tensorflow serving serve models off from single machine? Is there any way to set up a cluster of machines running tensorflow serving? So that multiple machines serve same model somewhat working as master and slave or say load balance between them while serving different models.

Q.2. Does similar functionality exist for other deep learning frameworks, say keras, mxnet etc (not just restricting to tensorflow and serving models from different frameworks)?

score 0 · Answer 1 · answered Dec 13 '18 at 20:21

A1: Serving tensorflow models in a distributed fashion is made easy with Kubernetes, a container orchestration system, that takes much of the pain related to having distributed system away from you, including load balancing. Please check serving kubernetes.

A2: Sure, check for instance Prediction IO. It's not deep learning specific, but can be used to deploy models made with e.g. Spark MLLib.

Serving multiple deep learning models from cluster

1 Answers1

Linked