Need Guidance for deploying Huggingface model with Flask on Kubernetes with Ingress and GPU support

Asked Jun 12 '23 at 03:47

Active Jun 12 '23 at 03:47

Viewed 79 times

So, I have developed a chatbot based application using multiple services (used multiple NodeJs servers + flask servers) dockerize and deployed as kubernetes pod and used minikube Ingress-Nginx Controller. The problem I am facing is that my Chatbot service requires GPU support which I am unable to provide with minikube. Is there any way or approach through which I can use GPU for services and which requires minimum changes in my current architecture...

Here, is the better explanation of my current architecture...

Client - React Service with Server side rendering
Auth Service - for authentication and creating session. It's NodeJs app service..
Profanity Service - Santitize Data.. Flask App and no need for GPU here...
Communication Service - Store Data and Responses and handles Communication with different services - NodeJs app service
Chatbot Service - Huggingface LLM Model with Flask App. For Generating response. Here, I required to use GPU...

Note: All services are deployed as Kubernetes pod in my local minikube with Ingress-Nginx Controller...

asked Jun 12 '23 at 03:47

Ishan Joshi

Why are you unable to provide GPU to the pod? OS issue or something else? – Joachim Isaksson Jun 12 '23 at 04:34
I am not able to access GPU inside the pod – Ishan Joshi Jun 12 '23 at 07:06

Need Guidance for deploying Huggingface model with Flask on Kubernetes with Ingress and GPU support

0 Answers0