So, I have developed a chatbot based application using multiple services (used multiple NodeJs servers + flask servers) dockerize and deployed as kubernetes pod and used minikube Ingress-Nginx Controller. The problem I am facing is that my Chatbot service requires GPU support which I am unable to provide with minikube. Is there any way or approach through which I can use GPU for services and which requires minimum changes in my current architecture...
Here, is the better explanation of my current architecture...
Client - React Service with Server side rendering
Auth Service - for authentication and creating session. It's NodeJs app service..
Profanity Service - Santitize Data.. Flask App and no need for GPU here...
Communication Service - Store Data and Responses and handles Communication with different services - NodeJs app service
Chatbot Service - Huggingface LLM Model with Flask App. For Generating response. Here, I required to use GPU...
Note: All services are deployed as Kubernetes pod in my local minikube with Ingress-Nginx Controller...