I develop API using Django for AI model but it's prediction is slow and i have lots of request which take time to execute through AI API, I need help to handle multiple Request at same time
Asked
Active
Viewed 42 times
1 Answers
2
If your AI model is stateless meaning new request can be processed independent of the previous requests, then you can run multiple instances of your AI model. You can use a Deployment with multiple replicas. Then, use a Service to load-balance between the instances.

Emruz Hossain
- 4,764
- 18
- 26
-
can i use kubernetes for this? – Ayaz Khan Mar 21 '22 at 16:36
-
Yes. Btw, where are running your application now? – Emruz Hossain Mar 21 '22 at 16:38
-
i am running it on aws instance, but if made replica for it then it is expensive for us, – Ayaz Khan Mar 21 '22 at 16:52