How to fast custom AI model prediction through API?

Question

I develop API using Django for AI model but it's prediction is slow and i have lots of request which take time to execute through AI API, I need help to handle multiple Request at same time

score 2 · Answer 1 · answered Mar 21 '22 at 15:42

2

If your AI model is stateless meaning new request can be processed independent of the previous requests, then you can run multiple instances of your AI model. You can use a Deployment with multiple replicas. Then, use a Service to load-balance between the instances.

answered Mar 21 '22 at 15:42

Emruz Hossain

4,764
18
26

can i use kubernetes for this? – Ayaz Khan Mar 21 '22 at 16:36
Yes. Btw, where are running your application now? – Emruz Hossain Mar 21 '22 at 16:38
i am running it on aws instance, but if made replica for it then it is expensive for us, – Ayaz Khan Mar 21 '22 at 16:52

How to fast custom AI model prediction through API?

1 Answers1