1

I'm using Google AutoML Translation with custom models. It works fine but there is one issue that bothers me.

If I do not use language pair (e.g. English-Spanish) for a while there is a cold start and the request takes more than 10 seconds. Whereas the next call is very quick and it takes around 0.5s. In my application, I support many language pairs and the issue appears to each language pairs.

I didn't find any information about the cold start or readiness probes in the documentation.

The question is: Is it possible to somehow avoid a cold start?

mskuratowski
  • 4,014
  • 15
  • 58
  • 109
  • Could you please specify the time you left the model without use? And how you are calling the model REST API, Python code, both? I made some test by my own, and I was not able to reproduce the issue, is supposed that once the model is deployed it is available to use it without any re-deploy or cold start. – Enrique Zetina Jan 05 '21 at 21:36
  • @EnriqueZetina I updated my question. It's weird because I face the same issue when I try it on the GCP. – mskuratowski Jan 15 '21 at 11:15

0 Answers0