2

I have deployed a service using Cloud run on gke which uses Knative as an abstraction over k8s. The default MaxRevisionTimeoutSeconds is set to 600s in the knative default config but according to this PR this is customizable.

I couldn't find anything in the official Knative documentation, can anybody help me out here?

UPDATE:

After digging a bit more in knative source code and documentation. It looks like that the MaxRevisionTimeoutSeconds is defined in resource=ConfigMap/config-defaults. So have to update it with custom value.

From this it looks like we can use something called as operator to modify the ConfigMap resource but it did not work probably because gcp's does not use operator to install Knative components. Anyways I went on to install the operator and then used resource=knativeserving to overwrite the config-defaults. But this also did not work when I tried re-deploying service.

The next solution is to directly edit the config-defaults using kubectl edit. I even tried doing this but encountered weird behavior. After editing the YAML file when I used kubectl describe to check the changed value, it sometimes shows the modified value, sometimes shows the old value, and sometimes doesn't even show that particular key-value pair in the YAML. Also, it doesn't work when trying to re-deploy the service after doing this edit.

If anyone can help me with this, it would be really great.

Srijan Singh
  • 37
  • 1
  • 2
  • 11

1 Answers1

2

MaxRevisionTimeoutSeconds is a cluster-global setting which enforces the max value for TimeoutSeconds on each Revision. This value exists so that cluster administrators can set upper bounds on the amount of time a single HTTP request can be in the system. Knowing an upper bound can be useful when configuring graceful shutdown settings on the HTTP routing components to prevent dropped requests during upgrades.

It's possible that Cloud Run on GKE has overridden these configurations so that they can upgrade the underlying Istio and Knative components on a predictable schedule. (If you have a 10% upgrade budget and it takes 10m to drain a component, your minimum upgrade time is probably around 110m, taking into account additional scheduling / image fetch / startup time.)

E. Anderson
  • 3,405
  • 1
  • 16
  • 19
  • Yes, they did override this particular value to 900s in the configmap/config-defaults. Is there any way to like to increase this limit further up to let's say 1500s? – Srijan Singh Jun 14 '20 at 20:36
  • @AhmetB-Google I did try that as mentioned in the update. But it doesn't work. Do I need special permission? I am currently Cloud Run admin and Editor. – Srijan Singh Jun 15 '20 at 19:25
  • 1
    Even after getting GKE admin role, the changes to config-defaults do not work. Every time I edit the yaml file it goes back to the default set by gcp. I have no idea why it would not let gke/run admin modify it. – Srijan Singh Jun 16 '20 at 20:27
  • Thanks for reporting, I'll follow up on this on our side. – ahmet alp balkan Jun 17 '20 at 03:12
  • I won't be able to provide updates here. If you’d like to be updated, consider opening an issue on Google Cloud Issue Trackers. – ahmet alp balkan Jun 25 '20 at 18:02