4

I'm running a Jupyter Notebook on the Vertex AI workbench and getting the error below. It loads for an hour or two, then reverts back to a stopped state. It's a managed notebook, so I can't ssh to the notebook to recover my code, which is all I want to do since there's a lot of work there I haven't backed up (I didn't expect a big Cloud service to randomly fail like this...) Any ideas on how to get this back up and running? I am happy to delete this and start a new instance onc I recover the code on this.

enter image description here

Tanishq Kumar
  • 263
  • 1
  • 13
  • Have you made modifications before the notebook was inaccessible? Also can you share the logs for this error? I believe you can also see it on your activity page on your console – Nestor Ceniza Jr Aug 04 '22 at 01:28
  • I did not make modifications, and had been opening/starting/stopping the instance just fine for weeks. In my activities, I just see some errors `[my email] failed to execute google.cloud.notebooks.v1.ManagedNotebookService.StartRuntime on rio-plates-playground` with `Aborted (HTTP 409): unable to queue the operation`, and then more recently, I see `[my email] has executed google.cloud.notebooks.v1.ManagedNotebookService.StartRuntime on rio-plates-playground` with no errors EVEN THOUGH the notebook continues to fail to start. Help much appreciated, as getting GCP support is nigh impossible! – Tanishq Kumar Aug 04 '22 at 13:57
  • I actually had the same issue. But I just refreshed my page and retried, It was able to start after. Can you check your logging page related to the notebook also you can check your quota page to see if you hit somekind of limit? – Nestor Ceniza Jr Aug 08 '22 at 22:57
  • I have refreshed it 1000 times and have not hit any quotas, I have checked. The logging page gave the errors I outlined above. Extremely disappointed and frustrated in GCP as it seems I'm losing my work, will look to use AWS in future instead. – Tanishq Kumar Aug 08 '22 at 23:24
  • If you want to keep the data I would recommend you to File a case with [Google Cloud Support](https://cloud.google.com/support-hub). – Jose Gutierrez Paliza Aug 11 '22 at 17:18

2 Answers2

2

Managed to get it running myself again in meantime.. Make sure your idle timeout is not set too high, as it appears the same timeout value is valid for scenarios like these, which gives you the opportunity to start the notebook again as soon it "gives up" after being idle (in "Starting" state) for too long.

0

You can always use gcloud if the console fails. Use stop command for managed-notebooks as below:

gcloud notebooks runtimes stop <NOTEBOOK_NAME> --location=<LOCATION>
nimbous
  • 1,507
  • 9
  • 12