I've been running a long job on GCE with a GPU. It is not a preemptible instance.
I was monitoring the job on a local terminal with SSH and TMUX on the instance so it keeps running if the SSH connection gets broken. The screen froze so I tried to SSH from another terminal window, but SSH also froze.
I went to the Google cloud console to try to see what is going on, and there are a lot of disk reads going on:
I'm pretty sure that nothing I've done has caused the disk reads.
Any idea what is going on? I hope my job is still running and I don't want to start over again so I'd rather not stop and restart my instance.