1

I have a Google Cloud project on which I'm unable to access a CentOS 8 VM. It is running kernel version 4.18.0-193.19.1.el8_2.x86_64 on an x86_64. I'm also running with selinux enabled.

Before running sudo yum update on my CentOS 8 VM yesterday, I was able to SSH and authenticate via OTP without issue. Today, all of my OTP codes are failing. I don't know for sure the update was the cause of the issue, but it's the only major change I've made before this issue surfaced.

I've tried resyncing on my phone's Google Authenticator app, which did not help. Further, I've tried each of my emergency scratch codes created at the time of running google-authenticator, and none of them have worked either. As far as I can tell, the times are sufficiently in sync between client and server.

As a response to this, I've enabled serial console access, but, at no point that I can recall, I have never set up passwords for my CentOS user--just SSH keys. So, I'm at the point where I can't authenticate via serial console and I can't authenticate via SSH.

Is there anything else I can try?

Curtis
  • 13
  • 3

1 Answers1

0

At first, I'd recommend you to check logs. To do it restart your VM instance, if it's possible, and check logs at Compute Engine -> VM instances -> click on NAME_OF_YOUR_VM -> at the VM instance details find section Logs and click on Serial port 1 (console) for any error or warning messages that could explain what happened to your VM instance.

To get access via serial console follow steps below:

  1. Enable serial console connection with gcloud command:

     gcloud compute instances add-metadata NAME_OF_YOUR_VM_INSTANCE \
     --metadata serial-port-enable=TRUE
    

or go to Compute Engine -> VM instances -> click on NAME_OF_YOUR_VM_INSTANCE -> click on EDIT -> go to section Remote access and check Enable connecting to serial ports

  1. Create temporary user and password to login: shutdown your VM and set a startup script by adding at the section Custom metadata key startup-script and value:

     useradd --groups google_sudoers tempuser
     echo "tempuser:password" | chpasswd
    

and then start your VM.

  1. Connect to your VM via serial port with gcloud command:

     gcloud compute connect-to-serial-port NAME_OF_YOUR_VM_INSTANCE
    

or go to Compute Engine -> VM instances -> click on NAME_OF_YOUR_VM_INSTANCE -> and click on Connect to serial console

  1. Check what went wrong after update.

  2. Disable access via serial port with gcloud command:

     gcloud compute instances add-metadata NAME_OF_YOUR_VM_INSTANCE \
     --metadata serial-port-enable=FALSE
    

or go to Compute Engine -> VM instances -> click on NAME_OF_YOUR_VM_INSTANCE -> click on EDIT -> go to section Remote access and uncheck Enable connecting to serial ports

Keep in mind that accordingly to the documentation Interacting with the serial console:

Caution: The interactive serial console does not support IP-based access restrictions such as IP whitelists. If you enable the interactive serial console on an instance, clients can attempt to connect to that instance from any IP address. Anybody can connect to that instance if they know the correct SSH key, username, project ID, zone, and instance name. Use firewall rules to control access to your network and specific ports.

In addition, have a look at the documentation Troubleshooting SSH and 3rd party article Resolving getting locked out of a Compute Engine.

Serhii Rohoza
  • 1,424
  • 2
  • 5
  • 15
  • Thanks so much for going into so much detail. It seems like the startup script is not getting run (tried two subsequent boots after adding); the console tells me "Login incorrect". The only other thing I can think of is to try mounting disk to another instance, as suggested by @Michael Hampton. – Curtis Sep 23 '20 at 18:43
  • Thanks, you can find step by step instructions how to mount disk at the documentation Troubleshooting SSH that I mentioned above. Please check and let me know if you were able to fix your issue. – Serhii Rohoza Sep 24 '20 at 05:03
  • 1
    I was able to mount the disk on a Ubuntu server I had up. I made changes to sshd_config file and the pam.d ssh config file to temporarily disable PAM authentication. Unfortunately, there now seems to be a permissions issue. When rebooting the server with the disk re-mounted, the serial logs show the machine is unable to read sshd_config, so I'm essentially still locked out. I'm assuming this is an selinux issue, so plan to do more reading. – Curtis Sep 27 '20 at 00:28
  • 1
    I'm marking this as the solution, since the SSH troubleshooting steps are what helped me to move forward. The SELinux issues seemed to be at the heart of my problem--the `~/.google_authenticator` file was also never generated, which I believe is why my OTP rescue codes didn't work. – Curtis Sep 28 '20 at 19:54