1

My VM was running for 2 uptime years or even more with no problems. Several days ago I could reach the web site it hosts. I tried ssh but it fail connecting, so I restarted the VM and it fails to boot. Attaching the log from serial console. I see problems resolving metadata server. What can be wrong?

Thanks!

Nov 27 13:21:54 (none) /etc/mysql/debian-start[2326]: Upgrading MySQL tables if necessary.
[[36minfo[39;49m] Checking for tables which need an upgrade, are corrupt or were 
not closed cleanly..
curl: (6) Couldn't resolve host 'metadata.google.internal'
Nov 27 13:21:54 (none) google: 
Nov 27 13:21:54 (none) google: No startup script found in metadata.
Nov 27 13:21:54 (none) /etc/mysql/debian-start[2330]: /usr/bin/mysql_upgrade: the '--basedir' option is always ignored
Nov 27 13:21:54 (none) /etc/mysql/debian-start[2330]: Looking for 'mysql' as: /usr/bin/mysql
Nov 27 13:21:54 (none) /etc/mysql/debian-start[2330]: Looking for 'mysqlcheck' as: /usr/bin/mysqlcheck
Nov 27 13:21:54 (none) /etc/mysql/debian-start[2330]: This installation of MySQL is already upgraded to 5.5.52, use --force if you still need to run mysql_upgrade
Nov 27 13:21:54 (none) /etc/mysql/debian-start[2377]: Checking for insecure root accounts.
Nov 27 13:21:54 (none) /etc/mysql/debian-start[2382]: Triggering myisam-recover for all MyISAM tables
Nov 27 13:21:56 (none) accounts-from-metadata: WARNING error while trying to update accounts: <urlopen error [Errno 101] Network is unreachable>
Nov 27 13:22:01 (none) accounts-from-metadata: WARNING error while trying to update accounts: <urlopen error [Errno 101] Network is unreachable>
Nov 27 13:22:06 (none) accounts-from-metadata: WARNING error while trying to update accounts: <urlopen error [Errno 101] Network is unreachable>
Nov 27 13:22:11 (none) accounts-from-metadata: WARNING error while trying to update accounts: <urlopen error [Errno 101] Network is unreachable>
Nov 27 13:22:16 (none) accounts-from-metadata: WARNING error while trying to update accounts: <urlopen error [Errno 101] Network is unreachable>
Nov 27 13:22:21 (none) accounts-from-metadata: WARNING error while trying to update accounts: <urlopen error [Errno 101] Network is unreachable>
Nov 27 13:22:26 (none) accounts-from-metadata: WARNING error while trying to update accounts: <urlopen error [Errno 101] Network is unreachable>
Nov 27 13:22:31 (none) accounts-from-metadata: WARNING error while trying to update accounts: <urlopen error [Errno 101] Network is unreachable>
Nov 27 13:22:37 (none) accounts-from-metadata: WARNING error while trying to update accounts: <urlopen error [Errno 101] Network is unreachable>
Nov 27 13:22:42 (none) accounts-from-metadata: WARNING error while trying to update accounts: <urlopen error [Errno 101] Network is unreachable>
Nov 27 13:22:47 (none) accounts-from-metadata: WARNING error while trying to update accounts: <urlopen error [Errno 101] Network is unreachable>

I reviewed some logs and I see that probably it may be caused by:

Operation type

compute.instances.migrateOnHostMaintenance

Status message

Instance migrated during Compute Engine maintenance.
user2988257
  • 111
  • 4

2 Answers2

1

You can enable interactive access to the serial console so you can more easily troubleshoot instances that are not booting properly or that are otherwise inaccessible. See Interacting with the Serial Console for more information.

Kamran
  • 1,425
  • 7
  • 17
  • I answered to you on google groups, but let's take it here cause it not always showing my comments on google groups. anyway the console is not responsible, i think the problem is network. how i can debug it? – user2988257 Nov 29 '17 at 18:07
  • This [article](https://cloud.google.com/compute/docs/troubleshooting) might help you. If you need to make changes to configuration files in the VM follow the section "Inspect an instance without shutting it down". – Carlos Nov 30 '17 at 17:25
0

As we are having the error message [1], normally the metadata service responds to http, DNS, and ICMP echo. So we suggest checking the firewall and link [2] could be helpful for you.

On the other hand, about the error message [3], it seems there should have been the error message “hosterror” also. A host error means that there was a hardware or software issue on the physical machine hosting your virtual machine that caused your virtual machine to crash. When Compute Engine detects such an event, a compute.instances.hostError [4] message we have in operation log.

To “prevent” applications and services from potentially disruptive system events like these, please check “Understanding types of failures” from document [5] as listed below:

Understanding types of failures:

At some point, one or more of your VM instances might be lost due to system or hardware failures. Some of the failures include but are not limited to:

Unexpected single instance failure

Unexpected single instance failures can be due to hardware or system failure. To mitigate these events, use persistent disks and startup scripts to save your data and re-enable software after you restart the instance.

Unexpected single instance reboot

At some point in time, you will experience an unexpected single instance failure and reboot. Unlike unexpected single instance failures, your instance fails and is automatically rebooted by the Compute Engine service. To help mitigate these events, back up your data, use persistent disks, and use startup scripts to quickly re-configure software.

Zone or region failures

Zone and region failures are rare failures that can cause all of your instances in a given zone or region to be inaccessible or fail. To mitigate these failures, create diversity across regions and zones and implement load balancing. You should also back up your data or replicate your persistent disks across multiple zones.

Make sure you design robust systems and for more information about robust systems please have the document [5].

[1] Couldn't resolve host 'metadata.google.internal'

[2] Why can't I access Metadata Server of GCP Instance?

[3] compute.instances.migrateOnHostMaintenance

[4] https://cloud.google.com/compute/docs/faq#hosterror

[5] https://cloud.google.com/compute/docs/tutorials/robustsystems

Shafiq I
  • 166
  • 5