4

I have a remote SuperMicro server that is failing to boot.

It's located on a different continent and access is limited.

I can get into the BIOS and do whatever I need to, but as it gets through the POST messages it appears to Blue Screen!

This is not the windows Blue Screen of Death. The installed OS is actually Ubuntu Server 12.04.

Has anyone seen this before on SuperMicro Super Servers?


Things I've tried so far.

  1. Cold start multiple times
  2. Reset bios to optimal defaults
  3. Warm reset
  4. Disabling USB (I recall a similar issue to do with keyboards and USB)... anyway that didn't help.

None of that made any difference.

Ok, here is the screenshot.

enter image description here

hookenz
  • 14,472
  • 23
  • 88
  • 143
  • Any POST errors, does it actually finish POST? – xeon Jul 25 '14 at 03:19
  • Hard to tell what's happening. I've just turned on the system event log in IPMI. Hopefully i'll give me a code. – hookenz Jul 25 '14 at 03:21
  • I can't actually tell whether it got through the POST. There is nothing in the IPMI log. I'm guessing it didn't. IPMI across the world is rather slow as you can imagine. – hookenz Jul 25 '14 at 03:28
  • Get physical access and see if you get beeps or more information on screen. – xeon Jul 25 '14 at 03:32
  • I'll have to send my collegue in who's based there. Hopefully he'll have an idea. Apparenly the USB issue last time was due to having an OLD USB keyboard plugged in. There are no keyboards plugged in right now. – hookenz Jul 25 '14 at 03:43

2 Answers2

4

Has anyone seen this before on SuperMicro Super Servers?

Yes, this just happened to me. Right before the screen turns blue there is an error message (I had to record it on my phone and play the video back to read it):

error: Invalid mode: text

I booted off a USB drive, mounted the file system and changed /etc/default/grub. Comment out the line:

GRUB_GFXMODE="text"

... and update Grub. You should get the normal menu the next time you boot. I don't know why it changed (a non-SuperMicro server running 12.04.4 didn't have that line and I'm sure no one here changed it manually) but it solved the problem for us.

Good luck!

D Parker
  • 41
  • 1
  • hmm, I'll have a look at that. That could be a possibility – hookenz Jul 27 '14 at 19:50
  • I'd like to say that was it, but it wasn't. I have GRUB_GFXMODE= comment out. And I have GRUB_TERMINAL=console set. Unfortunately because of the slow IPMI screen update and the long distance to the server (10,600+ km direct)... you can imagine that getting to see what the error is is not too easy. I'll get my colleague to have a look. – hookenz Jul 27 '14 at 21:42
1

I managed to get it back up.

This particular 1U server has 12 regular hard drives and an SSD in it. Somehow during a recent 'apt-get update && apt-get upgrade' operation grub got corrupted/confused maybe. Not sure if it might be related to having so many drives.

The solution was to boot (over IPMI virtual CDROM) the Ubuntu Mini.ISO.

I then selected the rescue option and reinstalled grub to the right disk. And then I went to the command prompt and ran "update-initramfs -u" and update-grub for good measure.

Cold booted server and all OK.

hookenz
  • 14,472
  • 23
  • 88
  • 143
  • Was this *Software* RAID? – ewwhite Jul 28 '14 at 00:58
  • 1
    No. A ceph cluster node. – hookenz Jul 28 '14 at 01:03
  • 1
    I had the same boot issues with my SuperMicro after a fresh install of Ubuntu 14.04. It started working after I changed the HD boot priority to a different drive. I have no idea why that would happen, but the server was freezing before it even got to grub. – zidarsk8 Jul 28 '14 at 07:51