0

I have a Mac OS X 10.5 server, with a RAID set in it, that went down due to a power outage on Thursday, and the machine is not happily booting right now*.

It is possible to find out when the machine went down, while not booted off the internal drive? (I'm booted off an external drive, waiting for the RAID sets to initialize.)

Normally, I'd run last. The man page doesn't indicate that I can run it against a different startup volume. It looks possible to parse /var/log/utmpx, but I don't think it'd be worthwhile to try to do that from scratch for this one-off problem.


Update: In re-reading the man page, I see that the synopsis shows:

last [-n] [-f file] [-h host] [-t tty] [user ...]

Nowhere else is the -f file parameter mentioned. When I try last -f /var/run/utmpx I get a little bit of information and then a segmentation fault.

I googled 'man last' and came up with a man page for OS X 10.6, and it doesn't even show the -f flag in the synopsis section. I'm assuming from this that the -f flag has a known bug in it.


*** I'm still trying to figure out why it isn't happy, and may ask a follow-up question. Right now I can see that UserNotificationCenter crashed repeatedly early Thursday morning, and that securityd, mdworker, and ARDAgent crash shortly after startup [I think -- I want to verify when the box went up and down]. The login window does not come up right (I think it is crashing or not able to cope with a dead securityd). The box is supposed to be set to go down when the UPS tells it power is out; at the moment, I'm wondering if it went down, and turned back on multiple times! I sure hope not.

Clinton Blackmore
  • 3,520
  • 6
  • 36
  • 61

2 Answers2

2

This is just a guess since I'm a Linux guy, but I usually get a -- MARK -- in /var/log/messages every 20 minutes that the computer is on. Just look for when there's a gap in the system log files?

rescdsk
  • 165
  • 3
  • Thanks for the idea. It appears to me that /var/log/system.log on OS X is analogous to /var/log/messages. I tried looking for gaps in when messages are output (`cat /Volumes/ServerHD/var/log/system.log | cut -b 1-12 | uniq -c`), but more helpful appears to be to search for `fsck_hfs` which appears to appear in the log early on after startup. I don't get any `-- MARK --` messages. – Clinton Blackmore May 04 '10 at 19:58
  • I am now thinking that searching system.log for "Darwin" is an even better bet. There is a line stating the kernel version that always appears in the logs shortly after startup. – Clinton Blackmore May 11 '10 at 15:28
0

I don't particularly want to accept this as an answer, as it skirts around the original question, but:

  • finally, after the RAID finished initializing, I booted into the server in single user mode, ran last and wrote down the results.

  • I did compare against the /var/log/system.log, and it appears that at each bootup, it does output a line with fsck in it -- either saying that journalling is on and a check is not needed, or saying that a check has been forced. (I don't see any particular correlation with when the computer shuts down.)

  • As I was examining /var/log/secure.log, I noticed it was having trouble reading /etc/authorization. I looked at the file (which should be an XML file) and it was gobbledegook. I ran plutil /etc/authorization, just in case it was a binary plist, and it complained about it. So, I copied /etc/authorization from another server, and, after rebooting, my server is up and running and appears happy.


Update: I'm just added some information from my logs to aid anyone else with this problem in finding my solution to it.

In /var/log/system.log, I saw things like:

Apr 29 06:11:41 servername /System/Library/CoreServices/loginwindow.app/Contents/MacOS/loginwindow[59]: ERROR | main | session setup failed (status = 226), launched by init = 1
Apr 29 06:11:41 servername /System/Library/CoreServices/loginwindow.app/Contents/MacOS/loginwindow[59]: ERROR | main | loginwindow argument passed = console
Apr 29 06:11:41 servername /System/Library/CoreServices/loginwindow.app/Contents/MacOS/loginwindow[59]: SessionGetInfo(0xffffffff) -> Mach 1102
...
Apr 29 06:11:42 servername loginwindow[59]: GetOurLSSessionIDInit(), returned error 0xe2, securitySessionID=0x0 attrs=0 uid=0 euid=0.  cf Radar 5123589.
Apr 29 06:11:42 servername /System/Library/CoreServices/coreservicesd[93]: Client must provide a valid sessionID but passed 0, 0x0/0x0 version=10500000 uid=0 euid=0. cf Radar 5123589.
Apr 29 06:11:42 servername loginwindow[59]: LaunchServices/5123589: coreservicesd is running an unsupported version, 0 ( we are 10500000 ), so we cannot talk to it.
...
Apr 29 06:11:47 servername ReportCrash[88]: Formulating crash report for process loginwindow[59]
Apr 29 06:11:48 servername com.apple.launchd[1] (com.apple.UserEventAgent-LoginWindow[195]): Exited: Terminated
Apr 29 06:11:48 servername com.apple.launchd[1] (com.apple.loginwindow[59]): Exited abnormally: Abort trap
Apr 29 06:11:48 servername /System/Library/CoreServices/loginwindow.app/Contents/MacOS/loginwindow[220]: Login Window Application Started -- Threaded auth
Apr 29 06:11:48 servername ReportCrash[88]: Saved crashreport to /Library/Logs/CrashReporter/loginwindow_2010-04-29-061142_servername.crash using uid: 0 gid: 0, euid: 0 egid: 0

And in /var/log/secure.log:

Apr 29 06:15:46 servername loginwindow[772]: SessionGetInfo(0xffffffff) -> Mach 1102
Apr 29 06:15:46 servername loginwindow[772]: GetOurLSSessionIDInit(), returned error 0xe2, securitySessionID=0x0 attrs=0 uid=0 euid=0.  cf Radar 5123589.
Apr 29 06:15:46 servername loginwindow[772]: LaunchServices/5123589: coreservicesd is running an unsupported version, 0 ( we are 10500000 ), so we cannot talk to it.
Apr 29 06:15:56: --- last message repeated 4 times ---
...
Apr 29 06:37:23 servername loginwindow[554]: SessionGetInfo(0xffffffff) -> Mach 1102
Apr 29 06:37:23 servername loginwindow[554]: GetOurLSSessionIDInit(), returned error 0xe2, securitySessionID=0x0 attrs=0 uid=0 euid=0.  cf Radar 5123589.
Apr 29 06:37:23 servername loginwindow[554]: LaunchServices/5123589: coreservicesd is running an unsupported version, 0 ( we are 10500000 ), so we cannot talk to it.
Apr 30 08:04:28 localhost com.apple.SecurityServer[36]: Entering service
Apr 30 08:04:33 servername com.apple.SecurityServer[36]: Parsing rules file "/etc/authorization": XML parser error:\n   Unexpected character  at line 1\nOld-style plist parser error:\n    Unexpected character '0x0' at line 1
Apr 30 08:04:33 servername com.apple.SecurityServer[36]: Parsing rules file "/etc/authorization": XML parser error:\n   Unexpected character  at line 1\nOld-style plist parser error:\n    Unexpected character '0x0' at line 1
Apr 30 08:05:20 servername loginwindow[60]: SessionGetInfo(0xffffffff) -> Mach 1102
Apr 30 08:05:20 servername loginwindow[60]: GetOurLSSessionIDInit(), returned error 0xe2, securitySessionID=0x0 attrs=0 uid=0 euid=0.  cf Radar 5123589.
Apr 30 08:05:20 servername loginwindow[60]: LaunchServices/5123589: coreservicesd is running an unsupported version, 0 ( we are 10500000 ), so we cannot talk to it.

The key things from the logs are that nothing could talk to coreservicesd (and the login window crashed because of that), and that SecurityServer could not parse /etc/authorization. I should note that the server is running 10.5 (probably 10.5.8).

Clinton Blackmore
  • 3,520
  • 6
  • 36
  • 61