2

Could someone give a tip on how to log first 5 rows from top output? I was thinking about grep, but don't know how to pick rows.

I need to understand. What freezes server sometimes. Maybe there are some tools for it?

Thanks ;)

hmontoliu
  • 3,753
  • 3
  • 23
  • 24
Somebody
  • 364
  • 1
  • 6
  • 17

3 Answers3

2

You can do what you've requested by:

~$ top -n1 | head -5

In addition to Janne's list of commands and symcbean's advice to check your system logs, I would suggest atop:

   ...
   It  shows  the  occupation  of  the  most  critical  hardware
   resources (from a performance point of view) on system level, i.e. cpu,
   memory, disk and network.
   It  also  shows  which processes are responsible for the indicated load
   with respect to cpu- and memory load on process level.   Disk  load  is
   shown if per process "storage accounting" is active in the kernel or if
   the kernel patch `cnt' has been installed.  Network load is only  shown
   per process if the kernel patch `cnt' has been installed...

   [...]

   Every  interval  (default:  10  seconds) information is shown about the
   resource occupation on system level (cpu,  memory,  disks  and  network
   layers),  followed by a list of processes which have been active during
   the last interval (note that all processes that were  unchanged  during
   the  last interval are not shown, unless the key 'a' has been pressed).
   If the list of active processes does not entirely fit  on  the  screen,
   only the top of the list is shown (sorted in order of activity).

Also with atop you can check backwards in time which was the situation of your server because it stores that data in it's log files. For example, I have the following snippet of code in a script that is launched when a server loadavg passes an arbitrary limit. The atop information and other related system info is sent then by mail to my account:

atop -r /var/log/atop.log -M -b "$(date +'%H:%M' -d '30 minutes ago')" -e "$(date -d now +'%H:%M')"

Basically I get a report of the actual state of the server and what was happening in it during the past 30 minutes (with detailed information of each 10 minutes interval)

hmontoliu
  • 3,753
  • 3
  • 23
  • 24
0

First, its unlikely that the server freezing is due to user processes. Secondly, even if that were the case, why would it be one of the top 5 causing the problem?

If you've already checked your logs and found nothing, then try implementing a watchdog to write a heartbeat to the logs - and check that the system really is freezing (as opposed to being observed to pause when remotely accessed).

How long does the freeze last for? How frequent? Is there a blip in the load when it comes back? Where are you observing these freezes? Is this a dedicated or virtual machine?

symcbean
  • 21,009
  • 1
  • 31
  • 52
0

top | head -n 12 >>your_file.txt would do it, but with caveats: it would save all the control and ANSI characters as well, so browsing the output with less etc would not be nice. And, to be honest, not nearly the best way to catch the rebel process.

For overall trends about your server performance tools like snmp+mrtg, Cacti or Munin can be very useful - they graph CPU usage, memory usage and so on. For command line use sysstat package with its sadc daemon and sar reporting utility can also offer you a view about how's your server doing. psacct package adds BSD accounting and can provide you overall statistics about per-user/process time spent.

For real-time view just stay logged on your server and follow stuff like iostat -x 1, vmstat and top.

Consider setting up a separate syslog server If you have spare hardware (or a central syslog server already installed), use it! You just configure your server to send out its syslogs and other logs to your syslog server. Sometimes those freezes (if caused by a kernel panic) might happen so that your server cannot write information to disk but it's able to send its last words to the syslog server.

Test your server with memtest86 if you can take your server offline for a night or so, let it run memtest86 to see if it has faulty RAM or something else.

Sorry, I'm unable to be more helpful than this since you did not actually tell us much about your problem.

Janne Pikkarainen
  • 31,852
  • 4
  • 58
  • 81
  • I'm not disagreeing with the idea of memtest86, but for people researching this and running into the answer, I normally add the caveat the software memory testers can tell you if memory is bad, not reliably tell you that it's good. That is, if you run it and find a problem your memory is bad. But if you run it and find nothing, there *could* still be a problem with it. It's like asking your brain to figure out if it has a problem with itself. Unless it's an obvious problem it may fake you out :-) – Bart Silverstrim Aug 05 '11 at 12:00