Solaris: what's a kernel trap?

Question

I'm trying to understand whether the high trap count reported by top on Solaris 10

Kernel: 1659 ctxsw, 1069 trap, 4433 intr, 3837 syscall, 5 fork Memory: 8192M phys mem, 299M free mem, 4103M total swap, 3236M free swap

is a problem or not.

Googling for kernel traps mainly returns large documents on kernel architecture; cliff notes would be much appreciated.

Thanks

PS. Never mind the swapping

score 0 · Answer 1 · answered Jun 15 '12 at 12:22

0

It is a sort of interrupt probably coming from the cpu but that is not important right now. See man trapstat and read the links you have gathered at your leisure.

You are wrong to say never mind the swapping because (especially since your server has not crashed) the traps are probably memory related and due to TLB misses. Use trapstat -T to see if you might benefit from large pages (and maybe buy some more RAM).

Whether it is a problem, who can say?

As an aside, prstat is normally used instead of top.

answered Jun 15 '12 at 12:22

ramruma

2,740
1
15
8

Probably? Not important? Who can say? Not a very useful answer. – moodywoody Jun 15 '12 at 12:28
I've given a likely explanation of what you are seeing, and two new commands to try. There is an element of guesswork as there is not much to go on. My CPUs are 30 per cent busy: is that a problem? Should I buy more? Have I bought too many if they are running at below 50 per cent? It is not really a technical question: more of a business decision. If you are worried, call Oracle support. – ramruma Jun 15 '12 at 13:18
I think ramruma's answer is perfectly fine considering the lack of detail that you provided. Without seeing other metrics about your current system it is impossible to say if 1069 traps is good or bad. – mghocke Jun 15 '12 at 13:31

mghocke · Answer 2 · 2012-06-15T14:12:28.820

I think I should've posted an answer instead of a comment. I tried to figure out a way to delete it but didn't know how and then I exceeded the five minute limit. Anyway... here is a better comment shaped as an answer:

A trap is a mechanism built into the CPU that allows the program execution to continue at another, well defined location (in this case it switches from user context to kernel context as well, hence the name kernel trap). One use of traps is when the hardware encounters an error and needs the CPU to continue with the error handling code (division by zero, memory access errors, etc.). In UNIX systems traps are also used to execute system calls (see McDougall's and Mauro's excellent book "Solaris Internals", chapter 2.8 in specific). In your case the kernel was entered 1069 times over a specific amount of time.

Without knowing much more about your system, its processes it is running at this point, and the hardware it is running on, it is, unfortunately, impossible to say if your systems is in a good or bad state.

Cheers, let me try to reword: let's say I want to assess whether the box has a problem. Let's further assume I see (superficially) nothing else extraordinary. Should the trap count then prompt me to continue with the search for problems or should it (in absence of further symptoms) be ignored? If investigation should continue, which steps do you recommend? — moodywoody, Jun 16 '12 at 06:20
It really depends on the baseline of your system. You have to monitor the system and collect this data over time. Here are some numbers for you to compare yours with (these are from a relatively quiet system): "Kernel: 676 ctxsw, 118 trap, 396 intr, 2258 syscall, 91 flt", "Kernel: 464 ctxsw, 1827 trap, 352 intr, 2472 syscall, 5 fork, 1252 flt". Nothing's wrong with my system. — mghocke, Jun 18 '12 at 13:45

Solaris: what's a kernel trap?

2 Answers2