1

I recently ran into a problem where a Solaris server could not establish a TCP socket on port 2126. From a packet capture I see this (note: A is a Solaris server, B is a router):

  1. A sends SYN to B
  2. B sends SYN, ACK to A

Notice A (Solaris) does not acknowledge the SYN from B.

Due to the business impact of the problem, I had to reboot the server to fix the problem. That said, I want to know the next time the problem occurs, what can I do to get a root cause (ie before server reboot)?

Thanks in advance.

anurag kohli
  • 57
  • 1
  • 2
  • 7
  • Where and how do you capture the traffic ? Is this reproducible ? Why are you rebooting the server ? – jlliagre Dec 21 '10 at 14:07
  • We captured traffic from the Solaris server using snoop. Unfortunately this is not a reproducible problem. Why reboot? Well, there are 35 routers. I found the socket failure always occurred with the 35th router. In other words, I have to bring one router "out of service" to open a socket on another. It seemed to me this is some resource starvation issue. – anurag kohli Dec 21 '10 at 22:30

1 Answers1

1

You didn't mentioned which Solaris version we are talking here about. As a solution: The best option is to check whether Oracle created any patches and how current is this Solaris. Do you have any support for this system? Also, consider setting DTrace probes, to collect some data of the system when this occurs.

plluksie
  • 468
  • 3
  • 10