2

I have executed netstat -s on my dedicated server running debian. I'd like to interprete the results because I'm experiencing connectivity problems with TCP. I don't know how to read these results. Can anyone help please ?

The context: It's a public tcp server, with clients from all around the world, most of them use 3G/UMTS networks. The sockets are opened for 1 hour in average. Some tcp links stall for 10-60 seconds, every 10 minutes or so. I'm running a custom java program which is the tcp server.

Here is the output of netstat -s. Does it show any obvious connectivity problem ?

    Ip:
        33780786 total packets received
        0 forwarded
        0 incoming packets discarded
        33780059 incoming packets delivered
        33577363 requests sent out
        1 outgoing packets dropped
        1442 reassemblies required
        715 packets reassembled ok
    Icmp:
        4675 ICMP messages received
        98 input ICMP message failed.
        ICMP input histogram:
            destination unreachable: 2901
            timeout in transit: 152
            echo requests: 1334
            echo replies: 226
        2109 ICMP messages sent
        0 ICMP messages failed
        ICMP output histogram:
            destination unreachable: 550
            echo request: 225
            echo replies: 1334
    IcmpMsg:
            InType0: 226
            InType3: 2901
            InType8: 1334
            InType11: 152
            OutType0: 1334
            OutType3: 550
            OutType8: 225
    Tcp:
        8752 active connections openings
        287296 passive connection openings
        58164 failed connection attempts
        74065 connection resets received
        30 connections established
        32997886 segments received
        32357425 segments send out
        438184 segments retransmited
        587 bad segments received.
        75868 resets sent
    Udp:
        777245 packets received
        550 packets to unknown port received.
        0 packet receive errors
        779944 packets sent
    TcpExt:
        28674 invalid SYN cookies received
        56570 resets received for embryonic SYN_RECV sockets
        998 packets pruned from receive queue because of socket buffer overrun
        9 ICMP packets dropped because they were out-of-window
        27402 packets rejects in established connections because of timestamp
        1266543 delayed acks sent
        1399 delayed acks further delayed because of locked socket
        Quick ack mode was activated 143367 times
        1556 times the listen queue of a socket overflowed
        1556 SYNs to LISTEN sockets dropped
        25884635 packets directly queued to recvmsg prequeue.
        785180902 bytes directly in process context from backlog
        1800599695 bytes directly received in process context from prequeue
        2879633 packet headers predicted
        7627605 packets header predicted and directly queued to user
        3218508 acknowledgments not containing data payload received
        14774120 predicted acknowledgments
        52 times recovered from packet loss due to fast retransmit
        24519 times recovered from packet loss by selective acknowledgements
        4 bad SACK blocks received
        Detected reordering 146 times using FACK
        Detected reordering 77 times using SACK
        Detected reordering 2239 times using time stamp
        3548 congestion windows fully recovered without slow start
        15840 congestion windows partially recovered using Hoe heuristic
        8832 congestion windows recovered without slow start by DSACK
        127403 congestion windows recovered without slow start after partial ack
        12080 TCP data loss events
        TCPLostRetransmit: 3
        179 timeouts after reno fast retransmit
        21328 timeouts after SACK recovery
        1481 timeouts in loss state
        32373 fast retransmits
        5349 forward retransmits
        26402 retransmits in slow start
        230593 other TCP timeouts
        4 classic Reno fast retransmits failed
        2367 SACK retransmits failed
        563 times receiver scheduled too late for direct processing
        243774 packets collapsed in receive queue due to low socket buffer
        151068 DSACKs sent for old packets
        45306 DSACKs sent for out of order packets
        238987 DSACKs received
        14 DSACKs for out of order packets received
        27627 connections reset due to unexpected data
        4045 connections reset due to early user close
        4992 connections aborted due to timeout
    IpExt:
Joel
  • 195
  • 2
  • 10

2 Answers2

4
1 outgoing packets dropped

There is nearly no packet loss, which is good, but we don't have latency data. At a glance, i'd say you're using the wrong tools for the job.

Is there a database involved? Are there some kind of cyclical functions that slow the system down around the 10 minutes mark? Does the machine only operate this tcp server or is serving other resources?

Netstat is not a proper metric for what you want to do. To be sure your web-application is performing as intended, you need an infrastructure in place featuring the following

  • Hooks into your application to ensure proper metrics. You're the developer, so you can do this and it will ease your work massively. By hooks i mean facilities to fetch diagnostic and performance data, coded directly into your application.
  • A graphing/monitoring infrastructure. Cacti and Nagios are an example i'm familiar with, but there are more.
  • A plan. What do you want to achieve? What level of service do you want to supply your users with? Implement diagnostics and performance metrics as you develop your application and if you get wind this could turn into something big, make it scalable. *Really* scalable.
ItsGC
  • 905
  • 7
  • 12
0

Some things to try and help you understand the problem:

  • How does your receiving program handle connections from the network? Is it multithreaded? How does it handle clients? Is there a timeout being reached?
  • How have you tested the server code? Have you run it on your local machine and tried out how many connections you can get to it? Have you tested the effect of long sessions?
  • Try running "netstat -p" or "lsof -i TCP" and see what is happening. What does the send queue look like? Run a "ps auxwww", what is the state of the server program?