Thanks for the replies guys.
My goal here is to understand why the spikes in the graph are regular and cyclical, and then make them stop. These are coming in at a slow trickle (5 or so at a time). I understand that tcp connections fail for one reason or another, but there is obviously something afoot because of their regularity. I suspect a cron job somewhere, a load balancer misconfiguration, or a monitoring device as the culprit.
I've decided to take the UTSL approach, and here's where I'm at so far:
netstat -s gets it's statistics from /proc/net/snmp.
Great. So the kernel updates an SNMP counter every time there is a failed connection attempt.
What id like to accomplish, is every time there is a failed connection, to not only updates this counter, but to also log "failed connection attempt from IP $foo at $timestamp"
So... what qualifies as a failed connection attempt?
downloads kernel source
greps around
concludes that the counter I'm looking for is TCP_MIB_ATTEMPTFAILS
grepping through the 2.6.18 source tree, I've found two places where this is referenced:
./net/ipv4/tcp_minisocks.c:
453 struct sock *tcp_check_req(struct sock *sk,struct sk_buff *skb,
454 struct request_sock *req,
455 struct request_sock **prev)
456 {
592 if (flg & (TCP_FLAG_RST|TCP_FLAG_SYN)) {
593 TCP_INC_STATS_BH(TCP_MIB_ATTEMPTFAILS);
594 goto embryonic_reset;
595 }
640 }
... and, include/net/tcp.h
915 static inline void tcp_done(struct sock *sk)
916 {
917 if(sk->sk_state == TCP_SYN_SENT || sk->sk_state == TCP_SYN_RECV)
918 TCP_INC_STATS_BH(TCP_MIB_ATTEMPTFAILS);
919
920 tcp_set_state(sk, TCP_CLOSE);
921 tcp_clear_xmit_timers(sk);
922
923 sk->sk_shutdown = SHUTDOWN_MASK;
924
925 if (!sock_flag(sk, SOCK_DEAD))
926 sk->sk_state_change(sk);
927 else
928 inet_csk_destroy_sock(sk);
929 }
I'm a bit surprised that these are the only two conditions that generate a counter update, but whatever.
The first block seems pretty obvious. "If you see a TCP packet with both the RST and SYN flags set, update the TCP_MIB_ATTEMPTFAILS counter and reset the connection".
In an attempt to catch these, I put in the following iptables rules
iptables -A INPUT -p tcp -m tcp --tcp-flags SYN,RST SYN,RST -j LOG
iptables -A OUTPUT -p tcp -m tcp --tcp-flags SYN,RST SYN,RST -j LOG
waits ….. nothing. =(
The second block is a little more mysterious, but I'm going to guess and assume that this means "update the TCP_MIB_ATTEMPTFAILS counter if the socket is in a certain state when tearing it down"
This, I have no idea how to test for this.
My questions are now:
1) Am I misinterpreting this graph somehow?
2) Am I on the right track?
3) Short of writing a kernel module (which I'm no where near skilled enough to do), how can I accomplish my logging goal?
Thanks
-s