3

I've got a Linux server running CentOS 7, and I've been facing an issue where the RX dropped packet count keeps increasing. I believe it may be caused by a faulty cable/connector in the network; however, I have no means of checking every connection to find the problem. Is there any way to see from which IP are the dropped packets coming?

1 Answers1

5

Using nmap you can get a variety of information about your network including, and it can be incredibly useful for tracking packet loss. The --packet-trace option can be useful for tracking all of the packets. While there are a great number of examples and guides for using nmap available online a simple way to start would be something like:

nmap -sP --packet-trace 192.168.1.0/24

Will scan the network for hosts and track the packets in the case of a timeout I got something like this on my local network

 nsock_read(): Read request from IOD #1 [192.168.1.42:53] (timeout: -1ms) EID 66

The nmap call can be customized and scripted in a variety of ways to get more information about your network and try to weed out bad connections.

In addition to nmap there is a graphical tool available called wireshark that can be used for a whole host of network profiling tasks including finding bad links/sources of dropped packets.

Depending on the switches and networking hardware used as well as your level of access you may be able to use the built-in management interface on the switches in your environment to look for physical ports with abnormally high rates of dropped packets. With certain brands of managable switches this sort of task could probably even be scripted/automated; the exact semantics of such an operation would depend on specific network hardware and configurations. In the right sort of environment using Simple Network Management Protocol(SNMP) may be a novel way to gather all of this information(dropped packets on physical switch ports, etc) from your network without too much work.

EDIT: In the above example EID is a DNS Resource Record(RR) and is an abbreviation for Endpoint IDentifier(EID).

Matt
  • 2,751
  • 1
  • 14
  • 20
  • I will give `nmap` a try tomorrow (the packet dropping only happens during business hours). Thank you for your help! – Gustavo Silva Jul 13 '15 at 23:08
  • If you have access to a large percentage of the machines in non-business hours and would be able to either reboot them over the network into a RAM only image and/or install nmap on a large number of systems it would be pretty simple to write scripts to generate and monitor huge amounts of traffic on the network. If that is the sort of thing you would be interested in trying or would find valuable I can modify my answer with more details. – Matt Jul 13 '15 at 23:13
  • Actually, I have root access to the server, so it is possible to change any config or install anything; however I have no access to the client machines. That is why I am looking for a server-side only approach. – Gustavo Silva Jul 13 '15 at 23:19
  • Then running `nmap` and/or other network profiling tools when the traffic is occurring is probably your best bet. If you have admin/physical access to the switches/networking equipment there is a chance you would be able get some useful information from those. A lot of networking equipment, even entry level, has some sort of management interface that can get information about the physical ports, which can be a great tool for finding bad cables, redundant connections, etc. – Matt Jul 13 '15 at 23:24
  • I'm not 100% sure that the dropped packets are coming from bad wiring. It may also be caused by some server-side issue, such as a small RX queue. I increased the queue in about 10 times; if dropping stops, then it was the cause. That is, however, not a really clever way to discover the root cause; do you know any other? – Gustavo Silva Jul 13 '15 at 23:43
  • It depends on the amount of time and resources you are willing to invest as well as the willingness of the network users to assist or at the very least tolerate potential service disruptions. If you have administrator access to all of the networking equipment(switches, routers, etc) as well as access to the servers running certain network services you could work out what the issue is. I am not certain how clever it would be considered but if you control your network you can become a "global passive observer" of your network and start monitoring all network traffic until you find the source – Matt Jul 14 '15 at 16:44
  • 1
    So I gave `nmap` a try, and after a quick scan I got this: `NSOCK (0.0630s) Read request from IOD #1 [192.1.1.1:53] (timeout: -1ms) EID 34`. What exactly does it mean? What is the EID? – Gustavo Silva Jul 15 '15 at 16:51
  • See the edited post above, typically when I have seen this it has been in response to `nmap` hitting one of the DNS servers on a local network, though there could be other causes. – Matt Jul 15 '15 at 17:04
  • My next approach will be to use `nmap` just to list the online hosts, then use `ping` to send an amount of packets to each host and check for timeouts. Do you believe it may work? I guess if the problem is hardware, then it doesnt matter the protocol or port and an ICMP packet should have the same behavior as a packet sent by the applications used daily by the client machines. – Gustavo Silva Jul 16 '15 at 01:41