On a server heavily used for file download, the server stops responding ssh, http and ping request every few hours. It will be back to normal after server restart.
The provider technician guesses that it could be due to network failure. I am wonder how to investigate and possibly resolve this problem?
Here is the last logs in dmesg log. The server has been restarted twice in the last 24 hours.
[ 7.266682] ioatdma 0000:00:16.0: setting latency timer to 64
[ 7.266726] alloc irq_desc for 65 on node -1
[ 7.266728] alloc kstat_irqs on node -1
[ 7.266731] alloc irq_2_iommu on node -1
[ 7.266736] ioatdma 0000:00:16.0: irq 65 for MSI/MSI-X
[ 7.266879] ioatdma 0000:00:16.1: enabling device (0000 -> 0002)
[ 7.266882] alloc irq_desc for 44 on node -1
[ 7.266883] alloc kstat_irqs on node -1
[ 7.266886] alloc irq_2_iommu on node -1
[ 7.266891] ioatdma 0000:00:16.1: PCI INT B -> GSI 44 (level, low) -> IRQ 44
[ 7.266902] ioatdma 0000:00:16.1: setting latency timer to 64
[ 7.266936] alloc irq_desc for 66 on node -1
[ 7.266938] alloc kstat_irqs on node -1
[ 7.266940] alloc irq_2_iommu on node -1
[ 7.266944] ioatdma 0000:00:16.1: irq 66 for MSI/MSI-X
[ 7.267097] ioatdma 0000:00:16.2: enabling device (0000 -> 0002)
[ 7.267101] alloc irq_desc for 45 on node -1
[ 7.267103] alloc kstat_irqs on node -1
[ 7.267107] alloc irq_2_iommu on node -1
[ 7.267113] ioatdma 0000:00:16.2: PCI INT C -> GSI 45 (level, low) -> IRQ 45
[ 7.267126] ioatdma 0000:00:16.2: setting latency timer to 64
[ 7.267162] alloc irq_desc for 67 on node -1
[ 7.267163] alloc kstat_irqs on node -1
[ 7.267165] alloc irq_2_iommu on node -1
[ 7.267170] ioatdma 0000:00:16.2: irq 67 for MSI/MSI-X
[ 7.267307] ioatdma 0000:00:16.3: enabling device (0000 -> 0002)
[ 7.267312] alloc irq_desc for 46 on node -1
[ 7.267314] alloc kstat_irqs on node -1
[ 7.267317] alloc irq_2_iommu on node -1
[ 7.267324] ioatdma 0000:00:16.3: PCI INT D -> GSI 46 (level, low) -> IRQ 46
[ 7.267339] ioatdma 0000:00:16.3: setting latency timer to 64
[ 7.267383] alloc irq_desc for 68 on node -1
[ 7.267386] alloc kstat_irqs on node -1
[ 7.267389] alloc irq_2_iommu on node -1
[ 7.267395] ioatdma 0000:00:16.3: irq 68 for MSI/MSI-X
[ 7.267527] ioatdma 0000:00:16.4: enabling device (0000 -> 0002)
[ 7.267531] ioatdma 0000:00:16.4: PCI INT A -> GSI 43 (level, low) -> IRQ 43
[ 7.267543] ioatdma 0000:00:16.4: setting latency timer to 64
[ 7.267587] alloc irq_desc for 69 on node -1
[ 7.267589] alloc kstat_irqs on node -1
[ 7.267593] alloc irq_2_iommu on node -1
[ 7.267599] ioatdma 0000:00:16.4: irq 69 for MSI/MSI-X
[ 7.267743] ioatdma 0000:00:16.5: enabling device (0000 -> 0002)
[ 7.267746] ioatdma 0000:00:16.5: PCI INT B -> GSI 44 (level, low) -> IRQ 44
[ 7.267759] ioatdma 0000:00:16.5: setting latency timer to 64
[ 7.267794] alloc irq_desc for 70 on node -1
[ 7.267796] alloc kstat_irqs on node -1
[ 7.267798] alloc irq_2_iommu on node -1
[ 7.267803] ioatdma 0000:00:16.5: irq 70 for MSI/MSI-X
[ 7.267950] ioatdma 0000:00:16.6: enabling device (0000 -> 0002)
[ 7.267955] ioatdma 0000:00:16.6: PCI INT C -> GSI 45 (level, low) -> IRQ 45
[ 7.267970] ioatdma 0000:00:16.6: setting latency timer to 64
[ 7.268012] alloc irq_desc for 71 on node -1
[ 7.268013] alloc kstat_irqs on node -1
[ 7.268016] alloc irq_2_iommu on node -1
[ 7.268021] ioatdma 0000:00:16.6: irq 71 for MSI/MSI-X
[ 7.268152] ioatdma 0000:00:16.7: enabling device (0000 -> 0002)
[ 7.268157] ioatdma 0000:00:16.7: PCI INT D -> GSI 46 (level, low) -> IRQ 46
[ 7.268173] ioatdma 0000:00:16.7: setting latency timer to 64
[ 7.268217] alloc irq_desc for 72 on node -1
[ 7.268219] alloc kstat_irqs on node -1
[ 7.268222] alloc irq_2_iommu on node -1
[ 7.268228] ioatdma 0000:00:16.7: irq 72 for MSI/MSI-X
[ 7.273295] i801_smbus 0000:00:1f.3: PCI INT C -> GSI 18 (level, low) -> IRQ 18
[ 7.277431] Monitor-Mwait will be used to enter C-1 state
[ 7.277533] Monitor-Mwait will be used to enter C-2 state
[ 7.278051] Monitor-Mwait will be used to enter C-3 state
[ 7.278131] processor LNXCPU:00: registered as cooling_device0
[ 7.278197] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input7
[ 7.278226] ACPI: Power Button [PWRF]
[ 7.278892] processor LNXCPU:01: registered as cooling_device1
[ 7.279463] processor LNXCPU:02: registered as cooling_device2
[ 7.280028] processor LNXCPU:03: registered as cooling_device3
[ 7.280564] processor LNXCPU:04: registered as cooling_device4
[ 7.283535] processor LNXCPU:05: registered as cooling_device5
[ 7.284159] processor LNXCPU:06: registered as cooling_device6
[ 7.284768] processor LNXCPU:07: registered as cooling_device7
[ 7.285364] processor LNXCPU:08: registered as cooling_device8
[ 7.285879] processor LNXCPU:09: registered as cooling_device9
[ 7.286595] processor LNXCPU:0a: registered as cooling_device10
[ 7.287125] processor LNXCPU:0b: registered as cooling_device11
[ 7.287720] processor LNXCPU:0c: registered as cooling_device12
[ 7.288295] processor LNXCPU:0d: registered as cooling_device13
[ 7.288825] processor LNXCPU:0e: registered as cooling_device14
[ 7.289485] processor LNXCPU:0f: registered as cooling_device15
[ 7.290069] processor LNXCPU:10: registered as cooling_device16
[ 7.290675] processor LNXCPU:11: registered as cooling_device17
[ 7.296242] Error: Driver 'pcspkr' is already registered, aborting...
[ 7.299964] processor LNXCPU:12: registered as cooling_device18
[ 7.300702] processor LNXCPU:13: registered as cooling_device19
[ 7.301409] processor LNXCPU:14: registered as cooling_device20
[ 7.302091] processor LNXCPU:15: registered as cooling_device21
[ 7.302741] processor LNXCPU:16: registered as cooling_device22
[ 7.303410] processor LNXCPU:17: registered as cooling_device23
[ 7.447430] Adding 8787960k swap on /dev/md1. Priority:-1 extents:1 across:8787960k
[ 7.502237] loop: module loaded
[ 7.660050] EXT4-fs (sdd1): mounted filesystem with ordered data mode
[ 7.668827] EXT4-fs (sda3): mounted filesystem with ordered data mode
[ 7.669375] EXT4-fs (sdc): Unrecognized mount option "0" or missing value
[ 7.824669] ADDRCONF(NETDEV_UP): eth0: link is not ready