0

I have a production server that was working nominally until 4pm yesterday, 12/16/2020. After that time it started to reject incoming TCP connections, and connections trying to connect through localhost.

The server blocks all of these connections:

• MySQL
• Ping (Can't ping or be pinged by client but can ping sites like google)
• Tracert

Sometimes MySQL connections go through, but 95% of the time I get a 10060 timeout error. The server hosts a website and an API, both of which are still accessible remotely.

I've tried the following:

• Turning firewall off/on
• Restarting server
• Updating all available updates
• Scanning for malware
• Made sure port 3306 was listening
• Pinging the server from client

I have no idea why this happened. I believe it's not a firewall issue, but I can't think of anything else that would have changed. No one logged onto the server and the normal cron-jobs etc. don't modify anything that would be network related. Could it be the server provider?

EDIT I've enabled firewall logging and it is showing a lot of dropped UDP packets. Every TCP connection is received however. Looking it up real quick RDP is over TCP so that would explain why I can RDP into the server. So why is the server dropping UDP packets?

EDIT Server test results Client test results

  • Ping is UDP, so are you saying both TVP and UDP are not working? – Eric C. Singer Dec 18 '20 at 00:39
  • My first question would be have you checked your TCP/IP settings? Do you have the right dns servers, the right dhcp server (if applicable) the right subnet make and gateway. – Eric C. Singer Dec 18 '20 at 00:41
  • When you ping, is it by name or IP? – Eric C. Singer Dec 18 '20 at 00:41
  • Both tcp and udp are not working yes that's correct my mistake, I've tried pinging the domain name and the ip both timeout. dns gateway and subnet look correct, I can access the server with rdp so wouldn't that mean these settings are more or less correct anyway? Thanks – equallyhero Dec 18 '20 at 02:40
  • So rdp works? Are you sure there isn’t an AV that perhaps has a FW turned on besides the windows fw. – Eric C. Singer Dec 18 '20 at 03:29
  • The other thing I would do as more of a diag, is run a packet capture. Ping from something else to this box and see if you see the packets come in. Secondly, is this a VM or a physical box? I’d it’s a vm there are all other kinds of things that might be going on. – Eric C. Singer Dec 18 '20 at 03:34
  • Yes rdp works, and the only av is defender and it has always been on, turning it off does nothing. I tried pinging from a different network and machine and it was 100% packet loss again. It's a physical box server. I can connect to the sql server through sql workbench so it has to be allowing packets through no? A majority of the queries that are sent timeout however. I can access files on the server remotely just fine. – equallyhero Dec 18 '20 at 03:42
  • if your successful RDP is from a host on the same segment it will work even if all of the routing and switch settings are wrong because your local system will find the remote system by arp and that will reply before DNS even comes into play. Since you say it happened at a specific time, what is in the Windows Event log from that time period? rather than throwing things at it to see what sticks, you need to determine what change caused the issue. and then fix that. – Rowan Hawkins Dec 18 '20 at 08:18
  • My apologizes I'm not a system or network admin the company I work for is just small so I do everything. I didn't know about the event log, obviously very handy. I looked through and didn't see anything stuck out. I looked into event id 10016 but it doesn't seem related. I don't understand how it can refuse a connection coming from itself though, or what could cause that. – equallyhero Dec 18 '20 at 09:37
  • Try this, and update us with your finding in your post. Fire up powershell to run all these commands and report the output. Clean up as needed to protect private info. – Eric C. Singer Dec 18 '20 at 16:10
  • Are you running this server as a VMWare VM? If so, make sure you are running a recent version of VMWare tools - older versions have bugs contributing to port exhaustion. – R. StackUser Dec 18 '20 at 21:15
  • It's a physical server, not a vm. – equallyhero Dec 18 '20 at 22:47

1 Answers1

0

I'm going to switch to the answer block, because honestly trying to do anything in that conversation block is a PITA.

If you don't have it already, download and install PowerShell 7.1 on the server you're having issues.

In PowerShell (PWSH.EXE) run the following script blocks and report the output.

This will perform some tests on your IP config

$All_IPConfigs = Get-NetIPConfiguration | Where-Object {$null -ne $_.IPv4Address.IPAddress}

Foreach ($IPConfig in $All_IPConfigs)
    {
    Write-Host "###################################"
    Write-Host "Testing interface $($IPConfig.InterfaceAlias)"

    Write-Host "Testing ip $($IPConfig.IPv4Address.IPAddress)"
    $Test_Self = $null
    $Test_Self = Test-Connection -ComputerName $($IPConfig.IPv4Address.IPAddress) -Ping -Count 2 -Quiet -ErrorAction SilentlyContinue
    Write-Host "[Can ping self?]: $($Test_Self)"

    Foreach ($Gateay in $($IPConfig.IPv4DefaultGateway.NextHop))
        {
        Write-Host "Testing gateway $($Gateay)"
        $Test_Gateway = $null
        $Test_Gateway = Test-Connection -ComputerName $($Gateay) -Ping -Count 2 -Quiet -ErrorAction SilentlyContinue
        Write-Host "[Can ping gateway?]: $($Test_Gateway)"
        }

    

    Foreach ($DNS_Server in $($IPConfig.DNSServer.ServerAddresses))
        {
        Write-Host "Testing DNS IP $($DNS_Server)"
        $Test_DNS_Network = $null
        $Test_DNS_Network = Test-NetConnection -ComputerName $DNS_Server -Port 53 -ErrorAction SilentlyContinue
        
        $Test_Resolove_Self = $null
        $Test_Resolove_Self = Resolve-DnsName -Name "$($env:computername)" -Server $DNS_Server -Type A -ErrorAction SilentlyContinue | Select-Object -First 1

        $Test_Resolove_GMail = $null
        $Test_Resolove_GMail = Resolve-DnsName -Name "gmail.com" -Server $DNS_Server -Type A -ErrorAction SilentlyContinue | Select-Object -First 1

        Write-Host "[Can ping DNS server?]: $($Test_DNS_Network.PingSucceeded)"
        Write-Host "[Can connect to DNS server TCP port 53?]: $($Test_DNS_Network.TcpTestSucceeded)"
        Write-Host "[Can resolve self?]: $($Test_Resolove_Self.IPAddress)"
        Write-Host "[Can resolve gmail?]: $($Test_Resolove_GMail.IPAddress)"
        }

    Write-Host "Testing Google DNS IP 8.8.8.8"
    $Test_Google_DNS = $null
    $Test_Google_DNS = Resolve-DnsName -Name "gmail.com" -Server "8.8.8.8" -Type A -ErrorAction SilentlyContinue | Select-Object -First 1
    Write-Host "[Can resolve gmail via Google DNS?]: $($Test_Google_DNS.IPAddress)"
    }

IPConfig

Get-NetIPConfiguration   

All network TCP connections

Get-NetTCPConnection

All network TCP connections

Get-NetTCPConnection

Also, please run the same tests on client that is failing to connect (except the all tcp connections).

Eric C. Singer
  • 2,329
  • 16
  • 17
  • added the test results to the post – equallyhero Dec 18 '20 at 19:50
  • When you say you’ve enabled the fw, how did you do it? Try running the following command (as administrator). “ netsh advfirewall set allprofiles state off“ – Eric C. Singer Dec 18 '20 at 23:16
  • That will disable the fw in ALL profiles – Eric C. Singer Dec 18 '20 at 23:16
  • Via the firewall utility. Running that command turned off the firewall as expected, still no connectivity change and the problems still exist. At this point I think it makes sense to wipe it and reinstall the os. I have no idea what happened to block these connections but as it's a production server it can't be down for several days while I figure it out unfortunately. I appreciate the advice and hope someone in the future finds this question/answer helpful. – equallyhero Dec 19 '20 at 01:16