4

I am getting a lot of errors like below mentioned,

read tcp xx.xx.xx.xx:80: use of closed network connection

read tcp xx.xx.xx.xx:80: connection reset by peer

//function for HTTP connection

func GetResponseBytesByURL_raw(restUrl, connectionTimeOutStr, readTimeOutStr string) ([]byte, error) {
    connectionTimeOut, _ /*err*/ := time.ParseDuration(connectionTimeOutStr)
    readTimeOut, _ /*err*/ := time.ParseDuration(readTimeOutStr)
    timeout := connectionTimeOut + readTimeOut // time.Duration((strconv.Atoi(connectionTimeOutStr) + strconv.Atoi(readTimeOutStr)))
    //timeout = 200 * time.Millisecond
    client := http.Client{
        Timeout: timeout,
    }
    resp, err := client.Get(restUrl)
    if nil != err {
        logger.SetLog("Error GetResponseBytesByURL_raw |err: ", logs.LevelError, err)
        return make([]byte, 0), err
    }
    defer resp.Body.Close()
    body, err := ioutil.ReadAll(resp.Body)
    return body, err
}

Update (July 14):

Server : NumCPU=8, RAM=24GB, GO=go1.4.2.linux-amd64

I am getting such error during some high traffic. 20000-30000 request per minutes, and I have a time frame of 500ms to fetch response from third party api.

netstat status from my server (using : netstat -nat | awk '{print $6}' | sort | uniq -c | sort -n) to get frequency

      1 established)
      1 Foreign
      9 LISTEN
     33 FIN_WAIT1
    338 ESTABLISHED
   5530 SYN_SENT
  32202 TIME_WAIT

sysctl -p

**sysctl -p**
fs.file-max = 2097152
vm.swappiness = 10
vm.dirty_ratio = 60
vm.dirty_background_ratio = 2
net.ipv4.tcp_synack_retries = 2
net.ipv4.ip_local_port_range = 2000 65535
net.ipv4.tcp_rfc1337 = 1
net.ipv4.tcp_fin_timeout = 5
net.ipv4.tcp_keepalive_time = 300
net.ipv4.tcp_keepalive_probes = 5
net.ipv4.tcp_keepalive_intvl = 15
net.core.rmem_default = 31457280
net.core.rmem_max = 12582912
net.core.wmem_default = 31457280
net.core.wmem_max = 12582912
net.core.somaxconn = 65536
net.core.netdev_max_backlog = 65536
net.core.optmem_max = 25165824
net.ipv4.tcp_mem = 65536 131072 262144
net.ipv4.udp_mem = 65536 131072 262144
net.ipv4.tcp_rmem = 8192 87380 16777216
net.ipv4.udp_rmem_min = 16384
net.ipv4.tcp_wmem = 8192 65536 16777216
net.ipv4.udp_wmem_min = 16384
net.ipv4.tcp_max_tw_buckets = 1440000
net.ipv4.tcp_tw_recycle = 0
net.ipv4.tcp_tw_reuse = 1
net.ipv6.bindv6only = 1
net.ipv4.conf.all.rp_filter = 1
net.ipv4.conf.all.accept_source_route = 0
net.ipv4.icmp_echo_ignore_broadcasts = 1
error: "net.ipv4.icmp_ignore_bogus_error_messages" is an unknown key
kernel.exec-shield = 1
kernel.randomize_va_space = 1
net.ipv4.conf.all.log_martians = 1
net.ipv4.conf.default.log_martians = 1
net.ipv4.icmp_ignore_bogus_error_responses = 1
net.ipv4.ip_forward = 0
net.ipv4.conf.all.send_redirects = 0
net.ipv4.conf.default.send_redirects = 0
net.ipv4.conf.all.secure_redirects = 0
net.ipv4.conf.default.secure_redirects = 0
Jonathan Hall
  • 75,165
  • 16
  • 143
  • 189
niraj.nijju
  • 616
  • 1
  • 8
  • 17
  • Looks like the server is closing the connections without proper content length headers. Did you check the headers and content returned? – Not_a_Golfer Jul 13 '15 at 14:23
  • What is your rate of request/roughly how much data are you trying to pull per minute? I've only ever received responses like that as a matter of hodge podge rate limiting... – evanmcdonnal Jul 13 '15 at 16:25
  • As the others have said, this is because the sever is closing the connection. Others possible reasons I've encountered: The server has a timeout that consistently closes the connection at about the same time you're making the next request, or it's an poor reverse-proxy/server combination that is incorrectly closing requests without a `Connection: close` header. – JimB Jul 13 '15 at 19:01
  • @evanmcdonnal added details in questions – niraj.nijju Jul 14 '15 at 06:09
  • @JimB I think same, but tcp_keepalive_time = 300 & tcp_keepalive_time = 300 – niraj.nijju Jul 14 '15 at 06:10
  • `tcp_keepalive_time` is irrelevant. That just is the default interval for keepalive probes (plus the default http.Transport sets a 30s keepalive to begin with). – JimB Jul 14 '15 at 14:19

2 Answers2

5

When making connections at a high rate over the internet, it's very likely you're going to encounter some connection problems. You can't mitigate them completely, so you may want to add retry logic around the request. The actual error type at this point probably doesn't matter, but matching the error string for use of closed network connection or connection reset by peer is about the best you can do if you want to be specific. Make sure to limit the retries with a backoff, as some systems will drop or reset connections as a way to limit request rates, and you may get more errors the faster you reconnect.

Depending on the number of remote hosts you're communicating with, you will want to increase Transport.MaxIdleConnsPerHost (the default is only 2). The fewer hosts you talk to, the higher you can set this. This will decrease the number of new connections made, and speed up the requests overall.

If you can, try the go1.5 beta. There have been a couple changes around keep-alive connections that may help reduce the number of errors you see.

JimB
  • 104,193
  • 13
  • 262
  • 255
  • I am not sure, but I think you are telling some thing like : body,err := GetResponseBytesByURL_raw( url, t1, t2) ; if nil != err{body,err = GetResponseBytesByURL_raw( url, t1, t2)} – niraj.nijju Jul 15 '15 at 12:54
  • @niraj.nijju: I'm not sure what that that's doing, but it might be a simple way to test a retry. – JimB Jul 15 '15 at 13:19
  • I looked for MaxIdleConnsPerHost and I should use it, but I am confused with method, please suggest a way http://play.golang.org/p/4yVkuqS2iU – niraj.nijju Jul 16 '15 at 12:49
  • @niraj.nijju: There's no reason to create a new Client and Transport each time. Not only does it waste resources, but you won't be able to reuse your connections at all. In your second example, it looks like you're initializing `client` twice for some reason (though the syntax in wrong). For your Transport, look at the configuration of http.DefaultTransport; you probably want a similar Dialer and TLSHandshakeTimeout. – JimB Jul 16 '15 at 13:35
  • 1
    Good news, this has been addressed in go 1.16. https://tip.golang.org/doc/go1.16#net – Public Profile Dec 18 '20 at 22:03
2

I recommend implementing an exponential back off or some other rate limiting mechanism on your side of the wire. There's not really anything you can do about those error, and using exponential back off won't necessarily make you get the data any faster either. But it can ensure that you get all the data and the API you're pulling from will surely appreciate the reduced traffic. Here's a link to one I found on GitHub; https://github.com/cenkalti/backoff

There was another popular option as well though I haven't used either. Implementing one yourself isn't terribly difficult either and I could provide some sample of that on request. One thing I do recommend based off my experience is make sure you're using a retry function that has an abort channel. If you get to really long back off times then you'll want some way for the caller to kill it.

evanmcdonnal
  • 46,131
  • 16
  • 104
  • 115
  • 1
    As I have to return response within 500ms otherwise my client will not care about my response and took decision on it's own, so I can't use large back off slots, but surely use retry if failed within 200ms. (This may increase load during peak, but let me try it.) – niraj.nijju Jul 16 '15 at 13:04