2

After I've found leaks in my program, I've solved the problem. However, now I'm trying to find out how to " test " leaking connections in a Go test? This is my question.

I've tried to change the number of requests in my test, didn't matter. No matter what I do, the current number of TCP connections in my test stay the same.

func TestLeakingConnections(t *testing.T) {
    getter := myhttp.New()

    s := newServer(ok)
    defer s.Close()

    cur := tcps(t)
    for i := 0; i < 1000; i++ {
        r, _ := getter.GetWithTimeout(s.URL, time.Millisecond*10)
        r.Body.Close()
    }

    for tries := 10; tries >= 0; tries-- {
        growth := tcps(t) - cur
        if growth > 5 {
            t.Error("leaked")
            return
        }
    }
}

// find tcp connections
func tcps(t *testing.T) (conns int) {
    lsof, err := exec.Command("lsof", "-n", "-p", strconv.Itoa(os.Getpid())).Output()
    if err != nil {
        t.Skip("skipping test; error finding or running lsof")
    }

    for _, ls := range strings.Split(string(lsof), "\n") {
        if strings.Contains(ls, "TCP") {
            conns++
        }
    }
    return
}

func newServer(f http.HandlerFunc) *httptest.Server {
    return httptest.NewServer(http.HandlerFunc(f))
}

func ok(w http.ResponseWriter, r *http.Request) {
    w.Header().Add("Content-Type", "application/xml")
    io.WriteString(w, "<xml></xml>")
}

// myhttp package

// ...other code omitted for clarification

func (g *Getter) GetWithTimeout(
    url string,
    timeout time.Duration,
) (
    *http.Response, error,
) {
    // this is the leaking part
    // moving this out of here will stop leaks
    transport := http.Transport{
        DialContext: (&net.Dialer{
            Timeout: timeout,
        }).DialContext,
        TLSHandshakeTimeout:   timeout,
        ResponseHeaderTimeout: timeout,
        ExpectContinueTimeout: timeout,
    }

    client := http.Client{
        Timeout:   timeout,
        Transport: &transport,
    }

    return client.Get(url)
}

// fixture worker package

// some outside code injects getter into fixture_worker like this:
getter := myhttp.New()

// NewWithTimeout creates a new fetcher with timeout threshold
func NewWithTimeout(
    getter myhttp.HTTPGetter,
    fetchURL string,
    timeout time.Duration,
) *Fetcher {
    return &Fetcher{getter, fetchURL, timeout}
}

// Fetch fetches fixture xml
func (f *Fetcher) Fetch() (*parser.FixtureXML, error) {
    res, err := f.getter.GetWithTimeout(f.fetchURL, f.timeout)
    if err != nil {
        if res != nil && res.Body != nil {
            res.Body.Close()
        }
        return nil, errors.Wrap(err, ErrFetch.Error())
    }
    defer res.Body.Close()

    ioutil.ReadAll(res.Body)

    return &parser.FixtureXML{}, nil
}

leaking conns gif


Output of fixture worker lsof: https://pastebin.com/fDUkpYsE

Output of test: https://pastebin.com/uGpK0czn

Test never leaks whereas fixture worker it leaks.

Fixture worker is using the same code as the test, to request http gets using myhttp package.

Inanc Gumus
  • 25,195
  • 9
  • 85
  • 101
  • You're not reading the body at all, and the http client is saving you by throwing away the connections entirely (that's not guaranteed behavior, and can't be relied upon). `GetWithTimeout` is also breaking the documented pattern of closing the body, by adding an error value when there is a legitimate response. This means the caller has no way to know if the body requires being close or not on an error without somehow inspecting for `ErrStatus`, possibly causing more leaks. I don't understand the reason for the `dialFunc`, and you should use `DialContext` now anyway. – JimB Aug 18 '17 at 02:11

1 Answers1

0

If you want your test to represent reality, you need to use it in the same manner that you do outside of tests. In this case you're not reading any of the responses, and the transport happens to be preventing lost connections because it's discarding them as quickly as possible since they can't be reused.

Reading the response will use the connection, and get it into a state where it's "leaked". You also need to properly handle errors in all cases, and you always need to Close() the response body. The pattern to handle an http response and make sure it's closed is very simple, and doesn't necessarily require testing (see What could happen if I don't close response.Body in golang?)

    resp, err := GetWithTimeout(s.URL)
    if err != nil {
        t.Fatal(err)
    }
    ioutil.ReadAll(resp.Body)
    resp.Body.Close()

This is arguably of limited usefulness, since the most common bugs would result from improper error and response handling, and you're not testing that because the test needs to do it correctly in the first place.

The remaining problem here is that your GetWithTimeout method returns an error value and a valid http response, which contradicts the http package documentation as well as most user's expectations. If you're going to insert an error, it would be better to also handle the response at the same point to ensure the body is closed and discarded.

Finally, most of GetWithTimeout is superfluous, since not only is creating Transports every time incorrect, but creating an http.Client every request is usually unnecessary as they are meant to be reused and are safe for concurrent use.

JimB
  • 104,193
  • 13
  • 262
  • 255
  • @InancGumus: The `lsof` check you're doing works OK (though you may want to filter things like TIME_WAIT, and only look for client connections), and fixing the code in your tests will show the expected results. What other behavior are you trying to test? – JimB Aug 18 '17 at 13:53
  • What is the output of the `fixture_worker` `lsof`? You're counting total lines there, rather than just those with `TCP`, and that isn't the same process either. What is `fixture_worker`? – JimB Aug 18 '17 at 14:38
  • 1
    If `fixture_worker` is executing the exact same code, with the exact same version of Go, you should see the same results. As I've stated, you need to actually read the response body to get consistent results. If `fixture_worker` actually does something, it must be reading the response which you're not doing in the unit test. – JimB Aug 18 '17 at 14:55
  • @InancGumus: re: Fetcher.Fetch, don't call `resp.Body.Close` if there was an error. There's no guarantee that `resp.Body` isn't nil. Just follow the documentation and handle the body in all non-error cases. – JimB Aug 18 '17 at 14:57
  • The point was that you don't need to call `resp.Body.Close` at if there's an error, ever, it's documented as such and it's simply making the code more confusing because someone needs to think about _why_ you would ever do that. It's more code for no reason. – JimB Aug 18 '17 at 15:04
  • So you're saying that the code in `// ... omitted` somehow converts the request to a `*parser.FixtureXML` without reading the http response in any way? – JimB Aug 18 '17 at 15:08
  • 1
    `TestLeakingConnections` still does not read the response body. _That_ is the difference I pointed out first, and the reason the connections can be dropped immediately. – JimB Aug 18 '17 at 15:15