1

I have a docker swarm on three nodes plus an external mysql service (outside of the swarm). I am programming micro-services with an API Gateway in golang and gRPC. I have two problems.

The first problem is when I push an update a service with docker swarm update --image ... myservice I get the error on my API Gateway from each micro-service transport is closing for three requests to the gateway. (I am assuming each task needs to reconnect on each service?) How can I fix this? Each service has update-delay set to 30s and update-parallelism set to 1. Shouldn't the api gateway stay connected to each service if there are rolling updates?

The second problem is after an idol time (not sure how long) I am getting the same issue from above that the services are closing and I have to do three requests to the api gateway for it to work. Any help appreciated. I am running Ubuntu 16.04 with Docker version 17.12.0-ce, build c97c6d6 and am building go version go1.9.4 darwin/amd64

A service example:

  // listen on address
    lis, err := net.Listen(network, address)
    if err != nil {
        log.Fatalf("failed to listen: %v", err)
    }
    defer lis.Close()

    // create users server
    s := grpc.NewServer()
    pb.RegisterUsersServer(s, &server{})

    // register reflection service on gRPC server.
    reflection.Register(s)

    // message will run after serve if err == nil
    go func() {
        time.Sleep(time.Second)
        log.Printf("started users service on %s", address)
    }()
    if err := s.Serve(lis); err != nil {
        log.Fatalf("failed to serve: %v", err)
    }

My api gateway connection

// connect to users
usersConnection, err := grpc.Dial(usersAddress, grpc.WithInsecure())
if err != nil {
    log.Fatalf("did not connect: %v", err)
}
users = userspb.NewUsersClient(usersConnection)
log.Println("connected to users")
Trevor V
  • 1,958
  • 13
  • 33
  • When you do a service update it will delete each task (container) in that service and replace with a new container. Wouldn't you see the connection close as each task stops? – Bret Fisher Mar 01 '18 at 18:43
  • On connection timeouts, I'm not sure by your description which connection you are talking about. Is it the go-to-mysql connection? – Bret Fisher Mar 01 '18 at 18:45
  • I have a close DB and other connections before leaving from a signal interrupt function. It does close properly on my dev computer. When swarm deletes the task does it stop without sending the interupt? – Trevor V Mar 01 '18 at 20:05
  • Swarm treats container startup/shutdown the same as docker run. It sends `SIGTERM` to the container and if the container takes longer than 10s to shutdown, it will then send `SIGKILL`. You can have it wait longer with `--stop-grace-period` option on service create. – Bret Fisher Mar 01 '18 at 22:16
  • I did confirm that the close connections do work before shutdown but I still have the same issue on idol. – Trevor V Mar 02 '18 at 06:19
  • Could it be the IPVS feature in Linux default timeout on idle connections of 15min be the issue? https://github.com/moby/moby/issues/32195#issuecomment-315143802 – Bret Fisher Mar 02 '18 at 16:08
  • I confirm this problem. There is no problem on docker for mac. But I have 2 linux machines and just 1 machine has this problem. Still looking for the solution. – noomz May 15 '18 at 14:51
  • @noomz What is did is have the client dial the service on the app startup and then create a new client on each api request from the dial connection. That seamed to work. – Trevor V May 15 '18 at 14:55
  • @TrevorVarwig Create new connection every request make my req/s drop. So in this case I create new connection only once when service started. – noomz May 25 '18 at 11:59
  • @TrevorVarwig Ah, sorry, I misunderstood your text. Yes, I do the same as what you do, but mine is unstable. – noomz May 28 '18 at 08:13

0 Answers0