1

I have a service that I want to run on preemptible instances on Google Cloud Platform. The instances will live behind a load balancer. Most requests take < 10 seconds to handle.

I can't modify the service itself, but there is an nginx instance on the same image that sits in front of the service that I can configure.

As far as I can see an ACPI soft shutdown signal is sent to the instance 30 seconds before it's shut down, at which point I'd like it to stop receiving requests.

I could create a shutdown script that reconfigures nginx to stop forwarding health checks to the service, and instead respond with a thumbs down itself, but this seems a bit hacky and I feel like there should be a better way. (It also feels a bit wrong to say the service is not healthy – it just wants to be taken out of the pool.)

What would be the appropriate way of telling the load balancer to stop sending requests to this instance, so it can (hopefully) fulfil its current requests and then shut down without having received any new requests in the meantime?

                             ┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─
                           ┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─  │
                         ┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─  │
                          Preemptible instance                  │   │
                         │                                        │
                                                                │   │
                         │                                        │
                                                                │   │
        ┌─────────┐      │   ┌─────────┐           ┌─────────┐    │
        │  Load   │          │         │  /health  │  some   │  │   │
   ────▶│balancer │──────┼──▶│  nginx  │──────────▶│ service │    │
        │         │          │         │   /api/…  │         │  │   │
        └─────────┘      │   └─────────┘           └─────────┘    │
                                                                │   │
                         │                                        │
                                                                │   │
                         │                                        │─
                                                                │─
                         └ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─
beta
  • 2,380
  • 21
  • 38

1 Answers1

0

This answer is most a "what I would" than a "it's the better solution", but this works if the health-check is really fast to answer.

I understand where you come from with the idea of cutting request before it stops, this way you won't have ongoing request cut in the middle and answering with a timeout. That's already implemented with the graceful stop of services.

If the service is in graceful stop, it will stop taking request but wait for the current request ongoing to finish before stopping, create a dependance to the nginx service to the others service for them to stop after it.

That way if your load balancer is checking every seconds if the server is up and it answer directly, when the signal is send from google to stop the instance, the lb will removed it from it's possible target almost immediately, the ongoing requests will finish normaly and then the server should stop cleanly. That way you should have close to no request lost.

night-gold
  • 2,202
  • 2
  • 20
  • 31