Autoscaling nginx in Kubernetes
Kubernetes is an amazing tool, it has allowed me to scale my infrastructure dynamically and reduce costs dramatically for the past 9 months. The cluster autoscaler, allied with the horizontal pod autoscalers, allow for a dynamic provisioning that ensures that only the necessary resources are provisioned at any given moment.
But sometimes there are caveats... And this time it was nginx.
The problem was that everytime kubernetes hpa scales down, it shuts down the
pods by sending them SIGTERM
, and when nginx receives this signal, it
forcefully shuts down, terminating all established connections.
I found that running a preStop command was the most popular option that I could find on the internet.
lifecycle: preStop: exec: command: ["/usr/sbin/nginx","-s","quit"]
This is not ideal, since the command is non-blocking, and when it finishes the
hpa sends the SIGTERM
right away, killing the nginx process that is trying to
shutdown gracefully.
terminationGracePeriodSeconds: 60
As described by the pod spec,
this is not really an option, since it is the time between SIGTERM
and SIGKILL
.
So the ideal option is to send the SIGQUIT
to nginx, and give it some time to
gracefully quit.
(...) The grace period is the duration in seconds after the processes running in the pod are sent a termination signal and the time when the processes are forcibly halted with a kill signal. (...)
Therefore, my solution was to make a simple script that I add to
/usr/local/bin
on my nginx images:
#!/bin/bash # quit.sh echo "Quitting nginx!" # Time to sleep (in seconds) between checks POLLING_TIME=.2 # Get nginx's pid before we try to kill it NGINX_PID=$(cat /var/run/nginx.pid) # Gracefully quit nginx nginx -s quit while kill -0 $NGINX_PID 2> /dev/null; do echo "waiting for nginx to quit..." sleep $POLLING_TIME done echo "nginx is dead, bye!"
and run it as a preStop command:
lifecycle: preStop: exec: command: ["quit.sh"]
This way, the graceful termination command will block until the nginx process quits, therefore we are able to scale down as fast as possible, but without ripping hanging connections apart.