NGINX as an HTTP Load Balancer

Quick Intro

You've probably used NGINX as a WebServer and you might be wondering how does it work as a load balancer.

We don't need a different NGINX file or to install a different package to make it a load balancer.

NGINX works as a load balancer out of the box as long as we specify we want it to work as a load balancer.

In this article, we're going to introduce NGINX HTTP load balancer.

If you need more details, please refer to NGINX official documentation.

However, NGINX also supports TCP and UDP load balancing and health monitors which are not covered here.

NGINX as a WebServer by default

Once NGINX is installed, it works as a WebServer by default:

The response above came from our client-facing NGINX Load Balancer-to-be which is currently just a webserver.

Let's make it an HTTP load balancer, shall we?

NGINX as a Load Balancer

Disable WebServer

Let's comment out the default file on /etc/nginx/sites-enabled to disable our local WebServer just in case:

Then I reload the config:

If we try to connect now, it won't work now because we disabled the default page:

Now, we're ready to create the load balancer's file!

Creating Load Balancer's file

Essentially, the default NGINX config file (/etc/nginx/nginx.conf) already has an http block which references the /etc/nginx/conf.d directory.

With that in mind, we can pretty much create our file in /etc/nginx/conf.d/ and whatever is in there will be in HTTP context:

Within upstream directive, we add our backend servers along with the port they're listening to.

These are the servers NGINX load balancer will forward the request to.

Lastly, we create a server config for our listener with proxy_pass pointing to upstream name (backends).

We now reload NGINX again:

Lab tests

NGINX has a couple of different load balancing methods and round robin (potentially weighted) is the default one.

Round robin test

I tried the first time:

Second time:

Third time:

Fourth time it goes back to server 1:

That's because the default load balancing method for NGINX is round robin.

Weight test

Now I've added weight=2 and I expect that server 2 will proportionally take 2x more requests than the rest of the servers:

Once again I reload the configuration after the changes:

First request goes to Server 3:

Next one to Server 2:

Then again to Server 2:

And lastly Server 1:

Administratively shutting down a server

We can also administratively shut down server 2 for maintenance, for example, by adding the down keyword:

When we issue the requests it only goes to server 1 or 3 now:

Other Load Balancing methods

Least connections

Least connections sends requests to the server with the least number of active connections taking into consideration any optionally configured weight:

If there's a tie, round robin is used, taking into account the optionally configured weights.

For more information on least connections method, please click here.

IP Hash

The request is sent to the server as a result of a hash calculation based on the first 3 octets of Client IP address or the whole IPv6 address.

This method makes sure requests from the same client are always sent to the same server, unless it's unavailable.

Note: A hash from server that is marked down is preserved for when it comes back up.

For more information on IP hash method, please click here.

Custom or Generic Hash

We can also define a key for the hash itself.

In below example, the key is the variable $request_uri which represents the URI present in the HTTP request sent by client:

Least Time (NGINX Plus only)

This method is not supported in NGINX free version. In this case, NGINX Plus picks the server with lowest average latency + number of active connections.

The lowest average latency is calculated based on an argument passed to least_time with the following values:

  • header: time it took to receive the first byte from server
  • last_byte: time it took to receive the full response from server
  • last_byte inflight: time it took to receive full response from server taking into account incomplete requests 

In above example, NGINX Plus will make its load balancing decision based on the time it took to receive the full HTTP response from our server and will also include incomplete requests in the calculation.

Note that /etc/nginx/conf.d/ is the default directory for NGINX config files. In above case, as I installed NGINX directly from Debian APT repository, it also added sites-enabled. Some Linux distributions do add it to make it easier for those who use Apache.

For more information on least time method, please click here.


In this method, requests are passed on to randomly selected servers but it's only ideal for environments where multiple load balancers are passing on requests to the same back end servers.

Note that least_time parameter in this method is only available on NGINX Plus.

For more details about this method, please click here.

Published Apr 09, 2020
Version 1.0

Was this article helpful?

No CommentsBe the first to comment