This chapter describes how to use NGINX and NGINX Plus as a load balancer.
Load balancing across multiple application instances is a commonly used technique for optimizing resource utilization, maximizing throughput, reducing latency, and ensuring fault-tolerant configurations.
Watch the NGINX Load Balancing Software Webinar On Demand for a deep dive on techniques that NGINX users employ to build large scale, highly-available web services.
NGINX can be used in different deployment scenarios as a very efficient HTTP load balancer.
To start using NGINX with a group of servers, first, you need to define the group with the upstream
directive. The directive is placed in the http
context.
Servers in the group are configured using the server
directive (not to be confused with the server
block that defines a virtual server running on NGINX). For example, the following configuration defines a group named backend
and consists of three server configurations (which may resolve in more than three actual servers):
http {
upstream backend {
server backend1.example.com weight=5;
server backend2.example.com;
server 192.0.0.1 backup;
}
}
To pass requests to a server group, the name of the group is specified in the proxy_pass
directive (or fastcgi_pass
, memcached_pass
, uwsgi_pass
, scgi_pass
depending on the protocol). In the next example, a virtual server running on NGINX passes all requests to the backend
server group defined in the previous example:
server {
location / {
proxy_pass http://backend;
}
}
The following example sums up the two examples above and shows proxying requests to the backend
server group, where the server group consists of three servers, two of them run two instances of the same application while one is a backup server, NGINX applies HTTP load balancing to distribute the requests:
http {
upstream backend {
server backend1.example.com;
server backend2.example.com;
server 192.0.0.1 backup;
}
server {
location / {
proxy_pass http://backend;
}
}
}
NGINX supports four load balancing methods, and NGINX Plus an additional fifth method:
The round-robin method: requests are distributed evenly across the servers with server weights taken into consideration. This method is used by default (there is no directive for enabling it):
upstream backend {
server backend1.example.com;
server backend2.example.com;
}
The least_conn method: a request is sent to the server with the least number of active connections with server weights taken into consideration:
upstream backend {
least_conn;
server backend1.example.com;
server backend2.example.com;
}
The ip_hash method: the server to which a request is sent is determined from the client IP address. In this case, either the first three octets of IPv4 address or the whole IPv6 address are used to calculate the hash value. The method guarantees that requests from the same address get to the same server unless it is not available.
upstream backend {
ip_hash;
server backend1.example.com;
server backend2.example.com;
}
If one of the servers needs to be temporarily removed, it can be marked with the down
parameter in order to preserve the current hashing of client IP addresses. Requests that were to be processed by this server are automatically sent to the next server in the group:
upstream backend {
server backend1.example.com;
server backend2.example.com;
server backend3.example.com down;
}
The generic hash method: the server to which a request is sent is determined from a user-defined key which may be a text, variable, or their combination. For example, the key may be a source IP and port, or URI:
upstream backend {
hash $request_uri consistent;
server backend1.example.com;
server backend2.example.com;
}
The optional consistent
parameter of the hash
directive enables ketama consistent hash load balancing. Requests will be evenly distributed across all upstream servers based on the user-defined hashed key value. If an upstream server is added to or removed from an upstream group, only few keys will be remapped which will minimize cache misses in case of load balancing cache servers and other applications that accumulate state.
The least_time method (NGINX Plus): for each request, NGINX Plus selects the server with the lowest average latency and the least number of active connections, where the lowest average latency is calculated based on which of the following parameters is included on the least_time
directive:
header
– Time to receive the first byte from the serverlast_byte
– Time to receive the full response from the serverupstream backend {
least_time header;
server backend1.example.com;
server backend2.example.com;
}
Note: When configuring any method other than round-robin, put the corresponding directive (least_conn
, ip_hash
, hash
, least_time
) above the list of server
directives in the upstream
block.
By default, NGINX distributes requests among the servers in the group according to their weights using the round-robin method. The weight
parameter of the server
directive sets the weight of a server, by default, it is 1:
upstream backend {
server backend1.example.com weight=5;
server backend2.example.com;
server 192.0.0.1 backup;
}
In the example, backend1.example.com has weight 5; the other two servers have the default weight (1), but the one with IP address 192.0.0.1 is marked as a backup server and does not receive requests unless both of the other servers are unavailable. With this configuration of weights, out of every six requests, five are sent backend1.example.com and one to backend2.example.com.
The server slow start feature prevents a recently recovered server from being overwhelmed by connections, which may timeout and cause the server to be marked as failed again.
In NGINX Plus, slow start allows an upstream server to gradually recover its weight from zero to its nominal value after it has been recovered or became available. This can be done with the slow_start
parameter of the server
directive:
upstream backend {
server backend1.example.com slow_start=30s;
server backend2.example.com;
server 192.0.0.1 backup;
}
The time value sets the time for the server will recover its weight.
Note that if there is only a single server in a group, max_fails
, fail_timeout
, and slow_start
parameters will be ignored and this server will never be considered unavailable.
Session persistence means that NGINX Plus identifies user sessions and routes the requests from this session to the same upstream server.
NGINX Plus supports three session persistence methods. The methods are set with the sticky
directive.
The sticky cookie method. With this method, NGINX Plus adds a session cookie to the first response from the upstream group and identifies the server which has sent the response. When a client issues next request, it will contain the cookie value and NGINX Plus will route the request to the same upstream server:
upstream backend {
server backend1.example.com;
server backend2.example.com;
sticky cookie srv_id expires=1h domain=.example.com path=/;
}
In the example, the srv_id
parameter sets the name of the cookie which will be set or inspected. The optional expires
parameter sets the time for the browser to keep the cookie. The optional domain
parameter defines a domain for which the cookie is set. The optional path
parameter defines the path for which the cookie is set. This is the simplest session persistence method.
The sticky route method. With this method, NGINX Plus will assign a “route” to the client when it receives the first request. All subsequent requests will be compared with the route
parameter of the server
directive to identify the server where requests will be proxied. The route information is taken from either cookie, or URI.
upstream backend {
server backend1.example.com route=a;
server backend2.example.com route=b;
sticky route $route_cookie $route_uri;
}
The cookie learn method. With this method, NGINX Plus first finds session identifiers by inspecting requests and responses. Then NGINX Plus “learns” which upstream server corresponds to which session identifier. Generally, these identifiers are passed in a HTTP cookie. If a request contains a session identifier already “learned”, NGINX Plus will forward the request to the corresponding server:
upstream backend {
server backend1.example.com;
server backend2.example.com;
sticky learn
create=$upstream_cookie_examplecookie
lookup=$cookie_examplecookie
zone=client_sessions:1m
timeout=1h;
}
In the example, one of the upstream servers creates a session by setting the cookie “EXAMPLECOOKIE” in the response.
The obligatory parameter create
specifies a variable that indicates how a new session is created. In our example, new sessions are created from the cookie “EXAMPLECOOKIE” sent by the upstream server.
The obligatory parameter lookup
specifies how to search for existing sessions. In our example, existing sessions are searched in the cookie “EXAMPLECOOKIE” sent by the client.
The obligatory parameter zone
specifies a shared memory zone where all information about sticky sessions is kept. In our example, the zone is named client_sessions
and has the size of 1 megabyte.
This is a more sophisticated session persistence method as it does not require keeping any cookies on the client side: all info is kept server-side in the shared memory zone.
With NGINX Plus, it is possible to maintain the desired number of connections by setting the connection limit with the max_conns
parameter.
If the max_conns
limit has been reached, the request can be placed into the queue for its further processing provided that the queue
directive is specified. The directive sets the maximum number of requests that can be simultaneously in the queue:
upstream backend {
server backend1.example.com max_conns=3;
server backend2.example.com;
queue 100 timeout=70;
}
If the queue is filled up with requests or the upstream server cannot be selected during the timeout specified in the optional timeout
parameter, the client will receive an error.
Note that the max_conns
limit will be ignored if there are idle keepalive connections opened in other worker processes. As a result, the total number of connections to the server may exceed the max_conns
value in a configuration where the memory is shared with multiple worker processes.
When NGINX considers a server unavailable, it temporarily stops sending requests to that server until it is considered active again. The following parameters of the server
directive configure the conditions to consider a server unavailable:
fail_timeout
parameter sets the time during which the specified number of failed attempts should happen and still consider the server unavailable. In other words, the server is unavailable for the interval set by fail_timeout
.max_fails
parameter sets the number of failed attempts that should happen during the specified time to still consider the server unavailable.The default values are 10 seconds and 1 attempt. So if NGINX fails to send a request to some server or does not receive a response from this server at least once, it immediately considers the server unavailable for 10 seconds. The following example shows how to set these parameters:
upstream backend {
server backend1.example.com;
server backend2.example.com max_fails=3 fail_timeout=30s;
server backend3.example.com max_fails=2;
}
Next are some more sophisticated features for tracking server availability available in NGINX Plus.
Periodically sending special requests to each server and checking for a response that satisfies certain conditions can monitor the availability of servers.
To enable this type of health monitoring in your nginx.conf file the location that passes requests to the group should include the health_check
directive. In addition, the server group should also be dynamically configurable with the zone
directive:
http {
upstream backend {
zone backend 64k;
server backend1.example.com;
server backend2.example.com;
server backend3.example.com;
server backend4.example.com;
}
server {
location / {
proxy_pass http://backend;
health_check;
}
}
}
This configuration defines a server group and a virtual server with a single location that passes all requests to a server group. It also enables health monitoring with default parameters. In this case, every 5 seconds NGINX Plus sends the “/” requests to each server in the backend group. If any communication error or timeout occurs (or a proxied server responds with a status code other than 2xx
or 3xx
) the health check fails for this proxied server. Any server that fails a health check is considered unhealthy, and NGINX Plus stops sending client requests to it until it once again passes a health check.
The zone
directive defines a memory zone that is shared among worker processes and is used to store the configuration of the server group. This enables the worker processes to use the same set of counters to keep track of responses from the servers in the group. The zone
directive also makes the group dynamically configurable.
This behavior can be overridden using the parameters of the health_check
directive:
location / {
proxy_pass http://backend;
health_check interval=10 fails=3 passes=2;
}
Here, the duration between 2 consecutive health checks has been increased to 10 seconds using the interval
parameter. In addition, a server will be considered unhealthy after 3 consecutive failed health checks by setting the fails=3
parameter. Finally, using the passes
parameter, we have made it so that a server needs to pass 2 consecutive checks to be considered healthy again.
It is possible to set a specific URI to request in a health check. Use the uri
parameter for this purpose:
location / {
proxy_pass http://backend;
health_check uri=/some/path;
}
The provided URI will be appended to the server domain name or IP address specified for the server in the upstream
directive. For example, for the first server in the backend
group declared above, a health check request will have the http://backend1.example.com/some/path URI.
Finally, it is possible to set custom conditions that a healthy response should satisfy. The conditions are specified in the match
block, which is defined in the match parameter of the health_check
directive.
http {
...
match server_ok {
status 200-399;
body !~ "maintenance mode";
}
server {
...
location / {
proxy_pass http://backend;
health_check match=server_ok;
}
}
}
Here a health check is passed if the response has the status in the range from 200 to 399, and its body does not match the provided regular expression.
The match
directive allows NGINX Plus to check the status, header fields, and the body of a response. Using this directive it is possible to verify whether the status is in the specified range, whether a response includes a header, or whether the header or body matches a regular expression. The match
directive can contain one status condition, one body condition, and multiple header conditions. To correspond to the match
block, the response must satisfy all of the conditions specified within it.
For example, the following match
directive looks for responses that have status code 200
, contain the Content-Type
header with the exact value text/html
, and have the text “Welcome to nginx!” in the body:
match welcome {
status 200;
header Content-Type = text/html;
body ~ "Welcome to nginx!";
}
In the following example of using the exclamation point (!
), conditions match responses where the status code is anything other than 301
, 302
, 303
, and 307
, and Refresh
is not among the headers.
match not_redirect {
status ! 301-303 307;
header ! Refresh;
}
Health checks can also be enabled for non-HTTP protocols, such as FastCGI, uwsgi, SCGI, and memcached.
If the upstream
directive does not include the zone
directive, each worker process keeps its own copy of the server group configuration and maintains its own set of related counters. The counters include the current number of connections to each server in the group and the number of failed attempts to pass a request to a server. As a result, the server group configuration isn’t changeable.
If the upstream
directive does include the zone
directive, the configuration of the server group is placed in a memory area shared among all worker processes. This scenario is dynamically configurable, because the worker processes access the same copy of the group configuration and utilize the same related counters.
The zone
directive is mandatory for health checks and on-the-fly reconfiguration of the server group. However, other features of the server groups can benefit from the use of this directive as well.
For example, if the configuration of a group is not shared, each worker process maintains its own counter for failed attempts to pass a request to a server (see the max_fails
parameter). In this case, each request gets to only one worker process. When the worker process that is selected to process a request fails to transmit the request to a server, other worker processes don’t know anything about it. While some worker process can consider a server unavailable, others may still send requests to this server. For a server to be definitively considered unavailable, max_fails
multiplied by the number of workers processes of failed attempts should happen within the timeframe set by fail_timeout
. On the other hand, the zone
directive guarantees the expected behavior.
The least_conn
load balancing method might not work as expected without the zone
directive, at least on small loads. This method of tcp and http load balancing passes a request to the server with the least number of active connections. Again, if the configuration of the group is not shared, each worker process uses its own counter for the number of connections. And if one worker process passes by a request to a server, the other worker process can also pass a request to the same server. However, you can increase the number of requests to reduce this effect. On high loads requests are distributed among worker processes evenly, and the least_conn load balancing method works as expected.
There are no exact settings due to quite different usage patterns. Each feature, such as sticky cookie
/route
/learn
load balancing, health checks, or re-resolving will affect the zone size.
For example, the 256 Kb zone with the sticky_route
session persistence method and a single health check can hold up to:
The configuration of a server group can be modified at run time using DNS.
NGINX Plus can monitor changes of IP addresses that correspond to a domain name of the server and automatically apply these changes to NGINX without its restart. This can be done with the resolver
directive which must be specified in the http
block, and the resolve
parameter of the server
directive in a server group:
http {
resolver 10.0.0.1 valid=300s ipv6=off;
resolver_timeout 10s;
server {
location / {
proxy_pass http://backend;
}
}
upstream backend {
zone backend 32k;
least_conn;
...
server backend1.example.com resolve;
server backend2.example.com resolve;
}
}
In the example, the resolve
parameter of the server
directive will periodically re-resolve backend1.example.com and backend2.example.com servers into IP addresses. By default, NGINX re-resolves DNS records basing on their TTL, but the TTL value can be overridden with the valid
parameter to the resolver
directive, in our example it is 5 minutes.
The optional ipv6=off
parameter allows resolving only to IPv4 addresses, though both IPv4 and IPv6 resolving is supported.
If a domain name resolves to several IP addresses, the addresses will be saved to the upstream configuration and load-balanced. In our example, the servers will be load balanced according to the least_conn load balancing method. If one or more IP addresses has been changed or added/removed, then the servers will be re-balanced.
In Release 7 and later, NGINX Plus can proxy Microsoft Exchange traffic to a server or a group of servers and load balance it.
To set up load balancing of Microsoft Exchange Servers:
proxy_pass
directive:
location / {
proxy_pass https://exchange;
...
}
location
block set the proxy_http_version
directive value to 1.1
, and the proxy_set_header
directive to “Connection “” ”, just like for a keepalive connection:
location / {
...
proxy_http_version 1.1;
proxy_set_header Connection "";
...
}
http
block, configure a load balancing group with Microsoft Exchange servers. Specify the upstream
directive with the name of the server group previously specified in the proxy_pass
directive. Then specify the ntlm
directive that will allow the group to accept the requests with NTLM authentication:
http {
...
upstream exchange {
zone exchange 64k;
ntlm;
...
}
}
http {
...
upstream exchange {
zone exchange 64k;
ntlm;
server exchange1.example.com;
server exchange2.example.com;
...
}
}
http {
...
upstream exchange {
zone exchange 64k;
ntlm;
server exchange1.example.com;
server exchange2.example.com;
}
server {
listen 443 ssl;
ssl_certificate /etc/nginx/ssl/company.com.crt;
ssl_certificate_key /etc/nginx/ssl/company.com.key;
ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
location / {
proxy_pass https://exchange;
proxy_http_version 1.1;
proxy_set_header Connection "";
}
}
}
For more information about configuring Microsoft Exchange and NGINX Plus, see the Load Balance Microsoft Exchange Servers guide (PDF).
With NGINX Plus, the configuration of a server group can be modified on-the-fly using the HTTP interface. A configuration command can be used to view all servers or a particular server in a group, modify parameter for a particular server, and add or remove servers.
zone
directive in the upstream
block. The zone
directive configures a zone in the shared memory and sets the zone name and size. The configuration of the server group is kept in this zone, so all worker processes use the same configuration:http {
...
upstream appservers {
zone appservers 64k;
server appserv1.example.com weight=5;
server appserv2.example.com:8080 fail_timeout=5s;
server reserve1.example.com:8080 backup;
server reserve2.example.com:8080 backup;
}
}
upstream_conf
directive in a separate location:
server {
location /upstream_conf {
upstream_conf;
...
}
}
It is highly recommended restricting access to this location, for example, allow access only from the 127.0.0.1
address:
server {
location /upstream_conf {
upstream_conf;
allow 127.0.0.1;
deny all;
}
}
A complete example of this configuration:
http {
...
# Configuration of the server group
upstream appservers {
zone appservers 64k;
server appserv1.example.com weight=5;
server appserv2.example.com:8080 fail_timeout=5s;
server reserve1.example.com:8080 backup;
server reserve2.example.com:8080 backup;
}
server {
# Location that proxies requests to the group
location / {
proxy_pass http://appservers;
health_check;
}
# Location for configuration requests
location /upstream_conf {
upstream_conf;
allow 127.0.0.1;
deny all;
}
}
}
In the example, the access to the second location is allowed only from the 127.0.0.1 IP address. Access from all other IP addresses is denied.
The configuration from the previous example allows storing the on-the-fly changes only in the shared memory. These changes will be discarded any time NGINX Plus configuration file is reloaded.
To make these changes persistent across configuration reloads, you will need to move the list of upstream servers from the upstream block to a special file that will keep the state of the upstream servers. The path to the file is set with the state
directive. Recommended path for Linux distributions is /var/lib/nginx/state/, for FreeBSD distributions is /var/db/nginx/state/:
http {
...
upstream appservers {
zone appservers 64k;
state /var/lib/nginx/state/appservers.conf;
# All these servers should be moved to the file using the upstream_conf API:
# server appserv1.example.com weight=5;
# server appserv2.example.com:8080 fail_timeout=5s;
# server reserve1.example.com:8080 backup;
# server reserve2.example.com:8080 backup;
}
}
Keep in mind that this file can be modified only with configuration commands from the upstream_conf
API interface, modifying the file directly should be avoided.
To pass a configuration command to NGINX, send an HTTP request. The request should have an appropriate URI to get into the location that includes the upstream_conf
directive. The request should also include the upstream
argument set to the name of the server group.
For example, to view all backup servers (marked with backup
) in the group, send:
http://127.0.0.1/upstream_conf?upstream=appservers&backup=
To add a new server to the group, send a request with add
and server
arguments:
http://127.0.0.1/upstream_conf?add=&upstream=appservers&server=appserv3.example.com:8080&weight=2&max_fails=3
To remove a server, send a request with the remove
command and the id
argument identifying the server:
http://127.0.0.1/upstream_conf?remove=&upstream=appservers&id=2
To modify a parameter of a specific server, send a request with the id
argument identifying the server and the parameter:
http://127.0.0.1/upstream_conf?upstream=appservers&id=2&down=
See the upstream_conf module for more examples.