Try this guide out on UpCloud with our free trial! Get started
Load balancing is a common solution for distributing web applications horizontally across multiple hosts while providing the users with a single point of access to the service. HAProxy is one of the most popular open-source load-balancing software, which also offers high availability and proxy functionality.
HAProxy aims to optimise resource usage, maximise throughput, minimise response time, and avoid overloading any single resource. It is available for installation on many Linux distributions like Debian 8 in this guide, but also on Ubuntu 16 and CentOS 7 systems.
HAProxy is particularly suited for very high-traffic websites and is, therefore, often used to improve web service reliability and performance for multi-server configurations. This guide lays out the steps for setting up HAProxy as a load balancer on Debian 8 to its own cloud host, which then directs the traffic to your web servers.
As a pre-requirement for the best results, you should have a minimum of two web servers and a server for the load balancer. The web servers must be running at least the basic web service such as Apache2 or nginx to test the load balancing between them.
Installing HAProxy 1.7
As a fast-developing open-source application, HAProxy available for installation in the Debian default repositories might not be the latest release. To find out what version number is offered through the official channels, enter the following command.
sudo aptitude show haproxy
HAProxy always has three active stable versions of the releases, two of the latest versions in development plus a third older version still receiving critical updates. You can always check the currently newest stable version listed on the HAProxy website and then decide which version you wish to go with.
While the latest stable version, 1.7, of HAProxy is not yet available on the packet manager by default, it can be found in a backported repository. To install HAProxy from a backport repo, you will need to add the source using the following command.
echo "deb http://httpredir.debian.org/debian jessie-backports main" | sudo tee /etc/apt/sources.list.d/backports.list
Next, update your sources list.
sudo aptitude update
Then, install HAProxy from the backport using the command below.
sudo aptitude install -t jessie-backports haproxy
Afterwards, you can double-check the installed version number with the following command.
haproxy -v
HA-Proxy version 1.7.5-2~bpo8+1 2017/05/27 Copyright 2000-2017 Willy Tarreau <[email protected]>
The installation is then complete. Continue below with instructions on configuring the load balancer to redirect requests to your web servers.
Configuring the load balancer
Setting up HAProxy for load balancing is quite straightforward. All you need to do is tell HAProxy what kind of connections it should be listening for and where the connections should be relayed.
This is done by creating a configuration file /etc/haproxy/haproxy.cfg with the defining settings. You can read about the configuration options on the HAProxy documentation page if you wish to find out more.
Load balancing on layer 4
Once installed, HAProxy should have a template for configuring the load balancer. Open the configuration file, for example, using nano with the command underneath.
sudo nano /etc/haproxy/haproxy.cfg
Add the following sections to the end of the file. Replace the <server name> with whatever you want to call your servers on the statistics page and the <private IP> with the private IPs for the servers you wish to direct the web traffic. You can check the private IPs in your UpCloud Control Panel and the Private Network tab under the Network menu.
frontend http_front bind *:80 stats uri /haproxy?stats default_backend http_back backend http_back balance roundrobin server <server1 name> <private IP 1>:80 check server <server2 name> <private IP 2>:80 check
This defines a layer 4 load balancer with a front-end name http_front listening to port number 80, which then directs the traffic to the default backend named http_back. The additional stats URI /haproxy?stats enable the statistics page at that specified address.
Different load-balancing algorithms
Configuring the servers in the backend section allows HAProxy to use these servers for load balancing according to the roundrobin algorithm whenever available.
The balancing algorithms are used to decide which server each connection is transferred to at the backend. Some of the useful options include the following:
- Roundrobin: Each server is used in turns according to its weights. This is the smoothest and fairest algorithm when the server’s processing time remains equally distributed. This dynamic algorithm allows server weights to be adjusted on the fly.
- Leastconn: The server with the lowest number of connections is chosen. Round-robin is performed between servers with the same load. This algorithm is recommended for long sessions such as LDAP, SQL, TSE, etc., but it is not very well suited for short sessions such as HTTP.
- First: The first server with available connection slots receives the connection. The servers are chosen from the lowest numeric identifier to the highest, which defaults to the server’s position on the farm. Once a server reaches its maxconn value, the next server is used.
- Source: The source IP address is hashed and divided by the total weight of the running servers to designate which server will receive the request. This way, the same client IP address will always reach the same server while the servers stay the same.
Configuring load balancing for layer 7
Another possibility is configuring the load balancer to work on layer 7, which is useful when parts of your web application are located on different hosts. This can be accomplished by conditioning the connection transfer, for example, by the URL.
Open the HAProxy configuration file with a text editor.
sudo nano /etc/haproxy/haproxy.cfg
Then, set the front and backend segments according to the example below.
frontend http_front bind *:80 stats uri /haproxy?stats acl url_blog path_beg /blog use_backend blog_back if url_blog default_backend http_back backend http_back balance roundrobin server <server name> <private IP>:80 check server <server name> <private IP>:80 check backend blog_back server <server name> <private IP>:80 check
The front end declares an ACL rule named url_blog that applies to all connections with paths that begin with /blog. Use_backend defines connections matching the url_blog condition as being served by the backend named blog_back, while all other requests are handled by the default backend.
On the backend side, the configuration sets up two server groups, http_back like before and the new one called blog_back, which servers connections specifically to example.com/blog.
After making the configurations, save the file and restart HAProxy with the next command.
sudo systemctl restart haproxy
If you get any errors or warnings at startup, check the configuration for any mistypes and then try restarting again.
Testing the setup
With the HAProxy configured and running, open your load balancer server’s public IP in a web browser and check that you connect to your backend correctly. The parameter stats uri in the configuration enables the statistics page at the defined address.
http://<load balancer public IP>/haproxy?stats
When you load the statistics page and all your servers are listed in green, your configuration succeeded!
The statistics page contains helpful information to keep track of your web hosts, including up and down times and session counts. If a server is listed in red, check that it is powered on and that you can ping it from the load balancer machine.
In case your load balancer does not reply, check that HTTP connections are not getting blocked by a firewall. Also, confirm that HAProxy is running using the command below.
sudo systemctl status haproxy
Password protecting the statistics page
However, Having the statistics page listed at the front end is publicly open for anyone to view, which might not be such a good idea. Instead, you can set it up to its port number by adding the example below to the end of your haproxy.cfg file. Replace the username and password with something secure.
listen stats bind *:8181 stats enable stats uri / stats realm Haproxy Statistics stats auth username:password
After adding the new listen group, remove the old reference to the stats uri from the frontend group. When done, save the file and restart HAProxy again.
sudo systemctl restart haproxy
Then open the load balancer again with the new port number, and log in with the username and password you set in the configuration file.
http://<load balancer public IP>:8181
Check that your servers are still reporting all green, and then open just the load balancer IP without any port numbers on your web browser.
http://<load balancer public IP>/
If your backend servers have at least slightly different landing pages, you will notice that each time you reload the page, you get a reply from a different host. You can try different balancing algorithms in the configuration section or the full documentation.
Conclusions
Congratulations on successfully configuring HAProxy! A basic load balancer setup can considerably increase your web application performance and availability. This guide is, however, just an introduction to load balancing with HAProxy, which is capable of much more than what could be covered in first-time setup instructions. We recommend experimenting with different configurations with the help of the extensive documentation available for HAProxy and then start planning the load balancing for your production environment.
While using multiple hosts to protect your web service with redundancy, the load balancer itself can still leave a single point of failure. You can further improve the high availability by setting up a floating IP between multiple load balancers. You can find out more about this in our article for floating IPs on UpCloud.