= What is load balancing

:Author: Seth Kenlon
:Email: [email protected]

When the personal computer was still young, a household was likely to have one or fewer computers in it.
Children played games on the household computer during the day, and parents did accounting or programming or roamed through a BBS in the evening.
Imagine a one-computer household today, though, and you can predict the conflict it would create.
Everyone would want to use the computer at the same time, but there just wouldn't be enough keyboard and mouse to go around.
This is, more or less, the same scenario that's been happening to the IT industry as computers have become more and more ubiquitous: demand for services and servers has increased to the point that they could grind to halt from over-use.
As a result, we now have the concept of load balancing.

== What is load balancing

Load balancing is a generic term referring to anything you do to ensure the resources you manage are distributed efficiently.
For the systems administrator of a web server, load balancing usually means ensuring that the web server software (such as https://opensource.com/business/15/4/nginx-open-source-platform[Nginx]) is configured with enough worker nodes to handle a spike in incoming visitors.
In other words, should a site become suddenly very popular and its visitor count quadruples in a matter of minutes, the software running the server must be able to respond to each visitor without any single visitor noticing degredation in service.
For simple sites, this is as simple as a one-line configuration option, but for complex sites with dynamic content and several database queries for each user, it can be a serious problem.

This problem is supposed to have been solved thanks to cloud computing, but it's not impossible for a web app to fail to scale out when it experiences an unexpected surge.

The important thing to keep in mind when it comes to load balancing is that distributing resources _efficiently_ doesn't necessarily mean distributing them _evenly_.
Not all tasks require all available resources at all times.
A smart load balancing strategy provides resources to users and tasks only when those resources are required.
This is often the domain of the application developer rather than the IT infrastructure.
Asynchronous applications are a vital part of ensuring that a user who has walked away from their computer for a coffee break aren't also occupying valuable resources of the server.

== How does load balancing work?

Load balancing avoids bottlenecks by distributing a workload across multiple computational nodes.
Those nodes may be physical servers in a data center, or containers in a cloud, or strategically placed servers enlisted for Edge computing, or even separate JVMs in a complex application framework or daemons running on a single Linux server.
The idea is to divide a large problem into small tasks, and to assign each task to a dedicated computer.
For a website that requires its users to log in, for instance, the website itself might be hosted on Server A, while the login page, and all the authentication lookups that go along with it, is hosted on Server B.
This way, the process of a new user logging into their account doesn't steal resources from users actively using the site.

=== Load balancing the cloud

Cloud computing uses https://opensource.com/resources/what-are-linux-containers[containers], so there often aren't separate physical servers to handle distinct tasks (actually, there are many separate servers, but they're clustered together to act as one computational "brain").
Instead, a "pod" is created from several containers.
When one pod starts to run out of resources due to its user or task load, an identical pod is generated.
Pods share storage and network resources, and each pod is assigned to a compute node as it's created.
Pods can be created or destroyed on-demand as the load requires, so users experience consistent quality of service regardless of how many users there are.

=== Edge computing

https://opensource.com/article/18/5/edge-computing[Edge computing] takes the physical world into account when load balancing.
The cloud is by nature a distributed system, but in practice the nodes of a cloud is usually concentrated in a few data centers.
The farther a user is from the data center running the cloud, the more physical barriers the user must overcome for optimal service.
Even with fibre connections and proper load balancing, the response time of a server located 3000 miles away is likely greater than the response time of something just 300 miles away.
Edge computing brings compute nodes to the "edge" of the cloud in an attempt to bridge the geographic divide, forming a sort of satellite network for the cloud, and so it also plays a part in a good load balancing effort.

== What is a load balancing algorithm?

There are many strategies for load balancing, and they range in complexity depending on what technology is involved and what requirements demand.
Load balancing doesn't have to be complicated, and it's important even when using specialized software, like https://opensource.com/resources/what-is-kubernetes[Kubernetes] or https://www.redhat.com/sysadmin/keepalived-basics[keepalived], to start load balancing from inception.
Don't rely on containers to load balance when you could instead design your application to take simple precautions on its own.
If you design your application to be modular and ephemeral from the start, then you'll benefit from the load balancing opportunities made available by clever network design, container orchestration, and whatever tomorrow's technology brings.

There are some popular algorithms that can guide your efforts as either an application developer or a network engineer.

* Assign tasks to servers sequentially (this is often referred to as *round robin*).
* Assign tasks to the server that's currently the least busy.
* Assign tasks to the server with the best response time.
* Assign tasks randomly.

All of these principles can be combined, or weighted to favour, for instance, the most powerful server in a group when assigning particularly complex tasks.
https://opensource.com/article/20/11/orchestration-vs-automation[Orchestration] is commonly used so that an administrator doesn't have to drum up the perfect algorithm or strategy for load balancing, although sometimes it's up to the admin to choose which combination of load balance schemes are used.

== Expect the unexpected

Load balancing isn't really about ensuring that all of your resources are used evenly across your network.
Load balancing is all about guarannteeing a reliable user experience even when the unexpected happens.
Good infrasctructure can withstand a computer crash, application overload, an onslaught of network traffic, and user error.
Think about how your service can be resilient, and design load balancing in from the ground up.