#3 System Design: Load Balancers

Think of it as a Traffic control system :)

It evenly distributes traffic across multiple servers based on their availability and responsiveness and prevent server overload.

At home if everyone shares the work of dishes, laundry, mopping, organizing mom/dad is not overloaded !!

Why do we need Load Balancers?

Think of an Coffee shop where Mary is the Barista and Jane is the receptionist/billing assistant.

Jane takes the order from the customers and passes it to Mary, Mary brews the coffee and hands it to the customer. This is very common and the coffee shop sales go smoothly.

What if there is a concert happening nearby the Coffee shop and there is a sudden spike in customers showing up. Jane now has to serve the customers in a queue and pass the orders to Mary. Mary does her best by doubling her capacity to serve her customers but she is just a single person doing the work (Vertical Scaling). The customers are not happy still since they are waiting on the queue for their turn. The manager(Load Balancer) then decides to employ Jack and Hose to reduce the load on Mary while all three serve the customers (Horizontal Scaling) thereby increasing efficiency.

Similarly, web applications deployed and hosted on servers, which ultimately live on hardware machines with finite resources such as memory (RAM), processor (CPU), and network connections handle different kind of traffic patterns. As the traffic to an application increases, these resources can becoming limiting factors and prevent the machine from serving requests

This limit is known as the system's capacity. At first, some of these scaling problems can be solved by simply increasing the memory or CPU of the server or by using the available resources more efficiently, such as multithreading.

At a certain point, though, increased traffic will cause any application to exceed the capacity that a single server can provide.

The only solution to this problem is to add more servers to the system, also known as horizontal scaling. When more than one server can be used to serve a request, it becomes necessary to decide which server to send the request to. That's where load balancers come into play.

In vertical scaling, where we add more servers, the goal is to keep the load factor on each of the server almost the same at any given point of time. ie even during server crashes or accommodate increase in traffic by adding new servers to the pool. Just like how addition of Jane and Hose helped the coffee shop scale to serve more customers. To top it the good manager also made sure that the work is evenly distributed among all.

Now you know what load balancing is all about !!!!

How do they work?

A good load balancer will efficiently distribute the traffic to targets maximizing capacity utilization and minimize queuing time.

They work based on few techniques like

Round Robin:

Elastic Load Balancer (ELB)

Untitled (1).png

The first server gets it, then the second, then the third, then the fourth and repeat the loop again.. first second third fourth....

Because servers are assigned in a repeating order, the next server assigned is guaranteed to be the least recently used.

Least connections:

Assigns the server currently handling the fewest number of requests.

Untitled (2).png

In cases where application servers have similar specifications, an application server may be overloaded due to longer lived connections, this algorithm/technique takes the active connection load into consideration. It will ignore the idle servers.

Consistent hashing:

Similar to database sharding, the server can be assigned consistently based on IP address or URL.

Tell me more about it ! It deserves to be its own blog altogether. If you're too curious. I learnt it from

Hashing

Consistent Hashing

Since load balancers must handle the traffic for the entire server pool, they need to be efficient and highly available. Depending on the chosen strategy and performance requirements, load balancers can operate at higher or lower network layers (HTTP vs. TCP/IP) or even be implemented in hardware.

Engineering teams typically don't implement their own load balancers and instead use an industry-standard reverse proxy (like HAProxy or Nginx) to perform load balancing and other functions such as SSL termination and health checks. Most cloud providers also offer out-of-the-box load balancers, such as Amazon's Elastic Load Balancer (ELB).

When to use a load balancer?

When you think the system you're creating could benefit from more capacity or redundancy, you should use a load balancer.

Do you want your Coffee shop to serve all customers even during peak hours or during events?

Load balancers are frequently placed between external traffic and application servers. Load balancers are commonly used in front of each internal service in a microservice architecture so that each portion of the system can be scaled independently.

Keep in mind that load balancing won't solve all of your system's scaling issues. Database performance, algorithmic complexity, and other sorts of resource conflicts, for example, can all cause an application to fail.

Inefficient calculations, slow database queries, and unreliable third-party APIs will not be mitigated by adding more servers.

In these instances, a system that can handle jobs asynchronously, such as a job queue, may be required.

Load balancing is notably distinct from rate limiting, which is when traffic is intentionally throttled or dropped in order to prevent abuse by a particular user or organization.

Pros of Load Balancers

Scalability. Load balancers make it easier to scale up and down with demand by adding or removing backend servers.
Reliability. Load balancers provide redundancy and can minimize downtime by automatically detecting and replacing unhealthy servers.

Health Checks are done to ensure only healthy servers handle the request !

Performance. By distributing the workload evenly across servers, load balancers can improve the average response time.

Walk with Care??? Why?? :

As scale increases, load balancers can themselves become a bottleneck or single point of failure, so multiple load balancers must be used to guarantee availability.
DNS round robin can be used to balance traffic across different load balancers.
User sessions. The same user's requests can be served from different backends unless the load balancer is configured otherwise. This could be problematic for applications that rely on session data that isn't shared across servers.
Longer deploys. Deploying new server versions can take longer and require more machines since the load balancer needs to roll over traffic to the new servers and drain requests from the old machines.

Hope you enjoyed learning about Load Balancers the fun way. I'll try to collect my learning journey on consistent hashing as a separate blog.

Until then .. Stay tuned..

#3 System Design: Load Balancers

Why do we need Load Balancers?

How do they work?

When to use a load balancer?

Pros of Load Balancers

Walk with Care??? Why?? :

Comments

System Design

#2 System Design: Web Protocols

More from this blog

A Deep Dive on Asynchronous communication !

#15 System Design: High Availability

#14 System Design: Reliability

#13 System Design: Consistent Hashing

#12 System Design: AuthN and AuthZ

Command Palette

Why do we need Load Balancers?

How do they work?

When to use a load balancer?

Pros of Load Balancers

Walk with Care??? Why?? :

Comments

System Design

#2 System Design: Web Protocols

More from this blog