Load Balancer: Comprehensive Guide to Traffic Distribution and High Availability

Introduction

A Load Balancer is a critical component in distributed systems that distributes incoming network traffic across multiple servers to ensure high availability, reliability, and optimal resource utilization. Understanding load balancing is essential for system design interviews and building scalable applications.

This guide covers:

Load Balancing Fundamentals: Algorithms, types, and architectures
Health Checks: Server monitoring and failover
Session Persistence: Sticky sessions and state management
SSL Termination: HTTPS handling at the load balancer
Best Practices: Performance, security, and reliability

What is a Load Balancer?

A Load Balancer is a device or software that:

Distributes Traffic: Routes requests across multiple servers
Improves Availability: Handles server failures gracefully
Optimizes Performance: Balances load for optimal response times
Scales Horizontally: Adds/removes servers dynamically
Provides Redundancy: Multiple load balancers for high availability

Key Concepts

Backend Server: Application server that handles requests

Health Check: Monitoring server availability

Session Persistence: Routing same client to same server

SSL Termination: Decrypting HTTPS at load balancer

Sticky Session: Maintaining client-server affinity

Failover: Switching to backup when primary fails

Architecture

High-Level Architecture

┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│   Client    │────▶│   Client    │────▶│   Client    │
│  (Browser)  │     │  (Mobile)   │     │  (API)      │
└──────┬──────┘     └──────┬──────┘     └──────┬──────┘
       │                    │                    │
       └────────────────────┴────────────────────┘
                            │
                            ▼
              ┌─────────────────────────┐
              │   Load Balancer         │
              │                         │
              │  ┌──────────┐           │
              │  │ Algorithm │           │
              │  │ (Routing) │           │
              │  └────┬─────┘           │
              │       │                 │
              │  ┌────┴─────┐           │
              │  │ Health   │           │
              │  │ Checks   │           │
              │  └──────────┘           │
              └──────┬──────────────────┘
                     │
       ┌─────────────┴─────────────┐
       │                           │
┌──────▼──────┐           ┌───────▼──────┐
│  Backend    │           │  Backend     │
│  Server 1   │           │  Server 2    │
└─────────────┘           └─────────────┘

Explanation:

Clients: Web browsers, mobile apps, or API clients that make requests to the application.
Load Balancer: Distributes incoming requests across multiple backend servers to improve performance and availability.
Algorithm (Routing): Determines which backend server receives each request (e.g., round-robin, least connections, IP hash).
Health Checks: Monitors backend server health and removes unhealthy servers from the pool.
Backend Servers: Application servers that process requests and generate responses.

Load Balancing Algorithms

1. Round Robin

How it Works:

Distributes requests sequentially
Each server gets equal share
Simple and fair

Example:

Request 1 → Server 1
Request 2 → Server 2
Request 3 → Server 3
Request 4 → Server 1 (cycle repeats)

Use Cases:

Servers with similar capacity
Stateless applications
Simple distribution

2. Least Connections

How it Works:

Routes to server with fewest active connections
Considers current load
Better for long-lived connections

Example:

Server 1: 10 connections
Server 2: 5 connections  ← Selected
Server 3: 15 connections

Use Cases:

Long-lived connections
Varying request processing times
Database connections

3. Least Response Time

How it Works:

Routes to server with lowest response time
Considers both connections and latency
Most intelligent algorithm

Example:

Server 1: 50ms response time
Server 2: 30ms response time  ← Selected
Server 3: 80ms response time

Use Cases:

Performance-critical applications
Varying server performance
Real-time applications

4. Weighted Round Robin

How it Works:

Assigns weights to servers
Higher weight = more requests
Accounts for server capacity

Example:

Server 1: Weight 3 (handles 3 requests)
Server 2: Weight 2 (handles 2 requests)
Server 3: Weight 1 (handles 1 request)

Use Cases:

Servers with different capacities
Gradual capacity scaling
Resource optimization

5. IP Hash

How it Works:

Hashes client IP address
Routes to same server (sticky)
Ensures session persistence

Example:

Client IP: 192.168.1.100
Hash: hash(192.168.1.100) % 3 = 1
Route to: Server 1

Use Cases:

Session persistence required
Stateful applications
Cache affinity

6. Geographic Routing

How it Works:

Routes based on client location
Nearest server selection
Reduces latency

Example:

US Client → US Server
EU Client → EU Server
Asia Client → Asia Server

Use Cases:

Global applications
Latency optimization
Data locality

Load Balancer Types

1. Layer 4 (Transport Layer)

Characteristics:

Operates at TCP/UDP level
Routes based on IP and port
Fast and efficient
No application awareness

Use Cases:

Simple routing
High throughput
TCP/UDP protocols

Example:

Client → LB (IP:Port) → Backend Server

2. Layer 7 (Application Layer)

Characteristics:

Operates at HTTP/HTTPS level
Content-aware routing
SSL termination
More features

Use Cases:

HTTP/HTTPS applications
Content-based routing
SSL termination

Example:

Client → LB (HTTP Headers) → Backend Server

3. Hardware Load Balancer

Characteristics:

Dedicated hardware device
High performance
Expensive
Limited flexibility

Examples:

F5 BIG-IP
Citrix NetScaler
A10 Networks

4. Software Load Balancer

Characteristics:

Software-based solution
Flexible and configurable
Cost-effective
Runs on standard hardware

Examples:

NGINX
HAProxy
Apache HTTP Server
AWS ELB/ALB

Architecture Patterns

1. Single Load Balancer

        Clients
           │
           ▼
    ┌──────────────┐
    │ Load Balancer│
    └──────┬───────┘
           │
    ┌──────┴──────┐
    │             │
┌───▼───┐    ┌───▼───┐
│Server1│    │Server2│
└───────┘    └───────┘

Pros:

Simple setup
Low cost

Cons:

Single point of failure
Limited scalability

2. Multiple Load Balancers (Active-Passive)

        Clients
           │
    ┌──────┴──────┐
    │             │
┌───▼───┐    ┌───▼───┐
│  LB1  │    │  LB2  │
│(Active)│    │(Standby)│
└───┬───┘    └───────┘
    │
    ▼
Backend Servers

Pros:

High availability
Failover capability

Cons:

Standby resource waste
Failover time

3. Multiple Load Balancers (Active-Active)

        Clients
           │
    ┌──────┴──────┐
    │             │
┌───▼───┐    ┌───▼───┐
│  LB1  │    │  LB2  │
│(Active)│    │(Active)│
└───┬───┘    └───┬───┘
    │             │
    └──────┬──────┘
           │
    ┌──────┴──────┐
    │             │
┌───▼───┐    ┌───▼───┐
│Server1│    │Server2│
└───────┘    └───────┘

Pros:

No resource waste
Better performance
Load distribution

Cons:

More complex
Session sharing needed

4. DNS-Based Load Balancing

        Clients
           │
           ▼
    ┌──────────────┐
    │  DNS Server  │
    │ (Round Robin)│
    └──────┬───────┘
           │
    ┌──────┴──────┐
    │             │
┌───▼───┐    ┌───▼───┐
│Server1│    │Server2│
└───────┘    └───────┘

Pros:

Simple
No additional hardware

Cons:

Limited control
DNS caching issues
No health checks

Health Checks

Health Check Types

1. TCP Health Check:

LB → TCP Connection → Server
     (Success/Failure)

2. HTTP Health Check:

LB → GET /health → Server
     (200 OK / Error)

3. Custom Health Check:

LB → Custom Endpoint → Server
     (Application Logic)

Health Check Configuration

Interval:

How often to check
Example: Every 5 seconds

Timeout:

Maximum wait time
Example: 3 seconds

Threshold:

Success/failure count
Example: 2 failures = unhealthy

Example:

upstream backend {
    server server1.example.com:8080;
    server server2.example.com:8080;
    
    # Health check
    health_check interval=5s fails=2 passes=1;
}

Session Persistence

Sticky Sessions

Cookie-Based:

Client → LB → Server1
LB sets cookie: SERVER_ID=server1
Next request: LB reads cookie → Server1

IP Hash:

Client IP: 192.168.1.100
Hash: hash(192.168.1.100) % 3 = 1
Route to: Server1

Shared Session Store:

Server1 → Redis → Server2
(Shared session data)

Database Sessions:

Server1 → Database → Server2
(Shared session table)

SSL Termination

SSL Termination at Load Balancer

Client → HTTPS → LB (Decrypt) → HTTP → Backend

Benefits:

Offloads SSL processing
Centralized certificate management
Better performance

Configuration:

server {
    listen 443 ssl;
    ssl_certificate /path/to/cert.pem;
    ssl_certificate_key /path/to/key.pem;
    
    location / {
        proxy_pass http://backend;
    }
}

End-to-End SSL

Client → HTTPS → LB → HTTPS → Backend

Benefits:

Encrypted all the way
Better security

Drawbacks:

More CPU usage
Complex certificate management

Load Balancer Providers

NGINX

Features:

High performance
Layer 7 routing
SSL termination
Reverse proxy

Configuration:

upstream backend {
    least_conn;
    server server1.example.com;
    server server2.example.com;
}

server {
    listen 80;
    location / {
        proxy_pass http://backend;
    }
}

HAProxy

Features:

Layer 4 and Layer 7
Advanced algorithms
Health checks
Statistics

Configuration:

frontend web
    bind *:80
    default_backend servers

backend servers
    balance roundrobin
    server server1 192.168.1.10:8080 check
    server server2 192.168.1.11:8080 check

AWS ELB/ALB

Features:

Fully managed
Auto-scaling
Health checks
SSL termination

Types:

Classic ELB: Layer 4 and Layer 7
Application LB: Layer 7 only
Network LB: Layer 4 only

Best Practices

1. High Availability

Use multiple load balancers
Active-active configuration
Health checks
Automatic failover

2. Performance

Choose appropriate algorithm
Monitor backend health
Optimize health check frequency
Use connection pooling

3. Security

SSL/TLS termination
DDoS protection
Rate limiting
Access control

4. Monitoring

Track request rates
Monitor response times
Health check status
Backend server metrics

What Interviewers Look For

Load Balancing Understanding

Algorithm Selection
- Understanding of different algorithms
- When to use each algorithm
- Trade-offs between algorithms
- Red Flags: Wrong algorithm, no understanding, no trade-offs
Architecture Design
- Single vs multiple load balancers
- Active-active vs active-passive
- High availability setup
- Red Flags: Single point of failure, no HA, poor design
Health Checks
- Health check configuration
- Failover mechanisms
- Server monitoring
- Red Flags: No health checks, poor failover, no monitoring

Problem-Solving Approach

Traffic Distribution
- Algorithm selection
- Load balancing strategy
- Performance optimization
- Red Flags: No strategy, poor algorithm, no optimization
High Availability
- Redundancy design
- Failover mechanisms
- Disaster recovery
- Red Flags: No redundancy, poor failover, no recovery

System Design Skills

Scalability
- Horizontal scaling
- Dynamic server addition
- Load distribution
- Red Flags: No scaling, static setup, poor distribution
Performance
- Response time optimization
- Throughput maximization
- Resource utilization
- Red Flags: No optimization, poor performance, waste

Communication Skills

Clear Explanation
- Explains load balancing concepts
- Discusses trade-offs
- Justifies design decisions
- Red Flags: Unclear explanations, no justification, confusing

Meta-Specific Focus

System Design Expertise
- Understanding of distributed systems
- Load balancing mastery
- High availability patterns
- Key: Demonstrate system design expertise
Performance Optimization
- Algorithm selection
- Performance tuning
- Resource optimization
- Key: Show performance optimization skills

Summary

Load Balancer Key Points:

Traffic Distribution: Routes requests across multiple servers
Algorithms: Round robin, least connections, weighted, IP hash
High Availability: Multiple load balancers, health checks, failover
Session Persistence: Sticky sessions, session sharing
SSL Termination: HTTPS handling at load balancer
Performance: Optimizes response times and resource utilization

Common Use Cases:

Web applications
API services
Microservices
High-traffic websites
Global applications

Best Practices:

Use appropriate algorithm
Implement health checks
Configure high availability
Enable SSL termination
Monitor performance
Implement session persistence when needed
Use multiple load balancers for critical systems

Load balancers are essential for building scalable, highly available, and performant distributed systems.

Introduction

What is a Load Balancer?

Key Concepts

Architecture

High-Level Architecture

Load Balancing Algorithms

1. Round Robin

2. Least Connections

3. Least Response Time

4. Weighted Round Robin

5. IP Hash

6. Geographic Routing

Load Balancer Types

1. Layer 4 (Transport Layer)

2. Layer 7 (Application Layer)

3. Hardware Load Balancer

4. Software Load Balancer

Architecture Patterns

1. Single Load Balancer

2. Multiple Load Balancers (Active-Passive)

3. Multiple Load Balancers (Active-Active)

4. DNS-Based Load Balancing

Health Checks

Health Check Types

Health Check Configuration

Session Persistence

Sticky Sessions

Session Sharing

SSL Termination

SSL Termination at Load Balancer

End-to-End SSL

Load Balancer Providers

NGINX

HAProxy

AWS ELB/ALB

Best Practices

1. High Availability

2. Performance

3. Security

4. Monitoring

What Interviewers Look For

Load Balancing Understanding

Problem-Solving Approach

System Design Skills

Communication Skills

Meta-Specific Focus

Summary

Related Posts

Recent Posts