Introduction

A Load Balancer is a critical component in distributed systems that distributes incoming network traffic across multiple servers to ensure high availability, reliability, and optimal resource utilization. Understanding load balancing is essential for system design interviews and building scalable applications.

This guide covers:

  • Load Balancing Fundamentals: Algorithms, types, and architectures
  • Health Checks: Server monitoring and failover
  • Session Persistence: Sticky sessions and state management
  • SSL Termination: HTTPS handling at the load balancer
  • Best Practices: Performance, security, and reliability

What is a Load Balancer?

A Load Balancer is a device or software that:

  • Distributes Traffic: Routes requests across multiple servers
  • Improves Availability: Handles server failures gracefully
  • Optimizes Performance: Balances load for optimal response times
  • Scales Horizontally: Adds/removes servers dynamically
  • Provides Redundancy: Multiple load balancers for high availability

Key Concepts

Backend Server: Application server that handles requests

Health Check: Monitoring server availability

Session Persistence: Routing same client to same server

SSL Termination: Decrypting HTTPS at load balancer

Sticky Session: Maintaining client-server affinity

Failover: Switching to backup when primary fails

Architecture

High-Level Architecture

┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│   Client    │────▶│   Client    │────▶│   Client    │
│  (Browser)  │     │  (Mobile)   │     │  (API)      │
└──────┬──────┘     └──────┬──────┘     └──────┬──────┘
       │                    │                    │
       └────────────────────┴────────────────────┘
                            │
                            ▼
              ┌─────────────────────────┐
              │   Load Balancer         │
              │                         │
              │  ┌──────────┐           │
              │  │ Algorithm │           │
              │  │ (Routing) │           │
              │  └────┬─────┘           │
              │       │                 │
              │  ┌────┴─────┐           │
              │  │ Health   │           │
              │  │ Checks   │           │
              │  └──────────┘           │
              └──────┬──────────────────┘
                     │
       ┌─────────────┴─────────────┐
       │                           │
┌──────▼──────┐           ┌───────▼──────┐
│  Backend    │           │  Backend     │
│  Server 1   │           │  Server 2    │
└─────────────┘           └─────────────┘

Explanation:

  • Clients: Web browsers, mobile apps, or API clients that make requests to the application.
  • Load Balancer: Distributes incoming requests across multiple backend servers to improve performance and availability.
  • Algorithm (Routing): Determines which backend server receives each request (e.g., round-robin, least connections, IP hash).
  • Health Checks: Monitors backend server health and removes unhealthy servers from the pool.
  • Backend Servers: Application servers that process requests and generate responses.

Load Balancing Algorithms

1. Round Robin

How it Works:

  • Distributes requests sequentially
  • Each server gets equal share
  • Simple and fair

Example:

Request 1 → Server 1
Request 2 → Server 2
Request 3 → Server 3
Request 4 → Server 1 (cycle repeats)

Use Cases:

  • Servers with similar capacity
  • Stateless applications
  • Simple distribution

2. Least Connections

How it Works:

  • Routes to server with fewest active connections
  • Considers current load
  • Better for long-lived connections

Example:

Server 1: 10 connections
Server 2: 5 connections  ← Selected
Server 3: 15 connections

Use Cases:

  • Long-lived connections
  • Varying request processing times
  • Database connections

3. Least Response Time

How it Works:

  • Routes to server with lowest response time
  • Considers both connections and latency
  • Most intelligent algorithm

Example:

Server 1: 50ms response time
Server 2: 30ms response time  ← Selected
Server 3: 80ms response time

Use Cases:

  • Performance-critical applications
  • Varying server performance
  • Real-time applications

4. Weighted Round Robin

How it Works:

  • Assigns weights to servers
  • Higher weight = more requests
  • Accounts for server capacity

Example:

Server 1: Weight 3 (handles 3 requests)
Server 2: Weight 2 (handles 2 requests)
Server 3: Weight 1 (handles 1 request)

Use Cases:

  • Servers with different capacities
  • Gradual capacity scaling
  • Resource optimization

5. IP Hash

How it Works:

  • Hashes client IP address
  • Routes to same server (sticky)
  • Ensures session persistence

Example:

Client IP: 192.168.1.100
Hash: hash(192.168.1.100) % 3 = 1
Route to: Server 1

Use Cases:

  • Session persistence required
  • Stateful applications
  • Cache affinity

6. Geographic Routing

How it Works:

  • Routes based on client location
  • Nearest server selection
  • Reduces latency

Example:

US Client → US Server
EU Client → EU Server
Asia Client → Asia Server

Use Cases:

  • Global applications
  • Latency optimization
  • Data locality

Load Balancer Types

1. Layer 4 (Transport Layer)

Characteristics:

  • Operates at TCP/UDP level
  • Routes based on IP and port
  • Fast and efficient
  • No application awareness

Use Cases:

  • Simple routing
  • High throughput
  • TCP/UDP protocols

Example:

Client → LB (IP:Port) → Backend Server

2. Layer 7 (Application Layer)

Characteristics:

  • Operates at HTTP/HTTPS level
  • Content-aware routing
  • SSL termination
  • More features

Use Cases:

  • HTTP/HTTPS applications
  • Content-based routing
  • SSL termination

Example:

Client → LB (HTTP Headers) → Backend Server

3. Hardware Load Balancer

Characteristics:

  • Dedicated hardware device
  • High performance
  • Expensive
  • Limited flexibility

Examples:

  • F5 BIG-IP
  • Citrix NetScaler
  • A10 Networks

4. Software Load Balancer

Characteristics:

  • Software-based solution
  • Flexible and configurable
  • Cost-effective
  • Runs on standard hardware

Examples:

  • NGINX
  • HAProxy
  • Apache HTTP Server
  • AWS ELB/ALB

Architecture Patterns

1. Single Load Balancer

        Clients
           │
           ▼
    ┌──────────────┐
    │ Load Balancer│
    └──────┬───────┘
           │
    ┌──────┴──────┐
    │             │
┌───▼───┐    ┌───▼───┐
│Server1│    │Server2│
└───────┘    └───────┘

Pros:

  • Simple setup
  • Low cost

Cons:

  • Single point of failure
  • Limited scalability

2. Multiple Load Balancers (Active-Passive)

        Clients
           │
    ┌──────┴──────┐
    │             │
┌───▼───┐    ┌───▼───┐
│  LB1  │    │  LB2  │
│(Active)│    │(Standby)│
└───┬───┘    └───────┘
    │
    ▼
Backend Servers

Pros:

  • High availability
  • Failover capability

Cons:

  • Standby resource waste
  • Failover time

3. Multiple Load Balancers (Active-Active)

        Clients
           │
    ┌──────┴──────┐
    │             │
┌───▼───┐    ┌───▼───┐
│  LB1  │    │  LB2  │
│(Active)│    │(Active)│
└───┬───┘    └───┬───┘
    │             │
    └──────┬──────┘
           │
    ┌──────┴──────┐
    │             │
┌───▼───┐    ┌───▼───┐
│Server1│    │Server2│
└───────┘    └───────┘

Pros:

  • No resource waste
  • Better performance
  • Load distribution

Cons:

  • More complex
  • Session sharing needed

4. DNS-Based Load Balancing

        Clients
           │
           ▼
    ┌──────────────┐
    │  DNS Server  │
    │ (Round Robin)│
    └──────┬───────┘
           │
    ┌──────┴──────┐
    │             │
┌───▼───┐    ┌───▼───┐
│Server1│    │Server2│
└───────┘    └───────┘

Pros:

  • Simple
  • No additional hardware

Cons:

  • Limited control
  • DNS caching issues
  • No health checks

Health Checks

Health Check Types

1. TCP Health Check:

LB → TCP Connection → Server
     (Success/Failure)

2. HTTP Health Check:

LB → GET /health → Server
     (200 OK / Error)

3. Custom Health Check:

LB → Custom Endpoint → Server
     (Application Logic)

Health Check Configuration

Interval:

  • How often to check
  • Example: Every 5 seconds

Timeout:

  • Maximum wait time
  • Example: 3 seconds

Threshold:

  • Success/failure count
  • Example: 2 failures = unhealthy

Example:

upstream backend {
    server server1.example.com:8080;
    server server2.example.com:8080;
    
    # Health check
    health_check interval=5s fails=2 passes=1;
}

Session Persistence

Sticky Sessions

Cookie-Based:

Client → LB → Server1
LB sets cookie: SERVER_ID=server1
Next request: LB reads cookie → Server1

IP Hash:

Client IP: 192.168.1.100
Hash: hash(192.168.1.100) % 3 = 1
Route to: Server1

Session Sharing

Shared Session Store:

Server1 → Redis → Server2
(Shared session data)

Database Sessions:

Server1 → Database → Server2
(Shared session table)

SSL Termination

SSL Termination at Load Balancer

Client → HTTPS → LB (Decrypt) → HTTP → Backend

Benefits:

  • Offloads SSL processing
  • Centralized certificate management
  • Better performance

Configuration:

server {
    listen 443 ssl;
    ssl_certificate /path/to/cert.pem;
    ssl_certificate_key /path/to/key.pem;
    
    location / {
        proxy_pass http://backend;
    }
}

End-to-End SSL

Client → HTTPS → LB → HTTPS → Backend

Benefits:

  • Encrypted all the way
  • Better security

Drawbacks:

  • More CPU usage
  • Complex certificate management

Load Balancer Providers

NGINX

Features:

  • High performance
  • Layer 7 routing
  • SSL termination
  • Reverse proxy

Configuration:

upstream backend {
    least_conn;
    server server1.example.com;
    server server2.example.com;
}

server {
    listen 80;
    location / {
        proxy_pass http://backend;
    }
}

HAProxy

Features:

  • Layer 4 and Layer 7
  • Advanced algorithms
  • Health checks
  • Statistics

Configuration:

frontend web
    bind *:80
    default_backend servers

backend servers
    balance roundrobin
    server server1 192.168.1.10:8080 check
    server server2 192.168.1.11:8080 check

AWS ELB/ALB

Features:

  • Fully managed
  • Auto-scaling
  • Health checks
  • SSL termination

Types:

  • Classic ELB: Layer 4 and Layer 7
  • Application LB: Layer 7 only
  • Network LB: Layer 4 only

Best Practices

1. High Availability

  • Use multiple load balancers
  • Active-active configuration
  • Health checks
  • Automatic failover

2. Performance

  • Choose appropriate algorithm
  • Monitor backend health
  • Optimize health check frequency
  • Use connection pooling

3. Security

  • SSL/TLS termination
  • DDoS protection
  • Rate limiting
  • Access control

4. Monitoring

  • Track request rates
  • Monitor response times
  • Health check status
  • Backend server metrics

What Interviewers Look For

Load Balancing Understanding

  1. Algorithm Selection
    • Understanding of different algorithms
    • When to use each algorithm
    • Trade-offs between algorithms
    • Red Flags: Wrong algorithm, no understanding, no trade-offs
  2. Architecture Design
    • Single vs multiple load balancers
    • Active-active vs active-passive
    • High availability setup
    • Red Flags: Single point of failure, no HA, poor design
  3. Health Checks
    • Health check configuration
    • Failover mechanisms
    • Server monitoring
    • Red Flags: No health checks, poor failover, no monitoring

Problem-Solving Approach

  1. Traffic Distribution
    • Algorithm selection
    • Load balancing strategy
    • Performance optimization
    • Red Flags: No strategy, poor algorithm, no optimization
  2. High Availability
    • Redundancy design
    • Failover mechanisms
    • Disaster recovery
    • Red Flags: No redundancy, poor failover, no recovery

System Design Skills

  1. Scalability
    • Horizontal scaling
    • Dynamic server addition
    • Load distribution
    • Red Flags: No scaling, static setup, poor distribution
  2. Performance
    • Response time optimization
    • Throughput maximization
    • Resource utilization
    • Red Flags: No optimization, poor performance, waste

Communication Skills

  1. Clear Explanation
    • Explains load balancing concepts
    • Discusses trade-offs
    • Justifies design decisions
    • Red Flags: Unclear explanations, no justification, confusing

Meta-Specific Focus

  1. System Design Expertise
    • Understanding of distributed systems
    • Load balancing mastery
    • High availability patterns
    • Key: Demonstrate system design expertise
  2. Performance Optimization
    • Algorithm selection
    • Performance tuning
    • Resource optimization
    • Key: Show performance optimization skills

Summary

Load Balancer Key Points:

  • Traffic Distribution: Routes requests across multiple servers
  • Algorithms: Round robin, least connections, weighted, IP hash
  • High Availability: Multiple load balancers, health checks, failover
  • Session Persistence: Sticky sessions, session sharing
  • SSL Termination: HTTPS handling at load balancer
  • Performance: Optimizes response times and resource utilization

Common Use Cases:

  • Web applications
  • API services
  • Microservices
  • High-traffic websites
  • Global applications

Best Practices:

  • Use appropriate algorithm
  • Implement health checks
  • Configure high availability
  • Enable SSL termination
  • Monitor performance
  • Implement session persistence when needed
  • Use multiple load balancers for critical systems

Load balancers are essential for building scalable, highly available, and performant distributed systems.