etcd: Comprehensive Guide to Distributed Key-Value Store and Coordination

Introduction

etcd is a distributed, reliable key-value store for the most critical data of a distributed system. It’s used for service discovery, configuration management, and coordination. Understanding etcd is essential for system design interviews involving distributed systems and coordination.

This guide covers:

etcd Fundamentals: Key-value storage, Raft consensus, and watch mechanism
Operations: Get, put, delete, and watch operations
Leases: TTL-based key expiration
Transactions: Atomic operations
Best Practices: Performance, security, and monitoring

What is etcd?

etcd is a distributed key-value store that:

Distributed: Runs across multiple nodes
Consistent: Raft consensus algorithm
Reliable: Data replication and persistence
Watch: Real-time change notifications
Fast: Low latency operations

Key Concepts

Key: String identifier for stored value

Value: Data stored in etcd

Revision: Monotonically increasing version number

Lease: TTL-based expiration mechanism

Watch: Real-time change notification

Transaction: Atomic multi-operation

Architecture

High-Level Architecture

┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│   Client    │────▶│   Client    │────▶│   Client    │
│ Application │     │ Application │     │ Application │
└──────┬──────┘     └──────┬──────┘     └──────┬──────┘
       │                    │                    │
       └────────────────────┴────────────────────┘
                            │
                            ▼
              ┌─────────────────────────┐
              │     etcd Cluster        │
              │                         │
              │  ┌──────────┐           │
              │  │  Leader  │           │
              │  │  Node    │           │
              │  └────┬─────┘           │
              │       │                 │
              │  ┌────┴─────┐           │
              │  │ Followers │           │
              │  │  (Raft)   │           │
              │  └───────────┘           │
              │                         │
              │  ┌───────────────────┐  │
              │  │  Key-Value Store   │  │
              │  │  (Distributed)     │  │
              │  └───────────────────┘  │
              └─────────────────────────┘

Explanation:

Client Applications: Applications (e.g., Kubernetes, distributed systems) that connect to etcd for distributed key-value storage and coordination.
etcd Cluster: A collection of etcd nodes working together. Typically 3, 5, or 7 nodes for fault tolerance.
Leader Node: One node in the cluster that coordinates writes using the Raft consensus algorithm.
Follower Nodes: Other nodes in the cluster that replicate the leader’s writes and participate in consensus voting.
Key-Value Store: Distributed key-value storage that provides strong consistency and high availability.

Core Architecture

┌─────────────────────────────────────────────────────────┐
│              etcd Cluster                               │
│                                                          │
│  ┌──────────────────────────────────────────────────┐   │
│  │         etcd Node 1 (Leader)                     │   │
│  │  (Raft Consensus, Key-Value Store)               │   │
│  └──────────────────────────────────────────────────┘   │
│                          │                               │
│  ┌──────────────────────────────────────────────────┐   │
│  │         etcd Node 2 (Follower)                   │   │
│  │  (Replication, Consensus)                        │   │
│  └──────────────────────────────────────────────────┘   │
│                          │                               │
│  ┌──────────────────────────────────────────────────┐   │
│  │         etcd Node 3 (Follower)                   │   │
│  │  (Replication, Consensus)                        │   │
│  └──────────────────────────────────────────────────┘   │
└───────────────────────────────────────────────────────────┘

Basic Operations

Put (Write)

CLI:

etcdctl put /config/database/url "postgresql://localhost:5432/mydb"

Go Client:

package main

import (
    "context"
    "fmt"
    "go.etcd.io/etcd/clientv3"
    "time"
)

func main() {
    cli, err := clientv3.New(clientv3.Config{
        Endpoints:   []string{"localhost:2379"},
        DialTimeout: 5 * time.Second,
    })
    if err != nil {
        panic(err)
    }
    defer cli.Close()
    
    ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
    _, err = cli.Put(ctx, "/config/database/url", 
        "postgresql://localhost:5432/mydb")
    cancel()
    
    if err != nil {
        panic(err)
    }
}

Python Client:

import etcd3

client = etcd3.client(host='localhost', port=2379)
client.put('/config/database/url', 'postgresql://localhost:5432/mydb')

Get (Read)

CLI:

etcdctl get /config/database/url

Go Client:

ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
resp, err := cli.Get(ctx, "/config/database/url")
cancel()

if err != nil {
    panic(err)
}

for _, ev := range resp.Kvs {
    fmt.Printf("%s: %s\n", ev.Key, ev.Value)
}

With Revision:

resp, err := cli.Get(ctx, "/config/database/url", 
    clientv3.WithRev(100))

Delete

CLI:

etcdctl del /config/database/url

Go Client:

ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
_, err := cli.Delete(ctx, "/config/database/url")
cancel()

Delete with Prefix:

_, err := cli.Delete(ctx, "/config/", clientv3.WithPrefix())

Watch Mechanism

Watch Key

Go Client:

watchChan := cli.Watch(context.Background(), "/config/database/url")

for watchResp := range watchChan {
    for _, ev := range watchResp.Events {
        switch ev.Type {
        case clientv3.EventTypePut:
            fmt.Printf("PUT: %s = %s\n", ev.Kv.Key, ev.Kv.Value)
        case clientv3.EventTypeDelete:
            fmt.Printf("DELETE: %s\n", ev.Kv.Key)
        }
    }
}

Watch with Revision:

watchChan := cli.Watch(context.Background(), "/config/database/url",
    clientv3.WithRev(100))

Watch with Prefix:

watchChan := cli.Watch(context.Background(), "/config/",
    clientv3.WithPrefix())

Python Client

import etcd3

client = etcd3.client(host='localhost', port=2379)

def watch_callback(event):
    if isinstance(event, etcd3.events.PutEvent):
        print(f"PUT: {event.key} = {event.value}")
    elif isinstance(event, etcd3.events.DeleteEvent):
        print(f"DELETE: {event.key}")

client.add_watch_callback('/config/database/url', watch_callback)

Leases

Create Lease

Go Client:

// Grant lease with 60 second TTL
lease, err := cli.Grant(context.TODO(), 60)
if err != nil {
    panic(err)
}

// Put with lease
_, err = cli.Put(context.TODO(), "/session/lock", "locked",
    clientv3.WithLease(lease.ID))

// Keep lease alive
keepAliveChan, err := cli.KeepAlive(context.TODO(), lease.ID)
if err != nil {
    panic(err)
}

go func() {
    for ka := range keepAliveChan {
        fmt.Printf("Lease kept alive: %d\n", ka.ID)
    }
}()

Revoke Lease:

_, err := cli.Revoke(context.TODO(), lease.ID)

Use Cases

Session Management:

// Create session with lease
lease, _ := cli.Grant(context.TODO(), 30)
cli.Put(context.TODO(), "/sessions/user123", "active",
    clientv3.WithLease(lease.ID))

// Keep session alive
cli.KeepAlive(context.TODO(), lease.ID)

Distributed Locks:

// Acquire lock with lease
lease, _ := cli.Grant(context.TODO(), 10)
cli.Put(context.TODO(), "/locks/resource1", "locked",
    clientv3.WithLease(lease.ID))

// Keep lock alive
cli.KeepAlive(context.TODO(), lease.ID)

Transactions

Compare-and-Swap

Go Client:

txn := cli.Txn(context.TODO())

txn.If(clientv3.Compare(clientv3.Value("/config/version"), "=", "1")).
    Then(clientv3.OpPut("/config/version", "2")).
    Else(clientv3.OpGet("/config/version"))

resp, err := txn.Commit()
if err != nil {
    panic(err)
}

if resp.Succeeded {
    fmt.Println("Transaction succeeded")
} else {
    fmt.Println("Transaction failed")
}

Atomic Operations

Increment Counter:

txn := cli.Txn(context.TODO())

txn.If(clientv3.Compare(clientv3.Value("/counter"), "=", "10")).
    Then(clientv3.OpPut("/counter", "11")).
    Else(clientv3.OpGet("/counter"))

resp, err := txn.Commit()

Raft Consensus

Leader Election

How it Works:

Nodes start as candidates
Candidate requests votes
Node with majority becomes leader
Leader handles all writes
Followers replicate from leader

Consistency Guarantees

Linearizability: Strong consistency
Sequential Consistency: Ordered operations
Eventual Consistency: All nodes eventually consistent

Performance Optimization

Connection Pooling

Reuse Connections:

cli, err := clientv3.New(clientv3.Config{
    Endpoints:   []string{"localhost:2379"},
    DialTimeout: 5 * time.Second,
    MaxCallSendMsgSize: 10 * 1024 * 1024, // 10MB
    MaxCallRecvMsgSize: 10 * 1024 * 1024, // 10MB
})

Batch Operations

Batch Put:

ops := []clientv3.Op{
    clientv3.OpPut("/key1", "value1"),
    clientv3.OpPut("/key2", "value2"),
    clientv3.OpPut("/key3", "value3"),
}

_, err := cli.Txn(context.TODO()).Then(ops...).Commit()

Best Practices

1. Key Design

Use hierarchical keys
Keep keys short
Use consistent naming
Avoid too many keys

Good:

/config/database/url
/config/database/port
/sessions/user123

Bad:

key1
data
info

2. Value Size

Keep values small (< 1.5MB)
Use compression for large values
Store references instead of data

3. Leases

Use leases for temporary data
Keep leases alive
Handle lease expiration
Use appropriate TTL

4. Watch

Watch specific keys when possible
Use revision for recovery
Handle watch errors
Limit watch scope

What Interviewers Look For

Distributed Systems Understanding

etcd Concepts
- Understanding of key-value store
- Raft consensus
- Watch mechanism
- Red Flags: No etcd understanding, wrong consensus, no watch
Coordination Patterns
- Distributed locks
- Leader election
- Configuration management
- Red Flags: No coordination, poor locks, no election
Consistency
- Raft consensus
- Linearizability
- Eventual consistency
- Red Flags: No consensus, wrong consistency, poor understanding

Problem-Solving Approach

Key-Value Design
- Key organization
- Value size management
- Lease usage
- Red Flags: Poor keys, large values, no leases
Coordination Design
- Lock implementation
- Leader election
- Configuration management
- Red Flags: Poor locks, no election, no config

System Design Skills

Distributed Architecture
- etcd cluster design
- Raft configuration
- Performance optimization
- Red Flags: No cluster, poor config, no optimization
Scalability
- Cluster sizing
- Performance tuning
- Load distribution
- Red Flags: Poor sizing, no tuning, no distribution

Communication Skills

Clear Explanation
- Explains etcd concepts
- Discusses trade-offs
- Justifies design decisions
- Red Flags: Unclear explanations, no justification, confusing

Meta-Specific Focus

Distributed Systems Expertise
- Understanding of distributed coordination
- etcd mastery
- Consensus algorithms
- Key: Demonstrate distributed systems expertise
System Design Skills
- Can design coordination systems
- Understands consensus challenges
- Makes informed trade-offs
- Key: Show practical distributed systems design skills

Summary

etcd Key Points:

Distributed Key-Value Store: Reliable key-value storage
Raft Consensus: Strong consistency guarantees
Watch Mechanism: Real-time change notifications
Leases: TTL-based expiration
Transactions: Atomic operations

Common Use Cases:

Service discovery
Configuration management
Distributed locks
Leader election
Coordination
Metadata storage

Best Practices:

Design hierarchical keys
Keep values small
Use leases for temporary data
Watch specific keys
Optimize connection usage
Monitor cluster health
Handle failures gracefully

etcd is a powerful distributed key-value store that provides reliable coordination for distributed systems.

Robina Li

Introduction

What is etcd?

Key Concepts

Architecture

High-Level Architecture

Core Architecture

Basic Operations

Put (Write)

Get (Read)

Delete

Watch Mechanism

Watch Key

Python Client

Leases

Create Lease

Use Cases

Transactions

Compare-and-Swap

Atomic Operations

Raft Consensus

Leader Election

Consistency Guarantees

Performance Optimization

Connection Pooling

Batch Operations

Best Practices

1. Key Design

2. Value Size

3. Leases

4. Watch

What Interviewers Look For

Distributed Systems Understanding

Problem-Solving Approach

System Design Skills

Communication Skills

Meta-Specific Focus

Summary

Related Posts

Recent Posts