Introduction
Designing a publish-subscribe (pub/sub) messaging system is a complex distributed systems problem that tests your ability to build low-latency, real-time messaging infrastructure. The system must support publishers sending messages to topics and subscribers receiving those messages in near real-time, with the key characteristic that messages are ephemeral—they are lost if no subscribers are currently listening.
This post provides a detailed walkthrough of designing a pub/sub system, covering key architectural decisions, topic routing, connection management, ephemeral delivery guarantees, hot-topic handling, backpressure, and horizontal scaling. This is a common system design interview question that tests your understanding of distributed systems, WebSocket connections, message fan-out, and real-time communication patterns.
Table of Contents
- Problem Statement
- Requirements
- Capacity Estimation
- Core Entities
- API
- Data Flow
- High-Level Design
- Deep Dive
- What Interviewers Look For
- Summary
Problem Statement
Design a publish-subscribe messaging system where:
- Publishers can send messages to topics
- Subscribers can listen to topics and receive messages in real-time
- Messages are ephemeral—lost if no subscribers are currently listening
- Low-latency delivery to connected subscribers
- Support for topic management and permissions
- Handle hot topics with many subscribers
- Support horizontal scaling
Scale Requirements:
- 1M+ concurrent subscribers
- 100K+ topics
- 1M+ messages per second
- < 50ms message delivery latency (P95)
- Support topics with 100K+ subscribers
- Handle connection churn (frequent connect/disconnect)
Key Characteristics:
- Ephemeral Delivery: Messages only delivered to currently connected subscribers
- At-Most-Once: Messages may be lost (no durability)
- Real-Time: Low-latency delivery (< 50ms)
- No Retention: Messages not stored or replayed
- Topic-Based: Messages routed by topic
Requirements
Functional Requirements
Core Features:
- Topic Management: Create topics, manage permissions (who can publish/subscribe)
- Publish Messages: Publishers send messages to topics
- Subscribe to Topics: Subscribers connect and listen to topics
- Unsubscribe: Subscribers can unsubscribe from topics
- Real-Time Delivery: Messages delivered immediately to connected subscribers
- Presence: See who’s subscribed to a topic
- Ephemeral Delivery: Messages dropped if no subscribers connected
Topic Permissions:
- Public: Anyone can publish/subscribe
- Private: Only authorized users can publish/subscribe
- Publish-Only: Anyone can publish, only authorized can subscribe
- Subscribe-Only: Only authorized can publish, anyone can subscribe
Out of Scope:
- Message persistence/durability
- Message replay/history
- Message ordering guarantees (best-effort)
- Message acknowledgments
- Dead letter queues
Non-Functional Requirements
- Availability: 99.9% uptime
- Performance:
- Message delivery latency: < 50ms (P95)
- Topic creation: < 100ms
- Subscribe/unsubscribe: < 100ms
- Scalability: Handle 1M+ concurrent subscribers, 1M+ messages/second
- Consistency: Eventually consistent for presence (who’s subscribed)
- Ephemeral: No message storage, messages lost if no subscribers
- Real-Time: Messages delivered immediately to connected subscribers
Capacity Estimation
Traffic Estimates
- Concurrent Subscribers: 1M+
- Topics: 100K+
- Messages per Second: 1M+ (peak)
- Average Subscribers per Topic: 10
- Hot Topics: 100K+ subscribers per topic
- Connection Churn: 10% per minute (100K connects/disconnects per minute)
Storage Estimates
Topic Metadata:
- 100K topics × 1KB = 100MB
- Topic permissions, settings
Connection State:
- 1M connections × 256 bytes = 256MB (in-memory)
- Connection-to-topic mappings
Presence Data:
- 1M subscribers × 128 bytes = 128MB (in-memory)
- Who’s subscribed to which topics
Total Storage: ~500MB (mostly in-memory, no message storage)
Core Entities
Topic
- Attributes: topic_id, name, owner_id, permissions, created_at, subscriber_count
- Permissions: PUBLIC, PRIVATE, PUBLISH_ONLY, SUBSCRIBE_ONLY
- Relationships: Has many subscribers, receives many messages
Subscription
- Attributes: subscription_id, topic_id, user_id, subscribed_at, connection_id
- Relationships: Links user to topic via connection
- Purpose: Track active subscriptions
Message
- Attributes: message_id, topic_id, publisher_id, payload, timestamp
- Relationships: Belongs to topic, sent by publisher
- Purpose: Ephemeral message (not stored)
Connection
- Attributes: connection_id, user_id, connected_at, last_heartbeat, topics
- Relationships: Has many subscriptions
- Purpose: WebSocket/SSE connection
API
1. Create Topic
POST /api/v1/topics
Headers:
- Authorization: Bearer <token>
Body:
- name: string
- permissions: string (PUBLIC, PRIVATE, PUBLISH_ONLY, SUBSCRIBE_ONLY)
Response:
- topic_id: string
- name: string
- permissions: string
2. Publish Message
POST /api/v1/topics/{topic_id}/publish
Headers:
- Authorization: Bearer <token>
Body:
- payload: object (JSON)
Response:
- message_id: string
- delivered_to: integer (number of subscribers)
3. Subscribe to Topic (WebSocket)
WS /ws/subscribe?topic_id={topic_id}&token={auth_token}
Messages:
- Incoming: {"type": "message", "topic_id": "...", "payload": {...}}
- Incoming: {"type": "presence", "topic_id": "...", "subscribers": [...]}
- Outgoing: {"type": "subscribe", "topic_id": "..."}
- Outgoing: {"type": "unsubscribe", "topic_id": "..."}
4. Subscribe to Topic (SSE)
GET /api/v1/topics/{topic_id}/subscribe
Headers:
- Authorization: Bearer <token>
- Accept: text/event-stream
Response:
- Stream of Server-Sent Events
- data: {"type": "message", "payload": {...}}
5. Unsubscribe from Topic
DELETE /api/v1/topics/{topic_id}/subscribe
Headers:
- Authorization: Bearer <token>
Response:
- success: boolean
6. Get Topic Info
GET /api/v1/topics/{topic_id}
Headers:
- Authorization: Bearer <token>
Response:
- topic_id: string
- name: string
- subscriber_count: integer
- permissions: string
7. List Subscriptions
GET /api/v1/topics/{topic_id}/subscribers
Headers:
- Authorization: Bearer <token>
Response:
- subscribers: array of user objects
- count: integer
Data Flow
Publish Message Flow
1. Publisher → API Gateway
2. API Gateway → Auth Service (validate token)
3. API Gateway → Pub/Sub Service
4. Pub/Sub Service:
a. Validate topic exists
b. Check publish permission
c. Get list of active subscribers for topic
d. If no subscribers: Drop message (ephemeral)
e. If subscribers exist:
- Create message object
- Fan-out to all subscriber connections
- Return delivery count
Subscribe Flow
1. Client → WebSocket/SSE Connection
2. Connection Manager:
a. Authenticate connection
b. Create connection record
c. Add subscription to topic
d. Update topic subscriber count
e. Send initial presence info
f. Start delivering messages
Message Delivery Flow
1. Publisher publishes message to topic
2. Pub/Sub Service:
a. Get all active subscriptions for topic
b. For each subscription:
- Get connection
- Check connection is alive (heartbeat)
- Send message via WebSocket/SSE
c. If connection dead: Remove subscription
d. Return delivery count
Unsubscribe Flow
1. Client → Unsubscribe request
2. Connection Manager:
a. Remove subscription from topic
b. Update topic subscriber count
c. Stop delivering messages
d. Return success
High-Level Design
┌─────────────────────────────────────────────────────────┐
│ Publishers │
│ (Send Messages to Topics) │
└────────────────────┬────────────────────────────────────┘
│
│ HTTP POST
│
┌────────────────────▼────────────────────────────────────┐
│ API Gateway / LB │
│ (Rate Limiting, Auth) │
└────────────────────┬────────────────────────────────────┘
│
│
┌────────────────────▼────────────────────────────────────┐
│ Pub/Sub Service │
│ (Topic Management, Message Routing, Fan-Out) │
└──────┬───────────────────────────────────┬──────────────┘
│ │
│ │
┌──────▼──────────┐ ┌─────────▼───────────┐
│ Topic Router │ │ Connection Manager │
│ (Route by Topic)│ │ (WebSocket/SSE) │
└──────┬──────────┘ └─────────┬───────────┘
│ │
│ │
┌──────▼───────────────────────────────────▼───────────┐
│ Redis Cluster │
│ (Topic Metadata, Subscriptions, Presence) │
└──────┬─────────────────────────────────────────────────┘
│
│
┌──────▼─────────────────────────────────────────────────┐
│ Message Queue (Optional) │
│ (For Cross-Server Fan-Out, Hot Topics) │
└───────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────┐
│ Subscribers │
│ (WebSocket/SSE Connections) │
└─────────────────────────────────────────────────────────┘
Deep Dive
Component Design
1. Pub/Sub Service
- Responsibilities: Topic management, message routing, fan-out
- Optimization:
- Fast topic lookups
- Efficient subscriber enumeration
- Message batching for hot topics
2. Topic Router
- Responsibilities: Route messages to correct topic handlers
- Optimization:
- Hash-based routing
- Topic sharding
- Load balancing
3. Connection Manager
- Responsibilities: Manage WebSocket/SSE connections, subscriptions
- Optimization:
- Connection pooling
- Efficient connection-to-topic mapping
- Heartbeat management
4. Redis Cluster
- Topic Metadata: Topic info, permissions
- Subscriptions: Topic → [connection_ids]
- Presence: Who’s subscribed to which topics
- Connection State: Connection info, heartbeat
Topic Routing
Topic Sharding
Challenge: Distribute topics across multiple servers.
Solution: Hash-based sharding by topic_id
class TopicRouter:
def __init__(self, num_shards=100):
self.num_shards = num_shards
self.shards = [TopicShard(i) for i in range(num_shards)]
def get_shard(self, topic_id):
"""Route topic to shard"""
shard_index = hash(topic_id) % self.num_shards
return self.shards[shard_index]
def publish(self, topic_id, message):
"""Route publish to correct shard"""
shard = self.get_shard(topic_id)
return shard.publish(topic_id, message)
Benefits:
- Even distribution of topics
- Independent scaling per shard
- Isolated failures
Topic Metadata Storage
Redis Hash:
# Topic metadata
topic:{topic_id} = {
"name": "chat-room-1",
"owner_id": "user_123",
"permissions": "PUBLIC",
"subscriber_count": 1000,
"created_at": "2025-11-25T10:00:00Z"
}
Connection Management
WebSocket Connection
Connection Lifecycle:
- Connect: Client opens WebSocket connection
- Authenticate: Validate token, create connection record
- Subscribe: Client subscribes to topics
- Heartbeat: Ping/pong every 30 seconds
- Disconnect: Clean up subscriptions
Connection Mapping:
class ConnectionManager:
def __init__(self):
self.connections = {} # connection_id -> Connection
self.topic_subscriptions = {} # topic_id -> [connection_ids]
self.user_connections = {} # user_id -> [connection_ids]
def add_connection(self, connection_id, user_id, websocket):
"""Add new connection"""
connection = Connection(
connection_id=connection_id,
user_id=user_id,
websocket=websocket,
connected_at=now()
)
self.connections[connection_id] = connection
if user_id not in self.user_connections:
self.user_connections[user_id] = []
self.user_connections[user_id].append(connection_id)
def subscribe(self, connection_id, topic_id):
"""Subscribe connection to topic"""
if topic_id not in self.topic_subscriptions:
self.topic_subscriptions[topic_id] = []
if connection_id not in self.topic_subscriptions[topic_id]:
self.topic_subscriptions[topic_id].append(connection_id)
# Update Redis
redis.sadd(f"topic:{topic_id}:subscribers", connection_id)
def unsubscribe(self, connection_id, topic_id):
"""Unsubscribe connection from topic"""
if topic_id in self.topic_subscriptions:
self.topic_subscriptions[topic_id].remove(connection_id)
# Update Redis
redis.srem(f"topic:{topic_id}:subscribers", connection_id)
Heartbeat Management
Purpose: Detect dead connections
Implementation:
class ConnectionManager:
def heartbeat(self, connection_id):
"""Update connection heartbeat"""
if connection_id in self.connections:
self.connections[connection_id].last_heartbeat = now()
redis.setex(
f"connection:{connection_id}:heartbeat",
60, # 60 second TTL
str(now())
)
def check_connections(self):
"""Check for dead connections"""
for connection_id, connection in self.connections.items():
if now() - connection.last_heartbeat > 60:
# Connection dead, cleanup
self.remove_connection(connection_id)
def remove_connection(self, connection_id):
"""Remove dead connection"""
connection = self.connections[connection_id]
# Unsubscribe from all topics
for topic_id in connection.subscribed_topics:
self.unsubscribe(connection_id, topic_id)
# Remove connection
del self.connections[connection_id]
self.user_connections[connection.user_id].remove(connection_id)
Message Fan-Out
Direct Fan-Out
For Small Topics (< 1000 subscribers):
def publish_message(topic_id, message):
"""Publish message to topic"""
# Get all subscribers
subscribers = get_topic_subscribers(topic_id)
if not subscribers:
# No subscribers, drop message (ephemeral)
return {"delivered_to": 0}
# Fan-out to all subscribers
delivered = 0
for connection_id in subscribers:
connection = get_connection(connection_id)
if connection and connection.is_alive():
try:
connection.send(message)
delivered += 1
except:
# Connection dead, remove subscription
unsubscribe(connection_id, topic_id)
return {"delivered_to": delivered}
Distributed Fan-Out (Hot Topics)
For Large Topics (> 1000 subscribers):
def publish_message_hot_topic(topic_id, message):
"""Publish to hot topic using message queue"""
# Publish to Redis pub/sub or Kafka
redis.publish(f"topic:{topic_id}", json.dumps(message))
# Or use Kafka for better scalability
kafka.produce(f"topic-{topic_id}", message)
Redis Pub/Sub:
# Publisher
redis.publish(f"topic:{topic_id}", json.dumps(message))
# Subscriber (each server)
redis_subscriber = redis.pubsub()
redis_subscriber.subscribe(f"topic:{topic_id}")
for message in redis_subscriber.listen():
# Fan-out to local connections
fan_out_to_local_connections(topic_id, message)
Ephemeral Delivery
No Message Storage
Key Design: Messages are never stored, only delivered to active subscribers.
Implementation:
def publish_message(topic_id, message):
"""Publish message (ephemeral)"""
# Get active subscribers
subscribers = get_active_subscribers(topic_id)
if not subscribers:
# No subscribers, message is dropped
return {"delivered_to": 0, "dropped": True}
# Deliver to active subscribers only
delivered = fan_out_to_subscribers(subscribers, message)
# Message is not stored anywhere
return {"delivered_to": delivered, "dropped": False}
Benefits:
- No storage overhead
- Simple implementation
- Low latency (no disk I/O)
Trade-offs:
- Messages lost if no subscribers
- No message replay
- No durability guarantees
Hot-Topic Handling
Challenge
Topics with 100K+ subscribers create fan-out bottlenecks.
Solution: Multi-Level Fan-Out
Approach 1: Redis Pub/Sub
# Publisher publishes to Redis channel
redis.publish(f"topic:{topic_id}", message)
# Each server subscribes to Redis channel
# Then fans out to local connections
redis_subscriber.subscribe(f"topic:{topic_id}")
for message in redis_subscriber.listen():
local_subscribers = get_local_subscribers(topic_id)
fan_out_to_connections(local_subscribers, message)
Approach 2: Kafka Topics
# Publisher publishes to Kafka topic
kafka.produce(f"pubsub-topic-{topic_id}", message)
# Each server consumes from Kafka
# Then fans out to local connections
for message in kafka.consume(f"pubsub-topic-{topic_id}"):
local_subscribers = get_local_subscribers(topic_id)
fan_out_to_connections(local_subscribers, message)
Approach 3: Topic Partitioning
# Partition hot topic across multiple servers
def get_subscriber_shard(topic_id, connection_id):
"""Route subscriber to shard"""
return hash(f"{topic_id}:{connection_id}") % num_shards
# Each shard handles subset of subscribers
# Publisher sends to all shards
for shard in range(num_shards):
shard_connections = get_shard_subscribers(topic_id, shard)
fan_out_to_connections(shard_connections, message)
Backpressure
Challenge
Subscribers can’t keep up with message rate.
Solution: Connection-Level Backpressure
WebSocket Backpressure:
class Connection:
def __init__(self):
self.send_queue = queue.Queue(maxsize=1000)
self.is_ready = True
def send(self, message):
"""Send message with backpressure"""
try:
self.send_queue.put_nowait(message)
if self.is_ready:
self.flush_queue()
except queue.Full:
# Queue full, drop message or disconnect
self.handle_backpressure()
def flush_queue(self):
"""Flush queued messages"""
self.is_ready = False
while not self.send_queue.empty():
message = self.send_queue.get()
try:
self.websocket.send(message)
except:
# Connection dead, stop flushing
return
self.is_ready = True
Server-Level Backpressure:
def publish_message(topic_id, message):
"""Publish with server-level backpressure"""
subscribers = get_topic_subscribers(topic_id)
# Check server load
if server_load > 0.8:
# High load, drop low-priority messages
if message.priority == "LOW":
return {"delivered_to": 0, "dropped": "backpressure"}
# Fan-out with rate limiting
delivered = 0
for connection_id in subscribers:
connection = get_connection(connection_id)
if connection and connection.can_send():
connection.send(message)
delivered += 1
return {"delivered_to": delivered}
Protocols
WebSocket
Advantages:
- Full-duplex communication
- Low overhead
- Binary support
Implementation:
# Server
import websockets
async def handle_websocket(websocket, path):
connection_id = generate_id()
user_id = authenticate(websocket)
connection_manager.add_connection(connection_id, user_id, websocket)
try:
async for message in websocket:
data = json.loads(message)
if data['type'] == 'subscribe':
connection_manager.subscribe(connection_id, data['topic_id'])
elif data['type'] == 'unsubscribe':
connection_manager.unsubscribe(connection_id, data['topic_id'])
except websockets.exceptions.ConnectionClosed:
connection_manager.remove_connection(connection_id)
Server-Sent Events (SSE)
Advantages:
- Simpler than WebSocket
- Automatic reconnection
- HTTP-based
Implementation:
# Server
from flask import Response, stream_with_context
@app.route('/api/v1/topics/<topic_id>/subscribe')
def subscribe_sse(topic_id):
user_id = authenticate(request)
def event_stream():
connection_id = generate_id()
connection_manager.add_connection(connection_id, user_id, None)
connection_manager.subscribe(connection_id, topic_id)
try:
while True:
# Get messages for this topic
message = message_queue.get(topic_id, connection_id)
yield f"data: {json.dumps(message)}\n\n"
except GeneratorExit:
connection_manager.remove_connection(connection_id)
return Response(stream_with_context(event_stream()),
mimetype='text/event-stream')
Failure Handling
Connection Failures
Scenario: Connection drops during message delivery.
Solution:
- Detect dead connections via heartbeat
- Remove subscription on disconnect
- Continue delivering to other subscribers
Server Failures
Scenario: Server crashes, connections lost.
Solution:
- Clients reconnect automatically
- Re-subscribe to topics on reconnect
- Load balancer routes to healthy servers
Topic Shard Failures
Scenario: Shard handling topic fails.
Solution:
- Failover to replica shard
- Re-route topic to different shard
- Clients reconnect and re-subscribe
Trade-offs and Optimizations
Trade-offs
- Durability vs Latency
- Choice: Ephemeral (no durability)
- Reason: Low latency, simplicity
- Benefit: Fast delivery, no storage overhead
- At-Most-Once vs At-Least-Once
- Choice: At-most-once (may lose messages)
- Reason: Simpler, no deduplication needed
- Benefit: Lower latency, simpler implementation
- Ordering vs Performance
- Choice: Best-effort ordering
- Reason: Performance over strict ordering
- Benefit: Better throughput, lower latency
Optimizations
- Message Batching
- Batch multiple messages for same topic
- Reduce WebSocket overhead
- Better throughput
- Connection Pooling
- Reuse WebSocket connections
- Reduce connection overhead
- Better resource utilization
- Topic Sharding
- Distribute topics across servers
- Better load distribution
- Isolated failures
- Caching
- Cache topic metadata
- Cache subscription lists
- Reduce database queries
What Interviewers Look For
Distributed Systems Skills
- Low-Latency Fan-Out
- Efficient message distribution
- Direct fan-out for small topics
- Distributed fan-out for hot topics
- Red Flags: Inefficient fan-out, no hot-topic handling, high latency
- Connection Management
- WebSocket/SSE connection handling
- Heartbeat management
- Connection lifecycle
- Red Flags: No heartbeat, dead connections, poor lifecycle management
- Ephemeral Delivery
- No message storage
- Drop messages if no subscribers
- Clear delivery guarantees
- Red Flags: Storing messages, durability, over-engineering
Problem-Solving Approach
- Trade-off Awareness
- At-most-once vs at-least-once
- Durability vs latency
- Ordering vs performance
- Red Flags: No trade-offs, wrong trade-offs, over-engineering
- Hot-Topic Handling
- Recognize hot topics
- Multi-level fan-out
- Topic partitioning
- Red Flags: No hot-topic handling, single-level fan-out, bottlenecks
- Backpressure
- Connection-level backpressure
- Server-level backpressure
- Message dropping strategies
- Red Flags: No backpressure, resource exhaustion, no dropping
System Design Skills
- Component Design
- Clear service boundaries
- Proper abstractions
- Efficient interfaces
- Red Flags: Monolithic design, poor abstractions, inefficient APIs
- Scalability
- Horizontal scaling
- Topic sharding
- Load balancing
- Red Flags: Vertical scaling only, no sharding, no load balancing
- Protocols
- WebSocket vs SSE trade-offs
- Protocol selection
- Implementation details
- Red Flags: Wrong protocol, no protocol choice, poor implementation
Communication Skills
- Clear Explanation
- Explains ephemeral delivery
- Discusses trade-offs
- Justifies design decisions
- Red Flags: Unclear explanations, no justification, confusing
- Architecture Diagrams
- Clear component diagram
- Shows message flow
- Fan-out architecture
- Red Flags: No diagrams, unclear diagrams, missing components
Meta-Specific Focus
- Real-Time Systems
- Understanding of low-latency requirements
- WebSocket expertise
- Connection management
- Key: Demonstrate real-time systems expertise
- Pub/Sub Patterns
- Topic-based routing
- Message fan-out
- Ephemeral delivery
- Key: Show pub/sub pattern mastery
- Production-Grade Robustness
- Failure handling
- Backpressure
- Hot-topic handling
- Key: Demonstrate production thinking
Summary
Designing a pub/sub system requires careful consideration of ephemeral delivery, low-latency fan-out, connection management, and horizontal scaling. Key design decisions include:
Architecture Highlights:
- Topic-based routing with hash-based sharding
- WebSocket/SSE for real-time delivery
- Ephemeral message delivery (no storage)
- Multi-level fan-out for hot topics
- Connection management with heartbeat
Key Patterns:
- Ephemeral Delivery: Messages dropped if no subscribers
- At-Most-Once: No durability, messages may be lost
- Topic Routing: Hash-based sharding for scalability
- Connection Management: WebSocket/SSE with heartbeat
- Hot-Topic Handling: Multi-level fan-out (Redis/Kafka)
Scalability Solutions:
- Horizontal scaling (multiple servers)
- Topic sharding (distribute topics)
- Connection pooling (efficient connections)
- Message queue for cross-server fan-out
Trade-offs:
- Durability vs latency (ephemeral for low latency)
- At-most-once vs at-least-once (at-most-once for simplicity)
- Ordering vs performance (best-effort ordering)
This design handles 1M+ concurrent subscribers, 1M+ messages/second, and maintains < 50ms message delivery latency while ensuring messages are ephemeral and dropped if no subscribers are listening. The system is scalable, fault-tolerant, and optimized for real-time pub/sub messaging.