Introduction
Designing a simple chat application is a system design interview question that tests your ability to model messaging systems, handle message persistence, and manage conversation state. This question focuses on:
- Object-oriented modeling: Users, Messages, Conversations
- Message queues: Local message handling
- Persistence: Storing messages and conversations
- State management: Read/unread status, online/offline
This guide covers the design of a simple chat application with proper entity modeling and message handling.
Table of Contents
- Problem Statement
- Requirements
- Data Modeling
- Class Design
- Message Handling
- Read/Unread State
- Message History
- Implementation
- Summary
Problem Statement
Design a simple chat application that:
- Manages users and conversations
- Handles messages (send, receive, store)
- Tracks read/unread status
- Stores message history for retrieval
- Supports conversations (1-on-1, group)
- Handles online/offline status
Scale Requirements:
- Support 1K-100K users
- Support 10K-1M messages
- Fast message delivery: < 50ms
- Efficient history retrieval
Requirements
Functional Requirements
- Send Message: Send message to user or group
- Receive Message: Receive and display messages
- Mark as Read: Mark messages as read
- Get History: Retrieve conversation history
- Create Conversation: Start new conversation
- Get Unread Count: Get unread message count
Non-Functional Requirements
Performance:
- Fast message delivery
- Efficient history retrieval
- Quick read status updates
Consistency:
- No message loss
- Accurate read status
- Correct message ordering
Data Modeling
Database Schema
CREATE TABLE users (
user_id BIGINT PRIMARY KEY AUTO_INCREMENT,
username VARCHAR(50) UNIQUE NOT NULL,
email VARCHAR(100) UNIQUE NOT NULL,
status VARCHAR(20) DEFAULT 'offline', -- online, offline, away
last_seen TIMESTAMP NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
INDEX idx_username (username)
);
CREATE TABLE conversations (
conversation_id BIGINT PRIMARY KEY AUTO_INCREMENT,
conversation_type VARCHAR(20) DEFAULT 'direct', -- direct, group
name VARCHAR(100) NULL, -- For group chats
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
);
CREATE TABLE conversation_participants (
participant_id BIGINT PRIMARY KEY AUTO_INCREMENT,
conversation_id BIGINT NOT NULL,
user_id BIGINT NOT NULL,
joined_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
last_read_at TIMESTAMP NULL,
UNIQUE KEY unique_conversation_user (conversation_id, user_id),
INDEX idx_user_id (user_id),
INDEX idx_conversation_id (conversation_id),
FOREIGN KEY (conversation_id) REFERENCES conversations(conversation_id),
FOREIGN KEY (user_id) REFERENCES users(user_id)
);
CREATE TABLE messages (
message_id BIGINT PRIMARY KEY AUTO_INCREMENT,
conversation_id BIGINT NOT NULL,
sender_id BIGINT NOT NULL,
content TEXT NOT NULL,
message_type VARCHAR(20) DEFAULT 'text', -- text, image, file
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
INDEX idx_conversation_id (conversation_id),
INDEX idx_created_at (created_at),
INDEX idx_sender_id (sender_id),
FOREIGN KEY (conversation_id) REFERENCES conversations(conversation_id),
FOREIGN KEY (sender_id) REFERENCES users(user_id)
);
CREATE TABLE message_reads (
read_id BIGINT PRIMARY KEY AUTO_INCREMENT,
message_id BIGINT NOT NULL,
user_id BIGINT NOT NULL,
read_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
UNIQUE KEY unique_message_user (message_id, user_id),
INDEX idx_message_id (message_id),
INDEX idx_user_id (user_id),
FOREIGN KEY (message_id) REFERENCES messages(message_id),
FOREIGN KEY (user_id) REFERENCES users(user_id)
);
Data Model Classes
from enum import Enum
from datetime import datetime
from typing import Optional, List
from dataclasses import dataclass
class ConversationType(Enum):
DIRECT = "direct"
GROUP = "group"
class UserStatus(Enum):
ONLINE = "online"
OFFLINE = "offline"
AWAY = "away"
@dataclass
class User:
user_id: Optional[int]
username: str
email: str
status: UserStatus
last_seen: Optional[datetime]
@dataclass
class Conversation:
conversation_id: Optional[int]
conversation_type: ConversationType
name: Optional[str]
created_at: datetime
updated_at: datetime
participants: List[int] # user_ids
@dataclass
class Message:
message_id: Optional[int]
conversation_id: int
sender_id: int
content: str
message_type: str
created_at: datetime
read_by: List[int] = None # user_ids who read
def __post_init__(self):
if self.read_by is None:
self.read_by = []
Class Design
Chat Application
class ChatApplication:
def __init__(self, db):
self.db = db
self.message_queue = {} # conversation_id -> List[Message]
self.online_users = set() # user_id set
def create_user(self, username: str, email: str) -> User:
"""Create new user."""
user = User(
user_id=None,
username=username,
email=email,
status=UserStatus.OFFLINE,
last_seen=None
)
return self.db.create_user(user)
def create_conversation(self, user_ids: List[int],
conversation_type: ConversationType = ConversationType.DIRECT,
name: Optional[str] = None) -> Conversation:
"""Create new conversation."""
if conversation_type == ConversationType.DIRECT and len(user_ids) != 2:
raise ValueError("Direct conversation requires exactly 2 users")
conversation = Conversation(
conversation_id=None,
conversation_type=conversation_type,
name=name,
created_at=datetime.now(),
updated_at=datetime.now(),
participants=user_ids
)
conversation = self.db.create_conversation(conversation)
# Add participants
for user_id in user_ids:
self.db.add_participant(conversation.conversation_id, user_id)
return conversation
def send_message(self, conversation_id: int, sender_id: int,
content: str, message_type: str = 'text') -> Message:
"""Send message to conversation."""
# Validate conversation and sender
conversation = self.db.get_conversation(conversation_id)
if not conversation:
raise ValueError("Conversation not found")
if sender_id not in conversation.participants:
raise ValueError("User not in conversation")
# Create message
message = Message(
message_id=None,
conversation_id=conversation_id,
sender_id=sender_id,
content=content,
message_type=message_type,
created_at=datetime.now()
)
message = self.db.create_message(message)
# Update conversation timestamp
conversation.updated_at = datetime.now()
self.db.update_conversation(conversation)
# Add to message queue for real-time delivery
if conversation_id not in self.message_queue:
self.message_queue[conversation_id] = []
self.message_queue[conversation_id].append(message)
return message
def get_messages(self, conversation_id: int, user_id: int,
limit: int = 50, offset: int = 0) -> List[Message]:
"""Get conversation messages."""
# Validate user is participant
if not self.db.is_participant(conversation_id, user_id):
raise ValueError("User not in conversation")
messages = self.db.get_messages(conversation_id, limit=limit, offset=offset)
# Mark as read for this user
unread_messages = [m for m in messages if user_id not in m.read_by]
for message in unread_messages:
self.mark_as_read(message.message_id, user_id)
return messages
def mark_as_read(self, message_id: int, user_id: int) -> bool:
"""Mark message as read."""
message = self.db.get_message(message_id)
if not message:
return False
# Check if already read
if self.db.is_message_read(message_id, user_id):
return True
# Mark as read
self.db.mark_message_read(message_id, user_id)
# Update participant's last_read_at
self.db.update_participant_last_read(message.conversation_id, user_id)
return True
def mark_conversation_as_read(self, conversation_id: int, user_id: int):
"""Mark all messages in conversation as read."""
# Get unread messages
unread_messages = self.db.get_unread_messages(conversation_id, user_id)
for message in unread_messages:
self.mark_as_read(message.message_id, user_id)
# Update last_read_at
self.db.update_participant_last_read(conversation_id, user_id)
def get_unread_count(self, user_id: int) -> dict:
"""Get unread message count per conversation."""
conversations = self.db.get_user_conversations(user_id)
unread_counts = {}
for conversation in conversations:
count = self.db.get_unread_count(conversation.conversation_id, user_id)
if count > 0:
unread_counts[conversation.conversation_id] = count
return unread_counts
def get_conversation_history(self, conversation_id: int, user_id: int,
limit: int = 50) -> List[Message]:
"""Get conversation history."""
if not self.db.is_participant(conversation_id, user_id):
raise ValueError("User not in conversation")
return self.db.get_messages(conversation_id, limit=limit)
def set_user_status(self, user_id: int, status: UserStatus):
"""Set user online/offline status."""
user = self.db.get_user(user_id)
if not user:
return
user.status = status
user.last_seen = datetime.now()
self.db.update_user(user)
if status == UserStatus.ONLINE:
self.online_users.add(user_id)
else:
self.online_users.discard(user_id)
Message Handling
Local Message Queue
class ChatApplication:
def __init__(self, db):
# ... existing code ...
self.message_queue = {} # conversation_id -> queue.Queue
def send_message(self, conversation_id: int, sender_id: int, content: str) -> Message:
"""Send message with queue."""
# ... create message ...
# Add to queue for delivery
if conversation_id not in self.message_queue:
self.message_queue[conversation_id] = queue.Queue()
self.message_queue[conversation_id].put(message)
# Notify participants (in real system, use WebSocket/SSE)
self._notify_participants(conversation_id, message)
return message
def get_pending_messages(self, conversation_id: int, user_id: int) -> List[Message]:
"""Get pending messages from queue."""
if conversation_id not in self.message_queue:
return []
messages = []
queue_obj = self.message_queue[conversation_id]
while not queue_obj.empty():
try:
message = queue_obj.get_nowait()
# Only return messages not from this user
if message.sender_id != user_id:
messages.append(message)
except queue.Empty:
break
return messages
def _notify_participants(self, conversation_id: int, message: Message):
"""Notify conversation participants of new message."""
conversation = self.db.get_conversation(conversation_id)
for user_id in conversation.participants:
if user_id != message.sender_id and user_id in self.online_users:
# Send notification (in real system, push via WebSocket)
pass
Read/Unread State
Efficient Read Tracking
class ChatApplication:
def get_unread_messages(self, conversation_id: int, user_id: int) -> List[Message]:
"""Get unread messages for user."""
# Get participant's last_read_at
participant = self.db.get_participant(conversation_id, user_id)
if not participant or not participant.last_read_at:
# No read timestamp, all messages are unread
return self.db.get_messages(conversation_id)
# Get messages after last_read_at
return self.db.get_messages_after(conversation_id, participant.last_read_at)
def get_unread_count_fast(self, conversation_id: int, user_id: int) -> int:
"""Fast unread count using last_read_at."""
participant = self.db.get_participant(conversation_id, user_id)
if not participant or not participant.last_read_at:
return self.db.get_message_count(conversation_id)
return self.db.get_message_count_after(conversation_id, participant.last_read_at)
Message History
Pagination
class ChatApplication:
def get_message_history(self, conversation_id: int, user_id: int,
page: int = 1, page_size: int = 50) -> dict:
"""Get message history with pagination."""
if not self.db.is_participant(conversation_id, user_id):
raise ValueError("User not in conversation")
offset = (page - 1) * page_size
messages = self.db.get_messages(conversation_id, limit=page_size, offset=offset)
total = self.db.get_message_count(conversation_id)
return {
'messages': messages,
'page': page,
'page_size': page_size,
'total': total,
'total_pages': (total + page_size - 1) // page_size
}
Implementation
Complete Example
# Initialize
db = Database()
chat_app = ChatApplication(db)
# Create users
user1 = chat_app.create_user("alice", "alice@example.com")
user2 = chat_app.create_user("bob", "bob@example.com")
# Create conversation
conversation = chat_app.create_conversation([user1.user_id, user2.user_id])
# Send messages
message1 = chat_app.send_message(conversation.conversation_id, user1.user_id, "Hello!")
message2 = chat_app.send_message(conversation.conversation_id, user2.user_id, "Hi there!")
# Get messages
messages = chat_app.get_messages(conversation.conversation_id, user1.user_id)
# Mark as read
chat_app.mark_as_read(message2.message_id, user1.user_id)
# Get unread count
unread = chat_app.get_unread_count(user1.user_id)
What Interviewers Look For
Object-Oriented Modeling Skills
- Entity Design
- Proper modeling of User, Message, Conversation
- Clear relationships and responsibilities
- Red Flags: Unclear entities, poor relationships
- State Management
- Read/unread state tracking
- Online/offline status
- Red Flags: Missing state, incorrect tracking
Message Handling
- Message Queue Design
- Local queue implementation
- Message delivery logic
- Red Flags: No queue, incorrect delivery
- Persistence Strategy
- Message storage
- History retrieval
- Red Flags: No persistence, inefficient retrieval
Problem-Solving Approach
- Read/Unread Tracking
- Efficient tracking mechanism
- last_read_at approach
- Red Flags: Inefficient tracking, wrong approach
- Message History
- Pagination implementation
- Efficient queries
- Red Flags: No pagination, slow queries
- Edge Cases
- Group conversations
- Message ordering
- Concurrent messages
- Red Flags: Ignoring edge cases
Code Quality
- Data Consistency
- Accurate read status
- Correct message ordering
- Red Flags: Wrong status, incorrect ordering
- Error Handling
- Validation of operations
- Meaningful errors
- Red Flags: No validation, unclear errors
Interview Focus
- Local System Design
- No distributed messaging
- Focus on data structures
- Key: Show local system understanding
- State Management
- Proper state tracking
- Efficient updates
- Key: Demonstrate state management skills
Summary
Key Takeaways
- Entity Modeling: Users, Conversations, Messages, Participants
- Message Handling: Send, receive, store, queue
- Read/Unread State: Track with last_read_at timestamp
- Message History: Pagination for efficient retrieval
- State Management: Online/offline, read status
Common Interview Questions
- How would you model Users, Messages, and Conversations?
- Users: user_id, username, status
- Messages: message_id, conversation_id, sender_id, content
- Conversations: conversation_id, type, participants
- How would you mark messages as read/unread?
- Track last_read_at per participant
- Messages after last_read_at are unread
- Update on read action
- How would you handle message history retrieval?
- Pagination: limit/offset
- Index on conversation_id, created_at
- Efficient queries with proper indexes
Understanding chat application design is crucial for Meta interviews focusing on:
- Object-oriented modeling
- Message handling
- State management
- Persistence
- Read/unread tracking