Introduction
A chess game system with unlimited undo and ELO rating requires managing game state, move history, undo operations, and player ratings. Systems like Chess.com and Lichess handle millions of games, support unlimited move undo, and maintain accurate ELO ratings for millions of players.
This post provides a detailed walkthrough of designing a chess game system with unlimited undo and ELO rating, covering game state management, move history storage, undo/redo operations, ELO calculation, real-time gameplay, and handling concurrent games. This is a common system design interview question that tests your understanding of distributed systems, state management, data structures, rating algorithms, and real-time communication.
Table of Contents
- Problem Statement
- Requirements
- Capacity Estimation
- Core Entities
- API
- Data Flow
- Database Design
- High-Level Design
- Deep Dive
- Summary
Problem Statement
Design a chess game system with unlimited undo and ELO rating with the following features:
- Create and manage chess games
- Make moves and validate them
- Support unlimited undo moves
- Support redo moves (after undo)
- Calculate and update ELO ratings
- Real-time game updates
- Game history and replay
- Support multiple game modes (blitz, rapid, classical)
- Handle time controls and clocks
- Support game analysis and move suggestions
Scale Requirements:
- 10 million+ users
- 1 million+ active games simultaneously
- 100 million+ moves per day
- Peak: 10,000 moves per second
- Average moves per game: 40
- Maximum undo depth: Unlimited
- Must support real-time gameplay
- ELO rating updates after each game
Requirements
Functional Requirements
Core Features:
- Game Management: Create, join, leave games
- Move Making: Make valid chess moves
- Move Validation: Validate moves according to chess rules
- Unlimited Undo: Undo any number of moves
- Redo Support: Redo moves after undo
- ELO Rating: Calculate and update ELO ratings
- Real-Time Updates: Broadcast moves to both players
- Game History: Store complete game history
- Game Replay: Replay games from history
- Time Controls: Support different time controls
Out of Scope:
- Chess engine integration (focus on game management)
- Move analysis (focus on basic gameplay)
- Tournament management
- Spectator mode
- Mobile app (focus on web API)
Non-Functional Requirements
- Availability: 99.9% uptime
- Reliability: No move loss, accurate game state
- Performance:
- Move processing: < 100ms
- Undo operation: < 50ms
- Real-time update: < 200ms
- ELO calculation: < 10ms
- Scalability: Handle 10K+ moves per second
- Consistency: Strong consistency for game state
- Data Integrity: Accurate move history and ELO ratings
Capacity Estimation
Traffic Estimates
- Total Users: 10 million
- Daily Active Users (DAU): 1 million
- Active Games: 1 million simultaneously
- Moves per Day: 100 million
- Average Moves per Game: 40
- Peak Move Rate: 10,000 per second
- Normal Move Rate: 1,000 per second
- Undo Operations: 10% of moves = 10 million per day
- ELO Updates: 2.5 million per day (games completed)
Storage Estimates
Game Data:
- 1M active games × 5KB = 5GB
- Game state, move history, metadata
Move History:
- 100M moves/day × 200 bytes = 20GB/day
- 30-day retention: ~600GB
- 1-year archive: ~7.3TB
Game Archive:
- 2.5M completed games/day × 10KB = 25GB/day
- 1-year retention: ~9TB
Player Data:
- 10M users × 1KB = 10GB
- User profiles, ELO ratings
ELO History:
- 2.5M ELO updates/day × 100 bytes = 250MB/day
- 1-year retention: ~90GB
Total Storage: ~16TB
Bandwidth Estimates
Normal Traffic:
- 1,000 moves/sec × 2KB = 2MB/s = 16Mbps
- Move data, game state updates
Peak Traffic:
- 10,000 moves/sec × 2KB = 20MB/s = 160Mbps
Real-Time Updates:
- 1,000 moves/sec × 2KB × 2 players = 4MB/s = 32Mbps
Total Peak: ~160Mbps
Core Entities
Game
game_id(UUID)white_player_id(user_id)black_player_id(user_id)game_status(waiting, active, finished, abandoned)current_turn(white, black)board_state(FEN string)move_count(INT)time_control(blitz, rapid, classical)white_time_remaining(INT, milliseconds)black_time_remaining(INT, milliseconds)result(white_wins, black_wins, draw, ongoing)created_at(TIMESTAMP)updated_at(TIMESTAMP)finished_at(TIMESTAMP)
Move
move_id(UUID)game_id(UUID)move_number(INT)player_id(user_id)color(white, black)from_square(VARCHAR, e.g., “e2”)to_square(VARCHAR, e.g., “e4”)piece(pawn, rook, knight, bishop, queen, king)captured_piece(VARCHAR, optional)promotion(VARCHAR, optional)is_castle(BOOLEAN)is_en_passant(BOOLEAN)is_check(BOOLEAN)is_checkmate(BOOLEAN)board_state_after(FEN string)timestamp(TIMESTAMP)is_undone(BOOLEAN)
Undo History
undo_id(UUID)game_id(UUID)move_id(UUID)undo_timestamp(TIMESTAMP)redo_timestamp(TIMESTAMP, optional)is_redone(BOOLEAN)
Player
user_id(UUID)username(VARCHAR)elo_rating(INT)elo_blitz(INT)elo_rapid(INT)elo_classical(INT)games_played(INT)games_won(INT)games_lost(INT)games_drawn(INT)created_at(TIMESTAMP)updated_at(TIMESTAMP)
ELO History
elo_history_id(UUID)user_id(UUID)game_id(UUID)time_control(VARCHAR)old_elo(INT)new_elo(INT)elo_change(INT)opponent_elo(INT)result(win, loss, draw)timestamp(TIMESTAMP)
API
1. Create Game
POST /api/v1/games
Request:
{
"time_control": "blitz",
"initial_time_minutes": 5,
"increment_seconds": 3
}
Response:
{
"game_id": "uuid",
"white_player_id": "user123",
"black_player_id": null,
"game_status": "waiting",
"time_control": "blitz",
"created_at": "2025-11-13T10:00:00Z"
}
2. Make Move
POST /api/v1/games/{game_id}/moves
Request:
{
"from_square": "e2",
"to_square": "e4",
"promotion": null
}
Response:
{
"move_id": "uuid",
"game_id": "uuid",
"move_number": 1,
"from_square": "e2",
"to_square": "e4",
"board_state": "rnbqkbnr/pppppppp/8/8/4P3/8/PPPP1PPP/RNBQKBNR b KQkq e3 0 1",
"is_check": false,
"is_checkmate": false,
"current_turn": "black",
"timestamp": "2025-11-13T10:01:00Z"
}
3. Undo Move
POST /api/v1/games/{game_id}/undo
Request:
{
"move_count": 1
}
Response:
{
"game_id": "uuid",
"moves_undone": 1,
"current_move_number": 0,
"board_state": "rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1",
"current_turn": "white",
"undo_history": [
{
"move_id": "uuid",
"move_number": 1,
"undone_at": "2025-11-13T10:02:00Z"
}
]
}
4. Redo Move
POST /api/v1/games/{game_id}/redo
Request:
{
"move_count": 1
}
Response:
{
"game_id": "uuid",
"moves_redone": 1,
"current_move_number": 1,
"board_state": "rnbqkbnr/pppppppp/8/8/4P3/8/PPPP1PPP/RNBQKBNR b KQkq e3 0 1",
"current_turn": "black"
}
5. Get Game State
GET /api/v1/games/{game_id}
Response:
{
"game_id": "uuid",
"white_player": {...},
"black_player": {...},
"current_turn": "white",
"board_state": "rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1",
"move_count": 0,
"move_history": [],
"time_remaining": {
"white": 300000,
"black": 300000
},
"game_status": "active"
}
6. Get Move History
GET /api/v1/games/{game_id}/moves?limit=50&offset=0
Response:
{
"moves": [
{
"move_id": "uuid",
"move_number": 1,
"from_square": "e2",
"to_square": "e4",
"piece": "pawn",
"timestamp": "2025-11-13T10:01:00Z",
"is_undone": false
}
],
"total": 1,
"limit": 50,
"offset": 0
}
7. Finish Game
POST /api/v1/games/{game_id}/finish
Request:
{
"result": "white_wins",
"reason": "checkmate"
}
Response:
{
"game_id": "uuid",
"result": "white_wins",
"elo_updates": {
"white": {
"old_elo": 1500,
"new_elo": 1516,
"change": 16
},
"black": {
"old_elo": 1500,
"new_elo": 1484,
"change": -16
}
}
}
Data Flow
Move Making Flow
- Player Makes Move:
- Player submits move (from_square, to_square)
- Client sends request to API Gateway
- API Gateway routes to Game Service
- Move Validation:
- Game Service:
- Validates it’s player’s turn
- Validates move is legal (chess rules)
- Checks game is active
- Validates time remaining
- Game Service:
- Move Processing:
- Game Service:
- Creates move record
- Updates board state (FEN)
- Increments move count
- Switches turn
- Updates time remaining
- Checks for check/checkmate
- Game Service:
- State Update:
- Game Service:
- Updates game state in Database
- Stores move in Move History
- Publishes move event to Message Queue
- Game Service:
- Real-Time Notification:
- Notification Service:
- Broadcasts move to both players via WebSocket
- Updates game state for both clients
- Notification Service:
- Response:
- Game Service returns move details
- Client updates UI
Undo Move Flow
- Player Requests Undo:
- Player clicks undo button
- Client sends undo request
- API Gateway routes to Game Service
- Undo Validation:
- Game Service:
- Validates game is active
- Validates moves exist to undo
- Validates undo is allowed (not opponent’s turn)
- Game Service:
- Move Undo:
- Game Service:
- Gets last move(s) from history
- Reverts board state to previous position
- Marks moves as undone
- Updates move count
- Switches turn back
- Restores time remaining
- Game Service:
- Undo History:
- Game Service:
- Creates undo history record
- Stores previous board state
- Enables redo capability
- Game Service:
- State Update:
- Game Service:
- Updates game state in Database
- Publishes undo event
- Game Service:
- Notification:
- Notification Service broadcasts undo to both players
ELO Calculation Flow
- Game Finished:
- Game ends (checkmate, resignation, draw, timeout)
- Game Service determines result
- ELO Calculation:
- ELO Service:
- Gets both players’ current ELO ratings
- Calculates expected score for each player
- Calculates ELO change based on actual result
- Updates player ELO ratings
- ELO Service:
- ELO Update:
- ELO Service:
- Updates player ELO in Database
- Creates ELO history record
- Updates player statistics
- ELO Service:
- Notification:
- Notification Service notifies players of ELO changes
Database Design
Schema Design
Games Table:
CREATE TABLE games (
game_id UUID PRIMARY KEY,
white_player_id UUID NOT NULL,
black_player_id UUID NOT NULL,
game_status VARCHAR(50) NOT NULL,
current_turn VARCHAR(10) NOT NULL,
board_state VARCHAR(100) NOT NULL,
move_count INT DEFAULT 0,
time_control VARCHAR(50) NOT NULL,
white_time_remaining INT NOT NULL,
black_time_remaining INT NOT NULL,
result VARCHAR(50),
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW(),
finished_at TIMESTAMP NULL,
INDEX idx_white_player (white_player_id),
INDEX idx_black_player (black_player_id),
INDEX idx_status (game_status),
INDEX idx_created_at (created_at DESC)
);
Moves Table (Sharded by game_id):
CREATE TABLE moves_0 (
move_id UUID PRIMARY KEY,
game_id UUID NOT NULL,
move_number INT NOT NULL,
player_id UUID NOT NULL,
color VARCHAR(10) NOT NULL,
from_square VARCHAR(2) NOT NULL,
to_square VARCHAR(2) NOT NULL,
piece VARCHAR(20) NOT NULL,
captured_piece VARCHAR(20),
promotion VARCHAR(20),
is_castle BOOLEAN DEFAULT FALSE,
is_en_passant BOOLEAN DEFAULT FALSE,
is_check BOOLEAN DEFAULT FALSE,
is_checkmate BOOLEAN DEFAULT FALSE,
board_state_after VARCHAR(100) NOT NULL,
timestamp TIMESTAMP DEFAULT NOW(),
is_undone BOOLEAN DEFAULT FALSE,
INDEX idx_game_id (game_id),
INDEX idx_game_move_number (game_id, move_number),
INDEX idx_is_undone (is_undone)
);
-- Similar tables: moves_1, moves_2, ..., moves_N
Undo History Table:
CREATE TABLE undo_history (
undo_id UUID PRIMARY KEY,
game_id UUID NOT NULL,
move_id UUID NOT NULL,
move_number INT NOT NULL,
board_state_before VARCHAR(100) NOT NULL,
undo_timestamp TIMESTAMP DEFAULT NOW(),
redo_timestamp TIMESTAMP NULL,
is_redone BOOLEAN DEFAULT FALSE,
INDEX idx_game_id (game_id),
INDEX idx_move_id (move_id),
INDEX idx_is_redone (is_redone)
);
Players Table:
CREATE TABLE players (
user_id UUID PRIMARY KEY,
username VARCHAR(100) UNIQUE NOT NULL,
elo_rating INT DEFAULT 1500,
elo_blitz INT DEFAULT 1500,
elo_rapid INT DEFAULT 1500,
elo_classical INT DEFAULT 1500,
games_played INT DEFAULT 0,
games_won INT DEFAULT 0,
games_lost INT DEFAULT 0,
games_drawn INT DEFAULT 0,
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW(),
INDEX idx_username (username),
INDEX idx_elo_rating (elo_rating DESC)
);
ELO History Table:
CREATE TABLE elo_history (
elo_history_id UUID PRIMARY KEY,
user_id UUID NOT NULL,
game_id UUID NOT NULL,
time_control VARCHAR(50) NOT NULL,
old_elo INT NOT NULL,
new_elo INT NOT NULL,
elo_change INT NOT NULL,
opponent_elo INT NOT NULL,
result VARCHAR(10) NOT NULL,
timestamp TIMESTAMP DEFAULT NOW(),
INDEX idx_user_id (user_id),
INDEX idx_game_id (game_id),
INDEX idx_timestamp (timestamp DESC)
);
Database Sharding Strategy
Moves Table Sharding:
- Shard by game_id using consistent hashing
- 1000 shards:
shard_id = hash(game_id) % 1000 - All moves for a game in same shard
- Enables efficient move history queries
Shard Key Selection:
game_idensures all moves for a game are in same shard- Enables efficient queries for game move history
- Prevents cross-shard queries for single game
Replication:
- Each shard replicated 3x for high availability
- Master-replica setup for read scaling
- Writes go to master, reads can go to replicas
High-Level Design
┌─────────────┐
│ Client │
│ (Web App) │
└──────┬──────┘
│
│ HTTP/WebSocket
│
┌──────▼──────────────────────────────────────────────┐
│ API Gateway / Load Balancer │
│ - Rate Limiting │
│ - Request Routing │
└──────┬──────────────────────────────────────────────┘
│
├──────────────┬──────────────┬──────────────┬──────────────┐
│ │ │ │ │
┌──────▼──────┐ ┌─────▼──────┐ ┌─────▼──────┐ ┌─────▼──────┐ ┌─────▼──────┐
│ Game │ │ Move │ │ ELO │ │ Notification│ │ Chess │
│ Service │ │ Validator │ │ Service │ │ Service │ │ Engine │
└──────┬──────┘ └─────┬──────┘ └─────┬──────┘ └─────┬──────┘ └─────┬──────┘
│ │ │ │ │
│ │ │ │ │
┌──────▼──────────────▼──────────────▼──────────────▼──────────────▼──────┐
│ Message Queue (Kafka) │
│ - Move events │
│ - Undo events │
│ - Game events │
└──────┬───────────────────────────────────────────────────────────────────┘
│
│
┌──────▼───────────────────────────────────────────────────────────────────┐
│ Database Cluster (Sharded) │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Games │ │ Moves │ │ Players │ │
│ │ DB │ │ (Sharded)│ │ DB │ │
│ └──────────┘ └──────────┘ └──────────┘ │
│ │
│ ┌──────────┐ ┌──────────┐ │
│ │ Undo │ │ ELO │ │
│ │ History │ │ History │ │
│ │ DB │ │ DB │ │
│ └──────────┘ └──────────┘ │
└───────────────────────────────────────────────────────────────────────────┘
┌───────────────────────────────────────────────────────────────────────────┐
│ Cache Layer (Redis) │
│ - Active game states │
│ - Move history (recent) │
│ - Player ELO ratings │
└───────────────────────────────────────────────────────────────────────────┘
┌───────────────────────────────────────────────────────────────────────────┐
│ WebSocket Service │
│ - Real-time move updates │
│ - Game state synchronization │
└───────────────────────────────────────────────────────────────────────────┘
Deep Dive
Component Design
1. Game Service
Responsibilities:
- Manage game lifecycle
- Handle move making
- Manage undo/redo operations
- Update game state
Key Design Decisions:
- State Machine: Use state machine for game status
- Move History: Store complete move history
- Board State: Use FEN notation for board representation
- Undo Stack: Maintain undo stack for redo capability
Implementation:
class GameService:
def __init__(self):
self.db = Database()
self.move_validator = MoveValidator()
self.chess_engine = ChessEngine()
self.cache = RedisCache()
def make_move(self, game_id, player_id, from_square, to_square, promotion=None):
# Get game
game = self.get_game(game_id)
# Validate turn
if not self.is_player_turn(game, player_id):
raise InvalidMoveError("Not your turn")
# Validate move
if not self.move_validator.is_valid_move(
game.board_state, from_square, to_square, game.current_turn
):
raise InvalidMoveError("Invalid move")
# Make move
new_board_state = self.chess_engine.make_move(
game.board_state, from_square, to_square, promotion
)
# Check game state
is_check = self.chess_engine.is_check(new_board_state, game.current_turn)
is_checkmate = self.chess_engine.is_checkmate(new_board_state, game.current_turn)
# Create move record
move = Move(
game_id=game_id,
move_number=game.move_count + 1,
player_id=player_id,
color=game.current_turn,
from_square=from_square,
to_square=to_square,
board_state_after=new_board_state,
is_check=is_check,
is_checkmate=is_checkmate
)
move.save()
# Update game state
game.board_state = new_board_state
game.move_count += 1
game.current_turn = 'black' if game.current_turn == 'white' else 'white'
game.updated_at = datetime.now()
game.save()
# Clear undo stack (can't redo after new move)
self.clear_redo_stack(game_id)
# Publish move event
self.publish_move_event(game_id, move)
return move
def undo_move(self, game_id, player_id, move_count=1):
# Get game
game = self.get_game(game_id)
# Validate undo is allowed
if game.move_count == 0:
raise InvalidOperationError("No moves to undo")
# Get moves to undo
moves_to_undo = self.db.get_last_moves(game_id, move_count)
if not moves_to_undo:
raise InvalidOperationError("Not enough moves to undo")
# Undo moves
undone_moves = []
for move in reversed(moves_to_undo):
# Get previous board state
if move.move_number == 1:
# Initial position
previous_state = self.get_initial_board_state()
else:
previous_move = self.db.get_move_by_number(game_id, move.move_number - 1)
previous_state = previous_move.board_state_after
# Mark move as undone
move.is_undone = True
move.save()
# Create undo history
undo_history = UndoHistory(
game_id=game_id,
move_id=move.move_id,
move_number=move.move_number,
board_state_before=previous_state
)
undo_history.save()
undone_moves.append(move)
# Update game state
if undone_moves:
last_undone = undone_moves[-1]
if last_undone.move_number == 1:
game.board_state = self.get_initial_board_state()
game.move_count = 0
game.current_turn = 'white'
else:
previous_move = self.db.get_move_by_number(
game_id, last_undone.move_number - 1
)
game.board_state = previous_move.board_state_after
game.move_count = previous_move.move_number - 1
game.current_turn = previous_move.color
game.updated_at = datetime.now()
game.save()
# Publish undo event
self.publish_undo_event(game_id, undone_moves)
return {
'moves_undone': len(undone_moves),
'current_move_number': game.move_count,
'board_state': game.board_state
}
def redo_move(self, game_id, player_id, move_count=1):
# Get game
game = self.get_game(game_id)
# Get undone moves that can be redone
undone_moves = self.db.get_undone_moves_for_redo(game_id, move_count)
if not undone_moves:
raise InvalidOperationError("No moves to redo")
# Redo moves
redone_moves = []
for move in undone_moves:
# Mark move as not undone
move.is_undone = False
move.save()
# Update undo history
undo_history = self.db.get_undo_history_by_move_id(move.move_id)
if undo_history:
undo_history.is_redone = True
undo_history.redo_timestamp = datetime.now()
undo_history.save()
redone_moves.append(move)
# Update game state to last redone move
if redone_moves:
last_redone = redone_moves[-1]
game.board_state = last_redone.board_state_after
game.move_count = last_redone.move_number
game.current_turn = 'black' if last_redone.color == 'white' else 'white'
game.updated_at = datetime.now()
game.save()
# Publish redo event
self.publish_redo_event(game_id, redone_moves)
return {
'moves_redone': len(redone_moves),
'current_move_number': game.move_count,
'board_state': game.board_state
}
2. ELO Service
Responsibilities:
- Calculate ELO rating changes
- Update player ELO ratings
- Track ELO history
- Handle different time controls
Key Design Decisions:
- ELO Formula: Use standard ELO formula
- K-Factor: Different K-factors for different ratings
- Time Control: Separate ELO for blitz, rapid, classical
- History Tracking: Track all ELO changes
Implementation:
class ELOService:
def __init__(self):
self.db = Database()
self.k_factor_new = 32 # For new players (< 30 games)
self.k_factor_standard = 16 # For standard players
self.k_factor_expert = 10 # For expert players (> 2400)
def calculate_elo_change(self, player_elo, opponent_elo, result):
"""
Calculate ELO change using standard formula
result: 1 for win, 0.5 for draw, 0 for loss
"""
# Expected score
expected_score = 1 / (1 + 10 ** ((opponent_elo - player_elo) / 400))
# Determine K-factor
k_factor = self.get_k_factor(player_elo)
# Calculate ELO change
elo_change = k_factor * (result - expected_score)
return round(elo_change)
def get_k_factor(self, elo_rating):
"""Get K-factor based on rating"""
if elo_rating < 2100:
return self.k_factor_standard
elif elo_rating < 2400:
return self.k_factor_standard
else:
return self.k_factor_expert
def update_elo_ratings(self, game_id, white_player_id, black_player_id, result):
"""Update ELO ratings after game"""
# Get current ratings
white_player = self.db.get_player(white_player_id)
black_player = self.db.get_player(black_player_id)
# Determine result (1 = white wins, 0.5 = draw, 0 = black wins)
if result == 'white_wins':
white_result = 1
black_result = 0
elif result == 'black_wins':
white_result = 0
black_result = 1
else: # draw
white_result = 0.5
black_result = 0.5
# Get game time control
game = self.db.get_game(game_id)
time_control = game.time_control
# Calculate ELO changes
white_elo_change = self.calculate_elo_change(
white_player.get_elo(time_control),
black_player.get_elo(time_control),
white_result
)
black_elo_change = self.calculate_elo_change(
black_player.get_elo(time_control),
white_player.get_elo(time_control),
black_result
)
# Update player ELOs
white_new_elo = white_player.get_elo(time_control) + white_elo_change
black_new_elo = black_player.get_elo(time_control) + black_elo_change
white_player.set_elo(time_control, white_new_elo)
black_player.set_elo(time_control, black_new_elo)
# Update statistics
white_player.games_played += 1
black_player.games_played += 1
if result == 'white_wins':
white_player.games_won += 1
black_player.games_lost += 1
elif result == 'black_wins':
black_player.games_won += 1
white_player.games_lost += 1
else:
white_player.games_drawn += 1
black_player.games_drawn += 1
white_player.save()
black_player.save()
# Create ELO history records
self.create_elo_history(
white_player_id, game_id, time_control,
white_player.get_elo(time_control) - white_elo_change,
white_new_elo, white_elo_change,
black_player.get_elo(time_control), white_result
)
self.create_elo_history(
black_player_id, game_id, time_control,
black_player.get_elo(time_control) - black_elo_change,
black_new_elo, black_elo_change,
white_player.get_elo(time_control), black_result
)
return {
'white': {
'old_elo': white_player.get_elo(time_control) - white_elo_change,
'new_elo': white_new_elo,
'change': white_elo_change
},
'black': {
'old_elo': black_player.get_elo(time_control) - black_elo_change,
'new_elo': black_new_elo,
'change': black_elo_change
}
}
def create_elo_history(self, user_id, game_id, time_control, old_elo, new_elo,
elo_change, opponent_elo, result):
"""Create ELO history record"""
elo_history = ELOHistory(
user_id=user_id,
game_id=game_id,
time_control=time_control,
old_elo=old_elo,
new_elo=new_elo,
elo_change=elo_change,
opponent_elo=opponent_elo,
result='win' if result == 1 else ('draw' if result == 0.5 else 'loss')
)
elo_history.save()
3. Move Validator
Responsibilities:
- Validate chess moves
- Check move legality
- Validate game rules
Key Design Decisions:
- Chess Engine: Use chess library for validation
- Rule Checking: Check all chess rules (check, castling, en passant)
- Performance: Fast validation (< 10ms)
Implementation:
class MoveValidator:
def __init__(self):
self.chess_engine = ChessEngine()
def is_valid_move(self, board_state, from_square, to_square, color):
"""Validate if move is legal"""
try:
# Parse board state
board = self.chess_engine.parse_fen(board_state)
# Check if it's the correct color's turn
if board.turn != color:
return False
# Check if move is legal
move = self.chess_engine.create_move(from_square, to_square)
if move not in board.legal_moves:
return False
return True
except Exception as e:
return False
def validate_move_details(self, board_state, from_square, to_square, promotion=None):
"""Validate move and return details"""
board = self.chess_engine.parse_fen(board_state)
move = self.chess_engine.create_move(from_square, to_square, promotion)
if move not in board.legal_moves:
raise InvalidMoveError("Illegal move")
# Get move details
piece = board.piece_at(from_square)
captured_piece = board.piece_at(to_square)
is_castle = board.is_castling(move)
is_en_passant = board.is_en_passant(move)
# Make move to check for check/checkmate
board.push(move)
is_check = board.is_check()
is_checkmate = board.is_checkmate()
board.pop()
return {
'piece': piece.symbol(),
'captured_piece': captured_piece.symbol() if captured_piece else None,
'is_castle': is_castle,
'is_en_passant': is_en_passant,
'is_check': is_check,
'is_checkmate': is_checkmate
}
Detailed Design
Unlimited Undo Implementation
Challenge: Support unlimited undo while maintaining performance
Solution:
- Move History: Store all moves in database
- Undo Stack: Maintain undo stack in memory
- Board State Reconstruction: Reconstruct from move history
- Efficient Storage: Store FEN strings for board states
Implementation:
def undo_unlimited(self, game_id, move_count):
"""Undo any number of moves"""
# Get all moves up to current position
current_moves = self.db.get_moves_up_to_current(game_id)
if len(current_moves) < move_count:
raise InvalidOperationError("Not enough moves to undo")
# Undo moves
moves_to_undo = current_moves[-move_count:]
for move in reversed(moves_to_undo):
move.is_undone = True
move.save()
# Create undo history
undo_history = UndoHistory(
game_id=game_id,
move_id=move.move_id,
move_number=move.move_number
)
undo_history.save()
# Reconstruct board state
remaining_moves = current_moves[:-move_count]
if remaining_moves:
last_move = remaining_moves[-1]
board_state = last_move.board_state_after
move_count = last_move.move_number
else:
board_state = self.get_initial_board_state()
move_count = 0
# Update game
game = self.get_game(game_id)
game.board_state = board_state
game.move_count = move_count
game.current_turn = 'white' if move_count % 2 == 0 else 'black'
game.save()
ELO Rating Calculation
Challenge: Calculate accurate ELO ratings
Solution:
- Standard ELO Formula: Use proven ELO formula
- K-Factor Adjustment: Adjust K-factor based on player strength
- Expected Score: Calculate expected score from rating difference
- Rating Update: Update based on actual vs expected score
Formula:
Expected Score = 1 / (1 + 10^((Opponent Rating - Player Rating) / 400))
ELO Change = K * (Actual Score - Expected Score)
New Rating = Old Rating + ELO Change
Scalability Considerations
Horizontal Scaling
Game Service:
- Stateless service, horizontally scalable
- Multiple instances behind load balancer
- Shared database and cache
- WebSocket connections distributed
Caching Strategy
Redis Cache:
- Active Games: Cache active game states
- Recent Moves: Cache recent move history
- Player ELO: Cache player ELO ratings
- TTL: 1 hour for active games
Security Considerations
Move Validation
- Server-Side Validation: Always validate moves server-side
- Cheat Prevention: Detect impossible moves
- Rate Limiting: Limit move rate per player
Game Integrity
- State Verification: Verify game state consistency
- Move Ordering: Ensure moves are processed in order
- Undo Restrictions: Prevent undo abuse
Monitoring & Observability
Key Metrics
Game Metrics:
- Active games count
- Moves per second
- Undo operations per second
- Average game duration
- Games completed per day
Performance Metrics:
- Move processing latency
- Undo operation latency
- ELO calculation latency
- WebSocket message latency
Business Metrics:
- Total players
- Average ELO rating
- Games played per player
- ELO distribution
Trade-offs and Optimizations
Trade-offs
1. Undo Storage: Full History vs Snapshot
- Full History: More storage, faster undo
- Snapshot: Less storage, slower undo
- Decision: Full history for unlimited undo
2. Board State: FEN vs Full State
- FEN: Compact, requires parsing
- Full State: Larger, faster access
- Decision: FEN for storage, parsed in memory
3. ELO Update: Real-Time vs Batch
- Real-Time: Immediate updates, higher load
- Batch: Delayed updates, lower load
- Decision: Real-time for user experience
Optimizations
1. Move History Compression
- Compress move history
- Reduce storage
- Improve performance
2. Board State Caching
- Cache parsed board states
- Reduce parsing overhead
- Improve move validation speed
3. Batch ELO Updates
- Batch ELO calculations
- Reduce database writes
- Improve throughput
What Interviewers Look For
State Management Skills
- Game State Representation
- FEN notation
- Efficient board state storage
- Red Flags: Inefficient representation, high storage, slow operations
- Unlimited Undo
- Complete move history
- Efficient state reconstruction
- Red Flags: Limited undo, inefficient reconstruction, no history
- Move Validation
- Fast validation
- Accurate rules
- Red Flags: Slow validation, incorrect rules, no validation
Algorithm Skills
- ELO Calculation
- Standard ELO formula
- Proper K-factors
- Accurate rating updates
- Red Flags: Incorrect formula, wrong K-factor, inaccurate ratings
- State Reconstruction
- Efficient undo/redo
- Fast state recovery
- Red Flags: Slow reconstruction, inefficient, no optimization
- Move History Management
- Complete history storage
- Efficient retrieval
- Red Flags: Incomplete history, slow retrieval, no history
Problem-Solving Approach
- Unlimited Undo Design
- Complete move history
- Efficient storage
- Red Flags: Limited undo, inefficient storage, no history
- Edge Cases
- Concurrent moves
- Invalid moves
- Game end conditions
- Red Flags: Ignoring edge cases, no handling
- Trade-off Analysis
- Storage vs performance
- Accuracy vs speed
- Red Flags: No trade-offs, dogmatic choices
System Design Skills
- Component Design
- Game service
- Move validation service
- ELO service
- Red Flags: Monolithic, unclear boundaries
- Real-Time Updates
- WebSocket architecture
- Low-latency delivery
- Red Flags: Polling, high latency, no real-time
- Database Design
- Proper indexing
- Sharding strategy
- Red Flags: Missing indexes, no sharding
Communication Skills
- State Management Explanation
- Can explain FEN notation
- Understands state reconstruction
- Red Flags: No understanding, vague
- Algorithm Explanation
- Can explain ELO calculation
- Understands undo/redo
- Red Flags: No understanding, vague
Meta-Specific Focus
- State Management Expertise
- Deep understanding of state
- Efficient algorithms
- Key: Show state management expertise
- Correctness Focus
- Accurate game rules
- Data integrity
- Key: Demonstrate reliability focus
Summary
Designing a chess game system with unlimited undo and ELO rating requires careful consideration of:
- Game State Management: Efficient board state representation and storage
- Unlimited Undo: Complete move history with efficient undo/redo
- Move Validation: Fast and accurate move validation
- ELO Calculation: Accurate rating system with proper K-factors
- Real-Time Updates: WebSocket-based real-time game updates
- Move History: Complete move history for replay and analysis
- Scalability: Handle 10K+ moves per second
- Data Integrity: Accurate game state and ELO ratings
- Performance: Sub-100ms move processing
- Undo/Redo: Efficient undo/redo with state reconstruction
Key architectural decisions:
- FEN Notation for board state representation
- Complete Move History for unlimited undo
- Standard ELO Formula for rating calculation
- Sharded Database for move storage
- WebSocket for real-time updates
- Redis Cache for active game states
- Horizontal Scaling for all services
The system handles 10,000 moves per second, supports unlimited undo operations, maintains accurate ELO ratings, and provides real-time gameplay with sub-200ms update latency.