System Design Interview Summary

This guide serves as both a reference for system design interviews and a practical resource for designing real-world systems. Use it as a checklist for interview practice, or as a framework when architecting production systems. Tailor depth to role level and domain.

What interviewers look for

Problem framing: clarify product goals, constraints, traffic, SLOs.
Architecture: right components, sync vs async, data stores, caches.
Reasoning: back-of-the-envelope capacity, trade-offs, consistency.
Operational maturity: SLOs, error budgets, rollout, observability, cost.
Evolution: phased plan, de-risking, failure modes and mitigations.

Company focus cues

Google: SLOs/error budgets, capacity math, storage and indexing, consistency semantics, multi-region.
Meta: product thinking + practical scale; caching/fanout/feed patterns; experimentation and metrics.
Salesforce: multi-tenancy, data security, compliance (SOX/PCI), integration patterns, API governance.
Robinhood/Fintech: strong consistency for money/orders, audit/regulatory artifacts, risk engines.

30/60/90 minute pacing

0–5: restate scope, define users, success metrics, SLOs.
5–15: APIs, high-level diagram, data model; call out consistency domains.
15–30: deep dive critical path; capacity and storage math.
30–45: reliability (circuit breakers, retries, idempotency), caching, indexing, queues.
45–55: failure drills, observability, rollout, cost.
55–60: recap trade-offs, next steps, evolution.

Capacity checklist

QPS/read-write mix; payload sizes; fanout; egress. Hot vs cold paths; cache hit rate targets.
Storage/day (raw vs. compressed), retention; index sizes; partitioning keys.

Consistency checklist

Strong vs eventual; idempotency keys; dedupe; outbox/inbox; exactly-once illusions.

Reliability checklist

SLOs (p95 latency, availability), error budgets, brownouts; backpressure and load shedding.
Multi-AZ default; multi-region read strategy; state reconciliation.

Security/compliance

AuthZ models, secrets/KMS, encryption; audit trails/immutability; data retention; PII/PCI/SOX.

Practice links (from this blog)

Embedded System Design Showcase

One last tip

Explicitly state SLOs and capacity numbers early. It anchors trade-offs and helps interviewers follow your choices.

Scoring and evaluation

Rubric dimensions (commonly used across Google/Meta/Salesforce/fintech)

Problem framing & requirements: clarifies scope, users, constraints, success metrics, SLOs.
Architecture & decomposition: chooses appropriate components; clear boundaries; sync vs async separation.
Data & storage: correct data models, indexing/partitioning, hot vs cold storage, TTL/retention.
Capacity math: reasonable back-of-the-envelope for QPS, storage/day, egress, cache hit rates.
Reliability & operations: SLOs/error budgets, backpressure, retries/idempotency, rollout, canaries, oncall readiness.
Consistency & correctness: identifies strong vs eventual domains; idempotency keys; dedupe/outbox; ordering.
API design & contracts: versioning, pagination, idempotency, authn/z; clear interfaces.
Security, privacy, compliance: IAM/KMS, encryption, PII/PCI/SOX, auditing.
Observability & testing: metrics/logs/traces, synthetic tests, chaos/failure drills.
Communication & collaboration: structure, trade-offs, active listening, time management.

Score bands (signal examples)

Strong hire: Quantifies scale early, states SLOs, proposes viable architecture with trade-offs, does capacity math, handles consistency explicitly, walks through failures and mitigations, presents evolution plan.
Leaning hire: Good architecture and trade-offs, some scale math, partial SLOs, covers a few failures; minor gaps in consistency or ops.
Leaning no: Vague requirements, jumps to tech buzzwords, no numbers, misses consistency or failure modes, unclear data model.
No hire: Fundamentally incorrect design (e.g., single-node bottleneck at target scale), ignores constraints/security, cannot reason about trade-offs.

Red flags

No numbers anywhere; cannot estimate even roughly.
Ignores SLOs/availability; no plan for retries/backpressure/idempotency.
Hand-wavy storage/indexing; ignores pagination, hot keys, or partitioning.
Over-indexes on tech names over problem fit; solutionism.
Inflexible communication; doesn’t incorporate hints.

Leveling expectations (rough guide)

L3/Junior: Solid fundamentals, clear communication, correct baseline architecture; may need prompting for capacity/SLOs.
L4/Mid: Drives structure independently, produces reasonable numbers, identifies key trade-offs and basic failure handling.
L5/Senior: Anticipates product/ops concerns, quantifies thoroughly, robust consistency/ops plan, strong evolution path.
L6+/Staff: Multi-region, multi-tenant thinking; cost/operability; risk and compliance where relevant; crisp prioritization and simplification.

C++ & Systems Blog

System Design Interview Summary

System Design Interview Summary

What interviewers look for

Company focus cues

30/60/90 minute pacing

Capacity checklist

Consistency checklist

Reliability checklist

Security/compliance

Practice links (from this blog)

One last tip

Scoring and evaluation

Related Posts

Recent Posts