Introduction

Docker is a platform for developing, shipping, and running applications using containerization. It packages applications and their dependencies into containers that can run consistently across different environments. Understanding Docker is essential for system design interviews involving containerization and microservices.

This guide covers:

  • Docker Fundamentals: Containers, images, and Dockerfile
  • Image Management: Building, tagging, and pushing images
  • Container Management: Running, stopping, and managing containers
  • Docker Compose: Multi-container applications
  • Best Practices: Security, optimization, and performance

What is Docker?

Docker is a containerization platform that:

  • Containerization: Packages applications in containers
  • Isolation: Isolates applications from host
  • Portability: Runs consistently across environments
  • Efficiency: Lightweight compared to VMs
  • Scalability: Easy to scale applications

Key Concepts

Container: Running instance of an image

Image: Read-only template for creating containers

Dockerfile: Instructions for building images

Registry: Repository for storing images

Volume: Persistent data storage

Network: Container networking

Architecture

High-Level Architecture

┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│  Developer  │────▶│   DevOps    │────▶│   CI/CD     │
│             │     │  Engineer   │     │  Pipeline   │
└──────┬──────┘     └──────┬──────┘     └──────┬──────┘
       │                    │                    │
       └────────────────────┴────────────────────┘
                            │
                            │ docker build/run
                            │
                            ▼
              ┌─────────────────────────┐
              │   Docker Host            │
              │                         │
              │  ┌──────────┐           │
              │  │  Docker  │           │
              │  │  Daemon  │           │
              │  └────┬─────┘           │
              │       │                 │
              │  ┌────┴─────┐           │
              │  │ Containers│          │
              │  │ (Running) │           │
              │  └──────────┘           │
              │                         │
              │  ┌───────────────────┐  │
              │  │  Images           │  │
              │  │  (Templates)      │  │
              │  └───────────────────┘  │
              └─────────────────────────┘

Explanation:

  • Users: Developers, DevOps engineers, and CI/CD pipelines that build and run containerized applications.
  • Docker Host: Machine (physical or virtual) that runs the Docker daemon and containers.
  • Docker Daemon: Background service that manages Docker objects (containers, images, networks, volumes).
  • Containers: Running instances of Docker images that execute applications in isolated environments.
  • Images: Read-only templates used to create containers. Images are built from Dockerfiles or pulled from registries.

Core Architecture

┌─────────────────────────────────────────────────────────┐
│              Docker Host                                 │
│                                                           │
│  ┌──────────────────────────────────────────────────┐    │
│  │         Docker Daemon                            │    │
│  │  (Container Management, Image Management)          │    │
│  └──────────────────────────────────────────────────┘    │
│                          │                                │
│  ┌──────────────────────────────────────────────────┐    │
│  │         Containers                               │    │
│  │  (Running Applications)                           │    │
│  └──────────────────────────────────────────────────┘    │
│                          │                                │
│  ┌──────────────────────────────────────────────────┐    │
│  │         Images                                    │    │
│  │  (Application Templates)                          │    │
│  └──────────────────────────────────────────────────┘    │
└───────────────────────────────────────────────────────────┘

Dockerfile

Basic Dockerfile

Dockerfile:

FROM node:18-alpine

WORKDIR /app

COPY package*.json ./
RUN npm install

COPY . .

EXPOSE 3000

CMD ["node", "index.js"]

Multi-Stage Build

# Build stage
FROM node:18 AS builder
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
RUN npm run build

# Production stage
FROM node:18-alpine
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/package.json ./
EXPOSE 3000
CMD ["node", "dist/index.js"]

Best Practices

Optimize Layers:

# Bad: Creates multiple layers
RUN apt-get update
RUN apt-get install -y python3
RUN pip install flask

# Good: Single layer
RUN apt-get update && \
    apt-get install -y python3 && \
    pip install flask

Use .dockerignore:

node_modules
.git
.env
*.log

Image Management

Build Image

Build:

docker build -t myapp:1.0 .

Tag Image:

docker tag myapp:1.0 myapp:latest
docker tag myapp:1.0 registry.example.com/myapp:1.0

Push to Registry

Push:

docker push registry.example.com/myapp:1.0

Pull Image

Pull:

docker pull nginx:latest

Container Management

Run Container

Basic Run:

docker run nginx:latest

Run with Options:

docker run -d \
  --name my-nginx \
  -p 8080:80 \
  -v /host/path:/container/path \
  nginx:latest

Run with Environment Variables:

docker run -e DATABASE_URL=postgresql://localhost:5432/mydb \
  myapp:latest

Container Commands

List Containers:

docker ps          # Running containers
docker ps -a       # All containers

Stop Container:

docker stop my-nginx

Start Container:

docker start my-nginx

Remove Container:

docker rm my-nginx

View Logs:

docker logs my-nginx
docker logs -f my-nginx  # Follow logs

Docker Compose

docker-compose.yml

version: '3.8'

services:
  web:
    build: .
    ports:
      - "3000:3000"
    environment:
      - DATABASE_URL=postgresql://db:5432/mydb
    depends_on:
      - db
  
  db:
    image: postgres:14
    environment:
      - POSTGRES_DB=mydb
      - POSTGRES_USER=user
      - POSTGRES_PASSWORD=password
    volumes:
      - db-data:/var/lib/postgresql/data

volumes:
  db-data:

Commands:

docker-compose up          # Start services
docker-compose up -d       # Start in background
docker-compose down        # Stop services
docker-compose logs        # View logs

Volumes

Named Volume

Create Volume:

docker volume create my-volume

Use Volume:

docker run -v my-volume:/data nginx:latest

Bind Mount

Mount Host Directory:

docker run -v /host/path:/container/path nginx:latest

Volume in Dockerfile

VOLUME ["/data"]

Networking

Default Networks

Bridge Network:

docker network create my-network
docker run --network my-network nginx:latest

Host Network:

docker run --network host nginx:latest

None Network:

docker run --network none nginx:latest

Custom Network

Create Network:

docker network create --driver bridge my-network

Connect Containers:

docker network connect my-network container1
docker network connect my-network container2

Best Practices

1. Image Optimization

  • Use multi-stage builds
  • Minimize layers
  • Use .dockerignore
  • Use appropriate base images

2. Security

  • Don’t run as root
  • Scan images for vulnerabilities
  • Use official images
  • Keep images updated

3. Performance

  • Use appropriate base images
  • Cache layers effectively
  • Minimize image size
  • Use health checks

4. Resource Management

  • Set resource limits
  • Monitor resource usage
  • Clean up unused resources
  • Use volumes for data

What Interviewers Look For

Containerization Understanding

  1. Docker Concepts
    • Understanding of containers, images, Dockerfile
    • Container lifecycle
    • Image management
    • Red Flags: No Docker understanding, wrong concepts, poor image management
  2. Containerization Patterns
    • Multi-stage builds
    • Volume management
    • Networking
    • Red Flags: No patterns, poor volumes, no networking
  3. Orchestration
    • Docker Compose
    • Container orchestration
    • Scaling strategies
    • Red Flags: No orchestration, poor scaling, no strategy

Problem-Solving Approach

  1. Image Design
    • Dockerfile optimization
    • Layer management
    • Security considerations
    • Red Flags: Poor Dockerfile, many layers, security issues
  2. Container Management
    • Resource allocation
    • Volume management
    • Networking configuration
    • Red Flags: No resources, poor volumes, no networking

System Design Skills

  1. Containerization Architecture
    • Docker deployment design
    • Multi-container applications
    • Resource management
    • Red Flags: No architecture, poor containers, no resources
  2. Scalability
    • Container scaling
    • Resource optimization
    • Performance tuning
    • Red Flags: No scaling, poor optimization, no tuning

Communication Skills

  1. Clear Explanation
    • Explains Docker concepts
    • Discusses trade-offs
    • Justifies design decisions
    • Red Flags: Unclear explanations, no justification, confusing

Meta-Specific Focus

  1. Containerization Expertise
    • Understanding of containers
    • Docker mastery
    • Deployment strategies
    • Key: Demonstrate containerization expertise
  2. System Design Skills
    • Can design containerized systems
    • Understands containerization challenges
    • Makes informed trade-offs
    • Key: Show practical containerization design skills

Summary

Docker Key Points:

  • Containerization: Packages applications in containers
  • Portability: Runs consistently across environments
  • Efficiency: Lightweight compared to VMs
  • Isolation: Isolates applications from host
  • Scalability: Easy to scale applications

Common Use Cases:

  • Application deployment
  • Microservices
  • Development environments
  • CI/CD pipelines
  • Cloud-native applications
  • Multi-container applications

Best Practices:

  • Use multi-stage builds
  • Minimize image layers
  • Use .dockerignore
  • Set resource limits
  • Use volumes for data
  • Implement health checks
  • Keep images updated

Docker is a powerful platform for containerizing applications and enabling consistent deployment across different environments.