Introduction

Apache Pinot is a real-time distributed OLAP datastore designed to provide low-latency analytics on large-scale data. It’s optimized for analytical queries with sub-second latency. Understanding Pinot is essential for system design interviews involving real-time analytics and OLAP workloads.

This guide covers:

  • Pinot Fundamentals: Tables, segments, and real-time ingestion
  • Data Model: Dimensions, metrics, and time columns
  • Querying: SQL and PQL query languages
  • Performance: Star-tree indexing and optimization
  • Best Practices: Schema design, partitioning, and scaling

What is Apache Pinot?

Apache Pinot is a real-time OLAP database that:

  • Real-Time Ingestion: Stream data ingestion
  • Low Latency: Sub-second query performance
  • Column-Oriented: Optimized for analytical queries
  • Scalable: Horizontal scaling
  • SQL Interface: Standard SQL queries

Key Concepts

Table: Logical collection of data

Segment: Unit of storage and replication

Dimension: Column used for filtering and grouping

Metric: Column used for aggregation

Time Column: Timestamp column for partitioning

Star-Tree Index: Pre-aggregated index for fast queries

Architecture

High-Level Architecture

┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│   Client    │────▶│   Client    │────▶│   Client    │
│ Application │     │ Application │     │ Application │
└──────┬──────┘     └──────┬──────┘     └──────┬──────┘
       │                    │                    │
       └────────────────────┴────────────────────┘
                            │
                            │ SQL Queries
                            │
                            ▼
              ┌─────────────────────────┐
              │   Pinot Broker          │
              │   (Query Router)        │
              └──────┬──────────────────┘
                     │
       ┌─────────────┴─────────────┐
       │                           │
┌──────▼──────┐           ┌───────▼──────┐
│   Server    │           │   Minion      │
│   Nodes     │           │   (Ingestion) │
│ (Segments)  │           │               │
└──────┬──────┘           └───────┬──────┘
       │                           │
       └──────────────┬────────────┘
                      │
                      ▼
              ┌─────────────────────────┐
              │   Deep Storage           │
              │   (S3, HDFS)            │
              └─────────────────────────┘

Explanation:

  • Client Applications: Applications that query Pinot for real-time OLAP analytics (e.g., dashboards, analytics platforms, ad-hoc queries).
  • Pinot Broker: Routes queries to appropriate server nodes and merges results from multiple segments.
  • Server Nodes: Store and serve data segments. Execute queries on segments and return results to brokers.
  • Minion: Handles data ingestion, segment creation, and real-time data processing.
  • Deep Storage: Long-term storage for segments (e.g., S3, HDFS). Server nodes load segments from deep storage.

Core Architecture

┌─────────────────────────────────────────────────────────┐
│              Pinot Cluster                               │
│                                                           │
│  ┌──────────────────────────────────────────────────┐    │
│  │         Controller                                │    │
│  │  (Metadata Management, Segment Assignment)         │    │
│  └──────────────────────────────────────────────────┘    │
│                          │                                │
│  ┌──────────────────────────────────────────────────┐    │
│  │         Broker                                     │    │
│  │  (Query Routing, Result Merging)                   │    │
│  └──────────────────────────────────────────────────┘    │
│                          │                                │
│  ┌──────────────────────────────────────────────────┐    │
│  │         Server                                    │    │
│  │  (Segment Storage, Query Execution)                 │    │
│  └──────────────────────────────────────────────────┘    │
│                          │                                │
│  ┌──────────────────────────────────────────────────┐    │
│  │         Minion                                    │    │
│  │  (Segment Creation, Real-Time Ingestion)           │    │
│  └──────────────────────────────────────────────────┘    │
└───────────────────────────────────────────────────────────┘

Table Schema

Schema Definition

{
  "schemaName": "events",
  "dimensionFieldSpecs": [
    {
      "name": "user_id",
      "dataType": "LONG"
    },
    {
      "name": "event_type",
      "dataType": "STRING"
    },
    {
      "name": "country",
      "dataType": "STRING"
    }
  ],
  "metricFieldSpecs": [
    {
      "name": "count",
      "dataType": "LONG",
      "defaultNullValue": 0
    },
    {
      "name": "revenue",
      "dataType": "DOUBLE",
      "defaultNullValue": 0.0
    }
  ],
  "timeFieldSpec": {
    "incomingGranularitySpec": {
      "timeType": "MILLISECONDS",
      "dataType": "LONG",
      "name": "timestamp"
    },
    "outgoingGranularitySpec": {
      "timeType": "MILLISECONDS",
      "dataType": "LONG",
      "name": "timestamp"
    }
  }
}

Table Configuration

Offline Table

{
  "tableName": "events_OFFLINE",
  "tableType": "OFFLINE",
  "segmentsConfig": {
    "timeColumnName": "timestamp",
    "timeType": "MILLISECONDS",
    "segmentPushType": "APPEND",
    "segmentAssignmentStrategy": "BalanceNumSegmentAssignmentStrategy",
    "replication": "1",
    "schemaName": "events"
  },
  "tenants": {
    "broker": "DefaultTenant",
    "server": "DefaultTenant"
  },
  "tableIndexConfig": {
    "loadMode": "MMAP"
  },
  "metadata": {
    "customConfigs": {}
  }
}

Real-Time Table

{
  "tableName": "events_REALTIME",
  "tableType": "REALTIME",
  "segmentsConfig": {
    "timeColumnName": "timestamp",
    "timeType": "MILLISECONDS",
    "schemaName": "events",
    "replicasPerPartition": "1"
  },
  "tenants": {
    "broker": "DefaultTenant",
    "server": "DefaultTenant"
  },
  "tableIndexConfig": {
    "loadMode": "MMAP",
    "streamConfigs": {
      "streamType": "kafka",
      "stream.kafka.consumer.type": "lowlevel",
      "stream.kafka.topic.name": "events",
      "stream.kafka.decoder.class.name": "org.apache.pinot.plugin.stream.kafka.KafkaJSONMessageDecoder",
      "stream.kafka.consumer.factory.class.name": "org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory",
      "stream.kafka.broker.list": "localhost:9092"
    }
  }
}

Data Ingestion

Batch Ingestion

Upload Segment:

pinot-admin.sh AddTable \
  -schemaFile schema.json \
  -tableConfigFile table-config.json \
  -exec

Push Data:

pinot-admin.sh LaunchDataIngestionJob \
  -jobSpecFile job-spec.yaml

Real-Time Ingestion

Kafka Stream:

{
  "streamConfigs": {
    "streamType": "kafka",
    "stream.kafka.topic.name": "events",
    "stream.kafka.consumer.type": "lowlevel",
    "stream.kafka.broker.list": "localhost:9092",
    "stream.kafka.decoder.class.name": "org.apache.pinot.plugin.stream.kafka.KafkaJSONMessageDecoder"
  }
}

Querying

SQL Queries

Basic Query:

SELECT 
  country,
  COUNT(*) AS events,
  SUM(revenue) AS total_revenue
FROM events
WHERE timestamp >= 1609459200000
GROUP BY country
ORDER BY total_revenue DESC
LIMIT 10

Time-Based Query:

SELECT 
  DATETRUNC('hour', FROMDATETIME(timestamp, 'MILLISECONDS')) AS hour,
  COUNT(*) AS events
FROM events
WHERE timestamp >= 1609459200000
GROUP BY hour
ORDER BY hour

Filtering:

SELECT 
  event_type,
  COUNT(*) AS events
FROM events
WHERE timestamp >= 1609459200000
  AND country = 'US'
  AND user_id > 1000
GROUP BY event_type

PQL Queries

Aggregation:

{
  "sql": "SELECT country, COUNT(*), SUM(revenue) FROM events WHERE timestamp >= 1609459200000 GROUP BY country"
}

Star-Tree Index

Configuration

{
  "tableIndexConfig": {
    "starTreeIndexConfigs": [
      {
        "dimensionsSplitOrder": ["country", "event_type"],
        "functionColumnPairs": ["COUNT", "SUM__revenue"],
        "maxLeafRecords": 10
      }
    ]
  }
}

Benefits:

  • Pre-aggregated data
  • Faster queries
  • Reduced storage

Performance Optimization

Indexing

Inverted Index:

{
  "tableIndexConfig": {
    "invertedIndexColumns": ["country", "event_type"]
  }
}

Range Index:

{
  "tableIndexConfig": {
    "rangeIndexColumns": ["timestamp", "user_id"]
  }
}

Partitioning

Time-Based Partitioning:

{
  "segmentsConfig": {
    "timeColumnName": "timestamp",
    "timeType": "MILLISECONDS",
    "segmentPushType": "APPEND"
  }
}

Compression

Compression Codec:

{
  "tableIndexConfig": {
    "noDictionaryColumns": ["revenue"],
    "compressionCodec": "SNAPPY"
  }
}

Best Practices

1. Schema Design

  • Use appropriate data types
  • Design dimensions for filtering
  • Design metrics for aggregation
  • Use time column for partitioning

2. Table Configuration

  • Configure appropriate replication
  • Set segment size
  • Use star-tree for common queries
  • Optimize indexes

3. Query Performance

  • Always use time filters
  • Filter on dimensions early
  • Use appropriate aggregations
  • Limit result sets

4. Scaling

  • Scale servers for storage
  • Scale brokers for queries
  • Monitor segment distribution
  • Plan for capacity

What Interviewers Look For

OLAP Database Understanding

  1. Pinot Concepts
    • Understanding of columnar storage
    • Real-time ingestion
    • Star-tree indexing
    • Red Flags: No Pinot understanding, wrong model, no indexing
  2. Real-Time Analytics
    • Low-latency queries
    • Stream ingestion
    • Pre-aggregation strategies
    • Red Flags: No latency awareness, poor ingestion, no aggregation
  3. Performance
    • Query optimization
    • Indexing strategies
    • Compression
    • Red Flags: No optimization, poor indexes, no compression

Problem-Solving Approach

  1. Schema Design
    • Dimension and metric design
    • Time partitioning
    • Index selection
    • Red Flags: Poor design, no partitioning, wrong indexes
  2. Query Optimization
    • Time filter usage
    • Dimension filtering
    • Aggregation design
    • Red Flags: No time filters, poor filtering, no aggregation

System Design Skills

  1. Analytics Architecture
    • Pinot cluster design
    • Ingestion pipeline
    • Query optimization
    • Red Flags: No architecture, poor ingestion, no optimization
  2. Scalability
    • Horizontal scaling
    • Segment management
    • Performance tuning
    • Red Flags: No scaling, poor segments, no tuning

Communication Skills

  1. Clear Explanation
    • Explains Pinot concepts
    • Discusses trade-offs
    • Justifies design decisions
    • Red Flags: Unclear explanations, no justification, confusing

Meta-Specific Focus

  1. Real-Time Analytics Expertise
    • Understanding of OLAP systems
    • Pinot mastery
    • Low-latency optimization
    • Key: Demonstrate real-time analytics expertise
  2. System Design Skills
    • Can design analytics systems
    • Understands OLAP challenges
    • Makes informed trade-offs
    • Key: Show practical analytics design skills

Summary

Apache Pinot Key Points:

  • Real-Time OLAP: Low-latency analytical queries
  • Column-Oriented: Optimized for analytics
  • Real-Time Ingestion: Stream data ingestion
  • Star-Tree Index: Pre-aggregated indexes
  • Scalable: Horizontal scaling

Common Use Cases:

  • Real-time dashboards
  • OLAP analytics
  • Time-series analytics
  • Event analytics
  • Business intelligence
  • User analytics

Best Practices:

  • Design dimensions and metrics carefully
  • Use time column for partitioning
  • Configure star-tree for common queries
  • Always use time filters in queries
  • Optimize indexes
  • Monitor query performance
  • Plan for horizontal scaling

Apache Pinot is a powerful real-time OLAP database optimized for sub-second analytical queries on large datasets.