• Product
  • Pricing
  • Docs
  • Using PostHog
  • Community
  • Company
  • Login
  • Docs

  • Overview
    • Quickstart with PostHog Cloud
    • Overview
      • AWS
      • Azure
      • DigitalOcean
      • Google Cloud Platform
      • Hobby
      • Other platforms
      • Instance settings
      • Environment variables
      • Securing PostHog
      • Monitoring with Grafana
      • Running behind a proxy
      • Configuring email
      • Helm chart configuration
      • Deploying ClickHouse using Altinity.Cloud
      • Configuring Slack
      • Overview
        • Overview
        • Upgrade notes
        • Overview
        • 0001-events-sample-by
        • 0002_events_sample_by
        • 0003_fill_person_distinct_id2
        • ClickHouse
          • Backup
          • Kafka Engine
          • Resize disk
          • Restore
          • Vertical scaling
          • Horizontal scaling (Sharding & replication)
        • Kafka
          • Resize disk
          • Log retention
        • PostgreSQL
          • Resize disk
          • Troubleshooting long-running migrations
        • Plugin server
        • MinIO
        • Redis
        • Zookeeper
      • Disaster recovery
    • Troubleshooting and FAQs
    • Architecture
    • Managing hosting costs
    • EU-only hosting
    • Overview
    • Ingest live data
    • Ingest historical data
    • Identify users
    • User properties
    • Deploying a reverse proxy
    • Libraries
    • Badge
      • Snippet installation
      • Android
      • iOS
      • JavaScript
      • Flutter
      • React Native
      • Browser Extensions
      • Elixir
      • Go
      • Java
      • Node.js
      • PHP
      • Python
      • Ruby
      • Docusaurus v2
      • Gatsby
      • Google Tag Manager
      • Next.js
      • Nuxt.js
      • Retool
      • RudderStack
      • Segment
      • Sentry
      • Slack
      • Shopify
      • WordPress
      • Message formatting
      • Microsoft Teams
      • Slack
      • Discord
    • To another self-hosted instance
    • To PostHog from Amplitude
    • Between Cloud and self-hosted
    • Overview
    • Troubleshooting
      • Overview
      • Tutorial
      • TypeScript types
      • Developer reference
        • Amazon Kinesis Import
        • BitBucket Release Tracker
        • Braze Import
        • Event Replicator
        • GitHub Release Tracker
        • GitHub Star Sync
        • GitLab Release Tracker
        • Heartbeat
        • Ingestion Alert
        • Email Scoring
        • n8n Connector
        • Orbit Connector
        • Redshift Import
        • Segment Connector
        • Shopify Connector
        • Twitter Followers Tracker
        • Zendesk Connector
        • Airbyte Exporter
        • Amazon S3 Export
        • BigQuery Export
        • Customer.io Connector
        • Databricks Export
        • Engage Connector
        • GCP Pub/Sub Connector
        • Google Cloud Storage Export
        • Hubspot Connector
        • Intercom Connector
        • Migrator 3000
        • PagerDuty Connector
        • PostgreSQL Export
        • Redshift Export
        • RudderStack Export
        • Salesforce Connector
        • Sendgrid Connector
        • Sentry Connector
        • Snowflake Export
        • Twilio Connector
        • Variance Connector
        • Zapier Connector
        • Downsampler
        • Event Sequence Timer
        • First Time Event Tracker
        • Property Filter
        • Property Flattener
        • Schema Enforcer
        • Taxonomy Standardizer
        • Unduplicator
        • Automatic Cohort Creator
        • Currency Normalizer
        • GeoIP Enricher
        • Timestamp Parser
        • URL Normalizer
        • User Agent Populator
    • Overview
    • POST-only public endpoints
    • Actions
    • Annotations
    • Cohorts
    • Dashboards
    • Event definitions
    • Events
    • Experiments
    • Feature flags
    • Funnels
    • Groups
    • Groups types
    • Insights
    • Invites
    • Members
    • Persons
    • Plugin configs
    • Plugins
    • Projects
    • Property definitions
    • Session recordings
    • Trends
    • Users
    • Data model
    • Overview
    • GDPR guidance
    • HIPAA guidance
    • CCPA guidance
    • Data egress & compliance
    • Overview
    • Code of conduct
    • Recognizing contributions
  • Using PostHog

  • Table of contents
      • Dashboards
      • Funnels
      • Group Analytics
      • Insights
      • Lifecycle
      • Path analysis
      • Retention
      • Stickiness
      • Trends
      • Heatmaps
      • Session Recording
      • Correlation Analysis
      • Experimentation
      • Feature Flags
      • Actions
      • Annotations
      • Cohorts
      • Data Management
      • Events
      • Persons
      • Sessions
      • UTM segmentation
      • Team collaboration
      • Organizations & projects
      • Settings
      • SSO & SAML
      • Toolbar
      • Notifications & alerts
  • Tutorials
    • All tutorials
    • Actions
    • Apps
    • Cohorts
    • Dashboards
    • Feature flags
    • Funnels
    • Heatmaps
    • Path analysis
    • Retention
    • Session recording
    • Trends
  • Support
  • Docs

  • Overview
    • Quickstart with PostHog Cloud
    • Overview
      • AWS
      • Azure
      • DigitalOcean
      • Google Cloud Platform
      • Hobby
      • Other platforms
      • Instance settings
      • Environment variables
      • Securing PostHog
      • Monitoring with Grafana
      • Running behind a proxy
      • Configuring email
      • Helm chart configuration
      • Deploying ClickHouse using Altinity.Cloud
      • Configuring Slack
      • Overview
        • Overview
        • Upgrade notes
        • Overview
        • 0001-events-sample-by
        • 0002_events_sample_by
        • 0003_fill_person_distinct_id2
        • ClickHouse
          • Backup
          • Kafka Engine
          • Resize disk
          • Restore
          • Vertical scaling
          • Horizontal scaling (Sharding & replication)
        • Kafka
          • Resize disk
          • Log retention
        • PostgreSQL
          • Resize disk
          • Troubleshooting long-running migrations
        • Plugin server
        • MinIO
        • Redis
        • Zookeeper
      • Disaster recovery
    • Troubleshooting and FAQs
    • Architecture
    • Managing hosting costs
    • EU-only hosting
    • Overview
    • Ingest live data
    • Ingest historical data
    • Identify users
    • User properties
    • Deploying a reverse proxy
    • Libraries
    • Badge
      • Snippet installation
      • Android
      • iOS
      • JavaScript
      • Flutter
      • React Native
      • Browser Extensions
      • Elixir
      • Go
      • Java
      • Node.js
      • PHP
      • Python
      • Ruby
      • Docusaurus v2
      • Gatsby
      • Google Tag Manager
      • Next.js
      • Nuxt.js
      • Retool
      • RudderStack
      • Segment
      • Sentry
      • Slack
      • Shopify
      • WordPress
      • Message formatting
      • Microsoft Teams
      • Slack
      • Discord
    • To another self-hosted instance
    • To PostHog from Amplitude
    • Between Cloud and self-hosted
    • Overview
    • Troubleshooting
      • Overview
      • Tutorial
      • TypeScript types
      • Developer reference
        • Amazon Kinesis Import
        • BitBucket Release Tracker
        • Braze Import
        • Event Replicator
        • GitHub Release Tracker
        • GitHub Star Sync
        • GitLab Release Tracker
        • Heartbeat
        • Ingestion Alert
        • Email Scoring
        • n8n Connector
        • Orbit Connector
        • Redshift Import
        • Segment Connector
        • Shopify Connector
        • Twitter Followers Tracker
        • Zendesk Connector
        • Airbyte Exporter
        • Amazon S3 Export
        • BigQuery Export
        • Customer.io Connector
        • Databricks Export
        • Engage Connector
        • GCP Pub/Sub Connector
        • Google Cloud Storage Export
        • Hubspot Connector
        • Intercom Connector
        • Migrator 3000
        • PagerDuty Connector
        • PostgreSQL Export
        • Redshift Export
        • RudderStack Export
        • Salesforce Connector
        • Sendgrid Connector
        • Sentry Connector
        • Snowflake Export
        • Twilio Connector
        • Variance Connector
        • Zapier Connector
        • Downsampler
        • Event Sequence Timer
        • First Time Event Tracker
        • Property Filter
        • Property Flattener
        • Schema Enforcer
        • Taxonomy Standardizer
        • Unduplicator
        • Automatic Cohort Creator
        • Currency Normalizer
        • GeoIP Enricher
        • Timestamp Parser
        • URL Normalizer
        • User Agent Populator
    • Overview
    • POST-only public endpoints
    • Actions
    • Annotations
    • Cohorts
    • Dashboards
    • Event definitions
    • Events
    • Experiments
    • Feature flags
    • Funnels
    • Groups
    • Groups types
    • Insights
    • Invites
    • Members
    • Persons
    • Plugin configs
    • Plugins
    • Projects
    • Property definitions
    • Session recordings
    • Trends
    • Users
    • Data model
    • Overview
    • GDPR guidance
    • HIPAA guidance
    • CCPA guidance
    • Data egress & compliance
    • Overview
    • Code of conduct
    • Recognizing contributions
  • Using PostHog

  • Table of contents
      • Dashboards
      • Funnels
      • Group Analytics
      • Insights
      • Lifecycle
      • Path analysis
      • Retention
      • Stickiness
      • Trends
      • Heatmaps
      • Session Recording
      • Correlation Analysis
      • Experimentation
      • Feature Flags
      • Actions
      • Annotations
      • Cohorts
      • Data Management
      • Events
      • Persons
      • Sessions
      • UTM segmentation
      • Team collaboration
      • Organizations & projects
      • Settings
      • SSO & SAML
      • Toolbar
      • Notifications & alerts
  • Tutorials
    • All tutorials
    • Actions
    • Apps
    • Cohorts
    • Dashboards
    • Feature flags
    • Funnels
    • Heatmaps
    • Path analysis
    • Retention
    • Session recording
    • Trends
  • Support
  • Docs
  • Self-host
  • Runbook
  • Services
  • ClickHouse
  • Horizontal scaling (Sharding & replication)

Horizontal scaling (sharding & replication)

Last updated: Aug 09, 2022

On this page

  • How to set up sharding and replication
  • Using PostHog Helm charts
  • With external ClickHouse
  • Rebalancing data

Sharding is a horizontal cluster scaling strategy that puts parts of one ClickHouse database on different shards. This can help you to:

  • Improve fault tolerance.

    Sharding lets you isolate individual host or replica set malfunctions. If you don't use sharding, then when one host or a set of replicas fails, the entire data they contain may become inaccessible. But if 1 shard out of 5 fails, for example, then 80% of the table data is still available.

  • Improve the query performance.

    Requests compete with each other for the computing resources of cluster hosts, which can reduce the rate of request processing. This drop in the rate usually becomes obvious as the number of read operations or CPU time per query grows. In a sharded cluster where queries to the same table can be executed in parallel, competition for shared resources is eliminated and query processing time is reduced.

How to set up sharding and replication

Sharding PostHog ClickHouse is a new experimental feature only supported from PostHog 1.34.0.

To use sharding, first upgrade to version >= 1.34.0 and run the 0004_replicated_schema async migration

Using PostHog Helm charts

Update values.yaml with the appropriate sharding settings.

Example:

clickhouse:
layout:
shardsCount: 3
replicasCount: 2

PostHog helm charts implement sharding by leveraging clickhouse-operator configuration. Full documentation for this can be found in clickhouse-operator documentation

With external ClickHouse

If you're using an external ClickHouse provider like Altinity.Cloud, you can change sharding and replication settings within that platform.

Note that to propagate all the schemas to the new ClickHouse nodes, you should also do an helm upgrade which creates the right schema on the new nodes.

Rebalancing data

When adding new shards to an existing cluster, data between shards is not automatically rebalanced. This rebalancing is the job of the PostHog operator to do.

Tools like clickhouse-copier or clickhouse-backup can help with this rebalancing.

Questions?

Was this page useful?

Next article

Kafka

Apache Kafka is an open-source distributed event streaming platform used by thousands of companies for high-performance data pipelines, streaming analytics, data integration, and mission-critical applications. At PostHog we mainly use it to stream events from our ingestion pipeline to ClickHouse. Dictionary broker : a cluster is built by one or more servers. The servers forming the storage layer are called brokers event : an event records the fact that "something happened" in the world or in…

Read next article

Share

Jump to:

  • How to set up sharding and replication
  • Using PostHog Helm charts
  • With external ClickHouse
  • Rebalancing data
  • Questions?
  • Edit this page
  • Raise an issue
  • Toggle content width
  • Toggle dark mode
  • About
  • Blog
  • Newsletter
  • Careers
  • Support
  • Contact sales

Product OS suite

Product overview

Analytics
  • Funnels
  • Trends
  • Paths

Pricing

Features
  • Session recording
  • Feature flags
  • Experimentation
  • Heatmaps

Customers

Platform
  • Correlation analysis
  • Collaboration
  • Apps

Community

Discussion
  • Questions?
  • Slack
  • Issues
  • Contact sales
Get involved
  • Roadmap
  • Contributors
  • Merch
  • PostHog FM
  • Marketplace

Docs

Getting started
  • PostHog Cloud
  • Self-hosted
  • Compare options
  • Tutorials
  • PostHog on GitHub
Install & integrate
  • Installation
  • Docs
  • API
  • Apps
User guides
  • Cohorts
  • Funnels
  • Sessions
  • Data
  • Events

Company

About
  • Our story
  • Team
  • Handbook
  • Investors
  • Careers
Resources
  • FAQ
  • Ask a question
  • Blog
  • Press
  • Merch
  • YouTube
© 2022 PostHog, Inc.
  • Code of conduct
  • Privacy
  • Terms