ClickHouse Team

ClickHouse Team

People

Goals

Data Team's Mission at PostHog

Data Team's mission is to provide a storage and query engine that meets these requirements:

  • Continue to meet the needs of the product today now and in the future
  • Maintain and optimize our current ClickHouse deployment
  • Elastically scale our capacity with little effort
  • Support multiple query quality of service (QOS) guarantees (Real-time, Batch, etc.)
  • Data is stored once and queryable from the appropriate tool
  • Queries are optimized for cost and performance
  • Tunable execution performance to allow trade-offs between cost and performance
  • Storage is durable

In service of this mission, our goals for Q4 are:

Goals for Q4:

  • Improve elasticity and flexibility of our data store by putting all our data in Iceberg
    • Work with Altinity effectively to ship read path for Iceberg on ClickHouse - Brett Hoerner
    • Setup infrastructure to ship all of our data to Iceberg on S3 - James Greenhill
    • Shipping Query logs to Iceberg - James Greenhill
  • Continue improving CH operational expertise
    • Upgrade to a later version of Clickhouse - James Greenhill
    • Capacity planning - James Greenhill
    • Automation - Daniel Escribano
    • Put some of the basic mitigation operations in runbooks - Daniel Escribano
  • Schema management
    • Tool for schema migration (coordinator schemas) - Daniel Escribano
    • Tool for long running mutations - Daniel Escribano
  • Continued investment in performance
    • Tooling for other teams to understand which queries are slow and why - Ted Kaemming
    • Investigate variability of queries - Ted Kaemming
    • Per-team limits on queries/query complexity (needs product work) - Ted Kaemming

Handbook