Is PostHog warehouse native?

PostHog and Snowflake's relationship status: It's complicated.

tl;dr

PostHog enables you to connect external warehouses (like Snowflake or BigQuery) as sources to use warehouse tables inside PostHog. This moves selected data into PostHog and runs queries on our compute. You keep your data in your existing warehouse while still benefitting from PostHog tools such as analytics, experiments, and feature flags.

Alternatively, you can use PostHog's integrated data warehouse so that your data never needs to travel. You can store and model data in PostHog's warehouse, then use it across PostHog tools without stitching together multiple vendors or maintaining complex ETL pipelines.

What does "warehouse native" mean anyway?

“Warehouse native” usually means that you have an external tool, such as an analytics platform, that exists outside of your warehouse and does not ingest data. Queries and workloads are created in the external platform, but run on your data warehouse (Snowflake, BigQuery, Databricks, etc.) and the data stays there, never moving into the external tool. This is what tools like Statsig and Amplitude offer — they run queries directly in your existing warehouse.

PostHog takes a different approach: we give you the flexibility to either move selected data into PostHog from an external warehouse, or to use our integrated data warehouse. Both options enable you to use warehouse data with PostHog tools and run queries on our compute, while also giving you complete control of where data is stored.

What does PostHog support today?

If you're using an external warehouse, such as Snowflake, BigQuery, or Databricks, you can connect it as a source and sync the tables and fields you need into PostHog via our warehouse sources. Queries then run on PostHog compute, enabling you to use warehouse data across PostHog tools. This requires moving data out of the warehouse and running compute in PostHog, but gives you the flexibility to keep your data in your existing warehouse while still benefitting from PostHog tools.

Alternatively, PostHog offers an integrated data warehouse which works with other PostHog tools such as product analytics, experiments, and feature flags. If you're using PostHog as your data warehouse, your data stays in PostHog and can be accessed by other PostHog tools, eliminating the need to stitch multiple vendors together and maintain complex ETL pipelines.

What does PostHog not support today?

PostHog does not execute queries directly on an external warehouse (e.g. Snowflake, BigQuery, or Databricks). If your requirement is “run my analytics queries inside my existing warehouse,” that isn't how PostHog works today. We run queries on PostHog infrastructure—either in our integrated warehouse or on data synced into PostHog via our warehouse sources.

Why this approach?

The benefit of using an integrated stack like PostHog is that it gives you a single place for tools such as product analytics, experiments, and feature targeting without hopping between your warehouse and a separate product tool. We chose to own the execution environment so we can deliver that experience consistently, while also offering users a way to eliminate point solutions that need to be stitched together into complex data stacks.

Companies like HeadshotPro, Webshare, and ElevenLabs use PostHog's integrated warehouse as their single source of truth. This eliminates the need to maintain multiple systems and complex ETL pipelines.

What does this mean for the future?

PostHog is building a managed warehouse based on DuckDB in addition to the current ClickHouse-based warehouse. The focus is on expanding what integrates with our integrated warehouse, making it easier to use PostHog as your primary data platform without needing to stitch together multiple tools. If you're interested in finding out more, we suggest joining the waitlist for the managed DuckDB warehouse.

How PostHog's integrated warehouse works (docs) →

Community questions

Questions about this page? or post a community question.