Sampling (Beta)

Last updated:

|Edit this page|

Results sampling can significantly speed up loading if you have a large amount of data or are running complex queries in your insights.

Processing a lot of data can take some time, so we can offer faster results by sampling a portion of the data and extrapolating the results. The results will be less precise, but remain accurate and statistically significant if your sample pool of events is large enough.

Example

A conversion funnel for $pageview events leading to user_signed_up events over 180 days on a popular website could require aggregating billions of events to compute a result. This could take a while.

If we were to turn on sampling at a 1%, we would aggregate over only 10 million events, and then multiply the result by 100. This results in much faster load times.

Enabling insight sampling

Sampling can be enabled in each insight under Advanced options > Sampling. You can choose between sampling at 0.1%, 1%, 10%, and 25%.

These sampling rates will persist when the insight is saved, meaning if you set an insight to 10% sampling and save it to a dashboard, the results for that insight on the dashboard will be sampled. We flag sampled insights with an icon and a helpful tooltip.

Speed up slow queries

If a certain insight is taking long to load, we display a notice with some recommendations for speeding it up, including a button to enable sampling. The insight will persist your sampling rate on save.

FAQ

Will the sampled results be consistent across calculations?

If you do not send new data, yes. For a given sampling rate, the analysis will always run on the same set of data, so you don't have to worry about sampled results changing.

Does sampling work when calculating conversions?

Yes. Our sampling doesn't take a random set of events, it takes a sample based on a sampling variable. Currently, we use a user's distinct IDs for. All events are attached to distinct IDs, so you don't run the risk of an event at the first step of your funnel being in the sample while the subsequent related events aren't.

What sampling mechanism do you use under the hood?

We use ClickHouse's native sampling feature.

Questions? Ask Max AI.

It's easier than reading through 638 pages of documentation

Community questions

Was this page useful?

Next article

Customizable chart colors

Color themes Important: This feature requires a Teams or an Enterprise plan . PostHog allows you to customize color themes for your insights, making it easier to align visualizations with your brand identity. This section covers how to create custom themes, how to set a default theme, and how to override themes for individual insights. You can find the setting related to color themes under Project settings > Product analytics > Data colors . The color themes are also used when displaying…

Read next article