Optimizing SQL queries

When writing custom queries, the burden of performance falls onto you. PostHog handles performance for queries we own (for example, in product analytics insights and experiments, etc.), but because performance depends on how queries are structured and written, we can't optimize them for you. Large data sets particularly require extra careful attention to performance.

Here is some advice for making sure your queries are quick and don't read over too much data (which can increase costs):

1. Use shorter time ranges

You should almost always include a time range in your queries, and the shorter the better. There are a variety of SQL features to help you do this including now(), INTERVAL, and dateDiff. See more about these in our SQL docs.

Run in PostHog
SELECT count() FROM events WHERE timestamp >= now() - INTERVAL 7 DAY
curl \
  -H 'Content-Type: application/json' \
  -H "Authorization: Bearer $POSTHOG_PERSONAL_API_KEY" \
  <ph_app_host>/api/projects/:project_id/query/ \
  -d '{
        "query": {
          "kind": "HogQLQuery",
          "query": "SELECT count() FROM events WHERE timestamp >= now() - INTERVAL 7 DAY"
        },
        "name": "event count in last 7 days"
      }'
import requests
import json

url = "<ph_app_host>/api/projects/{project_id}/query/"
headers = {
    'Content-Type': 'application/json',
    'Authorization': 'Bearer {POSTHOG_PERSONAL_API_KEY}'
}
payload = {
    "query": {
        "kind": "HogQLQuery",
        "query": """
          SELECT count()
          FROM events
          WHERE timestamp >= now() - INTERVAL 7 DAY
        """
    },
    "name": "event count in last 7 days"
}
response = requests.post(url, headers=headers, data=json.dumps(payload))
print(response.json())
import fetch from "node-fetch";

async function createQuery() {
  const url = "<ph_app_host>/api/projects/:project_id/query/";
  const headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer {POSTHOG_PERSONAL_API_KEY}"
  };

  const payload = {
    "query": {
      "kind": "HogQLQuery",
      "query": `
        SELECT count()
        FROM events
        WHERE timestamp >= now() - INTERVAL 7 DAY
        LIMIT 100
      `
    },
    "name": "event count in last 7 days"
  }

  const response = await fetch(url, {
    method: "POST",
    headers: headers,
    body: JSON.stringify(payload),
  });

  const data = await response.json();
  console.log(data);
}

createQuery()

2. Materialize a view for the data you need

The data warehouse enables you to save and materialize views of your data. This means that the view is precomputed, which can significantly improve query performance.

To do this, write your query in the SQL editor, click Materialize, then Save and materialize, and give it a name without spaces (I chose mat_event_count). You can also schedule to update the view at a specific interval.

Once done, you can query the view like any other table.

Run in PostHog
SELECT * FROM mat_event_count
curl \
  -H 'Content-Type: application/json' \
  -H "Authorization: Bearer $POSTHOG_PERSONAL_API_KEY" \
  <ph_app_host>/api/projects/:project_id/query/ \
  -d '{
        "query": {
          "kind": "HogQLQuery",
          "query": "SELECT * FROM mat_event_count"
        },
        "name": "get materialized event count"
      }'
import requests
import json

url = "<ph_app_host>/api/projects/{project_id}/query/"
headers = {
    'Content-Type': 'application/json',
    'Authorization': 'Bearer {POSTHOG_PERSONAL_API_KEY}'
}
payload = {
    "query": {
        "kind": "HogQLQuery",
        "query": """
          SELECT *
          FROM mat_event_count
        """
    },
    "name": "get materialized event count"
}
response = requests.post(url, headers=headers, data=json.dumps(payload))
print(response.json())
import fetch from "node-fetch";

async function createQuery() {
  const url = "<ph_app_host>/api/projects/:project_id/query/";
  const headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer {POSTHOG_PERSONAL_API_KEY}"
  };

  const payload = {
    "query": {
      "kind": "HogQLQuery",
      "query": `
        SELECT *
        FROM mat_event_count
      `
    },
    "name": "get materialized counts"
  }

  const response = await fetch(url, {
    method: "POST",
    headers: headers,
    body: JSON.stringify(payload),
  });

  const data = await response.json();
  console.log(data);
}

createQuery()

3. Don't scan the same table multiple times

Reading a large table like events or persons more than once in the same query multiplies the work PostHog has to do (more I/O, more CPU, more memory). For example, this query is inefficient:

SQL
Run in PostHog
WITH us_events AS (
    SELECT *
    FROM events
    WHERE properties.$geoip_country_code = 'US'
),
ca_events AS (
    SELECT *
    FROM events
    WHERE properties.$geoip_country_code = 'CA'
)
SELECT *
FROM us_events
UNION ALL
SELECT *
FROM ca_events

Instead, pull the rows you need once and save it as a materialized view. You can then query from that materialized view in all the other steps.

Start by saving this materialized view, e.g. as base_events:

SQL
Run in PostHog
SELECT event, properties.$geoip_country_code as country
FROM events
WHERE properties.$geoip_country_code IN ('US', 'CA')

You can then query from base_events in your main query, which avoids scanning the raw events table multiple times:

SQL
Run in PostHog
WITH us_events AS (
    SELECT event
    FROM base_events
    WHERE country = 'US'
),
ca_events AS (
    SELECT event
    FROM base_events
    WHERE country = 'CA'
)
SELECT *
FROM us_events
UNION ALL
SELECT *
FROM ca_events

4. Name your queries for easier debugging

Always provide a meaningful name parameter for your queries. This helps you:

Identify slow or problematic queries in the query_log table
Analyze query performance patterns over time
Debug issues more efficiently
Track resource usage by query type

Good query names are descriptive and include the purpose:

daily_active_users_last_7_days
funnel_signup_to_activation
revenue_by_country_monthly

Bad names are generic and vague:

query1
test
data

5. Use keyset pagination instead of OFFSET

OFFSET pagination is not supported for programmatic requests on /query and is currently rejected with HTTP 400 for personal API keys. If you're paginating to export data, stop and use batch exports instead – /query is not a supported export path.

For ad-hoc paging, use keyset pagination on the table's sort column: timestamp for events, id for persons. Other columns (e.g. created_at) are not indexed for this and will be slow.

SQL
Run in PostHog
-- ❌ Rejected
SELECT * FROM events WHERE timestamp >= '2024-01-01'
ORDER BY timestamp LIMIT 1000 OFFSET 1000;

-- ✅ Keyset pagination
SELECT * FROM events WHERE timestamp > '2024-01-01 12:34:56.789'
ORDER BY timestamp LIMIT 1000;

6. Other SQL optimizations

Options 1-5 make the most difference, but other generic SQL optimizations work too. See our SQL docs for commands, useful functions, and more to help you with this.