SQL editor

Last updated:

|Edit this page|

The SQL editor enables you to directly access all your data in PostHog, from PostHog-specific events and persons tables to your external sources, using SQL queries.

SQL commands

You create SQL queries out of a list of commands that shape what data, filters, aggregations, and ordering we want to see.

SQL queries in PostHog don’t require the trailing semicolon (;) of traditional SQL queries.

SELECT

Use SELECT to select data (usually columns, transformations, or aggregations) from one or more tables in the database. You can use arithmetic or functions to modify the selected data.

SQL
SELECT
event,
timestamp,
properties.$current_url,
concat(properties.$lib, ' - ', properties.$lib_version)
FROM events

Common values to select are * (representing all), event, timestamp, properties, and functions like count(). You can access properties using dot notation like person.properties.$initial_browser. These values can be found in the data management properties tab or inside tables in the SQL tab.

Add the DISTINCT clause to SELECT commands to keep only unique rows in query results.

SQL
SELECT DISTINCT person_id
FROM events

FROM

Use FROM to select the database table to run the queries on. In PostHog, examples include events, groups, raw_session_replay_events, and more listed in the data management database tab.

SQL
SELECT session_id, min_first_timestamp, click_count
FROM raw_session_replay_events

FROM also enables you to break down your query into subqueries. This is useful for analyzing multiple groups of data. For example, to get the difference in event count between the last 7 days and the 7 days before that, you can use two subqueries like this:

SQL
SELECT
last_7_days.last_7_days_count - previous_7_days.previous_7_days_count
FROM
(
SELECT COUNT(*) AS last_7_days_count
FROM events
WHERE timestamp > now() - INTERVAL 7 DAY
AND timestamp <= now()
) AS last_7_days,
(
SELECT COUNT(*) AS previous_7_days_count
FROM events
WHERE timestamp > now() - INTERVAL 14 DAY
AND timestamp <= now() - INTERVAL 7 DAY
) AS previous_7_days

JOIN

You can query over multiple tables together using the JOIN command. This combines the tables and returns different records depending on which of the four conditions you use:

  1. INNER JOIN: Records with matching values in both tables.

  2. LEFT JOIN: All records from the left table and the matched records from the right table.

  3. RIGHT JOIN: All records from the right table and the matched records from the left table.

  4. FULL JOIN: All records matching either the left or right tables.

The command then takes one table before it and another after it and combines them based on the join condition using the ON keyword. For example, below, the events table is the left table and the persons table is the right table:

SQL
SELECT events.event, persons.is_identified
FROM events
LEFT JOIN persons ON events.person_id = persons.id

This is especially useful when querying using the data warehouse and querying external sources. For example, once you set up the Stripe connector, you can query for a count of events from your customers like this:

SQL
SELECT events.distinct_id, COUNT() AS event_count
FROM events
INNER JOIN prod_stripe_customer ON events.distinct_id = prod_stripe_customer.email
GROUP BY events.distinct_id
ORDER BY event_count DESC

WHERE

Use WHERE to filter rows based on specified conditions. These conditions can be:

  1. Comparison operators like =, !=, <, or >=
  2. Logical operators like AND, OR, or NOT. These are often used to combine multiple conditions.
  3. Functions like toDate, today()
  4. Clauses like LIKE, IN, IS NULL, BETWEEN
SQL
SELECT *
FROM events
WHERE event = '$pageview'
AND toDate(timestamp) = today()
AND properties.$current_url LIKE '%/blog%'

To have the insight or dashboard date range selector apply to the insight, include {filters} in query like this:

SQL
SELECT *
FROM events
WHERE event = '$pageview'
AND properties.$current_url LIKE '%/blog%'
AND {filters}

WHERE is also useful for querying across multiple tables. For example, if you have the Hubspot connector set up, you can get a count of events for contacts with a query like this:

SQL
SELECT COUNT() AS event_count, distinct_id
FROM events
WHERE distinct_id IN (SELECT email FROM hubspot_contacts)
GROUP BY distinct_id
ORDER BY event_count DESC

GROUP BY

Use GROUP BY to group rows that have the same values in specified columns into summary rows. It is often used in combination with aggregate functions.

SQL
select
properties.$os,
count()
from events
group by
properties.$os

ORDER BY

Use ORDER BY to sort the query results by one or more columns. You can specify order by ascending with ASC or descending with DESC.

SQL
SELECT
properties.$browser,
count()
FROM events
GROUP BY properties.$browser
ORDER BY count() DESC

LIKE

Use LIKE to search for a specified pattern in a column. This is often done in the WHERE command. You can also use ILIKE to make the search case insensitive.

Use the % sign to represent any set of characters (wildcard). Use the _ sign to define a single character wildcard.

For example, to get all the current URLs that contain the string "docs" you can use:

SQL
SELECT properties.$current_url
FROM events
WHERE properties.$current_url LIKE '%docs%'

AS

Use AS to alias columns or tables with different names. This makes the query and results more readable.

SQL
SELECT
properties.$current_url as current_url,
count() as current_url_count
FROM events
GROUP BY current_url
ORDER BY current_url_count DESC

LIMIT

Use LIMIT to restrict the number of rows returned by the query. It specifies the maximum number of rows the query should retrieve. By default, PostHog sets it at 100.

SQL
SELECT
properties.$lib as library,
count() as library_count
FROM events
WHERE properties.$lib != ''
GROUP BY library
ORDER BY library_count DESC
LIMIT 1

To paginate results, you can use the OFFSET command. For example, to get the 101-150th rows, you can use it like this:

SQL
SELECT event, count(*)
FROM events
GROUP BY event
ORDER BY count(*) DESC
LIMIT 50 OFFSET 100

Note: SQL insights default to a LIMIT 100. If you're adding the insight to a dashboard or using longer date ranges, consider explicitly increasing the limit to ensure all relevant data is shown.

HAVING

Use HAVING with the GROUP BY command to filter the results based on aggregate function values. While WHERE filters rows before grouping, HAVING filters grouped results after aggregation.

SQL
SELECT
properties.$os,
count()
FROM events
GROUP BY
properties.$os
HAVING count() > 100

WITH

Use WITH to define a temporary result set that you can reference within a larger query. It helps break down complex queries into smaller parts. You can think of it as a function that returns a temporary table similar to using a subquery in a FROM command. The difference is that we query WITH subqueries each time they are used, potentially leading to slower queries.

SQL
with first_query as (
select
count() as first_count
from events
)
select
first_count
from first_query

WINDOW

Use WINDOW to query data across a set of rows related to the current row without grouping the rows in the output like aggregates do. This is useful for complex queries that need row-level detail while also aggregating data from a set of rows.

A window function contains multiple parts:

  1. PARTITION BY: Divides the rows into partitions which the function then applies to. Each partition is processed separately.

  2. ORDER BY: Sorts rows within each partition.

  3. Frame Specification: Defines the subset of rows to include in the window such as ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING.

It is useful for analysis like running totals, moving averages, ranking, and percentiles. For example, to get a running total of pageviews, you can:

SQL
SELECT
event,
timestamp,
COUNT(*) OVER w AS running_total_pageviews
FROM events
WHERE event = '$pageview'
WINDOW w AS (ORDER BY timestamp)
ORDER BY timestamp DESC

CASE

Use CASE within the SELECT command to implement conditional logic within your SQL queries. It allows you to execute different SQL expressions based on conditions. It is similar to if/else statements in programming.

For example, to group properties.$os values in mobile, desktop, and other, you can use:

SQL
SELECT
CASE
WHEN properties.$os IN ('iOS', 'Android') THEN 'mobile'
WHEN properties.$os IN ('Windows', 'Mac OS X', 'Linux', 'Chrome OS') THEN 'desktop'
ELSE 'other'
END AS os_category,
COUNT(*) AS os_count
FROM events
WHERE properties.$os IS NOT NULL
GROUP BY os_category
ORDER BY os_count DESC

Comments

Use two dashes (--) to write comments.

SQL
SELECT *
FROM events
-- WHERE event = '$pageview'

Questions? Ask Max AI.

It's easier than reading through 675 pages of documentation

Community questions

Was this page useful?

Next article

Accessing data using SQL

Strings and quotes Quotation symbols work the same way they would work with ClickHouse, which inherits from ANSI SQL: S ingle quotes ( ' ) for S trings literals. D ouble quotes ( " ) and B ackticks ( ` ) for D ata B ase identifiers. For example: Types Types (and names) for the accessible data can be found in the database , properties tabs in data management as well as in the SQL tab for external sources. They include: STRING (default) JSON (accessible with dot or bracket notation…

Read next article