Useful SQL functions

Last updated:

|Edit this page|

Our SQL flavor includes many functions to aggregate and manipulate queried data. Below are some examples of some of the most popular SQL functions you can use in your insights.

Aggregate functions

These aggregate results for columns across all rows. They include:

  • avg(): Calculates the average numeric value of a column.
  • sum(): Calculates the total (sum) numeric value of a column.
  • max(), min(): Finds the maximum or minimum value of a column.
SQL
SELECT
avg(properties.$screen_height),
sum(properties.$screen_height),
max(properties.$screen_height),
min(properties.$screen_height)
FROM events
WHERE event = '$pageview' AND properties.$screen_height IS NOT NULL

You can find a full list of these in supported aggregations.

Count

Use count() to count the number of rows in a particular column. count(*) counts all rows, while count(column_name) counts the number of non-null values in a column.

Regular functions

Our SQL flavor provides many functions for accessing, modifying, and calculating data from queries. Along with the ones listed below, many basics include calculation operators (+, -, /, *), type conversions (toInt, toString), conditional statements (if, multiIf), and rounding (floor, round).

You can find a full list of these in supported ClickHouse functions.

Date and time

  • now(), today(), yesterday(): Returns the current time, date, or yesterday’s date respectively.
  • interval: A length of time for use in arithmetic operations with other dates and times.
  • toDayOfWeek, toHour, toMinute: Converts date number of day of week (1-7), hour in 24-hour time (0-23), and minute in hour (0-59).
  • toStartOfYear, toStartOfMonth, toMonday, toStartOfDay, toStartOfMinute: rounds date down to the nearest year, month, Monday, day, hour, or minute respectively
  • dateDiff('unit', startdate, enddate): Returns the count in unit between startdate and enddate.
  • formatDateTime: Formats a time according to a MySQL datetime format string.
SQL
SELECT
formatDateTime(now(), '%a %b %T') AS current_time,
toDayOfWeek(now()) AS current_day_of_week,
dateDiff('day', timestamp, now()) AS days_since_event
FROM events
WHERE timestamp > now() - interval 1 day

Read more examples in How to do time-based breakdowns (hour, minute, real time) and Using SQL for advanced time and date filters.

String

  • extract: Extracts a fragment of a string using a regular expression.
  • concat: Concatenates strings listed without separator.
  • splitByChar, splitByString, splitByRegexp, splitByWhitespace: splits a string into substring separated by a specified character, string, regular expression, or whitespace character respectively.
  • match: Return whether the string matches a regular expression pattern.
  • replaceOne, replaceRegexpOne: Replace the first occurrence of matching a substring or regular expression pattern respectively with a replacement string.
  • trim: Remove specified characters (or whitespace) from the start or end of a string.
  • upper, lower: Converts a string to uppercase or lowercase.
SQL
select
extract(elements_chain, '[:|"]attr__class="(.*?)"') as class_name,
concat(properties.$os, ' version: ', properties.$os_version),
replaceRegexpOne(properties.$current_url, '^/', 'site/') AS modified_current_url
from events
where event = '$autocapture'

Read more in How to analyze autocapture events with SQL.

JSON

You can access nested data in JSON and objects directly.

SQL
select properties.$set.$geoip_country_name
from events

You can parse JSON with JSONExtractRaw() to return a value.

SQL
SELECT
JSONExtractRaw(properties.$set) as set_properties
FROM events
WHERE properties.$set IS NOT NULL

Specialized JSONExtract functions exist for different data types including:

  • JSONExtractFloat
  • JSONExtractArrayRaw
  • JSONExtractString
  • JSONExtractBool

Array

  • arrayElement(arr, n): Retrieves the element with the index of n from the array arr.
  • arrayJoin(arr): Takes a row and generates multiple rows for the number of elements in the array. It copies all the column values, except the column where this function is applied. It replaces the applied column with the corresponding array value.
SQL
SELECT flag, count()
FROM (
SELECT arrayJoin(JSONExtractArrayRaw(assumeNotNull(properties.$active_feature_flags))) as flag
FROM events
WHERE event = '$pageview' and timestamp > '2023-08-01'
)
GROUP BY flag
ORDER BY count() desc

Read more in How to filter and breakdown arrays with SQL.

Sparkline

A sparkline is a tiny graph contained in one cell of your query result. As an argument, it takes an array of integers.

SQL
SELECT sparkline(range(1, 10)) FROM (SELECT 1)

You can use it to visualize queries, such as a 24-hour $pageview count for different $current_url values.

SQL
SELECT
pageview,
sparkline(arrayMap(h -> countEqual(groupArray(hour), h), range(0,23))),
count() as pageview_count
FROM
(
SELECT
properties.$current_url as pageview,
toHour(timestamp) AS hour
FROM
events
WHERE
timestamp > now () - interval 1 day
and event = '$pageview'
) subquery
GROUP BY
pageview
ORDER BY
pageview_count desc

You can also use it for art.

SQL
select
sparkline(arrayMap(a -> cos(toSecond(timestamp) + a/4), range(100 + 5 * toSecond(timestamp))))
from events

Session replays

You can create a button to view the replay for a session by using the recording_button() function with the session_id. For example, to get a list of recent replays, you can use:

SQL
SELECT
person.properties.email,
min_first_timestamp AS start,
recording_button(session_id)
FROM
raw_session_replay_events
WHERE
min_first_timestamp >= now() - INTERVAL 1 DAY
AND min_first_timestamp <= now()
ORDER BY
min_first_timestamp DESC
LIMIT 10

Questions? Ask Max AI.

It's easier than reading through 675 pages of documentation

Community questions

Was this page useful?

Next article

SQL variables

SQL variables enable you to create configurable, reusable queries by defining placeholders that can be dynamically replaced with different values. Using variables in SQL queries You must first create a variable in the variable sidebar panel. Once created, variables in PostHog SQL are defined using the {variables.<variable-name>} syntax. When you write a query with variables, you'll be prompted to provide values for each variable. For example, this query uses a variable to filter events by a…

Read next article