Our SQL flavor includes many functions to aggregate and manipulate queried data. Below are some examples of some of the most popular SQL functions you can use in your insights.
Aggregate functions
These aggregate results for columns across all rows. They include:
avg()
: Calculates the average numeric value of a column.sum()
: Calculates the total (sum) numeric value of a column.max()
,min()
: Finds the maximum or minimum value of a column.
SELECTavg(properties.$screen_height),sum(properties.$screen_height),max(properties.$screen_height),min(properties.$screen_height)FROM eventsWHERE event = '$pageview' AND properties.$screen_height IS NOT NULL
You can find a full list of these in supported aggregations.
Count
Use count()
to count the number of rows in a particular column. count(*)
counts all rows, while count(column_name)
counts the number of non-null values in a column.
Regular functions
Our SQL flavor provides many functions for accessing, modifying, and calculating data from queries. Along with the ones listed below, many basics include calculation operators (+
, -
, /
, *
), type conversions (toInt
, toString
), conditional statements (if
, multiIf
), and rounding (floor
, round
).
You can find a full list of these in supported ClickHouse functions.
Date and time
now()
,today()
,yesterday()
: Returns the current time, date, or yesterday’s date respectively.interval
: A length of time for use in arithmetic operations with other dates and times.toDayOfWeek
,toHour
,toMinute
: Converts date number of day of week (1-7), hour in 24-hour time (0-23), and minute in hour (0-59).toStartOfYear
,toStartOfMonth
,toMonday
,toStartOfDay
,toStartOfMinute
: rounds date down to the nearest year, month, Monday, day, hour, or minute respectivelydateDiff('unit', startdate, enddate)
: Returns the count inunit
betweenstartdate
andenddate
.formatDateTime
: Formats a time according to a MySQL datetime format string.
SELECTformatDateTime(now(), '%a %b %T') AS current_time,toDayOfWeek(now()) AS current_day_of_week,dateDiff('day', timestamp, now()) AS days_since_eventFROM eventsWHERE timestamp > now() - interval 1 day
Read more examples in How to do time-based breakdowns (hour, minute, real time) and Using SQL for advanced time and date filters.
String
extract
: Extracts a fragment of a string using a regular expression.concat
: Concatenates strings listed without separator.splitByChar
,splitByString
,splitByRegexp
,splitByWhitespace
: splits a string into substring separated by a specified character, string, regular expression, or whitespace character respectively.match
: Return whether the string matches a regular expression pattern.replaceOne
,replaceRegexpOne
: Replace the first occurrence of matching a substring or regular expression pattern respectively with a replacement string.trim
: Remove specified characters (or whitespace) from the start or end of a string.upper
,lower
: Converts a string to uppercase or lowercase.
selectextract(elements_chain, '[:|"]attr__class="(.*?)"') as class_name,concat(properties.$os, ' version: ', properties.$os_version),replaceRegexpOne(properties.$current_url, '^/', 'site/') AS modified_current_urlfrom eventswhere event = '$autocapture'
Read more in How to analyze autocapture events with SQL.
JSON
You can access nested data in JSON and objects directly.
select properties.$set.$geoip_country_namefrom events
You can parse JSON with JSONExtractRaw()
to return a value.
SELECTJSONExtractRaw(properties.$set) as set_propertiesFROM eventsWHERE properties.$set IS NOT NULL
Specialized JSONExtract
functions exist for different data types including:
JSONExtractFloat
JSONExtractArrayRaw
JSONExtractString
JSONExtractBool
Array
arrayElement(arr, n)
: Retrieves the element with the index of n from the arrayarr
.arrayJoin(arr)
: Takes a row and generates multiple rows for the number of elements in the array. It copies all the column values, except the column where this function is applied. It replaces the applied column with the corresponding array value.
SELECT flag, count()FROM (SELECT arrayJoin(JSONExtractArrayRaw(assumeNotNull(properties.$active_feature_flags))) as flagFROM eventsWHERE event = '$pageview' and timestamp > '2023-08-01')GROUP BY flagORDER BY count() desc
Read more in How to filter and breakdown arrays with SQL.
Sparkline
A sparkline is a tiny graph contained in one cell of your query result. As an argument, it takes an array of integers.
SELECT sparkline(range(1, 10)) FROM (SELECT 1)
You can use it to visualize queries, such as a 24-hour $pageview
count for different $current_url
values.
SELECTpageview,sparkline(arrayMap(h -> countEqual(groupArray(hour), h), range(0,23))),count() as pageview_countFROM(SELECTproperties.$current_url as pageview,toHour(timestamp) AS hourFROMeventsWHEREtimestamp > now () - interval 1 dayand event = '$pageview') subqueryGROUP BYpageviewORDER BYpageview_count desc
You can also use it for art.
selectsparkline(arrayMap(a -> cos(toSecond(timestamp) + a/4), range(100 + 5 * toSecond(timestamp))))from events
Session replays
You can create a button to view the replay for a session by using the recording_button()
function with the session_id
. For example, to get a list of recent replays, you can use:
SELECTperson.properties.email,min_first_timestamp AS start,recording_button(session_id)FROMraw_session_replay_eventsWHEREmin_first_timestamp >= now() - INTERVAL 1 DAYAND min_first_timestamp <= now()ORDER BYmin_first_timestamp DESCLIMIT 10