Data model

Last updated:

|Edit this page

On this page

This document provides a high-level overview of the various objects and primitives that make up the PostHog data model. For more information on exactly how data is stored in our database, check out the ClickHouse overview.

The two most-basic entities in PostHog are the Event and Person objects, and represent the core of our analytics functionalities.

Event

An event is the most important object in PostHog, and represents a single action that a user performed at a specific point in time. These events are sent either from one of our libraries or directly via our API.

Each event contains the following base fields within ClickHouse:

ColumnTypeDescription
uuidUUIDID of the event
team_idInt64Foreign key which links to the team
eventVARCHARName of the event
distinct_idVARCHARThe unique or anonymous id of the user that triggered the event.
propertiesVARCHARAny key: value pairs in a dict.
- $current_url - we use this in a couple of places (like /paths, /events) as the url the user was visiting at that time.
elements_*VariousColumns used for $autocapture to track which DOM element was clicked on
timestampDateTime64(6, 'UTC')Defaults to timezone.now at ingestion time if not set
created_atDateTime64(6, 'UTC')The timestamp for when the event was ingested
person_idUUIDThis is the id of the Person that sent this event
person_created_atDateTime64(3)The timestamp of the earliest event associated with this person
person_propertiesVARCHARA JSON object with all the properties for a user, which can be altered using the $set, $set_once, and $unset arguments
group*VariousColumns used for group analytics

Events are only stored within ClickHouse, and once they have been written they can't be changed. This limitation comes from a trade-off in the design of ClickHouse: inserting data and running queries on large tables is extremely fast, but updating or deleting specific rows is generally not efficient.

Person

In PostHog, a Person is an entity which sends events, and typically represents a 'User' in most implementations.

Each person contains the following base fields within PostgreSQL:

ColumnTypeDescription
idintegerSequential ID for the person
team_idintegerForeign key which links to the team
uuidUUIDUUID of the person within ClickHouse. This is referenced by the person_id field on events
created_attimestamptzThe timestamp of the earliest event associated with this person
propertiesjsonbA JSON object with all the properties for a user, which can be altered using the $set, $set_once, and $unset arguments
versionbigintIncremented every time a person is updated. Helps to keep ClickHouse and PostgreSQL in sync.

Persons are stored in PostgreSQL, but are additionally replicated into ClickHouse for certain queries. For example, when viewing the global list of Persons from the dashboard, this information is retrieved from ClickHouse.

Additionally, person properties are also stored directly on each event. Their value is determined during ingestion by looking up the Person who sent the event in PostgreSQL, and combining these values with any updates from the event itself.

The properties field on each Person object can be updated at any time, and as a result the PostgreSQL table represents the one source of truth for the most up-to-date values for the properties of a Person.

For more information on how person properties are added to events, take a look at this step in the ingestion overview.

Questions?

Was this page useful?

Next article

Ingestion pipeline

In its simplest form, the PostHog ingestion pipeline is a collection of services which listen for events as they are sent in, process them, and then store them for later analysis. This document gives an overview of how data ingestion works, as well as some of the caveats to be aware of when sending events to PostHog. Why am I seeing duplicate events? We recommend sending a uuid value with every captured event. Events with the same UUID, event name, timestamp, and distinct_id are considered…

Read next article