Data model

Last updated:

|Edit this page

On this page

This provides a high-level overview of the various objects and primitives that make up the PostHog data model.

The two most basic entities in PostHog are the event and person objects. They represent the core of our analytics functionalities.

Further reading: How data is stored in ClickHouse

Event

An event is the most important object in PostHog. It represents a single action that a user performed at a specific point in time. These events are sent either from one of our SDKs or directly via our API.

Each event contains the following base fields within ClickHouse:

ColumnTypeDescription
uuidUUIDID of the event
team_idInt64Foreign key which links to the team
eventVARCHARName of the event
distinct_idVARCHARThe unique or anonymous ID of the user that triggered the event.
propertiesVARCHARAny key: value pairs in a dict.
- $current_url - we use this in a couple of places (like /paths, /events) as the URL the user was visiting at that time.
elements_*VariousColumns used for $autocapture to track which DOM element was clicked on
timestampDateTime64(6, 'UTC')Defaults to timezone.now at ingestion time if not set
created_atDateTime64(6, 'UTC')The timestamp for when the event was ingested
person_idUUIDThis is the id of the Person that sent this event
person_created_atDateTime64(3)The timestamp of the earliest event associated with this person
person_propertiesVARCHARA JSON object with all the properties for a user, which can be altered using the $set, $set_once, and $unset arguments
group*VariousColumns used for group analytics

Events are only stored within ClickHouse, and once they have been written they can't be changed. This limitation comes from a trade-off in the design of ClickHouse: inserting data and running queries on large tables is extremely fast, but updating or deleting specific rows is generally not efficient.

Person

In PostHog, a person is an entity which sends events and typically represents a user in most implementations.

Each person contains the following base fields within PostgreSQL:

ColumnTypeDescription
idintegerSequential ID for the person
team_idintegerForeign key which links to the team
uuidUUIDUUID of the person within ClickHouse. This is referenced by the person_id field on events
created_attimestamptzThe timestamp of the earliest event associated with this person
propertiesjsonbA JSON object with all the properties for a user, which can be altered using the $set, $set_once, and $unset arguments
versionbigintIncremented every time a person is updated. Helps to keep ClickHouse and PostgreSQL in sync.

Persons are stored in PostgreSQL but are additionally replicated into ClickHouse for certain queries. For example, when viewing the global list of persons from the dashboard, this information is retrieved from ClickHouse.

Person properties are also stored directly on each event. Their value is determined during ingestion by looking up the person who sent the event in PostgreSQL and combining these values with any updates from the event itself.

The properties field on each person object can be updated at any time. As a result, the PostgreSQL table represents the one source of truth for the most up-to-date values for the properties of a person.

Further reading: How person properties are added to events

Questions?

Was this page useful?

Next article

Ingestion pipeline

In simple terms, the ingestion pipeline is a collection of services which listen for events as they are sent in, processed, and stored for later analysis. Capture API The Capture API represents the user-facing side of the ingestion pipeline and is exposed as API routes where events can be sent. Before an event reaches the ingestion pipeline, there are a couple of preliminary checks and actions that we perform so that we can return a response immediately to the client: Decompressing and…

Read next article