User properties

Last updated:

In addition to events having properties, users can also have properties. There are two different methods that can be used: set and set_once. Depending on the integration library the actual function calls look a bit different, but internally they all work the same way.

When to use set and set_once?

Use set, when the value should always be the newest one, e.g. user email, if user does an update to their email, we always want to use the latest value.

Use set_once when we want the first value and it to never be updated afterwards, e.g. initial URL for the first URL we ever saw this user on.

Sometimes we might want to mix set and set_once usage. Imagine, when we have some heuristics that help us determine a value, but users can also specify it. In these cases it might be beneficial to use set_once for the heuristically computed value and set for user specified value. Note that we will never override an existing value if set_once is used, so if heuristics are used on a continuous basis that would not be a good strategy.

How do values get overridden?

The section below explains how overrides will work in the future, but currently overrides ignore timestamps. This means at the moment set always overrides, set_once only writes when the value doesn't exist and alias upon property name conflict uses the value from person 1.

We rely on timestamps to know the order in which events happened. Note that we don't use ingestion order because events can arrive to PostHog in different order compared to when they were performed, e.g. when network was unreliable and some packets were resent later or when we're importing older events from a different PostHog instance.

This table explains, when values get overridden (TS means timestamp):

value existsmethodprevious methodprevious TS is ___ call TSwrite/override
1noN/AN/AN/Ayes
2yessetsetbeforeyes
3yessetset_oncebeforeyes
4yessetsetequalno
5yessetset_onceequalyes
6yessetsetafterno
7yessetset_onceafteryes
8yesset_oncesetbeforeno
9yesset_onceset_oncebeforeno
10yesset_oncesetequalno
11yesset_onceset_onceequalno
12yesset_oncesetafterno
13yesset_onceset_onceafteryes

When identify or alias methods are used, then all user properties individually get merged based on the same logic.

P1 refers to Person or 1's property and P2 to Person or distinct id 2's property. Notice how "yes" from above turned into "2".

P1 existsP2 existsP2 methodP1 methodP1 TS is ___ P2 TSvalue used
0yesnoN/AN/AN/A1
1noyesN/AN/AN/A2
2yesyessetsetbefore2
3yesyessetset_oncebefore2
4yesyessetsetequal1 or 2
5yesyessetset_onceequal2
6yesyessetsetafter1
7yesyessetset_onceafter2
8yesyesset_oncesetbefore1
9yesyesset_onceset_oncebefore1
10yesyesset_oncesetequal1
11yesyesset_onceset_onceequal1 or 2
12yesyesset_oncesetafter1
13yesyesset_onceset_onceafter2

Note: Here with equal timestamps and the current call (set/set_once) equal to the previous (rows 4 and 11) this means that we could end up using either person's value. We will optimize for efficiency here to minimize database writes (similarly to why set/set_once doesn't override). To elaborate look at the following examples:

For identify this series of calls would mean we end up with location being Rome or New York.

posthog.set('Alice', {'location': 'Rome'}, timestamp=1)
posthog.identify('Alice', {'$set': {'location': 'New York'}}, timestamp=1)

For alias this series of calls would also mean we end up with location being Rome or New York.

posthog.set('Alice 1', {'location': 'Rome'}, timestamp=1)
posthog.set('Alice 2', {'location': 'New York'}, timestamp=1)
posthog.alias('Alice 1', 'Alice 2', timestamp=1000)