Identifying users

By default, every MCP event is attributed to the connection's session id (ses_…). That gives you per-session analytics, but you can't yet say "Alice from Acme is calling this tool 200 times a day" — because the SDK doesn't know who Alice is.

The identify option lets you teach it.

How attribution works

For each event, the SDK picks distinct_id in this order:

  1. The id returned by your identify(request, extra) callback (if it returned a UserIdentity).
  2. The MCP session id (ses_…).
  3. The literal string "anonymous".

That means events emitted before identify returns a user are session-scoped; once it returns a user, subsequent events attribute to that user and PostHog's standard identity merge takes over for prior anonymous activity.

Anonymous sessions don't mint person profiles

Events for sessions with no resolved identity are sent with $process_person_profile: false. This keeps anonymous MCP sessions from each creating a person profile (which would inflate your person count and billing). Once identify resolves an identity for a session, person processing stays on and the events create/update that user's profile as normal.

Wiring identify

identify is an async callback that returns either a UserIdentity or null. The SDK calls it on each request and caches the result per session, so a stable identity isn't re-resolved on every tool call.

TypeScript
import { instrument } from "@posthog/mcp"
instrument(server, posthog, {
identify: async (request, extra) => {
const token = extra?.requestInfo?.headers?.authorization
if (!token) return null
const user = await resolveUserFromToken(token)
if (!user) return null
return {
distinctId: user.id, // becomes distinct_id
properties: { // written to $set
name: user.name,
email: user.email,
plan: user.plan,
signupDate: user.signupDate,
},
groups: { // becomes $groups on every event
organization: user.orgId,
},
}
},
})

This is the same shape as posthog-node's identify({ distinctId, properties }) — just returned from a per-request callback instead of called imperatively. The fields map to PostHog as follows:

  • distinctId → the event's distinct_id.
  • properties → written verbatim to $set (so to set a person's name or email, put them here, e.g. properties: { name, email }).
  • groups (optional Record<string, string> of groupType → groupKey) is stamped onto every event as $groups. You never hand-write $groups yourself.

When this returns a non-null identity, the SDK:

  1. Switches the event's distinct_id to distinctId for that session.
  2. Emits a $identify event the first time the identity is observed (or whenever it changes for that session), with $set populated from properties.
  3. Stamps $groups onto subsequent events from the returned groups map.
  4. Caches the identity in a small per-server LRU keyed by session id, so unchanged identities are silently deduped.

Identity merges

A single MCP session typically emits a handful of events before any auth handshake completes — for example, $mcp_initialize may fire before you've resolved the user. Those early events go out anonymous, attributed to the session id.

When identify eventually returns a user, the SDK emits $identify with $anon_distinct_id set to the prior session id. PostHog's identity-merging logic then attributes the anonymous events to the identified user. From that point on, events for that session go out under distinctId directly.

This is the same merge model the Node SDK uses — if you've configured Person profile mode or have other strong opinions on identity in your PostHog project, the same rules apply.

When not to call identify

  • Internal tools without per-user auth. If your MCP server doesn't authenticate end users (e.g. a single-tenant internal server behind a VPN), leave identify unset. Session-scoped attribution is fine.
  • Bots and crawlers. Returning a junk identity for unauthenticated traffic dilutes your person count. Return null for traffic you can't identify — those events stay session-scoped.

Querying by identified user

Once identification is wired up, anything that filters on person.properties.* or groups by distinct_id works as expected:

SQL
SELECT
person.properties.plan AS plan,
properties.$mcp_tool_name AS tool,
count() AS calls
FROM events
WHERE event = '$mcp_tool_call'
AND timestamp > now() - INTERVAL 30 DAY
GROUP BY plan, tool
ORDER BY calls DESC

Community questions

Was this page useful?

Questions about this page? or post a community question.