Communication templates for incidents

When things go wrong, our priority is simple: keep customers informed, quickly and clearly.

This section covers how we communicate during service disruptions, from small hiccups to major outages. We aim to be transparent, human, and proactive — sharing what we know (and what we don't) in plain English.

For the engineering incident response process, see Handling an incident.

PostHog customers rely on us to power their products, so we provide honest, timely updates through the right channels — usually Slack or email, and occasionally SMS for high‑touch accounts.

Core principles

1. Transparency > Perfection

Share what we know, when we know it, clearly and without “status-speak.”

2. Human-centric

Messages come from people, not “The PostHog Team.” Show empathy and ownership (“I know this might interrupt your work; here’s what we’re doing.”)

3. Consistency

Use a consistent structure and timing so customers know what to expect.

4. Proactive by default

Reach out before customers ask, even if it’s just to say, “We’re aware and investigating.”

Severity levels

Level	Description	Examples	Channels	Cadence
SEV 1 – Critical	Major outage or data loss; widespread impact.	API unavailable, ingestion halted, login failures.	Slack → Email → (DM or SMS if needed)	Every 30–60 min; postmortem within 48h
SEV 2 – Major	Partial degradation or downtime; workaround available.	Replay or query delays >30 min, flag evaluation slow.	Slack or Email	Every 1–2 hrs
SEV 3 – Minor	Limited impact or slow recovery.	Billing sync delays, isolated org issues.	Slack	Start and close
SEV 4 – Informational / Planned	Maintenance or recovered incidents.	DB upgrade, scaling events.	Email or Slack broadcast	Before + after window

Templates

Critical

Subject: PostHog Outage – We’re investigating

Hey [Name/Team],

We’re investigating a major outage affecting [feature]. You may see [symptom]. Engineers are on it — updates every 30 minutes until resolved.

We know this may disrupt your work — thanks for your patience while we get things back online.

— [Your Name], PostHog

Follow-Up (Resolution): Good news — the issue is resolved. Root cause: [summary]. Duration: [start–end]. Impact: [brief effect].

We’re monitoring and will share a full write-up within 48 hours.

Major

Subject: Performance issues in [Feature]

Hey [Name],

We’re seeing performance issues in [component]. You might notice [impact]. We’re mitigating and will update within the hour.

Thanks for your patience! — [Your Name], PostHog

Minor

Subject: Slower performance in \[area\]

FYI — This shouldn’t block you, but we’re monitoring closely. I’ll update once it’s stable.

Planned maintenance

Subject: Maintenance – [Service/Region]

Heads up — maintenance on [system] from [time window]. No downtime expected, but queries or replays may be briefly delayed. We’ll confirm once complete.

Tone and voice

Principle	Example	Avoid
Direct	“Event ingestion is paused.”	“We are experiencing an issue affecting a subset of users.”
Empathetic	“I know this blocks work; it’s our top priority.”	“We apologize for the inconvenience.”
Plain English	“Dashboards might not update.”	“You may experience degraded query latency.”
Ownership	“We identified a config issue on our side.”	“A third-party dependency caused an issue.”

Coordination within GTM

Engineering manages detection and resolution (see engineering incident handbook). GTM ensures clear, consistent customer updates, without duplication or coverage gaps.

Goals

Keep a single source of truth for comms, managed by the CMOC.
Maintain global coverage so customers always hear from us.
Enable fast, clear handoffs between teams.

Roles & responsibilities

Role	Responsibility
Communications Manager On-Call (CMOC)	Activated for any incident requiring GTM notification. Drafts all comms using handbook templates. Coordinates with engineering for context and keeps a central log of who’s been notified. Manages regional handoffs if incidents span time zones or owners are offline.
AM/AE/CSM	Sends comms to their accounts using CMOC drafts. If offline (PTO, off-hours, or time zone), CMOC assigns a regional backup.
Regional Backup (Americas / EMEA / APAC)	Covers accounts when owners are offline. Takes handoff from CMOC, sends comms, and ensures follow-up continuity.
Engineering Incident Lead	Owns technical response and provides updates to CMOC for accurate messaging.

Workflow

Incident declared (Engineering).
CMOC activated, notified of impact.
CMOC drafts the initial message, shares with the Account Owner.
AM/AE/CSM sends to accounts; backup sends if primary is offline.
Updates drafted by CMOC (30–60 min for SEV1, 1–2 hrs for SEV2).
Regional handoffs coordinated by CMOC.
Resolution: CMOC drafts closure; AM/AE/CSM (or backup) sends.
Post-incident: CMOC archives thread; GTM logs feedback and follow-ups.
Postmortem: Engineering writes technical summary; GTM adds comms learnings.

Example Slack workflow (Critical)

Incident created: #inc-2025-11-05-posthog-feature-flags-error.
SRE posts summary; CMOC coordinates comms.
CMOC drafts message and shares with the Account Owner (the person responsible for the affected accounts).
Account Owner sends the message to their customers. Example outbound: “We’re investigating an outage affecting event ingestion. Updates every 30 minutes.”
During: “Root cause identified (Redis queue saturation). Fix in progress.”
Resolution: “Resolved at 11:42 UTC. Write-up soon.”

Using Pylon for broadcasts

It's best that communications are shared directly from the account owner; however, if speed is of the essence, i.e. for a SEV 1 or security issue, and some folks are not yet working or on PTO, the CMOC can use Pylon to send a broadcast to all customer Slack channels en masse:

Log into app.usepylon.com with your PostHog Slack account.
Go to the Broadcasts link on the left-hand side of the navigation.
Click Create Broadcast in the top right-hand corner of the UI.
Enter the message you want to send, ensuring the formatting looks correct in the preview on the right-hand side.
Ensure the Send as option is set to PostHog, not your own user (unless you want to handle 450+ potential separate threads)
Click Next in the top right-hand corner of the UI.
Select your audience. You can use the filters to select all channels not owned by specific people, e.g., those who are currently online and communicating 1:1 with customers.
Make sure you click Add to Audience to add the selected channels to the broadcast.
Click Next in the top right-hand corner of the UI.
Set the engagement notification channel to be #support-customer-success
Check that you're happy with the message and audience and click Send Now in the top right of the UI.
Ask everyone online to monitor #support-customer-success for replies and respond where necessary.