Pineapple on pizza

No data available

Feature ownership
  • AI gateway
Members

Goals

Q2 2026 objectives

Goal 1: Rebuild the gateway

Description: Replace today's single-process gateway with a horizontally scalable service that internal PostHog teams can rely on for AI traffic.

What we will ship:

  • Horizontally scalable service - capacity adds by replica count
  • True-proxy per provider shape - native OpenAI and Anthropic shapes, no cross-shape translation
  • Core modules - auth, rate limiting, routing, cost, quota, dispatch, and the post-call recording path into AI Observability
  • Load characterization - so we can size the fleet from observed throughput

Goal 2: Spending controls

Description: Make sure a runaway caller can't quietly run up provider cost.

What we will ship:

  • Pre-call spending guardrails - decide whether a request is admitted before it reaches the provider
  • Runaway-cost protection - cap how much damage a misbehaving caller can do

Goal 3: Provider coverage and failover

Description: Cover the providers PostHog's internal AI features need, with automatic failover when a primary has an incident.

What we will ship:

  • OpenAI and Anthropic at launch - chat, responses, and messages routes
  • Failover to a hosted equivalent - Anthropic to Bedrock, OpenAI to Azure
  • Best-effort model resolution - so clients we don't control work without custom config

Goal 4: Land gateway calls in AI Observability

Description: Every gateway call lands in the existing $ai_generation pipeline so traces, evals, and cost analytics work on gateway traffic.

What we will ship:

  • No schema change - same event surface, dashboards keep working
  • Normalized cached vs billable input split - one shape across providers
  • Single cost-calculation path - converge the two we have today

Goal 5: Take ownership of the AI Observability Playground

Description: Pick up the AI Observability Playground from

AI Observability Team
AI Observability mini crest
AI Observability Team
and keep it moving as gateway capabilities grow.

What we will ship:

  • Clean handover - clear ownership, on-call, and triage path for Playground bugs and requests
  • Gateway-aligned model coverage - Playground keeps pace with the providers and models the gateway supports

Goal 6: Migration off the current gateway

Description: Cut traffic over from the existing gateway gradually, starting with internal callers, so we shake out failure modes before more services depend on it.

What we will ship:

  • Per-team feature flag at ingress - one-flag rollback per team
  • Phased rollout - start with PostHog AI and PostHog Code, then bring more PostHog services on
  • Production readiness - observability, admin surfaces, and graceful shutdown for in-flight streams
  • Retire the original gateway - once the new one carries all traffic

Handbook

The AI gateway is PostHog's internal routing layer for calling LLM providers. It gives PostHog's AI features a single endpoint to call, with shared handling for things like provider routing, caching, fallbacks, and cost attribution. The team owns the gateway service, its provider integrations, the AI Observability Playground, and the seams where the gateway connects into the rest of PostHog – particularly

AI Observability Team
AI Observability mini crest
AI Observability Team
.

Working with other teams

The gateway sits underneath PostHog's AI features, so most internal teams shipping AI functionality will end up calling it instead of providers directly. We work closely with

PostHog AI Team
PostHog AI mini crest
PostHog AI Team
,
PostHog Code mini crest
PostHog Code Team
,
AI Observability mini crest
AI Observability Team
, and any small team adding AI features to their product.

Come to us if you:

  • Are calling an LLM from inside PostHog and want to go through the gateway instead of a provider SDK
  • Need a new provider, model, or capability added to the gateway
  • Are debugging gateway latency, errors, or cost attribution
  • Want gateway-level features like caching, fallbacks, structured outputs, or rate-limit handling

Questions about this page? or post a community question.