Transform data

PostHog provides powerful tools to transform, enrich, and model your data to get in the exact shape you need it.

1. Events transformations

Transform event data as it flows through PostHog using our transformation apps and Hog functions.

Data in: Pipeline transformations (built with Hog)

Apply transformations to incoming event data before it's stored:

  • Data enrichment: Add context like GeoIP location, user agent parsing, or company data
  • Property mapping: Standardize property names and formats across different sources
  • Data validation: Ensure data quality by validating and cleaning incoming events
  • PII scrubbing: Remove or hash sensitive information before storage
  • Event filtering: Drop unwanted events or filter by specific criteria

Learn about transformations →

Data out: Using transformations (with Hog)

Transform data before sending it to external destinations:

  • JSON formatting: Convert PostHog events to match the JSON structure your destination expects
  • Data enrichment: Add context like GeoIP location, user agent parsing, or company data
  • Template syntax: Use {event.properties.value} to build custom payloads
  • Data aggregation: Combine multiple events into summary metrics
  • Custom routing: Send different events to different destinations based on rules
  • Rate limiting: Control the flow of data to external systems
  • Conditional exports: Only send events that meet specific criteria

Learn about Hog functions →

2. Warehouse source transformations

Transform and model data from your connected warehouse sources using SQL.

In SQL editor: Data modeling in saved views

Create reusable data models and transformations using PostHog's SQL editor:

  • Join multiple sources: Combine data from different tables and sources
  • Create calculated fields: Add derived metrics and computed columns
  • Build aggregations: Create summary tables and rollups
  • Save as views: Store complex queries as reusable views
  • Version control: Track changes to your data models over time

Learn about saved views →

Common transformation use cases

User identity resolution

  • Merge anonymous and identified user sessions
  • Link user IDs across different systems
  • Create a unified customer profile

Revenue attribution

  • Connect product usage to revenue data from Stripe
  • Calculate customer lifetime value
  • Attribute revenue to specific features or campaigns

Data standardization

  • Normalize event names across different platforms
  • Standardize timestamp formats and timezones
  • Map custom properties to a consistent schema

Privacy compliance

  • Automatically remove or hash PII
  • Implement data retention policies
  • Apply GDPR/CCPA compliance rules

Best practices

  • Test transformations: Always test with a small sample before applying to all data
  • Document your logic: Add clear descriptions to saved views and transformations
  • Monitor performance: Watch for slow queries and optimize as needed
  • Version control: Keep track of changes to critical transformations
  • Error handling: Build in fallbacks for when transformations fail

FAQ

  • What's the difference between webhooks and batch exports?
    Webhooks send data in real-time as events happen, perfect for alerts and automation. Batch exports send data in scheduled chunks, ideal for data warehouses and large-scale processing.
  • Can I send data to multiple destinations?
    Yes! Create as many webhook destinations as you need. Each can have different filters and transformations. Many teams use separate webhooks for alerts, CRM sync, and marketing automation.
  • How reliable are webhooks?
    PostHog automatically retries failed requests up to 3 times. We monitor destination performance and alert you to issues. For critical data, combine webhooks with batch exports as a backup.
  • Can I customize the webhook payload?
    Absolutely. Use our template syntax to shape data exactly how your destination expects it. For advanced cases, write custom Hog code to transform data however you need.

Hog FAQ

  • How is Hog different from HogQL?
    HogQL is our SQL dialect for querying data. Hog is a full programming language for transforming and routing data in real-time. While HogQL queries your data, Hog processes it as it flows through your pipeline.
  • Can I test Hog code locally?
    Yes! Clone the PostHog repo and use `bin/hog` to run .hog files locally. You can also compile to bytecode with `bin/hoge` for debugging.
  • Why 1-indexed arrays?
    Hog is SQL-compatible, and SQL has always used 1-indexed arrays. While it might feel odd coming from other languages, it ensures consistency with our SQL expressions.

Community questions

Questions about this page? or post a community question.