Linking Postgres as a source

The Postgres connector can link your database tables to PostHog. You can use it for regular scheduled syncs, or enable change data capture (CDC) for near real-time syncs that capture inserts, updates, and deletes.

To link Postgres:

  1. Go to the Data pipeline page and the sources tab in PostHog
  2. Click New source and select Postgres
  3. Enter your database connection details:
    • Connection string: An optional PostgreSQL connection string.
    • Host: The hostname or IP your database server like db.example.com or 123.132.1.100.
    • Port: The port your database server is listening to. The default is 5432.
    • Database: The name of the database you want like analytics_db.
    • User: The username with the necessary permissions to access the database.
    • Password: The password for the user.
    • Schema: (Optional) The schema for your database. Leave blank to browse and sync tables from all schemas, or specify one to limit the selection. The default is public.
  4. If you need to connect through an SSH tunnel, enable and configure it (optional):
    • Tunnel host: The hostname of your SSH server.
    • Tunnel port: The port your SSH server is listening on.
    • Authentication type:
      • For password authentication, enter your SSH username and password.
      • For key-based authentication, enter your SSH username, private key, and optional passphrase.
    • Require TLS through tunnel? - Whether to require TLS encryption through the SSH tunnel. Enabled by default. Disable if your database does not support TLS.
  5. Optional: enable Change data capture (CDC) if your database is prepared for logical replication.
  6. Click Next, select the tables to sync, choose the sync method for each table, then click Link

The data warehouse then starts syncing your Postgres data. You can see details and progress in the sources tab.

Configuration

OptionTypeRequired
Connection string (optional)textNo
HosttextYes
PortnumberYes
DatabasetextYes
UsertextYes
PasswordpasswordYes
SchematextNo
Use SSH tunnel?ssh-tunnelNo

Selecting columns

By default, PostHog syncs all columns from each table. To sync only specific columns:

  1. During source setup, click Columns next to any table in the table picker.
  2. Uncheck columns you don't want to sync.
  3. Primary key columns and the incremental sync field (if configured) are always synced and cannot be disabled.

You can also change column selection after setup:

  1. Go to the sources tab and click your Postgres source.
  2. Click Configure next to any schema.
  3. Under Columns, select which columns to sync.
Adding columns to existing syncs

When you add columns to a schema using incremental, append, or CDC sync, PostHog prompts you to choose:

  • Sync forward only - New columns are populated only for future data. Existing rows show NULL for the new columns.
  • Full resync - Triggers a complete resync to backfill the new columns for all rows.
SSL/TLS required for new sources

PostgreSQL sources created after February 18, 2026 require SSL/TLS encryption for database connections. Ensure your PostgreSQL database supports SSL/TLS connections before linking. Existing sources created before this date are not affected. If you're connecting through an SSH tunnel, you can optionally disable TLS using the Require TLS through tunnel? toggle since the tunnel already encrypts the connection.

Inbound IP addresses

We use a set of IP addresses to access your instance. To ensure this connector works, add these IPs to your inbound security rules:

USEU
44.205.89.553.75.65.221
44.208.188.17318.197.246.42
52.4.194.1223.120.223.253

Note: We currently don't support connections using IPv6, therefore, you may need to enable IPv4 connections to your database. This is required for Supabase, for example.

Sync methods

Postgres tables support full table, incremental, append-only, and CDC sync methods.

Use CDC when you need PostHog to stay close to your source database and you care about deletes. It is especially useful when:

  • Your table has inserts, updates, and deletes.
  • You don't have a reliable updated_at or other incremental cursor field.
  • Full table refreshes are too expensive for a frequently changing table.
  • You want an optional change-history table for audit or lifecycle analysis.

Use incremental or append-only sync instead when your table is append-heavy, has a reliable non-null cursor field, or you don't want to manage logical replication.

CDC is an advanced feature

CDC relies on PostgreSQL logical replication and a replication slot. If PostHog cannot consume the slot fast enough, PostgreSQL retains WAL files and your source database can run out of disk. Only enable CDC if you can monitor and operate replication slots, or if your database administrator is involved.

How PostHog CDC works

PostHog's Postgres CDC implementation uses PostgreSQL logical replication with the built-in pgoutput plugin.

When you create a CDC source:

  1. PostHog creates a replication slot for the source database.
  2. In PostHog-managed mode, PostHog creates the publication and adds selected CDC tables to it.
  3. In self-managed mode, you or your DBA create and update the publication. PostHog verifies that publication exists, then creates and manages the replication slot itself.
  4. Each CDC table first runs an initial snapshot through the normal Postgres import pipeline.
  5. After the snapshot completes, the table switches to streaming mode.
  6. A source-level CDC extraction schedule reads pending WAL changes, writes them to PostHog's data warehouse storage, then advances the replication slot.

PostHog reads changes with pg_logical_slot_peek_binary_changes(...) and confirms progress with pg_replication_slot_advance(...) only after the batch is safely written. CDC extraction is scheduled from the most frequent active CDC table interval. For newly created CDC schemas, PostHog uses a 5 minute interval by default.

The decoder currently handles Postgres logical replication messages for:

  • Inserts
  • Updates
  • Deletes
  • Truncates

Truncates are not replayed as row-by-row deletes. If PostHog sees a TRUNCATE, it marks the table for a new snapshot so the destination can recover to the new source state.

Output table modes

For each CDC table, choose how PostHog exposes the captured data.

Consolidated table only

PostHog keeps one destination table with the latest row state per primary key. CDC batches are deduplicated by primary key before being merged. Delete events are represented with CDC metadata columns such as _ph_deleted and _ph_deleted_at.

Use this when you mainly want a current-state table for analytics.

CDC history table only

PostHog creates a _cdc-suffixed table, for example users_cdc. It stores change history using SCD type 2-style columns:

  • _ph_cdc_op: I, U, or D
  • _ph_cdc_timestamp: the Postgres commit timestamp
  • _ph_deleted: whether the event is a delete
  • _ph_deleted_at: delete timestamp for delete events
  • valid_from: when this version became valid
  • valid_to: when this version stopped being current

The initial snapshot seeds the _cdc table as the starting point. History before the initial snapshot is not available.

Both

PostHog creates both the consolidated current-state table and the _cdc history table. This costs more storage and processing, but gives you both easy current-state analysis and a row-level history of changes from the CDC starting point.

CDC requirements

PostHog includes a Check CDC compatibility step in the source configuration screen. It validates the main prerequisites against your database before the source is created, including PostgreSQL version, logical replication, selected table primary keys, SELECT permissions, replication permissions, replication slot capacity, and publication existence for self-managed CDC.

You should still prepare the database before enabling CDC, especially if logical replication is not already enabled.

PostgreSQL version

Postgres CDC requires PostgreSQL 13 or later. PostHog creates publications with publish_via_partition_root = true, which requires Postgres 13+.

Check your version:

SQL
SHOW server_version_num;

Logical replication

Your database must have logical replication enabled:

SQL
SHOW wal_level;

The value must be logical.

On self-managed Postgres, this usually means setting the following in postgresql.conf, then restarting PostgreSQL:

conf
wal_level = logical
max_replication_slots = 10
max_wal_senders = 10

On AWS RDS or Aurora PostgreSQL, enable logical replication in the DB parameter group:

text
rds.logical_replication = 1

Changing this setting requires a database restart. Other managed Postgres providers have their own equivalent logical replication setting.

Replication slot capacity

PostHog creates one logical replication slot per Postgres CDC source. The source database must have a free replication slot:

SQL
SHOW max_replication_slots;
SELECT COUNT(*) FROM pg_replication_slots;

Don't reuse an existing replication slot. A slot should have exactly one consumer.

Table primary keys

Every table you sync with CDC must have a primary key. PostHog uses the primary key to merge updates, represent deletes, and build the optional history table.

Tables without a primary key won't be offered as CDC-capable tables in the setup flow.

Database user permissions

The PostHog database user needs schema USAGE and table SELECT permissions for every schema and table you want to sync. You don't need to grant access to public unless you're syncing tables from the public schema.

If all selected tables are in one schema, grant access to that schema:

SQL
GRANT USAGE ON SCHEMA "analytics" TO "posthog";
GRANT SELECT ON ALL TABLES IN SCHEMA "analytics" TO "posthog";
ALTER DEFAULT PRIVILEGES IN SCHEMA "analytics" GRANT SELECT ON TABLES TO "posthog";

If you're syncing tables across multiple schemas, repeat the grants for each selected schema, or grant SELECT on only the specific tables you selected:

SQL
GRANT USAGE ON SCHEMA "analytics" TO "posthog";
GRANT USAGE ON SCHEMA "billing" TO "posthog";
GRANT SELECT ON "analytics"."users", "billing"."invoices" TO "posthog";

It also needs permission to create and read logical replication slots.

For self-managed Postgres or most managed providers:

SQL
ALTER USER "posthog" WITH REPLICATION;

For AWS RDS:

SQL
GRANT rds_replication TO "posthog";

On RDS, membership in rds_superuser also satisfies PostHog's prerequisite check.

Slot and publication management

PostHog supports two CDC management modes.

PostHog-managed

PostHog creates and manages both the replication slot and publication. It generates names like:

  • Slot: posthog_<source id prefix>
  • Publication: posthog_pub_<source id prefix>

PostHog-managed mode is the easiest path if the database user has enough privileges. The user needs REPLICATION, SELECT, and ownership of the synced tables or superuser-equivalent permissions so PostHog can add and remove tables from the publication.

When you disable a CDC table, PostHog removes it from the publication. Changes made while the table is disabled are permanently lost from the CDC stream, so re-enabling the table requires a full resync.

When you delete the source, PostHog soft-deletes the warehouse tables and attempts to drop the PostHog-managed replication slot and publication. Cleanup is best-effort, so you should still verify the slot is gone if the source was failing or unreachable.

Self-managed

Use self-managed mode when you don't want to grant table ownership or superuser-equivalent permissions to the PostHog user.

In this mode:

  • You or your DBA create the publication as the table owner.
  • PostHog verifies the publication exists.
  • PostHog still creates and manages the replication slot itself.
  • PostHog still needs REPLICATION, schema USAGE, and table SELECT.

After you select CDC tables, PostHog shows a setup SQL dialog. It looks like this:

SQL
-- Reading a replication slot requires REPLICATION, or rds_replication on RDS.
ALTER USER "posthog" WITH REPLICATION;
-- GRANT rds_replication TO "posthog";
GRANT USAGE ON SCHEMA "analytics" TO "posthog";
GRANT SELECT ON "analytics"."users", "analytics"."orders" TO "posthog";
CREATE PUBLICATION "posthog_pub" FOR TABLE "analytics"."users", "analytics"."orders"
WITH (publish_via_partition_root = true);

Do not manually create the replication slot for normal self-managed setup. PostHog creates the slot with pg_create_logical_replication_slot(..., 'pgoutput').

If you later add a CDC table in self-managed mode, add it to the publication yourself:

SQL
ALTER PUBLICATION "posthog_pub" ADD TABLE "analytics"."new_table";

Avoid putting unrelated tables in the publication. PostHog filters out changes for tables that are not active CDC schemas, but the database still has to decode and send those changes.

WAL lag protection

Replication slots retain WAL until the consumer advances the slot. If PostHog stops consuming changes, retained WAL can grow.

PostHog stores two lag thresholds on CDC sources:

  • Warning threshold: 1024 MB by default
  • Critical threshold: 10240 MB by default

For PostHog-managed sources, automatic slot protection is enabled by default. If lag exceeds the critical threshold, PostHog attempts to drop the replication slot and publication, then marks the source as errored. This protects your source database from unbounded WAL growth, but it means you must recreate or resync the CDC source after fixing the issue.

For self-managed sources, PostHog does not drop the slot or publication. It marks the source as errored at the critical threshold, but you remain responsible for cleanup.

You can monitor a Postgres slot yourself:

SQL
SELECT
slot_name,
active,
pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn)) AS retained_wal
FROM pg_replication_slots
WHERE slot_name = 'posthog_your_slot';

Operational risks and limitations

CDC is powerful, but it changes how your production database retains WAL. Keep these constraints in mind:

  • Enabling logical replication often requires a database restart.
  • A stuck replication slot can fill your database disk with retained WAL.
  • PostHog CDC only supports tables with primary keys.
  • CDC captures row changes, not a complete history before the initial snapshot.
  • PostgreSQL logical replication doesn't send DDL as normal row changes. Schema changes may require a full resync, especially destructive changes like dropping or changing column types.
  • Large TOASTed values can be omitted from some update messages when they are unchanged. Validate CDC output carefully if your tables include very large text, jsonb, or bytea columns.
  • Disabling a table removes it from the publication in PostHog-managed mode. Changes made while disabled are not recoverable from the slot.
  • If a publication does not include a selected table, PostHog won't receive changes for that table.
  • If a table is truncated, PostHog triggers a new snapshot instead of replaying the truncate as individual deletes.

For production databases, we recommend enabling CDC on one or two representative tables first so your team can validate database permissions, WAL settings, monitoring, and internal rollout steps before adding more tables.

Troubleshooting

wal_level must be set to logical

Enable logical replication on the database and restart PostgreSQL. On RDS or Aurora PostgreSQL, set rds.logical_replication = 1 in the parameter group and restart the instance.

No replication slot capacity available

Increase max_replication_slots or remove unused slots. Check existing slots with:

SQL
SELECT slot_name, plugin, active, database
FROM pg_replication_slots;

Publication does not exist

This happens in self-managed mode when PostHog can't find the publication name you entered. Create the publication as the table owner, then click Verify & create source again.

Source is errored because of WAL lag

Check the slot lag query above. If PostHog-managed slot protection dropped the slot, create a new CDC source or run a full resync after fixing the underlying issue. If you're in self-managed mode, your DBA must decide whether to keep, advance, or drop the slot.

Looking for an example of the Postgres source? Check out our tutorial where we connect and query Supabase data.

Community questions

Was this page useful?

Questions about this page? or post a community question.