Migrate from Plausible to PostHog

Last updated:

|Edit this page

Prior to starting a historical data migration, ensure you do the following:

  1. Create a project on our US or EU Cloud.
  2. Sign up to a paid product analytics plan on the billing page (historic imports are free but this unlocks the necessary features).
  3. Set the historical_migration option to true when capturing events in the migration.

⚠️ Experimental: The following migration guide is not comprehensive and generates data that is less accurate than the data captured by PostHog. Contributions to improve it are welcome, feel free to open a PR here.

Compared to other tools, Plausible is easy to get data out of, but it is difficult to convert to PostHog's schema. This is because data exported from Plausible is aggregated, rather than the individual events that make up the aggregation (like we expect in PostHog).

This means we can try to convert data, but it won't be 100% accurate. For example, Plausible data tells there is a certain number of visitors, pageviews, and visits, but it doesn't tell you which user did what event at what time. We need break down this data to generate individual events that create an equivalent aggregation.

This guide aims to convert aggregate data into events where possible and capture them in PostHog.

Terminology differences: Although pageviews, custom events, and many properties are the same between Plausible and PostHog, there are two major terminology differences:

  1. Plausible uses visitors to represent the users behind pageviews and custom events. In PostHog, these are called persons.
  2. Plausible uses visit to represent a single use of your product or visit to your site. In PostHog, these are called sessions.

1. Export data from Plausible

To export data from Plausible, go to your site settings and click on the Imports & Exports tab. Once here, click Prepare download and download the collection of csv files once ready.

Plausible exports

2. Convert Plausible data and capture it in PostHog

🛠️ Work in progress: The following script converts and capture visitors and pageviews, but is not comprehensive. For example, it doesn't include:

  1. Person or event properties like UTMs or channels
  2. Session data (generating $session_id and correct timing for session duration)
  3. Bounce rate
  4. Custom events
  5. Entry or exit pages

Once you have your exported data, you can run the following script to convert it to PostHog's schema and capture pageviews. This script randomly generates $distinct_id values for each visitor and randomly assigns them pageviews. It requires the names and locations of your exported data as well as your Plausible Site Timezone which you can find in your site settings.

Python
import csv
from datetime import datetime
from backports.zoneinfo import ZoneInfo
import random
from uuid_extensions import uuid7
import numpy as np
from posthog import Posthog
# Replace with your export file names and location
visitors_filename = 'plausible-export/imported_visitors_20241001_20241008.csv'
pages_filename = 'plausible-export/imported_pages_20241001_20241008.csv'
# Replace with your site timezone
utc = 'US/Pacific'
posthog = Posthog(
'<ph_client_api_key>',
host='https://us.i.posthog.com',
debug=True,
historical_migration=True
)
def generate_distinct_id(date):
ns_since_epoch = int(date.timestamp() * 1e9)
return str(uuid7(ns=ns_since_epoch))
def convert_date_to_timestamp(date):
date_obj = datetime.strptime(date, '%Y-%m-%d')
local_datetime = date_obj.replace(hour=0, minute=random.randint(0, 59), second=random.randint(0, 59), tzinfo=ZoneInfo(utc))
utc_datetime = local_datetime.astimezone(ZoneInfo("UTC"))
return utc_datetime
def distribute_pageviews(visitors, total_pageviews):
# Ensure each visitor has at least 1 pageview
min_pageviews = np.ones(visitors, dtype=int)
remaining_pageviews = total_pageviews - visitors
if remaining_pageviews > 0:
# Distribute remaining pageviews
weights = np.random.dirichlet(np.ones(visitors)) * remaining_pageviews
additional_pageviews = np.round(weights).astype(int)
pageviews_per_visitor = min_pageviews + additional_pageviews
else:
# If total_pageviews <= visitors, each visitor gets 1 pageview
pageviews_per_visitor = min_pageviews
# Adjust for any rounding discrepancies
difference = total_pageviews - pageviews_per_visitor.sum()
for _ in range(abs(int(difference))):
if difference > 0:
pageviews_per_visitor[np.argmin(pageviews_per_visitor)] += 1
else:
pageviews_per_visitor[np.argmax(pageviews_per_visitor)] -= 1
return pageviews_per_visitor.tolist()
with open(visitors_filename, 'r') as visitors_file:
visitors_reader = csv.DictReader(visitors_file)
for visitor_row in visitors_reader:
date = visitor_row['date']
# Create persons with distinct IDs
persons = []
visitors = int(visitor_row['visitors'])
pageviews_distribution = distribute_pageviews(visitors, int(visitor_row['pageviews']))
for i in range(visitors):
timestamp = convert_date_to_timestamp(date)
persons.append({
'distinct_id': generate_distinct_id(timestamp),
'pageviews': pageviews_distribution[i]
})
# Generate pageviews from pages and persons
with open(pages_filename, 'r') as pages_file:
pages_reader = csv.DictReader(pages_file)
counter = 0
for page_row in pages_reader:
if page_row['date'] != date:
continue
for i in range(int(page_row['pageviews'])):
person = random.choice(persons)
posthog.capture(
distinct_id=person['distinct_id'],
event='$pageview',
properties={
'$current_url': f"https://{page_row['hostname']}{page_row['page']}",
'$host': page_row['hostname'],
'$pathname': page_row['page']
},
timestamp=convert_date_to_timestamp(date)
)
if person['pageviews'] == 1:
persons.remove(person)
else:
person['pageviews'] -= 1
counter += 1

Once you've run the script successfully, you should see data in PostHog. You can then continue your PostHog setup by installing the snippet or a library.

Questions?

Was this page useful?

Next article

Migrate from Statsig to PostHog

Statsig is a multi-product feature flag, experimentation, and analytics platform. This means going from Statsig to PostHog requires migrating data from each product. This guide walks through getting data from Statsig, converting it to the PostHog format, and using it to create or capture data in PostHog. Differences between Statsig and PostHog PostHog and Statsig have many of the same features and concepts, but different names and slight variations: Statsig has multiple types of feature and…

Read next article