How we use PostHog's built-in data warehouse
Jul 30, 2025
On this page
- Problem #1: Understanding growth and churn
- Top 50 error tracking customers by volume
- Organizations regularly capturing exceptions (monthly)
- Error tracking churn for June
- Using views to create reusable queries
- Problem #2: Tracking revenue
- US/EU revenue split
- Revenue lifecycle
- Revenue per product
- Problem #3: Creating quarterly sales and customer success reports
- Salesforce opportunities by quarter
- Salesforce open pipeline by quarter (annual only)
- Sales and CS managed accounts start of quarter ARR
- Using variables
- Problem #4: Creating support reports (SLA, first response time, and more)
- Breached non-escalated tickets in the last 7 days
- Escalated SLA % last 7 days by team
- Time UTC when tickets are created (last 6 months)
- Problem #5: Tracking startup and YC plan growth, costs, and ROI
- Startup plan customer count (not YC)
- Previous startup & YC plan customers revenue per customer
- Startup & YC plan customers cohorts by starting month
- How should you get started?
PostHog's data warehouse is our most powerful feature. It lets you sync data from the tools you already use like Stripe, Salesforce, and Hubspot, query it alongside your existing product data using SQL, and visualize it natively.
We built it because the modern data stack sucks. What starts as a handful of business critical tools devolves into dozens of tools, many specifically built to capture, clean, format, load, query, and visualize data.
We knew it didn't have to be this way, so we built the data warehouse to get rid of all this complexity and give you a single source of truth for all your business data.


We've created over 1,600 SQL insights and visualizations using our data warehouse so far. It's our second most-used insight type behind trends, which was around long before we had the data warehouse.


To help you get started, we're sharing how teams at PostHog use our built-in data warehouse and custom SQL insights to answer critical business questions, like:
- Which customers churned and how did it impact revenue?
- Which customers are increasing their spend?
- Who are our biggest customers and what products are they using?
- Are we achieving our customer support goals?
- What are the biggest sales opportunities in our pipeline?
We're also sharing the actual SQL queries we use to answer these questions as well as the insights and dashboards we use to visualize the data (we've faked the data for the screenshots though).
Problem #1: Understanding growth and churn
- Sources:
- Postgres for billing data
- PostHog for usage and activation data
- Salesforce for sales context (e.g. ICP scores)
- Tables:
postgres_invoice_with_annual_view
,postgres_billing_customer
,events
,salesforce.contact
,postgres.posthog_team
,postgres.posthog_organization
,postgres_billing_usagereport
Engineers make the product decisions at PostHog, but they'd be lost without the context product managers provide.
One of the ways PMs give them this context is through monthly growth reviews, where they explore:
- Who churned, how much was their churn, and why (so we can prevent future churn)
- Product-specific activation and retention rates, using our custom definitions.
- Metrics for their growth reviews, like revenue expansion and contraction.
This analysis is only possible when you combine product and revenue data, so the data warehouse and SQL insights are their weapon of choice.
Most companies would use our Stripe source to import revenue data, but we don't. Since we have multiple different products with different usage-based pricing, we need a custom billing service to handle everything. The data from this service goes into Postgres, which then gets synced and used in PostHog.


Product managers combine this billing data from Postgres with product analytics event and property data collected by PostHog, and some additional supplementary customer info imported from Salesforce.
From here they create specific SQL insights and combine them into a single dashboard like this one for error tracking:


It includes insights like:
Top 50 error tracking customers by volume


SQL query for top 50 error tracking customers by volume
SELECTJSONExtractString(properties, 'organization_name') AS organization_name,sum(JSONExtractInt(properties, 'exceptions_captured_in_period')) AS exception_eventsFROM eventsWHERE event = 'organization usage report' AND {filters} AND JSONExtractInt(properties, 'exceptions_captured_in_period') > 0group by organization_nameorder by exception_events desclimit 50
Organizations regularly capturing exceptions (monthly)


SQL query for organizations regularly capturing exceptions (monthly)
WITH daily_exceptions AS (SELECTJSONExtractString(properties, 'organization_id') as org_id,toStartOfMonth(timestamp) as month,toDate(timestamp) as report_date,JSONExtractInt(properties, 'exceptions_captured_in_period') as exceptionsFROM eventsWHERE event = 'organization usage report'AND JSONExtractInt(properties, 'exceptions_captured_in_period') > 0AND {filters} -- Same filters as daily customers queryAND timestamp >= toDateTime('2025-03-01')),monthly_activity AS (SELECTorg_id,month,count(DISTINCT report_date) as days_with_exceptions,-- Calculate weeks in the month to get proper averagedateDiff('day', min(report_date), max(report_date)) / 7.0 as weeks_active,count(DISTINCT report_date) / GREATEST(dateDiff('day', min(report_date), max(report_date)) / 7.0, 1) as avg_days_per_weekFROM daily_exceptionsGROUP BY org_id, month),regular_users AS (SELECTorg_id,monthFROM monthly_activityWHERE avg_days_per_week > 4)SELECTmonth,count(DISTINCT org_id) as orgs_regularly_capturing_exceptionsFROM regular_usersGROUP BY monthORDER BY month
Error tracking churn for June


SQL query for error tracking churn for June
WITH customers_per_month AS (SELECTcustomer_id,dateTrunc('month', toDateTime(period_end, 'UTC')) as month,JSONExtractString(data, 'customer_email') AS customer,sum(JSONExtractFloat(mrr_per_product, {variables.product})) AS mrrFROM postgres_invoice_with_annual_viewWHERE dateTrunc('month', toDateTime(period_end, 'UTC')) IN ('2025-04-01', '2025-05-01', '2025-06-01')AND JSONExtractFloat(mrr_per_product, {variables.product}) > 0GROUP BY customer_id, customer, month),april_customers AS (SELECTcustomer_id,customer,mrr as april_mrrFROM customers_per_monthWHERE month = '2025-04-01'),may_customers AS (SELECTcustomer_id,customer,mrr as may_mrrFROM customers_per_monthWHERE month = '2025-05-01'),june_customers AS (SELECTcustomer_id,customer,mrr as june_mrrFROM customers_per_monthWHERE month = '2025-06-01'),exception_events AS (SELECTcustomer_id, -- Assuming this field exists to join with invoice datasum(toInt(org_usage_summary.exceptions)) as exception_count,count(DISTINCT date) as days_with_exceptions_juneFROM postgres.prod.billing_usagereportWHERE org_usage_summary.exceptions IS NOT NULLAND toInt(org_usage_summary.exceptions) > 0 -- Only count days with actual exceptionsAND date >= '2025-06-01'AND date < '2025-07-01' -- June onlyGROUP BY customer_id)SELECTm.customer as _customer,m.customer_id,round(COALESCE(a.april_mrr, 0), 0) as april_mrr,round(m.may_mrr, 0) as may_mrr,round(COALESCE(j.june_mrr, 0), 0) as june_mrr,round(m.may_mrr - COALESCE(j.june_mrr, 0), 0) as mrr_change,COALESCE(e.exception_count, 0) as exception_events,COALESCE(e.days_with_exceptions_june, 0) as days_with_exceptions_june,'CHURNED IN JUNE' as churn_statusFROM may_customers mLEFT JOIN april_customers a ON m.customer_id = a.customer_idLEFT JOIN june_customers j ON m.customer_id = j.customer_idLEFT JOIN exception_events e ON m.customer_id = e.customer_idWHERE j.customer_id IS NULL -- Had May MRR but no June MRR = June churnAND m.customer_id IS NOT NULLORDER BY m.may_mrr DESC, days_with_exceptions_june DESC, exception_events DESC
Using views to create reusable queries
Because product managers are slicing and dicing the data in multiple similar ways, they end up using a lot of views. These are saved queries that can be reused easily. It enables them to build a query to get the data they want in the format they want then use it to build multiple other queries. Some examples include:
org_icp_scores
: Gets ICP scores for customers from Salesforce where they are calculated.feature_flags_activation_base
: Builds a funnel of product intent → activation → retention, all centered on feature flags and enriched with an ICP score.source_of_truth_for_dp_pricing_model
: Data on which customers are using pipelines, what volume they are doing, and more to help improve our data pipeline pricing.
Problem #2: Tracking revenue
- Sources:
- Postgres for billing data
- Salesforce for sales context (e.g. ICP scores)
- Tables:
postgres_invoice_with_annual_view
,postgres_billing_customer
,postgres_billing_usagereport
,salesforce_opportunity
As I hinted at above, tracking revenue at PostHog is complicated. We have multiple products with usage-based and tiered pricing with discounts and add-ons, so we can't just query Stripe and get our revenue data. We have a specific billing service that generates invoices and reports, and adds them to Postgres.
For multiple years, we used Metabase to query and visualize all this revenue data from Postgres. All of the cleanup and reformatting needed to make it usable was in there too. Moving all this to PostHog was a lot of work (it had its own project Slack channel), but we've done it now and it looks like this:


Rebuilding (and improving) these insights required us to build views that standardize the data, connect Salesforce to better project revenue, create SQL visualizations we were missing, and more. Overall, the project was a success both in terms of making our revenue data more accessible and improving our data warehouse as a product.
Our PostHog revenue dashboard now includes insights like the following (powered by the data warehouse):
US/EU revenue split


SQL query for US/EU revenue split
WITHeu_customers as (select id from postgres_billing_customer where license_id = 1),us_customers as (select id from postgres_billing_customer where license_id = 2)selectround(sumIf(mrr, customer_id in (select id from us_customers))) * 12 as us_ARR,round(sumIf(mrr, customer_id in (select id from eu_customers))) * 12 as eu_ARR,coalesce(round(sumIf(mrr, customer_id is null OR (customer_id not in (select id from eu_customers) and customer_id not in (select id from us_customers)))) * 12, 0) as other_ARR,dateTrunc('month', toDateTime(period_end, 'UTC')) as periodfrom postgres_invoice_with_annual_viewwhere toDateTime(period_end, 'UTC') < dateTrunc('month', now()) + interval 2 monthand toDateTime(period_end, 'UTC') > dateTrunc('month', now() - interval 12 month)group by period order by period asc
Revenue lifecycle


SQL query for revenue lifecycle
WITH date_range AS (SELECTtoStartOfMonth(now() - INTERVAL number MONTH) as monthFROM numbers(35) -- 33 months back + 2 forwardWHERE month >= toStartOfMonth(now() - INTERVAL 33 MONTH)AND month <= toStartOfMonth(now() + INTERVAL 2 MONTH)),all_customers AS (SELECT DISTINCT id as customer_idFROM iwa_summary_customer_monthWHERE month < toStartOfMonth(now() + INTERVAL 2 MONTH)AND month > toStartOfMonth(now() - INTERVAL 33 MONTH)),customer_month_grid AS (SELECTac.customer_id,dr.monthFROM all_customers acCROSS JOIN date_range dr),customers_per_month AS (SELECTcmg.customer_id,cmg.month,COALESCE(sum(iscm.total_mrr), 0) as mrrFROM customer_month_grid cmgLEFT JOIN iwa_summary_customer_month iscmON cmg.customer_id = iscm.idAND cmg.month = iscm.monthWHERE cmg.month < toStartOfMonth(now() + INTERVAL 2 MONTH)AND cmg.month > toStartOfMonth(now() - INTERVAL 33 MONTH)GROUP BY cmg.customer_id, cmg.month),churn_calc AS (SELECTcustomer_id,month AS churn_accounting_month,mrr AS curr_mrr,lagInFrame(mrr, 1, 0) OVER (PARTITION BY customer_idORDER BY month) AS prev_mrrFROM customers_per_month),churned AS (SELECTchurn_accounting_month,sum(prev_mrr - curr_mrr) AS churned_mrr,count(*) as churned_customersFROM churn_calcWHERE prev_mrr > 0AND curr_mrr <= 0GROUP BY 1),customer_status AS (SELECTcustomer_id,month,mrr,CASE WHEN MIN(month) OVER (PARTITION BY customer_id) = month THEN TRUE ELSE FALSE END as first_month_active,CASE WHEN lagInFrame(mrr, 1, 0) OVER (PARTITION BY customer_id ORDER BY month) > 0 THEN TRUE ELSE FALSE END as active_in_previous_month,lagInFrame(mrr, 1, 0) OVER (PARTITION BY customer_id ORDER BY month) as prev_month_mrrFROM customers_per_month),lifecycle AS (SELECTmonth,CASEWHEN NOT active_in_previous_month THEN 'NEW'WHEN active_in_previous_month AND prev_month_mrr = mrr THEN 'FLAT'WHEN active_in_previous_month AND prev_month_mrr > mrr THEN 'SHRINKING'WHEN active_in_previous_month AND prev_month_mrr < mrr THEN 'GROWING'ELSE 'UNCATEGORIZED'END AS growth_lifecycle_stage,mrr,prev_month_mrrFROM customer_statusWHERE mrr > 0),growth_base AS (SELECTmonth,SUM(CASE WHEN growth_lifecycle_stage = 'NEW' THEN mrr ELSE 0 END) as new,SUM(CASEWHEN growth_lifecycle_stage = 'FLAT' THEN mrrWHEN growth_lifecycle_stage IN ('GROWING', 'SHRINKING') THEN prev_month_mrrELSE 0 END) as retained,SUM(CASE WHEN growth_lifecycle_stage = 'GROWING' THEN mrr - prev_month_mrr ELSE 0 END) as grown_from_existing,SUM(CASE WHEN growth_lifecycle_stage = 'SHRINKING' THEN mrr - prev_month_mrr ELSE 0 END) as shrunk_from_existing,SUM(CASE WHEN growth_lifecycle_stage = 'UNCATEGORIZED' THEN mrr ELSE 0 END) as uncategorized,-- Calculate the base MRR that was "at risk" (from continuing customers)SUM(prev_month_mrr) as at_risk_mrrFROM lifecycleGROUP BY month)SELECTgrowth_base.month as month,growth_base.new,growth_base.retained,growth_base.grown_from_existing,growth_base.shrunk_from_existing,growth_base.uncategorized,-churned.churned_mrr as churned,churned.churned_customers,-- Add period start and end MRRlagInFrame(growth_base.new +growth_base.retained +growth_base.grown_from_existing +growth_base.shrunk_from_existing +growth_base.uncategorized) OVER (ORDER BY growth_base.month) as period_start_mrr,-- Period end MRR = current month's totalgrowth_base.new +growth_base.retained +growth_base.grown_from_existing +growth_base.shrunk_from_existing +growth_base.uncategorized as period_end_mrr,FROM growth_baseLEFT JOIN churnedON growth_base.month = churned.churn_accounting_monthWHERE growth_base.month < today() + interval 1 monthORDER BY month ASC
Revenue per product


SQL query for revenue per product
SELECTdateTrunc('month', toDateTime(period_end, 'UTC')) AS period_month,tupleElement(kv, 1) AS product,SUM(toFloat(tupleElement(kv, 2))) AS revenueFROM postgres_invoice_with_annual_viewARRAY JOIN JSONExtractKeysAndValuesRaw(assumeNotNull(mrr_per_product)) AS kvWHEREtoDateTime(period_end, 'UTC') < dateTrunc('month', now() + interval 1 month)AND toDateTime(period_end, 'UTC') >= dateTrunc('month', now() - interval 12 month)GROUP BY period_month, productORDER BY period_month, productLIMIT 500
Problem #3: Creating quarterly sales and customer success reports
- Sources:
- Salesforce for sales context (e.g. ICP scores)
- Postgres for billing data
- Vitally for account ownership details
- Tables:
invoice_with_annual_plans_shifted
,vitally_all_managed_accounts
,salesforce_opportunity
,salesforce_user
Our sales and customer success teams do reporting through PostHog. They pull from Salesforce mostly, but also billing data in Postgres and account ownership details from Vitally. These are best combined in the Sales and CS Quarter Tracker dashboard which covers details like:
- Revenue from customers managed by sales and customer success
- Details on managed customers
- Salesforce opportunity pipeline and how much they're worth
- Won opportunities and how much they're worth
Some examples include:
Salesforce opportunities by quarter


SQL query for Salesforce opportunities by quarter
SELECTu.email AS owner,opp.name,opp.type,opp.close_date AS close_date,opp.forecast_category_name,opp.stage_name,opp.amount,opp.discount_rate_c AS discount,opp.amount_discounted_c AS discounted_amountFROMsalesforce_opportunity oppJOINsalesforce_user u ON opp.owner_id = u.idWHEREtoDateTime(opp.close_date) >= {variables.start_of_quarter}AND toDateTime(opp.close_date) < {variables.start_of_quarter} + INTERVAL 3 MONTHAND opp.is_closed = FALSEAND opp.self_serve_no_interaction_c = FALSEAND opp.self_serve_post_engagement_c = FALSEAND u.email LIKE CONCAT('%', {variables.account_owner})ORDER BYopp.close_date ASC;
Salesforce open pipeline by quarter (annual only)


SQL query for Salesforce open pipeline by quarter (annual only)
SELECTSUM(opp.amount_discounted_c)FROMsalesforce_opportunity oppJOINsalesforce_user u ON opp.owner_id = u.idWHEREtoDateTime(opp.close_date) >= {variables.start_of_quarter}AND toDateTime(opp.close_date) < {variables.start_of_quarter} + INTERVAL 3 MONTHAND opp.forecast_category_name IN ('Commit', 'Best Case', 'Pipeline')AND opp.self_serve_no_interaction_c = FALSEAND opp.self_serve_post_engagement_c = FALSEAND opp.type = 'Annual Contract'AND u.email LIKE CONCAT('%', {variables.account_owner});
Sales and CS managed accounts start of quarter ARR


SQL query for Sales and CS managed accounts start of quarter ARR
WITH start_of_quarter AS (SELECTinv.organization_id AS organization_id,acc.organization_name,acc.account_executive,acc.customer_success_manager,ROUND(SUM(inv.mrr * 12), 2) AS arr,CASEWHEN arrayExists(x -> x LIKE '%annual%', groupArray(type))THEN 'annual'ELSE 'monthly'END AS typeFROMinvoice_with_annual_plans_shifted invJOINvitally_all_managed_accounts accON inv.organization_id = acc.organization_idWHEREperiod_end < toDateTime({variables.start_of_quarter})AND period_end >= toDateTime({variables.start_of_quarter}) - INTERVAL 1 MONTHGROUP BYorganization_id, organization_name, account_executive, customer_success_managerORDER BYarr DESC)SELECTROUND(SUM(arr), 0)FROMstart_of_quarterWHERE(account_executive LIKE CONCAT('%', {variables.account_owner})OR customer_success_manager LIKE CONCAT('%', {variables.account_owner}))AND CASEWHEN {variables.segment} = 'All' THEN 1WHEN {variables.segment} = 'AE Managed' AND account_executive IS NOT NULL THEN 1WHEN {variables.segment} = 'CSM Managed' AND customer_success_manager IS NOT NULL THEN 1ELSE 0END = 1;
Using variables
The sales dashboard is the best example of the power of variables. Nearly every insight uses variables to set the account owner, quarter, and whether they are managed by sales or customer success.
This makes all of the insights much more reusable by being able to reuse them across quarters or for looking at individual performance.


Problem #4: Creating support reports (SLA, first response time, and more)
- Sources:
- Zendesk for ticket data
- Tables:
zendesk_ticket_events
,zendesk_tickets
,zendesk_ticket_metric_events
,zendesk_groups
We have high standards for our support experience. Where do those standards get judged? The data warehouse, of course.
Similar to the sales and customer success teams, support does reporting through PostHog. It used to all be done in Zendesk, but since we added the Zendesk source, doing it in PostHog has become easier and better.
Unlike other teams, support almost always uses exclusively Zendesk. This means less complicated queries and more opportunities to use trend insights on top of SQL insights.
For example, our SLA Achievement Rate insight looks at the ticket_metrics_events
table from zendesk
to find what percentage of tickets were replied to or updated within the SLA goals we set out (no SQL required).


Beyond this, they use both insight types to track:
- Service-level agreement (SLA) and first response time goals.
- Performance and breaches on escalated tickets.
- Support load on our product teams.
- Sources of support requests.
This data is combined with CSAT surveys (both scores and responses) to give a complete picture of the support we're providing at PostHog. Abigail Richardson
Abigail Richardson writes up a summary based on this data and shares it with the exec team weekly.
Some examples of insights on the support dashboard include:
Breached non-escalated tickets in the last 7 days


SQL query for Breached non-escalated tickets in the last 7 days
with first_escalated as(selectticket_id,min(created_at) as timestampfrom zendesk_ticket_eventswhere child_events like '%escalated%'group by ticket_id),before_escalated_breaches_in_time_range as(selectztme.ticket_id as ticket_id,from zendesk_ticket_metric_events ztmejoin first_escalated fe on fe.ticket_id = ztme.ticket_idwhereztme.type = 'breach'and ztme.metric in ('reply_time', 'pausable_update_time')and ztme.time <= fe.timestampand ztme.time >= {filters.dateRange.from}and ztme.time <= {filters.dateRange.to}group by ztme.ticket_id),never_escalated as(selectid as ticket_id,from zendesk_ticketswhere tags not like '%escalated%'group by id),never_escalated_breaches_in_time_range as(selectztme.ticket_id as ticket_id,from zendesk_ticket_metric_events ztmejoin never_escalated ne on ne.ticket_id = ztme.ticket_idwhereztme.type = 'breach'and ztme.metric in ('reply_time', 'pausable_update_time')and ztme.time >= {filters.dateRange.from}and ztme.time <= {filters.dateRange.to}group by ztme.ticket_id),selectcount(distinct(zt.id)) as num_breaches,count(distinct(IF(zt.id in (select ticket_id from never_escalated_breaches_in_time_range), zt.id, null)))as never_escalated_breaches,count(distinct(IF(zt.id in (select ticket_id from before_escalated_breaches_in_time_range), zt.id, null))) as before_escalated_breaches,zg.namefrom zendesk_tickets ztjoin zendesk_groups zg on zg.id = zt.group_idwherezt.id in (select ticket_id from before_escalated_breaches_in_time_range)or zt.id in (select ticket_id from never_escalated_breaches_in_time_range)group by zg.nameorder by num_breaches desc
Escalated SLA % last 7 days by team


SQL query for Escalated SLA % last 7 days by team
with first_escalated as(selectticket_id,min(created_at) as timestampfrom zendesk_ticket_eventswhere child_events like '%"added_tags":["escalated"%'group by ticket_id),escalated_breaches_in_time_range as(selectztme.ticket_id as ticket_idfrom zendesk_ticket_metric_events ztmejoin first_escalated fe on fe.ticket_id = ztme.ticket_idwhereztme.type = 'breach'and ztme.metric in ('reply_time', 'pausable_update_time')and ztme.time >= fe.timestampand ztme.time >= {filters.dateRange.from}and ztme.time <= {filters.dateRange.to}group by ztme.ticket_id),escalated_fulfill_events_in_time_range as (selectztme.ticket_id as ticket_idfrom zendesk_ticket_metric_events ztmejoin first_escalated fe on fe.ticket_id = ztme.ticket_idwhereztme.type = 'fulfill'and ztme.metric in ('reply_time', 'pausable_update_time')and ztme.time >= fe.timestampand ztme.time >= {filters.dateRange.from}and ztme.time <= {filters.dateRange.to}group by ztme.ticket_id),breaches_by_group as (selectcount(distinct(zt.id)) as num_breaches,zg.name as group_namefrom zendesk_tickets ztjoin zendesk_groups zg on zg.id = zt.group_idwhere zt.id in (select ticket_id from escalated_breaches_in_time_range)group by zg.nameorder by num_breaches desc),fulfills_by_group as (selectcount(distinct(zt.id)) as num_fulfills,zg.name as group_namefrom zendesk_tickets ztjoin zendesk_groups zg on zg.id = zt.group_idwhere zt.id in (select ticket_id from escalated_fulfill_events_in_time_range)group by zg.nameorder by num_fulfills desc),selectzg.name,IF(fbg.num_fulfills=0, 0, (fbg.num_fulfills-bbg.num_breaches)/fbg.num_fulfills) as sla_attainmentfrom zendesk_groups zgleft outer join breaches_by_group bbg on bbg.group_name = zg.nameleft outer join fulfills_by_group fbg on fbg.group_name = zg.namewhere (bbg.num_breaches>0 or fbg.num_fulfills>0)order by sla_attainment desc
Time UTC when tickets are created (last 6 months)


Time UTC when tickets are created (last 6 months)
with all_tickets as(selectcount() as total_ticketsfrom zendesk_ticketswhere created_at >= toStartOfDay(now()) - interval 6 month)selecttoHour(toTimeZone(created_at, 'UTC')) as hour_of_day,count() as count_per_hour,count_per_hour/(select total_tickets from all_tickets) as percentage_per_hourfrom zendesk_ticketswhere created_at >= toStartOfDay(now()) - interval 6 monthgroup by hour_of_day
Problem #5: Tracking startup and YC plan growth, costs, and ROI
- Sources:
- Tables:
stripe_customer
,stripe_invoice
,postgres_billing_usagereport
,postgres_billing_customer
Each quarter, hundreds of startups sign up for our startup plan (which gives them $50k in free credits), dozens more sign up to our upgraded YC plan. To make sure it is going well, we have a dashboard with details like:
- The number of organizations on the startup plan
- The cost of the startup plan for us based on credit usage.
- The count of startups who “graduate” and pay us money as well as how much money they pay us.
Unlike the revenue, churn, and growth data, we actually do use Stripe for this. We have metadata on Stripe customers saying if they are a startup plan customer and what type of startup plan they are on. This helps us get the customer count as well as the “graduate” details. We can then combine the Stripe data with our billing data in Postgres to get the cost of the startup plan for us based on credit usage.
This is all put together in a Startup and YC Plan dashboard with insights like these:
Startup plan customer count (not YC)


SQL query for startup plan customer count (not YC)
SELECTCOUNT(*)FROMstripe_customerWHEREmetadata.is_current_startup_plan_customer = 'true'AND metadata.startup_plan_label != 'YC'
Previous startup & YC plan customers revenue per customer


SQL query for previous startup & YC plan customers revenue per customer
WITH previous_startup_customers AS (SELECTsc.id AS customer_id,sc.name as name,sc.metadata.startup_plan_end_atFROM stripe_customer AS scWHEREsc.metadata.is_previous_startup_plan_customer = 'true')SELECTpsc.customer_id,psc.name,concat('$', formatReadableQuantity(SUM(si.amount_paid / 100))) as total_amount_paid,concat('$', formatReadableQuantity(SUM(si.total / 100))) as total_invoice_amount,concat('$', formatReadableQuantity(SUM((si.starting_balance - si.ending_balance) / 100)*-1)) as total_credits,COUNT(si.id) AS invoice_countFROM stripe_invoice AS siJOIN previous_startup_customers AS psc ON psc.customer_id = si.customer_idWHEREsi.total > 0ANDsi.status <> 'draft'group by psc.customer_id, psc.namehaving SUM(si.amount_paid) > 0order by SUM(si.amount_paid) desc
Startup & YC plan customers cohorts by starting month


SQL query for startup & YC plan customers cohorts by starting month
SELECTCOUNT(*) AS customer_count,toStartOfMonth(fromUnixTimestamp(toInt(metadata.startup_plan_start_at))) AS month_start,SUM(metadata.startup_plan_label = 'YC') AS yc_customer_count,SUM(metadata.startup_plan_label != 'YC') AS non_yc_customer_countFROMstripe_customerWHEREmetadata.is_current_startup_plan_customer = 'true'OR metadata.is_previous_startup_plan_customer = 'true'GROUP BYmonth_startORDER BYmonth_start ASC
How should you get started?
Feeling inspired? Here's how you can get started:
- Start by linking a source from a tool you already use.
- Go to the SQL editor and start by writing a basic query like
select * from events limit 10
. - Layer more complexity, like filtering, aggregating, and joining data. Use Max AI to help you with this.
- Use our SQL visualizations to see your data in a new way.
- Build a dashboard of related insights and share them with your team.