Software Engineer — Warehouse Sources
Data Stack Team
Location
Remote
Timezone(s)
GMT +2 to GMT -8
About PostHog
We're shipping every product that companies need to run their business from their first day, to the day they IPO, and beyond. The operating system for folks who build software.
We started with open-source product analytics, launched out of Y Combinator's W20 cohort. We've since shipped more than a dozen products, including:
A built-in data warehouse, so users can query product and customer data together using custom SQL insights.
A customer data platform, so they can send their data wherever they need with ease.
PostHog AI, an AI-powered analyst that answers product questions, helps users find useful session recordings, and writes custom SQL queries.
Next on the roadmap are CRM, Workflow, revenue analytics, and support products. When we say every product that companies need to run their business, we really mean it!
We are:
Product-led. More than 100,000 companies have installed PostHog, mostly driven by word-of-mouth. We have intensely strong product-market fit.
Default alive. Revenue is growing 10% MoM on average, and we're very efficient. We raise money to push ambition and grow faster, not to keep the lights on.
Well-funded. We've raised more than $100m from some of the world's top investors. We're set up for a long, ambitious journey.
We're focused on building an awesome product for end users, hiring exceptional teammates, shipping fast, and being as weird as possible.
Things we care about
Transparency: Everyone can read about our roadmap, how we pay (or even let go of) people, our strategy, and how we work, in our public company handbook. Internally, we share revenue, notes and slides from board meetings, and fundraising plans, so everyone has the context they need to make good decisions.
Autonomy: We don’t tell anyone what to do. Everyone chooses what to work on next based on what's going to have the biggest impact on our customers, and what they find interesting and motivating to work on. Engineers lead product teams and make product decisions. Teams are flexible and easy to change when needed.
Shipping fast: Why not now? We want to build a lot of products; we can't do that shipping at a normal pace. We've built the company around small teams – autonomous, highly-efficient groups of cracked engineers who can outship much larger companies because they own their products end-to-end.
Time for building: Nothing gets shipped in a meeting. We're a natively remote company. We default to async communication – PRs > Issues > Slack. Tuesdays and Thursdays are meeting-free days, and we prioritize heads down building time over perfect coordination. This will be the most productive job you've ever had.
Ambition: We want to solve big problems. We strongly believe that aiming for the best possible upside, and sometimes missing, is better than never trying. We're optimistic about what's possible and our ability to get there.
Being weird: Weird means redesigning an already world-class website for the 5th time. It means shipping literally every product that relates to customer data. It means building an objectively unnecessary developer toy with dubious shareholder value. Doing weird stuff is a competitive advantage. And it's fun.
What you'll be doing
As a Software Engineer - Warehouse Sources team, you’ll build and iterate on our data import system - a customer-facing product - an equivalent of Airbyte and Fivetran.
Our import workers are built in python and we pull in data from APIs and databases in batches, process and transform the data using Apache Arrow in memory, and move the data into object storage in open table formats.
You’ll build and maintain our source library, as we’re looking for creative ways to make our library manageable at scale. You’ll revamp our schema management strategy, and build resilient systems (e.g logging, observability, testing)
You’ll debug stateful data workflows by digging into k8s pod metrics, and scheduling jobs using Temporal.io. As you can see, there’s a huge breadth of challenges and opportunities to tackle, and nothing is off-limits.
The PostHog Data Stack is both a core product for our users and a foundational platform for our internal teams. Data is a first-class product at PostHog, not an afterthought.
You will have the chance to push the boundaries of what our Warehouse Sources team can do while ensuring we remain stable and production-ready.
You’ll fit right in if:
You’re a builder. You bring strong skills in building resilient systems, with experience in Kubernetes, Docker, and S3 at scale. We build in python Asyncio and temporal.io experience is welcome.
You have hands-on experience with batch processing and open table data formats. We use Arrow to stream data. Experience with Iceberg and/or Delta is welcome, we don’t expect you to have experience with all three (although that would be great)
You're more than a connector of things. Building an import platform is more than configuring tools to make them work together, it's about actually building the tooling used in our warehouse import pipeline. We need you to have experience with building tools versus using off-the-shelf tools
You bring experience with creating and maintaining data pipelines. You are comfortable with debugging stateful, async data workflows by digging into k8s pod metrics.
You bring a mix of skills. It’s not just about the data pipeline work. You’ll need strong backend skills as we run a complex system.
You love getting things done. Engineers at PostHog have an incredible amount of autonomy to decide what to work on, so you’ll need to be proactive and just git it done.
You’re ready to do the best work of your career. We have incredible distribution, a big financial cushion and an amazing team. There’s probably no better place to see how far you can go.If this sounds like you, we should talk.
We are committed to ensuring a fair and accessible interview process. If you need any accommodations or adjustments, please let us know