I hate the standup bot (so I built an agent to do it for me)

Andy Maguire

Apr 01, 2026

There it is. Staring at me. Sitting there. Goading me.

The standup bot, taunting me

The day is nearly done. I've reviewed PRs, fixed bugs, hopped on quick calls, commanded my agents, and smashed that accept button all day!

And now I'm expected to fill in some stupid Slack thread with my standup updates?

Every time I manually copy-paste into our standup thread, a little part of me dies inside.

We just passed the Turing test and here I am copy-pasting GitHub issue URLs into a Slack thread like it's 2010. I'm pulling up tabs, hunting for PR links, navigating all over GitHub, trying to remember what I (my agents) actually did today, and formatting it all nicely in Slack.

After a few months of this, I snapped.

I work on AI Observability at PostHog. I spend my days helping people understand what their AI agents are doing. And yet here I am, manually summarizing my GitHub activity like some kind of caveman.

The irony was too much. I decided to build my own agent to beat the standup bot at its own game.

It helps too that this was also the perfect excuse to learn about and play around a little with the OpenAI Agents SDK.

The plan

The idea was simple:

Build an AI agent that reads my GitHub activity and previous recent standups for context
Have it draft a sensible standup summary in the style I want
Chat with it to refine the draft
Post directly to Slack

And because I'm a shameless dogfooder, this was the perfect excuse to:

Learn the OpenAI Agents SDK
Dog-food our OpenAI Agents SDK integration for posthog-python
Create something I can actually use every day
Write this very blog post
But most importantly, use AI even if it actually sometimes takes me longer each day than it did before.

The perfect excuse

Let me show you how it all works.

🌶️ Hot take

Agents are the new scripts. We used to write bash scripts and cron jobs to automate tedious tasks. Now we write agents. Scripts follow rigid rules; agents are more flexible and often that's worth it.

The architecture: one agent to rule them all

I'll be honest - I didn't start here. My first version was a three-agent system: a Coordinator that orchestrated the workflow, a Data Gatherer that pulled from GitHub, and a Summarizer that turned it all into prose. I used the OpenAI Agents SDK's agents-as-tools pattern so the coordinator could delegate to the others.

It was a mess. Debugging handoffs between agents was painful. The coordinator would sometimes call the wrong sub-agent, or pass incomplete context. The summarizer couldn't see the raw tool call results - only what the coordinator decided to forward. And I needed three separate prompt templates, three model configs, and an on_handoff hook just to wire it all together. Way too much complexity for what is fundamentally a simple task.

So I ripped it all out - deleted the coordinator, data gatherer, and summarizer, and collapsed everything into a single agent with all 18 tools directly. The result was 300 fewer lines of code and something that actually worked reliably.

The agent has various tools organized by what they do:

GitHub overview - activity feed and summary stats
GitHub lists - PRs, issues, commits, reviews, comments (all with date filters)
GitHub details - drill into specific PRs or issues
Slack - fetch team standups, publish, confirm before posting
Utilities - clipboard, file save
Feedback - capture quality ratings and comments

Here's the actual agent definition:

Python
def create_standup_agent(
    model: str = DEFAULT_MODEL,
    hooks: AgentHooks[StandupContext] | None = None,
    style_instructions: str | None = None,
) -> Agent[StandupContext]:
    instructions = _make_dynamic_instructions(style_instructions)

    return Agent[StandupContext](
        name="Standup Agent",
        instructions=instructions,
        tools=[
            # GitHub overview tools
            get_activity_feed,
            get_activity_summary,
            # GitHub list tools (with date filters)
            list_prs,
            list_issues,
            list_commits,
            list_reviews,
            list_comments,
            # GitHub assigned items (no date filter)
            list_assigned_items,
            # GitHub detail tools
            get_pr_details,
            get_issue_details,
            # Slack tools
            get_team_slack_standups,
            publish_standup_to_slack,
            confirm_slack_publish,
            set_slack_thread,
            # Utility tools
            copy_to_clipboard,
            save_standup_to_file,
            # Feedback tools
            capture_feedback_rating,
            capture_feedback_text,
        ],
        model=model,
        model_settings=ModelSettings(),
        hooks=hooks,
    )

The agent uses dynamic instructions that inject the current standup draft, so each refinement iteration has full context without blowing the token budget. A StandupContext object tracks collected data across tool calls without stuffing it all into the prompt.

Stealing my own GitHub activity

The data gathering is powered by the gh CLI, which means:

No API tokens to manage (just gh auth login)
Access to everything you can see on GitHub
Real-time data, not some stale cache

This approach works great for a small personal agent running locally on your machine. If you were building something multi-user or running it as a service, you'd want proper GitHub API tokens and rate limit handling - but for a daily standup helper? The gh CLI is perfect.

Here's what the activity feed looks like when the agent queries it:

Python
@function_tool
def get_activity_feed(
    ctx: RunContextWrapper[StandupContext],
    days_back: Annotated[int, "Number of days to look back"] = 1,
) -> str:
    """
    Fetch a chronological feed of all GitHub activity for the user.

    Returns a unified list of events including:
    - Commits pushed
    - Pull requests opened/merged/closed
    - Code reviews submitted
    - Issues opened/closed
    - Comments on issues and PRs
    - Branches/tags created
    """
    # Uses `gh api /users/{username}/events` under the hood
    ...

The agent starts with the activity feed to get an overview, then drills down into specific PRs or issues that need more context. It's smart about not fetching redundant data and caches results so subsequent questions don't hit GitHub again.

Example of what the agent actually sees (from a real run):

Activity Feed: 94 events in last 1 day(s)
Summary: 2 comment, 2 create, 4 issue, 13 pull_request, 26 push, 26 review, 21 review_comment

Events (newest first):
[2026-04-02 12:42] PUSH PostHog/posthog (master)
[2026-04-02 12:42] PR MERGED PostHog/posthog#53107 - feat/llma-cluster-mcp-tools
[2026-04-02 12:25] PR MERGED PostHog/posthog#53189 - feat/mcp-evaluation-test-hog-tool
[2026-04-02 12:18] REVIEW APPROVED PostHog/posthog#53175 - docs/update-convex-otel-examples
[2026-04-02 12:11] REVIEW APPROVED PostHog/posthog-js#3321 - fix/convex-remove-use-node
[2026-04-02 11:58] PR OPENED PostHog/posthog#53190 - fix/llm-prompt-oauth-auth
[2026-04-02 11:46] PR OPENED PostHog/posthog#53189 - feat/mcp-evaluation-test-hog-tool
...

The secret sauce: PostHog AI Observability

Here's where it gets interesting. Building an agent is one thing. Understanding what it's actually doing? That's where PostHog AI Observability comes in.

I instrumented the entire agent with our OpenAI Agents SDK integration. Fun fact — it was built during a one-day hackathon at a team onsite in Barcelona, where I also discovered Calçots (highly recommend).

Me holding a Calçot in Barcelona

That's me holding a Calçot. This is relevant to the blog post, I promise.

The integration was surprisingly quick to build thanks to the excellent tracing support that OpenAI baked into the Agents SDK. We just needed to extend their TracingProcessor to send spans to PostHog instead of their default backend.

Python
from posthog import Posthog
from posthog.ai.openai_agents import instrument

posthog_client = Posthog(
    api_key=os.getenv("POSTHOG_API_KEY"),
    host="https://us.i.posthog.com",
)

# This single line instruments the entire agent workflow
processor = instrument(
    client=posthog_client,
    distinct_id=github_username,
    privacy_mode=False,  # I want to see everything
    properties={"app": "github-standup-agent"},
)

That's it. One function call and suddenly I have full visibility into everything. (If you're handling sensitive data, you can enable privacy mode to exclude prompts and completions from being stored.)

Tracing

Every agent run creates a trace with spans for:

Agent execution - which agent ran and for how long
Tool calls - what tools were called, with what arguments, and what they returned
LLM generations - prompts, completions, token usage, and latency

I can see exactly how the agent decided which tools to call, what data it gathered, and how it turned all of that into prose. Each trace also gets an AI-generated summary so you can quickly understand what happened without digging into every span:

Trace view showing the full agent workflow with AI-generated summary

Prompt management

One thing I quickly realized: hardcoding prompts in source code is a pain when you're iterating on tone and formatting. So I moved all my prompts into PostHog's prompt management. The agent pulls its instructions, style guide, and context templates from PostHog at runtime:

standup-agent-instructions - the main system prompt with workflow, formatting rules, and feedback detection
custom-style - formatting rules that enforce Slack mrkdwn links and match team conventions
current-standup - injected dynamically with the latest draft during refinement
chat-context - contextual template for chat mode sessions
generate-standup - the one-shot generation prompt

This means I can tweak how the agent writes standups without redeploying anything. And because PostHog tracks prompt versions, I can see exactly which version of the instructions produced which standup - useful when something goes weird.

There's also the "prompts are IP" angle. You might have seen the recent Claude Code source map situation — a packaging mishap meant the full system prompts ended up in the npm release. These things happen (shipping is hard!), but it's a good reminder that if your prompts are in the source code, they're one accidental publish away from being public.

For my standup agent it doesn't matter (my half-baked vibecoded prompts are open source anyway), but if yours contain real secret sauce, a prompt store with a local cache is a nice safety net.

Tool usage

The tools tab shows me which tools the agent calls most and how long they take. Turns out get_pr_details and get_activity_feed dominate - you can see get_pr_details right at the top with the most calls. That was actually a surprise - the agent was drilling into way more PRs than it needed to. A quick prompt tweak ("only drill into PRs that look significant") cut unnecessary API calls.

Tools tab showing which tools the standup agent calls most

Clustering

Clustering automatically groups similar traces together. For a personal agent like this, the clusters are pretty simple - mostly "standard standup generation" vs "heavy iteration sessions" vs "quick copy-paste runs." But it's useful for spotting patterns in how I use the agent. If you're curious about how the clustering pipeline actually works under the hood, I wrote a deep dive on how we built it.

Sentiment

The sentiment tab is more fun. PostHog analyzes the tone of user messages, and looking at my negative ones is... humbling. Some highlights from my standup sessions:

"bad bad bad!!!!!"
"ok this is terrible im going to exit and debug - how can this happen!"
"you messed up the links"
"bad you are not following the examples or style properly"

In fairness, most of these are just me giving the agent iteration feedback ("drop all the merged words its a bit repetitive", "remove the ask, dont need it") rather than genuine rage. But sentiment analysis doesn't know the difference, and that's actually an interesting insight — what looks like negative user sentiment might just be a user who's actively engaged and iterating. The positive ones are more telling: "ok great stuff send that to slack" and "great love it lets publish to slack" tend to mean the agent actually nailed it first time.

Sentiment tab showing positive and negative standup agent interactions

Evals

Here's where dogfooding gets real. The agent's biggest formatting headache is Slack's cursed mrkdwn syntax - it uses <url|text> instead of [text](url), and the agent constantly tries to slip back into regular markdown. So I set up an evaluation that automatically checks every generation: "did it use correct Slack-style markdown formatting?"

The eval results summary tells the whole story. Generations that pass tend to use correct Slack mrkdwn links, simple bullets, and plain text headers. The failures? The agent keeps trying to sneak in **bold** headers, [text](url) markdown links, and occasionally dumps raw JSON outputs. My favorite failure pattern: the agent would use <link|text> syntax for headings (correct for links, completely wrong for headers) — Slack doesn't even have heading syntax, so it just needs to use plain text like Did:.

Eval results summary showing passing and failing patterns for Slack markdown formatting

You can also drill into individual runs to scan each generation's result and reasoning:

Eval runs showing individual generation results with pass/fail reasoning

This is the kind of thing that's nearly impossible to catch systematically without automated evals. I'd notice the odd broken link in Slack, fix the prompt, and think I'd solved it — only for a different formatting issue to pop up the next day. The eval catches all of them and shows me the patterns so I can fix the root cause in the prompt, not just whack-a-mole individual failures.

User feedback

This one is really just a regular posthog.capture() call exposed as a tool - plain old product analytics, nothing fancy. The interesting part is that the agent decides when to call it. I don't explicitly say "thumbs up" or "thumbs down" — the agent picks up on signals in my messages and fires the event itself:

Python
@function_tool
def capture_feedback_rating(
    ctx: RunContextWrapper[StandupContext],
    rating: Literal["thumbs_up", "thumbs_down"],
    comment: str = "",
) -> str:
    """Rate the quality of this standup."""
    posthog.capture(
        event="$ai_metric",
        properties={
            "$ai_trace_id": current_trace_id,
            "$ai_metric_name": "standup_quality",
            "$ai_metric_value": 1 if rating == "thumbs_up" else 0,
            "comment": comment,
        }
    )

When I say "ok great stuff send that to slack", the agent infers positive feedback and captures it before publishing. When I say "bad you are not following the examples", it captures negative feedback with my complaint as the comment. I never have to break out of the conversation flow to rate anything — the agent just reads the room.

That feedback is linked to the exact trace with $ai_trace_id, so I can go back and see what went wrong on bad runs. Over time, this builds a dataset for understanding what makes a good standup and which prompt versions produce better results.

The victory lap: posting to Slack

The final piece is actually posting to Slack. Because what's the point of automating standup generation if I still have to copy-paste?

The agent has Slack integration with a two-step confirmation flow (because posting to your team's Slack channel by accident would be... bad):

You: ok great stuff send that to slack

Agent [Standup Agent] starting...
  Tool: capture_feedback_rating(rating='thumbs_up',
        comment='User said: "ok great stuff" before asking to send to Slack.')...
  Tool: publish_standup_to_slack(confirmed=False)...
Agent [Standup Agent] finished


Assistant: Captured the feedback — thanks.

Slack preview is ready (channel #team-llm-analytics, thread 1775052008.434899).
Reply "yes, publish to slack" (or "confirm publish") and I'll post it.

You: yes

Agent [Standup Agent] starting...
  Tool: confirm_slack_publish()...
  Tool: publish_standup_to_slack(confirmed=True)...
Agent [Standup Agent] finished

Assistant: Published to Slack in today's standup thread.

You: ok thanks i love u

Agent [Standup Agent] starting...
  Tool: capture_feedback_rating(rating='thumbs_up',
        comment='User said: "ok thanks i love u" after successful Slack publish.')...
Agent [Standup Agent] finished

The agent automatically finds the latest standup thread in your configured channel and posts as a reply. No more hunting for the thread. No more copy-paste. Just vibes.

The final standup posted to Slack

Did I save any time compared to just writing the standup myself? Debatable.

Might there be some AI slop in my updates that slipped through? Most definitely.

Did I get to use AI instead of doing it manually? Yes.

And isn't that all that really matters?

🌶️ Hot take

Human in the loop still matters. Here's the thing - as soon as I stop manually iterating and approving the standups my agent is generating, it flips into something weirdly anti-social. I'm just sending my teammates auto-generated text. Even if it's 90% accurate and I would have approved it anyway, something feels off about fully automating communication with other humans. The approval step isn't just quality control - it's me taking ownership of my words.

Try it yourself

The agent is open source, go check it out or try it yourself:

Terminal
# Install
pip install github-standup-agent

# Configure
standup config

# Generate!
standup generate

You'll need:

Python 3.11+
GitHub CLI authenticated (gh auth login)
OpenAI API key
Optional: Slack bot token for posting
Optional: PostHog API key for analytics

Build your own instrumented agent

The broader lesson here: if you're building AI features, instrument them from day one.

PostHog's AI Observability gives you:

Tracing - See exactly what your agents are doing
Cost tracking - Know what you're spending on API calls
Evals - Measure quality systematically
Prompt management - Version and iterate on prompts without redeploying
Clustering and sentiment - Spot patterns in how your agent is used
User feedback - Link quality signals to specific traces

The OpenAI Agents SDK integration is available in posthog-python. Check out our other integrations for LangChain, OpenAI, and many other frameworks.

Stick it to your standup bot

I've been using this agent daily for a few weeks now. It's not perfect - sometimes it misses context or formats things weirdly. But that's the beauty of having full observability: I can see exactly what went wrong and iterate.

🌶️ Hot take

Why are we even doing standups? Honestly, summarization as a category feels crazy early. The fact that I have to manually aggregate my work across GitHub, Slack, Linear, and Google Docs is absurd. I should just be able to read the "headlines" of my work life each morning - a personalized digest of what matters, generated automatically from all my tools. No one has solved this yet. Maybe standups are just a symptom of our tools not talking to each other. The standup bot isn't the enemy - it's a coping mechanism for a fragmented toolchain.

The standup bot still messages me at 6 PM. But now I just smile, run standup generate, and let my agent do the talking.

F*$k you, standup bot. I win.

Blog

I hate the standup bot (so I built an agent to do it for me)

Contents

The plan

The architecture: one agent to rule them all

Stealing my own GitHub activity

The secret sauce: PostHog AI Observability

Tracing

Prompt management

Tool usage

Clustering

Sentiment

Evals

User feedback

The victory lap: posting to Slack

Try it yourself

Build your own instrumented agent

Stick it to your standup bot

Further reading

Community questions