Teaching Sabine to Remember: Fixing Conversation Context

There's a specific kind of frustration that comes from talking to an AI that forgets what you just said. You're three messages into a conversation, making a nuanced point, and suddenly the assistant responds like you're starting fresh. It's jarring. It breaks flow. And until this week, it was happening intermittently in Sabine.

The Problem

Sabine's chat interface maintains conversation history in memory on the frontend. When you send a message, that history gets passed to the Python backend router, which feeds it to the LLM. Simple enough. Except our implementation had a subtle flaw: we were only sending that in-memory history as a fallback when thread retrieval failed.

In normal operation, we'd fetch conversation state from Supabase using a conversation ID. But the in-memory history—the actual messages you'd just exchanged—wasn't being sent along consistently. The result? Sabine would sometimes lose track of what you'd said two messages ago, even though the UI showed the full conversation.

The Fix

The fix was straightforward once we understood the issue: always send the in-memory conversation history from the frontend. Not as a fallback. Not conditionally. Always. We updated the useChat hook in TypeScript to include conversation_history in every request, and modified the Python router to accept and use it unconditionally.

This sounds obvious in hindsight, but it exposes a deeper question about state management in conversational AI: where does the source of truth live? We were trying to use Supabase as the canonical store while the frontend held ephemeral state. That works for persistence, but for real-time conversation flow, the in-memory state is what matters. The database is a checkpoint, not the conversation itself.

What's Next

This fix improves Sabine's conversational coherence immediately, but it also clarifies our architecture going forward. We're evaluating whether to shift more conversation state management to the frontend, treating Supabase purely as a persistence layer rather than the runtime source of truth. That would simplify the data flow and reduce the surface area for these kinds of state desynchronization bugs.

We've also added test coverage for the turn chain logic to ensure history is always passed through correctly. The goal is to make conversation continuity a guaranteed property of the system, not something that works 'most of the time.' When you're building a Chief of Staff AI, forgetting context isn't just a bug—it's a broken promise.