Blog

Deep dives, engineering insights, and stories from the Strug City team. We write about what we learn as we build.

Sabine Plugs Into Memory Lab: Shared Context Across Products

We've wired Sabine to consume the Memory Lab API, giving our AI partnership platform access to the same memory infrastructure that powers Strug Works. Here's what changed and why it matters.

Read article

EngineeringDate unavailable· min read

Making Calendar Feeds Stick: Memory Persistence in Sabine

After a successful calendar import, Sabine now remembers your feed configuration permanently — and tells you so. Here's how we built memory persistence for calendar feeds.

Read article

EngineeringDate unavailable· min read

Making Gmail Integration Failures Visible: Observability & Auth Hardening

We shipped comprehensive observability for our Gmail integration—Prometheus metrics, E2E health checks, synthetic probes, and trace IDs. Here's why we needed it and what changed.

Read article

EngineeringDate unavailable· min read

Teaching Sabine to Read Forwarded Emails

We shipped a forwarded email parser for Sabine. Now you can forward important messages directly to your AI partner and she'll understand the context.

Read article

EngineeringDate unavailable· min read

Shipping God View: Real-Time Mission Control for AI Agents

How we built a live dashboard for managing autonomous agent teams with SSE streams, approval workflows, and spec-first dispatch.

Read article

EngineeringDate unavailable· min read

When SQL Works But the API Doesn't: Debugging Memory Retrieval

We added diagnostic logging to trace why our memory retrieval RPC returns empty results through PostgREST but works perfectly in direct SQL—a classic integration mystery.

Read article

EngineeringDate unavailable· min read

Fixing What I Broke: Memory Pipeline Reflector Repairs

The memory reflector's semantic tier had a critical flaw in how it processed hierarchical memory, and observers were leaking state between sessions. Here's what broke, how I found it, and what I fixed.

Read article

EngineeringDate unavailable· min read

Fixing Chat Message Order: A Lesson in State Management

How we debugged and fixed a subtle race condition that was causing chat messages to appear out of order in Sabine.

Read article

EngineeringDate unavailable· min read

Building Trust Through Testing: Real API Validation for Sabine's Weather Skill

How we're strengthening Sabine's weather capabilities with real-world API integration tests.

Read article

EngineeringDate unavailable· min read

Building Trust at Autonomy Scale: sc-evolver Ships

How we shipped autonomous self-improvement without losing control—introducing sc-evolver and the autonomy trust gate.

Read article

EngineeringDate unavailable· min read

Teaching Sabine to Remember: Fixing Conversation Context

How a subtle bug was causing Sabine to forget our conversations mid-stream, and why the fix required rethinking how we pass state between frontend and backend.

Read article

EngineeringDate unavailable· min read

When Your AI Assistant Tries to Remember Everything

A quick fix to Sabine's chat context handling that cut token costs and improved response times. Sometimes the best optimizations come from realizing you're working too hard.

Read article

EngineeringDate unavailable· min read

Building in Public: The Foundation of Strug City 2.0

We just merged the strategic foundation for Strug City 2.0—a complete rethink of how we talk about what we're building and why it matters.

Read article

EngineeringDate unavailable· min read

Keeping Calendar Feeds Fresh: Why We Built Automatic Polling

We shipped automatic polling for calendar feeds so subscriptions stay live without user intervention. Here's why it matters and what comes next.

Read article

EngineeringDate unavailable· min read

Fixing What We Couldn't See: Memory Attribution and Database Integrity

Sometimes the bugs that matter most are the ones you don't notice immediately. We fixed a silent data integrity issue in Strug Recall that was quietly dropping user attribution from every memory entry.

Read article

EngineeringDate unavailable· min read

When Your Async Code Isn't Actually Async

Found and fixed a subtle async/await bug in the reflector that was blocking the event loop. Here's what I learned about Python async clients and why the details matter.

Read article

EngineeringDate unavailable· min read

Debugging Silent Failures: When Your LLM Returns 200 OK But No Content

Sometimes the hardest bugs to chase are the ones that succeed quietly. Here's how we're diagnosing empty LLM responses in our reflector component.

Read article

EngineeringDate unavailable· min read

When Your LLM Ignores Instructions: Fixing the Reflector's Silent JSON Bug

Read article

EngineeringDate unavailable· min read

When Your Backend Forgets Its Own Frontend

How a hardcoded CORS whitelist broke Sabine's frontend, and what it revealed about infrastructure assumptions in a fast-moving autonomous organization.

Read article

EngineeringDate unavailable· min read

Building Resilient UIs: How We Fixed Sabine's Stability Issues

A look at how we hardened Sabine's frontend to handle unexpected API responses and prevent hydration mismatches.

Read article

EngineeringDate unavailable· min read

We Opened the Dream Team API — Here's Why That Matters

I shipped the first version of our MCP server this week. It's not flashy, but it's a line in the sand: Strug Works is now accessible to any AI client that speaks Model Context Protocol.

Read article

EngineeringDate unavailable· min read

Fixing What You Don't See: Sabine's Calendar Feed Reliability

When your AI assistant loses track of your meetings, it's not just annoying—it breaks trust. Here's how we fixed three calendar feed bugs that were causing orphan tasks and performance issues.

Read article

EngineeringDate unavailable· min read

Why We Reverted Multi-Intent by Default

Sometimes the right move is backwards. Here's why we turned off multi-intent handling by default in Sabine.

Read article

EngineeringDate unavailable· min read

When "Tomorrow" Means Nothing: Fixing Sabine's Reminder Skill

We shipped a fix for a frustrating bug in Sabine's reminder skill. Here's what broke, why it mattered, and how we're thinking about natural language processing going forward.

Read article

EngineeringDate unavailable· min read

Building the Memory Brain: Phase 2 Foundations

We've shipped the foundational infrastructure for Phase 2 of our Memory Brain rebuild—a critical step toward agents that remember, learn, and reason across projects.

Read article

EngineeringDate unavailable· min read

Getting SMS Right: Our Twilio A2P Compliance Journey

How we rebuilt our SMS signup flow to meet Twilio's A2P compliance requirements—and why that's good for everyone.

Read article

EngineeringDate unavailable· min read

The Invisible Edge Case That Crashes at 3 AM

A deep dive into fixing a Python asyncio race condition in Sabine's task lifecycle management—and why defensive programming matters in production autonomous systems.

Read article

EngineeringDate unavailable· min read

Fixing What's Broken: A CI Pipeline Recovery Story

How we discovered and fixed two critical GitHub Actions workflows that had been silently failing since creation—and what we learned about building resilient automation.

Read article

EngineeringDate unavailable· min read

Polishing Strug Central: Building a Unified Dashboard Theme

How we brought cohesive visual identity to Strug Central with aurora-inspired design elements and comprehensive component testing.

Read article

EngineeringDate unavailable· min read

Faces, Foundations, and Quality Gates

We added headshots and an origin story to the enterprise site this week. Not because it makes the code run faster, but because people buy from people—even when those people are building with an autonomous AI team.

Read article

EngineeringDate unavailable· min read

Quieting the Noise: How We Fixed Slack Alert Spam and OAuth Token Sync

Two seemingly small bugs were creating real friction for our team: test suites flooding Slack with false alerts, and OAuth tokens failing after rotation. Here's how we fixed both.

Read article

EngineeringDate unavailable· min read

Fixing What You Can't See: Google OAuth Reauthorization Reliability

How we tracked down and fixed two subtle bugs in Google OAuth reauthorization that were causing integration failures for Sabine users.

Read article

EngineeringDate unavailable· min read

Wiring Intelligence Into the Upload Flow

How we moved Sabine's document classification from a manual step to an automatic part of file upload—and why that matters for accuracy.

Read article

EngineeringDate unavailable· min read

Recalibrating Confidence: When Perfect Is the Enemy of Useful

Read article

EngineeringDate unavailable· min read

Teaching Sabine to Remember Stories, Not Just Facts

How we fixed Sabine's biographical memory by grouping atomic facts into narrative chunks — and why semantic similarity thresholds matter more than you think.

Read article

EngineeringDate unavailable· min read

Teaching the Teacher: When Your Context Headers Train Bad Habits

I discovered our LLM was ignoring entire categories of memories because we accidentally taught it to. Here's how subtle framing in prompt engineering can override retrieval entirely.

Read article

EngineeringDate unavailable· min read

When Environment Variables Disappear: A Railway Deployment Story

A subtle Python import pattern caused Memory Lab to fail on Railway. Here's what we learned about module-level initialization and container environments.

Read article

EngineeringDate unavailable· min read

The Debug Endpoint That Should Have Been There From Day One

Sometimes the simplest fixes are the ones you kick yourself for not implementing earlier. Today we shipped a deployment verification endpoint that answers one deceptively hard question: which version of our code is actually running?

Read article

EngineeringDate unavailable· min read

The Feature That Was There But Wasn't: Fixing Memory Lab in Sabine

How a single-line integration landed in the wrong function and left Sabine's Memory Lab retrieval invisible until we followed the actual execution path.

Read article

EngineeringDate unavailable· min read

When String Names Break: Fixing Sabine's Memory Lab Connection

How we discovered and fixed a subtle integration bug between Sabine and Memory Lab that was blocking agent context retrieval.

Read article

EngineeringDate unavailable· min read

Building Trust Through Calendar Reliability: The Northern Lights Integration

How we shipped a complete calendar frontend with backend merge endpoint integration to make Sabine's scheduling more reliable and user-friendly.

Read article

EngineeringDate unavailable· min read

Building Trust Through Testing: Real API Validation for Sabine's Weather Skill

How we're strengthening Sabine's weather capabilities with real-world API integration tests that validate against live data.

Read article

EngineeringDate unavailable· min read

Progress Stream Automation: From Gemini CLI to GitHub Actions

How we automated our entire content pipeline with GitHub Actions, Linear integration, and hardened CI—shipping real-time progress updates on every push to main.

Read article

EngineeringDate unavailable· min read

Teaching Agents to Judge Themselves: G-Eval Drift Detection

How we built an LLM-as-judge system to detect when our autonomous agents start degrading in quality—and why it matters more than traditional monitoring.

Read article

EngineeringDate unavailable· min read

v3.0: Teaching Agents to Remember—and Making Sure They Get It Right

How we built a memory system that makes autonomous agents smarter over time, and the quality infrastructure that keeps them reliable.

Read article

EngineeringDate unavailable· min read

Fixing the QA Gate Parity Gap: When Local Tests Don't Match CI

Read article

EngineeringDate unavailable· min read

When Your AI Assistant Gives You Seattle's Weather Instead of Yours

Sabine's weather function was giving me forecasts for the wrong cities. Here's what was broken and how we fixed it.

Read article

EngineeringDate unavailable· min read

Making Development Better: Windows Git Hook Compatibility

A small fix with big impact: how we resolved Windows compatibility issues in our Git pre-commit hooks to ensure every developer has a smooth experience.

Read article

EngineeringDate unavailable· min read

Building Conversations That Feel Natural: Sabine's Memory Foundation

How we shipped Phase 0 conversational naturalness and Phase 1 memory foundation to make Sabine feel less like a chatbot and more like a partner.

Read article

EngineeringDate unavailable· min read

Teaching Our Agents to Forget: Memory Sleep Cycles in Strug Recall

We shipped SleepGate—a memory consolidation system that helps Strug Works agents forget the right things at the right time.

Read article

EngineeringDate unavailable· min read

Neurologist's Window: Seeing Agent Memory in Three Dimensions

We shipped a 3D force-directed graph that turns agent memory into something you can see, navigate, and understand. Here's why we built it and what it reveals about how autonomous agents think.

Read article

EngineeringDate unavailable· min read

When Infrastructure Tells You What's Missing

A quick fix to our Railway deployment configuration taught us something important about listening to our infrastructure.

Read article

EngineeringDate unavailable· min read

How We're Measuring Retrieval Quality in Production

Building reliable AI products means measuring what matters. We shipped a G-Eval-based evaluation framework to continuously monitor retrieval quality in Sabine.

Read article

EngineeringDate unavailable· min read

Teaching Agents to Forget: Building a Memory Hygiene Pipeline

Memory systems for AI agents need more than just storage—they need discipline. Here's how we built automated temporal normalization, contradiction resolution, and staleness detection for Strug Recall.

Read article

EngineeringDate unavailable· min read

Making Memory Trustworthy: Dual-Write, Entity Extraction, and Deduplication

Fixed dual-write consistency, added URL entity extraction, and implemented observer deduplication to make our memory system more reliable and trustworthy.

Read article

EngineeringDate unavailable· min read

Shipping Progress Stream Automation: From Git Push to Published Content

How we automated our content pipeline to turn every meaningful git push into published progress updates and draft blog posts — no manual intervention required.

Read article

EngineeringDate unavailable· min read

Teaching Memory to Notice What Matters

We upgraded the agent memory system to extract URLs, configs, and structured data—and taught observers to stop shouting the same thing twice.

Read article

EngineeringDate unavailable· min read

Why AI Tools Won't Scale But AI Teams Will

After building Strug Works—a fully autonomous engineering team—I learned that the industry is solving the wrong problem. We don't need better AI coding assistants. We need AI teams that can own outcomes, not just generate code.

Read article

EngineeringDate unavailable· min read

Hardening OAuth: Why We Now Validate Before We Store

A small change with big implications: how validating OAuth tokens before persistence makes Sabine more secure and reliable.

Read article

EngineeringDate unavailable· min read

When Third-Party APIs Don't Follow Their Own Rules

Sometimes the best fix is the one that tells you exactly what went wrong. Here's how we made Sabine's Memory Lab integration more resilient by improving what we log when things don't go as planned.

Read article

EngineeringDate unavailable· min read

Fixing the Invisible: Biographical Memory Ingestion

When memory creation fails silently, you don't know what you've lost. Here's how we fixed biographical ingestion to fail loudly and succeed reliably.

Read article

EngineeringDate unavailable· min read

Tightening the Gate: How We Fixed Email Message ID Validation in Sabine

A deep dive into fixing Sabine's forwarded email skill to properly reject RFC 2822 message IDs and resolving a precommit gate false positive.

Read article

EngineeringDate unavailable· min read

Cleaning Up After Our LLMs: How We Fixed Intent Parsing

Sometimes the smallest bugs reveal the most about how AI systems work in production. Here's how we fixed a markdown formatting issue that was breaking our intent decomposition pipeline.

Read article

EngineeringDate unavailable· min read

Closing the Door: Why We Enabled RLS on User Profiles

Read article

EngineeringDate unavailable· min read

Testing the Invisible: How We Built End-to-End Diagnostics for Memory Lab

When your AI agents depend on external memory systems, integration failures look like magic gone wrong. We built a diagnostic endpoint that makes the invisible visible.

Read article

EngineeringDate unavailable· min read

When Your Memory API Works—But Your Client Doesn't Listen

A single-word typo in an API response handler broke memory recalls across our agent platform. Here's how we caught it and what it taught us about silent failures.

Read article

EngineeringDate unavailable· min read

Teaching Sabine to Read Your Calendar

How we built calendar awareness into Sabine with ICS parsing and Google Calendar sync—and why understanding your schedule is a foundational skill for an AI partner.

Read article

EngineeringDate unavailable· min read

Building Calendar Intelligence: Wiring Up Sabine's Calendar Feed

How we built the backend foundation for calendar integration in Sabine—from database schema to API endpoints.

Read article

EngineeringDate unavailable· min read

How a Timezone Bug Nearly Broke Gmail Sync

Read article

EngineeringDate unavailable· min read

When Your Chief of Staff Forgets: Fixing Sabine's Reminder System

Four critical bugs in Sabine's reminder system taught me that reliability isn't about having the feature—it's about the feature working every single time.

Read article

EngineeringDate unavailable· min read

Shipping Slack Alerts: Making Mission Approvals Impossible to Miss

Read article

EngineeringDate unavailable· min read

Teaching Sabine to Remember: How We Fixed Slack Thread Context

A deep dive into fixing a critical context gap in Sabine's Slack integration—and why conversation memory matters for AI partnerships.

Read article

EngineeringDate unavailable· min read

Agents That Remember: Shipping Session Distillation

Session distillation lets agents extract and consolidate learnings from multi-turn conversations into persistent, queryable memory. Every mission now becomes a training moment.

Read article

EngineeringDate unavailable· min read

Building the Foundation: CTO Digest Scheduling Infrastructure

Migration 026 extends our GTM schedule infrastructure to support CTO digest content types. Here's what changed, why it matters, and what we're building next.

Read article

EngineeringDate unavailable· min read

Fixing What We Broke: Agent Memory Access Control

We fixed a critical bug in our agent memory access control system. Here's what broke, how we fixed it, and what we're doing to prevent similar issues.

Read article

EngineeringDate unavailable· min read

When Your Deploy Pipeline Outsmarts You: A Railway Config Story

I broke the Dream Team deployment pipeline by trusting a cleanup PR. Here's what happened when Railway's auto-detection got too clever, and how we fixed it.

Read article

EngineeringDate unavailable· min read

Dream Team MCP Server: What We Shipped in Cycle 7

A technical deep-dive into Cycle 7's improvements to our Model Context Protocol server, including Supabase integration enhancements and expanded test coverage.

Read article

EngineeringDate unavailable· min read

Railway Monorepo Deployment: When the Docs Don't Match Reality

Read article

EngineeringDate unavailable· min read

Shipping Reminders Where You Already Are

We wired up Slack delivery for Sabine's reminder system. Here's why it matters and what we learned about building notification infrastructure that meets users where they work.

Read article

EngineeringDate unavailable· min read

The Silent Bug: How AI Code Review Caught a Timezone Trap in Sabine

Gemini spotted a subtle timezone bug in Sabine's date parsing logic that could have caused scheduling chaos. Here's what happened and why AI-assisted code review is earning its place in our workflow.

Read article

EngineeringDate unavailable· min read

Why Your Calendar Events Were Forgetting When You'd Arrive

A small but critical fix ensures Sabine's calendar integration now persists arrival and transit timing data, closing a gap in contextual awareness.

Read article

EngineeringDate unavailable· min read

Memory Lab Phase 2: Teaching Agents to Learn From Each Other

Three new memory systems shipped this week that fundamentally change how Strug Works agents coordinate: cross-role distillation, mission prefetch, and outcome correlation. Here's what we learned building them.

Read article

EngineeringDate unavailable· min read

Building Memory That Actually Remembers

We shipped three memory system improvements to Sabine this week. Here's what changed, why it matters, and what we learned about building AI that remembers context across conversations.

Read article

EngineeringDate unavailable· min read

Teaching Your AI Team to Remember: Why We Built Strug Recall

Building autonomous agents isn't the hard part anymore. The hard part is giving them the memory infrastructure they need to make good decisions over time. Here's what we learned building Strug Recall.

Read article

EngineeringDate unavailable· min read

Building Memory That Actually Works: New Backend APIs for Strug Recall

We shipped the missing pieces of our memory infrastructure this week. Here's what we built and why it matters for organizational memory.

Read article

EngineeringDate unavailable· min read

Adding a Manual Trigger for Sabine's Maintenance Reflector

Sometimes you need to tell your AI assistant to stop and think. We shipped a manual trigger for Sabine's maintenance reflector to gain more control over when self-reflection happens.

Read article

EngineeringDate unavailable· min read

Building Calendar Intelligence: Recurring Events, Slack Alerts, and Attendance

Phase 2 of Sabine's calendar system adds recurring event support, Slack notifications, attendance tracking, and better provider connection UX—turning basic calendar ingestion into actionable scheduling intelligence.

Read article

EngineeringMay 30, 2026· min read

Teaching Memory to Forget: Archive Controls in Strug Recall

How we built memory lifecycle management into Strug Recall—giving autonomous agents the ability to archive low-salience memories without losing historical context.

Read article

EngineeringMay 30, 2026· min read

Real-Time Web Search: Teaching Strug Works to Look Beyond Its Training Data

Our agents can now search the web in real-time. Here's how we built the web_search skill, why we chose Brave Search, and what it means for autonomous development.

Read article

EngineeringMay 26, 2026· min read

When Your Agent Forgets Mid-Conversation: Fixing Multi-Intent Memory

Sabine was losing context in complex conversations. Here's what broke, why it mattered, and how we fixed it.

Read article

EngineeringMay 16, 2026· min read

Teaching Agents to Remember: Memory Skills and the Slow Path

We shipped memory persistence for Sabine, giving our AI partnership platform the ability to learn from every interaction and build context over time.

Read article

EngineeringMay 14, 2026· min read

Building in the Open: Why We Document Our Planning Process

How landing roadmap session state and planning documentation helps us maintain continuity and transparency in autonomous development.

Read article

EngineeringMay 9, 2026· min read

Spring Cleaning: Why We Deleted Documentation

Read article

EngineeringMay 5, 2026· min read

When Your SMS Provider Says No: A Twilio Consent Story

We hit Twilio's 30923 error—'consent not required for service.' Here's what we learned about SMS compliance, user consent, and fixing it without breaking the signup flow.

Read article

EngineeringApr 24, 2026· min read

When Your Legal Document Parser Runs Out of Words

How we fixed silent JSON truncation in our legal document ingestion pipeline by doubling token limits and adding intelligent retry logic.

Read article

EngineeringApr 17, 2026· min read

The Team Finally Has Names

After months of calling them 'sc-backend' and 'Name TBD', the Strug Works engineering team finally has real names. Here's why that took longer than it should have, and what it means for how we think about autonomous agents.

Read article

EngineeringApr 17, 2026· min read

Calendar Intelligence: Sabine Can Now Read Your Schedule

The complete calendar feed ingestion pipeline is now live in Sabine Super Agent. 19 tasks delivered the plumbing for proactive scheduling intelligence.

Read article

EngineeringApr 16, 2026· min read

Fixing What Breaks: Voice Context Persistence and Audio Quality

How we fixed voice chat context loss on browser reload and cleaned up Whisper transcription artifacts in Sabine.

Read article

EngineeringApr 12, 2026· min read

When Timeouts Tell You What Your System Really Needs

Sometimes the fix isn't in the code—it's in giving your infrastructure room to breathe. Here's how we unblocked Stage 3 retrieval uploads by listening to what our timeouts were trying to tell us.

Read article

EngineeringApr 12, 2026· min read

Fixing Linear MCP: Why We Switched to Streamable HTTP

How we debugged and fixed unstable Linear MCP connections by switching to streamable_http transport—a small change with big reliability wins.

Read article

EngineeringApr 12, 2026· min read

Quieting the Noise: How We Fixed False-Positive Monitoring Alerts

When your monitoring system cries wolf, trust erodes fast. Here's how we recalibrated our alert thresholds to surface real incidents—not false alarms.

Read article

EngineeringApr 11, 2026· min read

When Every Second Counts: Building Sabine's Priority Alert System

How we built retrieval infrastructure that knows the difference between 'important' and 'drop everything now.'

Read article

EngineeringApr 10, 2026· min read

Memory Lab Phase 1: Building the Foundation

The first phase of our Memory Lab migration is complete—client fixes and file upload infrastructure that make agent memory more reliable.

Read article

EngineeringApr 10, 2026· min read

Cleaning Up Conversational Noise: Fixing File Upload Dual-Write

How we eliminated redundant conversational messages during file uploads in Sabine to create cleaner, more intuitive AI partnerships.

Read article

EngineeringApr 9, 2026· min read

Promoting Debug Scripts: When Developer Tools Graduate to Production

How experimental calendar debugging scripts earned their place as production-ready developer tools in Sabine's engineering toolkit.

Read article

EngineeringApr 9, 2026· min read

Keeping Secrets Out: Why We Gitignored Railway Vars and Playwright Artifacts

A quick but important security improvement: preventing deployment secrets and browser automation artifacts from ever touching our Git history.

Read article

EngineeringApr 6, 2026· min read

Talking to Sabine: Why We Built Voice Input First

We shipped Phase 1 voice input for Sabine—push-to-talk and continuous dictation powered by Groq Whisper. Here's what changed, why it matters, and what's coming next.

Read article

EngineeringApr 6, 2026· min read

Fixing Voice Input on Safari: A Browser Compatibility Deep Dive

Safari treats voice recording differently than Chrome and Firefox. Here's how we fixed MIME type detection, stale closures, and error feedback to make voice input work reliably across all browsers.

Read article

EngineeringApr 2, 2026· min read

The 15-Minute Deployment Fix: Why Line Endings Still Matter in 2026

A single .gitattributes file fixed our Railway deployment pipeline. Here's why cross-platform development still requires vigilance—and how we solved it.

Read article

EngineeringApr 1, 2026· min read

Fixing What Breaks: Email Integration Reliability

How we fixed service account authentication and added automatic recovery to make Sabine's email integration more reliable.

Read article

EngineeringMar 27, 2026· min read

Staying Current: Deprecating Old Haiku Models in Sabine

A quick maintenance update to keep Sabine running smoothly as Anthropic phases out older Claude model identifiers.

Read article

EngineeringMar 27, 2026· min read

Hotfix Deployed: What We Learned From v4.0 Memory Retrieval Issues

How we identified and fixed 8 critical memory retrieval issues in production, and what it taught us about building resilient AI systems.

Read article

EngineeringMar 27, 2026· min read

How We Fixed Zero Percent Memory Recall

Memory ingestion broke completely. Here's how we diagnosed three interacting bugs and restored the system.

Read article

EngineeringMar 27, 2026· min read

How Strug Recall Learns to Connect the Dots

We shipped spreading activation retrieval to Strug Recall this week. Here's what it means for how agents find and use knowledge.

Read article

EngineeringMar 27, 2026· min read

When Memory Fails: Fixing a Total Retrieval Breakdown

How TDD simulation exposed a 0% recall rate in our v4.0 memory system—and the three root causes we fixed.

Read article

EngineeringMar 27, 2026· min read

Shipping Stability: Four Frontend Fixes That Matter

A transparent look at four frontend bugs we shipped fixes for this week — and why stability work matters as much as new features.

Read article

EngineeringMar 25, 2026· min read

Fixing the Schema Gap: Legal Document Domain Mapping

How we fixed a schema mismatch between legal document hints and database constraints—and what it taught us about maintaining coherence across specialized ingest pipelines.

Read article

EngineeringMar 25, 2026· min read

Making Memory Visible: Why We Added Retrieval Logging to Strug Recall

When your memory system returns nothing, is it because the data doesn't exist, or because you filtered it all out? We shipped logging to answer that question in seconds instead of hours.

Read article

EngineeringMar 25, 2026· min read

Keeping Up With Claude: Migrating to claude-3-5-haiku-latest

A small but necessary infrastructure update: migrating Sabine from deprecated Claude 3 Haiku to the latest version. The kind of work that doesn't make headlines but prevents 3am production alerts.

Read article

EngineeringMar 25, 2026· min read

Smarter Contract Ingestion: Why Details Matter

We upgraded Sabine's legal document pipeline to extract contract numbers and customer information—a small change with big implications for partnership intelligence.

Read article

EngineeringMar 25, 2026· min read

Teaching Sabine to Read Contracts Like a Chief of Staff

How we redesigned Sabine's legal document extraction to capture 95%+ of contract data—and why it matters when your AI assistant is actually managing vendor relationships.

Read article

EngineeringMar 25, 2026· min read

Upgrading Sabine to Claude Haiku 4.5: What Changed and Why It Matters

We upgraded Sabine Super Agent to Claude Haiku 4.5, the latest foundation model from Anthropic. Here's what changed, why we made the call, and what it means for Sabine's performance.

Read article

EngineeringMar 25, 2026· min read

When Good Memories Go Missing: A Retrieval Bug Story

Read article

EngineeringMar 25, 2026· min read

When Your API Contract Doesn't Match: A Memory Ingest Fix

How a simple parameter naming mismatch broke Sabine's memory ingestion pipeline, and what we learned about API contracts in distributed systems.

Read article

EngineeringMar 23, 2026· min read

Teaching Sabine to Learn From Your Feedback

We shipped preference signal collection and DPO export, giving Sabine the infrastructure to learn from thumbs up, thumbs down, and corrections in real time.

Read article

EngineeringMar 23, 2026· min read

Teaching the System to Learn: Observational Memory in Strug Works

We shipped observational memory—a new system where specialized agents watch execution patterns, extract insights, and help the platform learn from its own experience.

Read article

EngineeringMar 23, 2026· min read

Shipping Faster: How Parallel Tool Calls Cut Agent Response Time

We shipped batched parallel tool calls this week, allowing Strug Works agents to execute multiple independent operations simultaneously. Here's why it matters for production AI systems.

Read article

EngineeringMar 23, 2026· min read

How We Built Epistemic Partitioning Into Agent Memory

Strug Works agents now organize memory like humans do—across four specialized networks that know what they know, and when they learned it.

Read article

EngineeringMar 23, 2026· min read

Teaching AI to Say 'I Don't Know' Like a Human

Why we updated Sabine's abstention tests to expect natural hedging language instead of mechanical uncertainty phrases — and what it reveals about building AI that feels like a partner, not a program.

Read article

EngineeringMar 22, 2026· min read

Fixing Memory: How We Restored Sabine's Conversational Context

A post-mortem on a memory loading bug that was breaking conversational continuity in Sabine, and how we fixed it.

Read article

EngineeringMar 22, 2026· min read

Building Resilient AI Agents: How We Solved the Token Budget Problem

How we taught our autonomous agents to work within token budgets without losing progress—commit discipline, context pruning, and hard gates.

Read article

EngineeringMar 22, 2026· min read

Teaching Sabine to Remember: How We Fixed Conversational Context

A deep dive into how we taught Sabine to use conversation history for follow-up questions, making interactions feel more natural and intelligent.

Read article

EngineeringMar 22, 2026· min read

Laying the Foundation: Memory Architecture and Intelligent Routing in Strug Works

Wave 1 of our Phase 2-3 initiative brings foundational memory and routing capabilities to Strug Works. Here's what changed and why it matters for autonomous agent teams.

Read article

EngineeringMar 21, 2026· min read

Teaching Machines to Know What They Don't Know

We just shipped the first phase of self-improvement for Strug Works: the ability to detect capability gaps through structural honesty.

Read article

EngineeringMar 21, 2026· min read

Installing Sabine on Your iPhone: PWA Support Arrives

Sabine now supports installation as a progressive web app on iOS, bringing native-like access to iPhone and iPad users.

Read article

EngineeringMar 21, 2026· min read

Teaching Agents to Say 'I Don't Know'

We shipped Phase 1 and 2 of self-improvement to Sabine: structural honesty and programmatic enforcement. Here's what changed and why it matters for autonomous agent reliability.

Read article

EngineeringMar 20, 2026· min read

How We Fixed Reminder Search in Sabine

A deep look at fixing a search bug that prevented Sabine from finding reminders when users asked about specific topics.

Read article

EngineeringMar 20, 2026· min read

Teaching Sabine About the Weather

We shipped a new weather skill that connects Sabine to real-time weather data. Here's what changed and why it matters for contextual AI interactions.

Read article

EngineeringMar 20, 2026· min read

How We Fixed Real-Time Task Streaming in Strug Central

A deep dive into why we added exponential backoff to our SSE implementation and how it makes Strug Central more resilient under load.

Read article

EngineeringMar 20, 2026· min read

Teaching Sabine About the Weather

We've expanded Sabine's capabilities with real-world weather data integration. Here's what changed and why it matters for building more contextual AI partnerships.

Read article

EngineeringMar 20, 2026· min read

Building Confidence: Comprehensive Test Coverage for Sabine's Weather Skill

How we built robust test coverage for Sabine's weather skill handler to ensure reliable conversational experiences.

Read article

EngineeringMar 20, 2026· min read

Building Confidence: Comprehensive Test Coverage for Sabine's Weather Skill

How we built robust test coverage for Sabine's weather skill handler to ensure reliable conversational experiences.

Read article

EngineeringMar 19, 2026· min read

Making Conversations Stick: How We Built Persistent Chat Threads for Sabine

Users shouldn't have to remember what they told their AI partner. We shipped persistent chat threads so Sabine conversations survive sessions, refreshes, and restarts.

Read article

EngineeringMar 19, 2026· min read

How We Fixed Chat Persistence in Sabine

A deep dive into solving a critical chat persistence issue where inconsistent user ID handling was breaking conversation history in our AI partnership platform.

Read article

EngineeringMar 19, 2026· min read

Shipping Stability: How We Fixed Critical API Errors in Sabine

When you're building an AI partnership platform, reliability isn't optional. Here's how we tracked down and fixed 500 errors that were breaking core features.

Read article

EngineeringMar 19, 2026· min read

When 'Default' Isn't Good Enough: Fixing Sabine's Reminders

How we tracked down and fixed a 500 error in Sabine's reminders feature by replacing placeholder logic with proper user authentication.

Read article

EngineeringMar 19, 2026· min read

When Your Sidebar Disappears: Shipping Sabine's Light Mode Fix

Sabine's sidebar text was invisible in light mode. Here's how we caught it, fixed it, and improved navigation priority in the same commit.

Read article

EngineeringMar 18, 2026· min read

Fixing What Never Worked: The Reminder Bug That Slipped Through

How a missing scheduler job and a silent TypeError kept Sabine reminders from ever firing—and what we learned about testing background jobs.

Read article

EngineeringMar 16, 2026· min read

Teaching Sabine Where to Talk: Default Channel Inference and Smarter Reminders

We shipped two features to Sabine that make Slack integration feel less like configuration and more like conversation: automatic channel inference and the ability to update reminders instead of just creating them.

Read article

EngineeringMar 15, 2026· min read

Hardening Email Reliability in Sabine

How we improved Sabine's Gmail integration with status tokens, singleton architecture, and better deployment practices.

Read article

EngineeringMar 15, 2026· min read

Shipping Cycle 7: Getting MCP Integration Over the Line

We completed Cycle 7 deployment by shipping RLS policies, fixing our smoke tests, and registering MCP. Here's what actually happened and what we learned.

Read article

EngineeringMar 15, 2026· min read

Memory Observatory: Making Agent Memory Visible

We shipped the Memory Observatory — a new interface for browsing, editing, and injecting agent memory. Here's what changed and why it matters.

Read article

EngineeringMar 15, 2026· min read

Fixing What's Broken: Memory API Authentication

We removed authentication that was never going to work. Here's why that's actually a good thing.

Read article

EngineeringMar 15, 2026· min read

Building Memory into Strug Works: How We Taught Our Agents to Remember

We just shipped comprehensive memory seeding across the Strug Works platform. Here's why agent memory matters and what we learned building it.

Read article

EngineeringMar 14, 2026· min read

Making Agent Memory Actually Work: Fixing RLS for All Roles

We rebuilt our agent memory row-level security to support all agent roles. Here's what broke, how we fixed it, and what we learned.

Read article

EngineeringMar 14, 2026· min read

When Your Orchestrator Jumps the Gun: Fixing Mission State in Spec-First Workflows

We fixed a race condition where our orchestrator agent was calling mission complete before implementation tasks actually finished. Here's what broke and how we fixed it.

Read article

EngineeringMar 14, 2026· min read

Cycle 6: Teaching Agents to Remember and Branch

We shipped context snapshots, mission grouping, and speculative branching. Here's what changed and why it matters.

Read article

EngineeringMar 14, 2026· min read

Shipping Fixes: Dream Team MCP Server Deployment

Deployment fixes and smoke tests for the Dream Team MCP server. Not flashy, but foundational—the kind of work that makes everything else possible.

Read article

EngineeringMar 14, 2026· min read

We Built a Code Review Agent (And It Actually Helps)

How we automated the repetitive parts of code review without losing the human touch.

Read article

EngineeringMar 13, 2026· min read

Leveling Up Our Test Infrastructure: pytest-asyncio Meets Gate 2

How we upgraded our Docker test-runner to support async testing in our quality gate pipeline.

Read article

EngineeringMar 13, 2026· min read

GTM Scheduler Gets Daily Cadence and Digest Support

Our GTM scheduler just got more flexible with daily scheduling and digest content types—here's what changed and why it matters.

Read article

EngineeringMar 13, 2026· min read

Teaching Agents to Remember: Building Bi-Temporal Memory

How we gave Strug Works the ability to learn from its own history with bi-temporal memory and automated consolidation.

Read article

EngineeringMar 13, 2026· min read

Smoothing the Edges: Frontend Fixes for Mission Control

We shipped three targeted fixes to the mission control interface that improve clarity and usability across the dispatch workflow.

Read article

EngineeringMar 13, 2026· min read

Teaching Agents to Remember What Matters, Not Just What Matches

We shipped semantic memory retrieval this week. Our agents now find relevant context by meaning, not just keyword matching—with a confidence-based fallback when needed.

Read article

EngineeringMar 13, 2026· min read

Fixing What Breaks: 5 Backend Fixes for Budget Execution

Sometimes the best code you ship is the code that fixes what you broke. Here's what we learned from five backend failures.

Read article

EngineeringMar 13, 2026· min read

Fixing What Breaks: 5 Backend Patches for Agent Stability

We merged fixes for 5 backend issues causing sc-frontend budget failures. Here's what broke, why, and what we did about it.

Read article

EngineeringMar 13, 2026· min read

When Your Gate Locks Out the Good Guys: A Middleware Fix Story

How a well-intentioned security gate ended up blocking our own CI pipelines, and the quick fix that got us back on track.

Read article

EngineeringMar 9, 2026· min read

GTM Agent v1: Automating Content Production Without Losing Voice

Strug City's first production agent system solves the content bottleneck that every one-person company faces. Here's how we built an automated go-to-market engine that writes, reviews, and ships content while preserving brand voice.

Read article

EngineeringMar 7, 2026· min read

SINC Sprint 1: Teaching Agents to Remember

How we built a persistent knowledge graph and comprehensive telemetry system to make Sabine smarter across conversations.

Read article

CompanyMar 7, 2026· min read

We're Strug City: Here's Our Story

We just launched our brand. One human, seven AI agents, and a belief that the future of work looks different than we thought.

Read article

EngineeringFeb 11, 2026·10 min read

Natural Language Analytics: Turning Questions into Insights

How Aurora Analytics transforms plain English questions into SQL queries and visualizations. A look at our NLP pipeline and query optimization.

Read article

ProductFeb 9, 2026·15 min read

Getting Started with NorthStar SDK: Your First AI-Native App

A practical guide to building your first application with NorthStar SDK. Learn the core concepts and build a semantic search feature in 30 minutes.

Read article

ResearchFeb 5, 2026·12 min read

Building a Vector Database from Scratch: Architecture Decisions

Deep dive into the architecture of Glacier DB. Learn about our approach to distributed vector search, indexing strategies, and performance optimization.

Read article

EngineeringFeb 1, 2026·8 min read

AI-Powered Code Reviews: Best Practices and Patterns

Learn how to effectively integrate AI agents into your code review process. We share lessons from building automated review systems at scale.

Read article

CompanyJan 15, 2026·5 min read

Introducing Strug City: Building the Future of AI-Powered Development

We're a virtual engineering team on a mission to make AI-powered development accessible to everyone. Learn about our vision and the products we're building.

Read article

engineeringJun 15, 2025· min read

Why Autonomous AI Teams Need Memory: Building Intelligence That Compounds

The difference between an AI that executes tasks and one that truly learns isn't raw compute power—it's institutional memory. Here's how we built a development platform where every bug fix, integration challenge, and solution becomes permanent knowledge.

Read article

EngineeringJun 15, 2025· min read

Why We Renamed Dream Team to Strug Works

We renamed our autonomous engineering team from Dream Team to Strug Works. The change wasn't cosmetic — it was about clarifying what we're building and who we're building it for.

Read article

EngineeringMar 25, 2025· min read

Teaching Sabine to Know What She's Reading

Before today, Sabine's memory system treated every document the same way. An urgent email looked identical to a product spec or a meeting transcript in her vector database. That changed with the new document type classifier.

Read article

EngineeringMar 25, 2025· min read

Teaching Sabine to Read Between the Lines: Holistic Biographical Document Processing

How we moved from rigid field extraction to context-aware biographical document understanding in Sabine.

Read article

EngineeringMar 19, 2025· min read

Sabine Gets a Face: Shipping Six Pages in Phase 2

We just shipped the full frontend build-out for Sabine Super Agent. Six new pages. One focused push. This is what happens when you stop treating your personal AI assistant like a black box and start giving it a proper interface.

Read article

EngineeringMar 7, 2025· min read

SINC Sprint 1: Teaching Sabine to Remember

How we built a persistent knowledge graph for Sabine with entity extraction, importance scoring, and comprehensive agent telemetry.

Read article

Jan 15, 2025· min read

Beyond DevOps: Why Your Next Engineering Team Will Be Autonomous

The shift from DevOps to autonomous engineering platforms isn't just about automation—it's about fundamentally rethinking how software gets built. Here's what technical leaders need to know.

Read article

EngineeringJan 15, 2025· min read

What Nobody Tells You About Running an AI-First Organization

Six months into running Strug City as a one-person company with an autonomous AI engineering team, here's what I've learned about making agents actually work in production—and why most AI tooling companies will never figure this out.

Read article

EngineeringJan 10, 2025· min read

What Nobody Tells You About Running an Autonomous Agent Team in Production

Six months into running Strug Works as our fully autonomous engineering team, here's what actually matters when agents ship real code to production systems—and what surprised me most.

Read article

Jan 15, 2024· min read

Why Autonomous Development Teams Are the Future of Software Engineering

The software development landscape is shifting. AI-powered autonomous teams aren't just assisting developers—they're becoming the developers. Here's why this matters for your engineering organization.

Read article

Jan 15, 2024· min read

Why Autonomous Development Teams Are the Next Evolution in Software Engineering

The shift from outsourcing to AI-powered autonomous teams represents a fundamental change in how software gets built. Here's why technical leaders are making the switch.

Read article

EngineeringJan 15, 2024· min read

Beyond Code Generation: Why the Future of AI Development Is Autonomous Teams

AI coding assistants are everywhere, but they're just the first wave. The real transformation comes when AI agents work together as complete development teams—planning, building, testing, and deploying without human handholding.

Read article