Shipping Faster: How Parallel Tool Calls Cut Agent Response Time

Most AI agent frameworks execute tool calls one at a time. When an agent needs to read three files, check git status, and query memory, it waits for each operation to complete before starting the next. That's fine for demos. It's not fine for production.

The Problem

Strug Works agents handle complex engineering tasks: reading codebases, writing tests, committing changes, updating Linear issues. A typical task might involve 15-20 tool calls. When those calls are independent—reading multiple files, for example—sequential execution adds unnecessary latency.

If each file read takes 200ms and you need to read five files, that's a full second of wait time before the agent can reason about what it found. Multiply that across dozens of tasks per day, and the delay compounds.

The Solution

This week we shipped batched parallel tool calls. When the agent model returns multiple independent function calls in a single response, our executor now runs them concurrently instead of sequentially.

The implementation respects dependencies: if a call depends on the result of another, we maintain order. But when calls are truly independent—like reading from different parts of the codebase or querying multiple memory scopes—we execute them in parallel and return all results together.

This isn't just about shaving milliseconds. It's about making agents feel responsive enough for real engineering workflows. When sc-backend needs to audit a Python module, check test coverage, and read the Linear issue context, those operations happen simultaneously now.

Why It Matters

The difference between research prototypes and production AI systems comes down to details like this. Users don't tolerate waiting for agents to slowly read files one by one. Engineers expect their tools to be fast.

Parallel execution also unlocks new patterns. Agents can now confidently read entire sections of a codebase upfront, explore multiple approaches simultaneously, and gather context more aggressively without worrying about time budgets.

What's Next

This is the first step in a broader performance push. We're exploring intelligent prefetching—predicting which files an agent will need based on task context and loading them proactively. We're also looking at persistent agent sessions that maintain warm connections to frequently accessed resources.

The goal isn't just faster agents. It's agents that feel like teammates—responsive, efficient, and ready to work. Parallel tool calls get us closer to that reality.