The Handoff Problem
When one agent finishes work and passes it to another, information gets lost. The receiving agent does not know what the sender tried, what failed, what assumptions were made, or what context shaped the decisions. This is the handoff problem, and it is the number one cause of multi-agent system failures.
Consider a coder agent that writes a function and hands it to a reviewer. Without a structured handoff, the reviewer sees only the code — not the requirements it was built against, the alternatives that were considered, or the edge cases the coder intentionally deferred. The review becomes shallow because the reviewer lacks context.
The fix is not more tokens or bigger context windows. The fix is a protocol — a defined structure that every handoff must follow.
Structured Handoff Formats
Every agent-to-agent message should follow a consistent schema. At minimum, include these fields:
- task_id: A unique identifier linking all messages about the same task. Without this, agents cannot correlate related work.
- from_agent: Who sent this message. The receiver needs to know the sender's role to interpret the content correctly.
- action: What the sender did — "completed", "failed", "needs_review", "escalated". A single word that tells the receiver what to do next.
- payload: The actual work product — code, a review, test results, a plan. Structured as a typed object, never a raw string.
- context: Relevant background the receiver needs — requirements, constraints, prior decisions. Keep it focused: only what this specific receiver needs.
- metadata: Timestamps, token usage, confidence scores, attempt count. Used for monitoring and debugging, not by the receiving agent directly.
Define this schema once, enforce it everywhere. Agents that produce malformed messages should be rejected by the coordinator, not silently accepted.
Shared Context Schemas
Beyond individual handoffs, agents need a shared understanding of the project they are working on. A shared context schema defines what every agent in the system can access:
- Project state: Current files, recent changes, active branch. Every agent reads from the same source of truth.
- Task queue: What tasks are pending, in progress, and completed. Agents check this before starting work to avoid duplication.
- Decision log: Key choices made during the session — "chose REST over GraphQL because X," "deferred pagination to next sprint." Prevents agents from revisiting settled decisions.
- Constraints: Rules that all agents must follow — tech stack, coding style, forbidden patterns. Loaded once at the start of every agent session.
The shared context is read-heavy. Agents read it constantly but write to it rarely and through a controlled process — typically only the coordinator or a dedicated memory agent can update it.
Context Compression
Raw context is expensive. A full file diff, a complete test suite output, or an entire requirements document will consume thousands of tokens. Agents need compressed context — the essential information without the noise.
Three compression strategies that work in practice:
- Summary extraction: Before passing context, have the sending agent summarize it. "5 files changed, 2 new functions added, 1 breaking change in the API" is far more useful than the raw diff for a reviewer doing a first pass.
- Relevance filtering: Each agent role needs different context. A tester does not need the architectural rationale — it needs the function signatures and expected behavior. Filter context per recipient role.
- Progressive detail: Start with a summary. If the receiving agent needs more, it requests the full version. This lazy-loading approach keeps most handoffs small while allowing depth when needed.
Error Propagation
When an agent fails, the system needs to know immediately. Silent failures are the worst outcome — they let bad data flow downstream, corrupting every subsequent agent's work.
Design your error protocol around these principles:
- Fail loudly: Every agent must report failures using the same handoff schema with a "failed" action. Include the error type, what was attempted, and how many retries were made.
- Classify errors: Distinguish between retryable errors (timeout, rate limit) and permanent errors (invalid input, missing dependency). The coordinator handles each type differently.
- Propagate upward: When a pipeline stage fails, the error must reach the coordinator — not just the next stage. The coordinator decides whether to retry, reroute, or abort the entire task.
- Include recovery hints: When possible, the failing agent should suggest what might fix the problem. "Missing type definition for User — check types.ts" is actionable. "Something went wrong" is not.
Practical Exercise
Define a complete handoff protocol for a 3-agent team (coder, reviewer, tester). Write out:
- The JSON schema for agent-to-agent messages (all required fields, types, and enums)
- The shared context schema (what every agent can read)
- Three example handoffs: coder → reviewer, reviewer → coder (with feedback), coder → tester
- Two error scenarios: a retryable failure and a permanent failure, with the exact messages each agent would produce
Test your protocol by walking through a realistic task — implementing a new API endpoint — and tracing every message that flows between agents. If any step feels ambiguous, your protocol has a gap.
Want production-ready agent protocols?
The AI Brain Pro package includes battle-tested handoff schemas, error handling, and monitoring for multi-agent teams.
View Pricing →