Files
kit/internal
space_cowboy 70cd214175 fix(mcp): surface MCP tool failures as soft errors, not critical aborts
The MCP adapter previously wrapped any error returned by MCPToolManager.ExecuteTool
into a Go error returned from the fantasy.AgentTool.Run interface. The fantasy
agent loop treats those as critical errors and aborts the entire turn —
discarding all prior reasoning, tool calls, and results.

In practice that meant a single misbehaved MCP server returning a JSON-RPC
"-32602 Invalid params" (e.g. a Zod schema mismatch on the server's input
validation) would kill an in-progress turn after the model had already done
dozens of seconds of useful work, with no way for the model to see the
validation message and self-correct.

This mismatched the contract that native Kit tools follow: native tools
return errors via kit.ErrorResult(...), which become soft tool-result errors
that the model reads and can act on (retry with corrected args, try a
different tool, give up gracefully).

Make the MCP path behave the same way:

  - JSON-RPC protocol errors, transport failures, and server-side schema
    rejections are now returned as fantasy.NewTextErrorResponse(...) with
    err == nil, so the agent loop continues and the model sees the failure
    in-band as a tool result it can reason about.
  - Context cancellation (ctx.Err() != nil) remains a critical error so
    callers can abort turns deterministically. This is the only case where
    bubbling up is correct — the caller intentionally tore the turn down
    and the agent must not keep spinning.
  - Server-side soft errors (CallToolResult{ isError: true }) and the
    happy path are unchanged.

The agent loop's MaxSteps cap already bounds the worst case for a
permanently broken MCP server, so there is no risk of unbounded retries.

Side effect: extracted a tiny mcpExecutor interface for the one method the
adapter uses (ExecuteTool), purely so the adapter is unit-testable in
isolation without standing up a full MCPToolManager + connection pool.

Behavior change note for downstream consumers: code that relied on
host.PromptResult / Stream returning a Go error containing
"mcp tool execution failed" will no longer see those errors — the
failure information is now in the assistant's final response (or in the
OnAfterToolResult / OnToolResult hooks, where IsError will be true).
Context cancellation continues to surface as an error from those calls
as before.
2026-05-13 19:48:13 +03:00
..