* fix(sdk): stop leaking fantasy types through pkg/kit.AgentConfig (#30)
Replace the alias-based AgentConfig and handler types with SDK-owned
structs and function types. CoreTools / ExtraTools / ToolWrapper now
accept []kit.Tool, and the handler types (ToolCallHandler,
ToolExecutionHandler, ToolResultHandler, ResponseHandler,
StreamingResponseHandler, ToolCallContentHandler) plus SpinnerFunc are
declared in pkg/kit/ with signatures that reference only SDK types.
Consumers no longer need to import charm.land/fantasy to populate an
AgentConfig or assign a handler. go doc pkg/kit AgentConfig output no
longer mentions fantasy.*.
- Add unexported (*AgentConfig).toInternal() to convert at the SDK
boundary; Tool is still an alias for the underlying tool type, so
slice and function fields convert without allocation.
- Add agent_config_internal_test.go covering nil receiver, scalar
fields, tool slices, ToolWrapper invocation, OnMCPServerLoaded, and
auth/token-factory wiring.
- Add types_test.go cases that populate AgentConfig and SpinnerFunc
without importing fantasy -- the file compiling is the regression
proof for the leak.
- Update pkg/kit/README.md Re-exported Types section to record that
AgentConfig and the handler types are now Kit-owned.
Fixes#30
* fix(sdk): add DebugLogger and MCPTaskConfig to kit.AgentConfig (#30)
The first revision of the SDK-owned AgentConfig dropped two fields that
internal/agent.AgentConfig carried: DebugLogger (tools.DebugLogger) and
MCPTaskConfig (tools.MCPTaskConfig). Restore them with SDK-owned
equivalents and wire them through toInternal().
- Add kit.DebugLogger interface (LogDebug / IsDebugEnabled) mirroring
tools.DebugLogger. Interface-to-interface assignment is automatic
because the method sets match.
- Add kit.MCPTaskConfig struct mirroring tools.MCPTaskConfig with SDK
types (MCPTaskMode, MCPTaskProgressHandler) and a toToolsConfig()
helper that converts at the SDK boundary.
- Wire both new fields in (*AgentConfig).toInternal().
- Extend agent_config_internal_test.go with cases for both fields.
- Document the additions in pkg/kit/README.md.
The MCP adapter previously wrapped any error returned by MCPToolManager.ExecuteTool
into a Go error returned from the fantasy.AgentTool.Run interface. The fantasy
agent loop treats those as critical errors and aborts the entire turn —
discarding all prior reasoning, tool calls, and results.
In practice that meant a single misbehaved MCP server returning a JSON-RPC
"-32602 Invalid params" (e.g. a Zod schema mismatch on the server's input
validation) would kill an in-progress turn after the model had already done
dozens of seconds of useful work, with no way for the model to see the
validation message and self-correct.
This mismatched the contract that native Kit tools follow: native tools
return errors via kit.ErrorResult(...), which become soft tool-result errors
that the model reads and can act on (retry with corrected args, try a
different tool, give up gracefully).
Make the MCP path behave the same way:
- JSON-RPC protocol errors, transport failures, and server-side schema
rejections are now returned as fantasy.NewTextErrorResponse(...) with
err == nil, so the agent loop continues and the model sees the failure
in-band as a tool result it can reason about.
- Context cancellation (ctx.Err() != nil) remains a critical error so
callers can abort turns deterministically. This is the only case where
bubbling up is correct — the caller intentionally tore the turn down
and the agent must not keep spinning.
- Server-side soft errors (CallToolResult{ isError: true }) and the
happy path are unchanged.
The agent loop's MaxSteps cap already bounds the worst case for a
permanently broken MCP server, so there is no risk of unbounded retries.
Side effect: extracted a tiny mcpExecutor interface for the one method the
adapter uses (ExecuteTool), purely so the adapter is unit-testable in
isolation without standing up a full MCPToolManager + connection pool.
Behavior change note for downstream consumers: code that relied on
host.PromptResult / Stream returning a Go error containing
"mcp tool execution failed" will no longer see those errors — the
failure information is now in the assistant's final response (or in the
OnAfterToolResult / OnToolResult hooks, where IsError will be true).
Context cancellation continues to surface as an error from those calls
as before.
Co-authored-by: space_cowboy <space_cowboy@mark3labs.com>
- register loaded skills into the input autocomplete under category
"Skills" with HasArgs so Enter populates "/skill:name " instead of
auto-submitting, leaving room for trailing args
- prefix descriptions with [project] or [user] to disambiguate
colliding skill names across sources
- extend refreshSkillItems to prune & re-add Skills entries on
ContentReloadEvent, matching the pattern used for prompt templates
and MCP prompts
- add Description field to ui.SkillItem and populate it from
kit.Skill.Description in both initial build and hot-reload paths
- Add unexported steerDrainFn test seam on App so unit tests can
inject fake steer items without standing up a full *kit.Kit
(Options.Kit is a concrete struct, not an interface).
- releaseBusyAfterCompact now prefers the seam over Kit.DrainSteer
via a small switch; production behaviour is unchanged when the
field is nil.
- Add TestReleaseBusyAfterCompact_splicesSteerAheadOfQueue, which
pre-populates both fake steer items and ordinary queue prompts,
invokes releaseBusyAfterCompact, and asserts the first dispatched
prompt is the steer item — proving steer messages retain 'act now'
priority and that drainQueue is actually launched (the bug from
#27).
- Add releaseBusyAfterCompact() shared deferred tail used by both
CompactConversation and CompactAsync. It drains the SDK steer
channel, splices steer items in front of any queued prompts, and
hands off to drainQueue so messages received during compaction
are dispatched automatically once compaction finishes.
- Previously, busy was simply cleared on completion and the queue
sat idle until the user submitted another prompt, which then
flushed everything together.
- Honor the closed flag so a teardown during compaction discards
pending items instead of spawning drainQueue against a torn-down
App.
- Add regression tests covering the queued-flush, idle-empty, and
closed-during-compact paths.
Fixes#27
SetupCLIForNonInteractive returns nil when --quiet is active, matching
the pre-existing nil checks elsewhere in the same block (e.g. the
buffered debug-message branch). Without this guard the new
'System Prompt loaded' notice panicked on quiet, non-interactive runs.
Discovered via tmux smoke test of the #25 fix.
When system-prompt was a file path (via --system-prompt, config entry,
or SDK Options.SystemPrompt), the path string itself was used as the
base prompt because config.LoadSystemPrompt only ran later in
BuildProviderConfig — by which point viper had been overwritten with
the path-augmented composed text. The LLM received the path instead of
the prompt contents.
- Call config.LoadSystemPrompt on the raw viper value in New() before
PromptBuilder composes runtime context (AGENTS.md / skills / date).
- Add HasCustomSystemPrompt() and GetSystemPromptSource() so SDK callers
can inspect prompt state without reaching into viper.
- Display 'System Prompt loaded: <source>' at startup in CLI and TUI
modes, paralleling the per-server 'MCP server loaded' notice.
- Add regression tests covering both file-path and inline prompt paths.
Fixes#25
Addresses two CodeRabbit feedback items on PR #24:
* Docstring coverage warning (was 57.14%, threshold 80%): adds godoc
comments to the four test functions added or substantially rewritten
in this PR — TestLoadAndSaveManifest, TestAddAndRemoveFromManifest,
TestFindInManifest, TestHighlightFileTokensInjectsANSI.
* Quick-win nitpick: replaces the manual os.Setenv/os.Unsetenv +
defer pattern in TestFindInManifest with t.Setenv, which restores
the env var automatically on cleanup even on panic or t.Fatal.
go test -race ./... still passes.
Per AGENTS.md 'Yaegi function field bug', named function/method
references assigned to extensions.Context fields return zero values
across the interpreter boundary. The two SetContext literals in
runNormalMode (now consolidated in buildInteractiveExtensionContext)
inherited 9 bare references that need to be anonymous closure literals:
PrintBlock, GetChildren, GetAvailableSkills, ParseTemplate,
RenderTemplate, ParseArguments, SimpleParseArguments,
ResolveModelChain, CheckModelAvailable
Each is now wrapped as 'func(args) ret { return <orig>(args) }'.
Behaviour unchanged in regular Go; Yaegi extensions that consume these
fields will now see callable closures instead of zero values.
Verified with go test -race ./...
The TestUserBlockHighlightsFileTokens test was rewritten to call
HighlightFileTokens directly (UserBlock was deleted in the dead-code
sweep). That left testTypography with no callers, so staticcheck U1000
flagged it.
The previous runNormalMode contained two nearly-identical 400-line
extensions.Context literal expressions:
* the startup-time literal (cmd/root.go:853-1307) that buffered
Print* calls into startupExtensionMessages
* the runtime literal (cmd/root.go:1311-1605) that routed Print*
through appInstance.PrintFromExtension
Every other field — Compact, SendMultimodalMessage, the four prompt
factories, all 25+ data-access fields, all four bridge phases — was
duplicated byte-for-byte. Maintainers had to remember to update both
copies whenever an extension Context field was added.
cmd/root.go is now 1463 lines (was 2225). The new helper lives in
cmd/extension_context.go (455 lines, mostly the closures verbatim) and
returns an extensions.Context with every field populated except
Print/PrintInfo/PrintError, which each call site sets afterwards to
match its phase. This preserves AGENTS.md's 'function field bug'
guarantee — all assignments remain anonymous closure literals.
Output of 'kit --version' / 'kit --help' unchanged. Full test suite
passes.
The same ~40-line block — building a kit.SubagentConfig, wrapping
OnEvent through sdkEventToSubagentEvent, calling kitInstance.Subagent,
and translating the SDK result into extensions.SubagentResult — was
copy-pasted three times:
* cmd/root.go (interactive TUI Context, line 1148)
* cmd/root.go (post-SessionStart runtime Context, line 1446)
* internal/acpserver/session.go (ACP server Context, line 154)
A separate sdkEventToSubagentEvent function was duplicated byte-for-byte
between cmd/root.go and internal/acpserver/session.go.
Both are now consolidated in a new internal/extbridge package which is
the only module-internal home that can legitimately import both
pkg/kit/ (the public SDK) and internal/extensions/. cmd/ and
internal/acpserver/ both import it, so SDK-event-to-extension-event
schema changes only have one site to update.
Also fixes pkg/kit/events.go godoc comment that named the underlying
LLM library, per AGENTS.md 'No Dependency Name Leakage' rule for
exported SDK symbols.
go test -race ./... passes.
Removes ~600 lines of unreferenced code surfaced by deadcode + manual
audit (none of it reachable from production code paths or test setup):
- internal/models/pool.go: ProviderPool was never wired into kitsetup
or the agent; the global pool singleton had zero callers.
- internal/ui/debug_logger.go: CLIDebugLogger was unreachable; debug
routing goes through internal/tools/buffered_logger.go instead.
- internal/ui/tool_approval_input.go: tea.Model never instantiated;
approvals are handled inline in model.go.
- internal/ui/cli.go: DisplayAssistantMessage / DisplayCancellation /
GetDebugLogger had zero callers (the *WithModel variant is what
event_handler.go uses).
- internal/ui/style/enhanced.go: Style{Card,Header,Subheader,Muted,
Success,Error,Warning,Info} + Create{Separator,ProgressBar} — none
used. CreateBadge stays (used by model.go).
- internal/ui/style/themes.go: RefreshThemeRegistry — never called.
- internal/ui/block_renderer.go: With{FullWidth,MarginTop,Padding{Left,
Right},Background,Foreground,Width} — option helpers nobody calls.
- internal/ui/render/blocks.go: UserBlock, ToolBlock — replaced by
inline rendering elsewhere; the test for UserBlock was rewritten to
directly exercise HighlightFileTokens (which is what the test really
cared about).
- internal/ui/commands/commands.go: GetAllCommandNames — no callers.
- internal/ui/message_items.go: NewTextMessageItem,
NewSystemMessageItem + the entire SystemMessageItem type — model.go
uses NewStyledMessageItem instead.
- internal/prompts/loader.go: Deduplicate — the loader does dedup
internally; standalone helper was unused.
- internal/models/cache_options.go: mergeProviderOptions + its
test-only consumer.
- internal/extensions/installer.go: Installer.GetInstalledPackages —
intended for a 'kit ext list' command that was never built.
- internal/extensions/manifest.go: saveManifestToScope,
saveManifestToPath, GetGlobalManifest, GetProjectManifest,
addEntryToManifest, removeEntryFromManifest — package-level
duplicates of *Installer methods. Tests rewritten to exercise the
live Installer methods instead, which fixes a latent path-resolution
inconsistency between manifestPathForScope and Installer.manifestPath
(the former hard-coded paths, the latter respects projectGitRoot).
- internal/extensions/subagent.go: SpawnSubagent + helpers
(generateSubagentID, findKitBinary, subagentJSONOutput). The
subprocess-spawn implementation is unreachable; production code
routes through kit.Kit.Subagent (in-process). Types
(SubagentConfig/Result/Handle/etc.) and the SubagentHandle methods
remain because they are exposed to extensions via Yaegi symbols and
the Context.SpawnSubagent field.
- cmd/root.go: LoadConfigWithEnvSubstitution — one-line wrapper around
kit.LoadConfigWithEnvSubstitution with zero callers.
go test -race ./... passes.
- Stale comment showed ~/.kit/sessions/--<cwd-path>--/ which does not
match the actual encoding (no leading/trailing dashes)
- Update to reflect the real format and point to encodeCwdForDir for
full rules
- Encode cwd via new encodeCwdForDir helper that handles both `/`
and `\` separators and strips characters illegal in Windows
directory names (`: < > " | ? *`)
- Fixes session creation on Windows where the drive-letter colon
produced names like `C:--test` and caused mkdir to fail
- Add regression tests covering Unix paths, Windows drive roots,
secondary drives, mixed separators, and other illegal chars
Fixes#18
Address two review findings on the MCP Tasks PR.
- Config.Validate() now rejects unknown tasksMode values with a clear
error naming the server and bad value. Without this a typo (e.g.
"alwasy") was silently downgraded to "auto" by the runtime parser.
- Kit.Subagent() now propagates the parent's six MCP task options
(mode map, timeout, TTL, poll interval, max poll interval, progress
callback) onto the child via a new inheritMCPTaskOptions helper.
Without this, child subagents always saw default polling and no
progress feedback regardless of parent configuration.
The propagation logic lives in a helper so the test exercises the real
code path instead of duplicating it; future task fields only need to be
added in one place.
- README: add tasksMode YAML example and MCP Tasks subsection with
SDK opt-in snippet
- pkg/kit/README: add MCP Tasks subsection covering MCPTaskMode,
progress callbacks, and List/Get/Cancel methods
- www/configuration: document the tasksMode server field plus a
per-mode behaviour table
- www/sdk/options: extend the Compaction & MCP table with the six
new Options fields and add a top-level MCP Tasks section
- www/sdk/overview: add a brief MCP Tasks section between MCP
prompts/resources and Context & compaction
All examples verified against the public symbols in pkg/kit/mcp_tasks.go;
docs site builds cleanly via npx tome build.
Implement Phase 1 of the MCP Tasks spec so long-running tools/call
requests can run asynchronously, survive proxy timeouts, and be
cancelled mid-flight.
- connection pool now advertises mcp.NewTasksCapability() during
initialize and captures the InitializeResult so callers can detect
per-server task support
- new MCPServerConfig.TasksMode (auto|never|always, default auto)
parsed from both new and legacy mcp.json shapes
- ExecuteTool augments tools/call with TaskParams when policy and
capability allow, polls tasks/get / tasks/result until terminal,
and best-effort tasks/cancel on context cancellation
- new MCPToolManager methods: SetTaskConfig, ListServerTasks,
GetServerTask, CancelServerTask
- public SDK surface in pkg/kit: MCPTask, MCPTaskStatus, MCPTaskMode,
MCPTaskProgress, MCPTaskProgressHandler, plus Options fields
(MCPTaskMode, MCPTaskTimeout, MCPTaskTTL, MCPTaskPollInterval,
MCPTaskMaxPollInterval, MCPTaskProgress) and Kit.{List,Get,Cancel}
MCPTask methods
- works around two upstream mcp-go v0.51.0 parser bugs
(ParseCallToolResult rejects task responses; ParseTaskResultResult
looks for content under a non-existent nested key) by decoding the
wire shape directly via the transport
- defaults to MCPTaskModeAuto so servers that don't advertise task
support behave exactly as before
Fixes#21
- Bump fantasy v0.21.0 -> v0.23.0, mcp-go v0.49.0 -> v0.51.0,
acp-go-sdk v0.12.0 -> v0.12.2, chroma v2.23.1 -> v2.24.1,
fsnotify v1.9.0 -> v1.10.1, ultraviolet, AWS SDK, Google API
- Implement CloseSession and ResumeSession on acpserver.Agent to
satisfy the expanded acp.Agent interface in acp-go-sdk v0.12.2
- Add sessionRegistry.remove helper to support session close
Fantasy v0.21.0 natively includes gpt-5.5 and other newer models in
its responsesModelIDs/responsesReasoningModelIDs lists, making our
workaround unnecessary.
- Delete responses_models.go (go:linkname hack + RegisterResponsesModels)
- Delete responses_models_test.go
- Replace isResponsesAPIModel/isResponsesReasoningModel heuristics with
direct openai.IsResponsesModel/openai.IsResponsesReasoningModel calls
- Remove RegisterResponsesModels calls from registry init/reload
- Remove hack documentation from AGENTS.md
- Update all deps (fantasy v0.21.0, smithy-go, ultraviolet, etc.)
Fantasy's hardcoded responsesModelIDs list gates whether a model uses
the Responses API or Chat Completions code path. When a new model
(e.g. gpt-5.5) is added via `kit update-models` but fantasy hasn't
been updated yet, the type mismatch between *ResponsesProviderOptions
and *ProviderOptions causes a crash.
- Add isResponsesAPIModel()/isResponsesReasoningModel() helpers that
supplement fantasy's checks with prefix-based heuristics for modern
OpenAI model families (gpt-4.1+, gpt-5+, o-series, codex, chatgpt)
- Add RegisterResponsesModels() using go:linkname to append missing
model IDs from our database into fantasy's internal slices at init
time and after ReloadGlobalRegistry()
- Replace all direct openai.IsResponsesModel/IsResponsesReasoningModel
calls in providers.go with the new helpers
- Merge embedded + cached model databases instead of cache-only fallback
- Bump fantasy v0.19.0 -> v0.20.0 to match existing import usage
- Document the technique and model-family update process in AGENTS.md
- Remove top-level old_text/new_text params from edit tool schema
- Make edits array the sole interface; single edits pass 1-item array
- Simplify normalizeEditInput, removing dual-mode branching logic
- Update UI renderer to only read from edits array
- Remove old_text/new_text from bodyKeys in message summarizer
- Update web session HTML to iterate edits array
- Convert all single-edit tests to use Edits array
- Replace mixed-mode test with empty-array validation test
Tool blocking via OnToolCall and SetActiveTools returned both a
ToolResponse (IsError=true) and a Go error. Fantasy treats a non-nil
Go error from tool.Run() as a critical failure, aborting the agent
loop without delivering the tool result to the LLM. The model never
saw the block reason and would retry or hallucinate.
- Return nil error for blocked tools (OnToolCall Block=true)
- Return nil error for disabled tools (SetActiveTools)
- Return nil error for extension tool execution failures
- Update tests to assert nil error (IsError response conveys the error)
Fixes#20
- Set context tokens per-step in recordStepUsage instead of waiting
for turn completion; each step re-sends the full conversation so
the reported usage monotonically increases
- Add UsageUpdatedEvent to trigger a TUI re-render after each step
so the status bar reflects updated tokens, cost, and context %
even during gaps between streaming chunks
- Update test to expect per-step context token updates
- Add heightCache map to ScrollList, keyed by item ID, avoiding
repeated Render() calls purely to count lines
- Rewrite GotoBottom() to walk backwards from the end in O(visible)
instead of two full O(N) forward passes over all items
- Replace all height-only Render() calls in clampOffset(), AtBottom(),
ScrollBy(), and ScrollPercent() with cached itemHeight() lookups
- Invalidate cache on width changes (SetWidth) and item mutations
(AppendChunk, AppendStdout/Stderr via InvalidateItemHeight)
- Refresh cache entries in View() from authoritative renders
- Add LLMToolResultOutputContentMedia alias (closes gap in tool result types)
- Add LLMToolResultContentType enum and constants (Text, Error, Media)
- Add LLMToolInfo, LLMProviderOptions, LLMProviderMetadata, LLMPrompt aliases
- Replace all fantasy.* references in hooks.go and hooks_test.go with
SDK-owned aliases, removing the charm.land/fantasy import from both
- Fix gofmt alignment in internal/extensions/symbols.go
- Update SDK skill doc with complete LLM type reference
- Infer Type="image" for image/* MIME types and Type="media" for all
other binary content so the downstream framework creates a media
content block instead of silently discarding Data bytes (#17)
- Extract shared toolOutputToResponse() helper to eliminate duplication
- Add ImageResult() and MediaResult() convenience constructors
- Add LLMToolCall and LLMToolResponse type aliases so SDK consumers
can call Tool.Run() without importing the underlying framework
- Add 6 regression tests covering image, media, and text responses
Closes#17
- Buffer session JSONL writes with bufio.Writer, flush at sync points;
ForkToNewSession and AddLLMMessages now batch N entries into ~1 syscall
- Cache lipgloss styles in style.CachedStyles, lazily built and
invalidated on SetTheme; eliminates ~15 NewStyle() calls per frame in
hot render paths (reasoning blocks, spinner, tool headers, margins)
- Cache git ls-files results for @file suggestions with 3s TTL; typing
@filename no longer spawns 3 subprocesses per keystroke
- Use strings.Builder for StreamingMessageItem.content; eliminates O(n²)
string copying during LLM response streaming
- Set Streaming: true in subagent childOpts to prevent
viper.Set("stream", false) from polluting global state
- Without this, concurrent subagents and the parent could read
stale stream=false from viper, causing provider-level issues
(e.g. Anthropic non-streaming timeouts with extended thinking)
- Update all Go dependencies (bubbletea v2.0.6, fantasy v0.19.0,
acp-go-sdk v0.12.0, mcp-go v0.49.0, and transitive deps)
- Replace SetSessionModel with SetSessionConfigOption to match new
acp-go-sdk Agent interface (union type with ValueId/Boolean variants)
- Add ListSessions stub returning empty list (new required method)
- Refresh embedded_models.json from models.dev/api.json
- Update ACP smoke test: add initialize handshake, session/list,
session/set_config_option, session/cancel, and fix update parsing
- remove "You" label and icon from user messages, use borderless content block
- remove input title bar ("Enter your prompt...") and hint line
- increase textarea from 3 to 4 rows with top/bottom margin
- hide input hints permanently for a cleaner UI
- match separator colors (use theme.Border for both startup and input dividers)
- make startup separator full terminal width instead of hardcoded 80
- add /help for help hint and pipe separators to status bar
- add printCustomMessage/RenderCustomMessage for custom alert labels
- render /help output as markdown with "Help" alert label
- add Ctrl+V (paste image) to help message keys section
- fix reasoning text wrapping using ANSI-aware lipgloss.Style.Width
- export HighlightFileTokens for cross-package use
- Add NoOAuth field to MCPServerConfig with JSON/YAML support
- Guard OAuth error handling and transport setup with the new flag
- Prevents failed dynamic client registration on servers like PubMed
that do not support OAuth
- First ctrl+c clears input and arms quit flag with 3s timeout
- Second ctrl+c within timeout window actually quits
- Show '⚠ Press Ctrl+C again to quit' warning after first press
- Empty input no longer quits immediately on single ctrl+c
- Prompt/overlay states: ctrl+c cancels dialog, re-dispatches to
main handler for double-press tracking instead of quitting
- Update placeholder, help text, and tests to match new behavior
Add charset=utf-8 to Content-Type header and use HTML entity
✓ instead of raw Unicode checkmark to prevent garbled
text display in browsers.
Fixes#9
- wrap thinking text in StreamComponent and render.ReasoningBlock
- plumb width through renderer and streaming item paths
- keeps style consistent with user/assistant blocks and avoids cut-off lines