* cmd: add --no-skills, --skill, and --skills-dir CLI flags
The pkg/kit Options struct already had full backend support for skills
control (NoSkills, Skills []string, SkillsDir) wired into loadSkills()
in pkg/kit/kit.go, but there were no corresponding CLI flags to drive
them. This commit closes that gap.
Changes in cmd/root.go:
- Add three package-level flag variables alongside the existing
noExtensionsFlag/extensionPaths group:
noSkillsFlag bool
skillsPaths []string
skillsDir string
- Register three persistent cobra flags in init():
--no-skills disable skill loading (auto-discovery and explicit)
--skill <path> load a skill file or directory (repeatable)
--skills-dir <dir> override the project-local skills directory
used for auto-discovery
- Wire all three into the kitOpts struct literal in runNormalMode()
so they flow directly into kit.New() -> loadSkills().
No changes to pkg/kit or internal/skills -- the backend was already
complete. No viper binding is needed because kit.go reads these fields
directly from opts rather than from viper (unlike NoExtensions which
uses the viper fallback path).
Example usage:
kit --no-skills "prompt"
kit --skill ./my-skill.md --skill ./other-skill.md "prompt"
kit --skills-dir /path/to/skills "prompt"
Co-authored-by: Claude <claude@anthropic.com>
* docs: document --no-skills, --skill, and --skills-dir CLI flags
Add the three new skills CLI flags to all relevant documentation:
- README.md: add Skills section under Global Flags CLI reference
- www/pages/cli/flags.md: add Skills table (mirrors Extensions section pattern)
- www/pages/cli/commands.md: expand the Skills section with usage examples
and a description of auto-discovery vs explicit loading vs --no-skills
Co-authored-by: Claude <claude@anthropic.com>
* feat: add config file support for skills options
Skills could previously only be controlled via CLI flags or SDK Options
fields. This commit wires all three skills settings into viper so they
can also be set in .kit.yml / .kit.yaml / .kit.json and via KIT_*
environment variables — matching the pattern used by no-extensions,
no-core-tools, and prompt-template.
cmd/root.go:
- Bind --no-skills, --skill, and --skills-dir flags to viper keys
(no-skills, skill, skills-dir) so config file values flow through.
pkg/kit/kit.go:
- At skill-load time, merge opts fields with viper values:
- noSkills = opts.NoSkills || v.GetBool("no-skills")
- skillPaths: opts.Skills if non-empty, else v.GetStringSlice("skill")
- skillsDir: opts.SkillsDir if non-empty, else v.GetString("skills-dir")
- Build a shallow-copied mergedOpts so loadSkills() picks up the
resolved values without mutating the original Options struct.
docs:
- README.md: add skills keys to the Basic Configuration YAML example
- www/pages/configuration.md: add no-skills, skill, skills-dir rows to
the All configuration keys table
Config file example (.kit.yml):
no-skills: false
skill:
- /path/to/skill.md
skills-dir: /path/to/skills/
Co-authored-by: Claude <claude@anthropic.com>
* config: add skills keys to default .kit.yml template
Add no-skills, skill, and skills-dir as commented-out examples in the
default config file generated by EnsureConfigExists(), alongside the
existing application settings block.
Co-authored-by: Claude <claude@anthropic.com>
* test: add test coverage for skills CLI flags and config keys
Four test locations updated:
pkg/kit/export_test.go:
- Add ConfigStringSliceForTest() helper to expose v.GetStringSlice()
from the Kit's isolated viper store, needed to assert skill list values.
pkg/kit/kit_test.go (TestNewWithSkillsOptions):
- NoSkills=true: GetSkills() returns empty slice
- SkillsDir=<empty dir>: kit.New() succeeds with zero skills
- Skills=[file]: single explicit skill file is loaded and name parsed correctly
pkg/kit/viper_isolation_test.go:
- TestSkillsViperKeys: no-API-key struct-level checks for NoSkills, Skills,
and SkillsDir fields on Options
- TestSkillsConfigFileKeys: full kit.New() round-trips via a written .kit.yml
for each of the three config keys:
no-skills: true → GetSkills() returns empty
skill: [path] → named skill loaded from config file path
skills-dir: dir → custom discovery root accepted without error
internal/config/config_test.go (TestEnsureConfigExists):
- Assert generated ~/.kit.yml template contains '# Skills configuration',
'no-skills:', and 'skills-dir:' comment blocks.
Co-authored-by: Claude <claude@anthropic.com>
---------
Co-authored-by: Claude <claude@anthropic.com>
* Remove dead code: 5 unused symbols across internal packages
- internal/models: LoadModelSettingsFromConfig (zero refs)
- internal/prompts: PromptTemplate.ExpandWithArgs (zero refs)
- internal/app: NewMessageStore (tests migrated to NewMessageStoreWithMessages)
- internal/config: HasEnvVars (+ its test)
- internal/core: ContextWithSudoPassword (test migrated to context.WithValue)
* pkg/kit: use TreeManager alias in exported signatures
NewTreeManagerAdapter and InitTreeSession now spell their signatures with
the public kit.TreeManager alias instead of internal/session.TreeManager,
so go doc renders domain types rather than internal paths.
* Consolidate tool-kind classification into internal/extensions
coreToolKinds + toolKindFor were duplicated verbatim in
internal/extensions/wrapper.go and pkg/kit/events.go, risking silent
divergence between extension events and SDK events. Single source of
truth now lives in internal/extensions/toolkinds.go; pkg/kit re-exports
the constants.
* Consolidate Anthropic OAuth detection and usage-tracker refresh
The 'is the active Anthropic credential a stored OAuth token' check was
copy-pasted at 5 sites, all prefix-matching the magic string
'stored OAuth' produced in internal/auth. Now:
- internal/auth: new CredentialSourceOAuth constant + IsAnthropicOAuth()
- internal/ui: new UpdateUsageTrackerForModel(); CreateUsageTracker and
SetupCLI share lookupTrackableModel (SetupCLI no longer re-inlines the
tracker construction)
- cmd/root.go + cmd/extension_context.go: verbatim-duplicated tracker
refresh blocks replaced with ui.UpdateUsageTrackerForModel
- pkg/kit isAnthropicOAuth delegates to auth.IsAnthropicOAuth
- internal/models compares source against the constant
* pkg/kit: consolidate model-path helpers and argument tokenizer
- ExtractModelFromPath mis-parsed model IDs containing '/' (e.g.
'openrouter/meta/llama' -> 'meta'); it now delegates to
RemoveProviderFromModel and is deprecated alongside
ExtractProviderFromPath (-> GetCurrentProvider)
- parseFields delegated to prompts.ParseCommandArgs so extension argument
parsing and builtin prompt-template parsing share one quote/escape
grammar; ParseCommandArgs now also splits on tabs (superset of both
previous tokenizers)
* Unify the two {{variable}} template engines
internal/skills and pkg/kit/template_bridge each had their own grammar:
skills rejected '{{ name }}' (whitespace) but allowed digit-first names;
the bridge was the opposite. A template behaved differently depending on
whether it was loaded as a skill prompt or via the extension API.
internal/skills is now the single engine using the superset grammar
(\{\{\s*(\w+)\s*\}\}); pkg/kit ParseTemplate/RenderTemplate are thin
adapters over it. Expand is now regex-based so whitespace placeholders
expand consistently; missing variables are still left as-is.
* internal/ui: extract switchModel helper for model-switch flow
The model-selector handler (ModelSelectedMsg) and /model slash command
duplicated the full switch sequence (thinking-level fallback, setModel,
display-state update, preference persistence, ModelChange emit) and had
already drifted in ordering. Both now call a single switchModel method.
Display state is still updated directly (no prog.Send from Update).
* extbridge: extract shared BaseContext for extension wiring
cmd/extension_context.go and internal/acpserver/session.go each built a
giant extensions.Context literal, duplicating ~15 delegation closures
(GetContextStats, GetMessages, AppendEntry, options, SetModel core,
Complete, SpawnSubagent, ...) that had to be kept in sync by hand. New
data-access fields had to be wired in both places or ACP-mode extensions
silently got nil function fields.
extbridge.BaseContext now provides the headless half; both call sites
overlay only their UI-specific closures. As a side effect ACP mode gains
previously-missing APIs (state, tree navigation, skills, template
parsing, model resolution) that were nil before. The interactive TUI
keeps its exact SetModel/ReloadExtensions ordering via overrides.
* internal/tools: extract withOAuthRetry and marshalToolResult helpers
ExecuteTool repeated the OAuth-error/re-auth/retry stanza verbatim twice
(sync and task-augmented paths) and the marshal-and-wrap stanza four
times. Both are now single helpers with identical error strings, so a
fix to OAuth retry or error categorization applies everywhere at once.
* internal/ui: extract buildShareFile with defer-based cleanup
handleShareCommand repeated the close/remove/print/return cleanup chain
four times across its temp-file write error paths. File assembly now
lives in buildShareFile with a single deferred cleanup on error.
* cmd: extract flag validation, preference restore, and provider-URL routing from runNormalMode
runNormalMode opened with ~150 lines of policy logic (flag-combination
validation, persisted model/thinking-level preference restoration, and
two subtle --provider-url model-rewrite rules). These are now standalone
functions (validateModeFlags, restorePersistedPreferences,
applyProviderURLRouting) so the routing policy is independently readable
and testable. Behaviour unchanged; ordering preserved.
* fix: address review findings on SDK godoc and nil guard
- pkg/kit: remove internal package paths from exported godoc on
ParseTemplate and the ToolKind* constants (SDK doc surface must not
reference internal packages)
- internal/tools: guard marshalToolResult against a nil CallToolResult
(json.Marshal(nil) succeeds as 'null', then result.IsError panics if
a client returns nil result with nil error)
Skipped the TreeNode Children deep-copy suggestion: the slice already
comes from TreeManager.GetChildren which returns a fresh copy per call
into a throwaway intermediate, so no internal state is exposed.
While the tool list of the main agent could be controlled by several
options, subagent used to be equipped with all available tools (except
for the subagent tool itself).
With this change the list of tools is taken from the parent, the
subagent tool itself is removed and the remaining tool list is added
to the subagent.
Signed-off-by: Egbert Eich <eich@suse.com>
* feat(extensions): add OnLLMUsage, SetState, enriched AgentEndEvent (#53)
Three additive primitives to the extension API:
- OnLLMUsage event: per-LLM-call token + cost deltas attributed to the
specific model/provider used for each round-trip. Derived from the SDK
StepFinishEvent in the extension bridge. Enables accurate budget
enforcement between calls instead of only at turn boundaries.
- ctx.SetState / GetState / DeleteState / ListState: session-scoped,
last-write-wins key-value store backed by a sidecar file
(<session>.ext-state.json) outside the conversation tree. Reads are
O(1), writes don't grow the JSONL, and the store is not duplicated on
fork. State is preserved across hot-reloads.
- Enriched AgentEndEvent: ToolCallCount, ToolNames, LLMCallCount, token
deltas (input/output/cache-read/cache-write), CostDelta, and
DurationMs populated by a per-turn aggregator. Existing handlers
reading only Response/StopReason are unaffected.
Includes unit tests for the state store, LLMUsage registration,
enriched AgentEndEvent, turn aggregator, llmUsageMeta, and sidecar path
derivation. Adds examples/extensions/usage-budget.go demoing all three
primitives together. Documents the additions in README, the docs site
(extensions overview, capabilities, examples), and the kit-extensions
and kit-sdk skill guides.
Fixes#53
* fix(extensions): address review feedback on state store and llmUsageMeta
- Serialize SetState/DeleteState saver invocations through a new saverMu
so overlapping atomic-rename writes can no longer race on the shared
.tmp file and persist an older snapshot after a newer one.
- LoadStateFromFile now clears the in-memory store when the sidecar is
missing or empty, matching the documented "replace … with its
contents" contract. This makes session-switching safe by preventing
keys from a prior session leaking into a new one. Tests updated to
cover both the missing-file and empty-file cases.
- llmUsageMeta now detects Anthropic OAuth credentials and returns
Cost=0, matching the comment and the existing usage_tracker behavior
for OAuth users. Mirrors the OAuth detection already used in
cmd/extension_context.go.
- Document the single-in-flight-turn assumption baked into the
per-turn aggregator with a clear migration path (per-turn ID) for if
concurrent turns ever become a supported use case.
* fix(extensions): release saverMu on panic in state store
Extract a runSaver helper that locks saverMu and defers Unlock before
invoking the persistence callback. Without the deferred Unlock, a panic
inside the saver (e.g. disk full mid-write) would leave saverMu held
forever and deadlock the next SetState/DeleteState. Both SetState and
DeleteState now route through the helper. New TestRunner_State_Saver
PanicReleasesSaverMu reproduces the deadlock window with a 2s deadline
and proves the mutex is released after a panic.
* feat(auth): add Copilot login
Add experimental GitHub Copilot device login and copilot/* provider support for users with Copilot access but no OpenAI account.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix(copilot): use responses for GPT-5
Route Copilot GPT-5 models through the Responses API because gpt-5.5 is not available on /chat/completions.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix(copilot): honor device flow timing
* docs(copilot): add auth helper docstrings
* fix(auth): address copilot review feedback
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add PopLastUserMessage() on *App: walks the current tree branch back to
the parent of the most recent user message, syncs the in-memory store,
and returns the prompt + image parts for resubmission.
- Register /retry (alias /rt) and wire handleRetryCommand which rebuilds
the visible ScrollList from the truncated branch before resubmitting
via Run/RunWithFiles. Mirrors SubmitMsg display path (badges, pending
prints, stateWorking transition).
- Recovers from transient provider errors (overloaded, timeout) without
duplicating the user message in context — the failed turn's entries
become orphaned off-branch rather than being re-sent to the LLM.
- Update help text, AppController interface, and stub controller.
- Add unit tests covering busy/closed/no-session guards, the happy-path
truncation, and the empty-branch error case.
- Extend PopupList with FullScreen mode, RenderItem callback, and
external-state setters (SetItems/SetCursor/SetSearch) so any popup
can reuse the same chrome (border, title, search, scroll, footer).
- Rewrite TreeSelector and SessionSelector as thin PopupList wrappers,
dropping ~500 lines of duplicated rendering. Selector-specific keys
(filter cycle, scope/named toggles, delete-confirm) are pre-handled;
everything else delegates to PopupList.
- Migrate the / and @ autocomplete popups in InputComponent to render
through PopupList, replacing the bespoke renderer.
- Fix /tree and /fork overflow with deep trees: measure tree-art
prefix width via lipgloss.Width (handles multi-byte box drawing),
truncate the prefix from the left with an ellipsis when it would
push text off the row, and collapse multi-line message content to
a single line so rows never wrap.
- Fix broken selection highlight in /tree, /fork, /sessions: emit a
plain string from RenderItem for the cursor row so the outer row
style paints one continuous fg+bg span instead of being shredded
by mid-row ANSI resets from inner Render calls.
- Center the cursor in the visible window so context is always shown
above and below the selection.
- New /edit (alias /ed) opens $EDITOR on a chosen file via tea.ExecProcess
- Typing '/edit ' activates a fuzzy file popup mirroring the @ trigger:
reuses GetFileSuggestions (git ls-files), supports directory drill-down,
excludes MCP resources
- Selecting a file auto-submits and runs $EDITOR ($VISUAL preferred);
on exit prints 'Edited <path>'
- Manual paths supported (~/, relative, absolute); non-existent paths
pass through so the editor can create them; directories are rejected
- /help updated with the new command
- Add sdkDefaultBaseURL map covering the 14 npm SDKs that ship a
hard-coded baseURL (groq, cerebras, mistral, xai, perplexity,
togetherai, deepinfra, cohere, v0, aihubmix, venice, merge-gateway,
openrouter, vercel gateway), so providers whose models.dev entry
omits the api field still auto-route correctly.
- Extend npmToWireProtocol so these thin OpenAI-compatible wrappers
route through fantasy's openaicompat provider.
- Add resolveTemplatedAPIURL to substitute ${VAR} placeholders for
cloudflare-workers-ai, databricks, snowflake-cortex from the env,
with friendly errors that name the missing vars.
- Wire amazon-bedrock and azure-cognitive-services aliases into the
existing native handlers; add createGoogleVertexProvider for the
google-vertex case.
- Expose kit.ResolveProviderBaseURL in the public SDK so embedders
can introspect the effective endpoint before instantiating a Kit.
- Refresh embedded_models.json from models.dev (5113 -> 5121 models;
139 providers unchanged).
- Remove deprecated GenerateWithLoopAndStreaming and TreeManager
AppendFantasyMessage / AddFantasyMessages / GetFantasyMessages to
close the SDK leakage caused by the kit.TreeManager type alias
- Switch extensionAPI method signatures to local Extension* aliases so
pkg.go.dev signatures no longer expose internal package names
- Bundle runNormalMode dependencies into a runModeDeps struct, shrinking
the runNonInteractive and runInteractive call sites from 40+ positional
args to (ctx, deps)
- Add generic subscribeTyped[E Event] helper and collapse ~30 typed OnXxx
wrappers in pkg/kit/events.go onto it (public signatures unchanged)
- Extract setupBashPipes / interpretBashExit in internal/core/bash.go to
deduplicate the buffered and streaming execution paths
- Extract resolveAutoRouteAPIKey and wrapProviderErr helpers in
internal/models/providers.go and uniformly apply them across every
createXxxProvider site
- Reimplement internal/extensions/watcher.go as a thin wrapper over the
general-purpose internal/watcher.ContentWatcher, eliminating ~130 LOC
of duplicated fsnotify logic while preserving the existing test API
- Add ctx.Err() pre-flight checks in executeRead / Write / Edit / Ls so
cancellation actually short-circuits pure file-IO tools
* fix(ui): show pasted image previews in input and transcript
The half-block thumbnail preview added in #47 rendered but was clipped
off the bottom of the screen, and submitted images showed only a text
badge in the conversation history.
- Mark the layout dirty when clipboardImageMsg / thumbnailReadyMsg reach
the parent, so distributeHeight re-measures the now-taller input region
instead of keeping a stale height that pushed the preview off-screen
- Render thumbnail previews in the transcript after a user message,
appended as a verbatim ScrollList item (raw ANSI half-blocks would be
mangled if folded into the word-wrapped user text block)
- Render transcript previews asynchronously via a tea.Cmd so decode +
resample never blocks the Bubble Tea event loop
- Add regression tests covering the input layout recompute and the
transcript preview flow
* fix(ui): anchor transcript image preview to its user message
- Insert the async thumbnail preview directly after the originating user
message (tracked via anchorID) instead of appending, so a streamed
assistant reply that lands first no longer pushes the preview out of place
- Make the layout regression test deterministic by forcing a truecolor
profile, avoiding flakes on low-color CI terminals where the thumbnail
would render empty
- Add tests for anchored insertion and the unknown-anchor append fallback
* feat(ui): render half-block thumbnails for attached images (#46)
- Add internal/ui/imagepreview package: Render() draws low-res
thumbnails using Unicode half-blocks (▀) + truecolor/256-color SGR,
which survives tmux/zellij (no graphics protocol)
- Cache a rendered thumbnail per pending clipboard image in the input
component; render once at attach time, never per frame
- Fall back to the existing [N image(s) attached] text pill when the
terminal lacks truecolor/256-color support
- Document Ctrl+V paste, Ctrl+U clear, and the preview in the docs
site and README keyboard shortcuts
Fixes#46
* fix(ui): render image thumbnails off the event loop and cap size
- Render thumbnails asynchronously via a tea.Cmd instead of calling
the decode + resample path synchronously inside Update(), which
blocked the Bubble Tea event loop
- Add thumbnailReadyMsg + an imageGen generation counter so async
results land on the correct pendingImages slot and stale renders
after a clear/re-attach are discarded
- Guard imagepreview.Render against decompression bombs by checking
DecodeConfig dimensions against a max before full decode
* fix(ui): skip image preview when input width is too small
- Return 0 from thumbCols when width <= 6 so a full-size thumbnail is
no longer rendered for tiny or uninitialized (width 0) terminals;
the caller falls back to the text pill
- replace npmToLLMProvider map with npmToWireProtocol (openai/anthropic/google)
- add createAutoRoutedGoogleProvider so @ai-sdk/google proxies work
(fixes opencode/gemini-* failing with "no LLM provider mapping")
- strip the genai-injected v1beta segment for proxies whose base URL
already carries a version (e.g. opencode's /zen/v1)
- preserve openai-compat fallback and clearer error for unroutable providers
- document auto-routing in README and providers docs; update CreateProvider godoc
- add regression tests for wire routing and version-path rewriting
Fixes#41
* feat(kit): isolate viper config per Kit instance + add NewAgent (#40)
- Give each kit.New()/NewAgent() call an isolated *viper.Viper store so
multiple Kit instances in one process no longer clobber each other's
config; runtime mutators (SetModel, SetThinkingLevel) touch only the
owning instance, making subagent spawning and multi-Kit embedding
race-free
- Thread the per-instance store through internal/config, internal/models
(ProviderConfig.ConfigStore), internal/kitsetup, and the extension
runner, with a nil -> process-global fallback so the CLI is unaffected
- Share the global store when Options.CLI != nil to preserve cobra flag
bindings (also opted in for internal/acpserver)
- Remove viperInitMu; preserve the tri-state IsSet precedence contract
and sdkDefaultMaxTokens floor
- Add ergonomic NewAgent + functional options (WithModel, WithStreaming,
Ephemeral, etc.); NewAgent defaults streaming on, opt out via
WithStreaming(false). New(ctx, *Options) behavior is unchanged
- Add config-isolation regression test and NewAgent/option coverage;
document NewAgent and per-instance isolation in README
Fixes#40
* docs(sdk): document NewAgent options and per-instance config isolation
- Add "Functional options (NewAgent)" and "Per-instance config isolation"
sections to the docs site SDK overview, with an options table and a
"when to use which" constructor comparison
- Cross-reference NewAgent from the SDK options page and correct the now
per-instance ProviderAPIKey precedence wording
- Document NewAgent + With* helpers and config isolation in pkg/kit/README
and list NewAgent/Option in the API reference
- Show the NewAgent constructor in the SDK examples getting-started snippet
* fix(kit): correct config loading and isolate ACP sessions
- Isolate each ACP session's config store instead of sharing the global
viper, preventing per-session SetModel/SetThinkingLevel races; seed the
root-command flag values (model, thinking-level, provider URL/key) so
`kit acp -m <model>` is still honored
- Run initConfig for isolated SDK stores by gating on opts.CLI instead of
v.GetString("model"), which setSDKDefaults always populates and thus
skipped .kit.yml / KIT_* loading for SDK callers
- Configure KIT_* env overrides unconditionally in initConfig so passing an
explicit config file no longer disables environment variable support
- Wrap config unmarshal/validate errors with %w to preserve the error chain
* fix(kit): make Options.Streaming a *bool to honor unset
- Change Options.Streaming from bool to *bool so a zero-valued Options no
longer forces stream=false; New only sets the key when non-nil, letting
streaming resolve through the precedence chain (env -> config -> default
true). This also fixes the CLI path, which never set the field
- Mirror the existing sampling-parameter pointer pattern instead of adding
a separate StreamingSet sentinel, keeping Options internally consistent
- Update WithStreaming/NewAgent, subagent, and ACP callers to the pointer
form; add regression tests for the nil-default and explicit opt-out paths
- Update SDK docs (README, pkg/kit/README, options page) with the ptrBool
helper and *bool semantics
* fix(kit): inherit parent provider config in subagents
- Copy the parent's effective provider/runtime config (API key, URL,
TLS, thinking level, max-tokens, samplers) onto child Options in
Kit.Subagent. After the per-instance viper isolation, the child's
isolated store only re-loaded .kit.yml / KIT_*, silently dropping
config the parent set via programmatic Options or runtime setters
like SetThinkingLevel
- Preserve the IsSet tri-state for max-tokens and samplers so per-model
defaults still apply on the child when the parent left them unset
- Add TestInheritProviderConfig covering propagation, unset keys, and
nil-safety
* feat(sdk): runtime skills and context-file management (#36)
Let SDK consumers add, remove, and replace skills and AGENTS.md-style
context files after Kit construction. Every mutation recomposes the
system prompt and applies it to the agent so the next turn picks up
the new instructions without restarting Kit.
- AddSkill / LoadAndAddSkill / RemoveSkill / SetSkills on *kit.Kit
- AddContextFile / AddContextFileContent / LoadAndAddContextFile /
RemoveContextFile / SetContextFiles on *kit.Kit
- RefreshSystemPrompt to force a manual recomposition
- agent.SetSystemPrompt / GetSystemPrompt on the internal agent so
the composed prompt rebuilds the fantasy agent on the next call
- Per-instance runtimeMu guards skills/contextFiles; GetSkills and
GetContextFiles return defensive snapshots safe for concurrent use
- Capture the resolved basePrompt during New so recomposition keeps
per-model overrides and --system-prompt file resolution intact
- Skills dedupe by Name; context files dedupe by Path (opaque ID,
not required to be a real filesystem path)
Tests cover add/remove/set/replace semantics, validation errors,
disk loading round-trips, prompt composition, and an 8-goroutine
race-stress sweep (go test -race clean).
Docs: pkg/kit/README, root README Go SDK section, www sdk/overview
"Runtime skills and context files" section, www sdk/options callout
cross-referencing the new API.
Fixes#36
* fix(agent): synchronize SetSystemPrompt against concurrent rebuilds
- add promptMu to Agent guarding systemPrompt writes and the fantasy
agent rebuild, fixing a data race when Kit.applyComposedSystemPrompt
is invoked concurrently
- read systemPrompt under the same lock in GetSystemPrompt
- update the thread-safety stress test to use a non-nil agent so the
SetSystemPrompt path is actually exercised under -race
- Remove unused SetOpenAICredentials/validateOpenAIAPIKey (internal/auth)
- Remove unused SudoPasswordRequiredMetadata/IsSudoPasswordRequiredResult
(internal/core)
- Add Extension* type aliases in pkg/kit/extension_api.go so the public
ExtensionAPI interface no longer exposes internal/extensions types
- Extract bridgeObserve generic helper and llmToContextMessages /
contextMessagesToLLM in pkg/kit/extensions_bridge.go (~150 lines saved)
- Extract parseHeaders and buildOAuthConfig in connection_pool.go to
deduplicate SSE/Streamable client construction (~60 lines saved)
- Eliminate redundant second buildInteractiveExtensionContext call in
cmd/root.go; swap print closures on the same context instead
- Replace 'Fantasy' with 'agent' in internal comment (pkg/kit/kit.go)
Previously GenerateWithCallbacks stored the most recent tool call's args
in a single shared variable, which got clobbered when a provider emitted
multiple tool_use blocks in a single step. Every OnToolResult callback
then received the args of the last OnToolCall, regardless of which call
it was actually resolving — breaking any downstream UI, log, or trace
that derived its description from the toolArgs parameter.
- Replace the shared currentToolArgs with a map keyed by ToolCallID,
guarded by a sync.Mutex in case the streaming layer dispatches
callbacks from multiple goroutines.
- Delete each entry in OnToolResult so the map cannot accumulate
across steps.
- Add a regression test driving the streaming wrapper with a fake
fantasy.Agent that emits two parallel tool calls before either
result, asserting each callback sees its own args.
Fixes#33
Open the /resume session picker faster by extracting per-file metadata
across a GOMAXPROCS-sized worker pool instead of sequentially. Each
extractSessionInfo call is I/O + JSON-parse bound and independent, so
wall time drops roughly proportionally to core count — meaningful for
users with many sessions, where ListSessions + ListAllSessions ran
back-to-back on the UI goroutine before the picker rendered.
- Switch NotifyWidgetUpdate from leading-only to leading+trailing edge
coalescing so a rapid SetWidget→RemoveWidget pair (e.g. emitted by
subagent-monitor on SubagentEnd) is never silently dropped.
- Without the trailing send the TUI keeps the pre-removal widget
height, leaving empty rows below the status bar until some other
event re-renders the layout.
- Match View() and getItemAndLineAtY() row counts for empty items so
streaming-reasoning placeholders no longer offset hit-testing by one
row each (exposed when extension widgets like subagent-monitor shrink
the scrollback).
- Honor IsLineInRange's endCol=-1 'to end of line' sentinel in
HighlightLine and ExtractText so the start row of a multi-line drag
actually renders highlighted and is included in clipboard copies.
- Add regression tests for both invariants in scrolllist and selection.
- Add prompts.GlobalDir() resolving $XDG_CONFIG_HOME/kit/prompts/
(default ~/.config/kit/prompts/) so prompt templates live alongside
extensions and skills under the same XDG-aligned root.
- LoadAll now discovers templates from both the legacy ~/.kit/prompts/
and the XDG location; existing legacy paths keep precedence.
- Include GlobalDir() in the prompts/skills file watcher so edits
under ~/.config/kit/prompts/ hot-reload automatically.
- Surface a visible 'Extensions reloaded.' (or error) message when
the extension watcher fires, matching /reload-ext feedback.
- Restore examples/extensions/subagent-monitor.go alongside its test
and update the test load path; previous move left the test broken.
- Add ExtensionInfo type and Loaded() method to the public ExtensionAPI
so SDK consumers can inspect which extensions are active.
- Introduce ui.ExtensionItem and thread ExtensionItems/GetExtensionItems
through AppModelOptions, mirroring the existing SkillItem pattern.
- Render an [Extensions] row in AddStartupMessageToScrollList showing
the filename of each loaded extension (with a (N tools) suffix when
extensions register tools). Falls back to tool count only when items
are unavailable, and is omitted entirely when no extensions load.
- Refresh the list on /reload-ext via a new refreshExtensionItems hook
so the banner stays accurate across hot-reloads.
- Add buildExtensionItems helper in cmd/root.go that strips .go and
resolves subdirectory extensions to their parent dir name, tagging
each as project or user scope based on cwd.
- Lock viewport scroll while a drag-select is active so highlighted
content stays under the cursor (SetItems, appendStreamingChunk,
MouseWheelDown all now honor IsMouseDown).
- HandleMouseDrag defensively clears autoScroll on every update so a
racy re-enable can't shift the row mid-drag.
- Recompute scrollback yOffset/viewport height on each mouse event
via currentScrollbackBounds() instead of relying on stale values
cached during the previous View() pass.
- Account for canceling/ctrlCPressedOnce warning rows in
distributeHeight and mark layoutDirty when those flags toggle so
the height budget and mouse origin stay in sync.
- Add ScrollList regression tests covering the three invariants.
- Register /copy (alias /cp) in the System command category
- Walk the scrollback to find the last user/assistant/reasoning
message, skipping transient system messages
- Reuse internal/ui/clipboard.CopyToClipboard for OSC 52 + native
clipboard support (works over SSH)
- Document the command in /help
After /compact, BuildContext emitted [summary, post-compact, kept]
which placed an older kept user/assistant turn after the latest
post-compaction turn. This broke user/assistant alternation and caused
the model to respond as if the post-compaction turn never happened on
the next user message.
- Emit kept messages chronologically before post-compaction messages
- Mirror the same order in GetContextEntryIDs so cut-point to entry-ID
mapping stays aligned across repeat compactions
- Update TestCompactionWithNewMessagesAfterCompaction to assert the
correct chronological order
The MCP adapter previously wrapped any error returned by MCPToolManager.ExecuteTool
into a Go error returned from the fantasy.AgentTool.Run interface. The fantasy
agent loop treats those as critical errors and aborts the entire turn —
discarding all prior reasoning, tool calls, and results.
In practice that meant a single misbehaved MCP server returning a JSON-RPC
"-32602 Invalid params" (e.g. a Zod schema mismatch on the server's input
validation) would kill an in-progress turn after the model had already done
dozens of seconds of useful work, with no way for the model to see the
validation message and self-correct.
This mismatched the contract that native Kit tools follow: native tools
return errors via kit.ErrorResult(...), which become soft tool-result errors
that the model reads and can act on (retry with corrected args, try a
different tool, give up gracefully).
Make the MCP path behave the same way:
- JSON-RPC protocol errors, transport failures, and server-side schema
rejections are now returned as fantasy.NewTextErrorResponse(...) with
err == nil, so the agent loop continues and the model sees the failure
in-band as a tool result it can reason about.
- Context cancellation (ctx.Err() != nil) remains a critical error so
callers can abort turns deterministically. This is the only case where
bubbling up is correct — the caller intentionally tore the turn down
and the agent must not keep spinning.
- Server-side soft errors (CallToolResult{ isError: true }) and the
happy path are unchanged.
The agent loop's MaxSteps cap already bounds the worst case for a
permanently broken MCP server, so there is no risk of unbounded retries.
Side effect: extracted a tiny mcpExecutor interface for the one method the
adapter uses (ExecuteTool), purely so the adapter is unit-testable in
isolation without standing up a full MCPToolManager + connection pool.
Behavior change note for downstream consumers: code that relied on
host.PromptResult / Stream returning a Go error containing
"mcp tool execution failed" will no longer see those errors — the
failure information is now in the assistant's final response (or in the
OnAfterToolResult / OnToolResult hooks, where IsError will be true).
Context cancellation continues to surface as an error from those calls
as before.
Co-authored-by: space_cowboy <space_cowboy@mark3labs.com>
- register loaded skills into the input autocomplete under category
"Skills" with HasArgs so Enter populates "/skill:name " instead of
auto-submitting, leaving room for trailing args
- prefix descriptions with [project] or [user] to disambiguate
colliding skill names across sources
- extend refreshSkillItems to prune & re-add Skills entries on
ContentReloadEvent, matching the pattern used for prompt templates
and MCP prompts
- add Description field to ui.SkillItem and populate it from
kit.Skill.Description in both initial build and hot-reload paths
- Add unexported steerDrainFn test seam on App so unit tests can
inject fake steer items without standing up a full *kit.Kit
(Options.Kit is a concrete struct, not an interface).
- releaseBusyAfterCompact now prefers the seam over Kit.DrainSteer
via a small switch; production behaviour is unchanged when the
field is nil.
- Add TestReleaseBusyAfterCompact_splicesSteerAheadOfQueue, which
pre-populates both fake steer items and ordinary queue prompts,
invokes releaseBusyAfterCompact, and asserts the first dispatched
prompt is the steer item — proving steer messages retain 'act now'
priority and that drainQueue is actually launched (the bug from
#27).
- Add releaseBusyAfterCompact() shared deferred tail used by both
CompactConversation and CompactAsync. It drains the SDK steer
channel, splices steer items in front of any queued prompts, and
hands off to drainQueue so messages received during compaction
are dispatched automatically once compaction finishes.
- Previously, busy was simply cleared on completion and the queue
sat idle until the user submitted another prompt, which then
flushed everything together.
- Honor the closed flag so a teardown during compaction discards
pending items instead of spawning drainQueue against a torn-down
App.
- Add regression tests covering the queued-flush, idle-empty, and
closed-during-compact paths.
Fixes#27
Addresses two CodeRabbit feedback items on PR #24:
* Docstring coverage warning (was 57.14%, threshold 80%): adds godoc
comments to the four test functions added or substantially rewritten
in this PR — TestLoadAndSaveManifest, TestAddAndRemoveFromManifest,
TestFindInManifest, TestHighlightFileTokensInjectsANSI.
* Quick-win nitpick: replaces the manual os.Setenv/os.Unsetenv +
defer pattern in TestFindInManifest with t.Setenv, which restores
the env var automatically on cleanup even on panic or t.Fatal.
go test -race ./... still passes.
The TestUserBlockHighlightsFileTokens test was rewritten to call
HighlightFileTokens directly (UserBlock was deleted in the dead-code
sweep). That left testTypography with no callers, so staticcheck U1000
flagged it.
The same ~40-line block — building a kit.SubagentConfig, wrapping
OnEvent through sdkEventToSubagentEvent, calling kitInstance.Subagent,
and translating the SDK result into extensions.SubagentResult — was
copy-pasted three times:
* cmd/root.go (interactive TUI Context, line 1148)
* cmd/root.go (post-SessionStart runtime Context, line 1446)
* internal/acpserver/session.go (ACP server Context, line 154)
A separate sdkEventToSubagentEvent function was duplicated byte-for-byte
between cmd/root.go and internal/acpserver/session.go.
Both are now consolidated in a new internal/extbridge package which is
the only module-internal home that can legitimately import both
pkg/kit/ (the public SDK) and internal/extensions/. cmd/ and
internal/acpserver/ both import it, so SDK-event-to-extension-event
schema changes only have one site to update.
Also fixes pkg/kit/events.go godoc comment that named the underlying
LLM library, per AGENTS.md 'No Dependency Name Leakage' rule for
exported SDK symbols.
go test -race ./... passes.
Removes ~600 lines of unreferenced code surfaced by deadcode + manual
audit (none of it reachable from production code paths or test setup):
- internal/models/pool.go: ProviderPool was never wired into kitsetup
or the agent; the global pool singleton had zero callers.
- internal/ui/debug_logger.go: CLIDebugLogger was unreachable; debug
routing goes through internal/tools/buffered_logger.go instead.
- internal/ui/tool_approval_input.go: tea.Model never instantiated;
approvals are handled inline in model.go.
- internal/ui/cli.go: DisplayAssistantMessage / DisplayCancellation /
GetDebugLogger had zero callers (the *WithModel variant is what
event_handler.go uses).
- internal/ui/style/enhanced.go: Style{Card,Header,Subheader,Muted,
Success,Error,Warning,Info} + Create{Separator,ProgressBar} — none
used. CreateBadge stays (used by model.go).
- internal/ui/style/themes.go: RefreshThemeRegistry — never called.
- internal/ui/block_renderer.go: With{FullWidth,MarginTop,Padding{Left,
Right},Background,Foreground,Width} — option helpers nobody calls.
- internal/ui/render/blocks.go: UserBlock, ToolBlock — replaced by
inline rendering elsewhere; the test for UserBlock was rewritten to
directly exercise HighlightFileTokens (which is what the test really
cared about).
- internal/ui/commands/commands.go: GetAllCommandNames — no callers.
- internal/ui/message_items.go: NewTextMessageItem,
NewSystemMessageItem + the entire SystemMessageItem type — model.go
uses NewStyledMessageItem instead.
- internal/prompts/loader.go: Deduplicate — the loader does dedup
internally; standalone helper was unused.
- internal/models/cache_options.go: mergeProviderOptions + its
test-only consumer.
- internal/extensions/installer.go: Installer.GetInstalledPackages —
intended for a 'kit ext list' command that was never built.
- internal/extensions/manifest.go: saveManifestToScope,
saveManifestToPath, GetGlobalManifest, GetProjectManifest,
addEntryToManifest, removeEntryFromManifest — package-level
duplicates of *Installer methods. Tests rewritten to exercise the
live Installer methods instead, which fixes a latent path-resolution
inconsistency between manifestPathForScope and Installer.manifestPath
(the former hard-coded paths, the latter respects projectGitRoot).
- internal/extensions/subagent.go: SpawnSubagent + helpers
(generateSubagentID, findKitBinary, subagentJSONOutput). The
subprocess-spawn implementation is unreachable; production code
routes through kit.Kit.Subagent (in-process). Types
(SubagentConfig/Result/Handle/etc.) and the SubagentHandle methods
remain because they are exposed to extensions via Yaegi symbols and
the Context.SpawnSubagent field.
- cmd/root.go: LoadConfigWithEnvSubstitution — one-line wrapper around
kit.LoadConfigWithEnvSubstitution with zero callers.
go test -race ./... passes.
- Stale comment showed ~/.kit/sessions/--<cwd-path>--/ which does not
match the actual encoding (no leading/trailing dashes)
- Update to reflect the real format and point to encodeCwdForDir for
full rules
- Encode cwd via new encodeCwdForDir helper that handles both `/`
and `\` separators and strips characters illegal in Windows
directory names (`: < > " | ? *`)
- Fixes session creation on Windows where the drive-letter colon
produced names like `C:--test` and caused mkdir to fail
- Add regression tests covering Unix paths, Windows drive roots,
secondary drives, mixed separators, and other illegal chars
Fixes#18
Address two review findings on the MCP Tasks PR.
- Config.Validate() now rejects unknown tasksMode values with a clear
error naming the server and bad value. Without this a typo (e.g.
"alwasy") was silently downgraded to "auto" by the runtime parser.
- Kit.Subagent() now propagates the parent's six MCP task options
(mode map, timeout, TTL, poll interval, max poll interval, progress
callback) onto the child via a new inheritMCPTaskOptions helper.
Without this, child subagents always saw default polling and no
progress feedback regardless of parent configuration.
The propagation logic lives in a helper so the test exercises the real
code path instead of duplicating it; future task fields only need to be
added in one place.
Implement Phase 1 of the MCP Tasks spec so long-running tools/call
requests can run asynchronously, survive proxy timeouts, and be
cancelled mid-flight.
- connection pool now advertises mcp.NewTasksCapability() during
initialize and captures the InitializeResult so callers can detect
per-server task support
- new MCPServerConfig.TasksMode (auto|never|always, default auto)
parsed from both new and legacy mcp.json shapes
- ExecuteTool augments tools/call with TaskParams when policy and
capability allow, polls tasks/get / tasks/result until terminal,
and best-effort tasks/cancel on context cancellation
- new MCPToolManager methods: SetTaskConfig, ListServerTasks,
GetServerTask, CancelServerTask
- public SDK surface in pkg/kit: MCPTask, MCPTaskStatus, MCPTaskMode,
MCPTaskProgress, MCPTaskProgressHandler, plus Options fields
(MCPTaskMode, MCPTaskTimeout, MCPTaskTTL, MCPTaskPollInterval,
MCPTaskMaxPollInterval, MCPTaskProgress) and Kit.{List,Get,Cancel}
MCPTask methods
- works around two upstream mcp-go v0.51.0 parser bugs
(ParseCallToolResult rejects task responses; ParseTaskResultResult
looks for content under a non-existent nested key) by decoding the
wire shape directly via the transport
- defaults to MCPTaskModeAuto so servers that don't advertise task
support behave exactly as before
Fixes#21
- Bump fantasy v0.21.0 -> v0.23.0, mcp-go v0.49.0 -> v0.51.0,
acp-go-sdk v0.12.0 -> v0.12.2, chroma v2.23.1 -> v2.24.1,
fsnotify v1.9.0 -> v1.10.1, ultraviolet, AWS SDK, Google API
- Implement CloseSession and ResumeSession on acpserver.Agent to
satisfy the expanded acp.Agent interface in acp-go-sdk v0.12.2
- Add sessionRegistry.remove helper to support session close
Fantasy v0.21.0 natively includes gpt-5.5 and other newer models in
its responsesModelIDs/responsesReasoningModelIDs lists, making our
workaround unnecessary.
- Delete responses_models.go (go:linkname hack + RegisterResponsesModels)
- Delete responses_models_test.go
- Replace isResponsesAPIModel/isResponsesReasoningModel heuristics with
direct openai.IsResponsesModel/openai.IsResponsesReasoningModel calls
- Remove RegisterResponsesModels calls from registry init/reload
- Remove hack documentation from AGENTS.md
- Update all deps (fantasy v0.21.0, smithy-go, ultraviolet, etc.)
Fantasy's hardcoded responsesModelIDs list gates whether a model uses
the Responses API or Chat Completions code path. When a new model
(e.g. gpt-5.5) is added via `kit update-models` but fantasy hasn't
been updated yet, the type mismatch between *ResponsesProviderOptions
and *ProviderOptions causes a crash.
- Add isResponsesAPIModel()/isResponsesReasoningModel() helpers that
supplement fantasy's checks with prefix-based heuristics for modern
OpenAI model families (gpt-4.1+, gpt-5+, o-series, codex, chatgpt)
- Add RegisterResponsesModels() using go:linkname to append missing
model IDs from our database into fantasy's internal slices at init
time and after ReloadGlobalRegistry()
- Replace all direct openai.IsResponsesModel/IsResponsesReasoningModel
calls in providers.go with the new helpers
- Merge embedded + cached model databases instead of cache-only fallback
- Bump fantasy v0.19.0 -> v0.20.0 to match existing import usage
- Document the technique and model-family update process in AGENTS.md
- Remove top-level old_text/new_text params from edit tool schema
- Make edits array the sole interface; single edits pass 1-item array
- Simplify normalizeEditInput, removing dual-mode branching logic
- Update UI renderer to only read from edits array
- Remove old_text/new_text from bodyKeys in message summarizer
- Update web session HTML to iterate edits array
- Convert all single-edit tests to use Edits array
- Replace mixed-mode test with empty-array validation test
Tool blocking via OnToolCall and SetActiveTools returned both a
ToolResponse (IsError=true) and a Go error. Fantasy treats a non-nil
Go error from tool.Run() as a critical failure, aborting the agent
loop without delivering the tool result to the LLM. The model never
saw the block reason and would retry or hallucinate.
- Return nil error for blocked tools (OnToolCall Block=true)
- Return nil error for disabled tools (SetActiveTools)
- Return nil error for extension tool execution failures
- Update tests to assert nil error (IsError response conveys the error)
Fixes#20
- Set context tokens per-step in recordStepUsage instead of waiting
for turn completion; each step re-sends the full conversation so
the reported usage monotonically increases
- Add UsageUpdatedEvent to trigger a TUI re-render after each step
so the status bar reflects updated tokens, cost, and context %
even during gaps between streaming chunks
- Update test to expect per-step context token updates
- Add heightCache map to ScrollList, keyed by item ID, avoiding
repeated Render() calls purely to count lines
- Rewrite GotoBottom() to walk backwards from the end in O(visible)
instead of two full O(N) forward passes over all items
- Replace all height-only Render() calls in clampOffset(), AtBottom(),
ScrollBy(), and ScrollPercent() with cached itemHeight() lookups
- Invalidate cache on width changes (SetWidth) and item mutations
(AppendChunk, AppendStdout/Stderr via InvalidateItemHeight)
- Refresh cache entries in View() from authoritative renders