Removes ~600 lines of unreferenced code surfaced by deadcode + manual
audit (none of it reachable from production code paths or test setup):
- internal/models/pool.go: ProviderPool was never wired into kitsetup
or the agent; the global pool singleton had zero callers.
- internal/ui/debug_logger.go: CLIDebugLogger was unreachable; debug
routing goes through internal/tools/buffered_logger.go instead.
- internal/ui/tool_approval_input.go: tea.Model never instantiated;
approvals are handled inline in model.go.
- internal/ui/cli.go: DisplayAssistantMessage / DisplayCancellation /
GetDebugLogger had zero callers (the *WithModel variant is what
event_handler.go uses).
- internal/ui/style/enhanced.go: Style{Card,Header,Subheader,Muted,
Success,Error,Warning,Info} + Create{Separator,ProgressBar} — none
used. CreateBadge stays (used by model.go).
- internal/ui/style/themes.go: RefreshThemeRegistry — never called.
- internal/ui/block_renderer.go: With{FullWidth,MarginTop,Padding{Left,
Right},Background,Foreground,Width} — option helpers nobody calls.
- internal/ui/render/blocks.go: UserBlock, ToolBlock — replaced by
inline rendering elsewhere; the test for UserBlock was rewritten to
directly exercise HighlightFileTokens (which is what the test really
cared about).
- internal/ui/commands/commands.go: GetAllCommandNames — no callers.
- internal/ui/message_items.go: NewTextMessageItem,
NewSystemMessageItem + the entire SystemMessageItem type — model.go
uses NewStyledMessageItem instead.
- internal/prompts/loader.go: Deduplicate — the loader does dedup
internally; standalone helper was unused.
- internal/models/cache_options.go: mergeProviderOptions + its
test-only consumer.
- internal/extensions/installer.go: Installer.GetInstalledPackages —
intended for a 'kit ext list' command that was never built.
- internal/extensions/manifest.go: saveManifestToScope,
saveManifestToPath, GetGlobalManifest, GetProjectManifest,
addEntryToManifest, removeEntryFromManifest — package-level
duplicates of *Installer methods. Tests rewritten to exercise the
live Installer methods instead, which fixes a latent path-resolution
inconsistency between manifestPathForScope and Installer.manifestPath
(the former hard-coded paths, the latter respects projectGitRoot).
- internal/extensions/subagent.go: SpawnSubagent + helpers
(generateSubagentID, findKitBinary, subagentJSONOutput). The
subprocess-spawn implementation is unreachable; production code
routes through kit.Kit.Subagent (in-process). Types
(SubagentConfig/Result/Handle/etc.) and the SubagentHandle methods
remain because they are exposed to extensions via Yaegi symbols and
the Context.SpawnSubagent field.
- cmd/root.go: LoadConfigWithEnvSubstitution — one-line wrapper around
kit.LoadConfigWithEnvSubstitution with zero callers.
go test -race ./... passes.
Fantasy v0.21.0 natively includes gpt-5.5 and other newer models in
its responsesModelIDs/responsesReasoningModelIDs lists, making our
workaround unnecessary.
- Delete responses_models.go (go:linkname hack + RegisterResponsesModels)
- Delete responses_models_test.go
- Replace isResponsesAPIModel/isResponsesReasoningModel heuristics with
direct openai.IsResponsesModel/openai.IsResponsesReasoningModel calls
- Remove RegisterResponsesModels calls from registry init/reload
- Remove hack documentation from AGENTS.md
- Update all deps (fantasy v0.21.0, smithy-go, ultraviolet, etc.)
Fantasy's hardcoded responsesModelIDs list gates whether a model uses
the Responses API or Chat Completions code path. When a new model
(e.g. gpt-5.5) is added via `kit update-models` but fantasy hasn't
been updated yet, the type mismatch between *ResponsesProviderOptions
and *ProviderOptions causes a crash.
- Add isResponsesAPIModel()/isResponsesReasoningModel() helpers that
supplement fantasy's checks with prefix-based heuristics for modern
OpenAI model families (gpt-4.1+, gpt-5+, o-series, codex, chatgpt)
- Add RegisterResponsesModels() using go:linkname to append missing
model IDs from our database into fantasy's internal slices at init
time and after ReloadGlobalRegistry()
- Replace all direct openai.IsResponsesModel/IsResponsesReasoningModel
calls in providers.go with the new helpers
- Merge embedded + cached model databases instead of cache-only fallback
- Bump fantasy v0.19.0 -> v0.20.0 to match existing import usage
- Document the technique and model-family update process in AGENTS.md
- Update all Go dependencies (bubbletea v2.0.6, fantasy v0.19.0,
acp-go-sdk v0.12.0, mcp-go v0.49.0, and transitive deps)
- Replace SetSessionModel with SetSessionConfigOption to match new
acp-go-sdk Agent interface (union type with ValueId/Boolean variants)
- Add ListSessions stub returning empty list (new required method)
- Refresh embedded_models.json from models.dev/api.json
- Update ACP smoke test: add initialize handshake, session/list,
session/set_config_option, session/cancel, and fix update parsing
Adds 'none' thinking level to support OpenAI gpt-5.4 models which use
'reasoning_effort: none' instead of 'minimal'. Includes validation and
auto-adjustment when switching models with incompatible levels.
- Add ThinkingNone constant mapping to ReasoningEffortNone
- Add IsValidThinkingLevelForModel() with gpt-5.4 detection
- Add SuggestThinkingLevelFallback() for level migration
- Auto-adjust thinking level on model switch with user notification
- Update all docs to include 'none' in valid levels
Fixes#11
Add --set-default flag to 'kit auth login' to automatically set the
provider's default model after successful authentication. When no Anthropic
credentials exist but OpenAI credentials are detected, error messages
now suggest using OpenAI with the correct --model flag.
Fixes#9
- Raise --max-tokens default from 4096 to 8192.
- Auto-raise MaxTokens toward the model's catalog Limit.Output (capped at
32768) when the user hasn't set --max-tokens explicitly and no per-model
modelSettings override applied. Prevents silent 4k/8k truncation on
models that support 32k-262k output.
- Surface FinishReasonLength at turn end: the app now subscribes to
TurnEndEvent and renders a system-message banner explaining the current
cap, the model's known ceiling, and how to raise it. Previously the TUI
swallowed 'length' stops, producing 'ghost' truncations.
- Export FinishReason* constants on pkg/kit (Stop, Length, ToolCalls,
ContentFilter, Error, Other, Unknown) and fix stale comments that used
Anthropic-style strings.
- Add Kit.MaxTokens() and Kit.MaxOutputLimit() SDK accessors, backed by
Agent.GetMaxTokens() which correctly returns 0 for providers that
suppress the param (e.g. Codex OAuth).
- Tests: rightSizeMaxTokens covers 7 paths (cap, raise, preserve,
explicit flag, nil info, zero limit); handleTurnEnd covers length/
non-length/nil-sendFn and the fallback message formatter.
- Docs: update configuration.md, cli/flags.md, and kit-extensions skill
to reflect the new default and behavior.
- Rename ExtensionToolsAsFantasy -> ExtensionToolsAsLLMTools
- Rename convertKitMessagesToFantasy -> convertToLLMMessages
- Delete GetFantasyProviders, ToFantasyMessages, FromFantasyMessage
- Replace direct fantasy type usage with kit.LLM* aliases in app tests
- Scrub fantasy references from godoc comments across pkg/kit and internal
- Add systemPrompt field to GenerationParams and config structs
- On init, replace default system prompt with per-model prompt when
user hasn't explicitly set one (via flag, config, or SDK option)
- On model switch, detect per-model prompt and compose it with
AGENTS.md, skills, and date/cwd context
- Fix viper.IsSet bug: BindPFlag causes IsSet to return true for
unset flags, so compare against defaultSystemPrompt instead
- Agent.SetModel now updates stored system prompt from config
- Export LoadModelSettingsFromConfig, LoadSystemPromptValue, and
LookupModelForSettings for use by Kit.SetModel
- Add tests for prompt apply, precedence, file path, and
modelSettings override
- Add modelSettings config section for attaching generation params
(temperature, topP, topK, frequencyPenalty, presencePenalty,
maxTokens, stopSequences, thinkingLevel) to any model by
provider/model key
- Add params field to customModels definitions for inline defaults
- Change BuildProviderConfig and SetModel to use viper.IsSet so
unset params remain nil, allowing model-level defaults to apply
- Wire ApplyModelSettings into CreateProvider with priority order:
CLI flags > global config > modelSettings > customModels params
- Add GenerationParams to ModelInfo in the registry
- Update default config template with modelSettings and customModels
params examples
- Add --frequency-penalty and --presence-penalty CLI flags (0.0-2.0)
- Wire through config, viper, ProviderConfig, and fantasy agent options
- Support in config file, env vars (KIT_FREQUENCY_PENALTY), and SDK
- Pass to Ollama via options map (frequency_penalty, presence_penalty)
- Apply on both initial agent creation and runtime model swap
- ctx.Abort(): cancel current agent turn and clear queue without
injecting a new message (App.Abort + App.IsBusy methods)
- ctx.IsIdle(): check whether the agent is currently processing
- ctx.Compact(CompactConfig): trigger async context compaction with
OnComplete/OnError callbacks (App.CompactAsync method)
- ctx.SendMultimodalMessage(text, []FilePart): send text+image messages
to the agent, bridging ext.FilePart to fantasy.FilePart via RunWithFiles
- ctx.GetSessionUsage() SessionUsage: expose aggregated session token
usage and cost from the UsageTracker
New types: CompactConfig, FilePart, SessionUsage
Wired in both context setups in cmd/root.go with nil-guard defaults
in runner.go and Yaegi symbol exports in symbols.go
Move reasoning tag detection from the provider and UI layers into the agent layer. This prevents raw XML tags from leaking into text streams while ensuring structured reasoning events are emitted correctly for all callers.
- Add `baseUrl` and `apiKey` fields to CustomModelConfig (config and models packages)
- Store them on ModelInfo so they travel through the registry
- createCustomProvider resolves URL/key from model definition first,
falling back to global --provider-url / --provider-api-key
- Fix registry initialisation: call ReloadGlobalRegistry() in InitConfig()
so customModels from config are visible on startup (not just at init time)
- Include custom provider in GetLLMProviders() so custom models appear
in the /model selector
- Hide the built-in custom/custom stub from the selector when user-defined
custom models are present
- Add ValidateModelString() to ModelsRegistry for format, provider,
and model name validation with typo suggestions
- Validate model in Kit.Subagent() before expensive Kit.New() setup
- Remove silent fallback to parent model on creation failure
- Error propagates as tool result so calling agent can self-correct
- Add registry_test.go covering format, provider, and suggestion cases
Models like Qwen and DeepSeek wrap reasoning content in ... XML-like
tags within the regular content field. This was causing the reasoning
text to appear twice - once as a reasoning block and once as regular text.
Changes:
1. Provider hooks (providers.go):
- Extract reasoning from tags and emit proper reasoning events
- Use openai provider directly with custom ExtraContentFunc and
StreamExtraFunc hooks to parse thinking content
2. Stream filtering (stream.go):
- Filter out all text content between and tags at the
streaming level to prevent duplicate rendering
- Track state with inThinkTag flag across stream chunks
3. Message conversion (content.go):
- Strip any remaining tags from text content when converting
from fantasy messages
The regex patterns use string concatenation to avoid XML tag corruption:
regexp.MustCompile( + + + + + + + )
Fixes duplicate reasoning text when using custom provider with models
that wrap thinking in tags.
- Upgrade golangci-lint to v2.11.4
- Fix errcheck warnings for os.Setenv/os.Unsetenv in tests
- Use maps.Copy instead of manual loop (modernize lint)
- Add maps import for maps.Copy
Remove internal monologue comments that don't add value for readers:
- Remove lengthy explanations of type conflicts that are now resolved
- Remove 'NOTE:' and 'TODO:' comments documenting implementation history
- Remove obvious test comments that just restate what the code does
- Keep only meaningful comments that explain design intent
The code is now cleaner and easier to read without the self-referential
commentary that was useful during development but not for maintenance.
Implements automatic prompt caching to reduce API costs by 60-90% for
repeated prompts with the same context.
Architecture:
- Provider-level caching for OpenAI (PromptCacheKey)
- Message-level caching for Anthropic (avoids type conflicts)
- Model family detection enables caching regardless of provider
Key Changes:
- Add ModelInfo.Family with SupportsCaching() and CacheType() methods
- Add ProviderConfig.DisableCaching for opt-out
- Implement message-level cache control in agent (like Crush)
- Last system message gets cache control
- Last 2 messages get cache control
- Last tool gets cache control
- Auto-disable caching when thinking is enabled (type conflict avoidance)
- Add KIT_DISABLE_CACHE environment variable for global opt-out
Tested with opencode/claude-sonnet-4-6 showing cacheRead/cacheWrite
tokens in debug output, confirming 60-90% cost savings.
Closes cost optimization for multi-turn conversations.
Rename public SDK symbols to use generic LLM terminology instead of
exposing the internal dependency name (charm.land/fantasy):
Public API renames (with deprecated wrappers for backward compat):
- ConvertToFantasyMessages() → ConvertToLLMMessages()
- ConvertFromFantasyMessage() → ConvertFromLLMMessage()
- GetFantasyProviders() → GetLLMProviders()
New type alias:
- LLMFilePart = fantasy.FilePart (eliminates need for direct fantasy import)
- PromptResultWithFiles() signature now uses LLMFilePart
Internal renames (with deprecated wrappers):
- ModelsRegistry.GetFantasyProviders() → GetLLMProviders()
- TreeManager.GetFantasyMessages() → GetLLMMessages()
- TreeManager.AppendFantasyMessage() → AppendLLMMessage()
- TreeManager.AddFantasyMessages() → AddLLMMessages()
- Message.ToFantasyMessages() → ToLLMMessages()
- FromFantasyMessage() → FromLLMMessage()
- npmToFantasyProvider → npmToLLMProvider
- isProviderFantasySupported() → isProviderLLMSupported()
All internal callers migrated to new names. ~30 comments updated
to remove Fantasy references across pkg/kit/, internal/agent/,
internal/models/, internal/message/, internal/session/.
Documentation updates:
- AGENTS.md: added Public SDK rules section (no dependency leakage,
naming conventions, deprecation pattern)
- README.md: removed Fantasy references
- pkg/kit/README.md: full rewrite with current API surface
- skills/kit-sdk/SKILL.md: updated examples and type references
- www/pages/providers.md, www/pages/cli/commands.md: updated
- Remove unused modelFamily variable in createOpenAICodexProvider
- Remove dead spark handling code (spark is rejected early with error)
- Simplify buildCodexProviderOptions to only handle regular codex models
- Remove redundant comments and simplify code structure
- Net reduction: 31 lines of code
Spark models are not accessible via ChatGPT OAuth and return Cloudflare
'Forbidden' errors. Add early detection and helpful error message directing
users to regular Codex models like 'openai/gpt-5.3-codex' instead.
Different Codex model families use different API formats:
- gpt-codex-spark: uses standard ProviderOptions (not Responses API)
- gpt-codex, gpt-codex-mini: uses ResponsesProviderOptions
- Add detectCodexModelFamily() to determine model family from name
- Use standard ProviderOptions for spark models
- Use ResponsesProviderOptions for regular codex models
- Conditionally use WithUseResponsesAPI() based on model family
Note: gpt-5.3-codex-spark still gets Cloudflare forbidden error,
may need additional headers or different endpoint.
The Codex API doesn't support the max_output_tokens parameter, which was causing
"Unsupported parameter: max_output_tokens" errors.
- Add SkipMaxOutputTokens flag to ProviderResult
- Set flag when creating Codex OAuth provider
- Check flag in agent setup to skip WithMaxOutputTokens option
- This matches pi's behavior of not sending max_tokens to Codex API
- Upgrade charm.land/fantasy from v0.16.0 to v0.17.1
- Add buildCodexProviderOptions() to pass system prompt as 'instructions'
- The Codex API requires instructions as a top-level field, not as system message
- Set Store=false to prevent server-side conversation storage
- Use ResponsesProviderOptions.Instructions for system prompt
- Change base URL to /backend-api/codex for correct endpoint path
- Add browser-like User-Agent to avoid Cloudflare blocking
- Add Accept, Accept-Language, Cache-Control headers
- Match pi client headers more closely
Implements OAuth authentication for OpenAI ChatGPT Plus/Pro (Codex) similar to pi:
- Add OpenAICredentials type with OAuth and API key support
- Add OpenAI OAuth client with correct endpoints (auth.openai.com)
- Implement PKCE-based OAuth flow with local callback server on :1455
- Add login/logout/status commands for openai provider
- Support both ChatGPT/Codex OAuth tokens (chatgpt.com/backend-api) and
regular OpenAI API keys (api.openai.com)
- Extract and store ChatGPT account ID from JWT token
- Add custom HTTP transport with required Codex headers:
- chatgpt-account-id, originator, OpenAI-Beta: responses=experimental
- Update provider selection to use correct endpoint based on auth type
Usage:
kit auth login openai # OAuth with ChatGPT account
kit auth logout openai
kit auth status
The implementation follows the same patterns as the existing Anthropic OAuth
support, with automatic token refresh and secure credential storage in
~/.config/.kit/credentials.json
Models from the opencode provider (like claude-opus-4-6 and gpt-5.3-codex)
have provider overrides in the models database that specify different npm
packages than the provider's default. The code was ignoring these overrides
and routing all models through openaicompat, causing "bad request" errors.
Changes:
- Added Provider field to modelsDBModel to capture model-specific overrides
- Added ProviderNPM field to ModelInfo registry struct
- Updated autoRouteProvider() to check for model-specific provider overrides
- Fixed URL path handling for anthropic provider (strip /v1 suffix to avoid
double /v1/v1 paths when using third-party anthropic-compatible APIs)
Fixes routing for:
- opencode/claude-opus-4-6 -> @ai-sdk/anthropic
- opencode/gpt-5.3-codex -> @ai-sdk/openai
Allow users to define custom models in ~/.kit.yml under the customModels
section. These models are automatically merged into the custom provider.
Example config:
customModels:
my-model:
name: "My Custom Model"
reasoning: true
temperature: true
cost:
input: 0.002
output: 0.004
limit:
context: 128000
output: 32000
Usage:
kit --model custom/my-model "Hello"
kit --provider-url "http://localhost:8080" --model custom/my-model "Hello"
Note: When --provider-url is specified without --model, kit defaults to
custom/custom. When --provider-url is specified WITH a custom model from
config, that model is used.
Bug fixes:
- Fixed kit.New() re-loading config file and overriding CLI-specified config
- Fixed models command to reload registry for custom models
When users pass --provider-url without --model, automatically default
to custom/custom instead of the saved model preference. This lets users
point kit at any OpenAI-compatible endpoint without needing a provider/model
pair from the database.
The custom/custom model has:
- Zero cost (input/output = 0)
- 262K context window, 65K output limit
- Reasoning and temperature support
- Routes through openaicompat fantasy provider
Remove the early ValidateEnvironment gate from CreateProvider that only
checked env vars and --provider-api-key, blocking stored OAuth credentials
from working. Each provider creation function already handles its own auth
resolution with clear error messages.
Update ValidateEnvironment to also check stored Anthropic credentials so
the model selector UI correctly shows Anthropic models for OAuth users.
Add automatic token refresh in oauthTransport so long-lived ACP sessions
survive token renewals. Surface actionable auth error messages in ACP
session creation.
Fix pre-existing staticcheck SA5011 warnings in test files.
Implement 4-phase subagent system enabling LLM and extensions to spawn,
manage, and orchestrate child Kit instances for parallel task execution.
- Phase 1: SDK API with SpawnSubagent() for extensions
- Phase 2: spawn_subagent core tool for LLM usage
- Phase 3: Session hierarchy with ParentSessionID tracking
- Phase 4: Provider pooling for concurrent model access
New files:
- internal/extensions/subagent.go: SpawnSubagent implementation
- internal/core/subagent.go: Core tool definition
- internal/models/pool.go: Provider pool for concurrency
- examples/extensions/subagent-test.go: Test extension
- openspec/subagent-support.md: Design specification
Anthropic rejects requests with both temperature and top_p set.
When both are configured (typically from defaults), clear top_p
so temperature takes precedence.
Add extended thinking/reasoning support for Anthropic and OpenAI models:
- ThinkingLevel type (off/minimal/low/medium/high) with token budgets
- Stream reasoning deltas via OnReasoningDelta through SDK→TUI event pipeline
- Render thinking blocks in StreamComponent (muted italic, collapsible)
- ctrl+t toggles thinking visibility, shift+tab cycles thinking level
- /thinking slash command with tab-completion for level names
- --thinking-level CLI flag and config file support
- Map ThinkingLevel to OpenAI ReasoningEffort for Responses API
- Auto-bump Anthropic max_tokens when thinking budget exceeds it
- Fix ResponseCompleteEvent prematurely resetting stream in streaming mode
- Status bar displays current thinking level
Enable fantasy's Responses API path (WithUseResponsesAPI) for the OpenAI
provider so that models like gpt-5.3-codex, codex-mini-latest, o3, o4-mini,
and other Responses-only models work correctly.
- Enable WithUseResponsesAPI on both createOpenAIProvider and
createAutoRoutedOpenAIProvider
- Build provider options for reasoning models (reasoning_summary, encrypted
reasoning content) matching crush's coordinator behaviour
- Thread ProviderOptions from provider creation through to the fantasy agent
in NewAgent, SetModel, and the SDK Complete path
- Pass generation parameters (Temperature, MaxTokens, TopP, TopK) to the
fantasy agent for all providers (previously only Ollama)
- Fix extension tool schema for Responses API: parse Parameters JSON Schema
string into fantasy ToolInfo format, ensure Required is never nil (OpenAI
rejects null, expects empty array)
Replace catwalk dependency with direct models.dev integration (97 providers,
3039 models vs catwalk's 22/679). Auto-route @ai-sdk/openai-compatible
providers through fantasy's openaicompat using the api URL from models.dev,
eliminating the need for --provider-url. Add --all flag to 'mcphost models'
to show all providers vs just fantasy-compatible ones.
Fix all 74 golangci-lint issues: errcheck (53), staticcheck SA4006 (24),
SA9003 (2), ST1005 (5), ineffassign (3). Restructure styles.go color
handling into a colorScheme struct to eliminate SA4006 false positives
from new(x) syntax.
- Make model validation advisory: unknown models pass through to the
provider API with a stderr warning instead of blocking. Catwalk
metadata is used for cost tracking and suggestions when available.
- Add LookupModel() as the primary registry API (returns nil for
unknown models, no error).
- Add 'mcphost update-models' subcommand to refresh the model database
from a catwalk server (defaults to https://catwalk.charm.sh), a local
file, or reset to the embedded version. Supports ETag caching.
- Add disk cache layer at ~/.local/share/mcphost/providers.json;
registry loads cached data first, falls back to embedded.
- Add vercel provider support via fantasy.
- Add io.Closer plumbing to ProviderResult and Agent.Close() for
providers that hold resources.
- Delete dead ESC listener code and bubbletea/time imports from agent
- Remove internal/tokens/ package (empty stubs and trivial estimator)
- Inline token estimation into usage_tracker as unexported helper
- Remove unused EstimateAndUpdateUsageFromText dead method
- Remove 9 unsupported provider env var entries from registry
Switch the --model / -m flag format from colon-separated (provider:model)
to slash-separated (provider/model), e.g. anthropic/claude-sonnet-4-5-20250929
or ollama/qwen3:8b. The slash separator is cleaner since model names can
contain colons (ollama tags, bedrock ARNs).
Add centralized ParseModelString() in internal/models/providers.go that all
callers now use. The old colon format is still accepted with a deprecation
warning to stderr for backward compatibility.
Update default model to claude-sonnet-4-5-20250929.
Each spinner created a new tea.NewProgram which sent DECRQM queries for
synchronized output mode 2026. When the program exited and restored
cooked terminal mode, the terminal's DECRPM response leaked as visible
^[[?2026;2$y characters. Replace Bubble Tea spinner with a simple
goroutine animation loop writing directly to stderr via lipgloss.