* Remove dead code: 5 unused symbols across internal packages
- internal/models: LoadModelSettingsFromConfig (zero refs)
- internal/prompts: PromptTemplate.ExpandWithArgs (zero refs)
- internal/app: NewMessageStore (tests migrated to NewMessageStoreWithMessages)
- internal/config: HasEnvVars (+ its test)
- internal/core: ContextWithSudoPassword (test migrated to context.WithValue)
* pkg/kit: use TreeManager alias in exported signatures
NewTreeManagerAdapter and InitTreeSession now spell their signatures with
the public kit.TreeManager alias instead of internal/session.TreeManager,
so go doc renders domain types rather than internal paths.
* Consolidate tool-kind classification into internal/extensions
coreToolKinds + toolKindFor were duplicated verbatim in
internal/extensions/wrapper.go and pkg/kit/events.go, risking silent
divergence between extension events and SDK events. Single source of
truth now lives in internal/extensions/toolkinds.go; pkg/kit re-exports
the constants.
* Consolidate Anthropic OAuth detection and usage-tracker refresh
The 'is the active Anthropic credential a stored OAuth token' check was
copy-pasted at 5 sites, all prefix-matching the magic string
'stored OAuth' produced in internal/auth. Now:
- internal/auth: new CredentialSourceOAuth constant + IsAnthropicOAuth()
- internal/ui: new UpdateUsageTrackerForModel(); CreateUsageTracker and
SetupCLI share lookupTrackableModel (SetupCLI no longer re-inlines the
tracker construction)
- cmd/root.go + cmd/extension_context.go: verbatim-duplicated tracker
refresh blocks replaced with ui.UpdateUsageTrackerForModel
- pkg/kit isAnthropicOAuth delegates to auth.IsAnthropicOAuth
- internal/models compares source against the constant
* pkg/kit: consolidate model-path helpers and argument tokenizer
- ExtractModelFromPath mis-parsed model IDs containing '/' (e.g.
'openrouter/meta/llama' -> 'meta'); it now delegates to
RemoveProviderFromModel and is deprecated alongside
ExtractProviderFromPath (-> GetCurrentProvider)
- parseFields delegated to prompts.ParseCommandArgs so extension argument
parsing and builtin prompt-template parsing share one quote/escape
grammar; ParseCommandArgs now also splits on tabs (superset of both
previous tokenizers)
* Unify the two {{variable}} template engines
internal/skills and pkg/kit/template_bridge each had their own grammar:
skills rejected '{{ name }}' (whitespace) but allowed digit-first names;
the bridge was the opposite. A template behaved differently depending on
whether it was loaded as a skill prompt or via the extension API.
internal/skills is now the single engine using the superset grammar
(\{\{\s*(\w+)\s*\}\}); pkg/kit ParseTemplate/RenderTemplate are thin
adapters over it. Expand is now regex-based so whitespace placeholders
expand consistently; missing variables are still left as-is.
* internal/ui: extract switchModel helper for model-switch flow
The model-selector handler (ModelSelectedMsg) and /model slash command
duplicated the full switch sequence (thinking-level fallback, setModel,
display-state update, preference persistence, ModelChange emit) and had
already drifted in ordering. Both now call a single switchModel method.
Display state is still updated directly (no prog.Send from Update).
* extbridge: extract shared BaseContext for extension wiring
cmd/extension_context.go and internal/acpserver/session.go each built a
giant extensions.Context literal, duplicating ~15 delegation closures
(GetContextStats, GetMessages, AppendEntry, options, SetModel core,
Complete, SpawnSubagent, ...) that had to be kept in sync by hand. New
data-access fields had to be wired in both places or ACP-mode extensions
silently got nil function fields.
extbridge.BaseContext now provides the headless half; both call sites
overlay only their UI-specific closures. As a side effect ACP mode gains
previously-missing APIs (state, tree navigation, skills, template
parsing, model resolution) that were nil before. The interactive TUI
keeps its exact SetModel/ReloadExtensions ordering via overrides.
* internal/tools: extract withOAuthRetry and marshalToolResult helpers
ExecuteTool repeated the OAuth-error/re-auth/retry stanza verbatim twice
(sync and task-augmented paths) and the marshal-and-wrap stanza four
times. Both are now single helpers with identical error strings, so a
fix to OAuth retry or error categorization applies everywhere at once.
* internal/ui: extract buildShareFile with defer-based cleanup
handleShareCommand repeated the close/remove/print/return cleanup chain
four times across its temp-file write error paths. File assembly now
lives in buildShareFile with a single deferred cleanup on error.
* cmd: extract flag validation, preference restore, and provider-URL routing from runNormalMode
runNormalMode opened with ~150 lines of policy logic (flag-combination
validation, persisted model/thinking-level preference restoration, and
two subtle --provider-url model-rewrite rules). These are now standalone
functions (validateModeFlags, restorePersistedPreferences,
applyProviderURLRouting) so the routing policy is independently readable
and testable. Behaviour unchanged; ordering preserved.
* fix: address review findings on SDK godoc and nil guard
- pkg/kit: remove internal package paths from exported godoc on
ParseTemplate and the ToolKind* constants (SDK doc surface must not
reference internal packages)
- internal/tools: guard marshalToolResult against a nil CallToolResult
(json.Marshal(nil) succeeds as 'null', then result.IsError panics if
a client returns nil result with nil error)
Skipped the TreeNode Children deep-copy suggestion: the slice already
comes from TreeManager.GetChildren which returns a fresh copy per call
into a throwaway intermediate, so no internal state is exposed.
* feat(auth): add Copilot login
Add experimental GitHub Copilot device login and copilot/* provider support for users with Copilot access but no OpenAI account.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix(copilot): use responses for GPT-5
Route Copilot GPT-5 models through the Responses API because gpt-5.5 is not available on /chat/completions.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix(copilot): honor device flow timing
* docs(copilot): add auth helper docstrings
* fix(auth): address copilot review feedback
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add sdkDefaultBaseURL map covering the 14 npm SDKs that ship a
hard-coded baseURL (groq, cerebras, mistral, xai, perplexity,
togetherai, deepinfra, cohere, v0, aihubmix, venice, merge-gateway,
openrouter, vercel gateway), so providers whose models.dev entry
omits the api field still auto-route correctly.
- Extend npmToWireProtocol so these thin OpenAI-compatible wrappers
route through fantasy's openaicompat provider.
- Add resolveTemplatedAPIURL to substitute ${VAR} placeholders for
cloudflare-workers-ai, databricks, snowflake-cortex from the env,
with friendly errors that name the missing vars.
- Wire amazon-bedrock and azure-cognitive-services aliases into the
existing native handlers; add createGoogleVertexProvider for the
google-vertex case.
- Expose kit.ResolveProviderBaseURL in the public SDK so embedders
can introspect the effective endpoint before instantiating a Kit.
- Refresh embedded_models.json from models.dev (5113 -> 5121 models;
139 providers unchanged).
- Remove deprecated GenerateWithLoopAndStreaming and TreeManager
AppendFantasyMessage / AddFantasyMessages / GetFantasyMessages to
close the SDK leakage caused by the kit.TreeManager type alias
- Switch extensionAPI method signatures to local Extension* aliases so
pkg.go.dev signatures no longer expose internal package names
- Bundle runNormalMode dependencies into a runModeDeps struct, shrinking
the runNonInteractive and runInteractive call sites from 40+ positional
args to (ctx, deps)
- Add generic subscribeTyped[E Event] helper and collapse ~30 typed OnXxx
wrappers in pkg/kit/events.go onto it (public signatures unchanged)
- Extract setupBashPipes / interpretBashExit in internal/core/bash.go to
deduplicate the buffered and streaming execution paths
- Extract resolveAutoRouteAPIKey and wrapProviderErr helpers in
internal/models/providers.go and uniformly apply them across every
createXxxProvider site
- Reimplement internal/extensions/watcher.go as a thin wrapper over the
general-purpose internal/watcher.ContentWatcher, eliminating ~130 LOC
of duplicated fsnotify logic while preserving the existing test API
- Add ctx.Err() pre-flight checks in executeRead / Write / Edit / Ls so
cancellation actually short-circuits pure file-IO tools
- replace npmToLLMProvider map with npmToWireProtocol (openai/anthropic/google)
- add createAutoRoutedGoogleProvider so @ai-sdk/google proxies work
(fixes opencode/gemini-* failing with "no LLM provider mapping")
- strip the genai-injected v1beta segment for proxies whose base URL
already carries a version (e.g. opencode's /zen/v1)
- preserve openai-compat fallback and clearer error for unroutable providers
- document auto-routing in README and providers docs; update CreateProvider godoc
- add regression tests for wire routing and version-path rewriting
Fixes#41
* feat(kit): isolate viper config per Kit instance + add NewAgent (#40)
- Give each kit.New()/NewAgent() call an isolated *viper.Viper store so
multiple Kit instances in one process no longer clobber each other's
config; runtime mutators (SetModel, SetThinkingLevel) touch only the
owning instance, making subagent spawning and multi-Kit embedding
race-free
- Thread the per-instance store through internal/config, internal/models
(ProviderConfig.ConfigStore), internal/kitsetup, and the extension
runner, with a nil -> process-global fallback so the CLI is unaffected
- Share the global store when Options.CLI != nil to preserve cobra flag
bindings (also opted in for internal/acpserver)
- Remove viperInitMu; preserve the tri-state IsSet precedence contract
and sdkDefaultMaxTokens floor
- Add ergonomic NewAgent + functional options (WithModel, WithStreaming,
Ephemeral, etc.); NewAgent defaults streaming on, opt out via
WithStreaming(false). New(ctx, *Options) behavior is unchanged
- Add config-isolation regression test and NewAgent/option coverage;
document NewAgent and per-instance isolation in README
Fixes#40
* docs(sdk): document NewAgent options and per-instance config isolation
- Add "Functional options (NewAgent)" and "Per-instance config isolation"
sections to the docs site SDK overview, with an options table and a
"when to use which" constructor comparison
- Cross-reference NewAgent from the SDK options page and correct the now
per-instance ProviderAPIKey precedence wording
- Document NewAgent + With* helpers and config isolation in pkg/kit/README
and list NewAgent/Option in the API reference
- Show the NewAgent constructor in the SDK examples getting-started snippet
* fix(kit): correct config loading and isolate ACP sessions
- Isolate each ACP session's config store instead of sharing the global
viper, preventing per-session SetModel/SetThinkingLevel races; seed the
root-command flag values (model, thinking-level, provider URL/key) so
`kit acp -m <model>` is still honored
- Run initConfig for isolated SDK stores by gating on opts.CLI instead of
v.GetString("model"), which setSDKDefaults always populates and thus
skipped .kit.yml / KIT_* loading for SDK callers
- Configure KIT_* env overrides unconditionally in initConfig so passing an
explicit config file no longer disables environment variable support
- Wrap config unmarshal/validate errors with %w to preserve the error chain
* fix(kit): make Options.Streaming a *bool to honor unset
- Change Options.Streaming from bool to *bool so a zero-valued Options no
longer forces stream=false; New only sets the key when non-nil, letting
streaming resolve through the precedence chain (env -> config -> default
true). This also fixes the CLI path, which never set the field
- Mirror the existing sampling-parameter pointer pattern instead of adding
a separate StreamingSet sentinel, keeping Options internally consistent
- Update WithStreaming/NewAgent, subagent, and ACP callers to the pointer
form; add regression tests for the nil-default and explicit opt-out paths
- Update SDK docs (README, pkg/kit/README, options page) with the ptrBool
helper and *bool semantics
* fix(kit): inherit parent provider config in subagents
- Copy the parent's effective provider/runtime config (API key, URL,
TLS, thinking level, max-tokens, samplers) onto child Options in
Kit.Subagent. After the per-instance viper isolation, the child's
isolated store only re-loaded .kit.yml / KIT_*, silently dropping
config the parent set via programmatic Options or runtime setters
like SetThinkingLevel
- Preserve the IsSet tri-state for max-tokens and samplers so per-model
defaults still apply on the child when the parent left them unset
- Add TestInheritProviderConfig covering propagation, unset keys, and
nil-safety
Removes ~600 lines of unreferenced code surfaced by deadcode + manual
audit (none of it reachable from production code paths or test setup):
- internal/models/pool.go: ProviderPool was never wired into kitsetup
or the agent; the global pool singleton had zero callers.
- internal/ui/debug_logger.go: CLIDebugLogger was unreachable; debug
routing goes through internal/tools/buffered_logger.go instead.
- internal/ui/tool_approval_input.go: tea.Model never instantiated;
approvals are handled inline in model.go.
- internal/ui/cli.go: DisplayAssistantMessage / DisplayCancellation /
GetDebugLogger had zero callers (the *WithModel variant is what
event_handler.go uses).
- internal/ui/style/enhanced.go: Style{Card,Header,Subheader,Muted,
Success,Error,Warning,Info} + Create{Separator,ProgressBar} — none
used. CreateBadge stays (used by model.go).
- internal/ui/style/themes.go: RefreshThemeRegistry — never called.
- internal/ui/block_renderer.go: With{FullWidth,MarginTop,Padding{Left,
Right},Background,Foreground,Width} — option helpers nobody calls.
- internal/ui/render/blocks.go: UserBlock, ToolBlock — replaced by
inline rendering elsewhere; the test for UserBlock was rewritten to
directly exercise HighlightFileTokens (which is what the test really
cared about).
- internal/ui/commands/commands.go: GetAllCommandNames — no callers.
- internal/ui/message_items.go: NewTextMessageItem,
NewSystemMessageItem + the entire SystemMessageItem type — model.go
uses NewStyledMessageItem instead.
- internal/prompts/loader.go: Deduplicate — the loader does dedup
internally; standalone helper was unused.
- internal/models/cache_options.go: mergeProviderOptions + its
test-only consumer.
- internal/extensions/installer.go: Installer.GetInstalledPackages —
intended for a 'kit ext list' command that was never built.
- internal/extensions/manifest.go: saveManifestToScope,
saveManifestToPath, GetGlobalManifest, GetProjectManifest,
addEntryToManifest, removeEntryFromManifest — package-level
duplicates of *Installer methods. Tests rewritten to exercise the
live Installer methods instead, which fixes a latent path-resolution
inconsistency between manifestPathForScope and Installer.manifestPath
(the former hard-coded paths, the latter respects projectGitRoot).
- internal/extensions/subagent.go: SpawnSubagent + helpers
(generateSubagentID, findKitBinary, subagentJSONOutput). The
subprocess-spawn implementation is unreachable; production code
routes through kit.Kit.Subagent (in-process). Types
(SubagentConfig/Result/Handle/etc.) and the SubagentHandle methods
remain because they are exposed to extensions via Yaegi symbols and
the Context.SpawnSubagent field.
- cmd/root.go: LoadConfigWithEnvSubstitution — one-line wrapper around
kit.LoadConfigWithEnvSubstitution with zero callers.
go test -race ./... passes.
Fantasy v0.21.0 natively includes gpt-5.5 and other newer models in
its responsesModelIDs/responsesReasoningModelIDs lists, making our
workaround unnecessary.
- Delete responses_models.go (go:linkname hack + RegisterResponsesModels)
- Delete responses_models_test.go
- Replace isResponsesAPIModel/isResponsesReasoningModel heuristics with
direct openai.IsResponsesModel/openai.IsResponsesReasoningModel calls
- Remove RegisterResponsesModels calls from registry init/reload
- Remove hack documentation from AGENTS.md
- Update all deps (fantasy v0.21.0, smithy-go, ultraviolet, etc.)
Fantasy's hardcoded responsesModelIDs list gates whether a model uses
the Responses API or Chat Completions code path. When a new model
(e.g. gpt-5.5) is added via `kit update-models` but fantasy hasn't
been updated yet, the type mismatch between *ResponsesProviderOptions
and *ProviderOptions causes a crash.
- Add isResponsesAPIModel()/isResponsesReasoningModel() helpers that
supplement fantasy's checks with prefix-based heuristics for modern
OpenAI model families (gpt-4.1+, gpt-5+, o-series, codex, chatgpt)
- Add RegisterResponsesModels() using go:linkname to append missing
model IDs from our database into fantasy's internal slices at init
time and after ReloadGlobalRegistry()
- Replace all direct openai.IsResponsesModel/IsResponsesReasoningModel
calls in providers.go with the new helpers
- Merge embedded + cached model databases instead of cache-only fallback
- Bump fantasy v0.19.0 -> v0.20.0 to match existing import usage
- Document the technique and model-family update process in AGENTS.md
- Update all Go dependencies (bubbletea v2.0.6, fantasy v0.19.0,
acp-go-sdk v0.12.0, mcp-go v0.49.0, and transitive deps)
- Replace SetSessionModel with SetSessionConfigOption to match new
acp-go-sdk Agent interface (union type with ValueId/Boolean variants)
- Add ListSessions stub returning empty list (new required method)
- Refresh embedded_models.json from models.dev/api.json
- Update ACP smoke test: add initialize handshake, session/list,
session/set_config_option, session/cancel, and fix update parsing
Adds 'none' thinking level to support OpenAI gpt-5.4 models which use
'reasoning_effort: none' instead of 'minimal'. Includes validation and
auto-adjustment when switching models with incompatible levels.
- Add ThinkingNone constant mapping to ReasoningEffortNone
- Add IsValidThinkingLevelForModel() with gpt-5.4 detection
- Add SuggestThinkingLevelFallback() for level migration
- Auto-adjust thinking level on model switch with user notification
- Update all docs to include 'none' in valid levels
Fixes#11
Add --set-default flag to 'kit auth login' to automatically set the
provider's default model after successful authentication. When no Anthropic
credentials exist but OpenAI credentials are detected, error messages
now suggest using OpenAI with the correct --model flag.
Fixes#9
- Raise --max-tokens default from 4096 to 8192.
- Auto-raise MaxTokens toward the model's catalog Limit.Output (capped at
32768) when the user hasn't set --max-tokens explicitly and no per-model
modelSettings override applied. Prevents silent 4k/8k truncation on
models that support 32k-262k output.
- Surface FinishReasonLength at turn end: the app now subscribes to
TurnEndEvent and renders a system-message banner explaining the current
cap, the model's known ceiling, and how to raise it. Previously the TUI
swallowed 'length' stops, producing 'ghost' truncations.
- Export FinishReason* constants on pkg/kit (Stop, Length, ToolCalls,
ContentFilter, Error, Other, Unknown) and fix stale comments that used
Anthropic-style strings.
- Add Kit.MaxTokens() and Kit.MaxOutputLimit() SDK accessors, backed by
Agent.GetMaxTokens() which correctly returns 0 for providers that
suppress the param (e.g. Codex OAuth).
- Tests: rightSizeMaxTokens covers 7 paths (cap, raise, preserve,
explicit flag, nil info, zero limit); handleTurnEnd covers length/
non-length/nil-sendFn and the fallback message formatter.
- Docs: update configuration.md, cli/flags.md, and kit-extensions skill
to reflect the new default and behavior.
- Rename ExtensionToolsAsFantasy -> ExtensionToolsAsLLMTools
- Rename convertKitMessagesToFantasy -> convertToLLMMessages
- Delete GetFantasyProviders, ToFantasyMessages, FromFantasyMessage
- Replace direct fantasy type usage with kit.LLM* aliases in app tests
- Scrub fantasy references from godoc comments across pkg/kit and internal
- Add systemPrompt field to GenerationParams and config structs
- On init, replace default system prompt with per-model prompt when
user hasn't explicitly set one (via flag, config, or SDK option)
- On model switch, detect per-model prompt and compose it with
AGENTS.md, skills, and date/cwd context
- Fix viper.IsSet bug: BindPFlag causes IsSet to return true for
unset flags, so compare against defaultSystemPrompt instead
- Agent.SetModel now updates stored system prompt from config
- Export LoadModelSettingsFromConfig, LoadSystemPromptValue, and
LookupModelForSettings for use by Kit.SetModel
- Add tests for prompt apply, precedence, file path, and
modelSettings override
- Add modelSettings config section for attaching generation params
(temperature, topP, topK, frequencyPenalty, presencePenalty,
maxTokens, stopSequences, thinkingLevel) to any model by
provider/model key
- Add params field to customModels definitions for inline defaults
- Change BuildProviderConfig and SetModel to use viper.IsSet so
unset params remain nil, allowing model-level defaults to apply
- Wire ApplyModelSettings into CreateProvider with priority order:
CLI flags > global config > modelSettings > customModels params
- Add GenerationParams to ModelInfo in the registry
- Update default config template with modelSettings and customModels
params examples
- Add --frequency-penalty and --presence-penalty CLI flags (0.0-2.0)
- Wire through config, viper, ProviderConfig, and fantasy agent options
- Support in config file, env vars (KIT_FREQUENCY_PENALTY), and SDK
- Pass to Ollama via options map (frequency_penalty, presence_penalty)
- Apply on both initial agent creation and runtime model swap
- ctx.Abort(): cancel current agent turn and clear queue without
injecting a new message (App.Abort + App.IsBusy methods)
- ctx.IsIdle(): check whether the agent is currently processing
- ctx.Compact(CompactConfig): trigger async context compaction with
OnComplete/OnError callbacks (App.CompactAsync method)
- ctx.SendMultimodalMessage(text, []FilePart): send text+image messages
to the agent, bridging ext.FilePart to fantasy.FilePart via RunWithFiles
- ctx.GetSessionUsage() SessionUsage: expose aggregated session token
usage and cost from the UsageTracker
New types: CompactConfig, FilePart, SessionUsage
Wired in both context setups in cmd/root.go with nil-guard defaults
in runner.go and Yaegi symbol exports in symbols.go
Move reasoning tag detection from the provider and UI layers into the agent layer. This prevents raw XML tags from leaking into text streams while ensuring structured reasoning events are emitted correctly for all callers.
- Add `baseUrl` and `apiKey` fields to CustomModelConfig (config and models packages)
- Store them on ModelInfo so they travel through the registry
- createCustomProvider resolves URL/key from model definition first,
falling back to global --provider-url / --provider-api-key
- Fix registry initialisation: call ReloadGlobalRegistry() in InitConfig()
so customModels from config are visible on startup (not just at init time)
- Include custom provider in GetLLMProviders() so custom models appear
in the /model selector
- Hide the built-in custom/custom stub from the selector when user-defined
custom models are present
- Add ValidateModelString() to ModelsRegistry for format, provider,
and model name validation with typo suggestions
- Validate model in Kit.Subagent() before expensive Kit.New() setup
- Remove silent fallback to parent model on creation failure
- Error propagates as tool result so calling agent can self-correct
- Add registry_test.go covering format, provider, and suggestion cases
Models like Qwen and DeepSeek wrap reasoning content in ... XML-like
tags within the regular content field. This was causing the reasoning
text to appear twice - once as a reasoning block and once as regular text.
Changes:
1. Provider hooks (providers.go):
- Extract reasoning from tags and emit proper reasoning events
- Use openai provider directly with custom ExtraContentFunc and
StreamExtraFunc hooks to parse thinking content
2. Stream filtering (stream.go):
- Filter out all text content between and tags at the
streaming level to prevent duplicate rendering
- Track state with inThinkTag flag across stream chunks
3. Message conversion (content.go):
- Strip any remaining tags from text content when converting
from fantasy messages
The regex patterns use string concatenation to avoid XML tag corruption:
regexp.MustCompile( + + + + + + + )
Fixes duplicate reasoning text when using custom provider with models
that wrap thinking in tags.
- Upgrade golangci-lint to v2.11.4
- Fix errcheck warnings for os.Setenv/os.Unsetenv in tests
- Use maps.Copy instead of manual loop (modernize lint)
- Add maps import for maps.Copy
Remove internal monologue comments that don't add value for readers:
- Remove lengthy explanations of type conflicts that are now resolved
- Remove 'NOTE:' and 'TODO:' comments documenting implementation history
- Remove obvious test comments that just restate what the code does
- Keep only meaningful comments that explain design intent
The code is now cleaner and easier to read without the self-referential
commentary that was useful during development but not for maintenance.
Implements automatic prompt caching to reduce API costs by 60-90% for
repeated prompts with the same context.
Architecture:
- Provider-level caching for OpenAI (PromptCacheKey)
- Message-level caching for Anthropic (avoids type conflicts)
- Model family detection enables caching regardless of provider
Key Changes:
- Add ModelInfo.Family with SupportsCaching() and CacheType() methods
- Add ProviderConfig.DisableCaching for opt-out
- Implement message-level cache control in agent (like Crush)
- Last system message gets cache control
- Last 2 messages get cache control
- Last tool gets cache control
- Auto-disable caching when thinking is enabled (type conflict avoidance)
- Add KIT_DISABLE_CACHE environment variable for global opt-out
Tested with opencode/claude-sonnet-4-6 showing cacheRead/cacheWrite
tokens in debug output, confirming 60-90% cost savings.
Closes cost optimization for multi-turn conversations.
Rename public SDK symbols to use generic LLM terminology instead of
exposing the internal dependency name (charm.land/fantasy):
Public API renames (with deprecated wrappers for backward compat):
- ConvertToFantasyMessages() → ConvertToLLMMessages()
- ConvertFromFantasyMessage() → ConvertFromLLMMessage()
- GetFantasyProviders() → GetLLMProviders()
New type alias:
- LLMFilePart = fantasy.FilePart (eliminates need for direct fantasy import)
- PromptResultWithFiles() signature now uses LLMFilePart
Internal renames (with deprecated wrappers):
- ModelsRegistry.GetFantasyProviders() → GetLLMProviders()
- TreeManager.GetFantasyMessages() → GetLLMMessages()
- TreeManager.AppendFantasyMessage() → AppendLLMMessage()
- TreeManager.AddFantasyMessages() → AddLLMMessages()
- Message.ToFantasyMessages() → ToLLMMessages()
- FromFantasyMessage() → FromLLMMessage()
- npmToFantasyProvider → npmToLLMProvider
- isProviderFantasySupported() → isProviderLLMSupported()
All internal callers migrated to new names. ~30 comments updated
to remove Fantasy references across pkg/kit/, internal/agent/,
internal/models/, internal/message/, internal/session/.
Documentation updates:
- AGENTS.md: added Public SDK rules section (no dependency leakage,
naming conventions, deprecation pattern)
- README.md: removed Fantasy references
- pkg/kit/README.md: full rewrite with current API surface
- skills/kit-sdk/SKILL.md: updated examples and type references
- www/pages/providers.md, www/pages/cli/commands.md: updated
- Remove unused modelFamily variable in createOpenAICodexProvider
- Remove dead spark handling code (spark is rejected early with error)
- Simplify buildCodexProviderOptions to only handle regular codex models
- Remove redundant comments and simplify code structure
- Net reduction: 31 lines of code
Spark models are not accessible via ChatGPT OAuth and return Cloudflare
'Forbidden' errors. Add early detection and helpful error message directing
users to regular Codex models like 'openai/gpt-5.3-codex' instead.
Different Codex model families use different API formats:
- gpt-codex-spark: uses standard ProviderOptions (not Responses API)
- gpt-codex, gpt-codex-mini: uses ResponsesProviderOptions
- Add detectCodexModelFamily() to determine model family from name
- Use standard ProviderOptions for spark models
- Use ResponsesProviderOptions for regular codex models
- Conditionally use WithUseResponsesAPI() based on model family
Note: gpt-5.3-codex-spark still gets Cloudflare forbidden error,
may need additional headers or different endpoint.
The Codex API doesn't support the max_output_tokens parameter, which was causing
"Unsupported parameter: max_output_tokens" errors.
- Add SkipMaxOutputTokens flag to ProviderResult
- Set flag when creating Codex OAuth provider
- Check flag in agent setup to skip WithMaxOutputTokens option
- This matches pi's behavior of not sending max_tokens to Codex API
- Upgrade charm.land/fantasy from v0.16.0 to v0.17.1
- Add buildCodexProviderOptions() to pass system prompt as 'instructions'
- The Codex API requires instructions as a top-level field, not as system message
- Set Store=false to prevent server-side conversation storage
- Use ResponsesProviderOptions.Instructions for system prompt
- Change base URL to /backend-api/codex for correct endpoint path
- Add browser-like User-Agent to avoid Cloudflare blocking
- Add Accept, Accept-Language, Cache-Control headers
- Match pi client headers more closely
Implements OAuth authentication for OpenAI ChatGPT Plus/Pro (Codex) similar to pi:
- Add OpenAICredentials type with OAuth and API key support
- Add OpenAI OAuth client with correct endpoints (auth.openai.com)
- Implement PKCE-based OAuth flow with local callback server on :1455
- Add login/logout/status commands for openai provider
- Support both ChatGPT/Codex OAuth tokens (chatgpt.com/backend-api) and
regular OpenAI API keys (api.openai.com)
- Extract and store ChatGPT account ID from JWT token
- Add custom HTTP transport with required Codex headers:
- chatgpt-account-id, originator, OpenAI-Beta: responses=experimental
- Update provider selection to use correct endpoint based on auth type
Usage:
kit auth login openai # OAuth with ChatGPT account
kit auth logout openai
kit auth status
The implementation follows the same patterns as the existing Anthropic OAuth
support, with automatic token refresh and secure credential storage in
~/.config/.kit/credentials.json
Models from the opencode provider (like claude-opus-4-6 and gpt-5.3-codex)
have provider overrides in the models database that specify different npm
packages than the provider's default. The code was ignoring these overrides
and routing all models through openaicompat, causing "bad request" errors.
Changes:
- Added Provider field to modelsDBModel to capture model-specific overrides
- Added ProviderNPM field to ModelInfo registry struct
- Updated autoRouteProvider() to check for model-specific provider overrides
- Fixed URL path handling for anthropic provider (strip /v1 suffix to avoid
double /v1/v1 paths when using third-party anthropic-compatible APIs)
Fixes routing for:
- opencode/claude-opus-4-6 -> @ai-sdk/anthropic
- opencode/gpt-5.3-codex -> @ai-sdk/openai
Allow users to define custom models in ~/.kit.yml under the customModels
section. These models are automatically merged into the custom provider.
Example config:
customModels:
my-model:
name: "My Custom Model"
reasoning: true
temperature: true
cost:
input: 0.002
output: 0.004
limit:
context: 128000
output: 32000
Usage:
kit --model custom/my-model "Hello"
kit --provider-url "http://localhost:8080" --model custom/my-model "Hello"
Note: When --provider-url is specified without --model, kit defaults to
custom/custom. When --provider-url is specified WITH a custom model from
config, that model is used.
Bug fixes:
- Fixed kit.New() re-loading config file and overriding CLI-specified config
- Fixed models command to reload registry for custom models
When users pass --provider-url without --model, automatically default
to custom/custom instead of the saved model preference. This lets users
point kit at any OpenAI-compatible endpoint without needing a provider/model
pair from the database.
The custom/custom model has:
- Zero cost (input/output = 0)
- 262K context window, 65K output limit
- Reasoning and temperature support
- Routes through openaicompat fantasy provider
Remove the early ValidateEnvironment gate from CreateProvider that only
checked env vars and --provider-api-key, blocking stored OAuth credentials
from working. Each provider creation function already handles its own auth
resolution with clear error messages.
Update ValidateEnvironment to also check stored Anthropic credentials so
the model selector UI correctly shows Anthropic models for OAuth users.
Add automatic token refresh in oauthTransport so long-lived ACP sessions
survive token renewals. Surface actionable auth error messages in ACP
session creation.
Fix pre-existing staticcheck SA5011 warnings in test files.
Implement 4-phase subagent system enabling LLM and extensions to spawn,
manage, and orchestrate child Kit instances for parallel task execution.
- Phase 1: SDK API with SpawnSubagent() for extensions
- Phase 2: spawn_subagent core tool for LLM usage
- Phase 3: Session hierarchy with ParentSessionID tracking
- Phase 4: Provider pooling for concurrent model access
New files:
- internal/extensions/subagent.go: SpawnSubagent implementation
- internal/core/subagent.go: Core tool definition
- internal/models/pool.go: Provider pool for concurrency
- examples/extensions/subagent-test.go: Test extension
- openspec/subagent-support.md: Design specification
Anthropic rejects requests with both temperature and top_p set.
When both are configured (typically from defaults), clear top_p
so temperature takes precedence.