* feat(extensions): add OnLLMUsage, SetState, enriched AgentEndEvent (#53)
Three additive primitives to the extension API:
- OnLLMUsage event: per-LLM-call token + cost deltas attributed to the
specific model/provider used for each round-trip. Derived from the SDK
StepFinishEvent in the extension bridge. Enables accurate budget
enforcement between calls instead of only at turn boundaries.
- ctx.SetState / GetState / DeleteState / ListState: session-scoped,
last-write-wins key-value store backed by a sidecar file
(<session>.ext-state.json) outside the conversation tree. Reads are
O(1), writes don't grow the JSONL, and the store is not duplicated on
fork. State is preserved across hot-reloads.
- Enriched AgentEndEvent: ToolCallCount, ToolNames, LLMCallCount, token
deltas (input/output/cache-read/cache-write), CostDelta, and
DurationMs populated by a per-turn aggregator. Existing handlers
reading only Response/StopReason are unaffected.
Includes unit tests for the state store, LLMUsage registration,
enriched AgentEndEvent, turn aggregator, llmUsageMeta, and sidecar path
derivation. Adds examples/extensions/usage-budget.go demoing all three
primitives together. Documents the additions in README, the docs site
(extensions overview, capabilities, examples), and the kit-extensions
and kit-sdk skill guides.
Fixes#53
* fix(extensions): address review feedback on state store and llmUsageMeta
- Serialize SetState/DeleteState saver invocations through a new saverMu
so overlapping atomic-rename writes can no longer race on the shared
.tmp file and persist an older snapshot after a newer one.
- LoadStateFromFile now clears the in-memory store when the sidecar is
missing or empty, matching the documented "replace … with its
contents" contract. This makes session-switching safe by preventing
keys from a prior session leaking into a new one. Tests updated to
cover both the missing-file and empty-file cases.
- llmUsageMeta now detects Anthropic OAuth credentials and returns
Cost=0, matching the comment and the existing usage_tracker behavior
for OAuth users. Mirrors the OAuth detection already used in
cmd/extension_context.go.
- Document the single-in-flight-turn assumption baked into the
per-turn aggregator with a clear migration path (per-turn ID) for if
concurrent turns ever become a supported use case.
* fix(extensions): release saverMu on panic in state store
Extract a runSaver helper that locks saverMu and defers Unlock before
invoking the persistence callback. Without the deferred Unlock, a panic
inside the saver (e.g. disk full mid-write) would leave saverMu held
forever and deadlock the next SetState/DeleteState. Both SetState and
DeleteState now route through the helper. New TestRunner_State_Saver
PanicReleasesSaverMu reproduces the deadlock window with a 2s deadline
and proves the mutex is released after a panic.
- Add LLMToolResultOutputContentMedia alias (closes gap in tool result types)
- Add LLMToolResultContentType enum and constants (Text, Error, Media)
- Add LLMToolInfo, LLMProviderOptions, LLMProviderMetadata, LLMPrompt aliases
- Replace all fantasy.* references in hooks.go and hooks_test.go with
SDK-owned aliases, removing the charm.land/fantasy import from both
- Fix gofmt alignment in internal/extensions/symbols.go
- Update SDK skill doc with complete LLM type reference
- Infer Type="image" for image/* MIME types and Type="media" for all
other binary content so the downstream framework creates a media
content block instead of silently discarding Data bytes (#17)
- Extract shared toolOutputToResponse() helper to eliminate duplication
- Add ImageResult() and MediaResult() convenience constructors
- Add LLMToolCall and LLMToolResponse type aliases so SDK consumers
can call Tool.Run() without importing the underlying framework
- Add 6 regression tests covering image, media, and text responses
Closes#17
Adds 'none' thinking level to support OpenAI gpt-5.4 models which use
'reasoning_effort: none' instead of 'minimal'. Includes validation and
auto-adjustment when switching models with incompatible levels.
- Add ThinkingNone constant mapping to ReasoningEffortNone
- Add IsValidThinkingLevelForModel() with gpt-5.4 detection
- Add SuggestThinkingLevelFallback() for level migration
- Auto-adjust thinking level on model switch with user notification
- Update all docs to include 'none' in valid levels
Fixes#11
Reflect the refactor that made MCPAuthHandler an explicit, opt-in
dependency for remote MCP OAuth. Four surfaces updated:
- README.md: new 'MCP OAuth (remote MCP servers)' subsection under the
Go SDK section, outlining the three consumer patterns (nil / CLI /
custom) and linking to the full options docs.
- pkg/kit/README.md: type cheat-sheet now lists MCPAuthHandler,
DefaultMCPAuthHandler, and CLIMCPAuthHandler alongside the existing
MCPTokenStore entries.
- skills/kit-sdk/SKILL.md: Options example annotated with nil-disables-
OAuth semantics; new 'MCP OAuth Authorization' section precedes the
existing token-storage section; re-exported types list expanded.
- www/pages/sdk/options.md: Options fields table gains MCPAuthHandler
row; new top-level 'MCP OAuth Authorization' section with consumer
matrix, CLI/custom/fully-custom code samples, and a warning callout
about the OnAuthURL nil-hang footgun.
The SDK last-resort MaxTokens floor is applied in kit.New() when
Options.MaxTokens, KIT_MAX_TOKENS, .kit.yml, and per-model defaults
are all unset. It was 4096 (inherited from the old setSDKDefaults
viper default) while the CLI --max-tokens cobra default is 8192.
Bump the floor to 8192 so SDK and CLI callers start from the same
base value before rightSizeMaxTokens runs, then update README,
skills/kit-sdk/SKILL.md, and www/pages/{configuration,sdk/options}.md
to match.
Previously setSDKDefaults() registered viper.SetDefault for max-tokens,
temperature, top-p, top-k, frequency/presence-penalty, and thinking-level.
viper.SetDefault makes IsSet() return true, which silently suppressed
per-model defaults (ApplyModelSettings) and automatic right-sizing
(rightSizeMaxTokens) for every SDK-created Kit — and for CLI runs too,
since cmd/root.go routes through kit.New. Effective max-tokens for
claude-sonnet-4-5 was pinned at 4096 instead of 32768.
- Drop SetDefault for all IsSet-sensitive keys; keep only model,
system-prompt, stream, num-gpu-layers, main-gpu.
- Apply a 4096 max-tokens floor directly on the *models.ProviderConfig
struct in kit.New() when nothing else resolved a value. Keeps
viper.IsSet("max-tokens") == false so rightSizeMaxTokens and
per-model maxTokens overrides still fire.
- Update Options.MaxTokens / ThinkingLevel godoc to describe the real
precedence chain.
- Strengthen tests: add Temperature subtest; add
TestNewPreservesIsSetSemantics regression covering all seven keys;
split TestNewWithProviderOptions into three subtests including
Options-beats-viper-state and ProviderURL propagation; add
resetViper helper so subtests don't bleed state.
- Document the new SDK fields (MaxTokens, ThinkingLevel, Temperature,
TopP, TopK, FrequencyPenalty, PresencePenalty, ProviderAPIKey,
ProviderURL, TLSSkipVerify) in README, skills/kit-sdk, and the www
configuration / sdk/options / sdk/overview pages, including a
dedicated precedence table.
- Raise --max-tokens default from 4096 to 8192.
- Auto-raise MaxTokens toward the model's catalog Limit.Output (capped at
32768) when the user hasn't set --max-tokens explicitly and no per-model
modelSettings override applied. Prevents silent 4k/8k truncation on
models that support 32k-262k output.
- Surface FinishReasonLength at turn end: the app now subscribes to
TurnEndEvent and renders a system-message banner explaining the current
cap, the model's known ceiling, and how to raise it. Previously the TUI
swallowed 'length' stops, producing 'ghost' truncations.
- Export FinishReason* constants on pkg/kit (Stop, Length, ToolCalls,
ContentFilter, Error, Other, Unknown) and fix stale comments that used
Anthropic-style strings.
- Add Kit.MaxTokens() and Kit.MaxOutputLimit() SDK accessors, backed by
Agent.GetMaxTokens() which correctly returns 0 for providers that
suppress the param (e.g. Codex OAuth).
- Tests: rightSizeMaxTokens covers 7 paths (cap, raise, preserve,
explicit flag, nil info, zero limit); handleTurnEnd covers length/
non-length/nil-sendFn and the fallback message formatter.
- Docs: update configuration.md, cli/flags.md, and kit-extensions skill
to reflect the new default and behavior.
- Add InProcessServer field to MCPServerConfig (json:"-", never serialized)
- Add "inprocess" transport type to config, validation, and connection pool
- Add createInProcessClient() using mcp-go client.NewInProcessClient()
- Add Kit.AddInProcessMCPServer() convenience method
- Add Options.InProcessMCPServers for init-time registration
- Export MCPServer type alias (= server.MCPServer) in pkg/kit/types.go
- Add 8 tests covering config, pool, tool manager, and edge cases
- Update SDK README, kit-sdk skill, and www docs
- Add ToolOutput struct, TextResult/ErrorResult helpers, and
ToolCallIDFromContext so SDK consumers can create custom tools
without importing charm.land/fantasy
- Add NewTool (sequential) and NewParallelTool (concurrent) generic
constructors with automatic JSON schema generation from struct tags
- Remove dead UpdateUsageFromResponse method and fantasy import from
internal/ui/cli.go
- Update SDK skill, README, and www/ docs with custom tool examples
and corrected hook signatures
Add SessionManager interface to allow pluggable session storage backends.
This enables users to implement custom session managers for databases,
cloud storage, or other persistence mechanisms instead of the default
JSONL file-based TreeManager.
Changes:
- Add SessionManager interface with methods for message storage,
tree navigation, compaction, and extension data
- Add treeManagerAdapter to wrap existing TreeManager for backward compatibility
- Update Kit struct to use SessionManager interface instead of concrete type
- Add SessionManager option to Options struct
- Update all session-related methods to use interface
- Add documentation for custom SessionManager usage
The default behavior is preserved - when no SessionManager is provided,
Kit automatically uses the TreeManager via the adapter.
Add two new Options fields for programmatic SDK usage:
- SkipConfig: Skip .kit.yml file loading while still using viper defaults
and environment variables. Useful for fully programmatic configuration.
- DisableCoreTools: Allow creating agents with 0 tools (chat-only mode) or
with only custom tools. When true and Tools is empty, no tools are loaded.
When combined with custom Tools, only those tools are loaded.
Updates documentation in README, pkg/kit/README, skills/kit-sdk/SKILL,
and www/pages/sdk/options.
- Refactor go-edit-lint to collect edited .go files during the agent
turn via OnToolResult, then run gopls + golangci-lint once in
OnAgentEnd instead of after every individual edit/write call
- Use ctx.SendMessage() to inject diagnostics as a follow-up prompt
when issues are found, replacing the old tool-result rewriting
- Show a green 'all clean' block when no issues are detected
- Fix StopReason docs in skills/kit-extensions/SKILL.md: the value is
'error' on failure, 'completed' when the LLM returns empty, or the
raw provider value (e.g. 'stop', 'end_turn') passed through — not
the previously documented 'completed'/'cancelled'/'error' enum
Remove charm.land/fantasy from the public API surface of pkg/kit by
replacing the four type aliases with concrete Kit-owned structs:
- LLMMessage {Role LLMMessageRole, Content string}
- LLMUsage {InputTokens, OutputTokens, TotalTokens, ...}
- LLMResponse {Content, FinishReason, Usage}
- LLMFilePart {Filename, Data []byte, MediaType}
Add LLMMessageRole type with user/assistant/system/tool constants.
Introduce pkg/kit/llm_convert.go as the single boundary layer where
Kit types convert to/from fantasy types internally. All callers in
pkg/kit, pkg/kit/compaction.go, pkg/kit/extensions_bridge.go, and
internal/app/app.go cross through this layer.
ContextPrepareHook.Messages and ContextPrepareResult.Messages change
from []fantasy.Message to []LLMMessage. extensions_bridge.go drops
its fantasy and strings imports entirely.
internal/app/app_test.go switches &fantasy.Usage{} to &kit.LLMUsage{}.
Add seven new tests in types_test.go covering concrete construction,
role constants, JSON snake_case tags, and round-trip conversion.
Rename public SDK symbols to use generic LLM terminology instead of
exposing the internal dependency name (charm.land/fantasy):
Public API renames (with deprecated wrappers for backward compat):
- ConvertToFantasyMessages() → ConvertToLLMMessages()
- ConvertFromFantasyMessage() → ConvertFromLLMMessage()
- GetFantasyProviders() → GetLLMProviders()
New type alias:
- LLMFilePart = fantasy.FilePart (eliminates need for direct fantasy import)
- PromptResultWithFiles() signature now uses LLMFilePart
Internal renames (with deprecated wrappers):
- ModelsRegistry.GetFantasyProviders() → GetLLMProviders()
- TreeManager.GetFantasyMessages() → GetLLMMessages()
- TreeManager.AppendFantasyMessage() → AppendLLMMessage()
- TreeManager.AddFantasyMessages() → AddLLMMessages()
- Message.ToFantasyMessages() → ToLLMMessages()
- FromFantasyMessage() → FromLLMMessage()
- npmToFantasyProvider → npmToLLMProvider
- isProviderFantasySupported() → isProviderLLMSupported()
All internal callers migrated to new names. ~30 comments updated
to remove Fantasy references across pkg/kit/, internal/agent/,
internal/models/, internal/message/, internal/session/.
Documentation updates:
- AGENTS.md: added Public SDK rules section (no dependency leakage,
naming conventions, deprecation pattern)
- README.md: removed Fantasy references
- pkg/kit/README.md: full rewrite with current API surface
- skills/kit-sdk/SKILL.md: updated examples and type references
- www/pages/providers.md, www/pages/cli/commands.md: updated
- Add StepUsageEvent and SteerConsumedEvent to event types table
- Add new Extension API section documenting kit.Extensions() sub-API
- Add extension_api.go to Key Files reference list
- Fix Close() error handling in README SDK example
These methods have been deprecated since the narrow-accessor and event-
subscriber APIs were introduced. No callers exist in this repository.
- pkg/kit/kit.go: remove GetExtRunner(), GetBufferedLogger(), GetAgent(),
and PromptWithCallbacks(); update Subscribe() doc comment which still
mentioned PromptWithCallbacks; tighten section header comment
- pkg/kit/README.md: replace PromptWithCallbacks example with the
OnToolCall/OnToolResult/OnStreaming subscriber pattern; remove method
from the quick-reference list
- README.md: same example migration in the SDK section
- www/pages/sdk/callbacks.md: remove the PromptWithCallbacks section
entirely; the event-based monitoring section that followed it is now
the lead content
- www/pages/sdk/overview.md: remove PromptWithCallbacks row from the
prompt-variant table
- skills/kit-sdk/SKILL.md: remove the deprecated legacy callback snippet
Move the extension testing package from internal/extensions/test to
pkg/extensions/test to make it publicly importable by external extension
authors.
Changes:
- Moved test package files to pkg/extensions/test/
- Updated all imports from internal/ to pkg/ path:
- README.md
- examples/extensions/tool-logger_test.go
- examples/extensions/extension_test_template.go
- skills/kit-extensions/SKILL.md
- www/pages/extensions/testing.md
- pkg/extensions/test/README.md
- pkg/extensions/test/harness.go
The test package is now available for external import as:
github.com/mark3labs/kit/pkg/extensions/test
All tests pass with race detector.
Add missing PromptMultiSelect example to Interactive Prompts section.
Replace the minimal subprocess pattern with comprehensive SpawnSubagent
documentation including blocking/background modes, all SubagentConfig
fields, SubagentResult fields, SubagentEvent types, and handle methods.
Add Themes section to Context API reference with RegisterTheme,
SetTheme, ListThemes examples and ThemeColorConfig field reference.
Add Custom Theme with Slash Command pattern to Common Patterns.
Remove mistakenly committed .agents/skills copy.
Add comprehensive testing documentation to the kit-extensions skill:
- Add code example showing basic test structure with LoadFile(), Emit(), and assertions
- Document key testing patterns (LoadFile vs LoadString, event emission, assertions)
- List 25+ assertion helpers available in test package
- Reference tool-logger_test.go as complete example with 14 test cases
- Add link to internal/extensions/test/ in Key Files section
- Maintain existing CLI testing commands section
The skill now provides complete guidance for testing extensions alongside development.