Compare commits

..

32 Commits

Author SHA1 Message Date
Ed Zynda 4ba9d6fab3 feat(events): mirror Fantasy tool input streaming callbacks as Kit events
- Add ToolCallStartEvent, ToolCallDeltaEvent, ToolCallEndEvent to SDK
- Wire Fantasy OnToolInputStart/Delta/End through agent to EventBus
- Add typed convenience subscribers: OnToolCallStart/Delta/End on Kit
- Bridge new events to TUI via ToolCallInputStart/Delta/End app events
- Extend extension system with OnToolCallInputStart/Delta/End handlers
- Add extension event types, API methods, loader wiring, Yaegi symbols
- Update docs: README, SDK skill, extensions skill, www/sdk, www/extensions

Closes #16
2026-04-21 23:28:13 +03:00
Ed Zynda aec0e7cc01 docs: document noOAuth MCP server config field
- Add noOAuth to MCP server fields table in www/pages/configuration.md
- Add pubmed example with noOAuth in README and www config docs
2026-04-21 22:44:27 +03:00
Ed Zynda bac04636bf feat(config): add noOAuth flag to skip OAuth on public MCP servers
- Add NoOAuth field to MCPServerConfig with JSON/YAML support
- Guard OAuth error handling and transport setup with the new flag
- Prevents failed dynamic client registration on servers like PubMed
  that do not support OAuth
2026-04-21 22:24:10 +03:00
Ed Zynda 5f851fd08e fix(ui): require double ctrl+c to quit, matching double-esc pattern
- First ctrl+c clears input and arms quit flag with 3s timeout
- Second ctrl+c within timeout window actually quits
- Show '⚠ Press Ctrl+C again to quit' warning after first press
- Empty input no longer quits immediately on single ctrl+c
- Prompt/overlay states: ctrl+c cancels dialog, re-dispatches to
  main handler for double-press tracking instead of quitting
- Update placeholder, help text, and tests to match new behavior
2026-04-21 22:05:13 +03:00
Ed Zynda f8371836d8 fix(cmd): fix character encoding in OAuth success page
Add charset=utf-8 to Content-Type header and use HTML entity
✓ instead of raw Unicode checkmark to prevent garbled
text display in browsers.

Fixes #9
2026-04-21 21:19:51 +03:00
Ed Zynda 74f00244be fix(ui): wrap reasoning blocks to terminal width to prevent clipping
- wrap thinking text in StreamComponent and render.ReasoningBlock
- plumb width through renderer and streaming item paths
- keeps style consistent with user/assistant blocks and avoids cut-off lines
2026-04-21 20:42:53 +03:00
Ed Zynda b5d7fd4f3e update docs 2026-04-21 20:33:32 +03:00
Ed Zynda 5857d40978 cleanup 2026-04-21 20:27:32 +03:00
Ed Zynda 3ff701054a fix(models): add gpt-5.4 reasoning level support with auto-adjustment
Adds 'none' thinking level to support OpenAI gpt-5.4 models which use
'reasoning_effort: none' instead of 'minimal'. Includes validation and
auto-adjustment when switching models with incompatible levels.

- Add ThinkingNone constant mapping to ReasoningEffortNone
- Add IsValidThinkingLevelForModel() with gpt-5.4 detection
- Add SuggestThinkingLevelFallback() for level migration
- Auto-adjust thinking level on model switch with user notification
- Update all docs to include 'none' in valid levels

Fixes #11
2026-04-21 20:19:00 +03:00
Ed Zynda c1dee3ceba feat(cmd): add --set-default flag and improve auth error messages
Add --set-default flag to 'kit auth login' to automatically set the
provider's default model after successful authentication. When no Anthropic
credentials exist but OpenAI credentials are detected, error messages
now suggest using OpenAI with the correct --model flag.

Fixes #9
2026-04-21 19:52:06 +03:00
Ed Zynda 2d9783a44d fix(ui): make ctrl+c clear input before quitting
Change Ctrl+C behavior to match other terminal AI tools (claude, codex, pi):
- First Ctrl+C clears the current input when text is present
- Second Ctrl+C (within 3 seconds) quits the application
- Ctrl+C on empty input quits immediately
- 3-second auto-reset timer clears the 'pressed once' state
- Flag also resets after message submission

Updates placeholder text and help message to reflect new behavior.

Fixes #13
2026-04-21 19:32:48 +03:00
Ed Zynda 88dd216e15 fix(session): prevent circular parent references in tree session
Add defensive validation to detect and prevent cycles in the session tree
parent chain that could occur after compaction or file corruption.

- Add tree_validation.go with cycle detection and parent chain validation
- Validate parent chain before appending messages (AppendMessage)
- Validate firstKeptEntryID exists in AppendCompaction
- Add depth limit and cycle detection to buildTreeNode to prevent infinite recursion
- Log diagnostics on session open to detect existing cycles
- Add tests for cycle detection and graceful handling
2026-04-21 16:24:38 +03:00
Ed Zynda 9e5806ade8 fix(subagent): remove biased model example from tool schema
- Remove vendor-specific model example that could bias LLM selection
- Add minimum recommended timeout guidance to subagent schema
2026-04-21 11:28:32 +03:00
Ed Zynda 50f586ec8f chore(models): update embedded model database from models.dev
Update internal/models/embedded_models.json with the latest snapshot
from https://models.dev/api.json.

- Providers: 111 → 115 (+4)
- Models: 4,191 → 4,259 (+68)
2026-04-21 10:38:23 +03:00
Ed Zynda 8a8e684dff docs(sdk): document MCPAuthHandler and OAuth opt-in behavior
Reflect the refactor that made MCPAuthHandler an explicit, opt-in
dependency for remote MCP OAuth. Four surfaces updated:

- README.md: new 'MCP OAuth (remote MCP servers)' subsection under the
  Go SDK section, outlining the three consumer patterns (nil / CLI /
  custom) and linking to the full options docs.
- pkg/kit/README.md: type cheat-sheet now lists MCPAuthHandler,
  DefaultMCPAuthHandler, and CLIMCPAuthHandler alongside the existing
  MCPTokenStore entries.
- skills/kit-sdk/SKILL.md: Options example annotated with nil-disables-
  OAuth semantics; new 'MCP OAuth Authorization' section precedes the
  existing token-storage section; re-exported types list expanded.
- www/pages/sdk/options.md: Options fields table gains MCPAuthHandler
  row; new top-level 'MCP OAuth Authorization' section with consumer
  matrix, CLI/custom/fully-custom code samples, and a warning callout
  about the OnAuthURL nil-hang footgun.
2026-04-17 15:30:10 +03:00
Ed Zynda 7ef99ac60f refactor(sdk): remove UX policy from MCP OAuth handler
Strip user-facing I/O out of the SDK's OAuth surface so library, daemon,
and web-app embedders are not surprised by port binds or browser opens.

- DefaultMCPAuthHandler no longer calls openBrowser; it exposes an
  OnAuthURL(serverName, authURL) hook and performs no presentation I/O.
- kit.New no longer auto-constructs a default handler when
  Options.MCPAuthHandler is nil. OAuth is opt-in; remote MCP servers
  requiring authorization fail with a clear error if no handler is set.
- CLIMCPAuthHandler owns the CLI policy (browser open + stderr prints)
  by wiring an OnAuthURL closure on the inner DefaultMCPAuthHandler.
- openBrowser is now unexported and colocated with its sole caller; no
  new exported helper is added to the SDK surface.

BREAKING CHANGE: SDK consumers relying on implicit OAuth with a nil
MCPAuthHandler must now pass kit.NewCLIMCPAuthHandler() (or a custom
implementation) explicitly. The kit CLI is unaffected — cmd/root.go
already constructs the handler explicitly.
2026-04-17 15:26:35 +03:00
Ed Zynda a67f514560 chore(models): refresh embedded models.dev database
- update internal/models/embedded_models.json from https://models.dev/api.json
- 110 → 111 providers, 4172 → 4191 models
2026-04-17 12:19:21 +03:00
Ed Zynda b6bb35cb71 Merge pull request #7 from mark3labs/feat/sdk-options-overrides
feat(sdk): expose generation and provider params on Options
2026-04-17 12:15:47 +03:00
Ed Zynda 4e82fac442 fix(fileutil): decouple TestDetectMediaType from system MIME db
TestDetectMediaType/.go fails on CI images (Ubuntu mime-support) where
/etc/mime.types registers '.go → text/x-go', because mime.TypeByExtension
reads those files at init. The test intended to exercise the 'unknown
extension falls through to text/plain' branch but used a real extension,
making the assertion environment-dependent.

Replace '.go' with '.kitsyntheticext', an invented extension that no
system MIME database registers. The fallback path is now exercised
deterministically on any host.
2026-04-17 12:13:28 +03:00
Ed Zynda 5ec2217b0f docs(sdk): document global viper state leakage in New and Options
The SDK applies Options by calling viper.Set on viper's process-global
store, which means two Kits constructed in the same process are not
isolated from each other: the second New overwrites the first's keys,
and downstream readers (SetModel, GetThinkingLevel, BuildProviderConfig)
observe the most recent value.

- Add a 'Global viper state warning' block to the Options godoc
  explaining the leak, the zero-value-does-not-clear gotcha, and
  pointing at viper.Reset() as the migration workaround.
- Add a matching warning to the New godoc so consumers discover the
  constraint from either entry point.
- Detach the viperInitMu godoc (previously lodged inside New's comment
  block) and clarify that the mutex only guards the construction
  window, not instance isolation.
- Add a TODO noting the proper fix: refactor to a per-call viper.New()
  instance so each Kit owns its own config store.
2026-04-17 12:09:13 +03:00
Ed Zynda 8a851723ba style(sdk): gofmt trailing newlines in kit_test.go 2026-04-17 12:07:54 +03:00
Ed Zynda 53b628c5f8 fix(sdk): map hyphenated config keys to KIT_* env vars
- InitConfig now installs a viper env key replacer so keys like
  "max-tokens" bind to KIT_MAX_TOKENS under AutomaticEnv; previously
  hyphenated keys silently missed their documented env overrides.
- Simplify TestNewPreservesIsSetSemantics: with SkipConfig: true no env
  bindings are registered, so the os.Getenv guard and upper() helper
  were dead weight. Remove both and drop the unused helper.
2026-04-17 12:07:29 +03:00
Ed Zynda e1c94cb362 fix(sdk): align SDK max-tokens floor with CLI default (4096 → 8192)
The SDK last-resort MaxTokens floor is applied in kit.New() when
Options.MaxTokens, KIT_MAX_TOKENS, .kit.yml, and per-model defaults
are all unset. It was 4096 (inherited from the old setSDKDefaults
viper default) while the CLI --max-tokens cobra default is 8192.

Bump the floor to 8192 so SDK and CLI callers start from the same
base value before rightSizeMaxTokens runs, then update README,
skills/kit-sdk/SKILL.md, and www/pages/{configuration,sdk/options}.md
to match.
2026-04-17 11:59:49 +03:00
Ed Zynda ecf95b52e1 fix(sdk): preserve IsSet semantics for generation param overrides
Previously setSDKDefaults() registered viper.SetDefault for max-tokens,
temperature, top-p, top-k, frequency/presence-penalty, and thinking-level.
viper.SetDefault makes IsSet() return true, which silently suppressed
per-model defaults (ApplyModelSettings) and automatic right-sizing
(rightSizeMaxTokens) for every SDK-created Kit — and for CLI runs too,
since cmd/root.go routes through kit.New. Effective max-tokens for
claude-sonnet-4-5 was pinned at 4096 instead of 32768.

- Drop SetDefault for all IsSet-sensitive keys; keep only model,
  system-prompt, stream, num-gpu-layers, main-gpu.
- Apply a 4096 max-tokens floor directly on the *models.ProviderConfig
  struct in kit.New() when nothing else resolved a value. Keeps
  viper.IsSet("max-tokens") == false so rightSizeMaxTokens and
  per-model maxTokens overrides still fire.
- Update Options.MaxTokens / ThinkingLevel godoc to describe the real
  precedence chain.
- Strengthen tests: add Temperature subtest; add
  TestNewPreservesIsSetSemantics regression covering all seven keys;
  split TestNewWithProviderOptions into three subtests including
  Options-beats-viper-state and ProviderURL propagation; add
  resetViper helper so subtests don't bleed state.
- Document the new SDK fields (MaxTokens, ThinkingLevel, Temperature,
  TopP, TopK, FrequencyPenalty, PresencePenalty, ProviderAPIKey,
  ProviderURL, TLSSkipVerify) in README, skills/kit-sdk, and the www
  configuration / sdk/options / sdk/overview pages, including a
  dedicated precedence table.
2026-04-17 11:50:45 +03:00
Ed Zynda 0641c92acc feat(sdk): expose generation and provider params on Options
Adds programmatic overrides on kit.Options for the model/provider knobs
that were previously only reachable through viper.Set() — letting SDK
consumers (web apps, services, embedded agents) configure kit fully
in-code without polluting global viper state or shipping .kit.yml.

Generation parameters:
  - MaxTokens         int      (max output tokens per response)
  - ThinkingLevel     string   (off/low/medium/high)
  - Temperature       *float32
  - TopP              *float32
  - TopK              *int32
  - FrequencyPenalty  *float32
  - PresencePenalty   *float32

Sampling params use pointer types so explicit 0 is distinguishable from
unset; nil leaves provider/per-model defaults in place.

Provider configuration:
  - ProviderAPIKey    string
  - ProviderURL       string
  - TLSSkipVerify     bool

Implementation just pushes Options values into viper inside New(),
so all existing downstream code (BuildProviderConfig, SetModel,
modelSettings lookups, runtime model switching) picks them up
uniformly without any new code paths. Tests added for MaxTokens,
ThinkingLevel, and ProviderAPIKey.
2026-04-17 11:24:00 +03:00
Ed Zynda 3bb20f5283 feat(models): surface and prevent silent max-tokens truncation
- Raise --max-tokens default from 4096 to 8192.
- Auto-raise MaxTokens toward the model's catalog Limit.Output (capped at
  32768) when the user hasn't set --max-tokens explicitly and no per-model
  modelSettings override applied. Prevents silent 4k/8k truncation on
  models that support 32k-262k output.
- Surface FinishReasonLength at turn end: the app now subscribes to
  TurnEndEvent and renders a system-message banner explaining the current
  cap, the model's known ceiling, and how to raise it. Previously the TUI
  swallowed 'length' stops, producing 'ghost' truncations.
- Export FinishReason* constants on pkg/kit (Stop, Length, ToolCalls,
  ContentFilter, Error, Other, Unknown) and fix stale comments that used
  Anthropic-style strings.
- Add Kit.MaxTokens() and Kit.MaxOutputLimit() SDK accessors, backed by
  Agent.GetMaxTokens() which correctly returns 0 for providers that
  suppress the param (e.g. Codex OAuth).
- Tests: rightSizeMaxTokens covers 7 paths (cap, raise, preserve,
  explicit flag, nil info, zero limit); handleTurnEnd covers length/
  non-length/nil-sendFn and the fallback message formatter.
- Docs: update configuration.md, cli/flags.md, and kit-extensions skill
  to reflect the new default and behavior.
2026-04-16 23:12:10 +03:00
Ed Zynda 633fa38b2b fix(ui): regenerate spinner frames on theme change
- UpdateTheme() only refreshed typography styles, leaving spinner
  frames rendered with the old theme's colors
- Now calls knightRiderFrames() to rebuild frames with the new
  theme's Primary, Muted, VeryMuted, and MutedBorder colors
2026-04-16 12:32:49 +03:00
Ed Zynda f905cee48c fix(ui): dynamically size slash command name column in popup
- Replace hardcoded nameWidth of 15 with dynamic calculation based on
  the longest command name in the filtered list
- Prevents truncation of longer names like /feature-request and
  /release-tagger that were cut off with ellipsis
- Cap name column to leave at least 20 chars for descriptions
- Add 1 char gap between name and description columns
2026-04-16 12:27:56 +03:00
Ed Zynda 182c10ea1a refactor(ui): improve keybinding ergonomics for terminal multiplexers
- Move thinking toggle from ctrl+t to leader chord (ctrl+x t) to avoid
  conflicts with tmux/zellij tab mode and terminal new-tab shortcuts
- Change scrollback jump from alt+home/alt+end to ctrl+home/ctrl+end
  for better compatibility across SSH and older tmux versions
- Remove ctrl+d as submit alias (enter suffices); avoids EOF convention
  confusion and accidental shell disconnects
- Remove ctrl+a from tree selector filter shortcuts to avoid conflict
  with the common tmux prefix remap (ctrl+o cycle still reaches all
  filter modes)
2026-04-16 12:21:37 +03:00
Ed Zynda fcaa52bf1c fix(extensions): serialize handler calls per-extension to prevent data races
- Add per-extension reentrant mutex to Runner that serializes handler
  invocations from concurrent goroutines (e.g. parallel subagent events)
  while allowing re-entrant calls (handler → EmitCustomEvent → handler)
- Fix subagent-monitor slice aliasing bug: submonEntries[:0] reuses the
  backing array, corrupting entries during in-place filtering
- Pass parent's loaded MCPConfig to child subagents in Kit.Subagent(),
  eliminating concurrent viper map access during parallel kit.New() calls
- Add Options.MCPConfig field so SDK consumers can also skip viper reads
- Add tests for concurrent emit, cross-extension concurrency, and
  re-entrant EmitCustomEvent
2026-04-16 12:11:10 +03:00
Ed Zynda 7e6455732c docs: update documentation for sudo password prompt feature
- README.md: mention interactive sudo password prompt in features
- skills/kit-sdk/SKILL.md: add PasswordPromptEvent to event types table
- www/pages/index.md: update features list with sudo prompt
- www/pages/development.md: update project structure description
- www/pages/sdk/callbacks.md: add complete event types table
2026-04-15 18:06:11 +03:00
Ed Zynda 71301a9035 feat: add interactive sudo password prompt for bash tool
Add core TUI support for handling sudo password prompts when executing
bash commands that require elevated privileges.

- Detect sudo commands and check if credentials are cached (sudo -n)
- Show modal password prompt with masked input (• characters) when needed
- Pipe password via stdin using sudo -S -p '' (no password in command string)
- Password flows through context callbacks, never stored in session history
- Add PasswordPromptHandler to agent and SDK event system
- Add password prompt overlay to TUI with 🔐 icon and hidden input
- Include tests for sudo command detection and rewriting

The password is never persisted to disk - it only exists in memory
during execution and is piped directly to sudo via stdin.
2026-04-15 17:33:03 +03:00
64 changed files with 3574 additions and 347 deletions
+6 -3
View File
@@ -13,6 +13,8 @@
// - No channels in maps (Yaegi panics on range over map[string]chan)
// - All ctx.* calls guarded with nil checks
// - Simple data structures only
// - The extension runner serializes handler calls per-extension, so
// concurrent subagent events cannot race on this shared state.
package main
import (
@@ -43,7 +45,8 @@ const (
)
// ---------------------------------------------------------------------------
// Package-level state - all simple types
// Package-level state — safe because the runner serializes all handler
// invocations for the same extension (per-extension reentrant mutex).
// ---------------------------------------------------------------------------
var (
@@ -282,8 +285,8 @@ func Init(api ext.API) {
submonPushWidget()
// Remove the entry immediately (no goroutine to avoid races)
newEntries := submonEntries[:0]
// Remove the entry — build a new slice to avoid aliasing bugs
newEntries := make([]*submonEntry, 0, len(submonEntries))
for _, en := range submonEntries {
if en.callID != e.ToolCallID {
newEntries = append(newEntries, en)
-80
View File
@@ -1,80 +0,0 @@
# Autoscroll Fix - Final Summary
## Root Cause
The autoscroll was failing for streaming assistant messages due to a bug in how `GotoBottom()` calculated item heights.
### The Problem
1. **Reasoning blocks** (`StreamingMessageItem` with `role="reasoning"`) are **never cached** because they have live duration counters that update every render
2. The `Height()` method returns `0` when `cachedRender == ""`
3. `GotoBottom()` was calling:
```go
itemHeight := item.Height() // Returns 0 for reasoning
if itemHeight == 0 {
item.Render(s.width) // Renders but doesn't cache (reasoning)
itemHeight = item.Height() // Still returns 0!
}
```
4. This caused incorrect scroll position calculations, especially during reasoning → assistant transitions
## The Solution
Changed `GotoBottom()` and `AtBottom()` to calculate height **directly from the rendered string** instead of relying on the cached height:
```go
// OLD: item.Height() which checks cached render
itemHeight := item.Height()
if itemHeight == 0 {
item.Render(s.width)
itemHeight = item.Height() // Still might be 0!
}
// NEW: Calculate from rendered string directly
rendered := item.Render(s.width)
itemHeight := strings.Count(rendered, "\n") + 1
```
This works for **all** items regardless of whether they cache their render or not.
## Files Changed
### `internal/ui/scrolllist.go`
- **`GotoBottom()`**: Calculate height from rendered string (2 loops)
- **`AtBottom()`**: Calculate height from rendered string (1 loop)
### `internal/ui/model.go`
- **`appendStreamingChunk()`**: For existing messages, call `GotoBottom()` directly (iteratr pattern)
- **`refreshContent()`**: Simplified to only call `SetItems()` (removed redundant `GotoBottom()`)
- **Bash streaming handler**: Removed redundant `GotoBottom()` after `refreshContent()`
## Testing Results
✅ **Test prompt**: "explore this repo"
**Before fix**:
- Autoscroll stopped after reasoning block completed
- Viewport stuck showing end of reasoning ("Thought for 203ms")
- Assistant response streamed off-screen below
**After fix**:
- Autoscroll works throughout reasoning block
- Autoscroll continues during reasoning → assistant transition
- Viewport stays at bottom showing latest assistant content
- Final position shows end of response (build commands section)
## Behavior Verified
1. ✅ Streaming text auto-scrolls to bottom
2. ✅ Works across reasoning → assistant transition
3. ✅ Manual scroll up (PgUp) disables autoscroll
4. ✅ Scroll to bottom (Alt+End) re-enables autoscroll
5. ✅ Accurate positioning with no offset errors
## Performance Note
The fix calls `Render()` on all items during `GotoBottom()` calculations. This is acceptable because:
- `Render()` is already optimized with caching for non-reasoning items
- `GotoBottom()` is only called during content updates (not every frame)
- Reasoning blocks need to render anyway for live duration updates
- This matches iteratr's approach of ensuring items are rendered before height calculations
+72 -7
View File
@@ -18,7 +18,7 @@ A powerful, extensible AI coding agent CLI with multi-provider support, built-in
## Features
- **Multi-Provider LLM Support**: Anthropic, OpenAI, Google Gemini, Ollama, Azure OpenAI, AWS Bedrock, OpenRouter, and more
- **Built-in Core Tools**: bash, read, write, edit, grep, find, ls, subagent - no MCP overhead
- **Built-in Core Tools**: bash (with interactive sudo password prompt), read, write, edit, grep, find, ls, subagent - no MCP overhead
- **Smart @ Attachments**: Binary files auto-detected via MIME type, MCP resources via `@mcp:server:uri`
- **MCP Integration**: Connect external MCP servers for expanded capabilities
- **Extension System**: Write custom tools, commands, widgets, and UI modifications in Go
@@ -126,8 +126,13 @@ model: anthropic/claude-sonnet-latest
max-tokens: 4096
temperature: 0.7
stream: true
thinking-level: off # off, none, minimal, low, medium, high
```
All of the above keys can also be set programmatically via the SDK
(`kit.Options.MaxTokens`, `Options.Temperature`, `Options.ThinkingLevel`, etc.)
without touching config files — see [SDK options](#with-options).
### Environment Variables
```bash
@@ -152,6 +157,11 @@ mcpServers:
search:
type: remote
url: "https://mcp.example.com/search"
pubmed:
type: remote
url: "https://pubmed.mcp.example.com"
noOAuth: true # skip OAuth for public servers that don't require auth
```
## CLI Reference
@@ -187,14 +197,14 @@ mcpServers:
--no-prompt-templates Disable prompt template loading
# Generation parameters
--max-tokens Maximum tokens in response (default: 4096)
--max-tokens Maximum tokens in response (default: 8192, auto-raised up to 32768 for models with larger known output limits)
--temperature Randomness 0.0-1.0 (default: 0.7)
--top-p Nucleus sampling 0.0-1.0 (default: 0.95)
--top-k Limit top K tokens (default: 40)
--stop-sequences Custom stop sequences (comma-separated)
--frequency-penalty Penalize frequent tokens 0.0-2.0 (default: 0.0)
--presence-penalty Penalize present tokens 0.0-2.0 (default: 0.0)
--thinking-level Extended thinking level: off, minimal, low, medium, high (default: off)
--thinking-level Extended thinking level: off, none, minimal, low, medium, high (default: off)
# System
--config Config file path (default: ~/.kit.yml)
@@ -206,9 +216,10 @@ mcpServers:
```bash
# Authentication (for OAuth-enabled providers)
kit auth login [provider] # Start OAuth flow (e.g., anthropic)
kit auth logout [provider] # Remove credentials for provider
kit auth status # Check authentication status
kit auth login [provider] # Start OAuth flow (e.g., anthropic)
kit auth login [provider] --set-default # Set provider's default model as system default
kit auth logout [provider] # Remove credentials for provider
kit auth status # Check authentication status
# Model database
kit models [provider] # List available models (optionally filter by provider)
@@ -290,7 +301,7 @@ kit -e examples/extensions/minimal.go
### Extension Capabilities
**Lifecycle Events**: OnSessionStart, OnSessionShutdown, OnBeforeAgentStart, OnAgentStart, OnAgentEnd, OnToolCall, OnToolExecutionStart, OnToolOutput, OnToolExecutionEnd, OnToolResult, OnInput, OnMessageStart, OnMessageUpdate, OnMessageEnd, OnModelChange, OnContextPrepare, OnBeforeFork, OnBeforeSessionSwitch, OnBeforeCompact, OnCustomEvent, OnSubagentStart, OnSubagentChunk, OnSubagentEnd
**Lifecycle Events**: OnSessionStart, OnSessionShutdown, OnBeforeAgentStart, OnAgentStart, OnAgentEnd, OnToolCall, OnToolCallInputStart, OnToolCallInputDelta, OnToolCallInputEnd, OnToolExecutionStart, OnToolOutput, OnToolExecutionEnd, OnToolResult, OnInput, OnMessageStart, OnMessageUpdate, OnMessageEnd, OnModelChange, OnContextPrepare, OnBeforeFork, OnBeforeSessionSwitch, OnBeforeCompact, OnCustomEvent, OnSubagentStart, OnSubagentChunk, OnSubagentEnd
**Custom Components**:
- **Tools**: Add new tools the LLM can invoke
@@ -541,6 +552,20 @@ host, err := kit.New(ctx, &kit.Options{
Streaming: true,
Quiet: true,
// Generation parameters (override env/config/per-model defaults)
MaxTokens: 16384, // 0 = auto-resolve (env → config → per-model → 8192 floor)
ThinkingLevel: "medium", // "off", "none", "minimal", "low", "medium", "high"
Temperature: ptr(float32(0.2)), // pointer so 0.0 != unset; nil = provider default
TopP: nil, // nil = leave provider/per-model default
TopK: nil,
FrequencyPenalty: nil,
PresencePenalty: nil,
// Provider configuration (override env/config without reaching into viper)
ProviderAPIKey: "sk-...", // "" = use config / provider env var
ProviderURL: "https://proxy.internal/v1", // "" = provider default
TLSSkipVerify: false, // only takes effect when true
// Session options
SessionPath: "./session.jsonl", // Open specific session
Continue: true, // Resume most recent session
@@ -561,6 +586,46 @@ host, err := kit.New(ctx, &kit.Options{
})
```
**Generation & provider fields** (added in v0.55+) let SDK consumers configure
Kit entirely in-code without `viper.Set()` workarounds or shipping a `.kit.yml`.
Precedence is `Options` > `KIT_*` env vars > `.kit.yml` > per-model defaults
(`modelSettings` / `customModels`) > provider-level defaults. Sampling params
are pointer types so explicit `0.0` is distinguishable from "leave alone"; a
non-zero `MaxTokens` suppresses automatic right-sizing the same way `--max-tokens`
does on the CLI.
### MCP OAuth (remote MCP servers)
When a remote MCP server returns 401, Kit runs the full OAuth flow (dynamic
client registration → PKCE → token exchange → persistence) but delegates the
user-facing step — showing the authorization URL and receiving the callback —
to an `MCPAuthHandler` that you pass explicitly via `Options.MCPAuthHandler`.
If nil, OAuth is disabled and the authorization-required error surfaces to the
caller; the SDK never auto-opens a browser or binds a localhost port.
```go
// CLI/TUI apps: opens the system browser + prints status to stderr.
authHandler, _ := kit.NewCLIMCPAuthHandler()
defer authHandler.Close()
host, _ := kit.New(ctx, &kit.Options{
MCPAuthHandler: authHandler,
})
// Custom UX: reuse the SDK's port + callback server, supply your own
// presentation via OnAuthURL (TUI modal, QR code, web redirect, etc.).
// h, _ := kit.NewDefaultMCPAuthHandler()
// h.OnAuthURL = func(server, authURL string) { myUI.Show(server, authURL) }
//
// Full control (web apps, daemons): implement kit.MCPAuthHandler yourself —
// no localhost binding, no side effects.
```
Tokens are persisted to `$XDG_CONFIG_HOME/.kit/mcp_tokens.json` by default; swap
in a custom `MCPTokenStoreFactory` for encrypted, DB-backed, or in-memory
storage. See the [SDK options docs](/sdk/options#mcp-oauth-authorization) for
the full matrix.
### Custom Tools
Create custom tools with automatic schema generation — no external dependencies needed:
+64 -4
View File
@@ -11,6 +11,7 @@ import (
"charm.land/huh/v2"
"github.com/mark3labs/kit/internal/auth"
"github.com/mark3labs/kit/internal/ui"
kit "github.com/mark3labs/kit/pkg/kit"
"github.com/spf13/cobra"
)
@@ -54,9 +55,13 @@ Available providers:
- anthropic: Anthropic Claude API (OAuth)
- openai: OpenAI ChatGPT Plus/Pro (Codex OAuth)
Example:
Flags:
--set-default Set this provider's default model as the system default
Examples:
kit auth login anthropic
kit auth login openai`,
kit auth login openai
kit auth login openai --set-default`,
Args: cobra.ExactArgs(1),
RunE: runAuthLogin,
}
@@ -99,10 +104,43 @@ Example:
RunE: runAuthStatus,
}
var (
loginSetDefault bool
)
// defaultModels maps providers to their recommended default models.
// These are used when --set-default flag is passed to auth login.
var defaultModels = map[string]string{
"anthropic": "anthropic/claude-sonnet-4-5-20250929",
"openai": "openai/gpt-5.4",
}
// setDefaultModelIfRequested sets the default model for the given provider
// if the --set-default flag was provided.
func setDefaultModelIfRequested(provider string) error {
if !loginSetDefault {
return nil
}
model, ok := defaultModels[provider]
if !ok {
return fmt.Errorf("no default model configured for provider: %s", provider)
}
if err := ui.SaveModelPreference(model); err != nil {
return fmt.Errorf("failed to save model preference: %w", err)
}
fmt.Printf("\n✓ Set default model to: %s\n", model)
return nil
}
func init() {
authCmd.AddCommand(authLoginCmd)
authCmd.AddCommand(authLogoutCmd)
authCmd.AddCommand(authStatusCmd)
authLoginCmd.Flags().BoolVar(&loginSetDefault, "set-default", false, "Set this provider's default model as the system default after login")
}
func runAuthLogin(cmd *cobra.Command, args []string) error {
@@ -288,6 +326,17 @@ func loginAnthropic() error {
fmt.Println("\n🎉 Your OAuth credentials will now be used for Anthropic API calls.")
fmt.Println("💡 You can check your authentication status with: kit auth status")
// Set default model if requested
if err := setDefaultModelIfRequested("anthropic"); err != nil {
return err
}
// Remind users how to set this as default if they didn't use --set-default
if !loginSetDefault {
fmt.Println("\n💡 To set Anthropic as your default model, run:")
fmt.Println(" kit auth login anthropic --set-default")
}
return nil
}
@@ -454,6 +503,17 @@ func loginOpenAI() error {
fmt.Println("\n🎉 Your OAuth credentials will now be used for OpenAI API calls.")
fmt.Println("💡 You can check your authentication status with: kit auth status")
// Set default model if requested
if err := setDefaultModelIfRequested("openai"); err != nil {
return err
}
// Remind users how to set this as default if they didn't use --set-default
if !loginSetDefault {
fmt.Println("\n💡 To set OpenAI as your default model, run:")
fmt.Println(" kit auth login openai --set-default")
}
return nil
}
@@ -504,13 +564,13 @@ func startOpenAICallbackServer(expectedState string) (*callbackServer, error) {
}
// Return success page
w.Header().Set("Content-Type", "text/html")
w.Header().Set("Content-Type", "text/html; charset=utf-8")
w.WriteHeader(http.StatusOK)
_, _ = fmt.Fprintf(w, `<!DOCTYPE html>
<html>
<head><title>Authentication Successful</title></head>
<body style="font-family: sans-serif; text-align: center; padding: 50px;">
<h1> Authentication Successful</h1>
<h1>&#10003; Authentication Successful</h1>
<p>You can close this window and return to the terminal.</p>
</body>
</html>`)
+2 -2
View File
@@ -297,14 +297,14 @@ func init() {
flags.BoolVar(&noPromptTemplates, "no-prompt-templates", false, "disable prompt template discovery")
// Model generation parameters
flags.IntVar(&maxTokens, "max-tokens", 4096, "maximum number of tokens in the response")
flags.IntVar(&maxTokens, "max-tokens", 8192, "maximum number of output tokens per response (auto-raised up to 32768 for models with higher known output limits; see internal/models/embedded_models.json)")
flags.Float32Var(&temperature, "temperature", 0.7, "controls randomness in responses (0.0-1.0)")
flags.Float32Var(&topP, "top-p", 0.95, "controls diversity via nucleus sampling (0.0-1.0)")
flags.Int32Var(&topK, "top-k", 40, "controls diversity by limiting top K tokens to sample from")
flags.Float32Var(&frequencyPenalty, "frequency-penalty", 0.0, "penalizes tokens based on frequency of appearance (0.0-2.0)")
flags.Float32Var(&presencePenalty, "presence-penalty", 0.0, "penalizes tokens based on whether they have appeared (0.0-2.0)")
flags.StringSliceVar(&stopSequences, "stop-sequences", nil, "custom stop sequences (comma-separated)")
flags.StringVar(&thinkingLevel, "thinking-level", "off", "extended thinking level: off, minimal, low, medium, high")
flags.StringVar(&thinkingLevel, "thinking-level", "off", "extended thinking level: off, none, minimal, low, medium, high")
// Ollama-specific parameters
flags.Int32Var(&numGPU, "num-gpu-layers", -1, "number of model layers to offload to GPU for Ollama models (-1 for auto-detect)")
@@ -130,6 +130,58 @@ func TestSubagentMonitor_MultipleSubagents(t *testing.T) {
time.Sleep(100 * time.Millisecond)
}
// TestSubagentMonitor_ConcurrentSubagents verifies no panics when multiple
// subagents emit events concurrently from different goroutines.
func TestSubagentMonitor_ConcurrentSubagents(t *testing.T) {
harness := test.New(t)
harness.LoadFile("../../.kit/extensions/subagent-monitor.go")
_, err := harness.Emit(extensions.SessionStartEvent{SessionID: "test-session"})
if err != nil {
t.Fatalf("SessionStart should not error: %v", err)
}
// Start 5 subagents concurrently
done := make(chan struct{}, 5)
for i := range 5 {
go func(idx int) {
defer func() { done <- struct{}{} }()
callID := fmt.Sprintf("concurrent-%d", idx)
task := fmt.Sprintf("concurrent task %d", idx)
_, _ = harness.Emit(extensions.SubagentStartEvent{
ToolCallID: callID,
Task: task,
})
// Emit many chunks rapidly
for j := range 20 {
_, _ = harness.Emit(extensions.SubagentChunkEvent{
ToolCallID: callID,
Task: task,
ChunkType: "text",
Content: fmt.Sprintf("agent %d chunk %d", idx, j),
})
}
_, _ = harness.Emit(extensions.SubagentEndEvent{
ToolCallID: callID,
Task: task,
Response: "done",
})
}(i)
}
// Wait for all goroutines
for range 5 {
<-done
}
// Allow any final processing
time.Sleep(200 * time.Millisecond)
}
// TestSubagentMonitor_SessionShutdown verifies shutdown doesn't panic
// even with nil ctx functions.
func TestSubagentMonitor_SessionShutdown(t *testing.T) {
+153
View File
@@ -0,0 +1,153 @@
//go:build ignore
// sudo-handler.go - Extension to handle sudo password prompts securely
//
// This extension intercepts bash commands containing "sudo" and:
// 1. Checks if sudo credentials are already cached (via sudo -n)
// 2. If not cached, prompts the user for their password (with masking)
// 3. Temporarily sets SUDO_PASSWORD environment variable for execution
// 4. The bash tool automatically uses sudo -S -p '' to pipe the password
//
// Usage: kit -e examples/extensions/sudo-handler.go
//
// Security notes:
// - Password is only stored in memory for the duration of the session
// - Password is never logged or displayed
// - Each session requires re-authentication (sudo -k is used)
// - The SUDO_PASSWORD env var is set only during tool execution
package main
import (
"encoding/json"
"os"
"strings"
"sync"
"kit/ext"
)
var (
// cachedPassword stores the sudo password for the session
cachedPassword string
// hasCachedPassword tracks if we have a valid cached password
hasCachedPassword bool
// mu protects cached password access
mu sync.RWMutex
)
// Init sets up the sudo handler extension
func Init(api ext.API) {
api.OnToolCall(func(tc ext.ToolCallEvent, ctx ext.Context) *ext.ToolCallResult {
if tc.ToolName != "bash" {
return nil
}
// Parse the command from tool input
var input struct {
Command string `json:"command"`
}
if err := json.Unmarshal([]byte(tc.Input), &input); err != nil {
return nil
}
// Check if command contains sudo
if !containsSudo(input.Command) {
return nil
}
// Check if we already have cached credentials
mu.RLock()
password := cachedPassword
hasCached := hasCachedPassword
mu.RUnlock()
if hasCached {
// Use cached password
os.Setenv("SUDO_PASSWORD", password)
return nil
}
// No cached password - prompt user
result := ctx.PromptInput(ext.PromptInputConfig{
Message: "🔐 Sudo password required for:\n " + truncateCommand(input.Command, 60),
Placeholder: "Enter your password",
})
if result.Cancelled {
return &ext.ToolCallResult{
Block: true,
Reason: "Sudo password prompt cancelled by user",
}
}
if result.Value == "" {
return &ext.ToolCallResult{
Block: true,
Reason: "No password provided",
}
}
// Cache the password for this session
mu.Lock()
cachedPassword = result.Value
hasCachedPassword = true
mu.Unlock()
// Set environment variable for the bash tool to use
os.Setenv("SUDO_PASSWORD", result.Value)
// Show confirmation (without revealing password)
ctx.PrintInfo("Sudo password cached for this session")
return nil
})
// Clear cached password when session ends
api.OnSessionShutdown(func(event ext.SessionShutdownEvent, ctx ext.Context) {
mu.Lock()
cachedPassword = ""
hasCachedPassword = false
mu.Unlock()
os.Unsetenv("SUDO_PASSWORD")
})
}
// containsSudo checks if the command contains sudo as a command (not in a string)
func containsSudo(command string) bool {
// Simple check for sudo as a word, not inside quotes or as part of another word
lower := strings.ToLower(command)
// Check for sudo at start or after separators
patterns := []string{
"sudo ",
"sudo\t",
";sudo ",
"&& sudo ",
"|| sudo ",
"| sudo ",
"$(sudo ",
"`sudo ",
}
for _, pattern := range patterns {
if strings.Contains(lower, pattern) {
return true
}
}
// Check if command starts with sudo
if strings.HasPrefix(lower, "sudo ") {
return true
}
return false
}
// truncateCommand truncates a long command for display
func truncateCommand(cmd string, maxLen int) string {
if len(cmd) <= maxLen {
return cmd
}
return cmd[:maxLen-3] + "..."
}
+76 -2
View File
@@ -87,6 +87,19 @@ type ReasoningDeltaHandler func(delta string)
// Called when the last reasoning token has been processed, before text streaming starts.
type ReasoningCompleteHandler func()
// ToolCallStartHandler is a function type for handling the moment when the LLM
// begins generating tool call arguments. The tool name is known but the full
// argument JSON is still streaming.
type ToolCallStartHandler func(toolCallID, toolName string)
// ToolCallDeltaHandler is a function type for handling streamed fragments of
// tool call arguments as they arrive from the LLM.
type ToolCallDeltaHandler func(toolCallID, delta string)
// ToolCallEndHandler is a function type for handling the end of tool argument
// streaming, before the tool call is parsed and execution begins.
type ToolCallEndHandler func(toolCallID string)
// ToolOutputHandler is a function type for handling streaming tool output chunks.
// Used by tools like bash to stream output as it arrives rather than waiting
// for the command to complete. The isStderr flag indicates if the chunk
@@ -94,6 +107,12 @@ type ReasoningCompleteHandler func()
// Note: This is an alias for core.ToolOutputCallback to avoid import cycles.
type ToolOutputHandler = core.ToolOutputCallback
// PasswordPromptHandler is a function type for password prompts.
// Used by the bash tool when sudo requires a password. The handler receives
// a prompt message and returns the password and whether it was cancelled.
// Note: This is an alias for core.PasswordPromptCallback.
type PasswordPromptHandler = core.PasswordPromptCallback
// StepMessagesHandler is a function type for persisting messages after each
// complete step in a multi-step agent turn. The handler receives the messages
// produced by the step (typically an assistant message with tool calls followed
@@ -405,7 +424,7 @@ func (a *Agent) GenerateWithLoop(ctx context.Context, messages []fantasy.Message
onResponse ResponseHandler, onToolCallContent ToolCallContentHandler,
) (*GenerateWithLoopResult, error) {
return a.GenerateWithLoopAndStreaming(ctx, messages, onToolCall, onToolExecution, onToolResult,
onResponse, onToolCallContent, nil, nil, nil, nil, nil, nil)
onResponse, onToolCallContent, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil)
}
// GenerateWithLoopAndStreaming processes messages using the agent with streaming and callbacks.
@@ -420,6 +439,10 @@ func (a *Agent) GenerateWithLoopAndStreaming(ctx context.Context, messages []fan
onToolOutput ToolOutputHandler,
onStepMessages StepMessagesHandler,
onStepUsage StepUsageHandler,
onPasswordPrompt PasswordPromptHandler,
onToolCallStart ToolCallStartHandler,
onToolCallDelta ToolCallDeltaHandler,
onToolCallEnd ToolCallEndHandler,
) (*GenerateWithLoopResult, error) {
// Wait for background MCP tool loading to complete and rebuild the
@@ -432,6 +455,11 @@ func (a *Agent) GenerateWithLoopAndStreaming(ctx context.Context, messages []fan
ctx = core.ContextWithToolOutputCallback(ctx, onToolOutput)
}
// Inject password prompt handler into context for use by bash tool.
if onPasswordPrompt != nil {
ctx = core.ContextWithPasswordPrompt(ctx, onPasswordPrompt)
}
// The agent requires the current user input as Prompt, with prior messages as history.
// Extract the last user message text and files as the prompt, and pass everything
// before it as Messages. Files (e.g. clipboard images) are passed via the Files
@@ -450,7 +478,8 @@ func (a *Agent) GenerateWithLoopAndStreaming(ctx context.Context, messages []fan
// Stream is required to observe tool execution in real time. The non-streaming
// Generate path is reserved for the simple case with no callbacks at all.
hasCallbacks := onToolCall != nil || onToolExecution != nil || onToolResult != nil ||
onToolCallContent != nil || onStreamingResponse != nil || onReasoningDelta != nil
onToolCallContent != nil || onStreamingResponse != nil || onReasoningDelta != nil ||
onToolCallStart != nil || onToolCallDelta != nil || onToolCallEnd != nil
if a.streamingEnabled || hasCallbacks {
// Track completed step messages so we can return partial results
@@ -469,6 +498,35 @@ func (a *Agent) GenerateWithLoopAndStreaming(ctx context.Context, messages []fan
Files: files,
Messages: history,
// Tool input streaming callbacks — fire during tool argument generation
OnToolInputStart: func(id, toolName string) error {
if ctx.Err() != nil {
return ctx.Err()
}
if onToolCallStart != nil {
onToolCallStart(id, toolName)
}
return nil
},
OnToolInputDelta: func(id, delta string) error {
if ctx.Err() != nil {
return ctx.Err()
}
if onToolCallDelta != nil {
onToolCallDelta(id, delta)
}
return nil
},
OnToolInputEnd: func(id string) error {
if ctx.Err() != nil {
return ctx.Err()
}
if onToolCallEnd != nil {
onToolCallEnd(id)
}
return nil
},
// Reasoning/thinking streaming callback
OnReasoningDelta: func(id, delta string) error {
if ctx.Err() != nil {
@@ -1013,6 +1071,22 @@ func (a *Agent) GetModel() fantasy.LanguageModel {
return a.model
}
// GetMaxTokens returns the effective max output tokens the agent currently
// sends to the LLM provider, after per-model defaults, right-sizing, and any
// Anthropic thinking-budget adjustments. Returns 0 when no ModelConfig is
// attached (e.g. early init) or when the provider suppresses the parameter
// (e.g. Codex OAuth), which allows callers to differentiate "default" from
// "explicitly capped".
func (a *Agent) GetMaxTokens() int {
if a.skipMaxOutputTokens {
return 0
}
if a.modelConfig == nil {
return 0
}
return a.modelConfig.MaxTokens
}
// Close closes the agent and cleans up resources.
// If MCP tools are still loading in the background, Close waits for them
// to finish before closing connections to avoid resource leaks.
+80
View File
@@ -888,6 +888,12 @@ func (a *App) subscribeSDKEvents(sendFn func(tea.Msg), stepUsageSeen *atomic.Boo
switch ev := e.(type) {
case kit.ToolCallEvent:
sendFn(ToolCallStartedEvent{ToolCallID: ev.ToolCallID, ToolName: ev.ToolName, ToolArgs: ev.ToolArgs})
case kit.ToolCallStartEvent:
sendFn(ToolCallInputStartEvent{ToolCallID: ev.ToolCallID, ToolName: ev.ToolName, ToolKind: ev.ToolKind})
case kit.ToolCallDeltaEvent:
sendFn(ToolCallInputDeltaEvent{ToolCallID: ev.ToolCallID, Delta: ev.Delta})
case kit.ToolCallEndEvent:
sendFn(ToolCallInputEndEvent{ToolCallID: ev.ToolCallID})
case kit.ToolExecutionStartEvent:
sendFn(ToolExecutionEvent{ToolCallID: ev.ToolCallID, ToolName: ev.ToolName, ToolArgs: ev.ToolArgs, IsStarting: true})
case kit.ToolExecutionEndEvent:
@@ -918,6 +924,22 @@ func (a *App) subscribeSDKEvents(sendFn func(tea.Msg), stepUsageSeen *atomic.Boo
sendFn(SteerConsumedEvent{})
case kit.StepUsageEvent:
a.recordStepUsage(ev, stepUsageSeen)
case kit.PasswordPromptEvent:
// Convert SDK PasswordPromptEvent to app PasswordPromptEvent
// The TUI will handle this and send the response back
responseCh := make(chan PasswordPromptResponse, 1)
sendFn(PasswordPromptEvent{
Prompt: ev.Prompt,
ResponseCh: responseCh,
})
// Wait for TUI response and forward to SDK
resp := <-responseCh
ev.ResponseCh <- kit.PasswordPromptResponse{
Password: resp.Password,
Cancelled: resp.Cancelled,
}
case kit.TurnEndEvent:
a.handleTurnEnd(ev, sendFn)
}
}))
@@ -928,6 +950,64 @@ func (a *App) subscribeSDKEvents(sendFn func(tea.Msg), stepUsageSeen *atomic.Boo
}
}
// handleTurnEnd inspects a turn's final StopReason and surfaces actionable
// feedback to the user when the turn ended in a state they can act on.
//
// Today the only surfaced case is FinishReasonLength — the model hit its
// configured max_output_tokens budget and the reply was truncated. Without
// this banner the TUI used to swallow the truncation silently, leading to
// "ghost" cut-offs with no indication of why.
//
// Separated from subscribeSDKEvents so tests can exercise it directly via a
// stubbed sendFn without standing up a full Kit.
func (a *App) handleTurnEnd(ev kit.TurnEndEvent, sendFn func(tea.Msg)) {
if sendFn == nil {
return
}
if ev.StopReason != kit.FinishReasonLength {
return
}
sendFn(ExtensionPrintEvent{
Level: "info",
Text: a.formatMaxTokensTruncatedMessage(),
})
}
// formatMaxTokensTruncatedMessage builds the user-facing explanation for a
// truncated turn. It reports the active max_output_tokens budget and, when
// known, the model's catalog output ceiling so the user can judge how much
// headroom is available.
func (a *App) formatMaxTokensTruncatedMessage() string {
k := a.opts.Kit
if k == nil {
// Extremely early / test-stub case: still emit a useful generic hint.
return "⚠ Response truncated: the model hit the configured max_output_tokens limit. " +
"Raise it with --max-tokens N, KIT_MAX_TOKENS=N, or per-model " +
"modelSettings[provider/model].maxTokens in config."
}
current := k.MaxTokens()
ceiling := k.MaxOutputLimit()
model := k.GetModelString()
msg := "⚠ Response truncated: "
if model != "" {
msg += fmt.Sprintf("%s hit the configured max_output_tokens limit", model)
} else {
msg += "the model hit the configured max_output_tokens limit"
}
if current > 0 {
msg += fmt.Sprintf(" (%d)", current)
}
msg += "."
if ceiling > 0 && current > 0 && ceiling > current {
msg += fmt.Sprintf(" This model supports up to %d output tokens.", ceiling)
}
msg += "\n\nRaise it with --max-tokens N, KIT_MAX_TOKENS=N, " +
"or per-model modelSettings[provider/model].maxTokens in your config. " +
"Re-run the last prompt after raising it to get the full response."
return msg
}
// QuitFromExtension triggers a graceful shutdown. In interactive mode it
// sends a tea.QuitMsg to the program so the TUI exits cleanly. In
// non-interactive mode it cancels the root context, stopping any in-flight
+93
View File
@@ -3,10 +3,12 @@ package app
import (
"context"
"errors"
"strings"
"sync"
"testing"
"time"
tea "charm.land/bubbletea/v2"
kit "github.com/mark3labs/kit/pkg/kit"
)
@@ -666,3 +668,94 @@ func TestUpdateUsageFromTurnResult_contextTokensUsesAllCategories(t *testing.T)
expected, usage.contextCalls, usage.lastContextTokens)
}
}
// TestHandleTurnEnd_LengthEmitsWarning verifies that when the SDK reports a
// FinishReasonLength (max_output_tokens hit), the app surfaces a user-visible
// ExtensionPrintEvent with Level="info" so the TUI can render a banner
// instead of silently showing a truncated reply.
func TestHandleTurnEnd_LengthEmitsWarning(t *testing.T) {
app := New(Options{}, nil)
defer app.Close()
var mu sync.Mutex
var received []tea.Msg
sendFn := func(m tea.Msg) {
mu.Lock()
defer mu.Unlock()
received = append(received, m)
}
app.handleTurnEnd(kit.TurnEndEvent{StopReason: kit.FinishReasonLength}, sendFn)
mu.Lock()
defer mu.Unlock()
if len(received) != 1 {
t.Fatalf("expected 1 event on length stop, got %d", len(received))
}
ev, ok := received[0].(ExtensionPrintEvent)
if !ok {
t.Fatalf("expected ExtensionPrintEvent, got %T", received[0])
}
if ev.Level != "info" {
t.Errorf("expected Level=info, got %q", ev.Level)
}
if ev.Text == "" {
t.Error("expected non-empty warning text")
}
if !strings.Contains(ev.Text, "max_output_tokens") {
t.Errorf("warning text should mention max_output_tokens, got: %s", ev.Text)
}
}
// TestHandleTurnEnd_NonLengthIgnored verifies that ordinary stop reasons
// (stop, tool-calls, error, unknown, "") do not produce a warning banner.
func TestHandleTurnEnd_NonLengthIgnored(t *testing.T) {
app := New(Options{}, nil)
defer app.Close()
reasons := []string{
kit.FinishReasonStop,
kit.FinishReasonToolCalls,
kit.FinishReasonError,
kit.FinishReasonContentFilter,
kit.FinishReasonOther,
kit.FinishReasonUnknown,
"",
}
for _, r := range reasons {
var called bool
app.handleTurnEnd(kit.TurnEndEvent{StopReason: r}, func(m tea.Msg) {
called = true
})
if called {
t.Errorf("stop reason %q unexpectedly emitted a warning", r)
}
}
}
// TestHandleTurnEnd_NilSendFn guards against panics when no TUI listener is
// attached (e.g. early init or headless teardown).
func TestHandleTurnEnd_NilSendFn(t *testing.T) {
app := New(Options{}, nil)
defer app.Close()
// Should not panic with a nil sendFn.
app.handleTurnEnd(kit.TurnEndEvent{StopReason: kit.FinishReasonLength}, nil)
}
// TestFormatMaxTokensTruncatedMessage_NoKit verifies the fallback message
// when Options.Kit is nil (test/stub path).
func TestFormatMaxTokensTruncatedMessage_NoKit(t *testing.T) {
app := New(Options{}, nil)
defer app.Close()
msg := app.formatMaxTokensTruncatedMessage()
if msg == "" {
t.Fatal("expected non-empty fallback message")
}
for _, needle := range []string{"max_output_tokens", "--max-tokens", "KIT_MAX_TOKENS", "modelSettings"} {
if !strings.Contains(msg, needle) {
t.Errorf("fallback message missing %q:\n%s", needle, msg)
}
}
}
+48
View File
@@ -32,6 +32,36 @@ type ToolCallStartedEvent struct {
ToolArgs string
}
// ToolCallInputStartEvent is sent when the LLM begins generating tool call
// arguments. The tool name is known but the full argument JSON is still being
// streamed. UIs can use this to show a "running" indicator immediately instead
// of waiting for the full argument JSON to finish streaming.
type ToolCallInputStartEvent struct {
// ToolCallID is the stable identifier for correlating tool lifecycle events.
ToolCallID string
// ToolName is the name of the tool being called.
ToolName string
// ToolKind classifies the tool: "execute", "edit", "read", "search", "agent".
ToolKind string
}
// ToolCallInputDeltaEvent is sent for each streamed fragment of tool call
// arguments as they arrive from the LLM. Useful for live-previewing content
// or showing a progress indicator with byte count.
type ToolCallInputDeltaEvent struct {
// ToolCallID is the stable identifier for correlating tool lifecycle events.
ToolCallID string
// Delta is a JSON fragment of tool call arguments.
Delta string
}
// ToolCallInputEndEvent is sent when tool argument streaming is complete,
// before the tool call is parsed and execution begins.
type ToolCallInputEndEvent struct {
// ToolCallID is the stable identifier for correlating tool lifecycle events.
ToolCallID string
}
// ToolExecutionEvent is sent when a tool starts or finishes executing.
// The IsStarting flag distinguishes between the start and end of execution.
type ToolExecutionEvent struct {
@@ -79,6 +109,24 @@ type ToolCallContentEvent struct {
Content string
}
// PasswordPromptEvent is sent when a sudo command needs a password.
// The TUI should display a password prompt overlay and send the result back.
type PasswordPromptEvent struct {
// Prompt is the message to display to the user.
Prompt string
// ResponseCh receives the password from the TUI.
// The TUI must send exactly one value.
ResponseCh chan<- PasswordPromptResponse
}
// PasswordPromptResponse carries the user's password input.
type PasswordPromptResponse struct {
// Password is the entered password.
Password string
// Cancelled is true if the user cancelled the prompt.
Cancelled bool
}
// ResponseCompleteEvent is sent when the LLM produces a final (non-streaming) response.
// In streaming mode, this may be empty if all content was delivered via StreamChunkEvents.
type ResponseCompleteEvent struct {
+8
View File
@@ -471,5 +471,13 @@ func GetAnthropicAPIKey(flagValue string) (string, string, error) {
return envKey, "ANTHROPIC_API_KEY environment variable", nil
}
// Check if OpenAI credentials exist to provide a helpful suggestion
if cm != nil {
hasOpenAI, _ := cm.HasOpenAICredentials()
if hasOpenAI {
return "", "", fmt.Errorf("no Anthropic API key found. Use 'kit auth login anthropic', set ANTHROPIC_API_KEY environment variable, or use --provider-api-key flag\n\nNote: OpenAI credentials were detected. To use OpenAI, run with --model openai/gpt-5.4 or set it as default:\n kit auth login openai --set-default")
}
}
return "", "", fmt.Errorf("no Anthropic API key found. Use 'kit auth login anthropic', set ANTHROPIC_API_KEY environment variable, or use --provider-api-key flag")
}
+10
View File
@@ -30,6 +30,14 @@ type MCPServerConfig struct {
OAuthClientSecret string `json:"oauthClientSecret,omitempty" yaml:"oauthClientSecret,omitempty"`
OAuthScopes []string `json:"oauthScopes,omitempty" yaml:"oauthScopes,omitempty"`
// NoOAuth disables OAuth transport configuration for this server, even
// when the connection pool has an auth handler. Use this for public MCP
// servers (e.g. PubMed) that don't require authentication. Without this
// flag, the pool would attach OAuth transport to every remote server,
// causing proactive dynamic-client-registration attempts that fail on
// servers that don't support it.
NoOAuth bool `json:"noOAuth,omitempty" yaml:"noOAuth,omitempty"`
// InProcessServer holds a live *server.MCPServer for in-process transport.
// When set (and Type is "inprocess"), the connection pool creates an
// in-process client instead of spawning a subprocess or making HTTP calls.
@@ -59,6 +67,7 @@ func (s *MCPServerConfig) UnmarshalJSON(data []byte) error {
OAuthClientID string `json:"oauthClientId,omitempty" yaml:"oauthClientId,omitempty"`
OAuthClientSecret string `json:"oauthClientSecret,omitempty" yaml:"oauthClientSecret,omitempty"`
OAuthScopes []string `json:"oauthScopes,omitempty" yaml:"oauthScopes,omitempty"`
NoOAuth bool `json:"noOAuth,omitempty" yaml:"noOAuth,omitempty"`
}
// Also try legacy format
@@ -86,6 +95,7 @@ func (s *MCPServerConfig) UnmarshalJSON(data []byte) error {
s.OAuthClientID = newConfig.OAuthClientID
s.OAuthClientSecret = newConfig.OAuthClientSecret
s.OAuthScopes = newConfig.OAuthScopes
s.NoOAuth = newConfig.NoOAuth
return nil
}
+176 -6
View File
@@ -19,10 +19,18 @@ import (
// It receives tool call ID, tool name, output chunk, and whether it's stderr.
type ToolOutputCallback func(toolCallID, toolName, chunk string, isStderr bool)
// PasswordPromptCallback is the signature for password prompts.
// It receives a prompt message and returns the password and whether it was cancelled.
type PasswordPromptCallback func(prompt string) (password string, cancelled bool)
// contextKey is a custom type for context keys to avoid collisions.
type contextKey string
const toolOutputCallbackKey contextKey = "toolOutputCallback"
const (
toolOutputCallbackKey contextKey = "toolOutputCallback"
sudoPasswordKey contextKey = "sudoPassword"
passwordPromptKey contextKey = "passwordPrompt"
)
// ContextWithToolOutputCallback returns a new context with the tool output callback set.
func ContextWithToolOutputCallback(ctx context.Context, callback ToolOutputCallback) context.Context {
@@ -37,6 +45,34 @@ func toolOutputCallbackFromContext(ctx context.Context) ToolOutputCallback {
return nil
}
// ContextWithPasswordPrompt returns a new context with the password prompt callback set.
// This allows the TUI to show a modal password prompt when sudo needs a password.
func ContextWithPasswordPrompt(ctx context.Context, callback PasswordPromptCallback) context.Context {
return context.WithValue(ctx, passwordPromptKey, callback)
}
// passwordPromptFromContext retrieves the password prompt callback from context.
func passwordPromptFromContext(ctx context.Context) PasswordPromptCallback {
if cb, ok := ctx.Value(passwordPromptKey).(PasswordPromptCallback); ok {
return cb
}
return nil
}
// ContextWithSudoPassword returns a new context with the sudo password set.
// When present, the bash tool will use sudo -S to pipe this password to sudo commands.
func ContextWithSudoPassword(ctx context.Context, password string) context.Context {
return context.WithValue(ctx, sudoPasswordKey, password)
}
// sudoPasswordFromContext retrieves the sudo password from context.
func sudoPasswordFromContext(ctx context.Context) string {
if pw, ok := ctx.Value(sudoPasswordKey).(string); ok {
return pw
}
return ""
}
const defaultBashTimeout = 120 * time.Second
const maxBashTimeout = 600 * time.Second
@@ -73,6 +109,66 @@ func NewBashTool(opts ...ToolOption) fantasy.AgentTool {
}
}
// sudoCommandRe matches sudo commands that need to be rewritten for -S mode.
// It matches "sudo" as a word boundary, optionally preceded by environment variables.
var sudoCommandRe = regexp.MustCompile(`(?i)(^|[&|;|]|\|\||&&)\s*(\w+=\S+\s+)?\bsudo\b`)
// truncateCommand truncates a long command for display.
func truncateCommand(cmd string, maxLen int) string {
if len(cmd) <= maxLen {
return cmd
}
return cmd[:maxLen-3] + "..."
}
// rewriteSudoForStdin rewrites sudo commands to use -S -p ” for stdin password input.
// It transforms: sudo cmd → sudo -S -p ” cmd
func rewriteSudoForStdin(command string) string {
// Find all matches and their positions
matches := sudoCommandRe.FindAllStringIndex(command, -1)
if matches == nil {
return command
}
// Build result from end to start to preserve indices
result := command
for i := len(matches) - 1; i >= 0; i-- {
match := matches[i]
start, end := match[0], match[1]
matchedText := result[start:end]
// Extract just the "sudo" part (after any prefix)
sudoIdx := strings.Index(strings.ToLower(matchedText), "sudo")
if sudoIdx == -1 {
continue
}
prefix := matchedText[:sudoIdx]
sudoPart := matchedText[sudoIdx:]
// Check if the text immediately after "sudo" in the result contains -S
afterSudo := result[end:]
if strings.HasPrefix(strings.TrimLeft(afterSudo, " \t"), "-S") {
// Already has -S flag, skip
continue
}
// Insert -S -p '' after "sudo"
newSudo := strings.Replace(sudoPart, "sudo", "sudo -S -p ''", 1)
result = result[:start] + prefix + newSudo + result[end:]
}
return result
}
// SudoPasswordRequiredResult is a special marker that indicates sudo needs a password.
// This is stored in tool response metadata to signal the TUI to prompt for password.
const SudoPasswordRequiredMetadata = `{"sudo_password_required":true}`
// IsSudoPasswordRequiredResult checks if a tool response indicates sudo password is needed.
func IsSudoPasswordRequiredResult(resp fantasy.ToolResponse) bool {
return resp.Metadata == SudoPasswordRequiredMetadata
}
func executeBash(ctx context.Context, call fantasy.ToolCall, workDir string) (fantasy.ToolResponse, error) {
var args bashArgs
if err := parseArgs(call.Input, &args); err != nil {
@@ -97,7 +193,47 @@ func executeBash(ctx context.Context, call fantasy.ToolCall, workDir string) (fa
cmdCtx, cancel := context.WithTimeout(ctx, timeout)
defer cancel()
cmd := exec.CommandContext(cmdCtx, "bash", "-c", args.Command)
// Check for sudo password in context or environment
sudoPassword := sudoPasswordFromContext(ctx)
if sudoPassword == "" {
sudoPassword = os.Getenv("SUDO_PASSWORD")
}
command := args.Command
// If command contains sudo and we don't have a password, check if sudo needs one
if sudoPassword == "" && sudoCommandRe.MatchString(command) {
// Check if sudo credentials are cached using sudo -n (non-interactive)
testCmd := exec.CommandContext(cmdCtx, "sudo", "-n", "true")
testCmd.Dir = workDir
if err := testCmd.Run(); err != nil {
// Sudo needs a password - try to prompt via callback
if promptCallback := passwordPromptFromContext(ctx); promptCallback != nil {
pw, cancelled := promptCallback("Sudo password required for: " + truncateCommand(args.Command, 60))
if cancelled {
return fantasy.NewTextErrorResponse("sudo password prompt cancelled"), nil
}
if pw == "" {
return fantasy.NewTextErrorResponse("no sudo password provided"), nil
}
sudoPassword = pw
command = rewriteSudoForStdin(command)
} else {
// No callback available - return error with helpful message
return fantasy.NewTextErrorResponse(
"This command requires sudo access. " +
"Please run 'sudo -v' in your terminal first to cache credentials, " +
"or set the SUDO_PASSWORD environment variable."), nil
}
}
// Credentials are cached or password was provided, proceed
}
// If we have a sudo password, rewrite the command to use sudo -S
if sudoPassword != "" && sudoCommandRe.MatchString(command) {
command = rewriteSudoForStdin(command)
}
cmd := exec.CommandContext(cmdCtx, "bash", "-c", command)
if workDir != "" {
cmd.Dir = workDir
}
@@ -115,18 +251,18 @@ func executeBash(ctx context.Context, call fantasy.ToolCall, workDir string) (fa
if outputCallback != nil {
// Streaming mode: use pipes to capture output as it arrives
return executeBashStreaming(cmdCtx, call, cmd, outputCallback)
return executeBashStreaming(cmdCtx, call, cmd, outputCallback, sudoPassword)
}
// Non-streaming mode: collect all output at once (original behavior)
return executeBashBuffered(cmdCtx, call, cmd)
return executeBashBuffered(cmdCtx, call, cmd, sudoPassword)
}
// executeBashBuffered collects all output before returning (original behavior).
// It uses explicit pipes (not cmd.Stdout) so that cmd.WaitDelay can forcibly
// close them when grandchild processes hold pipe handles open after the
// direct child exits.
func executeBashBuffered(cmdCtx context.Context, call fantasy.ToolCall, cmd *exec.Cmd) (fantasy.ToolResponse, error) {
func executeBashBuffered(cmdCtx context.Context, call fantasy.ToolCall, cmd *exec.Cmd, sudoPassword string) (fantasy.ToolResponse, error) {
stdoutPipe, err := cmd.StdoutPipe()
if err != nil {
return fantasy.NewTextErrorResponse("failed to create stdout pipe"), nil
@@ -136,10 +272,27 @@ func executeBashBuffered(cmdCtx context.Context, call fantasy.ToolCall, cmd *exe
return fantasy.NewTextErrorResponse("failed to create stderr pipe"), nil
}
// If we have a sudo password, create a stdin pipe and write the password
var stdinPipe io.WriteCloser
if sudoPassword != "" {
stdinPipe, err = cmd.StdinPipe()
if err != nil {
return fantasy.NewTextErrorResponse("failed to create stdin pipe"), nil
}
}
if err := cmd.Start(); err != nil {
return fantasy.NewTextErrorResponse(fmt.Sprintf("failed to start command: %v", err)), nil
}
// Write password to stdin if needed, then close stdin
if sudoPassword != "" && stdinPipe != nil {
go func() {
defer func() { _ = stdinPipe.Close() }()
_, _ = io.WriteString(stdinPipe, sudoPassword+"\n")
}()
}
// Read pipes concurrently
var wg sync.WaitGroup
var stdout, stderr strings.Builder
@@ -181,7 +334,7 @@ func executeBashBuffered(cmdCtx context.Context, call fantasy.ToolCall, cmd *exe
}
// executeBashStreaming streams output as it arrives via the callback.
func executeBashStreaming(cmdCtx context.Context, call fantasy.ToolCall, cmd *exec.Cmd, outputCallback ToolOutputCallback) (fantasy.ToolResponse, error) {
func executeBashStreaming(cmdCtx context.Context, call fantasy.ToolCall, cmd *exec.Cmd, outputCallback ToolOutputCallback, sudoPassword string) (fantasy.ToolResponse, error) {
stdoutPipe, err := cmd.StdoutPipe()
if err != nil {
return fantasy.NewTextErrorResponse("failed to create stdout pipe"), nil
@@ -191,11 +344,28 @@ func executeBashStreaming(cmdCtx context.Context, call fantasy.ToolCall, cmd *ex
return fantasy.NewTextErrorResponse("failed to create stderr pipe"), nil
}
// If we have a sudo password, create a stdin pipe
var stdinPipe io.WriteCloser
if sudoPassword != "" {
stdinPipe, err = cmd.StdinPipe()
if err != nil {
return fantasy.NewTextErrorResponse("failed to create stdin pipe"), nil
}
}
// Start command execution
if err := cmd.Start(); err != nil {
return fantasy.NewTextErrorResponse(fmt.Sprintf("failed to start command: %v", err)), nil
}
// Write password to stdin if needed, then close stdin
if sudoPassword != "" && stdinPipe != nil {
go func() {
defer func() { _ = stdinPipe.Close() }()
_, _ = io.WriteString(stdinPipe, sudoPassword+"\n")
}()
}
// Stream stdout and stderr concurrently
var wg sync.WaitGroup
var mu sync.Mutex
+69
View File
@@ -127,3 +127,72 @@ func TestBash_EmptyCommand(t *testing.T) {
t.Fatal("expected error for empty command")
}
}
func TestRewriteSudoForStdin(t *testing.T) {
tests := []struct {
name string
input string
expected string
}{
{
name: "simple sudo",
input: "sudo apt update",
expected: "sudo -S -p '' apt update",
},
{
name: "sudo with env var",
input: "DEBIAN_FRONTEND=noninteractive sudo apt update",
expected: "DEBIAN_FRONTEND=noninteractive sudo -S -p '' apt update",
},
{
name: "sudo in pipeline",
input: "echo test | sudo tee /etc/test.conf",
expected: "echo test | sudo -S -p '' tee /etc/test.conf",
},
{
name: "sudo after &&",
input: "apt update && sudo apt upgrade",
expected: "apt update && sudo -S -p '' apt upgrade",
},
{
name: "already has -S flag",
input: "sudo -S apt update",
expected: "sudo -S apt update",
},
{
name: "no sudo",
input: "apt update && apt upgrade",
expected: "apt update && apt upgrade",
},
{
name: "sudo in string (should not match)",
input: "echo 'use sudo carefully'",
expected: "echo 'use sudo carefully'",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
result := rewriteSudoForStdin(tt.input)
if result != tt.expected {
t.Errorf("rewriteSudoForStdin(%q) = %q, want %q", tt.input, result, tt.expected)
}
})
}
}
func TestSudoPasswordFromContext(t *testing.T) {
// Test with password in context
ctx := ContextWithSudoPassword(context.Background(), "secret123")
pw := sudoPasswordFromContext(ctx)
if pw != "secret123" {
t.Errorf("expected password 'secret123', got %q", pw)
}
// Test without password
ctx = context.Background()
pw = sudoPasswordFromContext(ctx)
if pw != "" {
t.Errorf("expected empty password, got %q", pw)
}
}
+2 -2
View File
@@ -86,7 +86,7 @@ Example use cases:
},
"model": map[string]any{
"type": "string",
"description": "Optional model override (e.g. 'anthropic/claude-haiku-3-5-20241022' for faster/cheaper tasks)",
"description": "Optional model override. Empty string uses the current model.",
},
"system_prompt": map[string]any{
"type": "string",
@@ -94,7 +94,7 @@ Example use cases:
},
"timeout_seconds": map[string]any{
"type": "number",
"description": "Maximum execution time in seconds (default: 300, max: 1800)",
"description": "Maximum execution time in seconds (default: 300, max: 1800, minimum recommended: 240)",
},
},
Required: []string{"task"},
+51
View File
@@ -1063,6 +1063,9 @@ type PrintBlockOpts struct {
type API struct {
// Event-specific registration functions (wired by the loader).
onToolCall func(func(ToolCallEvent, Context) *ToolCallResult)
onToolCallInputStart func(func(ToolCallInputStartEvent, Context))
onToolCallInputDelta func(func(ToolCallInputDeltaEvent, Context))
onToolCallInputEnd func(func(ToolCallInputEndEvent, Context))
onToolExecStart func(func(ToolExecutionStartEvent, Context))
onToolExecEnd func(func(ToolExecutionEndEvent, Context))
onToolOutput func(func(ToolOutputEvent, Context))
@@ -1099,6 +1102,26 @@ func (a *API) OnToolCall(handler func(ToolCallEvent, Context) *ToolCallResult) {
a.onToolCall(handler)
}
// OnToolCallInputStart registers a handler that fires when the LLM begins
// generating tool call arguments. The tool name is known but the full
// argument JSON is still being streamed. Useful for showing a "running"
// indicator immediately without waiting for the full arguments.
func (a *API) OnToolCallInputStart(handler func(ToolCallInputStartEvent, Context)) {
a.onToolCallInputStart(handler)
}
// OnToolCallInputDelta registers a handler that fires for each streamed
// fragment of tool call arguments as they arrive from the LLM.
func (a *API) OnToolCallInputDelta(handler func(ToolCallInputDeltaEvent, Context)) {
a.onToolCallInputDelta(handler)
}
// OnToolCallInputEnd registers a handler that fires when tool argument
// streaming is complete, before the tool call is parsed and execution begins.
func (a *API) OnToolCallInputEnd(handler func(ToolCallInputEndEvent, Context)) {
a.onToolCallInputEnd(handler)
}
// OnToolExecutionStart registers a handler for tool execution start.
func (a *API) OnToolExecutionStart(handler func(ToolExecutionStartEvent, Context)) {
a.onToolExecStart(handler)
@@ -1890,6 +1913,34 @@ type ToolCallResult struct {
func (ToolCallResult) isResult() {}
// ToolCallInputStartEvent fires when the LLM begins generating tool call
// arguments. The tool name is known but the full argument JSON is still
// being streamed.
type ToolCallInputStartEvent struct {
ToolCallID string
ToolName string
ToolKind string // Tool classification: "execute", "edit", "read", "search", "agent"
}
func (e ToolCallInputStartEvent) Type() EventType { return ToolCallInputStart }
// ToolCallInputDeltaEvent fires for each streamed fragment of tool call
// arguments as they arrive from the LLM.
type ToolCallInputDeltaEvent struct {
ToolCallID string
Delta string // JSON fragment of tool arguments
}
func (e ToolCallInputDeltaEvent) Type() EventType { return ToolCallInputDelta }
// ToolCallInputEndEvent fires when tool argument streaming is complete,
// before the tool call is parsed and execution begins.
type ToolCallInputEndEvent struct {
ToolCallID string
}
func (e ToolCallInputEndEvent) Type() EventType { return ToolCallInputEnd }
// ToolExecutionStartEvent fires when a tool begins executing.
type ToolExecutionStartEvent struct {
ToolCallID string
+15 -1
View File
@@ -13,6 +13,19 @@ const (
// ToolCall fires before a tool executes. Handlers can block execution.
ToolCall EventType = "tool_call"
// ToolCallInputStart fires when the LLM begins generating tool call
// arguments. The tool name is known but the full argument JSON is still
// being streamed.
ToolCallInputStart EventType = "tool_call_input_start"
// ToolCallInputDelta fires for each streamed fragment of tool call
// arguments as they arrive from the LLM.
ToolCallInputDelta EventType = "tool_call_input_delta"
// ToolCallInputEnd fires when tool argument streaming is complete,
// before the tool call is parsed and execution begins.
ToolCallInputEnd EventType = "tool_call_input_end"
// ToolExecutionStart fires when a tool begins executing.
ToolExecutionStart EventType = "tool_execution_start"
@@ -88,7 +101,8 @@ const (
// AllEventTypes returns every supported event type.
func AllEventTypes() []EventType {
return []EventType{
ToolCall, ToolExecutionStart, ToolExecutionEnd, ToolResult,
ToolCall, ToolCallInputStart, ToolCallInputDelta, ToolCallInputEnd,
ToolExecutionStart, ToolExecutionEnd, ToolResult,
Input, BeforeAgentStart, AgentStart, AgentEnd,
MessageStart, MessageUpdate, MessageEnd,
SessionStart, SessionShutdown,
+5 -2
View File
@@ -4,8 +4,8 @@ import "testing"
func TestAllEventTypes_Count(t *testing.T) {
all := AllEventTypes()
if len(all) != 21 {
t.Fatalf("expected 21 event types, got %d", len(all))
if len(all) != 24 {
t.Fatalf("expected 24 event types, got %d", len(all))
}
}
@@ -38,6 +38,9 @@ func TestEventType_TypeMethod(t *testing.T) {
want EventType
}{
{ToolCallEvent{ToolName: "test"}, ToolCall},
{ToolCallInputStartEvent{ToolCallID: "x", ToolName: "test"}, ToolCallInputStart},
{ToolCallInputDeltaEvent{ToolCallID: "x", Delta: "{"}, ToolCallInputDelta},
{ToolCallInputEndEvent{ToolCallID: "x"}, ToolCallInputEnd},
{ToolExecutionStartEvent{ToolName: "test"}, ToolExecutionStart},
{ToolExecutionEndEvent{ToolName: "test"}, ToolExecutionEnd},
{ToolResultEvent{ToolName: "test"}, ToolResult},
+18
View File
@@ -429,6 +429,24 @@ func loadSingleExtension(path string) (*LoadedExtension, error) {
return *r
})
},
onToolCallInputStart: func(h func(ToolCallInputStartEvent, Context)) {
reg(ToolCallInputStart, func(e Event, c Context) Result {
h(e.(ToolCallInputStartEvent), c)
return nil
})
},
onToolCallInputDelta: func(h func(ToolCallInputDeltaEvent, Context)) {
reg(ToolCallInputDelta, func(e Event, c Context) Result {
h(e.(ToolCallInputDeltaEvent), c)
return nil
})
},
onToolCallInputEnd: func(h func(ToolCallInputEndEvent, Context)) {
reg(ToolCallInputEnd, func(e Event, c Context) Result {
h(e.(ToolCallInputEndEvent), c)
return nil
})
},
onToolExecStart: func(h func(ToolExecutionStartEvent, Context)) {
reg(ToolExecutionStart, func(e Event, c Context) Result {
h(e.(ToolExecutionStartEvent), c)
+92 -3
View File
@@ -1,21 +1,93 @@
package extensions
import (
"bytes"
"fmt"
"log"
"os"
"runtime"
"sort"
"strconv"
"strings"
"sync"
"github.com/spf13/viper"
)
// ---------------------------------------------------------------------------
// reentrantMu — a per-extension mutex that allows the same goroutine to
// re-enter (e.g. handler → ctx.EmitCustomEvent → handler in same extension).
// Different goroutines are serialized, preventing concurrent state mutation.
// ---------------------------------------------------------------------------
type reentrantMu struct {
mu sync.Mutex
cond *sync.Cond
owner int64 // goroutine ID that holds the lock, or 0
depth int // re-entrancy depth
}
// initReentrantMu initializes the reentrant mutex in-place. Must be called
// after the struct is at its final memory location (not before copying).
func (r *reentrantMu) init() {
r.cond = sync.NewCond(&r.mu)
}
// lock acquires the mutex. If the calling goroutine already holds it, the
// call succeeds immediately (re-entrant). Every call to lock must be paired
// with a call to unlock.
func (r *reentrantMu) lock() {
gid := goroutineID()
r.mu.Lock()
if r.owner == gid {
// Re-entrant: same goroutine already holds the lock.
r.depth++
r.mu.Unlock()
return
}
// Wait for the current owner to release.
for r.owner != 0 {
r.cond.Wait() // releases mu, blocks, re-acquires mu on wake
}
r.owner = gid
r.depth = 1
r.mu.Unlock()
}
// unlock releases the mutex (or decrements re-entrancy depth).
func (r *reentrantMu) unlock() {
r.mu.Lock()
r.depth--
if r.depth == 0 {
r.owner = 0
r.cond.Signal()
}
r.mu.Unlock()
}
// goroutineID extracts the current goroutine's ID from runtime.Stack output.
// This is a well-known technique used by Go testing infrastructure.
func goroutineID() int64 {
var buf [64]byte
n := runtime.Stack(buf[:], false)
// Stack output starts with "goroutine NNN ["
s := buf[:n]
s = s[len("goroutine "):]
s = s[:bytes.IndexByte(s, ' ')]
id, _ := strconv.ParseInt(string(s), 10, 64)
return id
}
// Runner manages loaded extensions and dispatches events to their handlers
// sequentially. Handlers execute in extension
// load order; for cancellable events the first blocking result wins.
//
// Each extension has a dedicated reentrant mutex so that handlers for the
// same extension are serialized (preventing data races on shared package-level
// state), while handlers for different extensions may execute concurrently.
type Runner struct {
extensions []LoadedExtension
extMu []reentrantMu // per-extension reentrant mutex, indexed by extension position
ctx Context
widgets map[string]WidgetConfig // keyed by widget ID
statusEntries map[string]StatusBarEntry // keyed by status key
@@ -52,7 +124,11 @@ type LoadedExtension struct {
// NewRunner creates a Runner from a set of loaded extensions.
func NewRunner(exts []LoadedExtension) *Runner {
return &Runner{extensions: exts}
mus := make([]reentrantMu, len(exts))
for i := range mus {
mus[i].init()
}
return &Runner{extensions: exts, extMu: mus}
}
// SetContext updates the runtime context (session ID, model, etc.) that is
@@ -367,6 +443,11 @@ func (r *Runner) Emit(event Event) (Result, error) {
for i := range r.extensions {
ext := &r.extensions[i]
handlers := ext.Handlers[event.Type()]
if len(handlers) == 0 {
continue
}
r.extMu[i].lock()
for _, handler := range handlers {
result, err := safeCall(handler, event, ctx)
if err != nil {
@@ -379,6 +460,7 @@ func (r *Runner) Emit(event Event) (Result, error) {
// Check for blocking/short-circuit results.
if isBlocking(result) {
r.extMu[i].unlock()
return result, nil
}
@@ -386,6 +468,7 @@ func (r *Runner) Emit(event Event) (Result, error) {
// the caller is responsible for applying the modifications.
accumulated = result
}
r.extMu[i].unlock()
}
return accumulated, nil
}
@@ -712,11 +795,17 @@ func (r *Runner) EmitCustomEvent(name, data string) {
// Extension-registered handlers first (in load order).
for i := range r.extensions {
for _, h := range r.extensions[i].CustomEventHandlers[name] {
extHandlers := r.extensions[i].CustomEventHandlers[name]
if len(extHandlers) == 0 {
continue
}
r.extMu[i].lock()
for _, h := range extHandlers {
safeInvoke(h)
}
r.extMu[i].unlock()
}
// Then dynamic subscriptions.
// Then dynamic subscriptions (not extension-scoped, no per-ext lock).
for _, h := range dynamicHandlers {
safeInvoke(h)
}
+140
View File
@@ -1,6 +1,7 @@
package extensions
import (
"sync"
"testing"
)
@@ -571,3 +572,142 @@ func TestRunner_ContextPrintNilSafe(t *testing.T) {
t.Fatalf("unexpected error: %v", err)
}
}
func TestRunner_ConcurrentEmitSameExtension(t *testing.T) {
// Verify that concurrent Emit calls for the same extension are serialized
// and don't cause data races on shared handler state.
var counter int
ext := makeHandlerExt("shared-state.go", map[EventType][]HandlerFunc{
SubagentStart: {
func(e Event, c Context) Result {
// Read-modify-write: racy without serialization.
v := counter
counter = v + 1
return nil
},
},
SubagentChunk: {
func(e Event, c Context) Result {
v := counter
counter = v + 1
return nil
},
},
})
r := makeRunner(ext)
var wg sync.WaitGroup
const goroutines = 20
const iterations = 50
wg.Add(goroutines)
for range goroutines {
go func() {
defer wg.Done()
for range iterations {
_, _ = r.Emit(SubagentStartEvent{ToolCallID: "x"})
_, _ = r.Emit(SubagentChunkEvent{ToolCallID: "x"})
}
}()
}
wg.Wait()
if counter != goroutines*iterations*2 {
t.Errorf("expected counter=%d, got %d (race detected)", goroutines*iterations*2, counter)
}
}
func TestRunner_ConcurrentEmitDifferentExtensions(t *testing.T) {
// Two extensions with independent state should not block each other
// and should both run correctly under concurrent Emit calls.
var counter1, counter2 int
ext1 := makeHandlerExt("ext1.go", map[EventType][]HandlerFunc{
SubagentStart: {
func(e Event, c Context) Result {
v := counter1
counter1 = v + 1
return nil
},
},
})
ext2 := makeHandlerExt("ext2.go", map[EventType][]HandlerFunc{
SubagentStart: {
func(e Event, c Context) Result {
v := counter2
counter2 = v + 1
return nil
},
},
})
r := makeRunner(ext1, ext2)
var wg sync.WaitGroup
const goroutines = 20
const iterations = 50
wg.Add(goroutines)
for range goroutines {
go func() {
defer wg.Done()
for range iterations {
_, _ = r.Emit(SubagentStartEvent{ToolCallID: "x"})
}
}()
}
wg.Wait()
expected := goroutines * iterations
if counter1 != expected {
t.Errorf("ext1 counter: expected %d, got %d", expected, counter1)
}
if counter2 != expected {
t.Errorf("ext2 counter: expected %d, got %d", expected, counter2)
}
}
func TestRunner_ReentrantEmitCustomEvent(t *testing.T) {
// Verify that a handler can call EmitCustomEvent (which dispatches to
// the same extension's custom event handlers) without deadlocking.
var order []string
ext := LoadedExtension{
Path: "reentrant.go",
Handlers: map[EventType][]HandlerFunc{
SessionStart: {
func(e Event, c Context) Result {
order = append(order, "session_start")
// This triggers EmitCustomEvent for the same extension
// via a direct runner call (simulating ctx.EmitCustomEvent).
return nil
},
},
},
CustomEventHandlers: map[string][]func(string){
"test-event": {
func(data string) {
order = append(order, "custom:"+data)
},
},
},
}
r := makeRunner(ext)
// Wire up the handler to call EmitCustomEvent re-entrantly.
ext.Handlers[SessionStart] = []HandlerFunc{
func(e Event, c Context) Result {
order = append(order, "session_start")
r.EmitCustomEvent("test-event", "hello")
return nil
},
}
r.extensions[0] = ext
// Rebuild mutexes after modifying extensions slice.
r.extMu = make([]reentrantMu, len(r.extensions))
for i := range r.extMu {
r.extMu[i].init()
}
_, err := r.Emit(SessionStartEvent{})
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if len(order) != 2 || order[0] != "session_start" || order[1] != "custom:hello" {
t.Errorf("expected [session_start, custom:hello], got %v", order)
}
}
+3
View File
@@ -152,6 +152,9 @@ func Symbols() interp.Exports {
// Event structs
"ToolCallEvent": reflect.ValueOf((*ToolCallEvent)(nil)),
"ToolCallResult": reflect.ValueOf((*ToolCallResult)(nil)),
"ToolCallInputStartEvent": reflect.ValueOf((*ToolCallInputStartEvent)(nil)),
"ToolCallInputDeltaEvent": reflect.ValueOf((*ToolCallInputDeltaEvent)(nil)),
"ToolCallInputEndEvent": reflect.ValueOf((*ToolCallInputEndEvent)(nil)),
"ToolExecutionStartEvent": reflect.ValueOf((*ToolExecutionStartEvent)(nil)),
"ToolExecutionEndEvent": reflect.ValueOf((*ToolExecutionEndEvent)(nil)),
"ToolOutputEvent": reflect.ValueOf((*ToolOutputEvent)(nil)),
File diff suppressed because one or more lines are too long
+108 -6
View File
@@ -85,6 +85,7 @@ type ThinkingLevel string
const (
ThinkingOff ThinkingLevel = "off"
ThinkingNone ThinkingLevel = "none"
ThinkingMinimal ThinkingLevel = "minimal"
ThinkingLow ThinkingLevel = "low"
ThinkingMedium ThinkingLevel = "medium"
@@ -93,12 +94,14 @@ const (
// ThinkingLevels returns the ordered list of available thinking levels for cycling.
func ThinkingLevels() []ThinkingLevel {
return []ThinkingLevel{ThinkingOff, ThinkingMinimal, ThinkingLow, ThinkingMedium, ThinkingHigh}
return []ThinkingLevel{ThinkingOff, ThinkingNone, ThinkingMinimal, ThinkingLow, ThinkingMedium, ThinkingHigh}
}
// thinkingBudgetTokens returns the token budget for a thinking level, or 0 for "off".
// thinkingBudgetTokens returns the token budget for a thinking level, or 0 for "off" or "none".
func thinkingBudgetTokens(level ThinkingLevel) int64 {
switch level {
case ThinkingNone:
return 1024
case ThinkingMinimal:
return 1024
case ThinkingLow:
@@ -117,6 +120,8 @@ func ThinkingLevelDescription(level ThinkingLevel) string {
switch level {
case ThinkingOff:
return "No reasoning"
case ThinkingNone:
return "Minimal reasoning (OpenAI 'none')"
case ThinkingMinimal:
return "Very brief reasoning (~1k tokens)"
case ThinkingLow:
@@ -133,7 +138,7 @@ func ThinkingLevelDescription(level ThinkingLevel) string {
// ParseThinkingLevel converts a string to a ThinkingLevel, defaulting to ThinkingOff.
func ParseThinkingLevel(s string) ThinkingLevel {
switch ThinkingLevel(s) {
case ThinkingMinimal, ThinkingLow, ThinkingMedium, ThinkingHigh:
case ThinkingNone, ThinkingMinimal, ThinkingLow, ThinkingMedium, ThinkingHigh:
return ThinkingLevel(s)
default:
return ThinkingOff
@@ -251,6 +256,11 @@ func CreateProvider(ctx context.Context, config *ProviderConfig) (*ProviderResul
// via CLI flag or global config.
ApplyModelSettings(config, modelInfo)
// Auto-raise MaxTokens toward the model's known output ceiling when the
// user hasn't explicitly set --max-tokens and no per-model override
// applied. Runs after ApplyModelSettings so explicit modelSettings win.
rightSizeMaxTokens(config, modelInfo)
// Create the base provider
var result *ProviderResult
var createErr error
@@ -295,9 +305,18 @@ func CreateProvider(ctx context.Context, config *ProviderConfig) (*ProviderResul
// Only add cache options for providers that don't already have
// options set, to avoid type conflicts (e.g., Anthropic has
// different types for regular options vs cache control options).
for k, v := range cacheOpts {
if _, exists := result.ProviderOptions[k]; !exists {
result.ProviderOptions[k] = v
//
// For OpenAI Responses API models, we skip merging entirely because
// ResponsesProviderOptions and ProviderOptions are incompatible types.
skipMerge := false
if provider == "openai" && openai.IsResponsesModel(modelName) {
skipMerge = true
}
if !skipMerge {
for k, v := range cacheOpts {
if _, exists := result.ProviderOptions[k]; !exists {
result.ProviderOptions[k] = v
}
}
}
}
@@ -489,6 +508,37 @@ func validateModelConfig(config *ProviderConfig, modelInfo *ModelInfo) {
}
}
// defaultRightSizeCap bounds auto-raised MaxTokens so that we don't silently
// allocate enormous output budgets for models with very high ceilings (e.g.
// Devstral at 262144, Mistral at 128000). Users who genuinely want more can
// pass --max-tokens explicitly or set modelSettings[...].maxTokens in config.
const defaultRightSizeCap = 32768
// rightSizeMaxTokens raises config.MaxTokens toward the model's known output
// ceiling when:
// - the user has not explicitly set --max-tokens (or the KIT_MAX_TOKENS env
// var, or the top-level max-tokens key in config.yaml), AND
// - no per-model override already bumped MaxTokens (ApplyModelSettings runs
// before this function), AND
// - modelInfo.Limit.Output is known and larger than the current MaxTokens.
//
// The raised value is capped at defaultRightSizeCap to keep accidental
// allocations reasonable on very-large-output models. This prevents the
// common "ghost" where the agent's reply is silently truncated at the 8192
// default even though the selected model supports 64k or 262k output tokens.
func rightSizeMaxTokens(config *ProviderConfig, modelInfo *ModelInfo) {
if modelInfo == nil || modelInfo.Limit.Output <= 0 {
return
}
if isExplicitlySet("max-tokens") {
return
}
target := min(modelInfo.Limit.Output, defaultRightSizeCap)
if config.MaxTokens < target {
config.MaxTokens = target
}
}
// clearConflictingAnthropicSamplingParams ensures that temperature and top_p are
// not both sent to the Anthropic API, which rejects requests containing both.
// When both are set (typically from defaults), top_p is cleared so that
@@ -535,6 +585,8 @@ func buildOpenAIProviderOptions(config *ProviderConfig, modelName string) fantas
// Returns nil for ThinkingOff (use the model's default).
func thinkingLevelToReasoningEffort(level ThinkingLevel) *openai.ReasoningEffort {
switch level {
case ThinkingNone:
return new(openai.ReasoningEffortNone)
case ThinkingMinimal:
return new(openai.ReasoningEffortMinimal)
case ThinkingLow:
@@ -548,6 +600,56 @@ func thinkingLevelToReasoningEffort(level ThinkingLevel) *openai.ReasoningEffort
}
}
// IsValidThinkingLevelForModel checks if a thinking level is valid for the given
// model. Some OpenAI models like gpt-5.4 don't support "minimal" and require
// "none" instead.
func IsValidThinkingLevelForModel(level ThinkingLevel, modelName string) bool {
if level == ThinkingOff {
return true
}
// Check if this is an OpenAI model that doesn't support "minimal"
// gpt-5.4 and newer gpt-5.x models use "none" instead of "minimal"
if level == ThinkingMinimal {
if strings.Contains(modelName, "gpt-5.4") ||
strings.Contains(modelName, "gpt-5-pro") ||
strings.Contains(modelName, "gpt-5-chat") {
return false
}
}
// Check if this is an OpenAI model that doesn't support "none"
// Older gpt-5 models only support "minimal", not "none"
if level == ThinkingNone {
if strings.Contains(modelName, "gpt-5") &&
!strings.Contains(modelName, "gpt-5.4") &&
!strings.Contains(modelName, "gpt-5-pro") &&
!strings.Contains(modelName, "gpt-5-chat") {
// Older gpt-5 models might not support "none"
// They only added "none" support in newer versions
return false
}
}
// All other levels are generally valid for reasoning models
return true
}
// SuggestThinkingLevelFallback returns a recommended fallback level when the
// requested level is not valid for the model. Returns ThinkingOff if no
// suitable fallback exists.
func SuggestThinkingLevelFallback(level ThinkingLevel, modelName string) ThinkingLevel {
if level == ThinkingMinimal && !IsValidThinkingLevelForModel(level, modelName) {
// For models that don't support "minimal", suggest "none" (~same token budget)
return ThinkingNone
}
if level == ThinkingNone && !IsValidThinkingLevelForModel(level, modelName) {
// For models that don't support "none", suggest "minimal" (~same token budget)
return ThinkingMinimal
}
return ThinkingOff
}
// buildAnthropicProviderOptions returns fantasy.ProviderOptions configured for
// Anthropic models with extended thinking. When thinking is enabled, it sets
// SendReasoning to true and configures the thinking budget. For thinking-off
+148
View File
@@ -0,0 +1,148 @@
package models
import (
"testing"
"github.com/spf13/pflag"
"github.com/spf13/viper"
)
// bindMaxTokensFlag wires a fresh pflag-backed "max-tokens" key into viper so
// isExplicitlySet behaves the same way it does in production. Returns a
// cleanup function that removes the binding so sibling tests see a clean
// state.
func bindMaxTokensFlag(t *testing.T, args []string) func() {
t.Helper()
fs := pflag.NewFlagSet("test", pflag.ContinueOnError)
fs.Int("max-tokens", 8192, "")
if err := viper.BindPFlag("max-tokens", fs.Lookup("max-tokens")); err != nil {
t.Fatalf("BindPFlag: %v", err)
}
if err := fs.Parse(args); err != nil {
t.Fatalf("fs.Parse: %v", err)
}
return func() {
viper.Reset()
}
}
func TestRightSizeMaxTokens_RaisesWhenBelowCeiling(t *testing.T) {
cleanup := bindMaxTokensFlag(t, nil) // no args → flag.Changed = false
defer cleanup()
config := &ProviderConfig{MaxTokens: 8192}
modelInfo := &ModelInfo{
ID: "claude-sonnet-4-5",
Limit: Limit{Context: 200000, Output: 64000},
}
rightSizeMaxTokens(config, modelInfo)
if config.MaxTokens != 32768 {
t.Errorf("expected MaxTokens raised to defaultRightSizeCap (32768), got %d", config.MaxTokens)
}
}
func TestRightSizeMaxTokens_CapsAtDefaultRightSizeCap(t *testing.T) {
cleanup := bindMaxTokensFlag(t, nil)
defer cleanup()
config := &ProviderConfig{MaxTokens: 8192}
// Mistral Devstral has 262144 output — we should still cap at 32768.
modelInfo := &ModelInfo{
ID: "devstral-medium-latest",
Limit: Limit{Context: 262144, Output: 262144},
}
rightSizeMaxTokens(config, modelInfo)
if config.MaxTokens != defaultRightSizeCap {
t.Errorf("expected MaxTokens capped at %d, got %d", defaultRightSizeCap, config.MaxTokens)
}
}
func TestRightSizeMaxTokens_UsesExactOutputWhenBelowCap(t *testing.T) {
cleanup := bindMaxTokensFlag(t, nil)
defer cleanup()
config := &ProviderConfig{MaxTokens: 4096}
// Model with output limit smaller than the cap.
modelInfo := &ModelInfo{
ID: "gpt-4",
Limit: Limit{Context: 8192, Output: 8192},
}
rightSizeMaxTokens(config, modelInfo)
if config.MaxTokens != 8192 {
t.Errorf("expected MaxTokens raised to model output ceiling (8192), got %d", config.MaxTokens)
}
}
func TestRightSizeMaxTokens_DoesNotLowerCurrentValue(t *testing.T) {
cleanup := bindMaxTokensFlag(t, nil)
defer cleanup()
// User (via per-model settings, applied earlier) already bumped MaxTokens
// above the cap — we must not clobber their choice.
config := &ProviderConfig{MaxTokens: 100000}
modelInfo := &ModelInfo{
ID: "devstral-medium-latest",
Limit: Limit{Context: 262144, Output: 262144},
}
rightSizeMaxTokens(config, modelInfo)
if config.MaxTokens != 100000 {
t.Errorf("expected MaxTokens preserved at 100000, got %d", config.MaxTokens)
}
}
func TestRightSizeMaxTokens_RespectsExplicitFlag(t *testing.T) {
// Simulate `--max-tokens 4096` on the command line.
cleanup := bindMaxTokensFlag(t, []string{"--max-tokens", "4096"})
defer cleanup()
config := &ProviderConfig{MaxTokens: 4096}
modelInfo := &ModelInfo{
ID: "claude-sonnet-4-5",
Limit: Limit{Context: 200000, Output: 64000},
}
rightSizeMaxTokens(config, modelInfo)
if config.MaxTokens != 4096 {
t.Errorf("expected explicit --max-tokens to be preserved (4096), got %d", config.MaxTokens)
}
}
func TestRightSizeMaxTokens_NilModelInfo(t *testing.T) {
cleanup := bindMaxTokensFlag(t, nil)
defer cleanup()
config := &ProviderConfig{MaxTokens: 8192}
// Custom model / Ollama / unknown provider → no model info.
rightSizeMaxTokens(config, nil)
if config.MaxTokens != 8192 {
t.Errorf("expected MaxTokens unchanged with nil modelInfo, got %d", config.MaxTokens)
}
}
func TestRightSizeMaxTokens_ZeroOutputLimit(t *testing.T) {
cleanup := bindMaxTokensFlag(t, nil)
defer cleanup()
config := &ProviderConfig{MaxTokens: 8192}
// Model present in catalog but with no known output limit.
modelInfo := &ModelInfo{
ID: "unknown-model",
Limit: Limit{Context: 0, Output: 0},
}
rightSizeMaxTokens(config, modelInfo)
if config.MaxTokens != 8192 {
t.Errorf("expected MaxTokens unchanged with zero output limit, got %d", config.MaxTokens)
}
}
+66
View File
@@ -0,0 +1,66 @@
package session
import (
"testing"
"github.com/mark3labs/kit/internal/message"
)
// TestCompactionParentCycleRegression tests that after multiple compactions,
// newly appended messages always have a valid parent chain and BuildContext
// returns the correct messages.
func TestCompactionParentCycleRegression(t *testing.T) {
tm := InMemoryTreeSession("/test")
// Simulate a long conversation with multiple compactions.
msg1, _ := tm.AppendMessage(message.Message{Role: message.RoleUser, Parts: []message.ContentPart{message.TextContent{Text: "msg1"}}})
msg2, _ := tm.AppendMessage(message.Message{Role: message.RoleAssistant, Parts: []message.ContentPart{message.TextContent{Text: "msg2"}}})
// First compaction
comp1, _ := tm.AppendCompaction("Summary 1", msg1, 1000, 500, 1, []string{}, []string{})
msg3, _ := tm.AppendMessage(message.Message{Role: message.RoleUser, Parts: []message.ContentPart{message.TextContent{Text: "msg3"}}})
msg4, _ := tm.AppendMessage(message.Message{Role: message.RoleAssistant, Parts: []message.ContentPart{message.TextContent{Text: "msg4"}}})
// Second compaction
comp2, _ := tm.AppendCompaction("Summary 2", msg3, 1000, 500, 1, []string{}, []string{})
msg5, _ := tm.AppendMessage(message.Message{Role: message.RoleUser, Parts: []message.ContentPart{message.TextContent{Text: "msg5"}}})
msg6, _ := tm.AppendMessage(message.Message{Role: message.RoleAssistant, Parts: []message.ContentPart{message.TextContent{Text: "msg6"}}})
// Verify parent chain integrity
for _, id := range []string{msg1, msg2, comp1, msg3, msg4, comp2, msg5, msg6} {
entry := tm.GetEntry(id)
if entry == nil {
t.Fatalf("entry %s not found in index", id)
}
}
// Walk parent chain from msg6 — must reach root without cycles
visited := make(map[string]bool)
current := msg6
for current != "" {
if visited[current] {
t.Fatalf("cycle detected at entry %s", current)
}
visited[current] = true
entry := tm.GetEntry(current)
if entry == nil {
t.Fatalf("entry %s missing from index during parent walk", current)
}
parent := ""
switch e := entry.(type) {
case *MessageEntry:
parent = e.ParentID
case *CompactionEntry:
parent = e.ParentID
}
current = parent
}
// BuildContext should return: Summary2 + msg6 + msg5 + msg3 + msg4 = 5 messages
msgs, _, _ := tm.BuildContext()
if len(msgs) != 5 {
t.Fatalf("expected 5 messages, got %d: %+v", len(msgs), msgs)
}
}
+109
View File
@@ -0,0 +1,109 @@
package session
import (
"testing"
"github.com/mark3labs/kit/internal/message"
)
// TestDetectCycleWithCorruptedParentChain tests that cycle detection works
// when a corrupted session has circular parent references.
func TestDetectCycleWithCorruptedParentChain(t *testing.T) {
tm := InMemoryTreeSession("/test")
// Create normal chain: msg1 -> msg2 -> msg3
id1, _ := tm.AppendMessage(message.Message{Role: message.RoleUser, Parts: []message.ContentPart{message.TextContent{Text: "msg1"}}})
_, _ = tm.AppendMessage(message.Message{Role: message.RoleAssistant, Parts: []message.ContentPart{message.TextContent{Text: "msg2"}}})
id3, _ := tm.AppendMessage(message.Message{Role: message.RoleUser, Parts: []message.ContentPart{message.TextContent{Text: "msg3"}}})
// Simulate corruption: manually set msg1's parent to msg3, creating cycle
// This simulates the condition seen in the user's session
for _, entry := range tm.entries {
if e, ok := entry.(*MessageEntry); ok && e.ID == id1 {
e.ParentID = id3 // Create cycle: msg1 -> msg3 -> ... -> msg1
break
}
}
// DetectCycle should find the cycle
// The cycle is: id1 -> id3 -> id2 -> id1
// So detecting from id3 should find id1 as the repeat
cycle, entry := tm.DetectCycle(id3)
if !cycle {
t.Fatal("expected to detect cycle, but none found")
}
// The cycle entry could be id1 or id3 depending on where we start
if entry != id1 && entry != id3 {
t.Fatalf("expected cycle at %s or %s, got %s", id1, id3, entry)
}
// BuildContext should still work (it has its own cycle detection)
// but will truncate at the cycle point
msgs, _, _ := tm.BuildContext()
if len(msgs) == 0 {
t.Fatal("BuildContext returned no messages")
}
}
// TestAppendMessageRejectsInvalidParent tests that AppendMessage rejects
// appending when the current leaf has a broken parent chain.
func TestAppendMessageRejectsInvalidParent(t *testing.T) {
tm := InMemoryTreeSession("/test")
// Create normal message
id1, err := tm.AppendMessage(message.Message{Role: message.RoleUser, Parts: []message.ContentPart{message.TextContent{Text: "msg1"}}})
if err != nil {
t.Fatalf("failed to append msg1: %v", err)
}
// Simulate corruption: set leafID to a non-existent ID
tm.leafID = "non-existent-id"
// Next append should fail validation
_, err = tm.AppendMessage(message.Message{Role: message.RoleAssistant, Parts: []message.ContentPart{message.TextContent{Text: "msg2"}}})
if err == nil {
t.Fatal("expected error when appending with invalid leafID, got nil")
}
// Restore valid leafID
tm.leafID = id1
// Append should succeed now
_, err = tm.AppendMessage(message.Message{Role: message.RoleAssistant, Parts: []message.ContentPart{message.TextContent{Text: "msg3"}}})
if err != nil {
t.Fatalf("failed to append msg3 after restoring leafID: %v", err)
}
}
// TestBuildContextHandlesCycleGracefully tests that BuildContext handles
// cycles gracefully by truncating the branch.
func TestBuildContextHandlesCycleGracefully(t *testing.T) {
tm := InMemoryTreeSession("/test")
// Create messages
id1, _ := tm.AppendMessage(message.Message{Role: message.RoleUser, Parts: []message.ContentPart{message.TextContent{Text: "msg1"}}})
_, _ = tm.AppendMessage(message.Message{Role: message.RoleAssistant, Parts: []message.ContentPart{message.TextContent{Text: "msg2"}}})
id3, _ := tm.AppendMessage(message.Message{Role: message.RoleUser, Parts: []message.ContentPart{message.TextContent{Text: "msg3"}}})
// Verify normal case works
msgs, _, _ := tm.BuildContext()
if len(msgs) != 3 {
t.Fatalf("expected 3 messages, got %d", len(msgs))
}
// Simulate cycle: set msg1's parent to msg3
for _, entry := range tm.entries {
if e, ok := entry.(*MessageEntry); ok && e.ID == id1 {
e.ParentID = id3
break
}
}
// BuildContext should handle cycle gracefully (getBranchLocked has cycle detection)
msgs, _, _ = tm.BuildContext()
// Should only include messages from the cycle: msg3, msg2, msg1
// (msg3 is leaf, walks to msg2 -> msg1 -> msg3 (cycle detected, stops))
if len(msgs) != 3 {
t.Fatalf("expected 3 messages in cycle case, got %d: %+v", len(msgs), msgs)
}
}
+37 -1
View File
@@ -365,6 +365,9 @@ func OpenTreeSession(path string) (*TreeManager, error) {
tm.leafID = tm.EntryID(tm.entries[len(tm.entries)-1])
}
// Validate tree integrity and log diagnostics
tm.LogTreeDiagnostics()
// Open file for appending.
f, err := os.OpenFile(path, os.O_WRONLY|os.O_APPEND, 0644)
if err != nil {
@@ -410,6 +413,12 @@ func (tm *TreeManager) AppendMessage(msg message.Message) (string, error) {
tm.mu.Lock()
defer tm.mu.Unlock()
// Validate parent chain before appending to detect/prevent cycles
// that could be caused by external file corruption or race conditions.
if err := tm.validateParentChainLocked(tm.leafID, ""); err != nil {
return "", fmt.Errorf("parent chain validation failed: %w", err)
}
entry, err := NewMessageEntry(tm.leafID, msg)
if err != nil {
return "", err
@@ -518,6 +527,13 @@ func (tm *TreeManager) AppendCompaction(summary, firstKeptEntryID string, tokens
tm.mu.Lock()
defer tm.mu.Unlock()
// Validate that firstKeptEntryID exists if provided
if firstKeptEntryID != "" {
if _, ok := tm.index[firstKeptEntryID]; !ok {
return "", fmt.Errorf("first kept entry %q does not exist", firstKeptEntryID)
}
}
// The compaction entry has no parent, making it a new "root" for the
// post-compaction branch. This ensures old compacted messages are not
// traversed when walking from the current leaf.
@@ -1213,12 +1229,32 @@ func (tm *TreeManager) getBranchLocked(fromID string) []any {
}
// buildTreeNode recursively builds a TreeNode from an entry ID.
// It includes a depth limit to prevent infinite recursion in case of
// corrupted parent-child relationships.
func (tm *TreeManager) buildTreeNode(id string) *TreeNode {
return tm.buildTreeNodeDepth(id, 0, make(map[string]bool))
}
// buildTreeNodeDepth is the internal implementation with depth tracking.
func (tm *TreeManager) buildTreeNodeDepth(id string, depth int, visited map[string]bool) *TreeNode {
const maxDepth = 1000
if depth > maxDepth {
// Cycle or extremely deep tree detected, stop recursing
return nil
}
if visited[id] {
// Cycle detected, stop recursing
return nil
}
entry, ok := tm.index[id]
if !ok {
return nil
}
visited[id] = true
defer delete(visited, id)
node := &TreeNode{
Entry: entry,
ID: id,
@@ -1226,7 +1262,7 @@ func (tm *TreeManager) buildTreeNode(id string) *TreeNode {
}
for _, childID := range tm.childIndex[id] {
child := tm.buildTreeNode(childID)
child := tm.buildTreeNodeDepth(childID, depth+1, visited)
if child != nil {
node.Children = append(node.Children, child)
}
+143
View File
@@ -0,0 +1,143 @@
package session
import (
"fmt"
"log"
)
// ValidateParentChain checks that the parent ID points to an existing entry
// and that appending this entry would not create a cycle. This should be called
// before appending any entry to the tree.
// Returns an error if the parent is invalid or would create a cycle.
func (tm *TreeManager) ValidateParentChain(parentID string, newEntryID string) error {
if parentID == "" {
// Empty parent is valid (root entry)
return nil
}
// Check that parent exists
if _, ok := tm.index[parentID]; !ok {
return fmt.Errorf("parent entry %q does not exist in index", parentID)
}
// Check that we're not creating a cycle by walking up the parent chain
// from parentID and ensuring we don't hit newEntryID (or any node that
// has newEntryID as an ancestor, but since newEntryID is new, just check
// that parentID isn't newEntryID, which it can't be since we check existence)
visited := make(map[string]bool)
current := parentID
for current != "" {
if visited[current] {
return fmt.Errorf("existing cycle detected at entry %q", current)
}
visited[current] = true
// Safety check: if somehow we reach the new entry ID, that's a cycle
if current == newEntryID {
return fmt.Errorf("would create cycle: entry %q cannot be its own ancestor", newEntryID)
}
entry, ok := tm.index[current]
if !ok {
return fmt.Errorf("broken parent chain: entry %q not found", current)
}
current = tm.entryParentID(entry)
}
return nil
}
// DetectCycle walks the parent chain from the given entry ID and returns true
// if a cycle is detected. This is used for diagnostics.
func (tm *TreeManager) DetectCycle(fromID string) (cycleDetected bool, cycleEntry string) {
visited := make(map[string]bool)
current := fromID
for current != "" {
if visited[current] {
return true, current
}
visited[current] = true
entry, ok := tm.index[current]
if !ok {
return false, ""
}
current = tm.entryParentID(entry)
}
return false, ""
}
// LogTreeDiagnostics logs information about the tree structure for debugging.
// Call this after OpenTreeSession or when anomalies are detected.
func (tm *TreeManager) LogTreeDiagnostics() {
tm.mu.RLock()
defer tm.mu.RUnlock()
log.Printf("[TreeManager] Entry count: %d, Leaf ID: %s", len(tm.entries), tm.leafID)
// Check for cycles from leaf
if tm.leafID != "" {
if cycle, entry := tm.detectCycleLocked(tm.leafID); cycle {
log.Printf("[TreeManager] WARNING: Cycle detected in tree at entry %s", entry)
}
}
// Count entries by type
counts := make(map[EntryType]int)
for _, entry := range tm.entries {
var et EntryType
switch e := entry.(type) {
case *MessageEntry:
et = e.Type
case *ModelChangeEntry:
et = e.Type
case *BranchSummaryEntry:
et = e.Type
case *LabelEntry:
et = e.Type
case *SessionInfoEntry:
et = e.Type
case *ExtensionDataEntry:
et = e.Type
case *CompactionEntry:
et = e.Type
default:
et = "unknown"
}
counts[et]++
}
log.Printf("[TreeManager] Entry types: %+v", counts)
}
// detectCycleLocked is the internal version of DetectCycle (must hold read lock)
func (tm *TreeManager) detectCycleLocked(fromID string) (bool, string) {
visited := make(map[string]bool)
current := fromID
for current != "" {
if visited[current] {
return true, current
}
visited[current] = true
entry, ok := tm.index[current]
if !ok {
return false, ""
}
current = tm.entryParentID(entry)
}
return false, ""
}
// validateParentChainLocked is the internal version used by append methods.
// Must be called with the write lock held.
func (tm *TreeManager) validateParentChainLocked(parentID string, newEntryID string) error {
if parentID == "" {
return nil
}
if _, ok := tm.index[parentID]; !ok {
return fmt.Errorf("parent entry %q does not exist", parentID)
}
// Check for existing cycles in the parent chain
if cycle, entry := tm.detectCycleLocked(parentID); cycle {
return fmt.Errorf("existing cycle detected at entry %q in parent chain", entry)
}
return nil
}
+12 -12
View File
@@ -243,10 +243,12 @@ func (p *MCPConnectionPool) performHealthCheck(ctx context.Context, conn *MCPCon
// createConnection creates a new connection
func (p *MCPConnectionPool) createConnection(ctx context.Context, serverName string, serverConfig config.MCPServerConfig) (*MCPConnection, error) {
oauthEnabled := p.oauthFlow != nil && !serverConfig.NoOAuth
mcpClient, err := p.createMCPClient(ctx, serverName, serverConfig)
if err != nil {
// SSE transport can return OAuth error during Start()
if p.oauthFlow != nil && IsOAuthError(err) {
if oauthEnabled && IsOAuthError(err) {
if flowErr := p.oauthFlow.RunAuthFlow(ctx, serverName, err); flowErr != nil {
return nil, fmt.Errorf("OAuth authorization failed: %w", flowErr)
}
@@ -262,7 +264,7 @@ func (p *MCPConnectionPool) createConnection(ctx context.Context, serverName str
if err := p.initializeClient(ctx, mcpClient); err != nil {
// Streamable HTTP transport returns OAuth error during Initialize()
if p.oauthFlow != nil && IsOAuthError(err) {
if oauthEnabled && IsOAuthError(err) {
if flowErr := p.oauthFlow.RunAuthFlow(ctx, serverName, err); flowErr != nil {
_ = mcpClient.Close()
return nil, fmt.Errorf("OAuth authorization failed: %w", flowErr)
@@ -363,11 +365,11 @@ func (p *MCPConnectionPool) createSSEClient(ctx context.Context, serverConfig co
}
}
// Enable OAuth for remote transports when an auth handler is configured.
// The OAuthConfig uses PKCE and the handler's redirect URI. If the server
// config provides a pre-registered ClientID (for servers that don't support
// dynamic client registration, e.g. GitHub), it is passed through directly.
if p.oauthFlow != nil {
// Enable OAuth for remote transports when an auth handler is configured
// and the server hasn't opted out via NoOAuth. Public MCP servers (e.g.
// PubMed) set NoOAuth to skip dynamic client registration and token
// exchange, which would otherwise fail with a 404.
if p.oauthFlow != nil && !serverConfig.NoOAuth {
tokenStore, tsErr := p.createTokenStore(serverConfig.URL)
if tsErr != nil {
return nil, fmt.Errorf("failed to create token store: %w", tsErr)
@@ -420,11 +422,9 @@ func (p *MCPConnectionPool) createStreamableClient(ctx context.Context, serverCo
}
}
// Enable OAuth for remote transports when an auth handler is configured.
// The OAuthConfig uses PKCE and the handler's redirect URI. If the server
// config provides a pre-registered ClientID (for servers that don't support
// dynamic client registration, e.g. GitHub), it is passed through directly.
if p.oauthFlow != nil {
// Enable OAuth for remote transports when an auth handler is configured
// and the server hasn't opted out via NoOAuth.
if p.oauthFlow != nil && !serverConfig.NoOAuth {
tokenStore, tsErr := p.createTokenStore(serverConfig.URL)
if tsErr != nil {
return nil, fmt.Errorf("failed to create token store: %w", tsErr)
-24
View File
@@ -69,30 +69,6 @@ func TestInputComponent_SubmitEmitsSubmitMsg(t *testing.T) {
}
}
// TestInputComponent_CtrlD_SubmitEmitsSubmitMsg verifies that ctrl+d also
// submits the text.
func TestInputComponent_CtrlD_SubmitEmitsSubmitMsg(t *testing.T) {
ctrl := &stubAppController{}
c := newTestInput(ctrl)
c.textarea.SetValue("ctrl+d submit")
c.lastValue = "ctrl+d submit"
_, cmd := sendInputMsg(c, tea.KeyPressMsg{Code: 'd', Mod: tea.ModCtrl})
msg := runCmd(cmd)
if msg == nil {
t.Fatal("expected a cmd from ctrl+d on non-empty input")
}
sm, ok := msg.(core.SubmitMsg)
if !ok {
t.Fatalf("expected submitMsg from ctrl+d, got %T", msg)
}
if sm.Text != "ctrl+d submit" {
t.Fatalf("expected Text='ctrl+d submit', got %q", sm.Text)
}
}
// TestInputComponent_EmptySubmit_NoCmd verifies that submitting an empty or
// whitespace-only string produces no cmd.
func TestInputComponent_EmptySubmit_NoCmd(t *testing.T) {
+1 -1
View File
@@ -84,7 +84,7 @@ var SlashCommands = []SlashCommand{
},
{
Name: "/thinking",
Description: "Set thinking/reasoning level (off, minimal, low, medium, high)",
Description: "Set thinking/reasoning level (off, none, minimal, low, medium, high)",
Category: "System",
Aliases: []string{"/think"},
Complete: func(prefix string) []string {
+5
View File
@@ -25,6 +25,11 @@ type SubmitMsg struct {
// presses ESC a second time, the canceling state is reset to false.
type CancelTimerExpiredMsg struct{}
// CtrlCResetMsg is sent after a short delay when the user presses Ctrl+C to
// clear input. If the user doesn't press Ctrl+C again within the timeout,
// the ctrlCPressedOnce flag is reset so the next Ctrl+C will clear again.
type CtrlCResetMsg struct{}
// --- Tree session events ---
// TreeNodeSelectedMsg is sent when the user selects a node in the tree selector.
+7 -1
View File
@@ -145,7 +145,13 @@ func TestDetectMediaType(t *testing.T) {
content []byte
expected string
}{
{".go", nil, "text/plain"}, // .go falls back to content sniffing → text/plain
// An intentionally-synthetic extension that is not registered
// in any system MIME database. Exercises the "unknown ext +
// no content" branch, which must return the text/plain default.
// Do not use real extensions (e.g. .go) here: CI images often
// ship /etc/mime.types with entries like ".go → text/x-go",
// which would make the assertion environment-dependent.
{".kitsyntheticext", nil, "text/plain"},
{".png", []byte{0x89, 0x50, 0x4E, 0x47}, "image/png"},
{".jpg", []byte{0xFF, 0xD8, 0xFF}, "image/jpeg"},
{".pdf", []byte{0x25, 0x50, 0x44, 0x46}, "application/pdf"},
+35 -4
View File
@@ -201,7 +201,7 @@ func (s *InputComponent) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
case tea.KeyPressMsg:
if !s.showPopup {
switch msg.String() {
case "ctrl+d", "enter":
case "enter":
value := s.textarea.Value()
s.pushHistory(value)
s.textarea.SetValue("")
@@ -708,9 +708,25 @@ func (s *InputComponent) renderPopupWithOptions(centered bool) string {
}
content = indicator + displayName
} else {
nameWidth := 15
if innerWidth < 25 {
nameWidth = max(innerWidth*2/5+1, 8)
// Compute nameWidth from the longest command name in the
// visible slice so we never truncate unnecessarily.
nameWidth := 0
for _, fm := range s.filtered {
if n := len([]rune(fm.Command.Name)); n > nameWidth {
nameWidth = n
}
}
nameWidth += 3 // account for indicator prefix (2) + gap before description (1)
// Ensure descriptions still get at least 20 chars when possible.
maxForName := innerWidth - 20
if maxForName < 8 {
maxForName = innerWidth * 2 / 3
}
if nameWidth > maxForName {
nameWidth = maxForName
}
if nameWidth < 8 {
nameWidth = 8
}
maxNameChars := nameWidth - 2
displayName := sc.Name
@@ -843,6 +859,21 @@ func (s *InputComponent) PendingImageCount() int {
return len(s.pendingImages)
}
// Clear clears the textarea content and resets related state. Returns true if
// there was content to clear, false if the input was already empty.
func (s *InputComponent) Clear() bool {
hadContent := s.textarea.Value() != ""
s.textarea.SetValue("")
s.textarea.CursorEnd()
s.lastValue = ""
s.showPopup = false
s.argMode = false
s.fileMode = false
s.browsingHistory = false
s.savedInput = ""
return hadContent
}
// applyFileCompletion replaces the @prefix in the textarea with the selected
// file or MCP resource suggestion. For directories, it keeps the popup open
// for further drilling. For files and resources, it closes the popup and adds
+1 -1
View File
@@ -156,7 +156,7 @@ func (s *StreamingMessageItem) Render(width int) string {
durationMs = time.Since(s.startTime).Milliseconds()
}
ty := createTypography(style.GetTheme())
rendered = render.ReasoningBlock(s.content, durationMs, ty, style.GetTheme())
rendered = render.ReasoningBlock(s.content, durationMs, width, ty, style.GetTheme())
} else {
// Render as assistant message
rendered = render.AssistantBlock(s.content, width, style.GetTheme())
+1 -1
View File
@@ -178,7 +178,7 @@ func (r *MessageRenderer) RenderAssistantMessage(content string, timestamp time.
// as live streaming: muted italic text with margin. This is used when resuming
// sessions to display saved reasoning content.
func (r *MessageRenderer) RenderReasoningBlock(content string, timestamp time.Time) UIMessage {
rendered := render.ReasoningBlock(content, 0, r.ty, style.GetTheme())
rendered := render.ReasoningBlock(content, 0, r.width, r.ty, style.GetTheme())
return UIMessage{
Type: AssistantMessage,
+150 -21
View File
@@ -720,6 +720,10 @@ type AppModel struct {
// disables alt screen to restore the terminal properly.
quitting bool
// ctrlCPressedOnce tracks if Ctrl+C was pressed once to clear input.
// A second Ctrl+C (or Ctrl+C when input is empty) will quit the app.
ctrlCPressedOnce bool
// streamingBashOutput holds the current streaming bash output lines.
// Lines are accumulated as they arrive and displayed in the stream region.
streamingBashOutput []string
@@ -869,7 +873,7 @@ func NewAppModel(appCtrl AppController, opts AppModelOptions) *AppModel {
m.messages = []MessageItem{}
// Wire up child components now that we have the concrete implementations.
m.input = NewInputComponent(width, "Enter your prompt (Type /help for commands, Ctrl+C to quit)", appCtrl)
m.input = NewInputComponent(width, "Enter your prompt (Type /help for commands, Ctrl+C twice to quit)", appCtrl)
// Wire up cwd for @file autocomplete.
if ic, ok := m.input.(*InputComponent); ok && opts.Cwd != "" {
@@ -1138,6 +1142,31 @@ func (m *AppModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
m.state = stateInput
if m.setModel != nil {
previousModel := m.providerName + "/" + m.modelName
// Check if thinking level needs adjustment for the new model.
// Some models (e.g., OpenAI gpt-5.4) don't support "minimal" and require "none".
if m.thinkingLevel != "" && m.thinkingLevel != "off" {
parts := strings.SplitN(msg.ModelString, "/", 2)
if len(parts) == 2 {
modelName := parts[1]
currentLevel := models.ParseThinkingLevel(m.thinkingLevel)
if !models.IsValidThinkingLevelForModel(currentLevel, modelName) {
fallback := models.SuggestThinkingLevelFallback(currentLevel, modelName)
if fallback != models.ThinkingOff {
m.printSystemMessage(fmt.Sprintf(
"Note: Model %s doesn't support '%s' thinking level. Adjusted to '%s'.",
modelName, currentLevel, fallback,
))
m.thinkingLevel = string(fallback)
if m.setThinkingLevel != nil {
_ = m.setThinkingLevel(string(fallback))
}
go func() { _ = prefs.SaveThinkingLevelPreference(string(fallback)) }()
}
}
}
}
if err := m.setModel(msg.ModelString); err != nil {
m.printSystemMessage(fmt.Sprintf("Failed to switch model: %v", err))
} else {
@@ -1283,10 +1312,22 @@ func (m *AppModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
m.overlayResponseCh = nil
m.overlay = nil
}
// Set quitting flag so View() disables alt screen for clean exit.
m.quitting = true
// Graceful quit: app.Close() is deferred in cmd/root.go.
return m, tea.Quit
// Second Ctrl+C within the timeout window — quit.
if m.ctrlCPressedOnce {
m.quitting = true
return m, tea.Quit
}
// First Ctrl+C — clear input if it has content, then arm the quit flag.
if m.state == stateInput {
if ic, ok := m.input.(*InputComponent); ok {
ic.Clear()
}
}
m.ctrlCPressedOnce = true
// Start reset timer so the flag clears after 3 seconds.
return m, ctrlCResetCmd()
}
// Check extension-registered global keyboard shortcuts. These fire
@@ -1318,11 +1359,11 @@ func (m *AppModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
m.scrollList.autoScroll = true
}
return m, tea.Batch(cmds...)
case "alt+home":
case "ctrl+home":
m.scrollList.GotoTop()
m.scrollList.autoScroll = false
return m, tea.Batch(cmds...)
case "alt+end":
case "ctrl+end":
m.scrollList.GotoBottom()
m.scrollList.autoScroll = true
return m, tea.Batch(cmds...)
@@ -1330,15 +1371,10 @@ func (m *AppModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
}
// Thinking keybindings — only when the model supports reasoning.
// Note: thinking visibility toggle is under leader chord (Ctrl+X t)
// to avoid conflicts with terminal multiplexers.
if m.isReasoningModel {
switch msg.String() {
case "ctrl+t":
// Toggle thinking block visibility.
m.thinkingVisible = !m.thinkingVisible
if m.stream != nil {
m.stream.SetThinkingVisible(m.thinkingVisible)
}
return m, tea.Batch(cmds...)
case "shift+tab":
// Cycle thinking level.
m.cycleThinkingLevel()
@@ -1439,6 +1475,14 @@ func (m *AppModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
}
}
}
case "t":
// Ctrl+X t → Toggle thinking block visibility.
if m.isReasoningModel {
m.thinkingVisible = !m.thinkingVisible
if m.stream != nil {
m.stream.SetThinkingVisible(m.thinkingVisible)
}
}
case "e":
// Ctrl+X e → open $EDITOR to compose/edit the prompt.
editorApp := os.Getenv("VISUAL")
@@ -1561,10 +1605,16 @@ func (m *AppModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
case uicore.CancelTimerExpiredMsg:
m.canceling = false
// ── Ctrl+C reset timer expired ────────────────────────────────────────────
case uicore.CtrlCResetMsg:
m.ctrlCPressedOnce = false
// ── Input submitted ──────────────────────────────────────────────────────
case uicore.SubmitMsg:
// Re-enable auto-scroll when user submits a new message.
m.scrollList.autoScroll = true
// Reset Ctrl+C flag so next Ctrl+C clears input instead of quitting.
m.ctrlCPressedOnce = false
// Handle slash commands locally — they should never reach app.Run().
// Parse once: split on the first space so argument-bearing commands
@@ -2082,6 +2132,39 @@ func (m *AppModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
ic.textarea.CursorEnd()
}
case app.PasswordPromptEvent:
// Sudo password prompt - show a modal input prompt
// If already in prompt state, cancel the new request
if m.state == statePrompt {
if msg.ResponseCh != nil {
msg.ResponseCh <- app.PasswordPromptResponse{Cancelled: true}
}
return m, tea.Batch(cmds...)
}
m.prePromptState = m.state
m.state = statePrompt
// Create a custom response channel that converts PasswordPromptResponse
passwordResponseCh := make(chan app.PromptResponse, 1)
m.promptResponseCh = passwordResponseCh
// Create password input prompt (masked input)
m.prompt = newPasswordPrompt(msg.Prompt, m.width, m.height)
// Handle the response conversion
go func() {
resp := <-passwordResponseCh
if msg.ResponseCh != nil {
msg.ResponseCh <- app.PasswordPromptResponse{
Password: resp.Value,
Cancelled: resp.Cancelled,
}
}
}()
if m.prompt != nil {
cmds = append(cmds, m.prompt.Init())
}
case app.PromptRequestEvent:
// Extension wants to show an interactive prompt. Enter prompt state.
// If already in prompt state (concurrent prompt from another
@@ -2400,6 +2483,14 @@ func (m *AppModel) View() tea.View {
parts = append(parts, warning)
}
if m.ctrlCPressedOnce {
warning := lipgloss.NewStyle().
Foreground(theme.Warning).
Bold(true).
Render(" ⚠ Press Ctrl+C again to quit")
parts = append(parts, warning)
}
if !vis.HideSeparator {
parts = append(parts, m.renderSeparator())
}
@@ -2597,7 +2688,7 @@ func (m *AppModel) renderStatusBar() string {
// cycleThinkingLevel advances to the next thinking level and applies it.
func (m *AppModel) cycleThinkingLevel() {
levels := []string{"off", "minimal", "low", "medium", "high"}
levels := []string{"off", "none", "minimal", "low", "medium", "high"}
current := m.thinkingLevel
if current == "" {
current = "off"
@@ -3386,7 +3477,7 @@ func (m *AppModel) printHelpMessage() {
"- `!command`: Run shell command, output included in LLM context\n" +
"- `!!command`: Run shell command, output excluded from LLM context\n\n" +
"**Keys:**\n" +
"- `Ctrl+C`: Exit at any time\n" +
"- `Ctrl+C`: Clear input and arm quit (press again to exit)\n" +
"- `ESC` (x2): Cancel ongoing LLM generation\n" +
"- `Ctrl+X s`: Steer — redirect the agent mid-turn (injected between tool calls)\n" +
"- `Ctrl+X e`: Open `$EDITOR` to compose/edit your prompt\n" +
@@ -3782,6 +3873,30 @@ func (m *AppModel) handleModelCommand(args string) tea.Cmd {
return nil
}
// Check if thinking level needs adjustment for the new model.
// Some models (e.g., OpenAI gpt-5.4) don't support "minimal" and require "none".
if m.thinkingLevel != "" && m.thinkingLevel != "off" {
parts := strings.SplitN(args, "/", 2)
if len(parts) == 2 {
modelName := parts[1]
currentLevel := models.ParseThinkingLevel(m.thinkingLevel)
if !models.IsValidThinkingLevelForModel(currentLevel, modelName) {
fallback := models.SuggestThinkingLevelFallback(currentLevel, modelName)
if fallback != models.ThinkingOff {
m.printSystemMessage(fmt.Sprintf(
"Note: Model %s doesn't support '%s' thinking level. Adjusted to '%s'.",
modelName, currentLevel, fallback,
))
m.thinkingLevel = string(fallback)
if m.setThinkingLevel != nil {
_ = m.setThinkingLevel(string(fallback))
}
go func() { _ = prefs.SaveThinkingLevelPreference(string(fallback)) }()
}
}
}
}
// Direct model switch with the provided model string.
previousModel := m.providerName + "/" + m.modelName
if err := m.setModel(args); err != nil {
@@ -3886,7 +4001,7 @@ func (m *AppModel) handleThinkingCommand(args string) tea.Cmd {
// Parse and validate the level.
level := models.ParseThinkingLevel(args)
if string(level) != strings.ToLower(args) {
m.printSystemMessage(fmt.Sprintf("Unknown thinking level: %q. Use: off, minimal, low, medium, high", args))
m.printSystemMessage(fmt.Sprintf("Unknown thinking level: %q. Use: off, none, minimal, low, medium, high", args))
return nil
}
@@ -4473,6 +4588,14 @@ func cancelTimerCmd() tea.Cmd {
})
}
// ctrlCResetCmd returns a tea.Cmd that fires CtrlCResetMsg after 3s.
// This resets the ctrlCPressedOnce flag so the next Ctrl+C will clear input again.
func ctrlCResetCmd() tea.Cmd {
return tea.Tick(3*time.Second, func(_ time.Time) tea.Msg {
return uicore.CtrlCResetMsg{}
})
}
// --------------------------------------------------------------------------
// Interactive prompt support
// --------------------------------------------------------------------------
@@ -4544,9 +4667,12 @@ func (m *AppModel) updatePromptState(msg tea.Msg) (tea.Model, tea.Cmd) {
switch msg := msg.(type) {
case tea.KeyPressMsg:
if msg.String() == "ctrl+c" {
// Cancel prompt and quit the application.
// Cancel the prompt but don't quit — let the main handler's
// double-Ctrl+C logic handle quitting.
m.resolvePrompt(app.PromptResponse{Cancelled: true})
return m, tea.Quit
// Don't consume the keypress — re-dispatch so the main
// ctrl+c handler can track the double-press state.
return m.Update(msg)
}
result, cmd := m.prompt.Update(msg)
if cmd != nil {
@@ -4613,9 +4739,12 @@ func (m *AppModel) updateOverlayState(msg tea.Msg) (tea.Model, tea.Cmd) {
switch msg := msg.(type) {
case tea.KeyPressMsg:
if msg.String() == "ctrl+c" {
// Cancel overlay and quit the application.
// Cancel the overlay but don't quit — let the main handler's
// double-Ctrl+C logic handle quitting.
m.resolveOverlay(app.OverlayResponse{Cancelled: true})
return m, tea.Quit
// Don't consume the keypress — re-dispatch so the main
// ctrl+c handler can track the double-press state.
return m.Update(msg)
}
result, cmd := m.overlay.Update(msg)
if cmd != nil {
+148 -6
View File
@@ -853,23 +853,165 @@ func TestSpinnerEvent_hideDoesNotTransitionState(t *testing.T) {
}
// --------------------------------------------------------------------------
// ctrl+c produces tea.Quit
// ctrl+c double-press to quit
// --------------------------------------------------------------------------
// TestCtrlC_producesQuit verifies that ctrl+c always returns a tea.Quit cmd.
// TestCtrlC_producesQuit verifies that double ctrl+c returns a tea.Quit cmd.
func TestCtrlC_producesQuit(t *testing.T) {
ctrl := &stubAppController{}
m, _, _ := newTestAppModel(ctrl)
// First Ctrl+C arms the quit flag.
updated, cmd := m.Update(tea.KeyPressMsg{Code: 'c', Mod: tea.ModCtrl})
m = updated.(*AppModel)
if cmd == nil {
t.Fatal("expected a command after first ctrl+c, got nil")
}
// Should be a reset timer, not quit.
msg := cmd()
if _, ok := msg.(core.CtrlCResetMsg); !ok {
t.Fatalf("expected CtrlCResetMsg after first ctrl+c, got %T", msg)
}
// Second Ctrl+C should quit.
_, cmd = m.Update(tea.KeyPressMsg{Code: 'c', Mod: tea.ModCtrl})
if cmd == nil {
t.Fatal("expected tea.Quit cmd on second ctrl+c, got nil")
}
msg = cmd()
if _, ok := msg.(tea.QuitMsg); !ok {
t.Fatalf("expected QuitMsg from second ctrl+c, got %T", msg)
}
}
// TestCtrlC_clearsInput_firstPress tests that Ctrl+C clears input on first
// press when there's content, and requires a second press to quit.
func TestCtrlC_clearsInput_firstPress(t *testing.T) {
// Create a real InputComponent to test the clear behavior
ctrl := &stubAppController{}
m, _, _ := newTestAppModel(ctrl)
// Replace with real InputComponent that has content
input := NewInputComponent(80, "test", ctrl)
input.textarea.SetValue("some text content")
m.input = input
// First Ctrl+C should clear input, not quit
_, cmd := m.Update(tea.KeyPressMsg{Code: 'c', Mod: tea.ModCtrl})
if cmd == nil {
t.Fatal("expected tea.Quit cmd on ctrl+c, got nil")
// Should have cleared the input
if input.textarea.Value() != "" {
t.Fatalf("expected input to be cleared, got %q", input.textarea.Value())
}
// Should have set ctrlCPressedOnce flag
if !m.ctrlCPressedOnce {
t.Fatal("expected ctrlCPressedOnce to be true after first Ctrl+C")
}
// The command should be a ctrlCResetCmd (not tea.Quit)
if cmd == nil {
t.Fatal("expected a command after first Ctrl+C, got nil")
}
// We verify it's a quit command by running it and checking the message type.
msg := cmd()
if _, ok := msg.(core.CtrlCResetMsg); !ok {
t.Fatalf("expected CtrlCResetMsg, got %T", msg)
}
// Second Ctrl+C should now quit
_, cmd = m.Update(tea.KeyPressMsg{Code: 'c', Mod: tea.ModCtrl})
if cmd == nil {
t.Fatal("expected tea.Quit cmd on second Ctrl+C, got nil")
}
msg = cmd()
if _, ok := msg.(tea.QuitMsg); !ok {
t.Fatalf("expected QuitMsg from ctrl+c cmd, got %T", msg)
t.Fatalf("expected QuitMsg on second Ctrl+C, got %T", msg)
}
}
// TestCtrlC_resetAfterSubmit tests that the Ctrl+C flag is reset after
// submitting a message, so the next Ctrl+C clears input again.
func TestCtrlC_resetAfterSubmit(t *testing.T) {
// Use newTestAppModel but replace the input with a real InputComponent
ctrl := &stubAppController{}
m, _, _ := newTestAppModel(ctrl)
// Replace with real InputComponent
input := NewInputComponent(80, "test", ctrl)
input.textarea.SetValue("content")
m.input = input
// First Ctrl+C clears input
updated, _ := m.Update(tea.KeyPressMsg{Code: 'c', Mod: tea.ModCtrl})
m = updated.(*AppModel)
if input.textarea.Value() != "" {
t.Fatal("expected input to be cleared")
}
// Flag should be set
if !m.ctrlCPressedOnce {
t.Fatal("expected ctrlCPressedOnce to be true after first Ctrl+C")
}
// Simulate CtrlCResetMsg being processed (timer expired)
updated, _ = m.Update(core.CtrlCResetMsg{})
m = updated.(*AppModel)
// Flag should be reset
if m.ctrlCPressedOnce {
t.Fatal("expected ctrlCPressedOnce to be false after CtrlCResetMsg")
}
// Add new content to input
input.textarea.SetValue("new content")
// Next Ctrl+C should clear again (not quit) because flag was reset
_, cmd := m.Update(tea.KeyPressMsg{Code: 'c', Mod: tea.ModCtrl})
if input.textarea.Value() != "" {
t.Fatalf("expected input to be cleared again, got %q", input.textarea.Value())
}
if cmd == nil {
t.Fatal("expected a command after Ctrl+C, got nil")
}
msg := cmd()
if _, ok := msg.(core.CtrlCResetMsg); !ok {
t.Fatalf("expected CtrlCResetMsg, got %T", msg)
}
}
// TestCtrlC_emptyInput_armsQuit tests that Ctrl+C on empty input still
// requires a second press to quit (consistent double-press behavior).
func TestCtrlC_emptyInput_armsQuit(t *testing.T) {
ctrl := &stubAppController{}
m, _, _ := newTestAppModel(ctrl)
// Replace with real InputComponent (empty by default)
input := NewInputComponent(80, "test", ctrl)
m.input = input
// First Ctrl+C on empty input should arm the flag, not quit.
updated, cmd := m.Update(tea.KeyPressMsg{Code: 'c', Mod: tea.ModCtrl})
m = updated.(*AppModel)
if !m.ctrlCPressedOnce {
t.Fatal("expected ctrlCPressedOnce to be true after first Ctrl+C")
}
if cmd == nil {
t.Fatal("expected a command (reset timer), got nil")
}
msg := cmd()
if _, ok := msg.(core.CtrlCResetMsg); !ok {
t.Fatalf("expected CtrlCResetMsg, got %T", msg)
}
// Second Ctrl+C should quit.
_, cmd = m.Update(tea.KeyPressMsg{Code: 'c', Mod: tea.ModCtrl})
if cmd == nil {
t.Fatal("expected tea.Quit cmd on second Ctrl+C, got nil")
}
msg = cmd()
if _, ok := msg.(tea.QuitMsg); !ok {
t.Fatalf("expected QuitMsg on second Ctrl+C, got %T", msg)
}
}
+78 -9
View File
@@ -19,9 +19,10 @@ import (
type promptMode string
const (
promptModeSelect promptMode = "select"
promptModeConfirm promptMode = "confirm"
promptModeInput promptMode = "input"
promptModeSelect promptMode = "select"
promptModeConfirm promptMode = "confirm"
promptModeInput promptMode = "input"
promptModePassword promptMode = "password"
)
// promptResult carries the synchronous outcome of a prompt overlay update.
@@ -102,10 +103,38 @@ func newInputPrompt(message, placeholder, defaultValue string, width, height int
}
}
// Init returns the initial command for the prompt overlay. For input mode
// this starts the cursor blink animation.
// newPasswordPrompt creates a prompt overlay for password input (masked).
func newPasswordPrompt(message string, width, height int) *promptOverlay {
ta := textarea.New()
ta.Placeholder = "Enter password"
ta.ShowLineNumbers = false
ta.Prompt = ""
ta.CharLimit = 0
ta.SetWidth(width - 12) // account for border + padding
ta.SetHeight(1)
ta.Focus()
// Prevent Enter from inserting a newline — we intercept it for submit.
ta.KeyMap.InsertNewline = key.NewBinding(
key.WithKeys("ctrl+j", "shift+enter"),
)
// Enable password masking - the textarea will show dots instead of characters
// Note: textarea doesn't have built-in password masking, so we handle it in View()
return &promptOverlay{
mode: promptModePassword,
message: message,
inputTA: ta,
width: width,
height: height,
}
}
// Init returns the initial command for the prompt overlay. For input/password
// modes this starts the cursor blink animation.
func (p *promptOverlay) Init() tea.Cmd {
if p.mode == promptModeInput {
if p.mode == promptModeInput || p.mode == promptModePassword {
return textarea.Blink
}
return nil
@@ -113,13 +142,13 @@ func (p *promptOverlay) Init() tea.Cmd {
// Update handles messages for the prompt overlay. It returns a non-nil
// *promptResult when the user completes or cancels the prompt. The returned
// tea.Cmd is for textarea blink ticks (input mode only).
// tea.Cmd is for textarea blink ticks (input/password modes only).
func (p *promptOverlay) Update(msg tea.Msg) (*promptResult, tea.Cmd) {
switch msg := msg.(type) {
case tea.WindowSizeMsg:
p.width = msg.Width
p.height = msg.Height
if p.mode == promptModeInput {
if p.mode == promptModeInput || p.mode == promptModePassword {
p.inputTA.SetWidth(p.width - 12)
}
return nil, nil
@@ -132,11 +161,13 @@ func (p *promptOverlay) Update(msg tea.Msg) (*promptResult, tea.Cmd) {
return p.updateConfirm(msg)
case promptModeInput:
return p.updateInput(msg)
case promptModePassword:
return p.updatePassword(msg)
}
}
// Pass non-key messages to textarea for blink animation.
if p.mode == promptModeInput {
if p.mode == promptModeInput || p.mode == promptModePassword {
var cmd tea.Cmd
p.inputTA, cmd = p.inputTA.Update(msg)
return nil, cmd
@@ -202,6 +233,20 @@ func (p *promptOverlay) updateInput(msg tea.KeyPressMsg) (*promptResult, tea.Cmd
}
}
func (p *promptOverlay) updatePassword(msg tea.KeyPressMsg) (*promptResult, tea.Cmd) {
switch msg.String() {
case "enter":
return &promptResult{completed: true, value: p.inputTA.Value()}, nil
case "esc":
return &promptResult{cancelled: true}, nil
default:
// Delegate character input, backspace, cursor movement, etc.
var cmd tea.Cmd
p.inputTA, cmd = p.inputTA.Update(msg)
return nil, cmd
}
}
// Render returns the prompt as a styled string for inline composition in the
// AppModel layout. The prompt replaces the normal input area (below the
// separator and above the status bar) rather than taking over the full screen.
@@ -216,6 +261,8 @@ func (p *promptOverlay) Render() string {
content = p.viewConfirm(theme)
case promptModeInput:
content = p.viewInput(theme)
case promptModePassword:
content = p.viewPassword(theme)
}
return renderContentBlock(content, p.width,
@@ -286,3 +333,25 @@ func (p *promptOverlay) viewInput(theme style.Theme) string {
return strings.Join(lines, "\n")
}
func (p *promptOverlay) viewPassword(theme style.Theme) string {
var lines []string
// Add 🔐 icon to message for password prompt
lines = append(lines, lipgloss.NewStyle().Bold(true).Foreground(theme.Text).Render("🔐 "+p.message))
lines = append(lines, "")
// Mask the password input with dots
passwordValue := p.inputTA.Value()
masked := strings.Repeat("•", len([]rune(passwordValue)))
// Render the masked password in a style that looks like input
maskedStyle := lipgloss.NewStyle().Foreground(theme.Text)
cursor := lipgloss.NewStyle().Foreground(theme.Accent).Render("█")
lines = append(lines, maskedStyle.Render(masked)+cursor)
lines = append(lines, "")
lines = append(lines, lipgloss.NewStyle().
Foreground(theme.Muted).
Render(" Enter submit Esc cancel (input is hidden)"))
return strings.Join(lines, "\n")
}
+7 -2
View File
@@ -63,14 +63,19 @@ func AssistantBlock(content string, width int, theme style.Theme) string {
// ReasoningBlock renders a reasoning/thinking block with muted italic text.
// If duration > 0, shows "Thought for Xs" label. Otherwise shows just "Thought".
func ReasoningBlock(content string, duration int64, ty *herald.Typography, theme style.Theme) string {
// The width parameter controls soft-wrapping so long reasoning lines don't get cut off.
func ReasoningBlock(content string, duration int64, width int, ty *herald.Typography, theme style.Theme) string {
if strings.TrimSpace(content) == "" {
return ""
}
// Match live streaming styling: muted italic text
// Match live streaming styling: muted italic text. Wrap before styling so
// ANSI sequences from italics don't interfere with width calculations.
lines := strings.Split(strings.TrimRight(content, "\n"), "\n")
contentStr := strings.TrimLeft(strings.Join(lines, "\n"), " \t\n")
if width > 4 { // mirror other blocks (User/Assistant) which subtract 4
contentStr = lipgloss.Wrap(contentStr, width-4, "")
}
mutedStyle := lipgloss.NewStyle().Foreground(theme.Muted)
contentRendered := mutedStyle.Render(ty.Italic(contentStr))
+8 -2
View File
@@ -472,6 +472,10 @@ func (s *StreamComponent) renderReasoningBlock(reasoning string) string {
// Main content using Italic with Muted color for visual distinction.
content := strings.TrimLeft(strings.Join(lines, "\n"), " \t\n")
// Soft-wrap to the available width so long lines don't get cut off.
if s.width > 4 {
content = lipgloss.Wrap(content, s.width-4, "")
}
theme := GetTheme()
mutedStyle := lipgloss.NewStyle().Foreground(theme.Muted)
parts = append(parts, mutedStyle.Render(s.ty.Italic(content)))
@@ -588,8 +592,10 @@ func formatToolExecutionMessage(toolName string) string {
return toolName
}
// UpdateTheme refreshes the component's typography instance with colors from
// the current theme. This is called when the user changes themes via /theme.
// UpdateTheme refreshes the component's typography instance and spinner
// animation frames with colors from the current theme. This is called when
// the user changes themes via /theme.
func (s *StreamComponent) UpdateTheme() {
s.ty = createTypography(GetTheme())
s.spinnerFrames = knightRiderFrames()
}
-4
View File
@@ -200,10 +200,6 @@ func (ts *TreeSelectorComponent) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
case key.Matches(msg, key.NewBinding(key.WithKeys("ctrl+l"))):
ts.filter = TreeFilterLabelOnly
ts.rebuildFlatList()
case key.Matches(msg, key.NewBinding(key.WithKeys("ctrl+a"))):
ts.filter = TreeFilterAll
ts.rebuildFlatList()
default:
// Typing search.
if msg.Text != "" && len(msg.Text) == 1 {
+7 -4
View File
@@ -224,10 +224,13 @@ kit.LLMResponse // {Content, FinishReason, Usage}
kit.LLMFilePart // {Filename, Data []byte, MediaType}
// MCP OAuth types
kit.MCPServer // *server.MCPServer for in-process MCP transport
kit.MCPServerConfig // Configuration for an MCP server (stdio, SSE, or in-process)
kit.MCPTokenStore // Persists OAuth tokens for a single MCP server
kit.MCPToken // OAuth token (access token, refresh token, expiry)
kit.MCPServer // *server.MCPServer for in-process MCP transport
kit.MCPServerConfig // Configuration for an MCP server (stdio, SSE, or in-process)
kit.MCPAuthHandler // Interface: handles user-facing OAuth authorization
kit.DefaultMCPAuthHandler // Port + callback-server mechanics; set OnAuthURL for presentation
kit.CLIMCPAuthHandler // CLI wrapper: opens browser, prints status
kit.MCPTokenStore // Persists OAuth tokens for a single MCP server
kit.MCPToken // OAuth token (access token, refresh token, expiry)
kit.MCPTokenStoreFactory // Creates an MCPTokenStore for a given server URL
// Conversion helpers
+31 -10
View File
@@ -38,20 +38,37 @@ Guidelines:
- Be concise in your responses
- Show file paths clearly when working with files`
// setSDKDefaults registers the same viper defaults that the CLI sets via
// cobra flag bindings. This ensures the SDK behaves identically to the CLI
// even when cobra is not used.
// sdkDefaultMaxTokens is the last-resort ceiling applied when the SDK caller
// has not configured max-tokens via Options, env, config, or a per-model
// default. It matches the CLI's --max-tokens cobra default so SDK and CLI
// callers see the same base value before per-model right-sizing runs.
// It is intentionally applied on the *models.ProviderConfig struct
// (not via viper) so that viper.IsSet("max-tokens") remains false and the
// right-sizing + per-model-default paths continue to work.
const sdkDefaultMaxTokens = 8192
// setSDKDefaults registers viper defaults that match the CLI's cobra flag
// defaults for keys where SetDefault does not interfere with downstream
// viper.IsSet() checks.
//
// Keys that participate in "explicit vs unset" precedence downstream —
// max-tokens, temperature, top-p, top-k, frequency-penalty, presence-penalty,
// thinking-level — are deliberately NOT registered here. viper.SetDefault
// causes viper.IsSet() to return true, which would suppress per-model
// defaults (ApplyModelSettings) and automatic right-sizing (rightSizeMaxTokens)
// for every SDK-created Kit. Those defaults are instead applied:
//
// - max-tokens: as a last-resort struct-level floor (sdkDefaultMaxTokens)
// in kit.New() after BuildProviderConfig returns, when the resolved
// value is still zero.
// - thinking-level: handled implicitly by models.ParseThinkingLevel("")
// which returns models.ThinkingOff.
// - sampling params (temperature, top-p, top-k, frequency/presence-penalty):
// left as nil pointers so provider libraries apply their own defaults.
func setSDKDefaults() {
viper.SetDefault("model", "anthropic/claude-sonnet-4-5-20250929")
viper.SetDefault("system-prompt", defaultSystemPrompt)
viper.SetDefault("max-tokens", 4096)
viper.SetDefault("temperature", 0.7)
viper.SetDefault("top-p", 0.95)
viper.SetDefault("top-k", 40)
viper.SetDefault("frequency-penalty", 0.0)
viper.SetDefault("presence-penalty", 0.0)
viper.SetDefault("stream", true)
viper.SetDefault("thinking-level", "off")
viper.SetDefault("num-gpu-layers", -1)
viper.SetDefault("main-gpu", 0)
}
@@ -102,6 +119,10 @@ func InitConfig(configFile string, debug bool) error {
}
viper.SetEnvPrefix("KIT")
// Map hyphenated config keys (e.g. "max-tokens") to underscored env
// var names (e.g. KIT_MAX_TOKENS). Without this, AutomaticEnv looks
// for KIT_MAX-TOKENS and silently misses valid env overrides.
viper.SetEnvKeyReplacer(strings.NewReplacer("-", "_"))
viper.AutomaticEnv()
return nil
}
+136 -3
View File
@@ -23,6 +23,14 @@ const (
EventMessageUpdate EventType = "message_update"
// EventMessageEnd fires when the assistant message is complete.
EventMessageEnd EventType = "message_end"
// EventToolCallStart fires when the LLM begins generating tool call arguments.
// The tool name is known but arguments are still streaming.
EventToolCallStart EventType = "tool_call_start"
// EventToolCallDelta fires for each streamed fragment of tool call arguments.
EventToolCallDelta EventType = "tool_call_delta"
// EventToolCallEnd fires when tool argument streaming is complete, before
// the tool call is parsed and execution begins.
EventToolCallEnd EventType = "tool_call_end"
// EventToolCall fires when a tool call has been parsed and is about to execute.
EventToolCall EventType = "tool_call"
// EventToolExecutionStart fires when a tool begins executing.
@@ -45,6 +53,8 @@ const (
// EventToolOutput fires when a tool produces streaming output chunks.
EventToolOutput EventType = "tool_output"
EventStepUsage EventType = "step_usage"
// EventPasswordPrompt fires when a sudo command needs a password.
EventPasswordPrompt EventType = "password_prompt"
// EventSteerConsumed fires when one or more steering messages have been
// injected into the agent turn via PrepareStep.
EventSteerConsumed EventType = "steer_consumed"
@@ -108,6 +118,38 @@ func parseToolArgs(toolArgs string) map[string]any {
return nil
}
// ---------------------------------------------------------------------------
// Finish reason constants
// ---------------------------------------------------------------------------
// Finish reasons reported by the LLM provider on a completed turn. These
// mirror fantasy.FinishReason string values so comparisons against
// TurnEndEvent.StopReason / TurnResult.StopReason are stable across
// providers.
const (
// FinishReasonStop: the model produced a natural stop (e.g. stop sequence
// or end-of-turn signal).
FinishReasonStop = "stop"
// FinishReasonLength: the model hit the configured max_output_tokens
// budget. The response is truncated. Surface this to the user and
// consider raising --max-tokens / KIT_MAX_TOKENS / modelSettings[...]
// .maxTokens.
FinishReasonLength = "length"
// FinishReasonToolCalls: the model stopped to emit tool calls (normal
// mid-turn state during agentic loops).
FinishReasonToolCalls = "tool-calls"
// FinishReasonContentFilter: the provider's safety filter stopped
// generation.
FinishReasonContentFilter = "content-filter"
// FinishReasonError: the model stopped because of an error.
FinishReasonError = "error"
// FinishReasonOther: provider-specific reason that doesn't map to any of
// the above.
FinishReasonOther = "other"
// FinishReasonUnknown: the provider didn't report a finish reason.
FinishReasonUnknown = "unknown"
)
// ---------------------------------------------------------------------------
// Concrete event structs
// ---------------------------------------------------------------------------
@@ -122,9 +164,13 @@ func (e TurnStartEvent) EventType() EventType { return EventTurnStart }
// TurnEndEvent fires after the agent finishes processing.
type TurnEndEvent struct {
Response string
Error error
StopReason string // "end_turn", "max_tokens", "tool_use", "error", etc.
Response string
Error error
// StopReason is the LLM provider's finish reason for the final step of
// the turn. Compare against the FinishReason* constants — in particular,
// FinishReasonLength indicates the response was truncated because the
// agent hit its max_output_tokens budget.
StopReason string
}
// EventType implements Event.
@@ -178,6 +224,40 @@ type MessageEndEvent struct {
// EventType implements Event.
func (e MessageEndEvent) EventType() EventType { return EventMessageEnd }
// ToolCallStartEvent fires when the LLM begins generating tool call arguments.
// The tool name is known at this point but the full arguments are still being
// streamed. UIs can use this to show a "running" indicator immediately instead
// of waiting for the full argument JSON to finish streaming.
type ToolCallStartEvent struct {
ToolCallID string // Stable ID for correlating tool lifecycle events
ToolName string
ToolKind string // Tool classification: "execute", "edit", "read", "search", "agent"
}
// EventType implements Event.
func (e ToolCallStartEvent) EventType() EventType { return EventToolCallStart }
// ToolCallDeltaEvent fires for each streamed fragment of tool call arguments.
// Useful for live-previewing artifact content as it's generated, or showing a
// progress indicator with byte count.
type ToolCallDeltaEvent struct {
ToolCallID string // Stable ID for correlating tool lifecycle events
Delta string // JSON fragment of tool arguments
}
// EventType implements Event.
func (e ToolCallDeltaEvent) EventType() EventType { return EventToolCallDelta }
// ToolCallEndEvent fires when tool argument streaming is complete, before
// the tool call is parsed and execution begins. UIs can use this to
// transition from an "generating args" state to an "executing" state.
type ToolCallEndEvent struct {
ToolCallID string // Stable ID for correlating tool lifecycle events
}
// EventType implements Event.
func (e ToolCallEndEvent) EventType() EventType { return EventToolCallEnd }
// ToolCallEvent fires when a tool call has been parsed.
type ToolCallEvent struct {
ToolCallID string // Stable ID for correlating tool lifecycle events
@@ -299,6 +379,26 @@ type SteerConsumedEvent struct {
// EventType implements Event.
func (e SteerConsumedEvent) EventType() EventType { return EventSteerConsumed }
// PasswordPromptEvent fires when a sudo command needs a password.
// The TUI should display a password prompt and send the result back via ResponseCh.
type PasswordPromptEvent struct {
// Prompt is the message to display to the user.
Prompt string
// ResponseCh receives the password from the TUI.
// The TUI must send exactly one value: (password, false) for submit
// or ("", true) for cancel.
ResponseCh chan<- PasswordPromptResponse
}
// PasswordPromptResponse carries the password prompt result.
type PasswordPromptResponse struct {
Password string
Cancelled bool
}
// EventType implements Event.
func (e PasswordPromptEvent) EventType() EventType { return EventPasswordPrompt }
// ---------------------------------------------------------------------------
// EventBus
// ---------------------------------------------------------------------------
@@ -362,6 +462,39 @@ func (m *Kit) OnToolCall(handler func(ToolCallEvent)) func() {
})
}
// OnToolCallStart registers a handler that fires only for ToolCallStartEvent.
// This fires when the LLM begins generating tool call arguments — before the
// full argument JSON is available. Returns an unsubscribe function.
func (m *Kit) OnToolCallStart(handler func(ToolCallStartEvent)) func() {
return m.Subscribe(func(e Event) {
if tcs, ok := e.(ToolCallStartEvent); ok {
handler(tcs)
}
})
}
// OnToolCallDelta registers a handler that fires only for ToolCallDeltaEvent.
// Each delta contains a JSON fragment of tool call arguments as they stream in.
// Returns an unsubscribe function.
func (m *Kit) OnToolCallDelta(handler func(ToolCallDeltaEvent)) func() {
return m.Subscribe(func(e Event) {
if tcd, ok := e.(ToolCallDeltaEvent); ok {
handler(tcd)
}
})
}
// OnToolCallEnd registers a handler that fires only for ToolCallEndEvent.
// This fires when tool argument streaming is complete, before the tool call
// is parsed and execution begins. Returns an unsubscribe function.
func (m *Kit) OnToolCallEnd(handler func(ToolCallEndEvent)) func() {
return m.Subscribe(func(e Event) {
if tce, ok := e.(ToolCallEndEvent); ok {
handler(tce)
}
})
}
// OnToolResult registers a handler that fires only for ToolResultEvent.
// Returns an unsubscribe function.
func (m *Kit) OnToolResult(handler func(ToolResultEvent)) func() {
+32
View File
@@ -100,6 +100,38 @@ func (m *Kit) bridgeExtensions(runner *extensions.Runner) {
})
}
// Tool call input streaming events — fire as the LLM generates tool arguments.
if runner.HasHandlers(extensions.ToolCallInputStart) {
m.Subscribe(func(e Event) {
if ev, ok := e.(ToolCallStartEvent); ok {
_, _ = runner.Emit(extensions.ToolCallInputStartEvent{
ToolCallID: ev.ToolCallID,
ToolName: ev.ToolName,
ToolKind: ev.ToolKind,
})
}
})
}
if runner.HasHandlers(extensions.ToolCallInputDelta) {
m.Subscribe(func(e Event) {
if ev, ok := e.(ToolCallDeltaEvent); ok {
_, _ = runner.Emit(extensions.ToolCallInputDeltaEvent{
ToolCallID: ev.ToolCallID,
Delta: ev.Delta,
})
}
})
}
if runner.HasHandlers(extensions.ToolCallInputEnd) {
m.Subscribe(func(e Event) {
if ev, ok := e.(ToolCallEndEvent); ok {
_, _ = runner.Emit(extensions.ToolCallInputEndEvent{
ToolCallID: ev.ToolCallID,
})
}
})
}
if runner.HasHandlers(extensions.AgentEnd) {
m.Subscribe(func(e Event) {
if ev, ok := e.(TurnEndEvent); ok {
+286 -27
View File
@@ -51,6 +51,7 @@ type Kit struct {
bufferedLogger *tools.BufferedDebugLogger
authHandler MCPAuthHandler // OAuth handler for remote MCP servers (may need Close)
opts *Options // stored for reload operations (skills, etc.)
mcpConfig *config.Config // loaded MCP/server config, shared with subagents
// hasCustomSystemPrompt is true when the user explicitly configured a
// system prompt (via --system-prompt flag, config file, or SDK option).
@@ -542,6 +543,23 @@ func (m *Kit) SetModel(ctx context.Context, modelString string) error {
systemPrompt, _ := config.LoadSystemPrompt(viper.GetString("system-prompt"))
thinkingLevel := models.ParseThinkingLevel(viper.GetString("thinking-level"))
// Validate and adjust thinking level for the target model.
// Some models (e.g., OpenAI gpt-5.4) don't support "minimal" and require "none".
if thinkingLevel != models.ThinkingOff {
parts := strings.SplitN(modelString, "/", 2)
if len(parts) == 2 {
modelName := parts[1]
if !models.IsValidThinkingLevelForModel(thinkingLevel, modelName) {
fallback := models.SuggestThinkingLevelFallback(thinkingLevel, modelName)
if fallback != models.ThinkingOff {
// Adjust the thinking level in viper so the change persists.
viper.Set("thinking-level", string(fallback))
thinkingLevel = fallback
}
}
}
}
// With message-level caching, thinking and caching can work together.
// No need to disable caching when thinking is enabled.
cfg := &models.ProviderConfig{
@@ -810,6 +828,29 @@ func (m *Kit) ExecuteCompletion(ctx context.Context, req extensions.CompleteRequ
// Options configures Kit creation with optional overrides for model,
// prompts, configuration, and behavior settings. All fields are optional
// and will use CLI defaults if not specified.
//
// Global viper state warning:
// Options are applied by [New] via [viper.Set] calls against viper's
// process-global store. This store is shared with every downstream reader
// (e.g. [Kit.SetModel], [Kit.GetThinkingLevel], BuildProviderConfig, and
// any other code path that calls viper.Get*). Two consequences:
//
// 1. Kit instances are NOT isolated from each other within a single
// process. Values set by the second New() call overwrite the first,
// and any code that later reads viper will see the most recent Set.
// 2. Fields left at the zero value do NOT clear prior viper state; they
// simply skip the viper.Set. Callers that need a clean slate between
// constructions should invoke viper.Reset() (the test suite uses a
// private resetViper() helper that wraps it) before the next New().
//
// Recommended usage: create one Kit per process, or reset viper between
// constructions. Concurrent calls to New are serialized internally by
// [viperInitMu], but that mutex does not prevent later viper reads (from
// a different Kit) from observing mutated keys.
//
// TODO: refactor New to use a per-instance *viper.Viper (constructed via
// viper.New()) so each Kit owns its own isolated config store and Options
// no longer leak through the global singleton.
type Options struct {
Model string // Override model (e.g., "anthropic/claude-sonnet-4-5-20250929")
SystemPrompt string // Override system prompt
@@ -820,6 +861,76 @@ type Options struct {
Tools []Tool // Custom tool set. If empty, AllTools() is used.
ExtraTools []Tool // Additional tools added alongside core/MCP/extension tools.
// Generation parameters. These override the corresponding values from
// .kit.yml / KIT_* environment variables. Leaving a field at its
// zero/nil value means "use the configured default", which in turn
// falls back to per-model defaults (modelSettings / customModels) and
// finally to a last-resort SDK floor of 8192 for MaxTokens (matching
// the CLI --max-tokens default; sampling params fall through to
// provider-level defaults).
//
// Pointer types are used for sampling parameters so the SDK can
// distinguish "explicitly set to 0" from "leave alone".
// MaxTokens overrides the maximum output tokens per LLM response.
// 0 = let the precedence chain resolve a value (env → config →
// per-model → 8192 SDK floor, matching the CLI default). Setting a
// non-zero value here suppresses automatic right-sizing, matching
// the CLI's --max-tokens flag semantics. Bump this when generating
// long outputs (HTML artifacts, large refactors, etc.) to avoid
// silent truncation mid-tool-call. The cap also applies after
// model switches via [Kit.SetModel].
MaxTokens int
// ThinkingLevel sets the reasoning effort for models that support
// extended thinking. Valid values: "off", "none", "minimal", "low",
// "medium", "high". "" = let the precedence chain resolve a level
// (env → config → per-model → "off"). Use [Kit.SetThinkingLevel]
// to change at runtime.
ThinkingLevel string
// Temperature controls sampling randomness (typically 0.02.0).
// nil = leave provider/per-model default in place. Pointer type
// so explicit 0.0 (deterministic) is distinguishable from "unset".
Temperature *float32
// TopP is the nucleus-sampling cutoff (0.01.0).
// nil = leave provider/per-model default in place.
TopP *float32
// TopK limits sampling to the top K tokens.
// nil = leave provider/per-model default in place.
TopK *int32
// FrequencyPenalty discourages repeated tokens (OpenAI-family models).
// nil = leave provider/per-model default in place.
FrequencyPenalty *float32
// PresencePenalty discourages repeating topics (OpenAI-family models).
// nil = leave provider/per-model default in place.
PresencePenalty *float32
// Provider configuration. These override values normally read from
// .kit.yml or provider-specific environment variables. Useful when
// loading credentials from a secrets manager, pointing at custom
// OpenAI-compatible endpoints (LiteLLM, vLLM, Azure OpenAI, internal
// proxies), or running against self-hosted infrastructure.
// ProviderAPIKey overrides the API key used to authenticate with the
// model provider. "" = use the value from config or the
// provider-specific environment variable.
ProviderAPIKey string
// ProviderURL overrides the provider endpoint. "" = use the provider's
// default URL.
ProviderURL string
// TLSSkipVerify disables TLS certificate verification on provider
// HTTP clients. Only set this for self-signed certificates in
// development. Once enabled here it cannot be disabled via Options
// (use the config file or env var to opt back out).
TLSSkipVerify bool
// SkipConfig, when true, skips loading .kit.yml configuration files.
// Viper defaults (setSDKDefaults) and environment variables (KIT_*)
// are still applied. Use this for fully programmatic configuration.
@@ -849,6 +960,13 @@ type Options struct {
// (e.g. AGENTS.md) from the working directory.
NoContextFiles bool
// MCPConfig provides a pre-loaded MCP configuration. When set,
// LoadAndValidateConfig is skipped during Kit creation — avoiding
// viper access entirely. This is set automatically for in-process
// subagents (inheriting the parent's loaded config) and can be used
// by SDK consumers who build config programmatically.
MCPConfig *config.Config
// InProcessMCPServers registers mcp-go servers that run in the same
// process. Each key is the server name (used to prefix tool names, e.g.
// "docs__search"). The value must be a *[server.MCPServer].
@@ -879,15 +997,23 @@ type Options struct {
Debug bool
// MCPAuthHandler handles OAuth authorization for remote MCP servers.
// When set, remote transports (streamable HTTP, SSE) are configured with
// OAuth support. If the server returns a 401, the handler is invoked to
// let the user authorize via browser.
// When set, remote transports (streamable HTTP, SSE) are configured
// with OAuth support. If the server returns a 401, the handler is
// invoked to let the user authorize.
//
// If nil, a [DefaultMCPAuthHandler] is created automatically — opening the
// system browser and listening on a local callback server.
// If nil, OAuth is disabled: remote MCP servers requiring authorization
// will fail to connect and the underlying authorization-required error
// is surfaced to the caller. The SDK deliberately does not construct a
// default handler — doing so would bind a local TCP port and trigger
// presentation I/O (browser open, stderr writes) without the consumer
// opting in, which is wrong for library, daemon, or web-app embedders.
//
// Set to a custom implementation to control the authorization UX (e.g.
// display a URL in a custom UI, redirect to a web app, etc.).
// CLI consumers: pass [NewCLIMCPAuthHandler] to get the standard
// "open browser + print status" behavior.
//
// Custom UX: implement [MCPAuthHandler] directly, or use
// [DefaultMCPAuthHandler] and set its OnAuthURL hook to plug in your
// own presentation (TUI modal, QR code, web redirect, etc.).
MCPAuthHandler MCPAuthHandler
// MCPTokenStoreFactory, if non-nil, is called to create a token store for
@@ -971,14 +1097,29 @@ func InitTreeSession(opts *Options) (*session.TreeManager, error) {
return session.CreateTreeSession(sessionDir)
}
// viperInitMu serializes viper writes during [New]. Viper's global state
// is not thread-safe, so concurrent calls (e.g. parallel subagent spawns)
// must not overlap the Set/Get window. Note that this mutex only protects
// the construction window — it does not isolate long-lived Kit instances
// from each other. See the "Global viper state warning" on [Options].
var viperInitMu sync.Mutex
// New creates a Kit instance using the same initialization as the CLI.
// It loads configuration, initializes MCP servers, creates the LLM model, and
// sets up the agent for interaction. Returns an error if initialization fails.
// viperInitMu serializes viper writes during kit.New(). Viper's global state
// is not thread-safe, so concurrent calls (e.g. parallel subagent spawns)
// must not overlap the Set()/Get() window.
var viperInitMu sync.Mutex
//
// Global viper state warning: fields on [Options] are applied by calling
// [viper.Set] on viper's process-global store. As a result, two Kits
// constructed in the same process are NOT isolated: the second New
// overwrites viper keys set by the first, and any downstream reader
// (e.g. [Kit.SetModel], [Kit.GetThinkingLevel]) will observe the most
// recent value. Callers that need multiple independent Kits should call
// viper.Reset() between constructions, or avoid constructing more than
// one Kit per process. Writes during New are serialized by [viperInitMu].
//
// TODO: refactor to use a per-call viper.New() instance so each Kit owns
// its own isolated config store and Options stop leaking through the
// global singleton.
func New(ctx context.Context, opts *Options) (*Kit, error) {
if opts == nil {
opts = &Options{}
@@ -1039,6 +1180,47 @@ func New(ctx context.Context, opts *Options) (*Kit, error) {
}
viper.Set("stream", opts.Streaming)
// Generation parameter overrides. Each Options field, when set,
// is pushed into viper here so the existing downstream code
// (BuildProviderConfig, SetModel, modelSettings lookups) picks
// it up uniformly. Pointer-typed sampling params use viper.Set
// only when non-nil so that nil means "leave provider/per-model
// default in place" (BuildProviderConfig keys off viper.IsSet).
if opts.MaxTokens > 0 {
viper.Set("max-tokens", opts.MaxTokens)
}
if opts.ThinkingLevel != "" {
viper.Set("thinking-level", opts.ThinkingLevel)
}
if opts.Temperature != nil {
viper.Set("temperature", *opts.Temperature)
}
if opts.TopP != nil {
viper.Set("top-p", *opts.TopP)
}
if opts.TopK != nil {
viper.Set("top-k", *opts.TopK)
}
if opts.FrequencyPenalty != nil {
viper.Set("frequency-penalty", *opts.FrequencyPenalty)
}
if opts.PresencePenalty != nil {
viper.Set("presence-penalty", *opts.PresencePenalty)
}
// Provider overrides. TLSSkipVerify only takes effect when true —
// callers wanting to force-disable should use the config file or
// env var instead.
if opts.ProviderAPIKey != "" {
viper.Set("provider-api-key", opts.ProviderAPIKey)
}
if opts.ProviderURL != "" {
viper.Set("provider-url", opts.ProviderURL)
}
if opts.TLSSkipVerify {
viper.Set("tls-skip-verify", true)
}
// Resolve working directory for context/skill discovery.
cwd = opts.SessionDir
if cwd == "" {
@@ -1124,6 +1306,17 @@ func New(ctx context.Context, opts *Options) (*Kit, error) {
if pcErr != nil {
return fmt.Errorf("failed to build provider config: %w", pcErr)
}
// SDK last-resort max-tokens floor. When nothing — Options, env,
// config, nor a per-model default — supplied a value, we land on
// zero here (viper.GetInt returns 0 for unset keys). Apply the
// SDK default directly on the struct rather than via viper so
// viper.IsSet("max-tokens") stays false: downstream right-sizing
// can still raise this toward the model's known output ceiling,
// and per-model modelSettings[...].maxTokens can still win.
if providerConfig.MaxTokens == 0 && opts.MaxTokens == 0 {
providerConfig.MaxTokens = sdkDefaultMaxTokens
}
modelString = viper.GetString("model")
debug = viper.GetBool("debug")
noExtensions = opts.NoExtensions || viper.GetBool("no-extensions")
@@ -1136,8 +1329,11 @@ func New(ctx context.Context, opts *Options) (*Kit, error) {
}
// ---- viperInitMu released — heavy I/O below runs concurrently ----
// Load MCP configuration. Use pre-loaded config if provided via CLI options.
if opts.CLI != nil && opts.CLI.MCPConfig != nil {
// Load MCP configuration. Use pre-loaded config if provided directly,
// via CLI options, or load from viper as a last resort.
if opts.MCPConfig != nil {
mcpConfig = opts.MCPConfig
} else if opts.CLI != nil && opts.CLI.MCPConfig != nil {
mcpConfig = opts.CLI.MCPConfig
}
if mcpConfig == nil {
@@ -1191,20 +1387,19 @@ func New(ctx context.Context, opts *Options) (*Kit, error) {
OnMCPServerLoaded: opts.OnMCPServerLoaded,
}
// Set up OAuth handler for remote MCP servers.
// Set up OAuth handler for remote MCP servers. The SDK does not create
// a default handler: auto-construction would bind a local TCP port and
// (historically) shell out to a browser without the consumer asking,
// which is a surprise for library/daemon/web-app embedders. Consumers
// that want CLI behavior pass a [CLIMCPAuthHandler] explicitly; other
// consumers implement [MCPAuthHandler] themselves. If nil, remote MCP
// servers requiring OAuth will fail to connect with the underlying
// authorization-required error surfaced to the caller.
//
// The SDK MCPAuthHandler interface is structurally identical to
// tools.MCPAuthHandler, so any implementation satisfies both.
if opts.MCPAuthHandler != nil {
setupOpts.AuthHandler = opts.MCPAuthHandler
} else {
// Create a default handler that opens the system browser.
defaultHandler, authErr := NewDefaultMCPAuthHandler()
if authErr != nil {
// Non-fatal: OAuth just won't be available for remote servers.
log.Printf("WARN Failed to create OAuth handler; remote MCP servers requiring auth will fail: %v", authErr)
} else {
setupOpts.AuthHandler = defaultHandler
}
}
// Set up custom token store factory for MCP OAuth tokens.
@@ -1258,6 +1453,7 @@ func New(ctx context.Context, opts *Options) (*Kit, error) {
bufferedLogger: agentResult.BufferedLogger,
authHandler: setupOpts.AuthHandler,
opts: opts,
mcpConfig: mcpConfig,
hasCustomSystemPrompt: hasCustomSystemPrompt,
beforeToolCall: beforeToolCall,
afterToolResult: afterToolResult,
@@ -1439,8 +1635,9 @@ type TurnResult struct {
Response string
// StopReason indicates why the turn ended. Derived from the LLM
// provider's finish reason: "stop", "length" (max tokens), "tool-calls",
// "content-filter", "error", "other", "unknown".
// provider's finish reason: FinishReasonStop, FinishReasonLength (max
// output tokens reached), FinishReasonToolCalls, FinishReasonContentFilter,
// FinishReasonError, FinishReasonOther, FinishReasonUnknown.
StopReason string
// SessionID is the UUID of the session this turn belongs to.
@@ -1582,13 +1779,15 @@ func (m *Kit) Subagent(ctx context.Context, cfg SubagentConfig) (*SubagentResult
tools = SubagentTools()
}
// Create child Kit instance.
// Create child Kit instance. Pass the parent's loaded MCP config to
// avoid re-reading viper (which races with concurrent subagent spawns).
childOpts := &Options{
Model: model,
SystemPrompt: systemPrompt,
Tools: tools,
NoSession: cfg.NoSession,
Quiet: true,
MCPConfig: m.mcpConfig,
}
child, err := New(ctx, childOpts)
if err != nil {
@@ -1809,6 +2008,37 @@ func (m *Kit) generate(ctx context.Context, messages []fantasy.Message) (*agent.
CacheWriteTokens: uint64(cacheCreationTokens),
})
},
// Password prompt handler for sudo commands
func(prompt string) (string, bool) {
// Emit event to TUI and wait for response via channel
responseCh := make(chan PasswordPromptResponse, 1)
m.events.emit(PasswordPromptEvent{
Prompt: prompt,
ResponseCh: responseCh,
})
// Wait for response (TUI will send password or cancel)
resp := <-responseCh
return resp.Password, resp.Cancelled
},
// Tool call argument streaming — fire as the LLM generates tool arguments
func(toolCallID, toolName string) {
m.events.emit(ToolCallStartEvent{
ToolCallID: toolCallID,
ToolName: toolName,
ToolKind: toolKindFor(toolName),
})
},
func(toolCallID, delta string) {
m.events.emit(ToolCallDeltaEvent{
ToolCallID: toolCallID,
Delta: delta,
})
},
func(toolCallID string) {
m.events.emit(ToolCallEndEvent{
ToolCallID: toolCallID,
})
},
)
}
@@ -2223,6 +2453,35 @@ func (m *Kit) GetTools() []Tool {
return m.agent.GetTools()
}
// MaxTokens returns the effective max output tokens currently configured for
// the agent. This is the value actually sent to the LLM provider on each
// request, after CLI/env/config resolution, per-model overrides, model-aware
// right-sizing, and any Anthropic thinking-budget adjustments.
//
// Returns 0 when the active provider suppresses the max_output_tokens
// parameter (e.g. OpenAI Codex OAuth) or when no model is configured yet.
// A non-zero value is the number that will cause a FinishReasonLength
// truncation if the model tries to generate beyond it.
func (m *Kit) MaxTokens() int {
if m.agent == nil {
return 0
}
return m.agent.GetMaxTokens()
}
// MaxOutputLimit returns the catalog-reported output ceiling for the current
// model in tokens, or 0 when the model isn't in the registry (custom models,
// new releases, Ollama, etc.). Pair with MaxTokens() to detect when the agent
// is configured well below what the model supports and surface a hint to the
// user.
func (m *Kit) MaxOutputLimit() int {
info := m.GetModelInfo()
if info == nil {
return 0
}
return info.Limit.Output
}
// extractFileParts returns all FilePart entries from a message's Content.
// Used to preserve image attachments when replacing user message text.
func extractFileParts(msg fantasy.Message) []fantasy.FilePart {
+225
View File
@@ -5,6 +5,8 @@ import (
"os"
"testing"
"github.com/spf13/viper"
kit "github.com/mark3labs/kit/pkg/kit"
)
@@ -54,6 +56,225 @@ func TestNewWithOptions(t *testing.T) {
}
}
// TestNewWithGenerationOptions verifies that the SDK-only generation
// parameter overrides on Options propagate all the way through to the
// agent without requiring any viper.Set workarounds in caller code.
func TestNewWithGenerationOptions(t *testing.T) {
if os.Getenv("ANTHROPIC_API_KEY") == "" {
t.Skip("Skipping test: ANTHROPIC_API_KEY not set")
}
ctx := context.Background()
// MaxTokens override — keep ThinkingLevel off so Anthropic's thinking
// budget doesn't auto-bump MaxTokens above what we configured.
t.Run("MaxTokens", func(t *testing.T) {
defer resetViper()
const want = 12345
host, err := kit.New(ctx, &kit.Options{
Model: "anthropic/claude-sonnet-4-5-20250929",
Quiet: true,
MaxTokens: want,
})
if err != nil {
t.Fatalf("Failed to create Kit: %v", err)
}
defer func() { _ = host.Close() }()
if got := host.MaxTokens(); got != want {
t.Errorf("Options.MaxTokens=%d did not propagate; Kit.MaxTokens()=%d", want, got)
}
if !viper.IsSet("max-tokens") {
t.Error("viper.IsSet(\"max-tokens\") should be true after MaxTokens override")
}
})
// ThinkingLevel override — verified via the public getter, which
// reads back the configured (not provider-derived) level.
t.Run("ThinkingLevel", func(t *testing.T) {
defer resetViper()
const want = "high"
host, err := kit.New(ctx, &kit.Options{
Model: "anthropic/claude-sonnet-4-5-20250929",
Quiet: true,
ThinkingLevel: want,
})
if err != nil {
t.Fatalf("Failed to create Kit: %v", err)
}
defer func() { _ = host.Close() }()
if got := host.GetThinkingLevel(); got != want {
t.Errorf("Options.ThinkingLevel=%q did not propagate; Kit.GetThinkingLevel()=%q", want, got)
}
})
// Temperature override — pointer semantics let callers distinguish
// "explicitly 0.0" from "unset", which we assert by pushing a distinct
// value and reading it back off viper's merged state.
t.Run("Temperature", func(t *testing.T) {
defer resetViper()
want := float32(0.12345)
host, err := kit.New(ctx, &kit.Options{
Model: "anthropic/claude-sonnet-4-5-20250929",
Quiet: true,
Temperature: &want,
})
if err != nil {
t.Fatalf("Failed to create Kit: %v", err)
}
defer func() { _ = host.Close() }()
if !viper.IsSet("temperature") {
t.Fatal("viper.IsSet(\"temperature\") should be true after Temperature override")
}
if got := float32(viper.GetFloat64("temperature")); got != want {
t.Errorf("Options.Temperature=%v did not propagate; viper=%v", want, got)
}
})
}
// TestNewPreservesIsSetSemantics verifies that creating a Kit WITHOUT
// populating the generation-param Options fields does NOT mark those
// keys as explicitly set in viper. This is the precedence contract
// that per-model defaults (ApplyModelSettings) and right-sizing
// (rightSizeMaxTokens) rely on.
//
// Previously setSDKDefaults() used viper.SetDefault() for every param,
// which caused viper.IsSet() to return true for all of them — silently
// suppressing per-model defaults and pinning max-tokens at 4096 even
// on models with much larger output limits.
func TestNewPreservesIsSetSemantics(t *testing.T) {
if os.Getenv("ANTHROPIC_API_KEY") == "" {
t.Skip("Skipping test: ANTHROPIC_API_KEY not set")
}
defer resetViper()
ctx := context.Background()
host, err := kit.New(ctx, &kit.Options{
Model: "anthropic/claude-sonnet-4-5-20250929",
Quiet: true,
NoSession: true,
SkipConfig: true, // isolate from any ~/.kit.yml values
})
if err != nil {
t.Fatalf("Failed to create Kit: %v", err)
}
defer func() { _ = host.Close() }()
// These keys must remain "unset" from viper's perspective so the
// downstream isExplicitlySet() checks allow per-model defaults to
// take effect.
checkKeys := []string{
"max-tokens",
"temperature",
"top-p",
"top-k",
"frequency-penalty",
"presence-penalty",
"thinking-level",
}
// With SkipConfig: true, InitConfig() is not invoked, so viper has
// no env-var bindings registered. Any IsSet() here would come purely
// from SDK-side SetDefault/Set calls — which is exactly what this
// test is guarding against.
for _, k := range checkKeys {
if viper.IsSet(k) {
t.Errorf("viper.IsSet(%q) == true when no Options field set it "+
"(SDK defaults must not corrupt IsSet semantics)", k)
}
}
}
// TestNewWithProviderOptions verifies that programmatic provider overrides
// (API key, URL) take effect without env vars or config files, and that
// Options.ProviderAPIKey *wins* over any pre-existing viper state.
func TestNewWithProviderOptions(t *testing.T) {
if os.Getenv("ANTHROPIC_API_KEY") == "" {
t.Skip("Skipping test: ANTHROPIC_API_KEY not set")
}
ctx := context.Background()
t.Run("succeeds with API key from Options", func(t *testing.T) {
defer resetViper()
apiKey := os.Getenv("ANTHROPIC_API_KEY")
host, err := kit.New(ctx, &kit.Options{
Model: "anthropic/claude-sonnet-4-5-20250929",
Quiet: true,
NoSession: true,
ProviderAPIKey: apiKey,
})
if err != nil {
t.Fatalf("Failed to create Kit with ProviderAPIKey option: %v", err)
}
defer func() { _ = host.Close() }()
if got := viper.GetString("provider-api-key"); got != apiKey {
t.Errorf("Options.ProviderAPIKey did not propagate to viper; got %q (len=%d)", got, len(got))
}
})
// Override precedence: even when viper already holds a different
// provider-api-key value (as it would if a config file or earlier
// Set() call populated one), Options.ProviderAPIKey must win.
t.Run("Options override beats pre-existing viper state", func(t *testing.T) {
defer resetViper()
viper.Set("provider-api-key", "sk-config-file-placeholder")
want := "sk-from-options-override"
// Use an OpenAI-flavored model so the validation path accepts
// the placeholder without attempting a real Anthropic handshake.
host, err := kit.New(ctx, &kit.Options{
Model: "openai/gpt-4o-mini",
Quiet: true,
NoSession: true,
NoExtensions: true,
DisableCoreTools: true,
ProviderAPIKey: want,
})
// Creation may still fail if the model registry is strict, but
// we only care that the override reached viper before any
// provider handshake happened.
if host != nil {
defer func() { _ = host.Close() }()
}
_ = err
if got := viper.GetString("provider-api-key"); got != want {
t.Errorf("Options.ProviderAPIKey did not override pre-existing viper value; got %q, want %q", got, want)
}
})
// ProviderURL override must also reach viper.
t.Run("ProviderURL propagates", func(t *testing.T) {
defer resetViper()
const want = "https://custom.example.com/v1"
host, err := kit.New(ctx, &kit.Options{
Model: "anthropic/claude-sonnet-4-5-20250929",
Quiet: true,
NoSession: true,
ProviderURL: want,
})
if err != nil {
t.Fatalf("Failed to create Kit with ProviderURL option: %v", err)
}
defer func() { _ = host.Close() }()
if got := viper.GetString("provider-url"); got != want {
t.Errorf("Options.ProviderURL did not propagate; got %q, want %q", got, want)
}
})
}
func TestSessionManagement(t *testing.T) {
if os.Getenv("ANTHROPIC_API_KEY") == "" {
t.Skip("Skipping test: ANTHROPIC_API_KEY not set")
@@ -81,3 +302,7 @@ func TestSessionManagement(t *testing.T) {
t.Error("Expected non-empty session ID")
}
}
// resetViper wipes viper's global state so a test case doesn't leak
// viper.Set() calls into the next one. Used via defer in subtests.
func resetViper() { viper.Reset() }
+45 -45
View File
@@ -5,18 +5,18 @@ import (
"fmt"
"net"
"net/http"
"os/exec"
"runtime"
"sync"
"time"
)
// MCPAuthHandler handles OAuth authorization for MCP servers.
// Implementations control the user experience — opening a browser, showing a
// prompt, displaying a URL, etc.
// prompt, displaying a URL, posting to a message bus, etc.
//
// The default implementation ([DefaultMCPAuthHandler]) opens the system browser
// and starts a local HTTP callback server to receive the authorization code.
// [DefaultMCPAuthHandler] provides the transport mechanics (port reservation
// and callback server) but performs no user-facing I/O on its own; consumers
// wire presentation via [DefaultMCPAuthHandler.OnAuthURL] or implement
// MCPAuthHandler from scratch.
type MCPAuthHandler interface {
// RedirectURI returns the OAuth redirect URI that the callback server
// will listen on. This is called during MCP transport setup — before any
@@ -37,23 +37,44 @@ type MCPAuthHandler interface {
HandleAuth(ctx context.Context, serverName string, authURL string) (callbackURL string, err error)
}
// DefaultMCPAuthHandler opens the system browser and starts a local HTTP
// callback server to receive the OAuth authorization code. It eagerly reserves
// a TCP port on construction so [RedirectURI] is stable for the lifetime of
// the handler.
// DefaultMCPAuthHandler provides the transport mechanics of an OAuth flow —
// reserving a local TCP port and running a one-shot HTTP callback server —
// without making any user-experience decisions. It performs no browser opens,
// no printing, no TUI calls; consumers attach presentation by setting
// [DefaultMCPAuthHandler.OnAuthURL] or by wrapping the handler.
//
// Create instances with [NewDefaultMCPAuthHandler] (random port) or
// [NewDefaultMCPAuthHandlerWithPort] (explicit port).
// The handler eagerly reserves a TCP port on construction so [RedirectURI] is
// stable for the lifetime of the handler. Create instances with
// [NewDefaultMCPAuthHandler] (random port) or [NewDefaultMCPAuthHandlerWithPort]
// (explicit port). Always call [DefaultMCPAuthHandler.Close] when done to
// release the port.
type DefaultMCPAuthHandler struct {
listener net.Listener
port int
mu sync.Mutex // guards listener lifecycle
// OnAuthURL, if set, is invoked exactly once per [HandleAuth] call with
// the authorization URL the user must visit. This is where consumers
// plug in their UX: open a browser, print to stderr, post to a TUI
// stream, render a QR code, etc. The handler performs no I/O on the
// URL itself; if OnAuthURL is nil the URL is silently dropped and the
// user has no way to complete the flow.
//
// OnAuthURL is called synchronously before the handler blocks on the
// callback. It must not block indefinitely — long-running work should
// be dispatched to a goroutine.
OnAuthURL func(serverName, authURL string)
}
// NewDefaultMCPAuthHandler creates a handler that listens on a random
// available port on localhost. The port is reserved immediately so
// [RedirectURI] returns a stable value. Call [DefaultMCPAuthHandler.Close]
// when the handler is no longer needed to release the port.
//
// The returned handler has no OnAuthURL hook configured and will therefore
// appear to hang on HandleAuth until the context deadline fires. Set
// OnAuthURL before using the handler, or use a higher-level wrapper such
// as [CLIMCPAuthHandler].
func NewDefaultMCPAuthHandler() (*DefaultMCPAuthHandler, error) {
listener, err := net.Listen("tcp", "localhost:0")
if err != nil {
@@ -88,9 +109,9 @@ func (h *DefaultMCPAuthHandler) Port() int {
return h.port
}
// HandleAuth opens the system browser to authURL and waits for the OAuth
// callback on the local server. It returns the full callback URL including
// query parameters (code, state, etc.).
// HandleAuth invokes [OnAuthURL] with the authorization URL (if configured)
// and waits for the OAuth callback on the local server. It returns the full
// callback URL including query parameters (code, state, etc.).
//
// If the context has no deadline, a default 2-minute timeout is applied.
// The callback server is started for each HandleAuth call and shut down
@@ -136,19 +157,13 @@ func (h *DefaultMCPAuthHandler) HandleAuth(ctx context.Context, serverName strin
Handler: mux,
}
// Start serving on the pre-reserved listener. We need to create a new
// listener on the same port because http.Server.Serve takes ownership
// and closes the listener when done. The original listener is kept open
// to reserve the port; we create a second listener via SO_REUSEADDR
// semantics (Go's default on most platforms) or, more reliably, we
// temporarily release and re-acquire.
//
// Strategy: use the held listener directly for Serve. After Serve
// returns (due to Shutdown), re-acquire the listener to keep the port
// reserved for future HandleAuth calls.
// Start serving on the pre-reserved listener. http.Server.Serve takes
// ownership and closes the listener when Shutdown is called, so we
// re-acquire a fresh listener on the same port in the deferred cleanup
// below to keep the port reserved for subsequent HandleAuth calls.
h.mu.Lock()
serveListener := h.listener
h.listener = nil // Serve will close it
h.listener = nil
h.mu.Unlock()
if serveListener == nil {
@@ -184,10 +199,11 @@ func (h *DefaultMCPAuthHandler) HandleAuth(ctx context.Context, serverName strin
}
}()
// Open the system browser.
if err := openBrowser(authURL); err != nil {
// Browser open is best-effort; the user can still navigate manually.
_ = err
// Surface the authorization URL to the consumer. This is the single
// presentation seam: the SDK itself does not open browsers, print,
// or otherwise touch the user's environment.
if h.OnAuthURL != nil {
h.OnAuthURL(serverName, authURL)
}
// Wait for the callback, a server error, or context cancellation.
@@ -214,22 +230,6 @@ func (h *DefaultMCPAuthHandler) Close() error {
return nil
}
// openBrowser opens the default system browser to the given URL. This is a
// best-effort operation — errors are returned but callers typically ignore
// them since the user can navigate manually.
func openBrowser(url string) error {
switch runtime.GOOS {
case "linux":
return exec.Command("xdg-open", url).Start()
case "windows":
return exec.Command("rundll32", "url.dll,FileProtocolHandler", url).Start()
case "darwin":
return exec.Command("open", url).Start()
default:
return fmt.Errorf("unsupported platform: %s", runtime.GOOS)
}
}
// oauthSuccessHTML is the HTML page returned to the browser after a
// successful OAuth callback.
const oauthSuccessHTML = `<!DOCTYPE html>
+47 -15
View File
@@ -5,32 +5,49 @@ import (
"fmt"
"io"
"os"
"os/exec"
"runtime"
)
// CLIMCPAuthHandler wraps a [DefaultMCPAuthHandler] and prints status messages
// to a writer (typically stderr) so the user knows what's happening during
// OAuth authorization. This is the handler used by the CLI/TUI binary.
// CLIMCPAuthHandler is the MCP OAuth handler for CLI/TUI consumers. It wraps
// a [DefaultMCPAuthHandler] and layers standard CLI behavior on top of the
// underlying transport mechanics:
//
// For TUI integration, set NotifyFunc to route messages through the TUI's
// event system instead of (or in addition to) the writer.
// - Opens the authorization URL in the system browser
// - Prints status messages (or routes them to a TUI via [NotifyFunc])
//
// Non-CLI consumers (web apps, daemons, custom TUIs) should not use this
// handler; implement [MCPAuthHandler] directly or configure a
// [DefaultMCPAuthHandler] with a custom OnAuthURL instead.
type CLIMCPAuthHandler struct {
inner *DefaultMCPAuthHandler
w io.Writer
// NotifyFunc, when set, is called with status messages instead of writing
// to the writer. This allows the TUI to display system messages in the
// chat stream. If nil, messages are written to w.
// NotifyFunc, when set, is called with status messages instead of
// writing to the writer. This allows the TUI to display system
// messages in the chat stream. If nil, messages are written to w.
NotifyFunc func(serverName, message string)
}
// NewCLIMCPAuthHandler creates a CLI auth handler that prints status messages
// to stderr and delegates the actual OAuth flow to a [DefaultMCPAuthHandler].
// to stderr, opens the authorization URL in the system browser, and delegates
// the callback-server mechanics to a [DefaultMCPAuthHandler].
func NewCLIMCPAuthHandler() (*CLIMCPAuthHandler, error) {
inner, err := NewDefaultMCPAuthHandler()
if err != nil {
return nil, err
}
return &CLIMCPAuthHandler{inner: inner, w: os.Stderr}, nil
h := &CLIMCPAuthHandler{inner: inner, w: os.Stderr}
// Wire the CLI presentation policy into the inner handler's hook.
// This is the one place in the codebase where OAuth triggers a
// browser open; the SDK core remains I/O-free.
inner.OnAuthURL = func(serverName, authURL string) {
h.notify(serverName, fmt.Sprintf("🔐 MCP server %q requires authentication. Opening browser...", serverName))
h.notify(serverName, fmt.Sprintf(" If the browser doesn't open, visit:\n %s", authURL))
// Browser open is best-effort; the user can still navigate manually.
_ = openBrowser(authURL)
}
return h, nil
}
// RedirectURI returns the OAuth redirect URI from the inner handler.
@@ -38,17 +55,15 @@ func (h *CLIMCPAuthHandler) RedirectURI() string {
return h.inner.RedirectURI()
}
// HandleAuth prints status messages and delegates to the inner handler.
// HandleAuth delegates to the inner handler (which invokes OnAuthURL, runs
// the callback server, and returns the full callback URL) and emits a final
// success or failure notification.
func (h *CLIMCPAuthHandler) HandleAuth(ctx context.Context, serverName string, authURL string) (string, error) {
h.notify(serverName, fmt.Sprintf("🔐 MCP server %q requires authentication. Opening browser...", serverName))
h.notify(serverName, fmt.Sprintf(" If the browser doesn't open, visit:\n %s", authURL))
callbackURL, err := h.inner.HandleAuth(ctx, serverName, authURL)
if err != nil {
h.notify(serverName, fmt.Sprintf("✗ Authentication failed for %q: %v", serverName, err))
return "", err
}
h.notify(serverName, fmt.Sprintf("✓ Authenticated with %q", serverName))
return callbackURL, nil
}
@@ -66,3 +81,20 @@ func (h *CLIMCPAuthHandler) notify(serverName, message string) {
}
_, _ = fmt.Fprintln(h.w, message)
}
// openBrowser opens the system default browser at url. Intentionally
// unexported: browser opening is CLI policy, not SDK surface. Consumers
// that need similar behavior for their own UX should bring their own
// helper (or use a third-party package like github.com/pkg/browser).
func openBrowser(url string) error {
switch runtime.GOOS {
case "linux":
return exec.Command("xdg-open", url).Start()
case "windows":
return exec.Command("rundll32", "url.dll,FileProtocolHandler", url).Start()
case "darwin":
return exec.Command("open", url).Start()
default:
return fmt.Errorf("unsupported platform: %s", runtime.GOOS)
}
}
+33 -2
View File
@@ -55,7 +55,7 @@ The `Init` function receives an `ext.API` object for registering handlers, and e
## Lifecycle Events
Kit provides 18 lifecycle events. Each handler receives an event struct and a `Context`.
Kit provides 21 lifecycle events. Each handler receives an event struct and a `Context`.
### Session Events
@@ -93,7 +93,7 @@ api.OnAgentEnd(func(e ext.AgentEndEvent, ctx ext.Context) {
// e.Response string
// e.StopReason string — "error" (on failure), "completed" (when LLM returns
// empty stop reason), or the raw LLM provider value passed through
// (e.g. "stop", "end_turn", "max_tokens", "tool_use").
// (e.g. "stop", "length" (max output tokens hit), "tool-calls", "content-filter").
// To detect errors, check e.StopReason == "error".
// Do NOT compare against "completed" for success — instead check != "error".
})
@@ -136,6 +136,37 @@ api.OnToolResult(func(e ext.ToolResultEvent, ctx ext.Context) *ext.ToolResultRes
})
```
### Tool Call Input Streaming Events
These events fire during the LLM's tool argument generation phase, **before** the tool call is fully parsed and before `OnToolCall` fires. They enable UIs to show tool activity immediately rather than waiting for the full argument JSON to finish streaming.
```go
// Fires when the LLM begins generating tool call arguments.
// The tool name is known but the full argument JSON is still streaming.
api.OnToolCallInputStart(func(e ext.ToolCallInputStartEvent, ctx ext.Context) {
// e.ToolCallID string — stable ID for correlating tool lifecycle events
// e.ToolName string — name of the tool being called
// e.ToolKind string — "execute", "edit", "read", "search", "agent"
ctx.PrintInfo("Tool starting: " + e.ToolName)
})
// Fires for each streamed fragment of tool call arguments.
// Useful for live-previewing artifact content or showing a progress indicator.
api.OnToolCallInputDelta(func(e ext.ToolCallInputDeltaEvent, ctx ext.Context) {
// e.ToolCallID string
// e.Delta string — JSON fragment of tool arguments
})
// Fires when tool argument streaming is complete, before the tool call
// is parsed and execution begins. Transition UI from "generating args"
// to "executing".
api.OnToolCallInputEnd(func(e ext.ToolCallInputEndEvent, ctx ext.Context) {
// e.ToolCallID string
})
```
**Full tool lifecycle order**: `OnToolCallInputStart``OnToolCallInputDelta` (repeated) → `OnToolCallInputEnd``OnToolCall``OnToolExecutionStart``OnToolOutput` (optional, repeated) → `OnToolExecutionEnd``OnToolResult`
### Input Events
```go
+154 -3
View File
@@ -80,6 +80,23 @@ host, err := kit.New(ctx, &kit.Options{
Quiet: true, // suppress debug output
Debug: true, // enable debug logging
// Generation parameters — override env/config/per-model defaults.
// Leaving a field at its zero/nil value lets the precedence chain
// resolve a value (KIT_* env → .kit.yml → modelSettings/customModels →
// 8192 floor for MaxTokens, provider defaults for samplers).
MaxTokens: 16384, // 0 = auto-resolve; non-zero suppresses right-sizing
ThinkingLevel: "medium", // "off", "none", "minimal", "low", "medium", "high" ("" = default)
Temperature: ptrFloat32(0.2), // pointer so explicit 0.0 != unset
TopP: nil, // nil = leave provider/per-model default
TopK: nil, // nil = leave provider/per-model default
FrequencyPenalty: nil,
PresencePenalty: nil,
// Provider configuration — override env/config without viper.Set workarounds.
ProviderAPIKey: "sk-...", // "" = use config / provider env var
ProviderURL: "https://proxy.internal/v1", // "" = provider default endpoint
TLSSkipVerify: false, // true only; can't force-disable via Options
// Session
SessionDir: "/path/to/project", // base dir for session discovery (default: cwd)
SessionPath: "/path/to/session.jsonl", // open specific session file
@@ -108,7 +125,12 @@ host, err := kit.New(ctx, &kit.Options{
AutoCompact: true, // auto-compact near context limit
CompactionOptions: &kit.CompactionOptions{...}, // nil = defaults
// MCP OAuth
// MCP OAuth — both fields are opt-in. If MCPAuthHandler is nil,
// remote MCP servers that require OAuth will fail to connect with
// an authorization-required error instead of silently opening a
// browser. CLI consumers use NewCLIMCPAuthHandler; other embedders
// implement MCPAuthHandler or configure DefaultMCPAuthHandler.
MCPAuthHandler: mcpAuthHandler, // nil = OAuth disabled
MCPTokenStoreFactory: func(serverURL string) (kit.MCPTokenStore, error) {
return myCustomStore(serverURL), nil // custom OAuth token storage
},
@@ -118,12 +140,34 @@ host, err := kit.New(ctx, &kit.Options{
"docs": mcpSrv, // *server.MCPServer from mcp-go — no subprocess needed
},
})
// Tiny helper to take the address of a literal for pointer fields.
func ptrFloat32(v float32) *float32 { return &v }
```
**Critical distinction**: `Tools` replaces ALL default tools (core + MCP + extension). `ExtraTools` adds tools alongside the defaults. Use `Tools` to restrict the agent's capabilities; use `ExtraTools` to extend them.
**In-process MCP servers** bypass subprocess spawning entirely. Pass `*server.MCPServer` instances from mcp-go via `InProcessMCPServers` or call `AddInProcessMCPServer()` at runtime.
### Generation & provider Options (cheat sheet)
| Field | Type | Empty/nil means | Notes |
|-------|------|-----------------|-------|
| `MaxTokens` | `int` | Auto-resolve (env → config → per-model → 8192 floor) | Non-zero suppresses `rightSizeMaxTokens` |
| `ThinkingLevel` | `string` | Auto-resolve (→ `"off"`) | Valid: `"off"`, `"none"`, `"minimal"`, `"low"`, `"medium"`, `"high"` |
| `Temperature` | `*float32` | Leave provider/per-model default | Pointer so explicit `0.0` ≠ unset |
| `TopP` | `*float32` | Leave provider/per-model default | |
| `TopK` | `*int32` | Leave provider/per-model default | |
| `FrequencyPenalty` | `*float32` | Leave provider/per-model default | OpenAI-family |
| `PresencePenalty` | `*float32` | Leave provider/per-model default | OpenAI-family |
| `ProviderAPIKey` | `string` | Use config / provider env var | Overrides pre-existing viper state |
| `ProviderURL` | `string` | Use provider default endpoint | Same base URL flag as `--provider-url` |
| `TLSSkipVerify` | `bool` | — | Only effective when `true`; cannot force-disable via Options |
These fields eliminate the old `viper.Set("max-tokens", 16384)` dance many
downstream embedders used to do before calling `kit.New()`. Everything is
now discoverable via godoc on `kit.Options`.
---
## Prompt Methods
@@ -208,6 +252,25 @@ unsub := host.OnToolCall(func(e kit.ToolCallEvent) {
})
defer unsub()
host.OnToolCallStart(func(e kit.ToolCallStartEvent) {
// Fires when the LLM begins generating tool call arguments.
// e.ToolCallID, e.ToolName, e.ToolKind
// Use this to show a "running" indicator immediately — before the
// full argument JSON finishes streaming (eliminates "dead air").
})
host.OnToolCallDelta(func(e kit.ToolCallDeltaEvent) {
// Fires for each streamed fragment of tool call arguments.
// e.ToolCallID, e.Delta (JSON fragment)
// Useful for live-previewing artifact content or progress indicators.
})
host.OnToolCallEnd(func(e kit.ToolCallEndEvent) {
// Fires when tool argument streaming is complete, before execution.
// e.ToolCallID
// Transition UI from "generating args" to "executing".
})
host.OnToolResult(func(e kit.ToolResultEvent) {
// e.ToolCallID, e.ToolName, e.ToolKind, e.ToolArgs, e.ParsedArgs
// e.Result, e.IsError, e.Metadata (*ToolResultMetadata)
@@ -259,6 +322,9 @@ unsub := host.Subscribe(func(e kit.Event) {
| `message_start` | `MessageStartEvent` | *(none)* |
| `message_update` | `MessageUpdateEvent` | `Chunk` |
| `message_end` | `MessageEndEvent` | `Content` |
| `tool_call_start` | `ToolCallStartEvent` | `ToolCallID`, `ToolName`, `ToolKind` |
| `tool_call_delta` | `ToolCallDeltaEvent` | `ToolCallID`, `Delta` |
| `tool_call_end` | `ToolCallEndEvent` | `ToolCallID` |
| `tool_call` | `ToolCallEvent` | `ToolCallID`, `ToolName`, `ToolKind`, `ToolArgs`, `ParsedArgs` |
| `tool_execution_start` | `ToolExecutionStartEvent` | `ToolCallID`, `ToolName`, `ToolKind`, `ToolArgs` |
| `tool_execution_end` | `ToolExecutionEndEvent` | `ToolCallID`, `ToolName`, `ToolKind` |
@@ -270,6 +336,29 @@ unsub := host.Subscribe(func(e kit.Event) {
| `reasoning_delta` | `ReasoningDeltaEvent` | `Delta` |
| `step_usage` | `StepUsageEvent` | `InputTokens`, `OutputTokens`, `CacheReadTokens`, `CacheWriteTokens` |
| `steer_consumed` | `SteerConsumedEvent` | `Count` |
| `password_prompt` | `PasswordPromptEvent` | `Prompt`, `ResponseCh` |
**Tool call streaming lifecycle**: `ToolCallStartEvent``ToolCallDeltaEvent` (repeated) → `ToolCallEndEvent``ToolCallEvent``ToolExecutionStartEvent``ToolOutputEvent` (optional, repeated) → `ToolExecutionEndEvent``ToolResultEvent`
**PasswordPromptEvent** (for sudo password handling):
```go
// PasswordPromptEvent fires when a sudo command needs a password.
// The TUI should display a password prompt and send the result back via ResponseCh.
type PasswordPromptEvent struct {
// Prompt is the message to display to the user.
Prompt string
// ResponseCh receives the password from the TUI.
// The TUI must send exactly one value: (password, false) for submit
// or ("", true) for cancel.
ResponseCh chan<- PasswordPromptResponse
}
// PasswordPromptResponse carries the password prompt result.
type PasswordPromptResponse struct {
Password string
Cancelled bool
}
```
### Tool kind constants
@@ -760,9 +849,65 @@ err = host.SubscribeMCPResource(ctx, "myserver", "file:///path/to/file")
err = host.UnsubscribeMCPResource(ctx, "myserver", "file:///path/to/file")
```
### MCP OAuth Authorization
When a remote MCP server requires OAuth, Kit runs the full authorization flow
(dynamic client registration → PKCE → user consent → token exchange → token
persistence) but delegates the **user-facing step** — displaying the
authorization URL and receiving the callback — to an `MCPAuthHandler`.
The SDK ships three building blocks:
| Building block | When to use |
|---|---|
| **No handler** (`Options.MCPAuthHandler = nil`) | Default. OAuth is disabled; 401s from remote MCP servers surface as errors. Correct for library, daemon, and web-app embedders that don't want side effects. |
| **`kit.NewCLIMCPAuthHandler()`** | CLI/TUI apps. Opens the system browser, prints status to stderr (or via `NotifyFunc`), runs a localhost callback server. This is what the `kit` binary uses. |
| **`kit.NewDefaultMCPAuthHandler()` + `OnAuthURL`** | Custom UX. Get the transport mechanics (port reservation + callback server) from the SDK; wire your own presentation in the `OnAuthURL(serverName, authURL)` closure. |
| **Implement `kit.MCPAuthHandler` directly** | Full control. No localhost binding — e.g. return the URL from an HTTP endpoint and have the consumer POST the callback URL back. |
**CLI-style embedder (browser + stderr):**
```go
authHandler, err := kit.NewCLIMCPAuthHandler()
if err != nil {
log.Fatal(err)
}
defer authHandler.Close() // release the reserved port
host, _ := kit.New(ctx, &kit.Options{
MCPAuthHandler: authHandler,
})
```
**Custom UX embedder (TUI modal, QR code, web redirect, etc.):**
```go
authHandler, _ := kit.NewDefaultMCPAuthHandler()
authHandler.OnAuthURL = func(serverName, authURL string) {
// Render the URL however you like — no browser or terminal assumptions.
myUI.ShowAuthPrompt(serverName, authURL)
}
defer authHandler.Close()
host, _ := kit.New(ctx, &kit.Options{
MCPAuthHandler: authHandler,
})
```
**Important:** `DefaultMCPAuthHandler` with no `OnAuthURL` set will silently
drop the authorization URL and block until the 2-minute callback timeout
fires. Always set `OnAuthURL`, or use a higher-level wrapper like
`CLIMCPAuthHandler`.
### MCP OAuth Token Storage
For remote MCP servers that use OAuth, you can provide a custom token store:
Once authorization succeeds, the resulting access/refresh tokens are persisted
by an `MCPTokenStore`. By default tokens are written to
`$XDG_CONFIG_HOME/.kit/mcp_tokens.json` (fallback `~/.config/.kit/mcp_tokens.json`),
keyed by server URL, with `0600` file permissions.
Provide a custom store for encrypted storage, database persistence, or
in-memory-only flows:
```go
host, _ := kit.New(ctx, &kit.Options{
@@ -772,7 +917,7 @@ host, _ := kit.New(ctx, &kit.Options{
})
```
The `MCPTokenStore` interface requires `GetToken`/`SetToken`/`DeleteToken` methods. Return `kit.ErrMCPNoToken` from `GetToken` when no token is stored. When nil (default), tokens are persisted to `$XDG_CONFIG_HOME/.kit/mcp_tokens.json`.
The `MCPTokenStore` interface requires `GetToken`/`SetToken`/`DeleteToken` methods. Return `kit.ErrMCPNoToken` from `GetToken` when no token is stored.
---
@@ -955,6 +1100,12 @@ kit.LLMFilePart // {Filename, Data []byte, MediaType}
kit.CompactionResult, kit.CompactionOptions
// MCP OAuth types
kit.MCPAuthHandler // interface: RedirectURI() + HandleAuth(ctx, server, authURL) for OAuth UX
kit.DefaultMCPAuthHandler // SDK-provided transport mechanics (port + callback server); set OnAuthURL hook
kit.CLIMCPAuthHandler // CLI wrapper around DefaultMCPAuthHandler: opens browser, prints status
kit.NewDefaultMCPAuthHandler() // random port, no UX side effects
kit.NewDefaultMCPAuthHandlerWithPort() // fixed port (useful when registering a stable redirect URI)
kit.NewCLIMCPAuthHandler() // CLI handler: browser + stderr + localhost callback
kit.MCPTokenStore // interface for custom OAuth token storage
kit.MCPToken // OAuth token struct (access, refresh, expiry)
kit.MCPTokenStoreFactory // func(serverURL string) (MCPTokenStore, error)
-9
View File
@@ -1,9 +0,0 @@
1. Hello, world!
2. Testing one, two, three.
3. This is a quick test message.
4. Sample text for verification.
5. All systems operational.
+5 -4
View File
@@ -10,9 +10,10 @@ description: Complete reference for all Kit CLI subcommands.
For OAuth-enabled providers like Anthropic.
```bash
kit auth login [provider] # Start OAuth flow (e.g., anthropic)
kit auth logout [provider] # Remove credentials for provider
kit auth status # Check authentication status
kit auth login [provider] # Start OAuth flow (e.g., anthropic)
kit auth login [provider] --set-default # Set provider's default model as system default
kit auth logout [provider] # Remove credentials for provider
kit auth status # Check authentication status
```
## Model database
@@ -66,7 +67,7 @@ These commands are available inside the Kit TUI during an interactive session:
| `/servers` | Show connected MCP servers |
| `/model [name]` | Switch model or open model selector |
| `/theme [name]` | Switch color theme or list available themes |
| `/thinking [level]` | Set thinking level (off, minimal, low, medium, high) |
| `/thinking [level]` | Set thinking level (off, none, minimal, low, medium, high) |
| `/compact [focus]` | Summarize older messages to free context |
| `/clear` | Clear conversation |
| `/clear-queue` | Clear queued messages |
+2 -2
View File
@@ -52,14 +52,14 @@ These flags control Kit's behavior. When a prompt is passed as a positional argu
| Flag | Short | Default | Description |
|------|-------|---------|-------------|
| `--max-tokens` | — | `4096` | Maximum tokens in response |
| `--max-tokens` | — | `8192` | Base cap for output tokens. Auto-raised per-model up to 32768 when the model's catalog ceiling is higher and no explicit value is set. |
| `--temperature` | — | `0.7` | Randomness 0.01.0 |
| `--top-p` | — | `0.95` | Nucleus sampling 0.01.0 |
| `--top-k` | — | `40` | Limit top K tokens |
| `--stop-sequences` | — | — | Custom stop sequences (comma-separated) |
| `--frequency-penalty` | — | `0.0` | Penalize frequent tokens (0.02.0) |
| `--presence-penalty` | — | `0.0` | Penalize present tokens (0.02.0) |
| `--thinking-level` | — | `off` | Extended thinking level: off, minimal, low, medium, high |
| `--thinking-level` | — | `off` | Extended thinking level: off, none, minimal, low, medium, high |
## System
+24 -4
View File
@@ -18,7 +18,7 @@ Create `~/.kit.yml`:
```yaml
model: anthropic/claude-sonnet-latest
max-tokens: 4096
max-tokens: 8192
temperature: 0.7
stream: true
```
@@ -28,7 +28,7 @@ stream: true
| Key | Type | Default | Description |
|-----|------|---------|-------------|
| `model` | string | `anthropic/claude-sonnet-latest` | Model to use (provider/model format) |
| `max-tokens` | int | `4096` | Maximum tokens in response |
| `max-tokens` | int | `8192` | Base cap for output tokens. Auto-raised per-model up to 32768 when the model's catalog ceiling is higher and no explicit value is set. Use [`modelSettings[provider/model].maxTokens`](#per-model-settings) to override per-model. |
| `temperature` | float | `0.7` | Randomness 0.01.0 |
| `top-p` | float | `0.95` | Nucleus sampling 0.01.0 |
| `top-k` | int | `40` | Limit top K tokens |
@@ -37,7 +37,7 @@ stream: true
| `compact` | bool | `false` | Enable compact output mode |
| `system-prompt` | string | — | System prompt text or file path |
| `max-steps` | int | `0` | Maximum agent steps (0 = unlimited) |
| `thinking-level` | string | `off` | Extended thinking: off, minimal, low, medium, high |
| `thinking-level` | string | `off` | Extended thinking: off, none, minimal, low, medium, high |
| `provider-api-key` | string | — | API key for the provider |
| `provider-url` | string | — | Base URL for provider API |
| `tls-skip-verify` | bool | `false` | Skip TLS certificate verification |
@@ -83,6 +83,11 @@ mcpServers:
search:
type: remote
url: "https://mcp.example.com/search"
pubmed:
type: remote
url: "https://pubmed.mcp.example.com"
noOAuth: true # skip OAuth for public servers
```
### MCP server fields
@@ -95,6 +100,7 @@ mcpServers:
| `url` | string | URL for remote servers |
| `allowedTools` | list | Whitelist of tool names to expose |
| `excludedTools` | list | Blacklist of tool names to hide |
| `noOAuth` | bool | Skip OAuth for this server (for public servers that don't require auth) |
A legacy format with `transport`, `args`, `env`, and `headers` fields is also supported.
@@ -175,10 +181,24 @@ modelSettings:
| `thinkingLevel` | string | Thinking level override |
| `systemPrompt` | string | Per-model system prompt (used when no explicit prompt is set) |
Settings from `modelSettings` and `customModels.params` act as model-level defaults — explicit CLI flags and global config values always take precedence.
Settings from `modelSettings` and `customModels.params` act as model-level defaults — explicit CLI flags, `KIT_*` environment variables, global config values, and SDK `Options.*` fields all take precedence over them.
When switching models via `/model` or `SetModel()`, if the new model has a per-model system prompt and no custom global prompt was set, the per-model prompt automatically replaces the previous one.
### Precedence summary
For the generation and provider parameters documented above, the resolved value at runtime comes from the first source that sets it:
1. CLI flag (e.g. `--max-tokens`, `--temperature`, `--provider-api-key`)
2. SDK `Options.X` when embedding Kit as a library (`kit.Options.MaxTokens`, `Temperature`, `ProviderAPIKey`, etc.)
3. `KIT_*` environment variable (`KIT_MAX_TOKENS`, `KIT_TEMPERATURE`, ...)
4. `.kit.yml` / `.kit.yaml` / `.kit.json` (project-local, then global)
5. Per-model defaults (`modelSettings[provider/model]` / `customModels[...].params`)
6. Provider-level defaults (e.g. Anthropic's own temperature default)
7. SDK last-resort floor — currently an 8192 output-token ceiling matching the CLI `--max-tokens` default, auto-raised per-model up to 32768 when the model's catalog ceiling is higher
See the [SDK options reference](/sdk/options) for the full list of `kit.Options` fields that map to these keys.
## Theme configuration
```yaml
+1 -1
View File
@@ -37,7 +37,7 @@ internal/acpserver/ - ACP (Agent Client Protocol) server
internal/clipboard/ - Cross-platform clipboard operations
internal/compaction/ - Conversation compaction and summarization
internal/config/ - Configuration management
internal/core/ - Built-in tools (bash, read, write, edit, grep, find, ls)
internal/core/ - Built-in tools (bash with sudo password prompt, read, write, edit, grep, find, ls)
internal/extensions/ - Yaegi extension system
internal/kitsetup/ - Initial setup wizard
internal/message/ - Message content types and structured content blocks
+4 -1
View File
@@ -7,7 +7,7 @@ description: All extension capabilities — lifecycle events, tools, commands, w
## Lifecycle events
Extensions can hook into 23 lifecycle events:
Extensions can hook into 26 lifecycle events:
| Event | Description |
|-------|-------------|
@@ -17,6 +17,9 @@ Extensions can hook into 23 lifecycle events:
| `OnAgentStart` | Agent loop started |
| `OnAgentEnd` | Agent loop completed |
| `OnToolCall` | Tool call requested by the model |
| `OnToolCallInputStart` | LLM began generating tool call arguments (tool name known, args streaming) |
| `OnToolCallInputDelta` | Streamed JSON fragment of tool call arguments |
| `OnToolCallInputEnd` | Tool argument streaming complete, before execution begins |
| `OnToolExecutionStart` | Tool execution beginning |
| `OnToolOutput` | Streaming tool output chunk (for long-running tools) |
| `OnToolExecutionEnd` | Tool execution completed |
+1 -1
View File
@@ -13,7 +13,7 @@ A powerful, extensible AI coding agent CLI with multi-provider support, built-in
## Features
- **Multi-Provider LLM Support** — Anthropic, OpenAI, Google Gemini, Ollama, Azure OpenAI, AWS Bedrock, OpenRouter, and more
- **Built-in Core Tools** — bash, read, write, edit, grep, find, ls, subagent with no MCP overhead
- **Built-in Core Tools** — bash (with interactive sudo password prompt), read, write, edit, grep, find, ls, subagent with no MCP overhead
- **Smart @ Attachments** — Binary files auto-detected via MIME type, MCP resources via `@mcp:server:uri`
- **MCP Integration** — Connect external MCP servers for expanded capabilities (tools, prompts, and resources)
- **Extension System** — Write custom tools, commands, widgets, and UI modifications in Go
+42
View File
@@ -41,6 +41,32 @@ unsub6 := host.OnTurnEnd(func(event kit.TurnEndEvent) {
defer unsub6()
```
## Tool call argument streaming
For tools with large arguments (e.g., `write` with a full file body), the `ToolCallEvent` only fires after the full argument JSON finishes streaming — which can take 5-10+ seconds of "dead air." These three events fire during argument generation so UIs can show activity immediately:
```go
host.OnToolCallStart(func(event kit.ToolCallStartEvent) {
// Fires as soon as the LLM begins generating tool arguments.
// event.ToolCallID, event.ToolName, event.ToolKind
fmt.Printf("⏳ %s generating arguments...\n", event.ToolName)
})
host.OnToolCallDelta(func(event kit.ToolCallDeltaEvent) {
// Each streamed JSON fragment of the tool arguments.
// event.ToolCallID, event.Delta
// Useful for live-previewing content or showing byte progress.
})
host.OnToolCallEnd(func(event kit.ToolCallEndEvent) {
// Tool argument streaming complete — execution about to begin.
// event.ToolCallID
fmt.Printf("✓ Arguments ready, executing...\n")
})
```
**Full tool lifecycle**: `ToolCallStartEvent``ToolCallDeltaEvent` (repeated) → `ToolCallEndEvent``ToolCallEvent``ToolExecutionStartEvent``ToolOutputEvent` (optional) → `ToolExecutionEndEvent``ToolResultEvent`
## Hook system
Hooks can **modify or cancel** operations. Unlike events (read-only), hooks are read-write interceptors.
@@ -100,6 +126,22 @@ kit.HookPriorityLow = 100 // runs last
Lower values run first. First non-nil result wins.
## All event types
| Event | Description |
|-------|-------------|
| `ToolCallStartEvent` | LLM began generating tool call arguments (tool name known, args streaming) |
| `ToolCallDeltaEvent` | Streamed JSON fragment of tool call arguments |
| `ToolCallEndEvent` | Tool argument streaming complete, before execution begins |
| `ToolCallEvent` | Tool call fully parsed and about to execute |
| `ToolResultEvent` | Tool execution completed with result |
| `ToolOutputEvent` | Streaming output chunk from tool (e.g., bash stdout/stderr) |
| `MessageUpdateEvent` | Streaming text chunk from LLM |
| `ResponseEvent` | Final response received |
| `TurnStartEvent` | Agent turn started |
| `TurnEndEvent` | Agent turn completed |
| `PasswordPromptEvent` | Sudo command needs password (respond via `ResponseCh`) |
## Subagent event monitoring
Monitor real-time events from LLM-initiated subagents (when the model uses the `subagent` tool):
+178 -8
View File
@@ -22,6 +22,20 @@ host, err := kit.New(ctx, &kit.Options{
Quiet: true,
Debug: true,
// Generation parameters (override env/config/per-model defaults)
MaxTokens: 16384, // 0 = auto-resolve; non-zero suppresses right-sizing
ThinkingLevel: "medium", // "off", "none", "minimal", "low", "medium", "high"
Temperature: ptrFloat32(0.2), // pointer so explicit 0.0 != unset
TopP: nil, // nil = provider/per-model default
TopK: nil,
FrequencyPenalty: nil,
PresencePenalty: nil,
// Provider configuration
ProviderAPIKey: "sk-...", // "" = use config / provider env var
ProviderURL: "https://proxy.internal/v1", // "" = provider default endpoint
TLSSkipVerify: false, // only effective when true
// Session
SessionPath: "./session.jsonl",
SessionDir: "/custom/sessions/",
@@ -51,7 +65,11 @@ host, err := kit.New(ctx, &kit.Options{
// Session (advanced)
SessionManager: myCustomSession, // custom SessionManager implementation
// MCP OAuth
// MCP OAuth — both opt-in. Leave MCPAuthHandler nil to disable
// OAuth entirely (remote MCP 401s bubble up as errors). CLI apps
// pass kit.NewCLIMCPAuthHandler(); custom UX embedders implement
// MCPAuthHandler or configure DefaultMCPAuthHandler + OnAuthURL.
MCPAuthHandler: authHandler, // nil = OAuth disabled
MCPTokenStoreFactory: func(serverURL string) (kit.MCPTokenStore, error) {
return myStore(serverURL), nil
},
@@ -65,6 +83,8 @@ host, err := kit.New(ctx, &kit.Options{
## Options fields
### Core
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `Model` | `string` | config default | Model string (provider/model format) |
@@ -74,25 +94,175 @@ host, err := kit.New(ctx, &kit.Options{
| `Streaming` | `bool` | `true` | Enable streaming output |
| `Quiet` | `bool` | `false` | Suppress output |
| `Debug` | `bool` | `false` | Enable debug logging |
### Generation parameters
These fields override the corresponding values from `.kit.yml` / `KIT_*`
environment variables. Leaving a field at its zero/nil value lets the
precedence chain resolve a value (`KIT_*` env → config file → per-model
defaults from `modelSettings`/`customModels` → an 8192 SDK floor for
`MaxTokens` (matching the CLI `--max-tokens` default) and provider-level
defaults for samplers).
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `MaxTokens` | `int` | auto-resolved | Max output tokens per response. `0` = auto-resolve; non-zero suppresses automatic right-sizing (same semantics as `--max-tokens`). |
| `ThinkingLevel` | `string` | auto-resolved | Reasoning effort: `"off"`, `"none"`, `"minimal"`, `"low"`, `"medium"`, `"high"`. `""` falls through to config/env/per-model/`"off"`. |
| `Temperature` | `*float32` | — | Sampling randomness. Pointer type so explicit `0.0` is distinguishable from "unset". |
| `TopP` | `*float32` | — | Nucleus sampling cutoff. `nil` leaves provider/per-model default. |
| `TopK` | `*int32` | — | Top-K sampling limit. `nil` leaves provider/per-model default. |
| `FrequencyPenalty` | `*float32` | — | OpenAI-family frequency penalty. `nil` leaves provider default. |
| `PresencePenalty` | `*float32` | — | OpenAI-family presence penalty. `nil` leaves provider default. |
Pointer-typed samplers are populated via a tiny helper:
```go
func ptrFloat32(v float32) *float32 { return &v }
```
These fields eliminate the need for `viper.Set()` calls before `kit.New()`
when embedding Kit as a library.
### Provider configuration
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `ProviderAPIKey` | `string` | — | API key used to authenticate with the provider. `""` falls back to config / provider-specific env var (e.g. `ANTHROPIC_API_KEY`). When set, overrides any pre-existing viper state. |
| `ProviderURL` | `string` | — | Override the provider endpoint (e.g. LiteLLM, vLLM, Azure OpenAI, internal proxy). `""` = provider default. |
| `TLSSkipVerify` | `bool` | `false` | Disable TLS certificate verification on the provider HTTP client. Only effective when `true`; to force-disable, use config file or env var instead. For self-signed dev certs only. |
### Session
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `SessionPath` | `string` | — | Open a specific session file |
| `SessionDir` | `string` | — | Base directory for session discovery |
| `Continue` | `bool` | `false` | Resume most recent session |
| `NoSession` | `bool` | `false` | Ephemeral mode (no persistence) |
| `SessionManager` | `SessionManager` | — | Custom session backend (advanced) |
### Tools & extensions
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `Tools` | `[]Tool` | — | Replace the entire default tool set |
| `ExtraTools` | `[]Tool` | — | Additional tools alongside core/MCP/extension tools |
| `DisableCoreTools` | `bool` | `false` | Use no core tools (0 tools, for chat-only) |
| `SkipConfig` | `bool` | `false` | Skip .kit.yml file loading |
| `AutoCompact` | `bool` | `false` | Auto-compact when near context limit |
| `CompactionOptions` | `*CompactionOptions` | — | Configuration for auto-compaction |
| `NoExtensions` | `bool` | `false` | Disable Yaegi extension loading |
| `NoContextFiles` | `bool` | `false` | Disable automatic AGENTS.md loading |
### Skills & configuration
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `SkipConfig` | `bool` | `false` | Skip `.kit.yml` file loading (viper defaults + env vars still apply) |
| `Skills` | `[]string` | — | Explicit skill files/dirs to load |
| `SkillsDir` | `string` | — | Override default skills directory |
| `NoSkills` | `bool` | `false` | Disable skill loading entirely |
| `NoExtensions` | `bool` | `false` | Disable Yaegi extension loading |
| `NoContextFiles` | `bool` | `false` | Disable automatic AGENTS.md loading |
| `SessionManager` | `SessionManager` | — | Custom session backend (advanced) |
| `MCPTokenStoreFactory` | `func` | — | Custom OAuth token storage for MCP servers |
### Compaction & MCP
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `AutoCompact` | `bool` | `false` | Auto-compact when near context limit |
| `CompactionOptions` | `*CompactionOptions` | — | Configuration for auto-compaction |
| `MCPAuthHandler` | `MCPAuthHandler` | — | OAuth handler for remote MCP servers. `nil` disables OAuth (servers returning 401 fail with the authorization-required error). See [MCP OAuth](#mcp-oauth-authorization) below. |
| `MCPTokenStoreFactory` | `func` | — | Custom OAuth token storage for MCP servers (default: JSON file in `$XDG_CONFIG_HOME/.kit/mcp_tokens.json`). |
| `InProcessMCPServers` | `map[string]*MCPServer` | — | In-process mcp-go servers (no subprocess) |
## MCP OAuth Authorization
When a remote MCP server (SSE or Streamable HTTP) requires OAuth, Kit runs
the full authorization flow (dynamic client registration → PKCE → user
consent → token exchange → token persistence) but delegates the **user-facing
step** — displaying the authorization URL and receiving the callback — to
an `MCPAuthHandler`.
The SDK is deliberately inert when `MCPAuthHandler` is `nil`: it does **not**
auto-construct a default handler, bind a local TCP port, or open a browser.
This keeps library, daemon, and web-app embedders free of surprise I/O.
Consumers opt in by passing a handler explicitly.
| Building block | When to use |
|---|---|
| `MCPAuthHandler = nil` (default) | OAuth disabled. Remote MCP servers requiring auth fail with a clear error. Correct for libraries, daemons, and web apps. |
| `kit.NewCLIMCPAuthHandler()` | CLI/TUI apps. Opens the system browser, prints status to stderr (or via `NotifyFunc`), runs a localhost callback server. Used by the `kit` binary. |
| `kit.NewDefaultMCPAuthHandler()` + `OnAuthURL` | Custom UX. Use the SDK's port reservation and callback server; plug in your own presentation via the `OnAuthURL(serverName, authURL)` closure. |
| Implement `kit.MCPAuthHandler` directly | Full control. No localhost binding — e.g. return the URL from an HTTP endpoint and have the consumer POST the callback URL back. |
**CLI-style embedder:**
```go
authHandler, err := kit.NewCLIMCPAuthHandler()
if err != nil {
log.Fatal(err)
}
defer authHandler.Close() // release the reserved port
host, _ := kit.New(ctx, &kit.Options{
MCPAuthHandler: authHandler,
})
```
**Custom UX embedder (TUI modal, QR code, web redirect, etc.):**
```go
authHandler, _ := kit.NewDefaultMCPAuthHandler()
authHandler.OnAuthURL = func(serverName, authURL string) {
// No browser or terminal assumptions — render however you like.
myUI.ShowAuthPrompt(serverName, authURL)
}
defer authHandler.Close()
host, _ := kit.New(ctx, &kit.Options{
MCPAuthHandler: authHandler,
})
```
**Fully custom handler (no local port binding at all):**
```go
type WebAuthHandler struct {
redirectURI string
callbacks chan string
}
func (h *WebAuthHandler) RedirectURI() string { return h.redirectURI }
func (h *WebAuthHandler) HandleAuth(ctx context.Context, serverName, authURL string) (string, error) {
// Push the URL to the user's existing browser session via your web app,
// then block on the callback that your HTTP handler pushes onto the channel.
h.pushToUserSession(serverName, authURL)
select {
case callbackURL := <-h.callbacks:
return callbackURL, nil
case <-ctx.Done():
return "", ctx.Err()
}
}
```
::: warning
`DefaultMCPAuthHandler` with no `OnAuthURL` set will silently drop the
authorization URL and hang until the 2-minute callback timeout fires. Always
set `OnAuthURL`, or use a higher-level wrapper like `CLIMCPAuthHandler`.
:::
## Precedence
For any given generation or provider field, the effective value is resolved
in this order (highest priority first):
1. `Options.X` (SDK caller)
2. `KIT_X` environment variable
3. `.kit.yml` (project-local then `~/.kit.yml`)
4. Per-model defaults (`modelSettings[provider/model]` or `customModels[...].params`)
5. Provider-level defaults (e.g. Anthropic's own temperature default)
6. SDK last-resort floor (currently: `MaxTokens = 8192`, matching the CLI `--max-tokens` default)
Sampling params that remain `nil` after the SDK resolution step are left out
of the provider call entirely, so the LLM library applies its own default.
## Tool configuration
**`Tools`** replaces ALL default tools (core + MCP + extension). **`ExtraTools`** adds tools alongside the defaults. Use `Tools` to restrict capabilities; use `ExtraTools` to extend them.
+21
View File
@@ -106,6 +106,27 @@ For advanced use, return a `kit.ToolOutput` struct directly with `Data`, `MediaT
Use `kit.NewParallelTool` for tools that are safe to run concurrently. Use `kit.ToolCallIDFromContext(ctx)` to retrieve the LLM-assigned call ID for logging or tracing.
## Generation & provider overrides
SDK consumers can configure generation parameters and provider endpoints
entirely in-code via `Options`, without touching `.kit.yml` or `viper.Set()`:
```go
host, _ := kit.New(ctx, &kit.Options{
Model: "anthropic/claude-sonnet-4-5-20250929",
MaxTokens: 16384, // 0 = auto-resolve (env → config → per-model → floor)
ThinkingLevel: "high", // "off" | "none" | "minimal" | "low" | "medium" | "high"
Temperature: ptrFloat32(0.2), // nil = provider/per-model default
ProviderAPIKey: os.Getenv("MY_SECRET"), // overrides pre-existing viper state
ProviderURL: "https://proxy.internal/v1",
})
func ptrFloat32(v float32) *float32 { return &v }
```
See [Options](/sdk/options#generation-parameters) for the full field reference,
including `TopP`, `TopK`, `FrequencyPenalty`, `PresencePenalty`, and `TLSSkipVerify`.
## Event system
Subscribe to events for monitoring: