feat(prompts): add $+ required variadic, skip code in placeholders

- Add internal/fences package for detecting markdown code regions (fenced blocks and inline code spans) with ReplaceOutside/StripCode - SubstituteArgs, HasArgPlaceholders, RequiredArgs now skip $ placeholders inside ``` fences and `inline` code spans - ProcessFileAttachments skips @file tokens inside code regions - Add $+ placeholder: expands like $@ but requires at least 1 argument - Add RequiredArgs() method; expandPromptTemplate validates arg count and re-populates input on failure instead of submitting - Update feature-request, file-issue, new-prompt to use $+
feat(ui): populate input instead of auto-submitting prompts with args
2026-06-14 03:30:26 +00:00 · 2026-04-14 13:22:10 +03:00 · 2026-04-14 12:46:12 +03:00 · 2026-04-14 12:39:29 +03:00 · 2026-04-14 12:28:04 +03:00 · 2026-04-14 12:16:22 +03:00
75 changed files with 7376 additions and 1023 deletions
@@ -2,7 +2,7 @@
 description: Create a feature request using the GitHub template
 ---

-Create a feature request for the Kit repository. The user wants to request: $@
+Create a feature request for the Kit repository. The user wants to request: $+

 ## Feature Request Template

@@ -16,7 +16,7 @@ This prompt uses the `feature_request` GitHub template which requires:

 ## Steps

-1. **Understand the request** from `$@`
+1. **Understand the request** from `$+`
   - What capability is missing?
   - What would the ideal behavior look like?

@@ -2,7 +2,7 @@
 description: File a GitHub issue using the appropriate template
 ---

-File a GitHub issue for the Kit repository. The user wants to create an issue about: $@
+File a GitHub issue for the Kit repository. The user wants to create an issue about: $+

 ## Issue Templates Available

@@ -16,7 +16,7 @@ This repository has structured issue templates. You MUST use the appropriate tem

 ## Steps

-1. **Determine the issue type** from `$@`:
+1. **Determine the issue type** from `$+`:
   - Bug → use `--template bug_report`
   - Feature → use `--template feature_request`  
   - Documentation → use `--template documentation`
@@ -2,7 +2,7 @@
 description: Scaffold a new prompt template in .kit/prompts/
 ---

-Create a new kit prompt template. The user wants a prompt that does: $@
+Create a new kit prompt template. The user wants a prompt that does: $+

 ## What a prompt template is

@@ -23,19 +23,21 @@ $1 $2 etc. for positional arguments.
 - **Filename** → slug: `commit-push.md` becomes `/commit-push`
 - **Frontmatter**: only `description` is recognised; keep it under ~80 chars
 - **Body**: plain markdown; the full text is submitted as the user's message when the template fires
- **Arguments**: `$@` expands to everything the user typed after the slash command name;
+- **Arguments**: `$+` expands to everything the user typed after the slash command name
+  (requires at least one argument); `$@` is the same but allows zero arguments;
  `$1`, `$2` for individual positional args; omit entirely if no arguments are needed

 ## Steps

-1. **Understand the workflow** the user described in `$@` — ask a clarifying question if the intent is ambiguous
+1. **Understand the workflow** the user described in `$+` — ask a clarifying question if the intent is ambiguous
 2. **Choose a filename**: short, lowercase, hyphen-separated, descriptive (e.g. `code-review.md`)
 3. **Write the description**: one sentence, imperative, fits in autocomplete
 4. **Draft the body**:
   - Open with a single sentence stating the goal
   - Use `## Steps` for multi-step workflows; use plain prose for simple prompts
   - Be specific: name commands, flags, and file paths where relevant
-   - End with `$@` on its own line if the user might want to pass context or a hint; omit if the prompt is self-contained
+   - End with `$+` on its own line if the user must pass context; use `$@` if arguments
+     are optional; omit if the prompt is self-contained
 5. **Write the file** to `.kit/prompts/<slug>.md`
 6. **Confirm** by showing the final file content and the slash command that activates it

@@ -317,39 +317,39 @@ kit -e examples/extensions/minimal.go

 See the `examples/extensions/` directory:

- `minimal.go` - Clean UI with custom footer
- `auto-commit.go` - Auto-commit on shutdown
- `bookmark.go` - Bookmark conversations
- `branded-output.go` - Branded output rendering
- `compact-notify.go` - Notification on compaction
- `confirm-destructive.go` - Confirm destructive operations
- `context-inject.go` - Inject context into conversations
- `conversation-manager.go` - **NEW** Tree navigation, branch summarization, and fresh context loops
- `custom-editor-demo.go` - Vim-like modal editor
- `dev-reload.go` - Development live-reload
- `header-footer-demo.go` - Custom headers and footers
- `inline-bash.go` - Inline bash execution
- `interactive-shell.go` - Interactive shell integration
- `kit-kit.go` - Kit-in-Kit (sub-agent spawning)
- `lsp-diagnostics.go` - LSP diagnostic integration
- `notify.go` - Desktop notifications
- `overlay-demo.go` - Modal dialogs
- `permission-gate.go` - Permission gating for tools
- `pirate.go` - Pirate-themed personality
- `plan-mode.go` - Read-only planning mode
- `project-rules.go` - Project-specific rules
- `prompt-demo.go` - Interactive prompts (select/confirm/input)
- `prompt-templates.go` - **NEW** Frontmatter-driven templates with model switching and skill injection
- `protected-paths.go` - Path protection for sensitive files
- `subagent-widget.go` - Multi-agent orchestration with status widget
- `subagent-test.go` - Subagent testing utilities
- `summarize.go` - Conversation summarization
- `tool-logger.go` - Log all tool calls
- `neon-theme.go` - Custom theme registration and switching
- `tool-renderer-demo.go` - Custom tool call rendering
- `widget-status.go` - Persistent status widgets
+- [`minimal.go`](examples/extensions/minimal.go) - Clean UI with custom footer
+- [`auto-commit.go`](examples/extensions/auto-commit.go) - Auto-commit on shutdown
+- [`bookmark.go`](examples/extensions/bookmark.go) - Bookmark conversations
+- [`branded-output.go`](examples/extensions/branded-output.go) - Branded output rendering
+- [`compact-notify.go`](examples/extensions/compact-notify.go) - Notification on compaction
+- [`confirm-destructive.go`](examples/extensions/confirm-destructive.go) - Confirm destructive operations
+- [`context-inject.go`](examples/extensions/context-inject.go) - Inject context into conversations
+- [`conversation-manager.go`](examples/extensions/conversation-manager.go) - **NEW** Tree navigation, branch summarization, and fresh context loops
+- [`custom-editor-demo.go`](examples/extensions/custom-editor-demo.go) - Vim-like modal editor
+- [`dev-reload.go`](examples/extensions/dev-reload.go) - Development live-reload
+- [`header-footer-demo.go`](examples/extensions/header-footer-demo.go) - Custom headers and footers
+- [`inline-bash.go`](examples/extensions/inline-bash.go) - Inline bash execution
+- [`interactive-shell.go`](examples/extensions/interactive-shell.go) - Interactive shell integration
+- [`kit-kit.go`](examples/extensions/kit-kit.go) - Kit-in-Kit (sub-agent spawning)
+- [`lsp-diagnostics.go`](examples/extensions/lsp-diagnostics.go) - LSP diagnostic integration
+- [`notify.go`](examples/extensions/notify.go) - Desktop notifications
+- [`overlay-demo.go`](examples/extensions/overlay-demo.go) - Modal dialogs
+- [`permission-gate.go`](examples/extensions/permission-gate.go) - Permission gating for tools
+- [`pirate.go`](examples/extensions/pirate.go) - Pirate-themed personality
+- [`plan-mode.go`](examples/extensions/plan-mode.go) - Read-only planning mode
+- [`project-rules.go`](examples/extensions/project-rules.go) - Project-specific rules
+- [`prompt-demo.go`](examples/extensions/prompt-demo.go) - Interactive prompts (select/confirm/input)
+- [`prompt-templates.go`](examples/extensions/prompt-templates.go) - **NEW** Frontmatter-driven templates with model switching and skill injection
+- [`protected-paths.go`](examples/extensions/protected-paths.go) - Path protection for sensitive files
+- [`subagent-widget.go`](examples/extensions/subagent-widget.go) - Multi-agent orchestration with status widget
+- [`subagent-test.go`](examples/extensions/subagent-test.go) - Subagent testing utilities
+- [`summarize.go`](examples/extensions/summarize.go) - Conversation summarization
+- [`tool-logger.go`](examples/extensions/tool-logger.go) - Log all tool calls
+- [`neon-theme.go`](examples/extensions/neon-theme.go) - Custom theme registration and switching
+- [`tool-renderer-demo.go`](examples/extensions/tool-renderer-demo.go) - Custom tool call rendering
+- [`widget-status.go`](examples/extensions/widget-status.go) - Persistent status widgets

-Also see `.kit/extensions/go-edit-lint.go` (in this repo) for a project-local extension example that runs gopls and golangci-lint on Go file edits.
+Also see [`.kit/extensions/go-edit-lint.go`](.kit/extensions/go-edit-lint.go) (in this repo) for a project-local extension example that runs gopls and golangci-lint on Go file edits.

 ### Loading Extensions

@@ -406,7 +406,7 @@ func TestMyExtension(t *testing.T) {
 - `AssertPrinted()`, `AssertPrintedContains()` — Verify output
 - `AssertToolRegistered()`, `AssertCommandRegistered()` — Verify registration

-See `examples/extensions/tool-logger_test.go` for a complete example with 14 test cases covering tool calls, input handling, and session lifecycle.
+See [`examples/extensions/tool-logger_test.go`](examples/extensions/tool-logger_test.go) for a complete example with 14 test cases covering tool calls, input handling, and session lifecycle.

 ### Prompt Templates

@@ -531,7 +531,12 @@ host, err := kit.New(ctx, &kit.Options{
    NoSession:    true,                // Ephemeral mode

    // Tool options
-    ExtraTools:   []kit.Tool{...},     // Additional tools alongside defaults
+    Tools:            []kit.Tool{...},     // Replace default tool set entirely
+    ExtraTools:       []kit.Tool{...},     // Add tools alongside defaults
+    DisableCoreTools: true,                // Use no core tools (0 tools, for chat-only)
+
+    // Configuration
+    SkipConfig:   true,                   // Skip .kit.yml files (viper defaults + env vars still apply)

    // Compaction
    AutoCompact:  true,                // Auto-compact near context limit
@@ -540,6 +545,28 @@ host, err := kit.New(ctx, &kit.Options{
 })
 ```

+### Custom Tools
+
+Create custom tools with automatic schema generation — no external dependencies needed:
+
+```go
+type SearchInput struct {
+    Query string `json:"query" description:"Search query"`
+}
+
+searchTool := kit.NewTool("search", "Search the codebase",
+    func(ctx context.Context, input SearchInput) (kit.ToolOutput, error) {
+        return kit.TextResult("Found: ..."), nil
+    },
+)
+
+host, _ := kit.New(ctx, &kit.Options{
+    ExtraTools: []kit.Tool{searchTool}, // adds alongside built-in tools
+})
+```
+
+Use `kit.NewParallelTool` for tools safe to run concurrently. See the [SDK docs](/sdk/overview) for full details on struct tags, `ToolOutput` fields, and `ToolCallIDFromContext`.
+
 ### With Callbacks

 ```go
@@ -7,6 +7,7 @@ import (
 	"image/color"
 	"log"
 	"os"
+	"path/filepath"
 	"strings"

 	tea "charm.land/bubbletea/v2"
@@ -18,6 +19,7 @@ import (
 	"github.com/mark3labs/kit/internal/prompts"
 	"github.com/mark3labs/kit/internal/ui"
 	"github.com/mark3labs/kit/internal/ui/commands"
+	"github.com/mark3labs/kit/internal/watcher"
 	kit "github.com/mark3labs/kit/pkg/kit"
 	"github.com/spf13/cobra"
 	"github.com/spf13/viper"
@@ -48,12 +50,14 @@ var (
 	noSessionFlag bool // --no-session: ephemeral mode, no persistence

 	// Model generation parameters
-	maxTokens     int
-	temperature   float32
-	topP          float32
-	topK          int32
-	stopSequences []string
-	thinkingLevel string
+	maxTokens        int
+	temperature      float32
+	topP             float32
+	topK             int32
+	frequencyPenalty float32
+	presencePenalty  float32
+	stopSequences    []string
+	thinkingLevel    string

 	// Ollama-specific parameters
 	numGPU  int32
@@ -291,6 +295,8 @@ func init() {
 	flags.Float32Var(&temperature, "temperature", 0.7, "controls randomness in responses (0.0-1.0)")
 	flags.Float32Var(&topP, "top-p", 0.95, "controls diversity via nucleus sampling (0.0-1.0)")
 	flags.Int32Var(&topK, "top-k", 40, "controls diversity by limiting top K tokens to sample from")
+	flags.Float32Var(&frequencyPenalty, "frequency-penalty", 0.0, "penalizes tokens based on frequency of appearance (0.0-2.0)")
+	flags.Float32Var(&presencePenalty, "presence-penalty", 0.0, "penalizes tokens based on whether they have appeared (0.0-2.0)")
 	flags.StringSliceVar(&stopSequences, "stop-sequences", nil, "custom stop sequences (comma-separated)")
 	flags.StringVar(&thinkingLevel, "thinking-level", "off", "extended thinking level: off, minimal, low, medium, high")

@@ -313,6 +319,8 @@ func init() {
 	_ = viper.BindPFlag("temperature", rootCmd.PersistentFlags().Lookup("temperature"))
 	_ = viper.BindPFlag("top-p", rootCmd.PersistentFlags().Lookup("top-p"))
 	_ = viper.BindPFlag("top-k", rootCmd.PersistentFlags().Lookup("top-k"))
+	_ = viper.BindPFlag("frequency-penalty", rootCmd.PersistentFlags().Lookup("frequency-penalty"))
+	_ = viper.BindPFlag("presence-penalty", rootCmd.PersistentFlags().Lookup("presence-penalty"))
 	_ = viper.BindPFlag("stop-sequences", rootCmd.PersistentFlags().Lookup("stop-sequences"))
 	_ = viper.BindPFlag("thinking-level", rootCmd.PersistentFlags().Lookup("thinking-level"))
 	_ = viper.BindPFlag("num-gpu-layers", rootCmd.PersistentFlags().Lookup("num-gpu-layers"))
@@ -723,6 +731,11 @@ func runNormalMode(ctx context.Context) error {
 		fmt.Fprintf(os.Stderr, "Warning: Failed to create OAuth handler: %v\n", authErr)
 	}

+	// appInstancePtr is used to break the circular dependency between
+	// kit.New (which needs the OnMCPServerLoaded callback) and app.New
+	// (which is needed by the callback to send events to the TUI).
+	var appInstancePtr *app.App
+
 	kitOpts := &kit.Options{
 		Quiet:          quietFlag,
 		Debug:          debugMode,
@@ -731,6 +744,14 @@ func runNormalMode(ctx context.Context) error {
 		SessionPath:    sessionPath,
 		AutoCompact:    autoCompactFlag,
 		MCPAuthHandler: authHandler,
+		// This callback is called when each MCP server finishes loading.
+		// We use a closure that captures appInstancePtr which is set after
+		// app.New() is called below.
+		OnMCPServerLoaded: func(serverName string, toolCount int, err error) {
+			if appInstancePtr != nil {
+				appInstancePtr.NotifyMCPServerLoaded(serverName, toolCount, err)
+			}
+		},
 		CLI: &kit.CLIOptions{
 			MCPConfig:         mcpConfig,
 			ShowSpinner:       true,
@@ -801,6 +822,7 @@ func runNormalMode(ctx context.Context) error {
 	}

 	appInstance := app.New(appOpts, messages)
+	appInstancePtr = appInstance // Wire up the MCP server loaded callback.
 	defer appInstance.Close()

 	// Wire OAuth handler to route messages through the TUI once it's running.
@@ -1614,6 +1636,49 @@ func runNormalMode(ctx context.Context) error {
 		})
 	}

+	// Build prompt template and skill item provider callbacks for hot-reload.
+	// These are called by the TUI when ContentReloadEvent fires.
+	getPromptTemplates := func() []*prompts.PromptTemplate {
+		if noPromptTemplates {
+			return nil
+		}
+		homeDir, _ := os.UserHomeDir()
+		cwd, _ := os.Getwd()
+		tpls, _, err := prompts.LoadAll(prompts.LoadOptions{
+			Cwd:             cwd,
+			HomeDir:         homeDir,
+			ExtraPaths:      promptTemplatePaths,
+			ConfigPaths:     viper.GetStringSlice("prompts"),
+			IncludeDefaults: true,
+		})
+		if err != nil {
+			log.Printf("Warning: failed to reload prompt templates: %v", err)
+		}
+		return tpls
+	}
+
+	getSkillItems := func() []ui.SkillItem {
+		// Re-discover skills from disk.
+		if err := kitInstance.ReloadSkills(); err != nil {
+			log.Printf("Warning: failed to reload skills: %v", err)
+			return nil
+		}
+		cwd, _ := os.Getwd()
+		var items []ui.SkillItem
+		for _, s := range kitInstance.GetSkills() {
+			source := "user"
+			if strings.HasPrefix(s.Path, cwd) {
+				source = "project"
+			}
+			items = append(items, ui.SkillItem{
+				Name:   s.Name,
+				Path:   s.Path,
+				Source: source,
+			})
+		}
+		return items
+	}
+
 	// Build extension UI providers once (shared between both modes).
 	getWidgets := widgetProviderForUI(kitInstance)
 	getHeader := headerProviderForUI(kitInstance)
@@ -1629,6 +1694,25 @@ func runNormalMode(ctx context.Context) error {
 		return extensionCommandsForUI(kitInstance)
 	}

+	// Build dynamic tool name and MCP tool count providers. These are called
+	// by the TUI when MCPToolsReadyEvent fires to refresh the /tools list
+	// and startup info bar after background MCP tool loading completes.
+	getToolNames := func() []string {
+		return kitInstance.GetToolNames()
+	}
+	getMCPToolCount := func() int {
+		return kitInstance.GetMCPToolCount()
+	}
+
+	// Start a goroutine that waits for background MCP tool loading to
+	// complete and notifies the TUI so it can refresh tool names and counts.
+	if len(mcpConfig.MCPServers) > 0 {
+		go func() {
+			_ = kitInstance.WaitForMCPTools()
+			appInstance.NotifyMCPToolsReady()
+		}()
+	}
+
 	// Build model switching callbacks for the /model command.
 	setModelForUI := func(modelString string) error {
 		err := kitInstance.SetModel(context.Background(), modelString)
@@ -1709,9 +1793,54 @@ func runNormalMode(ctx context.Context) error {
 		}
 	}

+	// Start file watchers for automatic prompt and skill hot-reload.
+	{
+		homeDir, _ := os.UserHomeDir()
+		cwd, _ := os.Getwd()
+
+		// Collect prompt template directories.
+		promptDirs := watcher.CollectDirs(
+			[]string{
+				filepath.Join(homeDir, ".kit", "prompts"),
+				filepath.Join(cwd, ".kit", "prompts"),
+			},
+			append(promptTemplatePaths, viper.GetStringSlice("prompts")...),
+		)
+
+		// Collect skill directories.
+		skillDirs := watcher.CollectDirs(
+			[]string{
+				filepath.Join(homeDir, ".config", "kit", "skills"),
+				filepath.Join(cwd, ".agents", "skills"),
+				filepath.Join(cwd, ".kit", "skills"),
+			},
+			nil,
+		)
+
+		// Combine all content directories and start a single watcher.
+		allContentDirs := append(promptDirs, skillDirs...)
+		if len(allContentDirs) > 0 {
+			contentWatcher, watchErr := watcher.New(watcher.Options{
+				Dirs:       allContentDirs,
+				Extensions: []string{".md", ".txt"},
+				Label:      "prompts/skills",
+				OnReload: func() {
+					log.Printf("auto-reloading prompts and skills")
+					appInstance.NotifyContentReload()
+				},
+			})
+			if watchErr != nil {
+				log.Printf("content file watcher not started: %v", watchErr)
+			} else {
+				go contentWatcher.Start(ctx)
+				defer func() { _ = contentWatcher.Close() }()
+			}
+		}
+	}
+
 	// Check if running in non-interactive mode
 	if positionalPrompt != "" {
-		return runNonInteractiveModeApp(ctx, appInstance, cli, positionalPrompt, quietFlag, jsonFlag, noExitFlag, modelName, parsedProvider, kitInstance.GetLoadingMessage(), serverNames, toolNames, mcpToolCount, extensionToolCount, usageTracker, extCommands, promptTemplates, contextPaths, skillItems, getWidgets, getHeader, getFooter, getToolRenderer, getEditorInterceptor, getUIVisibility, getStatusBarEntries, emitBeforeFork, emitBeforeSessionSwitch, getGlobalShortcuts, getExtensionCommands, setModelForUI, emitModelChangeForUI, kitInstance.IsReasoningModel(), kitInstance.GetThinkingLevel(), setThinkingLevelForUI, switchSessionForUI, reloadExtensionsForUI)
+		return runNonInteractiveModeApp(ctx, appInstance, cli, positionalPrompt, quietFlag, jsonFlag, noExitFlag, modelName, parsedProvider, kitInstance.GetLoadingMessage(), serverNames, toolNames, mcpToolCount, extensionToolCount, usageTracker, extCommands, promptTemplates, contextPaths, skillItems, getPromptTemplates, getSkillItems, getToolNames, getMCPToolCount, getWidgets, getHeader, getFooter, getToolRenderer, getEditorInterceptor, getUIVisibility, getStatusBarEntries, emitBeforeFork, emitBeforeSessionSwitch, getGlobalShortcuts, getExtensionCommands, setModelForUI, emitModelChangeForUI, kitInstance.IsReasoningModel(), kitInstance.GetThinkingLevel(), setThinkingLevelForUI, switchSessionForUI, reloadExtensionsForUI)
 	}

 	// Quiet mode is not allowed in interactive mode
@@ -1719,7 +1848,7 @@ func runNormalMode(ctx context.Context) error {
 		return fmt.Errorf("--quiet requires a prompt")
 	}

-	return runInteractiveModeBubbleTea(ctx, appInstance, modelName, parsedProvider, kitInstance.GetLoadingMessage(), serverNames, toolNames, mcpToolCount, extensionToolCount, usageTracker, extCommands, promptTemplates, contextPaths, skillItems, getWidgets, getHeader, getFooter, getToolRenderer, getEditorInterceptor, getUIVisibility, getStatusBarEntries, emitBeforeFork, emitBeforeSessionSwitch, getGlobalShortcuts, getExtensionCommands, setModelForUI, emitModelChangeForUI, kitInstance.IsReasoningModel(), kitInstance.GetThinkingLevel(), setThinkingLevelForUI, switchSessionForUI, reloadExtensionsForUI, startupExtensionMessages)
+	return runInteractiveModeBubbleTea(ctx, appInstance, modelName, parsedProvider, kitInstance.GetLoadingMessage(), serverNames, toolNames, mcpToolCount, extensionToolCount, usageTracker, extCommands, promptTemplates, contextPaths, skillItems, getPromptTemplates, getSkillItems, getToolNames, getMCPToolCount, getWidgets, getHeader, getFooter, getToolRenderer, getEditorInterceptor, getUIVisibility, getStatusBarEntries, emitBeforeFork, emitBeforeSessionSwitch, getGlobalShortcuts, getExtensionCommands, setModelForUI, emitModelChangeForUI, kitInstance.IsReasoningModel(), kitInstance.GetThinkingLevel(), setThinkingLevelForUI, switchSessionForUI, reloadExtensionsForUI, startupExtensionMessages)
 }

 // runNonInteractiveModeApp executes a single prompt via the app layer and exits,
@@ -1732,7 +1861,7 @@ func runNormalMode(ctx context.Context) error {
 //
 // When --no-exit is set, after the prompt completes the interactive BubbleTea
 // TUI is started so the user can continue the conversation.
-func runNonInteractiveModeApp(ctx context.Context, appInstance *app.App, cli *ui.CLI, prompt string, quiet, jsonOutput, noExit bool, modelName, providerName, loadingMessage string, serverNames, toolNames []string, mcpToolCount, extensionToolCount int, usageTracker *ui.UsageTracker, extCommands []commands.ExtensionCommand, promptTemplates []*prompts.PromptTemplate, contextPaths []string, skillItems []ui.SkillItem, getWidgets func(string) []ui.WidgetData, getHeader, getFooter func() *ui.WidgetData, getToolRenderer func(string) *ui.ToolRendererData, getEditorInterceptor func() *ui.EditorInterceptor, getUIVisibility func() *ui.UIVisibility, getStatusBarEntries func() []ui.StatusBarEntryData, emitBeforeFork func(string, bool, string) (bool, string), emitBeforeSessionSwitch func(string) (bool, string), getGlobalShortcuts func() map[string]func(), getExtensionCommands func() []commands.ExtensionCommand, setModel func(string) error, emitModelChange func(string, string, string), isReasoningModel bool, thinkingLevel string, setThinkingLevel func(string) error, switchSession func(string) error, reloadExtensions func() error) error {
+func runNonInteractiveModeApp(ctx context.Context, appInstance *app.App, cli *ui.CLI, prompt string, quiet, jsonOutput, noExit bool, modelName, providerName, loadingMessage string, serverNames, toolNames []string, mcpToolCount, extensionToolCount int, usageTracker *ui.UsageTracker, extCommands []commands.ExtensionCommand, promptTemplates []*prompts.PromptTemplate, contextPaths []string, skillItems []ui.SkillItem, getPromptTemplates func() []*prompts.PromptTemplate, getSkillItems func() []ui.SkillItem, getToolNames func() []string, getMCPToolCount func() int, getWidgets func(string) []ui.WidgetData, getHeader, getFooter func() *ui.WidgetData, getToolRenderer func(string) *ui.ToolRendererData, getEditorInterceptor func() *ui.EditorInterceptor, getUIVisibility func() *ui.UIVisibility, getStatusBarEntries func() []ui.StatusBarEntryData, emitBeforeFork func(string, bool, string) (bool, string), emitBeforeSessionSwitch func(string) (bool, string), getGlobalShortcuts func() map[string]func(), getExtensionCommands func() []commands.ExtensionCommand, setModel func(string) error, emitModelChange func(string, string, string), isReasoningModel bool, thinkingLevel string, setThinkingLevel func(string) error, switchSession func(string) error, reloadExtensions func() error) error {
 	// Expand @file references in the prompt before sending to the agent.
 	if cwd, err := os.Getwd(); err == nil {
 		prompt = ui.ProcessFileAttachments(prompt, cwd)
@@ -1775,7 +1904,7 @@ func runNonInteractiveModeApp(ctx context.Context, appInstance *app.App, cli *ui

 	// If --no-exit was requested, hand off to the interactive TUI.
 	if noExit {
-		return runInteractiveModeBubbleTea(ctx, appInstance, modelName, providerName, loadingMessage, serverNames, toolNames, mcpToolCount, extensionToolCount, usageTracker, extCommands, promptTemplates, contextPaths, skillItems, getWidgets, getHeader, getFooter, getToolRenderer, getEditorInterceptor, getUIVisibility, getStatusBarEntries, emitBeforeFork, emitBeforeSessionSwitch, getGlobalShortcuts, getExtensionCommands, setModel, emitModelChange, isReasoningModel, thinkingLevel, setThinkingLevel, switchSession, reloadExtensions, nil)
+		return runInteractiveModeBubbleTea(ctx, appInstance, modelName, providerName, loadingMessage, serverNames, toolNames, mcpToolCount, extensionToolCount, usageTracker, extCommands, promptTemplates, contextPaths, skillItems, getPromptTemplates, getSkillItems, getToolNames, getMCPToolCount, getWidgets, getHeader, getFooter, getToolRenderer, getEditorInterceptor, getUIVisibility, getStatusBarEntries, emitBeforeFork, emitBeforeSessionSwitch, getGlobalShortcuts, getExtensionCommands, setModel, emitModelChange, isReasoningModel, thinkingLevel, setThinkingLevel, switchSession, reloadExtensions, nil)
 	}

 	return nil
@@ -1873,7 +2002,19 @@ func writeJSONError(err error) {
 //  4. Calls program.Run() which blocks until the user quits (Ctrl+C or /quit).
 //
 // SetupCLI is not used for interactive mode; the TUI (AppModel) handles its own rendering.
-func runInteractiveModeBubbleTea(_ context.Context, appInstance *app.App, modelName, providerName, loadingMessage string, serverNames, toolNames []string, mcpToolCount, extensionToolCount int, usageTracker *ui.UsageTracker, extCommands []commands.ExtensionCommand, promptTemplates []*prompts.PromptTemplate, contextPaths []string, skillItems []ui.SkillItem, getWidgets func(string) []ui.WidgetData, getHeader, getFooter func() *ui.WidgetData, getToolRenderer func(string) *ui.ToolRendererData, getEditorInterceptor func() *ui.EditorInterceptor, getUIVisibility func() *ui.UIVisibility, getStatusBarEntries func() []ui.StatusBarEntryData, emitBeforeFork func(string, bool, string) (bool, string), emitBeforeSessionSwitch func(string) (bool, string), getGlobalShortcuts func() map[string]func(), getExtensionCommands func() []commands.ExtensionCommand, setModel func(string) error, emitModelChange func(string, string, string), isReasoningModel bool, thinkingLevel string, setThinkingLevel func(string) error, switchSession func(string) error, reloadExtensions func() error, startupExtensionMessages []string) error {
+func runInteractiveModeBubbleTea(_ context.Context, appInstance *app.App, modelName, providerName, loadingMessage string, serverNames, toolNames []string, mcpToolCount, extensionToolCount int, usageTracker *ui.UsageTracker, extCommands []commands.ExtensionCommand, promptTemplates []*prompts.PromptTemplate, contextPaths []string, skillItems []ui.SkillItem, getPromptTemplates func() []*prompts.PromptTemplate, getSkillItems func() []ui.SkillItem, getToolNames func() []string, getMCPToolCount func() int, getWidgets func(string) []ui.WidgetData, getHeader, getFooter func() *ui.WidgetData, getToolRenderer func(string) *ui.ToolRendererData, getEditorInterceptor func() *ui.EditorInterceptor, getUIVisibility func() *ui.UIVisibility, getStatusBarEntries func() []ui.StatusBarEntryData, emitBeforeFork func(string, bool, string) (bool, string), emitBeforeSessionSwitch func(string) (bool, string), getGlobalShortcuts func() map[string]func(), getExtensionCommands func() []commands.ExtensionCommand, setModel func(string) error, emitModelChange func(string, string, string), isReasoningModel bool, thinkingLevel string, setThinkingLevel func(string) error, switchSession func(string) error, reloadExtensions func() error, startupExtensionMessages []string) error {
+	// Redirect all log output (stdlib and charm) to a file so that log
+	// messages don't write to stderr and corrupt the TUI. Bubble Tea
+	// captures stdout for rendering; any stray stderr output from
+	// background goroutines (watchers, extension handlers, SDK internals)
+	// will visually corrupt the terminal.
+	logDir := filepath.Join(os.TempDir(), "kit")
+	_ = os.MkdirAll(logDir, 0o700)
+	logFile, logErr := tea.LogToFile(filepath.Join(logDir, "kit.log"), "kit")
+	if logErr == nil {
+		defer func() { _ = logFile.Close() }()
+	}
+
 	// Determine terminal size; fall back gracefully.
 	termWidth, termHeight, err := term.GetSize(int(os.Stdout.Fd()))
 	if err != nil || termWidth == 0 {
@@ -1892,13 +2033,17 @@ func runInteractiveModeBubbleTea(_ context.Context, appInstance *app.App, modelN
 		Height:                   termHeight,
 		ServerNames:              serverNames,
 		ToolNames:                toolNames,
+		GetToolNames:             getToolNames,
+		GetMCPToolCount:          getMCPToolCount,
 		MCPToolCount:             mcpToolCount,
 		ExtensionToolCount:       extensionToolCount,
 		UsageTracker:             usageTracker,
 		ExtensionCommands:        extCommands,
 		PromptTemplates:          promptTemplates,
+		GetPromptTemplates:       getPromptTemplates,
 		ContextPaths:             contextPaths,
 		SkillItems:               skillItems,
+		GetSkillItems:            getSkillItems,
 		StartupExtensionMessages: startupExtensionMessages,
 		GetWidgets:               getWidgets,
 		GetHeader:                getHeader,
@@ -10,13 +10,21 @@ import (
 	"kit/ext"
 )

+// re matches !{...} with non-greedy content.
+var re = regexp.MustCompile(`!\{([^}]+)\}`)
+
 // Init expands inline bash expressions in user prompts before they reach the
-// LLM. Text like !{git branch --show-current} is replaced with the command's
-// stdout.
+// LLM. Text like !{git rev-parse --abbrev-ref HEAD} is replaced with the
+// command's stdout.
+//
+// In interactive mode the expansion happens at submit time via an editor
+// interceptor, so the expanded text is also visible in the user message
+// block on screen. In non-interactive mode (CLI, script, queue) the
+// expansion happens via OnInput transform.
 //
 // Examples:
 //
-//	"Fix the tests on !{git branch --show-current}"
+//	"Fix the tests on !{git rev-parse --abbrev-ref HEAD}"
 //	  → "Fix the tests on main"
 //
 //	"The current directory is !{pwd}"
@@ -24,29 +32,59 @@ import (
 //
 // Usage: kit -e examples/extensions/inline-bash.go
 func Init(api ext.API) {
-	// Matches !{...} with non-greedy content.
-	re := regexp.MustCompile(`!\{([^}]+)\}`)
+	// ── Interactive mode: editor interceptor ──────────────────────────
+	// Intercept Enter / Ctrl+D so we can expand !{...} BEFORE the
+	// SubmitMsg is created. This ensures the expanded text appears in
+	// the user message block on screen as well as in the LLM prompt.
+	api.OnSessionStart(func(_ ext.SessionStartEvent, ctx ext.Context) {
+		if !ctx.Interactive {
+			return
+		}
+		ctx.SetEditor(ext.EditorConfig{
+			HandleKey: func(key string, currentText string) ext.EditorKeyAction {
+				if (key == "enter" || key == "ctrl+d") && re.MatchString(currentText) {
+					expanded := expand(currentText)
+					// Clear the textarea asynchronously — calling
+					// SetEditorText synchronously from inside Update()
+					// would deadlock the BubbleTea event loop.
+					go ctx.SetEditorText("")
+					return ext.EditorKeyAction{
+						Type:       ext.EditorKeySubmit,
+						SubmitText: expanded,
+					}
+				}
+				return ext.EditorKeyAction{Type: ext.EditorKeyPassthrough}
+			},
+		})
+	})

+	// ── Non-interactive fallback: OnInput transform ──────────────────
+	// For CLI, script, and queue sources the editor interceptor is not
+	// active, so we fall back to OnInput which still rewrites the
+	// prompt text sent to the LLM.
 	api.OnInput(func(ev ext.InputEvent, ctx ext.Context) *ext.InputResult {
-		if !re.MatchString(ev.Text) {
+		if ev.Source == "interactive" || !re.MatchString(ev.Text) {
 			return nil
 		}

-		expanded := re.ReplaceAllStringFunc(ev.Text, func(match string) string {
-			// Extract the command between !{ and }.
-			cmd := re.FindStringSubmatch(match)[1]
-			cmd = strings.TrimSpace(cmd)
-
-			out, err := exec.Command("bash", "-c", cmd).Output()
-			if err != nil {
-				return match // keep original on error
-			}
-			return strings.TrimSpace(string(out))
-		})
-
 		return &ext.InputResult{
 			Action: "transform",
-			Text:   expanded,
+			Text:   expand(ev.Text),
 		}
 	})
 }
+
+// expand replaces every !{cmd} in text with the command's stdout.
+// On error the original !{cmd} token is preserved.
+func expand(text string) string {
+	return re.ReplaceAllStringFunc(text, func(match string) string {
+		cmd := re.FindStringSubmatch(match)[1]
+		cmd = strings.TrimSpace(cmd)
+
+		out, err := exec.Command("bash", "-c", cmd).Output()
+		if err != nil {
+			return match // keep original on error
+		}
+		return strings.TrimSpace(string(out))
+	})
+}
@@ -1,37 +1,38 @@
 module github.com/mark3labs/kit

-go 1.26.1
+go 1.26.2

 require (
 	charm.land/bubbles/v2 v2.1.0
-	charm.land/bubbletea/v2 v2.0.2
-	charm.land/fantasy v0.17.1
+	charm.land/bubbletea/v2 v2.0.5
+	charm.land/fantasy v0.17.2
 	charm.land/huh/v2 v2.0.3
-	charm.land/lipgloss/v2 v2.0.2
+	charm.land/lipgloss/v2 v2.0.3
 	github.com/alecthomas/chroma/v2 v2.23.1
 	github.com/atotto/clipboard v0.1.4
 	github.com/aymanbagabas/go-udiff v0.4.1
 	github.com/charmbracelet/fang v1.0.0
 	github.com/charmbracelet/log v1.0.0
 	github.com/charmbracelet/openai-go v0.0.0-20260319145158-d0740cc34266
-	github.com/charmbracelet/ultraviolet v0.0.0-20260330092749-0f94982c930b
+	github.com/charmbracelet/ultraviolet v0.0.0-20260414011438-8c69ec811b1e
+	github.com/charmbracelet/x/editor v0.2.0
 	github.com/clipperhouse/displaywidth v0.11.0
 	github.com/clipperhouse/uax29/v2 v2.7.0
 	github.com/coder/acp-go-sdk v0.6.3
 	github.com/fsnotify/fsnotify v1.9.0
 	github.com/indaco/herald v0.13.0
 	github.com/indaco/herald-md v0.3.0
-	github.com/mark3labs/mcp-go v0.47.0
+	github.com/mark3labs/mcp-go v0.48.0
 	github.com/spf13/cobra v1.10.2
 	github.com/spf13/viper v1.21.0
 	github.com/traefik/yaegi v0.16.1
-	golang.org/x/term v0.41.0
+	golang.org/x/term v0.42.0
 	gopkg.in/yaml.v3 v3.0.1
 )

 require (
 	cloud.google.com/go v0.123.0 // indirect
-	cloud.google.com/go/auth v0.19.0 // indirect
+	cloud.google.com/go/auth v0.20.0 // indirect
 	cloud.google.com/go/auth/oauth2adapt v0.2.8 // indirect
 	cloud.google.com/go/compute/metadata v0.9.0 // indirect
 	github.com/Azure/azure-sdk-for-go/sdk/azcore v1.21.0 // indirect
@@ -58,9 +59,9 @@ require (
 	github.com/charmbracelet/harmonica v0.2.0 // indirect
 	github.com/charmbracelet/lipgloss v1.1.1-0.20250404203927-76690c660834 // indirect
 	github.com/charmbracelet/x/cellbuf v0.0.15 // indirect
-	github.com/charmbracelet/x/exp/charmtone v0.0.0-20260330094520-2dce04b6f8a4 // indirect
+	github.com/charmbracelet/x/exp/charmtone v0.0.0-20260413165052-6921c759c913 // indirect
 	github.com/charmbracelet/x/exp/ordered v0.1.0 // indirect
-	github.com/charmbracelet/x/exp/slice v0.0.0-20260330094520-2dce04b6f8a4 // indirect
+	github.com/charmbracelet/x/exp/slice v0.0.0-20260413165052-6921c759c913 // indirect
 	github.com/charmbracelet/x/exp/strings v0.1.0 // indirect
 	github.com/charmbracelet/x/json v0.2.0 // indirect
 	github.com/charmbracelet/x/termios v0.1.1 // indirect
@@ -81,10 +82,10 @@ require (
 	github.com/googleapis/enterprise-certificate-proxy v0.3.14 // indirect
 	github.com/googleapis/gax-go/v2 v2.21.0 // indirect
 	github.com/gorilla/websocket v1.5.3 // indirect
-	github.com/kaptinlin/go-i18n v0.3.0 // indirect
+	github.com/kaptinlin/go-i18n v0.4.0 // indirect
 	github.com/kaptinlin/jsonpointer v0.4.17 // indirect
 	github.com/kaptinlin/jsonschema v0.7.7 // indirect
-	github.com/kaptinlin/messageformat-go v0.4.19 // indirect
+	github.com/kaptinlin/messageformat-go v0.4.20 // indirect
 	github.com/mitchellh/hashstructure/v2 v2.0.2 // indirect
 	github.com/muesli/mango v0.2.0 // indirect
 	github.com/muesli/mango-cobra v1.3.0 // indirect
@@ -103,20 +104,20 @@ require (
 	github.com/yosida95/uritemplate/v3 v3.0.2 // indirect
 	github.com/yuin/goldmark v1.8.2 // indirect
 	go.opentelemetry.io/auto/sdk v1.2.1 // indirect
-	go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc v0.67.0 // indirect
-	go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.67.0 // indirect
+	go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc v0.68.0 // indirect
+	go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.68.0 // indirect
 	go.opentelemetry.io/otel v1.43.0 // indirect
 	go.opentelemetry.io/otel/metric v1.43.0 // indirect
 	go.opentelemetry.io/otel/trace v1.43.0 // indirect
 	go.yaml.in/yaml/v3 v3.0.4 // indirect
-	golang.org/x/crypto v0.49.0 // indirect
-	golang.org/x/exp v0.0.0-20260312153236-7ab1446f8b90 // indirect
-	golang.org/x/net v0.52.0 // indirect
+	golang.org/x/crypto v0.50.0 // indirect
+	golang.org/x/exp v0.0.0-20260410095643-746e56fc9e2f // indirect
+	golang.org/x/net v0.53.0 // indirect
 	golang.org/x/oauth2 v0.36.0 // indirect
 	golang.org/x/time v0.15.0 // indirect
-	google.golang.org/api v0.274.0 // indirect
-	google.golang.org/genai v1.52.1 // indirect
-	google.golang.org/genproto/googleapis/rpc v0.0.0-20260401024825-9d38bb4040a9 // indirect
+	google.golang.org/api v0.275.0 // indirect
+	google.golang.org/genai v1.54.0 // indirect
+	google.golang.org/genproto/googleapis/rpc v0.0.0-20260414002931-afd174a4e478 // indirect
 	google.golang.org/grpc v1.80.0 // indirect
 	google.golang.org/protobuf v1.36.11 // indirect
 	gopkg.in/yaml.v2 v2.4.0 // indirect
@@ -124,17 +125,17 @@ require (

 require (
 	github.com/aymanbagabas/go-osc52/v2 v2.0.1 // indirect
-	github.com/charmbracelet/x/ansi v0.11.6
+	github.com/charmbracelet/x/ansi v0.11.7
 	github.com/charmbracelet/x/term v0.2.2 // indirect
 	github.com/inconshreveable/mousetrap v1.1.0 // indirect
 	github.com/lucasb-eyer/go-colorful v1.4.0 // indirect
-	github.com/mattn/go-isatty v0.0.20 // indirect
-	github.com/mattn/go-runewidth v0.0.22 // indirect
+	github.com/mattn/go-isatty v0.0.21 // indirect
+	github.com/mattn/go-runewidth v0.0.23 // indirect
 	github.com/muesli/cancelreader v0.2.2 // indirect
 	github.com/muesli/termenv v0.16.0 // indirect
 	github.com/rivo/uniseg v0.4.7 // indirect
 	github.com/spf13/pflag v1.0.10 // indirect
 	golang.org/x/sync v0.20.0 // indirect
-	golang.org/x/sys v0.42.0 // indirect
-	golang.org/x/text v0.35.0
+	golang.org/x/sys v0.43.0 // indirect
+	golang.org/x/text v0.36.0
 )
@@ -1,17 +1,17 @@
 charm.land/bubbles/v2 v2.1.0 h1:YSnNh5cPYlYjPxRrzs5VEn3vwhtEn3jVGRBT3M7/I0g=
 charm.land/bubbles/v2 v2.1.0/go.mod h1:l97h4hym2hvWBVfmJDtrEHHCtkIKeTEb3TTJ4ZOB3wY=
-charm.land/bubbletea/v2 v2.0.2 h1:4CRtRnuZOdFDTWSff9r8QFt/9+z6Emubz3aDMnf/dx0=
-charm.land/bubbletea/v2 v2.0.2/go.mod h1:3LRff2U4WIYXy7MTxfbAQ+AdfM3D8Xuvz2wbsOD9OHQ=
-charm.land/fantasy v0.17.1 h1:SQzfnyJPDuQWt6e//KKmQmEEXdqHMC0IZz10XwkLcEM=
-charm.land/fantasy v0.17.1/go.mod h1:FF5ALCCHETacHJPBqU42CtwMInYQ0ul52fdzIHQMbQk=
+charm.land/bubbletea/v2 v2.0.5 h1:TQlLFqxo39AAHSVuOhJ5D3nH7O9Nk8JGinsfWQ4y1U4=
+charm.land/bubbletea/v2 v2.0.5/go.mod h1:dvbsYZD+MHkdIZl+Z67D212hEvB+GII2tfH8f9SnoDw=
+charm.land/fantasy v0.17.2 h1:ojTMufMxY/PVH7TzYUxht2SVkvD90iCTJfmPR6c8BR8=
+charm.land/fantasy v0.17.2/go.mod h1:V9cCIUMZB9g3Bq40aKEY8xBNzDd48EdfHp2OMS0uzWs=
 charm.land/huh/v2 v2.0.3 h1:2cJsMqEPwSywGHvdlKsJyQKPtSJLVnFKyFbsYZTlLkU=
 charm.land/huh/v2 v2.0.3/go.mod h1:93eEveeeqn47MwiC3tf+2atZ2l7Is88rAtmZNZ8x9Wc=
-charm.land/lipgloss/v2 v2.0.2 h1:xFolbF8JdpNkM2cEPTfXEcW1p6NRzOWTSamRfYEw8cs=
-charm.land/lipgloss/v2 v2.0.2/go.mod h1:KjPle2Qd3YmvP1KL5OMHiHysGcNwq6u83MUjYkFvEkM=
+charm.land/lipgloss/v2 v2.0.3 h1:yM2zJ4Cf5Y51b7RHIwioil4ApI/aypFXXVHSwlM6RzU=
+charm.land/lipgloss/v2 v2.0.3/go.mod h1:7myLU9iG/3xluAWzpY/fSxYYHCgoKTie7laxk6ATwXA=
 cloud.google.com/go v0.123.0 h1:2NAUJwPR47q+E35uaJeYoNhuNEM9kM8SjgRgdeOJUSE=
 cloud.google.com/go v0.123.0/go.mod h1:xBoMV08QcqUGuPW65Qfm1o9Y4zKZBpGS+7bImXLTAZU=
-cloud.google.com/go/auth v0.19.0 h1:DGYwtbcsGsT1ywuxsIoWi1u/vlks0moIblQHgSDgQkQ=
-cloud.google.com/go/auth v0.19.0/go.mod h1:2Aph7BT2KnaSFOM0JDPyiYgNh6PL9vGMiP8CUIXZ+IY=
+cloud.google.com/go/auth v0.20.0 h1:kXTssoVb4azsVDoUiF8KvxAqrsQcQtB53DcSgta74CA=
+cloud.google.com/go/auth v0.20.0/go.mod h1:942/yi/itH1SsmpyrbnTMDgGfdy2BUqIKyd0cyYLc5Q=
 cloud.google.com/go/auth/oauth2adapt v0.2.8 h1:keo8NaayQZ6wimpNSmW5OPc283g65QNIiLpZnkHRbnc=
 cloud.google.com/go/auth/oauth2adapt v0.2.8/go.mod h1:XQ9y31RkqZCcwJWNSx2Xvric3RrU88hAYYbjDWYDL+c=
 cloud.google.com/go/compute/metadata v0.9.0 h1:pDUj4QMoPejqq20dK0Pg2N4yG9zIkYGdBtwLoEkH9Zs=
@@ -86,24 +86,26 @@ github.com/charmbracelet/log v1.0.0 h1:HVVVMmfOorfj3BA9i8X8UL69Hoz9lI0PYwXfJvOdR
 github.com/charmbracelet/log v1.0.0/go.mod h1:uYgY3SmLpwJWxmlrPwXvzVYujxis1vAKRV/0VQB7yWA=
 github.com/charmbracelet/openai-go v0.0.0-20260319145158-d0740cc34266 h1:BW/sZtyd1JyYy0h5adMm3tzpNyL857LWjuTRET6OhpY=
 github.com/charmbracelet/openai-go v0.0.0-20260319145158-d0740cc34266/go.mod h1:1DahUaExbUZx/jD+FNT2PKP4L9rLE5+ZBRuI8mZjd/E=
-github.com/charmbracelet/ultraviolet v0.0.0-20260330092749-0f94982c930b h1:ASDO9RT6SNKTQN87jO2bRfxHFJq8cgeYdFzivY2gCeM=
-github.com/charmbracelet/ultraviolet v0.0.0-20260330092749-0f94982c930b/go.mod h1:Vo8TffMf0q7Uho/n8e6XpBZvOWtd3g39yX+9P5rRutA=
-github.com/charmbracelet/x/ansi v0.11.6 h1:GhV21SiDz/45W9AnV2R61xZMRri5NlLnl6CVF7ihZW8=
-github.com/charmbracelet/x/ansi v0.11.6/go.mod h1:2JNYLgQUsyqaiLovhU2Rv/pb8r6ydXKS3NIttu3VGZQ=
+github.com/charmbracelet/ultraviolet v0.0.0-20260414011438-8c69ec811b1e h1:O5hZFj55wZQWxMiRtQLa3uLKhZGZGS/j8M3OXinQlrw=
+github.com/charmbracelet/ultraviolet v0.0.0-20260414011438-8c69ec811b1e/go.mod h1:bAAz7dh/FTYfC+oiHavL4mX1tOIBZ0ZwYjSi3qE6ivM=
+github.com/charmbracelet/x/ansi v0.11.7 h1:kzv1kJvjg2S3r9KHo8hDdHFQLEqn4RBCb39dAYC84jI=
+github.com/charmbracelet/x/ansi v0.11.7/go.mod h1:9qGpnAVYz+8ACONkZBUWPtL7lulP9No6p1epAihUZwQ=
 github.com/charmbracelet/x/cellbuf v0.0.15 h1:ur3pZy0o6z/R7EylET877CBxaiE1Sp1GMxoFPAIztPI=
 github.com/charmbracelet/x/cellbuf v0.0.15/go.mod h1:J1YVbR7MUuEGIFPCaaZ96KDl5NoS0DAWkskup+mOY+Q=
 github.com/charmbracelet/x/conpty v0.1.1 h1:s1bUxjoi7EpqiXysVtC+a8RrvPPNcNvAjfi4jxsAuEs=
 github.com/charmbracelet/x/conpty v0.1.1/go.mod h1:OmtR77VODEFbiTzGE9G1XiRJAga6011PIm4u5fTNZpk=
+github.com/charmbracelet/x/editor v0.2.0 h1:7XLUKtaRaB8jN7bWU2p2UChiySyaAuIfYiIRg8gGWwk=
+github.com/charmbracelet/x/editor v0.2.0/go.mod h1:p3oQ28TSL3YPd+GKJ1fHWcp+7bVGpedHpXmo0D6t1dY=
 github.com/charmbracelet/x/errors v0.0.0-20240508181413-e8d8b6e2de86 h1:JSt3B+U9iqk37QUU2Rvb6DSBYRLtWqFqfxf8l5hOZUA=
 github.com/charmbracelet/x/errors v0.0.0-20240508181413-e8d8b6e2de86/go.mod h1:2P0UgXMEa6TsToMSuFqKFQR+fZTO9CNGUNokkPatT/0=
-github.com/charmbracelet/x/exp/charmtone v0.0.0-20260330094520-2dce04b6f8a4 h1:pIj18ZCZO4WOVj7jwjLoUb1lC7rS/I8oC3fZWXugNaY=
-github.com/charmbracelet/x/exp/charmtone v0.0.0-20260330094520-2dce04b6f8a4/go.mod h1:nsExn0DGyX0lh9LwLHTn2Gg+hafdzfSXnC+QmEJTZFY=
+github.com/charmbracelet/x/exp/charmtone v0.0.0-20260413165052-6921c759c913 h1:6F/6bu5nBLjodsvaU5xAszTaxtHrDU5UiJarpMPZj48=
+github.com/charmbracelet/x/exp/charmtone v0.0.0-20260413165052-6921c759c913/go.mod h1:nsExn0DGyX0lh9LwLHTn2Gg+hafdzfSXnC+QmEJTZFY=
 github.com/charmbracelet/x/exp/golden v0.0.0-20250806222409-83e3a29d542f h1:pk6gmGpCE7F3FcjaOEKYriCvpmIN4+6OS/RD0vm4uIA=
 github.com/charmbracelet/x/exp/golden v0.0.0-20250806222409-83e3a29d542f/go.mod h1:IfZAMTHB6XkZSeXUqriemErjAWCCzT0LwjKFYCZyw0I=
 github.com/charmbracelet/x/exp/ordered v0.1.0 h1:55/qLwjIh0gL0Vni+QAWk7T/qRVP6sBf+2agPBgnOFE=
 github.com/charmbracelet/x/exp/ordered v0.1.0/go.mod h1:5UHwmG+is5THxMyCJHNPCn2/ecI07aKNrW+LcResjJ8=
-github.com/charmbracelet/x/exp/slice v0.0.0-20260330094520-2dce04b6f8a4 h1:VSd4zShIAf/4FgEDFJpapEcAPrc7h3dyyN7V9JlJpQw=
-github.com/charmbracelet/x/exp/slice v0.0.0-20260330094520-2dce04b6f8a4/go.mod h1:vqEfX6xzqW1pKKZUUiFOKg0OQ7bCh54Q2vR/tserrRA=
+github.com/charmbracelet/x/exp/slice v0.0.0-20260413165052-6921c759c913 h1:RiZFY92Ug9iz1CenzxSSQla2Z3WflsR7bIuXq40JlpU=
+github.com/charmbracelet/x/exp/slice v0.0.0-20260413165052-6921c759c913/go.mod h1:vqEfX6xzqW1pKKZUUiFOKg0OQ7bCh54Q2vR/tserrRA=
 github.com/charmbracelet/x/exp/strings v0.1.0 h1:i69S2XI7uG1u4NLGeJPSYU++Nmjvpo9nwd6aoEm7gkA=
 github.com/charmbracelet/x/exp/strings v0.1.0/go.mod h1:/ehtMPNh9K4odGFkqYJKpIYyePhdp1hLBRvyY4bWkH8=
 github.com/charmbracelet/x/json v0.2.0 h1:DqB+ZGx2h+Z+1s98HOuOyli+i97wsFQIxP2ZQANTPrQ=
@@ -185,14 +187,14 @@ github.com/indaco/herald v0.13.0 h1:+xVG9Fx5NpuWhwku/9IlRL6I009NnX4VUGKvlZHTRxU=
 github.com/indaco/herald v0.13.0/go.mod h1:T5g1+XLYvpjouhzAGHnAHDCKizhESkoV6+QPZ3DhgWA=
 github.com/indaco/herald-md v0.3.0 h1:hN1cKyrexPPM9PeHBsKuaWvIizSi/iYvM9yzRgtdb8M=
 github.com/indaco/herald-md v0.3.0/go.mod h1:RUHVaDSG45ymJjKyxpDwBocLXrZo93FB4OeYMsw9B9s=
-github.com/kaptinlin/go-i18n v0.3.0 h1:wP76dvYg04bvwTb+8NB+CmdZ2kL7lSSCQ9B/kFv7QHo=
-github.com/kaptinlin/go-i18n v0.3.0/go.mod h1:pVcu9qsW5pOIOoZFJXesRYmLos1vMQrby70JPAoWmJU=
+github.com/kaptinlin/go-i18n v0.4.0 h1:i7L3U2yurg+xhokITtJ0k+mjHnXqkoyz8ju5Wb7W8Oc=
+github.com/kaptinlin/go-i18n v0.4.0/go.mod h1:njA6x0+4MWGcLWT0KLrwekhRPmze1Hnstf2+VJFzwpM=
 github.com/kaptinlin/jsonpointer v0.4.17 h1:mY9k8ciWncxbsECyaxKnR0MdmxamNdp2tLQkAKVrtSk=
 github.com/kaptinlin/jsonpointer v0.4.17/go.mod h1:SsfsjqnHG5zuKo1DTBzk1VknaHlL4osHw+X9kZKukpU=
 github.com/kaptinlin/jsonschema v0.7.7 h1:41BlQJ9dskH0oE5DSzBUrl/w4JQYIr6N6L0B5GNyDoM=
 github.com/kaptinlin/jsonschema v0.7.7/go.mod h1:rKjWfyySHSxAD7Li2ctYkPlOu960igoKBvZ2ADRtd5Q=
-github.com/kaptinlin/messageformat-go v0.4.19 h1:A5kuuZ1ybXDQ7kD1aoEWGAOemX7hLsMY0yolgSbgpRI=
-github.com/kaptinlin/messageformat-go v0.4.19/go.mod h1:utSDTfiXTxl66OC5RIEuObLH7Ue3YjbA2X86SYMBYWg=
+github.com/kaptinlin/messageformat-go v0.4.20 h1:a0ufTd5liiUubIGeGxpSTnNS8ZSrN4DV01/wGFmfzMs=
+github.com/kaptinlin/messageformat-go v0.4.20/go.mod h1:FqdEPfQLkqVBX7OBRMPgYwUPvKYJohFD9Ok1BMzCfIo=
 github.com/kr/pretty v0.3.1 h1:flRD4NNwYAUpkphVc1HcthR4KEIFJ65n8Mw5qdRn3LE=
 github.com/kr/pretty v0.3.1/go.mod h1:hoEshYVHaxMs3cyo3Yncou5ZscifuDolrwPKZanG3xk=
 github.com/kr/text v0.2.0 h1:5Nx0Ya0ZqY2ygV366QzturHI13Jq95ApcVaJBhpS+AY=
@@ -201,12 +203,12 @@ github.com/kylelemons/godebug v1.1.0 h1:RPNrshWIDI6G2gRW9EHilWtl7Z6Sb1BR0xunSBf0
 github.com/kylelemons/godebug v1.1.0/go.mod h1:9/0rRGxNHcop5bhtWyNeEfOS8JIWk580+fNqagV/RAw=
 github.com/lucasb-eyer/go-colorful v1.4.0 h1:UtrWVfLdarDgc44HcS7pYloGHJUjHV/4FwW4TvVgFr4=
 github.com/lucasb-eyer/go-colorful v1.4.0/go.mod h1:R4dSotOR9KMtayYi1e77YzuveK+i7ruzyGqttikkLy0=
-github.com/mark3labs/mcp-go v0.47.0 h1:h44yeM3DduDyQgzImYWu4pt6VRkqP/0p/95AGhWngnA=
-github.com/mark3labs/mcp-go v0.47.0/go.mod h1:JKTC7R2LLVagkEWK7Kwu7DbmA6iIvnNAod6yrHiQMag=
-github.com/mattn/go-isatty v0.0.20 h1:xfD0iDuEKnDkl03q4limB+vH+GxLEtL/jb4xVJSWWEY=
-github.com/mattn/go-isatty v0.0.20/go.mod h1:W+V8PltTTMOvKvAeJH7IuucS94S2C6jfK/D7dTCTo3Y=
-github.com/mattn/go-runewidth v0.0.22 h1:76lXsPn6FyHtTY+jt2fTTvsMUCZq1k0qwRsAMuxzKAk=
-github.com/mattn/go-runewidth v0.0.22/go.mod h1:XBkDxAl56ILZc9knddidhrOlY5R/pDhgLpndooCuJAs=
+github.com/mark3labs/mcp-go v0.48.0 h1:o+MXuGW/HCeR2ny5LcAcZQn2bo6I2xaZMEHnpRG+dtw=
+github.com/mark3labs/mcp-go v0.48.0/go.mod h1:JKTC7R2LLVagkEWK7Kwu7DbmA6iIvnNAod6yrHiQMag=
+github.com/mattn/go-isatty v0.0.21 h1:xYae+lCNBP7QuW4PUnNG61ffM4hVIfm+zUzDuSzYLGs=
+github.com/mattn/go-isatty v0.0.21/go.mod h1:ZXfXG4SQHsB/w3ZeOYbR0PrPwLy+n6xiMrJlRFqopa4=
+github.com/mattn/go-runewidth v0.0.23 h1:7ykA0T0jkPpzSvMS5i9uoNn2Xy3R383f9HDx3RybWcw=
+github.com/mattn/go-runewidth v0.0.23/go.mod h1:XBkDxAl56ILZc9knddidhrOlY5R/pDhgLpndooCuJAs=
 github.com/mitchellh/hashstructure/v2 v2.0.2 h1:vGKWl0YJqUNxE8d+h8f6NJLcCJrgbhC4NcD46KavDd4=
 github.com/mitchellh/hashstructure/v2 v2.0.2/go.mod h1:MG3aRVU/N29oo/V/IhBX8GR/zz4kQkprJgF2EVszyDE=
 github.com/muesli/cancelreader v0.2.2 h1:3I4Kt4BQjOR54NavqnDogx/MIoWBFa0StPA8ELUXHmA=
@@ -272,53 +274,52 @@ github.com/yuin/goldmark v1.8.2 h1:kEGpgqJXdgbkhcOgBxkC0X0PmoPG1ZyoZ117rDVp4zE=
 github.com/yuin/goldmark v1.8.2/go.mod h1:ip/1k0VRfGynBgxOz0yCqHrbZXhcjxyuS66Brc7iBKg=
 go.opentelemetry.io/auto/sdk v1.2.1 h1:jXsnJ4Lmnqd11kwkBV2LgLoFMZKizbCi5fNZ/ipaZ64=
 go.opentelemetry.io/auto/sdk v1.2.1/go.mod h1:KRTj+aOaElaLi+wW1kO/DZRXwkF4C5xPbEe3ZiIhN7Y=
-go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc v0.67.0 h1:yI1/OhfEPy7J9eoa6Sj051C7n5dvpj0QX8g4sRchg04=
-go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc v0.67.0/go.mod h1:NoUCKYWK+3ecatC4HjkRktREheMeEtrXoQxrqYFeHSc=
-go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.67.0 h1:OyrsyzuttWTSur2qN/Lm0m2a8yqyIjUVBZcxFPuXq2o=
-go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.67.0/go.mod h1:C2NGBr+kAB4bk3xtMXfZ94gqFDtg/GkI7e9zqGh5Beg=
+go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc v0.68.0 h1:0Qx7VGBacMm9ZENQ7TnNObTYI4ShC+lHI16seduaxZo=
+go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc v0.68.0/go.mod h1:Sje3i3MjSPKTSPvVWCaL8ugBzJwik3u4smCjUeuupqg=
+go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.68.0 h1:CqXxU8VOmDefoh0+ztfGaymYbhdB/tT3zs79QaZTNGY=
+go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.68.0/go.mod h1:BuhAPThV8PBHBvg8ZzZ/Ok3idOdhWIodywz2xEcRbJo=
 go.opentelemetry.io/otel v1.43.0 h1:mYIM03dnh5zfN7HautFE4ieIig9amkNANT+xcVxAj9I=
 go.opentelemetry.io/otel v1.43.0/go.mod h1:JuG+u74mvjvcm8vj8pI5XiHy1zDeoCS2LB1spIq7Ay0=
 go.opentelemetry.io/otel/metric v1.43.0 h1:d7638QeInOnuwOONPp4JAOGfbCEpYb+K6DVWvdxGzgM=
 go.opentelemetry.io/otel/metric v1.43.0/go.mod h1:RDnPtIxvqlgO8GRW18W6Z/4P462ldprJtfxHxyKd2PY=
-go.opentelemetry.io/otel/sdk v1.42.0 h1:LyC8+jqk6UJwdrI/8VydAq/hvkFKNHZVIWuslJXYsDo=
-go.opentelemetry.io/otel/sdk v1.42.0/go.mod h1:rGHCAxd9DAph0joO4W6OPwxjNTYWghRWmkHuGbayMts=
-go.opentelemetry.io/otel/sdk/metric v1.42.0 h1:D/1QR46Clz6ajyZ3G8SgNlTJKBdGp84q9RKCAZ3YGuA=
-go.opentelemetry.io/otel/sdk/metric v1.42.0/go.mod h1:Ua6AAlDKdZ7tdvaQKfSmnFTdHx37+J4ba8MwVCYM5hc=
+go.opentelemetry.io/otel/sdk v1.43.0 h1:pi5mE86i5rTeLXqoF/hhiBtUNcrAGHLKQdhg4h4V9Dg=
+go.opentelemetry.io/otel/sdk v1.43.0/go.mod h1:P+IkVU3iWukmiit/Yf9AWvpyRDlUeBaRg6Y+C58QHzg=
+go.opentelemetry.io/otel/sdk/metric v1.43.0 h1:S88dyqXjJkuBNLeMcVPRFXpRw2fuwdvfCGLEo89fDkw=
+go.opentelemetry.io/otel/sdk/metric v1.43.0/go.mod h1:C/RJtwSEJ5hzTiUz5pXF1kILHStzb9zFlIEe85bhj6A=
 go.opentelemetry.io/otel/trace v1.43.0 h1:BkNrHpup+4k4w+ZZ86CZoHHEkohws8AY+WTX09nk+3A=
 go.opentelemetry.io/otel/trace v1.43.0/go.mod h1:/QJhyVBUUswCphDVxq+8mld+AvhXZLhe+8WVFxiFff0=
 go.yaml.in/yaml/v3 v3.0.4 h1:tfq32ie2Jv2UxXFdLJdh3jXuOzWiL1fo0bu/FbuKpbc=
 go.yaml.in/yaml/v3 v3.0.4/go.mod h1:DhzuOOF2ATzADvBadXxruRBLzYTpT36CKvDb3+aBEFg=
-golang.org/x/crypto v0.49.0 h1:+Ng2ULVvLHnJ/ZFEq4KdcDd/cfjrrjjNSXNzxg0Y4U4=
-golang.org/x/crypto v0.49.0/go.mod h1:ErX4dUh2UM+CFYiXZRTcMpEcN8b/1gxEuv3nODoYtCA=
-golang.org/x/exp v0.0.0-20260312153236-7ab1446f8b90 h1:jiDhWWeC7jfWqR9c/uplMOqJ0sbNlNWv0UkzE0vX1MA=
-golang.org/x/exp v0.0.0-20260312153236-7ab1446f8b90/go.mod h1:xE1HEv6b+1SCZ5/uscMRjUBKtIxworgEcEi+/n9NQDQ=
-golang.org/x/net v0.52.0 h1:He/TN1l0e4mmR3QqHMT2Xab3Aj3L9qjbhRm78/6jrW0=
-golang.org/x/net v0.52.0/go.mod h1:R1MAz7uMZxVMualyPXb+VaqGSa3LIaUqk0eEt3w36Sw=
+golang.org/x/crypto v0.50.0 h1:zO47/JPrL6vsNkINmLoo/PH1gcxpls50DNogFvB5ZGI=
+golang.org/x/crypto v0.50.0/go.mod h1:3muZ7vA7PBCE6xgPX7nkzzjiUq87kRItoJQM1Yo8S+Q=
+golang.org/x/exp v0.0.0-20260410095643-746e56fc9e2f h1:W3F4c+6OLc6H2lb//N1q4WpJkhzJCK5J6kUi1NTVXfM=
+golang.org/x/exp v0.0.0-20260410095643-746e56fc9e2f/go.mod h1:J1xhfL/vlindoeF/aINzNzt2Bket5bjo9sdOYzOsU80=
+golang.org/x/net v0.53.0 h1:d+qAbo5L0orcWAr0a9JweQpjXF19LMXJE8Ey7hwOdUA=
+golang.org/x/net v0.53.0/go.mod h1:JvMuJH7rrdiCfbeHoo3fCQU24Lf5JJwT9W3sJFulfgs=
 golang.org/x/oauth2 v0.36.0 h1:peZ/1z27fi9hUOFCAZaHyrpWG5lwe0RJEEEeH0ThlIs=
 golang.org/x/oauth2 v0.36.0/go.mod h1:YDBUJMTkDnJS+A4BP4eZBjCqtokkg1hODuPjwiGPO7Q=
 golang.org/x/sync v0.20.0 h1:e0PTpb7pjO8GAtTs2dQ6jYa5BWYlMuX047Dco/pItO4=
 golang.org/x/sync v0.20.0/go.mod h1:9xrNwdLfx4jkKbNva9FpL6vEN7evnE43NNNJQ2LF3+0=
-golang.org/x/sys v0.6.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
-golang.org/x/sys v0.42.0 h1:omrd2nAlyT5ESRdCLYdm3+fMfNFE/+Rf4bDIQImRJeo=
-golang.org/x/sys v0.42.0/go.mod h1:4GL1E5IUh+htKOUEOaiffhrAeqysfVGipDYzABqnCmw=
-golang.org/x/term v0.41.0 h1:QCgPso/Q3RTJx2Th4bDLqML4W6iJiaXFq2/ftQF13YU=
-golang.org/x/term v0.41.0/go.mod h1:3pfBgksrReYfZ5lvYM0kSO0LIkAl4Yl2bXOkKP7Ec2A=
-golang.org/x/text v0.35.0 h1:JOVx6vVDFokkpaq1AEptVzLTpDe9KGpj5tR4/X+ybL8=
-golang.org/x/text v0.35.0/go.mod h1:khi/HExzZJ2pGnjenulevKNX1W67CUy0AsXcNubPGCA=
+golang.org/x/sys v0.43.0 h1:Rlag2XtaFTxp19wS8MXlJwTvoh8ArU6ezoyFsMyCTNI=
+golang.org/x/sys v0.43.0/go.mod h1:4GL1E5IUh+htKOUEOaiffhrAeqysfVGipDYzABqnCmw=
+golang.org/x/term v0.42.0 h1:UiKe+zDFmJobeJ5ggPwOshJIVt6/Ft0rcfrXZDLWAWY=
+golang.org/x/term v0.42.0/go.mod h1:Dq/D+snpsbazcBG5+F9Q1n2rXV8Ma+71xEjTRufARgY=
+golang.org/x/text v0.36.0 h1:JfKh3XmcRPqZPKevfXVpI1wXPTqbkE5f7JA92a55Yxg=
+golang.org/x/text v0.36.0/go.mod h1:NIdBknypM8iqVmPiuco0Dh6P5Jcdk8lJL0CUebqK164=
 golang.org/x/time v0.15.0 h1:bbrp8t3bGUeFOx08pvsMYRTCVSMk89u4tKbNOZbp88U=
 golang.org/x/time v0.15.0/go.mod h1:Y4YMaQmXwGQZoFaVFk4YpCt4FLQMYKZe9oeV/f4MSno=
 gonum.org/v1/gonum v0.17.0 h1:VbpOemQlsSMrYmn7T2OUvQ4dqxQXU+ouZFQsZOx50z4=
 gonum.org/v1/gonum v0.17.0/go.mod h1:El3tOrEuMpv2UdMrbNlKEh9vd86bmQ6vqIcDwxEOc1E=
-google.golang.org/api v0.274.0 h1:aYhycS5QQCwxHLwfEHRRLf9yNsfvp1JadKKWBE54RFA=
-google.golang.org/api v0.274.0/go.mod h1:JbAt7mF+XVmWu6xNP8/+CTiGH30ofmCmk9nM8d8fHew=
-google.golang.org/genai v1.52.1 h1:dYoljKtLDXMiBdVaClSJ/ZPwZ7j1N0lGjMhwOKOQUlk=
-google.golang.org/genai v1.52.1/go.mod h1:A3kkl0nyBjyFlNjgxIwKq70julKbIxpSxqKO5gw/gmk=
-google.golang.org/genproto v0.0.0-20260319201613-d00831a3d3e7 h1:XzmzkmB14QhVhgnawEVsOn6OFsnpyxNPRY9QV01dNB0=
-google.golang.org/genproto v0.0.0-20260319201613-d00831a3d3e7/go.mod h1:L43LFes82YgSonw6iTXTxXUX1OlULt4AQtkik4ULL/I=
-google.golang.org/genproto/googleapis/api v0.0.0-20260319201613-d00831a3d3e7 h1:41r6JMbpzBMen0R/4TZeeAmGXSJC7DftGINUodzTkPI=
-google.golang.org/genproto/googleapis/api v0.0.0-20260319201613-d00831a3d3e7/go.mod h1:EIQZ5bFCfRQDV4MhRle7+OgjNtZ6P1PiZBgAKuxXu/Y=
-google.golang.org/genproto/googleapis/rpc v0.0.0-20260401024825-9d38bb4040a9 h1:m8qni9SQFH0tJc1X0vmnpw/0t+AImlSvp30sEupozUg=
-google.golang.org/genproto/googleapis/rpc v0.0.0-20260401024825-9d38bb4040a9/go.mod h1:4Hqkh8ycfw05ld/3BWL7rJOSfebL2Q+DVDeRgYgxUU8=
+google.golang.org/api v0.275.0 h1:vfY5d9vFVJeWEZT65QDd9hbndr7FyZ2+6mIzGAh71NI=
+google.golang.org/api v0.275.0/go.mod h1:Fnag/EWUPIcJXuIkP1pjoTgS5vdxlk3eeemL7Do6bvw=
+google.golang.org/genai v1.54.0 h1:ZQCa70WMTJDI11FdqWCzGvZ5PanpcpfoO6jl/lrSnGU=
+google.golang.org/genai v1.54.0/go.mod h1:A3kkl0nyBjyFlNjgxIwKq70julKbIxpSxqKO5gw/gmk=
+google.golang.org/genproto v0.0.0-20260406210006-6f92a3bedf2d h1:N1Ec54vZnIPd7MnxRiYLW+oY4fDR4BOS/LrssdD9+ek=
+google.golang.org/genproto v0.0.0-20260406210006-6f92a3bedf2d/go.mod h1:c2hJ1grtnH0xUiEKGDGkjGNTJ1Hy2LrblyKOHF0sqRM=
+google.golang.org/genproto/googleapis/api v0.0.0-20260406210006-6f92a3bedf2d h1:/aDRtSZJjyLQzm75d+a1wOJaqyKBMvIAfeQmoa3ORiI=
+google.golang.org/genproto/googleapis/api v0.0.0-20260406210006-6f92a3bedf2d/go.mod h1:etfGUgejTiadZAUaEP14NP97xi1RGeawqkjDARA/UOs=
+google.golang.org/genproto/googleapis/rpc v0.0.0-20260414002931-afd174a4e478 h1:RmoJA1ujG+/lRGNfUnOMfhCy5EipVMyvUE+KNbPbTlw=
+google.golang.org/genproto/googleapis/rpc v0.0.0-20260414002931-afd174a4e478/go.mod h1:4Hqkh8ycfw05ld/3BWL7rJOSfebL2Q+DVDeRgYgxUU8=
 google.golang.org/grpc v1.80.0 h1:Xr6m2WmWZLETvUNvIUmeD5OAagMw3FiKmMlTdViWsHM=
 google.golang.org/grpc v1.80.0/go.mod h1:ho/dLnxwi3EDJA4Zghp7k2Ec1+c2jqup0bFkw07bwF4=
 google.golang.org/protobuf v1.36.11 h1:fV6ZwhNocDyBLK0dj+fg8ektcVegBBuEolpbTQyBNVE=
@@ -23,18 +23,6 @@ import (
 // Version is injected at build time; fallback to "dev".
 var Version = "dev"

-// thinkingTagOpen and thinkingTagClose are the XML-style tags that some models
-// (Qwen, DeepSeek) wrap reasoning content in. We parse these to extract
-// reasoning/thinking content and send it as ACP thought updates.
-// Also support <think> format used by some models.
-const (
-	thinkingTagOpen    = "<thinking>"
-	thinkingTagClose   = "</thinking>"
-	shortThinkTagOpen  = "<think>"
-	shortThinkTagClose = "</think>"
-)
-
-// Agent implements the acp.Agent interface, delegating to Kit for LLM
 // execution, tool calls, and session management.
 type Agent struct {
 	conn     *acp.AgentSideConnection
@@ -42,10 +30,6 @@ type Agent struct {

 	// toolCallCounter provides unique IDs for tool calls within a turn.
 	toolCallCounter atomic.Int64
-
-	// inThinkingTag tracks whether we're currently inside a <thinking> tag
-	// when parsing streaming content from models that wrap reasoning in XML tags.
-	inThinkingTag bool
 }

 // NewAgent creates a new ACP agent backed by Kit.
@@ -144,9 +128,6 @@ func (a *Agent) Prompt(ctx context.Context, params acp.PromptRequest) (acp.Promp

 	log.Debug("acp: prompt", "session", sessionID, "prompt_len", len(promptText), "files", len(files))

-	// Reset thinking tag state for this new prompt turn
-	a.inThinkingTag = false
-
 	// Create a cancellable context for this prompt turn.
 	promptCtx, cancel := context.WithCancel(ctx)
 	sess.setCancel(cancel)
@@ -230,24 +211,8 @@ func (a *Agent) subscribeEvents(ctx context.Context, k *kit.Kit, sessionID acp.S
 		var update *acp.SessionUpdate
 		switch ev := e.(type) {
 		case kit.MessageUpdateEvent:
-			// Handle models that wrap reasoning in <thinking> tags (Qwen, DeepSeek)
-			// Parse the chunk and separate reasoning from regular text
-			reasoning, text := a.parseThinkingTags(ev.Chunk)
-
-			// Send reasoning update if we have reasoning content
-			if reasoning != "" {
-				u := acp.UpdateAgentThoughtText(reasoning)
-				_ = a.conn.SessionUpdate(ctx, acp.SessionNotification{
-					SessionId: sessionID,
-					Update:    u,
-				})
-			}
-
-			// Send text update if we have text content
-			if text != "" {
-				u := acp.UpdateAgentMessageText(text)
-				update = &u
-			}
+			u := acp.UpdateAgentMessageText(ev.Chunk)
+			update = &u

 		case kit.ReasoningDeltaEvent:
 			u := acp.UpdateAgentThoughtText(ev.Delta)
@@ -430,81 +395,6 @@ func extractPromptContent(blocks []acp.ContentBlock) (string, []kit.LLMFilePart)
 	return strings.Join(textParts, "\n"), files
 }

-// parseThinkingTags parses a text chunk for <thinking> or  tags and separates
-// reasoning content from regular text. This handles models (Qwen, DeepSeek)
-// that wrap reasoning in XML-style tags instead of using proper reasoning events.
-// Returns (reasoningContent, textContent).
-func (a *Agent) parseThinkingTags(chunk string) (reasoning string, text string) {
-	// Handle empty chunk
-	if chunk == "" {
-		return "", ""
-	}
-
-	// Determine which tag format to use (long or short)
-	openTag := thinkingTagOpen
-	closeTag := thinkingTagClose
-
-	if strings.Contains(chunk, shortThinkTagOpen) || strings.Contains(chunk, shortThinkTagClose) {
-		openTag = shortThinkTagOpen
-		closeTag = shortThinkTagClose
-	} else if !strings.Contains(chunk, thinkingTagOpen) && !strings.Contains(chunk, thinkingTagClose) && !a.inThinkingTag {
-		// No tags at all and not in thinking mode - return as text
-		return "", chunk
-	}
-
-	// Check for opening tag
-	if strings.Contains(chunk, openTag) {
-		parts := strings.SplitN(chunk, openTag, 2)
-
-		// Content before the opening tag is regular text
-		if !a.inThinkingTag && parts[0] != "" {
-			text = parts[0]
-		}
-
-		a.inThinkingTag = true
-
-		// Content after the opening tag is reasoning
-		if len(parts) > 1 {
-			// Check if the same chunk contains the closing tag
-			if strings.Contains(parts[1], closeTag) {
-				innerParts := strings.SplitN(parts[1], closeTag, 2)
-				reasoning = innerParts[0]
-				a.inThinkingTag = false
-
-				// Content after closing tag is regular text
-				if len(innerParts) > 1 && innerParts[1] != "" {
-					text += innerParts[1]
-				}
-			} else if parts[1] != "" {
-				// No closing tag yet, all remaining content is reasoning
-				reasoning = parts[1]
-			}
-		}
-		return reasoning, text
-	}
-
-	// Check for closing tag
-	if strings.Contains(chunk, closeTag) {
-		parts := strings.SplitN(chunk, closeTag, 2)
-		a.inThinkingTag = false
-
-		// Content before closing tag is reasoning
-		reasoning = parts[0]
-
-		// Content after closing tag is regular text
-		if len(parts) > 1 && parts[1] != "" {
-			text = parts[1]
-		}
-		return reasoning, text
-	}
-
-	// No tags found - content goes to current mode
-	if a.inThinkingTag {
-		return chunk, ""
-	}
-	return "", chunk
-}
-
 // isTextMimeType returns true if the MIME type indicates text content.
 func isTextMimeType(mimeType string) bool {
 	return strings.HasPrefix(mimeType, "text/") ||
@@ -30,11 +30,21 @@ type AgentConfig struct {
 	// If nil, remote MCP servers that require OAuth will fail to connect.
 	AuthHandler tools.MCPAuthHandler

+	// TokenStoreFactory, if non-nil, creates a custom token store for each
+	// remote MCP server's OAuth tokens. When nil, the default file-based
+	// token store is used.
+	TokenStoreFactory tools.TokenStoreFactory
+
 	// CoreTools overrides the default core tool set. If empty, core.AllTools()
 	// is used. This allows SDK users to provide a custom tool set (e.g.
 	// CodingTools or tools with a custom WorkDir).
 	CoreTools []fantasy.AgentTool

+	// DisableCoreTools, when true, prevents loading any core tools.
+	// If both DisableCoreTools is true and CoreTools is empty, the agent
+	// will have no tools (useful for simple chat completions).
+	DisableCoreTools bool
+
 	// ToolWrapper is an optional function that wraps the combined tool list
 	// before it is passed to the LLM agent. Used by the extensions system
 	// to intercept tool calls/results.
@@ -43,6 +53,11 @@ type AgentConfig struct {
 	// ExtraTools are additional tools to include alongside core and MCP tools.
 	// Used by extensions to register custom tools.
 	ExtraTools []fantasy.AgentTool
+
+	// OnMCPServerLoaded, if non-nil, is called when each MCP server finishes
+	// loading (successfully or with error). The callback receives the server
+	// name, tool count, and any error. Called from the background goroutine.
+	OnMCPServerLoaded func(serverName string, toolCount int, err error)
 }

 // ToolCallHandler is a function type for handling tool calls as they happen.
@@ -79,6 +94,14 @@ type ReasoningCompleteHandler func()
 // Note: This is an alias for core.ToolOutputCallback to avoid import cycles.
 type ToolOutputHandler = core.ToolOutputCallback

+// StepMessagesHandler is a function type for persisting messages after each
+// complete step in a multi-step agent turn. The handler receives the messages
+// produced by the step (typically an assistant message with tool calls followed
+// by a tool-role message with results, or a final assistant message with text).
+// This enables incremental session persistence so that progress is saved as
+// it happens rather than only at the end of the turn.
+type StepMessagesHandler func(stepMessages []fantasy.Message)
+
 // StepUsageHandler is a function type for handling token usage after each
 // complete step in a multi-step agent turn. This enables real-time cost
 // tracking during long-running tool-calling conversations.
@@ -88,6 +111,10 @@ type StepUsageHandler func(inputTokens, outputTokens, cacheReadTokens, cacheCrea
 // Core tools (bash, read, write, edit, grep, find, ls) are registered as direct
 // AgentTool implementations — no MCP layer, no serialization overhead.
 // Additional tools from external MCP servers can be loaded alongside core tools.
+//
+// When MCP servers are configured, tool loading happens in the background so the
+// agent (and UI) can start immediately. The first LLM call automatically waits
+// for MCP tools to finish loading before proceeding.
 type Agent struct {
 	toolManager      *tools.MCPToolManager
 	fantasyAgent     fantasy.Agent
@@ -101,6 +128,24 @@ type Agent struct {
 	coreTools        []fantasy.AgentTool
 	extraTools       []fantasy.AgentTool
 	toolWrapper      func([]fantasy.AgentTool) []fantasy.AgentTool // stored for SetModel rebuild
+
+	// providerOptions and modelConfig are stored for rebuilding the fantasy
+	// agent when MCP tools arrive asynchronously or on SetModel.
+	providerOptions     fantasy.ProviderOptions
+	skipMaxOutputTokens bool
+	modelConfig         *models.ProviderConfig
+
+	// authHandler and tokenStoreFactory are stored from AgentConfig so that
+	// AddMCPServer() can propagate them when creating a new MCPToolManager
+	// at runtime (i.e. when no MCP servers were configured at init time).
+	authHandler       tools.MCPAuthHandler
+	tokenStoreFactory tools.TokenStoreFactory
+
+	// mcpReady is closed when background MCP tool loading completes (success
+	// or failure). nil when no MCP servers are configured.
+	mcpReady chan struct{}
+	// mcpErr holds any error from background MCP loading.
+	mcpErr error
 }

 // GenerateWithLoopResult contains the result and conversation history from an agent interaction.
@@ -115,11 +160,19 @@ type GenerateWithLoopResult struct {
 	TotalUsage fantasy.Usage
 	// StopReason is the LLM provider's finish reason for the final response.
 	StopReason string
+	// PersistedMessageCount is the number of new messages (beyond the original
+	// input) that were already persisted incrementally via OnStepMessages during
+	// generation. The caller should skip these when doing post-generation
+	// persistence to avoid duplicates.
+	PersistedMessageCount int
 }

 // NewAgent creates a new Agent with core tools and optional MCP tool integration.
 // Core tools (bash, read, write, edit, grep, find, ls) are always registered.
-// External MCP tools are loaded from the config if any MCP servers are configured.
+// If MCP servers are configured, their tools are loaded in the background —
+// the agent returns immediately and is usable with core tools only. The first
+// LLM call (GenerateWithLoop) automatically waits for MCP tools to finish
+// loading and rebuilds the agent with the full tool set.
 func NewAgent(ctx context.Context, agentConfig *AgentConfig) (*Agent, error) {
 	// Create the LLM provider
 	providerResult, err := models.CreateProvider(ctx, agentConfig.ModelConfig)
@@ -129,38 +182,22 @@ func NewAgent(ctx context.Context, agentConfig *AgentConfig) (*Agent, error) {

 	// Register core tools (direct AgentTool implementations, no MCP overhead).
 	// Use caller-provided tools if set, otherwise default to all core tools.
-	coreTools := agentConfig.CoreTools
-	if len(coreTools) == 0 {
+	// DisableCoreTools allows explicitly having zero tools (for chat-only mode).
+	var coreTools []fantasy.AgentTool
+	if agentConfig.DisableCoreTools && len(agentConfig.CoreTools) == 0 {
+		// Explicitly zero tools - chat-only mode
+		coreTools = nil
+	} else if len(agentConfig.CoreTools) > 0 {
+		// Custom tools provided - use them
+		coreTools = agentConfig.CoreTools
+	} else {
+		// Default: load all core tools
 		coreTools = core.AllTools()
 	}

-	// Build the combined tool list: core tools + any external MCP tools
+	// Build the initial tool list: core tools + extension tools (no MCP yet).
 	allTools := make([]fantasy.AgentTool, len(coreTools))
 	copy(allTools, coreTools)
-
-	// Load external MCP tools if configured
-	var toolManager *tools.MCPToolManager
-	if agentConfig.MCPConfig != nil && len(agentConfig.MCPConfig.MCPServers) > 0 {
-		toolManager = tools.NewMCPToolManager()
-		toolManager.SetModel(providerResult.Model)
-
-		if agentConfig.AuthHandler != nil {
-			toolManager.SetAuthHandler(agentConfig.AuthHandler)
-		}
-
-		if agentConfig.DebugLogger != nil {
-			toolManager.SetDebugLogger(agentConfig.DebugLogger)
-		}
-
-		if err := toolManager.LoadTools(ctx, agentConfig.MCPConfig); err != nil {
-			// MCP tool loading failures are non-fatal; core tools still work
-			fmt.Printf("Warning: Failed to load MCP tools: %v\n", err)
-		} else {
-			mcpTools := toolManager.GetTools()
-			allTools = append(allTools, mcpTools...)
-		}
-	}
-
 	// Append any extra tools provided by extensions.
 	if len(agentConfig.ExtraTools) > 0 {
 		allTools = append(allTools, agentConfig.ExtraTools...)
@@ -172,6 +209,149 @@ func NewAgent(ctx context.Context, agentConfig *AgentConfig) (*Agent, error) {
 	}

 	// Build agent options
+	agentOpts := buildAgentOptions(agentConfig, providerResult, allTools)
+
+	// Create the agent
+	fantasyAgent := fantasy.NewAgent(providerResult.Model, agentOpts...)
+
+	// Determine provider type from model string
+	providerType := "default"
+	if agentConfig.ModelConfig != nil && agentConfig.ModelConfig.ModelString != "" {
+		if p, _, err := models.ParseModelString(agentConfig.ModelConfig.ModelString); err == nil {
+			providerType = p
+		}
+	}
+
+	a := &Agent{
+		fantasyAgent:        fantasyAgent,
+		model:               providerResult.Model,
+		providerCloser:      providerResult.Closer,
+		maxSteps:            agentConfig.MaxSteps,
+		systemPrompt:        agentConfig.SystemPrompt,
+		loadingMessage:      providerResult.Message,
+		providerType:        providerType,
+		streamingEnabled:    agentConfig.StreamingEnabled,
+		coreTools:           coreTools,
+		extraTools:          agentConfig.ExtraTools,
+		toolWrapper:         agentConfig.ToolWrapper,
+		providerOptions:     providerResult.ProviderOptions,
+		skipMaxOutputTokens: providerResult.SkipMaxOutputTokens,
+		modelConfig:         agentConfig.ModelConfig,
+		authHandler:         agentConfig.AuthHandler,
+		tokenStoreFactory:   agentConfig.TokenStoreFactory,
+	}
+
+	// Start MCP tool loading in the background if servers are configured.
+	// The mcpReady channel is closed when loading completes (success or failure).
+	if agentConfig.MCPConfig != nil && len(agentConfig.MCPConfig.MCPServers) > 0 {
+		toolManager := tools.NewMCPToolManager()
+		toolManager.SetModel(providerResult.Model)
+		if agentConfig.AuthHandler != nil {
+			toolManager.SetAuthHandler(agentConfig.AuthHandler)
+		}
+		if agentConfig.TokenStoreFactory != nil {
+			toolManager.SetTokenStoreFactory(agentConfig.TokenStoreFactory)
+		}
+		if agentConfig.DebugLogger != nil {
+			toolManager.SetDebugLogger(agentConfig.DebugLogger)
+		}
+		// Set per-server loaded callback if provided.
+		if agentConfig.OnMCPServerLoaded != nil {
+			toolManager.SetOnServerLoaded(agentConfig.OnMCPServerLoaded)
+		}
+		a.toolManager = toolManager
+		a.mcpReady = make(chan struct{})
+
+		go func() {
+			defer close(a.mcpReady)
+			if err := toolManager.LoadTools(ctx, agentConfig.MCPConfig); err != nil {
+				a.mcpErr = err
+				fmt.Printf("Warning: Failed to load MCP tools: %v\n", err)
+			}
+		}()
+	}
+
+	return a, nil
+}
+
+// WaitForMCPTools blocks until background MCP tool loading completes.
+// Returns nil if no MCP servers are configured or if loading succeeded.
+// Returns the loading error if all servers failed. Safe to call multiple times.
+func (a *Agent) WaitForMCPTools() error {
+	if a.mcpReady == nil {
+		return nil
+	}
+	<-a.mcpReady
+	return a.mcpErr
+}
+
+// MCPToolsReady returns true if MCP tool loading has completed (or was never
+// started). This is a non-blocking check useful for UI status display.
+func (a *Agent) MCPToolsReady() bool {
+	if a.mcpReady == nil {
+		return true
+	}
+	select {
+	case <-a.mcpReady:
+		return true
+	default:
+		return false
+	}
+}
+
+// ensureMCPTools waits for MCP tools to load and rebuilds the fantasy agent
+// with the full tool set. Called lazily before the first LLM call.
+// This is idempotent — subsequent calls after the first rebuild are no-ops.
+func (a *Agent) ensureMCPTools() {
+	if a.mcpReady == nil {
+		return
+	}
+	<-a.mcpReady
+
+	// If there are MCP tools, rebuild the fantasy agent to include them.
+	if a.toolManager != nil && len(a.toolManager.GetTools()) > 0 {
+		a.rebuildFantasyAgent()
+	}
+
+	// Nil out the channel so future calls are instant no-ops and we
+	// don't rebuild again.
+	a.mcpReady = nil
+}
+
+// rebuildFantasyAgent reconstructs the fantasy agent with the current full
+// tool set (core + MCP + extension tools). Used after MCP tools arrive
+// asynchronously and by SetModel.
+func (a *Agent) rebuildFantasyAgent() {
+	allTools := make([]fantasy.AgentTool, len(a.coreTools))
+	copy(allTools, a.coreTools)
+	if a.toolManager != nil {
+		allTools = append(allTools, a.toolManager.GetTools()...)
+	}
+	if len(a.extraTools) > 0 {
+		allTools = append(allTools, a.extraTools...)
+	}
+	if a.toolWrapper != nil {
+		allTools = a.toolWrapper(allTools)
+	}
+
+	providerResult := &models.ProviderResult{
+		Model:               a.model,
+		ProviderOptions:     a.providerOptions,
+		SkipMaxOutputTokens: a.skipMaxOutputTokens,
+	}
+	agentOpts := buildAgentOptions(&AgentConfig{
+		ModelConfig:  a.modelConfig,
+		SystemPrompt: a.systemPrompt,
+		MaxSteps:     a.maxSteps,
+	}, providerResult, allTools)
+
+	a.fantasyAgent = fantasy.NewAgent(a.model, agentOpts...)
+}
+
+// buildAgentOptions constructs the fantasy.AgentOption slice from config,
+// provider result, and the combined tool list. Shared by NewAgent,
+// rebuildFantasyAgent, and SetModel.
+func buildAgentOptions(agentConfig *AgentConfig, providerResult *models.ProviderResult, allTools []fantasy.AgentTool) []fantasy.AgentOption {
 	var agentOpts []fantasy.AgentOption

 	if agentConfig.SystemPrompt != "" {
@@ -209,33 +389,15 @@ func NewAgent(ctx context.Context, agentConfig *AgentConfig) (*Agent, error) {
 		if agentConfig.ModelConfig.TopK != nil {
 			agentOpts = append(agentOpts, fantasy.WithTopK(int64(*agentConfig.ModelConfig.TopK)))
 		}
-	}
-
-	// Create the agent
-	fantasyAgent := fantasy.NewAgent(providerResult.Model, agentOpts...)
-
-	// Determine provider type from model string
-	providerType := "default"
-	if agentConfig.ModelConfig != nil && agentConfig.ModelConfig.ModelString != "" {
-		if p, _, err := models.ParseModelString(agentConfig.ModelConfig.ModelString); err == nil {
-			providerType = p
+		if agentConfig.ModelConfig.FrequencyPenalty != nil {
+			agentOpts = append(agentOpts, fantasy.WithFrequencyPenalty(float64(*agentConfig.ModelConfig.FrequencyPenalty)))
+		}
+		if agentConfig.ModelConfig.PresencePenalty != nil {
+			agentOpts = append(agentOpts, fantasy.WithPresencePenalty(float64(*agentConfig.ModelConfig.PresencePenalty)))
 		}
 	}

-	return &Agent{
-		toolManager:      toolManager,
-		fantasyAgent:     fantasyAgent,
-		model:            providerResult.Model,
-		providerCloser:   providerResult.Closer,
-		maxSteps:         agentConfig.MaxSteps,
-		systemPrompt:     agentConfig.SystemPrompt,
-		loadingMessage:   providerResult.Message,
-		providerType:     providerType,
-		streamingEnabled: agentConfig.StreamingEnabled,
-		coreTools:        coreTools,
-		extraTools:       agentConfig.ExtraTools,
-		toolWrapper:      agentConfig.ToolWrapper,
-	}, nil
+	return agentOpts
 }

 // GenerateWithLoop processes messages with a custom loop that displays tool calls in real-time.
@@ -244,7 +406,7 @@ func (a *Agent) GenerateWithLoop(ctx context.Context, messages []fantasy.Message
 	onResponse ResponseHandler, onToolCallContent ToolCallContentHandler,
 ) (*GenerateWithLoopResult, error) {
 	return a.GenerateWithLoopAndStreaming(ctx, messages, onToolCall, onToolExecution, onToolResult,
-		onResponse, onToolCallContent, nil, nil, nil, nil, nil)
+		onResponse, onToolCallContent, nil, nil, nil, nil, nil, nil)
 }

 // GenerateWithLoopAndStreaming processes messages using the agent with streaming and callbacks.
@@ -257,9 +419,15 @@ func (a *Agent) GenerateWithLoopAndStreaming(ctx context.Context, messages []fan
 	onReasoningDelta ReasoningDeltaHandler,
 	onReasoningComplete ReasoningCompleteHandler,
 	onToolOutput ToolOutputHandler,
+	onStepMessages StepMessagesHandler,
 	onStepUsage StepUsageHandler,
 ) (*GenerateWithLoopResult, error) {

+	// Wait for background MCP tool loading to complete and rebuild the
+	// fantasy agent with the full tool set. This is a no-op when no MCP
+	// servers are configured or tools have already been integrated.
+	a.ensureMCPTools()
+
 	// Inject tool output handler into context for use by core tools (e.g., bash).
 	if onToolOutput != nil {
 		ctx = core.ContextWithToolOutputCallback(ctx, onToolOutput)
@@ -291,6 +459,10 @@ func (a *Agent) GenerateWithLoopAndStreaming(ctx context.Context, messages []fan
 		// when it returns an error, but the OnStepFinish callback fires
 		// for every step that completed before the error occurred.
 		var completedStepMessages []fantasy.Message
+		// persistedCount tracks how many new messages (beyond the original
+		// input) were persisted incrementally via onStepMessages, so the
+		// caller can skip them during post-generation persistence.
+		var persistedCount int

 		// Use the streaming agent
 		streamCall := fantasy.AgentStreamCall{
@@ -376,6 +548,13 @@ func (a *Agent) GenerateWithLoopAndStreaming(ctx context.Context, messages []fan
 				// persisted even if a later step is cancelled.
 				completedStepMessages = append(completedStepMessages, step.Messages...)

+				// Persist step messages incrementally so progress is saved
+				// as it happens rather than only at the end of the turn.
+				if onStepMessages != nil && len(step.Messages) > 0 {
+					onStepMessages(step.Messages)
+					persistedCount += len(step.Messages)
+				}
+
 				if ctx.Err() != nil {
 					return ctx.Err()
 				}
@@ -454,19 +633,25 @@ func (a *Agent) GenerateWithLoopAndStreaming(ctx context.Context, messages []fan
 				partialMessages = append(partialMessages, messages...)
 				partialMessages = append(partialMessages, completedStepMessages...)
 				return &GenerateWithLoopResult{
-					ConversationMessages: partialMessages,
+					ConversationMessages:  partialMessages,
+					PersistedMessageCount: persistedCount,
 				}, err
 			}
 			return nil, err
 		}

-		// Fire the response callback for callers that use it (e.g. non-streaming
-		// callers that still want the final response notification).
-		if onResponse != nil && result.Response.Content.Text() != "" {
+		// Fire the response callback so callers (e.g. the TUI) can reset
+		// streaming state. This must fire even when the response text is
+		// empty (e.g. reasoning-only responses) so the UI properly resets
+		// the stream component and avoids duplicate content on the next
+		// flush.
+		if onResponse != nil {
 			onResponse(result.Response.Content.Text())
 		}

-		return convertAgentResult(result, messages), nil
+		r := convertAgentResult(result, messages)
+		r.PersistedMessageCount = persistedCount
+		return r, nil
 	}

 	// Non-streaming path with no callbacks — use the simpler Generate call.
@@ -479,8 +664,9 @@ func (a *Agent) GenerateWithLoopAndStreaming(ctx context.Context, messages []fan
 		return nil, err
 	}

-	// For non-streaming, fire the response callback with the final text
-	if onResponse != nil && result.Response.Content.Text() != "" {
+	// For non-streaming, fire the response callback so callers can reset
+	// streaming state (see streaming path comment above).
+	if onResponse != nil {
 		onResponse(result.Response.Content.Text())
 	}

@@ -651,38 +837,68 @@ func (a *Agent) GetExtensionToolCount() int {
 // SetExtraTools replaces the agent's extra tools (e.g. extension-registered
 // tools) and rebuilds the internal agent with the updated tool list. The
 // model, system prompt, and all other configuration are preserved.
-func (a *Agent) SetExtraTools(tools []fantasy.AgentTool) {
-	a.extraTools = tools
+func (a *Agent) SetExtraTools(extraTools []fantasy.AgentTool) {
+	a.extraTools = extraTools
+	a.rebuildFantasyAgent()
+}

-	// Rebuild tool list (same as NewAgent / SetModel).
-	allTools := make([]fantasy.AgentTool, len(a.coreTools))
-	copy(allTools, a.coreTools)
-	if a.toolManager != nil {
-		allTools = append(allTools, a.toolManager.GetTools()...)
-	}
-	if len(a.extraTools) > 0 {
-		allTools = append(allTools, a.extraTools...)
-	}
-	if a.toolWrapper != nil {
-		allTools = a.toolWrapper(allTools)
+// AddMCPServer connects to a new MCP server at runtime and makes its tools
+// available to the agent. Returns the number of tools loaded.
+// If the agent has no tool manager (no MCP servers were configured at init),
+// one is created automatically.
+func (a *Agent) AddMCPServer(ctx context.Context, name string, cfg config.MCPServerConfig) (int, error) {
+	// Ensure MCP tools from initial load are settled first.
+	a.ensureMCPTools()
+
+	if a.toolManager == nil {
+		a.toolManager = tools.NewMCPToolManager()
+		a.toolManager.SetModel(a.model)
+		if a.authHandler != nil {
+			a.toolManager.SetAuthHandler(a.authHandler)
+		}
+		if a.tokenStoreFactory != nil {
+			a.toolManager.SetTokenStoreFactory(a.tokenStoreFactory)
+		}
+		a.toolManager.SetOnToolsChanged(func() {
+			a.rebuildFantasyAgent()
+		})
 	}

-	// Rebuild agent options with the existing model.
-	var agentOpts []fantasy.AgentOption
-	if a.systemPrompt != "" {
-		agentOpts = append(agentOpts, fantasy.WithSystemPrompt(a.systemPrompt))
-	}
-	if len(allTools) > 0 {
-		agentOpts = append(agentOpts, fantasy.WithTools(allTools...))
-	}
-	if a.maxSteps > 0 {
-		agentOpts = append(agentOpts, fantasy.WithStopConditions(
-			fantasy.StepCountIs(a.maxSteps),
-		))
+	count, err := a.toolManager.AddServer(ctx, name, cfg)
+	if err != nil {
+		return 0, err
 	}

-	// Swap the fantasy agent (model and provider are unchanged).
-	a.fantasyAgent = fantasy.NewAgent(a.model, agentOpts...)
+	// AddServer's onToolsChanged callback triggers rebuildFantasyAgent,
+	// but only if it was wired. Ensure rebuild happens regardless.
+	a.rebuildFantasyAgent()
+	return count, nil
+}
+
+// RemoveMCPServer disconnects an MCP server and removes its tools from the agent.
+func (a *Agent) RemoveMCPServer(name string) error {
+	if a.toolManager == nil {
+		return fmt.Errorf("no MCP servers loaded")
+	}
+
+	// Ensure MCP tools from initial load are settled first.
+	a.ensureMCPTools()
+
+	err := a.toolManager.RemoveServer(name)
+	if err != nil {
+		return err
+	}
+
+	// RemoveServer's onToolsChanged callback triggers rebuildFantasyAgent,
+	// but ensure rebuild happens regardless.
+	a.rebuildFantasyAgent()
+	return nil
+}
+
+// GetMCPToolManager returns the underlying MCP tool manager.
+// Returns nil if no MCP servers have been configured.
+func (a *Agent) GetMCPToolManager() *tools.MCPToolManager {
+	return a.toolManager
 }

 // GetLoadingMessage returns the loading message from provider creation.
@@ -698,64 +914,20 @@ func (a *Agent) GetLoadedServerNames() []string {
 	return a.toolManager.GetLoadedServerNames()
 }

-// SetModel swaps the agent's LLM provider to a new model. The existing tools,
-// system prompt, and configuration are preserved. The old provider is closed
-// if it has a closer. Returns the previous model string for notification.
+// SetModel swaps the agent's LLM provider to a new model. The existing tools
+// and configuration are preserved. When the new model's ProviderConfig carries
+// a system prompt (from per-model settings), it replaces the agent's stored
+// prompt so the rebuilt fantasy agent uses it. The old provider is closed if
+// it has a closer.
 func (a *Agent) SetModel(ctx context.Context, config *models.ProviderConfig) error {
+	// Ensure MCP tools are loaded before rebuilding (SetModel may be called
+	// before the first LLM call).
+	a.ensureMCPTools()
+
 	providerResult, err := models.CreateProvider(ctx, config)
 	if err != nil {
 		return fmt.Errorf("failed to create model provider: %v", err)
 	}
-
-	// Rebuild tool list (same as NewAgent).
-	allTools := make([]fantasy.AgentTool, len(a.coreTools))
-	copy(allTools, a.coreTools)
-	if a.toolManager != nil {
-		allTools = append(allTools, a.toolManager.GetTools()...)
-	}
-	if len(a.extraTools) > 0 {
-		allTools = append(allTools, a.extraTools...)
-	}
-	if a.toolWrapper != nil {
-		allTools = a.toolWrapper(allTools)
-	}
-
-	// Rebuild agent options.
-	var agentOpts []fantasy.AgentOption
-	if a.systemPrompt != "" {
-		agentOpts = append(agentOpts, fantasy.WithSystemPrompt(a.systemPrompt))
-	}
-	if len(allTools) > 0 {
-		agentOpts = append(agentOpts, fantasy.WithTools(allTools...))
-	}
-	if a.maxSteps > 0 {
-		agentOpts = append(agentOpts, fantasy.WithStopConditions(
-			fantasy.StepCountIs(a.maxSteps),
-		))
-	}
-
-	// Pass provider-specific options (e.g. OpenAI Responses API reasoning settings).
-	if providerResult.ProviderOptions != nil {
-		agentOpts = append(agentOpts, fantasy.WithProviderOptions(providerResult.ProviderOptions))
-	}
-
-	// Pass generation parameters when available.
-	// Skip max_output_tokens for providers that don't support it (e.g., Codex OAuth)
-	if config.MaxTokens > 0 && !providerResult.SkipMaxOutputTokens {
-		agentOpts = append(agentOpts, fantasy.WithMaxOutputTokens(int64(config.MaxTokens)))
-	}
-	if config.Temperature != nil {
-		agentOpts = append(agentOpts, fantasy.WithTemperature(float64(*config.Temperature)))
-	}
-	if config.TopP != nil {
-		agentOpts = append(agentOpts, fantasy.WithTopP(float64(*config.TopP)))
-	}
-	if config.TopK != nil {
-		agentOpts = append(agentOpts, fantasy.WithTopK(int64(*config.TopK)))
-	}
-
-	newFantasyAgent := fantasy.NewAgent(providerResult.Model, agentOpts...)
-
 	// Close old provider.
 	if a.providerCloser != nil {
 		_ = a.providerCloser.Close()
@@ -767,9 +939,18 @@ func (a *Agent) SetModel(ctx context.Context, config *models.ProviderConfig) err
 	}

 	// Swap fields.
-	a.fantasyAgent = newFantasyAgent
 	a.model = providerResult.Model
 	a.providerCloser = providerResult.Closer
+	a.providerOptions = providerResult.ProviderOptions
+	a.skipMaxOutputTokens = providerResult.SkipMaxOutputTokens
+	a.modelConfig = config
+
+	// Update system prompt when the config carries one (from per-model
+	// settings or the global config). This allows model-specific system
+	// prompts to take effect on model switch.
+	if config.SystemPrompt != "" {
+		a.systemPrompt = config.SystemPrompt
+	}

 	// Update provider type.
 	if config.ModelString != "" {
@@ -778,6 +959,9 @@ func (a *Agent) SetModel(ctx context.Context, config *models.ProviderConfig) err
 		}
 	}

+	// Rebuild the fantasy agent with the new model and current tool set.
+	a.rebuildFantasyAgent()
+
 	return nil
 }

@@ -787,7 +971,13 @@ func (a *Agent) GetModel() fantasy.LanguageModel {
 }

 // Close closes the agent and cleans up resources.
+// If MCP tools are still loading in the background, Close waits for them
+// to finish before closing connections to avoid resource leaks.
 func (a *Agent) Close() error {
+	// Wait for background MCP loading to finish before closing connections.
+	if a.mcpReady != nil {
+		<-a.mcpReady
+	}
 	var toolErr error
 	if a.toolManager != nil {
 		toolErr = a.toolManager.Close()
@@ -0,0 +1,302 @@
+package agent
+
+import (
+	"context"
+	"os"
+	"path/filepath"
+	"runtime"
+	"strings"
+	"testing"
+	"time"
+
+	"charm.land/fantasy"
+
+	"github.com/mark3labs/kit/internal/config"
+)
+
+// mockModel is a minimal LanguageModel that satisfies the interface
+// without making real API calls. Used to test tool management wiring.
+type mockModel struct{}
+
+func (m *mockModel) Generate(_ context.Context, _ fantasy.Call) (*fantasy.Response, error) {
+	return &fantasy.Response{}, nil
+}
+func (m *mockModel) Stream(_ context.Context, _ fantasy.Call) (fantasy.StreamResponse, error) {
+	return nil, nil
+}
+func (m *mockModel) GenerateObject(_ context.Context, _ fantasy.ObjectCall) (*fantasy.ObjectResponse, error) {
+	return &fantasy.ObjectResponse{}, nil
+}
+func (m *mockModel) StreamObject(_ context.Context, _ fantasy.ObjectCall) (fantasy.ObjectStreamResponse, error) {
+	return nil, nil
+}
+func (m *mockModel) Provider() string { return "mock" }
+func (m *mockModel) Model() string    { return "mock-model" }
+
+// testdataDir returns the absolute path to the tools testdata directory.
+func testdataDir(t *testing.T) string {
+	t.Helper()
+	_, file, _, ok := runtime.Caller(0)
+	if !ok {
+		t.Fatal("cannot determine test file path")
+	}
+	return filepath.Join(filepath.Dir(file), "..", "tools", "testdata")
+}
+
+// echoServerConfig returns an MCPServerConfig for the test echo MCP server.
+func echoServerConfig(t *testing.T) config.MCPServerConfig {
+	t.Helper()
+	script := filepath.Join(testdataDir(t), "echo_server.py")
+	if _, err := os.Stat(script); err != nil {
+		t.Skipf("echo_server.py not found: %v", err)
+	}
+	return config.MCPServerConfig{
+		Command: []string{"python3", script},
+	}
+}
+
+// mockAuthHandler is a minimal MCPAuthHandler for testing that auth handler
+// propagation works without requiring a real OAuth server.
+type mockAuthHandler struct {
+	redirectURI string
+}
+
+func (h *mockAuthHandler) RedirectURI() string { return h.redirectURI }
+func (h *mockAuthHandler) HandleAuth(_ context.Context, _ string, _ string) (string, error) {
+	return "", nil
+}
+
+// newTestAgent creates a minimal Agent with a mock model and no core tools,
+// suitable for testing MCP server management without an API key.
+func newTestAgent() *Agent {
+	model := &mockModel{}
+	a := &Agent{
+		model:        model,
+		coreTools:    nil,
+		extraTools:   nil,
+		maxSteps:     10,
+		systemPrompt: "test",
+		fantasyAgent: fantasy.NewAgent(model),
+	}
+	return a
+}
+
+func TestAgent_AddMCPServer(t *testing.T) {
+	if testing.Short() {
+		t.Skip("skipping integration test in short mode")
+	}
+
+	a := newTestAgent()
+	defer func() { _ = a.Close() }()
+
+	ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
+	defer cancel()
+
+	cfg := echoServerConfig(t)
+
+	// Initially no MCP tools.
+	if a.GetMCPToolCount() != 0 {
+		t.Fatalf("Expected 0 MCP tools initially, got %d", a.GetMCPToolCount())
+	}
+
+	// Add a server.
+	count, err := a.AddMCPServer(ctx, "echo", cfg)
+	if err != nil {
+		t.Fatalf("AddMCPServer failed: %v", err)
+	}
+	if count != 2 {
+		t.Errorf("Expected 2 tools, got %d", count)
+	}
+
+	// Verify tools are in the agent's tool list.
+	if a.GetMCPToolCount() != 2 {
+		t.Errorf("Expected 2 MCP tools, got %d", a.GetMCPToolCount())
+	}
+
+	allTools := a.GetTools()
+	toolNames := make(map[string]bool)
+	for _, tool := range allTools {
+		toolNames[tool.Info().Name] = true
+	}
+	if !toolNames["echo__echo"] {
+		t.Error("Expected tool 'echo__echo' in agent tools")
+	}
+	if !toolNames["echo__greet"] {
+		t.Error("Expected tool 'echo__greet' in agent tools")
+	}
+
+	// Verify loaded server names.
+	names := a.GetLoadedServerNames()
+	found := false
+	for _, n := range names {
+		if n == "echo" {
+			found = true
+		}
+	}
+	if !found {
+		t.Errorf("Expected 'echo' in loaded server names: %v", names)
+	}
+}
+
+func TestAgent_RemoveMCPServer(t *testing.T) {
+	if testing.Short() {
+		t.Skip("skipping integration test in short mode")
+	}
+
+	a := newTestAgent()
+	defer func() { _ = a.Close() }()
+
+	ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
+	defer cancel()
+
+	cfg := echoServerConfig(t)
+
+	// Add then remove.
+	_, err := a.AddMCPServer(ctx, "echo", cfg)
+	if err != nil {
+		t.Fatalf("AddMCPServer failed: %v", err)
+	}
+
+	err = a.RemoveMCPServer("echo")
+	if err != nil {
+		t.Fatalf("RemoveMCPServer failed: %v", err)
+	}
+
+	// Verify tools removed.
+	if a.GetMCPToolCount() != 0 {
+		t.Errorf("Expected 0 MCP tools after removal, got %d", a.GetMCPToolCount())
+	}
+
+	// Verify agent's tool list has no MCP tools.
+	for _, tool := range a.GetTools() {
+		if strings.Contains(tool.Info().Name, "echo__") {
+			t.Errorf("Found leftover tool after removal: %s", tool.Info().Name)
+		}
+	}
+}
+
+func TestAgent_RemoveMCPServer_NoToolManager(t *testing.T) {
+	a := newTestAgent()
+	defer func() { _ = a.Close() }()
+
+	err := a.RemoveMCPServer("nonexistent")
+	if err == nil {
+		t.Fatal("Expected error when no tool manager exists")
+	}
+	if !strings.Contains(err.Error(), "no MCP servers loaded") {
+		t.Errorf("Expected 'no MCP servers loaded' error, got: %v", err)
+	}
+}
+
+func TestAgent_AddMCPServer_CreatesToolManager(t *testing.T) {
+	if testing.Short() {
+		t.Skip("skipping integration test in short mode")
+	}
+
+	a := newTestAgent()
+	defer func() { _ = a.Close() }()
+
+	// Initially no tool manager.
+	if a.GetMCPToolManager() != nil {
+		t.Fatal("Expected nil tool manager initially")
+	}
+
+	ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
+	defer cancel()
+
+	cfg := echoServerConfig(t)
+	_, err := a.AddMCPServer(ctx, "echo", cfg)
+	if err != nil {
+		t.Fatalf("AddMCPServer failed: %v", err)
+	}
+
+	// Tool manager should now exist.
+	if a.GetMCPToolManager() == nil {
+		t.Fatal("Expected tool manager to be created by AddMCPServer")
+	}
+}
+
+func TestAgent_AddRemoveAdd_MCP(t *testing.T) {
+	if testing.Short() {
+		t.Skip("skipping integration test in short mode")
+	}
+
+	a := newTestAgent()
+	defer func() { _ = a.Close() }()
+
+	ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
+	defer cancel()
+
+	cfg := echoServerConfig(t)
+
+	// Add → Remove → Add cycle.
+	_, err := a.AddMCPServer(ctx, "echo", cfg)
+	if err != nil {
+		t.Fatalf("First add failed: %v", err)
+	}
+
+	err = a.RemoveMCPServer("echo")
+	if err != nil {
+		t.Fatalf("Remove failed: %v", err)
+	}
+
+	count, err := a.AddMCPServer(ctx, "echo", cfg)
+	if err != nil {
+		t.Fatalf("Re-add failed: %v", err)
+	}
+	if count != 2 {
+		t.Errorf("Expected 2 tools on re-add, got %d", count)
+	}
+	if a.GetMCPToolCount() != 2 {
+		t.Errorf("Expected 2 MCP tools after re-add, got %d", a.GetMCPToolCount())
+	}
+}
+
+// TestAgent_AddMCPServer_InheritsAuthHandler verifies that AddMCPServer()
+// propagates the agent's authHandler and tokenStoreFactory to a newly created
+// MCPToolManager (fix for issue #3).
+func TestAgent_AddMCPServer_InheritsAuthHandler(t *testing.T) {
+	if testing.Short() {
+		t.Skip("skipping integration test in short mode")
+	}
+
+	handler := &mockAuthHandler{redirectURI: "http://localhost:9999/oauth/callback"}
+
+	model := &mockModel{}
+	a := &Agent{
+		model:             model,
+		coreTools:         nil,
+		extraTools:        nil,
+		maxSteps:          10,
+		systemPrompt:      "test",
+		fantasyAgent:      fantasy.NewAgent(model),
+		authHandler:       handler,
+		tokenStoreFactory: nil, // nil is fine; we just test authHandler propagation
+	}
+	defer func() { _ = a.Close() }()
+
+	// Initially no tool manager.
+	if a.GetMCPToolManager() != nil {
+		t.Fatal("Expected nil tool manager initially")
+	}
+
+	ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
+	defer cancel()
+
+	cfg := echoServerConfig(t)
+	_, err := a.AddMCPServer(ctx, "echo", cfg)
+	if err != nil {
+		t.Fatalf("AddMCPServer failed: %v", err)
+	}
+
+	// Tool manager should now exist and have the auth handler set.
+	tm := a.GetMCPToolManager()
+	if tm == nil {
+		t.Fatal("Expected tool manager to be created by AddMCPServer")
+	}
+
+	// Verify the auth handler was propagated by checking the field directly.
+	if tm.GetAuthHandler() == nil {
+		t.Fatal("Expected auth handler to be propagated to tool manager")
+	}
+}
@@ -38,13 +38,24 @@ type AgentCreationOptions struct {
 	DebugLogger tools.DebugLogger // Optional debug logger
 	// AuthHandler handles OAuth authorization for remote MCP servers
 	AuthHandler tools.MCPAuthHandler
+	// TokenStoreFactory, if non-nil, creates a custom token store for each
+	// remote MCP server's OAuth tokens. When nil, the default file-based
+	// token store is used.
+	TokenStoreFactory tools.TokenStoreFactory
 	// CoreTools overrides the default core tool set. If empty, core.AllTools()
 	// is used.
 	CoreTools []fantasy.AgentTool
+	// DisableCoreTools, when true, prevents loading any core tools.
+	// If both DisableCoreTools is true and CoreTools is empty, the agent
+	// will have no tools (useful for simple chat completions).
+	DisableCoreTools bool
 	// ToolWrapper wraps the combined tool list before agent creation.
 	ToolWrapper func([]fantasy.AgentTool) []fantasy.AgentTool
 	// ExtraTools are additional tools to include (e.g. from extensions).
 	ExtraTools []fantasy.AgentTool
+	// OnMCPServerLoaded, if non-nil, is called when each MCP server finishes
+	// loading (successfully or with error). Called from the background goroutine.
+	OnMCPServerLoaded func(serverName string, toolCount int, err error)
 }

 // CreateAgent creates an agent with optional spinner for Ollama models.
@@ -52,16 +63,19 @@ type AgentCreationOptions struct {
 // Returns the created agent or an error if creation fails.
 func CreateAgent(ctx context.Context, opts *AgentCreationOptions) (*Agent, error) {
 	agentConfig := &AgentConfig{
-		ModelConfig:      opts.ModelConfig,
-		MCPConfig:        opts.MCPConfig,
-		SystemPrompt:     opts.SystemPrompt,
-		MaxSteps:         opts.MaxSteps,
-		StreamingEnabled: opts.StreamingEnabled,
-		DebugLogger:      opts.DebugLogger,
-		AuthHandler:      opts.AuthHandler,
-		CoreTools:        opts.CoreTools,
-		ToolWrapper:      opts.ToolWrapper,
-		ExtraTools:       opts.ExtraTools,
+		ModelConfig:       opts.ModelConfig,
+		MCPConfig:         opts.MCPConfig,
+		SystemPrompt:      opts.SystemPrompt,
+		MaxSteps:          opts.MaxSteps,
+		StreamingEnabled:  opts.StreamingEnabled,
+		DebugLogger:       opts.DebugLogger,
+		AuthHandler:       opts.AuthHandler,
+		TokenStoreFactory: opts.TokenStoreFactory,
+		CoreTools:         opts.CoreTools,
+		DisableCoreTools:  opts.DisableCoreTools,
+		ToolWrapper:       opts.ToolWrapper,
+		ExtraTools:        opts.ExtraTools,
+		OnMCPServerLoaded: opts.OnMCPServerLoaded,
 	}

 	var agent *Agent
@@ -930,7 +930,8 @@ func (a *App) QuitFromExtension() {
 // controls styling: "" for plain text, "info" for a system message block,
 // "error" for an error block. In interactive mode it sends an
 // ExtensionPrintEvent through the program so the TUI can render it with the
-// appropriate renderer. In non-interactive mode it falls back to stdout.
+// appropriate renderer. In non-interactive mode it falls back to stderr with
+// a level prefix so errors are distinguishable from plain output.
 func (a *App) PrintFromExtension(level, text string) {
 	a.mu.Lock()
 	prog := a.program
@@ -939,8 +940,16 @@ func (a *App) PrintFromExtension(level, text string) {
 		prog.Send(ExtensionPrintEvent{Text: text, Level: level})
 		return
 	}
-	// Non-interactive fallback: write directly to stdout.
-	fmt.Println(text)
+	// Non-interactive fallback: write to stderr with a level prefix so that
+	// errors and info messages are distinguishable from plain output.
+	switch level {
+	case "error":
+		fmt.Fprintf(os.Stderr, "[ERROR] %s\n", text)
+	case "info":
+		fmt.Fprintf(os.Stderr, "[INFO] %s\n", text)
+	default:
+		fmt.Println(text)
+	}
 }

 // SetEditorTextFromExtension sends an EditorTextSetEvent to the TUI to
@@ -997,6 +1006,47 @@ func (a *App) NotifyWidgetUpdate() {
 	}
 }

+// NotifyContentReload sends a ContentReloadEvent to the TUI so it refreshes
+// prompt templates and skills from their provider callbacks. Called by file
+// watchers when .md/.txt files change in prompt or skill directories.
+// In non-interactive mode this is a no-op.
+func (a *App) NotifyContentReload() {
+	a.mu.Lock()
+	prog := a.program
+	a.mu.Unlock()
+	if prog != nil {
+		prog.Send(ContentReloadEvent{})
+	}
+}
+
+// NotifyMCPToolsReady sends an MCPToolsReadyEvent to the TUI so it refreshes
+// tool names and MCP tool count from provider callbacks. Called when background
+// MCP tool loading completes. In non-interactive mode this is a no-op.
+func (a *App) NotifyMCPToolsReady() {
+	a.mu.Lock()
+	prog := a.program
+	a.mu.Unlock()
+	if prog != nil {
+		prog.Send(MCPToolsReadyEvent{})
+	}
+}
+
+// NotifyMCPServerLoaded sends an MCPServerLoadedEvent to the TUI so it can
+// display a system message when a single MCP server finishes loading. Called
+// per server as background MCP tool loading progresses.
+func (a *App) NotifyMCPServerLoaded(serverName string, toolCount int, err error) {
+	a.mu.Lock()
+	prog := a.program
+	a.mu.Unlock()
+	if prog != nil {
+		prog.Send(MCPServerLoadedEvent{
+			ServerName: serverName,
+			ToolCount:  toolCount,
+			Error:      err,
+		})
+	}
+}
+
 // SendEvent sends a tea.Msg to the registered program. Safe to call from
 // any goroutine. No-op when no program is registered.
 //
@@ -1081,11 +1131,12 @@ func (a *App) PrintBlockFromExtension(opts extensions.PrintBlockOpts) {
 		})
 		return
 	}
-	// Non-interactive fallback.
+	// Non-interactive fallback: render a simple framed block to stderr so
+	// it is visually distinct from plain stdout output.
 	if opts.Subtitle != "" {
-		fmt.Printf("%s\n  — %s\n", opts.Text, opts.Subtitle)
+		fmt.Fprintf(os.Stderr, "--- %s ---\n%s\n", opts.Subtitle, opts.Text)
 	} else {
-		fmt.Println(opts.Text)
+		fmt.Fprintf(os.Stderr, "---\n%s\n---\n", opts.Text)
 	}
 }

@@ -1114,9 +1165,10 @@ func (a *App) recordStepUsage(ev kit.StepUsageEvent, stepUsageSeen *atomic.Bool)
 		int(ev.CacheWriteTokens),
 	)
 	// NOTE: We do NOT call SetContextTokens here. Context fill is set once
-	// at turn completion via updateUsageFromTurnResult using FinalUsage.InputTokens,
-	// which reflects the full accumulated context. Per-step context tokens would
-	// cause the display to jump around during multi-step tool calls.
+	// at turn completion via updateUsageFromTurnResult, which sums all token
+	// categories (Input + CacheRead + CacheCreate + Output) from FinalUsage.
+	// Per-step context tokens would cause the display to jump around during
+	// multi-step tool calls.
 }

 // updateUsageFromTurnResult records token usage from an SDK TurnResult into the
@@ -1180,15 +1232,30 @@ func (a *App) updateUsageFromTurnResult(result *kit.TurnResult, userPrompt strin
 	}

 	// --- Context window fill (drives the % bar) ---
-	// Use FinalUsage.InputTokens as the context window fill. The API's InputTokens
-	// already includes the full conversation history (system prompt + all previous
-	// messages + current user message). Adding OutputTokens would double-count since
-	// the output becomes part of the input for the next turn.
-	if result.FinalUsage != nil && result.FinalUsage.InputTokens > 0 {
-		if a.opts.Debug {
-			log.Printf("[DEBUG] updateUsageFromTurnResult: calling SetContextTokens=%d (FinalUsage.InputTokens)",
-				result.FinalUsage.InputTokens)
+	// Calculate context fill from the LAST API call's usage. The context
+	// window is filled by everything sent to and received from the model:
+	//
+	//   InputTokens       — non-cached input (may be small with prompt caching)
+	//   CacheReadTokens   — input tokens served from cache
+	//   CacheCreationTokens — input tokens written to cache this call
+	//   OutputTokens      — assistant output (becomes input next turn)
+	//
+	// With Anthropic prompt caching, InputTokens can drop to near-zero while
+	// CacheReadTokens holds the bulk of the context. We must sum all four to
+	// get the true context window utilization.
+	//
+	// We use FinalUsage (last step only), NOT TotalUsage, because TotalUsage
+	// sums across all tool-calling steps — and each step re-sends the full
+	// conversation, so TotalUsage massively overstates the actual window fill.
+	if result.FinalUsage != nil {
+		u := result.FinalUsage
+		contextFill := int(u.InputTokens) + int(u.CacheReadTokens) + int(u.CacheCreationTokens) + int(u.OutputTokens)
+		if contextFill > 0 {
+			if a.opts.Debug {
+				log.Printf("[DEBUG] updateUsageFromTurnResult: SetContextTokens=%d (Input=%d + CacheRead=%d + CacheCreate=%d + Output=%d)",
+					contextFill, u.InputTokens, u.CacheReadTokens, u.CacheCreationTokens, u.OutputTokens)
+			}
+			a.opts.UsageTracker.SetContextTokens(contextFill)
 		}
-		a.opts.UsageTracker.SetContextTokens(int(result.FinalUsage.InputTokens))
 	}
 }
@@ -630,10 +630,12 @@ func TestUpdateUsageFromTurnResult_recordsWhenInputTokensZero(t *testing.T) {
 	}
 }

-// TestUpdateUsageFromTurnResult_contextTokensUsesInputOnly verifies that context
-// window fill uses InputTokens only (not input+output). The API's InputTokens
-// already includes the full conversation history; adding output would double-count.
-func TestUpdateUsageFromTurnResult_contextTokensUsesInputOnly(t *testing.T) {
+// TestUpdateUsageFromTurnResult_contextTokensUsesAllCategories verifies that
+// context window fill uses all token categories from the final API call:
+// InputTokens + CacheReadTokens + CacheCreationTokens + OutputTokens.
+// With Anthropic prompt caching, InputTokens can be near-zero while
+// CacheReadTokens holds the bulk of the context.
+func TestUpdateUsageFromTurnResult_contextTokensUsesAllCategories(t *testing.T) {
 	usage := &usageUpdaterStub{}
 	app := New(Options{UsageTracker: usage}, nil)
 	defer app.Close()
@@ -641,22 +643,26 @@ func TestUpdateUsageFromTurnResult_contextTokensUsesInputOnly(t *testing.T) {
 	app.updateUsageFromTurnResult(&kit.TurnResult{
 		Response: "ok",
 		TotalUsage: &kit.LLMUsage{
-			InputTokens:  1000,
-			OutputTokens: 200,
+			InputTokens:         3,
+			OutputTokens:        5,
+			CacheReadTokens:     0,
+			CacheCreationTokens: 4317,
 		},
 		FinalUsage: &kit.LLMUsage{
-			InputTokens:  1000, // Full context including history
-			OutputTokens: 200,
+			InputTokens:         3,    // Non-cached input (small with caching)
+			OutputTokens:        5,    // Assistant output
+			CacheReadTokens:     0,    // No cache reads on first call
+			CacheCreationTokens: 4317, // System prompt + tools written to cache
 		},
 	}, "prompt", false)

 	usage.mu.Lock()
 	defer usage.mu.Unlock()

-	// Context tokens should be InputTokens only (1000), not input+output (1200)
-	// because InputTokens already includes the full conversation history
-	if usage.contextCalls != 1 || usage.lastContextTokens != 1000 {
-		t.Fatalf("expected context tokens=1000 (InputTokens only), got calls=%d tokens=%d",
-			usage.contextCalls, usage.lastContextTokens)
+	// Context tokens should be Input + CacheRead + CacheCreate + Output = 4325
+	expected := 3 + 0 + 4317 + 5
+	if usage.contextCalls != 1 || usage.lastContextTokens != expected {
+		t.Fatalf("expected context tokens=%d (all categories), got calls=%d tokens=%d",
+			expected, usage.contextCalls, usage.lastContextTokens)
 	}
 }
@@ -167,6 +167,25 @@ type ModelChangedEvent struct {
 // from its WidgetProvider on the next render cycle.
 type WidgetUpdateEvent struct{}

+// ContentReloadEvent is sent when prompt templates or skills are reloaded
+// from disk (e.g. by a file watcher detecting changes). The TUI refreshes
+// its autocomplete entries and internal state from the provider callbacks.
+type ContentReloadEvent struct{}
+
+// MCPToolsReadyEvent is sent when background MCP tool loading completes.
+// The TUI refreshes its tool names and MCP tool count from provider callbacks
+// so that /tools and the startup info bar reflect the loaded MCP tools.
+type MCPToolsReadyEvent struct{}
+
+// MCPServerLoadedEvent is sent when a single MCP server finishes loading
+// (successfully or with error). The TUI displays a system message so users
+// see real-time progress as each server initializes.
+type MCPServerLoadedEvent struct {
+	ServerName string
+	ToolCount  int
+	Error      error // nil on success
+}
+
 // EditorTextSetEvent is sent when an extension calls ctx.SetEditorText to
 // pre-fill the input editor with text. The TUI handles this by setting the
 // textarea content and moving the cursor to the end.
@@ -21,8 +21,10 @@ type UsageUpdater interface {
 	// the provider does not return exact counts.
 	EstimateAndUpdateUsage(inputText, outputText string)
 	// SetContextTokens records the approximate current context window fill
-	// level. This should be the final API call's input+output tokens (from
-	// FinalResponse.Usage), NOT the aggregate TotalUsage.
+	// level. This should be the sum of ALL token categories from the last
+	// API call: InputTokens + CacheReadTokens + CacheCreationTokens +
+	// OutputTokens. With Anthropic prompt caching, InputTokens can be
+	// near-zero while CacheReadTokens holds the bulk of the context.
 	SetContextTokens(tokens int)
 }

@@ -22,6 +22,14 @@ type MCPServerConfig struct {
 	AllowedTools  []string          `json:"allowedTools,omitempty" yaml:"allowedTools,omitempty"`
 	ExcludedTools []string          `json:"excludedTools,omitempty" yaml:"excludedTools,omitempty"`

+	// OAuth configuration for remote servers that don't support dynamic
+	// client registration (e.g. GitHub). When OAuthClientID is set, it is
+	// passed directly to the transport's OAuthConfig instead of relying on
+	// dynamic registration.
+	OAuthClientID     string   `json:"oauthClientId,omitempty" yaml:"oauthClientId,omitempty"`
+	OAuthClientSecret string   `json:"oauthClientSecret,omitempty" yaml:"oauthClientSecret,omitempty"`
+	OAuthScopes       []string `json:"oauthScopes,omitempty" yaml:"oauthScopes,omitempty"`
+
 	// Legacy fields for backward compatibility
 	Transport string         `json:"transport,omitempty"`
 	Args      []string       `json:"args,omitempty"`
@@ -35,13 +43,16 @@ type MCPServerConfig struct {
 func (s *MCPServerConfig) UnmarshalJSON(data []byte) error {
 	// First try to unmarshal as the new format
 	type newFormat struct {
-		Type          string            `json:"type"`
-		Command       []string          `json:"command,omitempty"`
-		Environment   map[string]string `json:"environment,omitempty"`
-		URL           string            `json:"url,omitempty"`
-		Headers       []string          `json:"headers,omitempty"`
-		AllowedTools  []string          `json:"allowedTools,omitempty" yaml:"allowedTools,omitempty"`
-		ExcludedTools []string          `json:"excludedTools,omitempty" yaml:"excludedTools,omitempty"`
+		Type              string            `json:"type"`
+		Command           []string          `json:"command,omitempty"`
+		Environment       map[string]string `json:"environment,omitempty"`
+		URL               string            `json:"url,omitempty"`
+		Headers           []string          `json:"headers,omitempty"`
+		AllowedTools      []string          `json:"allowedTools,omitempty" yaml:"allowedTools,omitempty"`
+		ExcludedTools     []string          `json:"excludedTools,omitempty" yaml:"excludedTools,omitempty"`
+		OAuthClientID     string            `json:"oauthClientId,omitempty" yaml:"oauthClientId,omitempty"`
+		OAuthClientSecret string            `json:"oauthClientSecret,omitempty" yaml:"oauthClientSecret,omitempty"`
+		OAuthScopes       []string          `json:"oauthScopes,omitempty" yaml:"oauthScopes,omitempty"`
 	}

 	// Also try legacy format
@@ -66,6 +77,9 @@ func (s *MCPServerConfig) UnmarshalJSON(data []byte) error {
 		s.Headers = newConfig.Headers
 		s.AllowedTools = newConfig.AllowedTools
 		s.ExcludedTools = newConfig.ExcludedTools
+		s.OAuthClientID = newConfig.OAuthClientID
+		s.OAuthClientSecret = newConfig.OAuthClientSecret
+		s.OAuthScopes = newConfig.OAuthScopes
 		return nil
 	}

@@ -157,6 +171,21 @@ type Theme struct {
 	Markdown MarkdownThemeConfig `json:"markdown,omitzero" yaml:"markdown,omitempty"`
 }

+// GenerationParams defines generation parameter defaults that can be attached
+// to individual models. These act as model-level defaults — CLI flags and
+// global config values take precedence when explicitly set.
+type GenerationParams struct {
+	MaxTokens        *int     `json:"maxTokens,omitempty" yaml:"maxTokens,omitempty"`
+	Temperature      *float32 `json:"temperature,omitempty" yaml:"temperature,omitempty"`
+	TopP             *float32 `json:"topP,omitempty" yaml:"topP,omitempty"`
+	TopK             *int32   `json:"topK,omitempty" yaml:"topK,omitempty"`
+	FrequencyPenalty *float32 `json:"frequencyPenalty,omitempty" yaml:"frequencyPenalty,omitempty"`
+	PresencePenalty  *float32 `json:"presencePenalty,omitempty" yaml:"presencePenalty,omitempty"`
+	StopSequences    []string `json:"stopSequences,omitempty" yaml:"stopSequences,omitempty"`
+	ThinkingLevel    string   `json:"thinkingLevel,omitempty" yaml:"thinkingLevel,omitempty"`
+	SystemPrompt     string   `json:"systemPrompt,omitempty" yaml:"systemPrompt,omitempty"`
+}
+
 // CustomModelConfig defines a custom model that can be used with custom/custom
 // or other custom/ prefixed models. These models are loaded from the config file
 // and merged into the custom provider in the model registry.
@@ -171,6 +200,11 @@ type CustomModelConfig struct {
 	Knowledge   string      `json:"knowledge,omitempty" yaml:"knowledge,omitempty"`
 	Cost        CostConfig  `json:"cost" yaml:"cost"`
 	Limit       LimitConfig `json:"limit" yaml:"limit"`
+
+	// Generation parameter defaults for this model.
+	// These are applied when the user hasn't explicitly set the corresponding
+	// CLI flag or global config value.
+	Params GenerationParams `json:"params,omitzero" yaml:"params,omitempty"`
 }

 // CostConfig defines the pricing for a custom model.
@@ -199,11 +233,13 @@ type Config struct {
 	Stream         *bool                      `json:"stream,omitempty" yaml:"stream,omitempty"`
 	Theme          any                        `json:"theme" yaml:"theme"`
 	// Model generation parameters
-	MaxTokens     int      `json:"max-tokens,omitempty" yaml:"max-tokens,omitempty"`
-	Temperature   *float32 `json:"temperature,omitempty" yaml:"temperature,omitempty"`
-	TopP          *float32 `json:"top-p,omitempty" yaml:"top-p,omitempty"`
-	TopK          *int32   `json:"top-k,omitempty" yaml:"top-k,omitempty"`
-	StopSequences []string `json:"stop-sequences,omitempty" yaml:"stop-sequences,omitempty"`
+	MaxTokens        int      `json:"max-tokens,omitempty" yaml:"max-tokens,omitempty"`
+	Temperature      *float32 `json:"temperature,omitempty" yaml:"temperature,omitempty"`
+	TopP             *float32 `json:"top-p,omitempty" yaml:"top-p,omitempty"`
+	TopK             *int32   `json:"top-k,omitempty" yaml:"top-k,omitempty"`
+	FrequencyPenalty *float32 `json:"frequency-penalty,omitempty" yaml:"frequency-penalty,omitempty"`
+	PresencePenalty  *float32 `json:"presence-penalty,omitempty" yaml:"presence-penalty,omitempty"`
+	StopSequences    []string `json:"stop-sequences,omitempty" yaml:"stop-sequences,omitempty"`

 	// Thinking / extended reasoning
 	ThinkingLevel string `json:"thinking-level,omitempty" yaml:"thinking-level,omitempty"`
@@ -217,6 +253,12 @@ type Config struct {

 	// Custom model definitions (under custom/ provider)
 	CustomModels map[string]CustomModelConfig `json:"customModels,omitempty" yaml:"customModels,omitempty"`
+
+	// Per-model generation parameter overrides. Keys are "provider/model" strings
+	// (e.g. "anthropic/claude-sonnet-4-5-20250929", "openai/gpt-4o"). These
+	// settings act as model-level defaults — CLI flags and global config values
+	// take precedence when explicitly set.
+	ModelSettings map[string]GenerationParams `json:"modelSettings,omitempty" yaml:"modelSettings,omitempty"`
 }

 // GetTransportType returns the transport type for the server config, mapping
@@ -365,16 +407,55 @@ mcpServers:
 # debug: false                                 # Enable debug logging
 # system-prompt: "/path/to/system-prompt.txt" # System prompt text file

-# Model generation parameters (all optional)
+# Model generation parameters (all optional, apply globally to all models)
 # max-tokens: 4096                             # Maximum tokens in response
 # temperature: 0.7                             # Randomness (0.0-1.0)
 # top-p: 0.95                                  # Nucleus sampling (0.0-1.0)
 # top-k: 40                                    # Top K sampling
+# frequency-penalty: 0.0                        # Penalize frequent tokens (0.0-2.0)
+# presence-penalty: 0.0                         # Penalize present tokens (0.0-2.0)
 # stop-sequences: ["Human:", "Assistant:"]     # Custom stop sequences

+# Per-model generation parameter overrides (apply to specific models)
+# These act as model-level defaults — CLI flags and global settings above take precedence.
+# Keys are "provider/model" strings matching the model you use.
+# modelSettings:
+#   anthropic/claude-sonnet-4-5-20250929:
+#     temperature: 0.3
+#     maxTokens: 8192
+#   openai/gpt-4o:
+#     temperature: 0.7
+#     topP: 0.95
+#     topK: 40
+#     frequencyPenalty: 0.1
+#     presencePenalty: 0.1
+#   anthropic/claude-opus-4-6:
+#     thinkingLevel: "high"
+#     maxTokens: 16384
+#     systemPrompt: "You are a deep reasoning assistant."  # or a file path
+
 # API Configuration (can also use environment variables)
 # provider-api-key: "your-api-key"         # API key for OpenAI, Anthropic, or Google
 # provider-url: "https://api.openai.com/v1" # Base URL for OpenAI, Anthropic, or Ollama
+
+# Custom model definitions (under custom/ provider)
+# customModels:
+#   my-local-llama:
+#     name: "Local Llama 3"
+#     baseUrl: "http://localhost:8080/v1"
+#     family: "llama"
+#     temperature: true
+#     cost:
+#       input: 0.0
+#       output: 0.0
+#     limit:
+#       context: 131072
+#       output: 8192
+#     params:                              # Generation parameter defaults for this model
+#       temperature: 0.8
+#       topP: 0.95
+#       topK: 40
+#       systemPrompt: "You are a helpful local assistant."
 `

 	_, err = file.WriteString(content)
@@ -6,6 +6,8 @@ import (
 	"path/filepath"
 	"strings"
 	"testing"
+
+	"gopkg.in/yaml.v3"
 )

 func TestMCPServerConfig_NewFormat(t *testing.T) {
@@ -542,3 +544,86 @@ func TestEnsureConfigExistsWhenFileExists(t *testing.T) {
 		t.Error("Existing config file was modified when it shouldn't have been")
 	}
 }
+
+func TestMCPServerConfig_OAuthFields_JSON(t *testing.T) {
+	jsonData := `{
+		"type": "remote",
+		"url": "https://api.githubcopilot.com/mcp/",
+		"oauthClientId": "Ov23liXXXXXXXXXXXXXX",
+		"oauthClientSecret": "secret123",
+		"oauthScopes": ["read:user", "repo"]
+	}`
+
+	var cfg MCPServerConfig
+	err := json.Unmarshal([]byte(jsonData), &cfg)
+	if err != nil {
+		t.Fatalf("Failed to unmarshal: %v", err)
+	}
+
+	if cfg.Type != "remote" {
+		t.Errorf("Expected type 'remote', got %q", cfg.Type)
+	}
+	if cfg.URL != "https://api.githubcopilot.com/mcp/" {
+		t.Errorf("Expected URL, got %q", cfg.URL)
+	}
+	if cfg.OAuthClientID != "Ov23liXXXXXXXXXXXXXX" {
+		t.Errorf("Expected OAuthClientID 'Ov23liXXXXXXXXXXXXXX', got %q", cfg.OAuthClientID)
+	}
+	if cfg.OAuthClientSecret != "secret123" {
+		t.Errorf("Expected OAuthClientSecret 'secret123', got %q", cfg.OAuthClientSecret)
+	}
+	if len(cfg.OAuthScopes) != 2 || cfg.OAuthScopes[0] != "read:user" || cfg.OAuthScopes[1] != "repo" {
+		t.Errorf("Expected OAuthScopes [read:user, repo], got %v", cfg.OAuthScopes)
+	}
+}
+
+func TestMCPServerConfig_OAuthFields_YAML(t *testing.T) {
+	yamlData := `
+type: remote
+url: https://api.githubcopilot.com/mcp/
+oauthClientId: "Ov23liXXXXXXXXXXXXXX"
+oauthScopes:
+  - read:user
+  - repo
+`
+
+	var cfg MCPServerConfig
+	err := yaml.Unmarshal([]byte(yamlData), &cfg)
+	if err != nil {
+		t.Fatalf("Failed to unmarshal YAML: %v", err)
+	}
+
+	if cfg.Type != "remote" {
+		t.Errorf("Expected type 'remote', got %q", cfg.Type)
+	}
+	if cfg.OAuthClientID != "Ov23liXXXXXXXXXXXXXX" {
+		t.Errorf("Expected OAuthClientID 'Ov23liXXXXXXXXXXXXXX', got %q", cfg.OAuthClientID)
+	}
+	if len(cfg.OAuthScopes) != 2 || cfg.OAuthScopes[0] != "read:user" || cfg.OAuthScopes[1] != "repo" {
+		t.Errorf("Expected OAuthScopes [read:user, repo], got %v", cfg.OAuthScopes)
+	}
+}
+
+func TestMCPServerConfig_OAuthFields_Omitted(t *testing.T) {
+	// Verify that omitting OAuth fields still works (backward compat).
+	jsonData := `{
+		"type": "remote",
+		"url": "https://example.com/mcp"
+	}`
+
+	var cfg MCPServerConfig
+	err := json.Unmarshal([]byte(jsonData), &cfg)
+	if err != nil {
+		t.Fatalf("Failed to unmarshal: %v", err)
+	}
+
+	if cfg.OAuthClientID != "" {
+		t.Errorf("Expected empty OAuthClientID, got %q", cfg.OAuthClientID)
+	}
+	if cfg.OAuthClientSecret != "" {
+		t.Errorf("Expected empty OAuthClientSecret, got %q", cfg.OAuthClientSecret)
+	}
+	if len(cfg.OAuthScopes) != 0 {
+		t.Errorf("Expected empty OAuthScopes, got %v", cfg.OAuthScopes)
+	}
+}
@@ -34,15 +34,10 @@ func LoadExtensions(extraPaths []string) ([]LoadedExtension, error) {
 	for _, p := range paths {
 		ext, err := loadSingleExtension(p)
 		if err != nil {
-			log.Warn("skipping extension", "path", p, "err", err)
 			continue
 		}
 		loaded = append(loaded, *ext)
-		log.Debug("loaded extension", "path", p,
-			"handlers", countHandlers(ext),
-			"tools", len(ext.Tools),
-			"commands", len(ext.Commands),
-			"tool_renderers", len(ext.ToolRenderers))
+		log.Debug("loaded extension", "path", p, "handlers", countHandlers(ext), "tools", len(ext.Tools), "commands", len(ext.Commands), "tool_renderers", len(ext.ToolRenderers))
 	}
 	return loaded, nil
 }
@@ -2,12 +2,12 @@ package extensions

 import (
 	"fmt"
+	"log"
 	"os"
 	"sort"
 	"strings"
 	"sync"

-	"github.com/charmbracelet/log"
 	"github.com/spf13/viper"
 )

@@ -370,10 +370,7 @@ func (r *Runner) Emit(event Event) (Result, error) {
 		for _, handler := range handlers {
 			result, err := safeCall(handler, event, ctx)
 			if err != nil {
-				log.Warn("extension handler error",
-					"path", ext.Path,
-					"event", event.Type(),
-					"err", err)
+				log.Printf("WARN extension handler error: path=%s event=%s err=%v", ext.Path, event.Type(), err)
 				continue
 			}
 			if result == nil {
@@ -707,9 +704,7 @@ func (r *Runner) EmitCustomEvent(name, data string) {
 	safeInvoke := func(h func(string)) {
 		defer func() {
 			if rec := recover(); rec != nil {
-				log.Warn("custom event handler panicked",
-					"event", name,
-					"err", fmt.Sprintf("%v", rec))
+				log.Printf("WARN custom event handler panicked: event=%s err=%v", name, rec)
 			}
 		}()
 		h(data)
@@ -3,13 +3,13 @@ package extensions
 import (
 	"context"
 	"fmt"
+	"log"
 	"os"
 	"path/filepath"
 	"strings"
 	"sync"
 	"time"

-	"github.com/charmbracelet/log"
 	"github.com/fsnotify/fsnotify"
 )

@@ -39,7 +39,7 @@ func NewWatcher(dirs []string, onReload func()) (*Watcher, error) {
 	for _, dir := range dirs {
 		// Watch the directory itself.
 		if err := fsw.Add(dir); err != nil {
-			log.Debug("watcher: skipping directory", "dir", dir, "err", err)
+			log.Printf("DEBUG watcher: skipping directory: dir=%s err=%v", dir, err)
 			continue
 		}

@@ -52,7 +52,7 @@ func NewWatcher(dirs []string, onReload func()) (*Watcher, error) {
 			if entry.IsDir() {
 				subdir := filepath.Join(dir, entry.Name())
 				if err := fsw.Add(subdir); err != nil {
-					log.Debug("watcher: skipping subdirectory", "dir", subdir, "err", err)
+					log.Printf("DEBUG watcher: skipping subdirectory: dir=%s err=%v", subdir, err)
 				}
 			}
 		}
@@ -101,7 +101,7 @@ func (w *Watcher) Start(ctx context.Context) {
 				continue
 			}

-			log.Debug("watcher: file changed", "file", event.Name, "op", event.Op)
+			log.Printf("DEBUG watcher: file changed: file=%s op=%s", event.Name, event.Op)

 			// Debounce: reset timer on each event.
 			if timer != nil {
@@ -113,14 +113,14 @@ func (w *Watcher) Start(ctx context.Context) {
 		case <-timerC:
 			timerC = nil
 			timer = nil
-			log.Debug("watcher: reloading extensions")
+			log.Printf("DEBUG watcher: reloading extensions")
 			w.onReload()

 		case err, ok := <-w.watcher.Errors:
 			if !ok {
 				return
 			}
-			log.Warn("watcher: error", "err", err)
+			log.Printf("WARN watcher: error: %v", err)
 		}
 	}
 }
@@ -0,0 +1,248 @@
+// Package fences provides utilities for detecting markdown code regions
+// (fenced code blocks and inline code spans) and applying transformations
+// only to text outside those regions.
+//
+// This prevents special tokens like $1, $@, or @file from being interpreted
+// when they appear inside ``` fences, ~~~ fences, or `inline` code spans.
+package fences
+
+import "strings"
+
+// Ranges returns byte ranges [start, end) of fenced code blocks in content.
+// Recognises both backtick (```) and tilde (~~~) fences, with optional
+// leading indentation (up to 3 spaces) and optional info strings.
+// An unclosed fence extends to the end of content.
+func Ranges(content string) [][2]int {
+	var result [][2]int
+	var inFence bool
+	var fenceChar byte
+	var fenceCount int
+	var fenceStart int
+
+	pos := 0
+	for pos < len(content) {
+		// Find the end of the current line.
+		lineEnd := strings.IndexByte(content[pos:], '\n')
+		var line string
+		var nextPos int
+		if lineEnd < 0 {
+			line = content[pos:]
+			nextPos = len(content)
+		} else {
+			line = content[pos : pos+lineEnd]
+			nextPos = pos + lineEnd + 1
+		}
+
+		trimmed := strings.TrimLeft(line, " ")
+		indent := len(line) - len(trimmed)
+
+		if !inFence {
+			if indent <= 3 {
+				if ch, n := parseFenceOpen(trimmed); n > 0 {
+					inFence = true
+					fenceChar = ch
+					fenceCount = n
+					fenceStart = pos
+				}
+			}
+		} else {
+			if indent <= 3 && isFenceClose(trimmed, fenceChar, fenceCount) {
+				result = append(result, [2]int{fenceStart, nextPos})
+				inFence = false
+			}
+		}
+
+		pos = nextPos
+	}
+
+	// Unclosed fence extends to end of content.
+	if inFence {
+		result = append(result, [2]int{fenceStart, len(content)})
+	}
+
+	return result
+}
+
+// ReplaceOutside applies fn to each text segment that is outside fenced code
+// blocks and inline code spans, leaving code content unchanged. This is the
+// primary entry point for callers that need to do regex replacement only on
+// non-code text.
+func ReplaceOutside(content string, fn func(string) string) string {
+	ranges := Ranges(content)
+	if len(ranges) == 0 {
+		return replaceOutsideInline(content, fn)
+	}
+
+	var b strings.Builder
+	b.Grow(len(content))
+	pos := 0
+	for _, r := range ranges {
+		if pos < r[0] {
+			// Within non-fenced segments, also skip inline code spans.
+			b.WriteString(replaceOutsideInline(content[pos:r[0]], fn))
+		}
+		// Preserve fenced content verbatim.
+		b.WriteString(content[r[0]:r[1]])
+		pos = r[1]
+	}
+	if pos < len(content) {
+		b.WriteString(replaceOutsideInline(content[pos:], fn))
+	}
+	return b.String()
+}
+
+// StripCode returns content with fenced code blocks and inline code spans
+// removed. Useful for detection/matching where only non-code text matters.
+func StripCode(content string) string {
+	// First strip fenced blocks.
+	stripped := StripFenced(content)
+	// Then strip inline code spans from what remains.
+	return stripInlineCode(stripped)
+}
+
+// StripFenced returns content with fenced code block regions removed.
+// Useful for detection/matching where only non-fenced text matters.
+// NOTE: this does NOT strip inline code spans; use StripCode for both.
+func StripFenced(content string) string {
+	ranges := Ranges(content)
+	if len(ranges) == 0 {
+		return content
+	}
+
+	var b strings.Builder
+	b.Grow(len(content))
+	pos := 0
+	for _, r := range ranges {
+		b.WriteString(content[pos:r[0]])
+		pos = r[1]
+	}
+	b.WriteString(content[pos:])
+	return b.String()
+}
+
+// parseFenceOpen checks whether trimmed (leading spaces already removed)
+// starts a fenced code block. Returns the fence character and count, or
+// (0, 0) if it is not a fence opener.
+func parseFenceOpen(trimmed string) (byte, int) {
+	if len(trimmed) == 0 {
+		return 0, 0
+	}
+	ch := trimmed[0]
+	if ch != '`' && ch != '~' {
+		return 0, 0
+	}
+	count := 0
+	for count < len(trimmed) && trimmed[count] == ch {
+		count++
+	}
+	if count < 3 {
+		return 0, 0
+	}
+	// Per CommonMark: backtick fences cannot have backticks in the info string.
+	if ch == '`' && strings.ContainsRune(trimmed[count:], '`') {
+		return 0, 0
+	}
+	return ch, count
+}
+
+// isFenceClose checks whether trimmed is a closing fence matching fenceChar
+// with at least minCount characters. A closing fence line contains only the
+// fence characters and optional trailing spaces.
+func isFenceClose(trimmed string, fenceChar byte, minCount int) bool {
+	if len(trimmed) == 0 || trimmed[0] != fenceChar {
+		return false
+	}
+	count := 0
+	for count < len(trimmed) && trimmed[count] == fenceChar {
+		count++
+	}
+	if count < minCount {
+		return false
+	}
+	// Closing fence must contain only fence chars (and optional trailing spaces).
+	return strings.TrimRight(trimmed[count:], " ") == ""
+}
+
+// --------------------------------------------------------------------------
+// Inline code span handling
+// --------------------------------------------------------------------------
+
+// inlineCodeRanges returns byte ranges [start, end) of inline code spans
+// in segment. Per CommonMark, a code span opens with N backticks and closes
+// with exactly N backticks.
+func inlineCodeRanges(s string) [][2]int {
+	var result [][2]int
+	i := 0
+	for i < len(s) {
+		if s[i] != '`' {
+			i++
+			continue
+		}
+		// Count opening backticks.
+		start := i
+		n := 0
+		for i < len(s) && s[i] == '`' {
+			n++
+			i++
+		}
+		// Scan for a closing run of exactly n backticks.
+		for j := i; j < len(s); {
+			if s[j] != '`' {
+				j++
+				continue
+			}
+			m := 0
+			for j < len(s) && s[j] == '`' {
+				m++
+				j++
+			}
+			if m == n {
+				result = append(result, [2]int{start, j})
+				i = j
+				break
+			}
+		}
+		// If no closing run was found, i is already past the opening
+		// backticks so the outer loop advances naturally.
+	}
+	return result
+}
+
+// replaceOutsideInline applies fn only to text outside inline code spans.
+func replaceOutsideInline(segment string, fn func(string) string) string {
+	ranges := inlineCodeRanges(segment)
+	if len(ranges) == 0 {
+		return fn(segment)
+	}
+	var b strings.Builder
+	b.Grow(len(segment))
+	pos := 0
+	for _, r := range ranges {
+		if pos < r[0] {
+			b.WriteString(fn(segment[pos:r[0]]))
+		}
+		b.WriteString(segment[r[0]:r[1]])
+		pos = r[1]
+	}
+	if pos < len(segment) {
+		b.WriteString(fn(segment[pos:]))
+	}
+	return b.String()
+}
+
+// stripInlineCode removes inline code spans from s.
+func stripInlineCode(s string) string {
+	ranges := inlineCodeRanges(s)
+	if len(ranges) == 0 {
+		return s
+	}
+	var b strings.Builder
+	b.Grow(len(s))
+	pos := 0
+	for _, r := range ranges {
+		b.WriteString(s[pos:r[0]])
+		pos = r[1]
+	}
+	b.WriteString(s[pos:])
+	return b.String()
+}
@@ -0,0 +1,313 @@
+package fences
+
+import (
+	"testing"
+)
+
+func TestRanges(t *testing.T) {
+	tests := []struct {
+		name    string
+		content string
+		want    [][2]int
+	}{
+		{
+			name:    "no fences",
+			content: "hello world\nno code here",
+			want:    nil,
+		},
+		{
+			name:    "single backtick fence",
+			content: "before\n```\ncode\n```\nafter",
+			want:    [][2]int{{7, 20}},
+		},
+		{
+			name:    "single tilde fence",
+			content: "before\n~~~\ncode\n~~~\nafter",
+			want:    [][2]int{{7, 20}},
+		},
+		{
+			name:    "fence with info string",
+			content: "before\n```go\ncode\n```\nafter",
+			want:    [][2]int{{7, 22}},
+		},
+		{
+			name:    "multiple fences",
+			content: "a\n```\nx\n```\nb\n~~~\ny\n~~~\nc",
+			want:    [][2]int{{2, 12}, {14, 24}},
+		},
+		{
+			name:    "unclosed fence",
+			content: "before\n```\ncode\nmore code",
+			want:    [][2]int{{7, 25}},
+		},
+		{
+			name:    "longer closing fence",
+			content: "before\n```\ncode\n`````\nafter",
+			want:    [][2]int{{7, 22}},
+		},
+		{
+			name:    "shorter closing fence ignored",
+			content: "before\n`````\ncode\n```\nmore\n`````\nafter",
+			want:    [][2]int{{7, 33}},
+		},
+		{
+			name:    "indented fence up to 3 spaces",
+			content: "before\n   ```\ncode\n   ```\nafter",
+			want:    [][2]int{{7, 26}},
+		},
+		{
+			name:    "4 space indent is not a fence",
+			content: "before\n    ```\ncode\n    ```\nafter",
+			want:    nil,
+		},
+		{
+			name: "backtick in info string rejects open",
+			// The ```foo`bar line is not a valid opener (backtick in info).
+			// The standalone ``` becomes an opener with no close.
+			content: "before\n```foo`bar\ncode\n```\nafter",
+			want:    [][2]int{{23, 32}},
+		},
+		{
+			name:    "empty content",
+			content: "",
+			want:    nil,
+		},
+		{
+			name:    "fence only",
+			content: "```\ncode\n```",
+			want:    [][2]int{{0, 12}},
+		},
+		{
+			name:    "fence at end without trailing newline",
+			content: "```\ncode\n```",
+			want:    [][2]int{{0, 12}},
+		},
+		{
+			name:    "tilde fence does not close with backticks",
+			content: "~~~\ncode\n```\nmore\n~~~\nafter",
+			want:    [][2]int{{0, 22}},
+		},
+	}
+
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			got := Ranges(tt.content)
+			if len(got) != len(tt.want) {
+				t.Fatalf("Ranges() = %v, want %v", got, tt.want)
+			}
+			for i := range got {
+				if got[i] != tt.want[i] {
+					t.Errorf("Ranges()[%d] = %v, want %v", i, got[i], tt.want[i])
+				}
+			}
+		})
+	}
+}
+
+func TestReplaceOutside(t *testing.T) {
+	upper := func(s string) string {
+		b := []byte(s)
+		for i, c := range b {
+			if c >= 'a' && c <= 'z' {
+				b[i] = c - 32
+			}
+		}
+		return string(b)
+	}
+
+	tests := []struct {
+		name    string
+		content string
+		want    string
+	}{
+		{
+			name:    "no fences",
+			content: "hello world",
+			want:    "HELLO WORLD",
+		},
+		{
+			name:    "text around fence",
+			content: "before\n```\ncode\n```\nafter",
+			want:    "BEFORE\n```\ncode\n```\nAFTER",
+		},
+		{
+			name:    "multiple fences",
+			content: "aaa\n```\nxxx\n```\nbbb\n~~~\nyyy\n~~~\nccc",
+			want:    "AAA\n```\nxxx\n```\nBBB\n~~~\nyyy\n~~~\nCCC",
+		},
+		{
+			name:    "unclosed fence preserves code",
+			content: "before\n```\ncode",
+			want:    "BEFORE\n```\ncode",
+		},
+		{
+			name:    "only fenced content",
+			content: "```\ncode\n```",
+			want:    "```\ncode\n```",
+		},
+	}
+
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			got := ReplaceOutside(tt.content, upper)
+			if got != tt.want {
+				t.Errorf("ReplaceOutside() =\n%s\nwant:\n%s", got, tt.want)
+			}
+		})
+	}
+}
+
+func TestStripFenced(t *testing.T) {
+	tests := []struct {
+		name    string
+		content string
+		want    string
+	}{
+		{
+			name:    "no fences",
+			content: "hello $1 world",
+			want:    "hello $1 world",
+		},
+		{
+			name:    "strips fenced code",
+			content: "before $1\n```\n$2 inside\n```\nafter $3",
+			want:    "before $1\nafter $3",
+		},
+		{
+			name:    "multiple fences",
+			content: "a\n```\nx\n```\nb\n~~~\ny\n~~~\nc",
+			want:    "a\nb\nc",
+		},
+		{
+			name:    "unclosed fence",
+			content: "before\n```\n$1 inside",
+			want:    "before\n",
+		},
+	}
+
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			got := StripFenced(tt.content)
+			if got != tt.want {
+				t.Errorf("StripFenced() = %q, want %q", got, tt.want)
+			}
+		})
+	}
+}
+
+func TestInlineCodeRanges(t *testing.T) {
+	tests := []struct {
+		name string
+		s    string
+		want [][2]int
+	}{
+		{"no backticks", "hello world", nil},
+		{"single backtick span", "use `$1` here", [][2]int{{4, 8}}},
+		{"double backtick span", "use ``$1`` here", [][2]int{{4, 10}}},
+		{"multiple spans", "`$1` and `$2`", [][2]int{{0, 4}, {9, 13}}},
+		{"unmatched backtick", "use `$1 here", nil},
+		{"mismatched backtick counts", "use ``$1` here", nil},
+		{"empty inline content", "use `` `` here", [][2]int{{4, 9}}},
+		{"backticks inside double", "use ``foo`bar`` here", [][2]int{{4, 15}}},
+	}
+
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			got := inlineCodeRanges(tt.s)
+			if len(got) != len(tt.want) {
+				t.Fatalf("inlineCodeRanges() = %v, want %v", got, tt.want)
+			}
+			for i := range got {
+				if got[i] != tt.want[i] {
+					t.Errorf("inlineCodeRanges()[%d] = %v, want %v", i, got[i], tt.want[i])
+				}
+			}
+		})
+	}
+}
+
+func TestReplaceOutside_InlineCode(t *testing.T) {
+	upper := func(s string) string {
+		b := []byte(s)
+		for i, c := range b {
+			if c >= 'a' && c <= 'z' {
+				b[i] = c - 32
+			}
+		}
+		return string(b)
+	}
+
+	tests := []struct {
+		name    string
+		content string
+		want    string
+	}{
+		{
+			name:    "inline code preserved",
+			content: "use `code` here",
+			want:    "USE `code` HERE",
+		},
+		{
+			name:    "double backtick inline code",
+			content: "use ``co`de`` here",
+			want:    "USE ``co`de`` HERE",
+		},
+		{
+			name:    "mixed fenced and inline",
+			content: "before `x` mid\n```\nfenced\n```\nafter `y` end",
+			want:    "BEFORE `x` MID\n```\nfenced\n```\nAFTER `y` END",
+		},
+		{
+			name:    "only inline code",
+			content: "`code`",
+			want:    "`code`",
+		},
+	}
+
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			got := ReplaceOutside(tt.content, upper)
+			if got != tt.want {
+				t.Errorf("ReplaceOutside() =\n%s\nwant:\n%s", got, tt.want)
+			}
+		})
+	}
+}
+
+func TestStripCode(t *testing.T) {
+	tests := []struct {
+		name    string
+		content string
+		want    string
+	}{
+		{
+			name:    "no code",
+			content: "hello $1 world",
+			want:    "hello $1 world",
+		},
+		{
+			name:    "strips inline code",
+			content: "use `$1` and `$2` for positional args",
+			want:    "use  and  for positional args",
+		},
+		{
+			name:    "strips fenced and inline",
+			content: "before `$1`\n```\n$2 inside\n```\nafter",
+			want:    "before \nafter",
+		},
+		{
+			name:    "real world prompt template",
+			content: "Use $@ for all args.\n`$1`, `$2` for positional.\n```bash\necho $1\n```\n",
+			want:    "Use $@ for all args.\n,  for positional.\n",
+		},
+	}
+
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			got := StripCode(tt.content)
+			if got != tt.want {
+				t.Errorf("StripCode() = %q, want %q", got, tt.want)
+			}
+		})
+	}
+}
@@ -33,6 +33,10 @@ type AgentSetupOptions struct {
 	// CoreTools overrides the default core tool set. If empty, core.AllTools()
 	// is used. Allows SDK users to pass custom tools (e.g. with WithWorkDir).
 	CoreTools []fantasy.AgentTool
+	// DisableCoreTools, when true, prevents loading any core tools.
+	// If both DisableCoreTools is true and CoreTools is empty, the agent
+	// will have no tools (useful for simple chat completions).
+	DisableCoreTools bool
 	// ExtraTools are additional tools added alongside core, MCP, and extension
 	// tools. They do not replace the defaults — they extend them.
 	ExtraTools []fantasy.AgentTool
@@ -61,6 +65,13 @@ type AgentSetupOptions struct {
 	// AuthHandler handles OAuth authorization for remote MCP servers.
 	// When set, remote transports are configured with OAuth support.
 	AuthHandler tools.MCPAuthHandler
+	// TokenStoreFactory, if non-nil, creates a custom token store for each
+	// remote MCP server's OAuth tokens. When nil, the default file-based
+	// token store is used.
+	TokenStoreFactory tools.TokenStoreFactory
+	// OnMCPServerLoaded, if non-nil, is called when each MCP server finishes
+	// loading (successfully or with error). Called from the background goroutine.
+	OnMCPServerLoaded func(serverName string, toolCount int, err error)
 }

 // AgentSetupResult bundles the created agent and any debug logger so the caller
@@ -75,15 +86,17 @@ type AgentSetupResult struct {

 // BuildProviderConfig creates a *models.ProviderConfig from the current viper
 // state. All entry points (root, script, SDK) converge through this function.
+//
+// Generation parameter pointers (Temperature, TopP, etc.) are only set when
+// the user has explicitly configured them via CLI flag, environment variable,
+// or global config file. This allows per-model defaults from modelSettings
+// and customModels to fill in unset parameters downstream.
 func BuildProviderConfig() (*models.ProviderConfig, string, error) {
 	systemPrompt, err := config.LoadSystemPrompt(viper.GetString("system-prompt"))
 	if err != nil {
 		return nil, "", fmt.Errorf("failed to load system prompt: %w", err)
 	}

-	temperature := float32(viper.GetFloat64("temperature"))
-	topP := float32(viper.GetFloat64("top-p"))
-	topK := int32(viper.GetInt("top-k"))
 	numGPU := int32(viper.GetInt("num-gpu-layers"))
 	mainGPU := int32(viper.GetInt("main-gpu"))

@@ -93,9 +106,6 @@ func BuildProviderConfig() (*models.ProviderConfig, string, error) {
 		ProviderAPIKey: viper.GetString("provider-api-key"),
 		ProviderURL:    viper.GetString("provider-url"),
 		MaxTokens:      viper.GetInt("max-tokens"),
-		Temperature:    &temperature,
-		TopP:           &topP,
-		TopK:           &topK,
 		StopSequences:  viper.GetStringSlice("stop-sequences"),
 		NumGPU:         &numGPU,
 		MainGPU:        &mainGPU,
@@ -103,6 +113,30 @@ func BuildProviderConfig() (*models.ProviderConfig, string, error) {
 		ThinkingLevel:  models.ParseThinkingLevel(viper.GetString("thinking-level")),
 	}

+	// Only set generation parameter pointers when the user has explicitly
+	// provided a value. This leaves nil pointers for unset params, allowing
+	// per-model defaults (modelSettings / customModels params) to apply.
+	if viper.IsSet("temperature") {
+		v := float32(viper.GetFloat64("temperature"))
+		cfg.Temperature = &v
+	}
+	if viper.IsSet("top-p") {
+		v := float32(viper.GetFloat64("top-p"))
+		cfg.TopP = &v
+	}
+	if viper.IsSet("top-k") {
+		v := int32(viper.GetInt("top-k"))
+		cfg.TopK = &v
+	}
+	if viper.IsSet("frequency-penalty") {
+		v := float32(viper.GetFloat64("frequency-penalty"))
+		cfg.FrequencyPenalty = &v
+	}
+	if viper.IsSet("presence-penalty") {
+		v := float32(viper.GetFloat64("presence-penalty"))
+		cfg.PresencePenalty = &v
+	}
+
 	return cfg, systemPrompt, nil
 }

@@ -179,19 +213,22 @@ func SetupAgent(ctx context.Context, opts AgentSetupOptions) (*AgentSetupResult,
 	}

 	a, err := agent.CreateAgent(ctx, &agent.AgentCreationOptions{
-		ModelConfig:      modelConfig,
-		MCPConfig:        opts.MCPConfig,
-		SystemPrompt:     systemPrompt,
-		MaxSteps:         maxSteps,
-		StreamingEnabled: streamingEnabled,
-		ShowSpinner:      opts.ShowSpinner,
-		Quiet:            opts.Quiet,
-		SpinnerFunc:      opts.SpinnerFunc,
-		DebugLogger:      debugLogger,
-		AuthHandler:      opts.AuthHandler,
-		CoreTools:        opts.CoreTools,
-		ToolWrapper:      toolWrapper,
-		ExtraTools:       extraTools,
+		ModelConfig:       modelConfig,
+		MCPConfig:         opts.MCPConfig,
+		SystemPrompt:      systemPrompt,
+		MaxSteps:          maxSteps,
+		StreamingEnabled:  streamingEnabled,
+		ShowSpinner:       opts.ShowSpinner,
+		Quiet:             opts.Quiet,
+		SpinnerFunc:       opts.SpinnerFunc,
+		DebugLogger:       debugLogger,
+		AuthHandler:       opts.AuthHandler,
+		TokenStoreFactory: opts.TokenStoreFactory,
+		CoreTools:         opts.CoreTools,
+		DisableCoreTools:  opts.DisableCoreTools,
+		ToolWrapper:       toolWrapper,
+		ExtraTools:        extraTools,
+		OnMCPServerLoaded: opts.OnMCPServerLoaded,
 	})
 	if err != nil {
 		return nil, fmt.Errorf("failed to create agent: %w", err)
@@ -2,6 +2,8 @@ package models

 import (
 	"log"
+	"os"
+	"strings"

 	"github.com/spf13/viper"
 )
@@ -31,7 +33,7 @@ func loadCustomModelsFromConfig() map[string]ModelInfo {

 // modelConfigToModelInfo converts a CustomModelConfig to a ModelInfo.
 func modelConfigToModelInfo(modelID string, cfg CustomModelConfig) ModelInfo {
-	return ModelInfo{
+	info := ModelInfo{
 		ID:          modelID,
 		Name:        cfg.Name,
 		Attachment:  cfg.Attachment,
@@ -48,21 +50,242 @@ func modelConfigToModelInfo(modelID string, cfg CustomModelConfig) ModelInfo {
 			Output:  cfg.Limit.Output,
 		},
 	}
+
+	// Convert custom model generation params if any are set.
+	if p := convertGenerationParams(cfg.Params); p != nil {
+		info.Params = p
+	}
+
+	return info
+}
+
+// LoadModelSettingsFromConfig loads per-model generation parameter overrides
+// from the config file. Keys are "provider/model" strings. Returns nil if
+// no model settings are configured.
+func LoadModelSettingsFromConfig() map[string]*GenerationParams {
+	if !viper.IsSet("modelSettings") {
+		return nil
+	}
+
+	var settings map[string]GenerationParamsConfig
+	if err := viper.UnmarshalKey("modelSettings", &settings); err != nil {
+		log.Printf("Warning: Failed to parse modelSettings: %v", err)
+		return nil
+	}
+
+	result := make(map[string]*GenerationParams, len(settings))
+	for modelKey, cfg := range settings {
+		if p := convertGenerationParams(cfg); p != nil {
+			result[modelKey] = p
+		}
+	}
+
+	return result
+}
+
+// convertGenerationParams converts a GenerationParamsConfig to a GenerationParams.
+// Returns nil if no parameters are set.
+func convertGenerationParams(cfg GenerationParamsConfig) *GenerationParams {
+	p := &GenerationParams{}
+	any := false
+
+	if cfg.MaxTokens != nil {
+		p.MaxTokens = cfg.MaxTokens
+		any = true
+	}
+	if cfg.Temperature != nil {
+		p.Temperature = cfg.Temperature
+		any = true
+	}
+	if cfg.TopP != nil {
+		p.TopP = cfg.TopP
+		any = true
+	}
+	if cfg.TopK != nil {
+		p.TopK = cfg.TopK
+		any = true
+	}
+	if cfg.FrequencyPenalty != nil {
+		p.FrequencyPenalty = cfg.FrequencyPenalty
+		any = true
+	}
+	if cfg.PresencePenalty != nil {
+		p.PresencePenalty = cfg.PresencePenalty
+		any = true
+	}
+	if len(cfg.StopSequences) > 0 {
+		p.StopSequences = cfg.StopSequences
+		any = true
+	}
+	if cfg.ThinkingLevel != "" {
+		p.ThinkingLevel = ParseThinkingLevel(cfg.ThinkingLevel)
+		any = true
+	}
+	if cfg.SystemPrompt != "" {
+		p.SystemPrompt = cfg.SystemPrompt
+		any = true
+	}
+
+	if !any {
+		return nil
+	}
+	return p
+}
+
+// ApplyModelSettings merges per-model generation parameter defaults from the
+// registry into a ProviderConfig. Model-level params are only applied for
+// fields where the user has not explicitly set a value (i.e., the
+// corresponding viper key is not set via CLI flag or global config).
+//
+// The lookup order is:
+//  1. modelSettings["provider/model"] from config (highest model-level priority)
+//  2. ModelInfo.Params from custom model definitions
+//
+// Both are overridden by explicit CLI flags / global config values.
+func ApplyModelSettings(config *ProviderConfig, modelInfo *ModelInfo) {
+	provider, modelName, err := ParseModelString(config.ModelString)
+	if err != nil {
+		return
+	}
+
+	// Collect model-level params: modelSettings override > custom model params.
+	// modelSettings takes priority because it's the more specific/intentional config.
+	var params *GenerationParams
+
+	// First check modelSettings from config.
+	if settings := LoadModelSettingsFromConfig(); settings != nil {
+		modelKey := provider + "/" + modelName
+		if p, ok := settings[modelKey]; ok {
+			params = p
+		}
+	}
+
+	// Fall back to ModelInfo.Params (from custom model definitions).
+	if params == nil && modelInfo != nil && modelInfo.Params != nil {
+		params = modelInfo.Params
+	}
+
+	if params == nil {
+		return
+	}
+
+	// Apply each parameter only when the user hasn't explicitly set it.
+	// We check viper.IsSet() which returns true only when the key was
+	// set via CLI flag, environment variable, or config file global section.
+
+	if params.MaxTokens != nil && !isExplicitlySet("max-tokens") {
+		config.MaxTokens = *params.MaxTokens
+	}
+	if params.Temperature != nil && !isExplicitlySet("temperature") {
+		config.Temperature = params.Temperature
+	}
+	if params.TopP != nil && !isExplicitlySet("top-p") {
+		config.TopP = params.TopP
+	}
+	if params.TopK != nil && !isExplicitlySet("top-k") {
+		config.TopK = params.TopK
+	}
+	if params.FrequencyPenalty != nil && !isExplicitlySet("frequency-penalty") {
+		config.FrequencyPenalty = params.FrequencyPenalty
+	}
+	if params.PresencePenalty != nil && !isExplicitlySet("presence-penalty") {
+		config.PresencePenalty = params.PresencePenalty
+	}
+	if len(params.StopSequences) > 0 && !isExplicitlySet("stop-sequences") {
+		config.StopSequences = params.StopSequences
+	}
+	if params.ThinkingLevel != "" && !isExplicitlySet("thinking-level") {
+		config.ThinkingLevel = params.ThinkingLevel
+	}
+	if params.SystemPrompt != "" && config.SystemPrompt == "" {
+		// Resolve file paths: if the value points to an existing file, read it.
+		// We check config.SystemPrompt == "" rather than isExplicitlySet because
+		// viper.BindPFlag causes IsSet to return true even for unset flags.
+		config.SystemPrompt = LoadSystemPromptValue(params.SystemPrompt)
+	}
+}
+
+// LoadSystemPromptValue resolves a system prompt value that may be either
+// inline text or a file path. If the value is a path to an existing file,
+// its contents are read and returned. Otherwise the string is returned as-is.
+// This mirrors config.LoadSystemPrompt but lives in the models package to
+// avoid circular dependencies.
+func LoadSystemPromptValue(input string) string {
+	if input == "" {
+		return ""
+	}
+	if info, err := os.Stat(input); err == nil && !info.IsDir() {
+		content, err := os.ReadFile(input)
+		if err != nil {
+			log.Printf("Warning: failed to read system prompt file %q: %v", input, err)
+			return input
+		}
+		return strings.TrimSpace(string(content))
+	}
+	return input
+}
+
+// isExplicitlySet returns true when the user has explicitly set a config key
+// via CLI flag, environment variable, or the global section of the config file.
+// Model-level defaults should not override explicitly set values.
+func isExplicitlySet(key string) bool {
+	// viper.IsSet returns true if the key has been set in any of the
+	// data stores (flag, env, config file, default). We need to check
+	// whether the value was set at the global config level (not just
+	// as a default). For generation params, the global config keys use
+	// hyphenated names (e.g. "max-tokens", "top-p").
+	//
+	// Since viper merges all sources, IsSet returns true even for config
+	// file values. This means global config file values (e.g.
+	// temperature: 0.7 at the top level) will correctly take precedence
+	// over model-level defaults, which is the desired behavior.
+	return viper.IsSet(key)
+}
+
+// GenerationParams holds per-model generation parameter defaults.
+// These are stored on ModelInfo and applied during provider creation.
+// Nil pointer fields mean "no model-level default" — the global config
+// or CLI flag value (if any) will be used instead.
+type GenerationParams struct {
+	MaxTokens        *int
+	Temperature      *float32
+	TopP             *float32
+	TopK             *int32
+	FrequencyPenalty *float32
+	PresencePenalty  *float32
+	StopSequences    []string
+	ThinkingLevel    ThinkingLevel
+	SystemPrompt     string // Per-model system prompt (inline text or file path)
 }

 // CustomModelConfig defines a custom model configuration loaded from the config file.
 // This is a duplicate here to avoid circular dependencies with internal/config.
 type CustomModelConfig struct {
-	Name        string      `json:"name" yaml:"name"`
-	BaseURL     string      `json:"baseUrl,omitempty" yaml:"baseUrl,omitempty"`
-	APIKey      string      `json:"apiKey,omitempty" yaml:"apiKey,omitempty"`
-	Family      string      `json:"family,omitempty" yaml:"family,omitempty"`
-	Attachment  bool        `json:"attachment,omitempty" yaml:"attachment,omitempty"`
-	Reasoning   bool        `json:"reasoning,omitempty" yaml:"reasoning,omitempty"`
-	Temperature bool        `json:"temperature,omitempty" yaml:"temperature,omitempty"`
-	Knowledge   string      `json:"knowledge,omitempty" yaml:"knowledge,omitempty"`
-	Cost        CostConfig  `json:"cost" yaml:"cost"`
-	Limit       LimitConfig `json:"limit" yaml:"limit"`
+	Name        string                 `json:"name" yaml:"name"`
+	BaseURL     string                 `json:"baseUrl,omitempty" yaml:"baseUrl,omitempty"`
+	APIKey      string                 `json:"apiKey,omitempty" yaml:"apiKey,omitempty"`
+	Family      string                 `json:"family,omitempty" yaml:"family,omitempty"`
+	Attachment  bool                   `json:"attachment,omitempty" yaml:"attachment,omitempty"`
+	Reasoning   bool                   `json:"reasoning,omitempty" yaml:"reasoning,omitempty"`
+	Temperature bool                   `json:"temperature,omitempty" yaml:"temperature,omitempty"`
+	Knowledge   string                 `json:"knowledge,omitempty" yaml:"knowledge,omitempty"`
+	Cost        CostConfig             `json:"cost" yaml:"cost"`
+	Limit       LimitConfig            `json:"limit" yaml:"limit"`
+	Params      GenerationParamsConfig `json:"params,omitzero" yaml:"params,omitempty"`
+}
+
+// GenerationParamsConfig is the JSON/YAML-serializable form of generation
+// parameter defaults. Used in both customModels[].params and modelSettings[].
+type GenerationParamsConfig struct {
+	MaxTokens        *int     `json:"maxTokens,omitempty" yaml:"maxTokens,omitempty"`
+	Temperature      *float32 `json:"temperature,omitempty" yaml:"temperature,omitempty"`
+	TopP             *float32 `json:"topP,omitempty" yaml:"topP,omitempty"`
+	TopK             *int32   `json:"topK,omitempty" yaml:"topK,omitempty"`
+	FrequencyPenalty *float32 `json:"frequencyPenalty,omitempty" yaml:"frequencyPenalty,omitempty"`
+	PresencePenalty  *float32 `json:"presencePenalty,omitempty" yaml:"presencePenalty,omitempty"`
+	StopSequences    []string `json:"stopSequences,omitempty" yaml:"stopSequences,omitempty"`
+	ThinkingLevel    string   `json:"thinkingLevel,omitempty" yaml:"thinkingLevel,omitempty"`
+	SystemPrompt     string   `json:"systemPrompt,omitempty" yaml:"systemPrompt,omitempty"`
 }

 // CostConfig defines the pricing for a custom model.
@@ -0,0 +1,422 @@
+package models
+
+import (
+	"os"
+	"testing"
+
+	"github.com/spf13/viper"
+)
+
+func TestConvertGenerationParams(t *testing.T) {
+	t.Run("empty config returns nil", func(t *testing.T) {
+		cfg := GenerationParamsConfig{}
+		p := convertGenerationParams(cfg)
+		if p != nil {
+			t.Errorf("expected nil, got %+v", p)
+		}
+	})
+
+	t.Run("temperature only", func(t *testing.T) {
+		temp := float32(0.7)
+		cfg := GenerationParamsConfig{Temperature: &temp}
+		p := convertGenerationParams(cfg)
+		if p == nil {
+			t.Fatal("expected non-nil")
+		}
+		if p.Temperature == nil || *p.Temperature != 0.7 {
+			t.Errorf("expected temperature 0.7, got %v", p.Temperature)
+		}
+		if p.TopP != nil {
+			t.Errorf("expected nil TopP, got %v", p.TopP)
+		}
+	})
+
+	t.Run("all params set", func(t *testing.T) {
+		maxTokens := 8192
+		temp := float32(0.5)
+		topP := float32(0.9)
+		topK := int32(50)
+		freqPenalty := float32(0.1)
+		presPenalty := float32(0.2)
+		cfg := GenerationParamsConfig{
+			MaxTokens:        &maxTokens,
+			Temperature:      &temp,
+			TopP:             &topP,
+			TopK:             &topK,
+			FrequencyPenalty: &freqPenalty,
+			PresencePenalty:  &presPenalty,
+			StopSequences:    []string{"STOP"},
+			ThinkingLevel:    "high",
+		}
+		p := convertGenerationParams(cfg)
+		if p == nil {
+			t.Fatal("expected non-nil")
+		}
+		if p.MaxTokens == nil || *p.MaxTokens != 8192 {
+			t.Errorf("expected maxTokens 8192, got %v", p.MaxTokens)
+		}
+		if p.Temperature == nil || *p.Temperature != 0.5 {
+			t.Errorf("expected temperature 0.5, got %v", p.Temperature)
+		}
+		if p.TopP == nil || *p.TopP != 0.9 {
+			t.Errorf("expected topP 0.9, got %v", p.TopP)
+		}
+		if p.TopK == nil || *p.TopK != 50 {
+			t.Errorf("expected topK 50, got %v", p.TopK)
+		}
+		if p.FrequencyPenalty == nil || *p.FrequencyPenalty != 0.1 {
+			t.Errorf("expected frequencyPenalty 0.1, got %v", p.FrequencyPenalty)
+		}
+		if p.PresencePenalty == nil || *p.PresencePenalty != 0.2 {
+			t.Errorf("expected presencePenalty 0.2, got %v", p.PresencePenalty)
+		}
+		if len(p.StopSequences) != 1 || p.StopSequences[0] != "STOP" {
+			t.Errorf("expected stop sequences [STOP], got %v", p.StopSequences)
+		}
+		if p.ThinkingLevel != ThinkingHigh {
+			t.Errorf("expected thinking level high, got %v", p.ThinkingLevel)
+		}
+	})
+
+	t.Run("thinking level parsing", func(t *testing.T) {
+		cfg := GenerationParamsConfig{ThinkingLevel: "medium"}
+		p := convertGenerationParams(cfg)
+		if p == nil {
+			t.Fatal("expected non-nil")
+		}
+		if p.ThinkingLevel != ThinkingMedium {
+			t.Errorf("expected thinking level medium, got %v", p.ThinkingLevel)
+		}
+	})
+	t.Run("system prompt only", func(t *testing.T) {
+		cfg := GenerationParamsConfig{SystemPrompt: "You are helpful."}
+		p := convertGenerationParams(cfg)
+		if p == nil {
+			t.Fatal("expected non-nil")
+		}
+		if p.SystemPrompt != "You are helpful." {
+			t.Errorf("expected system prompt, got %q", p.SystemPrompt)
+		}
+	})
+}
+
+func TestModelConfigToModelInfoWithParams(t *testing.T) {
+	temp := float32(0.8)
+	topP := float32(0.95)
+	cfg := CustomModelConfig{
+		Name:        "Test Model",
+		BaseURL:     "http://localhost:8080/v1",
+		Temperature: true,
+		Params: GenerationParamsConfig{
+			Temperature: &temp,
+			TopP:        &topP,
+		},
+	}
+
+	info := modelConfigToModelInfo("test-model", cfg)
+
+	if info.Params == nil {
+		t.Fatal("expected non-nil Params")
+	}
+	if info.Params.Temperature == nil || *info.Params.Temperature != 0.8 {
+		t.Errorf("expected temperature 0.8, got %v", info.Params.Temperature)
+	}
+	if info.Params.TopP == nil || *info.Params.TopP != 0.95 {
+		t.Errorf("expected topP 0.95, got %v", info.Params.TopP)
+	}
+}
+
+func TestModelConfigToModelInfoWithoutParams(t *testing.T) {
+	cfg := CustomModelConfig{
+		Name:    "Test Model",
+		BaseURL: "http://localhost:8080/v1",
+	}
+
+	info := modelConfigToModelInfo("test-model", cfg)
+
+	if info.Params != nil {
+		t.Errorf("expected nil Params, got %+v", info.Params)
+	}
+}
+
+func TestApplyModelSettings(t *testing.T) {
+	// Save and restore viper state.
+	originalViper := viper.AllSettings()
+	defer func() {
+		viper.Reset()
+		for k, v := range originalViper {
+			viper.Set(k, v)
+		}
+	}()
+
+	t.Run("applies model params when not explicitly set", func(t *testing.T) {
+		viper.Reset()
+
+		temp := float32(0.8)
+		topK := int32(50)
+		maxTokens := 4096
+		modelInfo := &ModelInfo{
+			ID: "test-model",
+			Params: &GenerationParams{
+				Temperature: &temp,
+				TopK:        &topK,
+				MaxTokens:   &maxTokens,
+			},
+		}
+
+		config := &ProviderConfig{
+			ModelString: "custom/test-model",
+		}
+
+		ApplyModelSettings(config, modelInfo)
+
+		if config.Temperature == nil || *config.Temperature != 0.8 {
+			t.Errorf("expected temperature 0.8, got %v", config.Temperature)
+		}
+		if config.TopK == nil || *config.TopK != 50 {
+			t.Errorf("expected topK 50, got %v", config.TopK)
+		}
+		if config.MaxTokens != 4096 {
+			t.Errorf("expected maxTokens 4096, got %d", config.MaxTokens)
+		}
+	})
+
+	t.Run("explicit viper values take precedence", func(t *testing.T) {
+		viper.Reset()
+		viper.Set("temperature", 0.3)
+
+		temp := float32(0.8)
+		modelInfo := &ModelInfo{
+			ID: "test-model",
+			Params: &GenerationParams{
+				Temperature: &temp,
+			},
+		}
+
+		explicitTemp := float32(0.3)
+		config := &ProviderConfig{
+			ModelString: "custom/test-model",
+			Temperature: &explicitTemp,
+		}
+
+		ApplyModelSettings(config, modelInfo)
+
+		// Temperature should NOT be overridden because it's explicitly set in viper
+		if config.Temperature == nil || *config.Temperature != 0.3 {
+			t.Errorf("expected temperature 0.3 (explicit), got %v", config.Temperature)
+		}
+	})
+
+	t.Run("nil model info is safe", func(t *testing.T) {
+		viper.Reset()
+
+		config := &ProviderConfig{
+			ModelString: "custom/test-model",
+		}
+
+		// Should not panic
+		ApplyModelSettings(config, nil)
+
+		if config.Temperature != nil {
+			t.Errorf("expected nil temperature, got %v", config.Temperature)
+		}
+	})
+
+	t.Run("model info without params is safe", func(t *testing.T) {
+		viper.Reset()
+
+		modelInfo := &ModelInfo{ID: "test-model"}
+		config := &ProviderConfig{
+			ModelString: "custom/test-model",
+		}
+
+		ApplyModelSettings(config, modelInfo)
+
+		if config.Temperature != nil {
+			t.Errorf("expected nil temperature, got %v", config.Temperature)
+		}
+	})
+
+	t.Run("modelSettings from viper takes priority over ModelInfo.Params", func(t *testing.T) {
+		viper.Reset()
+
+		// Set up modelSettings in viper (simulating config file)
+		viper.Set("modelSettings", map[string]any{
+			"custom/test-model": map[string]any{
+				"temperature": 0.5,
+				"topK":        30,
+			},
+		})
+
+		// ModelInfo has different params
+		temp := float32(0.8)
+		topK := int32(50)
+		modelInfo := &ModelInfo{
+			ID: "test-model",
+			Params: &GenerationParams{
+				Temperature: &temp,
+				TopK:        &topK,
+			},
+		}
+
+		config := &ProviderConfig{
+			ModelString: "custom/test-model",
+		}
+
+		ApplyModelSettings(config, modelInfo)
+
+		// modelSettings should win over ModelInfo.Params
+		if config.Temperature == nil || *config.Temperature != 0.5 {
+			t.Errorf("expected temperature 0.5 (from modelSettings), got %v", config.Temperature)
+		}
+		if config.TopK == nil || *config.TopK != 30 {
+			t.Errorf("expected topK 30 (from modelSettings), got %v", config.TopK)
+		}
+	})
+
+	t.Run("stop sequences applied from model params", func(t *testing.T) {
+		viper.Reset()
+
+		modelInfo := &ModelInfo{
+			ID: "test-model",
+			Params: &GenerationParams{
+				StopSequences: []string{"STOP", "END"},
+			},
+		}
+
+		config := &ProviderConfig{
+			ModelString: "custom/test-model",
+		}
+
+		ApplyModelSettings(config, modelInfo)
+
+		if len(config.StopSequences) != 2 || config.StopSequences[0] != "STOP" {
+			t.Errorf("expected stop sequences [STOP END], got %v", config.StopSequences)
+		}
+	})
+
+	t.Run("thinking level applied from model params", func(t *testing.T) {
+		viper.Reset()
+
+		modelInfo := &ModelInfo{
+			ID: "test-model",
+			Params: &GenerationParams{
+				ThinkingLevel: ThinkingHigh,
+			},
+		}
+
+		config := &ProviderConfig{
+			ModelString: "custom/test-model",
+		}
+
+		ApplyModelSettings(config, modelInfo)
+
+		if config.ThinkingLevel != ThinkingHigh {
+			t.Errorf("expected thinking level high, got %v", config.ThinkingLevel)
+		}
+	})
+
+	t.Run("system prompt applied from model params", func(t *testing.T) {
+		viper.Reset()
+
+		modelInfo := &ModelInfo{
+			ID: "test-model",
+			Params: &GenerationParams{
+				SystemPrompt: "You are a coding assistant.",
+			},
+		}
+
+		config := &ProviderConfig{
+			ModelString: "custom/test-model",
+		}
+
+		ApplyModelSettings(config, modelInfo)
+
+		if config.SystemPrompt != "You are a coding assistant." {
+			t.Errorf("expected system prompt to be set, got %q", config.SystemPrompt)
+		}
+	})
+
+	t.Run("explicit system prompt takes precedence", func(t *testing.T) {
+		viper.Reset()
+
+		modelInfo := &ModelInfo{
+			ID: "test-model",
+			Params: &GenerationParams{
+				SystemPrompt: "Model-specific prompt",
+			},
+		}
+
+		config := &ProviderConfig{
+			ModelString:  "custom/test-model",
+			SystemPrompt: "Global prompt",
+		}
+
+		ApplyModelSettings(config, modelInfo)
+
+		// Global system prompt should NOT be overridden because config
+		// already has a non-empty SystemPrompt.
+		if config.SystemPrompt != "Global prompt" {
+			t.Errorf("expected global prompt preserved, got %q", config.SystemPrompt)
+		}
+	})
+
+	t.Run("system prompt from file path", func(t *testing.T) {
+		viper.Reset()
+
+		// Create a temp file with a system prompt
+		tmpFile, err := os.CreateTemp("", "kit-test-prompt-*.txt")
+		if err != nil {
+			t.Fatal(err)
+		}
+		defer func() { _ = os.Remove(tmpFile.Name()) }()
+		if _, err := tmpFile.WriteString("  Prompt from file  "); err != nil {
+			t.Fatal(err)
+		}
+		_ = tmpFile.Close()
+
+		modelInfo := &ModelInfo{
+			ID: "test-model",
+			Params: &GenerationParams{
+				SystemPrompt: tmpFile.Name(),
+			},
+		}
+
+		config := &ProviderConfig{
+			ModelString: "custom/test-model",
+		}
+
+		ApplyModelSettings(config, modelInfo)
+
+		if config.SystemPrompt != "Prompt from file" {
+			t.Errorf("expected trimmed file content, got %q", config.SystemPrompt)
+		}
+	})
+
+	t.Run("modelSettings system prompt overrides custom model params", func(t *testing.T) {
+		viper.Reset()
+
+		viper.Set("modelSettings", map[string]any{
+			"custom/test-model": map[string]any{
+				"systemPrompt": "From modelSettings",
+			},
+		})
+
+		modelInfo := &ModelInfo{
+			ID: "test-model",
+			Params: &GenerationParams{
+				SystemPrompt: "From custom model",
+			},
+		}
+
+		config := &ProviderConfig{
+			ModelString: "custom/test-model",
+		}
+
+		ApplyModelSettings(config, modelInfo)
+
+		if config.SystemPrompt != "From modelSettings" {
+			t.Errorf("expected modelSettings prompt, got %q", config.SystemPrompt)
+		}
+	})
+}
@@ -143,20 +143,22 @@ func ParseThinkingLevel(s string) ThinkingLevel {

 // ProviderConfig holds configuration for creating LLM providers.
 type ProviderConfig struct {
-	ModelString    string
-	SystemPrompt   string
-	ProviderAPIKey string
-	ProviderURL    string
-	MaxTokens      int
-	Temperature    *float32
-	TopP           *float32
-	TopK           *int32
-	StopSequences  []string
-	NumGPU         *int32
-	MainGPU        *int32
-	TLSSkipVerify  bool
-	ThinkingLevel  ThinkingLevel
-	DisableCaching bool // Opt-out: set to true to disable automatic prompt caching
+	ModelString      string
+	SystemPrompt     string
+	ProviderAPIKey   string
+	ProviderURL      string
+	MaxTokens        int
+	Temperature      *float32
+	TopP             *float32
+	TopK             *int32
+	FrequencyPenalty *float32
+	PresencePenalty  *float32
+	StopSequences    []string
+	NumGPU           *int32
+	MainGPU          *int32
+	TLSSkipVerify    bool
+	ThinkingLevel    ThinkingLevel
+	DisableCaching   bool // Opt-out: set to true to disable automatic prompt caching
 }

 // ProviderResult contains the result of provider creation.
@@ -239,6 +241,11 @@ func CreateProvider(ctx context.Context, config *ProviderConfig) (*ProviderResul
 		validateModelConfig(config, modelInfo)
 	}

+	// Apply per-model generation parameter defaults. Model-level params are
+	// only applied for fields where the user hasn't explicitly set a value
+	// via CLI flag or global config.
+	ApplyModelSettings(config, modelInfo)
+
 	// Create the base provider
 	var result *ProviderResult
 	var createErr error
@@ -1164,6 +1171,12 @@ func buildOllamaOptions(config *ProviderConfig) map[string]any {
 	if config.TopK != nil {
 		options["top_k"] = int(*config.TopK)
 	}
+	if config.FrequencyPenalty != nil {
+		options["frequency_penalty"] = *config.FrequencyPenalty
+	}
+	if config.PresencePenalty != nil {
+		options["presence_penalty"] = *config.PresencePenalty
+	}
 	if len(config.StopSequences) > 0 {
 		options["stop"] = config.StopSequences
 	}
@@ -26,6 +26,11 @@ type ModelInfo struct {
 	ProviderNPM string // Model-specific provider npm override (e.g. "@ai-sdk/anthropic")
 	BaseURL     string // Per-model base URL override (custom models only)
 	APIKey      string // Per-model API key override (custom models only)
+
+	// Params holds per-model generation parameter defaults. These are applied
+	// when the user hasn't explicitly set the corresponding CLI flag or global
+	// config value. Nil pointer fields mean "no model-level default".
+	Params *GenerationParams
 }

 // SupportsCaching returns true if this model family supports prompt caching.
@@ -236,6 +241,18 @@ func (r *ModelsRegistry) LookupModel(provider, modelID string) *ModelInfo {
 	return &modelInfo
 }

+// LookupModelForSettings is a convenience function that parses a
+// "provider/model" string and looks up the ModelInfo in the global registry.
+// Returns nil when the model string is invalid or the model is unknown.
+// Used by Kit.SetModel to pre-apply per-model settings before CreateProvider.
+func LookupModelForSettings(modelString string) *ModelInfo {
+	provider, modelName, err := ParseModelString(modelString)
+	if err != nil {
+		return nil
+	}
+	return GetGlobalRegistry().LookupModel(provider, modelName)
+}
+
 // getRequiredEnvVars returns the required environment variables for a provider.
 func (r *ModelsRegistry) getRequiredEnvVars(provider string) ([]string, error) {
 	providerInfo, exists := r.providers[provider]
@@ -2,11 +2,10 @@ package prompts

 import (
 	"fmt"
+	"log"
 	"os"
 	"path/filepath"
 	"strings"
-
-	"github.com/charmbracelet/log"
 )

 // LoadOptions configures how templates are discovered and loaded.
@@ -74,10 +73,7 @@ func LoadAll(opts LoadOptions) ([]*PromptTemplate, []Diagnostic, error) {
 					DroppedPath: tpl.FilePath,
 					Reason:      fmt.Sprintf("template from %s overridden by %s", source, existing.Source),
 				})
-				log.Debug("template collision",
-					"name", tpl.Name,
-					"dropped", tpl.FilePath,
-					"kept", existing.FilePath)
+				log.Printf("DEBUG template collision: name=%s dropped=%s kept=%s", tpl.Name, tpl.FilePath, existing.FilePath)
 			} else {
 				tpl.Source = source
 				seen[tpl.Name] = tpl
@@ -7,10 +7,12 @@ import (
 	"regexp"
 	"strconv"
 	"strings"
+
+	"github.com/mark3labs/kit/internal/fences"
 )

 // PromptTemplate is a named prompt template with shell-style argument placeholders.
-// It supports Pi-style $1, $2, $@, $ARGUMENTS, ${@:N}, ${@:N:L} syntax.
+// It supports Pi-style $1, $2, $@, $+, $ARGUMENTS, ${@:N}, ${@:N:L} syntax.
 type PromptTemplate struct {
 	// Name is the human-readable identifier for this template.
 	Name string
@@ -120,19 +122,28 @@ func ParseCommandArgs(input string) []string {

 // argPlaceholder matches shell-style argument placeholders:
 //   - $1, $2, etc. - positional arguments
-//   - $@ - all arguments
+//   - $@ - all arguments (zero or more)
+//   - $+ - all arguments (one or more required)
 //   - $ARGUMENTS - all arguments (alias for $@)
 //   - ${@:N} - arguments from N onwards
 //   - ${@:N:L} - L arguments starting from N
-var argPlaceholder = regexp.MustCompile(`\$\{(\d+)\}|\$\{(\d+):(\d+)\}|\$\{ARGUMENTS\}|\$\{@(:\d+)?(:\d+)?\}|\$(\d+)|\$@|\$ARGUMENTS`)
+var argPlaceholder = regexp.MustCompile(`\$\{(\d+)\}|\$\{(\d+):(\d+)\}|\$\{ARGUMENTS\}|\$\{@(:\d+)?(:\d+)?\}|\$(\d+)|\$@|\$\+|\$ARGUMENTS`)

 // SubstituteArgs replaces argument placeholders in content with values from args.
 // Supported placeholders:
 //   - $N, ${N} - the Nth argument (1-indexed)
-//   - $@, $ARGUMENTS, ${ARGUMENTS} - all arguments joined with spaces
+//   - $@, $+, $ARGUMENTS, ${ARGUMENTS} - all arguments joined with spaces
 //   - ${@:N} - arguments from index N onwards (0-indexed)
 //   - ${@:N:L} - L arguments starting from index N (0-indexed)
 func SubstituteArgs(content string, args []string) string {
+	return fences.ReplaceOutside(content, func(segment string) string {
+		return substituteArgsInSegment(segment, args)
+	})
+}
+
+// substituteArgsInSegment performs argument substitution on a single text
+// segment that is known to be outside fenced code blocks.
+func substituteArgsInSegment(content string, args []string) string {
 	return argPlaceholder.ReplaceAllStringFunc(content, func(match string) string {
 		// Check for ${N} or ${N:M} format
 		if strings.HasPrefix(match, "${") && strings.Contains(match, "}") {
@@ -191,8 +202,8 @@ func SubstituteArgs(content string, args []string) string {
 		if strings.HasPrefix(match, "$") && !strings.HasPrefix(match, "${") {
 			suffix := match[1:]

-			// $@ or $ARGUMENTS
-			if suffix == "@" || suffix == "ARGUMENTS" {
+			// $@, $+, or $ARGUMENTS
+			if suffix == "@" || suffix == "+" || suffix == "ARGUMENTS" {
 				return strings.Join(args, " ")
 			}

@@ -266,6 +277,48 @@ func joinArgsRange(args []string, start, length int) string {
 	return strings.Join(args[start:end], " ")
 }

+// HasArgPlaceholders reports whether the template content contains any
+// argument placeholders ($1, $@, $ARGUMENTS, ${@:...}, etc.).
+// Placeholders inside fenced code blocks and inline code spans are ignored.
+func (t *PromptTemplate) HasArgPlaceholders() bool {
+	return argPlaceholder.MatchString(fences.StripCode(t.Content))
+}
+
+// RequiredArgs returns the number of positional arguments the template
+// expects. This is determined by the highest $N or ${N} placeholder found
+// in the content (1-indexed, so $2 means 2 args required). The $+
+// placeholder (required variadic) ensures at least 1. Optional wildcards
+// ($@, $ARGUMENTS) do not contribute to the count.
+func (t *PromptTemplate) RequiredArgs() int {
+	content := fences.StripCode(t.Content)
+	maxN := 0
+	hasRequiredVariadic := strings.Contains(content, "$+")
+	for _, match := range argPlaceholder.FindAllStringSubmatch(content, -1) {
+		// Group 1: ${N} format — the N value.
+		if match[1] != "" {
+			if n, err := strconv.Atoi(match[1]); err == nil && n > maxN {
+				maxN = n
+			}
+		}
+		// Group 2: ${N:M} format — the N value (start index).
+		if match[2] != "" {
+			if n, err := strconv.Atoi(match[2]); err == nil && n > maxN {
+				maxN = n
+			}
+		}
+		// Group 6: $N format (no braces) — the N value.
+		if match[6] != "" {
+			if n, err := strconv.Atoi(match[6]); err == nil && n > maxN {
+				maxN = n
+			}
+		}
+	}
+	if hasRequiredVariadic && maxN < 1 {
+		maxN = 1
+	}
+	return maxN
+}
+
 // Expand substitutes arguments into the template content and returns the result.
 // It first parses args from the input string, then substitutes them into the template.
 func (t *PromptTemplate) Expand(argsInput string) string {
@@ -129,6 +129,48 @@ func TestSubstituteArgs(t *testing.T) {
 			args:     []string{},
 			expected: "Args: ",
 		},
+		{
+			name:     "$1 inside code block preserved",
+			content:  "Use $1 here\n```bash\necho $1\n```\ndone",
+			args:     []string{"foo"},
+			expected: "Use foo here\n```bash\necho $1\n```\ndone",
+		},
+		{
+			name:     "$@ inside code block preserved",
+			content:  "Run $@\n```\necho $@\n```\n",
+			args:     []string{"a", "b"},
+			expected: "Run a b\n```\necho $@\n```\n",
+		},
+		{
+			name:     "all placeholders inside code block",
+			content:  "Prompt\n```\n$1 $2 $@\n```\n",
+			args:     []string{"x"},
+			expected: "Prompt\n```\n$1 $2 $@\n```\n",
+		},
+		{
+			name:     "$1 inside inline code preserved",
+			content:  "Use `$1` here and $1 outside",
+			args:     []string{"foo"},
+			expected: "Use `$1` here and foo outside",
+		},
+		{
+			name:     "$+ required variadic",
+			content:  "Args: $+",
+			args:     []string{"a", "b", "c"},
+			expected: "Args: a b c",
+		},
+		{
+			name:     "$+ with empty args",
+			content:  "Args: $+",
+			args:     []string{},
+			expected: "Args: ",
+		},
+		{
+			name:     "all placeholders in inline code",
+			content:  "Use `$1` and `$@` for args",
+			args:     []string{"x"},
+			expected: "Use `$1` and `$@` for args",
+		},
 	}

 	for _, tt := range tests {
@@ -213,3 +255,78 @@ func TestPromptTemplateExpand(t *testing.T) {
 		})
 	}
 }
+
+func TestHasArgPlaceholders(t *testing.T) {
+	tests := []struct {
+		name    string
+		content string
+		want    bool
+	}{
+		{"no placeholders", "Just a plain prompt with no args", false},
+		{"$1 placeholder", "Create a $1 component", true},
+		{"$@ placeholder", "Run with args: $@", true},
+		{"$ARGUMENTS placeholder", "Features: $ARGUMENTS", true},
+		{"${1} placeholder", "Name: ${1}", true},
+		{"${ARGUMENTS} placeholder", "All: ${ARGUMENTS}", true},
+		{"${@:1} placeholder", "Rest: ${@:1}", true},
+		{"${@:1:2} placeholder", "Slice: ${@:1:2}", true},
+		{"dollar in text", "Cost is one hundred dollars", false},
+		{"empty content", "", false},
+		{"$1 inside code block only", "Prompt\n```\necho $1\n```\n", false},
+		{"$1 outside and inside code block", "Use $1 here\n```\necho $1\n```\n", true},
+		{"$@ inside code block only", "Prompt\n```bash\necho $@\n```\n", false},
+		{"$+ placeholder", "Run with args: $+", true},
+		{"$+ inside inline code only", "Use `$+` for required args", false},
+		{"$1 inside inline code only", "Use `$1` for positional args", false},
+		{"$1 outside and in inline code", "Create $1 (see `$1` syntax)", true},
+		{"$@ outside $1 in inline code", "Run $@ with `$1` syntax", true},
+	}
+
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			tpl := &PromptTemplate{Content: tt.content}
+			if got := tpl.HasArgPlaceholders(); got != tt.want {
+				t.Errorf("HasArgPlaceholders() = %v, want %v", got, tt.want)
+			}
+		})
+	}
+}
+
+func TestRequiredArgs(t *testing.T) {
+	tests := []struct {
+		name    string
+		content string
+		want    int
+	}{
+		{"no placeholders", "Just a plain prompt", 0},
+		{"$1 only", "Create a $1 component", 1},
+		{"$1 and $2", "Create $1 with $2", 2},
+		{"$3 skipping $2", "Use $1 and $3", 3},
+		{"${1} braced", "Name: ${1}", 1},
+		{"${2} braced", "Name: ${1} Desc: ${2}", 2},
+		{"$@ only", "Run with: $@", 0},
+		{"$ARGUMENTS only", "Features: $ARGUMENTS", 0},
+		{"${ARGUMENTS} only", "All: ${ARGUMENTS}", 0},
+		{"$1 and $@", "Create $1 with extras: $@", 1},
+		{"${@:1} slice only", "Rest: ${@:1}", 0},
+		{"${@:1:2} slice only", "Slice: ${@:1:2}", 0},
+		{"mixed $1 $2 and $@", "Create $1 named $2: $@", 2},
+		{"empty content", "", 0},
+		{"$2 inside code block only", "Prompt\n```\n$1 $2\n```\n", 0},
+		{"$1 outside $2 inside code block", "Use $1\n```\n$2 inside\n```\n", 1},
+		{"$+ only", "Run with: $+", 1},
+		{"$+ and $2", "Create $2 with: $+", 2},
+		{"$+ inside inline code only", "Use `$+` for required args", 0},
+		{"$1 and $2 in inline code only", "Use `$1` and `$2` for args", 0},
+		{"$1 outside $2 in inline code", "Create $1 (see `$2`)", 1},
+	}
+
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			tpl := &PromptTemplate{Content: tt.content}
+			if got := tpl.RequiredArgs(); got != tt.want {
+				t.Errorf("RequiredArgs() = %d, want %d", got, tt.want)
+			}
+		})
+	}
+}
@@ -0,0 +1,317 @@
+package session
+
+import (
+	"slices"
+	"testing"
+
+	"charm.land/fantasy"
+	"github.com/mark3labs/kit/internal/message"
+)
+
+// TestCompactionCreatesNewLeaf verifies that after compaction, the compaction
+// entry has no parent (creating a new root), and BuildContext returns only
+// the summary and kept messages, not the old compacted messages.
+func TestCompactionCreatesNewLeaf(t *testing.T) {
+	tm := InMemoryTreeSession("/test")
+
+	// Add some messages: M1, M2 (old, will be compacted), M3, M4 (kept)
+	msg1 := message.Message{Role: message.RoleUser, Parts: []message.ContentPart{message.TextContent{Text: "Message 1 - old"}}}
+	msg2 := message.Message{Role: message.RoleAssistant, Parts: []message.ContentPart{message.TextContent{Text: "Message 2 - old"}}}
+	msg3 := message.Message{Role: message.RoleUser, Parts: []message.ContentPart{message.TextContent{Text: "Message 3 - kept"}}}
+	msg4 := message.Message{Role: message.RoleAssistant, Parts: []message.ContentPart{message.TextContent{Text: "Message 4 - kept"}}}
+
+	_, _ = tm.AppendMessage(msg1)
+	_, _ = tm.AppendMessage(msg2)
+	id3, _ := tm.AppendMessage(msg3)
+	id4, _ := tm.AppendMessage(msg4)
+
+	// Verify initial state - all messages should be in context
+	messages, _, _ := tm.BuildContext()
+	if len(messages) != 4 {
+		t.Fatalf("expected 4 messages before compaction, got %d", len(messages))
+	}
+
+	// Verify entry IDs
+	entryIDs := tm.GetContextEntryIDs()
+	if len(entryIDs) != 4 {
+		t.Fatalf("expected 4 entry IDs before compaction, got %d", len(entryIDs))
+	}
+
+	// Now add a compaction entry, simulating that M3 is the first kept entry
+	summary := "Summary of old messages"
+	compactionID, err := tm.AppendCompaction(summary, id3, 1000, 500, 2, []string{}, []string{})
+	if err != nil {
+		t.Fatalf("failed to append compaction: %v", err)
+	}
+
+	// Verify the compaction entry has no parent (empty ParentID)
+	compactionEntry := tm.GetEntry(compactionID).(*CompactionEntry)
+	if compactionEntry.ParentID != "" {
+		t.Errorf("compaction entry should have no parent, got %q", compactionEntry.ParentID)
+	}
+
+	// Verify the leaf is now the compaction entry
+	if tm.GetLeafID() != compactionID {
+		t.Errorf("leaf should be compaction entry %q, got %q", compactionID, tm.GetLeafID())
+	}
+
+	// Now BuildContext should return: [summary] + [M3, M4]
+	messages, _, _ = tm.BuildContext()
+	if len(messages) != 3 {
+		t.Fatalf("expected 3 messages after compaction (summary + 2 kept), got %d", len(messages))
+	}
+
+	// First message should be the summary
+	if messages[0].Role != fantasy.MessageRoleSystem {
+		t.Errorf("first message should be system summary, got %s", messages[0].Role)
+	}
+	summaryText := messages[0].Content[0].(fantasy.TextPart).Text
+	if summaryText != "[Conversation summary — earlier messages were compacted]\n\n"+summary {
+		t.Errorf("unexpected summary text: %s", summaryText)
+	}
+
+	// Second message should be M3 (kept)
+	if messages[1].Role != fantasy.MessageRoleUser {
+		t.Errorf("second message should be user (M3), got %s", messages[1].Role)
+	}
+	m3Text := messages[1].Content[0].(fantasy.TextPart).Text
+	if m3Text != "Message 3 - kept" {
+		t.Errorf("unexpected M3 text: %s", m3Text)
+	}
+
+	// Third message should be M4 (kept)
+	if messages[2].Role != fantasy.MessageRoleAssistant {
+		t.Errorf("third message should be assistant (M4), got %s", messages[2].Role)
+	}
+	m4Text := messages[2].Content[0].(fantasy.TextPart).Text
+	if m4Text != "Message 4 - kept" {
+		t.Errorf("unexpected M4 text: %s", m4Text)
+	}
+
+	// Verify GetContextEntryIDs returns correct IDs
+	entryIDs = tm.GetContextEntryIDs()
+	if len(entryIDs) != 3 {
+		t.Fatalf("expected 3 entry IDs after compaction (empty for summary + 2 kept), got %d: %v", len(entryIDs), entryIDs)
+	}
+
+	// First entry ID should be empty (summary has no entry)
+	if entryIDs[0] != "" {
+		t.Errorf("first entry ID should be empty (summary), got %q", entryIDs[0])
+	}
+
+	// Second and third should be id3 and id4 (the kept messages)
+	if entryIDs[1] != id3 {
+		t.Errorf("second entry ID should be %q (M3), got %q", id3, entryIDs[1])
+	}
+	if entryIDs[2] != id4 {
+		t.Errorf("third entry ID should be %q (M4), got %q", id4, entryIDs[2])
+	}
+}
+
+// TestCompactionWithNewMessagesAfterCompaction verifies that messages appended
+// after compaction are correctly included in the context.
+func TestCompactionWithNewMessagesAfterCompaction(t *testing.T) {
+	tm := InMemoryTreeSession("/test")
+
+	// Add initial messages
+	msg1 := message.Message{Role: message.RoleUser, Parts: []message.ContentPart{message.TextContent{Text: "Message 1"}}}
+	msg2 := message.Message{Role: message.RoleAssistant, Parts: []message.ContentPart{message.TextContent{Text: "Message 2"}}}
+	msg3 := message.Message{Role: message.RoleUser, Parts: []message.ContentPart{message.TextContent{Text: "Message 3 - kept"}}}
+
+	_, _ = tm.AppendMessage(msg1)
+	_, _ = tm.AppendMessage(msg2)
+	id3, _ := tm.AppendMessage(msg3)
+
+	// Compact, keeping only M3
+	_, _ = tm.AppendCompaction("Summary", id3, 1000, 500, 2, []string{}, []string{})
+
+	// Add a new message after compaction
+	msg4 := message.Message{Role: message.RoleAssistant, Parts: []message.ContentPart{message.TextContent{Text: "Message 4 - after compaction"}}}
+	_, _ = tm.AppendMessage(msg4)
+
+	// BuildContext should return: [summary] + [M4 (new after compaction)] + [M3 (kept)]
+	messages, _, _ := tm.BuildContext()
+	if len(messages) != 3 {
+		t.Fatalf("expected 3 messages (summary + M4 + M3), got %d: %+v", len(messages), messages)
+	}
+
+	// Verify order: summary, M4 (new), M3 (kept)
+	if messages[0].Role != fantasy.MessageRoleSystem {
+		t.Errorf("first message should be summary, got %s", messages[0].Role)
+	}
+	if messages[1].Role != fantasy.MessageRoleAssistant {
+		t.Errorf("second message should be assistant (M4), got %s", messages[1].Role)
+	}
+	m4Text := messages[1].Content[0].(fantasy.TextPart).Text
+	if m4Text != "Message 4 - after compaction" {
+		t.Errorf("unexpected M4 text: %s", m4Text)
+	}
+	if messages[2].Role != fantasy.MessageRoleUser {
+		t.Errorf("third message should be user (M3), got %s", messages[2].Role)
+	}
+
+	// Verify that M1 is NOT in the context
+	for i, msg := range messages {
+		if msg.Role == fantasy.MessageRoleUser {
+			text := msg.Content[0].(fantasy.TextPart).Text
+			if text == "Message 1" {
+				t.Errorf("Message 1 (compacted) should not be in context at index %d", i)
+			}
+		}
+	}
+}
+
+// TestCompactionWithNoKeptMessages verifies compaction when all messages are compacted.
+func TestCompactionWithNoKeptMessages(t *testing.T) {
+	tm := InMemoryTreeSession("/test")
+
+	// Add messages that will all be compacted
+	msg1 := message.Message{Role: message.RoleUser, Parts: []message.ContentPart{message.TextContent{Text: "Message 1"}}}
+	msg2 := message.Message{Role: message.RoleAssistant, Parts: []message.ContentPart{message.TextContent{Text: "Message 2"}}}
+
+	if _, err := tm.AppendMessage(msg1); err != nil {
+		t.Fatalf("failed to append message: %v", err)
+	}
+	if _, err := tm.AppendMessage(msg2); err != nil {
+		t.Fatalf("failed to append message: %v", err)
+	}
+
+	// Compact with no kept messages (empty firstKeptEntryID)
+	summary := "All messages summarized"
+	compactionID, _ := tm.AppendCompaction(summary, "", 1000, 100, 2, []string{}, []string{})
+
+	// Verify the compaction entry has no parent
+	compactionEntry := tm.GetEntry(compactionID).(*CompactionEntry)
+	if compactionEntry.ParentID != "" {
+		t.Errorf("compaction entry should have no parent, got %q", compactionEntry.ParentID)
+	}
+
+	// BuildContext should return only the summary
+	messages, _, _ := tm.BuildContext()
+	if len(messages) != 1 {
+		t.Fatalf("expected 1 message (summary only), got %d: %+v", len(messages), messages)
+	}
+	if messages[0].Role != fantasy.MessageRoleSystem {
+		t.Errorf("message should be system summary, got %s", messages[0].Role)
+	}
+}
+
+// TestMultipleCompactions verifies that multiple compactions work correctly.
+func TestMultipleCompactions(t *testing.T) {
+	tm := InMemoryTreeSession("/test")
+
+	// First batch of messages
+	msg1 := message.Message{Role: message.RoleUser, Parts: []message.ContentPart{message.TextContent{Text: "Batch 1 - User"}}}
+	msg2 := message.Message{Role: message.RoleAssistant, Parts: []message.ContentPart{message.TextContent{Text: "Batch 1 - Assistant"}}}
+	id1, _ := tm.AppendMessage(msg1)
+	id2, _ := tm.AppendMessage(msg2)
+
+	// First compaction
+	_, _ = tm.AppendCompaction("Summary 1", id1, 1000, 500, 1, []string{}, []string{})
+
+	// Second batch
+	msg3 := message.Message{Role: message.RoleUser, Parts: []message.ContentPart{message.TextContent{Text: "Batch 2 - User"}}}
+	msg4 := message.Message{Role: message.RoleAssistant, Parts: []message.ContentPart{message.TextContent{Text: "Batch 2 - Assistant"}}}
+	id3, _ := tm.AppendMessage(msg3)
+	id4, _ := tm.AppendMessage(msg4)
+
+	// Second compaction (compacting the first compaction + batch 2)
+	// Note: id3 is the first kept entry, so id3 and id4 should be preserved
+	compactionID2, _ := tm.AppendCompaction("Summary 2", id3, 1000, 500, 3, []string{}, []string{})
+
+	// Verify second compaction has no parent
+	compactionEntry2 := tm.GetEntry(compactionID2).(*CompactionEntry)
+	if compactionEntry2.ParentID != "" {
+		t.Errorf("second compaction entry should have no parent, got %q", compactionEntry2.ParentID)
+	}
+
+	// Add final message
+	msg5 := message.Message{Role: message.RoleUser, Parts: []message.ContentPart{message.TextContent{Text: "Final message"}}}
+	id5, _ := tm.AppendMessage(msg5)
+
+	// BuildContext should include:
+	// - Summary 2 (from second compaction)
+	// - msg5 (final message)
+	// - msg3, msg4 (kept from second compaction)
+	// But NOT Summary 1 or msg1, msg2 (they're before the first kept entry of compaction 2)
+	messages, _, _ := tm.BuildContext()
+
+	// Should have: Summary 2 + msg5 + msg3 + msg4 = 4 messages
+	if len(messages) != 4 {
+		t.Fatalf("expected 4 messages (Summary 2 + msg5 + msg3 + msg4), got %d: %+v", len(messages), messages)
+	}
+
+	// First should be Summary 2
+	if messages[0].Role != fantasy.MessageRoleSystem {
+		t.Errorf("first message should be system (Summary 2), got %s", messages[0].Role)
+	}
+	summaryText := messages[0].Content[0].(fantasy.TextPart).Text
+	if summaryText != "[Conversation summary — earlier messages were compacted]\n\nSummary 2" {
+		t.Errorf("unexpected summary: %s", summaryText)
+	}
+
+	// Verify msg5 is included
+	foundFinal := false
+	for _, msg := range messages {
+		if msg.Role == fantasy.MessageRoleUser {
+			text := msg.Content[0].(fantasy.TextPart).Text
+			if text == "Final message" {
+				foundFinal = true
+				break
+			}
+		}
+	}
+	if !foundFinal {
+		t.Error("Final message (msg5) should be in context")
+	}
+
+	// Verify msg1, msg2 are NOT included (compacted by first compaction, then second)
+	for _, msg := range messages {
+		if msg.Role == fantasy.MessageRoleUser || msg.Role == fantasy.MessageRoleAssistant {
+			text := msg.Content[0].(fantasy.TextPart).Text
+			if text == "Batch 1 - User" || text == "Batch 1 - Assistant" {
+				t.Errorf("Batch 1 messages should not be in context, found: %s", text)
+			}
+		}
+	}
+
+	// Verify entry IDs
+	entryIDs := tm.GetContextEntryIDs()
+	if len(entryIDs) != 4 {
+		t.Fatalf("expected 4 entry IDs, got %d: %v", len(entryIDs), entryIDs)
+	}
+
+	// First should be empty (summary)
+	if entryIDs[0] != "" {
+		t.Errorf("first entry ID should be empty (summary), got %q", entryIDs[0])
+	}
+
+	// Check that id5 is in the list
+	if !slices.Contains(entryIDs, id5) {
+		t.Errorf("id5 (final message) should be in entry IDs, got %v", entryIDs)
+	}
+
+	// Verify id3 and id4 ARE in the list (they were kept)
+	foundID3, foundID4 := false, false
+	for _, id := range entryIDs {
+		if id == id3 {
+			foundID3 = true
+		}
+		if id == id4 {
+			foundID4 = true
+		}
+	}
+	if !foundID3 {
+		t.Errorf("id3 (kept message) should be in entry IDs, got %v", entryIDs)
+	}
+	if !foundID4 {
+		t.Errorf("id4 (kept message) should be in entry IDs, got %v", entryIDs)
+	}
+
+	// Verify id1 and id2 are NOT in the list (they were compacted away)
+	for _, id := range entryIDs {
+		if id == id1 || id == id2 {
+			t.Errorf("id1 or id2 (compacted) should not be in entry IDs, found %q in %v", id, entryIDs)
+		}
+	}
+}
@@ -509,11 +509,19 @@ func (tm *TreeManager) AppendExtensionData(extType, data string) (string, error)
 // AppendCompaction adds a compaction entry to the tree. The entry records
 // the summary and the ID of the first entry that should be preserved in the
 // LLM context. Messages before that entry are replaced by the summary.
+//
+// The compaction entry becomes a new "root" for the post-compaction branch
+// with no parent (empty ParentID). This breaks the parent chain so that old
+// compacted messages are no longer traversed when building context. The kept
+// messages are explicitly collected via FirstKeptEntryID in BuildContext.
 func (tm *TreeManager) AppendCompaction(summary, firstKeptEntryID string, tokensBefore, tokensAfter, messagesRemoved int, readFiles, modifiedFiles []string) (string, error) {
 	tm.mu.Lock()
 	defer tm.mu.Unlock()

-	entry := NewCompactionEntry(tm.leafID, summary, firstKeptEntryID, tokensBefore, tokensAfter, messagesRemoved, readFiles, modifiedFiles)
+	// The compaction entry has no parent, making it a new "root" for the
+	// post-compaction branch. This ensures old compacted messages are not
+	// traversed when walking from the current leaf.
+	entry := NewCompactionEntry("", summary, firstKeptEntryID, tokensBefore, tokensAfter, messagesRemoved, readFiles, modifiedFiles)
 	if err := tm.appendAndPersist(entry); err != nil {
 		return "", err
 	}
@@ -683,14 +691,18 @@ func (tm *TreeManager) BuildContext() (messages []fantasy.Message, provider stri
 	// Find the last compaction entry on this branch — it determines
 	// which older messages are replaced by the summary.
 	var lastCompaction *CompactionEntry
+	var compactionIndex = -1
 	for i := len(branch) - 1; i >= 0; i-- {
 		if c, ok := branch[i].(*CompactionEntry); ok {
 			lastCompaction = c
+			compactionIndex = i
 			break
 		}
 	}

-	// If there is a compaction, inject the summary first.
+	// If there is a compaction, inject the summary first and collect
+	// the kept messages starting from FirstKeptEntryID (since the
+	// compaction entry's parent chain doesn't include them).
 	if lastCompaction != nil {
 		messages = append(messages, fantasy.Message{
 			Role: fantasy.MessageRoleSystem,
@@ -700,21 +712,104 @@ func (tm *TreeManager) BuildContext() (messages []fantasy.Message, provider stri
 				},
 			},
 		})
-	}

-	// Determine whether to skip entries (everything before firstKeptEntryID).
-	skipping := lastCompaction != nil
-	for _, entry := range branch {
-		// Once we reach the first kept entry, stop skipping.
-		if skipping {
-			entryID := tm.EntryID(entry)
-			if entryID == lastCompaction.FirstKeptEntryID {
-				skipping = false
-			} else {
+		// Collect entries from the compaction entry itself (at compactionIndex)
+		// and any entries before it in the branch (newer messages).
+		for i := compactionIndex; i < len(branch); i++ {
+			entry := branch[i]
+			switch e := entry.(type) {
+			case *MessageEntry:
+				msg, err := e.ToMessage()
+				if err != nil {
+					continue // skip malformed entries
+				}
+				msgs := msg.ToLLMMessages()
+				messages = append(messages, msgs...)
+
+			case *BranchSummaryEntry:
+				// Convert branch summary to a user message for context.
+				if e.Summary != "" {
+					messages = append(messages, fantasy.Message{
+						Role: fantasy.MessageRoleUser,
+						Content: []fantasy.MessagePart{
+							fantasy.TextPart{
+								Text: fmt.Sprintf("[Branch context: %s]", e.Summary),
+							},
+						},
+					})
+				}
+
+			case *ModelChangeEntry:
+				provider = e.Provider
+				modelID = e.ModelID
+
+			case *CompactionEntry:
+				// Already handled above (summary injected).
 				continue
 			}
 		}

+		// Now collect the kept messages starting from FirstKeptEntryID.
+		// These are not in the current branch because the compaction entry
+		// is parented to the first kept entry's parent, not the first kept entry.
+		// We iterate through entries in order (not using getBranchLocked) to avoid
+		// walking back to old compacted messages.
+		// We stop when we reach the compaction entry to avoid double-counting
+		// messages that were added after the compaction.
+		if lastCompaction.FirstKeptEntryID != "" {
+			found := false
+			for _, entry := range tm.entries {
+				entryID := tm.EntryID(entry)
+
+				// Skip entries until we reach the first kept entry.
+				if !found {
+					if entryID == lastCompaction.FirstKeptEntryID {
+						found = true
+					} else {
+						continue
+					}
+				}
+
+				// Stop when we reach the compaction entry itself.
+				// Messages after the compaction are collected from the branch walk above.
+				if entryID == lastCompaction.ID {
+					break
+				}
+
+				// Process this kept entry.
+				switch e := entry.(type) {
+				case *MessageEntry:
+					msg, err := e.ToMessage()
+					if err != nil {
+						continue
+					}
+					msgs := msg.ToLLMMessages()
+					messages = append(messages, msgs...)
+
+				case *BranchSummaryEntry:
+					if e.Summary != "" {
+						messages = append(messages, fantasy.Message{
+							Role: fantasy.MessageRoleUser,
+							Content: []fantasy.MessagePart{
+								fantasy.TextPart{
+									Text: fmt.Sprintf("[Branch context: %s]", e.Summary),
+								},
+							},
+						})
+					}
+
+				case *ModelChangeEntry:
+					provider = e.Provider
+					modelID = e.ModelID
+				}
+			}
+		}
+
+		return messages, provider, modelID
+	}
+
+	// No compaction - process the entire branch normally.
+	for _, entry := range branch {
 		switch e := entry.(type) {
 		case *MessageEntry:
 			msg, err := e.ToMessage()
@@ -740,10 +835,6 @@ func (tm *TreeManager) BuildContext() (messages []fantasy.Message, provider stri
 		case *ModelChangeEntry:
 			provider = e.Provider
 			modelID = e.ModelID
-
-		case *CompactionEntry:
-			// Already handled above (the last one on the branch).
-			continue
 		}
 	}

@@ -853,31 +944,92 @@ func (tm *TreeManager) GetContextEntryIDs() []string {

 	// Find the last compaction entry for skip logic.
 	var lastCompaction *CompactionEntry
+	var compactionIndex = -1
 	for i := len(branch) - 1; i >= 0; i-- {
 		if c, ok := branch[i].(*CompactionEntry); ok {
 			lastCompaction = c
+			compactionIndex = i
 			break
 		}
 	}

 	var ids []string

-	// If there's a compaction summary injected, it has no entry ID.
+	// If there's a compaction, we need to collect IDs from:
+	// 1. Entries after the compaction entry in the branch (newer messages)
+	// 2. Entries from FirstKeptEntryID onwards (kept messages)
 	if lastCompaction != nil {
-		ids = append(ids, "") // placeholder for the summary system message
-	}
+		// Placeholder for the summary system message (no entry ID).
+		ids = append(ids, "")

-	skipping := lastCompaction != nil
-	for _, entry := range branch {
-		if skipping {
-			entryID := tm.EntryID(entry)
-			if entryID == lastCompaction.FirstKeptEntryID {
-				skipping = false
-			} else {
-				continue
+		// Collect IDs from entries after the compaction entry (newer messages).
+		for i := compactionIndex + 1; i < len(branch); i++ {
+			entry := branch[i]
+			switch e := entry.(type) {
+			case *MessageEntry:
+				msg, err := e.ToMessage()
+				if err != nil {
+					continue
+				}
+				msgs := msg.ToLLMMessages()
+				for range msgs {
+					ids = append(ids, e.ID)
+				}
+
+			case *BranchSummaryEntry:
+				if e.Summary != "" {
+					ids = append(ids, e.ID)
+				}
 			}
 		}

+		// Collect IDs from the kept messages starting at FirstKeptEntryID.
+		// We iterate through entries in order (not using getBranchLocked) to avoid
+		// walking back to old compacted messages.
+		// We stop when we reach the compaction entry to avoid double-counting.
+		if lastCompaction.FirstKeptEntryID != "" {
+			found := false
+			for _, entry := range tm.entries {
+				entryID := tm.EntryID(entry)
+
+				// Skip entries until we reach the first kept entry.
+				if !found {
+					if entryID == lastCompaction.FirstKeptEntryID {
+						found = true
+					} else {
+						continue
+					}
+				}
+
+				// Stop when we reach the compaction entry itself.
+				if entryID == lastCompaction.ID {
+					break
+				}
+
+				switch e := entry.(type) {
+				case *MessageEntry:
+					msg, err := e.ToMessage()
+					if err != nil {
+						continue
+					}
+					msgs := msg.ToLLMMessages()
+					for range msgs {
+						ids = append(ids, e.ID)
+					}
+
+				case *BranchSummaryEntry:
+					if e.Summary != "" {
+						ids = append(ids, e.ID)
+					}
+				}
+			}
+		}
+
+		return ids
+	}
+
+	// No compaction - collect IDs from the entire branch.
+	for _, entry := range branch {
 		switch e := entry.(type) {
 		case *MessageEntry:
 			msg, err := e.ToMessage()
@@ -893,9 +1045,6 @@ func (tm *TreeManager) GetContextEntryIDs() []string {
 			if e.Summary != "" {
 				ids = append(ids, e.ID)
 			}
-
-		case *CompactionEntry:
-			continue
 		}
 	}

@@ -60,15 +60,16 @@ type MCPConnection struct {
 // creation, health monitoring, and cleanup. The pool runs background health checks
 // to proactively identify and remove unhealthy connections.
 type MCPConnectionPool struct {
-	connections map[string]*MCPConnection
-	config      *ConnectionPoolConfig
-	mu          sync.RWMutex
-	model       fantasy.LanguageModel
-	ctx         context.Context
-	cancel      context.CancelFunc
-	debug       bool
-	debugLogger DebugLogger
-	oauthFlow   *OAuthFlowRunner
+	connections       map[string]*MCPConnection
+	config            *ConnectionPoolConfig
+	mu                sync.RWMutex
+	model             fantasy.LanguageModel
+	ctx               context.Context
+	cancel            context.CancelFunc
+	debug             bool
+	debugLogger       DebugLogger
+	oauthFlow         *OAuthFlowRunner
+	tokenStoreFactory TokenStoreFactory // custom factory for per-server token stores (nil = default FileTokenStore)
 }

 // NewMCPConnectionPool creates a new MCP connection pool with the specified configuration.
@@ -76,19 +77,20 @@ type MCPConnectionPool struct {
 // goroutine for periodic health checks that runs until Close is called.
 // The model parameter is used for MCP servers that require sampling support.
 // Thread-safe for concurrent use immediately after creation.
-func NewMCPConnectionPool(config *ConnectionPoolConfig, model fantasy.LanguageModel, debug bool, authHandler MCPAuthHandler) *MCPConnectionPool {
+func NewMCPConnectionPool(config *ConnectionPoolConfig, model fantasy.LanguageModel, debug bool, authHandler MCPAuthHandler, tokenStoreFactory TokenStoreFactory) *MCPConnectionPool {
 	if config == nil {
 		config = DefaultConnectionPoolConfig()
 	}

 	ctx, cancel := context.WithCancel(context.Background())
 	pool := &MCPConnectionPool{
-		connections: make(map[string]*MCPConnection),
-		config:      config,
-		model:       model,
-		ctx:         ctx,
-		cancel:      cancel,
-		debug:       debug,
+		connections:       make(map[string]*MCPConnection),
+		config:            config,
+		model:             model,
+		ctx:               ctx,
+		cancel:            cancel,
+		debug:             debug,
+		tokenStoreFactory: tokenStoreFactory,
 	}

 	if authHandler != nil {
@@ -363,19 +365,29 @@ func (p *MCPConnectionPool) createSSEClient(ctx context.Context, serverConfig co
 	}

 	// Enable OAuth for remote transports when an auth handler is configured.
-	// The OAuthConfig uses PKCE and the handler's redirect URI. Client ID and
-	// scopes are discovered automatically via dynamic client registration and
-	// server metadata (RFC 9728).
+	// The OAuthConfig uses PKCE and the handler's redirect URI. If the server
+	// config provides a pre-registered ClientID (for servers that don't support
+	// dynamic client registration, e.g. GitHub), it is passed through directly.
 	if p.oauthFlow != nil {
-		tokenStore, tsErr := NewFileTokenStore(serverConfig.URL)
+		tokenStore, tsErr := p.createTokenStore(serverConfig.URL)
 		if tsErr != nil {
 			return nil, fmt.Errorf("failed to create token store: %w", tsErr)
 		}
-		options = append(options, transport.WithOAuth(transport.OAuthConfig{
+		oauthCfg := transport.OAuthConfig{
 			RedirectURI: p.oauthFlow.handler.RedirectURI(),
 			PKCEEnabled: true,
 			TokenStore:  tokenStore,
-		}))
+		}
+		if serverConfig.OAuthClientID != "" {
+			oauthCfg.ClientID = serverConfig.OAuthClientID
+		}
+		if serverConfig.OAuthClientSecret != "" {
+			oauthCfg.ClientSecret = serverConfig.OAuthClientSecret
+		}
+		if len(serverConfig.OAuthScopes) > 0 {
+			oauthCfg.Scopes = serverConfig.OAuthScopes
+		}
+		options = append(options, transport.WithOAuth(oauthCfg))
 	}

 	sseClient, err := client.NewSSEMCPClient(serverConfig.URL, options...)
@@ -410,19 +422,29 @@ func (p *MCPConnectionPool) createStreamableClient(ctx context.Context, serverCo
 	}

 	// Enable OAuth for remote transports when an auth handler is configured.
-	// The OAuthConfig uses PKCE and the handler's redirect URI. Client ID and
-	// scopes are discovered automatically via dynamic client registration and
-	// server metadata (RFC 9728).
+	// The OAuthConfig uses PKCE and the handler's redirect URI. If the server
+	// config provides a pre-registered ClientID (for servers that don't support
+	// dynamic client registration, e.g. GitHub), it is passed through directly.
 	if p.oauthFlow != nil {
-		tokenStore, tsErr := NewFileTokenStore(serverConfig.URL)
+		tokenStore, tsErr := p.createTokenStore(serverConfig.URL)
 		if tsErr != nil {
 			return nil, fmt.Errorf("failed to create token store: %w", tsErr)
 		}
-		options = append(options, transport.WithHTTPOAuth(transport.OAuthConfig{
+		oauthCfg := transport.OAuthConfig{
 			RedirectURI: p.oauthFlow.handler.RedirectURI(),
 			PKCEEnabled: true,
 			TokenStore:  tokenStore,
-		}))
+		}
+		if serverConfig.OAuthClientID != "" {
+			oauthCfg.ClientID = serverConfig.OAuthClientID
+		}
+		if serverConfig.OAuthClientSecret != "" {
+			oauthCfg.ClientSecret = serverConfig.OAuthClientSecret
+		}
+		if len(serverConfig.OAuthScopes) > 0 {
+			oauthCfg.Scopes = serverConfig.OAuthScopes
+		}
+		options = append(options, transport.WithHTTPOAuth(oauthCfg))
 	}

 	streamableClient, err := client.NewStreamableHttpClient(serverConfig.URL, options...)
@@ -437,6 +459,16 @@ func (p *MCPConnectionPool) createStreamableClient(ctx context.Context, serverCo
 	return streamableClient, nil
 }

+// createTokenStore creates a token store for the given server URL.
+// If a custom TokenStoreFactory is configured, it is used; otherwise the
+// default file-backed token store is created.
+func (p *MCPConnectionPool) createTokenStore(serverURL string) (transport.TokenStore, error) {
+	if p.tokenStoreFactory != nil {
+		return p.tokenStoreFactory(serverURL)
+	}
+	return NewFileTokenStore(serverURL)
+}
+
 // initializeClient initializes the client
 func (p *MCPConnectionPool) initializeClient(ctx context.Context, client client.MCPClient) error {
 	initCtx, cancel := context.WithTimeout(ctx, 5*time.Minute)
@@ -583,6 +615,27 @@ func (p *MCPConnectionPool) GetClients() map[string]client.MCPClient {
 	return clients
 }

+// RemoveConnection closes and removes a single connection from the pool.
+// Returns an error if the connection does not exist or if closing fails.
+// Thread-safe for concurrent use.
+func (p *MCPConnectionPool) RemoveConnection(serverName string) error {
+	p.mu.Lock()
+	defer p.mu.Unlock()
+
+	conn, exists := p.connections[serverName]
+	if !exists {
+		return fmt.Errorf("connection %q not found in pool", serverName)
+	}
+
+	err := conn.client.Close()
+	delete(p.connections, serverName)
+
+	if p.debugLogger != nil && p.debugLogger.IsDebugEnabled() {
+		p.debugLogger.LogDebug(fmt.Sprintf("[POOL] Removed connection %s", serverName))
+	}
+	return err
+}
+
 // Close gracefully shuts down the connection pool, closing all client connections
 // and stopping the background health check goroutine. It attempts to close all
 // connections even if some fail, logging any errors encountered.
@@ -4,8 +4,10 @@ import (
 	"context"
 	"encoding/json"
 	"fmt"
+	"maps"
 	"slices"
 	"strings"
+	"sync"

 	"charm.land/fantasy"
 	"github.com/mark3labs/kit/internal/config"
@@ -18,14 +20,25 @@ import (
 // pooling, health checks, tool name prefixing to avoid conflicts, and sampling support for LLM interactions.
 // Thread-safe for concurrent tool invocations.
 type MCPToolManager struct {
-	connectionPool *MCPConnectionPool
-	tools          []fantasy.AgentTool
-	toolMap        map[string]*toolMapping // maps prefixed tool names to their server and original name
-	model          fantasy.LanguageModel   // LLM model for sampling
-	authHandler    MCPAuthHandler          // OAuth handler for remote servers (nil = no OAuth)
-	config         *config.Config
-	debug          bool
-	debugLogger    DebugLogger
+	connectionPool    *MCPConnectionPool
+	tools             []fantasy.AgentTool
+	toolMap           map[string]*toolMapping // maps prefixed tool names to their server and original name
+	mu                sync.Mutex              // protects tools and toolMap during parallel loading
+	model             fantasy.LanguageModel   // LLM model for sampling
+	authHandler       MCPAuthHandler          // OAuth handler for remote servers (nil = no OAuth)
+	tokenStoreFactory TokenStoreFactory       // factory for creating per-server token stores (nil = default FileTokenStore)
+	config            *config.Config
+	debug             bool
+	debugLogger       DebugLogger
+
+	// onServerLoaded, if non-nil, is called when each server finishes loading.
+	// Called with server name, tool count, and error (nil on success).
+	onServerLoaded func(serverName string, toolCount int, err error)
+
+	// onToolsChanged, if non-nil, is called after AddServer or RemoveServer
+	// mutates the tool list. The agent layer uses this to trigger a
+	// rebuildFantasyAgent so the LLM sees the updated tools.
+	onToolsChanged func()
 }

 // toolMapping stores the mapping between prefixed tool names and their original details
@@ -62,6 +75,20 @@ func (m *MCPToolManager) SetAuthHandler(handler MCPAuthHandler) {
 	m.authHandler = handler
 }

+// GetAuthHandler returns the OAuth handler for remote MCP server authentication.
+// Returns nil if no handler is configured.
+func (m *MCPToolManager) GetAuthHandler() MCPAuthHandler {
+	return m.authHandler
+}
+
+// SetTokenStoreFactory sets a custom factory for creating per-server OAuth token
+// stores. When set, the factory is called for each remote MCP server instead of
+// using the default file-based token store. This method should be called before
+// LoadTools.
+func (m *MCPToolManager) SetTokenStoreFactory(factory TokenStoreFactory) {
+	m.tokenStoreFactory = factory
+}
+
 // SetDebugLogger sets the debug logger for the tool manager.
 // The logger will be used to output detailed debugging information about MCP connections,
 // tool loading, and execution. If a connection pool exists, it will also be configured
@@ -73,48 +100,207 @@ func (m *MCPToolManager) SetDebugLogger(logger DebugLogger) {
 	}
 }

+// SetOnServerLoaded sets the callback that's invoked when each MCP server finishes
+// loading. The callback receives the server name, tool count, and any error.
+// Call this before LoadTools to receive per-server notifications.
+func (m *MCPToolManager) SetOnServerLoaded(cb func(serverName string, toolCount int, err error)) {
+	m.onServerLoaded = cb
+}
+
+// SetOnToolsChanged sets the callback that's invoked after AddServer or
+// RemoveServer mutates the tool list. The agent layer uses this to trigger
+// a rebuild of the fantasy agent so the LLM sees the updated tool set.
+func (m *MCPToolManager) SetOnToolsChanged(cb func()) {
+	m.onToolsChanged = cb
+}
+
+// AddServer connects to a new MCP server at runtime and loads its tools.
+// The server's tools are immediately available to the agent after this call.
+// Returns the number of tools loaded from the server.
+//
+// If the connection pool has not been initialised yet (i.e. LoadTools was never
+// called), AddServer creates one automatically using the manager's current
+// configuration.
+//
+// Returns an error if a server with the same name is already loaded, or if
+// the connection or tool loading fails.
+func (m *MCPToolManager) AddServer(ctx context.Context, name string, cfg config.MCPServerConfig) (int, error) {
+	m.mu.Lock()
+	// Check for duplicate.
+	if _, exists := m.toolMap[name+"__"]; exists {
+		m.mu.Unlock()
+		return 0, fmt.Errorf("MCP server %q is already loaded", name)
+	}
+	// More thorough duplicate check: scan toolMap for any key with the server prefix.
+	prefix := name + "__"
+	for k := range m.toolMap {
+		if len(k) >= len(prefix) && k[:len(prefix)] == prefix {
+			m.mu.Unlock()
+			return 0, fmt.Errorf("MCP server %q is already loaded", name)
+		}
+	}
+	m.mu.Unlock()
+
+	// Lazily create the connection pool if LoadTools was never called.
+	m.ensureConnectionPool()
+
+	count, err := m.loadServerTools(ctx, name, cfg)
+	if err != nil {
+		return 0, fmt.Errorf("failed to add MCP server %q: %w", name, err)
+	}
+
+	// Notify listeners.
+	if m.onServerLoaded != nil {
+		m.onServerLoaded(name, count, nil)
+	}
+	if m.onToolsChanged != nil {
+		m.onToolsChanged()
+	}
+
+	return count, nil
+}
+
+// RemoveServer disconnects an MCP server and removes all its tools.
+// After this call the agent will no longer see or be able to call tools from
+// the named server. Returns an error if the server is not loaded.
+func (m *MCPToolManager) RemoveServer(name string) error {
+	prefix := name + "__"
+
+	m.mu.Lock()
+
+	// Check the server actually has tools loaded.
+	found := false
+	for k := range m.toolMap {
+		if len(k) >= len(prefix) && k[:len(prefix)] == prefix {
+			found = true
+			break
+		}
+	}
+	if !found {
+		m.mu.Unlock()
+		return fmt.Errorf("MCP server %q is not loaded", name)
+	}
+
+	// Remove tools belonging to this server.
+	newTools := make([]fantasy.AgentTool, 0, len(m.tools))
+	for _, t := range m.tools {
+		if len(t.Info().Name) < len(prefix) || t.Info().Name[:len(prefix)] != prefix {
+			newTools = append(newTools, t)
+		}
+	}
+	m.tools = newTools
+
+	// Remove tool mappings.
+	for k := range m.toolMap {
+		if len(k) >= len(prefix) && k[:len(prefix)] == prefix {
+			delete(m.toolMap, k)
+		}
+	}
+	m.mu.Unlock()
+
+	// Close the connection in the pool (best-effort).
+	if m.connectionPool != nil {
+		_ = m.connectionPool.RemoveConnection(name)
+	}
+
+	if m.onToolsChanged != nil {
+		m.onToolsChanged()
+	}
+
+	return nil
+}
+
+// ensureConnectionPool lazily creates a connection pool if one does not exist.
+// This allows AddServer to work even if LoadTools was never called.
+func (m *MCPToolManager) ensureConnectionPool() {
+	if m.connectionPool != nil {
+		return
+	}
+	debug := false
+	if m.config != nil {
+		debug = m.config.Debug
+	}
+	if m.debugLogger == nil {
+		m.debugLogger = NewSimpleDebugLogger(debug)
+	}
+	m.connectionPool = NewMCPConnectionPool(DefaultConnectionPoolConfig(), m.model, debug, m.authHandler, m.tokenStoreFactory)
+	m.connectionPool.SetDebugLogger(m.debugLogger)
+}
+
 // LoadTools loads tools from all configured MCP servers based on the provided configuration.
 // It initializes the connection pool, connects to each configured server, and loads their tools.
 // Tools from different servers are prefixed with the server name to avoid naming conflicts.
 // Returns an error only if all configured servers fail to load; partial failures are logged as warnings.
 // This method is thread-safe and idempotent.
-func (m *MCPToolManager) LoadTools(ctx context.Context, config *config.Config) error {
+func (m *MCPToolManager) LoadTools(ctx context.Context, cfg *config.Config) error {
 	// Initialize connection pool
-	m.config = config
-	m.debug = config.Debug
+	m.config = cfg
+	m.debug = cfg.Debug
 	if m.debugLogger == nil {
-		m.debugLogger = NewSimpleDebugLogger(config.Debug)
+		m.debugLogger = NewSimpleDebugLogger(cfg.Debug)
 	}
-	m.connectionPool = NewMCPConnectionPool(DefaultConnectionPoolConfig(), m.model, config.Debug, m.authHandler)
+	m.connectionPool = NewMCPConnectionPool(DefaultConnectionPoolConfig(), m.model, cfg.Debug, m.authHandler, m.tokenStoreFactory)
 	m.connectionPool.SetDebugLogger(m.debugLogger)

-	var loadErrors []string
+	// Load all servers in parallel. Each server connection (subprocess
+	// spawn, MCP initialize handshake, ListTools) is independent and
+	// typically dominated by process startup latency. Running them
+	// concurrently reduces total wall-clock time from O(n * avg) to
+	// O(max).
+	type serverResult struct {
+		name string
+		err  error
+	}

-	for serverName, serverConfig := range config.MCPServers {
-		if err := m.loadServerTools(ctx, serverName, serverConfig); err != nil {
-			loadErrors = append(loadErrors, fmt.Sprintf("server %s: %v", serverName, err))
-			fmt.Printf("Warning: Failed to load MCP server '%s': %v\n", serverName, err)
-			continue
+	results := make(chan serverResult, len(cfg.MCPServers))
+	var wg sync.WaitGroup
+
+	for serverName, serverConfig := range cfg.MCPServers {
+		wg.Add(1)
+		go func(name string, sc config.MCPServerConfig) {
+			defer wg.Done()
+			count, err := m.loadServerTools(ctx, name, sc)
+			results <- serverResult{name: name, err: err}
+			// Notify callback if set (for real-time UI updates).
+			if m.onServerLoaded != nil {
+				m.onServerLoaded(name, count, err)
+			}
+		}(serverName, serverConfig)
+	}
+
+	// Close results channel once all goroutines finish.
+	go func() {
+		wg.Wait()
+		close(results)
+	}()
+
+	var loadErrors []string
+	for r := range results {
+		if r.err != nil {
+			loadErrors = append(loadErrors, fmt.Sprintf("server %s: %v", r.name, r.err))
+			fmt.Printf("Warning: Failed to load MCP server '%s': %v\n", r.name, r.err)
 		}
 	}

 	// If all servers failed to load, return an error
-	if len(loadErrors) == len(config.MCPServers) && len(config.MCPServers) > 0 {
+	if len(loadErrors) == len(cfg.MCPServers) && len(cfg.MCPServers) > 0 {
 		return fmt.Errorf("all MCP servers failed to load: %s", strings.Join(loadErrors, "; "))
 	}

 	return nil
 }

-// loadServerTools loads tools from a single MCP server
-func (m *MCPToolManager) loadServerTools(ctx context.Context, serverName string, serverConfig config.MCPServerConfig) error {
+// loadServerTools loads tools from a single MCP server.
+// Thread-safe: may be called concurrently for different servers.
+// Returns the number of tools loaded from this server, or -1 on error.
+func (m *MCPToolManager) loadServerTools(ctx context.Context, serverName string, serverConfig config.MCPServerConfig) (int, error) {
 	// Add debug logging
 	m.debugLogConnectionInfo(serverName, serverConfig)

 	// Get connection from pool
 	conn, err := m.connectionPool.GetConnection(ctx, serverName, serverConfig)
 	if err != nil {
-		return fmt.Errorf("failed to get connection from pool: %v", err)
+		return -1, fmt.Errorf("failed to get connection from pool: %v", err)
 	}

 	// Get tools from this server
@@ -122,7 +308,7 @@ func (m *MCPToolManager) loadServerTools(ctx context.Context, serverName string,
 	if err != nil {
 		// Handle connection error
 		m.connectionPool.HandleConnectionError(serverName, err)
-		return fmt.Errorf("failed to list tools: %v", err)
+		return -1, fmt.Errorf("failed to list tools: %v", err)
 	}

 	// Create name set for allowed tools
@@ -134,6 +320,10 @@ func (m *MCPToolManager) loadServerTools(ctx context.Context, serverName string,
 		}
 	}

+	// Build tools locally before acquiring the lock.
+	var localTools []fantasy.AgentTool
+	localMap := make(map[string]*toolMapping)
+
 	// Convert MCP tools to fantasy AgentTools with prefixed names
 	for _, mcpTool := range listResults.Tools {
 		// Filter tools based on allowedTools/excludedTools
@@ -151,7 +341,7 @@ func (m *MCPToolManager) loadServerTools(ctx context.Context, serverName string,
 		// Convert MCP InputSchema to map[string]any for fantasy ToolInfo
 		marshaledSchema, err := json.Marshal(mcpTool.InputSchema)
 		if err != nil {
-			return fmt.Errorf("conv mcp tool input schema fail(marshal): %w, tool name: %s", err, mcpTool.Name)
+			return -1, fmt.Errorf("conv mcp tool input schema fail(marshal): %w, tool name: %s", err, mcpTool.Name)
 		}

 		// Fix for JSON Schema draft-07 vs draft-04 compatibility
@@ -160,7 +350,7 @@ func (m *MCPToolManager) loadServerTools(ctx context.Context, serverName string,
 		// Parse into map[string]any for fantasy's parameters format
 		var schemaMap map[string]any
 		if err := json.Unmarshal(marshaledSchema, &schemaMap); err != nil {
-			return fmt.Errorf("conv mcp tool input schema fail(unmarshal): %w, tool name: %s", err, mcpTool.Name)
+			return -1, fmt.Errorf("conv mcp tool input schema fail(unmarshal): %w, tool name: %s", err, mcpTool.Name)
 		}

 		// Extract properties and required from the schema
@@ -193,7 +383,7 @@ func (m *MCPToolManager) loadServerTools(ctx context.Context, serverName string,
 			serverConfig: serverConfig,
 			manager:      m,
 		}
-		m.toolMap[prefixedName] = mapping
+		localMap[prefixedName] = mapping

 		// Create fantasy AgentTool
 		fantasyTool := &mcpFantasyTool{
@@ -206,10 +396,16 @@ func (m *MCPToolManager) loadServerTools(ctx context.Context, serverName string,
 			mapping: mapping,
 		}

-		m.tools = append(m.tools, fantasyTool)
+		localTools = append(localTools, fantasyTool)
 	}

-	return nil
+	// Merge into the manager under the lock.
+	m.mu.Lock()
+	maps.Copy(m.toolMap, localMap)
+	m.tools = append(m.tools, localTools...)
+	m.mu.Unlock()
+
+	return len(localTools), nil
 }

 // GetTools returns all loaded tools as fantasy AgentTools from all configured MCP servers.
@@ -234,6 +430,9 @@ func (m *MCPToolManager) GetLoadedServerNames() []string {
 // proper cleanup of stdio processes, network connections, and other resources.
 // It is safe to call Close multiple times.
 func (m *MCPToolManager) Close() error {
+	if m.connectionPool == nil {
+		return nil
+	}
 	return m.connectionPool.Close()
 }

@@ -0,0 +1,323 @@
+package tools
+
+import (
+	"context"
+	"os"
+	"path/filepath"
+	"runtime"
+	"slices"
+	"strings"
+	"sync"
+	"testing"
+	"time"
+
+	"github.com/mark3labs/kit/internal/config"
+)
+
+// testdataDir returns the absolute path to the testdata directory.
+func testdataDir(t *testing.T) string {
+	t.Helper()
+	_, file, _, ok := runtime.Caller(0)
+	if !ok {
+		t.Fatal("cannot determine test file path")
+	}
+	return filepath.Join(filepath.Dir(file), "testdata")
+}
+
+// echoServerConfig returns an MCPServerConfig for the test echo MCP server.
+func echoServerConfig(t *testing.T) config.MCPServerConfig {
+	t.Helper()
+	script := filepath.Join(testdataDir(t), "echo_server.py")
+	if _, err := os.Stat(script); err != nil {
+		t.Skipf("echo_server.py not found: %v", err)
+	}
+	return config.MCPServerConfig{
+		Command: []string{"python3", script},
+	}
+}
+
+// TestMCPToolManager_AddServer_Integration tests adding a real MCP server
+// at runtime and verifying tools are loaded.
+func TestMCPToolManager_AddServer_Integration(t *testing.T) {
+	if testing.Short() {
+		t.Skip("skipping integration test in short mode")
+	}
+
+	manager := NewMCPToolManager()
+	defer func() { _ = manager.Close() }()
+
+	ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
+	defer cancel()
+
+	cfg := echoServerConfig(t)
+
+	// Track callbacks.
+	var mu sync.Mutex
+	var loadedServer string
+	var loadedCount int
+	toolsChangedCount := 0
+
+	manager.SetOnServerLoaded(func(name string, count int, err error) {
+		mu.Lock()
+		loadedServer = name
+		loadedCount = count
+		mu.Unlock()
+	})
+	manager.SetOnToolsChanged(func() {
+		mu.Lock()
+		toolsChangedCount++
+		mu.Unlock()
+	})
+
+	// Add the server.
+	count, err := manager.AddServer(ctx, "echo", cfg)
+	if err != nil {
+		t.Fatalf("AddServer failed: %v", err)
+	}
+
+	if count != 2 {
+		t.Errorf("Expected 2 tools from echo server, got %d", count)
+	}
+
+	// Verify callbacks fired.
+	mu.Lock()
+	if loadedServer != "echo" {
+		t.Errorf("Expected onServerLoaded for 'echo', got %q", loadedServer)
+	}
+	if loadedCount != 2 {
+		t.Errorf("Expected onServerLoaded count=2, got %d", loadedCount)
+	}
+	if toolsChangedCount != 1 {
+		t.Errorf("Expected onToolsChanged called once, got %d", toolsChangedCount)
+	}
+	mu.Unlock()
+
+	// Verify tools are accessible.
+	tools := manager.GetTools()
+	if len(tools) != 2 {
+		t.Fatalf("Expected 2 tools, got %d", len(tools))
+	}
+
+	// Verify tool names are prefixed.
+	toolNames := make(map[string]bool)
+	for _, tool := range tools {
+		toolNames[tool.Info().Name] = true
+	}
+	if !toolNames["echo__echo"] {
+		t.Error("Expected tool 'echo__echo'")
+	}
+	if !toolNames["echo__greet"] {
+		t.Error("Expected tool 'echo__greet'")
+	}
+
+	// Verify server appears in loaded names.
+	names := manager.GetLoadedServerNames()
+	if !slices.Contains(names, "echo") {
+		t.Errorf("Expected 'echo' in loaded server names, got: %v", names)
+	}
+}
+
+// TestMCPToolManager_RemoveServer_Integration tests removing a real MCP server
+// and verifying tools are cleaned up.
+func TestMCPToolManager_RemoveServer_Integration(t *testing.T) {
+	if testing.Short() {
+		t.Skip("skipping integration test in short mode")
+	}
+
+	manager := NewMCPToolManager()
+	defer func() { _ = manager.Close() }()
+
+	ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
+	defer cancel()
+
+	cfg := echoServerConfig(t)
+
+	// Add the server first.
+	count, err := manager.AddServer(ctx, "echo", cfg)
+	if err != nil {
+		t.Fatalf("AddServer failed: %v", err)
+	}
+	if count != 2 {
+		t.Fatalf("Expected 2 tools, got %d", count)
+	}
+
+	var mu sync.Mutex
+	toolsChangedCount := 0
+	manager.SetOnToolsChanged(func() {
+		mu.Lock()
+		toolsChangedCount++
+		mu.Unlock()
+	})
+
+	// Remove the server.
+	err = manager.RemoveServer("echo")
+	if err != nil {
+		t.Fatalf("RemoveServer failed: %v", err)
+	}
+
+	// Verify tools are gone.
+	tools := manager.GetTools()
+	if len(tools) != 0 {
+		t.Errorf("Expected 0 tools after removal, got %d", len(tools))
+	}
+
+	// Verify callback fired.
+	mu.Lock()
+	if toolsChangedCount != 1 {
+		t.Errorf("Expected onToolsChanged called once, got %d", toolsChangedCount)
+	}
+	mu.Unlock()
+
+	// Verify server is gone from loaded names.
+	names := manager.GetLoadedServerNames()
+	for _, n := range names {
+		if n == "echo" {
+			t.Error("Server 'echo' should not appear in loaded names after removal")
+		}
+	}
+
+	// Removing again should error.
+	err = manager.RemoveServer("echo")
+	if err == nil {
+		t.Fatal("Expected error removing already-removed server")
+	}
+	if !strings.Contains(err.Error(), "not loaded") {
+		t.Errorf("Expected 'not loaded' error, got: %v", err)
+	}
+}
+
+// TestMCPToolManager_AddRemoveMultiple_Integration tests adding and removing
+// multiple servers, verifying tool isolation.
+func TestMCPToolManager_AddRemoveMultiple_Integration(t *testing.T) {
+	if testing.Short() {
+		t.Skip("skipping integration test in short mode")
+	}
+
+	manager := NewMCPToolManager()
+	defer func() { _ = manager.Close() }()
+
+	ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
+	defer cancel()
+
+	cfg := echoServerConfig(t)
+
+	// Add two servers with the same binary but different names.
+	count1, err := manager.AddServer(ctx, "server-a", cfg)
+	if err != nil {
+		t.Fatalf("AddServer server-a failed: %v", err)
+	}
+	count2, err := manager.AddServer(ctx, "server-b", cfg)
+	if err != nil {
+		t.Fatalf("AddServer server-b failed: %v", err)
+	}
+
+	totalTools := count1 + count2
+	if totalTools != 4 {
+		t.Fatalf("Expected 4 total tools (2+2), got %d", totalTools)
+	}
+
+	tools := manager.GetTools()
+	if len(tools) != 4 {
+		t.Fatalf("Expected 4 tools, got %d", len(tools))
+	}
+
+	// Remove server-a, verify server-b tools remain.
+	err = manager.RemoveServer("server-a")
+	if err != nil {
+		t.Fatalf("RemoveServer server-a failed: %v", err)
+	}
+
+	tools = manager.GetTools()
+	if len(tools) != 2 {
+		t.Fatalf("Expected 2 tools after removing server-a, got %d", len(tools))
+	}
+
+	// Remaining tools should all be from server-b.
+	for _, tool := range tools {
+		if !strings.HasPrefix(tool.Info().Name, "server-b__") {
+			t.Errorf("Expected tool from server-b, got: %s", tool.Info().Name)
+		}
+	}
+
+	// Remove server-b.
+	err = manager.RemoveServer("server-b")
+	if err != nil {
+		t.Fatalf("RemoveServer server-b failed: %v", err)
+	}
+
+	tools = manager.GetTools()
+	if len(tools) != 0 {
+		t.Errorf("Expected 0 tools after removing all servers, got %d", len(tools))
+	}
+}
+
+// TestMCPToolManager_AddServer_DuplicateDetection_Integration tests that
+// adding a server with the same name as an already loaded server errors.
+func TestMCPToolManager_AddServer_DuplicateDetection_Integration(t *testing.T) {
+	if testing.Short() {
+		t.Skip("skipping integration test in short mode")
+	}
+
+	manager := NewMCPToolManager()
+	defer func() { _ = manager.Close() }()
+
+	ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
+	defer cancel()
+
+	cfg := echoServerConfig(t)
+
+	// Add the server.
+	_, err := manager.AddServer(ctx, "echo", cfg)
+	if err != nil {
+		t.Fatalf("First AddServer failed: %v", err)
+	}
+
+	// Try to add again with the same name.
+	_, err = manager.AddServer(ctx, "echo", cfg)
+	if err == nil {
+		t.Fatal("Expected error adding duplicate server")
+	}
+	if !strings.Contains(err.Error(), "already loaded") {
+		t.Errorf("Expected 'already loaded' error, got: %v", err)
+	}
+}
+
+// TestMCPToolManager_AddAfterRemove_Integration tests that a server can be
+// re-added after being removed.
+func TestMCPToolManager_AddAfterRemove_Integration(t *testing.T) {
+	if testing.Short() {
+		t.Skip("skipping integration test in short mode")
+	}
+
+	manager := NewMCPToolManager()
+	defer func() { _ = manager.Close() }()
+
+	ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
+	defer cancel()
+
+	cfg := echoServerConfig(t)
+
+	// Add, remove, re-add.
+	_, err := manager.AddServer(ctx, "echo", cfg)
+	if err != nil {
+		t.Fatalf("First AddServer failed: %v", err)
+	}
+
+	err = manager.RemoveServer("echo")
+	if err != nil {
+		t.Fatalf("RemoveServer failed: %v", err)
+	}
+
+	count, err := manager.AddServer(ctx, "echo", cfg)
+	if err != nil {
+		t.Fatalf("Re-AddServer failed: %v", err)
+	}
+	if count != 2 {
+		t.Errorf("Expected 2 tools on re-add, got %d", count)
+	}
+
+	tools := manager.GetTools()
+	if len(tools) != 2 {
+		t.Errorf("Expected 2 tools after re-add, got %d", len(tools))
+	}
+}
@@ -0,0 +1,155 @@
+package tools
+
+import (
+	"context"
+	"strings"
+	"sync"
+	"testing"
+	"time"
+
+	"github.com/mark3labs/kit/internal/config"
+)
+
+// TestMCPToolManager_AddServer_DuplicateName verifies that adding a server
+// with a name that already exists returns an error.
+func TestMCPToolManager_AddServer_DuplicateName(t *testing.T) {
+	manager := NewMCPToolManager()
+
+	cfg := config.MCPServerConfig{
+		Command: []string{"non-existent-command"},
+	}
+
+	ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
+	defer cancel()
+
+	// First add will fail (bad command), but let's test the duplicate detection
+	// by simulating a loaded server via LoadTools first.
+	loadCfg := &config.Config{
+		MCPServers: map[string]config.MCPServerConfig{
+			"test-server": cfg,
+		},
+	}
+	// This will fail to load but creates the connection pool.
+	_ = manager.LoadTools(ctx, loadCfg)
+
+	// Now try to add the same server name — the tools didn't load (bad command),
+	// so AddServer should not find a duplicate and should fail with connection error.
+	_, err := manager.AddServer(ctx, "test-server", cfg)
+	if err == nil {
+		t.Fatal("Expected error when adding server with bad command, got nil")
+	}
+	// It should be a connection error, not a duplicate error.
+	if strings.Contains(err.Error(), "already loaded") {
+		t.Fatalf("Should not report duplicate since server failed to load initially: %v", err)
+	}
+}
+
+// TestMCPToolManager_RemoveServer_NotLoaded verifies that removing a server
+// that doesn't exist returns an appropriate error.
+func TestMCPToolManager_RemoveServer_NotLoaded(t *testing.T) {
+	manager := NewMCPToolManager()
+
+	err := manager.RemoveServer("nonexistent")
+	if err == nil {
+		t.Fatal("Expected error when removing non-existent server, got nil")
+	}
+	if !strings.Contains(err.Error(), "not loaded") {
+		t.Errorf("Expected 'not loaded' error, got: %v", err)
+	}
+}
+
+// TestMCPToolManager_AddServer_CreatesConnectionPool verifies that AddServer
+// lazily creates a connection pool when LoadTools was never called.
+func TestMCPToolManager_AddServer_CreatesConnectionPool(t *testing.T) {
+	manager := NewMCPToolManager()
+
+	// Connection pool should be nil initially.
+	if manager.connectionPool != nil {
+		t.Fatal("Expected nil connection pool before any operation")
+	}
+
+	ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
+	defer cancel()
+
+	// AddServer with a bad command — should fail, but the pool should be created.
+	_, err := manager.AddServer(ctx, "lazy-server", config.MCPServerConfig{
+		Command: []string{"non-existent-command"},
+	})
+	if err == nil {
+		t.Fatal("Expected error for bad command")
+	}
+
+	// Connection pool should have been created.
+	if manager.connectionPool == nil {
+		t.Fatal("Expected connection pool to be created lazily by AddServer")
+	}
+}
+
+// TestMCPToolManager_OnToolsChanged_Callback verifies that the onToolsChanged
+// callback fires on RemoveServer (we can't easily test AddServer with a real
+// MCP server, but we can test the callback wiring).
+func TestMCPToolManager_OnToolsChanged_Callback(t *testing.T) {
+	manager := NewMCPToolManager()
+
+	var mu sync.Mutex
+	callCount := 0
+	manager.SetOnToolsChanged(func() {
+		mu.Lock()
+		callCount++
+		mu.Unlock()
+	})
+
+	// RemoveServer on non-existent should NOT fire callback.
+	_ = manager.RemoveServer("nonexistent")
+
+	mu.Lock()
+	if callCount != 0 {
+		t.Errorf("Expected 0 callback calls for failed remove, got %d", callCount)
+	}
+	mu.Unlock()
+}
+
+// TestMCPToolManager_Close_NilPool verifies Close is safe when the connection
+// pool was never initialized.
+func TestMCPToolManager_Close_NilPool(t *testing.T) {
+	manager := NewMCPToolManager()
+	err := manager.Close()
+	if err != nil {
+		t.Fatalf("Expected nil error from Close with nil pool, got: %v", err)
+	}
+}
+
+// TestMCPConnectionPool_RemoveConnection_NotFound verifies that removing a
+// non-existent connection returns an error.
+func TestMCPConnectionPool_RemoveConnection_NotFound(t *testing.T) {
+	pool := NewMCPConnectionPool(DefaultConnectionPoolConfig(), nil, false, nil, nil)
+	defer func() { _ = pool.Close() }()
+
+	err := pool.RemoveConnection("nonexistent")
+	if err == nil {
+		t.Fatal("Expected error for non-existent connection")
+	}
+	if !strings.Contains(err.Error(), "not found") {
+		t.Errorf("Expected 'not found' error, got: %v", err)
+	}
+}
+
+// TestMCPToolManager_EnsureConnectionPool_Idempotent verifies that
+// ensureConnectionPool doesn't recreate an existing pool.
+func TestMCPToolManager_EnsureConnectionPool_Idempotent(t *testing.T) {
+	manager := NewMCPToolManager()
+
+	// First call creates the pool.
+	manager.ensureConnectionPool()
+	pool1 := manager.connectionPool
+	if pool1 == nil {
+		t.Fatal("Expected pool to be created")
+	}
+
+	// Second call should be a no-op.
+	manager.ensureConnectionPool()
+	pool2 := manager.connectionPool
+	if pool1 != pool2 {
+		t.Fatal("Expected ensureConnectionPool to be idempotent")
+	}
+}
@@ -6,6 +6,7 @@ import (
 	"net/url"

 	"github.com/mark3labs/mcp-go/client"
+	"github.com/mark3labs/mcp-go/client/transport"
 )

 // MCPAuthHandler is the internal interface for handling MCP OAuth flows.
@@ -21,6 +22,12 @@ type MCPAuthHandler interface {
 	HandleAuth(ctx context.Context, serverName string, authURL string) (callbackURL string, err error)
 }

+// TokenStoreFactory creates a transport.TokenStore for a given MCP server URL.
+// When provided to the connection pool, it is called once per remote MCP server
+// instead of using the default file-based token store. Implementations can
+// return any transport.TokenStore — in-memory, database-backed, encrypted, etc.
+type TokenStoreFactory func(serverURL string) (transport.TokenStore, error)
+
 // OAuthFlowRunner handles the OAuth authorization flow when an MCP server
 // returns an OAuthAuthorizationRequiredError. It coordinates dynamic client
 // registration, PKCE generation, user authorization (via MCPAuthHandler),
@@ -0,0 +1,111 @@
+#!/usr/bin/env python3
+"""Minimal MCP server over stdio for testing. Exposes one tool: echo."""
+import json
+import sys
+
+
+def read_message():
+    """Read a JSON-RPC message from stdin."""
+    line = sys.stdin.readline()
+    if not line:
+        return None
+    return json.loads(line.strip())
+
+
+def write_message(msg):
+    """Write a JSON-RPC message to stdout."""
+    sys.stdout.write(json.dumps(msg) + "\n")
+    sys.stdout.flush()
+
+
+def handle(msg):
+    method = msg.get("method", "")
+    mid = msg.get("id")
+
+    if method == "initialize":
+        write_message({
+            "jsonrpc": "2.0",
+            "id": mid,
+            "result": {
+                "protocolVersion": "2024-11-05",
+                "capabilities": {"tools": {}},
+                "serverInfo": {"name": "test-echo", "version": "1.0.0"},
+            },
+        })
+    elif method == "notifications/initialized":
+        pass  # no response needed
+    elif method == "tools/list":
+        write_message({
+            "jsonrpc": "2.0",
+            "id": mid,
+            "result": {
+                "tools": [
+                    {
+                        "name": "echo",
+                        "description": "Echoes the input text back.",
+                        "inputSchema": {
+                            "type": "object",
+                            "properties": {
+                                "text": {"type": "string", "description": "Text to echo"}
+                            },
+                            "required": ["text"],
+                        },
+                    },
+                    {
+                        "name": "greet",
+                        "description": "Returns a greeting.",
+                        "inputSchema": {
+                            "type": "object",
+                            "properties": {
+                                "name": {"type": "string", "description": "Name to greet"}
+                            },
+                            "required": ["name"],
+                        },
+                    },
+                ]
+            },
+        })
+    elif method == "tools/call":
+        tool_name = msg["params"]["name"]
+        args = msg["params"].get("arguments", {})
+        if tool_name == "echo":
+            text = args.get("text", "")
+            write_message({
+                "jsonrpc": "2.0",
+                "id": mid,
+                "result": {
+                    "content": [{"type": "text", "text": text}]
+                },
+            })
+        elif tool_name == "greet":
+            name = args.get("name", "World")
+            write_message({
+                "jsonrpc": "2.0",
+                "id": mid,
+                "result": {
+                    "content": [{"type": "text", "text": f"Hello, {name}!"}]
+                },
+            })
+        else:
+            write_message({
+                "jsonrpc": "2.0",
+                "id": mid,
+                "error": {"code": -32601, "message": f"Unknown tool: {tool_name}"},
+            })
+    elif method == "ping":
+        write_message({"jsonrpc": "2.0", "id": mid, "result": {}})
+    else:
+        if mid is not None:
+            write_message({
+                "jsonrpc": "2.0",
+                "id": mid,
+                "error": {"code": -32601, "message": f"Unknown method: {method}"},
+            })
+
+
+if __name__ == "__main__":
+    while True:
+        msg = read_message()
+        if msg is None:
+            break
+        handle(msg)
@@ -5,7 +5,6 @@ import (
 	"os"
 	"time"

-	"charm.land/fantasy"
 	"charm.land/lipgloss/v2"
 	"golang.org/x/term"

@@ -173,33 +172,6 @@ func (c *CLI) DisplayDebugConfig(config map[string]any) {
 	fmt.Println(c.renderer.RenderDebugConfigMessage(config, time.Now()).Content)
 }

-// UpdateUsageFromResponse records token usage using metadata from the fantasy
-// response. Only actual API-reported tokens are used for cost tracking.
-// If the provider doesn't report token counts, no usage is recorded.
-func (c *CLI) UpdateUsageFromResponse(response *fantasy.Response, inputText string) {
-	if c.usageTracker == nil {
-		return
-	}
-
-	usage := response.Usage
-	inputTokens := int(usage.InputTokens)
-	outputTokens := int(usage.OutputTokens)
-
-	// Only use actual API-reported tokens for cost tracking.
-	// We intentionally do NOT estimate tokens - estimation is inaccurate
-	// and should never be used for cost calculations.
-	if inputTokens > 0 {
-		cacheReadTokens := int(usage.CacheReadTokens)
-		cacheWriteTokens := int(usage.CacheCreationTokens)
-		c.usageTracker.UpdateUsage(inputTokens, outputTokens, cacheReadTokens, cacheWriteTokens)
-		// Per-response usage is a single API call, so it represents the
-		// actual context window fill level.
-		c.usageTracker.SetContextTokens(inputTokens + outputTokens)
-	}
-	// If inputTokens is 0, the provider didn't report usage - we skip recording
-	// rather than estimating, to ensure cost accuracy.
-}
-
 // DisplayUsageAfterResponse renders and displays token usage information immediately
 // following an AI response. This provides real-time feedback about the cost and
 // token consumption of each interaction.
@@ -20,6 +20,7 @@ type SlashCommand struct {
 	Aliases     []string
 	Category    string                       // e.g., "Navigation", "System", "Info"
 	Complete    func(prefix string) []string // optional argument tab-completion
+	HasArgs     bool                         // true when the command expects arguments (e.g. prompt templates with placeholders)
 }

 // SlashCommands provides the global registry of all available slash commands
@@ -139,7 +139,9 @@ func (h *CLIEventHandler) Handle(msg tea.Msg) {
 		case "block":
 			h.cli.DisplayExtensionBlock(e.Text, e.BorderColor, e.Subtitle)
 		default:
-			fmt.Println(e.Text)
+			// Route unstyled extension prints through the system message
+			// renderer so they get consistent formatting and timestamps.
+			h.cli.DisplayInfo(e.Text)
 		}

 	case app.StepCompleteEvent:
@@ -109,9 +109,7 @@ func SetupCLI(opts *CLISetupOptions) (*CLI, error) {
 		}
 	}

-	fmt.Println("")
-
-	// Display model info
+	// Display model info (the system message block provides its own spacing).
 	if provider != "unknown" && model != "unknown" {
 		cli.DisplayInfo(fmt.Sprintf("Model loaded: %s (%s)", provider, model))
 	}
@@ -6,6 +6,8 @@ import (
 	"path/filepath"
 	"regexp"
 	"strings"
+
+	"github.com/mark3labs/kit/internal/fences"
 )

 // fileTokenPattern matches @file references in user text. Supports:
@@ -20,6 +22,14 @@ var fileTokenPattern = regexp.MustCompile(`@"[^"]+"|@[^\s]+`)
 //
 // Returns the original text unchanged if no valid @file references are found.
 func ProcessFileAttachments(text string, cwd string) string {
+	return fences.ReplaceOutside(text, func(segment string) string {
+		return processFileTokens(segment, cwd)
+	})
+}
+
+// processFileTokens handles @file replacement in a single text segment
+// that is known to be outside fenced code blocks.
+func processFileTokens(text string, cwd string) string {
 	tokens := fileTokenPattern.FindAllString(text, -1)
 	if len(tokens) == 0 {
 		return text
@@ -69,7 +69,7 @@ type InputComponent struct {
 	hideHint bool

 	// agentBusy indicates the agent is currently working. When true, the
-	// hint text shows steering shortcut (Ctrl+S) instead of submit.
+	// hint text shows steering shortcut (Ctrl+X s) instead of submit.
 	agentBusy bool

 	// pendingImages holds clipboard images attached to the next submission.
@@ -109,7 +109,7 @@ func NewInputComponent(width int, title string, appCtrl AppController) *InputCom
 	ta.Placeholder = "Type your message..."
 	ta.ShowLineNumbers = false
 	ta.Prompt = ""
-	ta.CharLimit = 5000
+	ta.CharLimit = 0
 	ta.SetWidth(width - 8) // Account for container padding, border and internal padding
 	ta.SetHeight(3)        // Default to 3 lines like huh
 	ta.Focus()
@@ -285,16 +285,25 @@ func (s *InputComponent) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
 						s.textarea.CursorEnd()
 						return s, nil
 					}
+					selectedCmd := s.filtered[s.selected].Command
 					// Populate textarea with selected item and submit on next tick.
 					if s.argMode {
-						s.textarea.SetValue(s.argCommand + " " + s.filtered[s.selected].Command.Name)
+						s.textarea.SetValue(s.argCommand + " " + selectedCmd.Name)
 					} else {
-						s.textarea.SetValue(s.filtered[s.selected].Command.Name)
+						s.textarea.SetValue(selectedCmd.Name)
 					}
 					s.textarea.CursorEnd()
 					s.showPopup = false
 					s.selected = 0
-					s.submitNext = true
+					// If the selected command expects arguments, populate
+					// the input with the command + trailing space so the
+					// user can type args, instead of auto-submitting.
+					if !s.argMode && selectedCmd.HasArgs {
+						s.textarea.SetValue(selectedCmd.Name + " ")
+						s.textarea.CursorEnd()
+					} else {
+						s.submitNext = true
+					}
 					return s, nil
 				}
 				return s, nil
@@ -514,19 +523,21 @@ func (s *InputComponent) View() tea.View {
 		availableHintWidth := s.width - 3
 		if s.agentBusy {
 			// When the agent is working, show steering shortcut.
-			if availableHintWidth >= 55 {
-				hint = "enter queue • ctrl+s steer • esc esc cancel"
-			} else if availableHintWidth >= 35 {
-				hint = "↵ queue • ^S steer • esc×2 cancel"
+			if availableHintWidth >= 60 {
+				hint = "enter queue • ctrl+x s steer • esc esc cancel"
+			} else if availableHintWidth >= 40 {
+				hint = "↵ queue • ^X s steer • esc×2 cancel"
 			} else {
-				hint = "^S steer"
+				hint = "^X s steer"
 			}
+		} else if availableHintWidth >= 80 {
+			hint = "enter submit • ctrl+j / shift+enter new line • ctrl+x e editor • ctrl+v paste image"
 		} else if availableHintWidth >= 67 {
-			hint = "enter submit • ctrl+j / shift+enter new line • ctrl+v paste image"
+			hint = "enter submit • ctrl+j new line • ctrl+x e editor • ctrl+v image"
 		} else if availableHintWidth >= 40 {
-			hint = "↵ submit • ctrl+j newline • ctrl+v image"
+			hint = "↵ submit • ctrl+j newline • ^X e editor"
 		} else if availableHintWidth >= 20 {
-			hint = "↵ submit • ctrl+j"
+			hint = "↵ submit • ^X e editor"
 		} else {
 			hint = "↵ submit"
 		}
@@ -152,7 +152,7 @@ func (r *MessageRenderer) SetWidth(width int) {

 // RenderUserMessage renders a user's input message using herald Tip alert
 func (r *MessageRenderer) RenderUserMessage(content string, timestamp time.Time) UIMessage {
-	rendered := render.UserBlock(content, r.ty, style.GetTheme())
+	rendered := render.UserBlock(content, r.width, r.ty, style.GetTheme())

 	return UIMessage{
 		Type:      UserMessage,
@@ -12,6 +12,7 @@ import (

 	tea "charm.land/bubbletea/v2"
 	"charm.land/lipgloss/v2"
+	"github.com/charmbracelet/x/editor"
 	"github.com/spf13/viper"

 	"github.com/mark3labs/kit/internal/app"
@@ -281,6 +282,16 @@ type AppModelOptions struct {
 	// ToolNames holds available tool names for the /tools command.
 	ToolNames []string

+	// GetToolNames, if non-nil, returns the current tool names. Called on
+	// MCPToolsReadyEvent to refresh the tool list after background MCP tool
+	// loading completes. May be nil if dynamic tool refresh is not needed.
+	GetToolNames func() []string
+
+	// GetMCPToolCount, if non-nil, returns the current MCP tool count.
+	// Called on MCPToolsReadyEvent to refresh the startup info bar.
+	// May be nil if dynamic tool refresh is not needed.
+	GetMCPToolCount func() int
+
 	// UsageTracker provides token usage statistics for /usage and /reset-usage.
 	// May be nil if usage tracking is unavailable for the current model.
 	UsageTracker *UsageTracker
@@ -294,6 +305,11 @@ type AppModelOptions struct {
 	// and are expanded when submitted (e.g., /review → full prompt text).
 	PromptTemplates []*prompts.PromptTemplate

+	// GetPromptTemplates, if non-nil, returns the current prompt templates.
+	// Called on ContentReloadEvent to refresh the template list after a file
+	// watcher detects changes. May be nil if prompt hot-reload is not needed.
+	GetPromptTemplates func() []*prompts.PromptTemplate
+
 	// ContextPaths lists absolute paths of loaded context files (e.g.
 	// AGENTS.md). Displayed in the [Context] startup section.
 	ContextPaths []string
@@ -301,6 +317,11 @@ type AppModelOptions struct {
 	// SkillItems lists loaded skills for the [Skills] startup section.
 	SkillItems []SkillItem

+	// GetSkillItems, if non-nil, returns the current skill items.
+	// Called on ContentReloadEvent to refresh the skill list after a file
+	// watcher detects changes. May be nil if skill hot-reload is not needed.
+	GetSkillItems func() []SkillItem
+
 	// MCPToolCount is the number of tools loaded from external MCP servers.
 	MCPToolCount int

@@ -457,7 +478,7 @@ type AppModel struct {
 	queuedMessages []string

 	// steeringMessages stores the text of prompts that were sent as steer
-	// messages (injected mid-turn via Ctrl+S). Rendered with a "STEERING"
+	// messages (injected mid-turn via Ctrl+X s). Rendered with a "STEERING"
 	// badge above the input. Cleared when the steer is consumed.
 	steeringMessages []string

@@ -478,6 +499,11 @@ type AppModel struct {
 	// A second ESC within 2 seconds will cancel the current step.
 	canceling bool

+	// leaderKeyActive tracks whether the Ctrl+X leader key prefix has been
+	// pressed. The next keypress is interpreted as a chord suffix (e.g. "s"
+	// for steer). Cleared on any subsequent keypress.
+	leaderKeyActive bool
+
 	// providerName is the LLM provider for the startup message.
 	providerName string

@@ -485,8 +511,12 @@ type AppModel struct {
 	loadingMessage string

 	// serverNames, toolNames are used by /servers and /tools commands.
-	serverNames []string
-	toolNames   []string
+	serverNames  []string
+	toolNames    []string
+	getToolNames func() []string // dynamic tool name provider (for MCP refresh)
+
+	// getMCPToolCount returns the current MCP tool count dynamically.
+	getMCPToolCount func() int

 	// usageTracker provides token usage stats for /usage and /reset-usage.
 	// May be nil when usage tracking is unavailable.
@@ -500,6 +530,10 @@ type AppModel struct {
 	// They appear in autocomplete and are expanded when submitted.
 	promptTemplates []*prompts.PromptTemplate

+	// getPromptTemplates returns the current prompt templates. Used to
+	// refresh the template list after content hot-reload. May be nil.
+	getPromptTemplates func() []*prompts.PromptTemplate
+
 	// treeSelector is the tree navigation overlay, active in stateTreeSelector.
 	treeSelector *TreeSelectorComponent

@@ -508,6 +542,10 @@ type AppModel struct {
 	contextPaths []string
 	skillItems   []SkillItem

+	// getSkillItems returns the current skill items. Used to refresh the
+	// skill list after content hot-reload. May be nil.
+	getSkillItems func() []SkillItem
+
 	// mcpToolCount and extensionToolCount track tool counts by source for
 	// the startup info display.
 	mcpToolCount       int
@@ -704,23 +742,26 @@ func NewAppModel(appCtrl AppController, opts AppModelOptions) *AppModel {
 	rdr := mr

 	m := &AppModel{
-		state:          stateInput,
-		appCtrl:        appCtrl,
-		renderer:       rdr,
-		modelName:      opts.ModelName,
-		providerName:   opts.ProviderName,
-		loadingMessage: opts.LoadingMessage,
-		serverNames:    opts.ServerNames,
-		toolNames:      opts.ToolNames,
-		usageTracker:   opts.UsageTracker,
-		cwd:            opts.Cwd,
-		width:          width,
-		height:         height,
+		state:           stateInput,
+		appCtrl:         appCtrl,
+		renderer:        rdr,
+		modelName:       opts.ModelName,
+		providerName:    opts.ProviderName,
+		loadingMessage:  opts.LoadingMessage,
+		serverNames:     opts.ServerNames,
+		toolNames:       opts.ToolNames,
+		getToolNames:    opts.GetToolNames,
+		getMCPToolCount: opts.GetMCPToolCount,
+		usageTracker:    opts.UsageTracker,
+		cwd:             opts.Cwd,
+		width:           width,
+		height:          height,
 	}

 	// Store extension commands for dispatch.
 	m.extensionCommands = opts.ExtensionCommands
 	m.promptTemplates = opts.PromptTemplates
+	m.getPromptTemplates = opts.GetPromptTemplates
 	m.getWidgets = opts.GetWidgets
 	m.getHeader = opts.GetHeader
 	m.getFooter = opts.GetFooter
@@ -746,6 +787,7 @@ func NewAppModel(appCtrl AppController, opts AppModelOptions) *AppModel {
 	// Store context/skills metadata and tool counts for startup display.
 	m.contextPaths = opts.ContextPaths
 	m.skillItems = opts.SkillItems
+	m.getSkillItems = opts.GetSkillItems
 	m.mcpToolCount = opts.MCPToolCount
 	m.extensionToolCount = opts.ExtensionToolCount
 	m.startupExtensionMessages = opts.StartupExtensionMessages
@@ -785,6 +827,7 @@ func NewAppModel(appCtrl AppController, opts AppModelOptions) *AppModel {
 				Name:        "/" + tpl.Name,
 				Description: tpl.Description,
 				Category:    "Prompts",
+				HasArgs:     tpl.HasArgPlaceholders(),
 			})
 		}
 	}
@@ -1232,6 +1275,110 @@ func (m *AppModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
 			return m, tea.Batch(cmds...)
 		}

+		// ── Leader key chord handling (Ctrl+X prefix) ──────────────
+		// If the leader key was previously pressed, the current key
+		// completes the chord. We consume it regardless of match so
+		// the prefix doesn't leak to child components.
+		if m.leaderKeyActive {
+			m.leaderKeyActive = false
+			switch msg.String() {
+			case "s":
+				// Ctrl+X s → Steer: inject the current input as a steering
+				// message into the running agent turn.
+				if m.state == stateWorking && m.appCtrl != nil {
+					var text string
+					if ic, ok := m.input.(*InputComponent); ok {
+						text = strings.TrimSpace(ic.textarea.Value())
+					}
+					if text != "" {
+						// Clear the input, collect pending images, and push to history.
+						var images []uicore.ImageAttachment
+						if ic, ok := m.input.(*InputComponent); ok {
+							ic.pushHistory(text)
+							ic.textarea.SetValue("")
+							images = ic.ClearPendingImages()
+						}
+
+						// Preprocess @file references.
+						processedText := text
+						if m.cwd != "" {
+							processedText = fileutil.ProcessFileAttachments(text, m.cwd)
+						}
+
+						// Convert image attachments to kit.LLMFilePart for the app layer.
+						var fileParts []kit.LLMFilePart
+						for _, img := range images {
+							fileParts = append(fileParts, kit.LLMFilePart{
+								Data:      img.Data,
+								MediaType: img.MediaType,
+							})
+						}
+
+						// Build display text (include image count if any).
+						displayText := text
+						if len(images) > 0 {
+							displayText = fmt.Sprintf("%s\n[%d image(s) attached]", text, len(images))
+						}
+
+						// Inject the steer message.
+						sLen := m.appCtrl.SteerWithFiles(processedText, fileParts)
+						if sLen > 0 {
+							m.steeringMessages = append(m.steeringMessages, displayText)
+							m.layoutDirty = true
+						} else {
+							// Started immediately (agent was idle).
+							m.pendingUserPrints = append(m.pendingUserPrints, displayText)
+							m.flushStreamAndPendingUserMessages()
+							if m.state != stateWorking {
+								m.state = stateWorking
+							}
+						}
+					}
+				}
+			case "e":
+				// Ctrl+X e → open $EDITOR to compose/edit the prompt.
+				editorApp := os.Getenv("VISUAL")
+				if editorApp == "" {
+					editorApp = os.Getenv("EDITOR")
+				}
+				if editorApp == "" {
+					m.printSystemMessage("Set `$EDITOR` or `$VISUAL` to use external editor")
+				} else {
+					var currentText string
+					if ic, ok := m.input.(*InputComponent); ok {
+						currentText = ic.textarea.Value()
+					}
+					tmpFile, err := os.CreateTemp("", "kit_prompt_*.md")
+					if err == nil {
+						if currentText != "" {
+							_, _ = tmpFile.WriteString(currentText)
+						}
+						_ = tmpFile.Close()
+						editorCmd, cmdErr := editor.Command(editorApp, tmpFile.Name())
+						if cmdErr != nil {
+							_ = os.Remove(tmpFile.Name())
+							m.printSystemMessage(fmt.Sprintf("Failed to open editor: %v", cmdErr))
+						} else {
+							cmds = append(cmds, tea.ExecProcess(editorCmd, func(err error) tea.Msg {
+								if err != nil {
+									_ = os.Remove(tmpFile.Name())
+									return externalEditorMsg{err: err}
+								}
+								content, readErr := os.ReadFile(tmpFile.Name())
+								_ = os.Remove(tmpFile.Name())
+								if readErr != nil {
+									return externalEditorMsg{err: readErr}
+								}
+								return externalEditorMsg{text: string(content)}
+							}))
+						}
+					}
+				}
+			}
+			// Chord consumed — don't propagate to children.
+			return m, tea.Batch(cmds...)
+		}
+
 		switch msg.String() {
 		case "esc":
 			if m.state == stateWorking {
@@ -1250,61 +1397,10 @@ func (m *AppModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
 			}
 			// In other states pass ESC through to children below.

-		case "ctrl+s":
-			// Steer: inject the current input as a steering message into the
-			// running agent turn. Only active during stateWorking — in input
-			// state, Ctrl+S is passed through to children (no-op by default).
-			if m.state == stateWorking && m.appCtrl != nil {
-				var text string
-				if ic, ok := m.input.(*InputComponent); ok {
-					text = strings.TrimSpace(ic.textarea.Value())
-				}
-				if text != "" {
-					// Clear the input, collect pending images, and push to history.
-					var images []uicore.ImageAttachment
-					if ic, ok := m.input.(*InputComponent); ok {
-						ic.pushHistory(text)
-						ic.textarea.SetValue("")
-						images = ic.ClearPendingImages()
-					}
-
-					// Preprocess @file references.
-					processedText := text
-					if m.cwd != "" {
-						processedText = fileutil.ProcessFileAttachments(text, m.cwd)
-					}
-
-					// Convert image attachments to kit.LLMFilePart for the app layer.
-					var fileParts []kit.LLMFilePart
-					for _, img := range images {
-						fileParts = append(fileParts, kit.LLMFilePart{
-							Data:      img.Data,
-							MediaType: img.MediaType,
-						})
-					}
-
-					// Build display text (include image count if any).
-					displayText := text
-					if len(images) > 0 {
-						displayText = fmt.Sprintf("%s\n[%d image(s) attached]", text, len(images))
-					}
-
-					// Inject the steer message.
-					sLen := m.appCtrl.SteerWithFiles(processedText, fileParts)
-					if sLen > 0 {
-						m.steeringMessages = append(m.steeringMessages, displayText)
-						m.layoutDirty = true
-					} else {
-						// Started immediately (agent was idle).
-						m.pendingUserPrints = append(m.pendingUserPrints, displayText)
-						m.flushStreamAndPendingUserMessages()
-						if m.state != stateWorking {
-							m.state = stateWorking
-						}
-					}
-				}
-				return m, tea.Batch(cmds...)
-			}
+		case "ctrl+x":
+			// Activate leader key prefix — the next keypress completes the chord.
+			m.leaderKeyActive = true
+			return m, tea.Batch(cmds...)
 		}

 		// Route key events to the focused child. Check for editor
@@ -1389,7 +1485,15 @@ func (m *AppModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {

 		// Expand prompt templates. If the input matches a template name,
 		// substitute arguments and use the expanded content as the prompt.
-		if expanded, ok := m.expandPromptTemplate(msg.Text); ok {
+		if expanded, ok, validationErr := m.expandPromptTemplate(msg.Text); validationErr != "" {
+			// Validation failed — re-populate the input so the user can
+			// append the missing arguments without retyping.
+			if ic, ok := m.input.(*InputComponent); ok {
+				ic.textarea.SetValue(msg.Text + " ")
+				ic.textarea.CursorEnd()
+			}
+			return m, tea.Batch(cmds...)
+		} else if ok {
 			msg.Text = expanded
 		}

@@ -1700,6 +1804,13 @@ func (m *AppModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
 			m.stream, _ = updated.(streamComponentIface)
 			cmds = append(cmds, cmd)
 		}
+		// Mark any trailing StreamingMessageItem as complete so its live
+		// timer freezes and it is not left in a dangling streaming state.
+		if len(m.messages) > 0 {
+			if streamMsg, ok := m.messages[len(m.messages)-1].(*StreamingMessageItem); ok {
+				streamMsg.MarkComplete()
+			}
+		}
 		m.state = stateInput
 		m.canceling = false

@@ -1711,6 +1822,12 @@ func (m *AppModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
 			m.stream, _ = updated.(streamComponentIface)
 			cmds = append(cmds, cmd)
 		}
+		// Mark any trailing StreamingMessageItem as complete (see StepCompleteEvent).
+		if len(m.messages) > 0 {
+			if streamMsg, ok := m.messages[len(m.messages)-1].(*StreamingMessageItem); ok {
+				streamMsg.MarkComplete()
+			}
+		}
 		m.state = stateInput
 		m.canceling = false

@@ -1723,6 +1840,12 @@ func (m *AppModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
 			m.stream, _ = updated.(streamComponentIface)
 			cmds = append(cmds, cmd)
 		}
+		// Mark any trailing StreamingMessageItem as complete (see StepCompleteEvent).
+		if len(m.messages) > 0 {
+			if streamMsg, ok := m.messages[len(m.messages)-1].(*StreamingMessageItem); ok {
+				streamMsg.MarkComplete()
+			}
+		}
 		if msg.Err != nil {
 			m.printErrorResponse(msg)
 		}
@@ -1746,6 +1869,12 @@ func (m *AppModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
 		// Refresh content to show the finalized message.
 		m.refreshContent()

+		// Reset context token display — the pre-compaction count is stale.
+		// The next API call will set the accurate post-compaction value.
+		if m.usageTracker != nil {
+			m.usageTracker.SetContextTokens(0)
+		}
+
 		// Print stats as a separate system message.
 		saved := msg.OriginalTokens - msg.CompactedTokens
 		statsMsg := fmt.Sprintf(
@@ -1798,6 +1927,27 @@ func (m *AppModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
 			}
 		}

+	case app.ContentReloadEvent:
+		// Prompt templates or skills changed on disk — refresh from providers.
+		m.refreshPromptTemplates()
+		m.refreshSkillItems()
+		m.printSystemMessage("Prompts and skills reloaded.")
+
+	case app.MCPToolsReadyEvent:
+		// Background MCP tool loading completed — refresh tool names and count.
+		m.refreshToolNames()
+		m.refreshMCPToolCount()
+
+	case app.MCPServerLoadedEvent:
+		// A single MCP server finished loading — display a system message.
+		if msg.Error != nil {
+			m.printSystemMessage(fmt.Sprintf("MCP server '%s' failed to load: %v", msg.ServerName, msg.Error))
+		} else if msg.ToolCount > 0 {
+			m.printSystemMessage(fmt.Sprintf("MCP server '%s' loaded with %d tools", msg.ServerName, msg.ToolCount))
+		} else {
+			m.printSystemMessage(fmt.Sprintf("MCP server '%s' loaded (no tools)", msg.ServerName))
+		}
+
 	case app.EditorTextSetEvent:
 		// Extension wants to pre-fill the input editor with text.
 		if ic, ok := m.input.(*InputComponent); ok {
@@ -1872,6 +2022,19 @@ func (m *AppModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
 			m.printSystemMessage(msg.output)
 		}

+	case externalEditorMsg:
+		// User returned from $EDITOR. Replace input textarea content with
+		// whatever they saved in the temp file. On error (e.g. :cq in vim)
+		// the original input is silently preserved.
+		if msg.err == nil {
+			if ic, ok := m.input.(*InputComponent); ok {
+				ic.textarea.SetValue(msg.text)
+				// Move cursor to the end of the inserted text.
+				ic.textarea.CursorEnd()
+			}
+			m.layoutDirty = true
+		}
+
 	case extReloadResultMsg:
 		if msg.err != nil {
 			m.printSystemMessage(fmt.Sprintf("Extension reload failed: %v", msg.err))
@@ -2386,22 +2549,34 @@ func (m *AppModel) renderHeaderFooter(getter func() *WidgetData) string {
 	return renderContentBlock(data.Text, m.width, opts...)
 }

+// maxQueuedMessageLines is the maximum number of visible content lines
+// rendered for each queued or steering message block. Messages exceeding
+// this limit are truncated with an ellipsis to prevent large pastes from
+// overflowing the screen and squeezing the stream region to zero.
+const maxQueuedMessageLines = 3
+
 // renderQueuedMessages renders queued and steering prompts as styled content
 // blocks with badges, anchored between the separator and input. Steering
 // messages use a distinct "STEERING" badge to differentiate from queued ones.
+// Long messages are visually truncated to maxQueuedMessageLines.
 func (m *AppModel) renderQueuedMessages() string {
 	if len(m.queuedMessages) == 0 && len(m.steeringMessages) == 0 {
 		return ""
 	}
 	theme := style.GetTheme()

+	// Available content width inside the block: container minus border (1)
+	// minus left padding (2). Used to estimate line wrapping for truncation.
+	contentWidth := max(m.width-3, 10)
+
 	var blocks []string

 	// Render steering messages first (higher priority).
 	if len(m.steeringMessages) > 0 {
 		badge := style.CreateBadge("STEERING", theme.Warning)
 		for _, msg := range m.steeringMessages {
-			content := msg + "\n" + badge
+			display := truncateMessageForBlock(msg, maxQueuedMessageLines, contentWidth)
+			content := display + "\n" + badge
 			rendered := renderContentBlock(
 				content,
 				m.width,
@@ -2416,7 +2591,8 @@ func (m *AppModel) renderQueuedMessages() string {
 	if len(m.queuedMessages) > 0 {
 		badge := style.CreateBadge("QUEUED", theme.Accent)
 		for _, msg := range m.queuedMessages {
-			content := msg + "\n" + badge
+			display := truncateMessageForBlock(msg, maxQueuedMessageLines, contentWidth)
+			content := display + "\n" + badge
 			rendered := renderContentBlock(
 				content,
 				m.width,
@@ -2430,6 +2606,58 @@ func (m *AppModel) renderQueuedMessages() string {
 	return strings.Join(blocks, "\n")
 }

+// truncateMessageForBlock truncates a message to at most maxLines visible
+// lines, accounting for soft-wrapping at the given width. If the message is
+// truncated, the last visible line is replaced with an ellipsis ("…").
+func truncateMessageForBlock(msg string, maxLines, width int) string {
+	if width <= 0 {
+		width = 1
+	}
+
+	lines := strings.Split(msg, "\n")
+
+	// Count visible lines (each hard line may wrap into multiple visual lines).
+	var kept []string
+	visibleCount := 0
+	truncated := false
+
+	for _, line := range lines {
+		// Calculate how many visual lines this hard line occupies.
+		lineWidth := lipgloss.Width(line)
+		wrapped := 1
+		if lineWidth > width {
+			wrapped = (lineWidth + width - 1) / width // ceil division
+		}
+
+		if visibleCount+wrapped > maxLines {
+			// This line would exceed the limit. Keep a partial if we
+			// still have room for at least one more visual line.
+			remaining := maxLines - visibleCount
+			if remaining > 0 {
+				// Truncate the line to fit the remaining visual lines.
+				runes := []rune(line)
+				maxRunes := remaining * width
+				if maxRunes < len(runes) {
+					kept = append(kept, string(runes[:maxRunes]))
+				} else {
+					kept = append(kept, line)
+				}
+			}
+			truncated = true
+			break
+		}
+
+		kept = append(kept, line)
+		visibleCount += wrapped
+	}
+
+	if !truncated {
+		return msg
+	}
+
+	return strings.Join(kept, "\n") + "…"
+}
+
 // --------------------------------------------------------------------------
 // Print helpers — add content to ScrollList
 // --------------------------------------------------------------------------
@@ -2666,15 +2894,20 @@ func (m *AppModel) handleExtensionCommand(text string) tea.Cmd {

 // expandPromptTemplate checks if the submitted text matches a prompt template
 // and returns the expanded content with arguments substituted.
-// Returns (expanded, true) if a template was found and expanded, (text, false) otherwise.
-func (m *AppModel) expandPromptTemplate(text string) (string, bool) {
+//
+// Return values:
+//   - (expanded, true, "") — template matched and expanded successfully
+//   - (text, false, "")   — no template matched; caller should treat text as-is
+//   - ("", false, reason) — template matched but validation failed; reason
+//     contains a user-facing error message (already printed to ScrollList)
+func (m *AppModel) expandPromptTemplate(text string) (string, bool, string) {
 	if len(m.promptTemplates) == 0 {
-		return text, false
+		return text, false, ""
 	}

 	// Only consider inputs that look like slash commands.
 	if !strings.HasPrefix(text, "/") {
-		return text, false
+		return text, false, ""
 	}

 	// Split: "/templatename arg1 arg2" → name="/templatename", args="arg1 arg2"
@@ -2684,11 +2917,80 @@ func (m *AppModel) expandPromptTemplate(text string) (string, bool) {
 	// Find matching template
 	for _, tpl := range m.promptTemplates {
 		if tpl.Name == name {
-			return tpl.Expand(args), true
+			// Validate that enough positional arguments were provided.
+			required := tpl.RequiredArgs()
+			if required > 0 {
+				provided := len(prompts.ParseCommandArgs(args))
+				if provided < required {
+					reason := fmt.Sprintf(
+						"/%s requires %d argument(s), got %d",
+						name, required, provided,
+					)
+					m.printSystemMessage(reason)
+					return "", false, reason
+				}
+			}
+			return tpl.Expand(args), true, ""
 		}
 	}

-	return text, false
+	return text, false, ""
+}
+
+// refreshPromptTemplates reloads prompt templates from the provider callback
+// and updates the autocomplete entries. Called on ContentReloadEvent.
+func (m *AppModel) refreshPromptTemplates() {
+	if m.getPromptTemplates == nil {
+		return
+	}
+	newTemplates := m.getPromptTemplates()
+	m.promptTemplates = newTemplates
+
+	if ic, ok := m.input.(*InputComponent); ok {
+		// Remove old prompt commands and add fresh ones.
+		var kept []commands.SlashCommand
+		for _, sc := range ic.commands {
+			if sc.Category != "Prompts" {
+				kept = append(kept, sc)
+			}
+		}
+		for _, tpl := range newTemplates {
+			kept = append(kept, commands.SlashCommand{
+				Name:        "/" + tpl.Name,
+				Description: tpl.Description,
+				Category:    "Prompts",
+				HasArgs:     tpl.HasArgPlaceholders(),
+			})
+		}
+		ic.commands = kept
+	}
+}
+
+// refreshSkillItems reloads skill items from the provider callback.
+// Called on ContentReloadEvent.
+func (m *AppModel) refreshSkillItems() {
+	if m.getSkillItems == nil {
+		return
+	}
+	m.skillItems = m.getSkillItems()
+}
+
+// refreshToolNames reloads tool names from the provider callback.
+// Called on MCPToolsReadyEvent when background MCP tool loading completes.
+func (m *AppModel) refreshToolNames() {
+	if m.getToolNames == nil {
+		return
+	}
+	m.toolNames = m.getToolNames()
+}
+
+// refreshMCPToolCount reloads the MCP tool count from the provider callback.
+// Called on MCPToolsReadyEvent when background MCP tool loading completes.
+func (m *AppModel) refreshMCPToolCount() {
+	if m.getMCPToolCount == nil {
+		return
+	}
+	m.mcpToolCount = m.getMCPToolCount()
 }

 // printHelpMessage renders the help text listing all available slash commands.
@@ -2745,7 +3047,8 @@ func (m *AppModel) printHelpMessage() {
 		"**Keys:**\n" +
 		"- `Ctrl+C`: Exit at any time\n" +
 		"- `ESC` (x2): Cancel ongoing LLM generation\n" +
-		"- `Ctrl+S`: Steer — redirect the agent mid-turn (injected between tool calls)\n" +
+		"- `Ctrl+X s`: Steer — redirect the agent mid-turn (injected between tool calls)\n" +
+		"- `Ctrl+X e`: Open `$EDITOR` to compose/edit your prompt\n" +
 		"- `Enter` (while working): Queue message for after the agent finishes\n\n" +
 		"You can also just type your message to chat with the AI assistant."
 	m.printSystemMessage(help)
@@ -2886,12 +3189,27 @@ func (m *AppModel) flushStreamAndPendingUserMessages() {
 		if content := m.stream.GetRenderedContent(); content != "" {
 			m.stream.Reset()

-			// Render styled content using MessageRenderer
-			styledMsg := m.renderer.RenderAssistantMessage(content, time.Now(), m.modelName)
+			// Check whether the content is already in the ScrollList as a
+			// StreamingMessageItem (created by appendStreamingChunk during
+			// ReasoningChunkEvent / StreamChunkEvent). If so, just mark it
+			// complete — creating a second StyledMessageItem would duplicate
+			// the rendered block and shift mouse hit-testing coordinates.
+			alreadyInList := false
+			if len(m.messages) > 0 {
+				if streamMsg, ok := m.messages[len(m.messages)-1].(*StreamingMessageItem); ok {
+					streamMsg.MarkComplete()
+					alreadyInList = true
+				}
+			}

-			// Add to in-memory scrollList with styled content
-			msg := NewStyledMessageItem(generateMessageID(), "assistant", content, styledMsg.Content)
-			m.messages = append(m.messages, msg)
+			if !alreadyInList {
+				// Render styled content using MessageRenderer
+				styledMsg := m.renderer.RenderAssistantMessage(content, time.Now(), m.modelName)
+
+				// Add to in-memory scrollList with styled content
+				msg := NewStyledMessageItem(generateMessageID(), "assistant", content, styledMsg.Content)
+				m.messages = append(m.messages, msg)
+			}
 		}
 	}

@@ -3818,6 +4136,13 @@ func cancelTimerCmd() tea.Cmd {
 // Interactive prompt support
 // --------------------------------------------------------------------------

+// externalEditorMsg is sent when the user returns from $EDITOR after
+// composing a prompt via the Ctrl+X e chord.
+type externalEditorMsg struct {
+	text string
+	err  error
+}
+
 // shareResultMsg carries the result of an async gist upload.
 type shareResultMsg struct {
 	err       error
@@ -2,6 +2,7 @@ package ui

 import (
 	"errors"
+	"strings"
 	"testing"

 	tea "charm.land/bubbletea/v2"
@@ -892,3 +893,107 @@ func TestSubmit_duringWorking_stays(t *testing.T) {
 		t.Fatalf("expected Run('queued prompt') called, got %v", ctrl.runCalls)
 	}
 }
+
+// --------------------------------------------------------------------------
+// truncateMessageForBlock
+// --------------------------------------------------------------------------
+
+// TestTruncateMessageForBlock_shortMessage verifies that short messages are
+// returned unchanged.
+func TestTruncateMessageForBlock_shortMessage(t *testing.T) {
+	msg := "hello world"
+	got := truncateMessageForBlock(msg, 3, 80)
+	if got != msg {
+		t.Fatalf("expected unchanged message, got %q", got)
+	}
+}
+
+// TestTruncateMessageForBlock_exactLines verifies that a message with exactly
+// maxLines hard lines is returned unchanged.
+func TestTruncateMessageForBlock_exactLines(t *testing.T) {
+	msg := "line1\nline2\nline3"
+	got := truncateMessageForBlock(msg, 3, 80)
+	if got != msg {
+		t.Fatalf("expected unchanged message, got %q", got)
+	}
+}
+
+// TestTruncateMessageForBlock_tooManyLines verifies that messages exceeding
+// maxLines are truncated with an ellipsis.
+func TestTruncateMessageForBlock_tooManyLines(t *testing.T) {
+	msg := "line1\nline2\nline3\nline4\nline5"
+	got := truncateMessageForBlock(msg, 3, 80)
+	want := "line1\nline2\nline3…"
+	if got != want {
+		t.Fatalf("expected %q, got %q", want, got)
+	}
+}
+
+// TestTruncateMessageForBlock_longWrappingLine verifies that a single long
+// line that would wrap beyond maxLines is truncated.
+func TestTruncateMessageForBlock_longWrappingLine(t *testing.T) {
+	// 100 chars at width 20 = 5 visual lines, exceeds maxLines=3
+	msg := "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
+	got := truncateMessageForBlock(msg, 3, 20)
+	// Should be truncated to 3*20=60 runes + "…"
+	if len([]rune(got)) != 61 { // 60 runes + "…"
+		t.Fatalf("expected 61 runes (60 + ellipsis), got %d runes: %q", len([]rune(got)), got)
+	}
+	if got[len(got)-3:] != "…" { // "…" is 3 bytes in UTF-8
+		t.Fatal("expected trailing ellipsis")
+	}
+}
+
+// TestTruncateMessageForBlock_emptyMessage verifies that empty messages are
+// returned unchanged.
+func TestTruncateMessageForBlock_emptyMessage(t *testing.T) {
+	got := truncateMessageForBlock("", 3, 80)
+	if got != "" {
+		t.Fatalf("expected empty string, got %q", got)
+	}
+}
+
+// TestTruncateMessageForBlock_mixedWrapAndHardLines verifies truncation when
+// some hard lines wrap and the total exceeds maxLines.
+func TestTruncateMessageForBlock_mixedWrapAndHardLines(t *testing.T) {
+	// First line: 40 chars at width 20 = 2 visual lines
+	// Second line: "short" = 1 visual line (total: 3, exactly at limit)
+	// Third line: would exceed
+	msg := "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\nshort\nextra"
+	got := truncateMessageForBlock(msg, 3, 20)
+	want := "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\nshort…"
+	if got != want {
+		t.Fatalf("expected %q, got %q", want, got)
+	}
+}
+
+// TestRenderQueuedMessages_truncatesLongMessages verifies that the rendered
+// queued message view truncates long messages instead of showing them in full.
+func TestRenderQueuedMessages_truncatesLongMessages(t *testing.T) {
+	ctrl := &stubAppController{}
+	m, _, _ := newTestAppModel(ctrl)
+	m.width = 80
+
+	// Queue a very long message (20 lines).
+	var b strings.Builder
+	for i := range 20 {
+		if i > 0 {
+			b.WriteByte('\n')
+		}
+		b.WriteString("This is a long line of text for testing purposes")
+	}
+	m.queuedMessages = []string{b.String()}
+
+	rendered := m.renderQueuedMessages()
+	if rendered == "" {
+		t.Fatal("expected non-empty rendered output")
+	}
+
+	// The full message would be ~20+ lines. With truncation to 3 content
+	// lines + badge + padding, it should be much shorter.
+	lines := len(strings.Split(rendered, "\n"))
+	// 3 content lines + 1 badge + 2 padding + border overhead ≈ ~7 lines max
+	if lines > 10 {
+		t.Fatalf("expected truncated output to be ≤10 lines, got %d lines", lines)
+	}
+}
@@ -78,7 +78,7 @@ func newInputPrompt(message, placeholder, defaultValue string, width, height int
 	ta.Placeholder = placeholder
 	ta.ShowLineNumbers = false
 	ta.Prompt = ""
-	ta.CharLimit = 1000
+	ta.CharLimit = 0
 	ta.SetWidth(width - 12) // account for border + padding
 	ta.SetHeight(1)
 	ta.Focus()
@@ -5,6 +5,7 @@ package render

 import (
 	"fmt"
+	"regexp"
 	"strings"

 	"charm.land/lipgloss/v2"
@@ -13,16 +14,43 @@ import (
 	"github.com/mark3labs/kit/internal/ui/style"
 )

+// fileTokenPattern matches @file references in user text. Supports:
+//   - @"path with spaces.txt" (quoted)
+//   - @path/to/file.txt      (unquoted, no spaces)
+var fileTokenPattern = regexp.MustCompile(`@"[^"]+"|@[^\s]+`)
+
 // UserBlock renders a user message with herald Tip styling.
-func UserBlock(content string, ty *herald.Typography, theme style.Theme) string {
+// The width parameter controls line wrapping so long messages don't overflow.
+// Any @file tokens in the content are highlighted with the theme accent color.
+func UserBlock(content string, width int, ty *herald.Typography, theme style.Theme) string {
 	if strings.TrimSpace(content) == "" {
 		content = "(empty message)"
 	}

+	// Wrap content before passing to herald Alert so long lines break
+	// inside the alert box. Subtract 4 to account for the alert bar
+	// prefix ("│ ") and a small margin.
+	if width > 4 {
+		content = lipgloss.Wrap(content, width-4, "")
+	}
+
+	// Highlight @file tokens with accent color so file references are
+	// visually distinct from surrounding prompt text.
+	content = highlightFileTokens(content, theme)
+
 	rendered := ty.Tip(content)
 	return styleMarginBottom(theme, rendered)
 }

+// highlightFileTokens wraps @file tokens in the given text with the theme
+// accent color so they stand out visually in rendered user messages.
+func highlightFileTokens(text string, theme style.Theme) string {
+	accentStyle := lipgloss.NewStyle().Foreground(theme.Accent).Bold(true)
+	return fileTokenPattern.ReplaceAllStringFunc(text, func(token string) string {
+		return accentStyle.Render(token)
+	})
+}
+
 // AssistantBlock renders an assistant message with markdown styling.
 func AssistantBlock(content string, width int, theme style.Theme) string {
 	if strings.TrimSpace(content) == "" {
@@ -0,0 +1,110 @@
+package render
+
+import (
+	"strings"
+	"testing"
+
+	"github.com/indaco/herald"
+
+	"github.com/mark3labs/kit/internal/ui/style"
+)
+
+// testTypography creates a herald Typography for tests.
+func testTypography(theme style.Theme) *herald.Typography {
+	return herald.New(
+		herald.WithPalette(herald.ColorPalette{
+			Primary:   theme.Primary,
+			Secondary: theme.Secondary,
+			Tertiary:  theme.Info,
+			Accent:    theme.Accent,
+			Highlight: theme.Highlight,
+			Muted:     theme.Muted,
+			Text:      theme.Text,
+			Surface:   theme.Background,
+			Base:      theme.CodeBg,
+		}),
+		herald.WithAlertLabel(herald.AlertTip, "You"),
+	)
+}
+
+func TestHighlightFileTokens(t *testing.T) {
+	theme := style.DefaultTheme()
+
+	tests := []struct {
+		name     string
+		input    string
+		wantHas  []string // substrings that must be present in the output
+		wantNone []string // substrings that must NOT be present as plain text
+	}{
+		{
+			name:    "no tokens",
+			input:   "hello world",
+			wantHas: []string{"hello world"},
+		},
+		{
+			name:    "single unquoted token",
+			input:   "refactor @main.go please",
+			wantHas: []string{"@main.go", "refactor", "please"},
+		},
+		{
+			name:    "quoted token with spaces",
+			input:   `check @"path with spaces/file.txt" out`,
+			wantHas: []string{`@"path with spaces/file.txt"`, "check", "out"},
+		},
+		{
+			name:    "multiple tokens",
+			input:   "@main.go @utils.go refactor these",
+			wantHas: []string{"@main.go", "@utils.go", "refactor these"},
+		},
+		{
+			name:    "path with directory",
+			input:   "look at @internal/ui/render/blocks.go",
+			wantHas: []string{"@internal/ui/render/blocks.go", "look at"},
+		},
+		{
+			name:    "empty string",
+			input:   "",
+			wantHas: []string{""},
+		},
+	}
+
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			result := highlightFileTokens(tt.input, theme)
+
+			for _, want := range tt.wantHas {
+				if !strings.Contains(result, want) {
+					t.Errorf("highlightFileTokens(%q) = %q, want substring %q", tt.input, result, want)
+				}
+			}
+
+			// If there were @tokens, the result should contain ANSI escape
+			// sequences (from lipgloss styling).
+			if fileTokenPattern.MatchString(tt.input) && !strings.Contains(result, "\x1b[") {
+				t.Errorf("highlightFileTokens(%q) should contain ANSI escapes for @tokens but got %q", tt.input, result)
+			}
+		})
+	}
+}
+
+func TestUserBlockHighlightsFileTokens(t *testing.T) {
+	theme := style.DefaultTheme()
+	ty := testTypography(theme)
+
+	// A user message with @file tokens should contain ANSI escapes around the token.
+	content := "refactor @main.go and @utils.go"
+	result := UserBlock(content, 80, ty, theme)
+
+	// The rendered output should contain both file references.
+	if !strings.Contains(result, "@main.go") {
+		t.Errorf("UserBlock output should contain @main.go, got:\n%s", result)
+	}
+	if !strings.Contains(result, "@utils.go") {
+		t.Errorf("UserBlock output should contain @utils.go, got:\n%s", result)
+	}
+
+	// Verify ANSI codes are present (the tokens are styled).
+	if !strings.Contains(result, "\x1b[") {
+		t.Errorf("UserBlock output should contain ANSI escape codes for styled @file tokens")
+	}
+}
@@ -85,11 +85,13 @@ func GetMarkdownTypography() *herald.Typography {
 	return ty
 }

-// ToMarkdown renders markdown content using herald-md.
-// The width parameter is currently unused as herald handles wrapping
-// based on terminal width internally.
+// ToMarkdown renders markdown content using herald-md and wraps the result
+// to the given width so that long lines do not overflow the terminal.
 func ToMarkdown(content string, width int) string {
 	ty := GetMarkdownTypography()
 	rendered := heraldmd.Render(ty, []byte(content))
+	if width > 0 {
+		rendered = lipgloss.Wrap(rendered, width, "")
+	}
 	return rendered
 }
@@ -23,7 +23,7 @@ func NewToolApprovalInput(toolName, toolArgs string, width int) *ToolApprovalInp
 	ta := textarea.New()
 	ta.Placeholder = ""
 	ta.ShowLineNumbers = false
-	ta.CharLimit = 1000
+	ta.CharLimit = 0
 	ta.SetWidth(width - 8) // Account for container padding, border and internal padding
 	ta.SetHeight(4)        // Default to 3 lines like huh
 	ta.Focus()
@@ -134,23 +134,28 @@ func (ut *UsageTracker) EstimateAndUpdateUsage(inputText, outputText string) {
 }

 // SetContextTokens records the approximate current context window utilization.
-// This should be set from FinalUsage.InputTokens, which already includes the
-// full conversation history (system prompt + all previous messages). Do NOT
-// add OutputTokens as that would double-count (output becomes input next turn).
-// Use FinalResponse.Usage rather than aggregate TotalUsage, because TotalUsage
-// sums across all tool-calling steps and overstates the actual window fill level.
+//
+// The value should include ALL token categories from the last API call:
+//
+//	InputTokens + CacheReadTokens + CacheCreationTokens + OutputTokens
+//
+// With Anthropic prompt caching, InputTokens can be near-zero while
+// CacheReadTokens holds the bulk of the context. All four must be summed
+// to get the true context window fill level.
+//
+// OutputTokens is included because the assistant's output becomes part of
+// the context on the next turn.
+//
+// Use FinalResponse.Usage (last step only) rather than aggregate TotalUsage,
+// because TotalUsage sums across all tool-calling steps and overstates the
+// actual window fill level.
+//
+// The value is set unconditionally (not max-only) so that context shrinks
+// correctly after compaction.
 func (ut *UsageTracker) SetContextTokens(tokens int) {
 	ut.mu.Lock()
 	defer ut.mu.Unlock()
-	// Track the maximum context seen so far. In multi-step tool calls,
-	// FinalUsage.InputTokens may reflect only the last step's input, which
-	// can be smaller than previous steps. We want to show the largest context
-	// the model has processed in this session.
-	if tokens > ut.contextTokens {
-		ut.contextTokens = tokens
-	}
-	// If tokens < current, we keep the larger value (no-op)
-	// This prevents the display from dropping during multi-step tool calls.
+	ut.contextTokens = tokens
 }

 // RenderUsageInfo generates a formatted string displaying current usage statistics
@@ -0,0 +1,259 @@
+// Package watcher provides a general-purpose file watcher that monitors
+// directories for changes to files matching specified extensions. It uses
+// fsnotify for kernel-level notifications with debouncing to coalesce
+// rapid editor writes.
+package watcher
+
+import (
+	"context"
+	"fmt"
+	"os"
+	"path/filepath"
+	"strings"
+	"sync"
+	"time"
+
+	"github.com/fsnotify/fsnotify"
+)
+
+// ContentWatcher monitors directories for file changes matching a set of
+// extensions and triggers a reload callback when changes are detected.
+// It uses fsnotify for kernel-level file notifications (inotify on Linux,
+// kqueue on macOS) with debouncing to coalesce rapid editor writes.
+type ContentWatcher struct {
+	watcher    *fsnotify.Watcher
+	onReload   func()
+	extensions []string // e.g. [".md", ".txt"]
+	label      string   // for logging (e.g. "prompts", "skills")
+	debounce   time.Duration
+	cancel     context.CancelFunc
+	done       chan struct{}
+	mu         sync.Mutex
+}
+
+// Options configures a ContentWatcher.
+type Options struct {
+	// Dirs are the directories to watch.
+	Dirs []string
+	// Extensions are the file extensions to watch for (e.g. ".md", ".txt").
+	// Include the leading dot.
+	Extensions []string
+	// OnReload is called when a matching file changes (after debouncing).
+	OnReload func()
+	// Label is a human-readable name for logging (e.g. "prompts", "skills").
+	Label string
+	// Debounce is the debounce duration. Defaults to 300ms if zero.
+	Debounce time.Duration
+}
+
+// New creates a ContentWatcher that monitors the given directories for
+// file changes matching the specified extensions. When a change is detected
+// (after debouncing), onReload is called. The watcher must be started with
+// Start() and stopped with Close().
+func New(opts Options) (*ContentWatcher, error) {
+	if len(opts.Dirs) == 0 {
+		return nil, fmt.Errorf("no directories to watch")
+	}
+
+	fsw, err := fsnotify.NewWatcher()
+	if err != nil {
+		return nil, fmt.Errorf("creating file watcher: %w", err)
+	}
+
+	for _, dir := range opts.Dirs {
+		if err := fsw.Add(dir); err != nil {
+			continue
+		}
+
+		// Also watch immediate subdirectories (for skill/SKILL.md pattern).
+		entries, err := os.ReadDir(dir)
+		if err != nil {
+			continue
+		}
+		for _, entry := range entries {
+			if entry.IsDir() {
+				subdir := filepath.Join(dir, entry.Name())
+				_ = fsw.Add(subdir)
+			}
+		}
+	}
+
+	debounce := opts.Debounce
+	if debounce == 0 {
+		debounce = 300 * time.Millisecond
+	}
+
+	return &ContentWatcher{
+		watcher:    fsw,
+		onReload:   opts.OnReload,
+		extensions: opts.Extensions,
+		label:      opts.Label,
+		debounce:   debounce,
+		done:       make(chan struct{}),
+	}, nil
+}
+
+// Start begins watching for file changes. It blocks until the context
+// is cancelled or Close() is called. Typically called in a goroutine.
+func (w *ContentWatcher) Start(ctx context.Context) {
+	w.mu.Lock()
+	ctx, w.cancel = context.WithCancel(ctx)
+	w.mu.Unlock()
+
+	defer close(w.done)
+
+	var timer *time.Timer
+	var timerC <-chan time.Time
+
+	for {
+		select {
+		case <-ctx.Done():
+			if timer != nil {
+				timer.Stop()
+			}
+			return
+
+		case event, ok := <-w.watcher.Events:
+			if !ok {
+				return
+			}
+
+			// When a new subdirectory is created, start watching it so
+			// that files added inside (e.g. new-skill/SKILL.md) trigger
+			// reload events. Also schedule a reload in case the directory
+			// was created with matching files already inside.
+			if event.Op&fsnotify.Create != 0 {
+				if info, err := os.Stat(event.Name); err == nil && info.IsDir() {
+					if addErr := w.watcher.Add(event.Name); addErr == nil {
+						// Check if the new directory already contains matching files.
+						if w.dirContainsMatchingFiles(event.Name) {
+							if timer != nil {
+								timer.Stop()
+							}
+							timer = time.NewTimer(w.debounce)
+							timerC = timer.C
+						}
+					}
+					continue
+				}
+			}
+
+			// Only care about files matching our extensions.
+			if !w.matchesExtension(event.Name) {
+				continue
+			}
+
+			// React to write, create, remove, rename events.
+			if event.Op&(fsnotify.Write|fsnotify.Create|fsnotify.Remove|fsnotify.Rename) == 0 {
+				continue
+			}
+
+			// Debounce: reset timer on each event.
+			if timer != nil {
+				timer.Stop()
+			}
+			timer = time.NewTimer(w.debounce)
+			timerC = timer.C
+
+		case <-timerC:
+			timerC = nil
+			timer = nil
+			w.onReload()
+
+		case err, ok := <-w.watcher.Errors:
+			if !ok {
+				return
+			}
+			_ = err
+		}
+	}
+}
+
+// Close stops the watcher and releases resources.
+func (w *ContentWatcher) Close() error {
+	w.mu.Lock()
+	cancel := w.cancel
+	w.mu.Unlock()
+
+	if cancel != nil {
+		cancel()
+	}
+
+	// Wait for the event loop to finish.
+	<-w.done
+	return w.watcher.Close()
+}
+
+// matchesExtension returns true if the file name ends with one of the
+// watched extensions.
+func (w *ContentWatcher) matchesExtension(name string) bool {
+	for _, ext := range w.extensions {
+		if strings.HasSuffix(name, ext) {
+			return true
+		}
+	}
+	return false
+}
+
+// dirContainsMatchingFiles returns true if the directory contains at least
+// one file matching the watched extensions. Used to detect cases where a
+// directory is created with files already inside (e.g. cp -r).
+func (w *ContentWatcher) dirContainsMatchingFiles(dir string) bool {
+	entries, err := os.ReadDir(dir)
+	if err != nil {
+		return false
+	}
+	for _, entry := range entries {
+		if !entry.IsDir() && w.matchesExtension(entry.Name()) {
+			return true
+		}
+	}
+	return false
+}
+
+// CollectDirs returns the directories to watch for a given set of standard
+// directories and extra paths. Directories are deduplicated by absolute path
+// and verified to exist. For explicit file paths, the parent directory is
+// watched instead.
+func CollectDirs(standardDirs []string, extraPaths []string) []string {
+	var dirs []string
+	seen := make(map[string]bool)
+
+	add := func(dir string) {
+		abs, err := filepath.Abs(dir)
+		if err != nil {
+			return
+		}
+		if seen[abs] {
+			return
+		}
+
+		// Verify the directory exists.
+		info, err := os.Stat(abs)
+		if err != nil || !info.IsDir() {
+			return
+		}
+
+		seen[abs] = true
+		dirs = append(dirs, abs)
+	}
+
+	for _, d := range standardDirs {
+		add(d)
+	}
+
+	for _, p := range extraPaths {
+		info, err := os.Stat(p)
+		if err != nil {
+			continue
+		}
+		if info.IsDir() {
+			add(p)
+		} else {
+			// For explicit files, watch the parent directory.
+			add(filepath.Dir(p))
+		}
+	}
+
+	return dirs
+}
@@ -0,0 +1,307 @@
+package watcher
+
+import (
+	"os"
+	"path/filepath"
+	"sync/atomic"
+	"testing"
+	"time"
+)
+
+func TestContentWatcher_ReloadsOnMatchingFile(t *testing.T) {
+	dir := t.TempDir()
+
+	// Write an initial file so the directory isn't empty.
+	initial := filepath.Join(dir, "existing.md")
+	if err := os.WriteFile(initial, []byte("# Hello"), 0644); err != nil {
+		t.Fatal(err)
+	}
+
+	var reloadCount atomic.Int32
+	w, err := New(Options{
+		Dirs:       []string{dir},
+		Extensions: []string{".md"},
+		OnReload:   func() { reloadCount.Add(1) },
+		Label:      "test",
+		Debounce:   50 * time.Millisecond,
+	})
+	if err != nil {
+		t.Fatal(err)
+	}
+
+	go w.Start(t.Context())
+
+	// Wait for watcher to be ready.
+	time.Sleep(100 * time.Millisecond)
+
+	// Modify the file.
+	if err := os.WriteFile(initial, []byte("# Updated"), 0644); err != nil {
+		t.Fatal(err)
+	}
+
+	// Wait for debounce + processing.
+	time.Sleep(200 * time.Millisecond)
+
+	if got := reloadCount.Load(); got != 1 {
+		t.Errorf("expected 1 reload, got %d", got)
+	}
+
+	_ = w.Close()
+}
+
+func TestContentWatcher_IgnoresNonMatchingFiles(t *testing.T) {
+	dir := t.TempDir()
+
+	var reloadCount atomic.Int32
+	w, err := New(Options{
+		Dirs:       []string{dir},
+		Extensions: []string{".md"},
+		OnReload:   func() { reloadCount.Add(1) },
+		Label:      "test",
+		Debounce:   50 * time.Millisecond,
+	})
+	if err != nil {
+		t.Fatal(err)
+	}
+
+	go w.Start(t.Context())
+
+	time.Sleep(100 * time.Millisecond)
+
+	// Write a non-matching file.
+	if err := os.WriteFile(filepath.Join(dir, "readme.txt"), []byte("hello"), 0644); err != nil {
+		t.Fatal(err)
+	}
+
+	time.Sleep(200 * time.Millisecond)
+
+	if got := reloadCount.Load(); got != 0 {
+		t.Errorf("expected 0 reloads for non-matching file, got %d", got)
+	}
+
+	_ = w.Close()
+}
+
+func TestContentWatcher_MultipleExtensions(t *testing.T) {
+	dir := t.TempDir()
+
+	var reloadCount atomic.Int32
+	w, err := New(Options{
+		Dirs:       []string{dir},
+		Extensions: []string{".md", ".txt"},
+		OnReload:   func() { reloadCount.Add(1) },
+		Label:      "test",
+		Debounce:   50 * time.Millisecond,
+	})
+	if err != nil {
+		t.Fatal(err)
+	}
+
+	go w.Start(t.Context())
+
+	time.Sleep(100 * time.Millisecond)
+
+	// Write a .txt file — should trigger.
+	if err := os.WriteFile(filepath.Join(dir, "notes.txt"), []byte("notes"), 0644); err != nil {
+		t.Fatal(err)
+	}
+
+	time.Sleep(200 * time.Millisecond)
+
+	if got := reloadCount.Load(); got != 1 {
+		t.Errorf("expected 1 reload for .txt file, got %d", got)
+	}
+
+	_ = w.Close()
+}
+
+func TestContentWatcher_Debounces(t *testing.T) {
+	dir := t.TempDir()
+
+	var reloadCount atomic.Int32
+	w, err := New(Options{
+		Dirs:       []string{dir},
+		Extensions: []string{".md"},
+		OnReload:   func() { reloadCount.Add(1) },
+		Label:      "test",
+		Debounce:   100 * time.Millisecond,
+	})
+	if err != nil {
+		t.Fatal(err)
+	}
+
+	go w.Start(t.Context())
+
+	time.Sleep(100 * time.Millisecond)
+
+	// Rapid-fire writes — should debounce into 1 reload.
+	for i := range 5 {
+		if err := os.WriteFile(filepath.Join(dir, "test.md"), []byte("v"+string(rune('0'+i))), 0644); err != nil {
+			t.Fatal(err)
+		}
+		time.Sleep(30 * time.Millisecond)
+	}
+
+	time.Sleep(300 * time.Millisecond)
+
+	if got := reloadCount.Load(); got != 1 {
+		t.Errorf("expected 1 debounced reload, got %d", got)
+	}
+
+	_ = w.Close()
+}
+
+func TestContentWatcher_WatchesSubdirectories(t *testing.T) {
+	dir := t.TempDir()
+
+	// Create a subdirectory (simulates skill-name/SKILL.md pattern).
+	subdir := filepath.Join(dir, "my-skill")
+	if err := os.MkdirAll(subdir, 0755); err != nil {
+		t.Fatal(err)
+	}
+
+	var reloadCount atomic.Int32
+	w, err := New(Options{
+		Dirs:       []string{dir},
+		Extensions: []string{".md"},
+		OnReload:   func() { reloadCount.Add(1) },
+		Label:      "test",
+		Debounce:   50 * time.Millisecond,
+	})
+	if err != nil {
+		t.Fatal(err)
+	}
+
+	go w.Start(t.Context())
+
+	time.Sleep(100 * time.Millisecond)
+
+	// Write to subdirectory.
+	if err := os.WriteFile(filepath.Join(subdir, "SKILL.md"), []byte("# Skill"), 0644); err != nil {
+		t.Fatal(err)
+	}
+
+	time.Sleep(200 * time.Millisecond)
+
+	if got := reloadCount.Load(); got != 1 {
+		t.Errorf("expected 1 reload for subdirectory file, got %d", got)
+	}
+
+	_ = w.Close()
+}
+
+func TestContentWatcher_WatchesNewSubdirectory(t *testing.T) {
+	dir := t.TempDir()
+
+	var reloadCount atomic.Int32
+	w, err := New(Options{
+		Dirs:       []string{dir},
+		Extensions: []string{".md"},
+		OnReload:   func() { reloadCount.Add(1) },
+		Label:      "test",
+		Debounce:   50 * time.Millisecond,
+	})
+	if err != nil {
+		t.Fatal(err)
+	}
+
+	go w.Start(t.Context())
+
+	// Wait for watcher to be ready.
+	time.Sleep(100 * time.Millisecond)
+
+	// Create a NEW subdirectory after the watcher started (the bug scenario).
+	subdir := filepath.Join(dir, "new-skill")
+	if err := os.MkdirAll(subdir, 0755); err != nil {
+		t.Fatal(err)
+	}
+
+	// Give fsnotify time to pick up the new directory.
+	time.Sleep(100 * time.Millisecond)
+
+	// Write a matching file inside the new subdirectory.
+	if err := os.WriteFile(filepath.Join(subdir, "SKILL.md"), []byte("# New Skill"), 0644); err != nil {
+		t.Fatal(err)
+	}
+
+	// Wait for debounce + processing.
+	time.Sleep(200 * time.Millisecond)
+
+	if got := reloadCount.Load(); got < 1 {
+		t.Errorf("expected at least 1 reload for file in new subdirectory, got %d", got)
+	}
+
+	_ = w.Close()
+}
+
+func TestContentWatcher_WatchesNewSubdirectoryWithExistingFiles(t *testing.T) {
+	dir := t.TempDir()
+
+	var reloadCount atomic.Int32
+	w, err := New(Options{
+		Dirs:       []string{dir},
+		Extensions: []string{".md"},
+		OnReload:   func() { reloadCount.Add(1) },
+		Label:      "test",
+		Debounce:   50 * time.Millisecond,
+	})
+	if err != nil {
+		t.Fatal(err)
+	}
+
+	go w.Start(t.Context())
+
+	time.Sleep(100 * time.Millisecond)
+
+	// Create a subdirectory with a matching file already inside (simulates cp -r).
+	subdir := filepath.Join(dir, "copied-skill")
+	if err := os.MkdirAll(subdir, 0755); err != nil {
+		t.Fatal(err)
+	}
+	if err := os.WriteFile(filepath.Join(subdir, "SKILL.md"), []byte("# Copied"), 0644); err != nil {
+		t.Fatal(err)
+	}
+
+	// Wait for debounce + processing.
+	time.Sleep(300 * time.Millisecond)
+
+	if got := reloadCount.Load(); got < 1 {
+		t.Errorf("expected at least 1 reload for copied subdirectory with files, got %d", got)
+	}
+
+	_ = w.Close()
+}
+
+func TestCollectDirs_Deduplicates(t *testing.T) {
+	dir := t.TempDir()
+
+	dirs := CollectDirs([]string{dir, dir}, nil)
+	if len(dirs) != 1 {
+		t.Errorf("expected 1 deduplicated dir, got %d", len(dirs))
+	}
+}
+
+func TestCollectDirs_FileParent(t *testing.T) {
+	dir := t.TempDir()
+	file := filepath.Join(dir, "test.md")
+	if err := os.WriteFile(file, []byte("test"), 0644); err != nil {
+		t.Fatal(err)
+	}
+
+	dirs := CollectDirs(nil, []string{file})
+	if len(dirs) != 1 {
+		t.Fatalf("expected 1 dir, got %d", len(dirs))
+	}
+
+	abs, _ := filepath.Abs(dir)
+	if dirs[0] != abs {
+		t.Errorf("expected %s, got %s", abs, dirs[0])
+	}
+}
+
+func TestCollectDirs_SkipsNonexistent(t *testing.T) {
+	dirs := CollectDirs([]string{"/nonexistent/dir"}, nil)
+	if len(dirs) != 0 {
+		t.Errorf("expected 0 dirs for nonexistent path, got %d", len(dirs))
+	}
+}
@@ -68,8 +68,12 @@ host, err := kit.New(ctx, &kit.Options{
    NoSession:    true,                       // Ephemeral mode

    // Tool options
-    Tools:        []kit.Tool{kit.NewBashTool()}, // Replace default tool set
-    ExtraTools:   []kit.Tool{myTool},            // Add alongside defaults
+    Tools:            []kit.Tool{kit.NewBashTool()}, // Replace default tool set
+    ExtraTools:       []kit.Tool{myTool},            // Add alongside defaults
+    DisableCoreTools: true,                        // Use no core tools (0 tools)
+
+    // Configuration
+    SkipConfig:   true,                        // Skip .kit.yml files (viper defaults + env vars still apply)

    // Compaction
    AutoCompact:  true,                       // Auto-compact near context limit
@@ -172,6 +176,24 @@ msg  := kit.ConvertFromLLMMessage(lMsg)  // LLMMessage  → SDK Message
 - `GetSessionID()` - Get session UUID
 - `Close()` - Clean up resources

+### Options
+
+Key `Options` fields for SDK usage:
+
+| Field | Description |
+|-------|-------------|
+| `Model` | Override model (e.g., "anthropic/claude-sonnet-4-5-20250929") |
+| `SystemPrompt` | Override system prompt |
+| `ConfigFile` | Load specific config file (empty = search defaults) |
+| `SkipConfig` | Skip `.kit.yml` loading (defaults + env vars still apply) |
+| `Tools` | Replace core tools with custom set |
+| `ExtraTools` | Add tools alongside defaults |
+| `DisableCoreTools` | Use no core tools (0 tools, for chat-only) |
+| `NoSession` | Ephemeral mode (no session persistence) |
+| `SessionPath` | Open specific session file |
+| `Continue` | Resume most recent session |
+| `Debug` | Enable debug logging |
+
 ## Environment Variables

 All CLI environment variables work with the SDK:
@@ -0,0 +1,231 @@
+package kit
+
+import (
+	"strings"
+	"time"
+
+	"charm.land/fantasy"
+	"github.com/mark3labs/kit/internal/session"
+)
+
+// treeManagerAdapter adapts TreeManager to SessionManager interface.
+// This is unexported - users don't interact with it directly.
+type treeManagerAdapter struct {
+	inner *session.TreeManager
+}
+
+// NewTreeManagerAdapter creates an adapter (exported for use in New function).
+// This is used by the SDK when no custom SessionManager is provided.
+func NewTreeManagerAdapter(tm *session.TreeManager) SessionManager {
+	return &treeManagerAdapter{inner: tm}
+}
+
+// AppendMessage implements SessionManager.
+func (a *treeManagerAdapter) AppendMessage(msg LLMMessage) (string, error) {
+	// LLMMessage is just an alias for fantasy.Message, so no conversion needed
+	return a.inner.AppendLLMMessage(msg)
+}
+
+// GetMessages implements SessionManager.
+func (a *treeManagerAdapter) GetMessages() []LLMMessage {
+	// LLMMessage is just an alias for fantasy.Message
+	return a.inner.GetLLMMessages()
+}
+
+// BuildContext implements SessionManager.
+func (a *treeManagerAdapter) BuildContext() ([]LLMMessage, string, string) {
+	msgs, provider, modelID := a.inner.BuildContext()
+	return msgs, provider, modelID
+}
+
+// Branch implements SessionManager.
+func (a *treeManagerAdapter) Branch(entryID string) error {
+	return a.inner.Branch(entryID)
+}
+
+// GetCurrentBranch implements SessionManager.
+func (a *treeManagerAdapter) GetCurrentBranch() []BranchEntry {
+	branch := a.inner.GetBranch("")
+	var result []BranchEntry
+	for _, entry := range branch {
+		be := a.convertEntry(entry)
+		if be != nil {
+			result = append(result, *be)
+		}
+	}
+	return result
+}
+
+// GetChildren implements SessionManager.
+func (a *treeManagerAdapter) GetChildren(parentID string) []string {
+	return a.inner.GetChildren(parentID)
+}
+
+// GetEntry implements SessionManager.
+func (a *treeManagerAdapter) GetEntry(entryID string) *BranchEntry {
+	entry := a.inner.GetEntry(entryID)
+	if entry == nil {
+		return nil
+	}
+	return a.convertEntry(entry)
+}
+
+// GetSessionID implements SessionManager.
+func (a *treeManagerAdapter) GetSessionID() string {
+	return a.inner.GetSessionID()
+}
+
+// GetSessionName implements SessionManager.
+func (a *treeManagerAdapter) GetSessionName() string {
+	return a.inner.GetSessionName()
+}
+
+// SetSessionName implements SessionManager.
+func (a *treeManagerAdapter) SetSessionName(name string) error {
+	_, err := a.inner.AppendSessionInfo(name)
+	return err
+}
+
+// GetCreatedAt implements SessionManager.
+func (a *treeManagerAdapter) GetCreatedAt() time.Time {
+	return a.inner.GetHeader().Timestamp
+}
+
+// IsPersisted implements SessionManager.
+func (a *treeManagerAdapter) IsPersisted() bool {
+	return a.inner.IsPersisted()
+}
+
+// AppendCompaction implements SessionManager.
+func (a *treeManagerAdapter) AppendCompaction(summary string, firstKeptEntryID string,
+	tokensBefore, tokensAfter int, messagesRemoved int, readFiles, modifiedFiles []string) (string, error) {
+
+	return a.inner.AppendCompaction(summary, firstKeptEntryID,
+		tokensBefore, tokensAfter, messagesRemoved, readFiles, modifiedFiles)
+}
+
+// GetLastCompaction implements SessionManager.
+func (a *treeManagerAdapter) GetLastCompaction() *CompactionEntry {
+	c := a.inner.GetLastCompaction()
+	if c == nil {
+		return nil
+	}
+	return &CompactionEntry{
+		ID:               c.ID,
+		Summary:          c.Summary,
+		FirstKeptEntryID: c.FirstKeptEntryID,
+		TokensBefore:     c.TokensBefore,
+		TokensAfter:      c.TokensAfter,
+		MessagesRemoved:  c.MessagesRemoved,
+		ReadFiles:        c.ReadFiles,
+		ModifiedFiles:    c.ModifiedFiles,
+		Timestamp:        c.Timestamp,
+	}
+}
+
+// AppendExtensionData implements SessionManager.
+func (a *treeManagerAdapter) AppendExtensionData(extType, data string) (string, error) {
+	return a.inner.AppendExtensionData(extType, data)
+}
+
+// GetExtensionData implements SessionManager.
+func (a *treeManagerAdapter) GetExtensionData(extType string) []ExtensionDataEntry {
+	entries := a.inner.GetExtensionData(extType)
+	var result []ExtensionDataEntry
+	for _, e := range entries {
+		result = append(result, ExtensionDataEntry{
+			ID:        e.ID,
+			ExtType:   e.ExtType,
+			Data:      e.Data,
+			Timestamp: e.Timestamp,
+		})
+	}
+	return result
+}
+
+// AppendModelChange implements SessionManager.
+func (a *treeManagerAdapter) AppendModelChange(provider, modelID string) (string, error) {
+	return a.inner.AppendModelChange(provider, modelID)
+}
+
+// GetContextEntryIDs implements SessionManager.
+func (a *treeManagerAdapter) GetContextEntryIDs() []string {
+	return a.inner.GetContextEntryIDs()
+}
+
+// Close implements SessionManager.
+func (a *treeManagerAdapter) Close() error {
+	return a.inner.Close()
+}
+
+// Helper: Convert internal entry types to BranchEntry
+func (a *treeManagerAdapter) convertEntry(entry any) *BranchEntry {
+	switch e := entry.(type) {
+	case *session.MessageEntry:
+		msg, err := e.ToMessage()
+		if err != nil {
+			return nil
+		}
+		// Build content text from parts
+		var content strings.Builder
+		for _, part := range msg.Parts {
+			if textPart, ok := part.(TextContent); ok {
+				content.WriteString(textPart.Text)
+			}
+		}
+		return &BranchEntry{
+			ID:        e.ID,
+			ParentID:  e.ParentID,
+			Type:      EntryTypeMessage,
+			Role:      string(msg.Role),
+			Content:   content.String(),
+			Model:     e.Model,
+			Provider:  e.Provider,
+			Timestamp: e.Timestamp,
+			RawParts:  msg.Parts,
+		}
+	case *session.BranchSummaryEntry:
+		return &BranchEntry{
+			ID:        e.ID,
+			ParentID:  e.ParentID,
+			Type:      EntryTypeBranchSummary,
+			Content:   e.Summary,
+			Timestamp: e.Timestamp,
+		}
+	case *session.ModelChangeEntry:
+		return &BranchEntry{
+			ID:        e.ID,
+			ParentID:  e.ParentID,
+			Type:      EntryTypeModelChange,
+			Content:   "Model changed to " + e.Provider + "/" + e.ModelID,
+			Model:     e.ModelID,
+			Provider:  e.Provider,
+			Timestamp: e.Timestamp,
+		}
+	case *session.CompactionEntry:
+		return &BranchEntry{
+			ID:        e.ID,
+			ParentID:  e.ParentID,
+			Type:      EntryTypeCompaction,
+			Content:   e.Summary,
+			Timestamp: e.Timestamp,
+		}
+	case *session.ExtensionDataEntry:
+		return &BranchEntry{
+			ID:        e.ID,
+			ParentID:  e.ParentID,
+			Type:      EntryTypeExtensionData,
+			Content:   "Extension data: " + e.ExtType,
+			Timestamp: e.Timestamp,
+		}
+	default:
+		return nil
+	}
+}
+
+// convertKitMessagesToFantasy converts kit LLM messages to fantasy messages.
+// Since LLMMessage is an alias for fantasy.Message, this is a no-op.
+func convertKitMessagesToFantasy(msgs []LLMMessage) []fantasy.Message {
+	// LLMMessage is just an alias for fantasy.Message, so we can type convert
+	return msgs
+}
@@ -21,9 +21,9 @@ type ContextStats struct {
 const defaultReserveTokens = 16384

 // EstimateContextTokens returns the estimated token count of the current
-// conversation based on tree session messages.
+// conversation based on session messages.
 func (m *Kit) EstimateContextTokens() int {
-	messages := m.treeSession.GetLLMMessages()
+	messages := m.session.GetMessages()
 	return compaction.EstimateMessageTokens(messages)
 }

@@ -31,6 +31,11 @@ func (m *Kit) EstimateContextTokens() int {
 // limit and should be compacted.
 // Formula: contextTokens > contextWindow − reserveTokens.
 // Returns false if the model's context limit is unknown.
+//
+// When API-reported token counts are available (after at least one turn),
+// the real count is used instead of the text-based heuristic. This is
+// significantly more accurate because it includes system prompts, tool
+// definitions, and other overhead that the heuristic cannot account for.
 func (m *Kit) ShouldCompact() bool {
 	info := m.GetModelInfo()
 	if info == nil || info.Limit.Context <= 0 {
@@ -42,8 +47,18 @@ func (m *Kit) ShouldCompact() bool {
 		reserveTokens = m.compactionOpts.ReserveTokens
 	}

-	messages := m.treeSession.GetLLMMessages()
-	return compaction.ShouldCompact(messages, info.Limit.Context, reserveTokens)
+	// Prefer the real API-reported token count when available.
+	m.lastInputTokensMu.RLock()
+	realTokens := m.lastInputTokens
+	m.lastInputTokensMu.RUnlock()
+
+	if realTokens > 0 {
+		return realTokens > info.Limit.Context-reserveTokens
+	}
+
+	// Fall back to text-based heuristic before first turn completes.
+	messages := m.session.GetMessages()
+	return compaction.ShouldCompact(convertKitMessagesToFantasy(messages), info.Limit.Context, reserveTokens)
 }

 // GetContextStats returns current context usage statistics including
@@ -55,7 +70,7 @@ func (m *Kit) ShouldCompact() bool {
 // because it includes system prompts, tool definitions, and other overhead
 // that the heuristic cannot account for.
 func (m *Kit) GetContextStats() ContextStats {
-	messages := m.treeSession.GetLLMMessages()
+	messages := m.session.GetMessages()

 	// Prefer the real API-reported input token count when available.
 	m.lastInputTokensMu.RLock()
@@ -114,7 +129,7 @@ func (m *Kit) compactInternal(ctx context.Context, opts *CompactionOptions, cust
 		}
 	}

-	messages := m.treeSession.GetLLMMessages()
+	messages := m.session.GetMessages()
 	if len(messages) < 2 {
 		return nil, fmt.Errorf("cannot compact: need at least 2 messages")
 	}
@@ -145,7 +160,7 @@ func (m *Kit) compactInternal(ctx context.Context, opts *CompactionOptions, cust

 	// Carry forward file tracking from previous compaction.
 	var prev *compaction.PreviousCompaction
-	if lastCompaction := m.treeSession.GetLastCompaction(); lastCompaction != nil {
+	if lastCompaction := m.session.GetLastCompaction(); lastCompaction != nil {
 		prev = &compaction.PreviousCompaction{
 			ReadFiles:     lastCompaction.ReadFiles,
 			ModifiedFiles: lastCompaction.ModifiedFiles,
@@ -171,7 +186,7 @@ func (m *Kit) compactInternal(ctx context.Context, opts *CompactionOptions, cust

 	// Non-destructive: append a CompactionEntry to the session tree instead
 	// of clearing and rewriting messages.
-	entryIDs := m.treeSession.GetContextEntryIDs()
+	entryIDs := m.session.GetContextEntryIDs()
 	firstKeptEntryID := ""
 	if result.CutPoint >= 0 && result.CutPoint < len(entryIDs) {
 		firstKeptEntryID = entryIDs[result.CutPoint]
@@ -188,9 +203,9 @@ func (m *Kit) compactInternal(ctx context.Context, opts *CompactionOptions, cust
 // custom summary. It still determines the cut point and persists a
 // CompactionEntry.
 func (m *Kit) applyCustomCompaction(summary string, messages []LLMMessage, opts *CompactionOptions) (*CompactionResult, error) {
-	originalTokens := compaction.EstimateMessageTokens(messages)
+	originalTokens := compaction.EstimateMessageTokens(convertKitMessagesToFantasy(messages))

-	cutPoint := compaction.FindCutPoint(messages, opts.KeepRecentTokens)
+	cutPoint := compaction.FindCutPoint(convertKitMessagesToFantasy(messages), opts.KeepRecentTokens)
 	if cutPoint == 0 {
 		cutPoint = len(messages) - 1
 		if cutPoint < 1 {
@@ -198,7 +213,7 @@ func (m *Kit) applyCustomCompaction(summary string, messages []LLMMessage, opts
 		}
 	}

-	entryIDs := m.treeSession.GetContextEntryIDs()
+	entryIDs := m.session.GetContextEntryIDs()
 	firstKeptEntryID := ""
 	if cutPoint >= 0 && cutPoint < len(entryIDs) {
 		firstKeptEntryID = entryIDs[cutPoint]
@@ -234,7 +249,7 @@ func (m *Kit) persistAndEmitCompaction(
 	originalTokens, compactedTokens, messagesRemoved int,
 	readFiles, modifiedFiles []string,
 ) error {
-	if _, err := m.treeSession.AppendCompaction(
+	if _, err := m.session.AppendCompaction(
 		summary,
 		firstKeptEntryID,
 		originalTokens,
@@ -245,6 +260,14 @@ func (m *Kit) persistAndEmitCompaction(
 	); err != nil {
 		return fmt.Errorf("failed to persist compaction entry: %w", err)
 	}
+
+	// Reset the API-reported token count so GetContextStats() and
+	// ShouldCompact() don't use stale pre-compaction values. The next
+	// API call will set the accurate post-compaction count.
+	m.lastInputTokensMu.Lock()
+	m.lastInputTokens = 0
+	m.lastInputTokensMu.Unlock()
+
 	m.events.emit(CompactionEvent{
 		Summary:         summary,
 		OriginalTokens:  originalTokens,
@@ -48,6 +48,8 @@ func setSDKDefaults() {
 	viper.SetDefault("temperature", 0.7)
 	viper.SetDefault("top-p", 0.95)
 	viper.SetDefault("top-k", 40)
+	viper.SetDefault("frequency-penalty", 0.0)
+	viper.SetDefault("presence-penalty", 0.0)
 	viper.SetDefault("stream", true)
 	viper.SetDefault("thinking-level", "off")
 	viper.SetDefault("num-gpu-layers", -1)
@@ -227,28 +227,51 @@ func (e *extensionAPI) GetMessageRenderer(name string) *extensions.MessageRender
 // Session data

 func (e *extensionAPI) GetSessionMessages() []extensions.SessionMessage {
-	return iterBranchMessages(e.kit.treeSession, func(me *session.MessageEntry, msg message.Message) extensions.SessionMessage {
-		return extensions.SessionMessage{
-			ID:        me.ID,
-			Role:      string(msg.Role),
-			Content:   msg.Content(),
-			Timestamp: me.Timestamp.Format("2006-01-02T15:04:05Z07:00"),
+	if e.kit.session == nil {
+		return nil
+	}
+
+	// Try to use the legacy iterBranchMessages for backward compatibility
+	// with the default TreeManager adapter
+	if adapter, ok := e.kit.session.(*treeManagerAdapter); ok {
+		return iterBranchMessages(adapter.inner, func(me *session.MessageEntry, msg message.Message) extensions.SessionMessage {
+			return extensions.SessionMessage{
+				ID:        me.ID,
+				Role:      string(msg.Role),
+				Content:   msg.Content(),
+				Timestamp: me.Timestamp.Format("2006-01-02T15:04:05Z07:00"),
+			}
+		})
+	}
+
+	// For custom SessionManagers, use the public interface
+	branch := e.kit.session.GetCurrentBranch()
+	var result []extensions.SessionMessage
+	for _, entry := range branch {
+		if entry.Type == EntryTypeMessage {
+			result = append(result, extensions.SessionMessage{
+				ID:        entry.ID,
+				Role:      entry.Role,
+				Content:   entry.Content,
+				Timestamp: entry.Timestamp.Format("2006-01-02T15:04:05Z07:00"),
+			})
 		}
-	})
+	}
+	return result
 }

 func (e *extensionAPI) AppendEntry(extType, data string) (string, error) {
-	if e.kit.treeSession == nil {
+	if e.kit.session == nil {
 		return "", fmt.Errorf("no session available")
 	}
-	return e.kit.treeSession.AppendExtensionData(extType, data)
+	return e.kit.session.AppendExtensionData(extType, data)
 }

 func (e *extensionAPI) GetEntries(extType string) []extensions.ExtensionEntry {
-	if e.kit.treeSession == nil {
+	if e.kit.session == nil {
 		return nil
 	}
-	entries := e.kit.treeSession.GetExtensionData(extType)
+	entries := e.kit.session.GetExtensionData(extType)
 	result := make([]extensions.ExtensionEntry, 0, len(entries))
 	for _, e := range entries {
 		result = append(result, extensions.ExtensionEntry{
@@ -4,6 +4,7 @@ import (
 	"context"
 	"encoding/json"
 	"fmt"
+	"log"
 	"os"
 	"path/filepath"
 	"strings"
@@ -11,7 +12,6 @@ import (
 	"time"

 	"charm.land/fantasy"
-	charmlog "github.com/charmbracelet/log"

 	"github.com/mark3labs/kit/internal/agent"
 	"github.com/mark3labs/kit/internal/config"
@@ -39,7 +39,7 @@ type ContextFile struct {
 // agents, sessions, and model configurations.
 type Kit struct {
 	agent          *agent.Agent
-	treeSession    *session.TreeManager
+	session        SessionManager
 	modelString    string
 	events         *eventBus
 	autoCompact    bool
@@ -49,6 +49,13 @@ type Kit struct {
 	extRunner      *extensions.Runner
 	bufferedLogger *tools.BufferedDebugLogger
 	authHandler    MCPAuthHandler // OAuth handler for remote MCP servers (may need Close)
+	opts           *Options       // stored for reload operations (skills, etc.)
+
+	// hasCustomSystemPrompt is true when the user explicitly configured a
+	// system prompt (via --system-prompt flag, config file, or SDK option).
+	// When false, per-model system prompts from modelSettings/customModels
+	// can replace the default prompt on model switch.
+	hasCustomSystemPrompt bool

 	// Hook registries — interception layer (see hooks.go).
 	beforeToolCall  *hookRegistry[BeforeToolCallHook, BeforeToolCallResult]
@@ -113,15 +120,105 @@ func (m *Kit) GetLoadingMessage() string {
 }

 // GetLoadedServerNames returns the names of successfully loaded MCP servers.
+// If MCP servers are still loading in the background, this returns only the
+// servers that have completed loading so far.
 func (m *Kit) GetLoadedServerNames() []string {
 	return m.agent.GetLoadedServerNames()
 }

 // GetMCPToolCount returns the number of tools loaded from external MCP servers.
+// If MCP servers are still loading in the background, this returns the count
+// of tools loaded so far (may be 0).
 func (m *Kit) GetMCPToolCount() int {
 	return m.agent.GetMCPToolCount()
 }

+// WaitForMCPTools blocks until background MCP tool loading completes.
+// Returns nil if no MCP servers are configured or if loading succeeded.
+// Returns the loading error if all servers failed. Safe to call multiple times.
+func (m *Kit) WaitForMCPTools() error {
+	return m.agent.WaitForMCPTools()
+}
+
+// MCPToolsReady returns true if MCP tool loading has completed (or was never
+// started). This is a non-blocking check useful for UI status display.
+func (m *Kit) MCPToolsReady() bool {
+	return m.agent.MCPToolsReady()
+}
+
+// MCPServerStatus describes the runtime state of a loaded MCP server.
+type MCPServerStatus struct {
+	// Name is the configured server name.
+	Name string
+	// ToolCount is the number of tools loaded from this server.
+	ToolCount int
+}
+
+// AddMCPServer connects to a new MCP server at runtime and makes its tools
+// available to the agent immediately. The server's tools are prefixed with the
+// server name (e.g. "myserver__tool_name") to avoid naming conflicts, matching
+// the behaviour of servers loaded at initialization.
+//
+// Returns the number of tools loaded from the server.
+//
+// AddMCPServer is safe to call while the agent is idle. If a turn is in
+// progress ([Kit.IsGenerating] returns true), the new tools will be visible
+// starting from the next LLM step.
+//
+// Example:
+//
+//	n, err := k.AddMCPServer(ctx, "github", kit.MCPServerConfig{
+//	    Command: []string{"npx", "-y", "@modelcontextprotocol/server-github"},
+//	    Environment: map[string]string{"GITHUB_TOKEN": os.Getenv("GITHUB_TOKEN")},
+//	})
+func (m *Kit) AddMCPServer(ctx context.Context, name string, cfg MCPServerConfig) (int, error) {
+	return m.agent.AddMCPServer(ctx, name, cfg)
+}
+
+// RemoveMCPServer disconnects an MCP server and removes all its tools from
+// the agent. After this call the agent will no longer see or be able to call
+// tools from the named server.
+//
+// RemoveMCPServer is safe to call while the agent is idle. If a turn is in
+// progress, the tools are removed at the next LLM step. Any in-flight tool
+// calls to the removed server will fail gracefully.
+//
+// Returns an error if the named server is not currently loaded.
+func (m *Kit) RemoveMCPServer(name string) error {
+	return m.agent.RemoveMCPServer(name)
+}
+
+// ListMCPServers returns the status of all currently loaded MCP servers.
+// The returned slice is a snapshot; it is safe to read concurrently.
+func (m *Kit) ListMCPServers() []MCPServerStatus {
+	names := m.agent.GetLoadedServerNames()
+	if len(names) == 0 {
+		return nil
+	}
+
+	// Build a tool count per server by scanning tool names for the prefix.
+	toolNames := m.GetToolNames()
+	countByServer := make(map[string]int, len(names))
+	for _, tn := range toolNames {
+		for _, sn := range names {
+			prefix := sn + "__"
+			if len(tn) > len(prefix) && tn[:len(prefix)] == prefix {
+				countByServer[sn]++
+				break
+			}
+		}
+	}
+
+	result := make([]MCPServerStatus, 0, len(names))
+	for _, n := range names {
+		result = append(result, MCPServerStatus{
+			Name:      n,
+			ToolCount: countByServer[n],
+		})
+	}
+	return result
+}
+
 // GetExtensionToolCount returns the number of tools registered by extensions.
 func (m *Kit) GetExtensionToolCount() int {
 	return m.agent.GetExtensionToolCount()
@@ -154,27 +251,39 @@ type StructuredMessage struct {
 // flattens all content to a single text string, this preserves tool calls,
 // tool results, reasoning blocks, and finish markers as distinct typed parts.
 func (m *Kit) GetStructuredMessages() []StructuredMessage {
-	return iterBranchMessages(m.treeSession, func(me *session.MessageEntry, msg message.Message) StructuredMessage {
-		return StructuredMessage{
-			ID:        me.ID,
-			ParentID:  me.ParentID,
-			Role:      msg.Role,
-			Parts:     msg.Parts,
-			Model:     msg.Model,
-			Provider:  msg.Provider,
-			Timestamp: me.Timestamp.Format("2006-01-02T15:04:05Z07:00"),
+	if m.session == nil {
+		return nil
+	}
+
+	branch := m.session.GetCurrentBranch()
+	var results []StructuredMessage
+	for _, entry := range branch {
+		if entry.Type != EntryTypeMessage {
+			continue
 		}
-	})
+		results = append(results, StructuredMessage{
+			ID:        entry.ID,
+			ParentID:  entry.ParentID,
+			Role:      MessageRole(entry.Role),
+			Parts:     entry.RawParts,
+			Model:     entry.Model,
+			Provider:  entry.Provider,
+			Timestamp: entry.Timestamp.Format("2006-01-02T15:04:05Z07:00"),
+		})
+	}
+	return results
 }

 // iterBranchMessages iterates over the current branch's MessageEntry items,
 // converting each to a message.Message and calling fn to build the result.
-// Returns nil if there is no tree session. Skips entries that are not
+// Returns nil if there is no session. Skips entries that are not
 // MessageEntry or that fail conversion.
+// Deprecated: Use SessionManager.GetCurrentBranch() directly.
 func iterBranchMessages[T any](tm *session.TreeManager, fn func(*session.MessageEntry, message.Message) T) []T {
 	if tm == nil {
 		return nil
 	}
+
 	branch := tm.GetBranch("")
 	var results []T
 	for _, entry := range branch {
@@ -191,9 +300,12 @@ func iterBranchMessages[T any](tm *session.TreeManager, fn func(*session.Message
 	return results
 }

-// SetModel changes the active model at runtime. The existing tools, system
-// prompt, and session are preserved. The model string should be in
-// "provider/model" format (e.g. "anthropic/claude-sonnet-4-5-20250929").
+// SetModel changes the active model at runtime. The existing tools and
+// session are preserved. When the new model has a per-model system prompt
+// (from modelSettings or customModels params), it is composed with the
+// current AGENTS.md context and skills before being applied.
+// The model string should be in "provider/model" format
+// (e.g. "anthropic/claude-sonnet-4-5-20250929").
 // Returns an error if the model string is invalid or the provider cannot
 // be created.
 func (m *Kit) SetModel(ctx context.Context, modelString string) error {
@@ -209,7 +321,7 @@ func (m *Kit) SetModel(ctx context.Context, modelString string) error {

 	// With message-level caching, thinking and caching can work together.
 	// No need to disable caching when thinking is enabled.
-	config := &models.ProviderConfig{
+	cfg := &models.ProviderConfig{
 		ModelString:    modelString,
 		SystemPrompt:   systemPrompt,
 		ProviderAPIKey: viper.GetString("provider-api-key"),
@@ -219,14 +331,50 @@ func (m *Kit) SetModel(ctx context.Context, modelString string) error {
 		ThinkingLevel:  thinkingLevel,
 		DisableCaching: false, // Caching enabled by default, works with thinking
 	}
-	temperature := float32(viper.GetFloat64("temperature"))
-	config.Temperature = &temperature
-	topP := float32(viper.GetFloat64("top-p"))
-	config.TopP = &topP
-	topK := int32(viper.GetInt("top-k"))
-	config.TopK = &topK

-	if err := m.agent.SetModel(ctx, config); err != nil {
+	// Only set generation parameter pointers when the user has explicitly
+	// provided a value. This leaves nil pointers for unset params, allowing
+	// per-model defaults (modelSettings / customModels params) to apply.
+	if viper.IsSet("temperature") {
+		v := float32(viper.GetFloat64("temperature"))
+		cfg.Temperature = &v
+	}
+	if viper.IsSet("top-p") {
+		v := float32(viper.GetFloat64("top-p"))
+		cfg.TopP = &v
+	}
+	if viper.IsSet("top-k") {
+		v := int32(viper.GetInt("top-k"))
+		cfg.TopK = &v
+	}
+	if viper.IsSet("frequency-penalty") {
+		v := float32(viper.GetFloat64("frequency-penalty"))
+		cfg.FrequencyPenalty = &v
+	}
+	if viper.IsSet("presence-penalty") {
+		v := float32(viper.GetFloat64("presence-penalty"))
+		cfg.PresencePenalty = &v
+	}
+
+	// When the user hasn't set a custom global system prompt, check for a
+	// per-model system prompt. Pre-apply model settings to discover it,
+	// then compose with AGENTS.md context and skills if found.
+	if !m.hasCustomSystemPrompt {
+		// Temporarily clear the system prompt so ApplyModelSettings can
+		// detect that no explicit prompt is set and apply the per-model one.
+		cfg.SystemPrompt = ""
+		models.ApplyModelSettings(cfg, models.LookupModelForSettings(modelString))
+
+		if cfg.SystemPrompt != "" {
+			// Per-model system prompt found — compose with runtime context.
+			cfg.SystemPrompt = m.composeSystemPrompt(cfg.SystemPrompt)
+		} else {
+			// No per-model prompt — restore the global composed prompt.
+			cfg.SystemPrompt = systemPrompt
+		}
+	}
+
+	if err := m.agent.SetModel(ctx, cfg); err != nil {
 		return err
 	}

@@ -242,6 +390,32 @@ func (m *Kit) SetModel(ctx context.Context, modelString string) error {
 	return nil
 }

+// composeSystemPrompt takes a base system prompt and composes it with the
+// current runtime context: AGENTS.md content, skills metadata, and date/cwd.
+// This mirrors the composition done during Kit.New() initialization.
+func (m *Kit) composeSystemPrompt(basePrompt string) string {
+	cwd, _ := os.Getwd()
+	pb := skills.NewPromptBuilder(basePrompt)
+
+	// Inject AGENTS.md content as project context.
+	for _, cf := range m.contextFiles {
+		pb.WithSection("", fmt.Sprintf("Instructions from: %s\n\n%s", cf.Path, cf.Content))
+	}
+
+	// Inject skills metadata.
+	if len(m.skills) > 0 {
+		pb.WithSkills(m.skills)
+	}
+
+	// Append current date/time and working directory.
+	pb.WithSection("", fmt.Sprintf(
+		"Current date and time: %s\nCurrent working directory: %s",
+		time.Now().Format("Monday, January 2, 2006, 3:04:05 PM MST"), cwd,
+	))
+
+	return pb.Build()
+}
+
 // GetAvailableModels returns a list of known models from the registry. Each
 // entry includes provider, model ID, context limit, and whether the model
 // supports reasoning. This is an advisory list — models not in the registry
@@ -423,6 +597,17 @@ type Options struct {
 	Tools        []Tool // Custom tool set. If empty, AllTools() is used.
 	ExtraTools   []Tool // Additional tools added alongside core/MCP/extension tools.

+	// SkipConfig, when true, skips loading .kit.yml configuration files.
+	// Viper defaults (setSDKDefaults) and environment variables (KIT_*)
+	// are still applied. Use this for fully programmatic configuration.
+	SkipConfig bool
+
+	// DisableCoreTools, when true, prevents loading any core tools.
+	// Use with Tools or ExtraTools to provide only custom tools.
+	// If both DisableCoreTools is true and Tools is empty, the agent
+	// will have no tools (useful for simple chat completions).
+	DisableCoreTools bool
+
 	// Session configuration
 	SessionDir  string // Base directory for session discovery (default: cwd)
 	SessionPath string // Open a specific session file by path
@@ -432,6 +617,14 @@ type Options struct {
 	// Skills
 	Skills    []string // Explicit skill files/dirs to load (empty = auto-discover)
 	SkillsDir string   // Override default project-local skills directory
+	NoSkills  bool     // Disable skill loading entirely (auto-discovery and explicit)
+
+	// NoExtensions disables Yaegi extension loading entirely.
+	NoExtensions bool
+
+	// NoContextFiles disables automatic loading of project context files
+	// (e.g. AGENTS.md) from the working directory.
+	NoContextFiles bool

 	// Compaction
 	AutoCompact       bool               // Auto-compact when near context limit
@@ -452,8 +645,31 @@ type Options struct {
 	// display a URL in a custom UI, redirect to a web app, etc.).
 	MCPAuthHandler MCPAuthHandler

+	// MCPTokenStoreFactory, if non-nil, is called to create a token store for
+	// each remote MCP server that requires OAuth. The factory receives the
+	// server's URL and returns a [MCPTokenStore] implementation.
+	//
+	// When nil (default), tokens are persisted to a JSON file at
+	// $XDG_CONFIG_HOME/.kit/mcp_tokens.json (or ~/.config/.kit/mcp_tokens.json).
+	//
+	// Use this to store tokens in a database, encrypt them, keep them
+	// in-memory, or write them to a custom file path.
+	MCPTokenStoreFactory MCPTokenStoreFactory
+
+	// OnMCPServerLoaded, if non-nil, is called when each MCP server finishes
+	// loading during Kit initialization. The callback receives the server name,
+	// tool count, and any error. Called from a background goroutine; safe to
+	// call app.NotifyMCPServerLoaded() from within the callback to display
+	// real-time progress in the TUI.
+	OnMCPServerLoaded func(serverName string, toolCount int, err error)
+
 	// CLI is optional CLI-specific configuration. SDK users leave this nil.
 	CLI *CLIOptions
+
+	// SessionManager allows custom session storage backends.
+	// If nil (default), Kit uses the built-in file-based TreeManager.
+	// When provided, SessionPath, Continue, and NoSession options are ignored.
+	SessionManager SessionManager
 }

 // CLIOptions holds fields only relevant to the CLI binary. SDK users should
@@ -525,16 +741,17 @@ func New(ctx context.Context, opts *Options) (*Kit, error) {
 	// provider creation, session init) then runs outside the lock, allowing
 	// parallel subagent spawns to proceed concurrently.
 	var (
-		providerConfig *models.ProviderConfig
-		modelString    string
-		cwd            string
-		contextFiles   []*ContextFile
-		loadedSkills   []*Skill
-		mcpConfig      *config.Config
-		debug          bool
-		noExtensions   bool
-		maxSteps       int
-		streaming      bool
+		providerConfig        *models.ProviderConfig
+		modelString           string
+		cwd                   string
+		contextFiles          []*ContextFile
+		loadedSkills          []*Skill
+		mcpConfig             *config.Config
+		debug                 bool
+		noExtensions          bool
+		maxSteps              int
+		streaming             bool
+		hasCustomSystemPrompt bool
 	)

 	if err := func() error {
@@ -548,7 +765,8 @@ func New(ctx context.Context, opts *Options) (*Kit, error) {
 		// Initialize config (loads config files and env vars).
 		// Only initialize if not already done (e.g., by CLI's cobra.OnInitialize).
 		// Check if model is already set, which indicates config was loaded.
-		if viper.GetString("model") == "" {
+		// SkipConfig bypasses .kit.yml file loading (viper defaults and env vars still apply).
+		if !opts.SkipConfig && viper.GetString("model") == "" {
 			if err := InitConfig(opts.ConfigFile, false); err != nil {
 				return fmt.Errorf("failed to initialize config: %w", err)
 			}
@@ -578,19 +796,56 @@ func New(ctx context.Context, opts *Options) (*Kit, error) {
 		}

 		// Load context files (AGENTS.md) from the project root.
-		contextFiles = loadContextFiles(cwd)
+		if !opts.NoContextFiles {
+			contextFiles = loadContextFiles(cwd)
+		}

 		// Load skills — either from explicit paths or via auto-discovery.
-		var err error
-		loadedSkills, err = loadSkills(opts)
-		if err != nil {
-			return fmt.Errorf("failed to load skills: %w", err)
+		if !opts.NoSkills {
+			var err error
+			loadedSkills, err = loadSkills(opts)
+			if err != nil {
+				return fmt.Errorf("failed to load skills: %w", err)
+			}
 		}

 		// Always compose the system prompt with runtime context: base prompt +
 		// AGENTS.md context + skills metadata + date/cwd.
+		//
+		// If the configured model has a per-model system prompt (via
+		// modelSettings or customModels params) and the user hasn't
+		// explicitly set system-prompt, use the per-model prompt as the
+		// base instead of the global default.
 		{
 			basePrompt := viper.GetString("system-prompt")
+
+			// Track whether the user explicitly configured a custom system
+			// prompt. When they haven't (basePrompt is the built-in default
+			// or empty), per-model system prompts can replace it on switch.
+			userSetSystemPrompt := basePrompt != "" && basePrompt != defaultSystemPrompt
+			hasCustomSystemPrompt = userSetSystemPrompt
+
+			// Check for per-model system prompt override when no explicit
+			// global system-prompt was configured by the user.
+			if !userSetSystemPrompt {
+				modelStr := viper.GetString("model")
+				if modelStr != "" {
+					if mi := models.LookupModelForSettings(modelStr); mi != nil {
+						var perModelParams *models.GenerationParams
+						// modelSettings takes priority over custom model params.
+						if ms := models.LoadModelSettingsFromConfig(); ms != nil {
+							perModelParams = ms[modelStr]
+						}
+						if perModelParams == nil && mi.Params != nil {
+							perModelParams = mi.Params
+						}
+						if perModelParams != nil && perModelParams.SystemPrompt != "" {
+							basePrompt = models.LoadSystemPromptValue(perModelParams.SystemPrompt)
+						}
+					}
+				}
+			}
+
 			pb := skills.NewPromptBuilder(basePrompt)

 			// Inject AGENTS.md content as project context.
@@ -621,7 +876,7 @@ func New(ctx context.Context, opts *Options) (*Kit, error) {
 		}
 		modelString = viper.GetString("model")
 		debug = viper.GetBool("debug")
-		noExtensions = viper.GetBool("no-extensions")
+		noExtensions = opts.NoExtensions || viper.GetBool("no-extensions")
 		maxSteps = viper.GetInt("max-steps")
 		streaming = viper.GetBool("stream")

@@ -657,16 +912,18 @@ func New(ctx context.Context, opts *Options) (*Kit, error) {
 	// Pass the pre-built ProviderConfig and scalar viper snapshots so
 	// SetupAgent doesn't need to re-read viper (which would require the lock).
 	setupOpts := kitsetup.AgentSetupOptions{
-		MCPConfig:        mcpConfig,
-		Quiet:            opts.Quiet,
-		CoreTools:        opts.Tools,
-		ExtraTools:       opts.ExtraTools,
-		ToolWrapper:      hookToolWrapper(beforeToolCall, afterToolResult),
-		ProviderConfig:   providerConfig,
-		Debug:            debug,
-		NoExtensions:     noExtensions,
-		MaxSteps:         maxSteps,
-		StreamingEnabled: streaming,
+		MCPConfig:         mcpConfig,
+		Quiet:             opts.Quiet,
+		CoreTools:         opts.Tools,
+		DisableCoreTools:  opts.DisableCoreTools,
+		ExtraTools:        opts.ExtraTools,
+		ToolWrapper:       hookToolWrapper(beforeToolCall, afterToolResult),
+		ProviderConfig:    providerConfig,
+		Debug:             debug,
+		NoExtensions:      noExtensions,
+		MaxSteps:          maxSteps,
+		StreamingEnabled:  streaming,
+		OnMCPServerLoaded: opts.OnMCPServerLoaded,
 	}

 	// Set up OAuth handler for remote MCP servers.
@@ -679,12 +936,19 @@ func New(ctx context.Context, opts *Options) (*Kit, error) {
 		defaultHandler, authErr := NewDefaultMCPAuthHandler()
 		if authErr != nil {
 			// Non-fatal: OAuth just won't be available for remote servers.
-			charmlog.Warn("Failed to create OAuth handler; remote MCP servers requiring auth will fail", "error", authErr)
+			log.Printf("WARN Failed to create OAuth handler; remote MCP servers requiring auth will fail: %v", authErr)
 		} else {
 			setupOpts.AuthHandler = defaultHandler
 		}
 	}

+	// Set up custom token store factory for MCP OAuth tokens.
+	// The SDK MCPTokenStoreFactory is structurally identical to
+	// tools.TokenStoreFactory, so it can be assigned directly.
+	if opts.MCPTokenStoreFactory != nil {
+		setupOpts.TokenStoreFactory = tools.TokenStoreFactory(opts.MCPTokenStoreFactory)
+	}
+
 	if opts.CLI != nil {
 		setupOpts.ShowSpinner = opts.CLI.ShowSpinner
 		setupOpts.SpinnerFunc = opts.CLI.SpinnerFunc
@@ -697,31 +961,42 @@ func New(ctx context.Context, opts *Options) (*Kit, error) {
 		return nil, err
 	}

-	// Initialize tree session.
-	treeSession, err := InitTreeSession(opts)
-	if err != nil {
-		_ = agentResult.Agent.Close()
-		return nil, fmt.Errorf("failed to initialize session: %w", err)
+	// Initialize session manager.
+	var sessionManager SessionManager
+	if opts.SessionManager != nil {
+		// Use custom session manager provided by user.
+		sessionManager = opts.SessionManager
+	} else {
+		// DEFAULT: Use built-in TreeManager (existing behavior).
+		treeSession, err := InitTreeSession(opts)
+		if err != nil {
+			_ = agentResult.Agent.Close()
+			return nil, fmt.Errorf("failed to initialize session: %w", err)
+		}
+		// Wrap TreeManager in adapter to satisfy SessionManager interface.
+		sessionManager = NewTreeManagerAdapter(treeSession)
 	}

 	k := &Kit{
-		agent:           agentResult.Agent,
-		treeSession:     treeSession,
-		modelString:     modelString,
-		events:          newEventBus(),
-		autoCompact:     opts.AutoCompact,
-		compactionOpts:  opts.CompactionOptions,
-		contextFiles:    contextFiles,
-		skills:          loadedSkills,
-		extRunner:       agentResult.ExtRunner,
-		bufferedLogger:  agentResult.BufferedLogger,
-		authHandler:     setupOpts.AuthHandler,
-		beforeToolCall:  beforeToolCall,
-		afterToolResult: afterToolResult,
-		beforeTurn:      beforeTurn,
-		afterTurn:       afterTurn,
-		contextPrepare:  contextPrepare,
-		beforeCompact:   beforeCompact,
+		agent:                 agentResult.Agent,
+		session:               sessionManager,
+		modelString:           modelString,
+		events:                newEventBus(),
+		autoCompact:           opts.AutoCompact,
+		compactionOpts:        opts.CompactionOptions,
+		contextFiles:          contextFiles,
+		skills:                loadedSkills,
+		extRunner:             agentResult.ExtRunner,
+		bufferedLogger:        agentResult.BufferedLogger,
+		authHandler:           setupOpts.AuthHandler,
+		opts:                  opts,
+		hasCustomSystemPrompt: hasCustomSystemPrompt,
+		beforeToolCall:        beforeToolCall,
+		afterToolResult:       afterToolResult,
+		beforeTurn:            beforeTurn,
+		afterTurn:             afterTurn,
+		contextPrepare:        contextPrepare,
+		beforeCompact:         beforeCompact,
 	}

 	// Bridge extension events to SDK hooks.
@@ -908,9 +1183,11 @@ type TurnResult struct {
 	// report usage.
 	TotalUsage *LLMUsage

-	// FinalUsage is the token usage from the last API call only. Use this
-	// for context window fill estimation (InputTokens + OutputTokens ≈
-	// current context size). Nil if unavailable.
+	// FinalUsage is the token usage from the last API call only. For context
+	// window fill, sum all categories: InputTokens + CacheReadTokens +
+	// CacheCreationTokens + OutputTokens. With prompt caching, InputTokens
+	// alone understates the context (cached tokens are reported separately).
+	// Nil if unavailable.
 	FinalUsage *LLMUsage

 	// Messages is the full updated conversation after the turn, including
@@ -1239,14 +1516,22 @@ func (m *Kit) generate(ctx context.Context, messages []fantasy.Message) (*agent.
 				IsStderr:   isStderr,
 			})
 		},
+		// Persist step messages incrementally so that progress survives
+		// crashes and long-running turns don't lose work. Each step's
+		// messages are persisted as a unit: for tool-calling steps this is
+		// the assistant message (with tool_use parts) + tool-role message
+		// (with tool_result parts) as a pair; for the final step it's the
+		// assistant text/reasoning message alone.
+		func(stepMessages []fantasy.Message) {
+			for _, msg := range stepMessages {
+				_, _ = m.session.AppendMessage(msg)
+			}
+		},
 		func(inputTokens, outputTokens, cacheReadTokens, cacheCreationTokens int64) {
 			// Emit step usage event for real-time cost tracking
 			if viper.GetBool("debug") {
-				charmlog.Debug("Kit.generate emitting StepUsageEvent",
-					"input", inputTokens,
-					"output", outputTokens,
-					"cacheRead", cacheReadTokens,
-					"cacheCreate", cacheCreationTokens,
+				log.Printf("DEBUG Kit.generate emitting StepUsageEvent: input=%d output=%d cacheRead=%d cacheCreate=%d",
+					inputTokens, outputTokens, cacheReadTokens, cacheCreationTokens,
 				)
 			}
 			m.events.emit(StepUsageEvent{
@@ -1264,11 +1549,17 @@ func (m *Kit) generate(ctx context.Context, messages []fantasy.Message) (*agent.
 //  2. Persist pre-generation messages to the tree session.
 //  3. Build context from the tree (walks leaf-to-root for current branch).
 //  4. Emit turn/message start events.
-//  5. Run generation.
-//  6. Emit turn/message end events.
-//  7. Persist post-generation messages (tool calls, results, assistant).
+//  5. Run generation (messages are persisted incrementally per step).
+//  6. Persist any remaining messages not covered by incremental persistence.
+//  7. Emit turn/message end events.
 //  8. Run AfterTurn hooks.
 //
+// During generation, each completed step's messages are persisted immediately
+// via the onStepMessages callback. Tool calls are always persisted as
+// call/response pairs (assistant + tool messages together). Reasoning and
+// text-only assistant messages are persisted as soon as their step completes.
+// This ensures long-running turns don't lose progress on crash or cancellation.
+//
 // promptLabel is the human-readable label emitted in TurnStartEvent.Prompt.
 // prompt is the raw user text passed to BeforeTurn hooks.
 func (m *Kit) runTurn(ctx context.Context, promptLabel string, prompt string, preMessages []fantasy.Message) (*TurnResult, error) {
@@ -1313,9 +1604,9 @@ func (m *Kit) runTurn(ctx context.Context, promptLabel string, prompt string, pr
 		}
 	}

-	// Persist pre-generation messages to tree session.
+	// Persist pre-generation messages to session.
 	for _, msg := range preMessages {
-		_, _ = m.treeSession.AppendLLMMessage(msg)
+		_, _ = m.session.AppendMessage(msg)
 	}

 	// Auto-compact if enabled and conversation is near the context limit.
@@ -1323,8 +1614,8 @@ func (m *Kit) runTurn(ctx context.Context, promptLabel string, prompt string, pr
 		_, _ = m.compactInternal(ctx, m.compactionOpts, "", true) // best-effort, automatic
 	}

-	// Build context from the tree so only the current branch is sent.
-	messages := m.treeSession.GetLLMMessages()
+	// Build context from the session so only the current branch is sent.
+	messages, _, _ := m.session.BuildContext()

 	// Run ContextPrepare hooks — extensions can filter, reorder, or inject messages.
 	if hookResult := m.contextPrepare.run(ContextPrepareHook{Messages: messages}); hookResult != nil && hookResult.Messages != nil {
@@ -1338,16 +1629,18 @@ func (m *Kit) runTurn(ctx context.Context, promptLabel string, prompt string, pr

 	result, err := m.generate(ctx, messages)
 	if err != nil {
-		// Persist any messages from completed steps (tool call/result
-		// pairs) so partial progress is not lost. The agent layer only
-		// includes fully-paired tool_use + tool_result messages in
-		// completedStepMessages, so there are no orphaned entries that
-		// would break subsequent API requests. The user message and any
-		// completed work remain in the session; only the in-progress
-		// (pending) message or tool call is discarded.
-		if result != nil && len(result.ConversationMessages) > sentCount {
-			for _, msg := range result.ConversationMessages[sentCount:] {
-				_, _ = m.treeSession.AppendLLMMessage(msg)
+		// Persist any messages from completed steps that were NOT already
+		// persisted incrementally by the onStepMessages callback. The agent
+		// layer only includes fully-paired tool_use + tool_result messages
+		// in completedStepMessages, so there are no orphaned entries that
+		// would break subsequent API requests.
+		if result != nil {
+			newMessages := result.ConversationMessages[sentCount:]
+			alreadyPersisted := result.PersistedMessageCount
+			if alreadyPersisted < len(newMessages) {
+				for _, msg := range newMessages[alreadyPersisted:] {
+					_, _ = m.session.AppendMessage(msg)
+				}
 			}
 		}
 		m.events.emit(TurnEndEvent{Error: err})
@@ -1358,22 +1651,29 @@ func (m *Kit) runTurn(ctx context.Context, promptLabel string, prompt string, pr

 	responseText := result.FinalResponse.Content.Text()

-	// Persist new messages (tool calls, tool results, assistant response)
-	// BEFORE emitting events so that extension handlers calling
-	// GetContextStats() see up-to-date token counts.
+	// Persist any new messages that were NOT already persisted incrementally
+	// by the onStepMessages callback during generation. This handles the
+	// non-streaming path (where onStepMessages is not called) and any edge
+	// cases where the final response messages weren't covered by step callbacks.
 	if len(result.ConversationMessages) > sentCount {
-		for _, msg := range result.ConversationMessages[sentCount:] {
-			_, _ = m.treeSession.AppendLLMMessage(msg)
+		newMessages := result.ConversationMessages[sentCount:]
+		alreadyPersisted := result.PersistedMessageCount
+		if alreadyPersisted < len(newMessages) {
+			for _, msg := range newMessages[alreadyPersisted:] {
+				_, _ = m.session.AppendMessage(msg)
+			}
 		}
 	}

 	// Store the API-reported token count so GetContextStats() matches the
-	// built-in status bar (which uses input + output tokens). The
-	// text-based heuristic misses system prompts, tool definitions, etc.
+	// built-in status bar. The context window is filled by all token
+	// categories: non-cached input, cache reads, cache writes, and output.
+	// With Anthropic prompt caching, InputTokens can be near-zero while
+	// CacheReadTokens/CacheCreationTokens hold the bulk of the context.
 	if result.FinalResponse != nil {
 		u := result.FinalResponse.Usage
 		m.lastInputTokensMu.Lock()
-		m.lastInputTokens = int(u.InputTokens) + int(u.OutputTokens)
+		m.lastInputTokens = int(u.InputTokens) + int(u.CacheReadTokens) + int(u.CacheCreationTokens) + int(u.OutputTokens)
 		m.lastInputTokensMu.Unlock()
 	}

@@ -1445,7 +1745,7 @@ func (m *Kit) Steer(ctx context.Context, instruction string) (string, error) {
 // Returns an error if there are no previous messages in the session.
 func (m *Kit) FollowUp(ctx context.Context, text string) (string, error) {
 	// Verify there is conversation history to follow up on.
-	if len(m.treeSession.GetLLMMessages()) == 0 {
+	if len(m.session.GetMessages()) == 0 {
 		return "", fmt.Errorf("cannot follow up: no previous messages")
 	}

@@ -1601,10 +1901,12 @@ func (m *Kit) PromptResultWithMessages(ctx context.Context, messages []string) (
 	return m.runTurn(ctx, promptLabel, messages[len(messages)-1], preMessages)
 }

-// ClearSession resets the tree session's leaf pointer to the root, starting
+// ClearSession resets the session's leaf pointer to the root, starting
 // a fresh conversation branch.
 func (m *Kit) ClearSession() {
-	m.treeSession.ResetLeaf()
+	if m.session != nil {
+		_ = m.session.Branch("")
+	}
 }

 // GetModelString returns the current model string identifier (e.g.,
@@ -1673,8 +1975,8 @@ func (m *Kit) Close() error {
 	if m.extRunner != nil && m.extRunner.HasHandlers(extensions.SessionShutdown) {
 		_, _ = m.extRunner.Emit(extensions.SessionShutdownEvent{})
 	}
-	if m.treeSession != nil {
-		_ = m.treeSession.Close()
+	if m.session != nil {
+		_ = m.session.Close()
 	}
 	// Release the OAuth callback port if we own the handler.
 	if closer, ok := m.authHandler.(interface{ Close() error }); ok {
@@ -1682,3 +1984,5 @@ func (m *Kit) Close() error {
 	}
 	return m.agent.Close()
 }
+
+// Conversion helpers are defined in adapter.go.
@@ -0,0 +1,56 @@
+package kit_test
+
+import (
+	"testing"
+
+	kit "github.com/mark3labs/kit/pkg/kit"
+)
+
+// TestMCPServerStatus_TypeSurface verifies the MCPServerStatus type is
+// accessible and has the expected fields.
+func TestMCPServerStatus_TypeSurface(t *testing.T) {
+	s := kit.MCPServerStatus{
+		Name:      "test-server",
+		ToolCount: 5,
+	}
+	if s.Name != "test-server" {
+		t.Errorf("Expected Name 'test-server', got %q", s.Name)
+	}
+	if s.ToolCount != 5 {
+		t.Errorf("Expected ToolCount 5, got %d", s.ToolCount)
+	}
+}
+
+// TestMCPServerConfig_ForDynamicAdd verifies that MCPServerConfig can be
+// constructed with the expected fields for dynamic server management.
+func TestMCPServerConfig_ForDynamicAdd(t *testing.T) {
+	// Stdio server config.
+	stdio := kit.MCPServerConfig{
+		Command:     []string{"npx", "-y", "@modelcontextprotocol/server-github"},
+		Environment: map[string]string{"GITHUB_TOKEN": "test-token"},
+	}
+	if len(stdio.Command) != 3 {
+		t.Errorf("Expected 3 command parts, got %d", len(stdio.Command))
+	}
+	if stdio.Environment["GITHUB_TOKEN"] != "test-token" {
+		t.Error("Expected GITHUB_TOKEN in environment")
+	}
+
+	// Remote server config.
+	remote := kit.MCPServerConfig{
+		URL:     "https://mcp.example.com/sse",
+		Headers: []string{"Authorization: Bearer test"},
+	}
+	if remote.URL != "https://mcp.example.com/sse" {
+		t.Errorf("Unexpected URL: %s", remote.URL)
+	}
+
+	// Config with tool filtering.
+	filtered := kit.MCPServerConfig{
+		Command:      []string{"some-server"},
+		AllowedTools: []string{"read", "write"},
+	}
+	if len(filtered.AllowedTools) != 2 {
+		t.Errorf("Expected 2 allowed tools, got %d", len(filtered.AllowedTools))
+	}
+}
@@ -0,0 +1,144 @@
+package kit
+
+import (
+	"time"
+)
+
+// SessionManager defines the contract for conversation storage backends.
+// Implementations can use files (default), databases, cloud storage, etc.
+//
+// Implementations must be safe for concurrent use. During generation,
+// AppendMessage is called incrementally from the agent's step-completion
+// callback while read methods (GetMessages, GetCurrentBranch, etc.) may be
+// called concurrently from the UI or extension goroutines.
+type SessionManager interface {
+	// AppendMessage adds a message to the current branch and returns its entry ID.
+	// The entry ID is used for tree navigation and must be unique within the session.
+	//
+	// During generation, AppendMessage is called incrementally after each
+	// completed agent step rather than in a batch at the end of the turn.
+	// For tool-calling steps, the assistant message (containing tool_use parts)
+	// and the tool-role message (containing tool_result parts) are appended
+	// together as a pair. This ensures the session never contains an orphaned
+	// tool call without its result, which would break subsequent LLM requests.
+	AppendMessage(msg LLMMessage) (entryID string, err error)
+
+	// GetMessages returns all messages on the current branch (from root to leaf),
+	// including any compaction summaries at the appropriate positions.
+	GetMessages() []LLMMessage
+
+	// BuildContext returns the message history to send to the LLM, applying
+	// compaction rules and branch summaries as needed.
+	// Returns: messages, currentProvider, currentModelID
+	BuildContext() (messages []LLMMessage, provider string, modelID string)
+
+	// Branch moves the leaf pointer to the given entry ID, creating a branch point.
+	// Subsequent AppendMessage calls extend from this new position.
+	// entryID can be empty to reset to root (new conversation branch).
+	Branch(entryID string) error
+
+	// GetCurrentBranch returns the path from root to current leaf as entry metadata.
+	// Used for UI display and navigation.
+	GetCurrentBranch() []BranchEntry
+
+	// GetChildren returns direct child entry IDs for a given parent entry.
+	// Used to display branch points in the conversation tree.
+	GetChildren(parentID string) []string
+
+	// GetEntry returns a specific entry by ID, or nil if not found.
+	GetEntry(entryID string) *BranchEntry
+
+	// GetSessionID returns the unique session identifier (UUID).
+	GetSessionID() string
+
+	// GetSessionName returns the user-defined display name, or empty.
+	GetSessionName() string
+
+	// SetSessionName sets a display name for the session.
+	SetSessionName(name string) error
+
+	// GetCreatedAt returns when the session was created.
+	GetCreatedAt() time.Time
+
+	// IsPersisted returns true if this session writes to durable storage.
+	IsPersisted() bool
+
+	// AppendCompaction adds a compaction entry that summarizes older messages.
+	// firstKeptEntryID is the ID of the first message to preserve in context.
+	// readFiles and modifiedFiles track file changes for the compaction summary.
+	AppendCompaction(summary string, firstKeptEntryID string,
+		tokensBefore, tokensAfter int, messagesRemoved int, readFiles, modifiedFiles []string) (string, error)
+
+	// GetLastCompaction returns the most recent compaction entry on the current
+	// branch, or nil if none exists.
+	GetLastCompaction() *CompactionEntry
+
+	// AppendExtensionData stores custom extension data in the session tree.
+	// Extensions use this to persist state across restarts.
+	AppendExtensionData(extType, data string) (string, error)
+
+	// GetExtensionData returns all extension data entries of the given type
+	// on the current branch. If extType is empty, returns all extension data.
+	GetExtensionData(extType string) []ExtensionDataEntry
+
+	// AppendModelChange records a provider/model switch in the session.
+	AppendModelChange(provider, modelID string) (string, error)
+
+	// GetContextEntryIDs returns the entry IDs corresponding to the messages
+	// returned by BuildContext, in the same order. Used by compaction to
+	// determine which entries to summarize.
+	GetContextEntryIDs() []string
+
+	// Close releases resources (database connections, file handles, etc.).
+	Close() error
+}
+
+// BranchEntry represents a single node in the conversation tree.
+// This is a SDK-friendly struct (not the internal entry types).
+type BranchEntry struct {
+	ID        string
+	ParentID  string
+	Type      EntryType // "message", "branch_summary", "model_change", "compaction", "extension_data"
+	Role      string    // for messages: "user", "assistant", "system", "tool"
+	Content   string    // text content or summary
+	Model     string    // model used (for messages and model_change)
+	Provider  string    // provider used
+	Timestamp time.Time
+	Children  []string // child entry IDs (for tree display)
+
+	// RawParts contains the full typed content parts for structured access.
+	// Only populated for message entries.
+	RawParts []ContentPart
+}
+
+// EntryType identifies the kind of entry in the session tree.
+type EntryType string
+
+const (
+	EntryTypeMessage       EntryType = "message"
+	EntryTypeBranchSummary EntryType = "branch_summary"
+	EntryTypeModelChange   EntryType = "model_change"
+	EntryTypeCompaction    EntryType = "compaction"
+	EntryTypeExtensionData EntryType = "extension_data"
+)
+
+// CompactionEntry represents a context compaction/summarization event.
+type CompactionEntry struct {
+	ID               string
+	Summary          string
+	FirstKeptEntryID string
+	TokensBefore     int
+	TokensAfter      int
+	MessagesRemoved  int
+	ReadFiles        []string
+	ModifiedFiles    []string
+	Timestamp        time.Time
+}
+
+// ExtensionDataEntry represents custom extension data stored in the session.
+type ExtensionDataEntry struct {
+	ID        string
+	ExtType   string
+	Data      string
+	Timestamp time.Time
+}
@@ -8,7 +8,6 @@ import (
 	"time"

 	"github.com/mark3labs/kit/internal/extensions"
-	"github.com/mark3labs/kit/internal/message"
 	"github.com/mark3labs/kit/internal/session"
 )

@@ -47,49 +46,73 @@ func OpenTreeSession(path string) (*TreeManager, error) {

 // --- Instance methods on Kit ---

+// GetSessionManager returns the session manager, or nil if not configured.
+func (m *Kit) GetSessionManager() SessionManager {
+	return m.session
+}
+
 // GetTreeSession returns the tree session manager, or nil if not configured.
+// Deprecated: Use GetSessionManager instead.
 func (m *Kit) GetTreeSession() *TreeManager {
-	return m.treeSession
+	// Try to unwrap the adapter if using default implementation
+	if adapter, ok := m.session.(*treeManagerAdapter); ok {
+		return adapter.inner
+	}
+	return nil
+}
+
+// SetSessionManager replaces the session manager on a Kit instance.
+func (m *Kit) SetSessionManager(sm SessionManager) {
+	m.session = sm
 }

 // SetTreeSession replaces the tree session on a Kit instance. This is used by
 // the CLI when it handles session creation externally (e.g. --resume with a
 // TUI picker) and needs to inject the result into a Kit-like workflow.
+// Deprecated: Use SetSessionManager instead.
 func (m *Kit) SetTreeSession(ts *TreeManager) {
-	m.treeSession = ts
+	m.session = NewTreeManagerAdapter(ts)
 }

-// GetSessionPath returns the file path of the active tree session, or empty
-// for in-memory sessions or when no tree session is configured.
+// GetSessionPath returns the file path of the active session, or empty
+// for in-memory sessions or when no file-based session is configured.
 func (m *Kit) GetSessionPath() string {
-	if m.treeSession != nil {
-		return m.treeSession.GetFilePath()
+	// Only file-based sessions have a path
+	// Try to get it from the underlying TreeManager if using default adapter
+	if m.session == nil {
+		return ""
+	}
+	// Check if it's the default adapter
+	if adapter, ok := m.session.(*treeManagerAdapter); ok {
+		return adapter.inner.GetFilePath()
 	}
 	return ""
 }

-// GetSessionID returns the UUID of the active tree session, or empty when no
-// tree session is configured.
+// GetSessionID returns the UUID of the active session, or empty when no
+// session is configured.
 func (m *Kit) GetSessionID() string {
-	if m.treeSession != nil {
-		return m.treeSession.GetSessionID()
+	if m.session == nil {
+		return ""
 	}
-	return ""
+	return m.session.GetSessionID()
 }

-// Branch moves the tree session's leaf pointer to the given entry ID, creating
+// Branch moves the session's leaf pointer to the given entry ID, creating
 // a branch point. Subsequent Prompt() calls will extend from the new position.
 func (m *Kit) Branch(entryID string) error {
-	return m.treeSession.Branch(entryID)
+	if m.session == nil {
+		return fmt.Errorf("no session available")
+	}
+	return m.session.Branch(entryID)
 }

-// SetSessionName sets a user-defined display name for the active tree session.
+// SetSessionName sets a user-defined display name for the active session.
 func (m *Kit) SetSessionName(name string) error {
-	if m.treeSession == nil {
-		return fmt.Errorf("session naming requires a tree session")
+	if m.session == nil {
+		return fmt.Errorf("session naming requires a session")
 	}
-	_, err := m.treeSession.AppendSessionInfo(name)
-	return err
+	return m.session.SetSessionName(name)
 }

 // ---------------------------------------------------------------------------
@@ -97,27 +120,27 @@ func (m *Kit) SetSessionName(name string) error {
 // ---------------------------------------------------------------------------

 // GetTreeNode returns a node by ID with full metadata and children.
-// Returns nil if entry not found or no tree session.
+// Returns nil if entry not found or no session.
 func (m *Kit) GetTreeNode(entryID string) *TreeNode {
-	if m.treeSession == nil {
+	if m.session == nil {
 		return nil
 	}
-	entry := m.treeSession.GetEntry(entryID)
+	entry := m.session.GetEntry(entryID)
 	if entry == nil {
 		return nil
 	}
-	return m.entryToTreeNode(entry)
+	return m.branchEntryToTreeNode(entry)
 }

 // GetCurrentBranch returns the path from root to current leaf as TreeNodes.
 func (m *Kit) GetCurrentBranch() []TreeNode {
-	if m.treeSession == nil {
+	if m.session == nil {
 		return nil
 	}
-	branch := m.treeSession.GetBranch("")
+	branch := m.session.GetCurrentBranch()
 	var nodes []TreeNode
 	for _, entry := range branch {
-		node := m.entryToTreeNode(entry)
+		node := m.branchEntryToTreeNode(&entry)
 		if node != nil {
 			nodes = append(nodes, *node)
 		}
@@ -127,34 +150,34 @@ func (m *Kit) GetCurrentBranch() []TreeNode {

 // GetChildren returns direct child IDs of an entry.
 func (m *Kit) GetChildren(parentID string) []string {
-	if m.treeSession == nil {
+	if m.session == nil {
 		return nil
 	}
-	return m.treeSession.GetChildren(parentID)
+	return m.session.GetChildren(parentID)
 }

 // NavigateTo branches/forks the session to the specified entry ID.
 // Returns an error if the session is unavailable or the entry ID is not found.
 func (m *Kit) NavigateTo(entryID string) error {
-	if m.treeSession == nil {
-		return fmt.Errorf("no tree session available")
+	if m.session == nil {
+		return fmt.Errorf("no session available")
 	}
-	return m.treeSession.Branch(entryID)
+	return m.session.Branch(entryID)
 }

 // SummarizeBranch uses the LLM to summarize the conversation between two
 // entry IDs. Returns the summary text, or an error if the range is invalid,
 // the session is unavailable, or the LLM call fails.
 func (m *Kit) SummarizeBranch(fromID, toID string) (string, error) {
-	if m.treeSession == nil {
-		return "", fmt.Errorf("no tree session available")
+	if m.session == nil {
+		return "", fmt.Errorf("no session available")
 	}

 	// Get the branch and find the range
-	branch := m.treeSession.GetBranch("")
+	branch := m.session.GetCurrentBranch()
 	var startIdx, endIdx = -1, -1
 	for i, entry := range branch {
-		id := m.treeSession.EntryID(entry)
+		id := entry.ID
 		if id == fromID {
 			startIdx = i
 		}
@@ -170,7 +193,7 @@ func (m *Kit) SummarizeBranch(fromID, toID string) (string, error) {
 	// Build text to summarize
 	var content strings.Builder
 	for i := startIdx; i <= endIdx; i++ {
-		node := m.entryToTreeNode(branch[i])
+		node := m.branchEntryToTreeNode(&branch[i])
 		if node != nil && node.Content != "" {
 			fmt.Fprintf(&content, "[%s] %s\n\n", node.Role, node.Content)
 		}
@@ -195,73 +218,81 @@ func (m *Kit) SummarizeBranch(fromID, toID string) (string, error) {
 // CollapseBranch replaces a branch range with a summary entry.
 // Returns an error if the session is unavailable or the operation fails.
 func (m *Kit) CollapseBranch(fromID, toID, summary string) error {
-	if m.treeSession == nil {
-		return fmt.Errorf("no tree session available")
+	if m.session == nil {
+		return fmt.Errorf("no session available")
 	}
-	_, err := m.treeSession.AppendBranchSummary(fromID, summary)
-	return err
+	// Note: This operation is not directly supported by SessionManager interface
+	// as it requires AppendBranchSummary which is TreeManager-specific.
+	// For custom SessionManagers, this would need to be implemented differently.
+	// For now, we try to use the underlying TreeManager if available.
+	if adapter, ok := m.session.(*treeManagerAdapter); ok {
+		_, err := adapter.inner.AppendBranchSummary(fromID, summary)
+		return err
+	}
+	return fmt.Errorf("CollapseBranch not supported by custom session manager")
 }

-// entryToTreeNode converts a session entry to a TreeNode.
-func (m *Kit) entryToTreeNode(entry any) *TreeNode {
-	switch e := entry.(type) {
-	case *session.MessageEntry:
-		msg, err := e.ToMessage()
-		if err != nil {
-			return nil
-		}
+// branchEntryToTreeNode converts a BranchEntry to a TreeNode.
+func (m *Kit) branchEntryToTreeNode(entry *BranchEntry) *TreeNode {
+	if entry == nil {
+		return nil
+	}
+
+	switch entry.Type {
+	case EntryTypeMessage:
+		// Build content from RawParts
 		var content strings.Builder
-		for _, p := range msg.Parts {
+		for _, p := range entry.RawParts {
 			switch pt := p.(type) {
-			case message.TextContent:
+			case TextContent:
 				content.WriteString(pt.Text)
-			case message.ReasoningContent:
+			case ReasoningContent:
 				content.WriteString(pt.Thinking)
-			case message.ToolCall:
+			case ToolCall:
 				fmt.Fprintf(&content, "[tool_call: %s]", pt.Name)
-			case message.ToolResult:
+			case ToolResult:
 				fmt.Fprintf(&content, "[tool_result: %s]", pt.Content)
 			}
 		}
 		return &TreeNode{
-			ID:        e.ID,
-			ParentID:  e.ParentID,
+			ID:        entry.ID,
+			ParentID:  entry.ParentID,
 			Type:      "message",
-			Role:      string(msg.Role),
+			Role:      entry.Role,
 			Content:   content.String(),
-			Model:     msg.Model,
-			Provider:  msg.Provider,
-			Timestamp: e.Timestamp.Format(time.RFC3339),
-			Children:  m.treeSession.GetChildren(e.ID),
+			Model:     entry.Model,
+			Provider:  entry.Provider,
+			Timestamp: entry.Timestamp.Format(time.RFC3339),
+			Children:  m.session.GetChildren(entry.ID),
 		}
-	case *session.BranchSummaryEntry:
+	case EntryTypeBranchSummary:
 		return &TreeNode{
-			ID:        e.ID,
-			ParentID:  e.ParentID,
+			ID:        entry.ID,
+			ParentID:  entry.ParentID,
 			Type:      "branch_summary",
-			Content:   e.Summary,
-			Timestamp: e.Timestamp.Format(time.RFC3339),
-			Children:  m.treeSession.GetChildren(e.ID),
+			Content:   entry.Content,
+			Timestamp: entry.Timestamp.Format(time.RFC3339),
+			Children:  m.session.GetChildren(entry.ID),
 		}
-	case *session.ModelChangeEntry:
+	case EntryTypeModelChange:
 		return &TreeNode{
-			ID:        e.ID,
-			ParentID:  e.ParentID,
+			ID:        entry.ID,
+			ParentID:  entry.ParentID,
 			Type:      "model_change",
-			Content:   fmt.Sprintf("Model changed to %s/%s", e.Provider, e.ModelID),
-			Model:     e.Provider + "/" + e.ModelID,
-			Provider:  e.Provider,
-			Timestamp: e.Timestamp.Format(time.RFC3339),
-			Children:  m.treeSession.GetChildren(e.ID),
+			Content:   entry.Content,
+			Model:     entry.Model,
+			Provider:  entry.Provider,
+			Timestamp: entry.Timestamp.Format(time.RFC3339),
+			Children:  m.session.GetChildren(entry.ID),
 		}
-	case *session.ExtensionDataEntry:
+	case EntryTypeExtensionData:
 		return &TreeNode{
-			ID:        e.ID,
-			ParentID:  e.ParentID,
+			ID:        entry.ID,
+			ParentID:  entry.ParentID,
 			Type:      "extension_data",
-			Content:   fmt.Sprintf("Extension data: %s", e.ExtType),
-			Timestamp: e.Timestamp.Format(time.RFC3339),
-			Children:  m.treeSession.GetChildren(e.ID),
+			Content:   entry.Content,
+			Timestamp: entry.Timestamp.Format(time.RFC3339),
+			Children:  m.session.GetChildren(entry.ID),
 		}
 	default:
 		return nil
@@ -1,6 +1,7 @@
 package kit

 import (
+	"fmt"
 	"os"

 	"github.com/mark3labs/kit/internal/extensions"
@@ -136,3 +137,15 @@ func (m *Kit) ClearSkillCache() {
 	defer m.skillCache.mu.Unlock()
 	m.skillCache.skills = nil
 }
+
+// ReloadSkills re-discovers skills from disk, replacing the current set.
+// This is called by file watchers when skill files change.
+func (m *Kit) ReloadSkills() error {
+	newSkills, err := loadSkills(m.opts)
+	if err != nil {
+		return fmt.Errorf("reloading skills: %w", err)
+	}
+	m.skills = newSkills
+	m.ClearSkillCache()
+	return nil
+}
@@ -1,6 +1,8 @@
 package kit

 import (
+	"context"
+
 	"charm.land/fantasy"

 	"github.com/mark3labs/kit/internal/core"
@@ -16,6 +18,123 @@ type ToolOption = core.ToolOption
 // If empty, os.Getwd() is used at execution time.
 var WithWorkDir = core.WithWorkDir

+// --- Custom tool creation ---
+
+// ToolOutput is the return value from custom tool handlers created with
+// [NewTool] or [NewParallelTool]. It provides a dependency-free way to
+// return results without importing the underlying LLM framework.
+type ToolOutput struct {
+	// Content is the text content returned to the LLM.
+	Content string
+
+	// IsError, when true, signals to the LLM that the tool call failed.
+	IsError bool
+
+	// Data contains optional binary data (images, audio, etc.).
+	Data []byte
+
+	// MediaType is the MIME type for binary Data (e.g. "image/png").
+	MediaType string
+
+	// Metadata is optional opaque metadata attached to the response.
+	// It is not sent to the LLM but may be consumed by hooks or the UI.
+	Metadata any
+}
+
+// TextResult creates a successful text [ToolOutput].
+func TextResult(content string) ToolOutput {
+	return ToolOutput{Content: content}
+}
+
+// ErrorResult creates an error [ToolOutput]. The LLM will see the content
+// as a tool error, allowing it to retry or adjust its approach.
+func ErrorResult(content string) ToolOutput {
+	return ToolOutput{Content: content, IsError: true}
+}
+
+// toolCallIDKey is the context key for the tool call ID.
+type toolCallIDKey struct{}
+
+// ToolCallIDFromContext extracts the tool call ID from the context.
+// The call ID is set automatically by [NewTool] and [NewParallelTool]
+// before invoking the handler. Returns an empty string if no ID is present.
+func ToolCallIDFromContext(ctx context.Context) string {
+	s, _ := ctx.Value(toolCallIDKey{}).(string)
+	return s
+}
+
+// NewTool creates a custom [Tool] with automatic JSON schema generation from
+// the TInput struct type. The handler receives a typed input (deserialized
+// from the LLM's JSON arguments) and returns a [ToolResult].
+//
+// Struct tags on TInput control the generated schema:
+//
+//	json:"name"         → parameter name
+//	description:"..."   → parameter description shown to the LLM
+//	enum:"a,b,c"        → restrict valid values
+//	omitempty            → marks the parameter as optional
+//
+// The tool call ID is injected into the context and can be retrieved with
+// [ToolCallIDFromContext].
+//
+// Example:
+//
+//	type WeatherInput struct {
+//	    City string `json:"city" description:"City name"`
+//	}
+//
+//	tool := kit.NewTool("get_weather", "Get weather for a city",
+//	    func(ctx context.Context, input WeatherInput) (kit.ToolResult, error) {
+//	        return kit.TextResult("72°F, sunny in " + input.City), nil
+//	    },
+//	)
+func NewTool[TInput any](name, description string, fn func(ctx context.Context, input TInput) (ToolOutput, error)) Tool {
+	return fantasy.NewAgentTool(name, description,
+		func(ctx context.Context, input TInput, call fantasy.ToolCall) (fantasy.ToolResponse, error) {
+			ctx = context.WithValue(ctx, toolCallIDKey{}, call.ID)
+			result, err := fn(ctx, input)
+			if err != nil {
+				return fantasy.NewTextErrorResponse(err.Error()), nil
+			}
+			resp := fantasy.ToolResponse{
+				Content:   result.Content,
+				IsError:   result.IsError,
+				Data:      result.Data,
+				MediaType: result.MediaType,
+			}
+			if result.Metadata != nil {
+				resp = fantasy.WithResponseMetadata(resp, result.Metadata)
+			}
+			return resp, nil
+		},
+	)
+}
+
+// NewParallelTool is like [NewTool] but marks the tool as safe for concurrent
+// execution alongside other tools. Use this when the tool has no side effects
+// or when concurrent calls are safe.
+func NewParallelTool[TInput any](name, description string, fn func(ctx context.Context, input TInput) (ToolOutput, error)) Tool {
+	return fantasy.NewParallelAgentTool(name, description,
+		func(ctx context.Context, input TInput, call fantasy.ToolCall) (fantasy.ToolResponse, error) {
+			ctx = context.WithValue(ctx, toolCallIDKey{}, call.ID)
+			result, err := fn(ctx, input)
+			if err != nil {
+				return fantasy.NewTextErrorResponse(err.Error()), nil
+			}
+			resp := fantasy.ToolResponse{
+				Content:   result.Content,
+				IsError:   result.IsError,
+				Data:      result.Data,
+				MediaType: result.MediaType,
+			}
+			if result.Metadata != nil {
+				resp = fantasy.WithResponseMetadata(resp, result.Metadata)
+			}
+			return resp, nil
+		},
+	)
+}
+
 // --- Individual tool constructors ---

 // NewReadTool creates a file-reading tool.
@@ -0,0 +1,119 @@
+package kit_test
+
+import (
+	"context"
+	"testing"
+
+	kit "github.com/mark3labs/kit/pkg/kit"
+)
+
+// TestNewTool_BasicTextResult verifies that NewTool creates a working tool
+// that returns text content via ToolOutput.
+func TestNewTool_BasicTextResult(t *testing.T) {
+	type Input struct {
+		Name string `json:"name"`
+	}
+
+	tool := kit.NewTool("greet", "Greet someone",
+		func(ctx context.Context, input Input) (kit.ToolOutput, error) {
+			return kit.TextResult("hello " + input.Name), nil
+		},
+	)
+
+	info := tool.Info()
+	if info.Name != "greet" {
+		t.Errorf("Info().Name = %q, want %q", info.Name, "greet")
+	}
+	if info.Description != "Greet someone" {
+		t.Errorf("Info().Description = %q, want %q", info.Description, "Greet someone")
+	}
+	if info.Parallel {
+		t.Error("NewTool should not mark tool as parallel")
+	}
+}
+
+// TestNewParallelTool_MarkedParallel verifies that NewParallelTool marks the
+// tool as safe for concurrent execution.
+func TestNewParallelTool_MarkedParallel(t *testing.T) {
+	type Input struct {
+		Query string `json:"query"`
+	}
+
+	tool := kit.NewParallelTool("search", "Search for things",
+		func(ctx context.Context, input Input) (kit.ToolOutput, error) {
+			return kit.TextResult("found: " + input.Query), nil
+		},
+	)
+
+	info := tool.Info()
+	if info.Name != "search" {
+		t.Errorf("Info().Name = %q, want %q", info.Name, "search")
+	}
+	if !info.Parallel {
+		t.Error("NewParallelTool should mark tool as parallel")
+	}
+}
+
+// TestTextResult verifies the TextResult convenience constructor.
+func TestTextResult(t *testing.T) {
+	r := kit.TextResult("ok")
+	if r.Content != "ok" {
+		t.Errorf("Content = %q, want %q", r.Content, "ok")
+	}
+	if r.IsError {
+		t.Error("TextResult should not set IsError")
+	}
+}
+
+// TestErrorResult verifies the ErrorResult convenience constructor.
+func TestErrorResult(t *testing.T) {
+	r := kit.ErrorResult("bad input")
+	if r.Content != "bad input" {
+		t.Errorf("Content = %q, want %q", r.Content, "bad input")
+	}
+	if !r.IsError {
+		t.Error("ErrorResult should set IsError")
+	}
+}
+
+// TestToolCallIDFromContext verifies round-trip context injection.
+func TestToolCallIDFromContext(t *testing.T) {
+	// Empty context returns empty string.
+	if id := kit.ToolCallIDFromContext(context.Background()); id != "" {
+		t.Errorf("expected empty string from bare context, got %q", id)
+	}
+}
+
+// TestToolOutput_Metadata verifies that metadata can be set on ToolOutput.
+func TestToolOutput_Metadata(t *testing.T) {
+	r := kit.ToolOutput{
+		Content:  "data",
+		Metadata: map[string]string{"key": "value"},
+	}
+	if r.Metadata == nil {
+		t.Error("expected non-nil Metadata")
+	}
+	m, ok := r.Metadata.(map[string]string)
+	if !ok {
+		t.Fatalf("expected map[string]string, got %T", r.Metadata)
+	}
+	if m["key"] != "value" {
+		t.Errorf("Metadata[key] = %q, want %q", m["key"], "value")
+	}
+}
+
+// TestToolOutput_BinaryData verifies that binary data fields work correctly.
+func TestToolOutput_BinaryData(t *testing.T) {
+	data := []byte{0x89, 0x50, 0x4E, 0x47}
+	r := kit.ToolOutput{
+		Content:   "image result",
+		Data:      data,
+		MediaType: "image/png",
+	}
+	if len(r.Data) != 4 {
+		t.Errorf("Data len = %d, want 4", len(r.Data))
+	}
+	if r.MediaType != "image/png" {
+		t.Errorf("MediaType = %q, want %q", r.MediaType, "image/png")
+	}
+}
@@ -11,6 +11,7 @@ import (
 	"github.com/mark3labs/kit/internal/message"
 	"github.com/mark3labs/kit/internal/models"
 	"github.com/mark3labs/kit/internal/session"
+	"github.com/mark3labs/mcp-go/client/transport"
 )

 // ==== Message Types (internal/message/content.go) ====
@@ -204,6 +205,29 @@ type CompactionResult = compaction.CompactionResult
 // CompactionOptions configures compaction behaviour.
 type CompactionOptions = compaction.CompactionOptions

+// ==== MCP OAuth Types ====
+
+// MCPTokenStore persists OAuth tokens for a single MCP server. Implementations
+// must be safe for concurrent use.
+//
+// This is a type alias for the mcp-go transport.TokenStore interface. SDK
+// consumers can implement this interface to provide custom storage backends
+// (database, encrypted file, in-memory, etc.).
+type MCPTokenStore = transport.TokenStore
+
+// MCPToken represents an OAuth token for an MCP server, containing access
+// and refresh tokens along with expiration metadata.
+type MCPToken = transport.Token
+
+// MCPTokenStoreFactory creates an [MCPTokenStore] for a given MCP server URL.
+// It is called once per remote MCP server during connection setup.
+type MCPTokenStoreFactory func(serverURL string) (MCPTokenStore, error)
+
+// ErrMCPNoToken is the sentinel error that [MCPTokenStore] implementations
+// should return from GetToken when no token is stored for the server.
+// Callers can check for this with errors.Is.
+var ErrMCPNoToken = transport.ErrNoToken
+
 // ==== Constructor & Helper Functions ====

 // ParseModelString parses a model string in "provider/model" format.
@@ -17,7 +17,7 @@ import time
 import os

 KIT_BIN = os.path.join(os.path.dirname(__file__), "..", "output", "kit")
-MODEL   = "opencode/kimi-k2.5"
+MODEL   = os.environ.get("MODEL", "opencode/kimi-k2.5")
 CWD     = os.path.expanduser("~")
 TIMEOUT = 60  # seconds to wait for the prompt to complete

@@ -85,18 +85,33 @@ host, err := kit.New(ctx, &kit.Options{
    SessionPath: "/path/to/session.jsonl", // open specific session file
    Continue:    true,                // resume most recent session for SessionDir
    NoSession:   true,                // ephemeral in-memory session, no disk persistence
+    SessionManager: myCustomSession,  // custom SessionManager implementation (advanced)

    // Tools
-    Tools:      []kit.Tool{kit.NewBashTool()}, // REPLACES entire default tool set
-    ExtraTools: []kit.Tool{myTool},            // ADDS alongside core/MCP/extension tools
+    Tools:            []kit.Tool{kit.NewBashTool()}, // REPLACES entire default tool set
+    ExtraTools:       []kit.Tool{myTool},            // ADDS alongside core/MCP/extension tools
+    DisableCoreTools: true,                        // Use no core tools (0 tools, for chat-only)
+
+    // Configuration
+    SkipConfig:   true,                        // Skip .kit.yml files (viper defaults + env vars still apply)

    // Skills
    Skills:    []string{"/path/to/skill.md"}, // explicit skill files (empty = auto-discover)
    SkillsDir: "/path/to/skills",             // override project-local skills dir
+    NoSkills:  true,                          // disable skill loading entirely
+
+    // Feature toggles
+    NoExtensions:   true,                     // disable Yaegi extension loading entirely
+    NoContextFiles: true,                     // disable automatic AGENTS.md loading

    // Compaction
    AutoCompact:       true,                        // auto-compact near context limit
    CompactionOptions: &kit.CompactionOptions{...}, // nil = defaults
+
+    // MCP OAuth
+    MCPTokenStoreFactory: func(serverURL string) (kit.MCPTokenStore, error) {
+        return myCustomStore(serverURL), nil  // custom OAuth token storage
+    },
 })
 ```

@@ -120,8 +135,12 @@ result, err := host.PromptResult(ctx, "Analyze this file")
 // result.StopReason   — "stop", "length", "tool-calls", "error", etc.
 // result.SessionID    — session UUID
 // result.TotalUsage   — aggregate tokens across all steps (*kit.LLMUsage)
-//                        LLMUsage{InputTokens, OutputTokens, TotalTokens, ...}
+//                        LLMUsage{InputTokens, OutputTokens, TotalTokens,
+//                                 ReasoningTokens, CacheCreationTokens, CacheReadTokens}
 // result.FinalUsage   — tokens from last API call only (*kit.LLMUsage)
+//                        For context window fill, sum: InputTokens + CacheReadTokens +
+//                        CacheCreationTokens + OutputTokens (with prompt caching,
+//                        InputTokens alone understates the context)
 // result.Messages     — full updated conversation ([]kit.LLMMessage)
 //                        LLMMessage{Role kit.LLMMessageRole, Content string}
 ```
@@ -342,6 +361,77 @@ Lower values run first. Within the same priority, registration order applies. Fi

 ## Tools

+### Creating custom tools
+
+Use `kit.NewTool` to create custom tools. The JSON schema is auto-generated from the input struct — no external dependencies required:
+
+```go
+type WeatherInput struct {
+    City string `json:"city" description:"City name, e.g. 'San Francisco'"`
+}
+
+weatherTool := kit.NewTool("get_weather", "Get current weather for a city",
+    func(ctx context.Context, input WeatherInput) (kit.ToolOutput, error) {
+        // Your logic here (API calls, database lookups, etc.)
+        return kit.TextResult("72°F, sunny in " + input.City), nil
+    },
+)
+
+host, _ := kit.New(ctx, &kit.Options{
+    ExtraTools: []kit.Tool{weatherTool},
+})
+```
+
+**Struct tags** control the generated schema:
+
+| Tag | Purpose | Example |
+|-----|---------|---------|
+| `json:"name"` | Parameter name | `json:"city"` |
+| `description:"..."` | Description shown to the LLM | `description:"City name"` |
+| `enum:"a,b,c"` | Restrict valid values | `enum:"json,text,csv"` |
+| `omitempty` | Marks parameter as optional | `json:"limit,omitempty"` |
+
+**Return helpers:**
+
+| Function | Description |
+|----------|-------------|
+| `kit.TextResult(content)` | Successful text result |
+| `kit.ErrorResult(content)` | Error result (LLM sees it as a tool error) |
+
+**ToolOutput fields** (for advanced use):
+
+```go
+kit.ToolOutput{
+    Content:   "result text",     // text returned to the LLM
+    IsError:   false,             // true = LLM sees this as an error
+    Data:      pngBytes,          // optional binary data (images, audio)
+    MediaType: "image/png",       // MIME type for binary Data
+    Metadata:  map[string]any{},  // opaque metadata for hooks/UI (not sent to LLM)
+}
+```
+
+**Parallel tools** — mark as safe for concurrent execution:
+
+```go
+searchTool := kit.NewParallelTool("search", "Search the web",
+    func(ctx context.Context, input SearchInput) (kit.ToolOutput, error) {
+        return kit.TextResult("results..."), nil
+    },
+)
+```
+
+**Tool call ID** — available in context for logging/tracing:
+
+```go
+tool := kit.NewTool("my_tool", "...",
+    func(ctx context.Context, input MyInput) (kit.ToolOutput, error) {
+        callID := kit.ToolCallIDFromContext(ctx) // correlation ID from the LLM
+        log.Printf("[%s] my_tool called", callID)
+        return kit.TextResult("ok"), nil
+    },
+)
+```
+
 ### Built-in tool constructors

 ```go
@@ -390,6 +480,7 @@ names := host.GetToolNames()       // []string of all tool names
 tools := host.GetTools()           // []kit.Tool (full tool objects)
 mcpCount := host.GetMCPToolCount() // tools from MCP servers
 extCount := host.GetExtensionToolCount() // tools from extensions
+ready := host.MCPToolsReady()      // true when async MCP tool loading is complete
 ```

 ---
@@ -431,6 +522,72 @@ kit.DeleteSession("/path/to/session.jsonl")
 tm, _ := kit.OpenTreeSession("/path/to/session.jsonl") // open for direct access
 ```

+### Custom Session Manager (Advanced)
+
+You can provide a custom session manager to store conversation history in your own backend (database, cloud storage, etc.) instead of the default JSONL files.
+
+```go
+// Implement the SessionManager interface
+type MyDatabaseSessionManager struct {
+    db *sql.DB
+    // ... other fields
+}
+
+func (s *MyDatabaseSessionManager) AppendMessage(msg kit.LLMMessage) (string, error) {
+    // Store message in your database
+}
+
+func (s *MyDatabaseSessionManager) GetMessages() []kit.LLMMessage {
+    // Retrieve messages from your database
+}
+
+// ... implement all other SessionManager methods
+
+// Use with Kit
+host, _ := kit.New(ctx, &kit.Options{
+    SessionManager: myCustomSession,  // Your custom implementation
+    Model: "anthropic/claude-sonnet-latest",
+})
+```
+
+**SessionManager Interface:**
+
+```go
+type SessionManager interface {
+    AppendMessage(msg kit.LLMMessage) (entryID string, err error)
+    GetMessages() []kit.LLMMessage
+    BuildContext() (messages []kit.LLMMessage, provider string, modelID string)
+    Branch(entryID string) error
+    GetCurrentBranch() []kit.BranchEntry
+    GetChildren(parentID string) []string
+    GetEntry(entryID string) *kit.BranchEntry
+    GetSessionID() string
+    GetSessionName() string
+    SetSessionName(name string) error
+    GetCreatedAt() time.Time
+    IsPersisted() bool
+    AppendCompaction(summary string, firstKeptEntryID string,
+        tokensBefore, tokensAfter int, messagesRemoved int, readFiles, modifiedFiles []string) (string, error)
+    GetLastCompaction() *kit.CompactionEntry
+    AppendExtensionData(extType, data string) (string, error)
+    GetExtensionData(extType string) []kit.ExtensionDataEntry
+    AppendModelChange(provider, modelID string) (string, error)
+    GetContextEntryIDs() []string
+    Close() error
+}
+```
+
+**Use Cases:**
+- **PocketBase integration**: Store sessions as PocketBase records
+- **Cloud storage**: Persist sessions to S3, GCS, or Azure Blob
+- **Multi-user apps**: Store sessions per user in a database
+- **Custom retention**: Implement your own session cleanup policies
+
+**Note:** When using a custom SessionManager, the following Options are ignored:
+- `SessionPath` - your manager handles its own storage
+- `Continue` - your manager handles session selection
+- `NoSession` - use an in-memory implementation instead
+
 ---

 ## Model Management
@@ -476,6 +633,56 @@ Always `"provider/model"`: `"anthropic/claude-sonnet-4-5-20250929"`, `"openai/gp
 provider, modelID, err := kit.ParseModelString("anthropic/claude-sonnet-4-5-20250929")
 ```

+### Per-model system prompts
+
+Models can have per-model system prompts configured via `modelSettings` or `customModels` in `.kit.yml`. When the user hasn't explicitly set a system prompt (via `--system-prompt`, config, or `Options.SystemPrompt`), the per-model prompt is used as the base and composed with AGENTS.md context and skills.
+
+On `SetModel()`, if the new model has a per-model system prompt and no custom global prompt was set, the per-model prompt automatically replaces the previous one.
+
+### Per-model generation parameters
+
+Models can define default generation parameters (`temperature`, `top_p`, `top_k`, `frequency_penalty`, `presence_penalty`) via `modelSettings` or `customModels` `params` in `.kit.yml`. These defaults apply when the user hasn't explicitly set the parameter. Explicit CLI flags or config values always take priority.
+
+---
+
+## Dynamic MCP Server Management
+
+Add, remove, and inspect MCP servers at runtime without restarting Kit:
+
+```go
+// Add a new MCP server — tools become available immediately
+n, err := host.AddMCPServer(ctx, "github", kit.MCPServerConfig{
+    Command:     []string{"npx", "-y", "@modelcontextprotocol/server-github"},
+    Environment: map[string]string{"GITHUB_TOKEN": os.Getenv("GITHUB_TOKEN")},
+})
+fmt.Printf("Loaded %d tools from github server\n", n)
+
+// Remove an MCP server — its tools are no longer available
+err = host.RemoveMCPServer("github")
+
+// List all currently loaded MCP servers
+servers := host.ListMCPServers()
+for _, s := range servers {
+    fmt.Printf("Server %s: %d tools\n", s.Name, s.ToolCount)
+}
+```
+
+`AddMCPServer` is safe to call while the agent is idle. If a turn is in progress, new tools are visible starting from the next LLM step. Tool names are prefixed with the server name (e.g. `"github__create_issue"`).
+
+### MCP OAuth Token Storage
+
+For remote MCP servers that use OAuth, you can provide a custom token store:
+
+```go
+host, _ := kit.New(ctx, &kit.Options{
+    MCPTokenStoreFactory: func(serverURL string) (kit.MCPTokenStore, error) {
+        return &MyDatabaseTokenStore{serverURL: serverURL}, nil
+    },
+})
+```
+
+The `MCPTokenStore` interface requires `GetToken`/`SetToken`/`DeleteToken` methods. Return `kit.ErrMCPNoToken` from `GetToken` when no token is stored. When nil (default), tokens are persisted to `$XDG_CONFIG_HOME/.kit/mcp_tokens.json`.
+
 ---

 ## Context & Compaction
@@ -483,9 +690,12 @@ provider, modelID, err := kit.ParseModelString("anthropic/claude-sonnet-4-5-2025
 ```go
 tokens := host.EstimateContextTokens()  // heuristic token count
 shouldCompact := host.ShouldCompact()    // true if near context limit
+// ShouldCompact() uses API-reported token counts (including cache tokens)
+// when available, falling back to text-based heuristic before the first turn.

 stats := host.GetContextStats()
-// stats.EstimatedTokens — uses API-reported count when available (more accurate)
+// stats.EstimatedTokens — uses API-reported count when available (more accurate;
+//                          includes system prompts, tool definitions, cache tokens)
 // stats.ContextLimit    — model's context window size
 // stats.UsagePercent    — fraction used (0.0–1.0)
 // stats.MessageCount    — number of messages
@@ -645,13 +855,21 @@ kit.ProviderConfig, kit.ProviderResult, kit.ModelInfo, kit.ModelCost, kit.ModelL
 // LLM types — concrete Kit-owned structs (no external library dependency)
 kit.LLMMessage      // {Role LLMMessageRole, Content string}
 kit.LLMMessageRole  // "user" | "assistant" | "system" | "tool"
-kit.LLMUsage        // {InputTokens, OutputTokens, TotalTokens, ReasoningTokens, ...}
+kit.LLMUsage        // {InputTokens, OutputTokens, TotalTokens, ReasoningTokens,
+                     //  CacheCreationTokens, CacheReadTokens}
 kit.LLMResponse     // {Content, FinishReason, Usage}
 kit.LLMFilePart     // {Filename, Data []byte, MediaType}

 // Compaction types
 kit.CompactionResult, kit.CompactionOptions

+// MCP OAuth types
+kit.MCPTokenStore        // interface for custom OAuth token storage
+kit.MCPToken             // OAuth token struct (access, refresh, expiry)
+kit.MCPTokenStoreFactory // func(serverURL string) (MCPTokenStore, error)
+kit.ErrMCPNoToken        // sentinel error for "no token stored"
+kit.MCPServerStatus      // {Name string, ToolCount int}
+
 // Conversion helpers
 msgs := kit.ConvertToLLMMessages(&msg)   // SDK Message  → []LLMMessage
 msg  := kit.ConvertFromLLMMessage(lMsg)  // LLMMessage   → SDK Message
@@ -7,17 +7,16 @@ description: Monitor tool calls and streaming output with the Kit Go SDK.

 ## Event-based monitoring

-For more granular control, use the event subscription API:
+Subscribe to events for real-time monitoring. Each method returns an unsubscribe function:

 ```go
-// Subscribe returns an unsubscribe function
 unsub := host.OnToolCall(func(event kit.ToolCallEvent) {
-    fmt.Printf("Tool: %s, Args: %s\n", event.Name, event.Args)
+    fmt.Printf("Tool: %s, Args: %s\n", event.ToolName, event.ToolArgs)
 })
 defer unsub()

 unsub2 := host.OnToolResult(func(event kit.ToolResultEvent) {
-    fmt.Printf("Result: %s (error: %v)\n", event.Name, event.IsError)
+    fmt.Printf("Result: %s (error: %v)\n", event.ToolName, event.IsError)
 })
 defer unsub2()

@@ -44,33 +43,62 @@ defer unsub6()

 ## Hook system

-Hooks allow you to intercept and modify behavior. Unlike events, hooks can modify or cancel operations:
+Hooks can **modify or cancel** operations. Unlike events (read-only), hooks are read-write interceptors.
+
+### BeforeToolCall — block tool execution

 ```go
-// Intercept tool calls before execution
-host.OnBeforeToolCall(0, func(ctx context.Context, name string, args string) (string, error) {
-    if name == "bash" {
-        log.Println("Bash command:", args)
+host.OnBeforeToolCall(kit.HookPriorityNormal, func(h kit.BeforeToolCallHook) *kit.BeforeToolCallResult {
+    // h.ToolCallID, h.ToolName, h.ToolArgs
+    if h.ToolName == "bash" && strings.Contains(h.ToolArgs, "rm -rf") {
+        return &kit.BeforeToolCallResult{Block: true, Reason: "dangerous command"}
    }
-    return args, nil // return modified args or error to cancel
+    return nil // allow
 })
+```

-// Process results after tool execution
-host.OnAfterToolResult(0, func(ctx context.Context, name string, result string) (string, error) {
-    return result, nil
-})
+### AfterToolResult — modify tool output

-// Before/after each agent turn
-host.OnBeforeTurn(0, func(ctx context.Context) error {
-    return nil
-})
-
-host.OnAfterTurn(0, func(ctx context.Context) error {
+```go
+host.OnAfterToolResult(kit.HookPriorityNormal, func(h kit.AfterToolResultHook) *kit.AfterToolResultResult {
+    // h.ToolCallID, h.ToolName, h.ToolArgs, h.Result, h.IsError
+    if h.ToolName == "read" {
+        filtered := redactSecrets(h.Result)
+        return &kit.AfterToolResultResult{Result: &filtered}
+    }
    return nil
 })
 ```

-The first argument is a priority (lower = runs first).
+### BeforeTurn — modify prompt, inject messages
+
+```go
+host.OnBeforeTurn(kit.HookPriorityNormal, func(h kit.BeforeTurnHook) *kit.BeforeTurnResult {
+    // h.Prompt
+    newPrompt := h.Prompt + "\nAlways respond in JSON."
+    return &kit.BeforeTurnResult{Prompt: &newPrompt}
+    // Also available: SystemPrompt *string, InjectText *string
+})
+```
+
+### AfterTurn — observation only
+
+```go
+host.OnAfterTurn(kit.HookPriorityNormal, func(h kit.AfterTurnHook) {
+    // h.Response, h.Error
+    log.Printf("Turn completed: %d chars", len(h.Response))
+})
+```
+
+### Hook priorities
+
+```go
+kit.HookPriorityHigh   = 0   // runs first
+kit.HookPriorityNormal = 50  // default
+kit.HookPriorityLow    = 100 // runs last
+```
+
+Lower values run first. First non-nil result wins.

 ## Subagent event monitoring

@@ -29,8 +29,12 @@ host, err := kit.New(ctx, &kit.Options{
    NoSession:    true,

    // Tools
-    Tools:        []kit.Tool{...},     // Replace default tool set entirely
-    ExtraTools:   []kit.Tool{...},     // Add tools alongside defaults
+    Tools:            []kit.Tool{...},     // Replace default tool set entirely
+    ExtraTools:       []kit.Tool{...},     // Add tools alongside defaults
+    DisableCoreTools: true,                // Use no core tools (0 tools, for chat-only)
+
+    // Configuration
+    SkipConfig:   true,                   // Skip .kit.yml files (viper defaults + env vars still apply)

    // Compaction
    AutoCompact:  true,
@@ -58,7 +62,34 @@ host, err := kit.New(ctx, &kit.Options{
 | `NoSession` | `bool` | `false` | Ephemeral mode (no persistence) |
 | `Tools` | `[]Tool` | — | Replace the entire default tool set |
 | `ExtraTools` | `[]Tool` | — | Additional tools alongside core/MCP/extension tools |
+| `DisableCoreTools` | `bool` | `false` | Use no core tools (0 tools, for chat-only) |
+| `SkipConfig` | `bool` | `false` | Skip .kit.yml file loading |
 | `AutoCompact` | `bool` | `false` | Auto-compact when near context limit |
 | `CompactionOptions` | `*CompactionOptions` | — | Configuration for auto-compaction |
 | `Skills` | `[]string` | — | Explicit skill files/dirs to load |
 | `SkillsDir` | `string` | — | Override default skills directory |
+
+## Tool configuration
+
+**`Tools`** replaces ALL default tools (core + MCP + extension). **`ExtraTools`** adds tools alongside the defaults. Use `Tools` to restrict capabilities; use `ExtraTools` to extend them.
+
+Create custom tools with `kit.NewTool` — no external dependencies needed:
+
+```go
+type LookupInput struct {
+    ID string `json:"id" description:"Record ID to look up"`
+}
+
+lookupTool := kit.NewTool("lookup", "Look up a record by ID",
+    func(ctx context.Context, input LookupInput) (kit.ToolOutput, error) {
+        record := db.Find(input.ID)
+        return kit.TextResult(record.String()), nil
+    },
+)
+
+host, _ := kit.New(ctx, &kit.Options{
+    ExtraTools: []kit.Tool{lookupTool},
+})
+```
+
+See [Overview](/sdk/overview#custom-tools) for full custom tool documentation.
@@ -68,6 +68,44 @@ The SDK provides several prompt variants:
 | `Steer(ctx, instruction)` | System-level steering without user message |
 | `FollowUp(ctx, text)` | Continue without new user input |

+## Custom tools
+
+Create custom tools with `kit.NewTool`. The JSON schema is auto-generated from the input struct — no external dependencies required:
+
+```go
+type WeatherInput struct {
+    City string `json:"city" description:"City name"`
+}
+
+weatherTool := kit.NewTool("get_weather", "Get current weather for a city",
+    func(ctx context.Context, input WeatherInput) (kit.ToolOutput, error) {
+        return kit.TextResult("72°F, sunny in " + input.City), nil
+    },
+)
+
+host, _ := kit.New(ctx, &kit.Options{
+    ExtraTools: []kit.Tool{weatherTool},
+})
+```
+
+Struct tags control the schema:
+
+- `json:"name"` — parameter name
+- `description:"..."` — description shown to the LLM
+- `enum:"a,b,c"` — restrict valid values
+- `omitempty` — marks the parameter as optional
+
+Return values:
+
+| Helper | Description |
+|--------|-------------|
+| `kit.TextResult(s)` | Successful text result |
+| `kit.ErrorResult(s)` | Error result (LLM sees it as a tool error) |
+
+For advanced use, return a `kit.ToolOutput` struct directly with `Data`, `MediaType`, and `Metadata` fields.
+
+Use `kit.NewParallelTool` for tools that are safe to run concurrently. Use `kit.ToolCallIDFromContext(ctx)` to retrieve the LLM-assigned call ID for logging or tracing.
+
 ## Event system

 Subscribe to events for monitoring:
@@ -901,6 +901,126 @@ a:hover { text-decoration: underline; }
  color: var(--text-muted);
 }

+/* ============================================================
+   Compaction Card
+   ============================================================ */
+.compaction-card {
+  margin: 16px 0;
+  border: 1px solid var(--border);
+  border-radius: var(--radius);
+  background: var(--surface);
+  overflow: hidden;
+}
+
+.compaction-header {
+  display: flex;
+  align-items: center;
+  gap: 10px;
+  padding: 12px 16px;
+  cursor: pointer;
+  user-select: none;
+  transition: background var(--transition);
+  background: var(--surface-raised);
+}
+
+.compaction-header:hover {
+  background: var(--surface-overlay);
+}
+
+.compaction-icon {
+  width: 18px;
+  height: 18px;
+  color: var(--yellow);
+  flex-shrink: 0;
+}
+
+.compaction-title {
+  font-size: 13px;
+  font-weight: 600;
+  color: var(--text-secondary);
+  flex: 1;
+}
+
+.compaction-badge {
+  font-size: 11px;
+  font-weight: 500;
+  color: var(--text-muted);
+  background: var(--surface);
+  padding: 2px 8px;
+  border-radius: 10px;
+  border: 1px solid var(--border);
+}
+
+.compaction-chevron {
+  width: 16px;
+  height: 16px;
+  color: var(--text-faint);
+  transition: transform var(--transition);
+  flex-shrink: 0;
+}
+
+.compaction-card.expanded .compaction-chevron {
+  transform: rotate(180deg);
+}
+
+.compaction-content {
+  max-height: 0;
+  overflow: hidden;
+  transition: max-height var(--transition);
+  border-top: 1px solid transparent;
+}
+
+.compaction-card.expanded .compaction-content {
+  max-height: 2000px;
+  overflow-y: auto;
+  border-top-color: var(--border);
+}
+
+.compaction-summary {
+  padding: 16px;
+  font-size: 13.5px;
+  line-height: 1.7;
+  color: var(--text-secondary);
+}
+
+.compaction-summary .md-content h1,
+.compaction-summary .md-content h2,
+.compaction-summary .md-content h3 {
+  color: var(--text);
+  margin: 16px 0 8px;
+}
+
+.compaction-summary .md-content h1 { font-size: 1.3em; }
+.compaction-summary .md-content h2 { font-size: 1.15em; }
+.compaction-summary .md-content h3 { font-size: 1.05em; }
+
+.compaction-summary .md-content ul,
+.compaction-summary .md-content ol {
+  padding-left: 20px;
+}
+
+.compaction-stats {
+  display: flex;
+  gap: 16px;
+  padding: 12px 16px;
+  background: var(--surface-raised);
+  border-top: 1px solid var(--border-subtle);
+  font-size: 11.5px;
+  color: var(--text-muted);
+  flex-wrap: wrap;
+}
+
+.compaction-stat {
+  display: flex;
+  align-items: center;
+  gap: 4px;
+}
+
+.compaction-stat strong {
+  color: var(--text-secondary);
+  font-weight: 600;
+}
+
 /* ============================================================
   System Prompt Display
   ============================================================ */
@@ -1460,6 +1580,7 @@ a:hover { text-decoration: underline; }
    let userMsgCount = 0;
    let assistantMsgCount = 0;
    let toolCallCount = 0;
+    let compactionCount = 0;

    // Render each entry
    for (const entry of path) {
@@ -1491,6 +1612,9 @@ a:hover { text-decoration: underline; }
        renderSystemNotice('Label', entry.label || '', 'label');
      } else if (entry.type === 'session_info') {
        // Already handled above for header
+      } else if (entry.type === 'compaction') {
+        compactionCount++;
+        renderCompaction(entry);
      }
    }

@@ -1501,6 +1625,7 @@ a:hover { text-decoration: underline; }
          <div class="stat-item"><strong>${userMsgCount}</strong> user message${userMsgCount !== 1 ? 's' : ''}</div>
          <div class="stat-item"><strong>${assistantMsgCount}</strong> assistant message${assistantMsgCount !== 1 ? 's' : ''}</div>
          ${toolCallCount > 0 ? `<div class="stat-item"><strong>${toolCallCount}</strong> tool call${toolCallCount !== 1 ? 's' : ''}</div>` : ''}
+          ${compactionCount > 0 ? `<div class="stat-item"><strong>${compactionCount}</strong> compaction${compactionCount !== 1 ? 's' : ''}</div>` : ''}
          ${header && header.cwd ? `<div class="stat-item">📁 ${escapeHtml(header.cwd)}</div>` : ''}
        </div>`;
      $conversation.insertAdjacentHTML('beforeend', statsHtml);
@@ -2030,6 +2155,60 @@ a:hover { text-decoration: underline; }
    $conversation.appendChild(el);
  }

+  // ============================================================
+  //  Compaction Display
+  // ============================================================
+  function renderCompaction(entry) {
+    const el = document.createElement('div');
+    el.className = 'compaction-card fade-in';
+    
+    const cardId = 'compaction-' + Math.random().toString(36).substr(2, 9);
+    
+    // Build stats
+    const stats = [];
+    if (entry.messages_removed > 0) {
+      stats.push(`<div class="compaction-stat"><strong>${entry.messages_removed}</strong> messages compacted</div>`);
+    }
+    if (entry.tokens_before > 0 && entry.tokens_after > 0) {
+      const saved = entry.tokens_before - entry.tokens_after;
+      stats.push(`<div class="compaction-stat"><strong>${saved}</strong> tokens saved</div>`);
+    }
+    
+    // Format timestamp
+    const timeStr = formatTime(entry.timestamp);
+    
+    el.innerHTML = `
+      <div class="compaction-header" onclick="toggleCompaction('${cardId}')">
+        <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 16 16" fill="currentColor" class="compaction-icon">
+          <path d="M2.5 3.5a.5.5 0 0 1 .5-.5h10a.5.5 0 0 1 .5.5v1a.5.5 0 0 1-.5.5h-10a.5.5 0 0 1-.5-.5v-1Zm0 4a.5.5 0 0 1 .5-.5h10a.5.5 0 0 1 .5.5v1a.5.5 0 0 1-.5.5h-10a.5.5 0 0 1-.5-.5v-1Zm0 4a.5.5 0 0 1 .5-.5h10a.5.5 0 0 1 .5.5v1a.5.5 0 0 1-.5.5h-10a.5.5 0 0 1-.5-.5v-1Z"/>
+        </svg>
+        <span class="compaction-title">Context Compacted</span>
+        ${timeStr ? `<span class="compaction-badge">${timeStr}</span>` : ''}
+        <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 16 16" fill="currentColor" class="compaction-chevron" id="${cardId}-chevron">
+          <path d="M4.427 9.427a.25.25 0 0 0 0 .353l3 3a.25.25 0 0 0 .353 0l3-3a.25.25 0 0 0-.353-.353L8 11.646V4.75a.75.75 0 0 0-1.5 0v6.896L4.78 9.427a.25.25 0 0 0-.353 0Z"/>
+        </svg>
+      </div>
+      <div class="compaction-content" id="${cardId}">
+        ${entry.summary ? `<div class="compaction-summary"><div class="md-content">${renderMarkdown(entry.summary)}</div></div>` : ''}
+        ${stats.length > 0 ? `<div class="compaction-stats">${stats.join('')}</div>` : ''}
+      </div>`;
+    
+    $conversation.appendChild(el);
+  }
+
+  // Toggle compaction card expansion
+  window.toggleCompaction = function(cardId) {
+    const card = document.getElementById(cardId).closest('.compaction-card');
+    const chevron = document.getElementById(cardId + '-chevron');
+    const isExpanded = card.classList.contains('expanded');
+    
+    if (isExpanded) {
+      card.classList.remove('expanded');
+    } else {
+      card.classList.add('expanded');
+    }
+  };
+
  // ============================================================
  //  System Prompt Display (collapsible)
  // ============================================================