fix(extensions): return nil error for blocked/disabled tools so LLM sees the reason

Tool blocking via OnToolCall and SetActiveTools returned both a ToolResponse (IsError=true) and a Go error. Fantasy treats a non-nil Go error from tool.Run() as a critical failure, aborting the agent loop without delivering the tool result to the LLM. The model never saw the block reason and would retry or hallucinate. - Return nil error for blocked tools (OnToolCall Block=true) - Return nil error for disabled tools (SetActiveTools) - Return nil error for extension tool execution failures - Update tests to assert nil error (IsError response conveys the error) Fixes #20
feat(app): update token counts and context fill after every step
2026-06-14 03:30:26 +00:00 · 2026-04-23 13:13:28 +03:00 · 2026-04-23 12:56:00 +03:00 · 2026-04-23 12:03:44 +03:00 · 2026-04-22 21:05:04 +03:00 · 2026-04-22 20:25:06 +03:00
105 changed files with 9600 additions and 1137 deletions
@@ -13,6 +13,8 @@
 // - No channels in maps (Yaegi panics on range over map[string]chan)
 // - All ctx.* calls guarded with nil checks
 // - Simple data structures only
+// - The extension runner serializes handler calls per-extension, so
+//   concurrent subagent events cannot race on this shared state.
 package main

 import (
@@ -43,7 +45,8 @@ const (
 )

 // ---------------------------------------------------------------------------
-// Package-level state - all simple types
+// Package-level state — safe because the runner serializes all handler
+// invocations for the same extension (per-extension reentrant mutex).
 // ---------------------------------------------------------------------------

 var (
@@ -282,8 +285,8 @@ func Init(api ext.API) {

 		submonPushWidget()

-		// Remove the entry immediately (no goroutine to avoid races)
-		newEntries := submonEntries[:0]
+		// Remove the entry — build a new slice to avoid aliasing bugs
+		newEntries := make([]*submonEntry, 0, len(submonEntries))
 		for _, en := range submonEntries {
 			if en.callID != e.ToolCallID {
 				newEntries = append(newEntries, en)
@@ -1,8 +0,0 @@
-{
-  "$schema": "https://opencode.ai/config.json",
-  "permission": {
-    "external_directory": {
-      "~/go/**": "deny"
-    }
-  }
-}
@@ -1,80 +0,0 @@
-# Autoscroll Fix - Final Summary
-
-## Root Cause
-
-The autoscroll was failing for streaming assistant messages due to a bug in how `GotoBottom()` calculated item heights.
-
-### The Problem
-
-1. **Reasoning blocks** (`StreamingMessageItem` with `role="reasoning"`) are **never cached** because they have live duration counters that update every render
-2. The `Height()` method returns `0` when `cachedRender == ""`
-3. `GotoBottom()` was calling:
-   ```go
-   itemHeight := item.Height()  // Returns 0 for reasoning
-   if itemHeight == 0 {
-       item.Render(s.width)  // Renders but doesn't cache (reasoning)
-       itemHeight = item.Height()  // Still returns 0!
-   }
-   ```
-4. This caused incorrect scroll position calculations, especially during reasoning → assistant transitions
-
-## The Solution
-
-Changed `GotoBottom()` and `AtBottom()` to calculate height **directly from the rendered string** instead of relying on the cached height:
-
-```go
-// OLD: item.Height() which checks cached render
-itemHeight := item.Height()
-if itemHeight == 0 {
-    item.Render(s.width)
-    itemHeight = item.Height()  // Still might be 0!
-}
-
-// NEW: Calculate from rendered string directly
-rendered := item.Render(s.width)
-itemHeight := strings.Count(rendered, "\n") + 1
-```
-
-This works for **all** items regardless of whether they cache their render or not.
-
-## Files Changed
-
-### `internal/ui/scrolllist.go`
- **`GotoBottom()`**: Calculate height from rendered string (2 loops)
- **`AtBottom()`**: Calculate height from rendered string (1 loop)
-
-### `internal/ui/model.go`
- **`appendStreamingChunk()`**: For existing messages, call `GotoBottom()` directly (iteratr pattern)
- **`refreshContent()`**: Simplified to only call `SetItems()` (removed redundant `GotoBottom()`)
- **Bash streaming handler**: Removed redundant `GotoBottom()` after `refreshContent()`
-
-## Testing Results
-
-✅ **Test prompt**: "explore this repo"
-
-**Before fix**:
- Autoscroll stopped after reasoning block completed
- Viewport stuck showing end of reasoning ("Thought for 203ms")
- Assistant response streamed off-screen below
-
-**After fix**:
- Autoscroll works throughout reasoning block
- Autoscroll continues during reasoning → assistant transition  
- Viewport stays at bottom showing latest assistant content
- Final position shows end of response (build commands section)
-
-## Behavior Verified
-
-1. ✅ Streaming text auto-scrolls to bottom
-2. ✅ Works across reasoning → assistant transition
-3. ✅ Manual scroll up (PgUp) disables autoscroll
-4. ✅ Scroll to bottom (Alt+End) re-enables autoscroll
-5. ✅ Accurate positioning with no offset errors
-
-## Performance Note
-
-The fix calls `Render()` on all items during `GotoBottom()` calculations. This is acceptable because:
- `Render()` is already optimized with caching for non-reasoning items
- `GotoBottom()` is only called during content updates (not every frame)
- Reasoning blocks need to render anyway for live duration updates
- This matches iteratr's approach of ensuring items are rendered before height calculations
@@ -18,7 +18,8 @@ A powerful, extensible AI coding agent CLI with multi-provider support, built-in
 ## Features

 - **Multi-Provider LLM Support**: Anthropic, OpenAI, Google Gemini, Ollama, Azure OpenAI, AWS Bedrock, OpenRouter, and more
- **Built-in Core Tools**: bash, read, write, edit, grep, find, ls, subagent - no MCP overhead
+- **Built-in Core Tools**: bash (with interactive sudo password prompt), read, write, edit, grep, find, ls, subagent - no MCP overhead
+- **Smart @ Attachments**: Binary files auto-detected via MIME type, MCP resources via `@mcp:server:uri`
 - **MCP Integration**: Connect external MCP servers for expanded capabilities
 - **Extension System**: Write custom tools, commands, widgets, and UI modifications in Go
 - **Theming**: 22 built-in color themes (KITT, Catppuccin, Dracula, Nord, etc.) with runtime switching, persistence, and custom theme files
@@ -28,7 +29,7 @@ A powerful, extensible AI coding agent CLI with multi-provider support, built-in
 - **Session Management**: Tree-based conversation history with branching support
 - **Non-Interactive Mode**: Script-friendly positional args with JSON output
 - **ACP Server**: Run Kit as an [Agent Client Protocol](https://agentclientprotocol.com) agent over stdio
- **Go SDK**: Embed Kit in your own applications
+- **Go SDK**: Embed Kit in your own applications with full agent lifecycle events (30+ event types) and behavior-modifying hooks

 ## Installation

@@ -125,8 +126,13 @@ model: anthropic/claude-sonnet-latest
 max-tokens: 4096
 temperature: 0.7
 stream: true
+thinking-level: off       # off, none, minimal, low, medium, high
 ```

+All of the above keys can also be set programmatically via the SDK
+(`kit.Options.MaxTokens`, `Options.Temperature`, `Options.ThinkingLevel`, etc.)
+without touching config files — see [SDK options](#with-options).
+
 ### Environment Variables

 ```bash
@@ -151,6 +157,11 @@ mcpServers:
  search:
    type: remote
    url: "https://mcp.example.com/search"
+
+  pubmed:
+    type: remote
+    url: "https://pubmed.mcp.example.com"
+    noOAuth: true  # skip OAuth for public servers that don't require auth
 ```

 ## CLI Reference
@@ -186,12 +197,14 @@ mcpServers:
 --no-prompt-templates    Disable prompt template loading

 # Generation parameters
--max-tokens             Maximum tokens in response (default: 4096)
+--max-tokens             Maximum tokens in response (default: 8192, auto-raised up to 32768 for models with larger known output limits)
 --temperature            Randomness 0.0-1.0 (default: 0.7)
 --top-p                  Nucleus sampling 0.0-1.0 (default: 0.95)
 --top-k                  Limit top K tokens (default: 40)
 --stop-sequences         Custom stop sequences (comma-separated)
--thinking-level         Extended thinking level: off, minimal, low, medium, high (default: off)
+--frequency-penalty      Penalize frequent tokens 0.0-2.0 (default: 0.0)
+--presence-penalty       Penalize present tokens 0.0-2.0 (default: 0.0)
+--thinking-level         Extended thinking level: off, none, minimal, low, medium, high (default: off)

 # System
 --config                 Config file path (default: ~/.kit.yml)
@@ -203,9 +216,10 @@ mcpServers:

 ```bash
 # Authentication (for OAuth-enabled providers)
-kit auth login [provider]    # Start OAuth flow (e.g., anthropic)
-kit auth logout [provider]   # Remove credentials for provider
-kit auth status              # Check authentication status
+kit auth login [provider]          # Start OAuth flow (e.g., anthropic)
+kit auth login [provider] --set-default  # Set provider's default model as system default
+kit auth logout [provider]         # Remove credentials for provider
+kit auth status                    # Check authentication status

 # Model database
 kit models [provider]        # List available models (optionally filter by provider)
@@ -287,7 +301,7 @@ kit -e examples/extensions/minimal.go

 ### Extension Capabilities

-**Lifecycle Events**: OnSessionStart, OnSessionShutdown, OnBeforeAgentStart, OnAgentStart, OnAgentEnd, OnToolCall, OnToolExecutionStart, OnToolOutput, OnToolExecutionEnd, OnToolResult, OnInput, OnMessageStart, OnMessageUpdate, OnMessageEnd, OnModelChange, OnContextPrepare, OnBeforeFork, OnBeforeSessionSwitch, OnBeforeCompact, OnCustomEvent, OnSubagentStart, OnSubagentChunk, OnSubagentEnd
+**Lifecycle Events**: OnSessionStart, OnSessionShutdown, OnBeforeAgentStart, OnAgentStart, OnAgentEnd, OnToolCall, OnToolCallInputStart, OnToolCallInputDelta, OnToolCallInputEnd, OnToolExecutionStart, OnToolOutput, OnToolExecutionEnd, OnToolResult, OnInput, OnMessageStart, OnMessageUpdate, OnMessageEnd, OnModelChange, OnContextPrepare, OnBeforeFork, OnBeforeSessionSwitch, OnBeforeCompact, OnCustomEvent, OnSubagentStart, OnSubagentChunk, OnSubagentEnd

 **Custom Components**:
 - **Tools**: Add new tools the LLM can invoke
@@ -321,6 +335,7 @@ See the `examples/extensions/` directory:
 - [`auto-commit.go`](examples/extensions/auto-commit.go) - Auto-commit on shutdown
 - [`bookmark.go`](examples/extensions/bookmark.go) - Bookmark conversations
 - [`branded-output.go`](examples/extensions/branded-output.go) - Branded output rendering
+- [`bridge-demo.go`](examples/extensions/bridge_demo.go) - Bridged SDK API demo (tree navigation, skills, templates, model resolution)
 - [`compact-notify.go`](examples/extensions/compact-notify.go) - Notification on compaction
 - [`confirm-destructive.go`](examples/extensions/confirm-destructive.go) - Confirm destructive operations
 - [`context-inject.go`](examples/extensions/context-inject.go) - Inject context into conversations
@@ -428,10 +443,13 @@ Focus on $1 specifically.

 **Argument placeholders:**
 - `$1`, `$2`, etc. — Individual arguments
- `$@` or `$ARGUMENTS` — All arguments
+- `$@` or `$ARGUMENTS` — All arguments (zero or more)
+- `$+` — All arguments (one or more required; error if none given)
 - `${@:2}` — Arguments from position 2 onwards
 - `${@:1:3}` — 3 arguments starting at position 1

+Placeholders inside fenced code blocks (```) and inline code spans are ignored.
+
 Disable templates with `--no-prompt-templates` or load a specific template with `--prompt-template <name>`.

 ## Session Management
@@ -480,6 +498,15 @@ During an interactive session, use these slash commands:
 | `/fork` | Fork to new session from an earlier message |
 | `/new` | Start a fresh session |

+### Keyboard Shortcuts
+
+| Shortcut | Description |
+|----------|-------------|
+| `Ctrl+X e` | Open `$VISUAL`/`$EDITOR` to compose or edit your prompt |
+| `Ctrl+X s` | Steer — inject a system-level instruction mid-turn |
+| `ESC ESC` | Cancel the current operation (tool call or streaming) |
+| `↑` / `↓` | Navigate prompt history |
+
 ## Go SDK

 Embed Kit in your Go applications:
@@ -525,6 +552,20 @@ host, err := kit.New(ctx, &kit.Options{
    Streaming:    true,
    Quiet:        true,

+    // Generation parameters (override env/config/per-model defaults)
+    MaxTokens:        16384,             // 0 = auto-resolve (env → config → per-model → 8192 floor)
+    ThinkingLevel:    "medium",          // "off", "none", "minimal", "low", "medium", "high"
+    Temperature:      ptr(float32(0.2)), // pointer so 0.0 != unset; nil = provider default
+    TopP:             nil,                // nil = leave provider/per-model default
+    TopK:             nil,
+    FrequencyPenalty: nil,
+    PresencePenalty:  nil,
+
+    // Provider configuration (override env/config without reaching into viper)
+    ProviderAPIKey: "sk-...",                      // "" = use config / provider env var
+    ProviderURL:    "https://proxy.internal/v1",   // "" = provider default
+    TLSSkipVerify:  false,                         // only takes effect when true
+
    // Session options
    SessionPath:  "./session.jsonl",  // Open specific session
    Continue:     true,                // Resume most recent session
@@ -545,6 +586,46 @@ host, err := kit.New(ctx, &kit.Options{
 })
 ```

+**Generation & provider fields** (added in v0.55+) let SDK consumers configure
+Kit entirely in-code without `viper.Set()` workarounds or shipping a `.kit.yml`.
+Precedence is `Options` > `KIT_*` env vars > `.kit.yml` > per-model defaults
+(`modelSettings` / `customModels`) > provider-level defaults. Sampling params
+are pointer types so explicit `0.0` is distinguishable from "leave alone"; a
+non-zero `MaxTokens` suppresses automatic right-sizing the same way `--max-tokens`
+does on the CLI.
+
+### MCP OAuth (remote MCP servers)
+
+When a remote MCP server returns 401, Kit runs the full OAuth flow (dynamic
+client registration → PKCE → token exchange → persistence) but delegates the
+user-facing step — showing the authorization URL and receiving the callback —
+to an `MCPAuthHandler` that you pass explicitly via `Options.MCPAuthHandler`.
+If nil, OAuth is disabled and the authorization-required error surfaces to the
+caller; the SDK never auto-opens a browser or binds a localhost port.
+
+```go
+// CLI/TUI apps: opens the system browser + prints status to stderr.
+authHandler, _ := kit.NewCLIMCPAuthHandler()
+defer authHandler.Close()
+
+host, _ := kit.New(ctx, &kit.Options{
+    MCPAuthHandler: authHandler,
+})
+
+// Custom UX: reuse the SDK's port + callback server, supply your own
+// presentation via OnAuthURL (TUI modal, QR code, web redirect, etc.).
+//   h, _ := kit.NewDefaultMCPAuthHandler()
+//   h.OnAuthURL = func(server, authURL string) { myUI.Show(server, authURL) }
+//
+// Full control (web apps, daemons): implement kit.MCPAuthHandler yourself —
+// no localhost binding, no side effects.
+```
+
+Tokens are persisted to `$XDG_CONFIG_HOME/.kit/mcp_tokens.json` by default; swap
+in a custom `MCPTokenStoreFactory` for encrypted, DB-backed, or in-memory
+storage. See the [SDK options docs](/sdk/options#mcp-oauth-authorization) for
+the full matrix.
+
 ### Custom Tools

 Create custom tools with automatic schema generation — no external dependencies needed:
@@ -565,7 +646,28 @@ host, _ := kit.New(ctx, &kit.Options{
 })
 ```

-Use `kit.NewParallelTool` for tools safe to run concurrently. See the [SDK docs](/sdk/overview) for full details on struct tags, `ToolOutput` fields, and `ToolCallIDFromContext`.
+Use `kit.NewParallelTool` for tools safe to run concurrently. Binary data (images, audio, etc.) in `ToolOutput.Data` is automatically forwarded to the LLM when `MediaType` is set. See the [SDK docs](/sdk/overview) for full details on struct tags, `ToolOutput` fields, and `ToolCallIDFromContext`.
+
+#### Return Helpers
+
+| Helper | Description |
+| --- | --- |
+| `kit.TextResult(content)` | Successful text result |
+| `kit.ErrorResult(content)` | Error result (LLM sees it as a tool error) |
+| `kit.ImageResult(content, data, mediaType)` | Image result with binary data (e.g. `"image/png"`) |
+| `kit.MediaResult(content, data, mediaType)` | Non-image media result (e.g. `"audio/mpeg"`) |
+
+#### ToolOutput Fields
+
+```go
+kit.ToolOutput{
+    Content:   "result text",     // text returned to the LLM
+    IsError:   false,             // true = LLM sees this as an error
+    Data:      pngBytes,          // optional binary data (images, audio)
+    MediaType: "image/png",       // MIME type for binary Data
+    Metadata:  map[string]any{},  // opaque metadata for hooks/UI (not sent to LLM)
+}
+```

 ### With Callbacks

@@ -582,7 +684,7 @@ unsub2 := host.OnToolResult(func(e kit.ToolResultEvent) {
 })
 defer unsub2()

-unsub3 := host.OnStreaming(func(e kit.MessageUpdateEvent) {
+unsub3 := host.OnMessageUpdate(func(e kit.MessageUpdateEvent) {
    print(e.Chunk)
 })
 defer unsub3()
@@ -11,6 +11,7 @@ import (

 	"charm.land/huh/v2"
 	"github.com/mark3labs/kit/internal/auth"
+	"github.com/mark3labs/kit/internal/ui"
 	kit "github.com/mark3labs/kit/pkg/kit"
 	"github.com/spf13/cobra"
 )
@@ -54,9 +55,13 @@ Available providers:
  - anthropic: Anthropic Claude API (OAuth)
  - openai:    OpenAI ChatGPT Plus/Pro (Codex OAuth)

-Example:
+Flags:
+  --set-default   Set this provider's default model as the system default
+
+Examples:
  kit auth login anthropic
-  kit auth login openai`,
+  kit auth login openai
+  kit auth login openai --set-default`,
 	Args: cobra.ExactArgs(1),
 	RunE: runAuthLogin,
 }
@@ -99,10 +104,43 @@ Example:
 	RunE: runAuthStatus,
 }

+var (
+	loginSetDefault bool
+)
+
+// defaultModels maps providers to their recommended default models.
+// These are used when --set-default flag is passed to auth login.
+var defaultModels = map[string]string{
+	"anthropic": "anthropic/claude-sonnet-4-5-20250929",
+	"openai":    "openai/gpt-5.4",
+}
+
+// setDefaultModelIfRequested sets the default model for the given provider
+// if the --set-default flag was provided.
+func setDefaultModelIfRequested(provider string) error {
+	if !loginSetDefault {
+		return nil
+	}
+
+	model, ok := defaultModels[provider]
+	if !ok {
+		return fmt.Errorf("no default model configured for provider: %s", provider)
+	}
+
+	if err := ui.SaveModelPreference(model); err != nil {
+		return fmt.Errorf("failed to save model preference: %w", err)
+	}
+
+	fmt.Printf("\n✓ Set default model to: %s\n", model)
+	return nil
+}
+
 func init() {
 	authCmd.AddCommand(authLoginCmd)
 	authCmd.AddCommand(authLogoutCmd)
 	authCmd.AddCommand(authStatusCmd)
+
+	authLoginCmd.Flags().BoolVar(&loginSetDefault, "set-default", false, "Set this provider's default model as the system default after login")
 }

 func runAuthLogin(cmd *cobra.Command, args []string) error {
@@ -288,6 +326,17 @@ func loginAnthropic() error {
 	fmt.Println("\n🎉 Your OAuth credentials will now be used for Anthropic API calls.")
 	fmt.Println("💡 You can check your authentication status with: kit auth status")

+	// Set default model if requested
+	if err := setDefaultModelIfRequested("anthropic"); err != nil {
+		return err
+	}
+
+	// Remind users how to set this as default if they didn't use --set-default
+	if !loginSetDefault {
+		fmt.Println("\n💡 To set Anthropic as your default model, run:")
+		fmt.Println("   kit auth login anthropic --set-default")
+	}
+
 	return nil
 }

@@ -454,6 +503,17 @@ func loginOpenAI() error {
 	fmt.Println("\n🎉 Your OAuth credentials will now be used for OpenAI API calls.")
 	fmt.Println("💡 You can check your authentication status with: kit auth status")

+	// Set default model if requested
+	if err := setDefaultModelIfRequested("openai"); err != nil {
+		return err
+	}
+
+	// Remind users how to set this as default if they didn't use --set-default
+	if !loginSetDefault {
+		fmt.Println("\n💡 To set OpenAI as your default model, run:")
+		fmt.Println("   kit auth login openai --set-default")
+	}
+
 	return nil
 }

@@ -504,13 +564,13 @@ func startOpenAICallbackServer(expectedState string) (*callbackServer, error) {
 		}

 		// Return success page
-		w.Header().Set("Content-Type", "text/html")
+		w.Header().Set("Content-Type", "text/html; charset=utf-8")
 		w.WriteHeader(http.StatusOK)
 		_, _ = fmt.Fprintf(w, `<!DOCTYPE html>
 <html>
 <head><title>Authentication Successful</title></head>
 <body style="font-family: sans-serif; text-align: center; padding: 50px;">
-<h1>✓ Authentication Successful</h1>
+<h1>&#10003; Authentication Successful</h1>
 <p>You can close this window and return to the terminal.</p>
 </body>
 </html>`)
@@ -19,6 +19,7 @@ import (
 	"github.com/mark3labs/kit/internal/prompts"
 	"github.com/mark3labs/kit/internal/ui"
 	"github.com/mark3labs/kit/internal/ui/commands"
+	"github.com/mark3labs/kit/internal/ui/progress"
 	"github.com/mark3labs/kit/internal/watcher"
 	kit "github.com/mark3labs/kit/pkg/kit"
 	"github.com/spf13/cobra"
@@ -33,13 +34,18 @@ var (
 	providerURL      string
 	providerAPIKey   string
 	debugMode        bool
-	positionalPrompt string // set by processPositionalArgs from CLI positional args
-	quietFlag        bool
-	jsonFlag         bool
-	noExitFlag       bool
-	maxSteps         int
-	streamFlag       bool // Enable streaming output
-	autoCompactFlag  bool // Enable auto-compaction near context limit
+	positionalPrompt string        // set by processPositionalArgs from CLI positional args
+	positionalFiles  []ui.FilePart // binary @file parts from processPositionalArgs
+
+	// MCP resource callbacks, set in runNormalMode, consumed by runInteractiveModeBubbleTea.
+	mcpGetResources   func() []ui.FileSuggestion
+	mcpResourceReader ui.MCPResourceReader
+	quietFlag         bool
+	jsonFlag          bool
+	noExitFlag        bool
+	maxSteps          int
+	streamFlag        bool // Enable streaming output
+	autoCompactFlag   bool // Enable auto-compaction near context limit

 	// Session management
 	sessionPath string
@@ -291,14 +297,14 @@ func init() {
 	flags.BoolVar(&noPromptTemplates, "no-prompt-templates", false, "disable prompt template discovery")

 	// Model generation parameters
-	flags.IntVar(&maxTokens, "max-tokens", 4096, "maximum number of tokens in the response")
+	flags.IntVar(&maxTokens, "max-tokens", 8192, "maximum number of output tokens per response (auto-raised up to 32768 for models with higher known output limits; see internal/models/embedded_models.json)")
 	flags.Float32Var(&temperature, "temperature", 0.7, "controls randomness in responses (0.0-1.0)")
 	flags.Float32Var(&topP, "top-p", 0.95, "controls diversity via nucleus sampling (0.0-1.0)")
 	flags.Int32Var(&topK, "top-k", 40, "controls diversity by limiting top K tokens to sample from")
 	flags.Float32Var(&frequencyPenalty, "frequency-penalty", 0.0, "penalizes tokens based on frequency of appearance (0.0-2.0)")
 	flags.Float32Var(&presencePenalty, "presence-penalty", 0.0, "penalizes tokens based on whether they have appeared (0.0-2.0)")
 	flags.StringSliceVar(&stopSequences, "stop-sequences", nil, "custom stop sequences (comma-separated)")
-	flags.StringVar(&thinkingLevel, "thinking-level", "off", "extended thinking level: off, minimal, low, medium, high")
+	flags.StringVar(&thinkingLevel, "thinking-level", "off", "extended thinking level: off, none, minimal, low, medium, high")

 	// Ollama-specific parameters
 	flags.Int32Var(&numGPU, "num-gpu-layers", -1, "number of model layers to offload to GPU for Ollama models (-1 for auto-detect)")
@@ -338,12 +344,14 @@ func init() {
 }

 // processPositionalArgs separates positional CLI arguments into @file
-// attachments and prompt text. File content is read and prepended to
-// positionalPrompt so the agent receives it. Positional args are the primary
-// way to run non-interactive mode:
+// attachments and prompt text. Text file content is read and prepended to
+// positionalPrompt; binary files (images, audio) are stored in positionalFiles
+// for multimodal submission. Positional args are the primary way to run
+// non-interactive mode:
 //
 //	kit "Explain this codebase"
 //	kit @code.ts @test.ts "Review these files"
+//	kit @screenshot.png "What's in this image?"
 func processPositionalArgs(args []string) {
 	cwd, err := os.Getwd()
 	if err != nil {
@@ -362,14 +370,17 @@ func processPositionalArgs(args []string) {
 	}

 	// Build file content prefix from @file arguments.
+	// Text files are XML-wrapped inline; binary files become multimodal parts.
 	var fileContent strings.Builder
 	for _, token := range fileTokens {
-		expanded := ui.ProcessFileAttachments(token, cwd)
-		if expanded != token {
-			// File was resolved — add it.
-			fileContent.WriteString(expanded)
+		result := ui.ProcessFileAttachments(token, cwd)
+		if result.ProcessedText != token {
+			// Text file was resolved — add it.
+			fileContent.WriteString(result.ProcessedText)
 			fileContent.WriteString("\n\n")
 		}
+		// Collect binary file parts for multimodal submission.
+		positionalFiles = append(positionalFiles, result.FileParts...)
 	}

 	// Combine: positional prompt text is appended to any existing --prompt
@@ -753,10 +764,11 @@ func runNormalMode(ctx context.Context) error {
 			}
 		},
 		CLI: &kit.CLIOptions{
-			MCPConfig:         mcpConfig,
-			ShowSpinner:       true,
-			SpinnerFunc:       spinnerFunc,
-			UseBufferedLogger: true,
+			MCPConfig:          mcpConfig,
+			ShowSpinner:        true,
+			SpinnerFunc:        spinnerFunc,
+			UseBufferedLogger:  true,
+			ProgressReaderFunc: progress.NewProgressReadCloser,
 		},
 	}
 	if resumeFlag {
@@ -1704,6 +1716,81 @@ func runNormalMode(ctx context.Context) error {
 		return kitInstance.GetMCPToolCount()
 	}

+	// Build MCP prompt provider callbacks for the TUI.
+	// Convert kit.MCPPrompt → ui.MCPPromptInfo for the UI layer.
+	convertMCPPromptsForUI := func() []ui.MCPPromptInfo {
+		prompts := kitInstance.ListMCPPrompts()
+		if len(prompts) == 0 {
+			return nil
+		}
+		result := make([]ui.MCPPromptInfo, len(prompts))
+		for i, p := range prompts {
+			args := make([]ui.MCPPromptArgInfo, len(p.Arguments))
+			for j, a := range p.Arguments {
+				args[j] = ui.MCPPromptArgInfo{
+					Name:        a.Name,
+					Description: a.Description,
+					Required:    a.Required,
+				}
+			}
+			result[i] = ui.MCPPromptInfo{
+				Name:        p.Name,
+				Description: p.Description,
+				Arguments:   args,
+				ServerName:  p.ServerName,
+			}
+		}
+		return result
+	}
+	mcpPrompts := convertMCPPromptsForUI()
+	getMCPPrompts := func() []ui.MCPPromptInfo {
+		return convertMCPPromptsForUI()
+	}
+	expandMCPPrompt := func(serverName, promptName string, args map[string]string) (*ui.MCPPromptExpandResult, error) {
+		result, err := kitInstance.GetMCPPrompt(context.Background(), serverName, promptName, args)
+		if err != nil {
+			return nil, err
+		}
+		msgs := make([]ui.MCPPromptMessageInfo, len(result.Messages))
+		for i, m := range result.Messages {
+			msgs[i] = ui.MCPPromptMessageInfo{
+				Role:      m.Role,
+				Content:   m.Content,
+				FileParts: m.FileParts,
+			}
+		}
+		return &ui.MCPPromptExpandResult{Messages: msgs}, nil
+	}
+
+	// MCP resource callbacks for @ autocomplete and submit-time resolution.
+	getMCPResources := func() []ui.FileSuggestion {
+		resources := kitInstance.ListMCPResources()
+		suggestions := make([]ui.FileSuggestion, len(resources))
+		for i, r := range resources {
+			suggestions[i] = ui.FileSuggestion{
+				RelPath:        r.Name,
+				IsMCPResource:  true,
+				MCPServerName:  r.ServerName,
+				MCPResourceURI: r.URI,
+				MCPMIMEType:    r.MIMEType,
+				Score:          100, // default score, filtered later
+			}
+		}
+		return suggestions
+	}
+	mcpResourceReaderFn := func(serverName, uri string) (string, []byte, string, bool, error) {
+		content, err := kitInstance.ReadMCPResource(context.Background(), serverName, uri)
+		if err != nil {
+			return "", nil, "", false, err
+		}
+		return content.Text, content.BlobData, content.MIMEType, content.IsBlob, nil
+	}
+
+	// Store MCP resource callbacks at package level for consumption by
+	// runInteractiveModeBubbleTea and runNonInteractiveModeApp.
+	mcpGetResources = getMCPResources
+	mcpResourceReader = mcpResourceReaderFn
+
 	// Start a goroutine that waits for background MCP tool loading to
 	// complete and notifies the TUI so it can refresh tool names and counts.
 	if len(mcpConfig.MCPServers) > 0 {
@@ -1840,7 +1927,7 @@ func runNormalMode(ctx context.Context) error {

 	// Check if running in non-interactive mode
 	if positionalPrompt != "" {
-		return runNonInteractiveModeApp(ctx, appInstance, cli, positionalPrompt, quietFlag, jsonFlag, noExitFlag, modelName, parsedProvider, kitInstance.GetLoadingMessage(), serverNames, toolNames, mcpToolCount, extensionToolCount, usageTracker, extCommands, promptTemplates, contextPaths, skillItems, getPromptTemplates, getSkillItems, getToolNames, getMCPToolCount, getWidgets, getHeader, getFooter, getToolRenderer, getEditorInterceptor, getUIVisibility, getStatusBarEntries, emitBeforeFork, emitBeforeSessionSwitch, getGlobalShortcuts, getExtensionCommands, setModelForUI, emitModelChangeForUI, kitInstance.IsReasoningModel(), kitInstance.GetThinkingLevel(), setThinkingLevelForUI, switchSessionForUI, reloadExtensionsForUI)
+		return runNonInteractiveModeApp(ctx, appInstance, cli, positionalPrompt, quietFlag, jsonFlag, noExitFlag, modelName, parsedProvider, kitInstance.GetLoadingMessage(), serverNames, toolNames, mcpToolCount, extensionToolCount, usageTracker, extCommands, promptTemplates, contextPaths, skillItems, getPromptTemplates, getSkillItems, getToolNames, getMCPToolCount, mcpPrompts, getMCPPrompts, expandMCPPrompt, getWidgets, getHeader, getFooter, getToolRenderer, getEditorInterceptor, getUIVisibility, getStatusBarEntries, emitBeforeFork, emitBeforeSessionSwitch, getGlobalShortcuts, getExtensionCommands, setModelForUI, emitModelChangeForUI, kitInstance.IsReasoningModel(), kitInstance.GetThinkingLevel(), setThinkingLevelForUI, switchSessionForUI, reloadExtensionsForUI)
 	}

 	// Quiet mode is not allowed in interactive mode
@@ -1848,7 +1935,7 @@ func runNormalMode(ctx context.Context) error {
 		return fmt.Errorf("--quiet requires a prompt")
 	}

-	return runInteractiveModeBubbleTea(ctx, appInstance, modelName, parsedProvider, kitInstance.GetLoadingMessage(), serverNames, toolNames, mcpToolCount, extensionToolCount, usageTracker, extCommands, promptTemplates, contextPaths, skillItems, getPromptTemplates, getSkillItems, getToolNames, getMCPToolCount, getWidgets, getHeader, getFooter, getToolRenderer, getEditorInterceptor, getUIVisibility, getStatusBarEntries, emitBeforeFork, emitBeforeSessionSwitch, getGlobalShortcuts, getExtensionCommands, setModelForUI, emitModelChangeForUI, kitInstance.IsReasoningModel(), kitInstance.GetThinkingLevel(), setThinkingLevelForUI, switchSessionForUI, reloadExtensionsForUI, startupExtensionMessages)
+	return runInteractiveModeBubbleTea(ctx, appInstance, modelName, parsedProvider, kitInstance.GetLoadingMessage(), serverNames, toolNames, mcpToolCount, extensionToolCount, usageTracker, extCommands, promptTemplates, contextPaths, skillItems, getPromptTemplates, getSkillItems, getToolNames, getMCPToolCount, mcpPrompts, getMCPPrompts, expandMCPPrompt, getWidgets, getHeader, getFooter, getToolRenderer, getEditorInterceptor, getUIVisibility, getStatusBarEntries, emitBeforeFork, emitBeforeSessionSwitch, getGlobalShortcuts, getExtensionCommands, setModelForUI, emitModelChangeForUI, kitInstance.IsReasoningModel(), kitInstance.GetThinkingLevel(), setThinkingLevelForUI, switchSessionForUI, reloadExtensionsForUI, startupExtensionMessages)
 }

 // runNonInteractiveModeApp executes a single prompt via the app layer and exits,
@@ -1861,15 +1948,33 @@ func runNormalMode(ctx context.Context) error {
 //
 // When --no-exit is set, after the prompt completes the interactive BubbleTea
 // TUI is started so the user can continue the conversation.
-func runNonInteractiveModeApp(ctx context.Context, appInstance *app.App, cli *ui.CLI, prompt string, quiet, jsonOutput, noExit bool, modelName, providerName, loadingMessage string, serverNames, toolNames []string, mcpToolCount, extensionToolCount int, usageTracker *ui.UsageTracker, extCommands []commands.ExtensionCommand, promptTemplates []*prompts.PromptTemplate, contextPaths []string, skillItems []ui.SkillItem, getPromptTemplates func() []*prompts.PromptTemplate, getSkillItems func() []ui.SkillItem, getToolNames func() []string, getMCPToolCount func() int, getWidgets func(string) []ui.WidgetData, getHeader, getFooter func() *ui.WidgetData, getToolRenderer func(string) *ui.ToolRendererData, getEditorInterceptor func() *ui.EditorInterceptor, getUIVisibility func() *ui.UIVisibility, getStatusBarEntries func() []ui.StatusBarEntryData, emitBeforeFork func(string, bool, string) (bool, string), emitBeforeSessionSwitch func(string) (bool, string), getGlobalShortcuts func() map[string]func(), getExtensionCommands func() []commands.ExtensionCommand, setModel func(string) error, emitModelChange func(string, string, string), isReasoningModel bool, thinkingLevel string, setThinkingLevel func(string) error, switchSession func(string) error, reloadExtensions func() error) error {
+func runNonInteractiveModeApp(ctx context.Context, appInstance *app.App, cli *ui.CLI, prompt string, quiet, jsonOutput, noExit bool, modelName, providerName, loadingMessage string, serverNames, toolNames []string, mcpToolCount, extensionToolCount int, usageTracker *ui.UsageTracker, extCommands []commands.ExtensionCommand, promptTemplates []*prompts.PromptTemplate, contextPaths []string, skillItems []ui.SkillItem, getPromptTemplates func() []*prompts.PromptTemplate, getSkillItems func() []ui.SkillItem, getToolNames func() []string, getMCPToolCount func() int, mcpPrompts []ui.MCPPromptInfo, getMCPPrompts func() []ui.MCPPromptInfo, expandMCPPrompt func(string, string, map[string]string) (*ui.MCPPromptExpandResult, error), getWidgets func(string) []ui.WidgetData, getHeader, getFooter func() *ui.WidgetData, getToolRenderer func(string) *ui.ToolRendererData, getEditorInterceptor func() *ui.EditorInterceptor, getUIVisibility func() *ui.UIVisibility, getStatusBarEntries func() []ui.StatusBarEntryData, emitBeforeFork func(string, bool, string) (bool, string), emitBeforeSessionSwitch func(string) (bool, string), getGlobalShortcuts func() map[string]func(), getExtensionCommands func() []commands.ExtensionCommand, setModel func(string) error, emitModelChange func(string, string, string), isReasoningModel bool, thinkingLevel string, setThinkingLevel func(string) error, switchSession func(string) error, reloadExtensions func() error) error {
 	// Expand @file references in the prompt before sending to the agent.
+	// Text files are XML-inlined; binary files are extracted as multimodal parts.
+	var fileParts []kit.LLMFilePart
 	if cwd, err := os.Getwd(); err == nil {
-		prompt = ui.ProcessFileAttachments(prompt, cwd)
+		result := ui.ProcessFileAttachments(prompt, cwd, mcpResourceReader)
+		prompt = result.ProcessedText
+		for _, fp := range result.FileParts {
+			fileParts = append(fileParts, kit.LLMFilePart{
+				Filename:  fp.Filename,
+				Data:      fp.Data,
+				MediaType: fp.MediaType,
+			})
+		}
+	}
+	// Also include binary files from processPositionalArgs (CLI @file args).
+	for _, fp := range positionalFiles {
+		fileParts = append(fileParts, kit.LLMFilePart{
+			Filename:  fp.Filename,
+			Data:      fp.Data,
+			MediaType: fp.MediaType,
+		})
 	}

 	if jsonOutput {
 		// JSON mode: no intermediate display, structured JSON output.
-		result, err := appInstance.RunOnceResult(ctx, prompt)
+		result, err := appInstance.RunOnceResultWithFiles(ctx, prompt, fileParts)
 		if err != nil {
 			writeJSONError(err)
 			return err
@@ -1881,7 +1986,7 @@ func runNonInteractiveModeApp(ctx context.Context, appInstance *app.App, cli *ui
 		fmt.Println(string(data))
 	} else if quiet {
 		// Quiet mode: no intermediate display, just print final response.
-		if err := appInstance.RunOnce(ctx, prompt); err != nil {
+		if err := appInstance.RunOnceWithFiles(ctx, prompt, fileParts); err != nil {
 			return err
 		}
 	} else if cli != nil {
@@ -1890,21 +1995,21 @@ func runNonInteractiveModeApp(ctx context.Context, appInstance *app.App, cli *ui

 		// Route events through the shared CLI event handler.
 		eventHandler := ui.NewCLIEventHandler(cli, modelName)
-		err := appInstance.RunOnceWithDisplay(ctx, prompt, eventHandler.Handle)
+		err := appInstance.RunOnceWithDisplayAndFiles(ctx, prompt, eventHandler.Handle, fileParts)
 		eventHandler.Cleanup()
 		if err != nil {
 			return err
 		}
 	} else {
 		// No CLI available (shouldn't happen in non-quiet mode, but be safe).
-		if err := appInstance.RunOnce(ctx, prompt); err != nil {
+		if err := appInstance.RunOnceWithFiles(ctx, prompt, fileParts); err != nil {
 			return err
 		}
 	}

 	// If --no-exit was requested, hand off to the interactive TUI.
 	if noExit {
-		return runInteractiveModeBubbleTea(ctx, appInstance, modelName, providerName, loadingMessage, serverNames, toolNames, mcpToolCount, extensionToolCount, usageTracker, extCommands, promptTemplates, contextPaths, skillItems, getPromptTemplates, getSkillItems, getToolNames, getMCPToolCount, getWidgets, getHeader, getFooter, getToolRenderer, getEditorInterceptor, getUIVisibility, getStatusBarEntries, emitBeforeFork, emitBeforeSessionSwitch, getGlobalShortcuts, getExtensionCommands, setModel, emitModelChange, isReasoningModel, thinkingLevel, setThinkingLevel, switchSession, reloadExtensions, nil)
+		return runInteractiveModeBubbleTea(ctx, appInstance, modelName, providerName, loadingMessage, serverNames, toolNames, mcpToolCount, extensionToolCount, usageTracker, extCommands, promptTemplates, contextPaths, skillItems, getPromptTemplates, getSkillItems, getToolNames, getMCPToolCount, mcpPrompts, getMCPPrompts, expandMCPPrompt, getWidgets, getHeader, getFooter, getToolRenderer, getEditorInterceptor, getUIVisibility, getStatusBarEntries, emitBeforeFork, emitBeforeSessionSwitch, getGlobalShortcuts, getExtensionCommands, setModel, emitModelChange, isReasoningModel, thinkingLevel, setThinkingLevel, switchSession, reloadExtensions, nil)
 	}

 	return nil
@@ -2002,7 +2107,7 @@ func writeJSONError(err error) {
 //  4. Calls program.Run() which blocks until the user quits (Ctrl+C or /quit).
 //
 // SetupCLI is not used for interactive mode; the TUI (AppModel) handles its own rendering.
-func runInteractiveModeBubbleTea(_ context.Context, appInstance *app.App, modelName, providerName, loadingMessage string, serverNames, toolNames []string, mcpToolCount, extensionToolCount int, usageTracker *ui.UsageTracker, extCommands []commands.ExtensionCommand, promptTemplates []*prompts.PromptTemplate, contextPaths []string, skillItems []ui.SkillItem, getPromptTemplates func() []*prompts.PromptTemplate, getSkillItems func() []ui.SkillItem, getToolNames func() []string, getMCPToolCount func() int, getWidgets func(string) []ui.WidgetData, getHeader, getFooter func() *ui.WidgetData, getToolRenderer func(string) *ui.ToolRendererData, getEditorInterceptor func() *ui.EditorInterceptor, getUIVisibility func() *ui.UIVisibility, getStatusBarEntries func() []ui.StatusBarEntryData, emitBeforeFork func(string, bool, string) (bool, string), emitBeforeSessionSwitch func(string) (bool, string), getGlobalShortcuts func() map[string]func(), getExtensionCommands func() []commands.ExtensionCommand, setModel func(string) error, emitModelChange func(string, string, string), isReasoningModel bool, thinkingLevel string, setThinkingLevel func(string) error, switchSession func(string) error, reloadExtensions func() error, startupExtensionMessages []string) error {
+func runInteractiveModeBubbleTea(_ context.Context, appInstance *app.App, modelName, providerName, loadingMessage string, serverNames, toolNames []string, mcpToolCount, extensionToolCount int, usageTracker *ui.UsageTracker, extCommands []commands.ExtensionCommand, promptTemplates []*prompts.PromptTemplate, contextPaths []string, skillItems []ui.SkillItem, getPromptTemplates func() []*prompts.PromptTemplate, getSkillItems func() []ui.SkillItem, getToolNames func() []string, getMCPToolCount func() int, mcpPrompts []ui.MCPPromptInfo, getMCPPrompts func() []ui.MCPPromptInfo, expandMCPPrompt func(string, string, map[string]string) (*ui.MCPPromptExpandResult, error), getWidgets func(string) []ui.WidgetData, getHeader, getFooter func() *ui.WidgetData, getToolRenderer func(string) *ui.ToolRendererData, getEditorInterceptor func() *ui.EditorInterceptor, getUIVisibility func() *ui.UIVisibility, getStatusBarEntries func() []ui.StatusBarEntryData, emitBeforeFork func(string, bool, string) (bool, string), emitBeforeSessionSwitch func(string) (bool, string), getGlobalShortcuts func() map[string]func(), getExtensionCommands func() []commands.ExtensionCommand, setModel func(string) error, emitModelChange func(string, string, string), isReasoningModel bool, thinkingLevel string, setThinkingLevel func(string) error, switchSession func(string) error, reloadExtensions func() error, startupExtensionMessages []string) error {
 	// Redirect all log output (stdlib and charm) to a file so that log
 	// messages don't write to stderr and corrupt the TUI. Bubble Tea
 	// captures stdout for rendering; any stray stderr output from
@@ -2041,6 +2146,9 @@ func runInteractiveModeBubbleTea(_ context.Context, appInstance *app.App, modelN
 		ExtensionCommands:        extCommands,
 		PromptTemplates:          promptTemplates,
 		GetPromptTemplates:       getPromptTemplates,
+		MCPPrompts:               mcpPrompts,
+		GetMCPPrompts:            getMCPPrompts,
+		ExpandMCPPrompt:          expandMCPPrompt,
 		ContextPaths:             contextPaths,
 		SkillItems:               skillItems,
 		GetSkillItems:            getSkillItems,
@@ -2064,6 +2172,8 @@ func runInteractiveModeBubbleTea(_ context.Context, appInstance *app.App, modelN
 		SwitchSession:            switchSession,
 		ReloadExtensions:         reloadExtensions,
 		ShowSessionPicker:        resumeFlag,
+		GetMCPResources:          mcpGetResources,
+		MCPResourceReader:        mcpResourceReader,
 	})

 	program := tea.NewProgram(appModel)
@@ -130,6 +130,58 @@ func TestSubagentMonitor_MultipleSubagents(t *testing.T) {
 	time.Sleep(100 * time.Millisecond)
 }

+// TestSubagentMonitor_ConcurrentSubagents verifies no panics when multiple
+// subagents emit events concurrently from different goroutines.
+func TestSubagentMonitor_ConcurrentSubagents(t *testing.T) {
+	harness := test.New(t)
+	harness.LoadFile("../../.kit/extensions/subagent-monitor.go")
+
+	_, err := harness.Emit(extensions.SessionStartEvent{SessionID: "test-session"})
+	if err != nil {
+		t.Fatalf("SessionStart should not error: %v", err)
+	}
+
+	// Start 5 subagents concurrently
+	done := make(chan struct{}, 5)
+	for i := range 5 {
+		go func(idx int) {
+			defer func() { done <- struct{}{} }()
+
+			callID := fmt.Sprintf("concurrent-%d", idx)
+			task := fmt.Sprintf("concurrent task %d", idx)
+
+			_, _ = harness.Emit(extensions.SubagentStartEvent{
+				ToolCallID: callID,
+				Task:       task,
+			})
+
+			// Emit many chunks rapidly
+			for j := range 20 {
+				_, _ = harness.Emit(extensions.SubagentChunkEvent{
+					ToolCallID: callID,
+					Task:       task,
+					ChunkType:  "text",
+					Content:    fmt.Sprintf("agent %d chunk %d", idx, j),
+				})
+			}
+
+			_, _ = harness.Emit(extensions.SubagentEndEvent{
+				ToolCallID: callID,
+				Task:       task,
+				Response:   "done",
+			})
+		}(i)
+	}
+
+	// Wait for all goroutines
+	for range 5 {
+		<-done
+	}
+
+	// Allow any final processing
+	time.Sleep(200 * time.Millisecond)
+}
+
 // TestSubagentMonitor_SessionShutdown verifies shutdown doesn't panic
 // even with nil ctx functions.
 func TestSubagentMonitor_SessionShutdown(t *testing.T) {
@@ -0,0 +1,153 @@
+//go:build ignore
+
+// sudo-handler.go - Extension to handle sudo password prompts securely
+//
+// This extension intercepts bash commands containing "sudo" and:
+// 1. Checks if sudo credentials are already cached (via sudo -n)
+// 2. If not cached, prompts the user for their password (with masking)
+// 3. Temporarily sets SUDO_PASSWORD environment variable for execution
+// 4. The bash tool automatically uses sudo -S -p '' to pipe the password
+//
+// Usage: kit -e examples/extensions/sudo-handler.go
+//
+// Security notes:
+// - Password is only stored in memory for the duration of the session
+// - Password is never logged or displayed
+// - Each session requires re-authentication (sudo -k is used)
+// - The SUDO_PASSWORD env var is set only during tool execution
+
+package main
+
+import (
+	"encoding/json"
+	"os"
+	"strings"
+	"sync"
+
+	"kit/ext"
+)
+
+var (
+	// cachedPassword stores the sudo password for the session
+	cachedPassword string
+	// hasCachedPassword tracks if we have a valid cached password
+	hasCachedPassword bool
+	// mu protects cached password access
+	mu sync.RWMutex
+)
+
+// Init sets up the sudo handler extension
+func Init(api ext.API) {
+	api.OnToolCall(func(tc ext.ToolCallEvent, ctx ext.Context) *ext.ToolCallResult {
+		if tc.ToolName != "bash" {
+			return nil
+		}
+
+		// Parse the command from tool input
+		var input struct {
+			Command string `json:"command"`
+		}
+		if err := json.Unmarshal([]byte(tc.Input), &input); err != nil {
+			return nil
+		}
+
+		// Check if command contains sudo
+		if !containsSudo(input.Command) {
+			return nil
+		}
+
+		// Check if we already have cached credentials
+		mu.RLock()
+		password := cachedPassword
+		hasCached := hasCachedPassword
+		mu.RUnlock()
+
+		if hasCached {
+			// Use cached password
+			os.Setenv("SUDO_PASSWORD", password)
+			return nil
+		}
+
+		// No cached password - prompt user
+		result := ctx.PromptInput(ext.PromptInputConfig{
+			Message:     "🔐 Sudo password required for:\n  " + truncateCommand(input.Command, 60),
+			Placeholder: "Enter your password",
+		})
+
+		if result.Cancelled {
+			return &ext.ToolCallResult{
+				Block:  true,
+				Reason: "Sudo password prompt cancelled by user",
+			}
+		}
+
+		if result.Value == "" {
+			return &ext.ToolCallResult{
+				Block:  true,
+				Reason: "No password provided",
+			}
+		}
+
+		// Cache the password for this session
+		mu.Lock()
+		cachedPassword = result.Value
+		hasCachedPassword = true
+		mu.Unlock()
+
+		// Set environment variable for the bash tool to use
+		os.Setenv("SUDO_PASSWORD", result.Value)
+
+		// Show confirmation (without revealing password)
+		ctx.PrintInfo("Sudo password cached for this session")
+
+		return nil
+	})
+
+	// Clear cached password when session ends
+	api.OnSessionShutdown(func(event ext.SessionShutdownEvent, ctx ext.Context) {
+		mu.Lock()
+		cachedPassword = ""
+		hasCachedPassword = false
+		mu.Unlock()
+		os.Unsetenv("SUDO_PASSWORD")
+	})
+}
+
+// containsSudo checks if the command contains sudo as a command (not in a string)
+func containsSudo(command string) bool {
+	// Simple check for sudo as a word, not inside quotes or as part of another word
+	lower := strings.ToLower(command)
+
+	// Check for sudo at start or after separators
+	patterns := []string{
+		"sudo ",
+		"sudo\t",
+		";sudo ",
+		"&& sudo ",
+		"|| sudo ",
+		"| sudo ",
+		"$(sudo ",
+		"`sudo ",
+	}
+
+	for _, pattern := range patterns {
+		if strings.Contains(lower, pattern) {
+			return true
+		}
+	}
+
+	// Check if command starts with sudo
+	if strings.HasPrefix(lower, "sudo ") {
+		return true
+	}
+
+	return false
+}
+
+// truncateCommand truncates a long command for display
+func truncateCommand(cmd string, maxLen int) string {
+	if len(cmd) <= maxLen {
+		return cmd
+	}
+	return cmd[:maxLen-3] + "..."
+}
@@ -62,7 +62,7 @@ func main() {
 		}
 	})
 	// Subscribe to streaming chunks.
-	host3.OnStreaming(func(e kit.MessageUpdateEvent) {
+	host3.OnMessageUpdate(func(e kit.MessageUpdateEvent) {
 		fmt.Print(e.Chunk)
 	})

@@ -4,8 +4,8 @@ go 1.26.2

 require (
 	charm.land/bubbles/v2 v2.1.0
-	charm.land/bubbletea/v2 v2.0.5
-	charm.land/fantasy v0.17.2
+	charm.land/bubbletea/v2 v2.0.6
+	charm.land/fantasy v0.19.0
 	charm.land/huh/v2 v2.0.3
 	charm.land/lipgloss/v2 v2.0.3
 	github.com/alecthomas/chroma/v2 v2.23.1
@@ -14,15 +14,15 @@ require (
 	github.com/charmbracelet/fang v1.0.0
 	github.com/charmbracelet/log v1.0.0
 	github.com/charmbracelet/openai-go v0.0.0-20260319145158-d0740cc34266
-	github.com/charmbracelet/ultraviolet v0.0.0-20260414011438-8c69ec811b1e
+	github.com/charmbracelet/ultraviolet v0.0.0-20260420095748-421e4a7fa8d7
 	github.com/charmbracelet/x/editor v0.2.0
 	github.com/clipperhouse/displaywidth v0.11.0
 	github.com/clipperhouse/uax29/v2 v2.7.0
-	github.com/coder/acp-go-sdk v0.6.3
+	github.com/coder/acp-go-sdk v0.12.0
 	github.com/fsnotify/fsnotify v1.9.0
 	github.com/indaco/herald v0.13.0
 	github.com/indaco/herald-md v0.3.0
-	github.com/mark3labs/mcp-go v0.48.0
+	github.com/mark3labs/mcp-go v0.49.0
 	github.com/spf13/cobra v1.10.2
 	github.com/spf13/viper v1.21.0
 	github.com/traefik/yaegi v0.16.1
@@ -35,23 +35,23 @@ require (
 	cloud.google.com/go/auth v0.20.0 // indirect
 	cloud.google.com/go/auth/oauth2adapt v0.2.8 // indirect
 	cloud.google.com/go/compute/metadata v0.9.0 // indirect
-	github.com/Azure/azure-sdk-for-go/sdk/azcore v1.21.0 // indirect
+	github.com/Azure/azure-sdk-for-go/sdk/azcore v1.21.1 // indirect
 	github.com/Azure/azure-sdk-for-go/sdk/internal v1.12.0 // indirect
-	github.com/aws/aws-sdk-go-v2 v1.41.5 // indirect
-	github.com/aws/aws-sdk-go-v2/aws/protocol/eventstream v1.7.8 // indirect
-	github.com/aws/aws-sdk-go-v2/config v1.32.14 // indirect
-	github.com/aws/aws-sdk-go-v2/credentials v1.19.14 // indirect
-	github.com/aws/aws-sdk-go-v2/feature/ec2/imds v1.18.21 // indirect
-	github.com/aws/aws-sdk-go-v2/internal/configsources v1.4.21 // indirect
-	github.com/aws/aws-sdk-go-v2/internal/endpoints/v2 v2.7.21 // indirect
-	github.com/aws/aws-sdk-go-v2/internal/ini v1.8.6 // indirect
-	github.com/aws/aws-sdk-go-v2/service/internal/accept-encoding v1.13.7 // indirect
-	github.com/aws/aws-sdk-go-v2/service/internal/presigned-url v1.13.21 // indirect
-	github.com/aws/aws-sdk-go-v2/service/signin v1.0.9 // indirect
-	github.com/aws/aws-sdk-go-v2/service/sso v1.30.15 // indirect
-	github.com/aws/aws-sdk-go-v2/service/ssooidc v1.35.19 // indirect
-	github.com/aws/aws-sdk-go-v2/service/sts v1.41.10 // indirect
-	github.com/aws/smithy-go v1.24.3 // indirect
+	github.com/aws/aws-sdk-go-v2 v1.41.6 // indirect
+	github.com/aws/aws-sdk-go-v2/aws/protocol/eventstream v1.7.9 // indirect
+	github.com/aws/aws-sdk-go-v2/config v1.32.16 // indirect
+	github.com/aws/aws-sdk-go-v2/credentials v1.19.15 // indirect
+	github.com/aws/aws-sdk-go-v2/feature/ec2/imds v1.18.22 // indirect
+	github.com/aws/aws-sdk-go-v2/internal/configsources v1.4.22 // indirect
+	github.com/aws/aws-sdk-go-v2/internal/endpoints/v2 v2.7.22 // indirect
+	github.com/aws/aws-sdk-go-v2/internal/v4a v1.4.23 // indirect
+	github.com/aws/aws-sdk-go-v2/service/internal/accept-encoding v1.13.8 // indirect
+	github.com/aws/aws-sdk-go-v2/service/internal/presigned-url v1.13.22 // indirect
+	github.com/aws/aws-sdk-go-v2/service/signin v1.0.10 // indirect
+	github.com/aws/aws-sdk-go-v2/service/sso v1.30.16 // indirect
+	github.com/aws/aws-sdk-go-v2/service/ssooidc v1.35.20 // indirect
+	github.com/aws/aws-sdk-go-v2/service/sts v1.42.0 // indirect
+	github.com/aws/smithy-go v1.25.0 // indirect
 	github.com/catppuccin/go v0.3.0 // indirect
 	github.com/cespare/xxhash/v2 v2.3.0 // indirect
 	github.com/charmbracelet/anthropic-sdk-go v0.0.0-20260223140439-63879b0b8dab // indirect
@@ -59,14 +59,14 @@ require (
 	github.com/charmbracelet/harmonica v0.2.0 // indirect
 	github.com/charmbracelet/lipgloss v1.1.1-0.20250404203927-76690c660834 // indirect
 	github.com/charmbracelet/x/cellbuf v0.0.15 // indirect
-	github.com/charmbracelet/x/exp/charmtone v0.0.0-20260413165052-6921c759c913 // indirect
+	github.com/charmbracelet/x/exp/charmtone v0.0.0-20260420102150-fe550f2efce5 // indirect
 	github.com/charmbracelet/x/exp/ordered v0.1.0 // indirect
-	github.com/charmbracelet/x/exp/slice v0.0.0-20260413165052-6921c759c913 // indirect
+	github.com/charmbracelet/x/exp/slice v0.0.0-20260420102150-fe550f2efce5 // indirect
 	github.com/charmbracelet/x/exp/strings v0.1.0 // indirect
 	github.com/charmbracelet/x/json v0.2.0 // indirect
 	github.com/charmbracelet/x/termios v0.1.1 // indirect
 	github.com/charmbracelet/x/windows v0.2.2 // indirect
-	github.com/dlclark/regexp2 v1.11.5 // indirect
+	github.com/dlclark/regexp2 v1.12.0 // indirect
 	github.com/dustin/go-humanize v1.0.1 // indirect
 	github.com/felixge/httpsnoop v1.0.4 // indirect
 	github.com/go-json-experiment/json v0.0.0-20260214004413-d219187c3433 // indirect
@@ -79,13 +79,13 @@ require (
 	github.com/google/jsonschema-go v0.4.2 // indirect
 	github.com/google/s2a-go v0.1.9 // indirect
 	github.com/google/uuid v1.6.0 // indirect
-	github.com/googleapis/enterprise-certificate-proxy v0.3.14 // indirect
-	github.com/googleapis/gax-go/v2 v2.21.0 // indirect
+	github.com/googleapis/enterprise-certificate-proxy v0.3.15 // indirect
+	github.com/googleapis/gax-go/v2 v2.22.0 // indirect
 	github.com/gorilla/websocket v1.5.3 // indirect
-	github.com/kaptinlin/go-i18n v0.4.0 // indirect
-	github.com/kaptinlin/jsonpointer v0.4.17 // indirect
-	github.com/kaptinlin/jsonschema v0.7.7 // indirect
-	github.com/kaptinlin/messageformat-go v0.4.20 // indirect
+	github.com/kaptinlin/go-i18n v0.4.2 // indirect
+	github.com/kaptinlin/jsonpointer v0.4.18 // indirect
+	github.com/kaptinlin/jsonschema v0.7.8 // indirect
+	github.com/kaptinlin/messageformat-go v0.5.2 // indirect
 	github.com/mitchellh/hashstructure/v2 v2.0.2 // indirect
 	github.com/muesli/mango v0.2.0 // indirect
 	github.com/muesli/mango-cobra v1.3.0 // indirect
@@ -115,9 +115,9 @@ require (
 	golang.org/x/net v0.53.0 // indirect
 	golang.org/x/oauth2 v0.36.0 // indirect
 	golang.org/x/time v0.15.0 // indirect
-	google.golang.org/api v0.275.0 // indirect
+	google.golang.org/api v0.276.0 // indirect
 	google.golang.org/genai v1.54.0 // indirect
-	google.golang.org/genproto/googleapis/rpc v0.0.0-20260414002931-afd174a4e478 // indirect
+	google.golang.org/genproto/googleapis/rpc v0.0.0-20260420184626-e10c466a9529 // indirect
 	google.golang.org/grpc v1.80.0 // indirect
 	google.golang.org/protobuf v1.36.11 // indirect
 	gopkg.in/yaml.v2 v2.4.0 // indirect
@@ -134,7 +134,7 @@ require (
 	github.com/muesli/cancelreader v0.2.2 // indirect
 	github.com/muesli/termenv v0.16.0 // indirect
 	github.com/rivo/uniseg v0.4.7 // indirect
-	github.com/spf13/pflag v1.0.10 // indirect
+	github.com/spf13/pflag v1.0.10
 	golang.org/x/sync v0.20.0 // indirect
 	golang.org/x/sys v0.43.0 // indirect
 	golang.org/x/text v0.36.0
@@ -1,9 +1,9 @@
 charm.land/bubbles/v2 v2.1.0 h1:YSnNh5cPYlYjPxRrzs5VEn3vwhtEn3jVGRBT3M7/I0g=
 charm.land/bubbles/v2 v2.1.0/go.mod h1:l97h4hym2hvWBVfmJDtrEHHCtkIKeTEb3TTJ4ZOB3wY=
-charm.land/bubbletea/v2 v2.0.5 h1:TQlLFqxo39AAHSVuOhJ5D3nH7O9Nk8JGinsfWQ4y1U4=
-charm.land/bubbletea/v2 v2.0.5/go.mod h1:dvbsYZD+MHkdIZl+Z67D212hEvB+GII2tfH8f9SnoDw=
-charm.land/fantasy v0.17.2 h1:ojTMufMxY/PVH7TzYUxht2SVkvD90iCTJfmPR6c8BR8=
-charm.land/fantasy v0.17.2/go.mod h1:V9cCIUMZB9g3Bq40aKEY8xBNzDd48EdfHp2OMS0uzWs=
+charm.land/bubbletea/v2 v2.0.6 h1:UHN/91OyuhaOFGSrBXQ/hMZD8IO1Uc4BvHlgHXL2WJo=
+charm.land/bubbletea/v2 v2.0.6/go.mod h1:MH/D8ZLlN3op37vQvijKuU29g3rqTp+aQapURFonF9g=
+charm.land/fantasy v0.19.0 h1:fnNXkIJ/xcIW3sdVtWxjtQGpWWe8pDGhBCWSHkgbrd0=
+charm.land/fantasy v0.19.0/go.mod h1:V9cCIUMZB9g3Bq40aKEY8xBNzDd48EdfHp2OMS0uzWs=
 charm.land/huh/v2 v2.0.3 h1:2cJsMqEPwSywGHvdlKsJyQKPtSJLVnFKyFbsYZTlLkU=
 charm.land/huh/v2 v2.0.3/go.mod h1:93eEveeeqn47MwiC3tf+2atZ2l7Is88rAtmZNZ8x9Wc=
 charm.land/lipgloss/v2 v2.0.3 h1:yM2zJ4Cf5Y51b7RHIwioil4ApI/aypFXXVHSwlM6RzU=
@@ -16,8 +16,8 @@ cloud.google.com/go/auth/oauth2adapt v0.2.8 h1:keo8NaayQZ6wimpNSmW5OPc283g65QNIi
 cloud.google.com/go/auth/oauth2adapt v0.2.8/go.mod h1:XQ9y31RkqZCcwJWNSx2Xvric3RrU88hAYYbjDWYDL+c=
 cloud.google.com/go/compute/metadata v0.9.0 h1:pDUj4QMoPejqq20dK0Pg2N4yG9zIkYGdBtwLoEkH9Zs=
 cloud.google.com/go/compute/metadata v0.9.0/go.mod h1:E0bWwX5wTnLPedCKqk3pJmVgCBSM6qQI1yTBdEb3C10=
-github.com/Azure/azure-sdk-for-go/sdk/azcore v1.21.0 h1:fou+2+WFTib47nS+nz/ozhEBnvU96bKHy6LjRsY4E28=
-github.com/Azure/azure-sdk-for-go/sdk/azcore v1.21.0/go.mod h1:t76Ruy8AHvUAC8GfMWJMa0ElSbuIcO03NLpynfbgsPA=
+github.com/Azure/azure-sdk-for-go/sdk/azcore v1.21.1 h1:jHb/wfvRikGdxMXYV3QG/SzUOPYN9KEUUuC0Yd0/vC0=
+github.com/Azure/azure-sdk-for-go/sdk/azcore v1.21.1/go.mod h1:pzBXCYn05zvYIrwLgtK8Ap8QcjRg+0i76tMQdWN6wOk=
 github.com/Azure/azure-sdk-for-go/sdk/azidentity v1.13.1 h1:Hk5QBxZQC1jb2Fwj6mpzme37xbCDdNTxU7O9eb5+LB4=
 github.com/Azure/azure-sdk-for-go/sdk/azidentity v1.13.1/go.mod h1:IYus9qsFobWIc2YVwe/WPjcnyCkPKtnHAqUYeebc8z0=
 github.com/Azure/azure-sdk-for-go/sdk/internal v1.12.0 h1:fhqpLE3UEXi9lPaBRpQ6XuRW0nU7hgg4zlmZZa+a9q4=
@@ -34,36 +34,36 @@ github.com/alecthomas/repr v0.5.2 h1:SU73FTI9D1P5UNtvseffFSGmdNci/O6RsqzeXJtP0Qs
 github.com/alecthomas/repr v0.5.2/go.mod h1:Fr0507jx4eOXV7AlPV6AVZLYrLIuIeSOWtW57eE/O/4=
 github.com/atotto/clipboard v0.1.4 h1:EH0zSVneZPSuFR11BlR9YppQTVDbh5+16AmcJi4g1z4=
 github.com/atotto/clipboard v0.1.4/go.mod h1:ZY9tmq7sm5xIbd9bOK4onWV4S6X0u6GY7Vn0Yu86PYI=
-github.com/aws/aws-sdk-go-v2 v1.41.5 h1:dj5kopbwUsVUVFgO4Fi5BIT3t4WyqIDjGKCangnV/yY=
-github.com/aws/aws-sdk-go-v2 v1.41.5/go.mod h1:mwsPRE8ceUUpiTgF7QmQIJ7lgsKUPQOUl3o72QBrE1o=
-github.com/aws/aws-sdk-go-v2/aws/protocol/eventstream v1.7.8 h1:eBMB84YGghSocM7PsjmmPffTa+1FBUeNvGvFou6V/4o=
-github.com/aws/aws-sdk-go-v2/aws/protocol/eventstream v1.7.8/go.mod h1:lyw7GFp3qENLh7kwzf7iMzAxDn+NzjXEAGjKS2UOKqI=
-github.com/aws/aws-sdk-go-v2/config v1.32.14 h1:opVIRo/ZbbI8OIqSOKmpFaY7IwfFUOCCXBsUpJOwDdI=
-github.com/aws/aws-sdk-go-v2/config v1.32.14/go.mod h1:U4/V0uKxh0Tl5sxmCBZ3AecYny4UNlVmObYjKuuaiOo=
-github.com/aws/aws-sdk-go-v2/credentials v1.19.14 h1:n+UcGWAIZHkXzYt87uMFBv/l8THYELoX6gVcUvgl6fI=
-github.com/aws/aws-sdk-go-v2/credentials v1.19.14/go.mod h1:cJKuyWB59Mqi0jM3nFYQRmnHVQIcgoxjEMAbLkpr62w=
-github.com/aws/aws-sdk-go-v2/feature/ec2/imds v1.18.21 h1:NUS3K4BTDArQqNu2ih7yeDLaS3bmHD0YndtA6UP884g=
-github.com/aws/aws-sdk-go-v2/feature/ec2/imds v1.18.21/go.mod h1:YWNWJQNjKigKY1RHVJCuupeWDrrHjRqHm0N9rdrWzYI=
-github.com/aws/aws-sdk-go-v2/internal/configsources v1.4.21 h1:Rgg6wvjjtX8bNHcvi9OnXWwcE0a2vGpbwmtICOsvcf4=
-github.com/aws/aws-sdk-go-v2/internal/configsources v1.4.21/go.mod h1:A/kJFst/nm//cyqonihbdpQZwiUhhzpqTsdbhDdRF9c=
-github.com/aws/aws-sdk-go-v2/internal/endpoints/v2 v2.7.21 h1:PEgGVtPoB6NTpPrBgqSE5hE/o47Ij9qk/SEZFbUOe9A=
-github.com/aws/aws-sdk-go-v2/internal/endpoints/v2 v2.7.21/go.mod h1:p+hz+PRAYlY3zcpJhPwXlLC4C+kqn70WIHwnzAfs6ps=
-github.com/aws/aws-sdk-go-v2/internal/ini v1.8.6 h1:qYQ4pzQ2Oz6WpQ8T3HvGHnZydA72MnLuFK9tJwmrbHw=
-github.com/aws/aws-sdk-go-v2/internal/ini v1.8.6/go.mod h1:O3h0IK87yXci+kg6flUKzJnWeziQUKciKrLjcatSNcY=
-github.com/aws/aws-sdk-go-v2/service/internal/accept-encoding v1.13.7 h1:5EniKhLZe4xzL7a+fU3C2tfUN4nWIqlLesfrjkuPFTY=
-github.com/aws/aws-sdk-go-v2/service/internal/accept-encoding v1.13.7/go.mod h1:x0nZssQ3qZSnIcePWLvcoFisRXJzcTVvYpAAdYX8+GI=
-github.com/aws/aws-sdk-go-v2/service/internal/presigned-url v1.13.21 h1:c31//R3xgIJMSC8S6hEVq+38DcvUlgFY0FM6mSI5oto=
-github.com/aws/aws-sdk-go-v2/service/internal/presigned-url v1.13.21/go.mod h1:r6+pf23ouCB718FUxaqzZdbpYFyDtehyZcmP5KL9FkA=
-github.com/aws/aws-sdk-go-v2/service/signin v1.0.9 h1:QKZH0S178gCmFEgst8hN0mCX1KxLgHBKKY/CLqwP8lg=
-github.com/aws/aws-sdk-go-v2/service/signin v1.0.9/go.mod h1:7yuQJoT+OoH8aqIxw9vwF+8KpvLZ8AWmvmUWHsGQZvI=
-github.com/aws/aws-sdk-go-v2/service/sso v1.30.15 h1:lFd1+ZSEYJZYvv9d6kXzhkZu07si3f+GQ1AaYwa2LUM=
-github.com/aws/aws-sdk-go-v2/service/sso v1.30.15/go.mod h1:WSvS1NLr7JaPunCXqpJnWk1Bjo7IxzZXrZi1QQCkuqM=
-github.com/aws/aws-sdk-go-v2/service/ssooidc v1.35.19 h1:dzztQ1YmfPrxdrOiuZRMF6fuOwWlWpD2StNLTceKpys=
-github.com/aws/aws-sdk-go-v2/service/ssooidc v1.35.19/go.mod h1:YO8TrYtFdl5w/4vmjL8zaBSsiNp3w0L1FfKVKenZT7w=
-github.com/aws/aws-sdk-go-v2/service/sts v1.41.10 h1:p8ogvvLugcR/zLBXTXrTkj0RYBUdErbMnAFFp12Lm/U=
-github.com/aws/aws-sdk-go-v2/service/sts v1.41.10/go.mod h1:60dv0eZJfeVXfbT1tFJinbHrDfSJ2GZl4Q//OSSNAVw=
-github.com/aws/smithy-go v1.24.3 h1:XgOAaUgx+HhVBoP4v8n6HCQoTRDhoMghKqw4LNHsDNg=
-github.com/aws/smithy-go v1.24.3/go.mod h1:YE2RhdIuDbA5E5bTdciG9KrW3+TiEONeUWCqxX9i1Fc=
+github.com/aws/aws-sdk-go-v2 v1.41.6 h1:1AX0AthnBQzMx1vbmir3Y4WsnJgiydmnJjiLu+LvXOg=
+github.com/aws/aws-sdk-go-v2 v1.41.6/go.mod h1:dy0UzBIfwSeot4grGvY1AqFWN5zgziMmWGzysDnHFcQ=
+github.com/aws/aws-sdk-go-v2/aws/protocol/eventstream v1.7.9 h1:adBsCIIpLbLmYnkQU+nAChU5yhVTvu5PerROm+/Kq2A=
+github.com/aws/aws-sdk-go-v2/aws/protocol/eventstream v1.7.9/go.mod h1:uOYhgfgThm/ZyAuJGNQ5YgNyOlYfqnGpTHXvk3cpykg=
+github.com/aws/aws-sdk-go-v2/config v1.32.16 h1:Q0iQ7quUgJP0F/SCRTieScnaMdXr9h/2+wze1u3cNeM=
+github.com/aws/aws-sdk-go-v2/config v1.32.16/go.mod h1:duCCnJEFqpt2RC6no1iK6q+8HpwOAkiUua0pY507dQc=
+github.com/aws/aws-sdk-go-v2/credentials v1.19.15 h1:fyvgWTszojq8hEnMi8PPBTvZdTtEVmAVyo+NFLHBhH4=
+github.com/aws/aws-sdk-go-v2/credentials v1.19.15/go.mod h1:gJiYyMOjNg8OEdRWOf3CrFQxM2a98qmrtjx1zuiQfB8=
+github.com/aws/aws-sdk-go-v2/feature/ec2/imds v1.18.22 h1:IOGsJ1xVWhsi+ZO7/NW8OuZZBtMJLZbk4P5HDjJO0jQ=
+github.com/aws/aws-sdk-go-v2/feature/ec2/imds v1.18.22/go.mod h1:b+hYdbU+jGKfXE8kKM6g1+h+L/Go3vMvzlxBsiuGsxg=
+github.com/aws/aws-sdk-go-v2/internal/configsources v1.4.22 h1:GmLa5Kw1ESqtFpXsx5MmC84QWa/ZrLZvlJGa2y+4kcQ=
+github.com/aws/aws-sdk-go-v2/internal/configsources v1.4.22/go.mod h1:6sW9iWm9DK9YRpRGga/qzrzNLgKpT2cIxb7Vo2eNOp0=
+github.com/aws/aws-sdk-go-v2/internal/endpoints/v2 v2.7.22 h1:dY4kWZiSaXIzxnKlj17nHnBcXXBfac6UlsAx2qL6XrU=
+github.com/aws/aws-sdk-go-v2/internal/endpoints/v2 v2.7.22/go.mod h1:KIpEUx0JuRZLO7U6cbV204cWAEco2iC3l061IxlwLtI=
+github.com/aws/aws-sdk-go-v2/internal/v4a v1.4.23 h1:FPXsW9+gMuIeKmz7j6ENWcWtBGTe1kH8r9thNt5Uxx4=
+github.com/aws/aws-sdk-go-v2/internal/v4a v1.4.23/go.mod h1:7J8iGMdRKk6lw2C+cMIphgAnT8uTwBwNOsGkyOCm80U=
+github.com/aws/aws-sdk-go-v2/service/internal/accept-encoding v1.13.8 h1:HtOTYcbVcGABLOVuPYaIihj6IlkqubBwFj10K5fxRek=
+github.com/aws/aws-sdk-go-v2/service/internal/accept-encoding v1.13.8/go.mod h1:VsK9abqQeGlzPgUr+isNWzPlK2vKe9INMLWnY65f5Xs=
+github.com/aws/aws-sdk-go-v2/service/internal/presigned-url v1.13.22 h1:PUmZeJU6Y1Lbvt9WFuJ0ugUK2xn6hIWUBBbKuOWF30s=
+github.com/aws/aws-sdk-go-v2/service/internal/presigned-url v1.13.22/go.mod h1:nO6egFBoAaoXze24a2C0NjQCvdpk8OueRoYimvEB9jo=
+github.com/aws/aws-sdk-go-v2/service/signin v1.0.10 h1:a1Fq/KXn75wSzoJaPQTgZO0wHGqE9mjFnylnqEPTchA=
+github.com/aws/aws-sdk-go-v2/service/signin v1.0.10/go.mod h1:p6+MXNxW7IA6dMgHfTAzljuwSKD0NCm/4lbS4t6+7vI=
+github.com/aws/aws-sdk-go-v2/service/sso v1.30.16 h1:x6bKbmDhsgSZwv6q19wY/u3rLk/3FGjJWyqKcIRufpE=
+github.com/aws/aws-sdk-go-v2/service/sso v1.30.16/go.mod h1:CudnEVKRtLn0+3uMV0yEXZ+YZOKnAtUJ5DmDhilVnIw=
+github.com/aws/aws-sdk-go-v2/service/ssooidc v1.35.20 h1:oK/njaL8GtyEihkWMD4k3VgHCT64RQKkZwh0DG5j8ak=
+github.com/aws/aws-sdk-go-v2/service/ssooidc v1.35.20/go.mod h1:JHs8/y1f3zY7U5WcuzoJ/yAYGYtNIVPKLIbp61euvmg=
+github.com/aws/aws-sdk-go-v2/service/sts v1.42.0 h1:ks8KBcZPh3PYISr5dAiXCM5/Thcuxk8l+PG4+A0exds=
+github.com/aws/aws-sdk-go-v2/service/sts v1.42.0/go.mod h1:pFw33T0WLvXU3rw1WBkpMlkgIn54eCB5FYLhjDc9Foo=
+github.com/aws/smithy-go v1.25.0 h1:Sz/XJ64rwuiKtB6j98nDIPyYrV1nVNJ4YU74gttcl5U=
+github.com/aws/smithy-go v1.25.0/go.mod h1:YE2RhdIuDbA5E5bTdciG9KrW3+TiEONeUWCqxX9i1Fc=
 github.com/aymanbagabas/go-osc52/v2 v2.0.1 h1:HwpRHbFMcZLEVr42D4p7XBqjyuxQH5SMiErDT4WkJ2k=
 github.com/aymanbagabas/go-osc52/v2 v2.0.1/go.mod h1:uYgXzlJ7ZpABp8OJ+exZzJJhRNQ2ASbcXHWsFqH8hp8=
 github.com/aymanbagabas/go-udiff v0.4.1 h1:OEIrQ8maEeDBXQDoGCbbTTXYJMYRCRO1fnodZ12Gv5o=
@@ -86,8 +86,8 @@ github.com/charmbracelet/log v1.0.0 h1:HVVVMmfOorfj3BA9i8X8UL69Hoz9lI0PYwXfJvOdR
 github.com/charmbracelet/log v1.0.0/go.mod h1:uYgY3SmLpwJWxmlrPwXvzVYujxis1vAKRV/0VQB7yWA=
 github.com/charmbracelet/openai-go v0.0.0-20260319145158-d0740cc34266 h1:BW/sZtyd1JyYy0h5adMm3tzpNyL857LWjuTRET6OhpY=
 github.com/charmbracelet/openai-go v0.0.0-20260319145158-d0740cc34266/go.mod h1:1DahUaExbUZx/jD+FNT2PKP4L9rLE5+ZBRuI8mZjd/E=
-github.com/charmbracelet/ultraviolet v0.0.0-20260414011438-8c69ec811b1e h1:O5hZFj55wZQWxMiRtQLa3uLKhZGZGS/j8M3OXinQlrw=
-github.com/charmbracelet/ultraviolet v0.0.0-20260414011438-8c69ec811b1e/go.mod h1:bAAz7dh/FTYfC+oiHavL4mX1tOIBZ0ZwYjSi3qE6ivM=
+github.com/charmbracelet/ultraviolet v0.0.0-20260420095748-421e4a7fa8d7 h1:PbFxahSfyADcQOp+7WxbeqN3wX37KA/Rk+EXOW1xS9Q=
+github.com/charmbracelet/ultraviolet v0.0.0-20260420095748-421e4a7fa8d7/go.mod h1:3YdTxlnV/L0bQ3VN8WOSw8doF7LZV/xawUQ4MuAPDvo=
 github.com/charmbracelet/x/ansi v0.11.7 h1:kzv1kJvjg2S3r9KHo8hDdHFQLEqn4RBCb39dAYC84jI=
 github.com/charmbracelet/x/ansi v0.11.7/go.mod h1:9qGpnAVYz+8ACONkZBUWPtL7lulP9No6p1epAihUZwQ=
 github.com/charmbracelet/x/cellbuf v0.0.15 h1:ur3pZy0o6z/R7EylET877CBxaiE1Sp1GMxoFPAIztPI=
@@ -98,14 +98,14 @@ github.com/charmbracelet/x/editor v0.2.0 h1:7XLUKtaRaB8jN7bWU2p2UChiySyaAuIfYiIR
 github.com/charmbracelet/x/editor v0.2.0/go.mod h1:p3oQ28TSL3YPd+GKJ1fHWcp+7bVGpedHpXmo0D6t1dY=
 github.com/charmbracelet/x/errors v0.0.0-20240508181413-e8d8b6e2de86 h1:JSt3B+U9iqk37QUU2Rvb6DSBYRLtWqFqfxf8l5hOZUA=
 github.com/charmbracelet/x/errors v0.0.0-20240508181413-e8d8b6e2de86/go.mod h1:2P0UgXMEa6TsToMSuFqKFQR+fZTO9CNGUNokkPatT/0=
-github.com/charmbracelet/x/exp/charmtone v0.0.0-20260413165052-6921c759c913 h1:6F/6bu5nBLjodsvaU5xAszTaxtHrDU5UiJarpMPZj48=
-github.com/charmbracelet/x/exp/charmtone v0.0.0-20260413165052-6921c759c913/go.mod h1:nsExn0DGyX0lh9LwLHTn2Gg+hafdzfSXnC+QmEJTZFY=
+github.com/charmbracelet/x/exp/charmtone v0.0.0-20260420102150-fe550f2efce5 h1:3ElWZRQqSRqML2P/r2TmuSkdXPMDI+Jg3f0bGA6Ekg4=
+github.com/charmbracelet/x/exp/charmtone v0.0.0-20260420102150-fe550f2efce5/go.mod h1:nsExn0DGyX0lh9LwLHTn2Gg+hafdzfSXnC+QmEJTZFY=
 github.com/charmbracelet/x/exp/golden v0.0.0-20250806222409-83e3a29d542f h1:pk6gmGpCE7F3FcjaOEKYriCvpmIN4+6OS/RD0vm4uIA=
 github.com/charmbracelet/x/exp/golden v0.0.0-20250806222409-83e3a29d542f/go.mod h1:IfZAMTHB6XkZSeXUqriemErjAWCCzT0LwjKFYCZyw0I=
 github.com/charmbracelet/x/exp/ordered v0.1.0 h1:55/qLwjIh0gL0Vni+QAWk7T/qRVP6sBf+2agPBgnOFE=
 github.com/charmbracelet/x/exp/ordered v0.1.0/go.mod h1:5UHwmG+is5THxMyCJHNPCn2/ecI07aKNrW+LcResjJ8=
-github.com/charmbracelet/x/exp/slice v0.0.0-20260413165052-6921c759c913 h1:RiZFY92Ug9iz1CenzxSSQla2Z3WflsR7bIuXq40JlpU=
-github.com/charmbracelet/x/exp/slice v0.0.0-20260413165052-6921c759c913/go.mod h1:vqEfX6xzqW1pKKZUUiFOKg0OQ7bCh54Q2vR/tserrRA=
+github.com/charmbracelet/x/exp/slice v0.0.0-20260420102150-fe550f2efce5 h1:QqpW1CPNAnOpM3Nj0X7IT2IFlR90bLdAkO5+A3Hwbi4=
+github.com/charmbracelet/x/exp/slice v0.0.0-20260420102150-fe550f2efce5/go.mod h1:vqEfX6xzqW1pKKZUUiFOKg0OQ7bCh54Q2vR/tserrRA=
 github.com/charmbracelet/x/exp/strings v0.1.0 h1:i69S2XI7uG1u4NLGeJPSYU++Nmjvpo9nwd6aoEm7gkA=
 github.com/charmbracelet/x/exp/strings v0.1.0/go.mod h1:/ehtMPNh9K4odGFkqYJKpIYyePhdp1hLBRvyY4bWkH8=
 github.com/charmbracelet/x/json v0.2.0 h1:DqB+ZGx2h+Z+1s98HOuOyli+i97wsFQIxP2ZQANTPrQ=
@@ -124,15 +124,15 @@ github.com/clipperhouse/uax29/v2 v2.7.0 h1:+gs4oBZ2gPfVrKPthwbMzWZDaAFPGYK72F0NJ
 github.com/clipperhouse/uax29/v2 v2.7.0/go.mod h1:EFJ2TJMRUaplDxHKj1qAEhCtQPW2tJSwu5BF98AuoVM=
 github.com/cncf/xds/go v0.0.0-20260202195803-dba9d589def2 h1:aBangftG7EVZoUb69Os8IaYg++6uMOdKK83QtkkvJik=
 github.com/cncf/xds/go v0.0.0-20260202195803-dba9d589def2/go.mod h1:qwXFYgsP6T7XnJtbKlf1HP8AjxZZyzxMmc+Lq5GjlU4=
-github.com/coder/acp-go-sdk v0.6.3 h1:LsXQytehdjKIYJnoVWON/nf7mqbiarnyuyE3rrjBsXQ=
-github.com/coder/acp-go-sdk v0.6.3/go.mod h1:yKzM/3R9uELp4+nBAwwtkS0aN1FOFjo11CNPy37yFko=
+github.com/coder/acp-go-sdk v0.12.0 h1:GoIC6RrkPMBIVQ3ckSkl+bO/ERV/IRK6clBdZmx4Uf4=
+github.com/coder/acp-go-sdk v0.12.0/go.mod h1:yKzM/3R9uELp4+nBAwwtkS0aN1FOFjo11CNPy37yFko=
 github.com/cpuguy83/go-md2man/v2 v2.0.6/go.mod h1:oOW0eioCTA6cOiMLiUPZOpcVxMig6NIQQ7OS05n1F4g=
 github.com/creack/pty v1.1.24 h1:bJrF4RRfyJnbTJqzRLHzcGaZK1NeM5kTC9jGgovnR1s=
 github.com/creack/pty v1.1.24/go.mod h1:08sCNb52WyoAwi2QDyzUCTgcvVFhUzewun7wtTfvcwE=
 github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc h1:U9qPSI2PIWSS1VwoXQT9A3Wy9MM3WgvqSxFWenqJduM=
 github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
-github.com/dlclark/regexp2 v1.11.5 h1:Q/sSnsKerHeCkc/jSTNq1oCm7KiVgUMZRDUoRu0JQZQ=
-github.com/dlclark/regexp2 v1.11.5/go.mod h1:DHkYz0B9wPfa6wondMfaivmHpzrQ3v9q8cnmRbL6yW8=
+github.com/dlclark/regexp2 v1.12.0 h1:0j4c5qQmnC6XOWNjP3PIXURXN2gWx76rd3KvgdPkCz8=
+github.com/dlclark/regexp2 v1.12.0/go.mod h1:DHkYz0B9wPfa6wondMfaivmHpzrQ3v9q8cnmRbL6yW8=
 github.com/dnaeon/go-vcr v1.2.0 h1:zHCHvJYTMh1N7xnV7zf1m1GPBF9Ad0Jk/whtQ1663qI=
 github.com/dnaeon/go-vcr v1.2.0/go.mod h1:R4UdLID7HZT3taECzJs4YgbbH6PIGXB6W/sc5OLb6RQ=
 github.com/dustin/go-humanize v1.0.1 h1:GzkhY7T5VNhEkwH0PVJgjz+fX1rhBrR7pRT3mDkpeCY=
@@ -173,10 +173,10 @@ github.com/google/s2a-go v0.1.9 h1:LGD7gtMgezd8a/Xak7mEWL0PjoTQFvpRudN895yqKW0=
 github.com/google/s2a-go v0.1.9/go.mod h1:YA0Ei2ZQL3acow2O62kdp9UlnvMmU7kA6Eutn0dXayM=
 github.com/google/uuid v1.6.0 h1:NIvaJDMOsjHA8n1jAhLSgzrAzy1Hgr+hNrb57e+94F0=
 github.com/google/uuid v1.6.0/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
-github.com/googleapis/enterprise-certificate-proxy v0.3.14 h1:yh8ncqsbUY4shRD5dA6RlzjJaT4hi3kII+zYw8wmLb8=
-github.com/googleapis/enterprise-certificate-proxy v0.3.14/go.mod h1:vqVt9yG9480NtzREnTlmGSBmFrA+bzb0yl0TxoBQXOg=
-github.com/googleapis/gax-go/v2 v2.21.0 h1:h45NjjzEO3faG9Lg/cFrBh2PgegVVgzqKzuZl/wMbiI=
-github.com/googleapis/gax-go/v2 v2.21.0/go.mod h1:But/NJU6TnZsrLai/xBAQLLz+Hc7fHZJt/hsCz3Fih4=
+github.com/googleapis/enterprise-certificate-proxy v0.3.15 h1:xolVQTEXusUcAA5UgtyRLjelpFFHWlPQ4XfWGc7MBas=
+github.com/googleapis/enterprise-certificate-proxy v0.3.15/go.mod h1:vqVt9yG9480NtzREnTlmGSBmFrA+bzb0yl0TxoBQXOg=
+github.com/googleapis/gax-go/v2 v2.22.0 h1:PjIWBpgGIVKGoCXuiCoP64altEJCj3/Ei+kSU5vlZD4=
+github.com/googleapis/gax-go/v2 v2.22.0/go.mod h1:irWBbALSr0Sk3qlqb9SyJ1h68WjgeFuiOzI4Rqw5+aY=
 github.com/gorilla/websocket v1.5.3 h1:saDtZ6Pbx/0u+bgYQ3q96pZgCzfhKXGPqt7kZ72aNNg=
 github.com/gorilla/websocket v1.5.3/go.mod h1:YR8l580nyteQvAITg2hZ9XVh4b55+EU/adAjf1fMHhE=
 github.com/hexops/gotextdiff v1.0.3 h1:gitA9+qJrrTCsiCl7+kh75nPqQt1cx4ZkudSTLoUqJM=
@@ -187,14 +187,14 @@ github.com/indaco/herald v0.13.0 h1:+xVG9Fx5NpuWhwku/9IlRL6I009NnX4VUGKvlZHTRxU=
 github.com/indaco/herald v0.13.0/go.mod h1:T5g1+XLYvpjouhzAGHnAHDCKizhESkoV6+QPZ3DhgWA=
 github.com/indaco/herald-md v0.3.0 h1:hN1cKyrexPPM9PeHBsKuaWvIizSi/iYvM9yzRgtdb8M=
 github.com/indaco/herald-md v0.3.0/go.mod h1:RUHVaDSG45ymJjKyxpDwBocLXrZo93FB4OeYMsw9B9s=
-github.com/kaptinlin/go-i18n v0.4.0 h1:i7L3U2yurg+xhokITtJ0k+mjHnXqkoyz8ju5Wb7W8Oc=
-github.com/kaptinlin/go-i18n v0.4.0/go.mod h1:njA6x0+4MWGcLWT0KLrwekhRPmze1Hnstf2+VJFzwpM=
-github.com/kaptinlin/jsonpointer v0.4.17 h1:mY9k8ciWncxbsECyaxKnR0MdmxamNdp2tLQkAKVrtSk=
-github.com/kaptinlin/jsonpointer v0.4.17/go.mod h1:SsfsjqnHG5zuKo1DTBzk1VknaHlL4osHw+X9kZKukpU=
-github.com/kaptinlin/jsonschema v0.7.7 h1:41BlQJ9dskH0oE5DSzBUrl/w4JQYIr6N6L0B5GNyDoM=
-github.com/kaptinlin/jsonschema v0.7.7/go.mod h1:rKjWfyySHSxAD7Li2ctYkPlOu960igoKBvZ2ADRtd5Q=
-github.com/kaptinlin/messageformat-go v0.4.20 h1:a0ufTd5liiUubIGeGxpSTnNS8ZSrN4DV01/wGFmfzMs=
-github.com/kaptinlin/messageformat-go v0.4.20/go.mod h1:FqdEPfQLkqVBX7OBRMPgYwUPvKYJohFD9Ok1BMzCfIo=
+github.com/kaptinlin/go-i18n v0.4.2 h1:52gGOx4ZwbLEiOyDMNA1ax2WktKlrKsmV6Ydf9Tw3/I=
+github.com/kaptinlin/go-i18n v0.4.2/go.mod h1:IACLIi+sHn3pGyryFMiqr2N1CJry4OKFD0MAEneEVQk=
+github.com/kaptinlin/jsonpointer v0.4.18 h1:EDUXT4WKpOKguU7oaFv6VaNatN7uHFe6dEYHX0+OFxs=
+github.com/kaptinlin/jsonpointer v0.4.18/go.mod h1:ndmfvrqrEDSbV3F7yGaOuDvr29WrxYU1aqkvef9L2do=
+github.com/kaptinlin/jsonschema v0.7.8 h1:aHv28bYtfLfUXYI/10Phb1nvVyLXNz1lmu73vtKmlOY=
+github.com/kaptinlin/jsonschema v0.7.8/go.mod h1:cz7SK0jTHdabKdQp+SwBKKmOeZ55txuNo72Jx9Sbb2w=
+github.com/kaptinlin/messageformat-go v0.5.2 h1:E+D5oQVRepHgyMiLWRHnPXYFbqBDI4Sek7/CTIAByj4=
+github.com/kaptinlin/messageformat-go v0.5.2/go.mod h1:NKjwS6e9u7DRhAK+vydjDDwJ7UbdHhYjk/yk2WPuZPs=
 github.com/kr/pretty v0.3.1 h1:flRD4NNwYAUpkphVc1HcthR4KEIFJ65n8Mw5qdRn3LE=
 github.com/kr/pretty v0.3.1/go.mod h1:hoEshYVHaxMs3cyo3Yncou5ZscifuDolrwPKZanG3xk=
 github.com/kr/text v0.2.0 h1:5Nx0Ya0ZqY2ygV366QzturHI13Jq95ApcVaJBhpS+AY=
@@ -203,8 +203,8 @@ github.com/kylelemons/godebug v1.1.0 h1:RPNrshWIDI6G2gRW9EHilWtl7Z6Sb1BR0xunSBf0
 github.com/kylelemons/godebug v1.1.0/go.mod h1:9/0rRGxNHcop5bhtWyNeEfOS8JIWk580+fNqagV/RAw=
 github.com/lucasb-eyer/go-colorful v1.4.0 h1:UtrWVfLdarDgc44HcS7pYloGHJUjHV/4FwW4TvVgFr4=
 github.com/lucasb-eyer/go-colorful v1.4.0/go.mod h1:R4dSotOR9KMtayYi1e77YzuveK+i7ruzyGqttikkLy0=
-github.com/mark3labs/mcp-go v0.48.0 h1:o+MXuGW/HCeR2ny5LcAcZQn2bo6I2xaZMEHnpRG+dtw=
-github.com/mark3labs/mcp-go v0.48.0/go.mod h1:JKTC7R2LLVagkEWK7Kwu7DbmA6iIvnNAod6yrHiQMag=
+github.com/mark3labs/mcp-go v0.49.0 h1:7Ssx4d7/T86qnWoJIdye7wEEvUzv39UIbnZb/FqUZMY=
+github.com/mark3labs/mcp-go v0.49.0/go.mod h1:BflTAZAzXlrTpiO44gmjMu89n2FO56rJ9m31fp4zd5k=
 github.com/mattn/go-isatty v0.0.21 h1:xYae+lCNBP7QuW4PUnNG61ffM4hVIfm+zUzDuSzYLGs=
 github.com/mattn/go-isatty v0.0.21/go.mod h1:ZXfXG4SQHsB/w3ZeOYbR0PrPwLy+n6xiMrJlRFqopa4=
 github.com/mattn/go-runewidth v0.0.23 h1:7ykA0T0jkPpzSvMS5i9uoNn2Xy3R383f9HDx3RybWcw=
@@ -310,16 +310,16 @@ golang.org/x/time v0.15.0 h1:bbrp8t3bGUeFOx08pvsMYRTCVSMk89u4tKbNOZbp88U=
 golang.org/x/time v0.15.0/go.mod h1:Y4YMaQmXwGQZoFaVFk4YpCt4FLQMYKZe9oeV/f4MSno=
 gonum.org/v1/gonum v0.17.0 h1:VbpOemQlsSMrYmn7T2OUvQ4dqxQXU+ouZFQsZOx50z4=
 gonum.org/v1/gonum v0.17.0/go.mod h1:El3tOrEuMpv2UdMrbNlKEh9vd86bmQ6vqIcDwxEOc1E=
-google.golang.org/api v0.275.0 h1:vfY5d9vFVJeWEZT65QDd9hbndr7FyZ2+6mIzGAh71NI=
-google.golang.org/api v0.275.0/go.mod h1:Fnag/EWUPIcJXuIkP1pjoTgS5vdxlk3eeemL7Do6bvw=
+google.golang.org/api v0.276.0 h1:nVArUtfLEihtW+b0DdcqRGK1xoEm2+ltAihyztq7MKY=
+google.golang.org/api v0.276.0/go.mod h1:Fnag/EWUPIcJXuIkP1pjoTgS5vdxlk3eeemL7Do6bvw=
 google.golang.org/genai v1.54.0 h1:ZQCa70WMTJDI11FdqWCzGvZ5PanpcpfoO6jl/lrSnGU=
 google.golang.org/genai v1.54.0/go.mod h1:A3kkl0nyBjyFlNjgxIwKq70julKbIxpSxqKO5gw/gmk=
 google.golang.org/genproto v0.0.0-20260406210006-6f92a3bedf2d h1:N1Ec54vZnIPd7MnxRiYLW+oY4fDR4BOS/LrssdD9+ek=
 google.golang.org/genproto v0.0.0-20260406210006-6f92a3bedf2d/go.mod h1:c2hJ1grtnH0xUiEKGDGkjGNTJ1Hy2LrblyKOHF0sqRM=
 google.golang.org/genproto/googleapis/api v0.0.0-20260406210006-6f92a3bedf2d h1:/aDRtSZJjyLQzm75d+a1wOJaqyKBMvIAfeQmoa3ORiI=
 google.golang.org/genproto/googleapis/api v0.0.0-20260406210006-6f92a3bedf2d/go.mod h1:etfGUgejTiadZAUaEP14NP97xi1RGeawqkjDARA/UOs=
-google.golang.org/genproto/googleapis/rpc v0.0.0-20260414002931-afd174a4e478 h1:RmoJA1ujG+/lRGNfUnOMfhCy5EipVMyvUE+KNbPbTlw=
-google.golang.org/genproto/googleapis/rpc v0.0.0-20260414002931-afd174a4e478/go.mod h1:4Hqkh8ycfw05ld/3BWL7rJOSfebL2Q+DVDeRgYgxUU8=
+google.golang.org/genproto/googleapis/rpc v0.0.0-20260420184626-e10c466a9529 h1:XF8+t6QQiS0o9ArVan/HW8Q7cycNPGsJf6GA2nXxYAg=
+google.golang.org/genproto/googleapis/rpc v0.0.0-20260420184626-e10c466a9529/go.mod h1:4Hqkh8ycfw05ld/3BWL7rJOSfebL2Q+DVDeRgYgxUU8=
 google.golang.org/grpc v1.80.0 h1:Xr6m2WmWZLETvUNvIUmeD5OAagMw3FiKmMlTdViWsHM=
 google.golang.org/grpc v1.80.0/go.mod h1:ho/dLnxwi3EDJA4Zghp7k2Ec1+c2jqup0bFkw07bwF4=
 google.golang.org/protobuf v1.36.11 h1:fV6ZwhNocDyBLK0dj+fg8ektcVegBBuEolpbTQyBNVE=
@@ -177,22 +177,55 @@ func (a *Agent) SetSessionMode(_ context.Context, _ acp.SetSessionModeRequest) (
 	return acp.SetSessionModeResponse{}, nil
 }

-// SetSessionModel changes the active model for a session.
-func (a *Agent) SetSessionModel(ctx context.Context, params acp.SetSessionModelRequest) (acp.SetSessionModelResponse, error) {
-	sessionID := string(params.SessionId)
+// ListSessions returns an empty session list. Kit doesn't persist sessions
+// across restarts in ACP mode, so this is effectively a no-op.
+func (a *Agent) ListSessions(_ context.Context, _ acp.ListSessionsRequest) (acp.ListSessionsResponse, error) {
+	return acp.ListSessionsResponse{
+		Sessions: []acp.SessionInfo{},
+	}, nil
+}
+
+// SetSessionConfigOption handles session configuration changes. Currently
+// supports the "model" config option to change the active model for a session.
+func (a *Agent) SetSessionConfigOption(ctx context.Context, params acp.SetSessionConfigOptionRequest) (acp.SetSessionConfigOptionResponse, error) {
+	// Extract session ID and config ID from whichever variant is present.
+	var sessionID string
+	var configID string
+	var value string
+
+	switch {
+	case params.ValueId != nil:
+		sessionID = string(params.ValueId.SessionId)
+		configID = string(params.ValueId.ConfigId)
+		value = string(params.ValueId.Value)
+	case params.Boolean != nil:
+		sessionID = string(params.Boolean.SessionId)
+		configID = string(params.Boolean.ConfigId)
+		// Boolean config options are not used for model selection.
+		log.Debug("acp: set_session_config_option (boolean)", "session", sessionID, "config", configID, "value", params.Boolean.Value)
+		return acp.SetSessionConfigOptionResponse{}, nil
+	default:
+		return acp.SetSessionConfigOptionResponse{}, acp.NewInvalidParams("unsupported config option variant")
+	}
+
 	sess, ok := a.registry.get(sessionID)
 	if !ok {
-		return acp.SetSessionModelResponse{}, acp.NewInvalidParams(fmt.Sprintf("session not found: %s", sessionID))
+		return acp.SetSessionConfigOptionResponse{}, acp.NewInvalidParams(fmt.Sprintf("session not found: %s", sessionID))
 	}

-	modelID := string(params.ModelId)
-	log.Debug("acp: set_session_model", "session", sessionID, "model", modelID)
+	log.Debug("acp: set_session_config_option", "session", sessionID, "config", configID, "value", value)

-	if err := sess.kit.SetModel(ctx, modelID); err != nil {
-		return acp.SetSessionModelResponse{}, fmt.Errorf("set model: %w", err)
+	// Handle known config options.
+	switch configID {
+	case "model":
+		if err := sess.kit.SetModel(ctx, value); err != nil {
+			return acp.SetSessionConfigOptionResponse{}, fmt.Errorf("set model: %w", err)
+		}
+	default:
+		log.Debug("acp: unknown config option", "config", configID)
 	}

-	return acp.SetSessionModelResponse{}, nil
+	return acp.SetSessionConfigOptionResponse{}, nil
 }

 // ---------------------------------------------------------------------------
@@ -6,6 +6,7 @@ import (
 	"fmt"
 	"io"
 	"strings"
+	"time"

 	"charm.land/fantasy"

@@ -87,6 +88,19 @@ type ReasoningDeltaHandler func(delta string)
 // Called when the last reasoning token has been processed, before text streaming starts.
 type ReasoningCompleteHandler func()

+// ToolCallStartHandler is a function type for handling the moment when the LLM
+// begins generating tool call arguments. The tool name is known but the full
+// argument JSON is still streaming.
+type ToolCallStartHandler func(toolCallID, toolName string)
+
+// ToolCallDeltaHandler is a function type for handling streamed fragments of
+// tool call arguments as they arrive from the LLM.
+type ToolCallDeltaHandler func(toolCallID, delta string)
+
+// ToolCallEndHandler is a function type for handling the end of tool argument
+// streaming, before the tool call is parsed and execution begins.
+type ToolCallEndHandler func(toolCallID string)
+
 // ToolOutputHandler is a function type for handling streaming tool output chunks.
 // Used by tools like bash to stream output as it arrives rather than waiting
 // for the command to complete. The isStderr flag indicates if the chunk
@@ -94,6 +108,12 @@ type ReasoningCompleteHandler func()
 // Note: This is an alias for core.ToolOutputCallback to avoid import cycles.
 type ToolOutputHandler = core.ToolOutputCallback

+// PasswordPromptHandler is a function type for password prompts.
+// Used by the bash tool when sudo requires a password. The handler receives
+// a prompt message and returns the password and whether it was cancelled.
+// Note: This is an alias for core.PasswordPromptCallback.
+type PasswordPromptHandler = core.PasswordPromptCallback
+
 // StepMessagesHandler is a function type for persisting messages after each
 // complete step in a multi-step agent turn. The handler receives the messages
 // produced by the step (typically an assistant message with tool calls followed
@@ -107,6 +127,76 @@ type StepMessagesHandler func(stepMessages []fantasy.Message)
 // tracking during long-running tool-calling conversations.
 type StepUsageHandler func(inputTokens, outputTokens, cacheReadTokens, cacheCreationTokens int64)

+// StepStartHandler is called when a new LLM step begins within a turn.
+type StepStartHandler func(stepNumber int)
+
+// StepFinishHandler is called when a step completes with full context.
+type StepFinishHandler func(stepNumber int, hasToolCalls bool, finishReason string, usage fantasy.Usage)
+
+// TextStartHandler is called when the LLM begins generating text content.
+type TextStartHandler func(id string)
+
+// TextEndHandler is called when the LLM finishes generating text content.
+type TextEndHandler func(id string)
+
+// ReasoningStartHandler is called when the LLM begins reasoning/thinking.
+type ReasoningStartHandler func(id string)
+
+// WarningsHandler is called when the LLM provider returns warnings.
+type WarningsHandler func(warnings []string)
+
+// SourceHandler is called when the LLM references a source.
+type SourceHandler func(sourceType, id, url, title string)
+
+// StreamFinishHandler is called when a per-step LLM stream completes.
+type StreamFinishHandler func(usage fantasy.Usage, finishReason string)
+
+// ErrorHandler is called when an agent-level error occurs.
+type ErrorHandler func(err error)
+
+// RetryHandler is called when the LLM request is retried.
+type RetryHandler func(attempt int, err error)
+
+// PrepareStepHandler is called between steps to allow message modification.
+// It receives the step number and current messages, and returns replacement
+// messages (or nil to keep unchanged).
+type PrepareStepHandler func(stepNumber int, messages []fantasy.Message) []fantasy.Message
+
+// GenerateCallbacks consolidates all callback functions for
+// GenerateWithLoopAndStreaming into a single struct. This replaces the previous
+// 16+ positional callback parameters, making it easier to add new callbacks
+// without breaking existing callers (new fields default to nil).
+type GenerateCallbacks struct {
+	OnToolCall          ToolCallHandler
+	OnToolExecution     ToolExecutionHandler
+	OnToolResult        ToolResultHandler
+	OnResponse          ResponseHandler
+	OnToolCallContent   ToolCallContentHandler
+	OnStreamingResponse StreamingResponseHandler
+	OnReasoningDelta    ReasoningDeltaHandler
+	OnReasoningComplete ReasoningCompleteHandler
+	OnToolOutput        ToolOutputHandler
+	OnStepMessages      StepMessagesHandler
+	OnStepUsage         StepUsageHandler
+	OnPasswordPrompt    PasswordPromptHandler
+	OnToolCallStart     ToolCallStartHandler
+	OnToolCallDelta     ToolCallDeltaHandler
+	OnToolCallEnd       ToolCallEndHandler
+
+	// New callbacks for previously unwired Fantasy lifecycle events.
+	OnStepStart      StepStartHandler
+	OnStepFinish     StepFinishHandler
+	OnTextStart      TextStartHandler
+	OnTextEnd        TextEndHandler
+	OnReasoningStart ReasoningStartHandler
+	OnWarnings       WarningsHandler
+	OnSource         SourceHandler
+	OnStreamFinish   StreamFinishHandler
+	OnError          ErrorHandler
+	OnRetry          RetryHandler
+	OnPrepareStep    PrepareStepHandler
+}
+
 // Agent represents an AI agent with core tool integration using the LLM library.
 // Core tools (bash, read, write, edit, grep, find, ls) are registered as direct
 // AgentTool implementations — no MCP layer, no serialization overhead.
@@ -245,7 +335,6 @@ func NewAgent(ctx context.Context, agentConfig *AgentConfig) (*Agent, error) {
 	// The mcpReady channel is closed when loading completes (success or failure).
 	if agentConfig.MCPConfig != nil && len(agentConfig.MCPConfig.MCPServers) > 0 {
 		toolManager := tools.NewMCPToolManager()
-		toolManager.SetModel(providerResult.Model)
 		if agentConfig.AuthHandler != nil {
 			toolManager.SetAuthHandler(agentConfig.AuthHandler)
 		}
@@ -325,7 +414,7 @@ func (a *Agent) rebuildFantasyAgent() {
 	allTools := make([]fantasy.AgentTool, len(a.coreTools))
 	copy(allTools, a.coreTools)
 	if a.toolManager != nil {
-		allTools = append(allTools, a.toolManager.GetTools()...)
+		allTools = append(allTools, mcpToolsToAgentTools(a.toolManager.GetTools(), a.toolManager)...)
 	}
 	if len(a.extraTools) > 0 {
 		allTools = append(allTools, a.extraTools...)
@@ -405,13 +494,20 @@ func (a *Agent) GenerateWithLoop(ctx context.Context, messages []fantasy.Message
 	onToolCall ToolCallHandler, onToolExecution ToolExecutionHandler, onToolResult ToolResultHandler,
 	onResponse ResponseHandler, onToolCallContent ToolCallContentHandler,
 ) (*GenerateWithLoopResult, error) {
-	return a.GenerateWithLoopAndStreaming(ctx, messages, onToolCall, onToolExecution, onToolResult,
-		onResponse, onToolCallContent, nil, nil, nil, nil, nil, nil)
+	return a.GenerateWithCallbacks(ctx, messages, GenerateCallbacks{
+		OnToolCall:        onToolCall,
+		OnToolExecution:   onToolExecution,
+		OnToolResult:      onToolResult,
+		OnResponse:        onResponse,
+		OnToolCallContent: onToolCallContent,
+	})
 }

 // GenerateWithLoopAndStreaming processes messages using the agent with streaming and callbacks.
-// The agent handles the tool call loop internally. We map the rich callback system
-// to kit's existing callback interface for UI integration.
+// The agent handles the tool call loop internally.
+//
+// Deprecated: Use GenerateWithCallbacks instead, which takes a GenerateCallbacks
+// struct and is easier to extend with new callbacks.
 func (a *Agent) GenerateWithLoopAndStreaming(ctx context.Context, messages []fantasy.Message,
 	onToolCall ToolCallHandler, onToolExecution ToolExecutionHandler, onToolResult ToolResultHandler,
 	onResponse ResponseHandler, onToolCallContent ToolCallContentHandler,
@@ -421,6 +517,35 @@ func (a *Agent) GenerateWithLoopAndStreaming(ctx context.Context, messages []fan
 	onToolOutput ToolOutputHandler,
 	onStepMessages StepMessagesHandler,
 	onStepUsage StepUsageHandler,
+	onPasswordPrompt PasswordPromptHandler,
+	onToolCallStart ToolCallStartHandler,
+	onToolCallDelta ToolCallDeltaHandler,
+	onToolCallEnd ToolCallEndHandler,
+) (*GenerateWithLoopResult, error) {
+	return a.GenerateWithCallbacks(ctx, messages, GenerateCallbacks{
+		OnToolCall:          onToolCall,
+		OnToolExecution:     onToolExecution,
+		OnToolResult:        onToolResult,
+		OnResponse:          onResponse,
+		OnToolCallContent:   onToolCallContent,
+		OnStreamingResponse: onStreamingResponse,
+		OnReasoningDelta:    onReasoningDelta,
+		OnReasoningComplete: onReasoningComplete,
+		OnToolOutput:        onToolOutput,
+		OnStepMessages:      onStepMessages,
+		OnStepUsage:         onStepUsage,
+		OnPasswordPrompt:    onPasswordPrompt,
+		OnToolCallStart:     onToolCallStart,
+		OnToolCallDelta:     onToolCallDelta,
+		OnToolCallEnd:       onToolCallEnd,
+	})
+}
+
+// GenerateWithCallbacks processes messages using the agent with streaming and callbacks.
+// The agent handles the tool call loop internally. We map the rich callback system
+// to kit's existing callback interface for UI integration.
+func (a *Agent) GenerateWithCallbacks(ctx context.Context, messages []fantasy.Message,
+	cb GenerateCallbacks,
 ) (*GenerateWithLoopResult, error) {

 	// Wait for background MCP tool loading to complete and rebuild the
@@ -429,8 +554,13 @@ func (a *Agent) GenerateWithLoopAndStreaming(ctx context.Context, messages []fan
 	a.ensureMCPTools()

 	// Inject tool output handler into context for use by core tools (e.g., bash).
-	if onToolOutput != nil {
-		ctx = core.ContextWithToolOutputCallback(ctx, onToolOutput)
+	if cb.OnToolOutput != nil {
+		ctx = core.ContextWithToolOutputCallback(ctx, cb.OnToolOutput)
+	}
+
+	// Inject password prompt handler into context for use by bash tool.
+	if cb.OnPasswordPrompt != nil {
+		ctx = core.ContextWithPasswordPrompt(ctx, cb.OnPasswordPrompt)
 	}

 	// The agent requires the current user input as Prompt, with prior messages as history.
@@ -450,8 +580,13 @@ func (a *Agent) GenerateWithLoopAndStreaming(ctx context.Context, messages []fan
 	// provided. The agent only exposes tool/step callbacks on AgentStreamCall, so
 	// Stream is required to observe tool execution in real time. The non-streaming
 	// Generate path is reserved for the simple case with no callbacks at all.
-	hasCallbacks := onToolCall != nil || onToolExecution != nil || onToolResult != nil ||
-		onToolCallContent != nil || onStreamingResponse != nil || onReasoningDelta != nil
+	hasCallbacks := cb.OnToolCall != nil || cb.OnToolExecution != nil || cb.OnToolResult != nil ||
+		cb.OnToolCallContent != nil || cb.OnStreamingResponse != nil || cb.OnReasoningDelta != nil ||
+		cb.OnToolCallStart != nil || cb.OnToolCallDelta != nil || cb.OnToolCallEnd != nil ||
+		cb.OnStepStart != nil || cb.OnStepFinish != nil || cb.OnTextStart != nil ||
+		cb.OnTextEnd != nil || cb.OnReasoningStart != nil || cb.OnWarnings != nil ||
+		cb.OnSource != nil || cb.OnStreamFinish != nil || cb.OnError != nil ||
+		cb.OnRetry != nil || cb.OnPrepareStep != nil

 	if a.streamingEnabled || hasCallbacks {
 		// Track completed step messages so we can return partial results
@@ -460,9 +595,11 @@ func (a *Agent) GenerateWithLoopAndStreaming(ctx context.Context, messages []fan
 		// for every step that completed before the error occurred.
 		var completedStepMessages []fantasy.Message
 		// persistedCount tracks how many new messages (beyond the original
-		// input) were persisted incrementally via onStepMessages, so the
+		// input) were persisted incrementally via cb.OnStepMessages, so the
 		// caller can skip them during post-generation persistence.
 		var persistedCount int
+		// stepCounter tracks the current step number for StepStart/StepFinish events.
+		var stepCounter int

 		// Use the streaming agent
 		streamCall := fantasy.AgentStreamCall{
@@ -470,13 +607,73 @@ func (a *Agent) GenerateWithLoopAndStreaming(ctx context.Context, messages []fan
 			Files:    files,
 			Messages: history,

+			// Tool input streaming callbacks — fire during tool argument generation
+			OnToolInputStart: func(id, toolName string) error {
+				if ctx.Err() != nil {
+					return ctx.Err()
+				}
+				if cb.OnToolCallStart != nil {
+					cb.OnToolCallStart(id, toolName)
+				}
+				return nil
+			},
+			OnToolInputDelta: func(id, delta string) error {
+				if ctx.Err() != nil {
+					return ctx.Err()
+				}
+				if cb.OnToolCallDelta != nil {
+					cb.OnToolCallDelta(id, delta)
+				}
+				return nil
+			},
+			OnToolInputEnd: func(id string) error {
+				if ctx.Err() != nil {
+					return ctx.Err()
+				}
+				if cb.OnToolCallEnd != nil {
+					cb.OnToolCallEnd(id)
+				}
+				return nil
+			},
+
+			// Text start/end callbacks
+			OnTextStart: func(id string) error {
+				if ctx.Err() != nil {
+					return ctx.Err()
+				}
+				if cb.OnTextStart != nil {
+					cb.OnTextStart(id)
+				}
+				return nil
+			},
+			OnTextEnd: func(id string) error {
+				if ctx.Err() != nil {
+					return ctx.Err()
+				}
+				if cb.OnTextEnd != nil {
+					cb.OnTextEnd(id)
+				}
+				return nil
+			},
+
+			// Reasoning start callback
+			OnReasoningStart: func(id string, _ fantasy.ReasoningContent) error {
+				if ctx.Err() != nil {
+					return ctx.Err()
+				}
+				if cb.OnReasoningStart != nil {
+					cb.OnReasoningStart(id)
+				}
+				return nil
+			},
+
 			// Reasoning/thinking streaming callback
 			OnReasoningDelta: func(id, delta string) error {
 				if ctx.Err() != nil {
 					return ctx.Err()
 				}
-				if onReasoningDelta != nil {
-					onReasoningDelta(delta)
+				if cb.OnReasoningDelta != nil {
+					cb.OnReasoningDelta(delta)
 				}
 				return nil
 			},
@@ -486,8 +683,8 @@ func (a *Agent) GenerateWithLoopAndStreaming(ctx context.Context, messages []fan
 				if ctx.Err() != nil {
 					return ctx.Err()
 				}
-				if onReasoningComplete != nil {
-					onReasoningComplete()
+				if cb.OnReasoningComplete != nil {
+					cb.OnReasoningComplete()
 				}
 				return nil
 			},
@@ -497,8 +694,64 @@ func (a *Agent) GenerateWithLoopAndStreaming(ctx context.Context, messages []fan
 				if ctx.Err() != nil {
 					return ctx.Err()
 				}
-				if onStreamingResponse != nil {
-					onStreamingResponse(text)
+				if cb.OnStreamingResponse != nil {
+					cb.OnStreamingResponse(text)
+				}
+				return nil
+			},
+
+			// Warnings callback
+			OnWarnings: func(warnings []fantasy.CallWarning) error {
+				if ctx.Err() != nil {
+					return ctx.Err()
+				}
+				if cb.OnWarnings != nil {
+					strs := make([]string, len(warnings))
+					for i, w := range warnings {
+						strs[i] = w.Message
+					}
+					cb.OnWarnings(strs)
+				}
+				return nil
+			},
+
+			// Source callback
+			OnSource: func(source fantasy.SourceContent) error {
+				if ctx.Err() != nil {
+					return ctx.Err()
+				}
+				if cb.OnSource != nil {
+					cb.OnSource(string(source.SourceType), source.ID, source.URL, source.Title)
+				}
+				return nil
+			},
+
+			// Stream finish callback (per-step stream completion)
+			OnStreamFinish: func(usage fantasy.Usage, finishReason fantasy.FinishReason, _ fantasy.ProviderMetadata) error {
+				if ctx.Err() != nil {
+					return ctx.Err()
+				}
+				if cb.OnStreamFinish != nil {
+					cb.OnStreamFinish(usage, string(finishReason))
+				}
+				return nil
+			},
+
+			// Error callback
+			OnError: func(err error) {
+				if cb.OnError != nil {
+					cb.OnError(err)
+				}
+			},
+
+			// Step start callback
+			OnStepStart: func(stepNumber int) error {
+				if ctx.Err() != nil {
+					return ctx.Err()
+				}
+				stepCounter = stepNumber
+				if cb.OnStepStart != nil {
+					cb.OnStepStart(stepNumber)
 				}
 				return nil
 			},
@@ -511,13 +764,13 @@ func (a *Agent) GenerateWithLoopAndStreaming(ctx context.Context, messages []fan
 				currentToolArgs = tc.Input

 				// Notify about the tool call
-				if onToolCall != nil {
-					onToolCall(tc.ToolCallID, tc.ToolName, tc.Input)
+				if cb.OnToolCall != nil {
+					cb.OnToolCall(tc.ToolCallID, tc.ToolName, tc.Input)
 				}

 				// Notify tool execution starting
-				if onToolExecution != nil {
-					onToolExecution(tc.ToolCallID, tc.ToolName, tc.Input, true)
+				if cb.OnToolExecution != nil {
+					cb.OnToolExecution(tc.ToolCallID, tc.ToolName, tc.Input, true)
 				}

 				return nil
@@ -529,14 +782,14 @@ func (a *Agent) GenerateWithLoopAndStreaming(ctx context.Context, messages []fan
 					return ctx.Err()
 				}
 				// Notify tool execution finished
-				if onToolExecution != nil {
-					onToolExecution(tr.ToolCallID, tr.ToolName, currentToolArgs, false)
+				if cb.OnToolExecution != nil {
+					cb.OnToolExecution(tr.ToolCallID, tr.ToolName, currentToolArgs, false)
 				}

-				if onToolResult != nil {
+				if cb.OnToolResult != nil {
 					// Extract result text and error status
 					resultText, isError := extractToolResultText(tr)
-					onToolResult(tr.ToolCallID, tr.ToolName, currentToolArgs, resultText, tr.ClientMetadata, isError)
+					cb.OnToolResult(tr.ToolCallID, tr.ToolName, currentToolArgs, resultText, tr.ClientMetadata, isError)
 				}

 				return nil
@@ -550,8 +803,8 @@ func (a *Agent) GenerateWithLoopAndStreaming(ctx context.Context, messages []fan

 				// Persist step messages incrementally so progress is saved
 				// as it happens rather than only at the end of the turn.
-				if onStepMessages != nil && len(step.Messages) > 0 {
-					onStepMessages(step.Messages)
+				if cb.OnStepMessages != nil && len(step.Messages) > 0 {
+					cb.OnStepMessages(step.Messages)
 					persistedCount += len(step.Messages)
 				}

@@ -561,65 +814,88 @@ func (a *Agent) GenerateWithLoopAndStreaming(ctx context.Context, messages []fan
 				// Check if step has text content alongside tool calls
 				text := step.Content.Text()
 				toolCalls := step.Content.ToolCalls()
-				if text != "" && len(toolCalls) > 0 && onToolCallContent != nil {
-					onToolCallContent(text)
+				if text != "" && len(toolCalls) > 0 && cb.OnToolCallContent != nil {
+					cb.OnToolCallContent(text)
 				}
 				// Emit step usage for real-time cost tracking
-				if onStepUsage != nil {
-					onStepUsage(step.Usage.InputTokens, step.Usage.OutputTokens,
+				if cb.OnStepUsage != nil {
+					cb.OnStepUsage(step.Usage.InputTokens, step.Usage.OutputTokens,
 						step.Usage.CacheReadTokens, step.Usage.CacheCreationTokens)
 				}
+				// Emit unified step finish event
+				if cb.OnStepFinish != nil {
+					cb.OnStepFinish(stepCounter, len(toolCalls) > 0, string(step.FinishReason), step.Usage)
+				}
 				return nil
 			},
 		}

-		// If a steer channel is attached to the context, wire up a
-		// PrepareStep function that drains the channel between steps
-		// and injects pending steer messages as user messages before
-		// the next LLM call. This enables graceful mid-turn steering
-		// without cancelling in-progress tool execution.
-		if steerCh := steerChFromContext(ctx); steerCh != nil {
-			onConsumed := steerConsumedFromContext(ctx)
+		// Always wire up PrepareStep to handle both steering and the
+		// OnPrepareStep hook. Steering drains its channel first, then
+		// OnPrepareStep hooks run against the (possibly already steered)
+		// messages.
+		steerCh := steerChFromContext(ctx)
+		onConsumed := steerConsumedFromContext(ctx)
+		hasSteering := steerCh != nil
+		hasPrepareStepHook := cb.OnPrepareStep != nil
+
+		if hasSteering || hasPrepareStepHook {
 			streamCall.PrepareStep = func(
 				stepCtx context.Context,
 				opts fantasy.PrepareStepFunctionOptions,
 			) (context.Context, fantasy.PrepareStepResult, error) {
-				// Drain all pending steer messages (non-blocking).
-				var steered []SteerMessage
-				for {
-					select {
-					case msg := <-steerCh:
-						steered = append(steered, msg)
-					default:
-						goto done
-					}
-				}
-			done:
 				result := fantasy.PrepareStepResult{
 					Model:    opts.Model,
 					Messages: opts.Messages,
 				}
-				if len(steered) > 0 {
-					// Inject each steer message as a user message so the
-					// LLM sees the redirection on the next step.
-					for _, sm := range steered {
-						result.Messages = append(result.Messages,
-							fantasy.NewUserMessage(sm.Text, sm.Files...))
+
+				// Phase 1: Drain steering channel (if present).
+				if hasSteering {
+					var steered []SteerMessage
+					for {
+						select {
+						case msg := <-steerCh:
+							steered = append(steered, msg)
+						default:
+							goto done
+						}
 					}
-					// Notify that steer messages were consumed.
-					if onConsumed != nil {
-						onConsumed(len(steered))
+				done:
+					if len(steered) > 0 {
+						for _, sm := range steered {
+							result.Messages = append(result.Messages,
+								fantasy.NewUserMessage(sm.Text, sm.Files...))
+						}
+						if onConsumed != nil {
+							onConsumed(len(steered))
+						}
+					}
+				}
+
+				// Phase 2: Run OnPrepareStep hook (if registered).
+				if hasPrepareStepHook {
+					if replacement := cb.OnPrepareStep(opts.StepNumber, result.Messages); replacement != nil {
+						result.Messages = replacement
 					}
 				}

 				// Apply message-level cache control for Anthropic models.
-				// This avoids type conflicts with provider-level options.
 				result.Messages = applyCacheControlToMessages(result.Messages)

 				return stepCtx, result, nil
 			}
 		}

+		// Wire OnRetry callback if provided.
+		if cb.OnRetry != nil {
+			streamCall.OnRetry = func(err *fantasy.ProviderError, _ time.Duration) {
+				// Use the retry number from the error if available; Fantasy
+				// doesn't pass a counter directly, so we approximate with a
+				// counter incremented on each call.
+				cb.OnRetry(0, err)
+			}
+		}
+
 		result, err := a.fantasyAgent.Stream(ctx, streamCall)
 		if err != nil {
 			// On cancellation (or any error), return a partial result
@@ -645,8 +921,8 @@ func (a *Agent) GenerateWithLoopAndStreaming(ctx context.Context, messages []fan
 		// empty (e.g. reasoning-only responses) so the UI properly resets
 		// the stream component and avoids duplicate content on the next
 		// flush.
-		if onResponse != nil {
-			onResponse(result.Response.Content.Text())
+		if cb.OnResponse != nil {
+			cb.OnResponse(result.Response.Content.Text())
 		}

 		r := convertAgentResult(result, messages)
@@ -666,8 +942,8 @@ func (a *Agent) GenerateWithLoopAndStreaming(ctx context.Context, messages []fan

 	// For non-streaming, fire the response callback so callers can reset
 	// streaming state (see streaming path comment above).
-	if onResponse != nil {
-		onResponse(result.Response.Content.Text())
+	if cb.OnResponse != nil {
+		cb.OnResponse(result.Response.Content.Text())
 	}

 	return convertAgentResult(result, messages), nil
@@ -808,7 +1084,7 @@ func (a *Agent) GetTools() []fantasy.AgentTool {
 	allTools := make([]fantasy.AgentTool, len(a.coreTools))
 	copy(allTools, a.coreTools)
 	if a.toolManager != nil {
-		allTools = append(allTools, a.toolManager.GetTools()...)
+		allTools = append(allTools, mcpToolsToAgentTools(a.toolManager.GetTools(), a.toolManager)...)
 	}
 	if len(a.extraTools) > 0 {
 		allTools = append(allTools, a.extraTools...)
@@ -852,7 +1128,6 @@ func (a *Agent) AddMCPServer(ctx context.Context, name string, cfg config.MCPSer

 	if a.toolManager == nil {
 		a.toolManager = tools.NewMCPToolManager()
-		a.toolManager.SetModel(a.model)
 		if a.authHandler != nil {
 			a.toolManager.SetAuthHandler(a.authHandler)
 		}
@@ -914,6 +1189,56 @@ func (a *Agent) GetLoadedServerNames() []string {
 	return a.toolManager.GetLoadedServerNames()
 }

+// GetMCPPrompts returns all prompts discovered from connected MCP servers.
+// Returns nil if no MCP servers are configured or no prompts were found.
+func (a *Agent) GetMCPPrompts() []tools.MCPPrompt {
+	if a.toolManager == nil {
+		return nil
+	}
+	return a.toolManager.GetPrompts()
+}
+
+// GetMCPPrompt retrieves and expands a specific prompt from an MCP server.
+// This is a lazy call — the server is contacted each time.
+func (a *Agent) GetMCPPrompt(ctx context.Context, serverName, promptName string, args map[string]string) (*tools.MCPPromptResult, error) {
+	if a.toolManager == nil {
+		return nil, fmt.Errorf("no MCP servers configured")
+	}
+	return a.toolManager.GetPrompt(ctx, serverName, promptName, args)
+}
+
+// GetMCPResources returns all resources discovered from connected MCP servers.
+func (a *Agent) GetMCPResources() []tools.MCPResource {
+	if a.toolManager == nil {
+		return nil
+	}
+	return a.toolManager.GetResources()
+}
+
+// ReadMCPResource reads a specific resource from an MCP server by URI.
+func (a *Agent) ReadMCPResource(ctx context.Context, serverName, uri string) (*tools.MCPResourceContent, error) {
+	if a.toolManager == nil {
+		return nil, fmt.Errorf("no MCP servers configured")
+	}
+	return a.toolManager.ReadResource(ctx, serverName, uri)
+}
+
+// SubscribeMCPResource subscribes to change notifications for a resource.
+func (a *Agent) SubscribeMCPResource(ctx context.Context, serverName, uri string) error {
+	if a.toolManager == nil {
+		return fmt.Errorf("no MCP servers configured")
+	}
+	return a.toolManager.SubscribeResource(ctx, serverName, uri)
+}
+
+// UnsubscribeMCPResource cancels change notifications for a resource.
+func (a *Agent) UnsubscribeMCPResource(ctx context.Context, serverName, uri string) error {
+	if a.toolManager == nil {
+		return fmt.Errorf("no MCP servers configured")
+	}
+	return a.toolManager.UnsubscribeResource(ctx, serverName, uri)
+}
+
 // SetModel swaps the agent's LLM provider to a new model. The existing tools
 // and configuration are preserved. When the new model's ProviderConfig carries
 // a system prompt (from per-model settings), it replaces the agent's stored
@@ -933,11 +1258,6 @@ func (a *Agent) SetModel(ctx context.Context, config *models.ProviderConfig) err
 		_ = a.providerCloser.Close()
 	}

-	// Update model info on MCP tool manager.
-	if a.toolManager != nil {
-		a.toolManager.SetModel(providerResult.Model)
-	}
-
 	// Swap fields.
 	a.model = providerResult.Model
 	a.providerCloser = providerResult.Closer
@@ -970,6 +1290,22 @@ func (a *Agent) GetModel() fantasy.LanguageModel {
 	return a.model
 }

+// GetMaxTokens returns the effective max output tokens the agent currently
+// sends to the LLM provider, after per-model defaults, right-sizing, and any
+// Anthropic thinking-budget adjustments. Returns 0 when no ModelConfig is
+// attached (e.g. early init) or when the provider suppresses the parameter
+// (e.g. Codex OAuth), which allows callers to differentiate "default" from
+// "explicitly capped".
+func (a *Agent) GetMaxTokens() int {
+	if a.skipMaxOutputTokens {
+		return 0
+	}
+	if a.modelConfig == nil {
+		return 0
+	}
+	return a.modelConfig.MaxTokens
+}
+
 // Close closes the agent and cleans up resources.
 // If MCP tools are still loading in the background, Close waits for them
 // to finish before closing connections to avoid resource leaks.
@@ -0,0 +1,65 @@
+package agent
+
+import (
+	"context"
+	"fmt"
+
+	"charm.land/fantasy"
+
+	"github.com/mark3labs/kit/internal/tools"
+)
+
+// mcpAgentTool adapts an tools.MCPTool to the fantasy.AgentTool interface.
+// This keeps the fantasy dependency confined to the agent layer — the tools
+// package is a pure MCP client library with no LLM framework dependency.
+type mcpAgentTool struct {
+	tool            tools.MCPTool
+	manager         *tools.MCPToolManager
+	providerOptions fantasy.ProviderOptions
+}
+
+// Info returns the fantasy tool info including name, description, and parameter schema.
+func (t *mcpAgentTool) Info() fantasy.ToolInfo {
+	return fantasy.ToolInfo{
+		Name:        t.tool.Name,
+		Description: t.tool.Description,
+		Parameters:  t.tool.Parameters,
+		Required:    t.tool.Required,
+	}
+}
+
+// Run executes the MCP tool by delegating to the MCPToolManager.
+func (t *mcpAgentTool) Run(ctx context.Context, call fantasy.ToolCall) (fantasy.ToolResponse, error) {
+	result, err := t.manager.ExecuteTool(ctx, t.tool.Name, call.Input)
+	if err != nil {
+		return fantasy.ToolResponse{}, fmt.Errorf("mcp tool execution failed: %w", err)
+	}
+
+	if result.IsError {
+		return fantasy.NewTextErrorResponse(result.Content), nil
+	}
+	return fantasy.NewTextResponse(result.Content), nil
+}
+
+// ProviderOptions returns provider-specific options for this tool.
+func (t *mcpAgentTool) ProviderOptions() fantasy.ProviderOptions {
+	return t.providerOptions
+}
+
+// SetProviderOptions sets provider-specific options for this tool.
+func (t *mcpAgentTool) SetProviderOptions(opts fantasy.ProviderOptions) {
+	t.providerOptions = opts
+}
+
+// mcpToolsToAgentTools converts a slice of MCPTool to fantasy.AgentTool
+// implementations that route execution through the MCPToolManager.
+func mcpToolsToAgentTools(mcpTools []tools.MCPTool, manager *tools.MCPToolManager) []fantasy.AgentTool {
+	agentTools := make([]fantasy.AgentTool, len(mcpTools))
+	for i, t := range mcpTools {
+		agentTools[i] = &mcpAgentTool{
+			tool:    t,
+			manager: manager,
+		}
+	}
+	return agentTools
+}
@@ -497,6 +497,12 @@ func (a *App) CompactAsync(customInstructions string, onComplete func(), onError
 // response text to stdout. No intermediate events are emitted. Blocks until
 // the step completes or ctx is cancelled.
 func (a *App) RunOnce(ctx context.Context, prompt string) error {
+	return a.RunOnceWithFiles(ctx, prompt, nil)
+}
+
+// RunOnceWithFiles executes a single agent step synchronously with optional
+// multimodal file attachments. Prints the response to stdout and returns.
+func (a *App) RunOnceWithFiles(ctx context.Context, prompt string, files []kit.LLMFilePart) error {
 	stepCtx, cancel := context.WithCancel(ctx)
 	defer cancel()

@@ -504,7 +510,7 @@ func (a *App) RunOnce(ctx context.Context, prompt string) error {
 	a.cancelStep = cancel
 	a.mu.Unlock()

-	result, err := a.executeStep(stepCtx, prompt, nil, nil)
+	result, err := a.executeStep(stepCtx, prompt, nil, files)
 	if err != nil {
 		return err
 	}
@@ -519,6 +525,12 @@ func (a *App) RunOnce(ctx context.Context, prompt string) error {
 // full TurnResult without printing anything. This is used by --json mode to
 // capture structured output for serialization.
 func (a *App) RunOnceResult(ctx context.Context, prompt string) (*kit.TurnResult, error) {
+	return a.RunOnceResultWithFiles(ctx, prompt, nil)
+}
+
+// RunOnceResultWithFiles executes a single agent step synchronously with
+// optional multimodal file attachments and returns the full TurnResult.
+func (a *App) RunOnceResultWithFiles(ctx context.Context, prompt string, files []kit.LLMFilePart) (*kit.TurnResult, error) {
 	stepCtx, cancel := context.WithCancel(ctx)
 	defer cancel()

@@ -526,7 +538,7 @@ func (a *App) RunOnceResult(ctx context.Context, prompt string) (*kit.TurnResult
 	a.cancelStep = cancel
 	a.mu.Unlock()

-	return a.executeStep(stepCtx, prompt, nil, nil)
+	return a.executeStep(stepCtx, prompt, nil, files)
 }

 // RunOnceWithDisplay executes a single agent step synchronously, sending
@@ -540,6 +552,12 @@ func (a *App) RunOnceResult(ctx context.Context, prompt string) (*kit.TurnResult
 //
 // Blocks until the step completes or ctx is cancelled.
 func (a *App) RunOnceWithDisplay(ctx context.Context, prompt string, eventFn func(tea.Msg)) error {
+	return a.RunOnceWithDisplayAndFiles(ctx, prompt, eventFn, nil)
+}
+
+// RunOnceWithDisplayAndFiles executes a single agent step synchronously with
+// optional multimodal file attachments, sending intermediate display events.
+func (a *App) RunOnceWithDisplayAndFiles(ctx context.Context, prompt string, eventFn func(tea.Msg), files []kit.LLMFilePart) error {
 	stepCtx, cancel := context.WithCancel(ctx)
 	defer cancel()

@@ -547,7 +565,7 @@ func (a *App) RunOnceWithDisplay(ctx context.Context, prompt string, eventFn fun
 	a.cancelStep = cancel
 	a.mu.Unlock()

-	result, err := a.executeStep(stepCtx, prompt, eventFn, nil)
+	result, err := a.executeStep(stepCtx, prompt, eventFn, files)
 	if err != nil {
 		return err
 	}
@@ -870,6 +888,12 @@ func (a *App) subscribeSDKEvents(sendFn func(tea.Msg), stepUsageSeen *atomic.Boo
 		switch ev := e.(type) {
 		case kit.ToolCallEvent:
 			sendFn(ToolCallStartedEvent{ToolCallID: ev.ToolCallID, ToolName: ev.ToolName, ToolArgs: ev.ToolArgs})
+		case kit.ToolCallStartEvent:
+			sendFn(ToolCallInputStartEvent{ToolCallID: ev.ToolCallID, ToolName: ev.ToolName, ToolKind: ev.ToolKind})
+		case kit.ToolCallDeltaEvent:
+			sendFn(ToolCallInputDeltaEvent{ToolCallID: ev.ToolCallID, Delta: ev.Delta})
+		case kit.ToolCallEndEvent:
+			sendFn(ToolCallInputEndEvent{ToolCallID: ev.ToolCallID})
 		case kit.ToolExecutionStartEvent:
 			sendFn(ToolExecutionEvent{ToolCallID: ev.ToolCallID, ToolName: ev.ToolName, ToolArgs: ev.ToolArgs, IsStarting: true})
 		case kit.ToolExecutionEndEvent:
@@ -899,7 +923,23 @@ func (a *App) subscribeSDKEvents(sendFn func(tea.Msg), stepUsageSeen *atomic.Boo
 		case kit.SteerConsumedEvent:
 			sendFn(SteerConsumedEvent{})
 		case kit.StepUsageEvent:
-			a.recordStepUsage(ev, stepUsageSeen)
+			a.recordStepUsage(ev, stepUsageSeen, sendFn)
+		case kit.PasswordPromptEvent:
+			// Convert SDK PasswordPromptEvent to app PasswordPromptEvent
+			// The TUI will handle this and send the response back
+			responseCh := make(chan PasswordPromptResponse, 1)
+			sendFn(PasswordPromptEvent{
+				Prompt:     ev.Prompt,
+				ResponseCh: responseCh,
+			})
+			// Wait for TUI response and forward to SDK
+			resp := <-responseCh
+			ev.ResponseCh <- kit.PasswordPromptResponse{
+				Password:  resp.Password,
+				Cancelled: resp.Cancelled,
+			}
+		case kit.TurnEndEvent:
+			a.handleTurnEnd(ev, sendFn)
 		}
 	}))

@@ -910,6 +950,64 @@ func (a *App) subscribeSDKEvents(sendFn func(tea.Msg), stepUsageSeen *atomic.Boo
 	}
 }

+// handleTurnEnd inspects a turn's final StopReason and surfaces actionable
+// feedback to the user when the turn ended in a state they can act on.
+//
+// Today the only surfaced case is FinishReasonLength — the model hit its
+// configured max_output_tokens budget and the reply was truncated. Without
+// this banner the TUI used to swallow the truncation silently, leading to
+// "ghost" cut-offs with no indication of why.
+//
+// Separated from subscribeSDKEvents so tests can exercise it directly via a
+// stubbed sendFn without standing up a full Kit.
+func (a *App) handleTurnEnd(ev kit.TurnEndEvent, sendFn func(tea.Msg)) {
+	if sendFn == nil {
+		return
+	}
+	if ev.StopReason != kit.FinishReasonLength {
+		return
+	}
+	sendFn(ExtensionPrintEvent{
+		Level: "info",
+		Text:  a.formatMaxTokensTruncatedMessage(),
+	})
+}
+
+// formatMaxTokensTruncatedMessage builds the user-facing explanation for a
+// truncated turn. It reports the active max_output_tokens budget and, when
+// known, the model's catalog output ceiling so the user can judge how much
+// headroom is available.
+func (a *App) formatMaxTokensTruncatedMessage() string {
+	k := a.opts.Kit
+	if k == nil {
+		// Extremely early / test-stub case: still emit a useful generic hint.
+		return "⚠ Response truncated: the model hit the configured max_output_tokens limit. " +
+			"Raise it with --max-tokens N, KIT_MAX_TOKENS=N, or per-model " +
+			"modelSettings[provider/model].maxTokens in config."
+	}
+	current := k.MaxTokens()
+	ceiling := k.MaxOutputLimit()
+	model := k.GetModelString()
+
+	msg := "⚠ Response truncated: "
+	if model != "" {
+		msg += fmt.Sprintf("%s hit the configured max_output_tokens limit", model)
+	} else {
+		msg += "the model hit the configured max_output_tokens limit"
+	}
+	if current > 0 {
+		msg += fmt.Sprintf(" (%d)", current)
+	}
+	msg += "."
+	if ceiling > 0 && current > 0 && ceiling > current {
+		msg += fmt.Sprintf(" This model supports up to %d output tokens.", ceiling)
+	}
+	msg += "\n\nRaise it with --max-tokens N, KIT_MAX_TOKENS=N, " +
+		"or per-model modelSettings[provider/model].maxTokens in your config. " +
+		"Re-run the last prompt after raising it to get the full response."
+	return msg
+}
+
 // QuitFromExtension triggers a graceful shutdown. In interactive mode it
 // sends a tea.QuitMsg to the program so the TUI exits cleanly. In
 // non-interactive mode it cancels the root context, stopping any in-flight
@@ -1143,7 +1241,16 @@ func (a *App) PrintBlockFromExtension(opts extensions.PrintBlockOpts) {
 // recordStepUsage applies token/cost usage reported for a completed step.
 // Step usage events arrive even when a turn is later cancelled, so this keeps
 // the usage widget accurate on all stop paths.
-func (a *App) recordStepUsage(ev kit.StepUsageEvent, stepUsageSeen *atomic.Bool) {
+//
+// Both session totals (cost, token counts) and the context window fill level
+// are updated here so the status bar reflects progress after every LLM call,
+// not just at the end of the full turn. Context fill monotonically increases
+// across steps because each step re-sends the entire conversation plus any
+// new tool results, so the numbers only go up.
+//
+// sendFn is called with a UsageUpdatedEvent to trigger a TUI re-render so
+// the updated values are visible immediately.
+func (a *App) recordStepUsage(ev kit.StepUsageEvent, stepUsageSeen *atomic.Bool, sendFn func(tea.Msg)) {
 	hasUsage := ev.InputTokens > 0 || ev.OutputTokens > 0 || ev.CacheReadTokens > 0 || ev.CacheWriteTokens > 0
 	if a.opts.Debug {
 		log.Printf("[DEBUG] recordStepUsage: hasUsage=%v input=%d output=%d cacheRead=%d cacheWrite=%d",
@@ -1164,11 +1271,21 @@ func (a *App) recordStepUsage(ev kit.StepUsageEvent, stepUsageSeen *atomic.Bool)
 		int(ev.CacheReadTokens),
 		int(ev.CacheWriteTokens),
 	)
-	// NOTE: We do NOT call SetContextTokens here. Context fill is set once
-	// at turn completion via updateUsageFromTurnResult, which sums all token
-	// categories (Input + CacheRead + CacheCreate + Output) from FinalUsage.
-	// Per-step context tokens would cause the display to jump around during
-	// multi-step tool calls.
+	// Update context window fill from this step's usage. Each step sends
+	// the full conversation to the LLM, so the reported token counts
+	// represent the actual context utilization at that point.
+	contextFill := int(ev.InputTokens) + int(ev.CacheReadTokens) + int(ev.CacheWriteTokens) + int(ev.OutputTokens)
+	if contextFill > 0 {
+		if a.opts.Debug {
+			log.Printf("[DEBUG] recordStepUsage: SetContextTokens=%d (Input=%d + CacheRead=%d + CacheWrite=%d + Output=%d)",
+				contextFill, ev.InputTokens, ev.CacheReadTokens, ev.CacheWriteTokens, ev.OutputTokens)
+		}
+		a.opts.UsageTracker.SetContextTokens(contextFill)
+	}
+	// Notify the TUI so it re-renders the status bar with updated values.
+	if sendFn != nil {
+		sendFn(UsageUpdatedEvent{})
+	}
 }

 // updateUsageFromTurnResult records token usage from an SDK TurnResult into the
@@ -3,10 +3,12 @@ package app
 import (
 	"context"
 	"errors"
+	"strings"
 	"sync"
 	"testing"
 	"time"

+	tea "charm.land/bubbletea/v2"
 	kit "github.com/mark3labs/kit/pkg/kit"
 )

@@ -532,9 +534,9 @@ func TestQueueLength_reflects(t *testing.T) {
 }

 // TestRecordStepUsage_updatesTracker verifies that per-step usage updates are
-// recorded immediately for cost tracking. Context tokens are NOT updated here
-// (only via updateUsageFromTurnResult) to avoid display jumps during multi-step
-// tool calls.
+// recorded immediately for cost tracking. Context tokens are also updated so
+// the status bar reflects context fill after every LLM call in a multi-step
+// turn, not just at the end.
 func TestRecordStepUsage_updatesTracker(t *testing.T) {
 	usage := &usageUpdaterStub{}
 	app := New(Options{UsageTracker: usage}, nil)
@@ -545,7 +547,7 @@ func TestRecordStepUsage_updatesTracker(t *testing.T) {
 		OutputTokens:     45,
 		CacheReadTokens:  5,
 		CacheWriteTokens: 2,
-	}, nil)
+	}, nil, nil)

 	usage.mu.Lock()
 	defer usage.mu.Unlock()
@@ -557,9 +559,13 @@ func TestRecordStepUsage_updatesTracker(t *testing.T) {
 		t.Fatalf("unexpected usage update payload: in=%d out=%d cache_read=%d cache_write=%d",
 			usage.lastUpdateInput, usage.lastUpdateOutput, usage.lastUpdateCacheRead, usage.lastUpdateCacheWrite)
 	}
-	// Context tokens should NOT be updated by recordStepUsage (only by updateUsageFromTurnResult)
-	if usage.contextCalls != 0 {
-		t.Fatalf("expected 0 context token updates from recordStepUsage, got %d", usage.contextCalls)
+	// Context tokens should now be updated per-step (Input + CacheRead + CacheWrite + Output).
+	if usage.contextCalls != 1 {
+		t.Fatalf("expected 1 context token update from recordStepUsage, got %d", usage.contextCalls)
+	}
+	expectedContext := 120 + 45 + 5 + 2
+	if usage.lastContextTokens != expectedContext {
+		t.Fatalf("expected context tokens %d, got %d", expectedContext, usage.lastContextTokens)
 	}
 }

@@ -666,3 +672,94 @@ func TestUpdateUsageFromTurnResult_contextTokensUsesAllCategories(t *testing.T)
 			expected, usage.contextCalls, usage.lastContextTokens)
 	}
 }
+
+// TestHandleTurnEnd_LengthEmitsWarning verifies that when the SDK reports a
+// FinishReasonLength (max_output_tokens hit), the app surfaces a user-visible
+// ExtensionPrintEvent with Level="info" so the TUI can render a banner
+// instead of silently showing a truncated reply.
+func TestHandleTurnEnd_LengthEmitsWarning(t *testing.T) {
+	app := New(Options{}, nil)
+	defer app.Close()
+
+	var mu sync.Mutex
+	var received []tea.Msg
+	sendFn := func(m tea.Msg) {
+		mu.Lock()
+		defer mu.Unlock()
+		received = append(received, m)
+	}
+
+	app.handleTurnEnd(kit.TurnEndEvent{StopReason: kit.FinishReasonLength}, sendFn)
+
+	mu.Lock()
+	defer mu.Unlock()
+	if len(received) != 1 {
+		t.Fatalf("expected 1 event on length stop, got %d", len(received))
+	}
+	ev, ok := received[0].(ExtensionPrintEvent)
+	if !ok {
+		t.Fatalf("expected ExtensionPrintEvent, got %T", received[0])
+	}
+	if ev.Level != "info" {
+		t.Errorf("expected Level=info, got %q", ev.Level)
+	}
+	if ev.Text == "" {
+		t.Error("expected non-empty warning text")
+	}
+	if !strings.Contains(ev.Text, "max_output_tokens") {
+		t.Errorf("warning text should mention max_output_tokens, got: %s", ev.Text)
+	}
+}
+
+// TestHandleTurnEnd_NonLengthIgnored verifies that ordinary stop reasons
+// (stop, tool-calls, error, unknown, "") do not produce a warning banner.
+func TestHandleTurnEnd_NonLengthIgnored(t *testing.T) {
+	app := New(Options{}, nil)
+	defer app.Close()
+
+	reasons := []string{
+		kit.FinishReasonStop,
+		kit.FinishReasonToolCalls,
+		kit.FinishReasonError,
+		kit.FinishReasonContentFilter,
+		kit.FinishReasonOther,
+		kit.FinishReasonUnknown,
+		"",
+	}
+	for _, r := range reasons {
+		var called bool
+		app.handleTurnEnd(kit.TurnEndEvent{StopReason: r}, func(m tea.Msg) {
+			called = true
+		})
+		if called {
+			t.Errorf("stop reason %q unexpectedly emitted a warning", r)
+		}
+	}
+}
+
+// TestHandleTurnEnd_NilSendFn guards against panics when no TUI listener is
+// attached (e.g. early init or headless teardown).
+func TestHandleTurnEnd_NilSendFn(t *testing.T) {
+	app := New(Options{}, nil)
+	defer app.Close()
+
+	// Should not panic with a nil sendFn.
+	app.handleTurnEnd(kit.TurnEndEvent{StopReason: kit.FinishReasonLength}, nil)
+}
+
+// TestFormatMaxTokensTruncatedMessage_NoKit verifies the fallback message
+// when Options.Kit is nil (test/stub path).
+func TestFormatMaxTokensTruncatedMessage_NoKit(t *testing.T) {
+	app := New(Options{}, nil)
+	defer app.Close()
+
+	msg := app.formatMaxTokensTruncatedMessage()
+	if msg == "" {
+		t.Fatal("expected non-empty fallback message")
+	}
+	for _, needle := range []string{"max_output_tokens", "--max-tokens", "KIT_MAX_TOKENS", "modelSettings"} {
+		if !strings.Contains(msg, needle) {
+			t.Errorf("fallback message missing %q:\n%s", needle, msg)
+		}
+	}
+}
@@ -32,6 +32,36 @@ type ToolCallStartedEvent struct {
 	ToolArgs string
 }

+// ToolCallInputStartEvent is sent when the LLM begins generating tool call
+// arguments. The tool name is known but the full argument JSON is still being
+// streamed. UIs can use this to show a "running" indicator immediately instead
+// of waiting for the full argument JSON to finish streaming.
+type ToolCallInputStartEvent struct {
+	// ToolCallID is the stable identifier for correlating tool lifecycle events.
+	ToolCallID string
+	// ToolName is the name of the tool being called.
+	ToolName string
+	// ToolKind classifies the tool: "execute", "edit", "read", "search", "agent".
+	ToolKind string
+}
+
+// ToolCallInputDeltaEvent is sent for each streamed fragment of tool call
+// arguments as they arrive from the LLM. Useful for live-previewing content
+// or showing a progress indicator with byte count.
+type ToolCallInputDeltaEvent struct {
+	// ToolCallID is the stable identifier for correlating tool lifecycle events.
+	ToolCallID string
+	// Delta is a JSON fragment of tool call arguments.
+	Delta string
+}
+
+// ToolCallInputEndEvent is sent when tool argument streaming is complete,
+// before the tool call is parsed and execution begins.
+type ToolCallInputEndEvent struct {
+	// ToolCallID is the stable identifier for correlating tool lifecycle events.
+	ToolCallID string
+}
+
 // ToolExecutionEvent is sent when a tool starts or finishes executing.
 // The IsStarting flag distinguishes between the start and end of execution.
 type ToolExecutionEvent struct {
@@ -79,6 +109,24 @@ type ToolCallContentEvent struct {
 	Content string
 }

+// PasswordPromptEvent is sent when a sudo command needs a password.
+// The TUI should display a password prompt overlay and send the result back.
+type PasswordPromptEvent struct {
+	// Prompt is the message to display to the user.
+	Prompt string
+	// ResponseCh receives the password from the TUI.
+	// The TUI must send exactly one value.
+	ResponseCh chan<- PasswordPromptResponse
+}
+
+// PasswordPromptResponse carries the user's password input.
+type PasswordPromptResponse struct {
+	// Password is the entered password.
+	Password string
+	// Cancelled is true if the user cancelled the prompt.
+	Cancelled bool
+}
+
 // ResponseCompleteEvent is sent when the LLM produces a final (non-streaming) response.
 // In streaming mode, this may be empty if all content was delivered via StreamChunkEvents.
 type ResponseCompleteEvent struct {
@@ -162,6 +210,12 @@ type ModelChangedEvent struct {
 	ModelName string
 }

+// UsageUpdatedEvent is sent after each completed LLM step to notify the TUI
+// that token counts and costs have changed. The UsageTracker is updated
+// in-place before this event is sent; the TUI just needs to re-render to
+// reflect the new values in the status bar.
+type UsageUpdatedEvent struct{}
+
 // WidgetUpdateEvent is sent when an extension adds, updates, or removes a
 // widget via ctx.SetWidget or ctx.RemoveWidget. The TUI re-reads widget state
 // from its WidgetProvider on the next render cycle.
@@ -3,24 +3,21 @@ package app
 import (
 	"testing"

-	"charm.land/fantasy"
-
 	kit "github.com/mark3labs/kit/pkg/kit"
 )

-// makeTextMsg builds a minimal kit.LLMMessage using fantasy.NewUserMessage
-// or constructing with the given role.
+// makeTextMsg builds a minimal kit.LLMMessage with the given role and text.
 func makeTextMsg(role, text string) kit.LLMMessage {
 	return kit.LLMMessage{
 		Role:    kit.LLMMessageRole(role),
-		Content: []fantasy.MessagePart{fantasy.TextPart{Text: text}},
+		Content: []kit.LLMMessagePart{kit.LLMTextPart{Text: text}},
 	}
 }

 // textOf extracts the plain text from an LLMMessage for assertions.
 func textOf(msg kit.LLMMessage) string {
 	for _, part := range msg.Content {
-		if tp, ok := part.(fantasy.TextPart); ok {
+		if tp, ok := part.(kit.LLMTextPart); ok {
 			return tp.Text
 		}
 	}
@@ -471,5 +471,13 @@ func GetAnthropicAPIKey(flagValue string) (string, string, error) {
 		return envKey, "ANTHROPIC_API_KEY environment variable", nil
 	}

+	// Check if OpenAI credentials exist to provide a helpful suggestion
+	if cm != nil {
+		hasOpenAI, _ := cm.HasOpenAICredentials()
+		if hasOpenAI {
+			return "", "", fmt.Errorf("no Anthropic API key found. Use 'kit auth login anthropic', set ANTHROPIC_API_KEY environment variable, or use --provider-api-key flag\n\nNote: OpenAI credentials were detected. To use OpenAI, run with --model openai/gpt-5.4 or set it as default:\n  kit auth login openai --set-default")
+		}
+	}
+
 	return "", "", fmt.Errorf("no Anthropic API key found. Use 'kit auth login anthropic', set ANTHROPIC_API_KEY environment variable, or use --provider-api-key flag")
 }
@@ -30,6 +30,20 @@ type MCPServerConfig struct {
 	OAuthClientSecret string   `json:"oauthClientSecret,omitempty" yaml:"oauthClientSecret,omitempty"`
 	OAuthScopes       []string `json:"oauthScopes,omitempty" yaml:"oauthScopes,omitempty"`

+	// NoOAuth disables OAuth transport configuration for this server, even
+	// when the connection pool has an auth handler. Use this for public MCP
+	// servers (e.g. PubMed) that don't require authentication. Without this
+	// flag, the pool would attach OAuth transport to every remote server,
+	// causing proactive dynamic-client-registration attempts that fail on
+	// servers that don't support it.
+	NoOAuth bool `json:"noOAuth,omitempty" yaml:"noOAuth,omitempty"`
+
+	// InProcessServer holds a live *server.MCPServer for in-process transport.
+	// When set (and Type is "inprocess"), the connection pool creates an
+	// in-process client instead of spawning a subprocess or making HTTP calls.
+	// This field is never serialized — it is only used programmatically via the SDK.
+	InProcessServer any `json:"-" yaml:"-"`
+
 	// Legacy fields for backward compatibility
 	Transport string         `json:"transport,omitempty"`
 	Args      []string       `json:"args,omitempty"`
@@ -53,6 +67,7 @@ func (s *MCPServerConfig) UnmarshalJSON(data []byte) error {
 		OAuthClientID     string            `json:"oauthClientId,omitempty" yaml:"oauthClientId,omitempty"`
 		OAuthClientSecret string            `json:"oauthClientSecret,omitempty" yaml:"oauthClientSecret,omitempty"`
 		OAuthScopes       []string          `json:"oauthScopes,omitempty" yaml:"oauthScopes,omitempty"`
+		NoOAuth           bool              `json:"noOAuth,omitempty" yaml:"noOAuth,omitempty"`
 	}

 	// Also try legacy format
@@ -80,6 +95,7 @@ func (s *MCPServerConfig) UnmarshalJSON(data []byte) error {
 		s.OAuthClientID = newConfig.OAuthClientID
 		s.OAuthClientSecret = newConfig.OAuthClientSecret
 		s.OAuthScopes = newConfig.OAuthScopes
+		s.NoOAuth = newConfig.NoOAuth
 		return nil
 	}

@@ -277,11 +293,18 @@ func (s *MCPServerConfig) GetTransportType() string {
 			return "stdio"
 		case "remote":
 			return "streamable"
+		case "inprocess":
+			return "inprocess"
 		default:
 			return s.Type
 		}
 	}

+	// Programmatic in-process server detection.
+	if s.InProcessServer != nil {
+		return "inprocess"
+	}
+
 	// Backward compatibility: infer transport type
 	if len(s.Command) > 0 {
 		return "stdio"
@@ -312,8 +335,12 @@ func (c *Config) Validate() error {
 			if serverConfig.URL == "" {
 				return fmt.Errorf("server %s: url is required for %s transport", serverName, transport)
 			}
+		case "inprocess":
+			if serverConfig.InProcessServer == nil {
+				return fmt.Errorf("server %s: InProcessServer is required for inprocess transport", serverName)
+			}
 		default:
-			return fmt.Errorf("server %s: unsupported transport type '%s'. Supported types: stdio, sse, streamable", serverName, transport)
+			return fmt.Errorf("server %s: unsupported transport type '%s'. Supported types: stdio, sse, streamable, inprocess", serverName, transport)
 		}
 	}
 	return nil
@@ -19,10 +19,18 @@ import (
 // It receives tool call ID, tool name, output chunk, and whether it's stderr.
 type ToolOutputCallback func(toolCallID, toolName, chunk string, isStderr bool)

+// PasswordPromptCallback is the signature for password prompts.
+// It receives a prompt message and returns the password and whether it was cancelled.
+type PasswordPromptCallback func(prompt string) (password string, cancelled bool)
+
 // contextKey is a custom type for context keys to avoid collisions.
 type contextKey string

-const toolOutputCallbackKey contextKey = "toolOutputCallback"
+const (
+	toolOutputCallbackKey contextKey = "toolOutputCallback"
+	sudoPasswordKey       contextKey = "sudoPassword"
+	passwordPromptKey     contextKey = "passwordPrompt"
+)

 // ContextWithToolOutputCallback returns a new context with the tool output callback set.
 func ContextWithToolOutputCallback(ctx context.Context, callback ToolOutputCallback) context.Context {
@@ -37,6 +45,34 @@ func toolOutputCallbackFromContext(ctx context.Context) ToolOutputCallback {
 	return nil
 }

+// ContextWithPasswordPrompt returns a new context with the password prompt callback set.
+// This allows the TUI to show a modal password prompt when sudo needs a password.
+func ContextWithPasswordPrompt(ctx context.Context, callback PasswordPromptCallback) context.Context {
+	return context.WithValue(ctx, passwordPromptKey, callback)
+}
+
+// passwordPromptFromContext retrieves the password prompt callback from context.
+func passwordPromptFromContext(ctx context.Context) PasswordPromptCallback {
+	if cb, ok := ctx.Value(passwordPromptKey).(PasswordPromptCallback); ok {
+		return cb
+	}
+	return nil
+}
+
+// ContextWithSudoPassword returns a new context with the sudo password set.
+// When present, the bash tool will use sudo -S to pipe this password to sudo commands.
+func ContextWithSudoPassword(ctx context.Context, password string) context.Context {
+	return context.WithValue(ctx, sudoPasswordKey, password)
+}
+
+// sudoPasswordFromContext retrieves the sudo password from context.
+func sudoPasswordFromContext(ctx context.Context) string {
+	if pw, ok := ctx.Value(sudoPasswordKey).(string); ok {
+		return pw
+	}
+	return ""
+}
+
 const defaultBashTimeout = 120 * time.Second
 const maxBashTimeout = 600 * time.Second

@@ -73,6 +109,66 @@ func NewBashTool(opts ...ToolOption) fantasy.AgentTool {
 	}
 }

+// sudoCommandRe matches sudo commands that need to be rewritten for -S mode.
+// It matches "sudo" as a word boundary, optionally preceded by environment variables.
+var sudoCommandRe = regexp.MustCompile(`(?i)(^|[&|;|]|\|\||&&)\s*(\w+=\S+\s+)?\bsudo\b`)
+
+// truncateCommand truncates a long command for display.
+func truncateCommand(cmd string, maxLen int) string {
+	if len(cmd) <= maxLen {
+		return cmd
+	}
+	return cmd[:maxLen-3] + "..."
+}
+
+// rewriteSudoForStdin rewrites sudo commands to use -S -p ” for stdin password input.
+// It transforms: sudo cmd → sudo -S -p ” cmd
+func rewriteSudoForStdin(command string) string {
+	// Find all matches and their positions
+	matches := sudoCommandRe.FindAllStringIndex(command, -1)
+	if matches == nil {
+		return command
+	}
+
+	// Build result from end to start to preserve indices
+	result := command
+	for i := len(matches) - 1; i >= 0; i-- {
+		match := matches[i]
+		start, end := match[0], match[1]
+		matchedText := result[start:end]
+
+		// Extract just the "sudo" part (after any prefix)
+		sudoIdx := strings.Index(strings.ToLower(matchedText), "sudo")
+		if sudoIdx == -1 {
+			continue
+		}
+		prefix := matchedText[:sudoIdx]
+		sudoPart := matchedText[sudoIdx:]
+
+		// Check if the text immediately after "sudo" in the result contains -S
+		afterSudo := result[end:]
+		if strings.HasPrefix(strings.TrimLeft(afterSudo, " \t"), "-S") {
+			// Already has -S flag, skip
+			continue
+		}
+
+		// Insert -S -p '' after "sudo"
+		newSudo := strings.Replace(sudoPart, "sudo", "sudo -S -p ''", 1)
+		result = result[:start] + prefix + newSudo + result[end:]
+	}
+
+	return result
+}
+
+// SudoPasswordRequiredResult is a special marker that indicates sudo needs a password.
+// This is stored in tool response metadata to signal the TUI to prompt for password.
+const SudoPasswordRequiredMetadata = `{"sudo_password_required":true}`
+
+// IsSudoPasswordRequiredResult checks if a tool response indicates sudo password is needed.
+func IsSudoPasswordRequiredResult(resp fantasy.ToolResponse) bool {
+	return resp.Metadata == SudoPasswordRequiredMetadata
+}
+
 func executeBash(ctx context.Context, call fantasy.ToolCall, workDir string) (fantasy.ToolResponse, error) {
 	var args bashArgs
 	if err := parseArgs(call.Input, &args); err != nil {
@@ -97,7 +193,47 @@ func executeBash(ctx context.Context, call fantasy.ToolCall, workDir string) (fa
 	cmdCtx, cancel := context.WithTimeout(ctx, timeout)
 	defer cancel()

-	cmd := exec.CommandContext(cmdCtx, "bash", "-c", args.Command)
+	// Check for sudo password in context or environment
+	sudoPassword := sudoPasswordFromContext(ctx)
+	if sudoPassword == "" {
+		sudoPassword = os.Getenv("SUDO_PASSWORD")
+	}
+	command := args.Command
+
+	// If command contains sudo and we don't have a password, check if sudo needs one
+	if sudoPassword == "" && sudoCommandRe.MatchString(command) {
+		// Check if sudo credentials are cached using sudo -n (non-interactive)
+		testCmd := exec.CommandContext(cmdCtx, "sudo", "-n", "true")
+		testCmd.Dir = workDir
+		if err := testCmd.Run(); err != nil {
+			// Sudo needs a password - try to prompt via callback
+			if promptCallback := passwordPromptFromContext(ctx); promptCallback != nil {
+				pw, cancelled := promptCallback("Sudo password required for: " + truncateCommand(args.Command, 60))
+				if cancelled {
+					return fantasy.NewTextErrorResponse("sudo password prompt cancelled"), nil
+				}
+				if pw == "" {
+					return fantasy.NewTextErrorResponse("no sudo password provided"), nil
+				}
+				sudoPassword = pw
+				command = rewriteSudoForStdin(command)
+			} else {
+				// No callback available - return error with helpful message
+				return fantasy.NewTextErrorResponse(
+					"This command requires sudo access. " +
+						"Please run 'sudo -v' in your terminal first to cache credentials, " +
+						"or set the SUDO_PASSWORD environment variable."), nil
+			}
+		}
+		// Credentials are cached or password was provided, proceed
+	}
+
+	// If we have a sudo password, rewrite the command to use sudo -S
+	if sudoPassword != "" && sudoCommandRe.MatchString(command) {
+		command = rewriteSudoForStdin(command)
+	}
+
+	cmd := exec.CommandContext(cmdCtx, "bash", "-c", command)
 	if workDir != "" {
 		cmd.Dir = workDir
 	}
@@ -115,18 +251,18 @@ func executeBash(ctx context.Context, call fantasy.ToolCall, workDir string) (fa

 	if outputCallback != nil {
 		// Streaming mode: use pipes to capture output as it arrives
-		return executeBashStreaming(cmdCtx, call, cmd, outputCallback)
+		return executeBashStreaming(cmdCtx, call, cmd, outputCallback, sudoPassword)
 	}

 	// Non-streaming mode: collect all output at once (original behavior)
-	return executeBashBuffered(cmdCtx, call, cmd)
+	return executeBashBuffered(cmdCtx, call, cmd, sudoPassword)
 }

 // executeBashBuffered collects all output before returning (original behavior).
 // It uses explicit pipes (not cmd.Stdout) so that cmd.WaitDelay can forcibly
 // close them when grandchild processes hold pipe handles open after the
 // direct child exits.
-func executeBashBuffered(cmdCtx context.Context, call fantasy.ToolCall, cmd *exec.Cmd) (fantasy.ToolResponse, error) {
+func executeBashBuffered(cmdCtx context.Context, call fantasy.ToolCall, cmd *exec.Cmd, sudoPassword string) (fantasy.ToolResponse, error) {
 	stdoutPipe, err := cmd.StdoutPipe()
 	if err != nil {
 		return fantasy.NewTextErrorResponse("failed to create stdout pipe"), nil
@@ -136,10 +272,27 @@ func executeBashBuffered(cmdCtx context.Context, call fantasy.ToolCall, cmd *exe
 		return fantasy.NewTextErrorResponse("failed to create stderr pipe"), nil
 	}

+	// If we have a sudo password, create a stdin pipe and write the password
+	var stdinPipe io.WriteCloser
+	if sudoPassword != "" {
+		stdinPipe, err = cmd.StdinPipe()
+		if err != nil {
+			return fantasy.NewTextErrorResponse("failed to create stdin pipe"), nil
+		}
+	}
+
 	if err := cmd.Start(); err != nil {
 		return fantasy.NewTextErrorResponse(fmt.Sprintf("failed to start command: %v", err)), nil
 	}

+	// Write password to stdin if needed, then close stdin
+	if sudoPassword != "" && stdinPipe != nil {
+		go func() {
+			defer func() { _ = stdinPipe.Close() }()
+			_, _ = io.WriteString(stdinPipe, sudoPassword+"\n")
+		}()
+	}
+
 	// Read pipes concurrently
 	var wg sync.WaitGroup
 	var stdout, stderr strings.Builder
@@ -181,7 +334,7 @@ func executeBashBuffered(cmdCtx context.Context, call fantasy.ToolCall, cmd *exe
 }

 // executeBashStreaming streams output as it arrives via the callback.
-func executeBashStreaming(cmdCtx context.Context, call fantasy.ToolCall, cmd *exec.Cmd, outputCallback ToolOutputCallback) (fantasy.ToolResponse, error) {
+func executeBashStreaming(cmdCtx context.Context, call fantasy.ToolCall, cmd *exec.Cmd, outputCallback ToolOutputCallback, sudoPassword string) (fantasy.ToolResponse, error) {
 	stdoutPipe, err := cmd.StdoutPipe()
 	if err != nil {
 		return fantasy.NewTextErrorResponse("failed to create stdout pipe"), nil
@@ -191,11 +344,28 @@ func executeBashStreaming(cmdCtx context.Context, call fantasy.ToolCall, cmd *ex
 		return fantasy.NewTextErrorResponse("failed to create stderr pipe"), nil
 	}

+	// If we have a sudo password, create a stdin pipe
+	var stdinPipe io.WriteCloser
+	if sudoPassword != "" {
+		stdinPipe, err = cmd.StdinPipe()
+		if err != nil {
+			return fantasy.NewTextErrorResponse("failed to create stdin pipe"), nil
+		}
+	}
+
 	// Start command execution
 	if err := cmd.Start(); err != nil {
 		return fantasy.NewTextErrorResponse(fmt.Sprintf("failed to start command: %v", err)), nil
 	}

+	// Write password to stdin if needed, then close stdin
+	if sudoPassword != "" && stdinPipe != nil {
+		go func() {
+			defer func() { _ = stdinPipe.Close() }()
+			_, _ = io.WriteString(stdinPipe, sudoPassword+"\n")
+		}()
+	}
+
 	// Stream stdout and stderr concurrently
 	var wg sync.WaitGroup
 	var mu sync.Mutex
@@ -127,3 +127,72 @@ func TestBash_EmptyCommand(t *testing.T) {
 		t.Fatal("expected error for empty command")
 	}
 }
+
+func TestRewriteSudoForStdin(t *testing.T) {
+	tests := []struct {
+		name     string
+		input    string
+		expected string
+	}{
+		{
+			name:     "simple sudo",
+			input:    "sudo apt update",
+			expected: "sudo -S -p '' apt update",
+		},
+		{
+			name:     "sudo with env var",
+			input:    "DEBIAN_FRONTEND=noninteractive sudo apt update",
+			expected: "DEBIAN_FRONTEND=noninteractive sudo -S -p '' apt update",
+		},
+		{
+			name:     "sudo in pipeline",
+			input:    "echo test | sudo tee /etc/test.conf",
+			expected: "echo test | sudo -S -p '' tee /etc/test.conf",
+		},
+		{
+			name:     "sudo after &&",
+			input:    "apt update && sudo apt upgrade",
+			expected: "apt update && sudo -S -p '' apt upgrade",
+		},
+		{
+			name:     "already has -S flag",
+			input:    "sudo -S apt update",
+			expected: "sudo -S apt update",
+		},
+		{
+			name:     "no sudo",
+			input:    "apt update && apt upgrade",
+			expected: "apt update && apt upgrade",
+		},
+		{
+			name:     "sudo in string (should not match)",
+			input:    "echo 'use sudo carefully'",
+			expected: "echo 'use sudo carefully'",
+		},
+	}
+
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			result := rewriteSudoForStdin(tt.input)
+			if result != tt.expected {
+				t.Errorf("rewriteSudoForStdin(%q) = %q, want %q", tt.input, result, tt.expected)
+			}
+		})
+	}
+}
+
+func TestSudoPasswordFromContext(t *testing.T) {
+	// Test with password in context
+	ctx := ContextWithSudoPassword(context.Background(), "secret123")
+	pw := sudoPasswordFromContext(ctx)
+	if pw != "secret123" {
+		t.Errorf("expected password 'secret123', got %q", pw)
+	}
+
+	// Test without password
+	ctx = context.Background()
+	pw = sudoPasswordFromContext(ctx)
+	if pw != "" {
+		t.Errorf("expected empty password, got %q", pw)
+	}
+}
@@ -86,7 +86,7 @@ Example use cases:
 				},
 				"model": map[string]any{
 					"type":        "string",
-					"description": "Optional model override (e.g. 'anthropic/claude-haiku-3-5-20241022' for faster/cheaper tasks)",
+					"description": "Optional model override. Empty string uses the current model.",
 				},
 				"system_prompt": map[string]any{
 					"type":        "string",
@@ -94,7 +94,7 @@ Example use cases:
 				},
 				"timeout_seconds": map[string]any{
 					"type":        "number",
-					"description": "Maximum execution time in seconds (default: 300, max: 1800)",
+					"description": "Maximum execution time in seconds (default: 300, max: 1800, minimum recommended: 240)",
 				},
 			},
 			Required: []string{"task"},
@@ -918,7 +918,7 @@ type ExtensionEntry struct {
 type ContextMessage struct {
 	// Index is the position of this message in the original context array
 	// (0-based). When returning messages from a ContextPrepareResult,
-	// messages with Index >= 0 reuse the original fantasy.Message at that
+	// messages with Index >= 0 reuse the original LLM message at that
 	// position (preserving tool calls, reasoning, and other complex parts).
 	// Set Index to -1 for newly injected messages (created from Role + Content).
 	Index int
@@ -1063,6 +1063,9 @@ type PrintBlockOpts struct {
 type API struct {
 	// Event-specific registration functions (wired by the loader).
 	onToolCall                func(func(ToolCallEvent, Context) *ToolCallResult)
+	onToolCallInputStart      func(func(ToolCallInputStartEvent, Context))
+	onToolCallInputDelta      func(func(ToolCallInputDeltaEvent, Context))
+	onToolCallInputEnd        func(func(ToolCallInputEndEvent, Context))
 	onToolExecStart           func(func(ToolExecutionStartEvent, Context))
 	onToolExecEnd             func(func(ToolExecutionEndEvent, Context))
 	onToolOutput              func(func(ToolOutputEvent, Context))
@@ -1091,6 +1094,14 @@ type API struct {
 	onSubagentStart           func(func(SubagentStartEvent, Context))
 	onSubagentChunk           func(func(SubagentChunkEvent, Context))
 	onSubagentEnd             func(func(SubagentEndEvent, Context))
+	onStepStart               func(func(StepStartEvent, Context))
+	onStepFinish              func(func(StepFinishEvent, Context))
+	onReasoningStart          func(func(ReasoningStartEvent, Context))
+	onWarnings                func(func(WarningsEvent, Context))
+	onSource                  func(func(SourceEvent, Context))
+	onError                   func(func(ErrorEvent, Context))
+	onRetry                   func(func(RetryEvent, Context))
+	onPrepareStep             func(func(PrepareStepEvent, Context) *PrepareStepResult)
 }

 // OnToolCall registers a handler that fires before a tool executes.
@@ -1099,6 +1110,26 @@ func (a *API) OnToolCall(handler func(ToolCallEvent, Context) *ToolCallResult) {
 	a.onToolCall(handler)
 }

+// OnToolCallInputStart registers a handler that fires when the LLM begins
+// generating tool call arguments. The tool name is known but the full
+// argument JSON is still being streamed. Useful for showing a "running"
+// indicator immediately without waiting for the full arguments.
+func (a *API) OnToolCallInputStart(handler func(ToolCallInputStartEvent, Context)) {
+	a.onToolCallInputStart(handler)
+}
+
+// OnToolCallInputDelta registers a handler that fires for each streamed
+// fragment of tool call arguments as they arrive from the LLM.
+func (a *API) OnToolCallInputDelta(handler func(ToolCallInputDeltaEvent, Context)) {
+	a.onToolCallInputDelta(handler)
+}
+
+// OnToolCallInputEnd registers a handler that fires when tool argument
+// streaming is complete, before the tool call is parsed and execution begins.
+func (a *API) OnToolCallInputEnd(handler func(ToolCallInputEndEvent, Context)) {
+	a.onToolCallInputEnd(handler)
+}
+
 // OnToolExecutionStart registers a handler for tool execution start.
 func (a *API) OnToolExecutionStart(handler func(ToolExecutionStartEvent, Context)) {
 	a.onToolExecStart(handler)
@@ -1278,6 +1309,56 @@ func (a *API) OnBeforeCompact(handler func(BeforeCompactEvent, Context) *BeforeC
 	a.onBeforeCompact(handler)
 }

+// OnStepStart registers a handler that fires when a new LLM call begins
+// within a multi-step agent turn.
+func (a *API) OnStepStart(handler func(StepStartEvent, Context)) {
+	a.onStepStart(handler)
+}
+
+// OnStepFinish registers a handler that fires when a step completes,
+// providing step number, finish reason, and decomposed token usage.
+func (a *API) OnStepFinish(handler func(StepFinishEvent, Context)) {
+	a.onStepFinish(handler)
+}
+
+// OnReasoningStart registers a handler that fires when the LLM begins
+// reasoning/thinking.
+func (a *API) OnReasoningStart(handler func(ReasoningStartEvent, Context)) {
+	a.onReasoningStart(handler)
+}
+
+// OnWarnings registers a handler that fires when the LLM provider returns
+// warnings about the request.
+func (a *API) OnWarnings(handler func(WarningsEvent, Context)) {
+	a.onWarnings(handler)
+}
+
+// OnSource registers a handler that fires when the LLM references a source
+// (e.g. from web search tools).
+func (a *API) OnSource(handler func(SourceEvent, Context)) {
+	a.onSource(handler)
+}
+
+// OnError registers a handler that fires when an agent-level error occurs
+// during streaming.
+func (a *API) OnError(handler func(ErrorEvent, Context)) {
+	a.onError(handler)
+}
+
+// OnRetry registers a handler that fires when the LLM provider request is
+// retried after a transient error.
+func (a *API) OnRetry(handler func(RetryEvent, Context)) {
+	a.onRetry(handler)
+}
+
+// OnPrepareStep registers a handler that fires between steps within a
+// multi-step agent turn, after steering messages are injected and before
+// messages are sent to the LLM. Return a non-nil PrepareStepResult with
+// Messages to replace the context window for this step.
+func (a *API) OnPrepareStep(handler func(PrepareStepEvent, Context) *PrepareStepResult) {
+	a.onPrepareStep(handler)
+}
+
 // RegisterToolRenderer registers a custom renderer for a specific tool's
 // display in the TUI. The renderer controls the header (parameter summary)
 // and/or body (result display) of the tool's output block. If multiple
@@ -1890,6 +1971,34 @@ type ToolCallResult struct {

 func (ToolCallResult) isResult() {}

+// ToolCallInputStartEvent fires when the LLM begins generating tool call
+// arguments. The tool name is known but the full argument JSON is still
+// being streamed.
+type ToolCallInputStartEvent struct {
+	ToolCallID string
+	ToolName   string
+	ToolKind   string // Tool classification: "execute", "edit", "read", "search", "agent"
+}
+
+func (e ToolCallInputStartEvent) Type() EventType { return ToolCallInputStart }
+
+// ToolCallInputDeltaEvent fires for each streamed fragment of tool call
+// arguments as they arrive from the LLM.
+type ToolCallInputDeltaEvent struct {
+	ToolCallID string
+	Delta      string // JSON fragment of tool arguments
+}
+
+func (e ToolCallInputDeltaEvent) Type() EventType { return ToolCallInputDelta }
+
+// ToolCallInputEndEvent fires when tool argument streaming is complete,
+// before the tool call is parsed and execution begins.
+type ToolCallInputEndEvent struct {
+	ToolCallID string
+}
+
+func (e ToolCallInputEndEvent) Type() EventType { return ToolCallInputEnd }
+
 // ToolExecutionStartEvent fires when a tool begins executing.
 type ToolExecutionStartEvent struct {
 	ToolCallID string
@@ -2202,6 +2311,98 @@ type SubagentEndEvent struct {

 func (e SubagentEndEvent) Type() EventType { return SubagentEnd }

+// ---------------------------------------------------------------------------
+// Step lifecycle events (exposed to Yaegi — concrete structs)
+// ---------------------------------------------------------------------------
+
+// StepStartEvent fires when a new LLM call begins within a multi-step agent turn.
+type StepStartEvent struct {
+	StepNumber int
+}
+
+func (e StepStartEvent) Type() EventType { return StepStart }
+
+// StepFinishEvent fires when a step completes, providing step metadata and
+// token usage. Usage fields are plain int64 (not LLMUsage) because Yaegi
+// cannot handle fantasy types across the interpreter boundary.
+type StepFinishEvent struct {
+	StepNumber       int
+	HasToolCalls     bool
+	FinishReason     string
+	InputTokens      int64
+	OutputTokens     int64
+	CacheReadTokens  int64
+	CacheWriteTokens int64
+}
+
+func (e StepFinishEvent) Type() EventType { return StepFinish }
+
+// ReasoningStartEvent fires when the LLM begins reasoning/thinking.
+type ReasoningStartEvent struct {
+	ID string
+}
+
+func (e ReasoningStartEvent) Type() EventType { return ReasoningStart }
+
+// WarningsEvent fires when the LLM provider returns warnings about the request.
+type WarningsEvent struct {
+	Warnings []string
+}
+
+func (e WarningsEvent) Type() EventType { return Warnings }
+
+// SourceEvent fires when the LLM references a source (e.g. from web search).
+type SourceEvent struct {
+	SourceType string
+	ID         string
+	URL        string
+	Title      string
+}
+
+func (e SourceEvent) Type() EventType { return Source }
+
+// ErrorEvent fires when an agent-level error occurs during streaming.
+// Uses string instead of error because Yaegi cannot handle the error
+// interface reliably across the interpreter boundary.
+type ErrorEvent struct {
+	Error string
+}
+
+func (e ErrorEvent) Type() EventType { return Error }
+
+// RetryEvent fires when the LLM provider request is retried after a
+// transient error.
+type RetryEvent struct {
+	Attempt int
+	Error   string
+}
+
+func (e RetryEvent) Type() EventType { return Retry }
+
+// PrepareStepEvent fires between steps within a multi-step agent turn,
+// after steering messages are injected and before messages are sent to
+// the LLM. Handlers can inspect and replace the context window.
+type PrepareStepEvent struct {
+	// StepNumber is the zero-based step index within the current turn.
+	StepNumber int
+	// Messages is the current context window that will be sent to the LLM.
+	Messages []ContextMessage
+}
+
+func (e PrepareStepEvent) Type() EventType { return PrepareStep }
+
+// PrepareStepResult allows extensions to replace the context window between
+// steps. Return nil Messages to leave the context unchanged.
+type PrepareStepResult struct {
+	// Messages replaces the entire context window for this step. If nil,
+	// the original messages are used unchanged. Messages with a non-negative
+	// Index reuse the original message at that position; messages with
+	// Index < 0 are created fresh from Role + Content.
+	Messages []ContextMessage
+}
+
+func (PrepareStepResult) isResult() {}
+
 // ThemeColor is an adaptive color pair with light and dark hex values.
 // Either field may be empty to inherit from the default theme.
 type ThemeColor struct {
@@ -13,6 +13,19 @@ const (
 	// ToolCall fires before a tool executes. Handlers can block execution.
 	ToolCall EventType = "tool_call"

+	// ToolCallInputStart fires when the LLM begins generating tool call
+	// arguments. The tool name is known but the full argument JSON is still
+	// being streamed.
+	ToolCallInputStart EventType = "tool_call_input_start"
+
+	// ToolCallInputDelta fires for each streamed fragment of tool call
+	// arguments as they arrive from the LLM.
+	ToolCallInputDelta EventType = "tool_call_input_delta"
+
+	// ToolCallInputEnd fires when tool argument streaming is complete,
+	// before the tool call is parsed and execution begins.
+	ToolCallInputEnd EventType = "tool_call_input_end"
+
 	// ToolExecutionStart fires when a tool begins executing.
 	ToolExecutionStart EventType = "tool_execution_start"

@@ -83,18 +96,50 @@ const (
 	// SubagentEnd fires when a subagent tool call completes (success
 	// or error). Carries the final response and any error message.
 	SubagentEnd EventType = "subagent_end"
+
+	// StepStart fires when a new LLM call begins within a multi-step
+	// agent turn.
+	StepStart EventType = "step_start"
+
+	// StepFinish fires when a step completes, providing step number,
+	// finish reason, and token usage.
+	StepFinish EventType = "step_finish"
+
+	// ReasoningStart fires when the LLM begins reasoning/thinking.
+	ReasoningStart EventType = "reasoning_start"
+
+	// Warnings fires when the LLM provider returns warnings.
+	Warnings EventType = "warnings"
+
+	// Source fires when the LLM references a source (e.g. web search).
+	Source EventType = "source"
+
+	// Error fires when an agent-level error occurs during streaming.
+	Error EventType = "error"
+
+	// Retry fires when the LLM provider request is retried after a
+	// transient error.
+	Retry EventType = "retry"
+
+	// PrepareStep fires between steps within a multi-step agent turn,
+	// after steering messages are injected and before messages are sent
+	// to the LLM. Handlers can replace the context window for this step.
+	PrepareStep EventType = "prepare_step"
 )

 // AllEventTypes returns every supported event type.
 func AllEventTypes() []EventType {
 	return []EventType{
-		ToolCall, ToolExecutionStart, ToolExecutionEnd, ToolResult,
+		ToolCall, ToolCallInputStart, ToolCallInputDelta, ToolCallInputEnd,
+		ToolExecutionStart, ToolExecutionEnd, ToolResult,
 		Input, BeforeAgentStart, AgentStart, AgentEnd,
 		MessageStart, MessageUpdate, MessageEnd,
 		SessionStart, SessionShutdown,
 		ModelChange, ContextPrepare,
 		BeforeFork, BeforeSessionSwitch, BeforeCompact,
 		SubagentStart, SubagentChunk, SubagentEnd,
+		StepStart, StepFinish, ReasoningStart, Warnings, Source, Error, Retry,
+		PrepareStep,
 	}
 }

@@ -4,8 +4,8 @@ import "testing"

 func TestAllEventTypes_Count(t *testing.T) {
 	all := AllEventTypes()
-	if len(all) != 21 {
-		t.Fatalf("expected 21 event types, got %d", len(all))
+	if len(all) != 32 {
+		t.Fatalf("expected 32 event types, got %d", len(all))
 	}
 }

@@ -38,6 +38,9 @@ func TestEventType_TypeMethod(t *testing.T) {
 		want  EventType
 	}{
 		{ToolCallEvent{ToolName: "test"}, ToolCall},
+		{ToolCallInputStartEvent{ToolCallID: "x", ToolName: "test"}, ToolCallInputStart},
+		{ToolCallInputDeltaEvent{ToolCallID: "x", Delta: "{"}, ToolCallInputDelta},
+		{ToolCallInputEndEvent{ToolCallID: "x"}, ToolCallInputEnd},
 		{ToolExecutionStartEvent{ToolName: "test"}, ToolExecutionStart},
 		{ToolExecutionEndEvent{ToolName: "test"}, ToolExecutionEnd},
 		{ToolResultEvent{ToolName: "test"}, ToolResult},
@@ -429,6 +429,24 @@ func loadSingleExtension(path string) (*LoadedExtension, error) {
 				return *r
 			})
 		},
+		onToolCallInputStart: func(h func(ToolCallInputStartEvent, Context)) {
+			reg(ToolCallInputStart, func(e Event, c Context) Result {
+				h(e.(ToolCallInputStartEvent), c)
+				return nil
+			})
+		},
+		onToolCallInputDelta: func(h func(ToolCallInputDeltaEvent, Context)) {
+			reg(ToolCallInputDelta, func(e Event, c Context) Result {
+				h(e.(ToolCallInputDeltaEvent), c)
+				return nil
+			})
+		},
+		onToolCallInputEnd: func(h func(ToolCallInputEndEvent, Context)) {
+			reg(ToolCallInputEnd, func(e Event, c Context) Result {
+				h(e.(ToolCallInputEndEvent), c)
+				return nil
+			})
+		},
 		onToolExecStart: func(h func(ToolExecutionStartEvent, Context)) {
 			reg(ToolExecutionStart, func(e Event, c Context) Result {
 				h(e.(ToolExecutionStartEvent), c)
@@ -600,6 +618,57 @@ func loadSingleExtension(path string) (*LoadedExtension, error) {
 				return nil
 			})
 		},
+		onStepStart: func(h func(StepStartEvent, Context)) {
+			reg(StepStart, func(e Event, c Context) Result {
+				h(e.(StepStartEvent), c)
+				return nil
+			})
+		},
+		onStepFinish: func(h func(StepFinishEvent, Context)) {
+			reg(StepFinish, func(e Event, c Context) Result {
+				h(e.(StepFinishEvent), c)
+				return nil
+			})
+		},
+		onReasoningStart: func(h func(ReasoningStartEvent, Context)) {
+			reg(ReasoningStart, func(e Event, c Context) Result {
+				h(e.(ReasoningStartEvent), c)
+				return nil
+			})
+		},
+		onWarnings: func(h func(WarningsEvent, Context)) {
+			reg(Warnings, func(e Event, c Context) Result {
+				h(e.(WarningsEvent), c)
+				return nil
+			})
+		},
+		onSource: func(h func(SourceEvent, Context)) {
+			reg(Source, func(e Event, c Context) Result {
+				h(e.(SourceEvent), c)
+				return nil
+			})
+		},
+		onError: func(h func(ErrorEvent, Context)) {
+			reg(Error, func(e Event, c Context) Result {
+				h(e.(ErrorEvent), c)
+				return nil
+			})
+		},
+		onRetry: func(h func(RetryEvent, Context)) {
+			reg(Retry, func(e Event, c Context) Result {
+				h(e.(RetryEvent), c)
+				return nil
+			})
+		},
+		onPrepareStep: func(h func(PrepareStepEvent, Context) *PrepareStepResult) {
+			reg(PrepareStep, func(e Event, c Context) Result {
+				r := h(e.(PrepareStepEvent), c)
+				if r == nil {
+					return nil
+				}
+				return *r
+			})
+		},
 	}

 	// Call Init — the extension registers its handlers, tools, commands.
@@ -1,21 +1,93 @@
 package extensions

 import (
+	"bytes"
 	"fmt"
 	"log"
 	"os"
+	"runtime"
 	"sort"
+	"strconv"
 	"strings"
 	"sync"

 	"github.com/spf13/viper"
 )

+// ---------------------------------------------------------------------------
+// reentrantMu — a per-extension mutex that allows the same goroutine to
+// re-enter (e.g. handler → ctx.EmitCustomEvent → handler in same extension).
+// Different goroutines are serialized, preventing concurrent state mutation.
+// ---------------------------------------------------------------------------
+
+type reentrantMu struct {
+	mu    sync.Mutex
+	cond  *sync.Cond
+	owner int64 // goroutine ID that holds the lock, or 0
+	depth int   // re-entrancy depth
+}
+
+// initReentrantMu initializes the reentrant mutex in-place. Must be called
+// after the struct is at its final memory location (not before copying).
+func (r *reentrantMu) init() {
+	r.cond = sync.NewCond(&r.mu)
+}
+
+// lock acquires the mutex. If the calling goroutine already holds it, the
+// call succeeds immediately (re-entrant). Every call to lock must be paired
+// with a call to unlock.
+func (r *reentrantMu) lock() {
+	gid := goroutineID()
+	r.mu.Lock()
+	if r.owner == gid {
+		// Re-entrant: same goroutine already holds the lock.
+		r.depth++
+		r.mu.Unlock()
+		return
+	}
+	// Wait for the current owner to release.
+	for r.owner != 0 {
+		r.cond.Wait() // releases mu, blocks, re-acquires mu on wake
+	}
+	r.owner = gid
+	r.depth = 1
+	r.mu.Unlock()
+}
+
+// unlock releases the mutex (or decrements re-entrancy depth).
+func (r *reentrantMu) unlock() {
+	r.mu.Lock()
+	r.depth--
+	if r.depth == 0 {
+		r.owner = 0
+		r.cond.Signal()
+	}
+	r.mu.Unlock()
+}
+
+// goroutineID extracts the current goroutine's ID from runtime.Stack output.
+// This is a well-known technique used by Go testing infrastructure.
+func goroutineID() int64 {
+	var buf [64]byte
+	n := runtime.Stack(buf[:], false)
+	// Stack output starts with "goroutine NNN ["
+	s := buf[:n]
+	s = s[len("goroutine "):]
+	s = s[:bytes.IndexByte(s, ' ')]
+	id, _ := strconv.ParseInt(string(s), 10, 64)
+	return id
+}
+
 // Runner manages loaded extensions and dispatches events to their handlers
 // sequentially. Handlers execute in extension
 // load order; for cancellable events the first blocking result wins.
+//
+// Each extension has a dedicated reentrant mutex so that handlers for the
+// same extension are serialized (preventing data races on shared package-level
+// state), while handlers for different extensions may execute concurrently.
 type Runner struct {
 	extensions      []LoadedExtension
+	extMu           []reentrantMu // per-extension reentrant mutex, indexed by extension position
 	ctx             Context
 	widgets         map[string]WidgetConfig   // keyed by widget ID
 	statusEntries   map[string]StatusBarEntry // keyed by status key
@@ -52,7 +124,11 @@ type LoadedExtension struct {

 // NewRunner creates a Runner from a set of loaded extensions.
 func NewRunner(exts []LoadedExtension) *Runner {
-	return &Runner{extensions: exts}
+	mus := make([]reentrantMu, len(exts))
+	for i := range mus {
+		mus[i].init()
+	}
+	return &Runner{extensions: exts, extMu: mus}
 }

 // SetContext updates the runtime context (session ID, model, etc.) that is
@@ -367,6 +443,11 @@ func (r *Runner) Emit(event Event) (Result, error) {
 	for i := range r.extensions {
 		ext := &r.extensions[i]
 		handlers := ext.Handlers[event.Type()]
+		if len(handlers) == 0 {
+			continue
+		}
+
+		r.extMu[i].lock()
 		for _, handler := range handlers {
 			result, err := safeCall(handler, event, ctx)
 			if err != nil {
@@ -379,6 +460,7 @@ func (r *Runner) Emit(event Event) (Result, error) {

 			// Check for blocking/short-circuit results.
 			if isBlocking(result) {
+				r.extMu[i].unlock()
 				return result, nil
 			}

@@ -386,6 +468,7 @@ func (r *Runner) Emit(event Event) (Result, error) {
 			// the caller is responsible for applying the modifications.
 			accumulated = result
 		}
+		r.extMu[i].unlock()
 	}
 	return accumulated, nil
 }
@@ -712,11 +795,17 @@ func (r *Runner) EmitCustomEvent(name, data string) {

 	// Extension-registered handlers first (in load order).
 	for i := range r.extensions {
-		for _, h := range r.extensions[i].CustomEventHandlers[name] {
+		extHandlers := r.extensions[i].CustomEventHandlers[name]
+		if len(extHandlers) == 0 {
+			continue
+		}
+		r.extMu[i].lock()
+		for _, h := range extHandlers {
 			safeInvoke(h)
 		}
+		r.extMu[i].unlock()
 	}
-	// Then dynamic subscriptions.
+	// Then dynamic subscriptions (not extension-scoped, no per-ext lock).
 	for _, h := range dynamicHandlers {
 		safeInvoke(h)
 	}
@@ -1,6 +1,7 @@
 package extensions

 import (
+	"sync"
 	"testing"
 )

@@ -571,3 +572,142 @@ func TestRunner_ContextPrintNilSafe(t *testing.T) {
 		t.Fatalf("unexpected error: %v", err)
 	}
 }
+
+func TestRunner_ConcurrentEmitSameExtension(t *testing.T) {
+	// Verify that concurrent Emit calls for the same extension are serialized
+	// and don't cause data races on shared handler state.
+	var counter int
+	ext := makeHandlerExt("shared-state.go", map[EventType][]HandlerFunc{
+		SubagentStart: {
+			func(e Event, c Context) Result {
+				// Read-modify-write: racy without serialization.
+				v := counter
+				counter = v + 1
+				return nil
+			},
+		},
+		SubagentChunk: {
+			func(e Event, c Context) Result {
+				v := counter
+				counter = v + 1
+				return nil
+			},
+		},
+	})
+
+	r := makeRunner(ext)
+	var wg sync.WaitGroup
+	const goroutines = 20
+	const iterations = 50
+	wg.Add(goroutines)
+	for range goroutines {
+		go func() {
+			defer wg.Done()
+			for range iterations {
+				_, _ = r.Emit(SubagentStartEvent{ToolCallID: "x"})
+				_, _ = r.Emit(SubagentChunkEvent{ToolCallID: "x"})
+			}
+		}()
+	}
+	wg.Wait()
+	if counter != goroutines*iterations*2 {
+		t.Errorf("expected counter=%d, got %d (race detected)", goroutines*iterations*2, counter)
+	}
+}
+
+func TestRunner_ConcurrentEmitDifferentExtensions(t *testing.T) {
+	// Two extensions with independent state should not block each other
+	// and should both run correctly under concurrent Emit calls.
+	var counter1, counter2 int
+	ext1 := makeHandlerExt("ext1.go", map[EventType][]HandlerFunc{
+		SubagentStart: {
+			func(e Event, c Context) Result {
+				v := counter1
+				counter1 = v + 1
+				return nil
+			},
+		},
+	})
+	ext2 := makeHandlerExt("ext2.go", map[EventType][]HandlerFunc{
+		SubagentStart: {
+			func(e Event, c Context) Result {
+				v := counter2
+				counter2 = v + 1
+				return nil
+			},
+		},
+	})
+
+	r := makeRunner(ext1, ext2)
+	var wg sync.WaitGroup
+	const goroutines = 20
+	const iterations = 50
+	wg.Add(goroutines)
+	for range goroutines {
+		go func() {
+			defer wg.Done()
+			for range iterations {
+				_, _ = r.Emit(SubagentStartEvent{ToolCallID: "x"})
+			}
+		}()
+	}
+	wg.Wait()
+	expected := goroutines * iterations
+	if counter1 != expected {
+		t.Errorf("ext1 counter: expected %d, got %d", expected, counter1)
+	}
+	if counter2 != expected {
+		t.Errorf("ext2 counter: expected %d, got %d", expected, counter2)
+	}
+}
+
+func TestRunner_ReentrantEmitCustomEvent(t *testing.T) {
+	// Verify that a handler can call EmitCustomEvent (which dispatches to
+	// the same extension's custom event handlers) without deadlocking.
+	var order []string
+	ext := LoadedExtension{
+		Path: "reentrant.go",
+		Handlers: map[EventType][]HandlerFunc{
+			SessionStart: {
+				func(e Event, c Context) Result {
+					order = append(order, "session_start")
+					// This triggers EmitCustomEvent for the same extension
+					// via a direct runner call (simulating ctx.EmitCustomEvent).
+					return nil
+				},
+			},
+		},
+		CustomEventHandlers: map[string][]func(string){
+			"test-event": {
+				func(data string) {
+					order = append(order, "custom:"+data)
+				},
+			},
+		},
+	}
+
+	r := makeRunner(ext)
+
+	// Wire up the handler to call EmitCustomEvent re-entrantly.
+	ext.Handlers[SessionStart] = []HandlerFunc{
+		func(e Event, c Context) Result {
+			order = append(order, "session_start")
+			r.EmitCustomEvent("test-event", "hello")
+			return nil
+		},
+	}
+	r.extensions[0] = ext
+	// Rebuild mutexes after modifying extensions slice.
+	r.extMu = make([]reentrantMu, len(r.extensions))
+	for i := range r.extMu {
+		r.extMu[i].init()
+	}
+
+	_, err := r.Emit(SessionStartEvent{})
+	if err != nil {
+		t.Fatalf("unexpected error: %v", err)
+	}
+	if len(order) != 2 || order[0] != "session_start" || order[1] != "custom:hello" {
+		t.Errorf("expected [session_start, custom:hello], got %v", order)
+	}
+}
@@ -152,6 +152,9 @@ func Symbols() interp.Exports {
 			// Event structs
 			"ToolCallEvent":           reflect.ValueOf((*ToolCallEvent)(nil)),
 			"ToolCallResult":          reflect.ValueOf((*ToolCallResult)(nil)),
+			"ToolCallInputStartEvent": reflect.ValueOf((*ToolCallInputStartEvent)(nil)),
+			"ToolCallInputDeltaEvent": reflect.ValueOf((*ToolCallInputDeltaEvent)(nil)),
+			"ToolCallInputEndEvent":   reflect.ValueOf((*ToolCallInputEndEvent)(nil)),
 			"ToolExecutionStartEvent": reflect.ValueOf((*ToolExecutionStartEvent)(nil)),
 			"ToolExecutionEndEvent":   reflect.ValueOf((*ToolExecutionEndEvent)(nil)),
 			"ToolOutputEvent":         reflect.ValueOf((*ToolOutputEvent)(nil)),
@@ -169,6 +172,17 @@ func Symbols() interp.Exports {
 			"SessionStartEvent":       reflect.ValueOf((*SessionStartEvent)(nil)),
 			"SessionShutdownEvent":    reflect.ValueOf((*SessionShutdownEvent)(nil)),
 			"ModelChangeEvent":        reflect.ValueOf((*ModelChangeEvent)(nil)),
+
+			// Step lifecycle events
+			"StepStartEvent":      reflect.ValueOf((*StepStartEvent)(nil)),
+			"StepFinishEvent":     reflect.ValueOf((*StepFinishEvent)(nil)),
+			"ReasoningStartEvent": reflect.ValueOf((*ReasoningStartEvent)(nil)),
+			"WarningsEvent":       reflect.ValueOf((*WarningsEvent)(nil)),
+			"SourceEvent":         reflect.ValueOf((*SourceEvent)(nil)),
+			"ErrorEvent":          reflect.ValueOf((*ErrorEvent)(nil)),
+			"RetryEvent":          reflect.ValueOf((*RetryEvent)(nil)),
+			"PrepareStepEvent":    reflect.ValueOf((*PrepareStepEvent)(nil)),
+			"PrepareStepResult":   reflect.ValueOf((*PrepareStepResult)(nil)),
 		},
 	}
 }
@@ -28,11 +28,11 @@ func WrapToolsWithExtensions(tools []fantasy.AgentTool, runner *Runner) []fantas
 	return wrapped
 }

-// ExtensionToolsAsFantasy converts ToolDef values registered by extensions
-// into fantasy.AgentTool implementations so the LLM can invoke them.
+// ExtensionToolsAsLLMTools converts ToolDef values registered by extensions
+// into LLM agent tool implementations so the LLM can invoke them.
 // The runner is optional; if provided, ToolContext.OnProgress routes
 // progress messages through the runner's Print function.
-func ExtensionToolsAsFantasy(defs []ToolDef, runner *Runner) []fantasy.AgentTool {
+func ExtensionToolsAsLLMTools(defs []ToolDef, runner *Runner) []fantasy.AgentTool {
 	tools := make([]fantasy.AgentTool, 0, len(defs))
 	for _, def := range defs {
 		tools = append(tools, &extensionTool{def: def, runner: runner})
@@ -90,8 +90,7 @@ func (w *wrappedTool) Run(ctx context.Context, call fantasy.ToolCall) (fantasy.T
 	// 0. Check if tool is disabled via SetActiveTools.
 	if w.runner.IsToolDisabled(toolName) {
 		return fantasy.NewTextErrorResponse(
-				fmt.Sprintf("Error: tool %q is currently disabled", toolName)),
-			fmt.Errorf("tool %q disabled by extension", toolName)
+			fmt.Sprintf("Error: tool %q is currently disabled", toolName)), nil
 	}

 	kind := toolKindFor(toolName)
@@ -111,8 +110,7 @@ func (w *wrappedTool) Run(ctx context.Context, call fantasy.ToolCall) (fantasy.T
 			if reason == "" {
 				reason = "blocked by extension"
 			}
-			return fantasy.NewTextErrorResponse(fmt.Sprintf("Error: %s", reason)),
-				fmt.Errorf("tool blocked by extension: %s", reason)
+			return fantasy.NewTextErrorResponse(fmt.Sprintf("Error: %s", reason)), nil
 		}
 	}

@@ -154,7 +152,7 @@ func (w *wrappedTool) Run(ctx context.Context, call fantasy.ToolCall) (fantasy.T
 }

 // ---------------------------------------------------------------------------
-// extensionTool — wraps a ToolDef into a fantasy.AgentTool
+// extensionTool — wraps a ToolDef into an LLM agent tool
 // ---------------------------------------------------------------------------

 type extensionTool struct {
@@ -182,7 +180,7 @@ func (t *extensionTool) Info() fantasy.ToolInfo {
 				info.Parameters = props
 			} else {
 				// Schema doesn't have "properties" — use as-is (may be
-				// a flat property map already matching fantasy's format).
+				// a flat property map already matching the expected format).
 				info.Parameters = schema
 			}
 			// Extract required fields if present.
@@ -238,7 +236,7 @@ func (t *extensionTool) Run(ctx context.Context, call fantasy.ToolCall) (fantasy
 	}

 	if err != nil {
-		return fantasy.NewTextErrorResponse(err.Error()), err
+		return fantasy.NewTextErrorResponse(err.Error()), nil
 	}
 	return fantasy.NewTextResponse(result), nil
 }
@@ -142,8 +142,8 @@ func TestWrappedTool_BlockExecution(t *testing.T) {
 	if toolRan {
 		t.Error("tool should not have run after block")
 	}
-	if err == nil {
-		t.Error("expected error from blocked tool")
+	if err != nil {
+		t.Error("expected nil error for blocked tool (error is conveyed via IsError response)")
 	}
 	if resp.IsError != true {
 		t.Error("expected IsError=true from blocked response")
@@ -192,7 +192,7 @@ func TestWrappedTool_ExecutionStartEnd(t *testing.T) {
 	}
 }

-func TestExtensionToolsAsFantasy(t *testing.T) {
+func TestExtensionToolsAsLLMTools(t *testing.T) {
 	defs := []ToolDef{
 		{
 			Name:        "greet",
@@ -202,7 +202,7 @@ func TestExtensionToolsAsFantasy(t *testing.T) {
 		},
 	}

-	tools := ExtensionToolsAsFantasy(defs, nil)
+	tools := ExtensionToolsAsLLMTools(defs, nil)
 	if len(tools) != 1 {
 		t.Fatalf("expected 1 tool, got %d", len(tools))
 	}
@@ -232,10 +232,10 @@ func TestExtensionTool_Error(t *testing.T) {
 		},
 	}

-	tools := ExtensionToolsAsFantasy(defs, nil)
+	tools := ExtensionToolsAsLLMTools(defs, nil)
 	resp, err := tools[0].Run(context.Background(), fantasy.ToolCall{Input: "x"})
-	if err == nil {
-		t.Error("expected error")
+	if err != nil {
+		t.Error("expected nil error (error is conveyed via IsError response)")
 	}
 	if !resp.IsError {
 		t.Error("expected IsError=true")
@@ -259,7 +259,7 @@ func TestExtensionTool_ExecuteWithContext(t *testing.T) {
 	}

 	// Without runner, OnProgress is a no-op.
-	tools := ExtensionToolsAsFantasy(defs, nil)
+	tools := ExtensionToolsAsLLMTools(defs, nil)
 	resp, err := tools[0].Run(context.Background(), fantasy.ToolCall{Input: "test"})
 	if err != nil {
 		t.Fatalf("unexpected error: %v", err)
@@ -285,7 +285,7 @@ func TestExtensionTool_ExecuteWithContext(t *testing.T) {
 			},
 		},
 	}
-	tools2 := ExtensionToolsAsFantasy(defs2, runner)
+	tools2 := ExtensionToolsAsLLMTools(defs2, runner)
 	_, err = tools2[0].Run(context.Background(), fantasy.ToolCall{Input: ""})
 	if err != nil {
 		t.Fatalf("unexpected error: %v", err)
@@ -306,7 +306,7 @@ func TestExtensionTool_ExecuteWithContextPriority(t *testing.T) {
 			},
 		},
 	}
-	tools := ExtensionToolsAsFantasy(defs, nil)
+	tools := ExtensionToolsAsLLMTools(defs, nil)
 	resp, err := tools[0].Run(context.Background(), fantasy.ToolCall{Input: ""})
 	if err != nil {
 		t.Fatalf("unexpected error: %v", err)
@@ -330,7 +330,7 @@ func TestExtensionTool_CancelledContext(t *testing.T) {
 			},
 		},
 	}
-	tools := ExtensionToolsAsFantasy(defs, nil)
+	tools := ExtensionToolsAsLLMTools(defs, nil)
 	_, _ = tools[0].Run(ctx, fantasy.ToolCall{Input: ""})
 	if !sawCancelled {
 		t.Error("expected IsCancelled=true for cancelled context")
@@ -339,7 +339,7 @@ func TestExtensionTool_CancelledContext(t *testing.T) {

 func TestExtensionTool_ProviderOptions(t *testing.T) {
 	defs := []ToolDef{{Name: "test", Execute: func(string) (string, error) { return "", nil }}}
-	tools := ExtensionToolsAsFantasy(defs, nil)
+	tools := ExtensionToolsAsLLMTools(defs, nil)

 	// Initially nil.
 	opts := tools[0].ProviderOptions()
@@ -267,7 +267,7 @@ func loadExtensions() (*extensions.Runner, extensionCreationOpts, error) {
 		return extensions.WrapToolsWithExtensions(tools, runner)
 	}

-	extTools := extensions.ExtensionToolsAsFantasy(runner.RegisteredTools(), runner)
+	extTools := extensions.ExtensionToolsAsLLMTools(runner.RegisteredTools(), runner)

 	return runner, extensionCreationOpts{
 		toolWrapper: wrapper,
@@ -325,12 +325,6 @@ func UnmarshalParts(data []byte) ([]ContentPart, error) {
 // mixed TextPart and ToolCallPart content. Tool-role messages produce
 // ToolResultPart entries.
 func (m *Message) ToLLMMessages() []fantasy.Message {
-	return m.ToFantasyMessages()
-}
-
-// Deprecated: Use ToLLMMessages instead.
-// ToFantasyMessages converts a Message to one or more LLM message values.
-func (m *Message) ToFantasyMessages() []fantasy.Message {
 	switch m.Role {
 	case RoleAssistant:
 		var parts []fantasy.MessagePart
@@ -431,13 +425,6 @@ func (m *Message) ToFantasyMessages() []fantasy.Message {
 // FromLLMMessage converts an LLM message into our Message type,
 // extracting all content parts into the appropriate block types.
 func FromLLMMessage(msg fantasy.Message) Message {
-	return FromFantasyMessage(msg)
-}
-
-// Deprecated: Use FromLLMMessage instead.
-// FromFantasyMessage converts an LLM message into our Message type,
-// extracting all content parts into the appropriate block types.
-func FromFantasyMessage(msg fantasy.Message) Message {
 	m := Message{
 		Role:      MessageRole(msg.Role),
 		Parts:     make([]ContentPart, 0),
@@ -25,7 +25,6 @@ import (
 	openaisdk "github.com/charmbracelet/openai-go"

 	"github.com/mark3labs/kit/internal/auth"
-	"github.com/mark3labs/kit/internal/ui/progress"
 )

 const (
@@ -86,6 +85,7 @@ type ThinkingLevel string

 const (
 	ThinkingOff     ThinkingLevel = "off"
+	ThinkingNone    ThinkingLevel = "none"
 	ThinkingMinimal ThinkingLevel = "minimal"
 	ThinkingLow     ThinkingLevel = "low"
 	ThinkingMedium  ThinkingLevel = "medium"
@@ -94,12 +94,14 @@ const (

 // ThinkingLevels returns the ordered list of available thinking levels for cycling.
 func ThinkingLevels() []ThinkingLevel {
-	return []ThinkingLevel{ThinkingOff, ThinkingMinimal, ThinkingLow, ThinkingMedium, ThinkingHigh}
+	return []ThinkingLevel{ThinkingOff, ThinkingNone, ThinkingMinimal, ThinkingLow, ThinkingMedium, ThinkingHigh}
 }

-// thinkingBudgetTokens returns the token budget for a thinking level, or 0 for "off".
+// thinkingBudgetTokens returns the token budget for a thinking level, or 0 for "off" or "none".
 func thinkingBudgetTokens(level ThinkingLevel) int64 {
 	switch level {
+	case ThinkingNone:
+		return 1024
 	case ThinkingMinimal:
 		return 1024
 	case ThinkingLow:
@@ -118,6 +120,8 @@ func ThinkingLevelDescription(level ThinkingLevel) string {
 	switch level {
 	case ThinkingOff:
 		return "No reasoning"
+	case ThinkingNone:
+		return "Minimal reasoning (OpenAI 'none')"
 	case ThinkingMinimal:
 		return "Very brief reasoning (~1k tokens)"
 	case ThinkingLow:
@@ -134,7 +138,7 @@ func ThinkingLevelDescription(level ThinkingLevel) string {
 // ParseThinkingLevel converts a string to a ThinkingLevel, defaulting to ThinkingOff.
 func ParseThinkingLevel(s string) ThinkingLevel {
 	switch ThinkingLevel(s) {
-	case ThinkingMinimal, ThinkingLow, ThinkingMedium, ThinkingHigh:
+	case ThinkingNone, ThinkingMinimal, ThinkingLow, ThinkingMedium, ThinkingHigh:
 		return ThinkingLevel(s)
 	default:
 		return ThinkingOff
@@ -159,6 +163,12 @@ type ProviderConfig struct {
 	TLSSkipVerify    bool
 	ThinkingLevel    ThinkingLevel
 	DisableCaching   bool // Opt-out: set to true to disable automatic prompt caching
+
+	// ProgressReaderFunc, when set, wraps an io.Reader with progress display
+	// for long operations like Ollama model pulls. The returned io.ReadCloser
+	// must be closed when done. When nil, the raw reader is consumed directly
+	// with no progress UI.
+	ProgressReaderFunc func(io.Reader) io.ReadCloser
 }

 // ProviderResult contains the result of provider creation.
@@ -246,6 +256,11 @@ func CreateProvider(ctx context.Context, config *ProviderConfig) (*ProviderResul
 	// via CLI flag or global config.
 	ApplyModelSettings(config, modelInfo)

+	// Auto-raise MaxTokens toward the model's known output ceiling when the
+	// user hasn't explicitly set --max-tokens and no per-model override
+	// applied. Runs after ApplyModelSettings so explicit modelSettings win.
+	rightSizeMaxTokens(config, modelInfo)
+
 	// Create the base provider
 	var result *ProviderResult
 	var createErr error
@@ -290,9 +305,18 @@ func CreateProvider(ctx context.Context, config *ProviderConfig) (*ProviderResul
 			// Only add cache options for providers that don't already have
 			// options set, to avoid type conflicts (e.g., Anthropic has
 			// different types for regular options vs cache control options).
-			for k, v := range cacheOpts {
-				if _, exists := result.ProviderOptions[k]; !exists {
-					result.ProviderOptions[k] = v
+			//
+			// For OpenAI Responses API models, we skip merging entirely because
+			// ResponsesProviderOptions and ProviderOptions are incompatible types.
+			skipMerge := false
+			if provider == "openai" && openai.IsResponsesModel(modelName) {
+				skipMerge = true
+			}
+			if !skipMerge {
+				for k, v := range cacheOpts {
+					if _, exists := result.ProviderOptions[k]; !exists {
+						result.ProviderOptions[k] = v
+					}
 				}
 			}
 		}
@@ -484,6 +508,37 @@ func validateModelConfig(config *ProviderConfig, modelInfo *ModelInfo) {
 	}
 }

+// defaultRightSizeCap bounds auto-raised MaxTokens so that we don't silently
+// allocate enormous output budgets for models with very high ceilings (e.g.
+// Devstral at 262144, Mistral at 128000). Users who genuinely want more can
+// pass --max-tokens explicitly or set modelSettings[...].maxTokens in config.
+const defaultRightSizeCap = 32768
+
+// rightSizeMaxTokens raises config.MaxTokens toward the model's known output
+// ceiling when:
+//   - the user has not explicitly set --max-tokens (or the KIT_MAX_TOKENS env
+//     var, or the top-level max-tokens key in config.yaml), AND
+//   - no per-model override already bumped MaxTokens (ApplyModelSettings runs
+//     before this function), AND
+//   - modelInfo.Limit.Output is known and larger than the current MaxTokens.
+//
+// The raised value is capped at defaultRightSizeCap to keep accidental
+// allocations reasonable on very-large-output models. This prevents the
+// common "ghost" where the agent's reply is silently truncated at the 8192
+// default even though the selected model supports 64k or 262k output tokens.
+func rightSizeMaxTokens(config *ProviderConfig, modelInfo *ModelInfo) {
+	if modelInfo == nil || modelInfo.Limit.Output <= 0 {
+		return
+	}
+	if isExplicitlySet("max-tokens") {
+		return
+	}
+	target := min(modelInfo.Limit.Output, defaultRightSizeCap)
+	if config.MaxTokens < target {
+		config.MaxTokens = target
+	}
+}
+
 // clearConflictingAnthropicSamplingParams ensures that temperature and top_p are
 // not both sent to the Anthropic API, which rejects requests containing both.
 // When both are set (typically from defaults), top_p is cleared so that
@@ -530,6 +585,8 @@ func buildOpenAIProviderOptions(config *ProviderConfig, modelName string) fantas
 // Returns nil for ThinkingOff (use the model's default).
 func thinkingLevelToReasoningEffort(level ThinkingLevel) *openai.ReasoningEffort {
 	switch level {
+	case ThinkingNone:
+		return new(openai.ReasoningEffortNone)
 	case ThinkingMinimal:
 		return new(openai.ReasoningEffortMinimal)
 	case ThinkingLow:
@@ -543,6 +600,56 @@ func thinkingLevelToReasoningEffort(level ThinkingLevel) *openai.ReasoningEffort
 	}
 }

+// IsValidThinkingLevelForModel checks if a thinking level is valid for the given
+// model. Some OpenAI models like gpt-5.4 don't support "minimal" and require
+// "none" instead.
+func IsValidThinkingLevelForModel(level ThinkingLevel, modelName string) bool {
+	if level == ThinkingOff {
+		return true
+	}
+
+	// Check if this is an OpenAI model that doesn't support "minimal"
+	// gpt-5.4 and newer gpt-5.x models use "none" instead of "minimal"
+	if level == ThinkingMinimal {
+		if strings.Contains(modelName, "gpt-5.4") ||
+			strings.Contains(modelName, "gpt-5-pro") ||
+			strings.Contains(modelName, "gpt-5-chat") {
+			return false
+		}
+	}
+
+	// Check if this is an OpenAI model that doesn't support "none"
+	// Older gpt-5 models only support "minimal", not "none"
+	if level == ThinkingNone {
+		if strings.Contains(modelName, "gpt-5") &&
+			!strings.Contains(modelName, "gpt-5.4") &&
+			!strings.Contains(modelName, "gpt-5-pro") &&
+			!strings.Contains(modelName, "gpt-5-chat") {
+			// Older gpt-5 models might not support "none"
+			// They only added "none" support in newer versions
+			return false
+		}
+	}
+
+	// All other levels are generally valid for reasoning models
+	return true
+}
+
+// SuggestThinkingLevelFallback returns a recommended fallback level when the
+// requested level is not valid for the model. Returns ThinkingOff if no
+// suitable fallback exists.
+func SuggestThinkingLevelFallback(level ThinkingLevel, modelName string) ThinkingLevel {
+	if level == ThinkingMinimal && !IsValidThinkingLevelForModel(level, modelName) {
+		// For models that don't support "minimal", suggest "none" (~same token budget)
+		return ThinkingNone
+	}
+	if level == ThinkingNone && !IsValidThinkingLevelForModel(level, modelName) {
+		// For models that don't support "none", suggest "minimal" (~same token budget)
+		return ThinkingMinimal
+	}
+	return ThinkingOff
+}
+
 // buildAnthropicProviderOptions returns fantasy.ProviderOptions configured for
 // Anthropic models with extended thinking. When thinking is enabled, it sets
 // SendReasoning to true and configures the thinking budget. For thinking-off
@@ -1128,7 +1235,7 @@ func loadOllamaModelWithFallback(ctx context.Context, baseURL, modelName string,
 	// Phase 1: Check if model exists locally
 	if err := checkOllamaModelExists(client, baseURL, modelName); err != nil {
 		// Phase 2: Pull model if not found
-		if err := pullOllamaModel(ctx, client, baseURL, modelName); err != nil {
+		if err := pullOllamaModel(ctx, client, baseURL, modelName, config.ProgressReaderFunc); err != nil {
 			return nil, fmt.Errorf("failed to pull model %s: %v", modelName, err)
 		}
 	}
@@ -1217,11 +1324,7 @@ func checkOllamaModelExists(client *http.Client, baseURL, modelName string) erro
 	return nil
 }

-func pullOllamaModel(ctx context.Context, client *http.Client, baseURL, modelName string) error {
-	return pullOllamaModelWithProgress(ctx, client, baseURL, modelName, true)
-}
-
-func pullOllamaModelWithProgress(ctx context.Context, client *http.Client, baseURL, modelName string, showProgress bool) error {
+func pullOllamaModel(ctx context.Context, client *http.Client, baseURL, modelName string, progressFn func(io.Reader) io.ReadCloser) error {
 	reqBody := map[string]string{"name": modelName}
 	jsonBody, _ := json.Marshal(reqBody)

@@ -1245,10 +1348,10 @@ func pullOllamaModelWithProgress(ctx context.Context, client *http.Client, baseU
 		return fmt.Errorf("failed to pull model (status %d): %s", resp.StatusCode, string(body))
 	}

-	if showProgress {
-		progressReader := progress.NewProgressReader(resp.Body)
-		defer func() { _ = progressReader.Close() }()
-		_, err = io.ReadAll(progressReader)
+	if progressFn != nil {
+		pr := progressFn(resp.Body)
+		defer func() { _ = pr.Close() }()
+		_, err = io.ReadAll(pr)
 	} else {
 		_, err = io.ReadAll(resp.Body)
 	}
@@ -379,11 +379,6 @@ func (r *ModelsRegistry) GetLLMProviders() []string {
 	return providers
 }

-// Deprecated: Use GetLLMProviders instead.
-func (r *ModelsRegistry) GetFantasyProviders() []string {
-	return r.GetLLMProviders()
-}
-
 // isProviderLLMSupported checks if a provider can be used with the LLM layer.
 func isProviderLLMSupported(providerID string, info *ProviderInfo) bool {
 	// Ollama and custom are always supported (model names are user-defined).
@@ -0,0 +1,148 @@
+package models
+
+import (
+	"testing"
+
+	"github.com/spf13/pflag"
+	"github.com/spf13/viper"
+)
+
+// bindMaxTokensFlag wires a fresh pflag-backed "max-tokens" key into viper so
+// isExplicitlySet behaves the same way it does in production. Returns a
+// cleanup function that removes the binding so sibling tests see a clean
+// state.
+func bindMaxTokensFlag(t *testing.T, args []string) func() {
+	t.Helper()
+	fs := pflag.NewFlagSet("test", pflag.ContinueOnError)
+	fs.Int("max-tokens", 8192, "")
+	if err := viper.BindPFlag("max-tokens", fs.Lookup("max-tokens")); err != nil {
+		t.Fatalf("BindPFlag: %v", err)
+	}
+	if err := fs.Parse(args); err != nil {
+		t.Fatalf("fs.Parse: %v", err)
+	}
+	return func() {
+		viper.Reset()
+	}
+}
+
+func TestRightSizeMaxTokens_RaisesWhenBelowCeiling(t *testing.T) {
+	cleanup := bindMaxTokensFlag(t, nil) // no args → flag.Changed = false
+	defer cleanup()
+
+	config := &ProviderConfig{MaxTokens: 8192}
+	modelInfo := &ModelInfo{
+		ID:    "claude-sonnet-4-5",
+		Limit: Limit{Context: 200000, Output: 64000},
+	}
+
+	rightSizeMaxTokens(config, modelInfo)
+
+	if config.MaxTokens != 32768 {
+		t.Errorf("expected MaxTokens raised to defaultRightSizeCap (32768), got %d", config.MaxTokens)
+	}
+}
+
+func TestRightSizeMaxTokens_CapsAtDefaultRightSizeCap(t *testing.T) {
+	cleanup := bindMaxTokensFlag(t, nil)
+	defer cleanup()
+
+	config := &ProviderConfig{MaxTokens: 8192}
+	// Mistral Devstral has 262144 output — we should still cap at 32768.
+	modelInfo := &ModelInfo{
+		ID:    "devstral-medium-latest",
+		Limit: Limit{Context: 262144, Output: 262144},
+	}
+
+	rightSizeMaxTokens(config, modelInfo)
+
+	if config.MaxTokens != defaultRightSizeCap {
+		t.Errorf("expected MaxTokens capped at %d, got %d", defaultRightSizeCap, config.MaxTokens)
+	}
+}
+
+func TestRightSizeMaxTokens_UsesExactOutputWhenBelowCap(t *testing.T) {
+	cleanup := bindMaxTokensFlag(t, nil)
+	defer cleanup()
+
+	config := &ProviderConfig{MaxTokens: 4096}
+	// Model with output limit smaller than the cap.
+	modelInfo := &ModelInfo{
+		ID:    "gpt-4",
+		Limit: Limit{Context: 8192, Output: 8192},
+	}
+
+	rightSizeMaxTokens(config, modelInfo)
+
+	if config.MaxTokens != 8192 {
+		t.Errorf("expected MaxTokens raised to model output ceiling (8192), got %d", config.MaxTokens)
+	}
+}
+
+func TestRightSizeMaxTokens_DoesNotLowerCurrentValue(t *testing.T) {
+	cleanup := bindMaxTokensFlag(t, nil)
+	defer cleanup()
+
+	// User (via per-model settings, applied earlier) already bumped MaxTokens
+	// above the cap — we must not clobber their choice.
+	config := &ProviderConfig{MaxTokens: 100000}
+	modelInfo := &ModelInfo{
+		ID:    "devstral-medium-latest",
+		Limit: Limit{Context: 262144, Output: 262144},
+	}
+
+	rightSizeMaxTokens(config, modelInfo)
+
+	if config.MaxTokens != 100000 {
+		t.Errorf("expected MaxTokens preserved at 100000, got %d", config.MaxTokens)
+	}
+}
+
+func TestRightSizeMaxTokens_RespectsExplicitFlag(t *testing.T) {
+	// Simulate `--max-tokens 4096` on the command line.
+	cleanup := bindMaxTokensFlag(t, []string{"--max-tokens", "4096"})
+	defer cleanup()
+
+	config := &ProviderConfig{MaxTokens: 4096}
+	modelInfo := &ModelInfo{
+		ID:    "claude-sonnet-4-5",
+		Limit: Limit{Context: 200000, Output: 64000},
+	}
+
+	rightSizeMaxTokens(config, modelInfo)
+
+	if config.MaxTokens != 4096 {
+		t.Errorf("expected explicit --max-tokens to be preserved (4096), got %d", config.MaxTokens)
+	}
+}
+
+func TestRightSizeMaxTokens_NilModelInfo(t *testing.T) {
+	cleanup := bindMaxTokensFlag(t, nil)
+	defer cleanup()
+
+	config := &ProviderConfig{MaxTokens: 8192}
+	// Custom model / Ollama / unknown provider → no model info.
+	rightSizeMaxTokens(config, nil)
+
+	if config.MaxTokens != 8192 {
+		t.Errorf("expected MaxTokens unchanged with nil modelInfo, got %d", config.MaxTokens)
+	}
+}
+
+func TestRightSizeMaxTokens_ZeroOutputLimit(t *testing.T) {
+	cleanup := bindMaxTokensFlag(t, nil)
+	defer cleanup()
+
+	config := &ProviderConfig{MaxTokens: 8192}
+	// Model present in catalog but with no known output limit.
+	modelInfo := &ModelInfo{
+		ID:    "unknown-model",
+		Limit: Limit{Context: 0, Output: 0},
+	}
+
+	rightSizeMaxTokens(config, modelInfo)
+
+	if config.MaxTokens != 8192 {
+		t.Errorf("expected MaxTokens unchanged with zero output limit, got %d", config.MaxTokens)
+	}
+}
@@ -0,0 +1,66 @@
+package session
+
+import (
+	"testing"
+
+	"github.com/mark3labs/kit/internal/message"
+)
+
+// TestCompactionParentCycleRegression tests that after multiple compactions,
+// newly appended messages always have a valid parent chain and BuildContext
+// returns the correct messages.
+func TestCompactionParentCycleRegression(t *testing.T) {
+	tm := InMemoryTreeSession("/test")
+
+	// Simulate a long conversation with multiple compactions.
+	msg1, _ := tm.AppendMessage(message.Message{Role: message.RoleUser, Parts: []message.ContentPart{message.TextContent{Text: "msg1"}}})
+	msg2, _ := tm.AppendMessage(message.Message{Role: message.RoleAssistant, Parts: []message.ContentPart{message.TextContent{Text: "msg2"}}})
+
+	// First compaction
+	comp1, _ := tm.AppendCompaction("Summary 1", msg1, 1000, 500, 1, []string{}, []string{})
+
+	msg3, _ := tm.AppendMessage(message.Message{Role: message.RoleUser, Parts: []message.ContentPart{message.TextContent{Text: "msg3"}}})
+	msg4, _ := tm.AppendMessage(message.Message{Role: message.RoleAssistant, Parts: []message.ContentPart{message.TextContent{Text: "msg4"}}})
+
+	// Second compaction
+	comp2, _ := tm.AppendCompaction("Summary 2", msg3, 1000, 500, 1, []string{}, []string{})
+
+	msg5, _ := tm.AppendMessage(message.Message{Role: message.RoleUser, Parts: []message.ContentPart{message.TextContent{Text: "msg5"}}})
+	msg6, _ := tm.AppendMessage(message.Message{Role: message.RoleAssistant, Parts: []message.ContentPart{message.TextContent{Text: "msg6"}}})
+
+	// Verify parent chain integrity
+	for _, id := range []string{msg1, msg2, comp1, msg3, msg4, comp2, msg5, msg6} {
+		entry := tm.GetEntry(id)
+		if entry == nil {
+			t.Fatalf("entry %s not found in index", id)
+		}
+	}
+
+	// Walk parent chain from msg6 — must reach root without cycles
+	visited := make(map[string]bool)
+	current := msg6
+	for current != "" {
+		if visited[current] {
+			t.Fatalf("cycle detected at entry %s", current)
+		}
+		visited[current] = true
+		entry := tm.GetEntry(current)
+		if entry == nil {
+			t.Fatalf("entry %s missing from index during parent walk", current)
+		}
+		parent := ""
+		switch e := entry.(type) {
+		case *MessageEntry:
+			parent = e.ParentID
+		case *CompactionEntry:
+			parent = e.ParentID
+		}
+		current = parent
+	}
+
+	// BuildContext should return: Summary2 + msg6 + msg5 + msg3 + msg4 = 5 messages
+	msgs, _, _ := tm.BuildContext()
+	if len(msgs) != 5 {
+		t.Fatalf("expected 5 messages, got %d: %+v", len(msgs), msgs)
+	}
+}
@@ -0,0 +1,109 @@
+package session
+
+import (
+	"testing"
+
+	"github.com/mark3labs/kit/internal/message"
+)
+
+// TestDetectCycleWithCorruptedParentChain tests that cycle detection works
+// when a corrupted session has circular parent references.
+func TestDetectCycleWithCorruptedParentChain(t *testing.T) {
+	tm := InMemoryTreeSession("/test")
+
+	// Create normal chain: msg1 -> msg2 -> msg3
+	id1, _ := tm.AppendMessage(message.Message{Role: message.RoleUser, Parts: []message.ContentPart{message.TextContent{Text: "msg1"}}})
+	_, _ = tm.AppendMessage(message.Message{Role: message.RoleAssistant, Parts: []message.ContentPart{message.TextContent{Text: "msg2"}}})
+	id3, _ := tm.AppendMessage(message.Message{Role: message.RoleUser, Parts: []message.ContentPart{message.TextContent{Text: "msg3"}}})
+
+	// Simulate corruption: manually set msg1's parent to msg3, creating cycle
+	// This simulates the condition seen in the user's session
+	for _, entry := range tm.entries {
+		if e, ok := entry.(*MessageEntry); ok && e.ID == id1 {
+			e.ParentID = id3 // Create cycle: msg1 -> msg3 -> ... -> msg1
+			break
+		}
+	}
+
+	// DetectCycle should find the cycle
+	// The cycle is: id1 -> id3 -> id2 -> id1
+	// So detecting from id3 should find id1 as the repeat
+	cycle, entry := tm.DetectCycle(id3)
+	if !cycle {
+		t.Fatal("expected to detect cycle, but none found")
+	}
+	// The cycle entry could be id1 or id3 depending on where we start
+	if entry != id1 && entry != id3 {
+		t.Fatalf("expected cycle at %s or %s, got %s", id1, id3, entry)
+	}
+
+	// BuildContext should still work (it has its own cycle detection)
+	// but will truncate at the cycle point
+	msgs, _, _ := tm.BuildContext()
+	if len(msgs) == 0 {
+		t.Fatal("BuildContext returned no messages")
+	}
+}
+
+// TestAppendMessageRejectsInvalidParent tests that AppendMessage rejects
+// appending when the current leaf has a broken parent chain.
+func TestAppendMessageRejectsInvalidParent(t *testing.T) {
+	tm := InMemoryTreeSession("/test")
+
+	// Create normal message
+	id1, err := tm.AppendMessage(message.Message{Role: message.RoleUser, Parts: []message.ContentPart{message.TextContent{Text: "msg1"}}})
+	if err != nil {
+		t.Fatalf("failed to append msg1: %v", err)
+	}
+
+	// Simulate corruption: set leafID to a non-existent ID
+	tm.leafID = "non-existent-id"
+
+	// Next append should fail validation
+	_, err = tm.AppendMessage(message.Message{Role: message.RoleAssistant, Parts: []message.ContentPart{message.TextContent{Text: "msg2"}}})
+	if err == nil {
+		t.Fatal("expected error when appending with invalid leafID, got nil")
+	}
+
+	// Restore valid leafID
+	tm.leafID = id1
+
+	// Append should succeed now
+	_, err = tm.AppendMessage(message.Message{Role: message.RoleAssistant, Parts: []message.ContentPart{message.TextContent{Text: "msg3"}}})
+	if err != nil {
+		t.Fatalf("failed to append msg3 after restoring leafID: %v", err)
+	}
+}
+
+// TestBuildContextHandlesCycleGracefully tests that BuildContext handles
+// cycles gracefully by truncating the branch.
+func TestBuildContextHandlesCycleGracefully(t *testing.T) {
+	tm := InMemoryTreeSession("/test")
+
+	// Create messages
+	id1, _ := tm.AppendMessage(message.Message{Role: message.RoleUser, Parts: []message.ContentPart{message.TextContent{Text: "msg1"}}})
+	_, _ = tm.AppendMessage(message.Message{Role: message.RoleAssistant, Parts: []message.ContentPart{message.TextContent{Text: "msg2"}}})
+	id3, _ := tm.AppendMessage(message.Message{Role: message.RoleUser, Parts: []message.ContentPart{message.TextContent{Text: "msg3"}}})
+
+	// Verify normal case works
+	msgs, _, _ := tm.BuildContext()
+	if len(msgs) != 3 {
+		t.Fatalf("expected 3 messages, got %d", len(msgs))
+	}
+
+	// Simulate cycle: set msg1's parent to msg3
+	for _, entry := range tm.entries {
+		if e, ok := entry.(*MessageEntry); ok && e.ID == id1 {
+			e.ParentID = id3
+			break
+		}
+	}
+
+	// BuildContext should handle cycle gracefully (getBranchLocked has cycle detection)
+	msgs, _, _ = tm.BuildContext()
+	// Should only include messages from the cycle: msg3, msg2, msg1
+	// (msg3 is leaf, walks to msg2 -> msg1 -> msg3 (cycle detected, stops))
+	if len(msgs) != 3 {
+		t.Fatalf("expected 3 messages in cycle case, got %d: %+v", len(msgs), msgs)
+	}
+}
@@ -63,6 +63,11 @@ type TreeManager struct {

 	// file is the open file handle for appending entries. Nil for in-memory.
 	file *os.File
+
+	// writer is a buffered writer wrapping file. Writes go through this
+	// buffer and are flushed to disk at explicit sync points (after each
+	// public Append* call, in Close, etc.) to reduce syscall overhead.
+	writer *bufio.Writer
 }

 // --- Constructors ---
@@ -105,11 +110,16 @@ func CreateTreeSession(cwd string) (*TreeManager, error) {
 		return nil, fmt.Errorf("failed to create session file: %w", err)
 	}
 	tm.file = f
+	tm.writer = bufio.NewWriter(f)

 	if err := tm.writeEntry(&header); err != nil {
 		_ = f.Close()
 		return nil, fmt.Errorf("failed to write session header: %w", err)
 	}
+	if err := tm.flushLocked(); err != nil {
+		_ = f.Close()
+		return nil, fmt.Errorf("failed to flush session header: %w", err)
+	}

 	return tm, nil
 }
@@ -150,6 +160,7 @@ func (tm *TreeManager) ForkToNewSession(cwd string, targetID string) (*TreeManag
 		return nil, fmt.Errorf("failed to recreate session file: %w", err)
 	}
 	newTm.file = f
+	newTm.writer = bufio.NewWriter(f)

 	if err := newTm.writeEntry(&newTm.header); err != nil {
 		_ = f.Close()
@@ -289,6 +300,12 @@ func (tm *TreeManager) ForkToNewSession(cwd string, targetID string) (*TreeManag
 		}
 	}

+	// Flush all buffered writes from the fork in a single syscall.
+	if err := newTm.flushLocked(); err != nil {
+		_ = f.Close()
+		return nil, fmt.Errorf("failed to flush forked session: %w", err)
+	}
+
 	// Set the leaf to the last entry in the new session.
 	newTm.leafID = prevNewID

@@ -365,12 +382,16 @@ func OpenTreeSession(path string) (*TreeManager, error) {
 		tm.leafID = tm.EntryID(tm.entries[len(tm.entries)-1])
 	}

+	// Validate tree integrity and log diagnostics
+	tm.LogTreeDiagnostics()
+
 	// Open file for appending.
 	f, err := os.OpenFile(path, os.O_WRONLY|os.O_APPEND, 0644)
 	if err != nil {
 		return nil, fmt.Errorf("failed to open session file for append: %w", err)
 	}
 	tm.file = f
+	tm.writer = bufio.NewWriter(f)

 	return tm, nil
 }
@@ -410,6 +431,12 @@ func (tm *TreeManager) AppendMessage(msg message.Message) (string, error) {
 	tm.mu.Lock()
 	defer tm.mu.Unlock()

+	// Validate parent chain before appending to detect/prevent cycles
+	// that could be caused by external file corruption or race conditions.
+	if err := tm.validateParentChainLocked(tm.leafID, ""); err != nil {
+		return "", fmt.Errorf("parent chain validation failed: %w", err)
+	}
+
 	entry, err := NewMessageEntry(tm.leafID, msg)
 	if err != nil {
 		return "", err
@@ -418,6 +445,9 @@ func (tm *TreeManager) AppendMessage(msg message.Message) (string, error) {
 	if err := tm.appendAndPersist(entry); err != nil {
 		return "", err
 	}
+	if err := tm.flushLocked(); err != nil {
+		return "", fmt.Errorf("failed to flush message: %w", err)
+	}

 	tm.leafID = entry.ID
 	return entry.ID, nil
@@ -442,6 +472,9 @@ func (tm *TreeManager) AppendModelChange(provider, modelID string) (string, erro
 	if err := tm.appendAndPersist(entry); err != nil {
 		return "", err
 	}
+	if err := tm.flushLocked(); err != nil {
+		return "", fmt.Errorf("failed to flush model change: %w", err)
+	}

 	tm.leafID = entry.ID
 	return entry.ID, nil
@@ -456,6 +489,9 @@ func (tm *TreeManager) AppendBranchSummary(fromID, summary string) (string, erro
 	if err := tm.appendAndPersist(entry); err != nil {
 		return "", err
 	}
+	if err := tm.flushLocked(); err != nil {
+		return "", fmt.Errorf("failed to flush branch summary: %w", err)
+	}

 	tm.leafID = entry.ID
 	return entry.ID, nil
@@ -470,6 +506,9 @@ func (tm *TreeManager) AppendLabel(targetID, label string) (string, error) {
 	if err := tm.appendAndPersist(entry); err != nil {
 		return "", err
 	}
+	if err := tm.flushLocked(); err != nil {
+		return "", fmt.Errorf("failed to flush label: %w", err)
+	}

 	tm.labels[targetID] = label
 	tm.leafID = entry.ID
@@ -485,6 +524,9 @@ func (tm *TreeManager) AppendSessionInfo(name string) (string, error) {
 	if err := tm.appendAndPersist(entry); err != nil {
 		return "", err
 	}
+	if err := tm.flushLocked(); err != nil {
+		return "", fmt.Errorf("failed to flush session info: %w", err)
+	}

 	tm.sessionName = name
 	tm.leafID = entry.ID
@@ -501,6 +543,9 @@ func (tm *TreeManager) AppendExtensionData(extType, data string) (string, error)
 	if err := tm.appendAndPersist(entry); err != nil {
 		return "", err
 	}
+	if err := tm.flushLocked(); err != nil {
+		return "", fmt.Errorf("failed to flush extension data: %w", err)
+	}

 	tm.leafID = entry.ID
 	return entry.ID, nil
@@ -518,6 +563,13 @@ func (tm *TreeManager) AppendCompaction(summary, firstKeptEntryID string, tokens
 	tm.mu.Lock()
 	defer tm.mu.Unlock()

+	// Validate that firstKeptEntryID exists if provided
+	if firstKeptEntryID != "" {
+		if _, ok := tm.index[firstKeptEntryID]; !ok {
+			return "", fmt.Errorf("first kept entry %q does not exist", firstKeptEntryID)
+		}
+	}
+
 	// The compaction entry has no parent, making it a new "root" for the
 	// post-compaction branch. This ensures old compacted messages are not
 	// traversed when walking from the current leaf.
@@ -525,6 +577,9 @@ func (tm *TreeManager) AppendCompaction(summary, firstKeptEntryID string, tokens
 	if err := tm.appendAndPersist(entry); err != nil {
 		return "", err
 	}
+	if err := tm.flushLocked(); err != nil {
+		return "", fmt.Errorf("failed to flush compaction: %w", err)
+	}

 	tm.leafID = entry.ID
 	return entry.ID, nil
@@ -910,11 +965,31 @@ func (tm *TreeManager) IsEmpty() bool {
 	return tm.MessageCount() == 0
 }

-// Close closes the underlying file handle.
+// Flush writes any buffered data to the underlying file.
+func (tm *TreeManager) Flush() error {
+	tm.mu.Lock()
+	defer tm.mu.Unlock()
+	return tm.flushLocked()
+}
+
+// flushLocked writes buffered data to disk. Caller must hold the lock.
+func (tm *TreeManager) flushLocked() error {
+	if tm.writer != nil {
+		return tm.writer.Flush()
+	}
+	return nil
+}
+
+// Close flushes any buffered writes and closes the underlying file handle.
 func (tm *TreeManager) Close() error {
 	tm.mu.Lock()
 	defer tm.mu.Unlock()
 	if tm.file != nil {
+		// Flush buffered data before closing.
+		if tm.writer != nil {
+			_ = tm.writer.Flush()
+			tm.writer = nil
+		}
 		err := tm.file.Close()
 		tm.file = nil
 		return err
@@ -1074,13 +1149,22 @@ func (tm *TreeManager) GetLastCompaction() *CompactionEntry {

 // AddLLMMessages appends multiple LLM messages as entries. This is
 // used when syncing from the agent's ConversationMessages after a step.
+// All entries are buffered and flushed to disk in a single batch.
 func (tm *TreeManager) AddLLMMessages(msgs []fantasy.Message) error {
+	tm.mu.Lock()
+	defer tm.mu.Unlock()
+
 	for _, msg := range msgs {
-		if _, err := tm.AppendLLMMessage(msg); err != nil {
+		entry, err := NewMessageEntry(tm.leafID, message.FromLLMMessage(msg))
+		if err != nil {
 			return err
 		}
+		if err := tm.appendAndPersist(entry); err != nil {
+			return err
+		}
+		tm.leafID = entry.ID
 	}
-	return nil
+	return tm.flushLocked()
 }

 // Deprecated: Use AddLLMMessages instead.
@@ -1132,12 +1216,20 @@ func (tm *TreeManager) appendAndPersist(entry any) error {
 	return nil
 }

-// writeEntry serializes an entry and appends it as a line to the file.
+// writeEntry serializes an entry and appends it to the buffered writer.
+// The data is not flushed to disk until flushLocked is called.
 func (tm *TreeManager) writeEntry(entry any) error {
 	data, err := json.Marshal(entry)
 	if err != nil {
 		return fmt.Errorf("failed to marshal entry: %w", err)
 	}
+	if tm.writer != nil {
+		if _, err := tm.writer.Write(data); err != nil {
+			return err
+		}
+		return tm.writer.WriteByte('\n')
+	}
+	// Fallback for direct file writes (shouldn't happen in normal flow).
 	data = append(data, '\n')
 	_, err = tm.file.Write(data)
 	return err
@@ -1213,12 +1305,32 @@ func (tm *TreeManager) getBranchLocked(fromID string) []any {
 }

 // buildTreeNode recursively builds a TreeNode from an entry ID.
+// It includes a depth limit to prevent infinite recursion in case of
+// corrupted parent-child relationships.
 func (tm *TreeManager) buildTreeNode(id string) *TreeNode {
+	return tm.buildTreeNodeDepth(id, 0, make(map[string]bool))
+}
+
+// buildTreeNodeDepth is the internal implementation with depth tracking.
+func (tm *TreeManager) buildTreeNodeDepth(id string, depth int, visited map[string]bool) *TreeNode {
+	const maxDepth = 1000
+	if depth > maxDepth {
+		// Cycle or extremely deep tree detected, stop recursing
+		return nil
+	}
+	if visited[id] {
+		// Cycle detected, stop recursing
+		return nil
+	}
+
 	entry, ok := tm.index[id]
 	if !ok {
 		return nil
 	}

+	visited[id] = true
+	defer delete(visited, id)
+
 	node := &TreeNode{
 		Entry:    entry,
 		ID:       id,
@@ -1226,7 +1338,7 @@ func (tm *TreeManager) buildTreeNode(id string) *TreeNode {
 	}

 	for _, childID := range tm.childIndex[id] {
-		child := tm.buildTreeNode(childID)
+		child := tm.buildTreeNodeDepth(childID, depth+1, visited)
 		if child != nil {
 			node.Children = append(node.Children, child)
 		}
@@ -0,0 +1,143 @@
+package session
+
+import (
+	"fmt"
+	"log"
+)
+
+// ValidateParentChain checks that the parent ID points to an existing entry
+// and that appending this entry would not create a cycle. This should be called
+// before appending any entry to the tree.
+// Returns an error if the parent is invalid or would create a cycle.
+func (tm *TreeManager) ValidateParentChain(parentID string, newEntryID string) error {
+	if parentID == "" {
+		// Empty parent is valid (root entry)
+		return nil
+	}
+
+	// Check that parent exists
+	if _, ok := tm.index[parentID]; !ok {
+		return fmt.Errorf("parent entry %q does not exist in index", parentID)
+	}
+
+	// Check that we're not creating a cycle by walking up the parent chain
+	// from parentID and ensuring we don't hit newEntryID (or any node that
+	// has newEntryID as an ancestor, but since newEntryID is new, just check
+	// that parentID isn't newEntryID, which it can't be since we check existence)
+	visited := make(map[string]bool)
+	current := parentID
+	for current != "" {
+		if visited[current] {
+			return fmt.Errorf("existing cycle detected at entry %q", current)
+		}
+		visited[current] = true
+
+		// Safety check: if somehow we reach the new entry ID, that's a cycle
+		if current == newEntryID {
+			return fmt.Errorf("would create cycle: entry %q cannot be its own ancestor", newEntryID)
+		}
+
+		entry, ok := tm.index[current]
+		if !ok {
+			return fmt.Errorf("broken parent chain: entry %q not found", current)
+		}
+		current = tm.entryParentID(entry)
+	}
+
+	return nil
+}
+
+// DetectCycle walks the parent chain from the given entry ID and returns true
+// if a cycle is detected. This is used for diagnostics.
+func (tm *TreeManager) DetectCycle(fromID string) (cycleDetected bool, cycleEntry string) {
+	visited := make(map[string]bool)
+	current := fromID
+	for current != "" {
+		if visited[current] {
+			return true, current
+		}
+		visited[current] = true
+		entry, ok := tm.index[current]
+		if !ok {
+			return false, ""
+		}
+		current = tm.entryParentID(entry)
+	}
+	return false, ""
+}
+
+// LogTreeDiagnostics logs information about the tree structure for debugging.
+// Call this after OpenTreeSession or when anomalies are detected.
+func (tm *TreeManager) LogTreeDiagnostics() {
+	tm.mu.RLock()
+	defer tm.mu.RUnlock()
+
+	log.Printf("[TreeManager] Entry count: %d, Leaf ID: %s", len(tm.entries), tm.leafID)
+
+	// Check for cycles from leaf
+	if tm.leafID != "" {
+		if cycle, entry := tm.detectCycleLocked(tm.leafID); cycle {
+			log.Printf("[TreeManager] WARNING: Cycle detected in tree at entry %s", entry)
+		}
+	}
+
+	// Count entries by type
+	counts := make(map[EntryType]int)
+	for _, entry := range tm.entries {
+		var et EntryType
+		switch e := entry.(type) {
+		case *MessageEntry:
+			et = e.Type
+		case *ModelChangeEntry:
+			et = e.Type
+		case *BranchSummaryEntry:
+			et = e.Type
+		case *LabelEntry:
+			et = e.Type
+		case *SessionInfoEntry:
+			et = e.Type
+		case *ExtensionDataEntry:
+			et = e.Type
+		case *CompactionEntry:
+			et = e.Type
+		default:
+			et = "unknown"
+		}
+		counts[et]++
+	}
+	log.Printf("[TreeManager] Entry types: %+v", counts)
+}
+
+// detectCycleLocked is the internal version of DetectCycle (must hold read lock)
+func (tm *TreeManager) detectCycleLocked(fromID string) (bool, string) {
+	visited := make(map[string]bool)
+	current := fromID
+	for current != "" {
+		if visited[current] {
+			return true, current
+		}
+		visited[current] = true
+		entry, ok := tm.index[current]
+		if !ok {
+			return false, ""
+		}
+		current = tm.entryParentID(entry)
+	}
+	return false, ""
+}
+
+// validateParentChainLocked is the internal version used by append methods.
+// Must be called with the write lock held.
+func (tm *TreeManager) validateParentChainLocked(parentID string, newEntryID string) error {
+	if parentID == "" {
+		return nil
+	}
+	if _, ok := tm.index[parentID]; !ok {
+		return fmt.Errorf("parent entry %q does not exist", parentID)
+	}
+	// Check for existing cycles in the parent chain
+	if cycle, entry := tm.detectCycleLocked(parentID); cycle {
+		return fmt.Errorf("existing cycle detected at entry %q in parent chain", entry)
+	}
+	return nil
+}
@@ -8,11 +8,11 @@ import (
 	"sync"
 	"time"

-	"charm.land/fantasy"
 	"github.com/mark3labs/kit/internal/config"
 	"github.com/mark3labs/mcp-go/client"
 	"github.com/mark3labs/mcp-go/client/transport"
 	"github.com/mark3labs/mcp-go/mcp"
+	"github.com/mark3labs/mcp-go/server"
 )

 // ConnectionPoolConfig defines configuration parameters for the MCP connection pool.
@@ -63,7 +63,6 @@ type MCPConnectionPool struct {
 	connections       map[string]*MCPConnection
 	config            *ConnectionPoolConfig
 	mu                sync.RWMutex
-	model             fantasy.LanguageModel
 	ctx               context.Context
 	cancel            context.CancelFunc
 	debug             bool
@@ -75,9 +74,8 @@ type MCPConnectionPool struct {
 // NewMCPConnectionPool creates a new MCP connection pool with the specified configuration.
 // If config is nil, default configuration values will be used. The pool starts a background
 // goroutine for periodic health checks that runs until Close is called.
-// The model parameter is used for MCP servers that require sampling support.
 // Thread-safe for concurrent use immediately after creation.
-func NewMCPConnectionPool(config *ConnectionPoolConfig, model fantasy.LanguageModel, debug bool, authHandler MCPAuthHandler, tokenStoreFactory TokenStoreFactory) *MCPConnectionPool {
+func NewMCPConnectionPool(config *ConnectionPoolConfig, debug bool, authHandler MCPAuthHandler, tokenStoreFactory TokenStoreFactory) *MCPConnectionPool {
 	if config == nil {
 		config = DefaultConnectionPoolConfig()
 	}
@@ -86,7 +84,6 @@ func NewMCPConnectionPool(config *ConnectionPoolConfig, model fantasy.LanguageMo
 	pool := &MCPConnectionPool{
 		connections:       make(map[string]*MCPConnection),
 		config:            config,
-		model:             model,
 		ctx:               ctx,
 		cancel:            cancel,
 		debug:             debug,
@@ -246,10 +243,12 @@ func (p *MCPConnectionPool) performHealthCheck(ctx context.Context, conn *MCPCon

 // createConnection creates a new connection
 func (p *MCPConnectionPool) createConnection(ctx context.Context, serverName string, serverConfig config.MCPServerConfig) (*MCPConnection, error) {
+	oauthEnabled := p.oauthFlow != nil && !serverConfig.NoOAuth
+
 	mcpClient, err := p.createMCPClient(ctx, serverName, serverConfig)
 	if err != nil {
 		// SSE transport can return OAuth error during Start()
-		if p.oauthFlow != nil && IsOAuthError(err) {
+		if oauthEnabled && IsOAuthError(err) {
 			if flowErr := p.oauthFlow.RunAuthFlow(ctx, serverName, err); flowErr != nil {
 				return nil, fmt.Errorf("OAuth authorization failed: %w", flowErr)
 			}
@@ -265,7 +264,7 @@ func (p *MCPConnectionPool) createConnection(ctx context.Context, serverName str

 	if err := p.initializeClient(ctx, mcpClient); err != nil {
 		// Streamable HTTP transport returns OAuth error during Initialize()
-		if p.oauthFlow != nil && IsOAuthError(err) {
+		if oauthEnabled && IsOAuthError(err) {
 			if flowErr := p.oauthFlow.RunAuthFlow(ctx, serverName, err); flowErr != nil {
 				_ = mcpClient.Close()
 				return nil, fmt.Errorf("OAuth authorization failed: %w", flowErr)
@@ -308,6 +307,8 @@ func (p *MCPConnectionPool) createMCPClient(ctx context.Context, serverName stri
 		return p.createSSEClient(ctx, serverConfig)
 	case "streamable":
 		return p.createStreamableClient(ctx, serverConfig)
+	case "inprocess":
+		return p.createInProcessClient(serverConfig)
 	default:
 		return nil, fmt.Errorf("unsupported transport type '%s' for server %s", transportType, serverName)
 	}
@@ -364,11 +365,11 @@ func (p *MCPConnectionPool) createSSEClient(ctx context.Context, serverConfig co
 		}
 	}

-	// Enable OAuth for remote transports when an auth handler is configured.
-	// The OAuthConfig uses PKCE and the handler's redirect URI. If the server
-	// config provides a pre-registered ClientID (for servers that don't support
-	// dynamic client registration, e.g. GitHub), it is passed through directly.
-	if p.oauthFlow != nil {
+	// Enable OAuth for remote transports when an auth handler is configured
+	// and the server hasn't opted out via NoOAuth. Public MCP servers (e.g.
+	// PubMed) set NoOAuth to skip dynamic client registration and token
+	// exchange, which would otherwise fail with a 404.
+	if p.oauthFlow != nil && !serverConfig.NoOAuth {
 		tokenStore, tsErr := p.createTokenStore(serverConfig.URL)
 		if tsErr != nil {
 			return nil, fmt.Errorf("failed to create token store: %w", tsErr)
@@ -421,11 +422,9 @@ func (p *MCPConnectionPool) createStreamableClient(ctx context.Context, serverCo
 		}
 	}

-	// Enable OAuth for remote transports when an auth handler is configured.
-	// The OAuthConfig uses PKCE and the handler's redirect URI. If the server
-	// config provides a pre-registered ClientID (for servers that don't support
-	// dynamic client registration, e.g. GitHub), it is passed through directly.
-	if p.oauthFlow != nil {
+	// Enable OAuth for remote transports when an auth handler is configured
+	// and the server hasn't opted out via NoOAuth.
+	if p.oauthFlow != nil && !serverConfig.NoOAuth {
 		tokenStore, tsErr := p.createTokenStore(serverConfig.URL)
 		if tsErr != nil {
 			return nil, fmt.Errorf("failed to create token store: %w", tsErr)
@@ -459,6 +458,22 @@ func (p *MCPConnectionPool) createStreamableClient(ctx context.Context, serverCo
 	return streamableClient, nil
 }

+// createInProcessClient creates an in-process MCP client that communicates
+// directly with an *server.MCPServer in the same process. No subprocess is
+// spawned and no network I/O occurs — calls go through JSON marshal →
+// MCPServer.HandleMessage → JSON unmarshal, all in-memory.
+func (p *MCPConnectionPool) createInProcessClient(serverConfig config.MCPServerConfig) (client.MCPClient, error) {
+	srv, ok := serverConfig.InProcessServer.(*server.MCPServer)
+	if !ok {
+		return nil, fmt.Errorf("InProcessServer must be *server.MCPServer, got %T", serverConfig.InProcessServer)
+	}
+	inProcessClient, err := client.NewInProcessClient(srv)
+	if err != nil {
+		return nil, fmt.Errorf("failed to create in-process client: %w", err)
+	}
+	return inProcessClient, nil
+}
+
 // createTokenStore creates a token store for the given server URL.
 // If a custom TokenStoreFactory is configured, it is used; otherwise the
 // default file-backed token store is created.
@@ -1,109 +0,0 @@
-package tools
-
-import (
-	"context"
-	"encoding/json"
-	"fmt"
-
-	"charm.land/fantasy"
-	"github.com/mark3labs/mcp-go/mcp"
-)
-
-// mcpFantasyTool adapts an MCP tool to the fantasy.AgentTool interface.
-// It bridges the MCP tool protocol with fantasy's agent tool system, handling
-// name prefixing, schema conversion, connection pooling, and result marshaling.
-type mcpFantasyTool struct {
-	toolInfo        fantasy.ToolInfo
-	mapping         *toolMapping
-	providerOptions fantasy.ProviderOptions
-}
-
-// Info returns the fantasy tool info including name, description, and parameter schema.
-func (t *mcpFantasyTool) Info() fantasy.ToolInfo {
-	return t.toolInfo
-}
-
-// Run executes the MCP tool by routing through the connection pool.
-// It maps the prefixed tool name back to the original name, retrieves a healthy
-// connection, invokes the tool, and converts the MCP result to a fantasy ToolResponse.
-func (t *mcpFantasyTool) Run(ctx context.Context, call fantasy.ToolCall) (fantasy.ToolResponse, error) {
-	// Parse and validate JSON arguments
-	var arguments any
-	input := call.Input
-	if input == "" || input == "{}" {
-		arguments = nil
-	} else {
-		var temp any
-		if err := json.Unmarshal([]byte(input), &temp); err != nil {
-			return fantasy.NewTextErrorResponse(fmt.Sprintf("invalid JSON arguments: %v", err)), nil
-		}
-		arguments = json.RawMessage(input)
-	}
-
-	// Get connection from pool with health check
-	conn, err := t.mapping.manager.connectionPool.GetConnectionWithHealthCheck(
-		ctx, t.mapping.serverName, t.mapping.serverConfig,
-	)
-	if err != nil {
-		return fantasy.ToolResponse{}, fmt.Errorf("failed to get healthy connection from pool: %w", err)
-	}
-
-	// Call the MCP tool using the original (unprefixed) name
-	result, err := conn.client.CallTool(ctx, mcp.CallToolRequest{
-		Request: mcp.Request{
-			Method: "tools/call",
-		},
-		Params: mcp.CallToolParams{
-			Name:      t.mapping.originalName,
-			Arguments: arguments,
-		},
-	})
-	if err != nil {
-		// Handle OAuth re-authorization: token may have expired mid-session.
-		if t.mapping.manager.connectionPool.oauthFlow != nil && IsOAuthError(err) {
-			if flowErr := t.mapping.manager.connectionPool.oauthFlow.RunAuthFlow(ctx, t.mapping.serverName, err); flowErr != nil {
-				return fantasy.ToolResponse{}, fmt.Errorf("OAuth re-authorization failed for tool %s: %w", t.mapping.originalName, flowErr)
-			}
-			// Retry the tool call after successful re-auth.
-			result, err = conn.client.CallTool(ctx, mcp.CallToolRequest{
-				Request: mcp.Request{
-					Method: "tools/call",
-				},
-				Params: mcp.CallToolParams{
-					Name:      t.mapping.originalName,
-					Arguments: arguments,
-				},
-			})
-			if err != nil {
-				t.mapping.manager.connectionPool.HandleConnectionError(t.mapping.serverName, err)
-				return fantasy.ToolResponse{}, fmt.Errorf("failed to call mcp tool after re-auth: %w", err)
-			}
-		} else {
-			// Mark connection as unhealthy for automatic recovery
-			t.mapping.manager.connectionPool.HandleConnectionError(t.mapping.serverName, err)
-			return fantasy.ToolResponse{}, fmt.Errorf("failed to call mcp tool: %w", err)
-		}
-	}
-
-	// Marshal the MCP result to JSON string
-	marshaledResult, err := json.Marshal(result)
-	if err != nil {
-		return fantasy.ToolResponse{}, fmt.Errorf("failed to marshal mcp tool result: %w", err)
-	}
-
-	// Return as text response, preserving error status from MCP
-	if result.IsError {
-		return fantasy.NewTextErrorResponse(string(marshaledResult)), nil
-	}
-	return fantasy.NewTextResponse(string(marshaledResult)), nil
-}
-
-// ProviderOptions returns provider-specific options for this tool.
-func (t *mcpFantasyTool) ProviderOptions() fantasy.ProviderOptions {
-	return t.providerOptions
-}
-
-// SetProviderOptions sets provider-specific options for this tool.
-func (t *mcpFantasyTool) SetProviderOptions(opts fantasy.ProviderOptions) {
-	t.providerOptions = opts
-}
@@ -0,0 +1,244 @@
+package tools
+
+import (
+	"context"
+	"encoding/json"
+	"strings"
+	"testing"
+
+	"github.com/mark3labs/kit/internal/config"
+	"github.com/mark3labs/mcp-go/mcp"
+	"github.com/mark3labs/mcp-go/server"
+)
+
+// newTestInProcessServer creates a simple MCP server with one tool for testing.
+func newTestInProcessServer() *server.MCPServer {
+	srv := server.NewMCPServer("test-server", "1.0.0",
+		server.WithToolCapabilities(true),
+	)
+	srv.AddTool(
+		mcp.NewTool("greet",
+			mcp.WithDescription("Say hello"),
+			mcp.WithString("name", mcp.Required(), mcp.Description("Name to greet")),
+		),
+		func(ctx context.Context, req mcp.CallToolRequest) (*mcp.CallToolResult, error) {
+			name, _ := req.GetArguments()["name"].(string)
+			return mcp.NewToolResultText("Hello, " + name + "!"), nil
+		},
+	)
+	return srv
+}
+
+func TestInProcessTransportType(t *testing.T) {
+	cfg := config.MCPServerConfig{
+		Type:            "inprocess",
+		InProcessServer: newTestInProcessServer(),
+	}
+	if got := cfg.GetTransportType(); got != "inprocess" {
+		t.Errorf("GetTransportType() = %q, want %q", got, "inprocess")
+	}
+}
+
+func TestInProcessTransportTypeInferred(t *testing.T) {
+	// When Type is empty but InProcessServer is set, infer "inprocess".
+	cfg := config.MCPServerConfig{
+		InProcessServer: newTestInProcessServer(),
+	}
+	if got := cfg.GetTransportType(); got != "inprocess" {
+		t.Errorf("GetTransportType() = %q, want %q", got, "inprocess")
+	}
+}
+
+func TestInProcessValidation(t *testing.T) {
+	// Valid: InProcessServer is set.
+	validCfg := &config.Config{
+		MCPServers: map[string]config.MCPServerConfig{
+			"test": {
+				Type:            "inprocess",
+				InProcessServer: newTestInProcessServer(),
+			},
+		},
+	}
+	if err := validCfg.Validate(); err != nil {
+		t.Errorf("expected valid config, got error: %v", err)
+	}
+
+	// Invalid: type is inprocess but InProcessServer is nil.
+	invalidCfg := &config.Config{
+		MCPServers: map[string]config.MCPServerConfig{
+			"test": {
+				Type: "inprocess",
+			},
+		},
+	}
+	if err := invalidCfg.Validate(); err == nil {
+		t.Error("expected validation error for nil InProcessServer, got nil")
+	}
+}
+
+func TestConnectionPoolInProcessClient(t *testing.T) {
+	pool := NewMCPConnectionPool(DefaultConnectionPoolConfig(), false, nil, nil)
+	defer func() { _ = pool.Close() }()
+
+	ctx := context.Background()
+	srv := newTestInProcessServer()
+
+	cfg := config.MCPServerConfig{
+		Type:            "inprocess",
+		InProcessServer: srv,
+	}
+
+	conn, err := pool.GetConnection(ctx, "test-inproc", cfg)
+	if err != nil {
+		t.Fatalf("GetConnection failed: %v", err)
+	}
+
+	// Verify the connection is healthy and functional.
+	if !conn.isHealthy {
+		t.Error("expected connection to be healthy")
+	}
+
+	// List tools to verify the connection works end-to-end.
+	toolsResp, err := conn.client.ListTools(ctx, mcp.ListToolsRequest{})
+	if err != nil {
+		t.Fatalf("ListTools failed: %v", err)
+	}
+	if len(toolsResp.Tools) != 1 {
+		t.Fatalf("expected 1 tool, got %d", len(toolsResp.Tools))
+	}
+	if toolsResp.Tools[0].Name != "greet" {
+		t.Errorf("expected tool name 'greet', got %q", toolsResp.Tools[0].Name)
+	}
+}
+
+func TestConnectionPoolInProcessToolExecution(t *testing.T) {
+	pool := NewMCPConnectionPool(DefaultConnectionPoolConfig(), false, nil, nil)
+	defer func() { _ = pool.Close() }()
+
+	ctx := context.Background()
+	srv := newTestInProcessServer()
+
+	cfg := config.MCPServerConfig{
+		Type:            "inprocess",
+		InProcessServer: srv,
+	}
+
+	conn, err := pool.GetConnection(ctx, "test-inproc", cfg)
+	if err != nil {
+		t.Fatalf("GetConnection failed: %v", err)
+	}
+
+	// Call the tool.
+	result, err := conn.client.CallTool(ctx, mcp.CallToolRequest{
+		Request: mcp.Request{Method: "tools/call"},
+		Params: mcp.CallToolParams{
+			Name:      "greet",
+			Arguments: map[string]any{"name": "World"},
+		},
+	})
+	if err != nil {
+		t.Fatalf("CallTool failed: %v", err)
+	}
+	if result.IsError {
+		t.Error("expected non-error result")
+	}
+	if len(result.Content) == 0 {
+		t.Fatal("expected at least one content block")
+	}
+	text, ok := result.Content[0].(mcp.TextContent)
+	if !ok {
+		t.Fatalf("expected TextContent, got %T", result.Content[0])
+	}
+	if text.Text != "Hello, World!" {
+		t.Errorf("expected 'Hello, World!', got %q", text.Text)
+	}
+}
+
+func TestMCPToolManagerInProcess(t *testing.T) {
+	ctx := context.Background()
+	srv := newTestInProcessServer()
+
+	mgr := NewMCPToolManager()
+
+	cfg := config.MCPServerConfig{
+		Type:            "inprocess",
+		InProcessServer: srv,
+	}
+
+	count, err := mgr.AddServer(ctx, "myserver", cfg)
+	if err != nil {
+		t.Fatalf("AddServer failed: %v", err)
+	}
+	if count != 1 {
+		t.Errorf("expected 1 tool, got %d", count)
+	}
+
+	tools := mgr.GetTools()
+	if len(tools) != 1 {
+		t.Fatalf("expected 1 tool, got %d", len(tools))
+	}
+	if tools[0].Name != "myserver__greet" {
+		t.Errorf("expected tool name 'myserver__greet', got %q", tools[0].Name)
+	}
+
+	// Execute the tool.
+	input, _ := json.Marshal(map[string]any{"name": "SDK"})
+	result, err := mgr.ExecuteTool(ctx, "myserver__greet", string(input))
+	if err != nil {
+		t.Fatalf("ExecuteTool failed: %v", err)
+	}
+	if result.IsError {
+		t.Error("expected non-error result")
+	}
+	if result.Content == "" {
+		t.Error("expected non-empty result content")
+	}
+
+	// Verify result contains our greeting.
+	if !strings.Contains(result.Content, "Hello, SDK!") {
+		t.Errorf("expected 'Hello, SDK!' in result, got %q", result.Content)
+	}
+}
+
+func TestConnectionPoolInProcessInvalidServer(t *testing.T) {
+	pool := NewMCPConnectionPool(DefaultConnectionPoolConfig(), false, nil, nil)
+	defer func() { _ = pool.Close() }()
+
+	ctx := context.Background()
+
+	// Pass a non-*server.MCPServer value.
+	cfg := config.MCPServerConfig{
+		Type:            "inprocess",
+		InProcessServer: "not a server",
+	}
+
+	_, err := pool.GetConnection(ctx, "bad", cfg)
+	if err == nil {
+		t.Fatal("expected error for invalid InProcessServer type")
+	}
+}
+
+func TestConnectionPoolInProcessReuse(t *testing.T) {
+	pool := NewMCPConnectionPool(DefaultConnectionPoolConfig(), false, nil, nil)
+	defer func() { _ = pool.Close() }()
+
+	ctx := context.Background()
+	srv := newTestInProcessServer()
+	cfg := config.MCPServerConfig{
+		Type:            "inprocess",
+		InProcessServer: srv,
+	}
+
+	// Get connection twice — should reuse.
+	conn1, err := pool.GetConnection(ctx, "reuse-test", cfg)
+	if err != nil {
+		t.Fatalf("first GetConnection failed: %v", err)
+	}
+	conn2, err := pool.GetConnection(ctx, "reuse-test", cfg)
+	if err != nil {
+		t.Fatalf("second GetConnection failed: %v", err)
+	}
+	if conn1 != conn2 {
+		t.Error("expected same connection object on reuse")
+	}
+}
@@ -2,6 +2,7 @@ package tools

 import (
 	"context"
+	"encoding/base64"
 	"encoding/json"
 	"fmt"
 	"maps"
@@ -9,22 +10,131 @@ import (
 	"strings"
 	"sync"

-	"charm.land/fantasy"
+	log "github.com/charmbracelet/log"
+
 	"github.com/mark3labs/kit/internal/config"
 	"github.com/mark3labs/mcp-go/mcp"
 )

+// MCPTool represents a tool discovered from an MCP server. It contains all
+// the metadata needed to present the tool to an LLM (name, description, JSON
+// schema) plus the server origin information needed to execute it.
+type MCPTool struct {
+	// Name is the prefixed tool name: "serverName__toolName".
+	Name string
+	// Description is the human-readable tool description.
+	Description string
+	// Parameters is the JSON Schema properties for the tool's input.
+	Parameters map[string]any
+	// Required lists the required parameter names.
+	Required []string
+	// ServerName is the MCP server this tool belongs to.
+	ServerName string
+	// OriginalName is the unprefixed tool name on the MCP server.
+	OriginalName string
+}
+
+// MCPToolResult is the result of executing an MCP tool via ExecuteTool.
+type MCPToolResult struct {
+	// Content is the JSON-encoded result from the MCP server.
+	Content string
+	// IsError indicates the MCP server reported a tool-level error.
+	IsError bool
+}
+
+// MCPPrompt represents a prompt discovered from an MCP server.
+type MCPPrompt struct {
+	// Name is the prompt name on the MCP server.
+	Name string
+	// Description is the human-readable prompt description.
+	Description string
+	// Arguments lists the prompt's expected arguments.
+	Arguments []MCPPromptArgument
+	// ServerName is the MCP server this prompt belongs to.
+	ServerName string
+}
+
+// MCPPromptArgument describes an argument that a prompt template can accept.
+type MCPPromptArgument struct {
+	// Name is the argument name.
+	Name string
+	// Description is a human-readable description.
+	Description string
+	// Required indicates whether this argument must be provided.
+	Required bool
+}
+
+// MCPPromptMessage is a single message returned by a prompt expansion.
+type MCPPromptMessage struct {
+	// Role is "user" or "assistant".
+	Role string
+	// Content is the text content of the message.
+	Content string
+	// FileParts contains binary attachments extracted from embedded resources,
+	// images, or audio content blocks. Empty for text-only messages.
+	FileParts []MCPFilePart
+}
+
+// MCPFilePart represents a binary file attachment extracted from an MCP prompt
+// content block (ImageContent, AudioContent, or EmbeddedResource with blob data).
+type MCPFilePart struct {
+	// Filename is a best-effort name derived from the resource URI or content type.
+	Filename string
+	// Data is the raw binary content (already base64-decoded).
+	Data []byte
+	// MediaType is the MIME type (e.g. "image/png", "audio/wav").
+	MediaType string
+}
+
+// MCPPromptResult is the result of expanding an MCP prompt via GetPrompt.
+type MCPPromptResult struct {
+	// Description is an optional description returned by the server.
+	Description string
+	// Messages contains the expanded prompt messages.
+	Messages []MCPPromptMessage
+}
+
+// MCPResource represents a resource discovered from an MCP server.
+type MCPResource struct {
+	// URI is the unique resource identifier (e.g. "file:///path" or custom scheme).
+	URI string
+	// Name is a human-readable name for the resource.
+	Name string
+	// Description is an optional description of the resource.
+	Description string
+	// MIMEType is the MIME type of the resource, if known.
+	MIMEType string
+	// ServerName is the MCP server this resource belongs to.
+	ServerName string
+}
+
+// MCPResourceContent is the result of reading an MCP resource via ReadResource.
+type MCPResourceContent struct {
+	// URI is the resource URI that was read.
+	URI string
+	// MIMEType is the MIME type of the content.
+	MIMEType string
+	// Text is the text content (non-empty for text resources).
+	Text string
+	// BlobData is the decoded binary content (non-empty for blob resources).
+	BlobData []byte
+	// IsBlob is true when the content is binary (BlobData is set).
+	IsBlob bool
+}
+
 // MCPToolManager manages MCP (Model Context Protocol) tools and clients across multiple servers.
 // It provides a unified interface for loading, managing, and executing tools from various MCP servers,
 // including stdio, SSE, streamable HTTP, and built-in server types. The manager handles connection
-// pooling, health checks, tool name prefixing to avoid conflicts, and sampling support for LLM interactions.
+// pooling, health checks, tool name prefixing to avoid conflicts, and OAuth re-authorization.
 // Thread-safe for concurrent tool invocations.
 type MCPToolManager struct {
 	connectionPool    *MCPConnectionPool
-	tools             []fantasy.AgentTool
+	tools             []MCPTool
 	toolMap           map[string]*toolMapping // maps prefixed tool names to their server and original name
-	mu                sync.Mutex              // protects tools and toolMap during parallel loading
-	model             fantasy.LanguageModel   // LLM model for sampling
+	prompts           []MCPPrompt             // prompts discovered from all servers
+	resources         []MCPResource           // resources discovered from all servers
+	subscriptions     map[string]string       // resource URI → server name for active subscriptions
+	mu                sync.Mutex              // protects tools, toolMap, prompts, resources during parallel loading
 	authHandler       MCPAuthHandler          // OAuth handler for remote servers (nil = no OAuth)
 	tokenStoreFactory TokenStoreFactory       // factory for creating per-server token stores (nil = default FileTokenStore)
 	config            *config.Config
@@ -36,9 +146,13 @@ type MCPToolManager struct {
 	onServerLoaded func(serverName string, toolCount int, err error)

 	// onToolsChanged, if non-nil, is called after AddServer or RemoveServer
-	// mutates the tool list. The agent layer uses this to trigger a
-	// rebuildFantasyAgent so the LLM sees the updated tools.
+	// mutates the tool list. The agent layer uses this to trigger a rebuild
+	// so the LLM sees the updated tools.
 	onToolsChanged func()
+
+	// onResourcesChanged, if non-nil, is called when a subscribed resource
+	// is updated by the server.
+	onResourcesChanged func()
 }

 // toolMapping stores the mapping between prefixed tool names and their original details
@@ -46,27 +160,19 @@ type toolMapping struct {
 	serverName   string
 	originalName string
 	serverConfig config.MCPServerConfig
-	manager      *MCPToolManager
 }

 // NewMCPToolManager creates a new MCP tool manager instance.
 // Returns an initialized manager with empty tool collections ready to load tools from MCP servers.
-// The manager must be configured with SetModel and LoadTools before use.
+// The manager must be configured with LoadTools before use.
 func NewMCPToolManager() *MCPToolManager {
 	return &MCPToolManager{
-		tools:   make([]fantasy.AgentTool, 0),
-		toolMap: make(map[string]*toolMapping),
+		tools:         make([]MCPTool, 0),
+		toolMap:       make(map[string]*toolMapping),
+		subscriptions: make(map[string]string),
 	}
 }

-// SetModel sets the LLM model for sampling support.
-// The model is used when MCP servers request sampling operations, allowing them to
-// leverage the host's LLM capabilities for text generation tasks.
-// This method should be called before LoadTools if any MCP servers require sampling support.
-func (m *MCPToolManager) SetModel(model fantasy.LanguageModel) {
-	m.model = model
-}
-
 // SetAuthHandler sets the OAuth handler for remote MCP server authentication.
 // When set, remote transports (streamable HTTP, SSE) are configured with OAuth
 // support, enabling automatic authorization flows when servers require authentication.
@@ -109,7 +215,7 @@ func (m *MCPToolManager) SetOnServerLoaded(cb func(serverName string, toolCount

 // SetOnToolsChanged sets the callback that's invoked after AddServer or
 // RemoveServer mutates the tool list. The agent layer uses this to trigger
-// a rebuild of the fantasy agent so the LLM sees the updated tool set.
+// a rebuild so the LLM sees the updated tool set.
 func (m *MCPToolManager) SetOnToolsChanged(cb func()) {
 	m.onToolsChanged = cb
 }
@@ -160,7 +266,7 @@ func (m *MCPToolManager) AddServer(ctx context.Context, name string, cfg config.
 	return count, nil
 }

-// RemoveServer disconnects an MCP server and removes all its tools.
+// RemoveServer disconnects an MCP server and removes all its tools and prompts.
 // After this call the agent will no longer see or be able to call tools from
 // the named server. Returns an error if the server is not loaded.
 func (m *MCPToolManager) RemoveServer(name string) error {
@@ -168,7 +274,7 @@ func (m *MCPToolManager) RemoveServer(name string) error {

 	m.mu.Lock()

-	// Check the server actually has tools loaded.
+	// Check the server actually has tools or prompts loaded.
 	found := false
 	for k := range m.toolMap {
 		if len(k) >= len(prefix) && k[:len(prefix)] == prefix {
@@ -176,15 +282,24 @@ func (m *MCPToolManager) RemoveServer(name string) error {
 			break
 		}
 	}
+	if !found {
+		// Also check prompts — a server might expose only prompts.
+		for _, p := range m.prompts {
+			if p.ServerName == name {
+				found = true
+				break
+			}
+		}
+	}
 	if !found {
 		m.mu.Unlock()
 		return fmt.Errorf("MCP server %q is not loaded", name)
 	}

 	// Remove tools belonging to this server.
-	newTools := make([]fantasy.AgentTool, 0, len(m.tools))
+	newTools := make([]MCPTool, 0, len(m.tools))
 	for _, t := range m.tools {
-		if len(t.Info().Name) < len(prefix) || t.Info().Name[:len(prefix)] != prefix {
+		if len(t.Name) < len(prefix) || t.Name[:len(prefix)] != prefix {
 			newTools = append(newTools, t)
 		}
 	}
@@ -196,6 +311,28 @@ func (m *MCPToolManager) RemoveServer(name string) error {
 			delete(m.toolMap, k)
 		}
 	}
+
+	// Remove prompts belonging to this server.
+	newPrompts := make([]MCPPrompt, 0, len(m.prompts))
+	for _, p := range m.prompts {
+		if p.ServerName != name {
+			newPrompts = append(newPrompts, p)
+		}
+	}
+	m.prompts = newPrompts
+
+	// Remove resources belonging to this server.
+	newResources := make([]MCPResource, 0, len(m.resources))
+	for _, r := range m.resources {
+		if r.ServerName != name {
+			newResources = append(newResources, r)
+		} else {
+			// Clean up any active subscription for this resource.
+			delete(m.subscriptions, r.URI)
+		}
+	}
+	m.resources = newResources
+
 	m.mu.Unlock()

 	// Close the connection in the pool (best-effort).
@@ -223,7 +360,7 @@ func (m *MCPToolManager) ensureConnectionPool() {
 	if m.debugLogger == nil {
 		m.debugLogger = NewSimpleDebugLogger(debug)
 	}
-	m.connectionPool = NewMCPConnectionPool(DefaultConnectionPoolConfig(), m.model, debug, m.authHandler, m.tokenStoreFactory)
+	m.connectionPool = NewMCPConnectionPool(DefaultConnectionPoolConfig(), debug, m.authHandler, m.tokenStoreFactory)
 	m.connectionPool.SetDebugLogger(m.debugLogger)
 }

@@ -239,7 +376,7 @@ func (m *MCPToolManager) LoadTools(ctx context.Context, cfg *config.Config) erro
 	if m.debugLogger == nil {
 		m.debugLogger = NewSimpleDebugLogger(cfg.Debug)
 	}
-	m.connectionPool = NewMCPConnectionPool(DefaultConnectionPoolConfig(), m.model, cfg.Debug, m.authHandler, m.tokenStoreFactory)
+	m.connectionPool = NewMCPConnectionPool(DefaultConnectionPoolConfig(), cfg.Debug, m.authHandler, m.tokenStoreFactory)
 	m.connectionPool.SetDebugLogger(m.debugLogger)

 	// Load all servers in parallel. Each server connection (subprocess
@@ -321,10 +458,10 @@ func (m *MCPToolManager) loadServerTools(ctx context.Context, serverName string,
 	}

 	// Build tools locally before acquiring the lock.
-	var localTools []fantasy.AgentTool
+	var localTools []MCPTool
 	localMap := make(map[string]*toolMapping)

-	// Convert MCP tools to fantasy AgentTools with prefixed names
+	// Convert MCP tools to MCPTool structs with prefixed names
 	for _, mcpTool := range listResults.Tools {
 		// Filter tools based on allowedTools/excludedTools
 		if len(serverConfig.AllowedTools) > 0 {
@@ -338,7 +475,7 @@ func (m *MCPToolManager) loadServerTools(ctx context.Context, serverName string,
 			continue
 		}

-		// Convert MCP InputSchema to map[string]any for fantasy ToolInfo
+		// Convert MCP InputSchema to map[string]any
 		marshaledSchema, err := json.Marshal(mcpTool.InputSchema)
 		if err != nil {
 			return -1, fmt.Errorf("conv mcp tool input schema fail(marshal): %w, tool name: %s", err, mcpTool.Name)
@@ -347,7 +484,7 @@ func (m *MCPToolManager) loadServerTools(ctx context.Context, serverName string,
 		// Fix for JSON Schema draft-07 vs draft-04 compatibility
 		marshaledSchema = convertExclusiveBoundsToBoolean(marshaledSchema)

-		// Parse into map[string]any for fantasy's parameters format
+		// Parse into map[string]any
 		var schemaMap map[string]any
 		if err := json.Unmarshal(marshaledSchema, &schemaMap); err != nil {
 			return -1, fmt.Errorf("conv mcp tool input schema fail(unmarshal): %w, tool name: %s", err, mcpTool.Name)
@@ -363,7 +500,7 @@ func (m *MCPToolManager) loadServerTools(ctx context.Context, serverName string,

 		// Fix for issue #89: Ensure object schemas have a properties field.
 		// When schema type is "object" with no properties, we keep the
-		// empty parameters map — fantasy handles this fine.
+		// empty parameters map.

 		if req, ok := schemaMap["required"].([]any); ok {
 			for _, r := range req {
@@ -381,22 +518,18 @@ func (m *MCPToolManager) loadServerTools(ctx context.Context, serverName string,
 			serverName:   serverName,
 			originalName: mcpTool.Name,
 			serverConfig: serverConfig,
-			manager:      m,
 		}
 		localMap[prefixedName] = mapping

-		// Create fantasy AgentTool
-		fantasyTool := &mcpFantasyTool{
-			toolInfo: fantasy.ToolInfo{
-				Name:        prefixedName,
-				Description: mcpTool.Description,
-				Parameters:  parameters,
-				Required:    required,
-			},
-			mapping: mapping,
-		}
-
-		localTools = append(localTools, fantasyTool)
+		// Create MCPTool
+		localTools = append(localTools, MCPTool{
+			Name:         prefixedName,
+			Description:  mcpTool.Description,
+			Parameters:   parameters,
+			Required:     required,
+			ServerName:   serverName,
+			OriginalName: mcpTool.Name,
+		})
 	}

 	// Merge into the manager under the lock.
@@ -405,15 +538,516 @@ func (m *MCPToolManager) loadServerTools(ctx context.Context, serverName string,
 	m.tools = append(m.tools, localTools...)
 	m.mu.Unlock()

+	// Also load prompts from this server (best-effort, non-blocking).
+	m.loadServerPrompts(ctx, serverName, conn)
+
+	// Also load resources from this server (best-effort, non-blocking).
+	m.loadServerResources(ctx, serverName, conn)
+
 	return len(localTools), nil
 }

-// GetTools returns all loaded tools as fantasy AgentTools from all configured MCP servers.
+// ExecuteTool calls an MCP tool through the connection pool, handling health
+// checks, OAuth re-authorization, and connection error tracking.
+// The inputJSON parameter is the raw JSON arguments from the LLM.
+// Returns the result content, error flag, and any execution error.
+func (m *MCPToolManager) ExecuteTool(ctx context.Context, prefixedName, inputJSON string) (*MCPToolResult, error) {
+	m.mu.Lock()
+	mapping, ok := m.toolMap[prefixedName]
+	m.mu.Unlock()
+	if !ok {
+		return nil, fmt.Errorf("tool %q not found", prefixedName)
+	}
+
+	// Parse and validate JSON arguments
+	var arguments any
+	if inputJSON == "" || inputJSON == "{}" {
+		arguments = nil
+	} else {
+		var temp any
+		if err := json.Unmarshal([]byte(inputJSON), &temp); err != nil {
+			return &MCPToolResult{
+				Content: fmt.Sprintf("invalid JSON arguments: %v", err),
+				IsError: true,
+			}, nil
+		}
+		arguments = json.RawMessage(inputJSON)
+	}
+
+	// Get connection from pool with health check
+	conn, err := m.connectionPool.GetConnectionWithHealthCheck(
+		ctx, mapping.serverName, mapping.serverConfig,
+	)
+	if err != nil {
+		return nil, fmt.Errorf("failed to get healthy connection from pool: %w", err)
+	}
+
+	callRequest := mcp.CallToolRequest{
+		Request: mcp.Request{
+			Method: "tools/call",
+		},
+		Params: mcp.CallToolParams{
+			Name:      mapping.originalName,
+			Arguments: arguments,
+		},
+	}
+
+	// Call the MCP tool using the original (unprefixed) name
+	result, err := conn.client.CallTool(ctx, callRequest)
+	if err != nil {
+		// Handle OAuth re-authorization: token may have expired mid-session.
+		if m.connectionPool.oauthFlow != nil && IsOAuthError(err) {
+			if flowErr := m.connectionPool.oauthFlow.RunAuthFlow(ctx, mapping.serverName, err); flowErr != nil {
+				return nil, fmt.Errorf("OAuth re-authorization failed for tool %s: %w", mapping.originalName, flowErr)
+			}
+			// Retry the tool call after successful re-auth.
+			result, err = conn.client.CallTool(ctx, callRequest)
+			if err != nil {
+				m.connectionPool.HandleConnectionError(mapping.serverName, err)
+				return nil, fmt.Errorf("failed to call mcp tool after re-auth: %w", err)
+			}
+		} else {
+			// Mark connection as unhealthy for automatic recovery
+			m.connectionPool.HandleConnectionError(mapping.serverName, err)
+			return nil, fmt.Errorf("failed to call mcp tool: %w", err)
+		}
+	}
+
+	// Marshal the MCP result to JSON string
+	marshaledResult, err := json.Marshal(result)
+	if err != nil {
+		return nil, fmt.Errorf("failed to marshal mcp tool result: %w", err)
+	}
+
+	return &MCPToolResult{
+		Content: string(marshaledResult),
+		IsError: result.IsError,
+	}, nil
+}
+
+// GetTools returns all loaded MCP tools from all configured MCP servers.
 // Tools are returned with their prefixed names (serverName__toolName) to ensure uniqueness.
-func (m *MCPToolManager) GetTools() []fantasy.AgentTool {
+func (m *MCPToolManager) GetTools() []MCPTool {
 	return m.tools
 }

+// GetPrompts returns all prompts discovered from connected MCP servers.
+func (m *MCPToolManager) GetPrompts() []MCPPrompt {
+	m.mu.Lock()
+	defer m.mu.Unlock()
+	result := make([]MCPPrompt, len(m.prompts))
+	copy(result, m.prompts)
+	return result
+}
+
+// GetPrompt retrieves and expands a specific prompt from an MCP server.
+// The serverName identifies which server to query, promptName is the prompt's
+// name on that server, and args are the template arguments to substitute.
+// This call is lazy — it contacts the MCP server on each invocation.
+func (m *MCPToolManager) GetPrompt(ctx context.Context, serverName, promptName string, args map[string]string) (*MCPPromptResult, error) {
+	if m.connectionPool == nil {
+		return nil, fmt.Errorf("no connection pool available")
+	}
+
+	clients := m.connectionPool.GetClients()
+	mcpClient, ok := clients[serverName]
+	if !ok {
+		return nil, fmt.Errorf("MCP server %q not found", serverName)
+	}
+
+	req := mcp.GetPromptRequest{}
+	req.Params.Name = promptName
+	if len(args) > 0 {
+		req.Params.Arguments = args
+	}
+
+	result, err := mcpClient.GetPrompt(ctx, req)
+	if err != nil {
+		return nil, fmt.Errorf("failed to get prompt %q from server %q: %w", promptName, serverName, err)
+	}
+
+	// Convert MCP messages to our types, extracting all content types.
+	var messages []MCPPromptMessage
+	for _, msg := range result.Messages {
+		text, fileParts := extractPromptContent(msg.Content)
+		if text != "" || len(fileParts) > 0 {
+			messages = append(messages, MCPPromptMessage{
+				Role:      string(msg.Role),
+				Content:   text,
+				FileParts: fileParts,
+			})
+		}
+	}
+
+	return &MCPPromptResult{
+		Description: result.Description,
+		Messages:    messages,
+	}, nil
+}
+
+// extractPromptContent extracts text and binary attachments from an MCP Content value.
+// Handles all MCP content types: TextContent, ImageContent, AudioContent,
+// EmbeddedResource (text and blob), and ResourceLink.
+func extractPromptContent(content mcp.Content) (string, []MCPFilePart) {
+	switch c := content.(type) {
+	case mcp.TextContent:
+		return c.Text, nil
+	case *mcp.TextContent:
+		if c != nil {
+			return c.Text, nil
+		}
+		return "", nil
+
+	case mcp.ImageContent:
+		return "", decodeBase64FilePart(c.Data, c.MIMEType, "image/png", "image.png")
+	case *mcp.ImageContent:
+		if c != nil {
+			return "", decodeBase64FilePart(c.Data, c.MIMEType, "image/png", "image.png")
+		}
+		return "", nil
+
+	case mcp.AudioContent:
+		return "", decodeBase64FilePart(c.Data, c.MIMEType, "audio/wav", "audio.wav")
+	case *mcp.AudioContent:
+		if c != nil {
+			return "", decodeBase64FilePart(c.Data, c.MIMEType, "audio/wav", "audio.wav")
+		}
+		return "", nil
+
+	case mcp.EmbeddedResource:
+		return extractEmbeddedResourceContent(c.Resource)
+	case *mcp.EmbeddedResource:
+		if c != nil {
+			return extractEmbeddedResourceContent(c.Resource)
+		}
+		return "", nil
+
+	case mcp.ResourceLink:
+		// ResourceLink is a reference without inline content — include as a
+		// text annotation so the LLM knows about it.
+		return fmt.Sprintf("[Referenced resource: %s (%s)]", c.URI, c.Name), nil
+	case *mcp.ResourceLink:
+		if c != nil {
+			return fmt.Sprintf("[Referenced resource: %s (%s)]", c.URI, c.Name), nil
+		}
+		return "", nil
+
+	default:
+		return "", nil
+	}
+}
+
+// extractEmbeddedResourceContent handles the two variants of embedded resource
+// content: text resources are inlined as fenced code blocks, blob resources
+// are base64-decoded into MCPFilePart attachments.
+func extractEmbeddedResourceContent(res mcp.ResourceContents) (string, []MCPFilePart) {
+	switch r := res.(type) {
+	case mcp.TextResourceContents:
+		return fmt.Sprintf("[File: %s]\n```\n%s\n```", r.URI, r.Text), nil
+	case *mcp.TextResourceContents:
+		if r != nil {
+			return fmt.Sprintf("[File: %s]\n```\n%s\n```", r.URI, r.Text), nil
+		}
+		return "", nil
+	case mcp.BlobResourceContents:
+		return "", decodeBase64FilePart(r.Blob, r.MIMEType, "application/octet-stream", filenameFromURI(r.URI))
+	case *mcp.BlobResourceContents:
+		if r != nil {
+			return "", decodeBase64FilePart(r.Blob, r.MIMEType, "application/octet-stream", filenameFromURI(r.URI))
+		}
+		return "", nil
+	default:
+		return "", nil
+	}
+}
+
+// decodeBase64FilePart decodes base64-encoded data into an MCPFilePart.
+// Returns nil on decode failure (logged as a warning).
+func decodeBase64FilePart(data, mimeType, defaultMIME, filename string) []MCPFilePart {
+	decoded, err := base64.StdEncoding.DecodeString(data)
+	if err != nil {
+		log.Warn("mcp prompt: failed to decode base64 content", "filename", filename, "error", err)
+		return nil
+	}
+	if mimeType == "" {
+		mimeType = defaultMIME
+	}
+	return []MCPFilePart{{
+		Filename:  filename,
+		Data:      decoded,
+		MediaType: mimeType,
+	}}
+}
+
+// filenameFromURI extracts a filename from a URI (e.g. "file:///path/to/img.png" → "img.png").
+func filenameFromURI(uri string) string {
+	uri = strings.TrimPrefix(uri, "file://")
+	if idx := strings.LastIndex(uri, "/"); idx >= 0 {
+		return uri[idx+1:]
+	}
+	if uri == "" {
+		return "resource"
+	}
+	return uri
+}
+
+// loadServerPrompts loads prompts from a single MCP server connection.
+// Called inside loadServerTools after a successful connection is established.
+// Thread-safe: acquires m.mu to merge results.
+func (m *MCPToolManager) loadServerPrompts(ctx context.Context, serverName string, conn *MCPConnection) {
+	listResult, err := conn.client.ListPrompts(ctx, mcp.ListPromptsRequest{})
+	if err != nil {
+		// Prompts are optional — servers may not support them.
+		// Silently skip.
+		return
+	}
+
+	if len(listResult.Prompts) == 0 {
+		return
+	}
+
+	var localPrompts []MCPPrompt
+	for _, p := range listResult.Prompts {
+		var args []MCPPromptArgument
+		for _, a := range p.Arguments {
+			args = append(args, MCPPromptArgument{
+				Name:        a.Name,
+				Description: a.Description,
+				Required:    a.Required,
+			})
+		}
+		localPrompts = append(localPrompts, MCPPrompt{
+			Name:        p.Name,
+			Description: p.Description,
+			Arguments:   args,
+			ServerName:  serverName,
+		})
+	}
+
+	m.mu.Lock()
+	m.prompts = append(m.prompts, localPrompts...)
+	m.mu.Unlock()
+}
+
+// loadServerResources loads resources from a single MCP server connection.
+// Called inside loadServerTools after a successful connection is established.
+// Thread-safe: acquires m.mu to merge results.
+func (m *MCPToolManager) loadServerResources(ctx context.Context, serverName string, conn *MCPConnection) {
+	listResult, err := conn.client.ListResources(ctx, mcp.ListResourcesRequest{})
+	if err != nil {
+		// Resources are optional — servers may not support them.
+		return
+	}
+
+	if len(listResult.Resources) == 0 {
+		return
+	}
+
+	var localResources []MCPResource
+	for _, r := range listResult.Resources {
+		localResources = append(localResources, MCPResource{
+			URI:         r.URI,
+			Name:        r.Name,
+			Description: r.Description,
+			MIMEType:    r.MIMEType,
+			ServerName:  serverName,
+		})
+	}
+
+	m.mu.Lock()
+	m.resources = append(m.resources, localResources...)
+	m.mu.Unlock()
+}
+
+// GetResources returns all resources discovered from connected MCP servers.
+func (m *MCPToolManager) GetResources() []MCPResource {
+	m.mu.Lock()
+	defer m.mu.Unlock()
+	result := make([]MCPResource, len(m.resources))
+	copy(result, m.resources)
+	return result
+}
+
+// SetOnResourcesChanged sets the callback invoked when a subscribed resource
+// changes. Used by the UI layer to refresh autocomplete or re-read content.
+func (m *MCPToolManager) SetOnResourcesChanged(cb func()) {
+	m.onResourcesChanged = cb
+}
+
+// ReadResource reads a specific resource from an MCP server by URI.
+// Returns the resource content (text or binary blob).
+func (m *MCPToolManager) ReadResource(ctx context.Context, serverName, uri string) (*MCPResourceContent, error) {
+	if m.connectionPool == nil {
+		return nil, fmt.Errorf("no connection pool available")
+	}
+
+	clients := m.connectionPool.GetClients()
+	mcpClient, ok := clients[serverName]
+	if !ok {
+		return nil, fmt.Errorf("MCP server %q not found", serverName)
+	}
+
+	req := mcp.ReadResourceRequest{}
+	req.Params.URI = uri
+
+	result, err := mcpClient.ReadResource(ctx, req)
+	if err != nil {
+		return nil, fmt.Errorf("failed to read resource %q from server %q: %w", uri, serverName, err)
+	}
+
+	if len(result.Contents) == 0 {
+		return nil, fmt.Errorf("resource %q returned no content", uri)
+	}
+
+	// Process the first content item (most resources return exactly one).
+	content := result.Contents[0]
+	switch c := content.(type) {
+	case mcp.TextResourceContents:
+		return &MCPResourceContent{
+			URI:      c.URI,
+			MIMEType: c.MIMEType,
+			Text:     c.Text,
+			IsBlob:   false,
+		}, nil
+	case *mcp.TextResourceContents:
+		if c == nil {
+			return nil, fmt.Errorf("resource %q returned nil text content", uri)
+		}
+		return &MCPResourceContent{
+			URI:      c.URI,
+			MIMEType: c.MIMEType,
+			Text:     c.Text,
+			IsBlob:   false,
+		}, nil
+	case mcp.BlobResourceContents:
+		decoded, err := base64.StdEncoding.DecodeString(c.Blob)
+		if err != nil {
+			return nil, fmt.Errorf("failed to decode blob resource %q: %w", uri, err)
+		}
+		return &MCPResourceContent{
+			URI:      c.URI,
+			MIMEType: c.MIMEType,
+			BlobData: decoded,
+			IsBlob:   true,
+		}, nil
+	case *mcp.BlobResourceContents:
+		if c == nil {
+			return nil, fmt.Errorf("resource %q returned nil blob content", uri)
+		}
+		decoded, err := base64.StdEncoding.DecodeString(c.Blob)
+		if err != nil {
+			return nil, fmt.Errorf("failed to decode blob resource %q: %w", uri, err)
+		}
+		return &MCPResourceContent{
+			URI:      c.URI,
+			MIMEType: c.MIMEType,
+			BlobData: decoded,
+			IsBlob:   true,
+		}, nil
+	default:
+		return nil, fmt.Errorf("resource %q returned unknown content type %T", uri, content)
+	}
+}
+
+// SubscribeResource subscribes to change notifications for a resource.
+// When the resource changes on the server, onResourcesChanged is called
+// and the resource list is refreshed automatically.
+func (m *MCPToolManager) SubscribeResource(ctx context.Context, serverName, uri string) error {
+	if m.connectionPool == nil {
+		return fmt.Errorf("no connection pool available")
+	}
+
+	clients := m.connectionPool.GetClients()
+	mcpClient, ok := clients[serverName]
+	if !ok {
+		return fmt.Errorf("MCP server %q not found", serverName)
+	}
+
+	req := mcp.SubscribeRequest{}
+	req.Params.URI = uri
+
+	if err := mcpClient.Subscribe(ctx, req); err != nil {
+		return fmt.Errorf("failed to subscribe to resource %q on server %q: %w", uri, serverName, err)
+	}
+
+	m.mu.Lock()
+	m.subscriptions[uri] = serverName
+	m.mu.Unlock()
+
+	return nil
+}
+
+// UnsubscribeResource cancels change notifications for a resource.
+func (m *MCPToolManager) UnsubscribeResource(ctx context.Context, serverName, uri string) error {
+	if m.connectionPool == nil {
+		return fmt.Errorf("no connection pool available")
+	}
+
+	clients := m.connectionPool.GetClients()
+	mcpClient, ok := clients[serverName]
+	if !ok {
+		return fmt.Errorf("MCP server %q not found", serverName)
+	}
+
+	req := mcp.UnsubscribeRequest{}
+	req.Params.URI = uri
+
+	if err := mcpClient.Unsubscribe(ctx, req); err != nil {
+		return fmt.Errorf("failed to unsubscribe from resource %q on server %q: %w", uri, serverName, err)
+	}
+
+	m.mu.Lock()
+	delete(m.subscriptions, uri)
+	m.mu.Unlock()
+
+	return nil
+}
+
+// RefreshServerResources re-fetches resources from a specific server.
+// Called when a resource change notification is received.
+func (m *MCPToolManager) RefreshServerResources(ctx context.Context, serverName string) {
+	if m.connectionPool == nil {
+		return
+	}
+
+	clients := m.connectionPool.GetClients()
+	mcpClient, ok := clients[serverName]
+	if !ok {
+		return
+	}
+
+	listResult, err := mcpClient.ListResources(ctx, mcp.ListResourcesRequest{})
+	if err != nil {
+		return
+	}
+
+	var newResources []MCPResource
+	for _, r := range listResult.Resources {
+		newResources = append(newResources, MCPResource{
+			URI:         r.URI,
+			Name:        r.Name,
+			Description: r.Description,
+			MIMEType:    r.MIMEType,
+			ServerName:  serverName,
+		})
+	}
+
+	m.mu.Lock()
+	// Remove old resources from this server, add new ones.
+	filtered := make([]MCPResource, 0, len(m.resources))
+	for _, r := range m.resources {
+		if r.ServerName != serverName {
+			filtered = append(filtered, r)
+		}
+	}
+	m.resources = append(filtered, newResources...)
+	m.mu.Unlock()
+
+	if m.onResourcesChanged != nil {
+		m.onResourcesChanged()
+	}
+}
+
 // GetLoadedServerNames returns the names of all successfully loaded MCP servers.
 // This includes servers that are currently connected and have had their tools loaded,
 // regardless of their current health status. Useful for debugging and status reporting.
@@ -101,7 +101,7 @@ func TestMCPToolManager_AddServer_Integration(t *testing.T) {
 	// Verify tool names are prefixed.
 	toolNames := make(map[string]bool)
 	for _, tool := range tools {
-		toolNames[tool.Info().Name] = true
+		toolNames[tool.Name] = true
 	}
 	if !toolNames["echo__echo"] {
 		t.Error("Expected tool 'echo__echo'")
@@ -234,8 +234,8 @@ func TestMCPToolManager_AddRemoveMultiple_Integration(t *testing.T) {

 	// Remaining tools should all be from server-b.
 	for _, tool := range tools {
-		if !strings.HasPrefix(tool.Info().Name, "server-b__") {
-			t.Errorf("Expected tool from server-b, got: %s", tool.Info().Name)
+		if !strings.HasPrefix(tool.Name, "server-b__") {
+			t.Errorf("Expected tool from server-b, got: %s", tool.Name)
 		}
 	}

@@ -122,7 +122,7 @@ func TestMCPToolManager_Close_NilPool(t *testing.T) {
 // TestMCPConnectionPool_RemoveConnection_NotFound verifies that removing a
 // non-existent connection returns an error.
 func TestMCPConnectionPool_RemoveConnection_NotFound(t *testing.T) {
-	pool := NewMCPConnectionPool(DefaultConnectionPoolConfig(), nil, false, nil, nil)
+	pool := NewMCPConnectionPool(DefaultConnectionPoolConfig(), false, nil, nil)
 	defer func() { _ = pool.Close() }()

 	err := pool.RemoveConnection("nonexistent")
@@ -0,0 +1,691 @@
+package tools
+
+import (
+	"context"
+	"encoding/base64"
+	"fmt"
+	"strings"
+	"testing"
+
+	mcpclient "github.com/mark3labs/mcp-go/client"
+	"github.com/mark3labs/mcp-go/mcp"
+	"github.com/mark3labs/mcp-go/server"
+)
+
+// newTestPromptServer creates an in-process MCP server with prompt capabilities
+// and the specified prompts + handlers. Returns an initialized MCPClient.
+func newTestPromptServer(t *testing.T, prompts ...server.ServerPrompt) mcpclient.MCPClient {
+	t.Helper()
+
+	mcpServer := server.NewMCPServer(
+		"test-prompt-server", "1.0.0",
+		server.WithPromptCapabilities(true),
+		server.WithToolCapabilities(true),
+	)
+
+	if len(prompts) > 0 {
+		mcpServer.AddPrompts(prompts...)
+	}
+
+	// Add a dummy tool so loadServerTools has something to list.
+	mcpServer.AddTool(
+		mcp.NewTool("noop", mcp.WithDescription("no-op tool")),
+		func(ctx context.Context, req mcp.CallToolRequest) (*mcp.CallToolResult, error) {
+			return mcp.NewToolResultText("ok"), nil
+		},
+	)
+
+	client, err := mcpclient.NewInProcessClient(mcpServer)
+	if err != nil {
+		t.Fatalf("NewInProcessClient: %v", err)
+	}
+
+	ctx := context.Background()
+	if err := client.Start(ctx); err != nil {
+		t.Fatalf("client.Start: %v", err)
+	}
+
+	initReq := mcp.InitializeRequest{}
+	initReq.Params.ProtocolVersion = mcp.LATEST_PROTOCOL_VERSION
+	initReq.Params.ClientInfo = mcp.Implementation{Name: "test", Version: "1.0"}
+	if _, err := client.Initialize(ctx, initReq); err != nil {
+		t.Fatalf("client.Initialize: %v", err)
+	}
+
+	t.Cleanup(func() { _ = client.Close() })
+	return client
+}
+
+// injectClientIntoManager sets up an MCPToolManager with a pre-connected
+// in-process client, bypassing the normal connection pool flow.
+func injectClientIntoManager(t *testing.T, serverName string, client mcpclient.MCPClient) *MCPToolManager {
+	t.Helper()
+
+	m := NewMCPToolManager()
+
+	// Create a minimal connection pool and inject our client.
+	pool := NewMCPConnectionPool(DefaultConnectionPoolConfig(), false, nil, nil)
+	pool.mu.Lock()
+	pool.connections[serverName] = &MCPConnection{
+		client:     client,
+		serverName: serverName,
+		isHealthy:  true,
+	}
+	pool.mu.Unlock()
+	m.connectionPool = pool
+
+	return m
+}
+
+func TestLoadServerPrompts_Basic(t *testing.T) {
+	ctx := context.Background()
+
+	client := newTestPromptServer(t,
+		server.ServerPrompt{
+			Prompt: mcp.NewPrompt("review-pr",
+				mcp.WithPromptDescription("Review a pull request"),
+				mcp.WithArgument("pr_number",
+					mcp.ArgumentDescription("The PR number to review"),
+					mcp.RequiredArgument(),
+				),
+				mcp.WithArgument("focus",
+					mcp.ArgumentDescription("Area to focus on"),
+				),
+			),
+			Handler: func(ctx context.Context, req mcp.GetPromptRequest) (*mcp.GetPromptResult, error) {
+				prNum := req.Params.Arguments["pr_number"]
+				return &mcp.GetPromptResult{
+					Description: "PR review prompt",
+					Messages: []mcp.PromptMessage{
+						{
+							Role: mcp.RoleUser,
+							Content: mcp.TextContent{
+								Type: "text",
+								Text: fmt.Sprintf("Please review PR #%s", prNum),
+							},
+						},
+					},
+				}, nil
+			},
+		},
+		server.ServerPrompt{
+			Prompt: mcp.NewPrompt("explain-code",
+				mcp.WithPromptDescription("Explain a piece of code"),
+			),
+			Handler: func(ctx context.Context, req mcp.GetPromptRequest) (*mcp.GetPromptResult, error) {
+				return &mcp.GetPromptResult{
+					Messages: []mcp.PromptMessage{
+						{
+							Role: mcp.RoleUser,
+							Content: mcp.TextContent{
+								Type: "text",
+								Text: "Please explain the following code.",
+							},
+						},
+					},
+				}, nil
+			},
+		},
+	)
+
+	m := injectClientIntoManager(t, "github", client)
+
+	conn := &MCPConnection{
+		client:     client,
+		serverName: "github",
+		isHealthy:  true,
+	}
+	m.loadServerPrompts(ctx, "github", conn)
+
+	prompts := m.GetPrompts()
+	if len(prompts) != 2 {
+		t.Fatalf("expected 2 prompts, got %d", len(prompts))
+	}
+
+	// Find review-pr prompt.
+	var reviewPR *MCPPrompt
+	for i := range prompts {
+		if prompts[i].Name == "review-pr" {
+			reviewPR = &prompts[i]
+			break
+		}
+	}
+	if reviewPR == nil {
+		t.Fatal("review-pr prompt not found")
+	}
+	if reviewPR.Description != "Review a pull request" {
+		t.Errorf("unexpected description: %q", reviewPR.Description)
+	}
+	if reviewPR.ServerName != "github" {
+		t.Errorf("unexpected server name: %q", reviewPR.ServerName)
+	}
+	if len(reviewPR.Arguments) != 2 {
+		t.Fatalf("expected 2 arguments, got %d", len(reviewPR.Arguments))
+	}
+
+	// Verify argument metadata.
+	arg0 := reviewPR.Arguments[0]
+	if arg0.Name != "pr_number" {
+		t.Errorf("expected first arg name 'pr_number', got %q", arg0.Name)
+	}
+	if !arg0.Required {
+		t.Error("expected first arg to be required")
+	}
+	arg1 := reviewPR.Arguments[1]
+	if arg1.Name != "focus" {
+		t.Errorf("expected second arg name 'focus', got %q", arg1.Name)
+	}
+	if arg1.Required {
+		t.Error("expected second arg to be optional")
+	}
+}
+
+func TestGetPrompt_ExpandsWithArgs(t *testing.T) {
+	ctx := context.Background()
+
+	client := newTestPromptServer(t,
+		server.ServerPrompt{
+			Prompt: mcp.NewPrompt("greet",
+				mcp.WithPromptDescription("Greet someone"),
+				mcp.WithArgument("name", mcp.RequiredArgument()),
+			),
+			Handler: func(ctx context.Context, req mcp.GetPromptRequest) (*mcp.GetPromptResult, error) {
+				name := req.Params.Arguments["name"]
+				return &mcp.GetPromptResult{
+					Description: "Greeting",
+					Messages: []mcp.PromptMessage{
+						{
+							Role: mcp.RoleUser,
+							Content: mcp.TextContent{
+								Type: "text",
+								Text: fmt.Sprintf("Hello, %s!", name),
+							},
+						},
+					},
+				}, nil
+			},
+		},
+	)
+
+	m := injectClientIntoManager(t, "myserver", client)
+
+	result, err := m.GetPrompt(ctx, "myserver", "greet", map[string]string{"name": "World"})
+	if err != nil {
+		t.Fatalf("GetPrompt error: %v", err)
+	}
+	if result.Description != "Greeting" {
+		t.Errorf("unexpected description: %q", result.Description)
+	}
+	if len(result.Messages) != 1 {
+		t.Fatalf("expected 1 message, got %d", len(result.Messages))
+	}
+	if result.Messages[0].Role != "user" {
+		t.Errorf("unexpected role: %q", result.Messages[0].Role)
+	}
+	if result.Messages[0].Content != "Hello, World!" {
+		t.Errorf("unexpected content: %q", result.Messages[0].Content)
+	}
+}
+
+func TestGetPrompt_MultipleMessages(t *testing.T) {
+	ctx := context.Background()
+
+	client := newTestPromptServer(t,
+		server.ServerPrompt{
+			Prompt: mcp.NewPrompt("chat-starter"),
+			Handler: func(ctx context.Context, req mcp.GetPromptRequest) (*mcp.GetPromptResult, error) {
+				return &mcp.GetPromptResult{
+					Messages: []mcp.PromptMessage{
+						{
+							Role:    mcp.RoleUser,
+							Content: mcp.TextContent{Type: "text", Text: "What is Go?"},
+						},
+						{
+							Role:    mcp.RoleAssistant,
+							Content: mcp.TextContent{Type: "text", Text: "Go is a programming language."},
+						},
+						{
+							Role:    mcp.RoleUser,
+							Content: mcp.TextContent{Type: "text", Text: "Tell me more."},
+						},
+					},
+				}, nil
+			},
+		},
+	)
+
+	m := injectClientIntoManager(t, "server", client)
+
+	result, err := m.GetPrompt(ctx, "server", "chat-starter", nil)
+	if err != nil {
+		t.Fatalf("GetPrompt error: %v", err)
+	}
+	if len(result.Messages) != 3 {
+		t.Fatalf("expected 3 messages, got %d", len(result.Messages))
+	}
+	if result.Messages[0].Role != "user" {
+		t.Errorf("msg[0] role: got %q, want 'user'", result.Messages[0].Role)
+	}
+	if result.Messages[1].Role != "assistant" {
+		t.Errorf("msg[1] role: got %q, want 'assistant'", result.Messages[1].Role)
+	}
+	if result.Messages[2].Content != "Tell me more." {
+		t.Errorf("msg[2] content: got %q, want 'Tell me more.'", result.Messages[2].Content)
+	}
+}
+
+func TestGetPrompt_ServerNotFound(t *testing.T) {
+	m := NewMCPToolManager()
+	pool := NewMCPConnectionPool(DefaultConnectionPoolConfig(), false, nil, nil)
+	m.connectionPool = pool
+
+	_, err := m.GetPrompt(context.Background(), "nonexistent", "foo", nil)
+	if err == nil {
+		t.Fatal("expected error for nonexistent server")
+	}
+}
+
+func TestGetPrompt_NoPool(t *testing.T) {
+	m := NewMCPToolManager()
+
+	_, err := m.GetPrompt(context.Background(), "any", "foo", nil)
+	if err == nil {
+		t.Fatal("expected error with no pool")
+	}
+}
+
+func TestRemoveServer_RemovesPrompts(t *testing.T) {
+	ctx := context.Background()
+
+	client := newTestPromptServer(t,
+		server.ServerPrompt{
+			Prompt: mcp.NewPrompt("my-prompt",
+				mcp.WithPromptDescription("A test prompt"),
+			),
+			Handler: func(ctx context.Context, req mcp.GetPromptRequest) (*mcp.GetPromptResult, error) {
+				return &mcp.GetPromptResult{
+					Messages: []mcp.PromptMessage{
+						{Role: mcp.RoleUser, Content: mcp.TextContent{Type: "text", Text: "hi"}},
+					},
+				}, nil
+			},
+		},
+	)
+
+	m := injectClientIntoManager(t, "testsvr", client)
+
+	// Manually populate tools and prompts as loadServerTools would.
+	conn := m.connectionPool.connections["testsvr"]
+	m.loadServerPrompts(ctx, "testsvr", conn)
+
+	// Also add a fake tool mapping so RemoveServer finds the server.
+	m.toolMap["testsvr__noop"] = &toolMapping{
+		serverName:   "testsvr",
+		originalName: "noop",
+	}
+	m.tools = append(m.tools, MCPTool{
+		Name:       "testsvr__noop",
+		ServerName: "testsvr",
+	})
+
+	// Verify prompts exist before removal.
+	if got := len(m.GetPrompts()); got != 1 {
+		t.Fatalf("expected 1 prompt before removal, got %d", got)
+	}
+
+	// Remove the server.
+	err := m.RemoveServer("testsvr")
+	if err != nil {
+		t.Fatalf("RemoveServer error: %v", err)
+	}
+
+	// Verify prompts are gone.
+	if got := len(m.GetPrompts()); got != 0 {
+		t.Fatalf("expected 0 prompts after removal, got %d", got)
+	}
+}
+
+func TestLoadServerPrompts_NoPromptCapability(t *testing.T) {
+	// Server without prompt capabilities — ListPrompts should fail gracefully.
+	mcpServer := server.NewMCPServer("no-prompts", "1.0.0",
+		server.WithToolCapabilities(true),
+		// No WithPromptCapabilities
+	)
+	mcpServer.AddTool(
+		mcp.NewTool("noop"),
+		func(ctx context.Context, req mcp.CallToolRequest) (*mcp.CallToolResult, error) {
+			return mcp.NewToolResultText("ok"), nil
+		},
+	)
+
+	client, err := mcpclient.NewInProcessClient(mcpServer)
+	if err != nil {
+		t.Fatalf("NewInProcessClient: %v", err)
+	}
+	ctx := context.Background()
+	_ = client.Start(ctx)
+	initReq := mcp.InitializeRequest{}
+	initReq.Params.ProtocolVersion = mcp.LATEST_PROTOCOL_VERSION
+	initReq.Params.ClientInfo = mcp.Implementation{Name: "test", Version: "1.0"}
+	_, _ = client.Initialize(ctx, initReq)
+	t.Cleanup(func() { _ = client.Close() })
+
+	m := NewMCPToolManager()
+	conn := &MCPConnection{
+		client:     client,
+		serverName: "no-prompts",
+		isHealthy:  true,
+	}
+
+	// Should not panic or error — just silently skip.
+	m.loadServerPrompts(ctx, "no-prompts", conn)
+
+	if got := len(m.GetPrompts()); got != 0 {
+		t.Fatalf("expected 0 prompts from server without prompt capability, got %d", got)
+	}
+}
+
+func TestExtractPromptContent(t *testing.T) {
+	t.Run("TextContent", func(t *testing.T) {
+		text, parts := extractPromptContent(mcp.TextContent{Type: "text", Text: "hello world"})
+		if text != "hello world" {
+			t.Errorf("text = %q, want %q", text, "hello world")
+		}
+		if len(parts) != 0 {
+			t.Errorf("expected 0 file parts, got %d", len(parts))
+		}
+	})
+
+	t.Run("ImageContent", func(t *testing.T) {
+		// base64 of "fake image"
+		encoded := base64.StdEncoding.EncodeToString([]byte("fake image"))
+		text, parts := extractPromptContent(mcp.ImageContent{
+			Type:     "image",
+			Data:     encoded,
+			MIMEType: "image/png",
+		})
+		if text != "" {
+			t.Errorf("expected empty text, got %q", text)
+		}
+		if len(parts) != 1 {
+			t.Fatalf("expected 1 file part, got %d", len(parts))
+		}
+		if parts[0].MediaType != "image/png" {
+			t.Errorf("media type = %q, want %q", parts[0].MediaType, "image/png")
+		}
+		if parts[0].Filename != "image.png" {
+			t.Errorf("filename = %q, want %q", parts[0].Filename, "image.png")
+		}
+		if string(parts[0].Data) != "fake image" {
+			t.Errorf("data = %q, want %q", string(parts[0].Data), "fake image")
+		}
+	})
+
+	t.Run("ImageContent_DefaultMIME", func(t *testing.T) {
+		encoded := base64.StdEncoding.EncodeToString([]byte("img"))
+		_, parts := extractPromptContent(mcp.ImageContent{
+			Type: "image",
+			Data: encoded,
+			// no MIMEType → should default to image/png
+		})
+		if len(parts) != 1 {
+			t.Fatalf("expected 1 file part, got %d", len(parts))
+		}
+		if parts[0].MediaType != "image/png" {
+			t.Errorf("default MIME = %q, want %q", parts[0].MediaType, "image/png")
+		}
+	})
+
+	t.Run("AudioContent", func(t *testing.T) {
+		encoded := base64.StdEncoding.EncodeToString([]byte("fake audio"))
+		text, parts := extractPromptContent(mcp.AudioContent{
+			Type:     "audio",
+			Data:     encoded,
+			MIMEType: "audio/mp3",
+		})
+		if text != "" {
+			t.Errorf("expected empty text, got %q", text)
+		}
+		if len(parts) != 1 {
+			t.Fatalf("expected 1 file part, got %d", len(parts))
+		}
+		if parts[0].MediaType != "audio/mp3" {
+			t.Errorf("media type = %q, want %q", parts[0].MediaType, "audio/mp3")
+		}
+		if parts[0].Filename != "audio.wav" {
+			t.Errorf("filename = %q, want %q", parts[0].Filename, "audio.wav")
+		}
+	})
+
+	t.Run("EmbeddedResource_Text", func(t *testing.T) {
+		text, parts := extractPromptContent(mcp.EmbeddedResource{
+			Type: "resource",
+			Resource: mcp.TextResourceContents{
+				URI:      "file:///project/main.go",
+				MIMEType: "text/x-go",
+				Text:     "package main",
+			},
+		})
+		if text == "" {
+			t.Fatal("expected non-empty text for text resource")
+		}
+		if !strings.Contains(text, "package main") {
+			t.Errorf("text should contain resource content, got %q", text)
+		}
+		if !strings.Contains(text, "file:///project/main.go") {
+			t.Errorf("text should contain URI, got %q", text)
+		}
+		if len(parts) != 0 {
+			t.Errorf("expected 0 file parts for text resource, got %d", len(parts))
+		}
+	})
+
+	t.Run("EmbeddedResource_Blob", func(t *testing.T) {
+		blobData := []byte("binary content")
+		encoded := base64.StdEncoding.EncodeToString(blobData)
+		text, parts := extractPromptContent(mcp.EmbeddedResource{
+			Type: "resource",
+			Resource: mcp.BlobResourceContents{
+				URI:      "file:///project/data.bin",
+				MIMEType: "application/octet-stream",
+				Blob:     encoded,
+			},
+		})
+		if text != "" {
+			t.Errorf("expected empty text for blob resource, got %q", text)
+		}
+		if len(parts) != 1 {
+			t.Fatalf("expected 1 file part for blob resource, got %d", len(parts))
+		}
+		if parts[0].Filename != "data.bin" {
+			t.Errorf("filename = %q, want %q", parts[0].Filename, "data.bin")
+		}
+		if parts[0].MediaType != "application/octet-stream" {
+			t.Errorf("media type = %q, want %q", parts[0].MediaType, "application/octet-stream")
+		}
+		if string(parts[0].Data) != "binary content" {
+			t.Errorf("data = %q, want %q", string(parts[0].Data), "binary content")
+		}
+	})
+
+	t.Run("ResourceLink", func(t *testing.T) {
+		text, parts := extractPromptContent(mcp.ResourceLink{
+			Type: "resource_link",
+			URI:  "file:///docs/readme.md",
+			Name: "readme.md",
+		})
+		if text == "" {
+			t.Fatal("expected non-empty text for resource link")
+		}
+		if !strings.Contains(text, "file:///docs/readme.md") {
+			t.Errorf("text should contain URI, got %q", text)
+		}
+		if !strings.Contains(text, "readme.md") {
+			t.Errorf("text should contain name, got %q", text)
+		}
+		if len(parts) != 0 {
+			t.Errorf("expected 0 file parts for resource link, got %d", len(parts))
+		}
+	})
+
+	t.Run("InvalidBase64", func(t *testing.T) {
+		_, parts := extractPromptContent(mcp.ImageContent{
+			Type:     "image",
+			Data:     "not-valid-base64!!!",
+			MIMEType: "image/png",
+		})
+		if len(parts) != 0 {
+			t.Errorf("expected 0 file parts for invalid base64, got %d", len(parts))
+		}
+	})
+
+	t.Run("NilContent", func(t *testing.T) {
+		text, parts := extractPromptContent((*mcp.TextContent)(nil))
+		if text != "" {
+			t.Errorf("expected empty text for nil, got %q", text)
+		}
+		if len(parts) != 0 {
+			t.Errorf("expected 0 parts for nil, got %d", len(parts))
+		}
+	})
+}
+
+func TestFilenameFromURI(t *testing.T) {
+	tests := []struct {
+		uri  string
+		want string
+	}{
+		{"file:///path/to/image.png", "image.png"},
+		{"file:///single.txt", "single.txt"},
+		{"resource://server/data.json", "data.json"},
+		{"nopath", "nopath"},
+		{"", "resource"},
+	}
+	for _, tt := range tests {
+		t.Run(tt.uri, func(t *testing.T) {
+			got := filenameFromURI(tt.uri)
+			if got != tt.want {
+				t.Errorf("filenameFromURI(%q) = %q, want %q", tt.uri, got, tt.want)
+			}
+		})
+	}
+}
+
+func TestGetPrompt_EmbeddedResources(t *testing.T) {
+	ctx := context.Background()
+
+	imgData := base64.StdEncoding.EncodeToString([]byte("fake-png"))
+	blobData := base64.StdEncoding.EncodeToString([]byte("binary-blob"))
+
+	client := newTestPromptServer(t,
+		server.ServerPrompt{
+			Prompt: mcp.NewPrompt("review-with-files",
+				mcp.WithPromptDescription("Review with embedded resources"),
+			),
+			Handler: func(ctx context.Context, req mcp.GetPromptRequest) (*mcp.GetPromptResult, error) {
+				return &mcp.GetPromptResult{
+					Description: "Review prompt with embedded files",
+					Messages: []mcp.PromptMessage{
+						{
+							Role:    mcp.RoleUser,
+							Content: mcp.TextContent{Type: "text", Text: "Please review these files:"},
+						},
+						{
+							Role: mcp.RoleUser,
+							Content: mcp.EmbeddedResource{
+								Type: "resource",
+								Resource: mcp.TextResourceContents{
+									URI:      "file:///src/main.go",
+									MIMEType: "text/x-go",
+									Text:     "package main\n\nfunc main() {}",
+								},
+							},
+						},
+						{
+							Role: mcp.RoleUser,
+							Content: mcp.ImageContent{
+								Type:     "image",
+								Data:     imgData,
+								MIMEType: "image/png",
+							},
+						},
+						{
+							Role: mcp.RoleUser,
+							Content: mcp.EmbeddedResource{
+								Type: "resource",
+								Resource: mcp.BlobResourceContents{
+									URI:      "file:///data/model.bin",
+									MIMEType: "application/octet-stream",
+									Blob:     blobData,
+								},
+							},
+						},
+					},
+				}, nil
+			},
+		},
+	)
+
+	m := injectClientIntoManager(t, "test", client)
+
+	result, err := m.GetPrompt(ctx, "test", "review-with-files", nil)
+	if err != nil {
+		t.Fatalf("GetPrompt error: %v", err)
+	}
+	if result.Description != "Review prompt with embedded files" {
+		t.Errorf("unexpected description: %q", result.Description)
+	}
+
+	// Should have 4 messages: text, embedded text resource, image, embedded blob
+	if len(result.Messages) != 4 {
+		t.Fatalf("expected 4 messages, got %d", len(result.Messages))
+	}
+
+	// Message 0: plain text
+	msg0 := result.Messages[0]
+	if msg0.Content != "Please review these files:" {
+		t.Errorf("msg[0] content = %q", msg0.Content)
+	}
+	if len(msg0.FileParts) != 0 {
+		t.Errorf("msg[0] expected 0 file parts, got %d", len(msg0.FileParts))
+	}
+
+	// Message 1: embedded text resource → inlined as text
+	msg1 := result.Messages[1]
+	if !strings.Contains(msg1.Content, "package main") {
+		t.Errorf("msg[1] should contain resource text, got %q", msg1.Content)
+	}
+	if len(msg1.FileParts) != 0 {
+		t.Errorf("msg[1] expected 0 file parts (text resource), got %d", len(msg1.FileParts))
+	}
+
+	// Message 2: image → file part
+	msg2 := result.Messages[2]
+	if msg2.Content != "" {
+		t.Errorf("msg[2] expected empty text for image, got %q", msg2.Content)
+	}
+	if len(msg2.FileParts) != 1 {
+		t.Fatalf("msg[2] expected 1 file part, got %d", len(msg2.FileParts))
+	}
+	if msg2.FileParts[0].MediaType != "image/png" {
+		t.Errorf("msg[2] file part MIME = %q", msg2.FileParts[0].MediaType)
+	}
+	if string(msg2.FileParts[0].Data) != "fake-png" {
+		t.Errorf("msg[2] file part data = %q", string(msg2.FileParts[0].Data))
+	}
+
+	// Message 3: embedded blob resource → file part
+	msg3 := result.Messages[3]
+	if msg3.Content != "" {
+		t.Errorf("msg[3] expected empty text for blob resource, got %q", msg3.Content)
+	}
+	if len(msg3.FileParts) != 1 {
+		t.Fatalf("msg[3] expected 1 file part, got %d", len(msg3.FileParts))
+	}
+	if msg3.FileParts[0].Filename != "model.bin" {
+		t.Errorf("msg[3] filename = %q, want %q", msg3.FileParts[0].Filename, "model.bin")
+	}
+	if string(msg3.FileParts[0].Data) != "binary-blob" {
+		t.Errorf("msg[3] file part data = %q", string(msg3.FileParts[0].Data))
+	}
+}
@@ -103,14 +103,12 @@ func TestMCPToolManager_EmptyConfig(t *testing.T) {

 	// Test that we can get tool info for each tool
 	for _, tool := range tools {
-		info := tool.Info()
-
 		// Check that the tool has a valid name
-		if info.Name == "" {
+		if tool.Name == "" {
 			t.Error("Tool has empty name")
 		}

-		t.Logf("Tool: %s, Description: %s", info.Name, info.Description)
+		t.Logf("Tool: %s, Description: %s", tool.Name, tool.Description)
 	}
 }

@@ -19,7 +19,7 @@ import (

 // newTestInput creates an InputComponent with the given AppController (may be nil).
 func newTestInput(ctrl AppController) *InputComponent {
-	return NewInputComponent(80, "test input", ctrl)
+	return NewInputComponent(80, ctrl)
 }

 // sendInputMsg calls component.Update with the given message, returns the
@@ -69,30 +69,6 @@ func TestInputComponent_SubmitEmitsSubmitMsg(t *testing.T) {
 	}
 }

-// TestInputComponent_CtrlD_SubmitEmitsSubmitMsg verifies that ctrl+d also
-// submits the text.
-func TestInputComponent_CtrlD_SubmitEmitsSubmitMsg(t *testing.T) {
-	ctrl := &stubAppController{}
-	c := newTestInput(ctrl)
-
-	c.textarea.SetValue("ctrl+d submit")
-	c.lastValue = "ctrl+d submit"
-
-	_, cmd := sendInputMsg(c, tea.KeyPressMsg{Code: 'd', Mod: tea.ModCtrl})
-
-	msg := runCmd(cmd)
-	if msg == nil {
-		t.Fatal("expected a cmd from ctrl+d on non-empty input")
-	}
-	sm, ok := msg.(core.SubmitMsg)
-	if !ok {
-		t.Fatalf("expected submitMsg from ctrl+d, got %T", msg)
-	}
-	if sm.Text != "ctrl+d submit" {
-		t.Fatalf("expected Text='ctrl+d submit', got %q", sm.Text)
-	}
-}
-
 // TestInputComponent_EmptySubmit_NoCmd verifies that submitting an empty or
 // whitespace-only string produces no cmd.
 func TestInputComponent_EmptySubmit_NoCmd(t *testing.T) {
@@ -84,7 +84,7 @@ var SlashCommands = []SlashCommand{
 	},
 	{
 		Name:        "/thinking",
-		Description: "Set thinking/reasoning level (off, minimal, low, medium, high)",
+		Description: "Set thinking/reasoning level (off, none, minimal, low, medium, high)",
 		Category:    "System",
 		Aliases:     []string{"/think"},
 		Complete: func(prefix string) []string {
@@ -25,6 +25,11 @@ type SubmitMsg struct {
 // presses ESC a second time, the canceling state is reset to false.
 type CancelTimerExpiredMsg struct{}

+// CtrlCResetMsg is sent after a short delay when the user presses Ctrl+C to
+// clear input. If the user doesn't press Ctrl+C again within the timeout,
+// the ctrlCPressedOnce flag is reset so the next Ctrl+C will clear again.
+type CtrlCResetMsg struct{}
+
 // --- Tree session events ---

 // TreeNodeSelectedMsg is sent when the user selects a node in the tree selector.
@@ -29,9 +29,16 @@ type (
 	ExtensionCommand = commands.ExtensionCommand
 )

-// Re-export functions from fileutil package
+// Re-export functions and types from fileutil package
 var ProcessFileAttachments = fileutil.ProcessFileAttachments

+// Re-export types from fileutil
+type (
+	FileAttachmentResult = fileutil.FileAttachmentResult
+	FilePart             = fileutil.FilePart
+	MCPResourceReader    = fileutil.MCPResourceReader
+)
+
 // Re-export from prefs package
 var (
 	LoadThemePreference         = prefs.LoadThemePreference
@@ -6,22 +6,78 @@ import (
 	"path/filepath"
 	"sort"
 	"strings"
+	"sync"
+	"time"
 )

-// FileSuggestion represents a single file or directory suggestion for the @
-// autocomplete popup.
+// FileSuggestion represents a single file, directory, or MCP resource
+// suggestion for the @ autocomplete popup.
 type FileSuggestion struct {
-	// RelPath is the path relative to the search base (e.g. "cmd/kit/main.go").
+	// RelPath is the path relative to the search base (e.g. "cmd/kit/main.go")
+	// or a display name for MCP resources (e.g. "mcp:server/resource-name").
 	RelPath string
 	// IsDir is true when the entry is a directory.
 	IsDir bool
 	// Score is the fuzzy match score (higher is better).
 	Score int
+	// IsMCPResource is true for MCP resource entries.
+	IsMCPResource bool
+	// MCPServerName is the MCP server name (set when IsMCPResource is true).
+	MCPServerName string
+	// MCPResourceURI is the MCP resource URI (set when IsMCPResource is true).
+	MCPResourceURI string
+	// MCPMIMEType is the MIME type hint from the MCP server.
+	MCPMIMEType string
 }

 // maxFileSuggestions is the maximum number of file suggestions returned.
 const maxFileSuggestions = 20

+// fileListCache caches the result of listFiles() keyed by directory to avoid
+// re-running git subprocesses on every keystroke during @file completion.
+var fileListCache struct {
+	mu       sync.Mutex
+	dir      string           // searchDir that produced the cached entries
+	cwd      string           // cwd used for the git query
+	entries  []FileSuggestion // cached file list
+	expireAt time.Time        // when the cache entry expires
+}
+
+// fileListCacheTTL controls how long a cached file list stays valid.
+// During rapid typing the list is reused; after the TTL a fresh git
+// ls-files is executed so newly created files become visible.
+const fileListCacheTTL = 3 * time.Second
+
+// getCachedFileList returns the file list for searchDir, using a short-lived
+// cache to avoid repeated subprocess calls during @file autocompletion.
+func getCachedFileList(searchDir, cwd string) []FileSuggestion {
+	fileListCache.mu.Lock()
+	defer fileListCache.mu.Unlock()
+
+	now := time.Now()
+	if fileListCache.dir == searchDir &&
+		fileListCache.cwd == cwd &&
+		now.Before(fileListCache.expireAt) {
+		// Return a copy so callers can mutate (e.g. prepend baseDir).
+		cp := make([]FileSuggestion, len(fileListCache.entries))
+		copy(cp, fileListCache.entries)
+		return cp
+	}
+
+	// Cache miss or expired — run the real (potentially expensive) lookup.
+	files := listFiles(searchDir, cwd)
+
+	fileListCache.dir = searchDir
+	fileListCache.cwd = cwd
+	fileListCache.entries = files
+	fileListCache.expireAt = now.Add(fileListCacheTTL)
+
+	// Return a copy.
+	cp := make([]FileSuggestion, len(files))
+	copy(cp, files)
+	return cp
+}
+
 // ExtractAtPrefix checks the current line for an @-file trigger at cursorCol.
 // It returns:
 //   - hasAt: true if a valid @ trigger was found
@@ -90,7 +146,7 @@ func GetFileSuggestions(prefix string, cwd string) []FileSuggestion {
 		}
 	}

-	files := listFiles(searchDir, cwd)
+	files := getCachedFileList(searchDir, cwd)
 	if len(files) == 0 {
 		return nil
 	}
@@ -2,6 +2,8 @@ package fileutil

 import (
 	"fmt"
+	"mime"
+	"net/http"
 	"os"
 	"path/filepath"
 	"regexp"
@@ -10,31 +12,75 @@ import (
 	"github.com/mark3labs/kit/internal/fences"
 )

+// FilePart represents a binary file attachment (image, audio, etc.) extracted
+// from an @file reference. Callers convert this to kit.LLMFilePart before
+// sending to the LLM. Defined here to avoid a circular dependency on pkg/kit.
+type FilePart struct {
+	// Filename is the basename of the file (e.g. "photo.png").
+	Filename string
+	// Data is the raw file bytes.
+	Data []byte
+	// MediaType is the MIME type (e.g. "image/png", "audio/wav").
+	MediaType string
+}
+
+// MCPResourceReader is a callback function that reads an MCP resource by
+// server name and URI. Returns text content, binary data, MIME type, and error.
+// Used by ProcessFileAttachments to resolve @mcp:server:uri tokens.
+type MCPResourceReader func(serverName, uri string) (text string, blobData []byte, mimeType string, isBlob bool, err error)
+
+// FileAttachmentResult is the result of processing @file references in user
+// input. Text files are inlined as XML in ProcessedText; binary files (images,
+// audio, video, PDFs) are returned as FileParts for multimodal submission.
+type FileAttachmentResult struct {
+	// ProcessedText is the user's text with @file tokens replaced:
+	// text files become XML-wrapped content, binary file tokens are removed.
+	ProcessedText string
+	// FileParts contains binary file attachments extracted from @file
+	// references. Empty when all referenced files are text.
+	FileParts []FilePart
+}
+
 // fileTokenPattern matches @file references in user text. Supports:
 //   - @"path with spaces.txt" (quoted)
 //   - @path/to/file.txt      (unquoted, no spaces)
 var fileTokenPattern = regexp.MustCompile(`@"[^"]+"|@[^\s]+`)

 // ProcessFileAttachments scans the user's input text for @file references,
-// reads each referenced file, and returns the text with @tokens replaced by
-// XML-wrapped file content. Non-file @ tokens (like email addresses) are left
-// unchanged.
+// reads each referenced file, and returns a result containing the processed
+// text and any binary file attachments. Text files are XML-wrapped inline;
+// binary files (images, audio, etc.) are extracted as FileParts for multimodal
+// submission. Non-file @ tokens (like email addresses) are left unchanged.
 //
-// Returns the original text unchanged if no valid @file references are found.
-func ProcessFileAttachments(text string, cwd string) string {
-	return fences.ReplaceOutside(text, func(segment string) string {
-		return processFileTokens(segment, cwd)
+// MCP resources are supported via @mcp:server:uri tokens. The optional
+// mcpReader callback is used to resolve them; pass nil to skip MCP resources.
+func ProcessFileAttachments(text string, cwd string, mcpReader ...MCPResourceReader) FileAttachmentResult {
+	var reader MCPResourceReader
+	if len(mcpReader) > 0 {
+		reader = mcpReader[0]
+	}
+	var allParts []FilePart
+	processed := fences.ReplaceOutside(text, func(segment string) string {
+		result, parts := processFileTokens(segment, cwd, reader)
+		allParts = append(allParts, parts...)
+		return result
 	})
+	return FileAttachmentResult{
+		ProcessedText: processed,
+		FileParts:     allParts,
+	}
 }

 // processFileTokens handles @file replacement in a single text segment
-// that is known to be outside fenced code blocks.
-func processFileTokens(text string, cwd string) string {
+// that is known to be outside fenced code blocks. Returns the processed
+// text and any binary file parts extracted.
+func processFileTokens(text string, cwd string, mcpReader MCPResourceReader) (string, []FilePart) {
 	tokens := fileTokenPattern.FindAllString(text, -1)
 	if len(tokens) == 0 {
-		return text
+		return text, nil
 	}

+	var parts []FilePart
 	result := text
 	for _, token := range tokens {
 		path := tokenToPath(token)
@@ -42,6 +88,43 @@ func processFileTokens(text string, cwd string) string {
 			continue
 		}

+		// Check for MCP resource reference: @mcp:server:uri
+		if strings.HasPrefix(path, "mcp:") {
+			if mcpReader == nil {
+				continue
+			}
+			mcpRef := path[4:] // strip "mcp:"
+			// Split into server:uri (first colon separates server from URI)
+			serverName, uri, ok := strings.Cut(mcpRef, ":")
+			if !ok || serverName == "" || uri == "" {
+				continue // invalid format
+			}
+
+			textContent, blobData, mimeType, isBlob, err := mcpReader(serverName, uri)
+			if err != nil {
+				continue // skip on error, leave token as-is
+			}
+
+			if isBlob {
+				// Binary MCP resource → extract as FilePart.
+				filename := filepath.Base(uri)
+				if filename == "." || filename == "/" {
+					filename = serverName + "_resource"
+				}
+				parts = append(parts, FilePart{
+					Filename:  filename,
+					Data:      blobData,
+					MediaType: mimeType,
+				})
+				result = strings.Replace(result, token, "", 1)
+			} else {
+				// Text MCP resource → inline as XML.
+				wrapped := fmt.Sprintf("<resource uri=\"%s\" server=\"%s\">\n%s\n</resource>", uri, serverName, textContent)
+				result = strings.Replace(result, token, wrapped, 1)
+			}
+			continue
+		}
+
 		absPath, err := resolvePath(path, cwd)
 		if err != nil {
 			// Not a valid file reference — leave the token as-is.
@@ -69,12 +152,28 @@ func processFileTokens(text string, cwd string) string {
 			continue
 		}

-		// Build the XML-wrapped replacement.
-		wrapped := wrapFileContent(absPath, content)
-		result = strings.Replace(result, token, wrapped, 1)
+		mediaType := detectMediaType(absPath, content)
+
+		if isBinaryMediaType(mediaType) {
+			// Binary file → extract as a FilePart for multimodal submission.
+			// Remove the @token from the text.
+			parts = append(parts, FilePart{
+				Filename:  filepath.Base(absPath),
+				Data:      content,
+				MediaType: mediaType,
+			})
+			result = strings.Replace(result, token, "", 1)
+		} else {
+			// Text file → inline as XML-wrapped content.
+			wrapped := wrapFileContent(absPath, content)
+			result = strings.Replace(result, token, wrapped, 1)
+		}
 	}

-	return result
+	// Clean up any extra whitespace left by removed binary tokens.
+	result = strings.TrimSpace(result)
+
+	return result, parts
 }

 // tokenToPath strips the @ prefix and optional quotes from a token,
@@ -137,3 +236,86 @@ func resolvePath(path string, cwd string) (string, error) {
 func wrapFileContent(absPath string, content []byte) string {
 	return fmt.Sprintf("<file path=\"%s\">\n%s\n</file>", absPath, string(content))
 }
+
+// detectMediaType determines the MIME type of a file using extension-based
+// lookup first (more reliable for known types), then falls back to content
+// sniffing via net/http.DetectContentType.
+func detectMediaType(path string, content []byte) string {
+	// Extension-based detection is more reliable for well-known types.
+	ext := strings.ToLower(filepath.Ext(path))
+	if mt := mime.TypeByExtension(ext); mt != "" {
+		// mime.TypeByExtension returns types like "image/png; charset=utf-8"
+		// — strip parameters.
+		if base, _, ok := strings.Cut(mt, ";"); ok {
+			return strings.TrimSpace(base)
+		}
+		return mt
+	}
+
+	// Known extensions that mime package may miss.
+	switch ext {
+	case ".webp":
+		return "image/webp"
+	case ".avif":
+		return "image/avif"
+	case ".heic", ".heif":
+		return "image/heif"
+	case ".opus":
+		return "audio/opus"
+	case ".flac":
+		return "audio/flac"
+	case ".m4a":
+		return "audio/mp4"
+	case ".wasm":
+		return "application/wasm"
+	}
+
+	// Content sniffing fallback.
+	if len(content) > 0 {
+		detected := http.DetectContentType(content)
+		if detected != "" && detected != "application/octet-stream" {
+			if base, _, ok := strings.Cut(detected, ";"); ok {
+				return strings.TrimSpace(base)
+			}
+			return detected
+		}
+	}
+
+	// Default: treat as plain text so it gets XML-wrapped.
+	return "text/plain"
+}
+
+// isBinaryMediaType returns true if the MIME type represents a binary file
+// that should be sent as a multimodal FilePart rather than XML-wrapped text.
+func isBinaryMediaType(mediaType string) bool {
+	// Image types — always binary.
+	if strings.HasPrefix(mediaType, "image/") {
+		return true
+	}
+	// Audio types — always binary.
+	if strings.HasPrefix(mediaType, "audio/") {
+		return true
+	}
+	// Video types — always binary.
+	if strings.HasPrefix(mediaType, "video/") {
+		return true
+	}
+	// Specific application types that are binary.
+	switch mediaType {
+	case "application/pdf",
+		"application/zip",
+		"application/gzip",
+		"application/x-tar",
+		"application/octet-stream",
+		"application/wasm",
+		"application/x-executable",
+		"application/vnd.ms-excel",
+		"application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
+		"application/vnd.ms-powerpoint",
+		"application/vnd.openxmlformats-officedocument.presentationml.presentation",
+		"application/msword",
+		"application/vnd.openxmlformats-officedocument.wordprocessingml.document":
+		return true
+	}
+	return false
+}
@@ -0,0 +1,209 @@
+package fileutil
+
+import (
+	"os"
+	"path/filepath"
+	"testing"
+)
+
+func TestProcessFileAttachments_TextFile(t *testing.T) {
+	// Create a temp text file
+	dir := t.TempDir()
+	textFile := filepath.Join(dir, "hello.txt")
+	if err := os.WriteFile(textFile, []byte("hello world"), 0644); err != nil {
+		t.Fatal(err)
+	}
+
+	text := "@" + textFile + " check this out"
+	result := ProcessFileAttachments(text, dir)
+
+	if len(result.FileParts) != 0 {
+		t.Errorf("expected 0 FileParts for text file, got %d", len(result.FileParts))
+	}
+	if result.ProcessedText == text {
+		t.Error("expected text file to be XML-wrapped, but got original text unchanged")
+	}
+	// Should contain XML wrapping
+	if !contains(result.ProcessedText, "<file path=") {
+		t.Error("expected XML <file> wrapping in processed text")
+	}
+	if !contains(result.ProcessedText, "hello world") {
+		t.Error("expected file content in processed text")
+	}
+}
+
+func TestProcessFileAttachments_BinaryFile(t *testing.T) {
+	// Create a minimal PNG file (binary)
+	dir := t.TempDir()
+	pngFile := filepath.Join(dir, "image.png")
+	// Minimal valid PNG header
+	pngData := []byte{
+		0x89, 0x50, 0x4E, 0x47, 0x0D, 0x0A, 0x1A, 0x0A, // PNG signature
+		0x00, 0x00, 0x00, 0x0D, 0x49, 0x48, 0x44, 0x52, // IHDR chunk
+		0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x01, // 1x1
+		0x08, 0x02, 0x00, 0x00, 0x00, 0x90, 0x77, 0x53, 0xDE, // 8bit RGB
+		0x00, 0x00, 0x00, 0x0C, 0x49, 0x44, 0x41, 0x54, // IDAT chunk
+		0x08, 0xD7, 0x63, 0xF8, 0xCF, 0xC0, 0x00, 0x00,
+		0x00, 0x02, 0x00, 0x01, 0xE2, 0x21, 0xBC, 0x33,
+		0x00, 0x00, 0x00, 0x00, 0x49, 0x45, 0x4E, 0x44, // IEND chunk
+		0xAE, 0x42, 0x60, 0x82,
+	}
+	if err := os.WriteFile(pngFile, pngData, 0644); err != nil {
+		t.Fatal(err)
+	}
+
+	text := "@" + pngFile + " what is this image?"
+	result := ProcessFileAttachments(text, dir)
+
+	if len(result.FileParts) != 1 {
+		t.Fatalf("expected 1 FilePart for binary file, got %d", len(result.FileParts))
+	}
+	if result.FileParts[0].MediaType != "image/png" {
+		t.Errorf("expected media type image/png, got %s", result.FileParts[0].MediaType)
+	}
+	if result.FileParts[0].Filename != "image.png" {
+		t.Errorf("expected filename image.png, got %s", result.FileParts[0].Filename)
+	}
+	// The @token should be removed from the text
+	if contains(result.ProcessedText, "@") && contains(result.ProcessedText, pngFile) {
+		t.Error("expected @token to be removed from processed text for binary file")
+	}
+	if contains(result.ProcessedText, "what is this image?") {
+		// Good, the prompt text should remain
+	} else {
+		t.Error("expected prompt text to remain in processed text")
+	}
+}
+
+func TestProcessFileAttachments_MCPResource(t *testing.T) {
+	// Test @mcp:server:uri token processing with a mock reader
+	text := "@mcp:test-server:docs://readme tell me about this"
+	reader := func(serverName, uri string) (string, []byte, string, bool, error) {
+		if serverName != "test-server" || uri != "docs://readme" {
+			t.Errorf("unexpected server/uri: %s/%s", serverName, uri)
+		}
+		return "Hello from MCP resource", nil, "text/plain", false, nil
+	}
+
+	result := ProcessFileAttachments(text, "/tmp", reader)
+
+	if len(result.FileParts) != 0 {
+		t.Errorf("expected 0 FileParts for text MCP resource, got %d", len(result.FileParts))
+	}
+	if !contains(result.ProcessedText, "<resource uri=\"docs://readme\" server=\"test-server\">") {
+		t.Error("expected <resource> XML wrapping in processed text")
+	}
+	if !contains(result.ProcessedText, "Hello from MCP resource") {
+		t.Error("expected MCP resource content in processed text")
+	}
+}
+
+func TestProcessFileAttachments_MCPResource_Binary(t *testing.T) {
+	// Test @mcp:server:uri token processing for a binary resource
+	text := "@mcp:test-server:images://logo describe this"
+	reader := func(serverName, uri string) (string, []byte, string, bool, error) {
+		if serverName != "test-server" || uri != "images://logo" {
+			t.Errorf("unexpected server/uri: %s/%s", serverName, uri)
+		}
+		return "", []byte{0x89, 0x50, 0x4E, 0x47}, "image/png", true, nil
+	}
+
+	result := ProcessFileAttachments(text, "/tmp", reader)
+
+	if len(result.FileParts) != 1 {
+		t.Fatalf("expected 1 FilePart for binary MCP resource, got %d", len(result.FileParts))
+	}
+	if result.FileParts[0].MediaType != "image/png" {
+		t.Errorf("expected media type image/png, got %s", result.FileParts[0].MediaType)
+	}
+	if result.FileParts[0].Filename != "logo" {
+		t.Errorf("expected filename 'logo', got %s", result.FileParts[0].Filename)
+	}
+	// The @token should be removed from the text
+	if contains(result.ProcessedText, "@mcp:") {
+		t.Error("expected @mcp: token to be removed from processed text for binary resource")
+	}
+}
+
+func TestProcessFileAttachments_NoReader(t *testing.T) {
+	// Without an MCP reader, @mcp: tokens should be left as-is
+	text := "@mcp:server:resource this is a test"
+	result := ProcessFileAttachments(text, "/tmp")
+
+	if len(result.FileParts) != 0 {
+		t.Errorf("expected 0 FileParts, got %d", len(result.FileParts))
+	}
+	// The @mcp: token should remain unchanged since no reader was provided
+	if result.ProcessedText != text {
+		t.Errorf("expected text unchanged without reader, got: %s", result.ProcessedText)
+	}
+}
+
+func TestDetectMediaType(t *testing.T) {
+	tests := []struct {
+		ext      string
+		content  []byte
+		expected string
+	}{
+		// An intentionally-synthetic extension that is not registered
+		// in any system MIME database. Exercises the "unknown ext +
+		// no content" branch, which must return the text/plain default.
+		// Do not use real extensions (e.g. .go) here: CI images often
+		// ship /etc/mime.types with entries like ".go → text/x-go",
+		// which would make the assertion environment-dependent.
+		{".kitsyntheticext", nil, "text/plain"},
+		{".png", []byte{0x89, 0x50, 0x4E, 0x47}, "image/png"},
+		{".jpg", []byte{0xFF, 0xD8, 0xFF}, "image/jpeg"},
+		{".pdf", []byte{0x25, 0x50, 0x44, 0x46}, "application/pdf"},
+		{".txt", []byte("hello"), "text/plain"},
+		{".wav", nil, "audio/wav"},
+		{".webp", nil, "image/webp"},
+	}
+
+	for _, tt := range tests {
+		t.Run(tt.ext, func(t *testing.T) {
+			got := detectMediaType("test"+tt.ext, tt.content)
+			if got != tt.expected {
+				t.Errorf("detectMediaType(%q) = %q, want %q", tt.ext, got, tt.expected)
+			}
+		})
+	}
+}
+
+func TestIsBinaryMediaType(t *testing.T) {
+	tests := []struct {
+		mimeType string
+		expected bool
+	}{
+		{"image/png", true},
+		{"image/jpeg", true},
+		{"audio/wav", true},
+		{"video/mp4", true},
+		{"application/pdf", true},
+		{"text/plain", false},
+		{"text/go", false},
+		{"application/json", false},
+	}
+
+	for _, tt := range tests {
+		t.Run(tt.mimeType, func(t *testing.T) {
+			got := isBinaryMediaType(tt.mimeType)
+			if got != tt.expected {
+				t.Errorf("isBinaryMediaType(%q) = %v, want %v", tt.mimeType, got, tt.expected)
+			}
+		})
+	}
+}
+
+func contains(s, substr string) bool {
+	return len(s) >= len(substr) && (s == substr || len(s) > 0 && containsStr(s, substr))
+}
+
+func containsStr(s, substr string) bool {
+	for i := 0; i <= len(s)-len(substr); i++ {
+		if s[i:i+len(substr)] == substr {
+			return true
+		}
+	}
+	return false
+}
@@ -17,6 +17,7 @@ type Renderer interface {
 	RenderReasoningBlock(content string, timestamp time.Time) UIMessage
 	RenderToolMessage(toolName, toolArgs, toolResult string, isError bool) UIMessage
 	RenderSystemMessage(content string, timestamp time.Time) UIMessage
+	RenderCustomMessage(content, label string, timestamp time.Time) UIMessage
 	RenderErrorMessage(errorMsg string, timestamp time.Time) UIMessage
 	RenderDebugMessage(message string, timestamp time.Time) UIMessage
 	RenderDebugConfigMessage(config map[string]any, timestamp time.Time) UIMessage
@@ -2,6 +2,7 @@ package ui

 import (
 	"fmt"
+	"sort"
 	"strings"

 	"charm.land/bubbles/v2/key"
@@ -39,7 +40,6 @@ type InputComponent struct {
 	width       int
 	lastValue   string
 	popupHeight int
-	title       string
 	submitNext  bool // defer submit one tick so popup dismisses cleanly

 	// Argument completion state. When the user types "/cmd " followed by
@@ -61,6 +61,10 @@ type InputComponent struct {
 	// autocomplete suggestions. Set by the parent via SetCwd.
 	cwd string

+	// mcpResources is a callback that returns available MCP resources for
+	// the @ autocomplete popup. Set by the parent via SetMCPResourceProvider.
+	mcpResources func() []FileSuggestion
+
 	// appCtrl is used for slash commands that mutate app state.
 	// May be nil in tests; nil-safe.
 	appCtrl AppController
@@ -101,17 +105,17 @@ type clipboardImageMsg struct {
 	err   error
 }

-// NewInputComponent creates a new InputComponent with the given width, title,
-// and optional AppController. If appCtrl is nil the component still works but
+// NewInputComponent creates a new InputComponent with the given width and
+// optional AppController. If appCtrl is nil the component still works but
 // /clear and /clear-queue are no-ops.
-func NewInputComponent(width int, title string, appCtrl AppController) *InputComponent {
+func NewInputComponent(width int, appCtrl AppController) *InputComponent {
 	ta := textarea.New()
 	ta.Placeholder = "Type your message..."
 	ta.ShowLineNumbers = false
 	ta.Prompt = ""
 	ta.CharLimit = 0
 	ta.SetWidth(width - 8) // Account for container padding, border and internal padding
-	ta.SetHeight(3)        // Default to 3 lines like huh
+	ta.SetHeight(4)        // 4 lines for comfortable multi-line input
 	ta.Focus()

 	// Override InsertNewline so only ctrl+j and shift+enter insert newlines.
@@ -136,8 +140,8 @@ func NewInputComponent(width int, title string, appCtrl AppController) *InputCom
 		commands:    commands.SlashCommands,
 		width:       width,
 		popupHeight: 7,
-		title:       title,
 		appCtrl:     appCtrl,
+		hideHint:    true,
 	}
 }

@@ -147,6 +151,12 @@ func (s *InputComponent) SetCwd(cwd string) {
 	s.cwd = cwd
 }

+// SetMCPResourceProvider sets a callback that returns MCP resource suggestions
+// for the @ autocomplete popup. Called by the parent after construction.
+func (s *InputComponent) SetMCPResourceProvider(fn func() []FileSuggestion) {
+	s.mcpResources = fn
+}
+
 // Init implements tea.Model. Starts the cursor blink animation.
 func (s *InputComponent) Init() tea.Cmd {
 	return textarea.Blink
@@ -190,7 +200,7 @@ func (s *InputComponent) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
 	case tea.KeyPressMsg:
 		if !s.showPopup {
 			switch msg.String() {
-			case "ctrl+d", "enter":
+			case "enter":
 				value := s.textarea.Value()
 				s.pushHistory(value)
 				s.textarea.SetValue("")
@@ -332,9 +342,46 @@ func (s *InputComponent) Update(msg tea.Msg) (tea.Model, tea.Cmd) {

 			// Check for @file trigger first.
 			cursorCol := len(line) // approximate: cursor is at end after typing
-			if hasAt, prefix, atIdx := ExtractAtPrefix(line, cursorCol); hasAt && s.cwd != "" {
-				suggestions := GetFileSuggestions(prefix, s.cwd)
+			if hasAt, prefix, atIdx := ExtractAtPrefix(line, cursorCol); hasAt {
+				var suggestions []FileSuggestion
+
+				// Local file suggestions (only if cwd is set).
+				if s.cwd != "" {
+					suggestions = GetFileSuggestions(prefix, s.cwd)
+				}
+
+				// MCP resource suggestions — merge with file suggestions.
+				if s.mcpResources != nil {
+					mcpSuggestions := s.mcpResources()
+					if prefix != "" {
+						// Fuzzy-filter MCP resources against the typed prefix.
+						queryLower := strings.ToLower(prefix)
+						var filtered []FileSuggestion
+						for _, r := range mcpSuggestions {
+							score := scoreFilePath(queryLower, r.RelPath)
+							if score <= 0 {
+								// Also try matching against the resource name without prefix.
+								score = scoreFilePath(queryLower, r.MCPServerName+"/"+r.RelPath)
+							}
+							if score > 0 {
+								r.Score = score
+								filtered = append(filtered, r)
+							}
+						}
+						mcpSuggestions = filtered
+					}
+					suggestions = append(suggestions, mcpSuggestions...)
+				}
+
 				if len(suggestions) > 0 {
+					// Sort by score descending, cap at maxFileSuggestions.
+					sort.Slice(suggestions, func(i, j int) bool {
+						return suggestions[i].Score > suggestions[j].Score
+					})
+					if len(suggestions) > maxFileSuggestions {
+						suggestions = suggestions[:maxFileSuggestions]
+					}
+
 					s.showPopup = true
 					s.fileMode = true
 					s.argMode = false
@@ -348,6 +395,8 @@ func (s *InputComponent) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
 						desc := ""
 						if fs.IsDir {
 							desc = "directory"
+						} else if fs.IsMCPResource {
+							desc = "mcp:" + fs.MCPServerName
 						}
 						s.fileSynthCmds[i] = commands.SlashCommand{Name: name, Description: desc}
 						s.filtered[i] = FuzzyMatch{Command: &s.fileSynthCmds[i], Score: fs.Score}
@@ -470,19 +519,13 @@ func (s *InputComponent) resetHistoryBrowsing() {
 	s.savedInput = ""
 }

-// View implements tea.Model. Renders the title, textarea, autocomplete popup
+// View implements tea.Model. Renders the textarea, autocomplete popup
 // (if visible), and help text.
 func (s *InputComponent) View() tea.View {
 	containerStyle := lipgloss.NewStyle()

 	theme := style.GetTheme()

-	// PaddingLeft(3) aligns with message content: border(1) + paddingLeft(2).
-	titleStyle := lipgloss.NewStyle().
-		Foreground(theme.Text).
-		MarginBottom(1).
-		PaddingLeft(3)
-
 	inputBoxStyle := lipgloss.NewStyle().
 		Border(lipgloss.ThickBorder()).
 		BorderLeft(true).
@@ -490,12 +533,12 @@ func (s *InputComponent) View() tea.View {
 		BorderTop(false).
 		BorderBottom(false).
 		BorderForeground(theme.Primary).
+		MarginTop(1).
+		MarginBottom(1).
 		PaddingLeft(2).    // match message block paddingLeft
 		Width(s.width - 1) // full width minus left border

 	var view strings.Builder
-	view.WriteString(titleStyle.Render(s.title))
-	view.WriteString("\n")
 	view.WriteString(inputBoxStyle.Render(s.textarea.View()))

 	// Popup is now rendered as a centered overlay in AppModel.View()
@@ -658,9 +701,25 @@ func (s *InputComponent) renderPopupWithOptions(centered bool) string {
 				}
 				content = indicator + displayName
 			} else {
-				nameWidth := 15
-				if innerWidth < 25 {
-					nameWidth = max(innerWidth*2/5+1, 8)
+				// Compute nameWidth from the longest command name in the
+				// visible slice so we never truncate unnecessarily.
+				nameWidth := 0
+				for _, fm := range s.filtered {
+					if n := len([]rune(fm.Command.Name)); n > nameWidth {
+						nameWidth = n
+					}
+				}
+				nameWidth += 3 // account for indicator prefix (2) + gap before description (1)
+				// Ensure descriptions still get at least 20 chars when possible.
+				maxForName := innerWidth - 20
+				if maxForName < 8 {
+					maxForName = innerWidth * 2 / 3
+				}
+				if nameWidth > maxForName {
+					nameWidth = maxForName
+				}
+				if nameWidth < 8 {
+					nameWidth = 8
 				}
 				maxNameChars := nameWidth - 2
 				displayName := sc.Name
@@ -793,9 +852,25 @@ func (s *InputComponent) PendingImageCount() int {
 	return len(s.pendingImages)
 }

+// Clear clears the textarea content and resets related state. Returns true if
+// there was content to clear, false if the input was already empty.
+func (s *InputComponent) Clear() bool {
+	hadContent := s.textarea.Value() != ""
+	s.textarea.SetValue("")
+	s.textarea.CursorEnd()
+	s.lastValue = ""
+	s.showPopup = false
+	s.argMode = false
+	s.fileMode = false
+	s.browsingHistory = false
+	s.savedInput = ""
+	return hadContent
+}
+
 // applyFileCompletion replaces the @prefix in the textarea with the selected
-// file suggestion. For directories, it keeps the popup open for further
-// drilling. For files, it closes the popup and adds a trailing space.
+// file or MCP resource suggestion. For directories, it keeps the popup open
+// for further drilling. For files and resources, it closes the popup and adds
+// a trailing space.
 func (s *InputComponent) applyFileCompletion(idx int) {
 	if idx >= len(s.fileSuggestions) {
 		return
@@ -812,19 +887,30 @@ func (s *InputComponent) applyFileCompletion(idx int) {

 	// Reconstruct: everything before the @ on the last line + @<path>
 	beforeAt := lastLine[:s.fileAtStartIdx]
-	needsQuote := strings.Contains(suggestion.RelPath, " ")

 	var replacement string
-	if needsQuote {
-		replacement = `@"` + suggestion.RelPath + `"`
-	} else {
-		replacement = "@" + suggestion.RelPath
-	}
-
-	// For files, add a trailing space. For directories, don't — allow
-	// continued drilling into the directory.
-	if !suggestion.IsDir {
+	if suggestion.IsMCPResource {
+		// MCP resources use @mcp:server:uri format.
+		// Quote if the URI contains spaces.
+		ref := "mcp:" + suggestion.MCPServerName + ":" + suggestion.MCPResourceURI
+		if strings.Contains(ref, " ") {
+			replacement = `@"` + ref + `"`
+		} else {
+			replacement = "@" + ref
+		}
 		replacement += " "
+	} else {
+		needsQuote := strings.Contains(suggestion.RelPath, " ")
+		if needsQuote {
+			replacement = `@"` + suggestion.RelPath + `"`
+		} else {
+			replacement = "@" + suggestion.RelPath
+		}
+		// For files, add a trailing space. For directories, don't — allow
+		// continued drilling into the directory.
+		if !suggestion.IsDir {
+			replacement += " "
+		}
 	}

 	newLastLine := beforeAt + replacement
@@ -836,7 +922,7 @@ func (s *InputComponent) applyFileCompletion(idx int) {
 	s.textarea.SetValue(newValue)
 	s.textarea.CursorEnd()

-	if suggestion.IsDir {
+	if suggestion.IsDir && !suggestion.IsMCPResource {
 		// Keep popup open — trigger a refresh for the new directory.
 		s.lastValue = "" // force re-evaluation on next update tick
 	} else {
@@ -109,8 +109,8 @@ func (m *TextMessageItem) renderContent(width int) string {
 // It accumulates content chunks and re-renders on each update for live display.
 type StreamingMessageItem struct {
 	id            string
-	role          string // "assistant" or "reasoning"
-	content       string // Accumulated streaming content
+	role          string          // "assistant" or "reasoning"
+	content       strings.Builder // Accumulated streaming content
 	timestamp     time.Time
 	startTime     time.Time // When streaming started (for live duration counter)
 	modelName     string
@@ -156,10 +156,10 @@ func (s *StreamingMessageItem) Render(width int) string {
 			durationMs = time.Since(s.startTime).Milliseconds()
 		}
 		ty := createTypography(style.GetTheme())
-		rendered = render.ReasoningBlock(s.content, durationMs, ty, style.GetTheme())
+		rendered = render.ReasoningBlock(s.content.String(), durationMs, width, ty, style.GetTheme())
 	} else {
 		// Render as assistant message
-		rendered = render.AssistantBlock(s.content, width, style.GetTheme())
+		rendered = render.AssistantBlock(s.content.String(), width, style.GetTheme())
 	}

 	// Cache and return (but reasoning is never cached due to live duration)
@@ -187,7 +187,7 @@ func (s *StreamingMessageItem) Height() int {

 // AppendChunk adds a content chunk and invalidates the render cache.
 func (s *StreamingMessageItem) AppendChunk(chunk string) {
-	s.content += chunk
+	s.content.WriteString(chunk)
 	s.cachedWidth = 0 // Invalidate cache
 }

@@ -243,9 +243,7 @@ func (m *StreamingBashOutputItem) Render(width int) string {

 	// Header with command
 	if m.command != "" {
-		headerStyle := lipgloss.NewStyle().
-			Foreground(theme.Muted).
-			Italic(true)
+		headerStyle := style.GetCachedStyles().BashHeader
 		parts = append(parts, headerStyle.Render(fmt.Sprintf("▸ %s", m.command)))
 	}

@@ -150,9 +150,26 @@ func (r *MessageRenderer) SetWidth(width int) {
 	r.width = width
 }

-// RenderUserMessage renders a user's input message using herald Tip alert
+// RenderUserMessage renders a user's input message with a colored left border.
 func (r *MessageRenderer) RenderUserMessage(content string, timestamp time.Time) UIMessage {
-	rendered := render.UserBlock(content, r.width, r.ty, style.GetTheme())
+	if strings.TrimSpace(content) == "" {
+		content = "(empty message)"
+	}
+
+	theme := style.GetTheme()
+
+	// Highlight @file tokens with accent color.
+	content = render.HighlightFileTokens(content, theme)
+
+	rendered := renderContentBlock(
+		content,
+		r.width,
+		WithAlign(lipgloss.Left),
+		WithBorderColor(theme.Success),
+		WithPaddingTop(0),
+		WithPaddingBottom(0),
+		WithMarginBottom(1),
+	)

 	return UIMessage{
 		Type:      UserMessage,
@@ -178,7 +195,7 @@ func (r *MessageRenderer) RenderAssistantMessage(content string, timestamp time.
 // as live streaming: muted italic text with margin. This is used when resuming
 // sessions to display saved reasoning content.
 func (r *MessageRenderer) RenderReasoningBlock(content string, timestamp time.Time) UIMessage {
-	rendered := render.ReasoningBlock(content, 0, r.ty, style.GetTheme())
+	rendered := render.ReasoningBlock(content, 0, r.width, r.ty, style.GetTheme())

 	return UIMessage{
 		Type:      AssistantMessage,
@@ -200,6 +217,19 @@ func (r *MessageRenderer) RenderSystemMessage(content string, timestamp time.Tim
 	}
 }

+// RenderCustomMessage renders a message with a custom alert label (e.g. "Help").
+// Content is rendered as markdown.
+func (r *MessageRenderer) RenderCustomMessage(content, label string, timestamp time.Time) UIMessage {
+	rendered := render.CustomBlock(content, label, r.width, style.GetTheme())
+
+	return UIMessage{
+		Type:      SystemMessage,
+		Content:   rendered,
+		Height:    lipgloss.Height(rendered),
+		Timestamp: timestamp,
+	}
+}
+
 // RenderDebugMessage renders diagnostic and debugging information
 func (r *MessageRenderer) RenderDebugMessage(message string, timestamp time.Time) UIMessage {
 	header := r.ty.H6("🔍 Debug Output")
@@ -308,7 +338,7 @@ func (r *MessageRenderer) RenderToolMessage(toolName, toolArgs, toolResult strin
 	// Build the content: icon + name + params on first line, then body
 	headerLine := styledIcon + " " + styledName
 	if params != "" {
-		headerLine += " " + lipgloss.NewStyle().Foreground(theme.Muted).Render(params)
+		headerLine += " " + style.GetCachedStyles().ToolMuted.Render(params)
 	}

 	// Get body content
@@ -399,7 +429,8 @@ func createTypography(theme style.Theme) *herald.Typography {
 		herald.WithCodeLineNumbers(true),
 		// Customize alert labels
 		herald.WithAlertLabel(herald.AlertNote, "Info"),
-		herald.WithAlertLabel(herald.AlertTip, "You"),
+		herald.WithAlertLabel(herald.AlertTip, ""),
+		herald.WithAlertIcon(herald.AlertTip, ""),
 		herald.WithAlertLabel(herald.AlertWarning, "Working"),
 		herald.WithAlertLabel(herald.AlertCaution, "Error"),
 	)
@@ -134,6 +134,34 @@ type SkillItem struct {
 	Source string // "project" or "user" (global).
 }

+// MCPPromptInfo describes an MCP prompt for display in the TUI (autocomplete,
+// help). This is a pure UI type — it carries no MCP client dependencies.
+type MCPPromptInfo struct {
+	Name        string             // Prompt name on the MCP server.
+	Description string             // Human-readable description.
+	Arguments   []MCPPromptArgInfo // Expected arguments.
+	ServerName  string             // Owning MCP server name.
+}
+
+// MCPPromptArgInfo describes an argument for an MCP prompt.
+type MCPPromptArgInfo struct {
+	Name        string
+	Description string
+	Required    bool
+}
+
+// MCPPromptExpandResult is the result of lazily expanding an MCP prompt.
+type MCPPromptExpandResult struct {
+	Messages []MCPPromptMessageInfo
+}
+
+// MCPPromptMessageInfo is a single message from an expanded MCP prompt.
+type MCPPromptMessageInfo struct {
+	Role      string // "user" or "assistant"
+	Content   string
+	FileParts []kit.LLMFilePart
+}
+
 // ToolRendererData holds extension-provided rendering functions for a specific
 // tool. The UI layer uses this to override the default tool header/body
 // rendering without depending on the extensions package directly.
@@ -310,6 +338,19 @@ type AppModelOptions struct {
 	// watcher detects changes. May be nil if prompt hot-reload is not needed.
 	GetPromptTemplates func() []*prompts.PromptTemplate

+	// MCPPrompts are prompts discovered from MCP servers at startup.
+	// They appear in autocomplete as /<server>:<prompt> commands.
+	MCPPrompts []MCPPromptInfo
+
+	// GetMCPPrompts, if non-nil, returns the current MCP prompts.
+	// Called on MCPToolsReadyEvent to refresh after background loading.
+	GetMCPPrompts func() []MCPPromptInfo
+
+	// ExpandMCPPrompt, if non-nil, lazily expands an MCP prompt by
+	// calling the MCP server's GetPrompt. Called asynchronously when the
+	// user invokes an MCP prompt slash command.
+	ExpandMCPPrompt func(serverName, promptName string, args map[string]string) (*MCPPromptExpandResult, error)
+
 	// ContextPaths lists absolute paths of loaded context files (e.g.
 	// AGENTS.md). Displayed in the [Context] startup section.
 	ContextPaths []string
@@ -425,6 +466,15 @@ type AppModelOptions struct {
 	IsReasoningModel bool
 	// SetThinkingLevel changes the thinking level on the agent/provider.
 	SetThinkingLevel func(level string) error
+
+	// GetMCPResources, if non-nil, returns FileSuggestion entries for all
+	// MCP resources available from connected servers. Used by the @
+	// autocomplete popup to merge resource suggestions with local files.
+	GetMCPResources func() []FileSuggestion
+
+	// MCPResourceReader, if non-nil, reads an MCP resource by server name
+	// and URI. Used at submit time to resolve @mcp:server:uri tokens.
+	MCPResourceReader fileutil.MCPResourceReader
 }

 // AppModel is the root Bubble Tea model for the interactive TUI. It owns the
@@ -534,6 +584,17 @@ type AppModel struct {
 	// refresh the template list after content hot-reload. May be nil.
 	getPromptTemplates func() []*prompts.PromptTemplate

+	// mcpPrompts are prompts discovered from MCP servers, shown as
+	// /<server>:<prompt> slash commands.
+	mcpPrompts []MCPPromptInfo
+
+	// getMCPPrompts returns the current MCP prompts. Called on
+	// MCPToolsReadyEvent to refresh after background loading.
+	getMCPPrompts func() []MCPPromptInfo
+
+	// expandMCPPrompt lazily expands an MCP prompt via the server.
+	expandMCPPrompt func(serverName, promptName string, args map[string]string) (*MCPPromptExpandResult, error)
+
 	// treeSelector is the tree navigation overlay, active in stateTreeSelector.
 	treeSelector *TreeSelectorComponent

@@ -647,6 +708,10 @@ type AppModel struct {
 	// cwd is the working directory for @file path resolution.
 	cwd string

+	// mcpResourceReader is an optional callback to read MCP resources when
+	// processing @mcp:server:uri tokens at submit time. Set by the parent.
+	mcpResourceReader fileutil.MCPResourceReader
+
 	// width and height track the terminal dimensions.
 	width  int
 	height int
@@ -655,6 +720,10 @@ type AppModel struct {
 	// disables alt screen to restore the terminal properly.
 	quitting bool

+	// ctrlCPressedOnce tracks if Ctrl+C was pressed once to clear input.
+	// A second Ctrl+C (or Ctrl+C when input is empty) will quit the app.
+	ctrlCPressedOnce bool
+
 	// streamingBashOutput holds the current streaming bash output lines.
 	// Lines are accumulated as they arrive and displayed in the stream region.
 	streamingBashOutput []string
@@ -762,6 +831,9 @@ func NewAppModel(appCtrl AppController, opts AppModelOptions) *AppModel {
 	m.extensionCommands = opts.ExtensionCommands
 	m.promptTemplates = opts.PromptTemplates
 	m.getPromptTemplates = opts.GetPromptTemplates
+	m.mcpPrompts = opts.MCPPrompts
+	m.getMCPPrompts = opts.GetMCPPrompts
+	m.expandMCPPrompt = opts.ExpandMCPPrompt
 	m.getWidgets = opts.GetWidgets
 	m.getHeader = opts.GetHeader
 	m.getFooter = opts.GetFooter
@@ -801,13 +873,21 @@ func NewAppModel(appCtrl AppController, opts AppModelOptions) *AppModel {
 	m.messages = []MessageItem{}

 	// Wire up child components now that we have the concrete implementations.
-	m.input = NewInputComponent(width, "Enter your prompt (Type /help for commands, Ctrl+C to quit)", appCtrl)
+	m.input = NewInputComponent(width, appCtrl)

 	// Wire up cwd for @file autocomplete.
 	if ic, ok := m.input.(*InputComponent); ok && opts.Cwd != "" {
 		ic.SetCwd(opts.Cwd)
 	}

+	// Wire up MCP resource provider for @ autocomplete.
+	if ic, ok := m.input.(*InputComponent); ok && opts.GetMCPResources != nil {
+		ic.SetMCPResourceProvider(opts.GetMCPResources)
+	}
+
+	// Wire up MCP resource reader for @mcp: token processing at submit time.
+	m.mcpResourceReader = opts.MCPResourceReader
+
 	// Merge extension commands into the InputComponent's autocomplete source.
 	if ic, ok := m.input.(*InputComponent); ok && len(opts.ExtensionCommands) > 0 {
 		for _, ec := range opts.ExtensionCommands {
@@ -832,6 +912,25 @@ func NewAppModel(appCtrl AppController, opts AppModelOptions) *AppModel {
 		}
 	}

+	// Merge MCP prompts into autocomplete as /<server>:<prompt> commands.
+	if ic, ok := m.input.(*InputComponent); ok && len(opts.MCPPrompts) > 0 {
+		for _, p := range opts.MCPPrompts {
+			hasArgs := false
+			for _, a := range p.Arguments {
+				if a.Required {
+					hasArgs = true
+					break
+				}
+			}
+			ic.commands = append(ic.commands, commands.SlashCommand{
+				Name:        fmt.Sprintf("/%s:%s", p.ServerName, p.Name),
+				Description: p.Description,
+				Category:    "MCP Prompts",
+				HasArgs:     hasArgs,
+			})
+		}
+	}
+
 	m.stream = NewStreamComponent(width, opts.ModelName)
 	m.stream.SetThinkingVisible(m.thinkingVisible)

@@ -945,7 +1044,7 @@ func (m *AppModel) AddStartupMessageToScrollList() {
 	// Add a visual separator after startup info: blank line + HR + blank line.
 	// Uses a single pre-rendered item so there are no left borders on the spacing.
 	theme := style.GetTheme()
-	separator := strings.Repeat("─", 80)
+	separator := strings.Repeat("─", m.width)
 	separatorStyled := lipgloss.NewStyle().
 		Foreground(theme.Border).
 		Render(separator)
@@ -1043,6 +1142,31 @@ func (m *AppModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
 		m.state = stateInput
 		if m.setModel != nil {
 			previousModel := m.providerName + "/" + m.modelName
+
+			// Check if thinking level needs adjustment for the new model.
+			// Some models (e.g., OpenAI gpt-5.4) don't support "minimal" and require "none".
+			if m.thinkingLevel != "" && m.thinkingLevel != "off" {
+				parts := strings.SplitN(msg.ModelString, "/", 2)
+				if len(parts) == 2 {
+					modelName := parts[1]
+					currentLevel := models.ParseThinkingLevel(m.thinkingLevel)
+					if !models.IsValidThinkingLevelForModel(currentLevel, modelName) {
+						fallback := models.SuggestThinkingLevelFallback(currentLevel, modelName)
+						if fallback != models.ThinkingOff {
+							m.printSystemMessage(fmt.Sprintf(
+								"Note: Model %s doesn't support '%s' thinking level. Adjusted to '%s'.",
+								modelName, currentLevel, fallback,
+							))
+							m.thinkingLevel = string(fallback)
+							if m.setThinkingLevel != nil {
+								_ = m.setThinkingLevel(string(fallback))
+							}
+							go func() { _ = prefs.SaveThinkingLevelPreference(string(fallback)) }()
+						}
+					}
+				}
+			}
+
 			if err := m.setModel(msg.ModelString); err != nil {
 				m.printSystemMessage(fmt.Sprintf("Failed to switch model: %v", err))
 			} else {
@@ -1188,10 +1312,22 @@ func (m *AppModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
 				m.overlayResponseCh = nil
 				m.overlay = nil
 			}
-			// Set quitting flag so View() disables alt screen for clean exit.
-			m.quitting = true
-			// Graceful quit: app.Close() is deferred in cmd/root.go.
-			return m, tea.Quit
+
+			// Second Ctrl+C within the timeout window — quit.
+			if m.ctrlCPressedOnce {
+				m.quitting = true
+				return m, tea.Quit
+			}
+
+			// First Ctrl+C — clear input if it has content, then arm the quit flag.
+			if m.state == stateInput {
+				if ic, ok := m.input.(*InputComponent); ok {
+					ic.Clear()
+				}
+			}
+			m.ctrlCPressedOnce = true
+			// Start reset timer so the flag clears after 3 seconds.
+			return m, ctrlCResetCmd()
 		}

 		// Check extension-registered global keyboard shortcuts. These fire
@@ -1223,11 +1359,11 @@ func (m *AppModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
 					m.scrollList.autoScroll = true
 				}
 				return m, tea.Batch(cmds...)
-			case "alt+home":
+			case "ctrl+home":
 				m.scrollList.GotoTop()
 				m.scrollList.autoScroll = false
 				return m, tea.Batch(cmds...)
-			case "alt+end":
+			case "ctrl+end":
 				m.scrollList.GotoBottom()
 				m.scrollList.autoScroll = true
 				return m, tea.Batch(cmds...)
@@ -1235,15 +1371,10 @@ func (m *AppModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
 		}

 		// Thinking keybindings — only when the model supports reasoning.
+		// Note: thinking visibility toggle is under leader chord (Ctrl+X t)
+		// to avoid conflicts with terminal multiplexers.
 		if m.isReasoningModel {
 			switch msg.String() {
-			case "ctrl+t":
-				// Toggle thinking block visibility.
-				m.thinkingVisible = !m.thinkingVisible
-				if m.stream != nil {
-					m.stream.SetThinkingVisible(m.thinkingVisible)
-				}
-				return m, tea.Batch(cmds...)
 			case "shift+tab":
 				// Cycle thinking level.
 				m.cycleThinkingLevel()
@@ -1299,14 +1430,23 @@ func (m *AppModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
 							images = ic.ClearPendingImages()
 						}

-						// Preprocess @file references.
+						// Preprocess @file references (text files are XML-inlined,
+						// binary files are extracted as multimodal parts).
 						processedText := text
+						var fileParts []kit.LLMFilePart
 						if m.cwd != "" {
-							processedText = fileutil.ProcessFileAttachments(text, m.cwd)
+							result := fileutil.ProcessFileAttachments(text, m.cwd, m.mcpResourceReader)
+							processedText = result.ProcessedText
+							for _, fp := range result.FileParts {
+								fileParts = append(fileParts, kit.LLMFilePart{
+									Filename:  fp.Filename,
+									Data:      fp.Data,
+									MediaType: fp.MediaType,
+								})
+							}
 						}

-						// Convert image attachments to kit.LLMFilePart for the app layer.
-						var fileParts []kit.LLMFilePart
+						// Convert clipboard image attachments to kit.LLMFilePart.
 						for _, img := range images {
 							fileParts = append(fileParts, kit.LLMFilePart{
 								Data:      img.Data,
@@ -1335,6 +1475,14 @@ func (m *AppModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
 						}
 					}
 				}
+			case "t":
+				// Ctrl+X t → Toggle thinking block visibility.
+				if m.isReasoningModel {
+					m.thinkingVisible = !m.thinkingVisible
+					if m.stream != nil {
+						m.stream.SetThinkingVisible(m.thinkingVisible)
+					}
+				}
 			case "e":
 				// Ctrl+X e → open $EDITOR to compose/edit the prompt.
 				editorApp := os.Getenv("VISUAL")
@@ -1457,10 +1605,16 @@ func (m *AppModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
 	case uicore.CancelTimerExpiredMsg:
 		m.canceling = false

+	// ── Ctrl+C reset timer expired ────────────────────────────────────────────
+	case uicore.CtrlCResetMsg:
+		m.ctrlCPressedOnce = false
+
 	// ── Input submitted ──────────────────────────────────────────────────────
 	case uicore.SubmitMsg:
 		// Re-enable auto-scroll when user submits a new message.
 		m.scrollList.autoScroll = true
+		// Reset Ctrl+C flag so next Ctrl+C clears input instead of quitting.
+		m.ctrlCPressedOnce = false

 		// Handle slash commands locally — they should never reach app.Run().
 		// Parse once: split on the first space so argument-bearing commands
@@ -1483,6 +1637,12 @@ func (m *AppModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
 			return m, tea.Batch(cmds...)
 		}

+		// Check MCP prompt commands (/<server>:<prompt> [args]).
+		if cmd := m.handleMCPPromptCommand(msg.Text); cmd != nil {
+			cmds = append(cmds, cmd)
+			return m, tea.Batch(cmds...)
+		}
+
 		// Expand prompt templates. If the input matches a template name,
 		// substitute arguments and use the expanded content as the prompt.
 		if expanded, ok, validationErr := m.expandPromptTemplate(msg.Text); validationErr != "" {
@@ -1498,16 +1658,25 @@ func (m *AppModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
 		}

 		// Regular prompt — forward to the app layer.
-		// Preprocess @file references: expand them into XML-wrapped file
-		// content before sending to the agent. The display text (shown in
-		// ScrollList) uses the original user text so the UI stays clean.
+		// Preprocess @file references: text files are XML-inlined, binary files
+		// (images, audio, etc.) are extracted as multimodal parts. The display
+		// text (shown in ScrollList) uses the original user text so the UI stays clean.
 		processedText := msg.Text
+		var fileParts []kit.LLMFilePart
 		if m.cwd != "" {
-			processedText = fileutil.ProcessFileAttachments(msg.Text, m.cwd)
+			result := fileutil.ProcessFileAttachments(msg.Text, m.cwd, m.mcpResourceReader)
+			processedText = result.ProcessedText
+			for _, fp := range result.FileParts {
+				fileParts = append(fileParts, kit.LLMFilePart{
+					Filename:  fp.Filename,
+					Data:      fp.Data,
+					MediaType: fp.MediaType,
+				})
+			}
 		}

-		// Convert image attachments to kit.LLMFilePart for the app layer.
-		var fileParts []kit.LLMFilePart
+		// Convert clipboard image attachments to kit.LLMFilePart.
+		fileOnlyCount := len(fileParts) // binary @file parts (before clipboard images)
 		for _, img := range msg.Images {
 			fileParts = append(fileParts, kit.LLMFilePart{
 				Data:      img.Data,
@@ -1515,10 +1684,17 @@ func (m *AppModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
 			})
 		}

-		// Build display text for ScrollList (include image count if any).
+		// Build display text for ScrollList (include attachment counts).
 		displayText := msg.Text
-		if len(msg.Images) > 0 {
-			displayText = fmt.Sprintf("%s\n[%d image(s) attached]", msg.Text, len(msg.Images))
+		if len(msg.Images) > 0 || fileOnlyCount > 0 {
+			var badges []string
+			if len(msg.Images) > 0 {
+				badges = append(badges, fmt.Sprintf("%d image(s) pasted", len(msg.Images)))
+			}
+			if fileOnlyCount > 0 {
+				badges = append(badges, fmt.Sprintf("%d file(s) attached", fileOnlyCount))
+			}
+			displayText = fmt.Sprintf("%s\n[%s]", msg.Text, strings.Join(badges, ", "))
 		}

 		if m.appCtrl != nil {
@@ -1705,6 +1881,10 @@ func (m *AppModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
 		} else {
 			bashItem.AppendStdout(msg.Chunk)
 		}
+		// Invalidate cached height after mutation.
+		if m.scrollList != nil {
+			m.scrollList.InvalidateItemHeight(bashItem.ID())
+		}

 		// Check height and cap if needed - we don't want streaming output to grow forever
 		const maxStreamingBashHeight = 20 // Max lines to show during streaming
@@ -1896,6 +2076,12 @@ func (m *AppModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
 		m.providerName = msg.ProviderName
 		m.modelName = msg.ModelName

+	case app.UsageUpdatedEvent:
+		// Token usage was updated after a completed LLM step. No state
+		// changes needed — the UsageTracker was already mutated in-place.
+		// Returning from Update() triggers View() which re-renders the
+		// status bar with the latest token counts, cost, and context %.
+
 	case app.WidgetUpdateEvent:
 		// Extension widget changed — recalculate height distribution so the
 		// stream region accounts for widget space. View() will read the
@@ -1934,9 +2120,10 @@ func (m *AppModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
 		m.printSystemMessage("Prompts and skills reloaded.")

 	case app.MCPToolsReadyEvent:
-		// Background MCP tool loading completed — refresh tool names and count.
+		// Background MCP tool loading completed — refresh tool names, count, and prompts.
 		m.refreshToolNames()
 		m.refreshMCPToolCount()
+		m.refreshMCPPrompts()

 	case app.MCPServerLoadedEvent:
 		// A single MCP server finished loading — display a system message.
@@ -1955,6 +2142,39 @@ func (m *AppModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
 			ic.textarea.CursorEnd()
 		}

+	case app.PasswordPromptEvent:
+		// Sudo password prompt - show a modal input prompt
+		// If already in prompt state, cancel the new request
+		if m.state == statePrompt {
+			if msg.ResponseCh != nil {
+				msg.ResponseCh <- app.PasswordPromptResponse{Cancelled: true}
+			}
+			return m, tea.Batch(cmds...)
+		}
+		m.prePromptState = m.state
+		m.state = statePrompt
+		// Create a custom response channel that converts PasswordPromptResponse
+		passwordResponseCh := make(chan app.PromptResponse, 1)
+		m.promptResponseCh = passwordResponseCh
+
+		// Create password input prompt (masked input)
+		m.prompt = newPasswordPrompt(msg.Prompt, m.width, m.height)
+
+		// Handle the response conversion
+		go func() {
+			resp := <-passwordResponseCh
+			if msg.ResponseCh != nil {
+				msg.ResponseCh <- app.PasswordPromptResponse{
+					Password:  resp.Value,
+					Cancelled: resp.Cancelled,
+				}
+			}
+		}()
+
+		if m.prompt != nil {
+			cmds = append(cmds, m.prompt.Init())
+		}
+
 	case app.PromptRequestEvent:
 		// Extension wants to show an interactive prompt. Enter prompt state.
 		// If already in prompt state (concurrent prompt from another
@@ -2022,6 +2242,75 @@ func (m *AppModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
 			m.printSystemMessage(msg.output)
 		}

+	case mcpPromptResultMsg:
+		// Async MCP prompt expansion completed. Submit the expanded text
+		// as a user message (same behavior as local prompt templates).
+		if msg.err != nil {
+			m.printSystemMessage(fmt.Sprintf("MCP prompt error: %v", msg.err))
+		} else if msg.text != "" || len(msg.fileParts) > 0 {
+			// Process @file references and submit.
+			processedText := msg.text
+			var fileParts []kit.LLMFilePart
+			if m.cwd != "" {
+				result := fileutil.ProcessFileAttachments(msg.text, m.cwd, m.mcpResourceReader)
+				processedText = result.ProcessedText
+				for _, fp := range result.FileParts {
+					fileParts = append(fileParts, kit.LLMFilePart{
+						Filename:  fp.Filename,
+						Data:      fp.Data,
+						MediaType: fp.MediaType,
+					})
+				}
+			}
+			// Merge file parts from embedded resources (images, audio, blobs)
+			// with any @file/@mcp: file parts extracted from the text.
+			fileParts = append(fileParts, msg.fileParts...)
+
+			// Build display text with attachment badges (matches the
+			// normal submit path so embedded resources look like pasted
+			// images / attached files).
+			displayText := msg.text
+			if len(msg.fileParts) > 0 {
+				var imageCount, fileCount int
+				for _, fp := range msg.fileParts {
+					if strings.HasPrefix(fp.MediaType, "image/") {
+						imageCount++
+					} else {
+						fileCount++
+					}
+				}
+				var badges []string
+				if imageCount > 0 {
+					badges = append(badges, fmt.Sprintf("%d image(s) attached", imageCount))
+				}
+				if fileCount > 0 {
+					badges = append(badges, fmt.Sprintf("%d file(s) attached", fileCount))
+				}
+				if len(badges) > 0 {
+					displayText = fmt.Sprintf("%s\n[%s]", msg.text, strings.Join(badges, ", "))
+				}
+			}
+
+			if m.appCtrl != nil {
+				var qLen int
+				if len(fileParts) > 0 {
+					qLen = m.appCtrl.RunWithFiles(processedText, fileParts)
+				} else {
+					qLen = m.appCtrl.Run(processedText)
+				}
+				if qLen > 0 {
+					m.queuedMessages = append(m.queuedMessages, displayText)
+					m.layoutDirty = true
+				} else {
+					m.pendingUserPrints = append(m.pendingUserPrints, displayText)
+					m.flushStreamAndPendingUserMessages()
+				}
+				if m.state != stateWorking {
+					m.state = stateWorking
+				}
+			}
+		}
+
 	case externalEditorMsg:
 		// User returned from $EDITOR. Replace input textarea content with
 		// whatever they saved in the temp file. On error (e.g. :cq in vim)
@@ -2161,8 +2450,10 @@ func (m *AppModel) View() tea.View {
 	scrollbackView := m.renderScrollback()

 	// Propagate hint visibility to the input component before rendering.
+	// Hints are hidden by default for a cleaner UI; extensions cannot
+	// override this.
 	if ic, ok := m.input.(*InputComponent); ok {
-		ic.hideHint = vis.HideInputHint
+		ic.hideHint = true
 		ic.agentBusy = m.state == stateWorking
 	}

@@ -2204,6 +2495,14 @@ func (m *AppModel) View() tea.View {
 		parts = append(parts, warning)
 	}

+	if m.ctrlCPressedOnce {
+		warning := lipgloss.NewStyle().
+			Foreground(theme.Warning).
+			Bold(true).
+			Render("  ⚠ Press Ctrl+C again to quit")
+		parts = append(parts, warning)
+	}
+
 	if !vis.HideSeparator {
 		parts = append(parts, m.renderSeparator())
 	}
@@ -2348,9 +2647,14 @@ func (m *AppModel) renderStatusBar() string {
 		middleSide = "  " + middleSide
 	}

-	// Right side: provider · model + usage stats.
+	// Right side: help hint + provider · model + usage stats.
+	// Order matters for progressive truncation — least important first.
 	var rightParts []string

+	rightParts = append(rightParts, lipgloss.NewStyle().
+		Foreground(theme.VeryMuted).
+		Render("/help for help"))
+
 	var modelLabel string
 	if m.providerName != "" && m.modelName != "" {
 		modelLabel = m.providerName + " · " + m.modelName
@@ -2369,11 +2673,11 @@ func (m *AppModel) renderStatusBar() string {
 		}
 	}

-	rightSide := strings.Join(rightParts, "  ")
+	rightSide := strings.Join(rightParts, "  |  ")

 	// Progressive truncation to keep the status bar on one line.
 	// When content exceeds terminal width, drop sections in order:
-	// middle (extensions/thinking) → usage stats → model label → right side.
+	// middle (extensions/thinking) → help hint → usage → model → all.
 	leftW := lipgloss.Width(leftSide)
 	middleW := lipgloss.Width(middleSide)
 	rightW := lipgloss.Width(rightSide)
@@ -2384,13 +2688,19 @@ func (m *AppModel) renderStatusBar() string {
 		middleSide = ""
 		middleW = 0
 	}
+	if leftW+rightW+1 > m.width && len(rightParts) > 2 {
+		// Drop help hint first.
+		rightParts = rightParts[1:]
+		rightSide = strings.Join(rightParts, "  |  ")
+		rightW = lipgloss.Width(rightSide)
+	}
 	if leftW+rightW+1 > m.width && len(rightParts) > 1 {
-		// Drop usage stats, keep model label.
-		rightSide = rightParts[0]
+		// Drop usage (last) next, keep model label.
+		rightParts = rightParts[:len(rightParts)-1]
+		rightSide = strings.Join(rightParts, "  |  ")
 		rightW = lipgloss.Width(rightSide)
 	}
 	if leftW+rightW+1 > m.width {
-		// Drop right side entirely.
 		rightSide = ""
 		rightW = 0
 	}
@@ -2401,7 +2711,7 @@ func (m *AppModel) renderStatusBar() string {

 // cycleThinkingLevel advances to the next thinking level and applies it.
 func (m *AppModel) cycleThinkingLevel() {
-	levels := []string{"off", "minimal", "low", "medium", "high"}
+	levels := []string{"off", "none", "minimal", "low", "medium", "high"}
 	current := m.thinkingLevel
 	if current == "" {
 		current = "off"
@@ -2434,7 +2744,7 @@ func (m *AppModel) cycleThinkingLevel() {
 // renderSeparator renders the separator line with an optional queue/steer count badge.
 func (m *AppModel) renderSeparator() string {
 	theme := style.GetTheme()
-	lineStyle := lipgloss.NewStyle().Foreground(theme.Muted)
+	lineStyle := lipgloss.NewStyle().Foreground(theme.Border)
 	queueLen := len(m.queuedMessages)
 	steerLen := len(m.steeringMessages)

@@ -2811,6 +3121,16 @@ func (m *AppModel) printSystemMessage(text string) {
 	m.refreshContent()
 }

+// printCustomMessage renders a message with a custom alert label into the ScrollList.
+func (m *AppModel) printCustomMessage(text, label string) {
+	styledMsg := m.renderer.RenderCustomMessage(text, label, time.Now())
+
+	msg := NewStyledMessageItem(generateMessageID(), "system", styledMsg.Content, styledMsg.Content)
+	m.messages = append(m.messages, msg)
+
+	m.refreshContent()
+}
+
 // printExtensionBlock renders a custom styled block from an extension with
 // caller-chosen border color and optional subtitle into the ScrollList.
 func (m *AppModel) printExtensionBlock(evt app.ExtensionPrintEvent) {
@@ -2892,6 +3212,115 @@ func (m *AppModel) handleExtensionCommand(text string) tea.Cmd {
 	return noopCmd
 }

+// handleMCPPromptCommand checks if the submitted text matches an MCP prompt
+// command (/<server>:<prompt> [args]) and returns a tea.Cmd that expands it
+// asynchronously. Returns nil if no MCP prompt matches.
+//
+// Arguments are parsed as key=value pairs. Positional arguments are mapped
+// to prompt argument names by order.
+func (m *AppModel) handleMCPPromptCommand(text string) tea.Cmd {
+	if len(m.mcpPrompts) == 0 || m.expandMCPPrompt == nil {
+		return nil
+	}
+
+	if !strings.HasPrefix(text, "/") {
+		return nil
+	}
+
+	// Split: "/<server>:<prompt> key=val ..." → command, args
+	cmdPart, argStr, _ := strings.Cut(text, " ")
+	cmdPart = strings.TrimPrefix(cmdPart, "/")
+
+	// Must contain a colon to be an MCP prompt command.
+	serverName, promptName, ok := strings.Cut(cmdPart, ":")
+	if !ok || serverName == "" || promptName == "" {
+		return nil
+	}
+
+	// Find matching MCP prompt.
+	var matched *MCPPromptInfo
+	for i := range m.mcpPrompts {
+		if m.mcpPrompts[i].ServerName == serverName && m.mcpPrompts[i].Name == promptName {
+			matched = &m.mcpPrompts[i]
+			break
+		}
+	}
+	if matched == nil {
+		return nil
+	}
+
+	// Parse arguments: support key=value pairs, with positional fallback.
+	args := parseMCPPromptArgs(argStr, matched.Arguments)
+
+	// Validate required arguments.
+	for _, a := range matched.Arguments {
+		if a.Required {
+			if _, exists := args[a.Name]; !exists {
+				m.printSystemMessage(fmt.Sprintf(
+					"/%s:%s requires argument '%s'",
+					serverName, promptName, a.Name,
+				))
+				// Re-populate input for the user to add missing args.
+				if ic, ok := m.input.(*InputComponent); ok {
+					ic.textarea.SetValue(text + " ")
+					ic.textarea.CursorEnd()
+				}
+				return noopCmd
+			}
+		}
+	}
+
+	// Expand asynchronously.
+	expand := m.expandMCPPrompt
+	ctrl := m.appCtrl
+	go func() {
+		result, err := expand(serverName, promptName, args)
+		if err != nil {
+			ctrl.SendEvent(mcpPromptResultMsg{err: err})
+			return
+		}
+		// Concatenate user-role messages as the prompt text and collect
+		// any binary attachments from embedded resources.
+		var parts []string
+		var allFileParts []kit.LLMFilePart
+		for _, msg := range result.Messages {
+			if msg.Role == "user" {
+				if msg.Content != "" {
+					parts = append(parts, msg.Content)
+				}
+				allFileParts = append(allFileParts, msg.FileParts...)
+			}
+		}
+		ctrl.SendEvent(mcpPromptResultMsg{
+			text:      strings.Join(parts, "\n\n"),
+			fileParts: allFileParts,
+		})
+	}()
+
+	return noopCmd
+}
+
+// parseMCPPromptArgs parses "key=value" pairs from a space-separated arg
+// string. Tokens without "=" are assigned to prompt arguments positionally.
+func parseMCPPromptArgs(argStr string, argDefs []MCPPromptArgInfo) map[string]string {
+	result := make(map[string]string)
+	if strings.TrimSpace(argStr) == "" {
+		return result
+	}
+
+	tokens := strings.Fields(argStr)
+	positionalIdx := 0
+	for _, tok := range tokens {
+		if k, v, ok := strings.Cut(tok, "="); ok && k != "" {
+			result[k] = v
+		} else if positionalIdx < len(argDefs) {
+			result[argDefs[positionalIdx].Name] = tok
+			positionalIdx++
+		}
+	}
+	return result
+}
+
 // expandPromptTemplate checks if the submitted text matches a prompt template
 // and returns the expanded content with arguments substituted.
 //
@@ -2975,6 +3404,42 @@ func (m *AppModel) refreshSkillItems() {
 	m.skillItems = m.getSkillItems()
 }

+// refreshMCPPrompts reloads MCP prompts from the provider callback and
+// updates the autocomplete entries. Called on MCPToolsReadyEvent.
+func (m *AppModel) refreshMCPPrompts() {
+	if m.getMCPPrompts == nil {
+		return
+	}
+	newPrompts := m.getMCPPrompts()
+	m.mcpPrompts = newPrompts
+
+	if ic, ok := m.input.(*InputComponent); ok {
+		// Remove old MCP Prompts commands and add fresh ones.
+		var kept []commands.SlashCommand
+		for _, sc := range ic.commands {
+			if sc.Category != "MCP Prompts" {
+				kept = append(kept, sc)
+			}
+		}
+		for _, p := range newPrompts {
+			hasArgs := false
+			for _, a := range p.Arguments {
+				if a.Required {
+					hasArgs = true
+					break
+				}
+			}
+			kept = append(kept, commands.SlashCommand{
+				Name:        fmt.Sprintf("/%s:%s", p.ServerName, p.Name),
+				Description: p.Description,
+				Category:    "MCP Prompts",
+				HasArgs:     hasArgs,
+			})
+		}
+		ic.commands = kept
+	}
+}
+
 // refreshToolNames reloads tool names from the provider callback.
 // Called on MCPToolsReadyEvent when background MCP tool loading completes.
 func (m *AppModel) refreshToolNames() {
@@ -3045,13 +3510,14 @@ func (m *AppModel) printHelpMessage() {
 		"- `!command`: Run shell command, output included in LLM context\n" +
 		"- `!!command`: Run shell command, output excluded from LLM context\n\n" +
 		"**Keys:**\n" +
-		"- `Ctrl+C`: Exit at any time\n" +
+		"- `Ctrl+C`: Clear input and arm quit (press again to exit)\n" +
 		"- `ESC` (x2): Cancel ongoing LLM generation\n" +
 		"- `Ctrl+X s`: Steer — redirect the agent mid-turn (injected between tool calls)\n" +
 		"- `Ctrl+X e`: Open `$EDITOR` to compose/edit your prompt\n" +
+		"- `Ctrl+V`: Paste image from clipboard\n" +
 		"- `Enter` (while working): Queue message for after the agent finishes\n\n" +
 		"You can also just type your message to chat with the AI assistant."
-	m.printSystemMessage(help)
+	m.printCustomMessage(help, "Help")
 }

 // printToolsMessage renders the list of available tools.
@@ -3240,6 +3706,10 @@ func (m *AppModel) appendStreamingChunk(role, content string) {
 	// If last message is a StreamingMessageItem with matching role, append to it
 	if streamMsg, ok := lastMsg.(*StreamingMessageItem); ok && streamMsg.role == role {
 		streamMsg.AppendChunk(content)
+		// Invalidate cached height so GotoBottom sees the new size.
+		if m.scrollList != nil {
+			m.scrollList.InvalidateItemHeight(streamMsg.ID())
+		}
 		// Auto-scroll to bottom if enabled (iteratr pattern)
 		// Don't call SetItems() - the slice reference hasn't changed
 		if m.scrollList != nil {
@@ -3304,15 +3774,16 @@ func (m *AppModel) distributeHeight() {
 	}

 	// Propagate hint visibility before measuring input height.
+	// Hints are always hidden for a cleaner UI.
 	if ic, ok := m.input.(*InputComponent); ok {
-		ic.hideHint = vis.HideInputHint
+		ic.hideHint = true
 	}

 	// Measure the actual rendered input (or prompt overlay) height so we
 	// don't rely on a fragile constant that drifts when styling changes.
 	// Use renderInput() which includes the editor interceptor's Render
 	// wrapper so the measured height matches what View() actually renders.
-	inputLines := 9 // fallback: title(1)+margin(1)+nl(1)+textarea(3)+nl(1)+margin(1)+help(1)
+	inputLines := 8 // fallback: marginTop(1)+textarea(4)+border-chrome(2)+marginBottom(1)
 	if m.state == statePrompt && m.prompt != nil {
 		if rendered := m.prompt.Render(); rendered != "" {
 			inputLines = lipgloss.Height(rendered)
@@ -3441,6 +3912,30 @@ func (m *AppModel) handleModelCommand(args string) tea.Cmd {
 		return nil
 	}

+	// Check if thinking level needs adjustment for the new model.
+	// Some models (e.g., OpenAI gpt-5.4) don't support "minimal" and require "none".
+	if m.thinkingLevel != "" && m.thinkingLevel != "off" {
+		parts := strings.SplitN(args, "/", 2)
+		if len(parts) == 2 {
+			modelName := parts[1]
+			currentLevel := models.ParseThinkingLevel(m.thinkingLevel)
+			if !models.IsValidThinkingLevelForModel(currentLevel, modelName) {
+				fallback := models.SuggestThinkingLevelFallback(currentLevel, modelName)
+				if fallback != models.ThinkingOff {
+					m.printSystemMessage(fmt.Sprintf(
+						"Note: Model %s doesn't support '%s' thinking level. Adjusted to '%s'.",
+						modelName, currentLevel, fallback,
+					))
+					m.thinkingLevel = string(fallback)
+					if m.setThinkingLevel != nil {
+						_ = m.setThinkingLevel(string(fallback))
+					}
+					go func() { _ = prefs.SaveThinkingLevelPreference(string(fallback)) }()
+				}
+			}
+		}
+	}
+
 	// Direct model switch with the provided model string.
 	previousModel := m.providerName + "/" + m.modelName
 	if err := m.setModel(args); err != nil {
@@ -3545,7 +4040,7 @@ func (m *AppModel) handleThinkingCommand(args string) tea.Cmd {
 	// Parse and validate the level.
 	level := models.ParseThinkingLevel(args)
 	if string(level) != strings.ToLower(args) {
-		m.printSystemMessage(fmt.Sprintf("Unknown thinking level: %q. Use: off, minimal, low, medium, high", args))
+		m.printSystemMessage(fmt.Sprintf("Unknown thinking level: %q. Use: off, none, minimal, low, medium, high", args))
 		return nil
 	}

@@ -4132,6 +4627,14 @@ func cancelTimerCmd() tea.Cmd {
 	})
 }

+// ctrlCResetCmd returns a tea.Cmd that fires CtrlCResetMsg after 3s.
+// This resets the ctrlCPressedOnce flag so the next Ctrl+C will clear input again.
+func ctrlCResetCmd() tea.Cmd {
+	return tea.Tick(3*time.Second, func(_ time.Time) tea.Msg {
+		return uicore.CtrlCResetMsg{}
+	})
+}
+
 // --------------------------------------------------------------------------
 // Interactive prompt support
 // --------------------------------------------------------------------------
@@ -4166,6 +4669,14 @@ type extensionCmdResultMsg struct {
 	err    error
 }

+// mcpPromptResultMsg carries the result of an asynchronously expanded MCP
+// prompt. The expansion runs in a goroutine since it contacts the MCP server.
+type mcpPromptResultMsg struct {
+	text      string            // concatenated user messages to submit as the prompt
+	fileParts []kit.LLMFilePart // binary attachments from embedded resources
+	err       error             // error from the server
+}
+
 // beforeSessionSwitchResultMsg carries the result of an asynchronously
 // executed before-session-switch hook. The hook runs in a goroutine so that
 // blocking operations like ctx.PromptConfirm() do not deadlock the TUI.
@@ -4195,9 +4706,12 @@ func (m *AppModel) updatePromptState(msg tea.Msg) (tea.Model, tea.Cmd) {
 	switch msg := msg.(type) {
 	case tea.KeyPressMsg:
 		if msg.String() == "ctrl+c" {
-			// Cancel prompt and quit the application.
+			// Cancel the prompt but don't quit — let the main handler's
+			// double-Ctrl+C logic handle quitting.
 			m.resolvePrompt(app.PromptResponse{Cancelled: true})
-			return m, tea.Quit
+			// Don't consume the keypress — re-dispatch so the main
+			// ctrl+c handler can track the double-press state.
+			return m.Update(msg)
 		}
 		result, cmd := m.prompt.Update(msg)
 		if cmd != nil {
@@ -4264,9 +4778,12 @@ func (m *AppModel) updateOverlayState(msg tea.Msg) (tea.Model, tea.Cmd) {
 	switch msg := msg.(type) {
 	case tea.KeyPressMsg:
 		if msg.String() == "ctrl+c" {
-			// Cancel overlay and quit the application.
+			// Cancel the overlay but don't quit — let the main handler's
+			// double-Ctrl+C logic handle quitting.
 			m.resolveOverlay(app.OverlayResponse{Cancelled: true})
-			return m, tea.Quit
+			// Don't consume the keypress — re-dispatch so the main
+			// ctrl+c handler can track the double-press state.
+			return m.Update(msg)
 		}
 		result, cmd := m.overlay.Update(msg)
 		if cmd != nil {
@@ -515,12 +515,12 @@ func TestWindowResize_distributeHeight(t *testing.T) {
 	ctrl := &stubAppController{}
 	m, _, _ := newTestAppModel(ctrl)

-	// With height=30, scroll height = 30 - 1 (separator) - 9 (input) - 1 (statusBar) = 19
+	// With height=30, scroll height = 30 - 1 (separator) - 8 (input) - 1 (statusBar) = 20
 	m = sendMsg(m, tea.WindowSizeMsg{Width: 80, Height: 30})
 	_ = m

-	if m.scrollList.height != 19 {
-		t.Fatalf("expected scroll list height=19, got %d", m.scrollList.height)
+	if m.scrollList.height != 20 {
+		t.Fatalf("expected scroll list height=20, got %d", m.scrollList.height)
 	}
 }

@@ -853,23 +853,165 @@ func TestSpinnerEvent_hideDoesNotTransitionState(t *testing.T) {
 }

 // --------------------------------------------------------------------------
-// ctrl+c produces tea.Quit
+// ctrl+c double-press to quit
 // --------------------------------------------------------------------------

-// TestCtrlC_producesQuit verifies that ctrl+c always returns a tea.Quit cmd.
+// TestCtrlC_producesQuit verifies that double ctrl+c returns a tea.Quit cmd.
 func TestCtrlC_producesQuit(t *testing.T) {
 	ctrl := &stubAppController{}
 	m, _, _ := newTestAppModel(ctrl)

+	// First Ctrl+C arms the quit flag.
+	updated, cmd := m.Update(tea.KeyPressMsg{Code: 'c', Mod: tea.ModCtrl})
+	m = updated.(*AppModel)
+	if cmd == nil {
+		t.Fatal("expected a command after first ctrl+c, got nil")
+	}
+	// Should be a reset timer, not quit.
+	msg := cmd()
+	if _, ok := msg.(core.CtrlCResetMsg); !ok {
+		t.Fatalf("expected CtrlCResetMsg after first ctrl+c, got %T", msg)
+	}
+
+	// Second Ctrl+C should quit.
+	_, cmd = m.Update(tea.KeyPressMsg{Code: 'c', Mod: tea.ModCtrl})
+	if cmd == nil {
+		t.Fatal("expected tea.Quit cmd on second ctrl+c, got nil")
+	}
+	msg = cmd()
+	if _, ok := msg.(tea.QuitMsg); !ok {
+		t.Fatalf("expected QuitMsg from second ctrl+c, got %T", msg)
+	}
+}
+
+// TestCtrlC_clearsInput_firstPress tests that Ctrl+C clears input on first
+// press when there's content, and requires a second press to quit.
+func TestCtrlC_clearsInput_firstPress(t *testing.T) {
+	// Create a real InputComponent to test the clear behavior
+	ctrl := &stubAppController{}
+	m, _, _ := newTestAppModel(ctrl)
+
+	// Replace with real InputComponent that has content
+	input := NewInputComponent(80, ctrl)
+	input.textarea.SetValue("some text content")
+	m.input = input
+
+	// First Ctrl+C should clear input, not quit
 	_, cmd := m.Update(tea.KeyPressMsg{Code: 'c', Mod: tea.ModCtrl})

-	if cmd == nil {
-		t.Fatal("expected tea.Quit cmd on ctrl+c, got nil")
+	// Should have cleared the input
+	if input.textarea.Value() != "" {
+		t.Fatalf("expected input to be cleared, got %q", input.textarea.Value())
+	}
+
+	// Should have set ctrlCPressedOnce flag
+	if !m.ctrlCPressedOnce {
+		t.Fatal("expected ctrlCPressedOnce to be true after first Ctrl+C")
+	}
+
+	// The command should be a ctrlCResetCmd (not tea.Quit)
+	if cmd == nil {
+		t.Fatal("expected a command after first Ctrl+C, got nil")
 	}
-	// We verify it's a quit command by running it and checking the message type.
 	msg := cmd()
+	if _, ok := msg.(core.CtrlCResetMsg); !ok {
+		t.Fatalf("expected CtrlCResetMsg, got %T", msg)
+	}
+
+	// Second Ctrl+C should now quit
+	_, cmd = m.Update(tea.KeyPressMsg{Code: 'c', Mod: tea.ModCtrl})
+	if cmd == nil {
+		t.Fatal("expected tea.Quit cmd on second Ctrl+C, got nil")
+	}
+	msg = cmd()
 	if _, ok := msg.(tea.QuitMsg); !ok {
-		t.Fatalf("expected QuitMsg from ctrl+c cmd, got %T", msg)
+		t.Fatalf("expected QuitMsg on second Ctrl+C, got %T", msg)
+	}
+}
+
+// TestCtrlC_resetAfterSubmit tests that the Ctrl+C flag is reset after
+// submitting a message, so the next Ctrl+C clears input again.
+func TestCtrlC_resetAfterSubmit(t *testing.T) {
+	// Use newTestAppModel but replace the input with a real InputComponent
+	ctrl := &stubAppController{}
+	m, _, _ := newTestAppModel(ctrl)
+
+	// Replace with real InputComponent
+	input := NewInputComponent(80, ctrl)
+	input.textarea.SetValue("content")
+	m.input = input
+
+	// First Ctrl+C clears input
+	updated, _ := m.Update(tea.KeyPressMsg{Code: 'c', Mod: tea.ModCtrl})
+	m = updated.(*AppModel)
+	if input.textarea.Value() != "" {
+		t.Fatal("expected input to be cleared")
+	}
+
+	// Flag should be set
+	if !m.ctrlCPressedOnce {
+		t.Fatal("expected ctrlCPressedOnce to be true after first Ctrl+C")
+	}
+
+	// Simulate CtrlCResetMsg being processed (timer expired)
+	updated, _ = m.Update(core.CtrlCResetMsg{})
+	m = updated.(*AppModel)
+
+	// Flag should be reset
+	if m.ctrlCPressedOnce {
+		t.Fatal("expected ctrlCPressedOnce to be false after CtrlCResetMsg")
+	}
+
+	// Add new content to input
+	input.textarea.SetValue("new content")
+
+	// Next Ctrl+C should clear again (not quit) because flag was reset
+	_, cmd := m.Update(tea.KeyPressMsg{Code: 'c', Mod: tea.ModCtrl})
+	if input.textarea.Value() != "" {
+		t.Fatalf("expected input to be cleared again, got %q", input.textarea.Value())
+	}
+	if cmd == nil {
+		t.Fatal("expected a command after Ctrl+C, got nil")
+	}
+	msg := cmd()
+	if _, ok := msg.(core.CtrlCResetMsg); !ok {
+		t.Fatalf("expected CtrlCResetMsg, got %T", msg)
+	}
+}
+
+// TestCtrlC_emptyInput_armsQuit tests that Ctrl+C on empty input still
+// requires a second press to quit (consistent double-press behavior).
+func TestCtrlC_emptyInput_armsQuit(t *testing.T) {
+	ctrl := &stubAppController{}
+	m, _, _ := newTestAppModel(ctrl)
+
+	// Replace with real InputComponent (empty by default)
+	input := NewInputComponent(80, ctrl)
+	m.input = input
+
+	// First Ctrl+C on empty input should arm the flag, not quit.
+	updated, cmd := m.Update(tea.KeyPressMsg{Code: 'c', Mod: tea.ModCtrl})
+	m = updated.(*AppModel)
+
+	if !m.ctrlCPressedOnce {
+		t.Fatal("expected ctrlCPressedOnce to be true after first Ctrl+C")
+	}
+	if cmd == nil {
+		t.Fatal("expected a command (reset timer), got nil")
+	}
+	msg := cmd()
+	if _, ok := msg.(core.CtrlCResetMsg); !ok {
+		t.Fatalf("expected CtrlCResetMsg, got %T", msg)
+	}
+
+	// Second Ctrl+C should quit.
+	_, cmd = m.Update(tea.KeyPressMsg{Code: 'c', Mod: tea.ModCtrl})
+	if cmd == nil {
+		t.Fatal("expected tea.Quit cmd on second Ctrl+C, got nil")
+	}
+	msg = cmd()
+	if _, ok := msg.(tea.QuitMsg); !ok {
+		t.Fatalf("expected QuitMsg on second Ctrl+C, got %T", msg)
 	}
 }

@@ -288,3 +288,9 @@ func (pr *ProgressReader) Close() error {

 	return nil
 }
+
+// NewProgressReadCloser is a convenience wrapper around NewProgressReader that
+// returns an io.ReadCloser, suitable for use as a ProgressReaderFunc callback.
+func NewProgressReadCloser(r io.Reader) io.ReadCloser {
+	return NewProgressReader(r)
+}
@@ -19,9 +19,10 @@ import (
 type promptMode string

 const (
-	promptModeSelect  promptMode = "select"
-	promptModeConfirm promptMode = "confirm"
-	promptModeInput   promptMode = "input"
+	promptModeSelect   promptMode = "select"
+	promptModeConfirm  promptMode = "confirm"
+	promptModeInput    promptMode = "input"
+	promptModePassword promptMode = "password"
 )

 // promptResult carries the synchronous outcome of a prompt overlay update.
@@ -102,10 +103,38 @@ func newInputPrompt(message, placeholder, defaultValue string, width, height int
 	}
 }

-// Init returns the initial command for the prompt overlay. For input mode
-// this starts the cursor blink animation.
+// newPasswordPrompt creates a prompt overlay for password input (masked).
+func newPasswordPrompt(message string, width, height int) *promptOverlay {
+	ta := textarea.New()
+	ta.Placeholder = "Enter password"
+	ta.ShowLineNumbers = false
+	ta.Prompt = ""
+	ta.CharLimit = 0
+	ta.SetWidth(width - 12) // account for border + padding
+	ta.SetHeight(1)
+	ta.Focus()
+
+	// Prevent Enter from inserting a newline — we intercept it for submit.
+	ta.KeyMap.InsertNewline = key.NewBinding(
+		key.WithKeys("ctrl+j", "shift+enter"),
+	)
+
+	// Enable password masking - the textarea will show dots instead of characters
+	// Note: textarea doesn't have built-in password masking, so we handle it in View()
+
+	return &promptOverlay{
+		mode:    promptModePassword,
+		message: message,
+		inputTA: ta,
+		width:   width,
+		height:  height,
+	}
+}
+
+// Init returns the initial command for the prompt overlay. For input/password
+// modes this starts the cursor blink animation.
 func (p *promptOverlay) Init() tea.Cmd {
-	if p.mode == promptModeInput {
+	if p.mode == promptModeInput || p.mode == promptModePassword {
 		return textarea.Blink
 	}
 	return nil
@@ -113,13 +142,13 @@ func (p *promptOverlay) Init() tea.Cmd {

 // Update handles messages for the prompt overlay. It returns a non-nil
 // *promptResult when the user completes or cancels the prompt. The returned
-// tea.Cmd is for textarea blink ticks (input mode only).
+// tea.Cmd is for textarea blink ticks (input/password modes only).
 func (p *promptOverlay) Update(msg tea.Msg) (*promptResult, tea.Cmd) {
 	switch msg := msg.(type) {
 	case tea.WindowSizeMsg:
 		p.width = msg.Width
 		p.height = msg.Height
-		if p.mode == promptModeInput {
+		if p.mode == promptModeInput || p.mode == promptModePassword {
 			p.inputTA.SetWidth(p.width - 12)
 		}
 		return nil, nil
@@ -132,11 +161,13 @@ func (p *promptOverlay) Update(msg tea.Msg) (*promptResult, tea.Cmd) {
 			return p.updateConfirm(msg)
 		case promptModeInput:
 			return p.updateInput(msg)
+		case promptModePassword:
+			return p.updatePassword(msg)
 		}
 	}

 	// Pass non-key messages to textarea for blink animation.
-	if p.mode == promptModeInput {
+	if p.mode == promptModeInput || p.mode == promptModePassword {
 		var cmd tea.Cmd
 		p.inputTA, cmd = p.inputTA.Update(msg)
 		return nil, cmd
@@ -202,6 +233,20 @@ func (p *promptOverlay) updateInput(msg tea.KeyPressMsg) (*promptResult, tea.Cmd
 	}
 }

+func (p *promptOverlay) updatePassword(msg tea.KeyPressMsg) (*promptResult, tea.Cmd) {
+	switch msg.String() {
+	case "enter":
+		return &promptResult{completed: true, value: p.inputTA.Value()}, nil
+	case "esc":
+		return &promptResult{cancelled: true}, nil
+	default:
+		// Delegate character input, backspace, cursor movement, etc.
+		var cmd tea.Cmd
+		p.inputTA, cmd = p.inputTA.Update(msg)
+		return nil, cmd
+	}
+}
+
 // Render returns the prompt as a styled string for inline composition in the
 // AppModel layout. The prompt replaces the normal input area (below the
 // separator and above the status bar) rather than taking over the full screen.
@@ -216,6 +261,8 @@ func (p *promptOverlay) Render() string {
 		content = p.viewConfirm(theme)
 	case promptModeInput:
 		content = p.viewInput(theme)
+	case promptModePassword:
+		content = p.viewPassword(theme)
 	}

 	return renderContentBlock(content, p.width,
@@ -286,3 +333,25 @@ func (p *promptOverlay) viewInput(theme style.Theme) string {

 	return strings.Join(lines, "\n")
 }
+
+func (p *promptOverlay) viewPassword(theme style.Theme) string {
+	var lines []string
+	// Add 🔐 icon to message for password prompt
+	lines = append(lines, lipgloss.NewStyle().Bold(true).Foreground(theme.Text).Render("🔐 "+p.message))
+	lines = append(lines, "")
+
+	// Mask the password input with dots
+	passwordValue := p.inputTA.Value()
+	masked := strings.Repeat("•", len([]rune(passwordValue)))
+	// Render the masked password in a style that looks like input
+	maskedStyle := lipgloss.NewStyle().Foreground(theme.Text)
+	cursor := lipgloss.NewStyle().Foreground(theme.Accent).Render("█")
+	lines = append(lines, maskedStyle.Render(masked)+cursor)
+
+	lines = append(lines, "")
+	lines = append(lines, lipgloss.NewStyle().
+		Foreground(theme.Muted).
+		Render("  Enter submit  Esc cancel  (input is hidden)"))
+
+	return strings.Join(lines, "\n")
+}
@@ -36,16 +36,16 @@ func UserBlock(content string, width int, ty *herald.Typography, theme style.The

 	// Highlight @file tokens with accent color so file references are
 	// visually distinct from surrounding prompt text.
-	content = highlightFileTokens(content, theme)
+	content = HighlightFileTokens(content, theme)

 	rendered := ty.Tip(content)
 	return styleMarginBottom(theme, rendered)
 }

-// highlightFileTokens wraps @file tokens in the given text with the theme
+// HighlightFileTokens wraps @file tokens in the given text with the theme
 // accent color so they stand out visually in rendered user messages.
-func highlightFileTokens(text string, theme style.Theme) string {
-	accentStyle := lipgloss.NewStyle().Foreground(theme.Accent).Bold(true)
+func HighlightFileTokens(text string, theme style.Theme) string {
+	accentStyle := style.GetCachedStyles().FileTokenAccent
 	return fileTokenPattern.ReplaceAllStringFunc(text, func(token string) string {
 		return accentStyle.Render(token)
 	})
@@ -63,16 +63,20 @@ func AssistantBlock(content string, width int, theme style.Theme) string {

 // ReasoningBlock renders a reasoning/thinking block with muted italic text.
 // If duration > 0, shows "Thought for Xs" label. Otherwise shows just "Thought".
-func ReasoningBlock(content string, duration int64, ty *herald.Typography, theme style.Theme) string {
+// The width parameter controls soft-wrapping so long reasoning lines don't get cut off.
+func ReasoningBlock(content string, duration int64, width int, ty *herald.Typography, theme style.Theme) string {
 	if strings.TrimSpace(content) == "" {
 		return ""
 	}

-	// Match live streaming styling: muted italic text
+	// Match live streaming styling: muted italic text.
 	lines := strings.Split(strings.TrimRight(content, "\n"), "\n")
 	contentStr := strings.TrimLeft(strings.Join(lines, "\n"), " \t\n")
-	mutedStyle := lipgloss.NewStyle().Foreground(theme.Muted)
-	contentRendered := mutedStyle.Render(ty.Italic(contentStr))
+	if width > 4 {
+		contentStr = wrapText(contentStr, width-4)
+	}
+	cs := style.GetCachedStyles()
+	contentRendered := cs.Muted.Render(ty.Italic(contentStr))

 	// Build label based on duration
 	if duration > 0 {
@@ -82,14 +86,14 @@ func ReasoningBlock(content string, duration int64, ty *herald.Typography, theme
 		} else {
 			durationStr = fmt.Sprintf("%.1fs", float64(duration)/1000)
 		}
-		labelPart := lipgloss.NewStyle().Foreground(theme.VeryMuted).Render("Thought for ")
-		durationPart := lipgloss.NewStyle().Foreground(theme.Accent).Render(durationStr)
+		labelPart := cs.VeryMuted.Render("Thought for ")
+		durationPart := cs.Accent.Render(durationStr)
 		label := labelPart + durationPart
 		rendered := contentRendered + "\n" + label
 		return styleMarginBottom(theme, rendered)
 	}

-	label := lipgloss.NewStyle().Foreground(theme.VeryMuted).Render("Thought")
+	label := cs.VeryMuted.Render("Thought")
 	rendered := contentRendered + "\n" + label

 	return styleMarginBottom(theme, rendered)
@@ -105,6 +109,45 @@ func SystemBlock(content string, ty *herald.Typography, theme style.Theme) strin
 	return styleMarginBottom(theme, rendered)
 }

+// CustomBlock renders a message with herald Note styling and a custom label.
+// Content is rendered as markdown before being wrapped in the alert. This
+// creates a one-off Typography instance with the given label so callers
+// can use any title (e.g. "Help", "Warning") without changing the shared
+// typography's default "Info" label.
+func CustomBlock(content, label string, width int, theme style.Theme) string {
+	if strings.TrimSpace(content) == "" {
+		content = "No content available"
+	}
+
+	// Render markdown first — subtract 4 for the alert bar prefix ("│ ").
+	mdWidth := max(width-4, 10)
+	rendered := style.ToMarkdown(content, mdWidth)
+
+	ty := herald.New(
+		herald.WithPalette(herald.ColorPalette{
+			Primary:   theme.Primary,
+			Secondary: theme.Secondary,
+			Tertiary:  theme.Info,
+			Accent:    theme.Accent,
+			Highlight: theme.Highlight,
+			Muted:     theme.Muted,
+			Text:      theme.Text,
+			Surface:   theme.Background,
+			Base:      theme.CodeBg,
+		}),
+		herald.WithAlertPalette(herald.AlertPalette{
+			Note:      theme.Info,
+			Tip:       theme.Success,
+			Important: theme.Accent,
+			Warning:   theme.Warning,
+			Caution:   theme.Error,
+		}),
+		herald.WithAlertLabel(herald.AlertNote, label),
+	)
+	alertRendered := ty.Note(rendered)
+	return styleMarginBottom(theme, alertRendered)
+}
+
 // ErrorBlock renders an error message with herald Caution styling.
 func ErrorBlock(errorMsg string, ty *herald.Typography, theme style.Theme) string {
 	rendered := ty.Caution(errorMsg)
@@ -151,5 +194,11 @@ func ToolBlock(displayName, params, body string, isError bool, width int, ty *he

 // styleMarginBottom applies a 1-line margin bottom using the theme.
 func styleMarginBottom(theme style.Theme, content string) string {
-	return lipgloss.NewStyle().MarginBottom(1).Render(content)
+	return style.GetCachedStyles().MarginBottom1.Render(content)
+}
+
+// wrapText soft-wraps a string to the given width using lipgloss, which is
+// ANSI-aware and preserves escape sequences across line breaks.
+func wrapText(s string, width int) string {
+	return lipgloss.NewStyle().Width(width).Render(s)
 }
@@ -23,7 +23,8 @@ func testTypography(theme style.Theme) *herald.Typography {
 			Surface:   theme.Background,
 			Base:      theme.CodeBg,
 		}),
-		herald.WithAlertLabel(herald.AlertTip, "You"),
+		herald.WithAlertLabel(herald.AlertTip, ""),
+		herald.WithAlertIcon(herald.AlertTip, ""),
 	)
 }

@@ -70,18 +71,18 @@ func TestHighlightFileTokens(t *testing.T) {

 	for _, tt := range tests {
 		t.Run(tt.name, func(t *testing.T) {
-			result := highlightFileTokens(tt.input, theme)
+			result := HighlightFileTokens(tt.input, theme)

 			for _, want := range tt.wantHas {
 				if !strings.Contains(result, want) {
-					t.Errorf("highlightFileTokens(%q) = %q, want substring %q", tt.input, result, want)
+					t.Errorf("HighlightFileTokens(%q) = %q, want substring %q", tt.input, result, want)
 				}
 			}

 			// If there were @tokens, the result should contain ANSI escape
 			// sequences (from lipgloss styling).
 			if fileTokenPattern.MatchString(tt.input) && !strings.Contains(result, "\x1b[") {
-				t.Errorf("highlightFileTokens(%q) should contain ANSI escapes for @tokens but got %q", tt.input, result)
+				t.Errorf("HighlightFileTokens(%q) should contain ANSI escapes for @tokens but got %q", tt.input, result)
 			}
 		})
 	}
@@ -35,6 +35,12 @@ type ScrollList struct {
 	autoScroll bool // Whether to auto-scroll to bottom on new content
 	itemGap    int  // Number of blank lines between items (0 = no gap)

+	// heightCache maps item ID → rendered line count at current width.
+	// Avoids redundant Render() calls in GotoBottom/clampOffset/AtBottom.
+	// Invalidated on width change; individual entries are refreshed in
+	// View() when an item is actually rendered.
+	heightCache map[string]int
+
 	// Character-level text selection (crush-style).
 	sel selection.State
 }
@@ -42,13 +48,14 @@ type ScrollList struct {
 // NewScrollList creates a new ScrollList with the given dimensions.
 func NewScrollList(width, height int) *ScrollList {
 	return &ScrollList{
-		items:      []MessageItem{},
-		offsetIdx:  0,
-		offsetLine: 0,
-		width:      width,
-		height:     height,
-		autoScroll: true,
-		sel:        selection.NewState(),
+		items:       []MessageItem{},
+		offsetIdx:   0,
+		offsetLine:  0,
+		width:       width,
+		height:      height,
+		autoScroll:  true,
+		heightCache: make(map[string]int, 64),
+		sel:         selection.NewState(),
 	}
 }

@@ -61,6 +68,13 @@ func (s *ScrollList) SetItems(items []MessageItem) {
 	}
 }

+// InvalidateItemHeight removes the cached height for the given item ID,
+// forcing a re-render on the next height query. Call this after mutating
+// an item's content (e.g. AppendChunk on a streaming message).
+func (s *ScrollList) InvalidateItemHeight(id string) {
+	delete(s.heightCache, id)
+}
+
 // SetHeight updates the viewport height. Called when the terminal is resized.
 func (s *ScrollList) SetHeight(height int) {
 	s.height = height
@@ -68,9 +82,11 @@ func (s *ScrollList) SetHeight(height int) {
 }

 // SetWidth updates the viewport width. Called when the terminal is resized.
-// This may invalidate cached renders in MessageItems.
+// This invalidates the height cache since rendered heights are width-dependent.
 func (s *ScrollList) SetWidth(width int) {
 	s.width = width
+	// Width change invalidates all cached heights.
+	clear(s.heightCache)
 	s.clampOffset()
 }

@@ -338,9 +354,8 @@ func (s *ScrollList) ScrollBy(lines int) {
 			if s.offsetIdx >= len(s.items) {
 				break
 			}
-			currentItem := s.items[s.offsetIdx]
-			itemHeight := currentItem.Height()
-			remainingLines := itemHeight - s.offsetLine
+			ih := s.itemHeight(s.items[s.offsetIdx])
+			remainingLines := ih - s.offsetLine

 			if lines >= remainingLines {
 				// Move to next item
@@ -387,14 +402,13 @@ func (s *ScrollList) ScrollBy(lines int) {
 				// Move to previous item
 				s.offsetIdx--
 				if s.offsetIdx < len(s.items) {
-					currentItem := s.items[s.offsetIdx]
-					itemHeight := currentItem.Height()
+					ih := s.itemHeight(s.items[s.offsetIdx])

-					if lines >= itemHeight {
-						lines -= itemHeight
+					if lines >= ih {
+						lines -= ih
 						s.offsetLine = 0
 					} else {
-						s.offsetLine = itemHeight - lines
+						s.offsetLine = ih - lines
 						lines = 0
 					}
 				}
@@ -405,6 +419,8 @@ func (s *ScrollList) ScrollBy(lines int) {
 }

 // GotoBottom scrolls to the end of the list.
+// Uses cached heights and walks backwards from the end to avoid rendering
+// every item in the list.
 func (s *ScrollList) GotoBottom() {
 	if len(s.items) == 0 {
 		s.offsetIdx = 0
@@ -412,42 +428,31 @@ func (s *ScrollList) GotoBottom() {
 		return
 	}

-	// Calculate total height including gaps
-	totalHeight := 0
-	for i, item := range s.items {
-		rendered := item.Render(s.width)
-		itemHeight := strings.Count(rendered, "\n") + 1
-		totalHeight += itemHeight
-		if s.itemGap > 0 && i < len(s.items)-1 {
-			totalHeight += s.itemGap
+	// Walk backwards from the last item, accumulating height until we
+	// exceed the viewport. This is O(visible) instead of O(all items).
+	budget := s.height
+	for idx := len(s.items) - 1; idx >= 0; idx-- {
+		ih := s.itemHeight(s.items[idx])
+
+		// Account for gap *above* this item (gap between idx-1 and idx).
+		gap := 0
+		if s.itemGap > 0 && idx < len(s.items)-1 {
+			gap = s.itemGap
 		}
-	}

-	// If content fits in viewport, start at top
-	if totalHeight <= s.height {
-		s.offsetIdx = 0
-		s.offsetLine = 0
-		return
-	}
-
-	// Otherwise, position viewport at bottom
-	remaining := totalHeight - s.height
-	for idx := 0; idx < len(s.items); idx++ {
-		rendered := s.items[idx].Render(s.width)
-		itemHeight := strings.Count(rendered, "\n") + 1
-		if remaining < itemHeight {
+		if ih+gap >= budget {
+			// This item (partially) fills the remaining budget.
+			// When the gap consumed part of the budget, offsetLine would go
+			// negative — clamp to 0 so the item is shown fully.
 			s.offsetIdx = idx
-			s.offsetLine = remaining
+			s.offsetLine = max(0, ih-budget)
 			return
 		}
-		remaining -= itemHeight
-		if s.itemGap > 0 && idx < len(s.items)-1 {
-			remaining -= s.itemGap
-		}
+		budget -= ih + gap
 	}

-	// Fallback: show last item
-	s.offsetIdx = max(0, len(s.items)-1)
+	// All content fits in viewport — start at top.
+	s.offsetIdx = 0
 	s.offsetLine = 0
 }

@@ -465,14 +470,12 @@ func (s *ScrollList) AtBottom() bool {

 	visibleHeight := 0
 	for idx := s.offsetIdx; idx < len(s.items); idx++ {
-		item := s.items[idx]
-		rendered := item.Render(s.width)
-		itemHeight := strings.Count(rendered, "\n") + 1
+		ih := s.itemHeight(s.items[idx])

 		if idx == s.offsetIdx {
-			visibleHeight += itemHeight - s.offsetLine
+			visibleHeight += ih - s.offsetLine
 		} else {
-			visibleHeight += itemHeight
+			visibleHeight += ih
 		}

 		if s.itemGap > 0 && idx < len(s.items)-1 {
@@ -520,6 +523,9 @@ func (s *ScrollList) View() string {
 			content := item.Render(s.width)
 			contentLines := strings.Split(content, "\n")

+			// Refresh height cache from the actual render (authoritative).
+			s.heightCache[item.ID()] = len(contentLines)
+
 			startLine := 0
 			if idx == s.offsetIdx {
 				startLine = s.offsetLine
@@ -568,7 +574,7 @@ func (s *ScrollList) ScrollPercent() float64 {

 	totalHeight := 0
 	for _, item := range s.items {
-		totalHeight += item.Height()
+		totalHeight += s.itemHeight(item)
 	}

 	if totalHeight <= s.height {
@@ -577,7 +583,7 @@ func (s *ScrollList) ScrollPercent() float64 {

 	linesAbove := 0
 	for i := 0; i < s.offsetIdx && i < len(s.items); i++ {
-		linesAbove += s.items[i].Height()
+		linesAbove += s.itemHeight(s.items[i])
 	}
 	linesAbove += s.offsetLine

@@ -597,7 +603,8 @@ func (s *ScrollList) ScrollPercent() float64 {
 }

 // clampOffset ensures the offset values are within valid bounds after
-// resizing or scrolling operations.
+// resizing or scrolling operations. Uses cached heights to avoid
+// redundant Render() calls.
 func (s *ScrollList) clampOffset() {
 	if len(s.items) == 0 {
 		s.offsetIdx = 0
@@ -605,6 +612,7 @@ func (s *ScrollList) clampOffset() {
 		return
 	}

+	// Clamp offsetIdx to valid item range.
 	if s.offsetIdx >= len(s.items) {
 		s.offsetIdx = len(s.items) - 1
 	}
@@ -612,37 +620,38 @@ func (s *ScrollList) clampOffset() {
 		s.offsetIdx = 0
 	}

+	// Clamp offsetLine within current item.
 	if s.offsetIdx < len(s.items) {
-		rendered := s.items[s.offsetIdx].Render(s.width)
-		itemHeight := strings.Count(rendered, "\n") + 1
-		if s.offsetLine >= itemHeight {
-			s.offsetLine = max(0, itemHeight-1)
+		ih := s.itemHeight(s.items[s.offsetIdx])
+		if s.offsetLine >= ih {
+			s.offsetLine = max(0, ih-1)
 		}
 	}
 	if s.offsetLine < 0 {
 		s.offsetLine = 0
 	}

-	// Prevent scrolling past the bottom
+	// Prevent scrolling past the bottom — compute total height and check
+	// whether remaining content from the current offset fills the viewport.
 	totalHeight := 0
 	for i, item := range s.items {
-		rendered := item.Render(s.width)
-		totalHeight += strings.Count(rendered, "\n") + 1
+		totalHeight += s.itemHeight(item)
 		if s.itemGap > 0 && i < len(s.items)-1 {
 			totalHeight += s.itemGap
 		}
 	}

+	// If content fits in viewport, force start at top.
 	if totalHeight <= s.height {
 		s.offsetIdx = 0
 		s.offsetLine = 0
 		return
 	}

+	// Compute lines above the viewport.
 	linesAbove := 0
 	for i := 0; i < s.offsetIdx; i++ {
-		rendered := s.items[i].Render(s.width)
-		linesAbove += strings.Count(rendered, "\n") + 1
+		linesAbove += s.itemHeight(s.items[i])
 		if s.itemGap > 0 && i < len(s.items)-1 {
 			linesAbove += s.itemGap
 		}
@@ -651,20 +660,21 @@ func (s *ScrollList) clampOffset() {

 	linesFromCurrentToEnd := totalHeight - linesAbove
 	if linesFromCurrentToEnd < s.height {
+		// We've scrolled past the bottom — reposition so the last line
+		// of content sits at the bottom of the viewport.
 		targetLine := totalHeight - s.height
 		currentLine := 0

 		for idx := 0; idx < len(s.items); idx++ {
-			rendered := s.items[idx].Render(s.width)
-			itemHeight := strings.Count(rendered, "\n") + 1
+			ih := s.itemHeight(s.items[idx])

-			if currentLine+itemHeight > targetLine {
+			if currentLine+ih > targetLine {
 				s.offsetIdx = idx
 				s.offsetLine = targetLine - currentLine
 				return
 			}

-			currentLine += itemHeight
+			currentLine += ih
 			if s.itemGap > 0 && idx < len(s.items)-1 {
 				currentLine += s.itemGap
 			}
@@ -672,6 +682,26 @@ func (s *ScrollList) clampOffset() {
 	}
 }

+// itemHeight returns the cached rendered height for an item, computing and
+// caching it on first access. This avoids calling Render() purely to
+// count lines — the most common source of redundant work in the scroll
+// list (GotoBottom, clampOffset, AtBottom, ScrollBy all need heights but
+// never use the rendered content).
+//
+// The cache is invalidated wholesale on width changes (SetWidth) and
+// individual entries are refreshed in View() after an item is actually
+// rendered, so stale entries are self-correcting within one frame.
+func (s *ScrollList) itemHeight(item MessageItem) int {
+	id := item.ID()
+	if h, ok := s.heightCache[id]; ok {
+		return h
+	}
+	// Cache miss — render to measure.
+	h := s.renderedHeight(item)
+	s.heightCache[id] = h
+	return h
+}
+
 // renderedHeight returns the height of a message item in lines by actually
 // rendering it. This is the single source of truth for item height — it
 // matches exactly what View() produces, unlike item.Height() which may
@@ -21,12 +21,11 @@ func knightRiderFrames() []string {
 	const numDots = 8
 	const dot = "▪"

-	theme := style.GetTheme()
-
-	bright := lipgloss.NewStyle().Foreground(theme.Primary)
-	med := lipgloss.NewStyle().Foreground(theme.Muted)
-	dim := lipgloss.NewStyle().Foreground(theme.VeryMuted)
-	off := lipgloss.NewStyle().Foreground(theme.MutedBorder)
+	cs := style.GetCachedStyles()
+	bright := cs.SpinnerBright
+	med := cs.SpinnerMed
+	dim := cs.SpinnerDim
+	off := cs.SpinnerOff

 	// Scanner bounces: 0→7→0
 	positions := make([]int, 0, 2*numDots-2)
@@ -472,9 +471,12 @@ func (s *StreamComponent) renderReasoningBlock(reasoning string) string {

 	// Main content using Italic with Muted color for visual distinction.
 	content := strings.TrimLeft(strings.Join(lines, "\n"), " \t\n")
-	theme := GetTheme()
-	mutedStyle := lipgloss.NewStyle().Foreground(theme.Muted)
-	parts = append(parts, mutedStyle.Render(s.ty.Italic(content)))
+	// Soft-wrap to the available width so long lines don't get cut off.
+	if s.width > 4 {
+		content = lipgloss.NewStyle().Width(s.width - 4).Render(content)
+	}
+	cs := style.GetCachedStyles()
+	parts = append(parts, cs.Muted.Render(s.ty.Italic(content)))

 	// Duration footer with VeryMuted label and Accent duration.
 	var duration time.Duration
@@ -490,8 +492,8 @@ func (s *StreamComponent) renderReasoningBlock(reasoning string) string {
 		} else {
 			durationStr = fmt.Sprintf("%.1fs", duration.Seconds())
 		}
-		label := lipgloss.NewStyle().Foreground(theme.VeryMuted).Render("Thought for ")
-		durationStyled := lipgloss.NewStyle().Foreground(theme.Accent).Render(durationStr)
+		label := cs.VeryMuted.Render("Thought for ")
+		durationStyled := cs.Accent.Render(durationStr)
 		parts = append(parts, label+durationStyled)
 	}

@@ -588,8 +590,10 @@ func formatToolExecutionMessage(toolName string) string {
 	return toolName
 }

-// UpdateTheme refreshes the component's typography instance with colors from
-// the current theme. This is called when the user changes themes via /theme.
+// UpdateTheme refreshes the component's typography instance and spinner
+// animation frames with colors from the current theme. This is called when
+// the user changes themes via /theme.
 func (s *StreamComponent) UpdateTheme() {
 	s.ty = createTypography(GetTheme())
+	s.spinnerFrames = knightRiderFrames()
 }
@@ -40,6 +40,70 @@ func GetTheme() Theme {
 func SetTheme(theme Theme) {
 	currentTheme = theme
 	markdownTypographyCache = nil // invalidate cached renderer; colors may have changed
+	styleCache = nil              // invalidate cached styles; colors may have changed
+}
+
+// CachedStyles holds pre-built lipgloss styles that are reused across
+// render frames. Invalidated by SetTheme, lazily rebuilt on next access.
+// Only accessed from BubbleTea's single-threaded Update/View cycle.
+type CachedStyles struct {
+	// render/blocks.go
+	FileTokenAccent lipgloss.Style // Foreground(Accent).Bold(true)
+	Muted           lipgloss.Style // Foreground(Muted)
+	VeryMuted       lipgloss.Style // Foreground(VeryMuted)
+	Accent          lipgloss.Style // Foreground(Accent)
+	MarginBottom1   lipgloss.Style // MarginBottom(1)
+
+	// stream.go - spinner phases
+	SpinnerBright lipgloss.Style // Foreground(Primary)
+	SpinnerMed    lipgloss.Style // Foreground(Muted)
+	SpinnerDim    lipgloss.Style // Foreground(VeryMuted)
+	SpinnerOff    lipgloss.Style // Foreground(MutedBorder)
+
+	// message_items.go - bash output
+	BashHeader lipgloss.Style // Foreground(Muted).Italic(true)
+	BashStderr lipgloss.Style // Foreground(Error)
+
+	// render/blocks.go - tool block
+	ToolSuccess lipgloss.Style // Foreground(Success)
+	ToolError   lipgloss.Style // Foreground(Error)
+	ToolInfo    lipgloss.Style // Foreground(Info).Bold(true)
+	ToolMuted   lipgloss.Style // Foreground(Muted)
+
+	// common
+	ErrorFg  lipgloss.Style // Foreground(Error)
+	TextBold lipgloss.Style // Foreground(Text).Bold(true)
+}
+
+var styleCache *CachedStyles
+
+// GetCachedStyles returns the pre-built style cache, creating it lazily
+// from the current theme. Invalidated by SetTheme.
+func GetCachedStyles() *CachedStyles {
+	if styleCache != nil {
+		return styleCache
+	}
+	theme := GetTheme()
+	styleCache = &CachedStyles{
+		FileTokenAccent: lipgloss.NewStyle().Foreground(theme.Accent).Bold(true),
+		Muted:           lipgloss.NewStyle().Foreground(theme.Muted),
+		VeryMuted:       lipgloss.NewStyle().Foreground(theme.VeryMuted),
+		Accent:          lipgloss.NewStyle().Foreground(theme.Accent),
+		MarginBottom1:   lipgloss.NewStyle().MarginBottom(1),
+		SpinnerBright:   lipgloss.NewStyle().Foreground(theme.Primary),
+		SpinnerMed:      lipgloss.NewStyle().Foreground(theme.Muted),
+		SpinnerDim:      lipgloss.NewStyle().Foreground(theme.VeryMuted),
+		SpinnerOff:      lipgloss.NewStyle().Foreground(theme.MutedBorder),
+		BashHeader:      lipgloss.NewStyle().Foreground(theme.Muted).Italic(true),
+		BashStderr:      lipgloss.NewStyle().Foreground(theme.Error),
+		ToolSuccess:     lipgloss.NewStyle().Foreground(theme.Success),
+		ToolError:       lipgloss.NewStyle().Foreground(theme.Error),
+		ToolInfo:        lipgloss.NewStyle().Foreground(theme.Info).Bold(true),
+		ToolMuted:       lipgloss.NewStyle().Foreground(theme.Muted),
+		ErrorFg:         lipgloss.NewStyle().Foreground(theme.Error),
+		TextBold:        lipgloss.NewStyle().Foreground(theme.Text).Bold(true),
+	}
+	return styleCache
 }

 // MarkdownThemeColors defines colors for markdown rendering and syntax highlighting.
@@ -200,10 +200,6 @@ func (ts *TreeSelectorComponent) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
 		case key.Matches(msg, key.NewBinding(key.WithKeys("ctrl+l"))):
 			ts.filter = TreeFilterLabelOnly
 			ts.rebuildFlatList()
-		case key.Matches(msg, key.NewBinding(key.WithKeys("ctrl+a"))):
-			ts.filter = TreeFilterAll
-			ts.rebuildFlatList()
-
 		default:
 			// Typing search.
 			if msg.Text != "" && len(msg.Text) == 1 {
@@ -77,6 +77,11 @@ host, err := kit.New(ctx, &kit.Options{

    // Compaction
    AutoCompact:  true,                       // Auto-compact near context limit
+
+    // In-process MCP servers (map name → *kit.MCPServer)
+    InProcessMCPServers: map[string]*kit.MCPServer{
+        "docs": mcpSrv,
+    },
 })
 ```

@@ -101,7 +106,7 @@ unsub2 := host.OnToolResult(func(e kit.ToolResultEvent) {
 })
 defer unsub2()

-unsub3 := host.OnStreaming(func(e kit.MessageUpdateEvent) {
+unsub3 := host.OnMessageUpdate(func(e kit.MessageUpdateEvent) {
    fmt.Print(e.Chunk)
 })
 defer unsub3()
@@ -112,6 +117,79 @@ response, err := host.Prompt(
 )
 ```

+### Dynamic MCP Server Management
+
+Add, remove, and list MCP servers at runtime:
+
+```go
+// Add an MCP server at runtime
+n, err := host.AddMCPServer(ctx, "github", kit.MCPServerConfig{
+    Command: "npx",
+    Args:    []string{"-y", "@modelcontextprotocol/server-github"},
+})
+fmt.Printf("Loaded %d tools from MCP server\n", n)
+
+// List connected MCP servers
+for _, s := range host.ListMCPServers() {
+    fmt.Printf("%s: %d tools\n", s.Name, s.ToolCount)
+}
+
+// Disconnect a server and remove its tools
+host.RemoveMCPServer("github")
+```
+
+### In-Process MCP Servers
+
+Register mcp-go servers that run in the same process — no subprocess spawning,
+no network I/O. This is ideal for custom tool servers implemented in Go:
+
+```go
+import (
+    "github.com/mark3labs/mcp-go/mcp"
+    "github.com/mark3labs/mcp-go/server"
+)
+
+// Create an mcp-go server with tools
+mcpSrv := server.NewMCPServer("my-tools", "1.0.0",
+    server.WithToolCapabilities(true),
+)
+mcpSrv.AddTool(mcp.NewTool("search_docs",
+    mcp.WithDescription("Search documentation"),
+    mcp.WithString("query", mcp.Required()),
+), searchHandler)
+
+// Option 1: At init time via Options
+host, _ := kit.New(ctx, &kit.Options{
+    InProcessMCPServers: map[string]*kit.MCPServer{
+        "docs": mcpSrv,
+    },
+})
+
+// Option 2: At runtime
+n, err := host.AddInProcessMCPServer(ctx, "docs", mcpSrv)
+fmt.Printf("Loaded %d tools from in-process server\n", n)
+```
+
+Kit does not take ownership of the server's lifecycle — the caller is responsible for any cleanup. In-process server tools are prefixed the same way as external MCP servers (e.g. `"docs__search_docs"`).
+
+### MCP Prompts
+
+MCP servers can expose prompt templates via the MCP prompts capability.
+Kit exposes these through the SDK:
+
+```go
+// List prompts from all connected MCP servers
+prompts := host.ListMCPPrompts()
+for _, p := range prompts {
+    fmt.Printf("%s/%s: %s\n", p.Server, p.Name, p.Description)
+}
+
+// Get a specific prompt with arguments
+msg, err := host.GetMCPPrompt(ctx, "server-name", "prompt-name", map[string]string{
+    "topic": "concurrency",
+})
+```
+
 ### Session Management

 Maintain conversation context:
@@ -145,6 +223,16 @@ kit.LLMUsage        // {InputTokens, OutputTokens, TotalTokens, ...}
 kit.LLMResponse     // {Content, FinishReason, Usage}
 kit.LLMFilePart     // {Filename, Data []byte, MediaType}

+// MCP OAuth types
+kit.MCPServer            // *server.MCPServer for in-process MCP transport
+kit.MCPServerConfig      // Configuration for an MCP server (stdio, SSE, or in-process)
+kit.MCPAuthHandler       // Interface: handles user-facing OAuth authorization
+kit.DefaultMCPAuthHandler // Port + callback-server mechanics; set OnAuthURL for presentation
+kit.CLIMCPAuthHandler    // CLI wrapper: opens browser, prints status
+kit.MCPTokenStore        // Persists OAuth tokens for a single MCP server
+kit.MCPToken             // OAuth token (access token, refresh token, expiry)
+kit.MCPTokenStoreFactory // Creates an MCPTokenStore for a given server URL
+
 // Conversion helpers
 msgs := kit.ConvertToLLMMessages(&msg)   // SDK Message → []LLMMessage
 msg  := kit.ConvertFromLLMMessage(lMsg)  // LLMMessage  → SDK Message
@@ -192,6 +280,7 @@ Key `Options` fields for SDK usage:
 | `NoSession` | Ephemeral mode (no session persistence) |
 | `SessionPath` | Open specific session file |
 | `Continue` | Resume most recent session |
+| `InProcessMCPServers` | Map of name → `*kit.MCPServer` for in-process MCP servers |
 | `Debug` | Enable debug logging |

 ## Environment Variables
@@ -22,13 +22,13 @@ func NewTreeManagerAdapter(tm *session.TreeManager) SessionManager {

 // AppendMessage implements SessionManager.
 func (a *treeManagerAdapter) AppendMessage(msg LLMMessage) (string, error) {
-	// LLMMessage is just an alias for fantasy.Message, so no conversion needed
+	// LLMMessage is a type alias, so no conversion needed.
 	return a.inner.AppendLLMMessage(msg)
 }

 // GetMessages implements SessionManager.
 func (a *treeManagerAdapter) GetMessages() []LLMMessage {
-	// LLMMessage is just an alias for fantasy.Message
+	// LLMMessage is a type alias, so no conversion needed.
 	return a.inner.GetLLMMessages()
 }

@@ -223,9 +223,8 @@ func (a *treeManagerAdapter) convertEntry(entry any) *BranchEntry {
 	}
 }

-// convertKitMessagesToFantasy converts kit LLM messages to fantasy messages.
-// Since LLMMessage is an alias for fantasy.Message, this is a no-op.
-func convertKitMessagesToFantasy(msgs []LLMMessage) []fantasy.Message {
-	// LLMMessage is just an alias for fantasy.Message, so we can type convert
+// convertToLLMMessages converts kit LLM messages to the underlying provider
+// message type. Since LLMMessage is a type alias, this is a no-op.
+func convertToLLMMessages(msgs []LLMMessage) []fantasy.Message {
 	return msgs
 }
@@ -58,7 +58,7 @@ func (m *Kit) ShouldCompact() bool {

 	// Fall back to text-based heuristic before first turn completes.
 	messages := m.session.GetMessages()
-	return compaction.ShouldCompact(convertKitMessagesToFantasy(messages), info.Limit.Context, reserveTokens)
+	return compaction.ShouldCompact(convertToLLMMessages(messages), info.Limit.Context, reserveTokens)
 }

 // GetContextStats returns current context usage statistics including
@@ -203,9 +203,9 @@ func (m *Kit) compactInternal(ctx context.Context, opts *CompactionOptions, cust
 // custom summary. It still determines the cut point and persists a
 // CompactionEntry.
 func (m *Kit) applyCustomCompaction(summary string, messages []LLMMessage, opts *CompactionOptions) (*CompactionResult, error) {
-	originalTokens := compaction.EstimateMessageTokens(convertKitMessagesToFantasy(messages))
+	originalTokens := compaction.EstimateMessageTokens(convertToLLMMessages(messages))

-	cutPoint := compaction.FindCutPoint(convertKitMessagesToFantasy(messages), opts.KeepRecentTokens)
+	cutPoint := compaction.FindCutPoint(convertToLLMMessages(messages), opts.KeepRecentTokens)
 	if cutPoint == 0 {
 		cutPoint = len(messages) - 1
 		if cutPoint < 1 {
@@ -38,20 +38,37 @@ Guidelines:
 - Be concise in your responses
 - Show file paths clearly when working with files`

-// setSDKDefaults registers the same viper defaults that the CLI sets via
-// cobra flag bindings. This ensures the SDK behaves identically to the CLI
-// even when cobra is not used.
+// sdkDefaultMaxTokens is the last-resort ceiling applied when the SDK caller
+// has not configured max-tokens via Options, env, config, or a per-model
+// default. It matches the CLI's --max-tokens cobra default so SDK and CLI
+// callers see the same base value before per-model right-sizing runs.
+// It is intentionally applied on the *models.ProviderConfig struct
+// (not via viper) so that viper.IsSet("max-tokens") remains false and the
+// right-sizing + per-model-default paths continue to work.
+const sdkDefaultMaxTokens = 8192
+
+// setSDKDefaults registers viper defaults that match the CLI's cobra flag
+// defaults for keys where SetDefault does not interfere with downstream
+// viper.IsSet() checks.
+//
+// Keys that participate in "explicit vs unset" precedence downstream —
+// max-tokens, temperature, top-p, top-k, frequency-penalty, presence-penalty,
+// thinking-level — are deliberately NOT registered here. viper.SetDefault
+// causes viper.IsSet() to return true, which would suppress per-model
+// defaults (ApplyModelSettings) and automatic right-sizing (rightSizeMaxTokens)
+// for every SDK-created Kit. Those defaults are instead applied:
+//
+//   - max-tokens: as a last-resort struct-level floor (sdkDefaultMaxTokens)
+//     in kit.New() after BuildProviderConfig returns, when the resolved
+//     value is still zero.
+//   - thinking-level: handled implicitly by models.ParseThinkingLevel("")
+//     which returns models.ThinkingOff.
+//   - sampling params (temperature, top-p, top-k, frequency/presence-penalty):
+//     left as nil pointers so provider libraries apply their own defaults.
 func setSDKDefaults() {
 	viper.SetDefault("model", "anthropic/claude-sonnet-4-5-20250929")
 	viper.SetDefault("system-prompt", defaultSystemPrompt)
-	viper.SetDefault("max-tokens", 4096)
-	viper.SetDefault("temperature", 0.7)
-	viper.SetDefault("top-p", 0.95)
-	viper.SetDefault("top-k", 40)
-	viper.SetDefault("frequency-penalty", 0.0)
-	viper.SetDefault("presence-penalty", 0.0)
 	viper.SetDefault("stream", true)
-	viper.SetDefault("thinking-level", "off")
 	viper.SetDefault("num-gpu-layers", -1)
 	viper.SetDefault("main-gpu", 0)
 }
@@ -102,6 +119,10 @@ func InitConfig(configFile string, debug bool) error {
 	}

 	viper.SetEnvPrefix("KIT")
+	// Map hyphenated config keys (e.g. "max-tokens") to underscored env
+	// var names (e.g. KIT_MAX_TOKENS). Without this, AutomaticEnv looks
+	// for KIT_MAX-TOKENS and silently misses valid env overrides.
+	viper.SetEnvKeyReplacer(strings.NewReplacer("-", "_"))
 	viper.AutomaticEnv()
 	return nil
 }
@@ -23,6 +23,14 @@ const (
 	EventMessageUpdate EventType = "message_update"
 	// EventMessageEnd fires when the assistant message is complete.
 	EventMessageEnd EventType = "message_end"
+	// EventToolCallStart fires when the LLM begins generating tool call arguments.
+	// The tool name is known but arguments are still streaming.
+	EventToolCallStart EventType = "tool_call_start"
+	// EventToolCallDelta fires for each streamed fragment of tool call arguments.
+	EventToolCallDelta EventType = "tool_call_delta"
+	// EventToolCallEnd fires when tool argument streaming is complete, before
+	// the tool call is parsed and execution begins.
+	EventToolCallEnd EventType = "tool_call_end"
 	// EventToolCall fires when a tool call has been parsed and is about to execute.
 	EventToolCall EventType = "tool_call"
 	// EventToolExecutionStart fires when a tool begins executing.
@@ -45,9 +53,36 @@ const (
 	// EventToolOutput fires when a tool produces streaming output chunks.
 	EventToolOutput EventType = "tool_output"
 	EventStepUsage  EventType = "step_usage"
+	// EventPasswordPrompt fires when a sudo command needs a password.
+	EventPasswordPrompt EventType = "password_prompt"
 	// EventSteerConsumed fires when one or more steering messages have been
 	// injected into the agent turn via PrepareStep.
 	EventSteerConsumed EventType = "steer_consumed"
+	// EventStepStart fires when a new LLM call begins within a turn.
+	EventStepStart EventType = "step_start"
+	// EventStepFinish fires when a step completes, providing full step context
+	// including whether tool calls were made, the finish reason, and usage stats.
+	EventStepFinish EventType = "step_finish"
+	// EventTextStart fires when the LLM begins generating text content.
+	EventTextStart EventType = "text_start"
+	// EventTextEnd fires when the LLM finishes generating text content.
+	EventTextEnd EventType = "text_end"
+	// EventReasoningStart fires when the LLM begins reasoning/thinking.
+	EventReasoningStart EventType = "reasoning_start"
+	// EventWarnings fires when the LLM provider returns warnings.
+	EventWarnings EventType = "warnings"
+	// EventSource fires when the LLM references a source (e.g. from web search).
+	EventSource EventType = "source"
+	// EventStreamFinish fires when a per-step LLM stream completes with
+	// usage stats and a finish reason.
+	EventStreamFinish EventType = "stream_finish"
+	// EventError fires when an agent-level error occurs during streaming.
+	// This is distinct from TurnEndEvent.Error — it fires at the point of
+	// failure, before the turn ends.
+	EventError EventType = "error"
+	// EventRetry fires when the LLM provider request is retried after a
+	// transient error.
+	EventRetry EventType = "retry"
 )

 // ---------------------------------------------------------------------------
@@ -108,6 +143,38 @@ func parseToolArgs(toolArgs string) map[string]any {
 	return nil
 }

+// ---------------------------------------------------------------------------
+// Finish reason constants
+// ---------------------------------------------------------------------------
+
+// Finish reasons reported by the LLM provider on a completed turn. These
+// mirror fantasy.FinishReason string values so comparisons against
+// TurnEndEvent.StopReason / TurnResult.StopReason are stable across
+// providers.
+const (
+	// FinishReasonStop: the model produced a natural stop (e.g. stop sequence
+	// or end-of-turn signal).
+	FinishReasonStop = "stop"
+	// FinishReasonLength: the model hit the configured max_output_tokens
+	// budget. The response is truncated. Surface this to the user and
+	// consider raising --max-tokens / KIT_MAX_TOKENS / modelSettings[...]
+	// .maxTokens.
+	FinishReasonLength = "length"
+	// FinishReasonToolCalls: the model stopped to emit tool calls (normal
+	// mid-turn state during agentic loops).
+	FinishReasonToolCalls = "tool-calls"
+	// FinishReasonContentFilter: the provider's safety filter stopped
+	// generation.
+	FinishReasonContentFilter = "content-filter"
+	// FinishReasonError: the model stopped because of an error.
+	FinishReasonError = "error"
+	// FinishReasonOther: provider-specific reason that doesn't map to any of
+	// the above.
+	FinishReasonOther = "other"
+	// FinishReasonUnknown: the provider didn't report a finish reason.
+	FinishReasonUnknown = "unknown"
+)
+
 // ---------------------------------------------------------------------------
 // Concrete event structs
 // ---------------------------------------------------------------------------
@@ -122,9 +189,13 @@ func (e TurnStartEvent) EventType() EventType { return EventTurnStart }

 // TurnEndEvent fires after the agent finishes processing.
 type TurnEndEvent struct {
-	Response   string
-	Error      error
-	StopReason string // "end_turn", "max_tokens", "tool_use", "error", etc.
+	Response string
+	Error    error
+	// StopReason is the LLM provider's finish reason for the final step of
+	// the turn. Compare against the FinishReason* constants — in particular,
+	// FinishReasonLength indicates the response was truncated because the
+	// agent hit its max_output_tokens budget.
+	StopReason string
 }

 // EventType implements Event.
@@ -178,6 +249,40 @@ type MessageEndEvent struct {
 // EventType implements Event.
 func (e MessageEndEvent) EventType() EventType { return EventMessageEnd }

+// ToolCallStartEvent fires when the LLM begins generating tool call arguments.
+// The tool name is known at this point but the full arguments are still being
+// streamed. UIs can use this to show a "running" indicator immediately instead
+// of waiting for the full argument JSON to finish streaming.
+type ToolCallStartEvent struct {
+	ToolCallID string // Stable ID for correlating tool lifecycle events
+	ToolName   string
+	ToolKind   string // Tool classification: "execute", "edit", "read", "search", "agent"
+}
+
+// EventType implements Event.
+func (e ToolCallStartEvent) EventType() EventType { return EventToolCallStart }
+
+// ToolCallDeltaEvent fires for each streamed fragment of tool call arguments.
+// Useful for live-previewing artifact content as it's generated, or showing a
+// progress indicator with byte count.
+type ToolCallDeltaEvent struct {
+	ToolCallID string // Stable ID for correlating tool lifecycle events
+	Delta      string // JSON fragment of tool arguments
+}
+
+// EventType implements Event.
+func (e ToolCallDeltaEvent) EventType() EventType { return EventToolCallDelta }
+
+// ToolCallEndEvent fires when tool argument streaming is complete, before
+// the tool call is parsed and execution begins. UIs can use this to
+// transition from an "generating args" state to an "executing" state.
+type ToolCallEndEvent struct {
+	ToolCallID string // Stable ID for correlating tool lifecycle events
+}
+
+// EventType implements Event.
+func (e ToolCallEndEvent) EventType() EventType { return EventToolCallEnd }
+
 // ToolCallEvent fires when a tool call has been parsed.
 type ToolCallEvent struct {
 	ToolCallID string // Stable ID for correlating tool lifecycle events
@@ -299,6 +404,120 @@ type SteerConsumedEvent struct {
 // EventType implements Event.
 func (e SteerConsumedEvent) EventType() EventType { return EventSteerConsumed }

+// StepStartEvent fires when a new LLM call begins within a multi-step agent turn.
+type StepStartEvent struct {
+	StepNumber int
+}
+
+// EventType implements Event.
+func (e StepStartEvent) EventType() EventType { return EventStepStart }
+
+// StepFinishEvent fires when a step completes, providing full step context.
+// This is a unified event that carries the same data as the existing
+// ToolCallContentEvent and StepUsageEvent, plus additional step metadata.
+type StepFinishEvent struct {
+	StepNumber   int
+	HasToolCalls bool
+	FinishReason string
+	Usage        LLMUsage
+}
+
+// EventType implements Event.
+func (e StepFinishEvent) EventType() EventType { return EventStepFinish }
+
+// TextStartEvent fires when the LLM begins generating text content.
+// Paired with MessageUpdateEvent (deltas) and TextEndEvent.
+type TextStartEvent struct {
+	ID string
+}
+
+// EventType implements Event.
+func (e TextStartEvent) EventType() EventType { return EventTextStart }
+
+// TextEndEvent fires when the LLM finishes generating text content.
+type TextEndEvent struct {
+	ID string
+}
+
+// EventType implements Event.
+func (e TextEndEvent) EventType() EventType { return EventTextEnd }
+
+// ReasoningStartEvent fires when the LLM begins reasoning/thinking.
+// Paired with ReasoningDeltaEvent (deltas) and ReasoningCompleteEvent.
+type ReasoningStartEvent struct {
+	ID string
+}
+
+// EventType implements Event.
+func (e ReasoningStartEvent) EventType() EventType { return EventReasoningStart }
+
+// WarningsEvent fires when the LLM provider returns warnings about the request.
+type WarningsEvent struct {
+	Warnings []string
+}
+
+// EventType implements Event.
+func (e WarningsEvent) EventType() EventType { return EventWarnings }
+
+// SourceEvent fires when the LLM references a source (e.g. from web search tools).
+type SourceEvent struct {
+	SourceType string
+	ID         string
+	URL        string
+	Title      string
+}
+
+// EventType implements Event.
+func (e SourceEvent) EventType() EventType { return EventSource }
+
+// StreamFinishEvent fires when a per-step LLM stream completes.
+// Provides per-stream usage stats and finish reason.
+type StreamFinishEvent struct {
+	Usage        LLMUsage
+	FinishReason string
+}
+
+// EventType implements Event.
+func (e StreamFinishEvent) EventType() EventType { return EventStreamFinish }
+
+// ErrorEvent fires when an agent-level error occurs during streaming.
+// This is distinct from TurnEndEvent.Error — it fires at the point of failure.
+type ErrorEvent struct {
+	Error error
+}
+
+// EventType implements Event.
+func (e ErrorEvent) EventType() EventType { return EventError }
+
+// RetryEvent fires when the LLM provider request is retried after a transient error.
+type RetryEvent struct {
+	Attempt int
+	Error   error
+}
+
+// EventType implements Event.
+func (e RetryEvent) EventType() EventType { return EventRetry }
+
+// PasswordPromptEvent fires when a sudo command needs a password.
+// The TUI should display a password prompt and send the result back via ResponseCh.
+type PasswordPromptEvent struct {
+	// Prompt is the message to display to the user.
+	Prompt string
+	// ResponseCh receives the password from the TUI.
+	// The TUI must send exactly one value: (password, false) for submit
+	// or ("", true) for cancel.
+	ResponseCh chan<- PasswordPromptResponse
+}
+
+// PasswordPromptResponse carries the password prompt result.
+type PasswordPromptResponse struct {
+	Password  string
+	Cancelled bool
+}
+
+// EventType implements Event.
+func (e PasswordPromptEvent) EventType() EventType { return EventPasswordPrompt }
+
 // ---------------------------------------------------------------------------
 // EventBus
 // ---------------------------------------------------------------------------
@@ -362,6 +581,39 @@ func (m *Kit) OnToolCall(handler func(ToolCallEvent)) func() {
 	})
 }

+// OnToolCallStart registers a handler that fires only for ToolCallStartEvent.
+// This fires when the LLM begins generating tool call arguments — before the
+// full argument JSON is available. Returns an unsubscribe function.
+func (m *Kit) OnToolCallStart(handler func(ToolCallStartEvent)) func() {
+	return m.Subscribe(func(e Event) {
+		if tcs, ok := e.(ToolCallStartEvent); ok {
+			handler(tcs)
+		}
+	})
+}
+
+// OnToolCallDelta registers a handler that fires only for ToolCallDeltaEvent.
+// Each delta contains a JSON fragment of tool call arguments as they stream in.
+// Returns an unsubscribe function.
+func (m *Kit) OnToolCallDelta(handler func(ToolCallDeltaEvent)) func() {
+	return m.Subscribe(func(e Event) {
+		if tcd, ok := e.(ToolCallDeltaEvent); ok {
+			handler(tcd)
+		}
+	})
+}
+
+// OnToolCallEnd registers a handler that fires only for ToolCallEndEvent.
+// This fires when tool argument streaming is complete, before the tool call
+// is parsed and execution begins. Returns an unsubscribe function.
+func (m *Kit) OnToolCallEnd(handler func(ToolCallEndEvent)) func() {
+	return m.Subscribe(func(e Event) {
+		if tce, ok := e.(ToolCallEndEvent); ok {
+			handler(tce)
+		}
+	})
+}
+
 // OnToolResult registers a handler that fires only for ToolResultEvent.
 // Returns an unsubscribe function.
 func (m *Kit) OnToolResult(handler func(ToolResultEvent)) func() {
@@ -384,7 +636,16 @@ func (m *Kit) OnToolOutput(handler func(ToolOutputEvent)) func() {

 // OnStreaming registers a handler that fires only for MessageUpdateEvent
 // (streaming text chunks). Returns an unsubscribe function.
+//
+// Deprecated: Use OnMessageUpdate instead. OnStreaming will be removed in a
+// future release.
 func (m *Kit) OnStreaming(handler func(MessageUpdateEvent)) func() {
+	return m.OnMessageUpdate(handler)
+}
+
+// OnMessageUpdate registers a handler that fires only for MessageUpdateEvent
+// (streaming text chunks). Returns an unsubscribe function.
+func (m *Kit) OnMessageUpdate(handler func(MessageUpdateEvent)) func() {
 	return m.Subscribe(func(e Event) {
 		if mu, ok := e.(MessageUpdateEvent); ok {
 			handler(mu)
@@ -422,6 +683,214 @@ func (m *Kit) OnTurnEnd(handler func(TurnEndEvent)) func() {
 	})
 }

+// ---------------------------------------------------------------------------
+// Typed subscribers for previously unsubscribed event types
+// ---------------------------------------------------------------------------
+
+// OnMessageStart registers a handler that fires only for MessageStartEvent.
+// Returns an unsubscribe function.
+func (m *Kit) OnMessageStart(handler func(MessageStartEvent)) func() {
+	return m.Subscribe(func(e Event) {
+		if ms, ok := e.(MessageStartEvent); ok {
+			handler(ms)
+		}
+	})
+}
+
+// OnMessageEnd registers a handler that fires only for MessageEndEvent.
+// Returns an unsubscribe function.
+func (m *Kit) OnMessageEnd(handler func(MessageEndEvent)) func() {
+	return m.Subscribe(func(e Event) {
+		if me, ok := e.(MessageEndEvent); ok {
+			handler(me)
+		}
+	})
+}
+
+// OnReasoningDelta registers a handler that fires only for ReasoningDeltaEvent.
+// Returns an unsubscribe function.
+func (m *Kit) OnReasoningDelta(handler func(ReasoningDeltaEvent)) func() {
+	return m.Subscribe(func(e Event) {
+		if rd, ok := e.(ReasoningDeltaEvent); ok {
+			handler(rd)
+		}
+	})
+}
+
+// OnReasoningComplete registers a handler that fires only for ReasoningCompleteEvent.
+// Returns an unsubscribe function.
+func (m *Kit) OnReasoningComplete(handler func(ReasoningCompleteEvent)) func() {
+	return m.Subscribe(func(e Event) {
+		if rc, ok := e.(ReasoningCompleteEvent); ok {
+			handler(rc)
+		}
+	})
+}
+
+// OnToolExecutionStart registers a handler that fires only for ToolExecutionStartEvent.
+// Returns an unsubscribe function.
+func (m *Kit) OnToolExecutionStart(handler func(ToolExecutionStartEvent)) func() {
+	return m.Subscribe(func(e Event) {
+		if tes, ok := e.(ToolExecutionStartEvent); ok {
+			handler(tes)
+		}
+	})
+}
+
+// OnToolExecutionEnd registers a handler that fires only for ToolExecutionEndEvent.
+// Returns an unsubscribe function.
+func (m *Kit) OnToolExecutionEnd(handler func(ToolExecutionEndEvent)) func() {
+	return m.Subscribe(func(e Event) {
+		if tee, ok := e.(ToolExecutionEndEvent); ok {
+			handler(tee)
+		}
+	})
+}
+
+// OnToolCallContent registers a handler that fires only for ToolCallContentEvent.
+// Returns an unsubscribe function.
+func (m *Kit) OnToolCallContent(handler func(ToolCallContentEvent)) func() {
+	return m.Subscribe(func(e Event) {
+		if tcc, ok := e.(ToolCallContentEvent); ok {
+			handler(tcc)
+		}
+	})
+}
+
+// OnStepUsage registers a handler that fires only for StepUsageEvent.
+// Returns an unsubscribe function.
+func (m *Kit) OnStepUsage(handler func(StepUsageEvent)) func() {
+	return m.Subscribe(func(e Event) {
+		if su, ok := e.(StepUsageEvent); ok {
+			handler(su)
+		}
+	})
+}
+
+// OnCompaction registers a handler that fires only for CompactionEvent.
+// Returns an unsubscribe function.
+func (m *Kit) OnCompaction(handler func(CompactionEvent)) func() {
+	return m.Subscribe(func(e Event) {
+		if ce, ok := e.(CompactionEvent); ok {
+			handler(ce)
+		}
+	})
+}
+
+// OnSteerConsumed registers a handler that fires only for SteerConsumedEvent.
+// Returns an unsubscribe function.
+func (m *Kit) OnSteerConsumed(handler func(SteerConsumedEvent)) func() {
+	return m.Subscribe(func(e Event) {
+		if sc, ok := e.(SteerConsumedEvent); ok {
+			handler(sc)
+		}
+	})
+}
+
+// ---------------------------------------------------------------------------
+// Typed subscribers for new event types
+// ---------------------------------------------------------------------------
+
+// OnStepStart registers a handler that fires only for StepStartEvent.
+// Returns an unsubscribe function.
+func (m *Kit) OnStepStart(handler func(StepStartEvent)) func() {
+	return m.Subscribe(func(e Event) {
+		if ss, ok := e.(StepStartEvent); ok {
+			handler(ss)
+		}
+	})
+}
+
+// OnStepFinish registers a handler that fires only for StepFinishEvent.
+// Returns an unsubscribe function.
+func (m *Kit) OnStepFinish(handler func(StepFinishEvent)) func() {
+	return m.Subscribe(func(e Event) {
+		if sf, ok := e.(StepFinishEvent); ok {
+			handler(sf)
+		}
+	})
+}
+
+// OnTextStart registers a handler that fires only for TextStartEvent.
+// Returns an unsubscribe function.
+func (m *Kit) OnTextStart(handler func(TextStartEvent)) func() {
+	return m.Subscribe(func(e Event) {
+		if ts, ok := e.(TextStartEvent); ok {
+			handler(ts)
+		}
+	})
+}
+
+// OnTextEnd registers a handler that fires only for TextEndEvent.
+// Returns an unsubscribe function.
+func (m *Kit) OnTextEnd(handler func(TextEndEvent)) func() {
+	return m.Subscribe(func(e Event) {
+		if te, ok := e.(TextEndEvent); ok {
+			handler(te)
+		}
+	})
+}
+
+// OnReasoningStart registers a handler that fires only for ReasoningStartEvent.
+// Returns an unsubscribe function.
+func (m *Kit) OnReasoningStart(handler func(ReasoningStartEvent)) func() {
+	return m.Subscribe(func(e Event) {
+		if rs, ok := e.(ReasoningStartEvent); ok {
+			handler(rs)
+		}
+	})
+}
+
+// OnWarnings registers a handler that fires only for WarningsEvent.
+// Returns an unsubscribe function.
+func (m *Kit) OnWarnings(handler func(WarningsEvent)) func() {
+	return m.Subscribe(func(e Event) {
+		if w, ok := e.(WarningsEvent); ok {
+			handler(w)
+		}
+	})
+}
+
+// OnSource registers a handler that fires only for SourceEvent.
+// Returns an unsubscribe function.
+func (m *Kit) OnSource(handler func(SourceEvent)) func() {
+	return m.Subscribe(func(e Event) {
+		if s, ok := e.(SourceEvent); ok {
+			handler(s)
+		}
+	})
+}
+
+// OnStreamFinish registers a handler that fires only for StreamFinishEvent.
+// Returns an unsubscribe function.
+func (m *Kit) OnStreamFinish(handler func(StreamFinishEvent)) func() {
+	return m.Subscribe(func(e Event) {
+		if sf, ok := e.(StreamFinishEvent); ok {
+			handler(sf)
+		}
+	})
+}
+
+// OnError registers a handler that fires only for ErrorEvent.
+// Returns an unsubscribe function.
+func (m *Kit) OnError(handler func(ErrorEvent)) func() {
+	return m.Subscribe(func(e Event) {
+		if ee, ok := e.(ErrorEvent); ok {
+			handler(ee)
+		}
+	})
+}
+
+// OnRetry registers a handler that fires only for RetryEvent.
+// Returns an unsubscribe function.
+func (m *Kit) OnRetry(handler func(RetryEvent)) func() {
+	return m.Subscribe(func(e Event) {
+		if r, ok := e.(RetryEvent); ok {
+			handler(r)
+		}
+	})
+}
+
 // ---------------------------------------------------------------------------
 // Subagent event subscriptions
 // ---------------------------------------------------------------------------
@@ -1,6 +1,7 @@
 package kit

 import (
+	"fmt"
 	"sync"
 	"sync/atomic"
 	"testing"
@@ -190,6 +191,74 @@ func TestEventTypes(t *testing.T) {
 	}
 }

+// TestNewEventTypes verifies that each new event struct returns the correct EventType.
+func TestNewEventTypes(t *testing.T) {
+	tests := []struct {
+		event    Event
+		expected EventType
+	}{
+		{StepStartEvent{StepNumber: 0}, EventStepStart},
+		{StepFinishEvent{StepNumber: 1, HasToolCalls: true}, EventStepFinish},
+		{TextStartEvent{ID: "text-1"}, EventTextStart},
+		{TextEndEvent{ID: "text-1"}, EventTextEnd},
+		{ReasoningStartEvent{ID: "reason-1"}, EventReasoningStart},
+		{WarningsEvent{Warnings: []string{"test"}}, EventWarnings},
+		{SourceEvent{URL: "https://example.com", Title: "Example"}, EventSource},
+		{StreamFinishEvent{FinishReason: "stop"}, EventStreamFinish},
+		{ErrorEvent{Error: fmt.Errorf("test error")}, EventError},
+		{RetryEvent{Attempt: 1, Error: fmt.Errorf("retry error")}, EventRetry},
+		{ToolCallStartEvent{}, EventToolCallStart},
+		{ToolCallDeltaEvent{}, EventToolCallDelta},
+		{ToolCallEndEvent{}, EventToolCallEnd},
+		{PasswordPromptEvent{}, EventPasswordPrompt},
+	}
+
+	for _, tt := range tests {
+		if got := tt.event.EventType(); got != tt.expected {
+			t.Errorf("%T.EventType() = %q, want %q", tt.event, got, tt.expected)
+		}
+	}
+}
+
+// TestNewEventEmission verifies that new event types are properly emitted and received.
+func TestNewEventEmission(t *testing.T) {
+	bus := newEventBus()
+	var received []Event
+
+	bus.subscribe(func(e Event) {
+		received = append(received, e)
+	})
+
+	bus.emit(StepStartEvent{StepNumber: 0})
+	bus.emit(TextStartEvent{ID: "text-1"})
+	bus.emit(TextEndEvent{ID: "text-1"})
+	bus.emit(ReasoningStartEvent{ID: "reason-1"})
+	bus.emit(WarningsEvent{Warnings: []string{"low confidence"}})
+	bus.emit(SourceEvent{URL: "https://example.com", Title: "Example"})
+	bus.emit(StreamFinishEvent{FinishReason: "stop"})
+	bus.emit(StepFinishEvent{StepNumber: 0, HasToolCalls: false, FinishReason: "stop"})
+	bus.emit(ErrorEvent{Error: fmt.Errorf("test error")})
+	bus.emit(RetryEvent{Attempt: 1, Error: fmt.Errorf("retry")})
+
+	if len(received) != 10 {
+		t.Fatalf("expected 10 events, got %d", len(received))
+	}
+
+	// Verify specific event fields
+	if ss, ok := received[0].(StepStartEvent); !ok || ss.StepNumber != 0 {
+		t.Errorf("event 0: expected StepStartEvent{StepNumber:0}, got %T %+v", received[0], received[0])
+	}
+	if ts, ok := received[1].(TextStartEvent); !ok || ts.ID != "text-1" {
+		t.Errorf("event 1: expected TextStartEvent{ID:text-1}, got %T %+v", received[1], received[1])
+	}
+	if w, ok := received[4].(WarningsEvent); !ok || len(w.Warnings) != 1 || w.Warnings[0] != "low confidence" {
+		t.Errorf("event 4: expected WarningsEvent with 1 warning, got %T %+v", received[4], received[4])
+	}
+	if sf, ok := received[7].(StepFinishEvent); !ok || sf.StepNumber != 0 || sf.HasToolCalls {
+		t.Errorf("event 7: expected StepFinishEvent{StepNumber:0, HasToolCalls:false}, got %T %+v", received[7], received[7])
+	}
+}
+
 // TestEventBusListenerCanUnsubscribeInCallback verifies that a listener can
 // safely call its own unsubscribe function from within the callback.
 func TestEventBusListenerCanUnsubscribeInCallback(t *testing.T) {
@@ -100,6 +100,38 @@ func (m *Kit) bridgeExtensions(runner *extensions.Runner) {
 		})
 	}

+	// Tool call input streaming events — fire as the LLM generates tool arguments.
+	if runner.HasHandlers(extensions.ToolCallInputStart) {
+		m.Subscribe(func(e Event) {
+			if ev, ok := e.(ToolCallStartEvent); ok {
+				_, _ = runner.Emit(extensions.ToolCallInputStartEvent{
+					ToolCallID: ev.ToolCallID,
+					ToolName:   ev.ToolName,
+					ToolKind:   ev.ToolKind,
+				})
+			}
+		})
+	}
+	if runner.HasHandlers(extensions.ToolCallInputDelta) {
+		m.Subscribe(func(e Event) {
+			if ev, ok := e.(ToolCallDeltaEvent); ok {
+				_, _ = runner.Emit(extensions.ToolCallInputDeltaEvent{
+					ToolCallID: ev.ToolCallID,
+					Delta:      ev.Delta,
+				})
+			}
+		})
+	}
+	if runner.HasHandlers(extensions.ToolCallInputEnd) {
+		m.Subscribe(func(e Event) {
+			if ev, ok := e.(ToolCallEndEvent); ok {
+				_, _ = runner.Emit(extensions.ToolCallInputEndEvent{
+					ToolCallID: ev.ToolCallID,
+				})
+			}
+		})
+	}
+
 	if runner.HasHandlers(extensions.AgentEnd) {
 		m.Subscribe(func(e Event) {
 			if ev, ok := e.(TurnEndEvent); ok {
@@ -324,4 +356,134 @@ func (m *Kit) bridgeExtensions(runner *extensions.Runner) {
 			return nil
 		})
 	}
+
+	// --- Step lifecycle observation events ---
+
+	if runner.HasHandlers(extensions.StepStart) {
+		m.Subscribe(func(e Event) {
+			if ev, ok := e.(StepStartEvent); ok {
+				_, _ = runner.Emit(extensions.StepStartEvent{StepNumber: ev.StepNumber})
+			}
+		})
+	}
+
+	if runner.HasHandlers(extensions.StepFinish) {
+		m.Subscribe(func(e Event) {
+			if ev, ok := e.(StepFinishEvent); ok {
+				_, _ = runner.Emit(extensions.StepFinishEvent{
+					StepNumber:       ev.StepNumber,
+					HasToolCalls:     ev.HasToolCalls,
+					FinishReason:     ev.FinishReason,
+					InputTokens:      ev.Usage.InputTokens,
+					OutputTokens:     ev.Usage.OutputTokens,
+					CacheReadTokens:  ev.Usage.CacheReadTokens,
+					CacheWriteTokens: ev.Usage.CacheCreationTokens,
+				})
+			}
+		})
+	}
+
+	if runner.HasHandlers(extensions.ReasoningStart) {
+		m.Subscribe(func(e Event) {
+			if ev, ok := e.(ReasoningStartEvent); ok {
+				_, _ = runner.Emit(extensions.ReasoningStartEvent{ID: ev.ID})
+			}
+		})
+	}
+
+	if runner.HasHandlers(extensions.Warnings) {
+		m.Subscribe(func(e Event) {
+			if ev, ok := e.(WarningsEvent); ok {
+				_, _ = runner.Emit(extensions.WarningsEvent{Warnings: ev.Warnings})
+			}
+		})
+	}
+
+	if runner.HasHandlers(extensions.Source) {
+		m.Subscribe(func(e Event) {
+			if ev, ok := e.(SourceEvent); ok {
+				_, _ = runner.Emit(extensions.SourceEvent{
+					SourceType: ev.SourceType,
+					ID:         ev.ID,
+					URL:        ev.URL,
+					Title:      ev.Title,
+				})
+			}
+		})
+	}
+
+	if runner.HasHandlers(extensions.Error) {
+		m.Subscribe(func(e Event) {
+			if ev, ok := e.(ErrorEvent); ok {
+				_, _ = runner.Emit(extensions.ErrorEvent{Error: ev.Error.Error()})
+			}
+		})
+	}
+
+	if runner.HasHandlers(extensions.Retry) {
+		m.Subscribe(func(e Event) {
+			if ev, ok := e.(RetryEvent); ok {
+				_, _ = runner.Emit(extensions.RetryEvent{
+					Attempt: ev.Attempt,
+					Error:   ev.Error.Error(),
+				})
+			}
+		})
+	}
+
+	// --- PrepareStep hook ---
+	// Extension PrepareStep → SDK PrepareStep hook.
+	// Same pattern as ContextPrepare: convert LLMMessage ↔ ContextMessage.
+	if runner.HasHandlers(extensions.PrepareStep) {
+		m.OnPrepareStep(HookPriorityNormal, func(h PrepareStepHook) *PrepareStepResult {
+			// Convert LLM message slice to extension ContextMessage slice.
+			extMsgs := make([]extensions.ContextMessage, len(h.Messages))
+			for i, msg := range h.Messages {
+				var sb strings.Builder
+				for _, part := range msg.Content {
+					if tp, ok := part.(LLMTextPart); ok {
+						sb.WriteString(tp.Text)
+					}
+				}
+				extMsgs[i] = extensions.ContextMessage{
+					Index:   i,
+					Role:    string(msg.Role),
+					Content: sb.String(),
+				}
+			}
+
+			result, _ := runner.Emit(extensions.PrepareStepEvent{
+				StepNumber: h.StepNumber,
+				Messages:   extMsgs,
+			})
+			r, ok := result.(extensions.PrepareStepResult)
+			if !ok || r.Messages == nil {
+				return nil
+			}
+
+			// Rebuild LLM message slice from extension result.
+			rebuilt := make([]LLMMessage, 0, len(r.Messages))
+			for _, cm := range r.Messages {
+				if cm.Index >= 0 && cm.Index < len(h.Messages) {
+					rebuilt = append(rebuilt, h.Messages[cm.Index])
+				} else {
+					role := LLMRoleUser
+					switch cm.Role {
+					case "assistant":
+						role = LLMRoleAssistant
+					case "system":
+						role = LLMRoleSystem
+					case "tool":
+						role = LLMRoleTool
+					}
+					rebuilt = append(rebuilt, LLMMessage{
+						Role:    role,
+						Content: []LLMMessagePart{LLMTextPart{Text: cm.Content}},
+					})
+				}
+			}
+
+			return &PrepareStepResult{Messages: rebuilt}
+		})
+	}
 }
@@ -5,8 +5,6 @@ import (
 	"fmt"
 	"sort"
 	"sync"
-
-	"charm.land/fantasy"
 )

 // ---------------------------------------------------------------------------
@@ -121,6 +119,32 @@ type BeforeCompactResult struct {
 	Summary string
 }

+// PrepareStepHook is the input for hooks that fire between steps within a
+// multi-step agent turn, with full message replacement capability. This is
+// the most powerful interception point — it fires after the existing steering
+// logic (if any) and before the messages are sent to the LLM.
+//
+// Use cases:
+//   - Transforming tool results (e.g. converting image tool results to FilePart
+//     user messages for vision models that don't support media in tool results)
+//   - Dynamic tool filtering per step
+//   - Mid-turn context injection beyond simple steering
+//   - Custom stop conditions that inspect message history
+type PrepareStepHook struct {
+	// StepNumber is the zero-based step index within the current turn.
+	StepNumber int
+	// Messages is the current context window that will be sent to the LLM.
+	// This includes any steering messages already injected in this step.
+	Messages []LLMMessage
+}
+
+// PrepareStepResult can replace the context window between steps.
+type PrepareStepResult struct {
+	// Messages replaces the entire context window for this step. If nil,
+	// the original messages (including any steering) are used unchanged.
+	Messages []LLMMessage
+}
+
 // ---------------------------------------------------------------------------
 // Generic hook registry with priority ordering
 // ---------------------------------------------------------------------------
@@ -248,6 +272,19 @@ func (m *Kit) OnBeforeCompact(p HookPriority, h func(BeforeCompactHook) *BeforeC
 	return m.beforeCompact.register(p, h)
 }

+// OnPrepareStep registers a hook that fires between steps within a multi-step
+// agent turn, after steering messages are injected and before the messages are
+// sent to the LLM. Return a non-nil PrepareStepResult with Messages to replace
+// the entire context window for this step. Hooks execute in priority order;
+// the first non-nil result wins. Returns an unregister function.
+//
+// This is the most powerful interception point in the agent lifecycle. It
+// enables patterns like transforming tool results, dynamic tool filtering,
+// and mid-turn context injection.
+func (m *Kit) OnPrepareStep(p HookPriority, h func(PrepareStepHook) *PrepareStepResult) func() {
+	return m.prepareStep.register(p, h)
+}
+
 // ---------------------------------------------------------------------------
 // Tool wrapping via hooks
 // ---------------------------------------------------------------------------
@@ -256,16 +293,16 @@ func (m *Kit) OnBeforeCompact(p HookPriority, h func(BeforeCompactHook) *BeforeC
 // AfterToolResult hooks around each execution. The registries are referenced
 // by pointer so hooks added after agent creation are still invoked.
 type hookedTool struct {
-	inner           fantasy.AgentTool
+	inner           Tool
 	beforeToolCall  *hookRegistry[BeforeToolCallHook, BeforeToolCallResult]
 	afterToolResult *hookRegistry[AfterToolResultHook, AfterToolResultResult]
 }

-func (h *hookedTool) Info() fantasy.ToolInfo                       { return h.inner.Info() }
-func (h *hookedTool) ProviderOptions() fantasy.ProviderOptions     { return h.inner.ProviderOptions() }
-func (h *hookedTool) SetProviderOptions(o fantasy.ProviderOptions) { h.inner.SetProviderOptions(o) }
+func (h *hookedTool) Info() LLMToolInfo                       { return h.inner.Info() }
+func (h *hookedTool) ProviderOptions() LLMProviderOptions     { return h.inner.ProviderOptions() }
+func (h *hookedTool) SetProviderOptions(o LLMProviderOptions) { h.inner.SetProviderOptions(o) }

-func (h *hookedTool) Run(ctx context.Context, call fantasy.ToolCall) (fantasy.ToolResponse, error) {
+func (h *hookedTool) Run(ctx context.Context, call LLMToolCall) (LLMToolResponse, error) {
 	toolName := h.inner.Info().Name

 	// 1. BeforeToolCall — can block execution.
@@ -279,7 +316,7 @@ func (h *hookedTool) Run(ctx context.Context, call fantasy.ToolCall) (fantasy.To
 			if reason == "" {
 				reason = "blocked by hook"
 			}
-			return fantasy.NewTextErrorResponse(fmt.Sprintf("Error: %s", reason)),
+			return newLLMTextErrorResponse(fmt.Sprintf("Error: %s", reason)),
 				fmt.Errorf("tool blocked by hook: %s", reason)
 		}
 	}
@@ -314,9 +351,9 @@ func (h *hookedTool) Run(ctx context.Context, call fantasy.ToolCall) (fantasy.To
 func hookToolWrapper(
 	beforeToolCall *hookRegistry[BeforeToolCallHook, BeforeToolCallResult],
 	afterToolResult *hookRegistry[AfterToolResultHook, AfterToolResultResult],
-) func([]fantasy.AgentTool) []fantasy.AgentTool {
-	return func(tools []fantasy.AgentTool) []fantasy.AgentTool {
-		wrapped := make([]fantasy.AgentTool, len(tools))
+) func([]Tool) []Tool {
+	return func(tools []Tool) []Tool {
+		wrapped := make([]Tool, len(tools))
 		for i, tool := range tools {
 			wrapped[i] = &hookedTool{
 				inner:           tool,
@@ -5,8 +5,6 @@ import (
 	"fmt"
 	"sync"
 	"testing"
-
-	"charm.land/fantasy"
 )

 // ---------------------------------------------------------------------------
@@ -177,20 +175,20 @@ func TestHookRegistry_ConcurrentAccess(t *testing.T) {
 // mockAgentTool implements the AgentTool interface for testing.
 type mockAgentTool struct {
 	name  string
-	runFn func(ctx context.Context, call fantasy.ToolCall) (fantasy.ToolResponse, error)
-	popts fantasy.ProviderOptions
+	runFn func(ctx context.Context, call LLMToolCall) (LLMToolResponse, error)
+	popts LLMProviderOptions
 }

-func (m *mockAgentTool) Info() fantasy.ToolInfo {
-	return fantasy.ToolInfo{Name: m.name, Description: "mock tool"}
+func (m *mockAgentTool) Info() LLMToolInfo {
+	return LLMToolInfo{Name: m.name, Description: "mock tool"}
 }
-func (m *mockAgentTool) ProviderOptions() fantasy.ProviderOptions     { return m.popts }
-func (m *mockAgentTool) SetProviderOptions(o fantasy.ProviderOptions) { m.popts = o }
-func (m *mockAgentTool) Run(ctx context.Context, call fantasy.ToolCall) (fantasy.ToolResponse, error) {
+func (m *mockAgentTool) ProviderOptions() LLMProviderOptions     { return m.popts }
+func (m *mockAgentTool) SetProviderOptions(o LLMProviderOptions) { m.popts = o }
+func (m *mockAgentTool) Run(ctx context.Context, call LLMToolCall) (LLMToolResponse, error) {
 	if m.runFn != nil {
 		return m.runFn(ctx, call)
 	}
-	return fantasy.NewTextResponse("default output"), nil
+	return newLLMTextResponse("default output"), nil
 }

 // newEmptyHookedTool creates a hookedTool with empty hook registries and the given mock tool.
@@ -203,14 +201,14 @@ func newEmptyHookedTool(mock *mockAgentTool) *hookedTool {
 func TestHookedTool_Passthrough(t *testing.T) {
 	mock := &mockAgentTool{
 		name: "test_tool",
-		runFn: func(_ context.Context, _ fantasy.ToolCall) (fantasy.ToolResponse, error) {
-			return fantasy.NewTextResponse("hello world"), nil
+		runFn: func(_ context.Context, _ LLMToolCall) (LLMToolResponse, error) {
+			return newLLMTextResponse("hello world"), nil
 		},
 	}

 	ht := newEmptyHookedTool(mock)

-	resp, err := ht.Run(context.Background(), fantasy.ToolCall{Input: "{}"})
+	resp, err := ht.Run(context.Background(), LLMToolCall{Input: "{}"})
 	if err != nil {
 		t.Fatalf("unexpected error: %v", err)
 	}
@@ -226,9 +224,9 @@ func TestHookedTool_BeforeToolCallBlock(t *testing.T) {
 	toolRan := false
 	mock := &mockAgentTool{
 		name: "dangerous_tool",
-		runFn: func(_ context.Context, _ fantasy.ToolCall) (fantasy.ToolResponse, error) {
+		runFn: func(_ context.Context, _ LLMToolCall) (LLMToolResponse, error) {
 			toolRan = true
-			return fantasy.NewTextResponse("should not run"), nil
+			return newLLMTextResponse("should not run"), nil
 		},
 	}

@@ -241,7 +239,7 @@ func TestHookedTool_BeforeToolCallBlock(t *testing.T) {

 	ht := &hookedTool{inner: mock, beforeToolCall: before, afterToolResult: after}

-	resp, err := ht.Run(context.Background(), fantasy.ToolCall{Input: "{}"})
+	resp, err := ht.Run(context.Background(), LLMToolCall{Input: "{}"})
 	if err == nil {
 		t.Fatal("expected error from blocked tool")
 	}
@@ -263,7 +261,7 @@ func TestHookedTool_BeforeToolCallBlockDefaultReason(t *testing.T) {
 	})

 	ht := &hookedTool{inner: mock, beforeToolCall: before, afterToolResult: after}
-	resp, _ := ht.Run(context.Background(), fantasy.ToolCall{})
+	resp, _ := ht.Run(context.Background(), LLMToolCall{})
 	if resp.Content != "Error: blocked by hook" {
 		t.Errorf("expected default block reason, got %q", resp.Content)
 	}
@@ -275,8 +273,8 @@ func TestHookedTool_AfterToolResultModify(t *testing.T) {

 	mock := &mockAgentTool{
 		name: "tool",
-		runFn: func(_ context.Context, _ fantasy.ToolCall) (fantasy.ToolResponse, error) {
-			return fantasy.NewTextResponse("secret data"), nil
+		runFn: func(_ context.Context, _ LLMToolCall) (LLMToolResponse, error) {
+			return newLLMTextResponse("secret data"), nil
 		},
 	}

@@ -286,7 +284,7 @@ func TestHookedTool_AfterToolResultModify(t *testing.T) {
 	})

 	ht := &hookedTool{inner: mock, beforeToolCall: before, afterToolResult: after}
-	resp, err := ht.Run(context.Background(), fantasy.ToolCall{Input: "{}"})
+	resp, err := ht.Run(context.Background(), LLMToolCall{Input: "{}"})
 	if err != nil {
 		t.Fatalf("unexpected error: %v", err)
 	}
@@ -301,8 +299,8 @@ func TestHookedTool_AfterToolResultModifyIsError(t *testing.T) {

 	mock := &mockAgentTool{
 		name: "tool",
-		runFn: func(_ context.Context, _ fantasy.ToolCall) (fantasy.ToolResponse, error) {
-			return fantasy.NewTextResponse("ok"), nil
+		runFn: func(_ context.Context, _ LLMToolCall) (LLMToolResponse, error) {
+			return newLLMTextResponse("ok"), nil
 		},
 	}

@@ -312,7 +310,7 @@ func TestHookedTool_AfterToolResultModifyIsError(t *testing.T) {
 	})

 	ht := &hookedTool{inner: mock, beforeToolCall: before, afterToolResult: after}
-	resp, err := ht.Run(context.Background(), fantasy.ToolCall{})
+	resp, err := ht.Run(context.Background(), LLMToolCall{})
 	if err != nil {
 		t.Fatalf("unexpected error: %v", err)
 	}
@@ -327,8 +325,8 @@ func TestHookedTool_HookReceivesToolInfo(t *testing.T) {

 	mock := &mockAgentTool{
 		name: "my_tool",
-		runFn: func(_ context.Context, _ fantasy.ToolCall) (fantasy.ToolResponse, error) {
-			return fantasy.NewTextResponse("result"), nil
+		runFn: func(_ context.Context, _ LLMToolCall) (LLMToolResponse, error) {
+			return newLLMTextResponse("result"), nil
 		},
 	}

@@ -345,7 +343,7 @@ func TestHookedTool_HookReceivesToolInfo(t *testing.T) {
 	})

 	ht := &hookedTool{inner: mock, beforeToolCall: before, afterToolResult: after}
-	_, _ = ht.Run(context.Background(), fantasy.ToolCall{Input: `{"key":"value"}`})
+	_, _ = ht.Run(context.Background(), LLMToolCall{Input: `{"key":"value"}`})

 	if capturedBefore.ToolName != "my_tool" {
 		t.Errorf("BeforeToolCall: expected tool name 'my_tool', got %q", capturedBefore.ToolName)
@@ -380,7 +378,7 @@ func TestHookToolWrapper(t *testing.T) {

 	wrapper := hookToolWrapper(before, after)

-	tools := []fantasy.AgentTool{
+	tools := []Tool{
 		&mockAgentTool{name: "tool_a"},
 		&mockAgentTool{name: "tool_b"},
 	}
@@ -407,7 +405,7 @@ func TestHookToolWrapper(t *testing.T) {
 		return &BeforeToolCallResult{Block: true, Reason: "late hook"}
 	})

-	_, err := wrapped[0].Run(context.Background(), fantasy.ToolCall{})
+	_, err := wrapped[0].Run(context.Background(), LLMToolCall{})
 	if err == nil {
 		t.Error("expected error from late-registered blocking hook")
 	}
@@ -538,3 +536,75 @@ func TestKit_HookMethodsExist(t *testing.T) {
 	u3()
 	u4()
 }
+
+// TestPrepareStepHookRegistry verifies registration and execution of PrepareStep hooks.
+func TestPrepareStepHookRegistry(t *testing.T) {
+	hr := newHookRegistry[PrepareStepHook, PrepareStepResult]()
+
+	// Register a hook that appends a message.
+	hr.register(HookPriorityNormal, func(h PrepareStepHook) *PrepareStepResult {
+		if h.StepNumber == 0 {
+			// On step 0, prepend a system message.
+			newMsgs := make([]LLMMessage, 0, len(h.Messages)+1)
+			newMsgs = append(newMsgs, NewLLMSystemMessage("injected"))
+			newMsgs = append(newMsgs, h.Messages...)
+			return &PrepareStepResult{Messages: newMsgs}
+		}
+		return nil // No modification for other steps.
+	})
+
+	// Test step 0 — should modify messages.
+	input := PrepareStepHook{
+		StepNumber: 0,
+		Messages:   []LLMMessage{NewLLMUserMessage("hello")},
+	}
+	result := hr.run(input)
+	if result == nil {
+		t.Fatal("expected non-nil result for step 0")
+	}
+	if len(result.Messages) != 2 {
+		t.Fatalf("expected 2 messages, got %d", len(result.Messages))
+	}
+	if result.Messages[0].Role != LLMRoleSystem {
+		t.Errorf("expected system message first, got role %q", result.Messages[0].Role)
+	}
+
+	// Test step 1 — should return nil (no modification).
+	input.StepNumber = 1
+	result = hr.run(input)
+	if result != nil {
+		t.Errorf("expected nil result for step 1, got %+v", result)
+	}
+}
+
+// TestPrepareStepHookPriority verifies that PrepareStep hooks respect priority ordering.
+func TestPrepareStepHookPriority(t *testing.T) {
+	hr := newHookRegistry[PrepareStepHook, PrepareStepResult]()
+
+	var order []string
+
+	// Low priority — should run second.
+	hr.register(HookPriorityLow, func(_ PrepareStepHook) *PrepareStepResult {
+		order = append(order, "low")
+		return nil
+	})
+
+	// High priority — should run first and win.
+	hr.register(HookPriorityHigh, func(h PrepareStepHook) *PrepareStepResult {
+		order = append(order, "high")
+		return &PrepareStepResult{Messages: h.Messages}
+	})
+
+	input := PrepareStepHook{
+		StepNumber: 0,
+		Messages:   []LLMMessage{NewLLMUserMessage("test")},
+	}
+	result := hr.run(input)
+
+	if result == nil {
+		t.Fatal("expected non-nil result")
+	}
+	if len(order) != 1 || order[0] != "high" {
+		t.Errorf("expected [high] (first non-nil wins), got %v", order)
+	}
+}
@@ -4,6 +4,7 @@ import (
 	"context"
 	"encoding/json"
 	"fmt"
+	"io"
 	"log"
 	"os"
 	"path/filepath"
@@ -50,6 +51,7 @@ type Kit struct {
 	bufferedLogger *tools.BufferedDebugLogger
 	authHandler    MCPAuthHandler // OAuth handler for remote MCP servers (may need Close)
 	opts           *Options       // stored for reload operations (skills, etc.)
+	mcpConfig      *config.Config // loaded MCP/server config, shared with subagents

 	// hasCustomSystemPrompt is true when the user explicitly configured a
 	// system prompt (via --system-prompt flag, config file, or SDK option).
@@ -64,6 +66,7 @@ type Kit struct {
 	afterTurn       *hookRegistry[AfterTurnHook, AfterTurnResult]
 	contextPrepare  *hookRegistry[ContextPrepareHook, ContextPrepareResult]
 	beforeCompact   *hookRegistry[BeforeCompactHook, BeforeCompactResult]
+	prepareStep     *hookRegistry[PrepareStepHook, PrepareStepResult]

 	// lastInputTokens stores the API-reported input token count from the
 	// most recent turn. Used by GetContextStats() to return accurate usage
@@ -175,6 +178,41 @@ func (m *Kit) AddMCPServer(ctx context.Context, name string, cfg MCPServerConfig
 	return m.agent.AddMCPServer(ctx, name, cfg)
 }

+// AddInProcessMCPServer connects an in-process mcp-go server and makes its
+// tools available to the agent immediately. Unlike [AddMCPServer] with a
+// command/URL config, this uses mcp-go's in-process transport — no subprocess
+// is spawned and no network I/O occurs.
+//
+// The server must be a *[server.MCPServer] from github.com/mark3labs/mcp-go/server.
+// Kit does not take ownership of the server's lifecycle; the caller is responsible
+// for any cleanup when the server is no longer needed.
+//
+// Returns the number of tools loaded from the server.
+//
+// Example:
+//
+//	import (
+//	    "github.com/mark3labs/mcp-go/mcp"
+//	    "github.com/mark3labs/mcp-go/server"
+//	)
+//
+//	mcpSrv := server.NewMCPServer("my-tools", "1.0.0",
+//	    server.WithToolCapabilities(true),
+//	)
+//	mcpSrv.AddTool(mcp.NewTool("search_docs",
+//	    mcp.WithDescription("Search documentation"),
+//	    mcp.WithString("query", mcp.Required()),
+//	), searchHandler)
+//
+//	n, err := k.AddInProcessMCPServer(ctx, "docs", mcpSrv)
+func (m *Kit) AddInProcessMCPServer(ctx context.Context, name string, srv *MCPServer) (int, error) {
+	cfg := MCPServerConfig{
+		Type:            "inprocess",
+		InProcessServer: srv,
+	}
+	return m.agent.AddMCPServer(ctx, name, cfg)
+}
+
 // RemoveMCPServer disconnects an MCP server and removes all its tools from
 // the agent. After this call the agent will no longer see or be able to call
 // tools from the named server.
@@ -224,6 +262,193 @@ func (m *Kit) GetExtensionToolCount() int {
 	return m.agent.GetExtensionToolCount()
 }

+// --------------------------------------------------------------------------
+// MCP Prompts
+// --------------------------------------------------------------------------
+
+// MCPPrompt describes a prompt exposed by an MCP server.
+type MCPPrompt struct {
+	// Name is the prompt name on the MCP server.
+	Name string
+	// Description is a human-readable description.
+	Description string
+	// Arguments lists the prompt's expected arguments.
+	Arguments []MCPPromptArgument
+	// ServerName is the MCP server that provides this prompt.
+	ServerName string
+}
+
+// MCPPromptArgument describes a single argument for an MCP prompt.
+type MCPPromptArgument struct {
+	// Name is the argument name.
+	Name string
+	// Description is a human-readable description.
+	Description string
+	// Required indicates whether this argument must be provided.
+	Required bool
+}
+
+// MCPPromptMessage is a single message returned by a prompt expansion.
+type MCPPromptMessage struct {
+	// Role is "user" or "assistant".
+	Role string
+	// Content is the text content of the message.
+	Content string
+	// FileParts contains binary attachments extracted from embedded resources,
+	// images, or audio content blocks within the prompt message. Empty for
+	// text-only messages.
+	FileParts []LLMFilePart
+}
+
+// MCPPromptResult is the result of expanding an MCP prompt.
+type MCPPromptResult struct {
+	// Description is an optional description returned by the server.
+	Description string
+	// Messages contains the expanded prompt messages.
+	Messages []MCPPromptMessage
+}
+
+// ListMCPPrompts returns all prompts discovered from connected MCP servers.
+// If MCP servers are still loading in the background, this returns only the
+// prompts discovered so far. Returns nil if no prompts are available.
+func (m *Kit) ListMCPPrompts() []MCPPrompt {
+	internal := m.agent.GetMCPPrompts()
+	if len(internal) == 0 {
+		return nil
+	}
+	result := make([]MCPPrompt, len(internal))
+	for i, p := range internal {
+		args := make([]MCPPromptArgument, len(p.Arguments))
+		for j, a := range p.Arguments {
+			args[j] = MCPPromptArgument{
+				Name:        a.Name,
+				Description: a.Description,
+				Required:    a.Required,
+			}
+		}
+		result[i] = MCPPrompt{
+			Name:        p.Name,
+			Description: p.Description,
+			Arguments:   args,
+			ServerName:  p.ServerName,
+		}
+	}
+	return result
+}
+
+// GetMCPPrompt retrieves and expands a specific prompt from an MCP server.
+// This is a lazy call — the server is contacted each time to get the latest
+// prompt content. Arguments are passed as key=value pairs to the server for
+// template substitution.
+//
+// Returns an error if the server is not found or the prompt expansion fails.
+func (m *Kit) GetMCPPrompt(ctx context.Context, serverName, promptName string, args map[string]string) (*MCPPromptResult, error) {
+	internal, err := m.agent.GetMCPPrompt(ctx, serverName, promptName, args)
+	if err != nil {
+		return nil, err
+	}
+	msgs := make([]MCPPromptMessage, len(internal.Messages))
+	for i, msg := range internal.Messages {
+		var fileParts []LLMFilePart
+		for _, fp := range msg.FileParts {
+			fileParts = append(fileParts, LLMFilePart{
+				Filename:  fp.Filename,
+				Data:      fp.Data,
+				MediaType: fp.MediaType,
+			})
+		}
+		msgs[i] = MCPPromptMessage{
+			Role:      msg.Role,
+			Content:   msg.Content,
+			FileParts: fileParts,
+		}
+	}
+	return &MCPPromptResult{
+		Description: internal.Description,
+		Messages:    msgs,
+	}, nil
+}
+
+// --------------------------------------------------------------------------
+// MCP Resources
+// --------------------------------------------------------------------------
+
+// MCPResource describes a resource exposed by an MCP server.
+type MCPResource struct {
+	// URI is the unique resource identifier (e.g. "file:///path" or custom scheme).
+	URI string
+	// Name is a human-readable name for the resource.
+	Name string
+	// Description is an optional description of the resource.
+	Description string
+	// MIMEType is the MIME type of the resource, if known.
+	MIMEType string
+	// ServerName is the MCP server that provides this resource.
+	ServerName string
+}
+
+// MCPResourceContent is the result of reading an MCP resource.
+type MCPResourceContent struct {
+	// URI is the resource URI that was read.
+	URI string
+	// MIMEType is the MIME type of the content.
+	MIMEType string
+	// Text is the text content (non-empty for text resources).
+	Text string
+	// BlobData is the decoded binary content (non-empty for blob resources).
+	BlobData []byte
+	// IsBlob is true when the content is binary (BlobData is set).
+	IsBlob bool
+}
+
+// ListMCPResources returns all resources discovered from connected MCP servers.
+// If MCP servers are still loading in the background, this returns only the
+// resources discovered so far. Returns nil if no resources are available.
+func (m *Kit) ListMCPResources() []MCPResource {
+	internal := m.agent.GetMCPResources()
+	if len(internal) == 0 {
+		return nil
+	}
+	result := make([]MCPResource, len(internal))
+	for i, r := range internal {
+		result[i] = MCPResource{
+			URI:         r.URI,
+			Name:        r.Name,
+			Description: r.Description,
+			MIMEType:    r.MIMEType,
+			ServerName:  r.ServerName,
+		}
+	}
+	return result
+}
+
+// ReadMCPResource reads a specific resource from an MCP server by URI.
+// Returns the resource content (text or binary blob).
+func (m *Kit) ReadMCPResource(ctx context.Context, serverName, uri string) (*MCPResourceContent, error) {
+	internal, err := m.agent.ReadMCPResource(ctx, serverName, uri)
+	if err != nil {
+		return nil, err
+	}
+	return &MCPResourceContent{
+		URI:      internal.URI,
+		MIMEType: internal.MIMEType,
+		Text:     internal.Text,
+		BlobData: internal.BlobData,
+		IsBlob:   internal.IsBlob,
+	}, nil
+}
+
+// SubscribeMCPResource subscribes to change notifications for a resource.
+// When the resource changes on the server, the resource list is refreshed.
+func (m *Kit) SubscribeMCPResource(ctx context.Context, serverName, uri string) error {
+	return m.agent.SubscribeMCPResource(ctx, serverName, uri)
+}
+
+// UnsubscribeMCPResource cancels change notifications for a resource.
+func (m *Kit) UnsubscribeMCPResource(ctx context.Context, serverName, uri string) error {
+	return m.agent.UnsubscribeMCPResource(ctx, serverName, uri)
+}
+
 // GetBufferedDebugMessages returns any debug messages that were buffered
 // during initialization, then clears the buffer. Returns nil if no messages
 // were buffered or if buffered logging was not configured.
@@ -319,6 +544,23 @@ func (m *Kit) SetModel(ctx context.Context, modelString string) error {
 	systemPrompt, _ := config.LoadSystemPrompt(viper.GetString("system-prompt"))
 	thinkingLevel := models.ParseThinkingLevel(viper.GetString("thinking-level"))

+	// Validate and adjust thinking level for the target model.
+	// Some models (e.g., OpenAI gpt-5.4) don't support "minimal" and require "none".
+	if thinkingLevel != models.ThinkingOff {
+		parts := strings.SplitN(modelString, "/", 2)
+		if len(parts) == 2 {
+			modelName := parts[1]
+			if !models.IsValidThinkingLevelForModel(thinkingLevel, modelName) {
+				fallback := models.SuggestThinkingLevelFallback(thinkingLevel, modelName)
+				if fallback != models.ThinkingOff {
+					// Adjust the thinking level in viper so the change persists.
+					viper.Set("thinking-level", string(fallback))
+					thinkingLevel = fallback
+				}
+			}
+		}
+	}
+
 	// With message-level caching, thinking and caching can work together.
 	// No need to disable caching when thinking is enabled.
 	cfg := &models.ProviderConfig{
@@ -467,7 +709,7 @@ func (m *Kit) ReloadExtensions() error {

 	// Update extension tools on the agent so the LLM sees changes.
 	if m.agent != nil {
-		extTools := extensions.ExtensionToolsAsFantasy(m.extRunner.RegisteredTools(), m.extRunner)
+		extTools := extensions.ExtensionToolsAsLLMTools(m.extRunner.RegisteredTools(), m.extRunner)
 		m.agent.SetExtraTools(extTools)
 	}

@@ -490,7 +732,7 @@ func (m *Kit) ExecuteCompletion(ctx context.Context, req extensions.CompleteRequ
 		llmModel    fantasy.LanguageModel
 		closer      func()
 		usedModel   string
-		providerOps fantasy.ProviderOptions
+		providerOps LLMProviderOptions
 	)

 	if req.Model == "" {
@@ -587,6 +829,29 @@ func (m *Kit) ExecuteCompletion(ctx context.Context, req extensions.CompleteRequ
 // Options configures Kit creation with optional overrides for model,
 // prompts, configuration, and behavior settings. All fields are optional
 // and will use CLI defaults if not specified.
+//
+// Global viper state warning:
+// Options are applied by [New] via [viper.Set] calls against viper's
+// process-global store. This store is shared with every downstream reader
+// (e.g. [Kit.SetModel], [Kit.GetThinkingLevel], BuildProviderConfig, and
+// any other code path that calls viper.Get*). Two consequences:
+//
+//  1. Kit instances are NOT isolated from each other within a single
+//     process. Values set by the second New() call overwrite the first,
+//     and any code that later reads viper will see the most recent Set.
+//  2. Fields left at the zero value do NOT clear prior viper state; they
+//     simply skip the viper.Set. Callers that need a clean slate between
+//     constructions should invoke viper.Reset() (the test suite uses a
+//     private resetViper() helper that wraps it) before the next New().
+//
+// Recommended usage: create one Kit per process, or reset viper between
+// constructions. Concurrent calls to New are serialized internally by
+// [viperInitMu], but that mutex does not prevent later viper reads (from
+// a different Kit) from observing mutated keys.
+//
+// TODO: refactor New to use a per-instance *viper.Viper (constructed via
+// viper.New()) so each Kit owns its own isolated config store and Options
+// no longer leak through the global singleton.
 type Options struct {
 	Model        string // Override model (e.g., "anthropic/claude-sonnet-4-5-20250929")
 	SystemPrompt string // Override system prompt
@@ -597,6 +862,76 @@ type Options struct {
 	Tools        []Tool // Custom tool set. If empty, AllTools() is used.
 	ExtraTools   []Tool // Additional tools added alongside core/MCP/extension tools.

+	// Generation parameters. These override the corresponding values from
+	// .kit.yml / KIT_* environment variables. Leaving a field at its
+	// zero/nil value means "use the configured default", which in turn
+	// falls back to per-model defaults (modelSettings / customModels) and
+	// finally to a last-resort SDK floor of 8192 for MaxTokens (matching
+	// the CLI --max-tokens default; sampling params fall through to
+	// provider-level defaults).
+	//
+	// Pointer types are used for sampling parameters so the SDK can
+	// distinguish "explicitly set to 0" from "leave alone".
+
+	// MaxTokens overrides the maximum output tokens per LLM response.
+	// 0 = let the precedence chain resolve a value (env → config →
+	// per-model → 8192 SDK floor, matching the CLI default). Setting a
+	// non-zero value here suppresses automatic right-sizing, matching
+	// the CLI's --max-tokens flag semantics. Bump this when generating
+	// long outputs (HTML artifacts, large refactors, etc.) to avoid
+	// silent truncation mid-tool-call. The cap also applies after
+	// model switches via [Kit.SetModel].
+	MaxTokens int
+
+	// ThinkingLevel sets the reasoning effort for models that support
+	// extended thinking. Valid values: "off", "none", "minimal", "low",
+	// "medium", "high". "" = let the precedence chain resolve a level
+	// (env → config → per-model → "off"). Use [Kit.SetThinkingLevel]
+	// to change at runtime.
+	ThinkingLevel string
+
+	// Temperature controls sampling randomness (typically 0.0–2.0).
+	// nil = leave provider/per-model default in place. Pointer type
+	// so explicit 0.0 (deterministic) is distinguishable from "unset".
+	Temperature *float32
+
+	// TopP is the nucleus-sampling cutoff (0.0–1.0).
+	// nil = leave provider/per-model default in place.
+	TopP *float32
+
+	// TopK limits sampling to the top K tokens.
+	// nil = leave provider/per-model default in place.
+	TopK *int32
+
+	// FrequencyPenalty discourages repeated tokens (OpenAI-family models).
+	// nil = leave provider/per-model default in place.
+	FrequencyPenalty *float32
+
+	// PresencePenalty discourages repeating topics (OpenAI-family models).
+	// nil = leave provider/per-model default in place.
+	PresencePenalty *float32
+
+	// Provider configuration. These override values normally read from
+	// .kit.yml or provider-specific environment variables. Useful when
+	// loading credentials from a secrets manager, pointing at custom
+	// OpenAI-compatible endpoints (LiteLLM, vLLM, Azure OpenAI, internal
+	// proxies), or running against self-hosted infrastructure.
+
+	// ProviderAPIKey overrides the API key used to authenticate with the
+	// model provider. "" = use the value from config or the
+	// provider-specific environment variable.
+	ProviderAPIKey string
+
+	// ProviderURL overrides the provider endpoint. "" = use the provider's
+	// default URL.
+	ProviderURL string
+
+	// TLSSkipVerify disables TLS certificate verification on provider
+	// HTTP clients. Only set this for self-signed certificates in
+	// development. Once enabled here it cannot be disabled via Options
+	// (use the config file or env var to opt back out).
+	TLSSkipVerify bool
+
 	// SkipConfig, when true, skips loading .kit.yml configuration files.
 	// Viper defaults (setSDKDefaults) and environment variables (KIT_*)
 	// are still applied. Use this for fully programmatic configuration.
@@ -626,6 +961,35 @@ type Options struct {
 	// (e.g. AGENTS.md) from the working directory.
 	NoContextFiles bool

+	// MCPConfig provides a pre-loaded MCP configuration. When set,
+	// LoadAndValidateConfig is skipped during Kit creation — avoiding
+	// viper access entirely. This is set automatically for in-process
+	// subagents (inheriting the parent's loaded config) and can be used
+	// by SDK consumers who build config programmatically.
+	MCPConfig *config.Config
+
+	// InProcessMCPServers registers mcp-go servers that run in the same
+	// process. Each key is the server name (used to prefix tool names, e.g.
+	// "docs__search"). The value must be a *[server.MCPServer].
+	//
+	// In-process servers bypass subprocess spawning and network I/O entirely.
+	// Kit does not take ownership of the servers — the caller is responsible
+	// for any cleanup after [Kit.Close].
+	//
+	// Example:
+	//
+	//	mcpSrv := server.NewMCPServer("my-tools", "1.0.0",
+	//	    server.WithToolCapabilities(true),
+	//	)
+	//	mcpSrv.AddTool(mcp.NewTool("search", ...), handler)
+	//
+	//	host, _ := kit.New(ctx, &kit.Options{
+	//	    InProcessMCPServers: map[string]*kit.MCPServer{
+	//	        "docs": mcpSrv,
+	//	    },
+	//	})
+	InProcessMCPServers map[string]*MCPServer
+
 	// Compaction
 	AutoCompact       bool               // Auto-compact when near context limit
 	CompactionOptions *CompactionOptions // Config for auto-compaction (nil = defaults)
@@ -634,15 +998,23 @@ type Options struct {
 	Debug bool

 	// MCPAuthHandler handles OAuth authorization for remote MCP servers.
-	// When set, remote transports (streamable HTTP, SSE) are configured with
-	// OAuth support. If the server returns a 401, the handler is invoked to
-	// let the user authorize via browser.
+	// When set, remote transports (streamable HTTP, SSE) are configured
+	// with OAuth support. If the server returns a 401, the handler is
+	// invoked to let the user authorize.
 	//
-	// If nil, a [DefaultMCPAuthHandler] is created automatically — opening the
-	// system browser and listening on a local callback server.
+	// If nil, OAuth is disabled: remote MCP servers requiring authorization
+	// will fail to connect and the underlying authorization-required error
+	// is surfaced to the caller. The SDK deliberately does not construct a
+	// default handler — doing so would bind a local TCP port and trigger
+	// presentation I/O (browser open, stderr writes) without the consumer
+	// opting in, which is wrong for library, daemon, or web-app embedders.
 	//
-	// Set to a custom implementation to control the authorization UX (e.g.
-	// display a URL in a custom UI, redirect to a web app, etc.).
+	// CLI consumers: pass [NewCLIMCPAuthHandler] to get the standard
+	// "open browser + print status" behavior.
+	//
+	// Custom UX: implement [MCPAuthHandler] directly, or use
+	// [DefaultMCPAuthHandler] and set its OnAuthURL hook to plug in your
+	// own presentation (TUI modal, QR code, web redirect, etc.).
 	MCPAuthHandler MCPAuthHandler

 	// MCPTokenStoreFactory, if non-nil, is called to create a token store for
@@ -684,6 +1056,11 @@ type CLIOptions struct {
 	SpinnerFunc SpinnerFunc
 	// UseBufferedLogger buffers debug messages for later display.
 	UseBufferedLogger bool
+	// ProgressReaderFunc wraps an io.Reader with a progress display for
+	// long-running operations such as Ollama model pulls. The returned
+	// io.ReadCloser must be closed when done. When nil, progress is not
+	// displayed.
+	ProgressReaderFunc func(io.Reader) io.ReadCloser
 }

 // InitTreeSession creates or opens a tree session based on the given options.
@@ -721,14 +1098,29 @@ func InitTreeSession(opts *Options) (*session.TreeManager, error) {
 	return session.CreateTreeSession(sessionDir)
 }

+// viperInitMu serializes viper writes during [New]. Viper's global state
+// is not thread-safe, so concurrent calls (e.g. parallel subagent spawns)
+// must not overlap the Set/Get window. Note that this mutex only protects
+// the construction window — it does not isolate long-lived Kit instances
+// from each other. See the "Global viper state warning" on [Options].
+var viperInitMu sync.Mutex
+
 // New creates a Kit instance using the same initialization as the CLI.
 // It loads configuration, initializes MCP servers, creates the LLM model, and
 // sets up the agent for interaction. Returns an error if initialization fails.
-// viperInitMu serializes viper writes during kit.New(). Viper's global state
-// is not thread-safe, so concurrent calls (e.g. parallel subagent spawns)
-// must not overlap the Set()/Get() window.
-var viperInitMu sync.Mutex
-
+//
+// Global viper state warning: fields on [Options] are applied by calling
+// [viper.Set] on viper's process-global store. As a result, two Kits
+// constructed in the same process are NOT isolated: the second New
+// overwrites viper keys set by the first, and any downstream reader
+// (e.g. [Kit.SetModel], [Kit.GetThinkingLevel]) will observe the most
+// recent value. Callers that need multiple independent Kits should call
+// viper.Reset() between constructions, or avoid constructing more than
+// one Kit per process. Writes during New are serialized by [viperInitMu].
+//
+// TODO: refactor to use a per-call viper.New() instance so each Kit owns
+// its own isolated config store and Options stop leaking through the
+// global singleton.
 func New(ctx context.Context, opts *Options) (*Kit, error) {
 	if opts == nil {
 		opts = &Options{}
@@ -789,6 +1181,47 @@ func New(ctx context.Context, opts *Options) (*Kit, error) {
 		}
 		viper.Set("stream", opts.Streaming)

+		// Generation parameter overrides. Each Options field, when set,
+		// is pushed into viper here so the existing downstream code
+		// (BuildProviderConfig, SetModel, modelSettings lookups) picks
+		// it up uniformly. Pointer-typed sampling params use viper.Set
+		// only when non-nil so that nil means "leave provider/per-model
+		// default in place" (BuildProviderConfig keys off viper.IsSet).
+		if opts.MaxTokens > 0 {
+			viper.Set("max-tokens", opts.MaxTokens)
+		}
+		if opts.ThinkingLevel != "" {
+			viper.Set("thinking-level", opts.ThinkingLevel)
+		}
+		if opts.Temperature != nil {
+			viper.Set("temperature", *opts.Temperature)
+		}
+		if opts.TopP != nil {
+			viper.Set("top-p", *opts.TopP)
+		}
+		if opts.TopK != nil {
+			viper.Set("top-k", *opts.TopK)
+		}
+		if opts.FrequencyPenalty != nil {
+			viper.Set("frequency-penalty", *opts.FrequencyPenalty)
+		}
+		if opts.PresencePenalty != nil {
+			viper.Set("presence-penalty", *opts.PresencePenalty)
+		}
+
+		// Provider overrides. TLSSkipVerify only takes effect when true —
+		// callers wanting to force-disable should use the config file or
+		// env var instead.
+		if opts.ProviderAPIKey != "" {
+			viper.Set("provider-api-key", opts.ProviderAPIKey)
+		}
+		if opts.ProviderURL != "" {
+			viper.Set("provider-url", opts.ProviderURL)
+		}
+		if opts.TLSSkipVerify {
+			viper.Set("tls-skip-verify", true)
+		}
+
 		// Resolve working directory for context/skill discovery.
 		cwd = opts.SessionDir
 		if cwd == "" {
@@ -874,6 +1307,17 @@ func New(ctx context.Context, opts *Options) (*Kit, error) {
 		if pcErr != nil {
 			return fmt.Errorf("failed to build provider config: %w", pcErr)
 		}
+
+		// SDK last-resort max-tokens floor. When nothing — Options, env,
+		// config, nor a per-model default — supplied a value, we land on
+		// zero here (viper.GetInt returns 0 for unset keys). Apply the
+		// SDK default directly on the struct rather than via viper so
+		// viper.IsSet("max-tokens") stays false: downstream right-sizing
+		// can still raise this toward the model's known output ceiling,
+		// and per-model modelSettings[...].maxTokens can still win.
+		if providerConfig.MaxTokens == 0 && opts.MaxTokens == 0 {
+			providerConfig.MaxTokens = sdkDefaultMaxTokens
+		}
 		modelString = viper.GetString("model")
 		debug = viper.GetBool("debug")
 		noExtensions = opts.NoExtensions || viper.GetBool("no-extensions")
@@ -886,8 +1330,11 @@ func New(ctx context.Context, opts *Options) (*Kit, error) {
 	}
 	// ---- viperInitMu released — heavy I/O below runs concurrently ----

-	// Load MCP configuration. Use pre-loaded config if provided via CLI options.
-	if opts.CLI != nil && opts.CLI.MCPConfig != nil {
+	// Load MCP configuration. Use pre-loaded config if provided directly,
+	// via CLI options, or load from viper as a last resort.
+	if opts.MCPConfig != nil {
+		mcpConfig = opts.MCPConfig
+	} else if opts.CLI != nil && opts.CLI.MCPConfig != nil {
 		mcpConfig = opts.CLI.MCPConfig
 	}
 	if mcpConfig == nil {
@@ -898,6 +1345,21 @@ func New(ctx context.Context, opts *Options) (*Kit, error) {
 		}
 	}

+	// Merge in-process MCP servers from Options into the MCP config.
+	// These are programmatically-provided *server.MCPServer instances that
+	// bypass subprocess spawning and network I/O.
+	if len(opts.InProcessMCPServers) > 0 {
+		if mcpConfig.MCPServers == nil {
+			mcpConfig.MCPServers = make(map[string]config.MCPServerConfig, len(opts.InProcessMCPServers))
+		}
+		for name, srv := range opts.InProcessMCPServers {
+			mcpConfig.MCPServers[name] = config.MCPServerConfig{
+				Type:            "inprocess",
+				InProcessServer: srv,
+			}
+		}
+	}
+
 	// Pre-create hook registries so the tool wrapper can reference them.
 	// Hooks registered after New() returns are still invoked because the
 	// wrapper captures the registries by pointer.
@@ -907,6 +1369,7 @@ func New(ctx context.Context, opts *Options) (*Kit, error) {
 	afterTurn := newHookRegistry[AfterTurnHook, AfterTurnResult]()
 	contextPrepare := newHookRegistry[ContextPrepareHook, ContextPrepareResult]()
 	beforeCompact := newHookRegistry[BeforeCompactHook, BeforeCompactResult]()
+	prepareStep := newHookRegistry[PrepareStepHook, PrepareStepResult]()

 	// Build agent setup options, pulling CLI-specific fields when available.
 	// Pass the pre-built ProviderConfig and scalar viper snapshots so
@@ -926,20 +1389,19 @@ func New(ctx context.Context, opts *Options) (*Kit, error) {
 		OnMCPServerLoaded: opts.OnMCPServerLoaded,
 	}

-	// Set up OAuth handler for remote MCP servers.
+	// Set up OAuth handler for remote MCP servers. The SDK does not create
+	// a default handler: auto-construction would bind a local TCP port and
+	// (historically) shell out to a browser without the consumer asking,
+	// which is a surprise for library/daemon/web-app embedders. Consumers
+	// that want CLI behavior pass a [CLIMCPAuthHandler] explicitly; other
+	// consumers implement [MCPAuthHandler] themselves. If nil, remote MCP
+	// servers requiring OAuth will fail to connect with the underlying
+	// authorization-required error surfaced to the caller.
+	//
 	// The SDK MCPAuthHandler interface is structurally identical to
 	// tools.MCPAuthHandler, so any implementation satisfies both.
 	if opts.MCPAuthHandler != nil {
 		setupOpts.AuthHandler = opts.MCPAuthHandler
-	} else {
-		// Create a default handler that opens the system browser.
-		defaultHandler, authErr := NewDefaultMCPAuthHandler()
-		if authErr != nil {
-			// Non-fatal: OAuth just won't be available for remote servers.
-			log.Printf("WARN Failed to create OAuth handler; remote MCP servers requiring auth will fail: %v", authErr)
-		} else {
-			setupOpts.AuthHandler = defaultHandler
-		}
 	}

 	// Set up custom token store factory for MCP OAuth tokens.
@@ -953,6 +1415,9 @@ func New(ctx context.Context, opts *Options) (*Kit, error) {
 		setupOpts.ShowSpinner = opts.CLI.ShowSpinner
 		setupOpts.SpinnerFunc = opts.CLI.SpinnerFunc
 		setupOpts.UseBufferedLogger = opts.CLI.UseBufferedLogger
+		if opts.CLI.ProgressReaderFunc != nil {
+			providerConfig.ProgressReaderFunc = opts.CLI.ProgressReaderFunc
+		}
 	}

 	// Create agent using shared setup with the hook tool wrapper.
@@ -990,6 +1455,7 @@ func New(ctx context.Context, opts *Options) (*Kit, error) {
 		bufferedLogger:        agentResult.BufferedLogger,
 		authHandler:           setupOpts.AuthHandler,
 		opts:                  opts,
+		mcpConfig:             mcpConfig,
 		hasCustomSystemPrompt: hasCustomSystemPrompt,
 		beforeToolCall:        beforeToolCall,
 		afterToolResult:       afterToolResult,
@@ -997,6 +1463,7 @@ func New(ctx context.Context, opts *Options) (*Kit, error) {
 		afterTurn:             afterTurn,
 		contextPrepare:        contextPrepare,
 		beforeCompact:         beforeCompact,
+		prepareStep:           prepareStep,
 	}

 	// Bridge extension events to SDK hooks.
@@ -1171,8 +1638,9 @@ type TurnResult struct {
 	Response string

 	// StopReason indicates why the turn ended. Derived from the LLM
-	// provider's finish reason: "stop", "length" (max tokens), "tool-calls",
-	// "content-filter", "error", "other", "unknown".
+	// provider's finish reason: FinishReasonStop, FinishReasonLength (max
+	// output tokens reached), FinishReasonToolCalls, FinishReasonContentFilter,
+	// FinishReasonError, FinishReasonOther, FinishReasonUnknown.
 	StopReason string

 	// SessionID is the UUID of the session this turn belongs to.
@@ -1314,13 +1782,22 @@ func (m *Kit) Subagent(ctx context.Context, cfg SubagentConfig) (*SubagentResult
 		tools = SubagentTools()
 	}

-	// Create child Kit instance.
+	// Create child Kit instance. Pass the parent's loaded MCP config to
+	// avoid re-reading viper (which races with concurrent subagent spawns).
+	// Streaming must be explicitly enabled — Options.Streaming defaults to
+	// false, and New() unconditionally writes viper.Set("stream", opts.Streaming).
+	// Without this, the subagent would (a) pollute viper global state for
+	// other concurrent callers and (b) potentially hit provider-level
+	// differences (e.g. Anthropic non-streaming timeouts with extended
+	// thinking).
 	childOpts := &Options{
 		Model:        model,
 		SystemPrompt: systemPrompt,
 		Tools:        tools,
 		NoSession:    cfg.NoSession,
 		Quiet:        true,
+		Streaming:    true,
+		MCPConfig:    m.mcpConfig,
 	}
 	child, err := New(ctx, childOpts)
 	if err != nil {
@@ -1427,21 +1904,21 @@ func (m *Kit) generate(ctx context.Context, messages []fantasy.Message) (*agent.
 		return sr, err
 	})

-	return m.agent.GenerateWithLoopAndStreaming(ctx, messages,
-		func(toolCallID, toolName, toolArgs string) {
+	return m.agent.GenerateWithCallbacks(ctx, messages, agent.GenerateCallbacks{
+		OnToolCall: func(toolCallID, toolName, toolArgs string) {
 			m.events.emit(ToolCallEvent{
 				ToolCallID: toolCallID, ToolName: toolName, ToolKind: toolKindFor(toolName),
 				ToolArgs: toolArgs, ParsedArgs: parseToolArgs(toolArgs),
 			})
 		},
-		func(toolCallID, toolName, toolArgs string, isStarting bool) {
+		OnToolExecution: func(toolCallID, toolName, toolArgs string, isStarting bool) {
 			if isStarting {
 				m.events.emit(ToolExecutionStartEvent{ToolCallID: toolCallID, ToolName: toolName, ToolKind: toolKindFor(toolName), ToolArgs: toolArgs})
 			} else {
 				m.events.emit(ToolExecutionEndEvent{ToolCallID: toolCallID, ToolName: toolName, ToolKind: toolKindFor(toolName)})
 			}
 		},
-		func(toolCallID, toolName, toolArgs, resultText, metadata string, isError bool) {
+		OnToolResult: func(toolCallID, toolName, toolArgs, resultText, metadata string, isError bool) {
 			evt := ToolResultEvent{
 				ToolCallID: toolCallID, ToolName: toolName, ToolKind: toolKindFor(toolName),
 				ToolArgs: toolArgs, ParsedArgs: parseToolArgs(toolArgs),
@@ -1455,17 +1932,17 @@ func (m *Kit) generate(ctx context.Context, messages []fantasy.Message) (*agent.
 			}
 			m.events.emit(evt)
 		},
-		func(content string) {
+		OnResponse: func(content string) {
 			m.events.emit(ResponseEvent{Content: content})
 		},
-		func(content string) {
+		OnToolCallContent: func(content string) {
 			m.events.emit(ToolCallContentEvent{Content: content})
 		},
 		// <think> tag filtering: models like Qwen/DeepSeek wrap reasoning inside
 		// <think>...</think> tags in the regular text stream. We intercept those
 		// spans here and re-route them as ReasoningDeltaEvent/ReasoningCompleteEvent
 		// so callers always receive clean, tag-free text and structured reasoning.
-		func() func(chunk string) {
+		OnStreamingResponse: func() func(chunk string) {
 			const (
 				thinkOpen  = "<think>"
 				thinkClose = "</think>"
@@ -1501,14 +1978,13 @@ func (m *Kit) generate(ctx context.Context, messages []fantasy.Message) (*agent.
 				}
 			}
 		}(),
-		func(delta string) {
+		OnReasoningDelta: func(delta string) {
 			m.events.emit(ReasoningDeltaEvent{Delta: delta})
 		},
-		func() {
+		OnReasoningComplete: func() {
 			m.events.emit(ReasoningCompleteEvent{})
 		},
-		func(toolCallID, toolName, chunk string, isStderr bool) {
-			// Emit tool output chunk event for streaming bash output
+		OnToolOutput: func(toolCallID, toolName, chunk string, isStderr bool) {
 			m.events.emit(ToolOutputEvent{
 				ToolCallID: toolCallID,
 				ToolName:   toolName,
@@ -1517,18 +1993,13 @@ func (m *Kit) generate(ctx context.Context, messages []fantasy.Message) (*agent.
 			})
 		},
 		// Persist step messages incrementally so that progress survives
-		// crashes and long-running turns don't lose work. Each step's
-		// messages are persisted as a unit: for tool-calling steps this is
-		// the assistant message (with tool_use parts) + tool-role message
-		// (with tool_result parts) as a pair; for the final step it's the
-		// assistant text/reasoning message alone.
-		func(stepMessages []fantasy.Message) {
+		// crashes and long-running turns don't lose work.
+		OnStepMessages: func(stepMessages []fantasy.Message) {
 			for _, msg := range stepMessages {
 				_, _ = m.session.AppendMessage(msg)
 			}
 		},
-		func(inputTokens, outputTokens, cacheReadTokens, cacheCreationTokens int64) {
-			// Emit step usage event for real-time cost tracking
+		OnStepUsage: func(inputTokens, outputTokens, cacheReadTokens, cacheCreationTokens int64) {
 			if viper.GetBool("debug") {
 				log.Printf("DEBUG Kit.generate emitting StepUsageEvent: input=%d output=%d cacheRead=%d cacheCreate=%d",
 					inputTokens, outputTokens, cacheReadTokens, cacheCreationTokens,
@@ -1541,7 +2012,98 @@ func (m *Kit) generate(ctx context.Context, messages []fantasy.Message) (*agent.
 				CacheWriteTokens: uint64(cacheCreationTokens),
 			})
 		},
-	)
+		// Password prompt handler for sudo commands
+		OnPasswordPrompt: func(prompt string) (string, bool) {
+			responseCh := make(chan PasswordPromptResponse, 1)
+			m.events.emit(PasswordPromptEvent{
+				Prompt:     prompt,
+				ResponseCh: responseCh,
+			})
+			resp := <-responseCh
+			return resp.Password, resp.Cancelled
+		},
+		// Tool call argument streaming
+		OnToolCallStart: func(toolCallID, toolName string) {
+			m.events.emit(ToolCallStartEvent{
+				ToolCallID: toolCallID,
+				ToolName:   toolName,
+				ToolKind:   toolKindFor(toolName),
+			})
+		},
+		OnToolCallDelta: func(toolCallID, delta string) {
+			m.events.emit(ToolCallDeltaEvent{
+				ToolCallID: toolCallID,
+				Delta:      delta,
+			})
+		},
+		OnToolCallEnd: func(toolCallID string) {
+			m.events.emit(ToolCallEndEvent{
+				ToolCallID: toolCallID,
+			})
+		},
+
+		// New callbacks for previously unwired Fantasy lifecycle events.
+		OnStepStart: func(stepNumber int) {
+			m.events.emit(StepStartEvent{StepNumber: stepNumber})
+		},
+		OnStepFinish: func(stepNumber int, hasToolCalls bool, finishReason string, usage fantasy.Usage) {
+			m.events.emit(StepFinishEvent{
+				StepNumber:   stepNumber,
+				HasToolCalls: hasToolCalls,
+				FinishReason: finishReason,
+				Usage:        usage,
+			})
+		},
+		OnTextStart: func(id string) {
+			m.events.emit(TextStartEvent{ID: id})
+		},
+		OnTextEnd: func(id string) {
+			m.events.emit(TextEndEvent{ID: id})
+		},
+		OnReasoningStart: func(id string) {
+			m.events.emit(ReasoningStartEvent{ID: id})
+		},
+		OnWarnings: func(warnings []string) {
+			m.events.emit(WarningsEvent{Warnings: warnings})
+		},
+		OnSource: func(sourceType, id, url, title string) {
+			m.events.emit(SourceEvent{
+				SourceType: sourceType,
+				ID:         id,
+				URL:        url,
+				Title:      title,
+			})
+		},
+		OnStreamFinish: func(usage fantasy.Usage, finishReason string) {
+			m.events.emit(StreamFinishEvent{
+				Usage:        usage,
+				FinishReason: finishReason,
+			})
+		},
+		OnError: func(err error) {
+			m.events.emit(ErrorEvent{Error: err})
+		},
+		OnRetry: func(attempt int, err error) {
+			m.events.emit(RetryEvent{Attempt: attempt, Error: err})
+		},
+		// PrepareStep hook — compose with steering (handled in agent layer)
+		// and then run SDK consumer hooks.
+		OnPrepareStep: func() agent.PrepareStepHandler {
+			if !m.prepareStep.hasHooks() {
+				return nil
+			}
+			return func(stepNumber int, messages []fantasy.Message) []fantasy.Message {
+				hookResult := m.prepareStep.run(PrepareStepHook{
+					StepNumber: stepNumber,
+					Messages:   messages,
+				})
+				if hookResult != nil && hookResult.Messages != nil {
+					return hookResult.Messages
+				}
+				return nil
+			}
+		}(),
+	})
 }

 // runTurn is the shared lifecycle for every prompt mode:
@@ -1955,6 +2517,35 @@ func (m *Kit) GetTools() []Tool {
 	return m.agent.GetTools()
 }

+// MaxTokens returns the effective max output tokens currently configured for
+// the agent. This is the value actually sent to the LLM provider on each
+// request, after CLI/env/config resolution, per-model overrides, model-aware
+// right-sizing, and any Anthropic thinking-budget adjustments.
+//
+// Returns 0 when the active provider suppresses the max_output_tokens
+// parameter (e.g. OpenAI Codex OAuth) or when no model is configured yet.
+// A non-zero value is the number that will cause a FinishReasonLength
+// truncation if the model tries to generate beyond it.
+func (m *Kit) MaxTokens() int {
+	if m.agent == nil {
+		return 0
+	}
+	return m.agent.GetMaxTokens()
+}
+
+// MaxOutputLimit returns the catalog-reported output ceiling for the current
+// model in tokens, or 0 when the model isn't in the registry (custom models,
+// new releases, Ollama, etc.). Pair with MaxTokens() to detect when the agent
+// is configured well below what the model supports and surface a hint to the
+// user.
+func (m *Kit) MaxOutputLimit() int {
+	info := m.GetModelInfo()
+	if info == nil {
+		return 0
+	}
+	return info.Limit.Output
+}
+
 // extractFileParts returns all FilePart entries from a message's Content.
 // Used to preserve image attachments when replacing user message text.
 func extractFileParts(msg fantasy.Message) []fantasy.FilePart {
@@ -5,6 +5,8 @@ import (
 	"os"
 	"testing"

+	"github.com/spf13/viper"
+
 	kit "github.com/mark3labs/kit/pkg/kit"
 )

@@ -54,6 +56,225 @@ func TestNewWithOptions(t *testing.T) {
 	}
 }

+// TestNewWithGenerationOptions verifies that the SDK-only generation
+// parameter overrides on Options propagate all the way through to the
+// agent without requiring any viper.Set workarounds in caller code.
+func TestNewWithGenerationOptions(t *testing.T) {
+	if os.Getenv("ANTHROPIC_API_KEY") == "" {
+		t.Skip("Skipping test: ANTHROPIC_API_KEY not set")
+	}
+
+	ctx := context.Background()
+
+	// MaxTokens override — keep ThinkingLevel off so Anthropic's thinking
+	// budget doesn't auto-bump MaxTokens above what we configured.
+	t.Run("MaxTokens", func(t *testing.T) {
+		defer resetViper()
+
+		const want = 12345
+		host, err := kit.New(ctx, &kit.Options{
+			Model:     "anthropic/claude-sonnet-4-5-20250929",
+			Quiet:     true,
+			MaxTokens: want,
+		})
+		if err != nil {
+			t.Fatalf("Failed to create Kit: %v", err)
+		}
+		defer func() { _ = host.Close() }()
+
+		if got := host.MaxTokens(); got != want {
+			t.Errorf("Options.MaxTokens=%d did not propagate; Kit.MaxTokens()=%d", want, got)
+		}
+		if !viper.IsSet("max-tokens") {
+			t.Error("viper.IsSet(\"max-tokens\") should be true after MaxTokens override")
+		}
+	})
+
+	// ThinkingLevel override — verified via the public getter, which
+	// reads back the configured (not provider-derived) level.
+	t.Run("ThinkingLevel", func(t *testing.T) {
+		defer resetViper()
+
+		const want = "high"
+		host, err := kit.New(ctx, &kit.Options{
+			Model:         "anthropic/claude-sonnet-4-5-20250929",
+			Quiet:         true,
+			ThinkingLevel: want,
+		})
+		if err != nil {
+			t.Fatalf("Failed to create Kit: %v", err)
+		}
+		defer func() { _ = host.Close() }()
+
+		if got := host.GetThinkingLevel(); got != want {
+			t.Errorf("Options.ThinkingLevel=%q did not propagate; Kit.GetThinkingLevel()=%q", want, got)
+		}
+	})
+
+	// Temperature override — pointer semantics let callers distinguish
+	// "explicitly 0.0" from "unset", which we assert by pushing a distinct
+	// value and reading it back off viper's merged state.
+	t.Run("Temperature", func(t *testing.T) {
+		defer resetViper()
+
+		want := float32(0.12345)
+		host, err := kit.New(ctx, &kit.Options{
+			Model:       "anthropic/claude-sonnet-4-5-20250929",
+			Quiet:       true,
+			Temperature: &want,
+		})
+		if err != nil {
+			t.Fatalf("Failed to create Kit: %v", err)
+		}
+		defer func() { _ = host.Close() }()
+
+		if !viper.IsSet("temperature") {
+			t.Fatal("viper.IsSet(\"temperature\") should be true after Temperature override")
+		}
+		if got := float32(viper.GetFloat64("temperature")); got != want {
+			t.Errorf("Options.Temperature=%v did not propagate; viper=%v", want, got)
+		}
+	})
+}
+
+// TestNewPreservesIsSetSemantics verifies that creating a Kit WITHOUT
+// populating the generation-param Options fields does NOT mark those
+// keys as explicitly set in viper. This is the precedence contract
+// that per-model defaults (ApplyModelSettings) and right-sizing
+// (rightSizeMaxTokens) rely on.
+//
+// Previously setSDKDefaults() used viper.SetDefault() for every param,
+// which caused viper.IsSet() to return true for all of them — silently
+// suppressing per-model defaults and pinning max-tokens at 4096 even
+// on models with much larger output limits.
+func TestNewPreservesIsSetSemantics(t *testing.T) {
+	if os.Getenv("ANTHROPIC_API_KEY") == "" {
+		t.Skip("Skipping test: ANTHROPIC_API_KEY not set")
+	}
+
+	defer resetViper()
+
+	ctx := context.Background()
+	host, err := kit.New(ctx, &kit.Options{
+		Model:      "anthropic/claude-sonnet-4-5-20250929",
+		Quiet:      true,
+		NoSession:  true,
+		SkipConfig: true, // isolate from any ~/.kit.yml values
+	})
+	if err != nil {
+		t.Fatalf("Failed to create Kit: %v", err)
+	}
+	defer func() { _ = host.Close() }()
+
+	// These keys must remain "unset" from viper's perspective so the
+	// downstream isExplicitlySet() checks allow per-model defaults to
+	// take effect.
+	checkKeys := []string{
+		"max-tokens",
+		"temperature",
+		"top-p",
+		"top-k",
+		"frequency-penalty",
+		"presence-penalty",
+		"thinking-level",
+	}
+
+	// With SkipConfig: true, InitConfig() is not invoked, so viper has
+	// no env-var bindings registered. Any IsSet() here would come purely
+	// from SDK-side SetDefault/Set calls — which is exactly what this
+	// test is guarding against.
+	for _, k := range checkKeys {
+		if viper.IsSet(k) {
+			t.Errorf("viper.IsSet(%q) == true when no Options field set it "+
+				"(SDK defaults must not corrupt IsSet semantics)", k)
+		}
+	}
+}
+
+// TestNewWithProviderOptions verifies that programmatic provider overrides
+// (API key, URL) take effect without env vars or config files, and that
+// Options.ProviderAPIKey *wins* over any pre-existing viper state.
+func TestNewWithProviderOptions(t *testing.T) {
+	if os.Getenv("ANTHROPIC_API_KEY") == "" {
+		t.Skip("Skipping test: ANTHROPIC_API_KEY not set")
+	}
+
+	ctx := context.Background()
+
+	t.Run("succeeds with API key from Options", func(t *testing.T) {
+		defer resetViper()
+
+		apiKey := os.Getenv("ANTHROPIC_API_KEY")
+		host, err := kit.New(ctx, &kit.Options{
+			Model:          "anthropic/claude-sonnet-4-5-20250929",
+			Quiet:          true,
+			NoSession:      true,
+			ProviderAPIKey: apiKey,
+		})
+		if err != nil {
+			t.Fatalf("Failed to create Kit with ProviderAPIKey option: %v", err)
+		}
+		defer func() { _ = host.Close() }()
+
+		if got := viper.GetString("provider-api-key"); got != apiKey {
+			t.Errorf("Options.ProviderAPIKey did not propagate to viper; got %q (len=%d)", got, len(got))
+		}
+	})
+
+	// Override precedence: even when viper already holds a different
+	// provider-api-key value (as it would if a config file or earlier
+	// Set() call populated one), Options.ProviderAPIKey must win.
+	t.Run("Options override beats pre-existing viper state", func(t *testing.T) {
+		defer resetViper()
+
+		viper.Set("provider-api-key", "sk-config-file-placeholder")
+
+		want := "sk-from-options-override"
+		// Use an OpenAI-flavored model so the validation path accepts
+		// the placeholder without attempting a real Anthropic handshake.
+		host, err := kit.New(ctx, &kit.Options{
+			Model:            "openai/gpt-4o-mini",
+			Quiet:            true,
+			NoSession:        true,
+			NoExtensions:     true,
+			DisableCoreTools: true,
+			ProviderAPIKey:   want,
+		})
+		// Creation may still fail if the model registry is strict, but
+		// we only care that the override reached viper before any
+		// provider handshake happened.
+		if host != nil {
+			defer func() { _ = host.Close() }()
+		}
+		_ = err
+
+		if got := viper.GetString("provider-api-key"); got != want {
+			t.Errorf("Options.ProviderAPIKey did not override pre-existing viper value; got %q, want %q", got, want)
+		}
+	})
+
+	// ProviderURL override must also reach viper.
+	t.Run("ProviderURL propagates", func(t *testing.T) {
+		defer resetViper()
+
+		const want = "https://custom.example.com/v1"
+		host, err := kit.New(ctx, &kit.Options{
+			Model:       "anthropic/claude-sonnet-4-5-20250929",
+			Quiet:       true,
+			NoSession:   true,
+			ProviderURL: want,
+		})
+		if err != nil {
+			t.Fatalf("Failed to create Kit with ProviderURL option: %v", err)
+		}
+		defer func() { _ = host.Close() }()
+
+		if got := viper.GetString("provider-url"); got != want {
+			t.Errorf("Options.ProviderURL did not propagate; got %q, want %q", got, want)
+		}
+	})
+}
+
 func TestSessionManagement(t *testing.T) {
 	if os.Getenv("ANTHROPIC_API_KEY") == "" {
 		t.Skip("Skipping test: ANTHROPIC_API_KEY not set")
@@ -81,3 +302,7 @@ func TestSessionManagement(t *testing.T) {
 		t.Error("Expected non-empty session ID")
 	}
 }
+
+// resetViper wipes viper's global state so a test case doesn't leak
+// viper.Set() calls into the next one. Used via defer in subtests.
+func resetViper() { viper.Reset() }
@@ -22,11 +22,6 @@ func GetLLMProviders() []string {
 	return models.GetGlobalRegistry().GetLLMProviders()
 }

-// Deprecated: Use GetLLMProviders instead.
-func GetFantasyProviders() []string {
-	return GetLLMProviders()
-}
-
 // GetModelsForProvider returns all known models for a provider.
 func GetModelsForProvider(provider string) (map[string]ModelInfo, error) {
 	return models.GetGlobalRegistry().GetModelsForProvider(provider)
@@ -5,18 +5,18 @@ import (
 	"fmt"
 	"net"
 	"net/http"
-	"os/exec"
-	"runtime"
 	"sync"
 	"time"
 )

 // MCPAuthHandler handles OAuth authorization for MCP servers.
 // Implementations control the user experience — opening a browser, showing a
-// prompt, displaying a URL, etc.
+// prompt, displaying a URL, posting to a message bus, etc.
 //
-// The default implementation ([DefaultMCPAuthHandler]) opens the system browser
-// and starts a local HTTP callback server to receive the authorization code.
+// [DefaultMCPAuthHandler] provides the transport mechanics (port reservation
+// and callback server) but performs no user-facing I/O on its own; consumers
+// wire presentation via [DefaultMCPAuthHandler.OnAuthURL] or implement
+// MCPAuthHandler from scratch.
 type MCPAuthHandler interface {
 	// RedirectURI returns the OAuth redirect URI that the callback server
 	// will listen on. This is called during MCP transport setup — before any
@@ -37,23 +37,44 @@ type MCPAuthHandler interface {
 	HandleAuth(ctx context.Context, serverName string, authURL string) (callbackURL string, err error)
 }

-// DefaultMCPAuthHandler opens the system browser and starts a local HTTP
-// callback server to receive the OAuth authorization code. It eagerly reserves
-// a TCP port on construction so [RedirectURI] is stable for the lifetime of
-// the handler.
+// DefaultMCPAuthHandler provides the transport mechanics of an OAuth flow —
+// reserving a local TCP port and running a one-shot HTTP callback server —
+// without making any user-experience decisions. It performs no browser opens,
+// no printing, no TUI calls; consumers attach presentation by setting
+// [DefaultMCPAuthHandler.OnAuthURL] or by wrapping the handler.
 //
-// Create instances with [NewDefaultMCPAuthHandler] (random port) or
-// [NewDefaultMCPAuthHandlerWithPort] (explicit port).
+// The handler eagerly reserves a TCP port on construction so [RedirectURI] is
+// stable for the lifetime of the handler. Create instances with
+// [NewDefaultMCPAuthHandler] (random port) or [NewDefaultMCPAuthHandlerWithPort]
+// (explicit port). Always call [DefaultMCPAuthHandler.Close] when done to
+// release the port.
 type DefaultMCPAuthHandler struct {
 	listener net.Listener
 	port     int
 	mu       sync.Mutex // guards listener lifecycle
+
+	// OnAuthURL, if set, is invoked exactly once per [HandleAuth] call with
+	// the authorization URL the user must visit. This is where consumers
+	// plug in their UX: open a browser, print to stderr, post to a TUI
+	// stream, render a QR code, etc. The handler performs no I/O on the
+	// URL itself; if OnAuthURL is nil the URL is silently dropped and the
+	// user has no way to complete the flow.
+	//
+	// OnAuthURL is called synchronously before the handler blocks on the
+	// callback. It must not block indefinitely — long-running work should
+	// be dispatched to a goroutine.
+	OnAuthURL func(serverName, authURL string)
 }

 // NewDefaultMCPAuthHandler creates a handler that listens on a random
 // available port on localhost. The port is reserved immediately so
 // [RedirectURI] returns a stable value. Call [DefaultMCPAuthHandler.Close]
 // when the handler is no longer needed to release the port.
+//
+// The returned handler has no OnAuthURL hook configured and will therefore
+// appear to hang on HandleAuth until the context deadline fires. Set
+// OnAuthURL before using the handler, or use a higher-level wrapper such
+// as [CLIMCPAuthHandler].
 func NewDefaultMCPAuthHandler() (*DefaultMCPAuthHandler, error) {
 	listener, err := net.Listen("tcp", "localhost:0")
 	if err != nil {
@@ -88,9 +109,9 @@ func (h *DefaultMCPAuthHandler) Port() int {
 	return h.port
 }

-// HandleAuth opens the system browser to authURL and waits for the OAuth
-// callback on the local server. It returns the full callback URL including
-// query parameters (code, state, etc.).
+// HandleAuth invokes [OnAuthURL] with the authorization URL (if configured)
+// and waits for the OAuth callback on the local server. It returns the full
+// callback URL including query parameters (code, state, etc.).
 //
 // If the context has no deadline, a default 2-minute timeout is applied.
 // The callback server is started for each HandleAuth call and shut down
@@ -136,19 +157,13 @@ func (h *DefaultMCPAuthHandler) HandleAuth(ctx context.Context, serverName strin
 		Handler: mux,
 	}

-	// Start serving on the pre-reserved listener. We need to create a new
-	// listener on the same port because http.Server.Serve takes ownership
-	// and closes the listener when done. The original listener is kept open
-	// to reserve the port; we create a second listener via SO_REUSEADDR
-	// semantics (Go's default on most platforms) or, more reliably, we
-	// temporarily release and re-acquire.
-	//
-	// Strategy: use the held listener directly for Serve. After Serve
-	// returns (due to Shutdown), re-acquire the listener to keep the port
-	// reserved for future HandleAuth calls.
+	// Start serving on the pre-reserved listener. http.Server.Serve takes
+	// ownership and closes the listener when Shutdown is called, so we
+	// re-acquire a fresh listener on the same port in the deferred cleanup
+	// below to keep the port reserved for subsequent HandleAuth calls.
 	h.mu.Lock()
 	serveListener := h.listener
-	h.listener = nil // Serve will close it
+	h.listener = nil
 	h.mu.Unlock()

 	if serveListener == nil {
@@ -184,10 +199,11 @@ func (h *DefaultMCPAuthHandler) HandleAuth(ctx context.Context, serverName strin
 		}
 	}()

-	// Open the system browser.
-	if err := openBrowser(authURL); err != nil {
-		// Browser open is best-effort; the user can still navigate manually.
-		_ = err
+	// Surface the authorization URL to the consumer. This is the single
+	// presentation seam: the SDK itself does not open browsers, print,
+	// or otherwise touch the user's environment.
+	if h.OnAuthURL != nil {
+		h.OnAuthURL(serverName, authURL)
 	}

 	// Wait for the callback, a server error, or context cancellation.
@@ -214,22 +230,6 @@ func (h *DefaultMCPAuthHandler) Close() error {
 	return nil
 }

-// openBrowser opens the default system browser to the given URL. This is a
-// best-effort operation — errors are returned but callers typically ignore
-// them since the user can navigate manually.
-func openBrowser(url string) error {
-	switch runtime.GOOS {
-	case "linux":
-		return exec.Command("xdg-open", url).Start()
-	case "windows":
-		return exec.Command("rundll32", "url.dll,FileProtocolHandler", url).Start()
-	case "darwin":
-		return exec.Command("open", url).Start()
-	default:
-		return fmt.Errorf("unsupported platform: %s", runtime.GOOS)
-	}
-}
-
 // oauthSuccessHTML is the HTML page returned to the browser after a
 // successful OAuth callback.
 const oauthSuccessHTML = `<!DOCTYPE html>
@@ -5,32 +5,49 @@ import (
 	"fmt"
 	"io"
 	"os"
+	"os/exec"
+	"runtime"
 )

-// CLIMCPAuthHandler wraps a [DefaultMCPAuthHandler] and prints status messages
-// to a writer (typically stderr) so the user knows what's happening during
-// OAuth authorization. This is the handler used by the CLI/TUI binary.
+// CLIMCPAuthHandler is the MCP OAuth handler for CLI/TUI consumers. It wraps
+// a [DefaultMCPAuthHandler] and layers standard CLI behavior on top of the
+// underlying transport mechanics:
 //
-// For TUI integration, set NotifyFunc to route messages through the TUI's
-// event system instead of (or in addition to) the writer.
+//   - Opens the authorization URL in the system browser
+//   - Prints status messages (or routes them to a TUI via [NotifyFunc])
+//
+// Non-CLI consumers (web apps, daemons, custom TUIs) should not use this
+// handler; implement [MCPAuthHandler] directly or configure a
+// [DefaultMCPAuthHandler] with a custom OnAuthURL instead.
 type CLIMCPAuthHandler struct {
 	inner *DefaultMCPAuthHandler
 	w     io.Writer

-	// NotifyFunc, when set, is called with status messages instead of writing
-	// to the writer. This allows the TUI to display system messages in the
-	// chat stream. If nil, messages are written to w.
+	// NotifyFunc, when set, is called with status messages instead of
+	// writing to the writer. This allows the TUI to display system
+	// messages in the chat stream. If nil, messages are written to w.
 	NotifyFunc func(serverName, message string)
 }

 // NewCLIMCPAuthHandler creates a CLI auth handler that prints status messages
-// to stderr and delegates the actual OAuth flow to a [DefaultMCPAuthHandler].
+// to stderr, opens the authorization URL in the system browser, and delegates
+// the callback-server mechanics to a [DefaultMCPAuthHandler].
 func NewCLIMCPAuthHandler() (*CLIMCPAuthHandler, error) {
 	inner, err := NewDefaultMCPAuthHandler()
 	if err != nil {
 		return nil, err
 	}
-	return &CLIMCPAuthHandler{inner: inner, w: os.Stderr}, nil
+	h := &CLIMCPAuthHandler{inner: inner, w: os.Stderr}
+	// Wire the CLI presentation policy into the inner handler's hook.
+	// This is the one place in the codebase where OAuth triggers a
+	// browser open; the SDK core remains I/O-free.
+	inner.OnAuthURL = func(serverName, authURL string) {
+		h.notify(serverName, fmt.Sprintf("🔐 MCP server %q requires authentication. Opening browser...", serverName))
+		h.notify(serverName, fmt.Sprintf("   If the browser doesn't open, visit:\n   %s", authURL))
+		// Browser open is best-effort; the user can still navigate manually.
+		_ = openBrowser(authURL)
+	}
+	return h, nil
 }

 // RedirectURI returns the OAuth redirect URI from the inner handler.
@@ -38,17 +55,15 @@ func (h *CLIMCPAuthHandler) RedirectURI() string {
 	return h.inner.RedirectURI()
 }

-// HandleAuth prints status messages and delegates to the inner handler.
+// HandleAuth delegates to the inner handler (which invokes OnAuthURL, runs
+// the callback server, and returns the full callback URL) and emits a final
+// success or failure notification.
 func (h *CLIMCPAuthHandler) HandleAuth(ctx context.Context, serverName string, authURL string) (string, error) {
-	h.notify(serverName, fmt.Sprintf("🔐 MCP server %q requires authentication. Opening browser...", serverName))
-	h.notify(serverName, fmt.Sprintf("   If the browser doesn't open, visit:\n   %s", authURL))
-
 	callbackURL, err := h.inner.HandleAuth(ctx, serverName, authURL)
 	if err != nil {
 		h.notify(serverName, fmt.Sprintf("✗ Authentication failed for %q: %v", serverName, err))
 		return "", err
 	}
-
 	h.notify(serverName, fmt.Sprintf("✓ Authenticated with %q", serverName))
 	return callbackURL, nil
 }
@@ -66,3 +81,20 @@ func (h *CLIMCPAuthHandler) notify(serverName, message string) {
 	}
 	_, _ = fmt.Fprintln(h.w, message)
 }
+
+// openBrowser opens the system default browser at url. Intentionally
+// unexported: browser opening is CLI policy, not SDK surface. Consumers
+// that need similar behavior for their own UX should bring their own
+// helper (or use a third-party package like github.com/pkg/browser).
+func openBrowser(url string) error {
+	switch runtime.GOOS {
+	case "linux":
+		return exec.Command("xdg-open", url).Start()
+	case "windows":
+		return exec.Command("rundll32", "url.dll,FileProtocolHandler", url).Start()
+	case "darwin":
+		return exec.Command("open", url).Start()
+	default:
+		return fmt.Errorf("unsupported platform: %s", runtime.GOOS)
+	}
+}
@@ -2,6 +2,7 @@ package kit

 import (
 	"context"
+	"strings"

 	"charm.land/fantasy"

@@ -52,6 +53,22 @@ func ErrorResult(content string) ToolOutput {
 	return ToolOutput{Content: content, IsError: true}
 }

+// ImageResult creates a [ToolOutput] that returns an image to the LLM.
+// The data is the raw image bytes and mediaType is the MIME type
+// (e.g. "image/png", "image/jpeg"). The optional text content accompanies
+// the image and is visible to the LLM alongside it.
+func ImageResult(content string, data []byte, mediaType string) ToolOutput {
+	return ToolOutput{Content: content, Data: data, MediaType: mediaType}
+}
+
+// MediaResult creates a [ToolOutput] that returns non-image binary media
+// (e.g. audio, video) to the LLM. The data is the raw bytes and mediaType
+// is the MIME type (e.g. "audio/wav", "video/mp4"). The optional text
+// content accompanies the media.
+func MediaResult(content string, data []byte, mediaType string) ToolOutput {
+	return ToolOutput{Content: content, Data: data, MediaType: mediaType}
+}
+
 // toolCallIDKey is the context key for the tool call ID.
 type toolCallIDKey struct{}

@@ -63,9 +80,35 @@ func ToolCallIDFromContext(ctx context.Context) string {
 	return s
 }

+// toolOutputToResponse converts a [ToolOutput] into the underlying
+// framework's ToolResponse, inferring the response Type from Data/MediaType
+// so that binary content (images, audio, etc.) is forwarded to the LLM
+// instead of being silently dropped.
+func toolOutputToResponse(result ToolOutput) fantasy.ToolResponse {
+	resp := fantasy.ToolResponse{
+		Content:   result.Content,
+		IsError:   result.IsError,
+		Data:      result.Data,
+		MediaType: result.MediaType,
+	}
+	// Infer response type from binary data so the downstream framework
+	// creates a media content block instead of a plain-text one.
+	if len(result.Data) > 0 && result.MediaType != "" {
+		if strings.HasPrefix(result.MediaType, "image/") {
+			resp.Type = "image"
+		} else {
+			resp.Type = "media"
+		}
+	}
+	if result.Metadata != nil {
+		resp = fantasy.WithResponseMetadata(resp, result.Metadata)
+	}
+	return resp
+}
+
 // NewTool creates a custom [Tool] with automatic JSON schema generation from
 // the TInput struct type. The handler receives a typed input (deserialized
-// from the LLM's JSON arguments) and returns a [ToolResult].
+// from the LLM's JSON arguments) and returns a [ToolOutput].
 //
 // Struct tags on TInput control the generated schema:
 //
@@ -77,6 +120,11 @@ func ToolCallIDFromContext(ctx context.Context) string {
 // The tool call ID is injected into the context and can be retrieved with
 // [ToolCallIDFromContext].
 //
+// Binary results: When [ToolOutput.Data] and [ToolOutput.MediaType] are set,
+// the response type is automatically inferred so the LLM receives the binary
+// content (e.g. an image) instead of only the text. Use [ImageResult] or
+// [MediaResult] for convenience.
+//
 // Example:
 //
 //	type WeatherInput struct {
@@ -84,7 +132,7 @@ func ToolCallIDFromContext(ctx context.Context) string {
 //	}
 //
 //	tool := kit.NewTool("get_weather", "Get weather for a city",
-//	    func(ctx context.Context, input WeatherInput) (kit.ToolResult, error) {
+//	    func(ctx context.Context, input WeatherInput) (kit.ToolOutput, error) {
 //	        return kit.TextResult("72°F, sunny in " + input.City), nil
 //	    },
 //	)
@@ -96,16 +144,7 @@ func NewTool[TInput any](name, description string, fn func(ctx context.Context,
 			if err != nil {
 				return fantasy.NewTextErrorResponse(err.Error()), nil
 			}
-			resp := fantasy.ToolResponse{
-				Content:   result.Content,
-				IsError:   result.IsError,
-				Data:      result.Data,
-				MediaType: result.MediaType,
-			}
-			if result.Metadata != nil {
-				resp = fantasy.WithResponseMetadata(resp, result.Metadata)
-			}
-			return resp, nil
+			return toolOutputToResponse(result), nil
 		},
 	)
 }
@@ -121,16 +160,7 @@ func NewParallelTool[TInput any](name, description string, fn func(ctx context.C
 			if err != nil {
 				return fantasy.NewTextErrorResponse(err.Error()), nil
 			}
-			resp := fantasy.ToolResponse{
-				Content:   result.Content,
-				IsError:   result.IsError,
-				Data:      result.Data,
-				MediaType: result.MediaType,
-			}
-			if result.Metadata != nil {
-				resp = fantasy.WithResponseMetadata(resp, result.Metadata)
-			}
-			return resp, nil
+			return toolOutputToResponse(result), nil
 		},
 	)
 }
@@ -117,3 +117,149 @@ func TestToolOutput_BinaryData(t *testing.T) {
 		t.Errorf("MediaType = %q, want %q", r.MediaType, "image/png")
 	}
 }
+
+// TestImageResult verifies the ImageResult convenience constructor.
+func TestImageResult(t *testing.T) {
+	data := []byte{0x89, 0x50, 0x4E, 0x47}
+	r := kit.ImageResult("here is the image", data, "image/png")
+	if r.Content != "here is the image" {
+		t.Errorf("Content = %q, want %q", r.Content, "here is the image")
+	}
+	if len(r.Data) != 4 {
+		t.Errorf("Data len = %d, want 4", len(r.Data))
+	}
+	if r.MediaType != "image/png" {
+		t.Errorf("MediaType = %q, want %q", r.MediaType, "image/png")
+	}
+	if r.IsError {
+		t.Error("ImageResult should not set IsError")
+	}
+}
+
+// TestMediaResult verifies the MediaResult convenience constructor.
+func TestMediaResult(t *testing.T) {
+	data := []byte{0xFF, 0xFB, 0x90, 0x00}
+	r := kit.MediaResult("audio clip", data, "audio/mpeg")
+	if r.Content != "audio clip" {
+		t.Errorf("Content = %q, want %q", r.Content, "audio clip")
+	}
+	if len(r.Data) != 4 {
+		t.Errorf("Data len = %d, want 4", len(r.Data))
+	}
+	if r.MediaType != "audio/mpeg" {
+		t.Errorf("MediaType = %q, want %q", r.MediaType, "audio/mpeg")
+	}
+	if r.IsError {
+		t.Error("MediaResult should not set IsError")
+	}
+}
+
+// TestNewTool_BinaryImageResponse verifies that NewTool correctly infers the
+// response type for image data so binary content is forwarded to the LLM
+// (issue #17).
+func TestNewTool_BinaryImageResponse(t *testing.T) {
+	type Input struct {
+		Path string `json:"path"`
+	}
+
+	imgData := []byte{0x89, 0x50, 0x4E, 0x47} // PNG magic bytes
+
+	tool := kit.NewTool("read_image", "Read an image file",
+		func(ctx context.Context, input Input) (kit.ToolOutput, error) {
+			return kit.ImageResult("Here is the image", imgData, "image/png"), nil
+		},
+	)
+
+	// Run the tool and inspect the raw ToolResponse via the AgentTool interface.
+	resp, err := tool.Run(context.Background(), kit.LLMToolCall{
+		ID:    "call_1",
+		Name:  "read_image",
+		Input: `{"path": "test.png"}`,
+	})
+	if err != nil {
+		t.Fatalf("Run() error: %v", err)
+	}
+
+	// The Type field must be "image" so the downstream framework creates a
+	// media content block instead of discarding the binary data.
+	if resp.Type != "image" {
+		t.Errorf("ToolResponse.Type = %q, want %q", resp.Type, "image")
+	}
+	if len(resp.Data) != 4 {
+		t.Errorf("ToolResponse.Data len = %d, want 4", len(resp.Data))
+	}
+	if resp.MediaType != "image/png" {
+		t.Errorf("ToolResponse.MediaType = %q, want %q", resp.MediaType, "image/png")
+	}
+	if resp.Content != "Here is the image" {
+		t.Errorf("ToolResponse.Content = %q, want %q", resp.Content, "Here is the image")
+	}
+}
+
+// TestNewTool_BinaryMediaResponse verifies type inference for non-image media.
+func TestNewTool_BinaryMediaResponse(t *testing.T) {
+	type Input struct{}
+
+	tool := kit.NewTool("get_audio", "Get audio",
+		func(ctx context.Context, input Input) (kit.ToolOutput, error) {
+			return kit.MediaResult("audio clip", []byte{0xFF, 0xFB}, "audio/mpeg"), nil
+		},
+	)
+
+	resp, err := tool.Run(context.Background(), kit.LLMToolCall{
+		ID:    "call_2",
+		Name:  "get_audio",
+		Input: `{}`,
+	})
+	if err != nil {
+		t.Fatalf("Run() error: %v", err)
+	}
+	if resp.Type != "media" {
+		t.Errorf("ToolResponse.Type = %q, want %q", resp.Type, "media")
+	}
+}
+
+// TestNewTool_TextResponseTypeNotSet verifies that text-only responses do NOT
+// get an inferred type (preserving existing behavior).
+func TestNewTool_TextResponseTypeNotSet(t *testing.T) {
+	type Input struct{}
+
+	tool := kit.NewTool("echo", "Echo",
+		func(ctx context.Context, input Input) (kit.ToolOutput, error) {
+			return kit.TextResult("hello"), nil
+		},
+	)
+
+	resp, err := tool.Run(context.Background(), kit.LLMToolCall{
+		ID: "call_3", Name: "echo", Input: `{}`,
+	})
+	if err != nil {
+		t.Fatalf("Run() error: %v", err)
+	}
+	// Text responses should not have Type set (the framework treats "" as text).
+	if resp.Type != "" {
+		t.Errorf("ToolResponse.Type = %q, want empty string for text responses", resp.Type)
+	}
+}
+
+// TestNewParallelTool_BinaryImageResponse mirrors the NewTool binary test for
+// NewParallelTool.
+func TestNewParallelTool_BinaryImageResponse(t *testing.T) {
+	type Input struct{}
+
+	tool := kit.NewParallelTool("snap", "Take a snapshot",
+		func(ctx context.Context, input Input) (kit.ToolOutput, error) {
+			return kit.ImageResult("snapshot", []byte{0xFF, 0xD8}, "image/jpeg"), nil
+		},
+	)
+
+	resp, err := tool.Run(context.Background(), kit.LLMToolCall{
+		ID: "call_4", Name: "snap", Input: `{}`,
+	})
+	if err != nil {
+		t.Fatalf("Run() error: %v", err)
+	}
+	if resp.Type != "image" {
+		t.Errorf("ToolResponse.Type = %q, want %q", resp.Type, "image")
+	}
+}
@@ -12,6 +12,7 @@ import (
 	"github.com/mark3labs/kit/internal/models"
 	"github.com/mark3labs/kit/internal/session"
 	"github.com/mark3labs/mcp-go/client/transport"
+	"github.com/mark3labs/mcp-go/server"
 )

 // ==== Message Types (internal/message/content.go) ====
@@ -129,9 +130,9 @@ type SpinnerFunc = agent.SpinnerFunc

 // ==== LLM Types ====
 //
-// These are type aliases for the corresponding charm.land/fantasy types,
-// giving them clean LLM-prefixed names without leaking the dependency name.
-// SDK consumers can use these types without importing charm.land/fantasy directly.
+// These are type aliases for the underlying LLM provider types, giving them
+// clean LLM-prefixed names without leaking the dependency name. SDK consumers
+// can use these types without importing the provider package directly.

 // LLMMessage represents a message in an LLM conversation, carrying a role
 // and a slice of typed content parts (text, tool calls, reasoning, etc.).
@@ -156,6 +157,18 @@ type LLMTextPart = fantasy.TextPart
 // LLMReasoningPart is a reasoning/chain-of-thought content part.
 type LLMReasoningPart = fantasy.ReasoningPart

+// LLMToolCall represents the raw tool invocation passed to a [Tool]'s Run
+// method. It carries the call ID, tool name, and the JSON-encoded input
+// arguments from the LLM. This is the execution-layer call object — distinct
+// from [ToolCall] (a message content part).
+type LLMToolCall = fantasy.ToolCall
+
+// LLMToolResponse represents the raw response returned from a [Tool]'s Run
+// method. Most SDK consumers should use [ToolOutput] with [NewTool] /
+// [NewParallelTool] instead — this alias is provided for advanced use cases
+// that need to call Tool.Run() directly (e.g. testing).
+type LLMToolResponse = fantasy.ToolResponse
+
 // LLMToolCallPart represents an LLM-initiated tool invocation within a message.
 type LLMToolCallPart = fantasy.ToolCallPart

@@ -171,13 +184,47 @@ type LLMToolResultOutputContentText = fantasy.ToolResultOutputContentText
 // LLMToolResultOutputContentError is an error-valued tool result output.
 type LLMToolResultOutputContentError = fantasy.ToolResultOutputContentError

+// LLMToolResultOutputContentMedia is a media-valued tool result output
+// (images, audio, etc.) carrying base64-encoded data and a MIME type.
+type LLMToolResultOutputContentMedia = fantasy.ToolResultOutputContentMedia
+
+// LLMToolResultContentType classifies the kind of a tool result output
+// ("text", "error", or "media").
+type LLMToolResultContentType = fantasy.ToolResultContentType
+
+// Tool result content type constants.
+const (
+	// LLMToolResultContentTypeText represents text output.
+	LLMToolResultContentTypeText = fantasy.ToolResultContentTypeText
+	// LLMToolResultContentTypeError represents error text output.
+	LLMToolResultContentTypeError = fantasy.ToolResultContentTypeError
+	// LLMToolResultContentTypeMedia represents media (binary) output.
+	LLMToolResultContentTypeMedia = fantasy.ToolResultContentTypeMedia
+)
+
+// LLMToolInfo describes a tool's name, description, and JSON-Schema parameters.
+type LLMToolInfo = fantasy.ToolInfo
+
+// LLMProviderOptions carries provider-specific key/value option maps, keyed
+// by provider name (e.g. "anthropic"). Use this when configuring or
+// inspecting provider-specific tool behaviour.
+type LLMProviderOptions = fantasy.ProviderOptions
+
+// LLMProviderMetadata carries provider-specific metadata returned alongside
+// LLM responses, keyed by provider name.
+type LLMProviderMetadata = fantasy.ProviderMetadata
+
+// LLMPrompt is an ordered sequence of [LLMMessage] values forming a complete
+// prompt for the LLM.
+type LLMPrompt = fantasy.Prompt
+
 // LLMMessageRole identifies the participant role in an LLM conversation.
 type LLMMessageRole = fantasy.MessageRole

 // LLMFinishReason indicates why the LLM stopped generating.
 type LLMFinishReason = fantasy.FinishReason

-// LLM role constants mirror fantasy.MessageRole* values under clean LLM-prefixed names.
+// LLM role constants mirror the provider's role values under clean LLM-prefixed names.
 const (
 	// LLMRoleUser identifies a user message.
 	LLMRoleUser = fantasy.MessageRoleUser
@@ -190,13 +237,19 @@ const (
 )

 // NewLLMUserMessage constructs a user-role LLMMessage with optional file
-// attachments. It is equivalent to fantasy.NewUserMessage.
+// attachments.
 var NewLLMUserMessage = fantasy.NewUserMessage

 // NewLLMSystemMessage constructs a system-role LLMMessage from one or more
-// prompt strings. It is equivalent to fantasy.NewSystemMessage.
+// prompt strings.
 var NewLLMSystemMessage = fantasy.NewSystemMessage

+// newLLMTextErrorResponse creates a tool-error response (internal helper).
+var newLLMTextErrorResponse = fantasy.NewTextErrorResponse
+
+// newLLMTextResponse creates a plain-text tool response (internal helper).
+var newLLMTextResponse = fantasy.NewTextResponse
+
 // ==== Compaction Types (internal/compaction/) ====

 // CompactionResult contains statistics from a compaction operation.
@@ -207,6 +260,12 @@ type CompactionOptions = compaction.CompactionOptions

 // ==== MCP OAuth Types ====

+// MCPServer is an in-process MCP server from the mcp-go library.
+// Pass an instance to [Kit.AddInProcessMCPServer] or
+// [Options.InProcessMCPServers] to register tools without spawning a
+// subprocess or making network calls.
+type MCPServer = server.MCPServer
+
 // MCPTokenStore persists OAuth tokens for a single MCP server. Implementations
 // must be safe for concurrent use.
 //
@@ -77,8 +77,8 @@ func TestLLMRoleConstants(t *testing.T) {
 	}
 }

-// TestLLMMessageAlias verifies LLMMessage is a type alias for fantasy.Message
-// and can be used interchangeably.
+// TestLLMMessageAlias verifies LLMMessage is a type alias for the underlying
+// LLM provider message type and can be used interchangeably.
 func TestLLMMessageAlias(t *testing.T) {
 	// Construct an LLMMessage using alias types.
 	msg := kit.LLMMessage{
@@ -132,8 +132,8 @@ func TestNewLLMSystemMessage(t *testing.T) {
 	}
 }

-// TestLLMUsageAlias verifies LLMUsage is a type alias for fantasy.Usage
-// and carries the correct fields.
+// TestLLMUsageAlias verifies LLMUsage is a type alias for the underlying
+// LLM provider usage type and carries the correct fields.
 func TestLLMUsageAlias(t *testing.T) {
 	u := kit.LLMUsage{
 		InputTokens:         100,
@@ -150,7 +150,7 @@ func TestLLMUsageAlias(t *testing.T) {
 		t.Errorf("LLMUsage.TotalTokens = %d, want 150", u.TotalTokens)
 	}

-	// Verify JSON marshaling uses snake_case (inherited from fantasy.Usage tags).
+	// Verify JSON marshaling uses snake_case (inherited from the provider's tags).
 	data, err := json.Marshal(u)
 	if err != nil {
 		t.Fatalf("LLMUsage.MarshalJSON: %v", err)
@@ -165,7 +165,8 @@ func TestLLMUsageAlias(t *testing.T) {
 	}
 }

-// TestLLMFilePartAlias verifies LLMFilePart is a type alias for fantasy.FilePart.
+// TestLLMFilePartAlias verifies LLMFilePart is a type alias for the underlying
+// LLM provider file part type.
 func TestLLMFilePartAlias(t *testing.T) {
 	fp := kit.LLMFilePart{
 		Filename:  "screenshot.png",
@@ -3,10 +3,13 @@
 ACP smoke test — drives `kit acp` over JSON-RPC 2.0 stdio.

 Protocol flow:
-  1. session/new  → get sessionId
-  2. session/set_model → set opencode/kimi-k2.5
-  3. session/prompt → "What is 2+2? Answer in one sentence."
-  4. Collect session updates until done
+  1. initialize        → negotiate capabilities
+  2. session/new       → get sessionId
+  3. session/list      → verify session listing works
+  4. session/set_config_option → set model
+  5. session/prompt    → "What is 2+2? Answer in one sentence."
+  6. Collect session/update notifications until prompt response
+  7. session/cancel    → verify cancel is accepted (no-op since prompt is done)
 """

 import json
@@ -21,9 +24,24 @@ MODEL   = os.environ.get("MODEL", "opencode/kimi-k2.5")
 CWD     = os.path.expanduser("~")
 TIMEOUT = 60  # seconds to wait for the prompt to complete

+# Request ID counter — initialize=1, session/new=2, etc.
+_next_id = 0

-def rpc(method, params, req_id):
-    return json.dumps({"jsonrpc": "2.0", "id": req_id, "method": method, "params": params}) + "\n"
+
+def next_id():
+    global _next_id
+    _next_id += 1
+    return _next_id
+
+
+def rpc_request(method, params):
+    """Build a JSON-RPC 2.0 request with auto-incrementing ID."""
+    return json.dumps({"jsonrpc": "2.0", "id": next_id(), "method": method, "params": params}) + "\n"
+
+
+def rpc_notification(method, params):
+    """Build a JSON-RPC 2.0 notification (no id)."""
+    return json.dumps({"jsonrpc": "2.0", "method": method, "params": params}) + "\n"


 def send(proc, line):
@@ -32,7 +50,7 @@ def send(proc, line):
    proc.stdin.flush()


-def read_responses(proc, collected, done_event):
+def read_responses(proc, collected, done_event, prompt_id):
    """Read newline-delimited JSON from stdout until process exits."""
    for raw in proc.stdout:
        raw = raw.strip()
@@ -50,32 +68,49 @@ def read_responses(proc, collected, done_event):
        if "result" in msg:
            result = msg["result"]
            print(f"← RESP  id={msg.get('id')}  result={json.dumps(result)[:200]}", flush=True)
-            # Prompt complete when we get a stopReason on id=3
-            if msg.get("id") == 3 and "stopReason" in result:
+            # Prompt complete when we get a stopReason on the prompt request ID
+            if msg.get("id") == prompt_id and "stopReason" in result:
                done_event.set()
        elif "error" in msg:
            print(f"← ERROR id={msg.get('id')}  {json.dumps(msg['error'])}", flush=True)
            # If it's the prompt call that errored, unblock
-            if msg.get("id") == 3:
+            if msg.get("id") == prompt_id:
                done_event.set()
        elif "method" in msg:
            # Notification / session update
            m = msg.get("method", "")
            p = msg.get("params", {})
-            if m in ("session/update", "session/updated"):
+            if m == "session/update":
                update = p.get("update", {})
-                stype = update.get("sessionUpdate") or update.get("type", "?")
+                stype = update.get("sessionUpdate", "?")
                content = update.get("content", {})
+                text = content.get("text", "")
                if stype == "agent_thought_chunk":
-                    print(f"  [thinking] {content.get('text','')}", end="", flush=True)
+                    print(f"  [thinking] {text}", end="", flush=True)
                elif stype == "agent_message_chunk":
-                    print(f"  [response] {content.get('text','')}", end="", flush=True)
+                    print(f"  [response] {text}", end="", flush=True)
+                elif stype in ("tool_call", "tool_call_update"):
+                    title = update.get("title", update.get("toolCallId", "?"))
+                    status = update.get("status", "?")
+                    print(f"\n  [{stype}] {title} ({status})", flush=True)
                else:
                    print(f"\n  [update/{stype}] {json.dumps(update)[:200]}", flush=True)
            else:
                print(f"\n← NOTIF {m}  {json.dumps(p)[:200]}", flush=True)


+def wait_for_response(collected, req_id, timeout=5.0, label="response"):
+    """Block until we have a response for the given request ID."""
+    deadline = time.time() + timeout
+    while time.time() < deadline:
+        for msg in collected:
+            if msg.get("id") == req_id and ("result" in msg or "error" in msg):
+                return msg
+        time.sleep(0.1)
+    print(f"\n✗ FAIL: timed out waiting for {label} (id={req_id})", flush=True)
+    return None
+
+
 def main():
    print(f"Starting: {KIT_BIN} acp -m {MODEL}", flush=True)

@@ -91,8 +126,13 @@ def main():
    collected = []
    done_event = threading.Event()

-    reader = threading.Thread(target=read_responses, args=(proc, collected, done_event), daemon=True)
-    reader.start()
+    # We'll set the prompt_id once we know it
+    prompt_id_holder = [None]
+
+    # Start reader thread — prompt_id will be set before prompt is sent
+    class ReaderThread(threading.Thread):
+        def run(self):
+            read_responses(proc, collected, done_event, prompt_id_holder[0])

    stderr_lines = []
    def read_stderr():
@@ -105,16 +145,55 @@ def main():

    time.sleep(0.3)  # let the process initialise

-    # 1. session/new
-    send(proc, rpc("session/new", {"cwd": CWD, "mcpServers": []}, 1))
+    # ── Step 1: initialize ──────────────────────────────────────────────
+    init_id = next_id()
+    send(proc, json.dumps({
+        "jsonrpc": "2.0",
+        "id": init_id,
+        "method": "initialize",
+        "params": {
+            "protocolVersion": 1,
+            "clientCapabilities": {
+                "fs": {"readTextFile": False, "writeTextFile": False},
+            },
+            "clientInfo": {"name": "acp-smoke-test", "version": "1.0.0"},
+        },
+    }) + "\n")
+
+    # Start a simple reader for the initialize response
+    reader = threading.Thread(target=read_responses, args=(proc, collected, done_event, None), daemon=True)
+    reader.start()
+
    time.sleep(1.0)

-    session_id = None
-    for msg in collected:
-        if msg.get("id") == 1 and "result" in msg:
-            session_id = msg["result"].get("sessionId")
-            break
+    init_resp = wait_for_response(collected, init_id, timeout=5, label="initialize")
+    if not init_resp or "error" in init_resp:
+        print(f"\n✗ FAIL: initialize failed: {init_resp}", flush=True)
+        proc.terminate()
+        sys.exit(1)

+    result = init_resp["result"]
+    proto_ver = result.get("protocolVersion")
+    agent_info = result.get("agentInfo", {})
+    print(f"\n✓ Initialized: protocol_version={proto_ver} agent={agent_info.get('name', '?')} v{agent_info.get('version', '?')}", flush=True)
+
+    # ── Step 2: session/new ─────────────────────────────────────────────
+    new_session_id = next_id()
+    send(proc, json.dumps({
+        "jsonrpc": "2.0",
+        "id": new_session_id,
+        "method": "session/new",
+        "params": {"cwd": CWD, "mcpServers": []},
+    }) + "\n")
+    time.sleep(1.0)
+
+    session_resp = wait_for_response(collected, new_session_id, timeout=10, label="session/new")
+    if not session_resp or "error" in session_resp:
+        print(f"\n✗ FAIL: session/new failed: {session_resp}", flush=True)
+        proc.terminate()
+        sys.exit(1)
+
+    session_id = session_resp["result"].get("sessionId")
    if not session_id:
        print("\n✗ FAIL: did not get sessionId from session/new", flush=True)
        proc.terminate()
@@ -122,31 +201,102 @@ def main():

    print(f"\n✓ Got sessionId: {session_id}", flush=True)

-    # 2. session/set_model (model already set via -m flag, but exercise the RPC)
-    send(proc, rpc("session/set_model", {"sessionId": session_id, "modelId": MODEL}, 2))
+    # ── Step 3: session/list ────────────────────────────────────────────
+    list_id = next_id()
+    send(proc, json.dumps({
+        "jsonrpc": "2.0",
+        "id": list_id,
+        "method": "session/list",
+        "params": {},
+    }) + "\n")
    time.sleep(0.5)

-    # 3. session/prompt
+    list_resp = wait_for_response(collected, list_id, timeout=5, label="session/list")
+    if not list_resp:
+        print("\n⚠ WARN: session/list timed out (non-fatal)", flush=True)
+    elif "error" in list_resp:
+        print(f"\n⚠ WARN: session/list returned error: {list_resp['error']} (non-fatal)", flush=True)
+    else:
+        sessions = list_resp["result"].get("sessions", [])
+        print(f"\n✓ session/list returned {len(sessions)} session(s)", flush=True)
+
+    # ── Step 4: session/set_config_option (model) ───────────────────────
+    # Uses the new session/set_config_option method (replaces the old session/set_model).
+    # The model is already set via -m flag, but we exercise the RPC to verify it works.
+    config_id = next_id()
+    send(proc, json.dumps({
+        "jsonrpc": "2.0",
+        "id": config_id,
+        "method": "session/set_config_option",
+        "params": {
+            "sessionId": session_id,
+            "configId": "model",
+            "value": MODEL,
+        },
+    }) + "\n")
+    time.sleep(0.5)
+
+    config_resp = wait_for_response(collected, config_id, timeout=5, label="session/set_config_option")
+    if not config_resp:
+        print("\n⚠ WARN: session/set_config_option timed out (non-fatal)", flush=True)
+    elif "error" in config_resp:
+        print(f"\n⚠ WARN: session/set_config_option returned error: {config_resp['error']} (non-fatal)", flush=True)
+    else:
+        print(f"\n✓ session/set_config_option accepted", flush=True)
+
+    # ── Step 5: session/prompt ──────────────────────────────────────────
+    prompt_id = next_id()
+    prompt_id_holder[0] = prompt_id
+
+    # Re-wire the reader to know the prompt ID (the existing thread is already running)
+    # Since we can't change it mid-flight easily, we check the collected list instead.
+
    prompt_params = {
        "sessionId": session_id,
        "prompt": [{"type": "text", "text": "What is 2+2? Answer in one sentence."}],
    }
-    send(proc, rpc("session/prompt", prompt_params, 3))
+    send(proc, json.dumps({
+        "jsonrpc": "2.0",
+        "id": prompt_id,
+        "method": "session/prompt",
+        "params": prompt_params,
+    }) + "\n")

-    # Wait for finished update or timeout
-    if not done_event.wait(timeout=TIMEOUT):
-        print(f"\n✗ FAIL: timed out after {TIMEOUT}s waiting for finished update", flush=True)
+    # Wait for finished update or timeout — poll collected list
+    deadline = time.time() + TIMEOUT
+    prompt_resp = None
+    while time.time() < deadline:
+        for msg in collected:
+            if msg.get("id") == prompt_id and ("result" in msg or "error" in msg):
+                prompt_resp = msg
+                break
+        if prompt_resp:
+            break
+        time.sleep(0.2)
+
+    if not prompt_resp:
+        print(f"\n✗ FAIL: timed out after {TIMEOUT}s waiting for prompt response", flush=True)
        proc.terminate()
        sys.exit(1)

-    # Check we got a successful prompt response
-    prompt_resp = next((m for m in collected if m.get("id") == 3), None)
-    if prompt_resp and "error" in prompt_resp:
+    if "error" in prompt_resp:
        print(f"\n✗ FAIL: prompt returned error: {prompt_resp['error']}", flush=True)
        proc.terminate()
        sys.exit(1)

-    print("\n✓ SMOKE TEST PASSED", flush=True)
+    stop_reason = prompt_resp["result"].get("stopReason", "?")
+    print(f"\n✓ Prompt completed: stopReason={stop_reason}", flush=True)
+
+    # ── Step 6: session/cancel (no-op, prompt already done) ─────────────
+    # This is a notification (no id), so no response expected.
+    send(proc, rpc_notification("session/cancel", {"sessionId": session_id}))
+    time.sleep(0.3)
+    print("✓ session/cancel sent (no-op)", flush=True)
+
+    # ── Summary ─────────────────────────────────────────────────────────
+    # Count session updates received
+    update_count = sum(1 for m in collected if m.get("method") == "session/update")
+    print(f"\n✓ SMOKE TEST PASSED  ({update_count} session updates received)", flush=True)
    proc.terminate()
    proc.wait(timeout=5)

@@ -55,7 +55,7 @@ The `Init` function receives an `ext.API` object for registering handlers, and e

 ## Lifecycle Events

-Kit provides 18 lifecycle events. Each handler receives an event struct and a `Context`.
+Kit provides 21 lifecycle events. Each handler receives an event struct and a `Context`.

 ### Session Events

@@ -93,7 +93,7 @@ api.OnAgentEnd(func(e ext.AgentEndEvent, ctx ext.Context) {
    // e.Response string
    // e.StopReason string — "error" (on failure), "completed" (when LLM returns
    //   empty stop reason), or the raw LLM provider value passed through
-    //   (e.g. "stop", "end_turn", "max_tokens", "tool_use").
+    //   (e.g. "stop", "length" (max output tokens hit), "tool-calls", "content-filter").
    //   To detect errors, check e.StopReason == "error".
    //   Do NOT compare against "completed" for success — instead check != "error".
 })
@@ -136,6 +136,37 @@ api.OnToolResult(func(e ext.ToolResultEvent, ctx ext.Context) *ext.ToolResultRes
 })
 ```

+### Tool Call Input Streaming Events
+
+These events fire during the LLM's tool argument generation phase, **before** the tool call is fully parsed and before `OnToolCall` fires. They enable UIs to show tool activity immediately rather than waiting for the full argument JSON to finish streaming.
+
+```go
+// Fires when the LLM begins generating tool call arguments.
+// The tool name is known but the full argument JSON is still streaming.
+api.OnToolCallInputStart(func(e ext.ToolCallInputStartEvent, ctx ext.Context) {
+    // e.ToolCallID string — stable ID for correlating tool lifecycle events
+    // e.ToolName string — name of the tool being called
+    // e.ToolKind string — "execute", "edit", "read", "search", "agent"
+    ctx.PrintInfo("Tool starting: " + e.ToolName)
+})
+
+// Fires for each streamed fragment of tool call arguments.
+// Useful for live-previewing artifact content or showing a progress indicator.
+api.OnToolCallInputDelta(func(e ext.ToolCallInputDeltaEvent, ctx ext.Context) {
+    // e.ToolCallID string
+    // e.Delta string — JSON fragment of tool arguments
+})
+
+// Fires when tool argument streaming is complete, before the tool call
+// is parsed and execution begins. Transition UI from "generating args"
+// to "executing".
+api.OnToolCallInputEnd(func(e ext.ToolCallInputEndEvent, ctx ext.Context) {
+    // e.ToolCallID string
+})
+```
+
+**Full tool lifecycle order**: `OnToolCallInputStart` → `OnToolCallInputDelta` (repeated) → `OnToolCallInputEnd` → `OnToolCall` → `OnToolExecutionStart` → `OnToolOutput` (optional, repeated) → `OnToolExecutionEnd` → `OnToolResult`
+
 ### Input Events

 ```go
@@ -80,6 +80,23 @@ host, err := kit.New(ctx, &kit.Options{
    Quiet:     true, // suppress debug output
    Debug:     true, // enable debug logging

+    // Generation parameters — override env/config/per-model defaults.
+    // Leaving a field at its zero/nil value lets the precedence chain
+    // resolve a value (KIT_* env → .kit.yml → modelSettings/customModels →
+    // 8192 floor for MaxTokens, provider defaults for samplers).
+    MaxTokens:        16384,             // 0 = auto-resolve; non-zero suppresses right-sizing
+    ThinkingLevel:    "medium",          // "off", "none", "minimal", "low", "medium", "high" ("" = default)
+    Temperature:      ptrFloat32(0.2),   // pointer so explicit 0.0 != unset
+    TopP:             nil,                // nil = leave provider/per-model default
+    TopK:             nil,                // nil = leave provider/per-model default
+    FrequencyPenalty: nil,
+    PresencePenalty:  nil,
+
+    // Provider configuration — override env/config without viper.Set workarounds.
+    ProviderAPIKey: "sk-...",                    // "" = use config / provider env var
+    ProviderURL:    "https://proxy.internal/v1", // "" = provider default endpoint
+    TLSSkipVerify:  false,                       // true only; can't force-disable via Options
+
    // Session
    SessionDir:  "/path/to/project",  // base dir for session discovery (default: cwd)
    SessionPath: "/path/to/session.jsonl", // open specific session file
@@ -108,15 +125,49 @@ host, err := kit.New(ctx, &kit.Options{
    AutoCompact:       true,                        // auto-compact near context limit
    CompactionOptions: &kit.CompactionOptions{...}, // nil = defaults

-    // MCP OAuth
+    // MCP OAuth — both fields are opt-in. If MCPAuthHandler is nil,
+    // remote MCP servers that require OAuth will fail to connect with
+    // an authorization-required error instead of silently opening a
+    // browser. CLI consumers use NewCLIMCPAuthHandler; other embedders
+    // implement MCPAuthHandler or configure DefaultMCPAuthHandler.
+    MCPAuthHandler: mcpAuthHandler,             // nil = OAuth disabled
    MCPTokenStoreFactory: func(serverURL string) (kit.MCPTokenStore, error) {
        return myCustomStore(serverURL), nil  // custom OAuth token storage
    },
+
+    // In-Process MCP Servers
+    InProcessMCPServers: map[string]*kit.MCPServer{
+        "docs": mcpSrv,  // *server.MCPServer from mcp-go — no subprocess needed
+    },
 })
+
+// Tiny helper to take the address of a literal for pointer fields.
+func ptrFloat32(v float32) *float32 { return &v }
 ```

 **Critical distinction**: `Tools` replaces ALL default tools (core + MCP + extension). `ExtraTools` adds tools alongside the defaults. Use `Tools` to restrict the agent's capabilities; use `ExtraTools` to extend them.

+**In-process MCP servers** bypass subprocess spawning entirely. Pass `*server.MCPServer` instances from mcp-go via `InProcessMCPServers` or call `AddInProcessMCPServer()` at runtime.
+
+### Generation & provider Options (cheat sheet)
+
+| Field | Type | Empty/nil means | Notes |
+|-------|------|-----------------|-------|
+| `MaxTokens` | `int` | Auto-resolve (env → config → per-model → 8192 floor) | Non-zero suppresses `rightSizeMaxTokens` |
+| `ThinkingLevel` | `string` | Auto-resolve (→ `"off"`) | Valid: `"off"`, `"none"`, `"minimal"`, `"low"`, `"medium"`, `"high"` |
+| `Temperature` | `*float32` | Leave provider/per-model default | Pointer so explicit `0.0` ≠ unset |
+| `TopP` | `*float32` | Leave provider/per-model default | |
+| `TopK` | `*int32` | Leave provider/per-model default | |
+| `FrequencyPenalty` | `*float32` | Leave provider/per-model default | OpenAI-family |
+| `PresencePenalty` | `*float32` | Leave provider/per-model default | OpenAI-family |
+| `ProviderAPIKey` | `string` | Use config / provider env var | Overrides pre-existing viper state |
+| `ProviderURL` | `string` | Use provider default endpoint | Same base URL flag as `--provider-url` |
+| `TLSSkipVerify` | `bool` | — | Only effective when `true`; cannot force-disable via Options |
+
+These fields eliminate the old `viper.Set("max-tokens", 16384)` dance many
+downstream embedders used to do before calling `kit.New()`. Everything is
+now discoverable via godoc on `kit.Options`.
+
 ---

 ## Prompt Methods
@@ -201,6 +252,25 @@ unsub := host.OnToolCall(func(e kit.ToolCallEvent) {
 })
 defer unsub()

+host.OnToolCallStart(func(e kit.ToolCallStartEvent) {
+    // Fires when the LLM begins generating tool call arguments.
+    // e.ToolCallID, e.ToolName, e.ToolKind
+    // Use this to show a "running" indicator immediately — before the
+    // full argument JSON finishes streaming (eliminates "dead air").
+})
+
+host.OnToolCallDelta(func(e kit.ToolCallDeltaEvent) {
+    // Fires for each streamed fragment of tool call arguments.
+    // e.ToolCallID, e.Delta (JSON fragment)
+    // Useful for live-previewing artifact content or progress indicators.
+})
+
+host.OnToolCallEnd(func(e kit.ToolCallEndEvent) {
+    // Fires when tool argument streaming is complete, before execution.
+    // e.ToolCallID
+    // Transition UI from "generating args" to "executing".
+})
+
 host.OnToolResult(func(e kit.ToolResultEvent) {
    // e.ToolCallID, e.ToolName, e.ToolKind, e.ToolArgs, e.ParsedArgs
    // e.Result, e.IsError, e.Metadata (*ToolResultMetadata)
@@ -211,7 +281,7 @@ host.OnToolOutput(func(e kit.ToolOutputEvent) {
    // Streaming bash output chunks
 })

-host.OnStreaming(func(e kit.MessageUpdateEvent) {
+host.OnMessageUpdate(func(e kit.MessageUpdateEvent) {
    fmt.Print(e.Chunk) // real-time text streaming
 })

@@ -226,8 +296,64 @@ host.OnTurnStart(func(e kit.TurnStartEvent) {
 host.OnTurnEnd(func(e kit.TurnEndEvent) {
    // e.Response, e.Error, e.StopReason
 })
+
+host.OnStepStart(func(e kit.StepStartEvent) {
+    // e.StepNumber — which LLM call step (1-based)
+})
+
+host.OnStepFinish(func(e kit.StepFinishEvent) {
+    // e.StepNumber, e.HasToolCalls, e.FinishReason, e.Usage (LLMUsage)
+})
+
+host.OnWarnings(func(e kit.WarningsEvent) {
+    for _, w := range e.Warnings {
+        log.Printf("warning: %s", w)
+    }
+})
+
+host.OnError(func(e kit.ErrorEvent) {
+    log.Printf("agent error: %v", e.Error)
+})
+
+host.OnRetry(func(e kit.RetryEvent) {
+    log.Printf("retrying (attempt %d): %v", e.Attempt, e.Error)
+})
+
+host.OnTextStart(func(e kit.TextStartEvent) {
+    // e.ID — content block ID
+})
+
+host.OnTextEnd(func(e kit.TextEndEvent) {
+    // e.ID — content block ID
+})
+
+host.OnReasoningStart(func(e kit.ReasoningStartEvent) {
+    // e.ID — reasoning block ID
+})
+
+host.OnSource(func(e kit.SourceEvent) {
+    // e.SourceType, e.ID, e.URL, e.Title
+})
+
+host.OnStreamFinish(func(e kit.StreamFinishEvent) {
+    // e.Usage (LLMUsage), e.FinishReason
+})
+
+// Additional typed subscribers for previously generic-only events:
+host.OnMessageStart(func(e kit.MessageStartEvent) {})
+host.OnMessageEnd(func(e kit.MessageEndEvent) { /* e.Content */ })
+host.OnReasoningDelta(func(e kit.ReasoningDeltaEvent) { /* e.Delta */ })
+host.OnReasoningComplete(func(e kit.ReasoningCompleteEvent) {})
+host.OnToolExecutionStart(func(e kit.ToolExecutionStartEvent) { /* e.ToolCallID, e.ToolName, e.ToolKind, e.ToolArgs */ })
+host.OnToolExecutionEnd(func(e kit.ToolExecutionEndEvent) { /* e.ToolCallID, e.ToolName, e.ToolKind */ })
+host.OnToolCallContent(func(e kit.ToolCallContentEvent) { /* e.Content */ })
+host.OnStepUsage(func(e kit.StepUsageEvent) { /* e.InputTokens, e.OutputTokens, e.CacheReadTokens, e.CacheWriteTokens */ })
+host.OnCompaction(func(e kit.CompactionEvent) { /* e.Summary, e.OriginalTokens, e.CompactedTokens, ... */ })
+host.OnSteerConsumed(func(e kit.SteerConsumedEvent) { /* e.Count */ })
 ```

+> **Rename note:** `OnStreaming` has been renamed to `OnMessageUpdate`. The old `OnStreaming` name is kept as a deprecated alias for one release cycle.
+
 ### Generic subscriber (receives all events)

 ```go
@@ -252,6 +378,9 @@ unsub := host.Subscribe(func(e kit.Event) {
 | `message_start` | `MessageStartEvent` | *(none)* |
 | `message_update` | `MessageUpdateEvent` | `Chunk` |
 | `message_end` | `MessageEndEvent` | `Content` |
+| `tool_call_start` | `ToolCallStartEvent` | `ToolCallID`, `ToolName`, `ToolKind` |
+| `tool_call_delta` | `ToolCallDeltaEvent` | `ToolCallID`, `Delta` |
+| `tool_call_end` | `ToolCallEndEvent` | `ToolCallID` |
 | `tool_call` | `ToolCallEvent` | `ToolCallID`, `ToolName`, `ToolKind`, `ToolArgs`, `ParsedArgs` |
 | `tool_execution_start` | `ToolExecutionStartEvent` | `ToolCallID`, `ToolName`, `ToolKind`, `ToolArgs` |
 | `tool_execution_end` | `ToolExecutionEndEvent` | `ToolCallID`, `ToolName`, `ToolKind` |
@@ -263,6 +392,39 @@ unsub := host.Subscribe(func(e kit.Event) {
 | `reasoning_delta` | `ReasoningDeltaEvent` | `Delta` |
 | `step_usage` | `StepUsageEvent` | `InputTokens`, `OutputTokens`, `CacheReadTokens`, `CacheWriteTokens` |
 | `steer_consumed` | `SteerConsumedEvent` | `Count` |
+| `step_start` | `StepStartEvent` | `StepNumber` |
+| `step_finish` | `StepFinishEvent` | `StepNumber`, `HasToolCalls`, `FinishReason`, `Usage` |
+| `text_start` | `TextStartEvent` | `ID` |
+| `text_end` | `TextEndEvent` | `ID` |
+| `reasoning_start` | `ReasoningStartEvent` | `ID` |
+| `warnings` | `WarningsEvent` | `Warnings` |
+| `source` | `SourceEvent` | `SourceType`, `ID`, `URL`, `Title` |
+| `stream_finish` | `StreamFinishEvent` | `Usage`, `FinishReason` |
+| `error` | `ErrorEvent` | `Error` |
+| `retry` | `RetryEvent` | `Attempt`, `Error` |
+| `password_prompt` | `PasswordPromptEvent` | `Prompt`, `ResponseCh` |
+
+**Tool call streaming lifecycle**: `ToolCallStartEvent` → `ToolCallDeltaEvent` (repeated) → `ToolCallEndEvent` → `ToolCallEvent` → `ToolExecutionStartEvent` → `ToolOutputEvent` (optional, repeated) → `ToolExecutionEndEvent` → `ToolResultEvent`
+
+**PasswordPromptEvent** (for sudo password handling):
+```go
+// PasswordPromptEvent fires when a sudo command needs a password.
+// The TUI should display a password prompt and send the result back via ResponseCh.
+type PasswordPromptEvent struct {
+    // Prompt is the message to display to the user.
+    Prompt string
+    // ResponseCh receives the password from the TUI.
+    // The TUI must send exactly one value: (password, false) for submit
+    // or ("", true) for cancel.
+    ResponseCh chan<- PasswordPromptResponse
+}
+
+// PasswordPromptResponse carries the password prompt result.
+type PasswordPromptResponse struct {
+    Password  string
+    Cancelled bool
+}
+```

 ### Tool kind constants

@@ -325,6 +487,20 @@ host.OnAfterTurn(kit.HookPriorityNormal, func(h kit.AfterTurnHook) {
 })
 ```

+### PrepareStep — intercept/replace messages before each LLM call
+
+```go
+host.OnPrepareStep(kit.HookPriorityNormal, func(h kit.PrepareStepHook) *kit.PrepareStepResult {
+    // h.StepNumber  — which step in the current turn (1-based)
+    // h.Messages    — []kit.LLMMessage being sent to the LLM
+    // Return nil to pass through unchanged, or replace messages:
+    modified := filterSensitiveMessages(h.Messages)
+    return &kit.PrepareStepResult{Messages: modified}
+})
+```
+
+`PrepareStep` fires before every LLM API call within a turn (including tool-call loop iterations). Unlike `ContextPrepare` (which operates on the full context window once per turn), `PrepareStep` runs per-step and sees the messages that include the latest tool results.
+
 ### ContextPrepare — filter/inject context window

 ```go
@@ -397,6 +573,8 @@ host, _ := kit.New(ctx, &kit.Options{
 |----------|-------------|
 | `kit.TextResult(content)` | Successful text result |
 | `kit.ErrorResult(content)` | Error result (LLM sees it as a tool error) |
+| `kit.ImageResult(content, data, mediaType)` | Image result with binary data (e.g. `"image/png"`) |
+| `kit.MediaResult(content, data, mediaType)` | Non-image media result (e.g. `"audio/mpeg"`) |

 **ToolOutput fields** (for advanced use):

@@ -669,9 +847,149 @@ for _, s := range servers {

 `AddMCPServer` is safe to call while the agent is idle. If a turn is in progress, new tools are visible starting from the next LLM step. Tool names are prefixed with the server name (e.g. `"github__create_issue"`).

+### In-Process MCP Servers
+
+Register mcp-go servers that run in the same process — no subprocess spawning,
+no network I/O:
+
+```go
+import (
+    "github.com/mark3labs/mcp-go/mcp"
+    "github.com/mark3labs/mcp-go/server"
+)
+
+mcpSrv := server.NewMCPServer("my-tools", "1.0.0",
+    server.WithToolCapabilities(true),
+)
+mcpSrv.AddTool(mcp.NewTool("search_docs",
+    mcp.WithDescription("Search documentation"),
+    mcp.WithString("query", mcp.Required()),
+), searchHandler)
+
+// At init time
+host, _ := kit.New(ctx, &kit.Options{
+    InProcessMCPServers: map[string]*kit.MCPServer{
+        "docs": mcpSrv,
+    },
+})
+
+// Or at runtime
+n, err := host.AddInProcessMCPServer(ctx, "docs", mcpSrv)
+```
+
+Kit does not own the server lifecycle — the caller handles cleanup. Tools are prefixed as usual (e.g. `"docs__search_docs"`).
+
+### MCP Prompts
+
+Query and expand prompts defined by connected MCP servers:
+
+```go
+// List all prompts from all connected MCP servers
+prompts := host.ListMCPPrompts()
+for _, p := range prompts {
+    fmt.Printf("%s/%s: %s\n", p.ServerName, p.Name, p.Description)
+    for _, arg := range p.Arguments {
+        fmt.Printf("  arg: %s (required: %v)\n", arg.Name, arg.Required)
+    }
+}
+
+// Expand a specific prompt with arguments
+result, err := host.GetMCPPrompt(ctx, "myserver", "code-review", map[string]string{
+    "language": "go",
+    "style":    "thorough",
+})
+// result.Description — optional server description
+// result.Messages — []MCPPromptMessage with Role, Content, and FileParts
+for _, msg := range result.Messages {
+    fmt.Printf("[%s] %s\n", msg.Role, msg.Content)
+    // msg.FileParts contains binary attachments (images, embedded resources)
+}
+```
+
+### MCP Resources
+
+Read and subscribe to resources exposed by MCP servers:
+
+```go
+// List all resources from connected servers
+resources := host.ListMCPResources()
+for _, r := range resources {
+    fmt.Printf("%s: %s (%s)\n", r.URI, r.Name, r.MIMEType)
+}
+
+// Read a specific resource
+content, err := host.ReadMCPResource(ctx, "myserver", "file:///path/to/file")
+if content.IsBlob {
+    // Binary content in content.BlobData
+} else {
+    // Text content in content.Text
+}
+
+// Subscribe to resource change notifications
+err = host.SubscribeMCPResource(ctx, "myserver", "file:///path/to/file")
+// Unsubscribe later
+err = host.UnsubscribeMCPResource(ctx, "myserver", "file:///path/to/file")
+```
+
+### MCP OAuth Authorization
+
+When a remote MCP server requires OAuth, Kit runs the full authorization flow
+(dynamic client registration → PKCE → user consent → token exchange → token
+persistence) but delegates the **user-facing step** — displaying the
+authorization URL and receiving the callback — to an `MCPAuthHandler`.
+
+The SDK ships three building blocks:
+
+| Building block | When to use |
+|---|---|
+| **No handler** (`Options.MCPAuthHandler = nil`) | Default. OAuth is disabled; 401s from remote MCP servers surface as errors. Correct for library, daemon, and web-app embedders that don't want side effects. |
+| **`kit.NewCLIMCPAuthHandler()`** | CLI/TUI apps. Opens the system browser, prints status to stderr (or via `NotifyFunc`), runs a localhost callback server. This is what the `kit` binary uses. |
+| **`kit.NewDefaultMCPAuthHandler()` + `OnAuthURL`** | Custom UX. Get the transport mechanics (port reservation + callback server) from the SDK; wire your own presentation in the `OnAuthURL(serverName, authURL)` closure. |
+| **Implement `kit.MCPAuthHandler` directly** | Full control. No localhost binding — e.g. return the URL from an HTTP endpoint and have the consumer POST the callback URL back. |
+
+**CLI-style embedder (browser + stderr):**
+
+```go
+authHandler, err := kit.NewCLIMCPAuthHandler()
+if err != nil {
+    log.Fatal(err)
+}
+defer authHandler.Close() // release the reserved port
+
+host, _ := kit.New(ctx, &kit.Options{
+    MCPAuthHandler: authHandler,
+})
+```
+
+**Custom UX embedder (TUI modal, QR code, web redirect, etc.):**
+
+```go
+authHandler, _ := kit.NewDefaultMCPAuthHandler()
+authHandler.OnAuthURL = func(serverName, authURL string) {
+    // Render the URL however you like — no browser or terminal assumptions.
+    myUI.ShowAuthPrompt(serverName, authURL)
+}
+defer authHandler.Close()
+
+host, _ := kit.New(ctx, &kit.Options{
+    MCPAuthHandler: authHandler,
+})
+```
+
+**Important:** `DefaultMCPAuthHandler` with no `OnAuthURL` set will silently
+drop the authorization URL and block until the 2-minute callback timeout
+fires. Always set `OnAuthURL`, or use a higher-level wrapper like
+`CLIMCPAuthHandler`.
+
 ### MCP OAuth Token Storage

-For remote MCP servers that use OAuth, you can provide a custom token store:
+Once authorization succeeds, the resulting access/refresh tokens are persisted
+by an `MCPTokenStore`. By default tokens are written to
+`$XDG_CONFIG_HOME/.kit/mcp_tokens.json` (fallback `~/.config/.kit/mcp_tokens.json`),
+keyed by server URL, with `0600` file permissions.
+
+Provide a custom store for encrypted storage, database persistence, or
+in-memory-only flows:

 ```go
 host, _ := kit.New(ctx, &kit.Options{
@@ -681,7 +999,7 @@ host, _ := kit.New(ctx, &kit.Options{
 })
 ```

-The `MCPTokenStore` interface requires `GetToken`/`SetToken`/`DeleteToken` methods. Return `kit.ErrMCPNoToken` from `GetToken` when no token is stored. When nil (default), tokens are persisted to `$XDG_CONFIG_HOME/.kit/mcp_tokens.json`.
+The `MCPTokenStore` interface requires `GetToken`/`SetToken`/`DeleteToken` methods. Return `kit.ErrMCPNoToken` from `GetToken` when no token is stored.

 ---

@@ -852,23 +1170,53 @@ kit.Config, kit.MCPServerConfig
 // Provider types
 kit.ProviderConfig, kit.ProviderResult, kit.ModelInfo, kit.ModelCost, kit.ModelLimit

-// LLM types — concrete Kit-owned structs (no external library dependency)
+// LLM types — clean aliases (no external library dependency in consumer code)
 kit.LLMMessage      // {Role LLMMessageRole, Content string}
+kit.LLMMessagePart  // interface for message content parts
 kit.LLMMessageRole  // "user" | "assistant" | "system" | "tool"
 kit.LLMUsage        // {InputTokens, OutputTokens, TotalTokens, ReasoningTokens,
                     //  CacheCreationTokens, CacheReadTokens}
 kit.LLMResponse     // {Content, FinishReason, Usage}
 kit.LLMFilePart     // {Filename, Data []byte, MediaType}
+kit.LLMTextPart     // plain-text content part
+kit.LLMReasoningPart // reasoning/chain-of-thought content part
+kit.LLMToolCall     // {ID, Name, Input string} — execution-layer tool call (for Tool.Run)
+kit.LLMToolResponse // {Type, Content, Data, MediaType, IsError, ...} — raw tool response
+kit.LLMToolCallPart    // LLM-initiated tool invocation within a message
+kit.LLMToolResultPart  // tool result within a message
+kit.LLMToolResultOutputContent      // interface for tool result output
+kit.LLMToolResultOutputContentText  // text tool result
+kit.LLMToolResultOutputContentError // error tool result
+kit.LLMToolResultOutputContentMedia // media tool result {Data, MediaType, Text}
+kit.LLMToolResultContentType        // "text" | "error" | "media"
+kit.LLMToolInfo          // {Name, Description, Parameters, Required, Parallel}
+kit.LLMProviderOptions   // provider-specific option maps (keyed by provider name)
+kit.LLMProviderMetadata  // provider-specific response metadata
+kit.LLMPrompt            // []LLMMessage — ordered prompt sequence
+kit.LLMFinishReason      // "stop" | "length" | "tool-calls" | ...

 // Compaction types
 kit.CompactionResult, kit.CompactionOptions

 // MCP OAuth types
+kit.MCPAuthHandler         // interface: RedirectURI() + HandleAuth(ctx, server, authURL) for OAuth UX
+kit.DefaultMCPAuthHandler  // SDK-provided transport mechanics (port + callback server); set OnAuthURL hook
+kit.CLIMCPAuthHandler      // CLI wrapper around DefaultMCPAuthHandler: opens browser, prints status
+kit.NewDefaultMCPAuthHandler()         // random port, no UX side effects
+kit.NewDefaultMCPAuthHandlerWithPort() // fixed port (useful when registering a stable redirect URI)
+kit.NewCLIMCPAuthHandler()             // CLI handler: browser + stderr + localhost callback
 kit.MCPTokenStore        // interface for custom OAuth token storage
 kit.MCPToken             // OAuth token struct (access, refresh, expiry)
 kit.MCPTokenStoreFactory // func(serverURL string) (MCPTokenStore, error)
 kit.ErrMCPNoToken        // sentinel error for "no token stored"
+kit.MCPServer            // *server.MCPServer for in-process MCP transport
 kit.MCPServerStatus      // {Name string, ToolCount int}
+kit.MCPPrompt            // {Name, Description, Arguments []MCPPromptArgument, ServerName}
+kit.MCPPromptArgument    // {Name, Description string, Required bool}
+kit.MCPPromptMessage     // {Role, Content string, FileParts []LLMFilePart}
+kit.MCPPromptResult      // {Description string, Messages []MCPPromptMessage}
+kit.MCPResource          // {URI, Name, Description, MIMEType, ServerName}
+kit.MCPResourceContent   // {URI, MIMEType, Text string, BlobData []byte, IsBlob bool}

 // Conversion helpers
 msgs := kit.ConvertToLLMMessages(&msg)   // SDK Message  → []LLMMessage
@@ -919,7 +1267,7 @@ for {
 ### Pattern: Streaming output to terminal

 ```go
-host.OnStreaming(func(e kit.MessageUpdateEvent) {
+host.OnMessageUpdate(func(e kit.MessageUpdateEvent) {
    fmt.Print(e.Chunk)
 })
 response, _ := host.Prompt(ctx, "Write a poem")
@@ -1,9 +0,0 @@
-1. Hello, world!
-
-2. Testing one, two, three.
-
-3. This is a quick test message.
-
-4. Sample text for verification.
-
-5. All systems operational.
@@ -10,9 +10,10 @@ description: Complete reference for all Kit CLI subcommands.
 For OAuth-enabled providers like Anthropic.

 ```bash
-kit auth login [provider]    # Start OAuth flow (e.g., anthropic)
-kit auth logout [provider]   # Remove credentials for provider
-kit auth status              # Check authentication status
+kit auth login [provider]          # Start OAuth flow (e.g., anthropic)
+kit auth login [provider] --set-default  # Set provider's default model as system default
+kit auth logout [provider]       # Remove credentials for provider
+kit auth status                    # Check authentication status
 ```

 ## Model database
@@ -66,7 +67,7 @@ These commands are available inside the Kit TUI during an interactive session:
 | `/servers` | Show connected MCP servers |
 | `/model [name]` | Switch model or open model selector |
 | `/theme [name]` | Switch color theme or list available themes |
-| `/thinking [level]` | Set thinking level (off, minimal, low, medium, high) |
+| `/thinking [level]` | Set thinking level (off, none, minimal, low, medium, high) |
 | `/compact [focus]` | Summarize older messages to free context |
 | `/clear` | Clear conversation |
 | `/clear-queue` | Clear queued messages |
@@ -95,15 +96,19 @@ Press **ESC twice** to cancel the current operation:

 This ensures that `tool_use` and `tool_result` messages are always sent to the API as matched pairs, avoiding errors from orphaned tool calls.

+### External editor
+
+Press **Ctrl+X e** to open your `$VISUAL` or `$EDITOR` in a temporary file pre-populated with the current input text. On save and quit, the edited content replaces the input textarea. On error exit (e.g., `:cq` in Vim), the original input is preserved.
+
 ### Mid-turn steering

-Press **Ctrl+S** during streaming to inject a system-level instruction mid-turn. This allows you to steer the conversation direction without waiting for the model to finish:
+Press **Ctrl+X s** during streaming to inject a system-level instruction mid-turn. This allows you to steer the conversation direction without waiting for the model to finish:

 - Works during streaming output
 - Sends a steering instruction as a system message
 - Model continues from the interruption point with the new guidance

-Example: While the model is writing code, press Ctrl+S and type "Use async/await instead" to change the implementation approach.
+Example: While the model is writing code, press Ctrl+X s and type "Use async/await instead" to change the implementation approach.

 ## Prompt templates

@@ -134,10 +139,13 @@ Templates appear as slash commands:
 | Placeholder | Description |
 |-------------|-------------|
 | `$1`, `$2`, etc. | Individual arguments by position |
-| `$@`, `$ARGUMENTS` | All arguments joined with spaces |
+| `$@`, `$ARGUMENTS` | All arguments joined with spaces (zero or more) |
+| `$+` | All arguments joined with spaces (one or more required) |
 | `${@:N}` | Arguments from position N onwards |
 | `${@:N:L}` | L arguments starting at position N |

+Placeholders inside fenced code blocks (`` ``` ``) and inline code spans are ignored, so documentation examples won't be substituted.
+
 ### CLI flags

 ```bash
@@ -52,12 +52,14 @@ These flags control Kit's behavior. When a prompt is passed as a positional argu

 | Flag | Short | Default | Description |
 |------|-------|---------|-------------|
-| `--max-tokens` | — | `4096` | Maximum tokens in response |
+| `--max-tokens` | — | `8192` | Base cap for output tokens. Auto-raised per-model up to 32768 when the model's catalog ceiling is higher and no explicit value is set. |
 | `--temperature` | — | `0.7` | Randomness 0.0–1.0 |
 | `--top-p` | — | `0.95` | Nucleus sampling 0.0–1.0 |
 | `--top-k` | — | `40` | Limit top K tokens |
 | `--stop-sequences` | — | — | Custom stop sequences (comma-separated) |
-| `--thinking-level` | — | `off` | Extended thinking level: off, minimal, low, medium, high |
+| `--frequency-penalty` | — | `0.0` | Penalize frequent tokens (0.0–2.0) |
+| `--presence-penalty` | — | `0.0` | Penalize present tokens (0.0–2.0) |
+| `--thinking-level` | — | `off` | Extended thinking level: off, none, minimal, low, medium, high |

 ## System

@@ -18,7 +18,7 @@ Create `~/.kit.yml`:

 ```yaml
 model: anthropic/claude-sonnet-latest
-max-tokens: 4096
+max-tokens: 8192
 temperature: 0.7
 stream: true
 ```
@@ -28,7 +28,7 @@ stream: true
 | Key | Type | Default | Description |
 |-----|------|---------|-------------|
 | `model` | string | `anthropic/claude-sonnet-latest` | Model to use (provider/model format) |
-| `max-tokens` | int | `4096` | Maximum tokens in response |
+| `max-tokens` | int | `8192` | Base cap for output tokens. Auto-raised per-model up to 32768 when the model's catalog ceiling is higher and no explicit value is set. Use [`modelSettings[provider/model].maxTokens`](#per-model-settings) to override per-model. |
 | `temperature` | float | `0.7` | Randomness 0.0–1.0 |
 | `top-p` | float | `0.95` | Nucleus sampling 0.0–1.0 |
 | `top-k` | int | `40` | Limit top K tokens |
@@ -37,10 +37,12 @@ stream: true
 | `compact` | bool | `false` | Enable compact output mode |
 | `system-prompt` | string | — | System prompt text or file path |
 | `max-steps` | int | `0` | Maximum agent steps (0 = unlimited) |
-| `thinking-level` | string | `off` | Extended thinking: off, minimal, low, medium, high |
+| `thinking-level` | string | `off` | Extended thinking: off, none, minimal, low, medium, high |
 | `provider-api-key` | string | — | API key for the provider |
 | `provider-url` | string | — | Base URL for provider API |
 | `tls-skip-verify` | bool | `false` | Skip TLS certificate verification |
+| `frequency-penalty` | float | `0.0` | Penalize frequent tokens (0.0–2.0) |
+| `presence-penalty` | float | `0.0` | Penalize present tokens (0.0–2.0) |
 | `stop-sequences` | list | — | Custom stop sequences |
 | `theme` | object or string | — | UI theme ([inline overrides or file path](/themes)) |
 | `prompt-templates` | bool | `true` | Enable prompt template loading |
@@ -81,6 +83,11 @@ mcpServers:
  search:
    type: remote
    url: "https://mcp.example.com/search"
+
+  pubmed:
+    type: remote
+    url: "https://pubmed.mcp.example.com"
+    noOAuth: true  # skip OAuth for public servers
 ```

 ### MCP server fields
@@ -93,6 +100,7 @@ mcpServers:
 | `url` | string | URL for remote servers |
 | `allowedTools` | list | Whitelist of tool names to expose |
 | `excludedTools` | list | Blacklist of tool names to hide |
+| `noOAuth` | bool | Skip OAuth for this server (for public servers that don't require auth) |

 A legacy format with `transport`, `args`, `env`, and `headers` fields is also supported.

@@ -144,6 +152,53 @@ kit --provider-url "http://localhost:8080/v1" --model custom/my-model "Hello"

 When `--provider-url` is specified without `--model`, Kit defaults to `custom/custom` which has zero cost tracking and a 262K context window.

+## Per-model settings
+
+Override generation parameters and system prompt on a per-model basis using `modelSettings`:
+
+```yaml
+modelSettings:
+  anthropic/claude-sonnet-4-5-20250929:
+    temperature: 0.3
+    maxTokens: 8192
+    systemPrompt: "You are a concise coding assistant."
+  openai/gpt-4o:
+    temperature: 0.7
+    frequencyPenalty: 0.5
+```
+
+### Per-model fields
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `temperature` | float | Temperature override for this model |
+| `maxTokens` | int | Max output tokens override |
+| `topP` | float | Top-p override |
+| `topK` | int | Top-k override |
+| `frequencyPenalty` | float | Frequency penalty override |
+| `presencePenalty` | float | Presence penalty override |
+| `stopSequences` | list | Stop sequences override |
+| `thinkingLevel` | string | Thinking level override |
+| `systemPrompt` | string | Per-model system prompt (used when no explicit prompt is set) |
+
+Settings from `modelSettings` and `customModels.params` act as model-level defaults — explicit CLI flags, `KIT_*` environment variables, global config values, and SDK `Options.*` fields all take precedence over them.
+
+When switching models via `/model` or `SetModel()`, if the new model has a per-model system prompt and no custom global prompt was set, the per-model prompt automatically replaces the previous one.
+
+### Precedence summary
+
+For the generation and provider parameters documented above, the resolved value at runtime comes from the first source that sets it:
+
+1. CLI flag (e.g. `--max-tokens`, `--temperature`, `--provider-api-key`)
+2. SDK `Options.X` when embedding Kit as a library (`kit.Options.MaxTokens`, `Temperature`, `ProviderAPIKey`, etc.)
+3. `KIT_*` environment variable (`KIT_MAX_TOKENS`, `KIT_TEMPERATURE`, ...)
+4. `.kit.yml` / `.kit.yaml` / `.kit.json` (project-local, then global)
+5. Per-model defaults (`modelSettings[provider/model]` / `customModels[...].params`)
+6. Provider-level defaults (e.g. Anthropic's own temperature default)
+7. SDK last-resort floor — currently an 8192 output-token ceiling matching the CLI `--max-tokens` default, auto-raised per-model up to 32768 when the model's catalog ceiling is higher
+
+See the [SDK options reference](/sdk/options) for the full list of `kit.Options` fields that map to these keys.
+
 ## Theme configuration

 ```yaml
@@ -37,7 +37,7 @@ internal/acpserver/  - ACP (Agent Client Protocol) server
 internal/clipboard/  - Cross-platform clipboard operations
 internal/compaction/ - Conversation compaction and summarization
 internal/config/     - Configuration management
-internal/core/       - Built-in tools (bash, read, write, edit, grep, find, ls)
+internal/core/       - Built-in tools (bash with sudo password prompt, read, write, edit, grep, find, ls)
 internal/extensions/ - Yaegi extension system
 internal/kitsetup/   - Initial setup wizard
 internal/message/    - Message content types and structured content blocks
@@ -7,7 +7,7 @@ description: All extension capabilities — lifecycle events, tools, commands, w

 ## Lifecycle events

-Extensions can hook into 23 lifecycle events:
+Extensions can hook into 26 lifecycle events:

 | Event | Description |
 |-------|-------------|
@@ -17,6 +17,9 @@ Extensions can hook into 23 lifecycle events:
 | `OnAgentStart` | Agent loop started |
 | `OnAgentEnd` | Agent loop completed |
 | `OnToolCall` | Tool call requested by the model |
+| `OnToolCallInputStart` | LLM began generating tool call arguments (tool name known, args streaming) |
+| `OnToolCallInputDelta` | Streamed JSON fragment of tool call arguments |
+| `OnToolCallInputEnd` | Tool argument streaming complete, before execution begins |
 | `OnToolExecutionStart` | Tool execution beginning |
 | `OnToolOutput` | Streaming tool output chunk (for long-running tools) |
 | `OnToolExecutionEnd` | Tool execution completed |
@@ -57,8 +57,9 @@ These examples demonstrate the new bridged SDK APIs that give extensions access

 | Extension | Description |
 |-----------|-------------|
-| [`conversation-manager.go`](https://github.com/mark3labs/kit/blob/master/examples/extensions/conversation-manager.go) | **NEW** Tree navigation (`GetTreeNode`, `GetCurrentBranch`, `NavigateTo`), branch summarization (`SummarizeBranch`), and fresh context loops (`CollapseBranch`) |
-| [`prompt-templates.go`](https://github.com/mark3labs/kit/blob/master/examples/extensions/prompt-templates.go) | **NEW** Frontmatter-driven templates with model fallback chains (`ResolveModelChain`), skill injection (`InjectSkillAsContext`), and template parsing (`ParseTemplate`, `RenderTemplate`) |
+| [`bridge-demo.go`](https://github.com/mark3labs/kit/blob/master/examples/extensions/bridge_demo.go) | Comprehensive demo of all bridged APIs — tree navigation, skill loading, template parsing, and model resolution |
+| [`conversation-manager.go`](https://github.com/mark3labs/kit/blob/master/examples/extensions/conversation-manager.go) | Tree navigation (`GetTreeNode`, `GetCurrentBranch`, `NavigateTo`), branch summarization (`SummarizeBranch`), and fresh context loops (`CollapseBranch`) |
+| [`prompt-templates.go`](https://github.com/mark3labs/kit/blob/master/examples/extensions/prompt-templates.go) | Frontmatter-driven templates with model fallback chains (`ResolveModelChain`), skill injection (`InjectSkillAsContext`), and template parsing (`ParseTemplate`, `RenderTemplate`) |

 ## Themes

@@ -13,8 +13,9 @@ A powerful, extensible AI coding agent CLI with multi-provider support, built-in
 ## Features

 - **Multi-Provider LLM Support** — Anthropic, OpenAI, Google Gemini, Ollama, Azure OpenAI, AWS Bedrock, OpenRouter, and more
- **Built-in Core Tools** — bash, read, write, edit, grep, find, ls, subagent with no MCP overhead
- **MCP Integration** — Connect external MCP servers for expanded capabilities
+- **Built-in Core Tools** — bash (with interactive sudo password prompt), read, write, edit, grep, find, ls, subagent with no MCP overhead
+- **Smart @ Attachments** — Binary files auto-detected via MIME type, MCP resources via `@mcp:server:uri`
+- **MCP Integration** — Connect external MCP servers for expanded capabilities (tools, prompts, and resources)
 - **Extension System** — Write custom tools, commands, widgets, and UI modifications in Go
 - **Interactive TUI** — Rich terminal interface powered by Bubble Tea with streaming, syntax highlighting, and custom rendering
 - **Session Management** — Tree-based conversation history with branching support
--- a/Show More
+++ b/Show More