Make subagent inherit tools from parent (#51 )

While the tool list of the main agent could be controlled by several options, subagent used to be equipped with all available tools (except for the subagent tool itself). With this change the list of tools is taken from the parent, the subagent tool itself is removed and the remaining tool list is added to the subagent. Signed-off-by: Egbert Eich <eich@suse.com>
feat(extensions): add OnLLMUsage, SetState, enriched AgentEndEvent (#53 ) (#54 )
2026-06-14 03:30:26 +00:00 · 2026-06-09 16:28:01 +03:00 · 2026-06-09 16:18:10 +03:00 · 2026-06-08 00:21:20 +03:00 · 2026-06-07 22:03:51 +03:00
34 changed files with 2587 additions and 38 deletions
@@ -228,6 +228,10 @@ kit auth login [provider] --set-default  # Set provider's default model as syste
 kit auth logout [provider]         # Remove credentials for provider
 kit auth status                    # Check authentication status

+# GitHub Copilot login (experimental; requires active Copilot subscription)
+kit auth login copilot
+kit --model copilot/gpt-5.5 "Hello"
+
 # Model database
 kit models [provider]        # List available models (optionally filter by provider)
 kit models --all             # Show all providers (not just LLM-compatible)
@@ -308,12 +312,15 @@ kit -e examples/extensions/minimal.go

 ### Extension Capabilities

-**Lifecycle Events**: OnSessionStart, OnSessionShutdown, OnBeforeAgentStart, OnAgentStart, OnAgentEnd, OnToolCall, OnToolCallInputStart, OnToolCallInputDelta, OnToolCallInputEnd, OnToolExecutionStart, OnToolOutput, OnToolExecutionEnd, OnToolResult, OnInput, OnMessageStart, OnMessageUpdate, OnMessageEnd, OnModelChange, OnContextPrepare, OnBeforeFork, OnBeforeSessionSwitch, OnBeforeCompact, OnCustomEvent, OnSubagentStart, OnSubagentChunk, OnSubagentEnd
+**Lifecycle Events**: OnSessionStart, OnSessionShutdown, OnBeforeAgentStart, OnAgentStart, OnAgentEnd, OnLLMUsage, OnToolCall, OnToolCallInputStart, OnToolCallInputDelta, OnToolCallInputEnd, OnToolExecutionStart, OnToolOutput, OnToolExecutionEnd, OnToolResult, OnInput, OnMessageStart, OnMessageUpdate, OnMessageEnd, OnModelChange, OnContextPrepare, OnBeforeFork, OnBeforeSessionSwitch, OnBeforeCompact, OnCustomEvent, OnSubagentStart, OnSubagentChunk, OnSubagentEnd
+
+`OnAgentEnd` carries per-turn aggregates (`ToolCallCount`, `ToolNames`, `LLMCallCount`, `InputTokensDelta`, `OutputTokensDelta`, `CostDelta`, `DurationMs`) so observers don't need to maintain parallel bookkeeping. `OnLLMUsage` fires after each LLM provider call with token + cost deltas attributed to that specific call/model — use it for accurate budget enforcement *between* calls instead of waiting for the turn to finish.

 **Custom Components**:
 - **Tools**: Add new tools the LLM can invoke
 - **Commands**: Register slash commands (e.g., `/mycommand`)
 - **Options**: Register configurable extension options
+- **Session State**: Last-write-wins key-value store via `ctx.SetState` / `GetState` / `DeleteState` / `ListState`, persisted to a per-session sidecar file outside the conversation tree
 - **Widgets**: Persistent status displays above/below input
 - **Headers/Footers**: Persistent content above/below the conversation
 - **Status Bar**: Custom status bar entries
@@ -369,6 +376,7 @@ See the `examples/extensions/` directory:
 - [`tool-logger.go`](examples/extensions/tool-logger.go) - Log all tool calls
 - [`neon-theme.go`](examples/extensions/neon-theme.go) - Custom theme registration and switching
 - [`tool-renderer-demo.go`](examples/extensions/tool-renderer-demo.go) - Custom tool call rendering
+- [`usage-budget.go`](examples/extensions/usage-budget.go) - Per-call usage callback (`OnLLMUsage`), session state, and enriched `OnAgentEnd` per-turn report
 - [`widget-status.go`](examples/extensions/widget-status.go) - Persistent status widgets

 Also see [`.kit/extensions/go-edit-lint.go`](.kit/extensions/go-edit-lint.go) (in this repo) for a project-local extension example that runs gopls and golangci-lint on Go file edits.
@@ -949,6 +957,7 @@ npm/                 - NPM package wrapper for distribution

 - **Anthropic** - Claude models (native, prompt caching, OAuth)
 - **OpenAI** - GPT models
+- **Copilot** - GitHub Copilot models (`copilot`, requires active Copilot subscription)
 - **Google** - Gemini models
 - **Ollama** - Local models
 - **Azure OpenAI** - Azure-hosted OpenAI
@@ -31,10 +31,12 @@ using OAuth flows. Stored credentials take precedence over environment variables
 Available providers:
  - anthropic: Anthropic Claude API (OAuth)
  - openai:    OpenAI API (OAuth and API key)
+  - copilot:   GitHub Copilot (GitHub device login)

 Examples:
  kit auth login anthropic
  kit auth login openai
+  kit auth login copilot
  kit auth logout anthropic
  kit auth status`,
 }
@@ -54,6 +56,7 @@ environment variables when making API calls.
 Available providers:
  - anthropic: Anthropic Claude API (OAuth)
  - openai:    OpenAI ChatGPT Plus/Pro (Codex OAuth)
+  - copilot:   GitHub Copilot (GitHub device login, experimental)

 Flags:
  --set-default   Set this provider's default model as the system default
@@ -61,7 +64,8 @@ Flags:
 Examples:
  kit auth login anthropic
  kit auth login openai
-  kit auth login openai --set-default`,
+  kit auth login copilot
+  kit auth login copilot --set-default`,
 	Args: cobra.ExactArgs(1),
 	RunE: runAuthLogin,
 }
@@ -80,10 +84,12 @@ You will need to use environment variables or command-line flags for authenticat
 Available providers:
  - anthropic: Anthropic Claude API
  - openai:    OpenAI API
+  - copilot:   GitHub Copilot

 Example:
  kit auth logout anthropic
-  kit auth logout openai`,
+  kit auth logout openai
+  kit auth logout copilot`,
 	Args: cobra.ExactArgs(1),
 	RunE: runAuthLogout,
 }
@@ -113,6 +119,7 @@ var (
 var defaultModels = map[string]string{
 	"anthropic": "anthropic/claude-sonnet-4-5-20250929",
 	"openai":    "openai/gpt-5.4",
+	"copilot":   "copilot/gpt-5.5",
 }

 // setDefaultModelIfRequested sets the default model for the given provider
@@ -143,6 +150,7 @@ func init() {
 	authLoginCmd.Flags().BoolVar(&loginSetDefault, "set-default", false, "Set this provider's default model as the system default after login")
 }

+// runAuthLogin dispatches OAuth login to the selected provider.
 func runAuthLogin(cmd *cobra.Command, args []string) error {
 	provider := strings.ToLower(args[0])

@@ -151,8 +159,10 @@ func runAuthLogin(cmd *cobra.Command, args []string) error {
 		return loginAnthropic()
 	case "openai":
 		return loginOpenAI()
+	case "copilot":
+		return loginCopilot(cmd.Context())
 	default:
-		return fmt.Errorf("unsupported provider: %s. Available providers: anthropic, openai", provider)
+		return fmt.Errorf("unsupported provider: %s. Available providers: anthropic, openai, copilot", provider)
 	}
 }

@@ -164,8 +174,10 @@ func runAuthLogout(cmd *cobra.Command, args []string) error {
 		return logoutAnthropic()
 	case "openai":
 		return logoutOpenAI()
+	case "copilot":
+		return logoutCopilot()
 	default:
-		return fmt.Errorf("unsupported provider: %s. Available providers: anthropic, openai", provider)
+		return fmt.Errorf("unsupported provider: %s. Available providers: anthropic, openai, copilot", provider)
 	}
 }

@@ -244,9 +256,31 @@ func runAuthStatus(cmd *cobra.Command, args []string) error {
 		}
 	}

+	// Check GitHub Copilot credentials
+	fmt.Print("\nGitHub Copilot: ")
+	if hasCopilotCreds, err := cm.HasCopilotCredentials(); err != nil {
+		fmt.Printf("Error checking credentials: %v\n", err)
+	} else if hasCopilotCreds {
+		if creds, err := cm.GetCopilotCredentials(); err != nil {
+			fmt.Printf("Error reading credentials: %v\n", err)
+		} else {
+			status := "✓ Authenticated"
+			if creds.IsExpired() {
+				status = "⚠️  Token expired (will refresh automatically)"
+			} else if creds.NeedsRefresh() {
+				status = "⚠️  Token expires soon (will refresh automatically)"
+			}
+
+			fmt.Printf("%s (GitHub OAuth, stored %s)\n", status, creds.CreatedAt.Format("2006-01-02 15:04:05"))
+		}
+	} else {
+		fmt.Println("✗ Not authenticated")
+	}
+
 	fmt.Println("\nTo authenticate with a provider:")
 	fmt.Println("  kit auth login anthropic")
 	fmt.Println("  kit auth login openai")
+	fmt.Println("  kit auth login copilot")

 	return nil
 }
@@ -517,6 +551,85 @@ func loginOpenAI() error {
 	return nil
 }

+// loginCopilot authenticates GitHub Copilot using GitHub device flow.
+func loginCopilot(ctx context.Context) error {
+	if ctx == nil {
+		ctx = context.Background()
+	}
+
+	cm, err := kit.NewCredentialManager()
+	if err != nil {
+		return fmt.Errorf("failed to initialize credential manager: %w", err)
+	}
+
+	if hasAuth, err := cm.HasCopilotCredentials(); err == nil && hasAuth {
+		var reauth bool
+		err := huh.NewConfirm().
+			Title("You are already authenticated with GitHub Copilot").
+			Description("Do you want to re-authenticate?").
+			Affirmative("Yes").
+			Negative("No").
+			Value(&reauth).
+			Run()
+		if err != nil {
+			return fmt.Errorf("failed to prompt for re-authentication: %w", err)
+		}
+		if !reauth {
+			fmt.Println("Authentication cancelled.")
+			return nil
+		}
+	}
+
+	client := auth.NewCopilotOAuthClient()
+
+	fmt.Println("🔐 Starting GitHub Copilot authentication...")
+	fmt.Println("This uses GitHub device login and requires an active GitHub Copilot subscription.")
+	fmt.Println("Experimental: this uses VS Code Copilot Chat client identifiers.")
+	fmt.Println()
+
+	deviceCode, err := client.StartDeviceFlow(ctx)
+	if err != nil {
+		return fmt.Errorf("failed to start GitHub device login: %w", err)
+	}
+
+	fmt.Println("📱 Open this page and enter the code:")
+	fmt.Printf("\n%s\n\n", deviceCode.VerificationURI)
+	fmt.Printf("Code: %s\n\n", deviceCode.UserCode)
+	auth.TryOpenBrowser(deviceCode.VerificationURI)
+
+	fmt.Println("Waiting for GitHub authorization...")
+	githubToken, err := client.PollDeviceToken(ctx, deviceCode)
+	if err != nil {
+		return fmt.Errorf("failed to complete GitHub device login: %w", err)
+	}
+
+	fmt.Println("\n🔄 Exchanging GitHub token for Copilot access token...")
+	creds, err := client.ExchangeGitHubToken(ctx, githubToken)
+	if err != nil {
+		return fmt.Errorf("failed to get GitHub Copilot token: %w", err)
+	}
+
+	if err := cm.SetCopilotOAuthCredentials(creds); err != nil {
+		return fmt.Errorf("failed to store credentials: %w", err)
+	}
+
+	fmt.Println("✅ Successfully authenticated with GitHub Copilot!")
+	fmt.Printf("📁 Credentials stored in: %s\n", cm.GetCredentialsPath())
+	fmt.Println("\n🎉 Your GitHub Copilot credentials will now be used for copilot/* models.")
+	fmt.Println("💡 You can check your authentication status with: kit auth status")
+
+	if err := setDefaultModelIfRequested("copilot"); err != nil {
+		return err
+	}
+
+	if !loginSetDefault {
+		fmt.Println("\n💡 To set Copilot as your default model, run:")
+		fmt.Println("   kit auth login copilot --set-default")
+	}
+
+	return nil
+}
+
 // callbackServer holds the HTTP server and channel for receiving the OAuth callback
 type callbackServer struct {
 	Server   *http.Server
@@ -635,3 +748,43 @@ func logoutOpenAI() error {

 	return nil
 }
+
+func logoutCopilot() error {
+	cm, err := kit.NewCredentialManager()
+	if err != nil {
+		return fmt.Errorf("failed to initialize credential manager: %w", err)
+	}
+
+	hasAuth, err := cm.HasCopilotCredentials()
+	if err != nil {
+		return fmt.Errorf("failed to check authentication status: %w", err)
+	}
+
+	if !hasAuth {
+		fmt.Println("You are not currently authenticated with GitHub Copilot.")
+		return nil
+	}
+
+	var confirm bool
+	err = huh.NewConfirm().
+		Title("Remove GitHub Copilot credentials").
+		Description("Are you sure you want to remove your stored credentials?").
+		Affirmative("Yes").
+		Negative("No").
+		Value(&confirm).
+		Run()
+	if err != nil || !confirm {
+		fmt.Println("Logout cancelled.")
+		return nil
+	}
+
+	if err := cm.RemoveCopilotCredentials(); err != nil {
+		return fmt.Errorf("failed to remove credentials: %w", err)
+	}
+
+	fmt.Println("✓ Successfully logged out from GitHub Copilot!")
+	fmt.Println("You will need to authenticate again with 'kit auth login copilot'.")
+	fmt.Println("Tip: this removes local credentials only. Revoke the GitHub OAuth grant at https://github.com/settings/applications")
+
+	return nil
+}
@@ -190,6 +190,18 @@ func buildInteractiveExtensionContext(deps extensionContextDeps) extensions.Cont
 		GetEntries: func(entryType string) []extensions.ExtensionEntry {
 			return kitInstance.Extensions().GetEntries(entryType)
 		},
+		SetState: func(key string, value string) {
+			kitInstance.Extensions().SetState(key, value)
+		},
+		GetState: func(key string) (string, bool) {
+			return kitInstance.Extensions().GetState(key)
+		},
+		DeleteState: func(key string) {
+			kitInstance.Extensions().DeleteState(key)
+		},
+		ListState: func() []string {
+			return kitInstance.Extensions().ListState()
+		},
 		SetEditorText: func(text string) {
 			appInstance.SetEditorTextFromExtension(text)
 		},
@@ -735,12 +735,27 @@ func runNormalMode(ctx context.Context) error {
 		viper.Set("model", "custom/custom")
 	}

-	// When --provider-url is set with an explicit --model that lacks a provider
-	// prefix (no "/"), auto-prefix with "custom/" for OpenAI-compatible endpoints.
+	// When --provider-url is set with an explicit --model, route through the
+	// "custom" provider (OpenAI-compatible wire). This honors the user's
+	// intent: passing a custom URL means "use THIS endpoint", not "speak
+	// the Google/Anthropic/etc. wire protocol against this endpoint".
+	//
+	// Any provider prefix on the model is stripped so a model name that
+	// happens to collide with a known provider (e.g. `google/gemma-4-12b`
+	// served by LM Studio) still resolves correctly. If you genuinely need
+	// to point a non-OpenAI wire (Anthropic, Google, ...) at a proxy URL,
+	// use the explicit `custom/<name>` form to opt out of the rewrite by
+	// configuring the proxy as that provider in your config file instead.
 	if viper.GetString("provider-url") != "" && modelFlagChanged {
 		model := viper.GetString("model")
-		if model != "" && !strings.Contains(model, "/") {
-			viper.Set("model", "custom/"+model)
+		if model != "" {
+			name := model
+			if _, after, ok := strings.Cut(model, "/"); ok {
+				name = after
+			}
+			if !strings.HasPrefix(model, "custom/") {
+				viper.Set("model", "custom/"+name)
+			}
 		}
 	}

@@ -916,6 +931,9 @@ func runNormalMode(ctx context.Context) error {
 			startupExtensionMessages = append(startupExtensionMessages, text)
 		}
 		kitInstance.Extensions().SetContext(extCtx)
+		if err := kitInstance.Extensions().InitStatePersistence(); err != nil {
+			log.Printf("WARN extension state init failed: %v", err)
+		}
 		kitInstance.Extensions().EmitSessionStart()

 		// Restore normal print functions for runtime use.
@@ -58,6 +58,7 @@ kit install github.com/mark3labs/kit/examples/extensions --local
 | `project-rules.go` | Project-specific rules | Session data, file reading |
 | `protected-paths.go` | Block dangerous operations | `OnToolCall` with blocking |
 | `permission-gate.go` | Confirm destructive actions | `OnToolCall` with confirmation |
+| `usage-budget.go` | Soft cost cap + per-turn report | `OnLLMUsage`, `SetState`/`GetState`, enriched `AgentEndEvent` |

 ### Tools & Commands

@@ -0,0 +1,87 @@
+//go:build ignore
+
+package main
+
+import (
+	"fmt"
+	"strconv"
+
+	"kit/ext"
+)
+
+// Init demonstrates the three primitives added in issue #53:
+//
+//  1. api.OnLLMUsage(...) — per-LLM-call usage callback with token + cost
+//     deltas. Use this for budget enforcement that reacts between calls
+//     within a single agent turn, rather than only at turn boundaries.
+//
+//  2. ctx.SetState / ctx.GetState / ctx.DeleteState / ctx.ListState —
+//     last-write-wins, session-scoped key-value store backed by a sidecar
+//     file. Use this for snapshot state (current value of X) instead of
+//     ctx.AppendEntry, which is append-only and bloats branch reads.
+//
+//  3. ext.AgentEndEvent.ToolCallCount / .ToolNames / .LLMCallCount /
+//     .InputTokensDelta / .OutputTokensDelta / .CostDelta / .DurationMs —
+//     per-turn aggregates so observer extensions don't need to maintain
+//     parallel bookkeeping.
+//
+// Together these support a simple soft-budget cap: warn when the
+// cumulative cost in this session exceeds a threshold, and print a
+// per-turn report on AgentEnd.
+//
+// Usage: kit -e examples/extensions/usage-budget.go
+func Init(api ext.API) {
+	const warnAtKey = "usage-budget:warn-at-usd"
+
+	// 1. Print per-LLM-call usage with provider, model, and cost.
+	api.OnLLMUsage(func(e ext.LLMUsageEvent, ctx ext.Context) {
+		ctx.Print(fmt.Sprintf(
+			"[usage] step=%d %s/%s tokens=↑%d ↓%d cache=↑%d/↓%d cost=$%.4f (%s)",
+			e.StepNumber, e.Provider, e.Model,
+			e.InputTokens, e.OutputTokens,
+			e.CacheWriteTokens, e.CacheReadTokens,
+			e.Cost, e.FinishReason,
+		))
+
+		// 2. Persist running total in last-write-wins state.
+		current := 0.0
+		if raw, ok := ctx.GetState("usage-budget:total-cost"); ok {
+			current, _ = strconv.ParseFloat(raw, 64)
+		}
+		current += e.Cost
+		ctx.SetState("usage-budget:total-cost", strconv.FormatFloat(current, 'f', 6, 64))
+
+		// Soft warn-at threshold (configurable via state).
+		warnAt := 0.50
+		if raw, ok := ctx.GetState(warnAtKey); ok {
+			if v, err := strconv.ParseFloat(raw, 64); err == nil {
+				warnAt = v
+			}
+		}
+		if current > warnAt {
+			ctx.PrintError(fmt.Sprintf(
+				"[usage] session cost $%.4f exceeds soft cap $%.2f",
+				current, warnAt,
+			))
+		}
+	})
+
+	// 3. Print a per-turn summary using the enriched AgentEndEvent.
+	api.OnAgentEnd(func(e ext.AgentEndEvent, ctx ext.Context) {
+		ctx.Print(fmt.Sprintf(
+			"[turn] stop=%s tools=%d llm-calls=%d tokens=↑%d ↓%d cost=$%.4f duration=%dms",
+			e.StopReason, e.ToolCallCount, e.LLMCallCount,
+			e.InputTokensDelta, e.OutputTokensDelta, e.CostDelta, e.DurationMs,
+		))
+		if len(e.ToolNames) > 0 {
+			ctx.Print(fmt.Sprintf("[turn] tool order: %v", e.ToolNames))
+		}
+	})
+
+	// Bootstrap default soft cap once per session.
+	api.OnSessionStart(func(e ext.SessionStartEvent, ctx ext.Context) {
+		if _, ok := ctx.GetState(warnAtKey); !ok {
+			ctx.SetState(warnAtKey, "0.50")
+		}
+	})
+}
@@ -1,6 +1,7 @@
 package auth

 import (
+	"context"
 	"encoding/json"
 	"fmt"
 	"os"
@@ -9,11 +10,11 @@ import (
 	"time"
 )

-// CredentialStore holds all stored credentials for various providers.
-// Currently supports Anthropic and OpenAI credentials with both OAuth and API key authentication methods.
+// CredentialStore holds stored credentials for Anthropic, OpenAI, and GitHub Copilot.
 type CredentialStore struct {
 	Anthropic *AnthropicCredentials `json:"anthropic,omitempty"`
 	OpenAI    *OpenAICredentials    `json:"openai,omitempty"`
+	Copilot   *CopilotCredentials   `json:"copilot,omitempty"`
 }

 // AnthropicCredentials holds Anthropic API credentials supporting both OAuth
@@ -43,6 +44,16 @@ type OpenAICredentials struct {
 	CreatedAt    time.Time `json:"created_at"`
 }

+// CopilotCredentials holds GitHub OAuth credentials and the short-lived
+// GitHub Copilot API token derived from them.
+type CopilotCredentials struct {
+	Type               string    `json:"type"`                           // "oauth"
+	GitHubToken        string    `json:"github_token,omitempty"`         // GitHub device-flow OAuth token
+	CopilotAccessToken string    `json:"copilot_access_token,omitempty"` // Short-lived Copilot API token
+	ExpiresAt          int64     `json:"expires_at,omitempty"`           // Copilot token expiry
+	CreatedAt          time.Time `json:"created_at"`
+}
+
 // oauthTokenExpired reports whether an OAuth token with the given type and
 // expiry unix timestamp is past its expiry. Returns false for API key
 // credentials or when no expiry is set.
@@ -91,6 +102,16 @@ func (c *OpenAICredentials) NeedsRefresh() bool {
 	return oauthTokenNeedsRefresh(c.Type, c.ExpiresAt)
 }

+// IsExpired checks if the Copilot API token is expired.
+func (c *CopilotCredentials) IsExpired() bool {
+	return oauthTokenExpired(c.Type, c.ExpiresAt)
+}
+
+// NeedsRefresh reports whether the Copilot API token should be renewed.
+func (c *CopilotCredentials) NeedsRefresh() bool {
+	return oauthTokenNeedsRefresh(c.Type, c.ExpiresAt)
+}
+
 // CredentialManager handles secure storage and retrieval of authentication credentials.
 // It manages a JSON file stored in the user's config directory with appropriate
 // file permissions for security.
@@ -222,7 +243,7 @@ func (cm *CredentialManager) RemoveAnthropicCredentials() error {
 	store.Anthropic = nil

 	// If store is empty, remove the file entirely
-	if store.Anthropic == nil {
+	if store.Anthropic == nil && store.OpenAI == nil && store.Copilot == nil {
 		if err := os.Remove(cm.credentialsPath); err != nil && !os.IsNotExist(err) {
 			return fmt.Errorf("failed to remove credentials file: %w", err)
 		}
@@ -279,7 +300,7 @@ func (cm *CredentialManager) RemoveOpenAICredentials() error {
 	store.OpenAI = nil

 	// If store is empty, remove the file entirely
-	if store.Anthropic == nil && store.OpenAI == nil {
+	if store.Anthropic == nil && store.OpenAI == nil && store.Copilot == nil {
 		if err := os.Remove(cm.credentialsPath); err != nil && !os.IsNotExist(err) {
 			return fmt.Errorf("failed to remove credentials file: %w", err)
 		}
@@ -289,6 +310,104 @@ func (cm *CredentialManager) RemoveOpenAICredentials() error {
 	return cm.SaveCredentials(store)
 }

+// GetCopilotCredentials retrieves stored GitHub Copilot credentials.
+func (cm *CredentialManager) GetCopilotCredentials() (*CopilotCredentials, error) {
+	store, err := cm.LoadCredentials()
+	if err != nil {
+		return nil, err
+	}
+
+	return store.Copilot, nil
+}
+
+// RemoveCopilotCredentials removes stored GitHub Copilot credentials.
+func (cm *CredentialManager) RemoveCopilotCredentials() error {
+	store, err := cm.LoadCredentials()
+	if err != nil {
+		return err
+	}
+
+	store.Copilot = nil
+
+	if store.Anthropic == nil && store.OpenAI == nil && store.Copilot == nil {
+		if err := os.Remove(cm.credentialsPath); err != nil && !os.IsNotExist(err) {
+			return fmt.Errorf("failed to remove credentials file: %w", err)
+		}
+		return nil
+	}
+
+	return cm.SaveCredentials(store)
+}
+
+// HasCopilotCredentials checks if valid GitHub Copilot credentials are stored.
+func (cm *CredentialManager) HasCopilotCredentials() (bool, error) {
+	creds, err := cm.GetCopilotCredentials()
+	if err != nil {
+		return false, err
+	}
+	if creds == nil {
+		return false, nil
+	}
+
+	return creds.Type == "oauth" && creds.GitHubToken != "", nil
+}
+
+// SetCopilotOAuthCredentials stores GitHub Copilot OAuth credentials.
+func (cm *CredentialManager) SetCopilotOAuthCredentials(creds *CopilotCredentials) error {
+	store, err := cm.LoadCredentials()
+	if err != nil {
+		return err
+	}
+
+	store.Copilot = creds
+	return cm.SaveCredentials(store)
+}
+
+// GetValidCopilotAccessToken returns a fresh Copilot API token, renewing it
+// with the stored GitHub OAuth token when needed.
+func (cm *CredentialManager) GetValidCopilotAccessToken() (string, error) {
+	return cm.GetValidCopilotAccessTokenContext(context.Background())
+}
+
+// GetValidCopilotAccessTokenContext returns a fresh Copilot API token, renewing
+// it with the stored GitHub OAuth token when needed.
+func (cm *CredentialManager) GetValidCopilotAccessTokenContext(ctx context.Context) (string, error) {
+	if ctx == nil {
+		ctx = context.Background()
+	}
+
+	creds, err := cm.GetCopilotCredentials()
+	if err != nil {
+		return "", err
+	}
+	if creds == nil {
+		return "", fmt.Errorf("no Copilot credentials found")
+	}
+	if creds.Type != "oauth" {
+		return "", fmt.Errorf("unknown credential type: %s", creds.Type)
+	}
+	if creds.GitHubToken == "" {
+		return "", fmt.Errorf("GitHub OAuth token missing from Copilot credentials")
+	}
+
+	if creds.CopilotAccessToken == "" || creds.NeedsRefresh() {
+		client := NewCopilotOAuthClient()
+		newCreds, err := client.RefreshCopilotToken(ctx, creds.GitHubToken)
+		if err != nil {
+			return "", fmt.Errorf("failed to refresh Copilot token: %w", err)
+		}
+		newCreds.CreatedAt = creds.CreatedAt
+
+		if err := cm.SetCopilotOAuthCredentials(newCreds); err != nil {
+			return "", fmt.Errorf("failed to save refreshed Copilot token: %w", err)
+		}
+
+		return newCreds.CopilotAccessToken, nil
+	}
+
+	return creds.CopilotAccessToken, nil
+}
+
 // HasOpenAICredentials checks if valid OpenAI credentials are stored.
 // Returns true if either a non-empty OAuth access token or API key is present,
 // false otherwise. Returns an error if credentials cannot be loaded.
@@ -4,6 +4,7 @@ import (
 	"os"
 	"path/filepath"
 	"testing"
+	"time"
 )

 func TestCredentialManager(t *testing.T) {
@@ -215,6 +216,7 @@ func TestCredentialStorePersistence(t *testing.T) {
 	if err != nil {
 		t.Fatalf("Failed to create temp dir: %v", err)
 	}
+
 	defer func() { _ = os.RemoveAll(tempDir) }()

 	credentialsPath := filepath.Join(tempDir, "credentials.json")
@@ -252,3 +254,98 @@ func TestCredentialStorePersistence(t *testing.T) {
 		t.Errorf("Expected file permissions 0600, got %v", info.Mode().Perm())
 	}
 }
+
+func TestCopilotCredentials(t *testing.T) {
+	tempDir, err := os.MkdirTemp("", "kit-auth-test")
+	if err != nil {
+		t.Fatalf("Failed to create temp dir: %v", err)
+	}
+	defer func() { _ = os.RemoveAll(tempDir) }()
+
+	cm := &CredentialManager{
+		credentialsPath: filepath.Join(tempDir, "credentials.json"),
+	}
+
+	creds := &CopilotCredentials{
+		Type:               "oauth",
+		GitHubToken:        "github-token",
+		CopilotAccessToken: "copilot-token",
+		ExpiresAt:          time.Now().Add(time.Hour).Unix(),
+		CreatedAt:          time.Now(),
+	}
+
+	if err := cm.SetCopilotOAuthCredentials(creds); err != nil {
+		t.Fatalf("SetCopilotOAuthCredentials failed: %v", err)
+	}
+
+	hasAuth, err := cm.HasCopilotCredentials()
+	if err != nil {
+		t.Fatalf("HasCopilotCredentials failed: %v", err)
+	}
+	if !hasAuth {
+		t.Fatal("Expected Copilot credentials")
+	}
+
+	token, err := cm.GetValidCopilotAccessToken()
+	if err != nil {
+		t.Fatalf("GetValidCopilotAccessToken failed: %v", err)
+	}
+	if token != creds.CopilotAccessToken {
+		t.Fatalf("Expected Copilot token %q, got %q", creds.CopilotAccessToken, token)
+	}
+
+	if err := cm.RemoveCopilotCredentials(); err != nil {
+		t.Fatalf("RemoveCopilotCredentials failed: %v", err)
+	}
+	hasAuth, err = cm.HasCopilotCredentials()
+	if err != nil {
+		t.Fatalf("HasCopilotCredentials after removal failed: %v", err)
+	}
+	if hasAuth {
+		t.Fatal("Expected no Copilot credentials after removal")
+	}
+}
+
+func TestRemoveCredentialsPreservesOtherProviders(t *testing.T) {
+	tempDir, err := os.MkdirTemp("", "kit-auth-test")
+	if err != nil {
+		t.Fatalf("Failed to create temp dir: %v", err)
+	}
+	defer func() { _ = os.RemoveAll(tempDir) }()
+
+	cm := &CredentialManager{
+		credentialsPath: filepath.Join(tempDir, "credentials.json"),
+	}
+
+	if err := cm.SetOpenAIOAuthCredentials(&OpenAICredentials{
+		Type:         "oauth",
+		AccessToken:  "openai-token",
+		RefreshToken: "refresh-token",
+		ExpiresAt:    time.Now().Add(time.Hour).Unix(),
+		AccountID:    "account",
+		CreatedAt:    time.Now(),
+	}); err != nil {
+		t.Fatalf("SetOpenAIOAuthCredentials failed: %v", err)
+	}
+	if err := cm.SetCopilotOAuthCredentials(&CopilotCredentials{
+		Type:               "oauth",
+		GitHubToken:        "github-token",
+		CopilotAccessToken: "copilot-token",
+		ExpiresAt:          time.Now().Add(time.Hour).Unix(),
+		CreatedAt:          time.Now(),
+	}); err != nil {
+		t.Fatalf("SetCopilotOAuthCredentials failed: %v", err)
+	}
+
+	if err := cm.RemoveCopilotCredentials(); err != nil {
+		t.Fatalf("RemoveCopilotCredentials failed: %v", err)
+	}
+
+	hasOpenAI, err := cm.HasOpenAICredentials()
+	if err != nil {
+		t.Fatalf("HasOpenAICredentials failed: %v", err)
+	}
+	if !hasOpenAI {
+		t.Fatal("Expected OpenAI credentials to remain after removing Copilot credentials")
+	}
+}
@@ -10,6 +10,7 @@ import (
 	"io"
 	"net/http"
 	"net/url"
+	"strconv"
 	"strings"
 	"time"
 )
@@ -211,6 +212,262 @@ type OpenAIOAuthClient struct {
 	Scopes       string
 }

+// CopilotOAuthClient handles GitHub device-flow OAuth and exchanges the
+// GitHub token for a short-lived GitHub Copilot API token.
+//
+// The GitHub token comes from GitHub's OAuth device flow. It is then presented
+// to GitHub's internal Copilot token endpoint, which returns the bearer token
+// used by api.githubcopilot.com.
+type CopilotOAuthClient struct {
+	ClientID      string
+	DeviceURL     string
+	TokenURL      string
+	CopilotURL    string
+	Scopes        string
+	PollTimeout   time.Duration
+	ClientTimeout time.Duration
+}
+
+// CopilotDeviceCode contains data returned by GitHub's device-code endpoint.
+type CopilotDeviceCode struct {
+	DeviceCode      string `json:"device_code"`
+	UserCode        string `json:"user_code"`
+	VerificationURI string `json:"verification_uri"`
+	ExpiresIn       int    `json:"expires_in"`
+	Interval        int    `json:"interval"`
+}
+
+// NewCopilotOAuthClient creates a GitHub Copilot OAuth client.
+func NewCopilotOAuthClient() *CopilotOAuthClient {
+	return &CopilotOAuthClient{
+		ClientID:      "Iv1.b507a08c87ecfe98",
+		DeviceURL:     "https://github.com/login/device/code",
+		TokenURL:      "https://github.com/login/oauth/access_token",
+		CopilotURL:    "https://api.github.com/copilot_internal/v2/token",
+		Scopes:        "read:user",
+		PollTimeout:   15 * time.Minute,
+		ClientTimeout: 30 * time.Second,
+	}
+}
+
+// StartDeviceFlow requests a GitHub device code for browser login.
+//
+// The returned user code and verification URI are displayed by loginCopilot.
+// GitHub's response may omit interval, so this method normalizes it to the
+// documented five-second default.
+func (c *CopilotOAuthClient) StartDeviceFlow(ctx context.Context) (*CopilotDeviceCode, error) {
+	if ctx == nil {
+		ctx = context.Background()
+	}
+
+	data := url.Values{
+		"client_id": {c.ClientID},
+		"scope":     {c.Scopes},
+	}
+
+	req, err := http.NewRequestWithContext(ctx, "POST", c.DeviceURL, strings.NewReader(data.Encode()))
+	if err != nil {
+		return nil, fmt.Errorf("failed to create device-code request: %w", err)
+	}
+	req.Header.Set("Accept", "application/json")
+	req.Header.Set("Content-Type", "application/x-www-form-urlencoded")
+
+	resp, err := (&http.Client{Timeout: c.ClientTimeout}).Do(req)
+	if err != nil {
+		return nil, fmt.Errorf("failed to request device code: %w", err)
+	}
+	defer func() { _ = resp.Body.Close() }()
+
+	if resp.StatusCode != http.StatusOK {
+		body, _ := io.ReadAll(resp.Body)
+		return nil, fmt.Errorf("device-code request failed with status %d: %s", resp.StatusCode, string(body))
+	}
+
+	var code CopilotDeviceCode
+	if err := json.NewDecoder(resp.Body).Decode(&code); err != nil {
+		return nil, fmt.Errorf("failed to decode device-code response: %w", err)
+	}
+	if code.DeviceCode == "" || code.UserCode == "" || code.VerificationURI == "" {
+		return nil, fmt.Errorf("device-code response missing required fields")
+	}
+	if code.Interval <= 0 {
+		code.Interval = 5
+	}
+	return &code, nil
+}
+
+// PollDeviceToken waits until the user authorizes the device code and returns
+// the resulting GitHub OAuth token.
+//
+// It follows GitHub's device-flow polling contract: authorization_pending keeps
+// polling, slow_down increases the interval, and polling stops at the earlier of
+// the client timeout or the device-code expiry.
+func (c *CopilotOAuthClient) PollDeviceToken(ctx context.Context, deviceCode *CopilotDeviceCode) (string, error) {
+	if ctx == nil {
+		ctx = context.Background()
+	}
+
+	if deviceCode == nil || deviceCode.DeviceCode == "" {
+		return "", fmt.Errorf("device code missing")
+	}
+
+	deadline := time.Now().Add(c.PollTimeout)
+	if deviceCode.ExpiresIn > 0 {
+		expiresAt := time.Now().Add(time.Duration(deviceCode.ExpiresIn) * time.Second)
+		if expiresAt.Before(deadline) {
+			deadline = expiresAt
+		}
+	}
+
+	interval := time.Duration(deviceCode.Interval) * time.Second
+	if interval <= 0 {
+		interval = 5 * time.Second
+	}
+
+	for time.Now().Before(deadline) {
+		wait := interval
+		if remaining := time.Until(deadline); remaining < wait {
+			wait = remaining
+		}
+		select {
+		case <-ctx.Done():
+			return "", ctx.Err()
+		case <-time.After(wait):
+		}
+
+		data := url.Values{
+			"client_id":   {c.ClientID},
+			"device_code": {deviceCode.DeviceCode},
+			"grant_type":  {"urn:ietf:params:oauth:grant-type:device_code"},
+		}
+
+		req, err := http.NewRequestWithContext(ctx, "POST", c.TokenURL, strings.NewReader(data.Encode()))
+		if err != nil {
+			return "", fmt.Errorf("failed to create device-token request: %w", err)
+		}
+		req.Header.Set("Accept", "application/json")
+		req.Header.Set("Content-Type", "application/x-www-form-urlencoded")
+
+		resp, err := (&http.Client{Timeout: c.ClientTimeout}).Do(req)
+		if err != nil {
+			return "", fmt.Errorf("failed to poll device token: %w", err)
+		}
+
+		var tokenResp struct {
+			AccessToken string `json:"access_token"`
+			Error       string `json:"error"`
+			Description string `json:"error_description"`
+		}
+		decodeErr := json.NewDecoder(resp.Body).Decode(&tokenResp)
+		_ = resp.Body.Close()
+		if decodeErr != nil {
+			return "", fmt.Errorf("failed to decode device-token response: %w", decodeErr)
+		}
+
+		if tokenResp.AccessToken != "" {
+			return tokenResp.AccessToken, nil
+		}
+
+		switch tokenResp.Error {
+		case "authorization_pending":
+			continue
+		case "slow_down":
+			interval += 5 * time.Second
+			continue
+		case "expired_token":
+			return "", fmt.Errorf("device code expired; restart login")
+		case "access_denied":
+			return "", fmt.Errorf("github login denied")
+		case "":
+			return "", fmt.Errorf("device-token request failed with status %d", resp.StatusCode)
+		default:
+			if tokenResp.Description != "" {
+				return "", fmt.Errorf("device-token request failed: %s: %s", tokenResp.Error, tokenResp.Description)
+			}
+			return "", fmt.Errorf("device-token request failed: %s", tokenResp.Error)
+		}
+	}
+
+	return "", fmt.Errorf("timed out waiting for github device authorization")
+}
+
+// ExchangeGitHubToken converts a GitHub OAuth token into a Copilot API token.
+// It is a semantic wrapper over RefreshCopilotToken used by the login flow.
+func (c *CopilotOAuthClient) ExchangeGitHubToken(ctx context.Context, githubToken string) (*CopilotCredentials, error) {
+	return c.RefreshCopilotToken(ctx, githubToken)
+}
+
+// RefreshCopilotToken obtains a fresh short-lived Copilot token from GitHub.
+//
+// GitHub may return expires_at as either a Unix timestamp or RFC3339 string.
+// parseCopilotExpiry handles both forms and falls back to a conservative
+// 20-minute lifetime when the field is absent or unrecognized.
+func (c *CopilotOAuthClient) RefreshCopilotToken(ctx context.Context, githubToken string) (*CopilotCredentials, error) {
+	if ctx == nil {
+		ctx = context.Background()
+	}
+
+	req, err := http.NewRequestWithContext(ctx, "GET", c.CopilotURL, nil)
+	if err != nil {
+		return nil, fmt.Errorf("failed to create copilot token request: %w", err)
+	}
+	req.Header.Set("Authorization", "token "+githubToken)
+	req.Header.Set("Accept", "application/json")
+	req.Header.Set("User-Agent", "kit")
+	req.Header.Set("X-GitHub-Api-Version", "2022-11-28")
+
+	resp, err := (&http.Client{Timeout: c.ClientTimeout}).Do(req)
+	if err != nil {
+		return nil, fmt.Errorf("failed to request copilot token: %w", err)
+	}
+	defer func() { _ = resp.Body.Close() }()
+
+	if resp.StatusCode != http.StatusOK {
+		body, _ := io.ReadAll(resp.Body)
+		return nil, fmt.Errorf("copilot token request failed with status %d: %s", resp.StatusCode, string(body))
+	}
+
+	var tokenResp struct {
+		Token     string `json:"token"`
+		ExpiresAt any    `json:"expires_at"`
+	}
+	if err := json.NewDecoder(resp.Body).Decode(&tokenResp); err != nil {
+		return nil, fmt.Errorf("failed to decode copilot token response: %w", err)
+	}
+	if tokenResp.Token == "" {
+		return nil, fmt.Errorf("copilot token response missing token")
+	}
+
+	expiresAt := parseCopilotExpiry(tokenResp.ExpiresAt)
+	if expiresAt == 0 {
+		expiresAt = time.Now().Add(20 * time.Minute).Unix()
+	}
+
+	return &CopilotCredentials{
+		Type:               "oauth",
+		GitHubToken:        githubToken,
+		CopilotAccessToken: tokenResp.Token,
+		ExpiresAt:          expiresAt,
+		CreatedAt:          time.Now(),
+	}, nil
+}
+
+// parseCopilotExpiry normalizes GitHub's expires_at variants to a Unix second.
+func parseCopilotExpiry(value any) int64 {
+	switch v := value.(type) {
+	case float64:
+		return int64(v)
+	case string:
+		if parsed, err := strconv.ParseInt(v, 10, 64); err == nil {
+			return parsed
+		}
+		if parsed, err := time.Parse(time.RFC3339, v); err == nil {
+			return parsed.Unix()
+		}
+	}
+	return 0
+}
+
 // NewOpenAIOAuthClient creates a new OAuth client configured for OpenAI Codex OAuth.
 // This uses the public client ID for CLI applications with PKCE for security.
 func NewOpenAIOAuthClient() *OpenAIOAuthClient {
@@ -0,0 +1,124 @@
+package auth
+
+import (
+	"context"
+	"encoding/json"
+	"net/http"
+	"net/http/httptest"
+	"testing"
+	"time"
+)
+
+func TestCopilotStartDeviceFlow(t *testing.T) {
+	server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
+		if r.Method != http.MethodPost {
+			t.Fatalf("expected POST, got %s", r.Method)
+		}
+		if err := r.ParseForm(); err != nil {
+			t.Fatalf("ParseForm failed: %v", err)
+		}
+		if r.Form.Get("client_id") != "client-id" {
+			t.Fatalf("expected client id, got %q", r.Form.Get("client_id"))
+		}
+		if r.Form.Get("scope") != "read:user" {
+			t.Fatalf("expected scope, got %q", r.Form.Get("scope"))
+		}
+		_ = json.NewEncoder(w).Encode(map[string]any{
+			"device_code":      "device-code",
+			"user_code":        "USER-CODE",
+			"verification_uri": "https://github.com/login/device",
+			"expires_in":       600,
+			"interval":         1,
+		})
+	}))
+	defer server.Close()
+
+	client := NewCopilotOAuthClient()
+	client.ClientID = "client-id"
+	client.DeviceURL = server.URL
+
+	code, err := client.StartDeviceFlow(context.Background())
+	if err != nil {
+		t.Fatalf("StartDeviceFlow failed: %v", err)
+	}
+	if code.DeviceCode != "device-code" || code.UserCode != "USER-CODE" || code.Interval != 1 {
+		t.Fatalf("unexpected device code: %#v", code)
+	}
+}
+
+func TestCopilotPollDeviceToken(t *testing.T) {
+	polls := 0
+	server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
+		polls++
+		if r.Method != http.MethodPost {
+			t.Fatalf("expected POST, got %s", r.Method)
+		}
+		if err := r.ParseForm(); err != nil {
+			t.Fatalf("ParseForm failed: %v", err)
+		}
+		if r.Form.Get("grant_type") != "urn:ietf:params:oauth:grant-type:device_code" {
+			t.Fatalf("unexpected grant type: %q", r.Form.Get("grant_type"))
+		}
+		if polls == 1 {
+			_ = json.NewEncoder(w).Encode(map[string]any{"error": "authorization_pending"})
+			return
+		}
+		_ = json.NewEncoder(w).Encode(map[string]any{"access_token": "github-token"})
+	}))
+	defer server.Close()
+
+	client := NewCopilotOAuthClient()
+	client.ClientID = "client-id"
+	client.TokenURL = server.URL
+	client.PollTimeout = 5 * time.Second
+	client.ClientTimeout = time.Second
+
+	token, err := client.PollDeviceToken(context.Background(), &CopilotDeviceCode{
+		DeviceCode: "device-code",
+		ExpiresIn:  10,
+		Interval:   1,
+	})
+	if err != nil {
+		t.Fatalf("PollDeviceToken failed: %v", err)
+	}
+	if token != "github-token" {
+		t.Fatalf("expected github-token, got %q", token)
+	}
+	if polls != 2 {
+		t.Fatalf("expected 2 polls, got %d", polls)
+	}
+}
+
+func TestCopilotRefreshToken(t *testing.T) {
+	expiresAt := time.Now().Add(time.Hour).Unix()
+	server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
+		if r.Method != http.MethodGet {
+			t.Fatalf("expected GET, got %s", r.Method)
+		}
+		if r.Header.Get("Authorization") != "token github-token" {
+			t.Fatalf("unexpected authorization header: %q", r.Header.Get("Authorization"))
+		}
+		if r.Header.Get("User-Agent") != "kit" {
+			t.Fatalf("unexpected user agent: %q", r.Header.Get("User-Agent"))
+		}
+		_ = json.NewEncoder(w).Encode(map[string]any{
+			"token":      "copilot-token",
+			"expires_at": expiresAt,
+		})
+	}))
+	defer server.Close()
+
+	client := NewCopilotOAuthClient()
+	client.CopilotURL = server.URL
+
+	creds, err := client.RefreshCopilotToken(context.Background(), "github-token")
+	if err != nil {
+		t.Fatalf("RefreshCopilotToken failed: %v", err)
+	}
+	if creds.GitHubToken != "github-token" || creds.CopilotAccessToken != "copilot-token" {
+		t.Fatalf("unexpected credentials: %#v", creds)
+	}
+	if creds.ExpiresAt != expiresAt {
+		t.Fatalf("expected expires_at %d, got %d", expiresAt, creds.ExpiresAt)
+	}
+}
@@ -66,6 +66,7 @@ func SpawnSubagent(ctx context.Context, k *kit.Kit, cfg extensions.SubagentConfi
 		SystemPrompt: cfg.SystemPrompt,
 		Timeout:      cfg.Timeout,
 		NoSession:    cfg.NoSession,
+		Tools:        k.GetToolsForSubagent(),
 	}
 	if cfg.OnEvent != nil {
 		sdkCfg.OnEvent = func(e kit.Event) {
@@ -341,6 +341,13 @@ type Context struct {
 	// The data survives across session restarts and can be retrieved via
 	// GetEntries. Use entryType to namespace your data (e.g. "myext:state").
 	//
+	// AppendEntry is append-only and lives in the conversation tree, which
+	// makes it the right tool for audit logs and event histories. For
+	// last-write-wins snapshot state — "what's the current value of X?" —
+	// prefer SetState / GetState instead. Those primitives store data in a
+	// sidecar file outside the conversation tree, are O(1) to read/write,
+	// and do not bloat branch reads or duplicate on fork.
+	//
 	// Example:
 	//
 	//   data, _ := json.Marshal(myState)
@@ -360,6 +367,45 @@ type Context struct {
 	//   }
 	GetEntries func(entryType string) []ExtensionEntry

+	// SetState stores a key-value pair in session-scoped, last-write-wins
+	// extension state. Unlike AppendEntry the value is kept in a sidecar
+	// file outside the conversation tree, so:
+	//   - reads are O(1) (no branch walk)
+	//   - writes don't bloat the session JSONL
+	//   - state is not duplicated on fork (branches share the sidecar)
+	//   - state is invisible to the LLM
+	//
+	// Use SetState for snapshot state ("current value of X"); use
+	// AppendEntry for audit logs and event histories. Namespace keys with
+	// your extension name to avoid collisions (e.g. "myext:budget-cap").
+	//
+	// State persists for the lifetime of the session. For ephemeral or
+	// in-memory sessions the state lives only in memory.
+	//
+	// Example:
+	//
+	//   ctx.SetState("myext:budget-cap", "10.00")
+	SetState func(key string, value string)
+
+	// GetState returns the value previously stored via SetState. The bool
+	// is false when the key was never written. Returns ("", false) when
+	// state is unavailable.
+	//
+	// Example:
+	//
+	//   if cap, ok := ctx.GetState("myext:budget-cap"); ok {
+	//       fmt.Println("current cap:", cap)
+	//   }
+	GetState func(key string) (string, bool)
+
+	// DeleteState removes a key from session-scoped extension state.
+	// No-op when the key is missing.
+	DeleteState func(key string)
+
+	// ListState returns all keys currently stored in session-scoped
+	// extension state, in unspecified order.
+	ListState func() []string
+
 	// SetEditorText sets the text content of the input editor. This can
 	// be used to pre-fill the editor with suggested text (e.g. extracted
 	// questions, handoff prompts). The cursor is moved to the end.
@@ -1102,6 +1148,7 @@ type API struct {
 	onError                   func(func(ErrorEvent, Context))
 	onRetry                   func(func(RetryEvent, Context))
 	onPrepareStep             func(func(PrepareStepEvent, Context) *PrepareStepResult)
+	onLLMUsage                func(func(LLMUsageEvent, Context))
 }

 // OnToolCall registers a handler that fires before a tool executes.
@@ -1359,6 +1406,19 @@ func (a *API) OnPrepareStep(handler func(PrepareStepEvent, Context) *PrepareStep
 	a.onPrepareStep(handler)
 }

+// OnLLMUsage registers a handler that fires after each LLM provider call
+// with the token and cost deltas for that single call. Use this for
+// per-call usage attribution, real-time budget enforcement, and cost
+// dashboards that need to react between calls within a single agent turn.
+//
+// Handlers receive an LLMUsageEvent describing the call's input/output
+// tokens, cache tokens, computed cost, model, and provider. A single agent
+// turn typically fires multiple LLMUsageEvents (one per tool-loop
+// iteration).
+func (a *API) OnLLMUsage(handler func(LLMUsageEvent, Context)) {
+	a.onLLMUsage(handler)
+}
+
 // RegisterToolRenderer registers a custom renderer for a specific tool's
 // display in the TUI. The renderer controls the header (parameter summary)
 // and/or body (result display) of the tool's output block. If multiple
@@ -2091,10 +2151,47 @@ type AgentStartEvent struct {

 func (e AgentStartEvent) Type() EventType { return AgentStart }

-// AgentEndEvent fires when the agent finishes responding.
+// AgentEndEvent fires when the agent finishes responding. In addition to the
+// final response and stop reason, the event carries per-turn aggregates so
+// observer-style extensions don't have to maintain parallel bookkeeping in
+// OnToolResult / OnStepFinish handlers.
 type AgentEndEvent struct {
 	Response   string
 	StopReason string // "completed", "cancelled", "error"
+
+	// ToolCallCount is the total number of tool invocations observed during
+	// this turn (sum across all steps).
+	ToolCallCount int
+
+	// ToolNames lists the tool names invoked during this turn, in call order.
+	// Duplicates are preserved (e.g. two bash calls produce ["bash", "bash"]).
+	ToolNames []string
+
+	// LLMCallCount is the number of LLM round-trips (tool-loop iterations)
+	// performed during this turn. Always >= 1 for a successful turn.
+	LLMCallCount int
+
+	// InputTokensDelta is the sum of input tokens consumed during this turn
+	// across every LLM call (including cache-hit input tokens).
+	InputTokensDelta int
+
+	// OutputTokensDelta is the sum of output tokens generated during this turn.
+	OutputTokensDelta int
+
+	// CacheReadTokensDelta is the sum of cache-read tokens during this turn.
+	CacheReadTokensDelta int
+
+	// CacheWriteTokensDelta is the sum of cache-write tokens during this turn.
+	CacheWriteTokensDelta int
+
+	// CostDelta is the total cost in USD attributable to this turn. Computed
+	// from per-step usage and current model pricing. Zero when pricing is
+	// unknown or OAuth credentials are in use.
+	CostDelta float64
+
+	// DurationMs is the elapsed wall-clock time from AgentStart to AgentEnd,
+	// in milliseconds.
+	DurationMs int64
 }

 func (e AgentEndEvent) Type() EventType { return AgentEnd }
@@ -2403,6 +2500,43 @@ type PrepareStepResult struct {

 func (PrepareStepResult) isResult() {}

+// LLMUsageEvent fires after each LLM provider call with the per-call token
+// and cost deltas. Use this for accurate budget tracking, cost dashboards,
+// and any logic that needs to react between LLM calls within a single agent
+// turn (rather than only at turn boundaries).
+//
+// A single agent turn typically produces multiple LLMUsageEvents (one per
+// tool-loop iteration). The Model and Provider fields reflect the model used
+// for that specific call, which may differ from earlier calls if the
+// extension switched models mid-turn via ctx.SetModel().
+type LLMUsageEvent struct {
+	// InputTokens is the number of input tokens for this call.
+	InputTokens int
+	// OutputTokens is the number of output tokens generated by this call.
+	OutputTokens int
+	// CacheReadTokens is the number of cache-hit input tokens (provider-specific).
+	CacheReadTokens int
+	// CacheWriteTokens is the number of cache-write tokens.
+	CacheWriteTokens int
+	// Cost is the USD cost of this call computed from the model's per-token
+	// pricing. Zero when pricing is unknown or OAuth credentials are in use.
+	Cost float64
+	// Model is the model identifier used for this call (e.g. "claude-sonnet-4-5-20250929").
+	Model string
+	// Provider is the provider identifier (e.g. "anthropic", "openai").
+	Provider string
+	// RequestID is an optional correlation id for the underlying provider
+	// call. May be empty when the provider does not surface one.
+	RequestID string
+	// StepNumber is the zero-based step index within the current agent turn.
+	StepNumber int
+	// FinishReason mirrors the provider's finish reason for this call
+	// (e.g. "stop", "tool_calls", "length"). May be empty.
+	FinishReason string
+}
+
+func (e LLMUsageEvent) Type() EventType { return LLMUsage }
+
 // ThemeColor is an adaptive color pair with light and dark hex values.
 // Either field may be empty to inherit from the default theme.
 type ThemeColor struct {
@@ -125,6 +125,11 @@ const (
 	// after steering messages are injected and before messages are sent
 	// to the LLM. Handlers can replace the context window for this step.
 	PrepareStep EventType = "prepare_step"
+
+	// LLMUsage fires after each LLM provider call with the token and cost
+	// deltas for that single call. Extensions use it to attribute usage to
+	// specific calls/models and to drive budget enforcement between calls.
+	LLMUsage EventType = "llm_usage"
 )

 // AllEventTypes returns every supported event type.
@@ -139,7 +144,7 @@ func AllEventTypes() []EventType {
 		BeforeFork, BeforeSessionSwitch, BeforeCompact,
 		SubagentStart, SubagentChunk, SubagentEnd,
 		StepStart, StepFinish, ReasoningStart, Warnings, Source, Error, Retry,
-		PrepareStep,
+		PrepareStep, LLMUsage,
 	}
 }

@@ -4,8 +4,8 @@ import "testing"

 func TestAllEventTypes_Count(t *testing.T) {
 	all := AllEventTypes()
-	if len(all) != 32 {
-		t.Fatalf("expected 32 event types, got %d", len(all))
+	if len(all) != 33 {
+		t.Fatalf("expected 33 event types, got %d", len(all))
 	}
 }

@@ -0,0 +1,119 @@
+package extensions
+
+import "testing"
+
+func TestRunner_EmitLLMUsage(t *testing.T) {
+	var got LLMUsageEvent
+	var called bool
+	ext := makeHandlerExt("llmusage.go", map[EventType][]HandlerFunc{
+		LLMUsage: {
+			func(e Event, c Context) Result {
+				got = e.(LLMUsageEvent)
+				called = true
+				return nil
+			},
+		},
+	})
+
+	r := makeRunner(ext)
+	_, err := r.Emit(LLMUsageEvent{
+		InputTokens:  100,
+		OutputTokens: 50,
+		Cost:         0.0012,
+		Model:        "claude-sonnet-4-5-20250929",
+		Provider:     "anthropic",
+		StepNumber:   2,
+		FinishReason: "tool_calls",
+	})
+	if err != nil {
+		t.Fatalf("emit: %v", err)
+	}
+	if !called {
+		t.Fatal("expected LLMUsage handler to be called")
+	}
+	if got.InputTokens != 100 || got.OutputTokens != 50 {
+		t.Errorf("token fields not propagated: %+v", got)
+	}
+	if got.Cost != 0.0012 {
+		t.Errorf("cost not propagated, got %v", got.Cost)
+	}
+	if got.Model != "claude-sonnet-4-5-20250929" || got.Provider != "anthropic" {
+		t.Errorf("model/provider not propagated: %+v", got)
+	}
+	if got.StepNumber != 2 || got.FinishReason != "tool_calls" {
+		t.Errorf("step/finish reason not propagated: %+v", got)
+	}
+}
+
+func TestRunner_LLMUsageRegisteredViaTestAPI(t *testing.T) {
+	// Verify NewTestAPI wires up onLLMUsage so the extension can call
+	// api.OnLLMUsage during Init.
+	ext := &LoadedExtension{Handlers: make(map[EventType][]HandlerFunc)}
+	api := NewTestAPI(ext)
+
+	var calls int
+	api.OnLLMUsage(func(e LLMUsageEvent, c Context) {
+		calls++
+	})
+
+	if len(ext.Handlers[LLMUsage]) != 1 {
+		t.Fatalf("expected 1 LLMUsage handler registered, got %d", len(ext.Handlers[LLMUsage]))
+	}
+
+	r := makeRunner(*ext)
+	_, _ = r.Emit(LLMUsageEvent{InputTokens: 1})
+	if calls != 1 {
+		t.Errorf("expected handler called once, got %d", calls)
+	}
+}
+
+func TestAgentEndEvent_EnrichedFields(t *testing.T) {
+	// Verify the enriched event carries through Emit without mangling.
+	var got AgentEndEvent
+	ext := makeHandlerExt("end.go", map[EventType][]HandlerFunc{
+		AgentEnd: {
+			func(e Event, c Context) Result {
+				got = e.(AgentEndEvent)
+				return nil
+			},
+		},
+	})
+	r := makeRunner(ext)
+	_, err := r.Emit(AgentEndEvent{
+		Response:              "done",
+		StopReason:            "completed",
+		ToolCallCount:         3,
+		ToolNames:             []string{"bash", "read", "bash"},
+		LLMCallCount:          4,
+		InputTokensDelta:      1500,
+		OutputTokensDelta:     400,
+		CacheReadTokensDelta:  200,
+		CacheWriteTokensDelta: 100,
+		CostDelta:             0.0123,
+		DurationMs:            2500,
+	})
+	if err != nil {
+		t.Fatalf("emit: %v", err)
+	}
+	if got.ToolCallCount != 3 {
+		t.Errorf("ToolCallCount: got %d want 3", got.ToolCallCount)
+	}
+	if len(got.ToolNames) != 3 || got.ToolNames[0] != "bash" || got.ToolNames[2] != "bash" {
+		t.Errorf("ToolNames: %v", got.ToolNames)
+	}
+	if got.LLMCallCount != 4 {
+		t.Errorf("LLMCallCount: got %d want 4", got.LLMCallCount)
+	}
+	if got.InputTokensDelta != 1500 || got.OutputTokensDelta != 400 {
+		t.Errorf("token deltas: %+v", got)
+	}
+	if got.CacheReadTokensDelta != 200 || got.CacheWriteTokensDelta != 100 {
+		t.Errorf("cache deltas: %+v", got)
+	}
+	if got.CostDelta != 0.0123 {
+		t.Errorf("CostDelta: got %v", got.CostDelta)
+	}
+	if got.DurationMs != 2500 {
+		t.Errorf("DurationMs: got %d", got.DurationMs)
+	}
+}
@@ -669,6 +669,12 @@ func loadSingleExtension(path string) (*LoadedExtension, error) {
 				return *r
 			})
 		},
+		onLLMUsage: func(h func(LLMUsageEvent, Context)) {
+			reg(LLMUsage, func(e Event, c Context) Result {
+				h(e.(LLMUsageEvent), c)
+				return nil
+			})
+		},
 	}

 	// Call Init — the extension registers its handlers, tools, commands.
@@ -2,9 +2,12 @@ package extensions

 import (
 	"bytes"
+	"encoding/json"
 	"fmt"
 	"log"
+	"maps"
 	"os"
+	"path/filepath"
 	"runtime"
 	"sort"
 	"strconv"
@@ -99,6 +102,10 @@ type Runner struct {
 	customEventSubs map[string][]func(string) // inter-extension event bus
 	optionOverrides map[string]string         // runtime option overrides
 	configStore     *viper.Viper              // per-instance config store (nil = global)
+	state           map[string]string         // session-scoped extension state (last-write-wins)
+	stateMu         sync.RWMutex              // guards state independently of mu
+	saverMu         sync.Mutex                // serializes stateSaver invocations so atomic-rename writes don't interleave
+	stateSaver      func()                    // optional persistence hook invoked after each state mutation
 	mu              sync.RWMutex
 }

@@ -264,6 +271,18 @@ func normalizeContext(ctx Context) Context {
 	if ctx.GetEntries == nil {
 		ctx.GetEntries = func(string) []ExtensionEntry { return nil }
 	}
+	if ctx.SetState == nil {
+		ctx.SetState = func(string, string) {}
+	}
+	if ctx.GetState == nil {
+		ctx.GetState = func(string) (string, bool) { return "", false }
+	}
+	if ctx.DeleteState == nil {
+		ctx.DeleteState = func(string) {}
+	}
+	if ctx.ListState == nil {
+		ctx.ListState = func() []string { return nil }
+	}
 	if ctx.GetOption == nil {
 		ctx.GetOption = func(string) string { return "" }
 	}
@@ -745,6 +764,168 @@ func (r *Runner) GetMessageRenderer(name string) *MessageRendererConfig {
 	return nil
 }

+// ---------------------------------------------------------------------------
+// Extension state store (session-scoped, last-write-wins)
+// ---------------------------------------------------------------------------
+
+// SetState records a key-value pair in the runner's session-scoped extension
+// state store. The store is in-memory; callers wire SetStateSaver to persist
+// changes to a sidecar file. Thread-safe.
+//
+// When a saver is installed, concurrent SetState/DeleteState invocations are
+// serialized through saverMu so that overlapping snapshot-and-rename writes
+// cannot interleave (which would otherwise race on the shared tmp file and
+// risk persisting an older snapshot after a newer one).
+func (r *Runner) SetState(key, value string) {
+	r.stateMu.Lock()
+	if r.state == nil {
+		r.state = make(map[string]string)
+	}
+	r.state[key] = value
+	saver := r.stateSaver
+	r.stateMu.Unlock()
+	r.runSaver(saver)
+}
+
+// GetState returns the value previously stored via SetState, plus a bool
+// indicating whether the key was present. Thread-safe.
+func (r *Runner) GetState(key string) (string, bool) {
+	r.stateMu.RLock()
+	defer r.stateMu.RUnlock()
+	v, ok := r.state[key]
+	return v, ok
+}
+
+// DeleteState removes a key from the state store. No-op if the key is
+// missing. Thread-safe. Saver invocations are serialized via saverMu — see
+// SetState for the rationale.
+func (r *Runner) DeleteState(key string) {
+	r.stateMu.Lock()
+	_, existed := r.state[key]
+	if existed {
+		delete(r.state, key)
+	}
+	saver := r.stateSaver
+	r.stateMu.Unlock()
+	if !existed {
+		return
+	}
+	r.runSaver(saver)
+}
+
+// runSaver invokes the optional persistence callback under saverMu so
+// concurrent SetState/DeleteState writers cannot race on the shared tmp
+// file used by SaveStateToFile's atomic rename. The deferred Unlock
+// guarantees saverMu is released even if the saver panics.
+func (r *Runner) runSaver(saver func()) {
+	if saver == nil {
+		return
+	}
+	r.saverMu.Lock()
+	defer r.saverMu.Unlock()
+	saver()
+}
+
+// ListState returns all keys currently in the state store, in unspecified
+// order. Thread-safe.
+func (r *Runner) ListState() []string {
+	r.stateMu.RLock()
+	defer r.stateMu.RUnlock()
+	if len(r.state) == 0 {
+		return nil
+	}
+	keys := make([]string, 0, len(r.state))
+	for k := range r.state {
+		keys = append(keys, k)
+	}
+	return keys
+}
+
+// SetStateSaver installs an optional persistence hook invoked after each
+// mutation to the state store (SetState / DeleteState / LoadStateFromFile).
+// Pass nil to disable persistence. Thread-safe.
+func (r *Runner) SetStateSaver(saver func()) {
+	r.stateMu.Lock()
+	defer r.stateMu.Unlock()
+	r.stateSaver = saver
+}
+
+// SnapshotState returns a copy of the current state store as a
+// fresh map. Useful for persisting to disk without holding the lock.
+// Thread-safe.
+func (r *Runner) SnapshotState() map[string]string {
+	r.stateMu.RLock()
+	defer r.stateMu.RUnlock()
+	if len(r.state) == 0 {
+		return nil
+	}
+	copyMap := make(map[string]string, len(r.state))
+	maps.Copy(copyMap, r.state)
+	return copyMap
+}
+
+// LoadStateFromFile reads a JSON map from path and replaces the in-memory
+// state store with its contents. Missing or empty files are treated as
+// "no prior state": the in-memory store is replaced with an empty map so
+// callers can safely switch sessions without leaking keys from a prior
+// session into a new one. Malformed JSON returns the parse error without
+// touching the existing store. Thread-safe.
+func (r *Runner) LoadStateFromFile(path string) error {
+	data, err := os.ReadFile(path)
+	if err != nil {
+		if os.IsNotExist(err) {
+			r.stateMu.Lock()
+			r.state = map[string]string{}
+			r.stateMu.Unlock()
+			return nil
+		}
+		return fmt.Errorf("reading extension state: %w", err)
+	}
+	if len(data) == 0 {
+		r.stateMu.Lock()
+		r.state = map[string]string{}
+		r.stateMu.Unlock()
+		return nil
+	}
+	var loaded map[string]string
+	if err := json.Unmarshal(data, &loaded); err != nil {
+		return fmt.Errorf("parsing extension state: %w", err)
+	}
+	r.stateMu.Lock()
+	r.state = loaded
+	r.stateMu.Unlock()
+	return nil
+}
+
+// SaveStateToFile writes the current state store to path as JSON, creating
+// parent directories as needed. An empty store writes an empty object so
+// that consumers can distinguish "loaded but empty" from "never saved".
+// Writes are atomic via a tmp-file-and-rename sequence. Thread-safe.
+func (r *Runner) SaveStateToFile(path string) error {
+	snap := r.SnapshotState()
+	if snap == nil {
+		snap = map[string]string{}
+	}
+	data, err := json.MarshalIndent(snap, "", "  ")
+	if err != nil {
+		return fmt.Errorf("marshalling extension state: %w", err)
+	}
+	if dir := filepath.Dir(path); dir != "." && dir != "" {
+		if err := os.MkdirAll(dir, 0o755); err != nil {
+			return fmt.Errorf("creating state directory: %w", err)
+		}
+	}
+	tmp := path + ".tmp"
+	if err := os.WriteFile(tmp, data, 0o644); err != nil {
+		return fmt.Errorf("writing extension state: %w", err)
+	}
+	if err := os.Rename(tmp, path); err != nil {
+		_ = os.Remove(tmp)
+		return fmt.Errorf("renaming extension state: %w", err)
+	}
+	return nil
+}
+
 // ---------------------------------------------------------------------------
 // Hot-reload
 // ---------------------------------------------------------------------------
@@ -768,7 +949,9 @@ func (r *Runner) Reload(exts []LoadedExtension) {
 	r.uiVisibility = nil
 	r.disabledTools = nil
 	r.customEventSubs = nil
-	// optionOverrides are intentionally preserved.
+	// optionOverrides and state are intentionally preserved across reloads:
+	// they represent user/session intent (not extension code) and would be
+	// surprising to lose on a hot-reload.
 }

 // ---------------------------------------------------------------------------
@@ -0,0 +1,262 @@
+package extensions
+
+import (
+	"encoding/json"
+	"os"
+	"path/filepath"
+	"sync"
+	"testing"
+	"time"
+)
+
+func TestRunner_State_BasicSetGetDelete(t *testing.T) {
+	r := NewRunner(nil)
+
+	if _, ok := r.GetState("missing"); ok {
+		t.Fatal("expected GetState to return ok=false for missing key")
+	}
+
+	r.SetState("a", "1")
+	r.SetState("b", "2")
+	r.SetState("a", "3") // last-write-wins
+
+	if v, ok := r.GetState("a"); !ok || v != "3" {
+		t.Errorf("expected GetState(a)=(3,true), got (%q,%v)", v, ok)
+	}
+	if v, ok := r.GetState("b"); !ok || v != "2" {
+		t.Errorf("expected GetState(b)=(2,true), got (%q,%v)", v, ok)
+	}
+
+	keys := r.ListState()
+	if len(keys) != 2 {
+		t.Errorf("expected 2 keys, got %d (%v)", len(keys), keys)
+	}
+
+	r.DeleteState("a")
+	if _, ok := r.GetState("a"); ok {
+		t.Error("expected key a to be gone after DeleteState")
+	}
+	if len(r.ListState()) != 1 {
+		t.Errorf("expected 1 key after delete, got %v", r.ListState())
+	}
+
+	// Deleting missing key is a no-op.
+	r.DeleteState("never-there")
+}
+
+func TestRunner_State_SaverFires(t *testing.T) {
+	r := NewRunner(nil)
+	var calls int
+	var mu sync.Mutex
+	r.SetStateSaver(func() {
+		mu.Lock()
+		calls++
+		mu.Unlock()
+	})
+
+	r.SetState("a", "1")
+	r.SetState("a", "2")
+	r.DeleteState("a")
+	r.DeleteState("a") // missing → no save
+
+	mu.Lock()
+	defer mu.Unlock()
+	if calls != 3 {
+		t.Errorf("expected saver to fire 3 times (2 sets + 1 delete), got %d", calls)
+	}
+}
+
+func TestRunner_State_SaveAndLoadRoundTrip(t *testing.T) {
+	dir := t.TempDir()
+	path := filepath.Join(dir, "ext-state.json")
+
+	r1 := NewRunner(nil)
+	r1.SetState("k1", "v1")
+	r1.SetState("k2", `{"json":"value"}`)
+	if err := r1.SaveStateToFile(path); err != nil {
+		t.Fatalf("SaveStateToFile: %v", err)
+	}
+
+	// Verify file contains JSON map.
+	data, err := os.ReadFile(path)
+	if err != nil {
+		t.Fatalf("reading saved file: %v", err)
+	}
+	var parsed map[string]string
+	if err := json.Unmarshal(data, &parsed); err != nil {
+		t.Fatalf("unmarshalling: %v", err)
+	}
+	if parsed["k1"] != "v1" || parsed["k2"] != `{"json":"value"}` {
+		t.Errorf("unexpected file contents: %v", parsed)
+	}
+
+	r2 := NewRunner(nil)
+	if err := r2.LoadStateFromFile(path); err != nil {
+		t.Fatalf("LoadStateFromFile: %v", err)
+	}
+	if v, ok := r2.GetState("k1"); !ok || v != "v1" {
+		t.Errorf("expected k1=v1 after load, got (%q,%v)", v, ok)
+	}
+	if v, ok := r2.GetState("k2"); !ok || v != `{"json":"value"}` {
+		t.Errorf("expected k2 to round-trip, got %q", v)
+	}
+}
+
+func TestRunner_State_LoadMissingFileClearsState(t *testing.T) {
+	// LoadStateFromFile is documented to "replace the in-memory state store
+	// with its contents"; for a missing file that means clearing the store.
+	// This is what makes session-switching safe: a new session that has not
+	// yet written a sidecar must not inherit keys from a prior session.
+	r := NewRunner(nil)
+	r.SetState("a", "1")
+	if err := r.LoadStateFromFile(filepath.Join(t.TempDir(), "does-not-exist.json")); err != nil {
+		t.Errorf("expected nil error for missing file, got %v", err)
+	}
+	if _, ok := r.GetState("a"); ok {
+		t.Error("expected pre-existing state to be cleared when target file is missing")
+	}
+	if keys := r.ListState(); keys != nil {
+		t.Errorf("expected ListState() to be nil after clearing, got %v", keys)
+	}
+}
+
+func TestRunner_State_LoadEmptyFileClearsState(t *testing.T) {
+	dir := t.TempDir()
+	path := filepath.Join(dir, "empty.json")
+	if err := os.WriteFile(path, nil, 0o644); err != nil {
+		t.Fatal(err)
+	}
+	r := NewRunner(nil)
+	r.SetState("a", "1")
+	if err := r.LoadStateFromFile(path); err != nil {
+		t.Errorf("expected nil error for empty file, got %v", err)
+	}
+	if _, ok := r.GetState("a"); ok {
+		t.Error("expected pre-existing state to be cleared when target file is empty")
+	}
+}
+
+func TestRunner_State_LoadMalformedFileError(t *testing.T) {
+	dir := t.TempDir()
+	path := filepath.Join(dir, "bad.json")
+	if err := os.WriteFile(path, []byte("{not json"), 0o644); err != nil {
+		t.Fatal(err)
+	}
+	r := NewRunner(nil)
+	if err := r.LoadStateFromFile(path); err == nil {
+		t.Error("expected error loading malformed JSON, got nil")
+	}
+}
+
+func TestRunner_State_PersistenceViaSaver(t *testing.T) {
+	dir := t.TempDir()
+	path := filepath.Join(dir, "ext-state.json")
+
+	r := NewRunner(nil)
+	r.SetStateSaver(func() {
+		_ = r.SaveStateToFile(path)
+	})
+	r.SetState("hello", "world")
+
+	// File should exist with the value already.
+	data, err := os.ReadFile(path)
+	if err != nil {
+		t.Fatalf("reading saved file: %v", err)
+	}
+	var parsed map[string]string
+	if err := json.Unmarshal(data, &parsed); err != nil {
+		t.Fatalf("unmarshalling: %v", err)
+	}
+	if parsed["hello"] != "world" {
+		t.Errorf("expected file to contain hello=world, got %v", parsed)
+	}
+}
+
+func TestRunner_State_ConcurrentSet(t *testing.T) {
+	r := NewRunner(nil)
+	var wg sync.WaitGroup
+	const goroutines = 16
+	const iterations = 100
+	wg.Add(goroutines)
+	for range goroutines {
+		go func() {
+			defer wg.Done()
+			for range iterations {
+				r.SetState("k", "v")
+				_, _ = r.GetState("k")
+			}
+		}()
+	}
+	wg.Wait()
+	if v, ok := r.GetState("k"); !ok || v != "v" {
+		t.Errorf("expected k=v after concurrent writes, got (%q,%v)", v, ok)
+	}
+}
+
+func TestRunner_State_ContextNoOpsWhenUnset(t *testing.T) {
+	// Verify normalizeContext installs safe no-ops for SetState/GetState/etc.
+	// when not provided by the caller.
+	ext := makeHandlerExt("state.go", map[EventType][]HandlerFunc{
+		SessionStart: {
+			func(e Event, c Context) Result {
+				// All four state functions should be non-nil and safe to call.
+				c.SetState("a", "b")
+				if v, ok := c.GetState("a"); ok || v != "" {
+					t.Errorf("no-op GetState should return (\"\", false); got (%q,%v)", v, ok)
+				}
+				c.DeleteState("a")
+				if keys := c.ListState(); keys != nil {
+					t.Errorf("no-op ListState should return nil; got %v", keys)
+				}
+				return nil
+			},
+		},
+	})
+	r := makeRunner(ext)
+	// SetContext with empty Context to exercise normalizeContext defaults.
+	r.SetContext(Context{})
+	_, err := r.Emit(SessionStartEvent{})
+	if err != nil {
+		t.Fatalf("emit: %v", err)
+	}
+}
+
+func TestRunner_State_SaverPanicReleasesSaverMu(t *testing.T) {
+	// If the saver callback panics (e.g. disk full mid-write), runSaver
+	// must still release saverMu so subsequent SetState/DeleteState calls
+	// can make progress. Without `defer Unlock()` the lock would be
+	// permanently held and the next write would deadlock.
+	r := NewRunner(nil)
+	var calls int
+	r.SetStateSaver(func() {
+		calls++
+		if calls == 1 {
+			panic("simulated disk-write failure")
+		}
+	})
+
+	// First call panics. Recover, then verify a follow-up call still works
+	// without blocking (proving saverMu was released).
+	func() {
+		defer func() {
+			if rec := recover(); rec == nil {
+				t.Fatal("expected panic from first saver invocation")
+			}
+		}()
+		r.SetState("a", "1")
+	}()
+
+	done := make(chan struct{})
+	go func() {
+		r.SetState("b", "2") // would deadlock if saverMu were still held
+		close(done)
+	}()
+	select {
+	case <-done:
+	case <-time.After(2 * time.Second):
+		t.Fatal("SetState after saver panic blocked — saverMu was not released")
+	}
+	if calls != 2 {
+		t.Errorf("expected saver to fire twice (panic + recovery write), got %d", calls)
+	}
+}
@@ -183,6 +183,7 @@ func Symbols() interp.Exports {
 			"RetryEvent":          reflect.ValueOf((*RetryEvent)(nil)),
 			"PrepareStepEvent":    reflect.ValueOf((*PrepareStepEvent)(nil)),
 			"PrepareStepResult":   reflect.ValueOf((*PrepareStepResult)(nil)),
+			"LLMUsageEvent":       reflect.ValueOf((*LLMUsageEvent)(nil)),
 		},
 	}
 }
@@ -189,5 +189,11 @@ func NewTestAPI(ext *LoadedExtension) API {
 				return nil
 			})
 		},
+		onLLMUsage: func(h func(LLMUsageEvent, Context)) {
+			reg(LLMUsage, func(e Event, c Context) Result {
+				h(e.(LLMUsageEvent), c)
+				return nil
+			})
+		},
 	}
 }
@@ -0,0 +1,84 @@
+package models
+
+import (
+	"net/http"
+	"testing"
+	"time"
+)
+
+func TestCopilotProviderAliasUsesCatalog(t *testing.T) {
+	registry := NewModelsRegistry()
+
+	models, err := registry.GetModelsForProvider("copilot")
+	if err != nil {
+		t.Fatalf("GetModelsForProvider(copilot) failed: %v", err)
+	}
+	if len(models) == 0 {
+		t.Fatal("expected copilot alias to return github-copilot catalog models")
+	}
+	if registry.LookupModel("copilot", "gpt-5.5") == nil {
+		t.Fatal("expected copilot/gpt-5.5 to resolve through github-copilot catalog")
+	}
+	if registry.GetProviderInfo("copilot") == nil {
+		t.Fatal("expected copilot alias to return github-copilot provider info")
+	}
+}
+
+func TestCopilotRejectsNonGPTModels(t *testing.T) {
+	_, err := CreateProvider(t.Context(), &ProviderConfig{ModelString: "copilot/claude-sonnet-4.6"})
+	if err == nil {
+		t.Fatal("expected non-GPT Copilot model to be rejected")
+	}
+}
+
+func TestCopilotHTTPClientCachesToken(t *testing.T) {
+	client := createCopilotHTTPClient("cached-token", time.Now().Add(time.Hour).Unix(), false)
+	transport, ok := client.Transport.(*copilotTransport)
+	if !ok {
+		t.Fatal("expected *copilotTransport")
+	}
+
+	token := transport.cachedToken(t.Context())
+	if token != "cached-token" {
+		t.Fatalf("expected cached token, got %q", token)
+	}
+}
+
+func TestCopilotTransportHeaders(t *testing.T) {
+	req, err := http.NewRequest(http.MethodGet, "https://example.com", nil)
+	if err != nil {
+		t.Fatalf("NewRequest failed: %v", err)
+	}
+
+	transport := &copilotTransport{
+		base: roundTripFunc(func(req *http.Request) (*http.Response, error) {
+			if req.Header.Get("Authorization") != "Bearer cached-token" {
+				t.Fatalf("unexpected Authorization header: %q", req.Header.Get("Authorization"))
+			}
+			if req.Header.Get("Copilot-Integration-Id") != copilotIntegrationID {
+				t.Fatalf("unexpected Copilot-Integration-Id header: %q", req.Header.Get("Copilot-Integration-Id"))
+			}
+			if req.Header.Get("Editor-Version") != copilotEditorVersion {
+				t.Fatalf("unexpected Editor-Version header: %q", req.Header.Get("Editor-Version"))
+			}
+			if req.Header.Get("User-Agent") != copilotUserAgent {
+				t.Fatalf("unexpected User-Agent header: %q", req.Header.Get("User-Agent"))
+			}
+			return &http.Response{StatusCode: http.StatusOK, Body: http.NoBody}, nil
+		}),
+		token:     "cached-token",
+		expiresAt: time.Now().Add(time.Hour).Unix(),
+	}
+
+	resp, err := transport.RoundTrip(req)
+	if err != nil {
+		t.Fatalf("RoundTrip failed: %v", err)
+	}
+	_ = resp.Body.Close()
+}
+
+type roundTripFunc func(*http.Request) (*http.Response, error)
+
+func (f roundTripFunc) RoundTrip(req *http.Request) (*http.Response, error) {
+	return f(req)
+}
@@ -13,6 +13,7 @@ import (
 	"os"
 	"regexp"
 	"strings"
+	"sync"
 	"time"

 	"charm.land/fantasy"
@@ -33,6 +34,24 @@ import (
 const (
 	// ClaudeCodePrompt is the required system prompt for OAuth authentication.
 	ClaudeCodePrompt = "You are Claude Code, Anthropic's official CLI for Claude."
+
+	// copilotProviderID is the canonical models.dev provider key. The CLI also
+	// accepts the shorter "copilot" alias for user-facing model strings.
+	copilotProviderID = "github-copilot"
+	// copilotAliasProviderID is the short provider prefix accepted by kit.
+	copilotAliasProviderID = "copilot"
+	// copilotBaseURL is the fallback API URL if the model catalog has no API URL.
+	copilotBaseURL = "https://api.githubcopilot.com"
+
+	// GitHub Copilot currently expects VS Code Copilot Chat client identifiers.
+	// Keep these centralized so they are easy to audit and update when GitHub
+	// changes accepted client metadata.
+	copilotIntegrationID       = "vscode-chat"
+	copilotEditorVersion       = "vscode/1.104.1"
+	copilotEditorPluginVersion = "copilot-chat/0.31.0"
+	copilotUserAgent           = "GitHubCopilotChat/0.31.0"
+	copilotOpenAIIntent        = "conversation-agent"
+	copilotGitHubAPIVersion    = "2026-01-09"
 )

 // resolveModelAlias resolves model aliases to their full names using the registry
@@ -215,6 +234,20 @@ func ParseModelString(modelString string) (provider, model string, err error) {
 	return "", "", fmt.Errorf("invalid model format %q: expected provider/model (e.g. anthropic/claude-sonnet-4-5)", modelString)
 }

+// isCopilotProvider reports whether provider is the canonical catalog key or
+// the user-facing shorthand alias.
+func isCopilotProvider(provider string) bool {
+	return provider == copilotAliasProviderID || provider == copilotProviderID
+}
+
+// catalogProviderID maps supported provider aliases to their models.dev keys.
+func catalogProviderID(provider string) string {
+	if isCopilotProvider(provider) {
+		return copilotProviderID
+	}
+	return provider
+}
+
 // CreateProvider creates a fantasy LanguageModel based on the provider configuration.
 // Model metadata is looked up from the models.dev database for cost tracking and
 // capability detection, but unknown models are passed through to the provider
@@ -238,17 +271,30 @@ func CreateProvider(ctx context.Context, config *ProviderConfig) (*ProviderResul
 	}

 	registry := GetGlobalRegistry()
+	lookupProvider := catalogProviderID(provider)

-	// Look up model metadata (advisory, not blocking).
+	// Look up model metadata (advisory for most providers, strict for Copilot).
 	// When the model is known we validate config limits and print
 	// suggestions on likely typos; when unknown we let the provider
-	// API be the authority.
-	modelInfo := registry.LookupModel(provider, modelName)
-	if modelInfo == nil && provider != "ollama" && config.ProviderURL == "" {
+	// API be the authority except for Copilot, whose non-GPT catalog entries
+	// require unsupported wire protocols.
+	modelInfo := registry.LookupModel(lookupProvider, modelName)
+	if isCopilotProvider(provider) {
+		providerInfo := registry.GetProviderInfo(copilotProviderID)
+		if providerInfo == nil {
+			return nil, fmt.Errorf("unsupported provider: %s (not found in model database)", copilotProviderID)
+		}
+		if modelInfo == nil {
+			if suggestions := registry.SuggestModels(copilotProviderID, modelName); len(suggestions) > 0 {
+				return nil, fmt.Errorf("model %q not found for provider %s. Did you mean one of: %s", modelName, copilotProviderID, strings.Join(suggestions, ", "))
+			}
+			return nil, fmt.Errorf("model %q not found for provider %s", modelName, copilotProviderID)
+		}
+	} else if modelInfo == nil && provider != "ollama" && config.ProviderURL == "" {
 		// Model not in database — warn with suggestions but don't block.
-		if suggestions := registry.SuggestModels(provider, modelName); len(suggestions) > 0 {
+		if suggestions := registry.SuggestModels(lookupProvider, modelName); len(suggestions) > 0 {
 			fmt.Fprintf(os.Stderr, "Warning: model %q not found in model database for provider %s. Similar models: %s\n",
-				modelName, provider, strings.Join(suggestions, ", "))
+				modelName, lookupProvider, strings.Join(suggestions, ", "))
 		}
 	}

@@ -282,6 +328,8 @@ func CreateProvider(ctx context.Context, config *ProviderConfig) (*ProviderResul
 		result, createErr = createAnthropicProvider(ctx, config, modelName)
 	case "openai":
 		result, createErr = createOpenAIProvider(ctx, config, modelName)
+	case "copilot", "github-copilot":
+		result, createErr = createCopilotProvider(ctx, config, modelName)
 	case "google", "gemini":
 		result, createErr = createGoogleProvider(ctx, config, modelName)
 	case "ollama":
@@ -1023,6 +1071,72 @@ func createOpenAIProvider(ctx context.Context, config *ProviderConfig, modelName
 	return &ProviderResult{Model: model, ProviderOptions: providerOpts}, nil
 }

+// createCopilotProvider builds a GitHub Copilot provider through fantasy's
+// OpenAI-compatible provider. The catalog key is github-copilot, but the public
+// model prefix may be either copilot/ or github-copilot/.
+//
+// Only gpt-* Copilot models are enabled here. The catalog also lists Claude and
+// Gemini Copilot models, but those require different wire protocols and must be
+// routed explicitly before they can be safely accepted.
+func createCopilotProvider(ctx context.Context, config *ProviderConfig, modelName string) (*ProviderResult, error) {
+	if !strings.HasPrefix(modelName, "gpt-") {
+		return nil, fmt.Errorf("GitHub Copilot model %q is not supported yet: only gpt-* models use the OpenAI-compatible protocol", modelName)
+	}
+
+	cm, err := auth.NewCredentialManager()
+	if err != nil {
+		return nil, fmt.Errorf("failed to initialize credential manager: %w", err)
+	}
+
+	token, err := cm.GetValidCopilotAccessTokenContext(ctx)
+	if err != nil {
+		return nil, fmt.Errorf("GitHub Copilot credentials not available. Use 'kit auth login copilot': %w", err)
+	}
+
+	expiresAt := int64(0)
+	if creds, err := cm.GetCopilotCredentials(); err == nil && creds != nil && creds.CopilotAccessToken == token {
+		expiresAt = creds.ExpiresAt
+	}
+
+	baseURL := copilotBaseURL
+	if providerInfo := GetGlobalRegistry().GetProviderInfo(copilotProviderID); providerInfo != nil && providerInfo.API != "" {
+		baseURL = providerInfo.API
+	}
+	if config.ProviderURL != "" {
+		baseURL = config.ProviderURL
+	}
+
+	opts := []openai.Option{
+		openai.WithName(copilotAliasProviderID),
+		openai.WithBaseURL(baseURL),
+		openai.WithAPIKey(token),
+		openai.WithHTTPClient(createCopilotHTTPClient(token, expiresAt, config.TLSSkipVerify)),
+		openai.WithUseResponsesAPI(),
+		openai.WithResponsesAPIFunc(copilotUsesResponsesAPI),
+		openai.WithObjectMode(fantasy.ObjectModeTool),
+	}
+
+	provider, err := openai.New(opts...)
+	if err != nil {
+		return nil, fmt.Errorf("failed to create GitHub Copilot provider: %w", err)
+	}
+
+	model, err := provider.LanguageModel(ctx, modelName)
+	if err != nil {
+		return nil, fmt.Errorf("failed to create GitHub Copilot model: %w", err)
+	}
+
+	providerOpts := buildOpenAIProviderOptions(config, modelName)
+
+	return &ProviderResult{Model: model, ProviderOptions: providerOpts}, nil
+}
+
+// copilotUsesResponsesAPI selects the OpenAI Responses API for Copilot models
+// known to support it. Non-gpt models are rejected before provider creation.
+func copilotUsesResponsesAPI(modelID string) bool {
+	return strings.HasPrefix(modelID, "gpt-5")
+}
+
 // createOpenAICodexProvider creates a provider for ChatGPT/Codex OAuth tokens.
 // Uses the chatgpt.com/backend-api/codex endpoint with special headers.
 func createOpenAICodexProvider(ctx context.Context, config *ProviderConfig, modelName, token, accountID string) (*ProviderResult, error) {
@@ -1152,6 +1266,87 @@ func (t *codexTransport) RoundTrip(req *http.Request) (*http.Response, error) {
 	return t.base.RoundTrip(newReq)
 }

+// createCopilotHTTPClient returns an HTTP client that injects Copilot-specific
+// authorization and client metadata headers. The token and expiry are cached in
+// the transport so streaming requests do not hit credentials.json on every
+// RoundTrip; the credential manager is consulted only near expiry.
+func createCopilotHTTPClient(token string, expiresAt int64, skipVerify bool) *http.Client {
+	var base http.RoundTripper
+	if skipVerify {
+		base = &http.Transport{
+			TLSClientConfig: &tls.Config{
+				InsecureSkipVerify: true,
+			},
+		}
+	} else {
+		base = http.DefaultTransport
+	}
+
+	return &http.Client{
+		Transport: &copilotTransport{
+			base:      base,
+			token:     token,
+			expiresAt: expiresAt,
+		},
+		Timeout: 120 * time.Second,
+	}
+}
+
+// copilotTransport decorates requests for api.githubcopilot.com.
+//
+// It owns a cached Copilot access token. When the token is still valid, the hot
+// path is in-memory only. Near expiry it refreshes through CredentialManager,
+// which updates both the cache here and credentials.json.
+type copilotTransport struct {
+	base      http.RoundTripper
+	token     string
+	expiresAt int64
+	mu        sync.Mutex
+}
+
+func (t *copilotTransport) RoundTrip(req *http.Request) (*http.Response, error) {
+	token := t.cachedToken(req.Context())
+
+	newReq := req.Clone(req.Context())
+	newReq.Header.Set("Authorization", "Bearer "+token)
+	newReq.Header.Set("Copilot-Integration-Id", copilotIntegrationID)
+	newReq.Header.Set("Editor-Version", copilotEditorVersion)
+	newReq.Header.Set("Editor-Plugin-Version", copilotEditorPluginVersion)
+	newReq.Header.Set("Openai-Intent", copilotOpenAIIntent)
+	newReq.Header.Set("User-Agent", copilotUserAgent)
+	newReq.Header.Set("X-GitHub-Api-Version", copilotGitHubAPIVersion)
+
+	return t.base.RoundTrip(newReq)
+}
+
+// cachedToken returns the cached token unless it is within the five-minute
+// refresh window. Refresh errors fall back to the last token so the request can
+// surface any authoritative auth failure from the Copilot API.
+func (t *copilotTransport) cachedToken(ctx context.Context) string {
+	t.mu.Lock()
+	defer t.mu.Unlock()
+
+	if t.expiresAt == 0 || time.Now().Unix() < t.expiresAt-300 {
+		return t.token
+	}
+
+	cm, err := auth.NewCredentialManager()
+	if err != nil {
+		return t.token
+	}
+
+	fresh, err := cm.GetValidCopilotAccessTokenContext(ctx)
+	if err != nil || fresh == "" {
+		return t.token
+	}
+
+	t.token = fresh
+	if creds, err := cm.GetCopilotCredentials(); err == nil && creds != nil && creds.CopilotAccessToken == fresh {
+		t.expiresAt = creds.ExpiresAt
+	}
+	return t.token
+}
+
 func createGoogleProvider(ctx context.Context, config *ProviderConfig, modelName string) (*ProviderResult, error) {
 	apiKey := firstNonEmpty(
 		config.ProviderAPIKey,
@@ -246,6 +246,7 @@ func loadEmbeddedProviders() map[string]modelsDBProvider {
 // doesn't track yet. Callers should treat a nil return as "unknown model"
 // and continue with sensible defaults.
 func (r *ModelsRegistry) LookupModel(provider, modelID string) *ModelInfo {
+	provider = catalogProviderID(provider)
 	providerInfo, exists := r.providers[provider]
 	if !exists {
 		return nil
@@ -273,6 +274,7 @@ func LookupModelForSettings(modelString string) *ModelInfo {

 // getRequiredEnvVars returns the required environment variables for a provider.
 func (r *ModelsRegistry) getRequiredEnvVars(provider string) ([]string, error) {
+	provider = catalogProviderID(provider)
 	providerInfo, exists := r.providers[provider]
 	if !exists {
 		return nil, fmt.Errorf("unsupported provider: %s", provider)
@@ -287,6 +289,7 @@ func (r *ModelsRegistry) getRequiredEnvVars(provider string) ([]string, error) {
 // variables. Returns nil for providers not in the registry (unknown
 // providers are assumed to handle auth themselves or via --provider-api-key).
 func (r *ModelsRegistry) ValidateEnvironment(provider string, apiKey string) error {
+	provider = catalogProviderID(provider)
 	if apiKey != "" {
 		return nil
 	}
@@ -311,6 +314,15 @@ func (r *ModelsRegistry) ValidateEnvironment(provider string, apiKey string) err
 		}
 	}

+	// For GitHub Copilot, check stored GitHub OAuth credentials.
+	if provider == copilotProviderID {
+		if cm, err := auth.NewCredentialManager(); err == nil {
+			if has, _ := cm.HasCopilotCredentials(); has {
+				return nil
+			}
+		}
+	}
+
 	envVars, err := r.getRequiredEnvVars(provider)
 	if err != nil {
 		// Unknown provider — nothing to validate
@@ -350,6 +362,7 @@ func (r *ModelsRegistry) ValidateEnvironment(provider string, apiKey string) err

 // SuggestModels returns similar model names when an invalid model is provided.
 func (r *ModelsRegistry) SuggestModels(provider, invalidModel string) []string {
+	provider = catalogProviderID(provider)
 	providerInfo, exists := r.providers[provider]
 	if !exists {
 		return nil
@@ -415,6 +428,7 @@ func isProviderLLMSupported(providerID string, info *ProviderInfo) bool {

 // GetModelsForProvider returns all models for a specific provider.
 func (r *ModelsRegistry) GetModelsForProvider(provider string) (map[string]ModelInfo, error) {
+	provider = catalogProviderID(provider)
 	providerInfo, exists := r.providers[provider]
 	if !exists {
 		return nil, fmt.Errorf("unsupported provider: %s", provider)
@@ -425,6 +439,7 @@ func (r *ModelsRegistry) GetModelsForProvider(provider string) (map[string]Model

 // GetProviderInfo returns the full provider info, or nil if not found.
 func (r *ModelsRegistry) GetProviderInfo(provider string) *ProviderInfo {
+	provider = catalogProviderID(provider)
 	info, exists := r.providers[provider]
 	if !exists {
 		return nil
@@ -17,6 +17,9 @@ type AnthropicCredentials = auth.AnthropicCredentials
 // and API key authentication methods.
 type OpenAICredentials = auth.OpenAICredentials

+// CopilotCredentials holds GitHub OAuth and Copilot API credentials.
+type CopilotCredentials = auth.CopilotCredentials
+
 // CredentialStore holds all stored credentials for various providers.
 type CredentialStore = auth.CredentialStore

@@ -65,6 +68,37 @@ func HasOpenAICredentials() bool {
 	return has
 }

+// HasCopilotCredentials checks if valid GitHub Copilot credentials are stored.
+func HasCopilotCredentials() bool {
+	cm, err := auth.NewCredentialManager()
+	if err != nil {
+		return false
+	}
+	has, err := cm.HasCopilotCredentials()
+	if err != nil {
+		return false
+	}
+	return has
+}
+
+// GetCopilotCredentials retrieves stored GitHub Copilot credentials.
+func GetCopilotCredentials() (*CopilotCredentials, error) {
+	cm, err := auth.NewCredentialManager()
+	if err != nil {
+		return nil, err
+	}
+	return cm.GetCopilotCredentials()
+}
+
+// GetValidCopilotAccessToken returns a fresh GitHub Copilot access token.
+func GetValidCopilotAccessToken() (string, error) {
+	cm, err := auth.NewCredentialManager()
+	if err != nil {
+		return "", err
+	}
+	return cm.GetValidCopilotAccessToken()
+}
+
 // GetOpenAIAPIKey resolves the OpenAI API key using the standard
 // resolution order: stored credentials -> OPENAI_API_KEY env var.
 // Returns an empty string if no key is found.
@@ -2,6 +2,8 @@ package kit

 import (
 	"fmt"
+	"log"
+	"strings"

 	"github.com/mark3labs/kit/internal/extensions"
 	"github.com/mark3labs/kit/internal/message"
@@ -96,6 +98,23 @@ type ExtensionAPI interface {
 	AppendEntry(extType, data string) (string, error)
 	GetEntries(extType string) []ExtensionEntry

+	// Session-scoped extension state (last-write-wins key-value store).
+	// Backed by an in-memory map and (optionally) a sidecar file per session;
+	// state lives outside the conversation tree and is not visible to the LLM.
+	SetState(key, value string)
+	GetState(key string) (string, bool)
+	DeleteState(key string)
+	ListState() []string
+
+	// InitStatePersistence loads any existing state from the per-session
+	// sidecar file and installs a saver hook so that subsequent SetState /
+	// DeleteState mutations are flushed to disk. Safe to call multiple times;
+	// repeat calls simply reload and reinstall the saver.
+	//
+	// For ephemeral or in-memory sessions (no session file path), the call
+	// is a no-op and state remains in memory for the lifetime of the runner.
+	InitStatePersistence() error
+
 	// Status bar
 	SetStatus(entry ExtensionStatusBarEntry)
 	RemoveStatus(key string)
@@ -332,6 +351,67 @@ func (e *extensionAPI) AppendEntry(extType, data string) (string, error) {
 	return e.kit.session.AppendExtensionData(extType, data)
 }

+func (e *extensionAPI) SetState(key, value string) {
+	if e.kit.extRunner != nil {
+		e.kit.extRunner.SetState(key, value)
+	}
+}
+
+func (e *extensionAPI) GetState(key string) (string, bool) {
+	if e.kit.extRunner == nil {
+		return "", false
+	}
+	return e.kit.extRunner.GetState(key)
+}
+
+func (e *extensionAPI) DeleteState(key string) {
+	if e.kit.extRunner != nil {
+		e.kit.extRunner.DeleteState(key)
+	}
+}
+
+func (e *extensionAPI) ListState() []string {
+	if e.kit.extRunner == nil {
+		return nil
+	}
+	return e.kit.extRunner.ListState()
+}
+
+func (e *extensionAPI) InitStatePersistence() error {
+	if e.kit.extRunner == nil {
+		return nil
+	}
+	path := extStateSidecarPath(e.kit.GetSessionPath())
+	if path == "" {
+		// Ephemeral or in-memory session; no on-disk state.
+		e.kit.extRunner.SetStateSaver(nil)
+		return nil
+	}
+	if err := e.kit.extRunner.LoadStateFromFile(path); err != nil {
+		return err
+	}
+	runner := e.kit.extRunner
+	runner.SetStateSaver(func() {
+		if err := runner.SaveStateToFile(path); err != nil {
+			log.Printf("WARN extension state save failed: path=%s err=%v", path, err)
+		}
+	})
+	return nil
+}
+
+// extStateSidecarPath returns the path to the per-session extension state
+// sidecar file derived from the session's JSONL path. Returns empty for
+// ephemeral / in-memory sessions where no JSONL is being written.
+func extStateSidecarPath(sessionPath string) string {
+	if sessionPath == "" {
+		return ""
+	}
+	if trimmed, ok := strings.CutSuffix(sessionPath, ".jsonl"); ok {
+		return trimmed + ".ext-state.json"
+	}
+	return sessionPath + ".ext-state.json"
+}
+
 func (e *extensionAPI) GetEntries(extType string) []ExtensionEntry {
 	if e.kit.session == nil {
 		return nil
@@ -3,8 +3,11 @@ package kit
 import (
 	"strings"
 	"sync"
+	"time"

+	"github.com/mark3labs/kit/internal/auth"
 	"github.com/mark3labs/kit/internal/extensions"
+	"github.com/mark3labs/kit/internal/models"
 )

 // bridgeExtensions registers extension event handlers as SDK hooks and
@@ -19,6 +22,30 @@ import (
 // wrapper (internal/extensions/wrapper.go) which composes underneath the SDK
 // hook wrapper.
 func (m *Kit) bridgeExtensions(runner *extensions.Runner) {
+	// Per-turn aggregator: collects tool/LLM/usage signals between AgentStart
+	// and AgentEnd so the enriched AgentEndEvent can be populated without
+	// requiring extensions to maintain parallel bookkeeping.
+	//
+	// NOTE: this aggregator assumes a single in-flight turn per *Kit instance,
+	// which is the current contract — runTurn does not serialize callers and
+	// the SDK's TurnStartEvent/TurnEndEvent do not carry a turn ID, so two
+	// concurrent Prompt() calls on the same *Kit would clobber the counters.
+	// All current callers (TUI app layer, CLI runner, SDK examples) serialize
+	// turns above this layer. If concurrent turns become a supported use case,
+	// extend TurnStartEvent/TurnEndEvent with a turn ID and key this map per
+	// turn instead.
+	turnAgg := &turnAggregator{kit: m}
+	m.Subscribe(func(e Event) {
+		switch ev := e.(type) {
+		case TurnStartEvent:
+			turnAgg.start()
+		case ToolResultEvent:
+			turnAgg.recordTool(ev.ToolName)
+		case StepFinishEvent:
+			turnAgg.recordStep(ev.Usage)
+		}
+	})
+
 	// --- Interception hooks ---

 	// Extension Input → BeforeTurn hook (high priority, runs first).
@@ -109,9 +136,19 @@ func (m *Kit) bridgeExtensions(runner *extensions.Runner) {
 				} else if stopReason == "" {
 					stopReason = "completed"
 				}
+				agg := turnAgg.consume()
 				_, _ = runner.Emit(extensions.AgentEndEvent{
-					Response:   response,
-					StopReason: stopReason,
+					Response:              response,
+					StopReason:            stopReason,
+					ToolCallCount:         agg.toolCallCount,
+					ToolNames:             agg.toolNames,
+					LLMCallCount:          agg.llmCallCount,
+					InputTokensDelta:      agg.inputTokens,
+					OutputTokensDelta:     agg.outputTokens,
+					CacheReadTokensDelta:  agg.cacheReadTokens,
+					CacheWriteTokensDelta: agg.cacheWriteTokens,
+					CostDelta:             agg.cost,
+					DurationMs:            agg.durationMs(),
 				})
 			}
 		})
@@ -302,6 +339,32 @@ func (m *Kit) bridgeExtensions(runner *extensions.Runner) {
 		}
 	})

+	// LLMUsage: derive per-call usage from StepFinish. Each step corresponds
+	// to one LLM provider call, so the step's usage is the per-call delta.
+	// Cost is computed from the current model's pricing (zero when unknown
+	// or OAuth credentials are in use). RequestID is left empty until the
+	// SDK surfaces a correlation id from the underlying provider.
+	if runner.HasHandlers(extensions.LLMUsage) {
+		m.Subscribe(func(e Event) {
+			ev, ok := e.(StepFinishEvent)
+			if !ok {
+				return
+			}
+			provider, modelID, cost := llmUsageMeta(m, ev.Usage)
+			_, _ = runner.Emit(extensions.LLMUsageEvent{
+				InputTokens:      int(ev.Usage.InputTokens),
+				OutputTokens:     int(ev.Usage.OutputTokens),
+				CacheReadTokens:  int(ev.Usage.CacheReadTokens),
+				CacheWriteTokens: int(ev.Usage.CacheCreationTokens),
+				Cost:             cost,
+				Model:            modelID,
+				Provider:         provider,
+				StepNumber:       ev.StepNumber,
+				FinishReason:     ev.FinishReason,
+			})
+		})
+	}
+
 	bridgeObserve(m, runner, extensions.ReasoningStart, func(ev ReasoningStartEvent) extensions.Event {
 		return extensions.ReasoningStartEvent{ID: ev.ID}
 	})
@@ -363,6 +426,172 @@ func bridgeObserve[In Event](m *Kit, runner *extensions.Runner, kind extensions.
 	})
 }

+// turnAggregator collects per-turn signals (tool calls, LLM round-trips, token
+// usage, wall-clock duration) so that the enriched AgentEndEvent can be
+// populated without requiring extensions to maintain parallel bookkeeping.
+//
+// The aggregator resets on each TurnStartEvent and is consumed (snapshotted +
+// reset) on TurnEndEvent. All access is serialized via a mutex because the
+// underlying event bus may fan handlers across goroutines in the future.
+type turnAggregator struct {
+	mu               sync.Mutex
+	started          time.Time
+	ended            time.Time
+	toolCallCount    int
+	toolNames        []string
+	llmCallCount     int
+	inputTokens      int
+	outputTokens     int
+	cacheReadTokens  int
+	cacheWriteTokens int
+	cost             float64
+	kit              *Kit
+}
+
+type turnSnapshot struct {
+	started          time.Time
+	ended            time.Time
+	toolCallCount    int
+	toolNames        []string
+	llmCallCount     int
+	inputTokens      int
+	outputTokens     int
+	cacheReadTokens  int
+	cacheWriteTokens int
+	cost             float64
+}
+
+func (s turnSnapshot) durationMs() int64 {
+	if s.started.IsZero() {
+		return 0
+	}
+	end := s.ended
+	if end.IsZero() {
+		end = time.Now()
+	}
+	return end.Sub(s.started).Milliseconds()
+}
+
+// start resets all counters and records the turn's start time. Called from
+// the TurnStartEvent subscriber.
+func (a *turnAggregator) start() {
+	a.mu.Lock()
+	defer a.mu.Unlock()
+	a.started = time.Now()
+	a.ended = time.Time{}
+	a.toolCallCount = 0
+	a.toolNames = nil
+	a.llmCallCount = 0
+	a.inputTokens = 0
+	a.outputTokens = 0
+	a.cacheReadTokens = 0
+	a.cacheWriteTokens = 0
+	a.cost = 0
+}
+
+func (a *turnAggregator) recordTool(name string) {
+	a.mu.Lock()
+	defer a.mu.Unlock()
+	a.toolCallCount++
+	if name != "" {
+		a.toolNames = append(a.toolNames, name)
+	}
+}
+
+func (a *turnAggregator) recordStep(usage LLMUsage) {
+	a.mu.Lock()
+	defer a.mu.Unlock()
+	a.llmCallCount++
+	a.inputTokens += int(usage.InputTokens)
+	a.outputTokens += int(usage.OutputTokens)
+	a.cacheReadTokens += int(usage.CacheReadTokens)
+	a.cacheWriteTokens += int(usage.CacheCreationTokens)
+	if a.kit != nil {
+		_, _, c := llmUsageMeta(a.kit, usage)
+		a.cost += c
+	}
+}
+
+// consume returns a snapshot of the current turn and marks it ended.
+// Subsequent start() calls clear the snapshot.
+func (a *turnAggregator) consume() turnSnapshot {
+	a.mu.Lock()
+	defer a.mu.Unlock()
+	a.ended = time.Now()
+	names := a.toolNames
+	if len(names) > 0 {
+		copied := make([]string, len(names))
+		copy(copied, names)
+		names = copied
+	}
+	return turnSnapshot{
+		started:          a.started,
+		ended:            a.ended,
+		toolCallCount:    a.toolCallCount,
+		toolNames:        names,
+		llmCallCount:     a.llmCallCount,
+		inputTokens:      a.inputTokens,
+		outputTokens:     a.outputTokens,
+		cacheReadTokens:  a.cacheReadTokens,
+		cacheWriteTokens: a.cacheWriteTokens,
+		cost:             a.cost,
+	}
+}
+
+// llmUsageMeta returns the current provider, model id, and computed cost for
+// the given usage values using the Kit instance's active model. Cost is zero
+// in any of the following cases:
+//   - the *Kit pointer is nil or has no active model;
+//   - the model is not in the registry (custom fine-tunes, unknown providers);
+//   - the model has no pricing fields set;
+//   - the active credential is an Anthropic OAuth token (matches the
+//     existing usage_tracker behavior of suppressing cost for OAuth users).
+func llmUsageMeta(m *Kit, usage LLMUsage) (provider, modelID string, cost float64) {
+	if m == nil {
+		return "", "", 0
+	}
+	modelString := m.GetModelString()
+	if modelString == "" {
+		return "", "", 0
+	}
+	p, id, err := models.ParseModelString(modelString)
+	if err != nil {
+		return "", "", 0
+	}
+	provider, modelID = p, id
+	info := models.GetGlobalRegistry().LookupModel(provider, modelID)
+	if info == nil {
+		return provider, modelID, 0
+	}
+	if isAnthropicOAuth(m, provider) {
+		return provider, modelID, 0
+	}
+	cost = float64(usage.InputTokens) * info.Cost.Input / 1_000_000
+	cost += float64(usage.OutputTokens) * info.Cost.Output / 1_000_000
+	if info.Cost.CacheRead != nil {
+		cost += float64(usage.CacheReadTokens) * (*info.Cost.CacheRead) / 1_000_000
+	}
+	if info.Cost.CacheWrite != nil {
+		cost += float64(usage.CacheCreationTokens) * (*info.Cost.CacheWrite) / 1_000_000
+	}
+	return provider, modelID, cost
+}
+
+// isAnthropicOAuth reports whether the current Anthropic credential resolves
+// to a stored OAuth token (in which case the user is not billed per-token).
+// Mirrors the OAuth detection in cmd/extension_context.go's usage tracker
+// update path so OnLLMUsage cost reporting agrees with ctx.GetSessionUsage().
+func isAnthropicOAuth(m *Kit, provider string) bool {
+	if m == nil || provider != "anthropic" {
+		return false
+	}
+	_, source, err := auth.GetAnthropicAPIKey(m.v.GetString("provider-api-key"))
+	if err != nil {
+		return false
+	}
+	return strings.HasPrefix(source, "stored OAuth")
+}
+
 // llmToContextMessages converts a slice of LLM messages to extension
 // ContextMessage values, extracting plain text from each message.
 func llmToContextMessages(msgs []LLMMessage) []extensions.ContextMessage {
@@ -0,0 +1,140 @@
+package kit
+
+import (
+	"testing"
+	"time"
+)
+
+// TestTurnAggregator_BasicLifecycle exercises the per-turn aggregator:
+// start → record several tools and steps → consume → snapshot should reflect
+// the accumulated counts and zero out for the next turn.
+func TestTurnAggregator_BasicLifecycle(t *testing.T) {
+	agg := &turnAggregator{}
+
+	agg.start()
+	agg.recordTool("bash")
+	agg.recordTool("read")
+	agg.recordTool("bash")
+	agg.recordStep(LLMUsage{
+		InputTokens:         100,
+		OutputTokens:        50,
+		CacheReadTokens:     10,
+		CacheCreationTokens: 5,
+	})
+	agg.recordStep(LLMUsage{
+		InputTokens:  200,
+		OutputTokens: 75,
+	})
+
+	snap := agg.consume()
+	if snap.toolCallCount != 3 {
+		t.Errorf("toolCallCount: got %d want 3", snap.toolCallCount)
+	}
+	wantNames := []string{"bash", "read", "bash"}
+	if len(snap.toolNames) != len(wantNames) {
+		t.Fatalf("toolNames length: got %d want %d", len(snap.toolNames), len(wantNames))
+	}
+	for i, n := range wantNames {
+		if snap.toolNames[i] != n {
+			t.Errorf("toolNames[%d]: got %q want %q", i, snap.toolNames[i], n)
+		}
+	}
+	if snap.llmCallCount != 2 {
+		t.Errorf("llmCallCount: got %d want 2", snap.llmCallCount)
+	}
+	if snap.inputTokens != 300 {
+		t.Errorf("inputTokens: got %d want 300", snap.inputTokens)
+	}
+	if snap.outputTokens != 125 {
+		t.Errorf("outputTokens: got %d want 125", snap.outputTokens)
+	}
+	if snap.cacheReadTokens != 10 {
+		t.Errorf("cacheReadTokens: got %d want 10", snap.cacheReadTokens)
+	}
+	if snap.cacheWriteTokens != 5 {
+		t.Errorf("cacheWriteTokens: got %d want 5", snap.cacheWriteTokens)
+	}
+	if snap.durationMs() < 0 {
+		t.Errorf("durationMs should not be negative, got %d", snap.durationMs())
+	}
+}
+
+func TestTurnAggregator_StartResetsCounters(t *testing.T) {
+	agg := &turnAggregator{}
+	agg.start()
+	agg.recordTool("bash")
+	agg.recordStep(LLMUsage{InputTokens: 50})
+
+	// Begin a new turn — previous counters should be cleared.
+	agg.start()
+	snap := agg.consume()
+
+	if snap.toolCallCount != 0 || snap.llmCallCount != 0 || snap.inputTokens != 0 {
+		t.Errorf("expected counters zeroed after start(), got %+v", snap)
+	}
+	if snap.toolNames != nil {
+		t.Errorf("expected toolNames=nil after start(), got %v", snap.toolNames)
+	}
+}
+
+// TestTurnAggregator_DurationMs verifies the snapshot computes a positive
+// duration when consume() runs after start().
+func TestTurnAggregator_DurationMs(t *testing.T) {
+	agg := &turnAggregator{}
+	agg.start()
+	time.Sleep(5 * time.Millisecond)
+	snap := agg.consume()
+	if snap.durationMs() < 1 {
+		t.Errorf("expected positive duration, got %d", snap.durationMs())
+	}
+}
+
+// TestTurnAggregator_ZeroStartSafe ensures a snapshot taken without a prior
+// start() doesn't crash and reports zero duration.
+func TestTurnAggregator_ZeroStartSafe(t *testing.T) {
+	agg := &turnAggregator{}
+	snap := agg.consume()
+	if snap.durationMs() != 0 {
+		t.Errorf("expected zero duration for unstarted aggregator, got %d", snap.durationMs())
+	}
+}
+
+// TestLLMUsageMeta_NilKit verifies the helper degrades gracefully when given
+// a nil Kit instance (zero values, no panic).
+func TestLLMUsageMeta_NilKit(t *testing.T) {
+	provider, modelID, cost := llmUsageMeta(nil, LLMUsage{InputTokens: 100})
+	if provider != "" || modelID != "" || cost != 0 {
+		t.Errorf("expected zero values for nil kit, got (%q,%q,%v)", provider, modelID, cost)
+	}
+}
+
+// TestIsAnthropicOAuth_NonAnthropic verifies the helper short-circuits for any
+// provider other than "anthropic" without touching the credential store.
+func TestIsAnthropicOAuth_NonAnthropic(t *testing.T) {
+	for _, provider := range []string{"openai", "google", "openrouter", ""} {
+		if isAnthropicOAuth(nil, provider) {
+			t.Errorf("isAnthropicOAuth(nil, %q) = true, want false", provider)
+		}
+	}
+}
+
+func TestExtStateSidecarPath(t *testing.T) {
+	tests := []struct {
+		name string
+		in   string
+		want string
+	}{
+		{"empty", "", ""},
+		{"jsonl", "/tmp/sessions/abc.jsonl", "/tmp/sessions/abc.ext-state.json"},
+		{"jsonl with subdir", "/a/b/c.jsonl", "/a/b/c.ext-state.json"},
+		{"no extension", "/tmp/session-blob", "/tmp/session-blob.ext-state.json"},
+	}
+	for _, tc := range tests {
+		t.Run(tc.name, func(t *testing.T) {
+			got := extStateSidecarPath(tc.in)
+			if got != tc.want {
+				t.Errorf("extStateSidecarPath(%q): got %q want %q", tc.in, got, tc.want)
+			}
+		})
+	}
+}
@@ -138,6 +138,19 @@ func (m *Kit) GetToolNames() []string {
 	return names
 }

+// GetToolsForSubagent like GetTools but eliminates subagent tool
+// to avoid infinite recursion.
+func (m *Kit) GetToolsForSubagent() []Tool {
+	var tools []Tool
+	for _, t := range m.agent.GetTools() {
+		if t.Info().Name == "subagent" {
+			continue
+		}
+		tools = append(tools, t)
+	}
+	return tools
+}
+
 // GetLoadingMessage returns the agent's startup info message (e.g. GPU
 // fallback info), or empty string if none.
 func (m *Kit) GetLoadingMessage() string {
@@ -1814,8 +1827,14 @@ type SubagentConfig struct {
 	// Empty string uses a minimal default prompt.
 	SystemPrompt string

-	// Tools overrides the tool set. If nil, SubagentTools() is used (all
-	// core tools except subagent, preventing infinite recursion).
+	// Tools overrides the tool set available to the subagent.
+	// If nil and the subagent is created via the SDK (Kit.Subagent()), the
+	// static SubagentTools() set (all core tools except "subagent") is used.
+	// When spawned internally by the agent loop, the parent's active tools
+	// minus "subagent" are used instead (see GetToolsForSubagent()).
+	// Pass m.GetToolsForSubagent() explicitly to opt into inheritance from
+	// SDK call sites.
+	// (The subagent tool is dropped to prevent infinite recursion.)
 	Tools []Tool

 	// NoSession, when true, uses an in-memory ephemeral session. When false
@@ -2076,6 +2095,7 @@ func (m *Kit) generate(ctx context.Context, messages []fantasy.Message) (*agent.
 			SystemPrompt: systemPrompt,
 			Timeout:      timeout,
 			OnEvent:      onEvent,
+			Tools:        m.GetToolsForSubagent(),
 		})
 		m.cleanupSubagentListeners(toolCallID)
 		if result == nil {
@@ -88,7 +88,8 @@ api.OnAgentStart(func(e ext.AgentStartEvent, ctx ext.Context) {
    // e.Prompt string
 })

-// Agent finished responding.
+// Agent finished responding. Carries per-turn aggregates so observer-style
+// extensions don't need to maintain parallel bookkeeping.
 api.OnAgentEnd(func(e ext.AgentEndEvent, ctx ext.Context) {
    // e.Response string
    // e.StopReason string — "error" (on failure), "completed" (when LLM returns
@@ -96,6 +97,33 @@ api.OnAgentEnd(func(e ext.AgentEndEvent, ctx ext.Context) {
    //   (e.g. "stop", "length" (max output tokens hit), "tool-calls", "content-filter").
    //   To detect errors, check e.StopReason == "error".
    //   Do NOT compare against "completed" for success — instead check != "error".
+    //
+    // Per-turn aggregates (computed by Kit's runtime):
+    // e.ToolCallCount          int       — total tool invocations this turn
+    // e.ToolNames              []string  — tool names in call order (duplicates preserved)
+    // e.LLMCallCount           int       — LLM round-trips / tool-loop iterations
+    // e.InputTokensDelta       int       — sum of input tokens across LLM calls this turn
+    // e.OutputTokensDelta      int
+    // e.CacheReadTokensDelta   int
+    // e.CacheWriteTokensDelta  int
+    // e.CostDelta              float64   — USD cost (zero when pricing unknown / OAuth)
+    // e.DurationMs             int64     — wall-clock duration AgentStart→AgentEnd
+})
+
+// Per-LLM-call usage — fires after each provider round-trip with token + cost
+// deltas attributed to that specific call. A single turn typically produces
+// multiple LLMUsageEvents (one per tool-loop iteration). Use this for accurate
+// budget enforcement that needs to react between calls instead of waiting
+// for the turn to finish.
+api.OnLLMUsage(func(e ext.LLMUsageEvent, ctx ext.Context) {
+    // e.InputTokens, e.OutputTokens             int
+    // e.CacheReadTokens, e.CacheWriteTokens     int
+    // e.Cost                                    float64  — USD; zero when pricing unknown / OAuth
+    // e.Model, e.Provider                       string   — model used for THIS call
+    //                                                      (may differ across calls if SetModel was called)
+    // e.StepNumber                              int      — zero-based step index in this turn
+    // e.FinishReason                            string   — "stop" / "tool_calls" / "length" / ...
+    // e.RequestID                               string   — optional provider correlation id (may be empty)
 })
 ```

@@ -528,11 +556,38 @@ stats := ctx.GetContextStats()     // .EstimatedTokens, .ContextLimit, .UsagePer
 msgs := ctx.GetMessages()          // []ext.SessionMessage on current branch
 path := ctx.GetSessionPath()       // file path of session JSONL

-// Persist custom data in the session tree:
+// Append-only log in the session tree (fork-aware, walked on every branch read):
 id, err := ctx.AppendEntry("my-type", "data string")
 entries := ctx.GetEntries("my-type")  // []ext.ExtensionEntry{ID, EntryType, Data, Timestamp}
 ```

+### Session State (last-write-wins)
+
+Key-value store scoped to the session, persisted to a sidecar file
+(`<session>.ext-state.json`) outside the conversation tree. Reads are O(1)
+(no branch walk), writes don't grow the JSONL, and the store is not
+duplicated on fork. State is invisible to the LLM and survives session
+resume. For ephemeral / in-memory sessions, state lives only in memory.
+
+```go
+ctx.SetState("myext:budget-cap", "10.00")          // last write wins
+val, ok := ctx.GetState("myext:budget-cap")        // (string, bool)
+ctx.DeleteState("myext:budget-cap")                // no-op if missing
+keys := ctx.ListState()                            // []string, unspecified order
+```
+
+**When to use which:**
+
+| Need | Use |
+|------|-----|
+| Snapshot state ("current value of X") | `SetState` / `GetState` |
+| Audit log / event history | `AppendEntry` / `GetEntries` |
+| One-shot per-turn signal | enriched `AgentEndEvent` fields |
+| Per-LLM-call observation | `OnLLMUsage` event |
+
+Namespace keys with your extension name (e.g. `"myext:budget-cap"`) to avoid
+collisions across extensions.
+
 ### Model Management

 ```go
@@ -1104,6 +1104,19 @@ if extAPI.HasExtensions() {
    tools := extAPI.GetToolInfos()
    extAPI.SetActiveTools([]string{"bash", "read"})

+    // Session-scoped extension state (last-write-wins key-value store).
+    // Backed by an in-memory map and a per-session sidecar file
+    // (<session>.ext-state.json) outside the conversation tree.
+    extAPI.SetState("myext:budget-cap", "10.00")
+    val, ok := extAPI.GetState("myext:budget-cap")
+    extAPI.DeleteState("myext:budget-cap")
+    keys := extAPI.ListState()
+
+    // Load any existing state from the sidecar and install a saver hook so
+    // subsequent SetState/DeleteState mutations are flushed atomically.
+    // No-op for ephemeral / in-memory sessions. Safe to call multiple times.
+    _ = extAPI.InitStatePersistence()
+
    // Events
    extAPI.EmitSessionStart()
    extAPI.EmitModelChange("new/model", "old/model", "extension")
@@ -7,7 +7,7 @@ description: All extension capabilities — lifecycle events, tools, commands, w

 ## Lifecycle events

-Extensions can hook into 26 lifecycle events:
+Extensions can hook into 27 lifecycle events:

 | Event | Description |
 |-------|-------------|
@@ -15,7 +15,8 @@ Extensions can hook into 26 lifecycle events:
 | `OnSessionShutdown` | Session ending |
 | `OnBeforeAgentStart` | Before the agent loop begins |
 | `OnAgentStart` | Agent loop started |
-| `OnAgentEnd` | Agent loop completed |
+| `OnAgentEnd` | Agent loop completed (carries per-turn aggregates: tool counts, token deltas, cost, duration) |
+| `OnLLMUsage` | Per-LLM-call token + cost delta (fires once per provider round-trip) |
 | `OnToolCall` | Tool call requested by the model |
 | `OnToolCallInputStart` | LLM began generating tool call arguments (tool name known, args streaming) |
 | `OnToolCallInputDelta` | Streamed JSON fragment of tool call arguments |
@@ -45,11 +46,52 @@ api.OnToolCall(func(event ext.ToolCallEvent, ctx ext.Context) {
    ctx.PrintInfo("Calling tool: " + event.Name)
 })

-api.OnAgentEnd(func(_ ext.AgentEndEvent, ctx ext.Context) {
-    ctx.PrintInfo("Agent finished")
+api.OnAgentEnd(func(e ext.AgentEndEvent, ctx ext.Context) {
+    // Per-turn aggregates populated by Kit's runtime — no parallel
+    // bookkeeping required in the handler.
+    ctx.PrintInfo(fmt.Sprintf(
+        "Turn finished: %d tool calls (%v), %d LLM round-trips, $%.4f, %dms",
+        e.ToolCallCount, e.ToolNames, e.LLMCallCount, e.CostDelta, e.DurationMs,
+    ))
+})
+
+// Per-LLM-call usage — fires multiple times per turn (once per round-trip).
+// Use for accurate budget enforcement between calls.
+api.OnLLMUsage(func(e ext.LLMUsageEvent, ctx ext.Context) {
+    ctx.PrintInfo(fmt.Sprintf(
+        "%s/%s step=%d tokens=↑%d ↓%d cost=$%.4f (%s)",
+        e.Provider, e.Model, e.StepNumber,
+        e.InputTokens, e.OutputTokens, e.Cost, e.FinishReason,
+    ))
 })
 ```

+**`AgentEndEvent` fields** (in addition to `Response` and `StopReason`):
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `ToolCallCount` | `int` | Total tool invocations during the turn |
+| `ToolNames` | `[]string` | Tool names in call order (duplicates preserved) |
+| `LLMCallCount` | `int` | LLM round-trips / tool-loop iterations |
+| `InputTokensDelta` | `int` | Sum of input tokens across all LLM calls this turn |
+| `OutputTokensDelta` | `int` | Sum of output tokens across all LLM calls this turn |
+| `CacheReadTokensDelta` | `int` | Sum of cache-read tokens this turn |
+| `CacheWriteTokensDelta` | `int` | Sum of cache-write tokens this turn |
+| `CostDelta` | `float64` | Cost in USD (zero when pricing is unknown or OAuth credentials) |
+| `DurationMs` | `int64` | Wall-clock time from `AgentStart` to `AgentEnd` |
+
+**`LLMUsageEvent` fields**:
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `InputTokens` / `OutputTokens` | `int` | Per-call token deltas |
+| `CacheReadTokens` / `CacheWriteTokens` | `int` | Per-call cache token deltas |
+| `Cost` | `float64` | Per-call USD cost (zero when pricing unknown) |
+| `Model` / `Provider` | `string` | Model used for this specific call — may differ from earlier calls if `ctx.SetModel` was called mid-turn |
+| `StepNumber` | `int` | Zero-based step index within the turn |
+| `FinishReason` | `string` | Provider finish reason for this call (`"stop"`, `"tool_calls"`, `"length"`, ...) |
+| `RequestID` | `string` | Optional provider correlation id (may be empty) |
+
 ## Tools

 Register custom tools that the LLM can invoke:
@@ -338,6 +380,36 @@ api.OnCustomEvent("my-extension:data-ready", func(data any, ctx ext.Context) {
 })
 ```

+## Session state
+
+Last-write-wins key-value store, scoped to the current session and persisted to a sidecar file (`<session>.ext-state.json`) outside the conversation tree:
+
+```go
+ctx.SetState("myext:budget-cap", "10.00")
+
+if cap, ok := ctx.GetState("myext:budget-cap"); ok {
+    // ...
+}
+
+ctx.DeleteState("myext:budget-cap")
+keys := ctx.ListState()  // []string, unspecified order
+```
+
+Reads are O(1) (no branch walk), writes don't grow the session JSONL, and the store is not duplicated when the conversation forks. State is invisible to the LLM and survives session resume.
+
+### When to use which persistence primitive
+
+| Need | Use | Why |
+|------|-----|-----|
+| Snapshot state ("current value of X") | `SetState` / `GetState` | O(1) reads, sidecar file, last-write-wins |
+| Audit log / event history | `AppendEntry` / `GetEntries` | Append-only, lives in conversation tree, fork-aware |
+| One-shot per-turn signal | Enriched `AgentEndEvent` fields | No persistence needed; runtime tracks it for you |
+| Per-LLM-call observation | `OnLLMUsage` event | Already attributed to model/provider/step |
+
+Using `AppendEntry` for snapshot state has a cost: it's O(branch_length) to read, fsyncs into the JSONL on every write, and the entry list duplicates on every fork. Prefer `SetState` for "what's the current value of X?"-style data.
+
+For ephemeral / in-memory sessions (no JSONL path) the state lives only in memory for the lifetime of the runner.
+
 ## Bridged SDK APIs

 Extensions can access powerful internal SDK capabilities that enable advanced features like conversation tree navigation, dynamic skill loading, template parsing, and model resolution.
@@ -50,6 +50,7 @@ Kit ships with a rich set of example extensions in the `examples/extensions/` di
 | [`context-inject.go`](https://github.com/mark3labs/kit/blob/master/examples/extensions/context-inject.go) | Inject context into conversations |
 | [`summarize.go`](https://github.com/mark3labs/kit/blob/master/examples/extensions/summarize.go) | Conversation summarization |
 | [`lsp-diagnostics.go`](https://github.com/mark3labs/kit/blob/master/examples/extensions/lsp-diagnostics.go) | LSP diagnostic integration |
+| [`usage-budget.go`](https://github.com/mark3labs/kit/blob/master/examples/extensions/usage-budget.go) | Per-call usage callback (`OnLLMUsage`), session state (`SetState`/`GetState`), and enriched `OnAgentEnd` per-turn report |

 ## Bridged SDK APIs

@@ -65,7 +65,8 @@ Passed to event handlers, the `Context` object provides runtime access to Kit's
 - **Model** — `ctx.SetModel(...)`, `ctx.GetAvailableModels()`
 - **Tools** — `ctx.GetAllTools()`, `ctx.SetActiveTools(...)`
 - **Context stats** — `ctx.GetContextStats()`
- **Session data** — `ctx.AppendEntry(...)`, `ctx.GetEntries(...)`
+- **Session data** — `ctx.AppendEntry(...)`, `ctx.GetEntries(...)` (append-only, in conversation tree)
+- **Session state** — `ctx.SetState(...)`, `ctx.GetState(...)`, `ctx.DeleteState(...)`, `ctx.ListState()` (last-write-wins, sidecar file)
 - **Subagents** — `ctx.SpawnSubagent(...)`
 - **LLM completion** — `ctx.Complete(...)`
 - **Custom events** — `ctx.EmitCustomEvent(...)`
@@ -13,6 +13,7 @@ Kit supports a wide range of LLM providers through a unified `provider/model` st
 |----------|--------|-------------|
 | **Anthropic** | `anthropic/` | Claude models (native, prompt caching, OAuth) |
 | **OpenAI** | `openai/` | GPT models |
+| **GitHub Copilot** | `copilot/` | Copilot models through GitHub device login (experimental) |
 | **Google** | `google/` or `gemini/` | Gemini models |
 | **Ollama** | `ollama/` | Local models |
 | **Azure OpenAI** | `azure/` | Azure-hosted OpenAI |
@@ -29,6 +30,7 @@ Kit supports a wide range of LLM providers through a unified `provider/model` st
 provider/model            # Standard format
 anthropic/claude-sonnet-latest
 openai/gpt-4o
+copilot/gpt-5.5
 ollama/llama3
 google/gemini-2.5-flash
 ```
@@ -117,14 +119,19 @@ kit --provider-api-key "sk-..." --model openai/gpt-4o

 ### OAuth

-For providers that support OAuth (e.g., Anthropic):
+For providers that support OAuth:

 ```bash
-kit auth login anthropic     # Start OAuth flow
+kit auth login anthropic     # Anthropic OAuth
+kit auth login openai        # ChatGPT/Codex OAuth
+kit auth login copilot       # GitHub Copilot device login (experimental)
 kit auth status              # Check authentication status
-kit auth logout anthropic    # Remove credentials
+kit auth logout copilot      # Remove credentials
 ```

+The experimental `copilot/` provider requires an active GitHub Copilot subscription
+and uses GitHub device login; no OpenAI account or OpenAI API key is required.
+
 ### Custom provider URL

 For self-hosted or proxy endpoints:
@@ -133,6 +140,15 @@ For self-hosted or proxy endpoints:
 kit --provider-url "https://my-proxy.example.com/v1" --model openai/gpt-4o
 ```

+When `--provider-url` is set with an explicit `--model`, Kit routes through the
+`custom` (OpenAI-compatible) wire and strips any provider prefix from the model
+name. So `openai/gpt-4o`, `google/gemma-4-12b`, and bare `gpt-4o` all resolve
+to the same endpoint — Kit treats `--provider-url` as authoritative about *where*
+to send the request, and the model string as just the upstream model id.
+
+This avoids name collisions when a local server (LM Studio, Ollama, vLLM, ...)
+happens to expose a model whose name matches a known cloud provider.
+
 When `--provider-url` is provided without `--model`, Kit automatically defaults to `custom/custom`:

 ```bash
Author	SHA1	Message	Date
Egbert Eich	ef072f6e59	Make subagent inherit tools from parent (#51 ) While the tool list of the main agent could be controlled by several options, subagent used to be equipped with all available tools (except for the subagent tool itself). With this change the list of tools is taken from the parent, the subagent tool itself is removed and the remaining tool list is added to the subagent. Signed-off-by: Egbert Eich <eich@suse.com>	2026-06-09 16:28:01 +03:00
Ed Zynda	49f8b485be	feat(extensions): add OnLLMUsage, SetState, enriched AgentEndEvent (#53 ) (#54 ) * feat(extensions): add OnLLMUsage, SetState, enriched AgentEndEvent (#53) Three additive primitives to the extension API: - OnLLMUsage event: per-LLM-call token + cost deltas attributed to the specific model/provider used for each round-trip. Derived from the SDK StepFinishEvent in the extension bridge. Enables accurate budget enforcement between calls instead of only at turn boundaries. - ctx.SetState / GetState / DeleteState / ListState: session-scoped, last-write-wins key-value store backed by a sidecar file (<session>.ext-state.json) outside the conversation tree. Reads are O(1), writes don't grow the JSONL, and the store is not duplicated on fork. State is preserved across hot-reloads. - Enriched AgentEndEvent: ToolCallCount, ToolNames, LLMCallCount, token deltas (input/output/cache-read/cache-write), CostDelta, and DurationMs populated by a per-turn aggregator. Existing handlers reading only Response/StopReason are unaffected. Includes unit tests for the state store, LLMUsage registration, enriched AgentEndEvent, turn aggregator, llmUsageMeta, and sidecar path derivation. Adds examples/extensions/usage-budget.go demoing all three primitives together. Documents the additions in README, the docs site (extensions overview, capabilities, examples), and the kit-extensions and kit-sdk skill guides. Fixes #53 * fix(extensions): address review feedback on state store and llmUsageMeta - Serialize SetState/DeleteState saver invocations through a new saverMu so overlapping atomic-rename writes can no longer race on the shared .tmp file and persist an older snapshot after a newer one. - LoadStateFromFile now clears the in-memory store when the sidecar is missing or empty, matching the documented "replace … with its contents" contract. This makes session-switching safe by preventing keys from a prior session leaking into a new one. Tests updated to cover both the missing-file and empty-file cases. - llmUsageMeta now detects Anthropic OAuth credentials and returns Cost=0, matching the comment and the existing usage_tracker behavior for OAuth users. Mirrors the OAuth detection already used in cmd/extension_context.go. - Document the single-in-flight-turn assumption baked into the per-turn aggregator with a clear migration path (per-turn ID) for if concurrent turns ever become a supported use case. * fix(extensions): release saverMu on panic in state store Extract a runSaver helper that locks saverMu and defers Unlock before invoking the persistence callback. Without the deferred Unlock, a panic inside the saver (e.g. disk full mid-write) would leave saverMu held forever and deadlock the next SetState/DeleteState. Both SetState and DeleteState now route through the helper. New TestRunner_State_Saver PanicReleasesSaverMu reproduces the deadlock window with a 2s deadline and proves the mutex is released after a panic.	2026-06-09 16:18:10 +03:00
Nuno do Carmo	febdc530e1	Feat/copilot login (#49 ) * feat(auth): add Copilot login Add experimental GitHub Copilot device login and copilot/* provider support for users with Copilot access but no OpenAI account. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(copilot): use responses for GPT-5 Route Copilot GPT-5 models through the Responses API because gpt-5.5 is not available on /chat/completions. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(copilot): honor device flow timing * docs(copilot): add auth helper docstrings * fix(auth): address copilot review feedback --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-08 00:21:20 +03:00
Ed Zynda	e610bdd2d0	fix(cmd): route prefixed models through custom wire when --provider-url is set When --provider-url was set with an explicit --model that already carried a provider prefix (e.g. google/gemma-4-12b served by LM Studio), Kit honored the prefix and routed through the Google wire protocol instead of the user-supplied endpoint, producing confusing upstream errors. - Strip any non-custom provider prefix from --model when --provider-url is set, so the request always lands on the OpenAI-compatible custom wire pointed at the user's URL. - Leave behavior unchanged when --provider-url is absent. - Document the rewrite in www/pages/providers.md.	2026-06-07 22:03:51 +03:00