feat(skills): full agentskills.io spec compliance (#71 )

* feat(skills): full agentskills.io spec compliance - escape catalog XML and drop file:// prefix on <location> - skip skills missing a required description; add Skill.Validate - add license/compatibility/metadata/allowed-tools/disable-model-invocation frontmatter fields plus a malformed-YAML (unquoted colon) fallback - scan ~/.agents/skills and dedupe by name with project>user precedence - treat --skills-dir as a direct directory; add --skill-disable + DisableSkill/EnableSkill SDK methods - enumerate bundled resources via <skill_resources> on activation - add activate_skill MCP tool with enum-constrained name and session dedup - protect activated skill content from compaction pruning - gate project-local skills on a persisted trust allowlist via SkillTrustPrompt and an interactive CLI prompt - document new fields, flags, and SDK surface across README and docs site Fixes #65 * fix(skills): address skill loading and activation review findings - log (instead of discard) genuine errors from skill directory loads so permission/read failures no longer yield a silently partial catalog - make activate_skill dedup atomic by holding the lock across check and mark, preventing concurrent double-activation - reject activation of disable-model-invocation skills in the tool's runtime lookup, mirroring their catalog/enum exclusion - add regression test for disabled-skill activation
feat(sdk): harden pkg/kit embedder surface with scoped additions (#69 )
2026-06-19 05:45:40 +00:00 · 2026-06-18 19:37:53 +03:00 · 2026-06-18 18:18:54 +03:00 · 2026-06-18 14:46:03 +03:00 · 2026-06-18 12:42:11 +03:00 · 2026-06-18 12:37:37 +03:00
67 changed files with 4132 additions and 665 deletions
@@ -130,11 +130,13 @@ stream: true
 thinking-level: off       # off, none, minimal, low, medium, high
 no-core-tools: false      # set to true to disable all built-in core tools

-# Skills — all three keys are optional
+# Skills — all keys are optional
 no-skills: false          # set to true to disable all skill loading
 skill:                    # explicit skill files/dirs (disables auto-discovery)
  - /path/to/skill.md
-skills-dir: ""            # override project-local directory for auto-discovery
+skills-dir: ""            # scan this directory directly for skills (overrides auto-discovery)
+skill-disable:            # hide skills from the model catalog by name (still usable via /skill:)
+  - some-skill
 ```

 All of the above keys can also be set programmatically via the SDK
@@ -212,7 +214,8 @@ mcpServers:

 # Skills
 --skill                  Load skill file or directory (repeatable)
--skills-dir             Override the project-local skills directory for auto-discovery
+--skills-dir             Scan this directory directly for skills (overrides auto-discovery)
+--skill-disable          Hide a skill from the model catalog by name (repeatable); still usable via /skill:
 --no-skills              Disable skill loading (auto-discovery and explicit)

 # Generation parameters
@@ -691,10 +694,10 @@ host, err := kit.NewAgent(ctx,

 Available options: `WithModel`, `WithSystemPrompt`, `WithStreaming`,
 `WithMaxTokens`, `WithThinkingLevel`, `WithTools`, `WithExtraTools`,
-`WithProviderAPIKey`, `WithProviderURL`, `WithConfigFile`, `WithDebug`, and
-`Ephemeral`. For advanced configuration not covered by the helpers (custom MCP
-config, in-process MCP servers, session backends, MCP task tuning) construct an
-`Options` value explicitly and call `kit.New`.
+`WithProviderAPIKey`, `WithProviderURL`, `WithConfigFile`, `WithDebug`,
+`WithDebugLogger`, and `Ephemeral`. For advanced configuration not covered by
+the helpers (custom MCP config, in-process MCP servers, session backends, MCP
+task tuning) construct an `Options` value explicitly and call `kit.New`.

 ### Per-instance config isolation

@@ -890,6 +893,11 @@ host.AddContextFileContent(
 host.RemoveSkill("polite-french")
 host.RemoveContextFile(fmt.Sprintf("session://%s/AGENTS.md", userID))

+// Hide a skill from the model catalog without unloading it (still usable
+// via /skill:); EnableSkill reverses it.
+host.DisableSkill("refund-policy")
+host.EnableSkill("refund-policy")
+
 // Or replace the whole set atomically.
 host.SetSkills(activeSkillsForUser)
 host.SetContextFiles(activeContextForUser)
@@ -69,6 +69,9 @@ func buildInteractiveExtensionContext(deps extensionContextDeps) extensions.Cont
 		}
 		appInstance.RunWithFiles(text, parts)
 	}
+	ec.NewSession = func(prompt string) error {
+		return appInstance.RequestNewSessionFromExtension(prompt)
+	}
 	ec.GetSessionUsage = func() extensions.SessionUsage {
 		if usageTracker == nil {
 			return extensions.SessionUsage{}
@@ -74,9 +74,10 @@ var (
 	extensionPaths   []string

 	// Skills control
-	noSkillsFlag bool
-	skillsPaths  []string
-	skillsDir    string
+	noSkillsFlag  bool
+	skillsPaths   []string
+	skillsDir     string
+	skillsDisable []string

 	// TLS configuration
 	tlsSkipVerify bool
@@ -294,7 +295,9 @@ func init() {
 	rootCmd.PersistentFlags().
 		StringSliceVar(&skillsPaths, "skill", nil, "load skill file or directory (repeatable)")
 	rootCmd.PersistentFlags().
-		StringVar(&skillsDir, "skills-dir", "", "override the project-local skills directory for auto-discovery")
+		StringVar(&skillsDir, "skills-dir", "", "scan this directory directly for skills (overrides auto-discovery)")
+	rootCmd.PersistentFlags().
+		StringSliceVar(&skillsDisable, "skill-disable", nil, "hide a skill from the model catalog by name (repeatable); still usable via /skill:")

 	flags := rootCmd.PersistentFlags()
 	flags.StringVar(&providerURL, "provider-url", "", "base URL for the provider API (applies to OpenAI, Anthropic, Ollama, and Google)")
@@ -349,6 +352,7 @@ func init() {
 	_ = viper.BindPFlag("no-skills", rootCmd.PersistentFlags().Lookup("no-skills"))
 	_ = viper.BindPFlag("skill", rootCmd.PersistentFlags().Lookup("skill"))
 	_ = viper.BindPFlag("skills-dir", rootCmd.PersistentFlags().Lookup("skills-dir"))
+	_ = viper.BindPFlag("skill-disable", rootCmd.PersistentFlags().Lookup("skill-disable"))

 	// Defaults are already set in flag definitions, no need to duplicate in viper

@@ -670,13 +674,16 @@ func beforeForkProviderForUI(k *kit.Kit) func(string, bool, string) (bool, strin

 // beforeSessionSwitchProviderForUI returns a callback that emits a
 // BeforeSessionSwitch event and returns (cancelled, reason). Returns nil
-// if extensions are disabled — the UI treats nil as "no hook".
-func beforeSessionSwitchProviderForUI(k *kit.Kit) func(string) (bool, string) {
+// if extensions are disabled — the UI treats nil as "no hook". The
+// initialPrompt argument is forwarded to the event so extensions can
+// inspect the prompt that will be submitted as the first turn of the
+// new session.
+func beforeSessionSwitchProviderForUI(k *kit.Kit) func(switchReason, initialPrompt string) (bool, string) {
 	if !k.Extensions().HasExtensions() {
 		return nil
 	}
-	return func(switchReason string) (bool, string) {
-		return k.Extensions().EmitBeforeSessionSwitch(switchReason)
+	return func(switchReason, initialPrompt string) (bool, string) {
+		return k.Extensions().EmitBeforeSessionSwitchWithPrompt(switchReason, initialPrompt)
 	}
 }

@@ -839,6 +846,8 @@ func runNormalMode(ctx context.Context) error {
 		NoSkills:         noSkillsFlag,
 		Skills:           skillsPaths,
 		SkillsDir:        skillsDir,
+		SkillsDisable:    skillsDisable,
+		SkillTrustPrompt: skillTrustPrompt(),
 		// This callback is called when each MCP server finishes loading.
 		// We use a closure that captures appInstancePtr which is set after
 		// app.New() is called below.
@@ -1487,7 +1496,7 @@ type runModeDeps struct {
 	getUIVisibility          func() *ui.UIVisibility
 	getStatusBarEntries      func() []ui.StatusBarEntryData
 	emitBeforeFork           func(string, bool, string) (bool, string)
-	emitBeforeSessionSwitch  func(string) (bool, string)
+	emitBeforeSessionSwitch  func(string, string) (bool, string)
 	getGlobalShortcuts       func() map[string]func()
 	getExtensionCommands     func() []commands.ExtensionCommand
 	setModel                 func(string) error
@@ -0,0 +1,52 @@
+package cmd
+
+import (
+	"bufio"
+	"fmt"
+	"os"
+	"strings"
+
+	"golang.org/x/term"
+
+	"github.com/mark3labs/kit/pkg/kit"
+)
+
+// skillTrustPrompt returns a callback that gates project-local skill loading
+// on an interactive trust decision (issue #65, gap #8). Project-local skills
+// are injected into the system prompt, so a freshly cloned untrusted repo
+// could smuggle instructions into the agent. The prompt asks the user whether
+// to trust the directory before any project skill is loaded.
+//
+// It returns nil — meaning "load without prompting" — when Kit is not running
+// interactively (a non-TTY stdin, --quiet, or a non-interactive one-shot
+// prompt), so scripted and piped invocations keep their existing behaviour.
+func skillTrustPrompt() func(projectDir string, skillCount int) kit.TrustDecision {
+	// Only prompt for interactive terminal sessions.
+	if quietFlag || positionalPrompt != "" {
+		return nil
+	}
+	if !term.IsTerminal(int(os.Stdin.Fd())) {
+		return nil
+	}
+
+	return func(projectDir string, skillCount int) kit.TrustDecision {
+		noun := "skills"
+		if skillCount == 1 {
+			noun = "skill"
+		}
+		fmt.Printf("\nThis project provides %d %s under .agents/skills or .kit/skills:\n  %s\n",
+			skillCount, noun, projectDir)
+		fmt.Print("Load them into the agent? [t]rust always / [o]nce / [s]kip (default skip): ")
+
+		reader := bufio.NewReader(os.Stdin)
+		line, _ := reader.ReadString('\n')
+		switch strings.ToLower(strings.TrimSpace(line)) {
+		case "t", "trust", "a", "always":
+			return kit.TrustProject
+		case "o", "once", "y", "yes":
+			return kit.TrustProjectOnce
+		default:
+			return kit.SkipProjectSkills
+		}
+	}
+}
@@ -0,0 +1,110 @@
+//go:build ignore
+
+// phase-handoff.go demonstrates ctx.NewSession by automating the multi-phase
+// workflow pattern: the agent works through a spec, writes a HANDOFF.md at
+// the end of each phase, then a fresh session picks up where the last one
+// left off.
+//
+// Two trigger modes are provided:
+//
+//  1. Automatic — when an assistant message ends with the sentinel
+//     "<HANDOFF_READY>", the extension starts a new session and pre-loads
+//     HANDOFF.md as the first prompt. Use this when you want the agent to
+//     hand off control to itself with no user intervention.
+//
+//  2. Manual — the /handoff slash command starts a new session immediately
+//     with the same handoff prompt. Useful when you finish a phase by hand
+//     and want to clear the context window before the next one starts.
+//
+// Usage:
+//
+//	kit -e examples/extensions/phase-handoff.go
+//
+// Have your spec-driving agent write a HANDOFF.md at the end of each phase
+// and finish its message with the literal string `<HANDOFF_READY>`. The
+// next session boots automatically and reads HANDOFF.md as @file context.
+
+package main
+
+import (
+	"strings"
+
+	"kit/ext"
+)
+
+// HANDOFFSentinel is the marker the agent appends to its last message to
+// request an automatic session switch. Change this to whatever fits your
+// workflow.
+const HANDOFFSentinel = "<HANDOFF_READY>"
+
+// HANDOFFPrompt is the first prompt the new session receives. The leading
+// "@HANDOFF.md" triggers Kit's @file expansion, inlining the handoff file's
+// contents as XML-wrapped context.
+const HANDOFFPrompt = "Read @HANDOFF.md and continue with the next phase."
+
+func Init(api ext.API) {
+	// Automatic trigger: detect the sentinel at the end of an agent turn.
+	api.OnAgentEnd(func(e ext.AgentEndEvent, ctx ext.Context) {
+		msgs := ctx.GetMessages()
+		if len(msgs) == 0 {
+			return
+		}
+		last := msgs[len(msgs)-1]
+		if last.Role != "assistant" || !strings.Contains(last.Content, HANDOFFSentinel) {
+			return
+		}
+
+		// NewSession blocks while the agent finishes settling and then while
+		// the TUI completes the switch; run it in a goroutine so the agent's
+		// turn-end pipeline isn't stalled. The internal wait-for-idle (added
+		// in response to issue #63) makes this reliable even when post-turn
+		// tooling (formatters, on-save hooks, hidden tool calls) extends the
+		// busy window past AgentEnd.
+		go func() {
+			if err := ctx.NewSession(HANDOFFPrompt); err != nil {
+				ctx.PrintError("phase-handoff: " + err.Error())
+				return
+			}
+			ctx.PrintInfo("phase-handoff: started a fresh session from HANDOFF.md")
+		}()
+	})
+
+	// Manual trigger: /handoff [optional override prompt]
+	api.RegisterCommand(ext.CommandDef{
+		Name:        "handoff",
+		Description: "Start a new session, optionally with a custom prompt",
+		Execute: func(args string, ctx ext.Context) (string, error) {
+			prompt := strings.TrimSpace(args)
+			if prompt == "" {
+				prompt = HANDOFFPrompt
+			}
+			if err := ctx.NewSession(prompt); err != nil {
+				return "", err
+			}
+			return "", nil
+		},
+	})
+
+	// Optional safeguard: surface the next prompt so the user can confirm
+	// before the auto-handoff proceeds. Set kit option "handoff.confirm=1"
+	// to enable.
+	api.OnBeforeSessionSwitch(func(e ext.BeforeSessionSwitchEvent, ctx ext.Context) *ext.BeforeSessionSwitchResult {
+		if ctx.GetOption("handoff.confirm") != "1" {
+			return nil
+		}
+		if e.InitialPrompt == "" {
+			return nil
+		}
+		resp := ctx.PromptConfirm(ext.PromptConfirmConfig{
+			Message:      "Start a new session with prompt:\n  " + e.InitialPrompt + "\n\nProceed?",
+			DefaultValue: true,
+		})
+		if resp.Cancelled || !resp.Value {
+			return &ext.BeforeSessionSwitchResult{
+				Cancel: true,
+				Reason: "handoff cancelled by user",
+			}
+		}
+		return nil
+	})
+}
@@ -5,16 +5,16 @@ go 1.26.4
 require (
 	charm.land/bubbles/v2 v2.1.0
 	charm.land/bubbletea/v2 v2.0.7
-	charm.land/fantasy v0.31.0
+	charm.land/fantasy v0.32.0
 	charm.land/huh/v2 v2.0.3
 	charm.land/lipgloss/v2 v2.0.4
-	github.com/alecthomas/chroma/v2 v2.26.1
+	github.com/alecthomas/chroma/v2 v2.27.0
 	github.com/atotto/clipboard v0.1.4
 	github.com/aymanbagabas/go-udiff v0.4.1
 	github.com/charmbracelet/colorprofile v0.4.3
 	github.com/charmbracelet/fang v1.0.0
 	github.com/charmbracelet/log v1.0.0
-	github.com/charmbracelet/openai-go v0.0.0-20260319145158-d0740cc34266
+	github.com/charmbracelet/openai-go v0.0.0-20260617131321-5e4b9c18c4be
 	github.com/charmbracelet/ultraviolet v0.0.0-20260615092913-2399af76d5b1
 	github.com/charmbracelet/x/editor v0.2.0
 	github.com/clipperhouse/displaywidth v0.11.0
@@ -23,7 +23,7 @@ require (
 	github.com/fsnotify/fsnotify v1.10.1
 	github.com/indaco/herald v0.13.0
 	github.com/indaco/herald-md v0.3.0
-	github.com/mark3labs/mcp-go v0.54.1
+	github.com/mark3labs/mcp-go v0.55.0
 	github.com/spf13/cobra v1.10.2
 	github.com/spf13/viper v1.21.0
 	github.com/traefik/yaegi v0.16.1
@@ -85,13 +85,13 @@ require (
 	github.com/googleapis/gax-go/v2 v2.22.0 // indirect
 	github.com/gorilla/websocket v1.5.3 // indirect
 	github.com/kaptinlin/jsonpointer v0.4.26 // indirect
-	github.com/kaptinlin/jsonschema v0.8.0 // indirect
+	github.com/kaptinlin/jsonschema v0.8.1 // indirect
 	github.com/mitchellh/hashstructure/v2 v2.0.2 // indirect
 	github.com/muesli/mango v0.2.0 // indirect
 	github.com/muesli/mango-cobra v1.3.0 // indirect
 	github.com/muesli/mango-pflag v0.2.0 // indirect
 	github.com/muesli/roff v0.1.0 // indirect
-	github.com/pelletier/go-toml/v2 v2.3.1 // indirect
+	github.com/pelletier/go-toml/v2 v2.4.0 // indirect
 	github.com/sagikazarmark/locafero v0.12.0 // indirect
 	github.com/santhosh-tekuri/jsonschema/v6 v6.0.2 // indirect
 	github.com/spf13/afero v1.15.0 // indirect
@@ -116,8 +116,8 @@ require (
 	golang.org/x/net v0.56.0 // indirect
 	golang.org/x/oauth2 v0.36.0 // indirect
 	golang.org/x/time v0.15.0 // indirect
-	google.golang.org/api v0.284.0 // indirect
-	google.golang.org/genai v1.60.0 // indirect
+	google.golang.org/api v0.285.0 // indirect
+	google.golang.org/genai v1.61.0 // indirect
 	google.golang.org/genproto/googleapis/rpc v0.0.0-20260615183401-62b3387ff324 // indirect
 	google.golang.org/grpc v1.81.1 // indirect
 	google.golang.org/protobuf v1.36.11 // indirect
@@ -2,8 +2,8 @@ charm.land/bubbles/v2 v2.1.0 h1:YSnNh5cPYlYjPxRrzs5VEn3vwhtEn3jVGRBT3M7/I0g=
 charm.land/bubbles/v2 v2.1.0/go.mod h1:l97h4hym2hvWBVfmJDtrEHHCtkIKeTEb3TTJ4ZOB3wY=
 charm.land/bubbletea/v2 v2.0.7 h1:7qw2tTAVar7m7klOPBYfTB0mniv/RuexsYwMRNxSeL0=
 charm.land/bubbletea/v2 v2.0.7/go.mod h1:DGW2q8gvzHnOpMpZTORs0aySVHCox5C+2Svk0fci1qs=
-charm.land/fantasy v0.31.0 h1:ioLVRi7A8lZXR8mrCIeseuCcq0KqAak46revmGumnpc=
-charm.land/fantasy v0.31.0/go.mod h1:lAE2gO68SrB1S5TrW5g0TRoxz9V+qJcg0Elx/uPWsDI=
+charm.land/fantasy v0.32.0 h1:tlC1qlOdXi2CkF6KB0x8YAAm3hiarI2/69u6pZmOZk8=
+charm.land/fantasy v0.32.0/go.mod h1:CWAFEOB21guhmt4qWN9sOnAHkZzVWjKbhxbPHG+oRs8=
 charm.land/huh/v2 v2.0.3 h1:2cJsMqEPwSywGHvdlKsJyQKPtSJLVnFKyFbsYZTlLkU=
 charm.land/huh/v2 v2.0.3/go.mod h1:93eEveeeqn47MwiC3tf+2atZ2l7Is88rAtmZNZ8x9Wc=
 charm.land/lipgloss/v2 v2.0.4 h1:lcPeVtcp23SNra7lHy8iYE4UC2aIipVQ47sbGyyxR5Q=
@@ -28,8 +28,8 @@ github.com/MakeNowJust/heredoc v1.0.0 h1:cXCdzVdstXyiTqTvfqk9SDHpKNjxuom+DOlyEeQ
 github.com/MakeNowJust/heredoc v1.0.0/go.mod h1:mG5amYoWBHf8vpLOuehzbGGw0EHxpZZ6lCpQ4fNJ8LE=
 github.com/alecthomas/assert/v2 v2.11.0 h1:2Q9r3ki8+JYXvGsDyBXwH3LcJ+WK5D0gc5E8vS6K3D0=
 github.com/alecthomas/assert/v2 v2.11.0/go.mod h1:Bze95FyfUr7x34QZrjL+XP+0qgp/zg8yS+TtBj1WA3k=
-github.com/alecthomas/chroma/v2 v2.26.1 h1:2X21EdxGZNv5GF9mG5u+uzc02GCFyGxbcBm3Grd9A78=
-github.com/alecthomas/chroma/v2 v2.26.1/go.mod h1:lxhRRa9H4hPmRLOOdYga4zkQIQjq3dtrrdwQeCfu78Y=
+github.com/alecthomas/chroma/v2 v2.27.0 h1:FodwmyOBgJULFYmDqibcp9pvfDLWdtPRh9v/r5BXYZs=
+github.com/alecthomas/chroma/v2 v2.27.0/go.mod h1:NjJ3ciIgrqBNeIkWZ4e46nseoLDslxU1LmfCoL+wcY8=
 github.com/alecthomas/repr v0.5.2 h1:SU73FTI9D1P5UNtvseffFSGmdNci/O6RsqzeXJtP0Qs=
 github.com/alecthomas/repr v0.5.2/go.mod h1:Fr0507jx4eOXV7AlPV6AVZLYrLIuIeSOWtW57eE/O/4=
 github.com/atotto/clipboard v0.1.4 h1:EH0zSVneZPSuFR11BlR9YppQTVDbh5+16AmcJi4g1z4=
@@ -84,8 +84,8 @@ github.com/charmbracelet/lipgloss v1.1.1-0.20250404203927-76690c660834 h1:ZR7e0r
 github.com/charmbracelet/lipgloss v1.1.1-0.20250404203927-76690c660834/go.mod h1:aKC/t2arECF6rNOnaKaVU6y4t4ZeHQzqfxedE/VkVhA=
 github.com/charmbracelet/log v1.0.0 h1:HVVVMmfOorfj3BA9i8X8UL69Hoz9lI0PYwXfJvOdRc4=
 github.com/charmbracelet/log v1.0.0/go.mod h1:uYgY3SmLpwJWxmlrPwXvzVYujxis1vAKRV/0VQB7yWA=
-github.com/charmbracelet/openai-go v0.0.0-20260319145158-d0740cc34266 h1:BW/sZtyd1JyYy0h5adMm3tzpNyL857LWjuTRET6OhpY=
-github.com/charmbracelet/openai-go v0.0.0-20260319145158-d0740cc34266/go.mod h1:1DahUaExbUZx/jD+FNT2PKP4L9rLE5+ZBRuI8mZjd/E=
+github.com/charmbracelet/openai-go v0.0.0-20260617131321-5e4b9c18c4be h1:pg+OWlIkk9HOe/8P5J95aKe2wGDzFUiiyFOUpwR30B4=
+github.com/charmbracelet/openai-go v0.0.0-20260617131321-5e4b9c18c4be/go.mod h1:1DahUaExbUZx/jD+FNT2PKP4L9rLE5+ZBRuI8mZjd/E=
 github.com/charmbracelet/ultraviolet v0.0.0-20260615092913-2399af76d5b1 h1:4+r3uOJ69ueRBt4okgEfWZeXs3BD36HcDBmOIAUlETk=
 github.com/charmbracelet/ultraviolet v0.0.0-20260615092913-2399af76d5b1/go.mod h1:f/jRa757WUmaOZrbPspXymbg/GnbF+rwe4OLsG7aXYo=
 github.com/charmbracelet/x/ansi v0.11.7 h1:kzv1kJvjg2S3r9KHo8hDdHFQLEqn4RBCb39dAYC84jI=
@@ -191,8 +191,8 @@ github.com/indaco/herald-md v0.3.0 h1:hN1cKyrexPPM9PeHBsKuaWvIizSi/iYvM9yzRgtdb8
 github.com/indaco/herald-md v0.3.0/go.mod h1:RUHVaDSG45ymJjKyxpDwBocLXrZo93FB4OeYMsw9B9s=
 github.com/kaptinlin/jsonpointer v0.4.26 h1:tw616yszHek+B3/GtDSia+uzBa3sLXGpmo4tYeMhBZw=
 github.com/kaptinlin/jsonpointer v0.4.26/go.mod h1:wVOBaXGGnP42YsMb6zev/3W5POTvspdNfh8DXzf8XS8=
-github.com/kaptinlin/jsonschema v0.8.0 h1:GhY966O2q3ZQsg1zkQj988KF2MADJ6EA7pKBMpGmb9A=
-github.com/kaptinlin/jsonschema v0.8.0/go.mod h1:dxt7s98W5NEuWEwCnAwGrhYGQdaRLqXZImR28DuxcMU=
+github.com/kaptinlin/jsonschema v0.8.1 h1:Krhuq1HpE+olHoPfcxkohqKKCnXfixUPv+aUYRegBBQ=
+github.com/kaptinlin/jsonschema v0.8.1/go.mod h1:mCH2W5lXd29tdDjvoFfY32nedPORnlk7pCVrrcs/NkQ=
 github.com/kr/pretty v0.3.1 h1:flRD4NNwYAUpkphVc1HcthR4KEIFJ65n8Mw5qdRn3LE=
 github.com/kr/pretty v0.3.1/go.mod h1:hoEshYVHaxMs3cyo3Yncou5ZscifuDolrwPKZanG3xk=
 github.com/kr/text v0.2.0 h1:5Nx0Ya0ZqY2ygV366QzturHI13Jq95ApcVaJBhpS+AY=
@@ -201,8 +201,8 @@ github.com/kylelemons/godebug v1.1.0 h1:RPNrshWIDI6G2gRW9EHilWtl7Z6Sb1BR0xunSBf0
 github.com/kylelemons/godebug v1.1.0/go.mod h1:9/0rRGxNHcop5bhtWyNeEfOS8JIWk580+fNqagV/RAw=
 github.com/lucasb-eyer/go-colorful v1.4.0 h1:UtrWVfLdarDgc44HcS7pYloGHJUjHV/4FwW4TvVgFr4=
 github.com/lucasb-eyer/go-colorful v1.4.0/go.mod h1:R4dSotOR9KMtayYi1e77YzuveK+i7ruzyGqttikkLy0=
-github.com/mark3labs/mcp-go v0.54.1 h1:Ap/ptEB9FtWzFKM8NDsTA7QDxerQOC06eZigrTldVj0=
-github.com/mark3labs/mcp-go v0.54.1/go.mod h1:+8WclSK1ZUweCP3hvktSji8n8ABG/95QaEkeVE/Uwas=
+github.com/mark3labs/mcp-go v0.55.0 h1:lJfz2aoctiwK+sI991+uIYwmKNIBciI+O7zsyDsa4U8=
+github.com/mark3labs/mcp-go v0.55.0/go.mod h1:+8WclSK1ZUweCP3hvktSji8n8ABG/95QaEkeVE/Uwas=
 github.com/mattn/go-isatty v0.0.22 h1:j8l17JJ9i6VGPUFUYoTUKPSgKe/83EYU2zBC7YNKMw4=
 github.com/mattn/go-isatty v0.0.22/go.mod h1:ZXfXG4SQHsB/w3ZeOYbR0PrPwLy+n6xiMrJlRFqopa4=
 github.com/mattn/go-runewidth v0.0.24 h1:cpokDiIn0MGnhdHwuWnJBITySJ20QyNGnY2kR/ay2DU=
@@ -221,8 +221,8 @@ github.com/muesli/roff v0.1.0 h1:YD0lalCotmYuF5HhZliKWlIx7IEhiXeSfq7hNjFqGF8=
 github.com/muesli/roff v0.1.0/go.mod h1:pjAHQM9hdUUwm/krAfrLGgJkXJ+YuhtsfZ42kieB2Ig=
 github.com/muesli/termenv v0.16.0 h1:S5AlUN9dENB57rsbnkPyfdGuWIlkmzJjbFf0Tf5FWUc=
 github.com/muesli/termenv v0.16.0/go.mod h1:ZRfOIKPFDYQoDFF4Olj7/QJbW60Ol/kL1pU3VfY/Cnk=
-github.com/pelletier/go-toml/v2 v2.3.1 h1:MYEvvGnQjeNkRF1qUuGolNtNExTDwct51yp7olPtrEc=
-github.com/pelletier/go-toml/v2 v2.3.1/go.mod h1:2gIqNv+qfxSVS7cM2xJQKtLSTLUE9V8t9Stt+h56mCY=
+github.com/pelletier/go-toml/v2 v2.4.0 h1:Mwu0mAkUKbittDs3/ADDWXqMmq3EOK2VHiuCkV00Row=
+github.com/pelletier/go-toml/v2 v2.4.0/go.mod h1:2gIqNv+qfxSVS7cM2xJQKtLSTLUE9V8t9Stt+h56mCY=
 github.com/pkg/browser v0.0.0-20240102092130-5ac0b6a4141c h1:+mdjkGKdHQG3305AYmdv1U2eRNDiU2ErMBj1gwrq8eQ=
 github.com/pkg/browser v0.0.0-20240102092130-5ac0b6a4141c/go.mod h1:7rwL4CYBLnjLxUqIJNnCWiEdr3bn6IUYi15bNlnbCCU=
 github.com/planetscale/vtprotobuf v0.6.1-0.20240319094008-0393e58bdf10 h1:GFCKgmp0tecUJ0sJuv4pzYCqS9+RGSn52M3FUwPs+uo=
@@ -312,14 +312,14 @@ golang.org/x/time v0.15.0 h1:bbrp8t3bGUeFOx08pvsMYRTCVSMk89u4tKbNOZbp88U=
 golang.org/x/time v0.15.0/go.mod h1:Y4YMaQmXwGQZoFaVFk4YpCt4FLQMYKZe9oeV/f4MSno=
 gonum.org/v1/gonum v0.17.0 h1:VbpOemQlsSMrYmn7T2OUvQ4dqxQXU+ouZFQsZOx50z4=
 gonum.org/v1/gonum v0.17.0/go.mod h1:El3tOrEuMpv2UdMrbNlKEh9vd86bmQ6vqIcDwxEOc1E=
-google.golang.org/api v0.284.0 h1:i+cKTgeQRcRySkP7QTl5PDO7/pAm8EcMFIUMlNbk4Vc=
-google.golang.org/api v0.284.0/go.mod h1:AU44fU+XVZOCcd8uLaBIa/ZgzgPf/0qqY3+m7lQaado=
-google.golang.org/genai v1.60.0 h1:uAkea4tYhCz1LlUmxdiOFAmlrLFaLs8PbXucgZHqHVo=
-google.golang.org/genai v1.60.0/go.mod h1:mDdPDFXo1Ats7f1WXVyZgWb/CkMzFWTWJruIMy7hGIU=
-google.golang.org/genproto v0.0.0-20260526163538-3dc84a4a5aaa h1:mfj8IS4EA4VAR9a6QDVxTQkLY64iBybb5QI1B4pXrpE=
-google.golang.org/genproto v0.0.0-20260526163538-3dc84a4a5aaa/go.mod h1:fuT7yonGw1Iq2oa+YC0fyqPPQJkgo/54gPNC6VitOkI=
-google.golang.org/genproto/googleapis/api v0.0.0-20260526163538-3dc84a4a5aaa h1:Kjn0N0tCrDgiAFW+lGO4JZ3ck44CehvJQMAwj9QF0G8=
-google.golang.org/genproto/googleapis/api v0.0.0-20260526163538-3dc84a4a5aaa/go.mod h1:q4lMZS6kskjT5HvCPrnnypcDPVJqT/f4nfxmkE7gryY=
+google.golang.org/api v0.285.0 h1:B7eHHoKGAX/LrPkQvhQqnGwjgWxofbdGwCTQvpm8FkM=
+google.golang.org/api v0.285.0/go.mod h1:NlOlUIr8MPoIhT9Bb/oUnRuHbJOLwxb6JSYJM8Yz+jQ=
+google.golang.org/genai v1.61.0 h1:wCyNGiaC9q5A59B80zuEtNBhq3ypEvICFkZYOfK7IO0=
+google.golang.org/genai v1.61.0/go.mod h1:mDdPDFXo1Ats7f1WXVyZgWb/CkMzFWTWJruIMy7hGIU=
+google.golang.org/genproto v0.0.0-20260610212136-7ab31c22f7ad h1:cYL1DPJAQr4JMvhfGao0PDXoaf03ifMljAuDyrbMBd0=
+google.golang.org/genproto v0.0.0-20260610212136-7ab31c22f7ad/go.mod h1:cVHIikDNAdx8ISZeW+2rYkEMf3xn0GSaBYmVnWXQBUo=
+google.golang.org/genproto/googleapis/api v0.0.0-20260610212136-7ab31c22f7ad h1:3iLyITS/sySRwbUKoC7ogfj2Yr1Cjs0pfaRKj5U5HEw=
+google.golang.org/genproto/googleapis/api v0.0.0-20260610212136-7ab31c22f7ad/go.mod h1:KdNqO+rCIWgFumrNBSEDlDNrkrQnpkax7Tv1WxNY8V4=
 google.golang.org/genproto/googleapis/rpc v0.0.0-20260615183401-62b3387ff324 h1:9HZDLIdYBJXAnaFOr9WHrKVycfpY+75s9HGadC0305A=
 google.golang.org/genproto/googleapis/rpc v0.0.0-20260615183401-62b3387ff324/go.mod h1:4Hqkh8ycfw05ld/3BWL7rJOSfebL2Q+DVDeRgYgxUU8=
 google.golang.org/grpc v1.81.1 h1:VnnIIZ88UzOOKLukQi+ImGz8O1Wdp8nAGGnvOfEIWQQ=
@@ -96,6 +96,9 @@ func (r *sessionRegistry) create(ctx context.Context, cwd string) (*acpSession,
 		// Message injection — no-ops for now; ACP clients drive prompts.
 		ec.SendMessage = func(string) {}
 		ec.CancelAndSend = func(string) {}
+		ec.NewSession = func(string) error {
+			return fmt.Errorf("new session not available in ACP mode")
+		}
 		ec.Exit = func() {}

 		// TUI widgets/chrome — silent no-ops (no TUI in ACP).
@@ -1105,6 +1105,18 @@ func (a *Agent) GetExtensionToolCount() int {
 	return len(a.extraTools)
 }

+// GetExtraTools returns the agent's current extra tools (e.g.
+// extension-registered tools). The returned slice is a copy so callers can
+// snapshot and later restore it via SetExtraTools.
+func (a *Agent) GetExtraTools() []fantasy.AgentTool {
+	if len(a.extraTools) == 0 {
+		return nil
+	}
+	out := make([]fantasy.AgentTool, len(a.extraTools))
+	copy(out, a.extraTools)
+	return out
+}
+
 // SetExtraTools replaces the agent's extra tools (e.g. extension-registered
 // tools) and rebuilds the internal agent with the updated tool list. The
 // model, system prompt, and all other configuration are preserved.
@@ -2,6 +2,7 @@ package app

 import (
 	"context"
+	"errors"
 	"fmt"
 	"log"
 	"os"
@@ -24,6 +25,26 @@ type queueItem struct {
 	Files  []kit.LLMFilePart
 }

+// ErrAgentBusy is returned when an operation cannot proceed because the agent
+// is still processing a turn (including any post-turn extension hooks) and did
+// not become idle before the operation's deadline.
+//
+// This is an alias for extensions.ErrAgentBusy so the extension API and the
+// app layer share a single sentinel value — callers can detect the condition
+// with errors.Is(err, app.ErrAgentBusy) without substring-matching the error
+// message.
+var ErrAgentBusy = extensions.ErrAgentBusy
+
+// DefaultNewSessionIdleWait bounds how long RequestNewSessionFromExtension
+// will block waiting for the agent to settle. It needs to be generous enough
+// to cover real-world post-turn tooling (project formatters, on-save linters,
+// hidden tool calls) which routinely hold the busy flag for seconds and
+// occasionally minutes — yet still short enough to surface a wedged agent.
+//
+// Issue #63 reported workloads where the busy window regularly exceeded
+// 6 seconds; ten minutes is the same bound the workaround in that issue used.
+const DefaultNewSessionIdleWait = 10 * time.Minute
+
 // App is the application-layer orchestrator. It owns the agentic loop,
 // conversation history (via MessageStore), and queue management. It is
 // designed to be created once per session and reused across multiple prompts.
@@ -55,11 +76,25 @@ type App struct {
 	// each new step and called by CancelCurrentStep().
 	cancelStep context.CancelFunc

-	// mu protects busy, queue, and cancelStep.
+	// mu protects busy, queue, cancelStep, and idleCh.
 	mu    sync.Mutex
 	busy  bool
 	queue []queueItem

+	// idleCh is closed when the agent transitions from busy back to idle.
+	// While the agent is idle the channel is already closed (recv returns
+	// immediately). When busy transitions to true a fresh open channel is
+	// allocated so callers blocked on the previous one are released. All
+	// transitions are funnelled through setBusyLocked to keep the channel
+	// pointer in sync with the busy flag.
+	//
+	// This is the underlying primitive WaitForIdle and
+	// RequestNewSessionFromExtension wait on to fix the AgentEnd→NewSession
+	// race described in issue #63: AgentEnd is emitted from inside the agent
+	// loop, before drainQueue clears busy, so any extension hook that calls
+	// ctx.NewSession synchronously would otherwise observe busy==true.
+	idleCh chan struct{}
+
 	// wg tracks in-flight goroutines; Close() waits on it.
 	wg sync.WaitGroup

@@ -95,6 +130,10 @@ type App struct {
 // initialMessages may be nil or empty for a fresh session.
 func New(opts Options, initialMessages []kit.LLMMessage) *App {
 	rootCtx, rootCancel := context.WithCancel(context.Background())
+	// idleCh starts already closed: the freshly constructed App is idle, so
+	// any caller blocking on it via WaitForIdle should be released immediately.
+	idleCh := make(chan struct{})
+	close(idleCh)
 	return &App{
 		opts:       opts,
 		store:      NewMessageStoreWithMessages(initialMessages),
@@ -102,6 +141,90 @@ func New(opts Options, initialMessages []kit.LLMMessage) *App {
 		rootCancel: rootCancel,
 		// cancelStep starts as a no-op so CancelCurrentStep() is always safe.
 		cancelStep: func() {},
+		idleCh:     idleCh,
+	}
+}
+
+// setBusyLocked is the single chokepoint for mutating a.busy. It keeps the
+// idleCh signalling channel in sync with the busy flag:
+//
+//   - false → true: allocate a fresh open channel so future WaitForIdle
+//     callers block until the next idle transition.
+//   - true  → false: close the current channel so any waiters wake up.
+//
+// No-op when the requested state already matches. The caller must hold a.mu.
+func (a *App) setBusyLocked(busy bool) {
+	if a.busy == busy {
+		return
+	}
+	a.busy = busy
+	if busy {
+		a.idleCh = make(chan struct{})
+	} else {
+		close(a.idleCh)
+	}
+}
+
+// idleSnapshot returns the current busy state and the channel that will be
+// closed on the next idle transition. The snapshot is taken under a.mu so the
+// pair is consistent (busy==true ⇒ ch is the open channel for *this* busy
+// cycle, not a stale one).
+func (a *App) idleSnapshot() (busy bool, ch chan struct{}) {
+	a.mu.Lock()
+	defer a.mu.Unlock()
+	return a.busy, a.idleCh
+}
+
+// WaitForIdle blocks until the agent is idle, the given timeout elapses, or
+// the app shuts down. Returns nil on idle, ErrAgentBusy on timeout, or the
+// rootCtx error if the app is closing.
+//
+// A non-positive timeout disables the deadline and waits indefinitely (until
+// idle or app shutdown). Safe to call from any goroutine, but never from
+// inside the Bubble Tea Update() loop — it blocks.
+//
+// Idiomatic use from extensions:
+//
+//	if err := app.WaitForIdle(0); err != nil { /* shutdown */ }
+//
+// The loop guards against the agent re-arming itself between wakeups: if
+// another prompt is queued (or a steer message lands) while we're waiting,
+// setBusyLocked allocates a fresh idleCh and we wait again.
+func (a *App) WaitForIdle(timeout time.Duration) error {
+	var deadline time.Time
+	if timeout > 0 {
+		deadline = time.Now().Add(timeout)
+	}
+	for {
+		busy, ch := a.idleSnapshot()
+		if !busy {
+			return nil
+		}
+		var timer *time.Timer
+		var timerCh <-chan time.Time
+		if timeout > 0 {
+			remaining := time.Until(deadline)
+			if remaining <= 0 {
+				return ErrAgentBusy
+			}
+			timer = time.NewTimer(remaining)
+			timerCh = timer.C
+		}
+		select {
+		case <-ch:
+			// Idle transition observed — loop and re-check under the
+			// mutex in case a new busy cycle started immediately after.
+		case <-timerCh:
+			return ErrAgentBusy
+		case <-a.rootCtx.Done():
+			if timer != nil {
+				timer.Stop()
+			}
+			return a.rootCtx.Err()
+		}
+		if timer != nil {
+			timer.Stop()
+		}
 	}
 }

@@ -155,7 +278,7 @@ func (a *App) RunWithFiles(prompt string, files []kit.LLMFilePart) int {
 		return qLen
 	}

-	a.busy = true
+	a.setBusyLocked(true)
 	a.wg.Add(1)
 	a.mu.Unlock()
 	go a.drainQueue(item)
@@ -235,7 +358,7 @@ func (a *App) SteerWithFiles(prompt string, files []kit.LLMFilePart) int {
 	if !a.busy {
 		// Not busy — start immediately, same as RunWithFiles().
 		item := queueItem{Prompt: prompt, Files: files}
-		a.busy = true
+		a.setBusyLocked(true)
 		a.wg.Add(1)
 		a.mu.Unlock()
 		go a.drainQueue(item)
@@ -271,7 +394,7 @@ func (a *App) InterruptAndSend(prompt string) {

 	if !a.busy {
 		// Not busy — start immediately, same as Run().
-		a.busy = true
+		a.setBusyLocked(true)
 		a.wg.Add(1)
 		a.mu.Unlock()
 		go a.drainQueue(item)
@@ -470,7 +593,7 @@ func (a *App) CompactConversation(customInstructions string) error {
 		a.mu.Unlock()
 		return fmt.Errorf("SDK instance not available")
 	}
-	a.busy = true
+	a.setBusyLocked(true)
 	a.wg.Add(1)
 	a.mu.Unlock()

@@ -532,7 +655,7 @@ func (a *App) CompactAsync(customInstructions string, onComplete func(), onError
 		a.mu.Unlock()
 		return fmt.Errorf("SDK instance not available")
 	}
-	a.busy = true
+	a.setBusyLocked(true)
 	a.wg.Add(1)
 	a.mu.Unlock()

@@ -621,7 +744,7 @@ func (a *App) releaseBusyAfterCompact() {
 	// in just before closed was set.
 	if a.closed {
 		a.queue = a.queue[:0]
-		a.busy = false
+		a.setBusyLocked(false)
 		a.mu.Unlock()
 		return
 	}
@@ -633,7 +756,7 @@ func (a *App) releaseBusyAfterCompact() {
 	a.queue = a.queue[:0]

 	if len(pending) == 0 {
-		a.busy = false
+		a.setBusyLocked(false)
 		a.mu.Unlock()
 		return
 	}
@@ -850,7 +973,7 @@ func (a *App) drainQueue(first queueItem) {

 	// Mark as no longer busy
 	a.mu.Lock()
-	a.busy = false
+	a.setBusyLocked(false)
 	a.mu.Unlock()
 }

@@ -1230,6 +1353,42 @@ func (a *App) SetEditorTextFromExtension(text string) {
 	}
 }

+// RequestNewSessionFromExtension sends a NewSessionRequestEvent to the TUI
+// to end the current session and start a fresh one. If initialPrompt is
+// non-empty it is submitted as the first user turn of the new session.
+//
+// If the agent is currently busy (e.g. the caller is an OnAgentEnd hook that
+// fires before drainQueue clears the busy flag, or there are queued prompts
+// still being processed) the call blocks until the agent becomes idle, up to
+// DefaultNewSessionIdleWait. If that deadline elapses, ErrAgentBusy is
+// returned and callers can detect it with errors.Is. This wait-then-send
+// behavior fixes the v0.79.0 phase-handoff race documented in issue #63.
+//
+// Returns an error when running headless (no TUI attached), when the wait
+// for idle times out (ErrAgentBusy), when the app is shutting down, or when
+// a BeforeSessionSwitch extension hook cancels the switch.
+//
+// This is the implementation behind ctx.NewSession(prompt) for the
+// interactive TUI. It blocks the caller until the TUI processes the
+// switch, so it must be invoked from a goroutine outside Update().
+func (a *App) RequestNewSessionFromExtension(initialPrompt string) error {
+	a.mu.Lock()
+	prog := a.program
+	a.mu.Unlock()
+	if prog == nil {
+		return fmt.Errorf("new session unavailable: no interactive TUI attached")
+	}
+	if err := a.WaitForIdle(DefaultNewSessionIdleWait); err != nil {
+		if errors.Is(err, ErrAgentBusy) {
+			return fmt.Errorf("cannot start new session: %w", err)
+		}
+		return err
+	}
+	ch := make(chan error, 1)
+	prog.Send(NewSessionRequestEvent{InitialPrompt: initialPrompt, ResponseCh: ch})
+	return <-ch
+}
+
 // NotifyModelChanged sends a ModelChangedEvent to the TUI so it updates
 // the model name in the status bar and message attribution.
 func (a *App) NotifyModelChanged(provider, model string) {
@@ -794,7 +794,7 @@ func TestReleaseBusyAfterCompact_flushesQueuedMessages(t *testing.T) {
 	// summarising. (Run() would have appended them and returned a queue
 	// length > 0 to the caller.)
 	app.mu.Lock()
-	app.busy = true
+	app.setBusyLocked(true)
 	app.queue = append(app.queue,
 		queueItem{Prompt: "queued during compact #1"},
 		queueItem{Prompt: "queued during compact #2"},
@@ -834,7 +834,7 @@ func TestReleaseBusyAfterCompact_idleWhenQueueEmpty(t *testing.T) {
 	defer app.Close()

 	app.mu.Lock()
-	app.busy = true
+	app.setBusyLocked(true)
 	app.mu.Unlock()

 	app.releaseBusyAfterCompact()
@@ -901,7 +901,7 @@ func TestReleaseBusyAfterCompact_splicesSteerAheadOfQueue(t *testing.T) {
 	// Simulate the state at the end of compaction: busy is set and a couple
 	// of regular Run() prompts have piled up after the steer messages.
 	app.mu.Lock()
-	app.busy = true
+	app.setBusyLocked(true)
 	app.queue = append(app.queue,
 		queueItem{Prompt: "queued-1"},
 		queueItem{Prompt: "queued-2"},
@@ -950,7 +950,7 @@ func TestReleaseBusyAfterCompact_dropsQueueWhenClosed(t *testing.T) {
 	app := newTestApp(stub)

 	app.mu.Lock()
-	app.busy = true
+	app.setBusyLocked(true)
 	app.queue = append(app.queue, queueItem{Prompt: "would have run"})
 	app.closed = true
 	app.mu.Unlock()
@@ -999,7 +999,7 @@ func TestPopLastUserMessage_WhileBusy(t *testing.T) {
 	defer app.Close()

 	app.mu.Lock()
-	app.busy = true
+	app.setBusyLocked(true)
 	app.mu.Unlock()

 	_, _, err := app.PopLastUserMessage()
@@ -1115,3 +1115,281 @@ func TestPopLastUserMessage_NoUserOnBranch(t *testing.T) {
 		t.Fatalf("expected error mentioning missing user message, got %q", err.Error())
 	}
 }
+
+// --------------------------------------------------------------------------
+// WaitForIdle / RequestNewSessionFromExtension (issue #63)
+// --------------------------------------------------------------------------
+
+// TestWaitForIdle_AlreadyIdle verifies the fast path: a freshly constructed
+// App is idle and WaitForIdle returns immediately without consulting the
+// timeout.
+func TestWaitForIdle_AlreadyIdle(t *testing.T) {
+	app := newTestApp(newStub())
+	defer app.Close()
+
+	start := time.Now()
+	if err := app.WaitForIdle(2 * time.Second); err != nil {
+		t.Fatalf("WaitForIdle on idle app: %v", err)
+	}
+	if elapsed := time.Since(start); elapsed > 100*time.Millisecond {
+		t.Fatalf("WaitForIdle blocked for %s on already-idle app", elapsed)
+	}
+}
+
+// TestWaitForIdle_BlocksUntilDrain reproduces the issue #63 race: while
+// drainQueue holds busy==true the call should block, then return nil as soon
+// as the drain completes.
+func TestWaitForIdle_BlocksUntilDrain(t *testing.T) {
+	gate := make(chan struct{})
+	var gateOnce sync.Once
+	closeGate := func() { gateOnce.Do(func() { close(gate) }) }
+	stub := newStubWithFuncs(
+		func(ctx context.Context) (*kit.TurnResult, error) {
+			select {
+			case <-gate:
+			case <-ctx.Done():
+				return nil, ctx.Err()
+			}
+			return turnResult("done"), nil
+		},
+	)
+	app := newTestApp(stub)
+	t.Cleanup(func() {
+		closeGate()
+		app.Close()
+	})
+
+	app.Run("hello")
+
+	// Confirm the agent is busy before we start waiting.
+	if !waitForCondition(2*time.Second, func() bool { return app.IsBusy() }) {
+		t.Fatal("app never became busy after Run()")
+	}
+
+	errCh := make(chan error, 1)
+	go func() {
+		errCh <- app.WaitForIdle(5 * time.Second)
+	}()
+
+	// Should not return while the stub is blocked.
+	select {
+	case err := <-errCh:
+		t.Fatalf("WaitForIdle returned early (err=%v) while agent still busy", err)
+	case <-time.After(150 * time.Millisecond):
+	}
+
+	closeGate()
+
+	select {
+	case err := <-errCh:
+		if err != nil {
+			t.Fatalf("WaitForIdle: %v", err)
+		}
+	case <-time.After(3 * time.Second):
+		t.Fatal("WaitForIdle did not return after drain completed")
+	}
+
+	if app.IsBusy() {
+		t.Fatal("app still reports busy after WaitForIdle returned")
+	}
+}
+
+// TestWaitForIdle_TimeoutReturnsErrAgentBusy verifies that a slow turn yields
+// ErrAgentBusy (detectable via errors.Is) when the deadline elapses.
+func TestWaitForIdle_TimeoutReturnsErrAgentBusy(t *testing.T) {
+	gate := make(chan struct{})
+	stub := newStubWithFuncs(
+		func(ctx context.Context) (*kit.TurnResult, error) {
+			select {
+			case <-gate:
+			case <-ctx.Done():
+				return nil, ctx.Err()
+			}
+			return turnResult("done"), nil
+		},
+	)
+	app := newTestApp(stub)
+	// Release the stub before Close so wg.Wait() can return.
+	t.Cleanup(func() {
+		close(gate)
+		app.Close()
+	})
+
+	app.Run("hello")
+	if !waitForCondition(2*time.Second, func() bool { return app.IsBusy() }) {
+		t.Fatal("app never became busy after Run()")
+	}
+
+	err := app.WaitForIdle(50 * time.Millisecond)
+	if !errors.Is(err, ErrAgentBusy) {
+		t.Fatalf("expected ErrAgentBusy on timeout, got %v", err)
+	}
+}
+
+// TestWaitForIdle_ZeroTimeoutWaitsIndefinitely verifies that a non-positive
+// timeout still blocks until idle (or shutdown) — not an instant ErrAgentBusy.
+func TestWaitForIdle_ZeroTimeoutWaitsIndefinitely(t *testing.T) {
+	gate := make(chan struct{})
+	var gateOnce sync.Once
+	closeGate := func() { gateOnce.Do(func() { close(gate) }) }
+	stub := newStubWithFuncs(
+		func(ctx context.Context) (*kit.TurnResult, error) {
+			select {
+			case <-gate:
+			case <-ctx.Done():
+				return nil, ctx.Err()
+			}
+			return turnResult("done"), nil
+		},
+	)
+	app := newTestApp(stub)
+	t.Cleanup(func() {
+		closeGate()
+		app.Close()
+	})
+
+	app.Run("hello")
+	if !waitForCondition(2*time.Second, func() bool { return app.IsBusy() }) {
+		t.Fatal("app never became busy after Run()")
+	}
+
+	errCh := make(chan error, 1)
+	go func() { errCh <- app.WaitForIdle(0) }()
+
+	select {
+	case err := <-errCh:
+		t.Fatalf("WaitForIdle(0) returned early with %v while agent was busy", err)
+	case <-time.After(150 * time.Millisecond):
+	}
+
+	closeGate()
+
+	select {
+	case err := <-errCh:
+		if err != nil {
+			t.Fatalf("WaitForIdle(0) returned %v after idle", err)
+		}
+	case <-time.After(3 * time.Second):
+		t.Fatal("WaitForIdle(0) did not return after drain completed")
+	}
+}
+
+// TestWaitForIdle_AppClose verifies that shutting down the app while a
+// caller is blocked in WaitForIdle releases the wait.
+func TestWaitForIdle_AppClose(t *testing.T) {
+	gate := make(chan struct{})
+	stub := newStubWithFuncs(
+		func(ctx context.Context) (*kit.TurnResult, error) {
+			select {
+			case <-gate:
+			case <-ctx.Done():
+				return nil, ctx.Err()
+			}
+			return turnResult("done"), nil
+		},
+	)
+	app := newTestApp(stub)
+
+	app.Run("hello")
+	if !waitForCondition(2*time.Second, func() bool { return app.IsBusy() }) {
+		t.Fatal("app never became busy after Run()")
+	}
+
+	errCh := make(chan error, 1)
+	go func() { errCh <- app.WaitForIdle(5 * time.Second) }()
+
+	// Give the goroutine a moment to enter the wait.
+	time.Sleep(50 * time.Millisecond)
+
+	// rootCancel is called by Close, which should release the waiter
+	// before drainQueue itself observes the cancellation and clears busy.
+	go func() {
+		// Unblock the stub so Close() can proceed past wg.Wait().
+		close(gate)
+	}()
+	app.Close()
+
+	select {
+	case err := <-errCh:
+		// Either rootCtx cancellation propagated first (err = context.Canceled)
+		// or the drain finished cleanly first (err == nil); both are
+		// acceptable terminations. The key invariant is that WaitForIdle
+		// does not hang past Close.
+		if err != nil && !errors.Is(err, context.Canceled) {
+			t.Fatalf("WaitForIdle returned unexpected error: %v", err)
+		}
+	case <-time.After(3 * time.Second):
+		t.Fatal("WaitForIdle did not return after Close()")
+	}
+}
+
+// TestRequestNewSessionFromExtension_NoTUI verifies the headless guard: with
+// no Bubble Tea program registered the call fails fast (no busy-wait).
+func TestRequestNewSessionFromExtension_NoTUI(t *testing.T) {
+	app := newTestApp(newStub())
+	defer app.Close()
+
+	err := app.RequestNewSessionFromExtension("hello")
+	if err == nil {
+		t.Fatal("expected error in headless mode")
+	}
+	if !strings.Contains(err.Error(), "no interactive TUI") {
+		t.Fatalf("expected 'no interactive TUI' error, got %q", err.Error())
+	}
+}
+
+// TestBusyTransitionsSignalIdleCh exercises the setBusyLocked invariants
+// directly: a fresh App is idle (closed channel); Run() opens a new channel
+// that is then closed when drainQueue exits.
+func TestBusyTransitionsSignalIdleCh(t *testing.T) {
+	app := newTestApp(newStub("ok"))
+	defer app.Close()
+
+	// Initial state: closed channel, busy==false.
+	busy, ch := app.idleSnapshot()
+	if busy {
+		t.Fatal("freshly constructed App should not be busy")
+	}
+	select {
+	case <-ch:
+	default:
+		t.Fatal("initial idleCh should already be closed")
+	}
+
+	gate := make(chan struct{})
+	var gateOnce sync.Once
+	closeGate := func() { gateOnce.Do(func() { close(gate) }) }
+	stub := newStubWithFuncs(func(ctx context.Context) (*kit.TurnResult, error) {
+		select {
+		case <-gate:
+		case <-ctx.Done():
+			return nil, ctx.Err()
+		}
+		return turnResult("ok"), nil
+	})
+	app2 := newTestApp(stub)
+	t.Cleanup(func() {
+		closeGate()
+		app2.Close()
+	})
+
+	app2.Run("hello")
+	if !waitForCondition(2*time.Second, func() bool { return app2.IsBusy() }) {
+		t.Fatal("app2 never became busy")
+	}
+
+	_, ch2 := app2.idleSnapshot()
+	select {
+	case <-ch2:
+		t.Fatal("idleCh should be open while busy")
+	default:
+	}
+
+	closeGate()
+
+	select {
+	case <-ch2:
+	case <-time.After(3 * time.Second):
+		t.Fatal("idleCh was never closed after drain completed")
+	}
+}
@@ -247,6 +247,21 @@ type EditorTextSetEvent struct {
 	Text string
 }

+// NewSessionRequestEvent is sent when an extension calls ctx.NewSession to
+// end the current session and start a fresh one. The TUI routes this into
+// the same /new code path (including the BeforeSessionSwitch hook and any
+// @file expansion in InitialPrompt). ResponseCh, when non-nil, receives a
+// single result so the extension goroutine can observe success or failure.
+type NewSessionRequestEvent struct {
+	// InitialPrompt, when non-empty, is the first user turn to submit
+	// after the session switch. @file references are expanded.
+	InitialPrompt string
+	// ResponseCh receives the outcome (nil error on success). Must be
+	// buffered (cap >= 1) so the TUI never blocks. May be nil if the
+	// caller does not need the result.
+	ResponseCh chan<- error
+}
+
 // ExtensionPrintEvent is sent when an extension calls ctx.Print, ctx.PrintInfo,
 // ctx.PrintError, or ctx.PrintBlock. The TUI renders it via the appropriate
 // renderer and tea.Println (scrollback); the CLI handler uses
@@ -389,6 +389,30 @@ func roleLabel(role fantasy.MessageRole) string {
 	}
 }

+// skillContentMarkers are substrings that identify a message carrying
+// explicitly-activated skill content. Such messages are exempt from
+// compaction pruning per the agentskills.io spec (issue #65, gap #7): an
+// activated skill must remain in context verbatim instead of being folded
+// into a lossy summary.
+var skillContentMarkers = []string{"<skill ", "<skill>", "<skill_content"}
+
+// isProtectedMessage reports whether msg carries explicitly-activated skill
+// content that must survive compaction unchanged.
+func isProtectedMessage(msg fantasy.Message) bool {
+	for _, part := range msg.Content {
+		tp, ok := part.(fantasy.TextPart)
+		if !ok {
+			continue
+		}
+		for _, marker := range skillContentMarkers {
+			if strings.Contains(tp.Text, marker) {
+				return true
+			}
+		}
+	}
+	return false
+}
+
 // serializeMessages converts a slice of fantasy messages into a plain-text
 // representation suitable for sending to the summarisation LLM. Tool result
 // text is truncated to maxToolResultChars to keep the summarisation request
@@ -518,6 +542,14 @@ func Compact(

 	newMessages := make([]fantasy.Message, 0, 1+len(recentMessages))
 	newMessages = append(newMessages, summaryMessage)
+	// Carry forward any explicitly-activated skill content from the
+	// summarised range verbatim — skill instructions must not be lost to
+	// compaction (issue #65, gap #7).
+	for _, msg := range oldMessages {
+		if isProtectedMessage(msg) {
+			newMessages = append(newMessages, msg)
+		}
+	}
 	newMessages = append(newMessages, recentMessages...)

 	compactedTokens := EstimateMessageTokens(newMessages)
@@ -439,3 +439,25 @@ func TestSortedKeys_Empty(t *testing.T) {
 		t.Errorf("sortedKeys(nil) = %v, want nil", got)
 	}
 }
+
+// ---------------------------------------------------------------------------
+// Skill-content protection (issue #65, gap #7)
+// ---------------------------------------------------------------------------
+
+func TestIsProtectedMessage(t *testing.T) {
+	cases := []struct {
+		text string
+		want bool
+	}{
+		{`<skill name="foo" location="/x">body</skill>`, true},
+		{`<skill_content name="foo">body</skill_content>`, true},
+		{"just a normal message", false},
+		{"talking about skills in general", false},
+	}
+	for _, c := range cases {
+		msg := makeTextMessage(fantasy.MessageRoleUser, c.text)
+		if got := isProtectedMessage(msg); got != c.want {
+			t.Errorf("isProtectedMessage(%q) = %v, want %v", c.text, got, c.want)
+		}
+	}
+}
@@ -6,6 +6,7 @@ import (
 	"os"
 	"path/filepath"
 	"strings"
+	"sync"

 	"github.com/spf13/viper"
 	"gopkg.in/yaml.v3"
@@ -227,16 +228,17 @@ type GenerationParams struct {
 // or other custom/ prefixed models. These models are loaded from the config file
 // and merged into the custom provider in the model registry.
 type CustomModelConfig struct {
-	Name        string      `json:"name" yaml:"name"`
-	BaseURL     string      `json:"baseUrl,omitempty" yaml:"baseUrl,omitempty"`
-	APIKey      string      `json:"apiKey,omitempty" yaml:"apiKey,omitempty"`
-	Family      string      `json:"family,omitempty" yaml:"family,omitempty"`
-	Attachment  bool        `json:"attachment,omitempty" yaml:"attachment,omitempty"`
-	Reasoning   bool        `json:"reasoning,omitempty" yaml:"reasoning,omitempty"`
-	Temperature bool        `json:"temperature,omitempty" yaml:"temperature,omitempty"`
-	Knowledge   string      `json:"knowledge,omitempty" yaml:"knowledge,omitempty"`
-	Cost        CostConfig  `json:"cost" yaml:"cost"`
-	Limit       LimitConfig `json:"limit" yaml:"limit"`
+	Name         string      `json:"name" yaml:"name"`
+	BaseURL      string      `json:"baseUrl,omitempty" yaml:"baseUrl,omitempty"`
+	APIKey       string      `json:"apiKey,omitempty" yaml:"apiKey,omitempty"`
+	APIModelName string      `json:"apiModelName,omitempty" yaml:"apiModelName,omitempty"`
+	Family       string      `json:"family,omitempty" yaml:"family,omitempty"`
+	Attachment   bool        `json:"attachment,omitempty" yaml:"attachment,omitempty"`
+	Reasoning    bool        `json:"reasoning,omitempty" yaml:"reasoning,omitempty"`
+	Temperature  bool        `json:"temperature,omitempty" yaml:"temperature,omitempty"`
+	Knowledge    string      `json:"knowledge,omitempty" yaml:"knowledge,omitempty"`
+	Cost         CostConfig  `json:"cost" yaml:"cost"`
+	Limit        LimitConfig `json:"limit" yaml:"limit"`

 	// Generation parameter defaults for this model.
 	// These are applied when the user hasn't explicitly set the corresponding
@@ -497,7 +499,22 @@ mcpServers:
 # no-skills: false                          # Set to true to disable all skill loading
 # skill:                                    # Explicit skill files/dirs (disables auto-discovery)
 #   - "/path/to/skill.md"
-# skills-dir: "/path/to/skills"            # Override project-local directory for auto-discovery
+# skills-dir: "/path/to/skills"            # Scan this directory directly for skills (overrides auto-discovery)
+# skill-disable:                            # Hide skills from the model catalog by name (still usable via /skill:)
+#   - "some-skill"
+#
+# Skill files follow the agentskills.io spec. A SKILL.md frontmatter block
+# supports these fields:
+#   name: my-skill                          # required
+#   description: Use when ...               # required (basis for model discovery)
+#   license: MIT                            # optional SPDX identifier
+#   compatibility: claude-code, cursor      # optional targeted-environment note
+#   allowed-tools: read, bash               # optional (experimental) tool restriction
+#   disable-model-invocation: false         # optional; true hides from the catalog
+#   metadata:                               # optional arbitrary key/value pairs
+#     author: you
+#   tags: [example]                         # Kit extension
+#   when: on-demand                         # Kit extension

 # API Configuration (can also use environment variables)
 # provider-api-key: "your-api-key"         # API key for OpenAI, Anthropic, or Google
@@ -553,7 +570,7 @@ func FilepathOr[T any](key string, value *T) error {
 				absPath = filepath.Join(home, absPath[2:])
 			}
 			if !filepath.IsAbs(absPath) {
-				base := configPath
+				base := GetConfigPath()
 				if base == "" {
 					fmt.Fprintf(os.Stderr, "unable to build relative path to config.")
 					os.Exit(1)
@@ -580,11 +597,24 @@ func FilepathOr[T any](key string, value *T) error {
 	return nil
 }

-var configPath string
+var (
+	configPathMu sync.RWMutex
+	configPath   string
+)

 // SetConfigPath sets the configuration file path for resolving relative paths
 // in configuration values. This should be called when the configuration file
-// location is known.
+// location is known. It is safe for concurrent use.
 func SetConfigPath(path string) {
+	configPathMu.Lock()
+	defer configPathMu.Unlock()
 	configPath = path
 }
+
+// GetConfigPath returns the configuration file path previously set via
+// SetConfigPath. It is safe for concurrent use.
+func GetConfigPath() string {
+	configPathMu.RLock()
+	defer configPathMu.RUnlock()
+	return configPath
+}
@@ -0,0 +1,33 @@
+package config
+
+import (
+	"sync"
+	"testing"
+)
+
+// TestConfigPathConcurrentAccess exercises the mutex guarding the package-level
+// configPath global. Run with -race to detect the data race that motivated the
+// guard (concurrent kit.New() calls discovering a .kit.yml).
+func TestConfigPathConcurrentAccess(t *testing.T) {
+	t.Cleanup(func() { SetConfigPath("") })
+
+	const goroutines = 32
+	var wg sync.WaitGroup
+	wg.Add(goroutines * 2)
+	for range goroutines {
+		go func() {
+			defer wg.Done()
+			SetConfigPath("/tmp/kit.yml")
+		}()
+		go func() {
+			defer wg.Done()
+			_ = GetConfigPath()
+		}()
+	}
+	wg.Wait()
+
+	SetConfigPath("/tmp/final.yml")
+	if got := GetConfigPath(); got != "/tmp/final.yml" {
+		t.Fatalf("GetConfigPath() = %q, want /tmp/final.yml", got)
+	}
+}
@@ -1,5 +1,24 @@
 package extensions

+import (
+	"errors"
+)
+
+// ErrAgentBusy is returned (wrapped) when an extension API call that requires
+// the agent to be idle cannot proceed because the agent is still processing a
+// turn or post-turn hooks. Most notably, ctx.NewSession waits for idle
+// internally; if its wait deadline elapses it returns an error that wraps
+// this sentinel.
+//
+// Extensions can detect the condition with errors.Is:
+//
+//	if err := ctx.NewSession(prompt); err != nil {
+//	    if errors.Is(err, ext.ErrAgentBusy) {
+//	        // agent never settled — fall back to a queued message instead
+//	    }
+//	}
+var ErrAgentBusy = errors.New("agent is busy")
+
 // ---------------------------------------------------------------------------
 // Internal types (used by runner, NOT exposed to Yaegi)
 // ---------------------------------------------------------------------------
@@ -124,6 +143,48 @@ type Context struct {
 	//   })
 	SendMultimodalMessage func(text string, files []FilePart)

+	// NewSession ends the current session and starts a fresh one (matching
+	// the /new slash command). When prompt is non-empty it is submitted as
+	// the first user turn of the new session, with @file references
+	// expanded the same way they are for normal user input. Pass an empty
+	// string to start an empty session.
+	//
+	// If the agent is currently busy when NewSession is called (for example,
+	// from an OnAgentEnd hook that fires before the agent fully settles, or
+	// while post-turn formatters/linters are still running), the call blocks
+	// until the agent transitions to idle. This avoids the v0.79.0
+	// phase-handoff race where NewSession from OnAgentEnd would fail with
+	// "agent is busy" because TurnEnd fires before the busy flag clears.
+	// The wait has a generous internal timeout; if it elapses the returned
+	// error wraps ErrAgentBusy (detectable with errors.Is).
+	//
+	// Returns an error if the agent does not become idle within the wait
+	// window, if a registered BeforeSessionSwitch handler cancels the
+	// switch, or if the new session file cannot be created. In
+	// non-interactive (ACP / headless) mode this is a no-op that returns
+	// an error.
+	//
+	// Because NewSession may block, call it from a goroutine — not
+	// directly from inside an event handler that the agent loop is waiting
+	// on.
+	//
+	// Typical pattern — start a fresh session at the end of a phase by
+	// reading a handoff file:
+	//
+	//   api.OnAgentEnd(func(e ext.AgentEndEvent, ctx ext.Context) {
+	//       msgs := ctx.GetMessages()
+	//       if len(msgs) == 0 {
+	//           return
+	//       }
+	//       last := msgs[len(msgs)-1].Content
+	//       if strings.Contains(last, "<HANDOFF_READY>") {
+	//           go func() {
+	//               _ = ctx.NewSession("Read @HANDOFF.md and continue the next phase.")
+	//           }()
+	//       }
+	//   })
+	NewSession func(prompt string) error
+
 	// GetSessionUsage returns aggregated token usage and cost statistics
 	// for the current session. This includes total input/output tokens,
 	// cache read/write tokens, total cost, and request count.
@@ -716,7 +777,8 @@ type Context struct {
 	LoadSkillsFromDir func(dir string) SkillLoadResult

 	// DiscoverSkills finds skills in standard locations.
-	// Checks ~/.config/kit/skills/, .kit/skills/, .agents/skills/
+	// Checks ~/.agents/skills/, ~/.config/kit/skills/, <project>/.agents/skills/,
+	// and <project>/.kit/skills/.
 	DiscoverSkills func() SkillLoadResult

 	// InjectSkillAsContext sends a skill's content as a system message.
@@ -848,9 +910,24 @@ type Skill struct {
 	Content string
 	// Path is the absolute filesystem path.
 	Path string
-	// Tags are optional labels for categorization.
+	// License is an optional SPDX license identifier (agentskills.io field).
+	License string
+	// Compatibility is an optional note describing targeted environments
+	// (agentskills.io field).
+	Compatibility string
+	// Metadata is an optional bag of arbitrary string key/value pairs
+	// (agentskills.io field).
+	Metadata map[string]string
+	// AllowedTools optionally restricts which tools the skill may use
+	// (experimental agentskills.io field).
+	AllowedTools string
+	// DisableModelInvocation hides the skill from the model-facing catalog
+	// while keeping it available via explicit activation (agentskills.io field).
+	DisableModelInvocation bool
+	// Tags are optional labels for categorization. Kit extension.
 	Tags []string
 	// When controls automatic inclusion: "always", "on-demand", or file-glob.
+	// Kit extension.
 	When string
 }

@@ -2296,6 +2373,12 @@ type BeforeSessionSwitchEvent struct {
 	// Reason describes why the switch is happening: "new" for /new command,
 	// "clear" for /clear command.
 	Reason string
+	// InitialPrompt, when non-empty, is the prompt that will be submitted
+	// as the first user turn of the new session. Set when /new is invoked
+	// with an argument (e.g. "/new continue from HANDOFF.md") or when an
+	// extension calls ctx.NewSession(prompt). Extensions may inspect this
+	// to decide whether to allow the switch.
+	InitialPrompt string
 }

 func (e BeforeSessionSwitchEvent) Type() EventType { return BeforeSessionSwitch }
@@ -192,6 +192,9 @@ func normalizeContext(ctx Context) Context {
 	if ctx.SendMultimodalMessage == nil {
 		ctx.SendMultimodalMessage = func(string, []FilePart) {}
 	}
+	if ctx.NewSession == nil {
+		ctx.NewSession = func(string) error { return fmt.Errorf("new session not available") }
+	}
 	if ctx.GetSessionUsage == nil {
 		ctx.GetSessionUsage = func() SessionUsage { return SessionUsage{} }
 	}
@@ -28,6 +28,11 @@ func Symbols() interp.Exports {
 			"CommandDef":     reflect.ValueOf((*CommandDef)(nil)),
 			"PrintBlockOpts": reflect.ValueOf((*PrintBlockOpts)(nil)),

+			// Sentinel errors. Extensions detect them with errors.Is:
+			//
+			//   if errors.Is(err, ext.ErrAgentBusy) { ... }
+			"ErrAgentBusy": reflect.ValueOf(&ErrAgentBusy).Elem(),
+
 			// Session types
 			"SessionMessage": reflect.ValueOf((*SessionMessage)(nil)),
 			"ExtensionEntry": reflect.ValueOf((*ExtensionEntry)(nil)),
@@ -53,6 +53,11 @@ type AgentSetupOptions struct {
 	// Debug enables debug logging. When zero-value, viper is consulted.
 	// Only meaningful when ProviderConfig is also set.
 	Debug bool
+	// DebugLogger, if non-nil, is used directly as the engine/MCP debug
+	// logger — overriding the built-in SimpleDebugLogger / BufferedDebugLogger
+	// selected by Debug + UseBufferedLogger. Callers supply this when they
+	// want to route debug output into their own logging system.
+	DebugLogger tools.DebugLogger
 	// NoExtensions skips extension loading. When false, viper is consulted.
 	// Only meaningful when ProviderConfig is also set.
 	NoExtensions bool
@@ -192,7 +197,12 @@ func SetupAgent(ctx context.Context, opts AgentSetupOptions) (*AgentSetupResult,
 	// Create the appropriate debug logger.
 	var debugLogger tools.DebugLogger
 	var bufferedLogger *tools.BufferedDebugLogger
-	if debugEnabled {
+	switch {
+	case opts.DebugLogger != nil:
+		// Caller-supplied logger wins unconditionally. Its IsDebugEnabled()
+		// is the source of truth for whether downstream code emits messages.
+		debugLogger = opts.DebugLogger
+	case debugEnabled:
 		if opts.UseBufferedLogger {
 			bufferedLogger = tools.NewBufferedDebugLogger(true)
 			debugLogger = bufferedLogger
@@ -44,13 +44,14 @@ func loadCustomModelsFrom(v *viper.Viper) map[string]ModelInfo {
 // modelConfigToModelInfo converts a CustomModelConfig to a ModelInfo.
 func modelConfigToModelInfo(modelID string, cfg CustomModelConfig) ModelInfo {
 	info := ModelInfo{
-		ID:          modelID,
-		Name:        cfg.Name,
-		Attachment:  cfg.Attachment,
-		Reasoning:   cfg.Reasoning,
-		Temperature: cfg.Temperature,
-		BaseURL:     cfg.BaseURL,
-		APIKey:      cfg.APIKey,
+		ID:           modelID,
+		Name:         cfg.Name,
+		Attachment:   cfg.Attachment,
+		Reasoning:    cfg.Reasoning,
+		Temperature:  cfg.Temperature,
+		BaseURL:      cfg.BaseURL,
+		APIKey:       cfg.APIKey,
+		APIModelName: cfg.APIModelName,
 		Cost: Cost{
 			Input:  cfg.Cost.Input,
 			Output: cfg.Cost.Output,
@@ -287,17 +288,18 @@ type GenerationParams struct {
 // CustomModelConfig defines a custom model configuration loaded from the config file.
 // This is a duplicate here to avoid circular dependencies with internal/config.
 type CustomModelConfig struct {
-	Name        string                 `json:"name" yaml:"name"`
-	BaseURL     string                 `json:"baseUrl,omitempty" yaml:"baseUrl,omitempty"`
-	APIKey      string                 `json:"apiKey,omitempty" yaml:"apiKey,omitempty"`
-	Family      string                 `json:"family,omitempty" yaml:"family,omitempty"`
-	Attachment  bool                   `json:"attachment,omitempty" yaml:"attachment,omitempty"`
-	Reasoning   bool                   `json:"reasoning,omitempty" yaml:"reasoning,omitempty"`
-	Temperature bool                   `json:"temperature,omitempty" yaml:"temperature,omitempty"`
-	Knowledge   string                 `json:"knowledge,omitempty" yaml:"knowledge,omitempty"`
-	Cost        CostConfig             `json:"cost" yaml:"cost"`
-	Limit       LimitConfig            `json:"limit" yaml:"limit"`
-	Params      GenerationParamsConfig `json:"params,omitzero" yaml:"params,omitempty"`
+	Name         string                 `json:"name" yaml:"name"`
+	BaseURL      string                 `json:"baseUrl,omitempty" yaml:"baseUrl,omitempty"`
+	APIKey       string                 `json:"apiKey,omitempty" yaml:"apiKey,omitempty"`
+	APIModelName string                 `json:"apiModelName,omitempty" yaml:"apiModelName,omitempty"`
+	Family       string                 `json:"family,omitempty" yaml:"family,omitempty"`
+	Attachment   bool                   `json:"attachment,omitempty" yaml:"attachment,omitempty"`
+	Reasoning    bool                   `json:"reasoning,omitempty" yaml:"reasoning,omitempty"`
+	Temperature  bool                   `json:"temperature,omitempty" yaml:"temperature,omitempty"`
+	Knowledge    string                 `json:"knowledge,omitempty" yaml:"knowledge,omitempty"`
+	Cost         CostConfig             `json:"cost" yaml:"cost"`
+	Limit        LimitConfig            `json:"limit" yaml:"limit"`
+	Params       GenerationParamsConfig `json:"params,omitzero" yaml:"params,omitempty"`
 }

 // GenerationParamsConfig is the JSON/YAML-serializable form of generation
@@ -1533,7 +1533,12 @@ func createCustomProvider(ctx context.Context, config *ProviderConfig, modelName
 		return nil, wrapProviderErr("custom", "provider", err)
 	}

-	model, err := p.LanguageModel(ctx, modelName)
+	apiModelName := modelName
+	if modelInfo != nil && modelInfo.APIModelName != "" {
+		apiModelName = modelInfo.APIModelName
+	}
+
+	model, err := p.LanguageModel(ctx, apiModelName)
 	if err != nil {
 		return nil, wrapProviderErr("custom", "model", err)
 	}
@@ -16,17 +16,18 @@ var embeddedModelsJSON []byte

 // ModelInfo represents information about a specific model.
 type ModelInfo struct {
-	ID          string
-	Name        string
-	Family      string // Model family (e.g., "claude", "gpt", "gemini")
-	Attachment  bool
-	Reasoning   bool
-	Temperature bool
-	Cost        Cost
-	Limit       Limit
-	ProviderNPM string // Model-specific provider npm override (e.g. "@ai-sdk/anthropic")
-	BaseURL     string // Per-model base URL override (custom models only)
-	APIKey      string // Per-model API key override (custom models only)
+	ID           string
+	Name         string
+	Family       string // Model family (e.g., "claude", "gpt", "gemini")
+	Attachment   bool
+	Reasoning    bool
+	Temperature  bool
+	Cost         Cost
+	Limit        Limit
+	ProviderNPM  string // Model-specific provider npm override (e.g. "@ai-sdk/anthropic")
+	BaseURL      string // Per-model base URL override (custom models only)
+	APIKey       string // Per-model API key override (custom models only)
+	APIModelName string // Per-model API model name override (custom models only)

 	// Params holds per-model generation parameter defaults. These are applied
 	// when the user hasn't explicitly set the corresponding CLI flag or global
@@ -50,7 +50,7 @@ func TestPromptBuilder_WithSkills(t *testing.T) {
 	if !strings.Contains(result, "<description>Write code</description>") {
 		t.Error("missing skill description in XML")
 	}
-	if !strings.Contains(result, "<location>file:///tmp/coding/SKILL.md</location>") {
+	if !strings.Contains(result, "<location>/tmp/coding/SKILL.md</location>") {
 		t.Error("missing skill location")
 	}
 }
@@ -2,40 +2,173 @@
 //
 // Skills are markdown instruction files with optional YAML frontmatter that
 // provide domain-specific context, instructions, and workflows to the agent.
-// They follow a hierarchical discovery pattern similar to extensions:
+// They follow the cross-client agentskills.io discovery convention plus a
+// Kit-native location:
 //
-//	~/.config/kit/skills/           global skills directory
-//	.kit/skills/                    project-local skills directory
+//	~/.agents/skills/               user-level cross-client skills
+//	~/.config/kit/skills/           user-level Kit skills ($XDG_CONFIG_HOME aware)
+//	<project>/.agents/skills/       project-local cross-client skills
+//	<project>/.kit/skills/          project-local Kit skills
 //
-// Skills can be single .md/.txt files or subdirectories containing a SKILL.md file.
+// Skills can be single .md/.txt files or subdirectories containing a SKILL.md
+// file. Project-level skills take precedence over user-level skills when two
+// skills share the same name.
 package skills

 import (
 	"bytes"
+	"encoding/xml"
+	"errors"
 	"fmt"
+	"io/fs"
 	"os"
+	"path"
 	"path/filepath"
+	"regexp"
+	"sort"
 	"strings"

+	"github.com/charmbracelet/log"
 	"gopkg.in/yaml.v3"
 )

 // Skill represents a markdown-based instruction file that provides
 // domain-specific context and workflows to the agent.
+//
+// The Name and Description fields are required by the agentskills.io
+// specification. License, Compatibility, Metadata, and AllowedTools are
+// optional spec fields. Tags and When are Kit-specific extensions that other
+// clients ignore.
 type Skill struct {
-	// Name is the human-readable identifier for this skill.
+	// Name is the human-readable identifier for this skill. Required.
 	Name string `yaml:"name" json:"name"`
-	// Description summarises what this skill provides.
+	// Description summarises what this skill provides and when to use it.
+	// Required by the spec — it is the sole basis on which the model decides
+	// whether a skill is relevant, so a skill without one is omitted from the
+	// catalog.
 	Description string `yaml:"description" json:"description"`
 	// Content is the full markdown body (after frontmatter).
 	Content string `yaml:"-" json:"content"`
 	// Path is the absolute filesystem path the skill was loaded from.
 	Path string `yaml:"-" json:"path"`
-	// Tags are optional labels for categorisation.
+
+	// License is an optional SPDX license identifier (spec field).
+	License string `yaml:"license,omitempty" json:"license,omitempty"`
+	// Compatibility is an optional free-form note describing the environments
+	// or clients the skill targets (spec field). The model can use it to adapt
+	// execution.
+	Compatibility string `yaml:"compatibility,omitempty" json:"compatibility,omitempty"`
+	// Metadata is an optional bag of arbitrary string key/value pairs (spec
+	// field) for client-specific annotations.
+	Metadata map[string]string `yaml:"metadata,omitempty" json:"metadata,omitempty"`
+	// AllowedTools optionally restricts which tools the skill may use. This is
+	// an experimental spec field carried for portability; Kit does not yet
+	// enforce it.
+	AllowedTools string `yaml:"allowed-tools,omitempty" json:"allowed_tools,omitempty"`
+	// DisableModelInvocation, when true, hides the skill from the
+	// model-facing catalog (spec field). The skill can still be activated
+	// explicitly via the /skill: slash command.
+	DisableModelInvocation bool `yaml:"disable-model-invocation,omitempty" json:"disable_model_invocation,omitempty"`
+
+	// Tags are optional labels for categorisation. Kit extension.
 	Tags []string `yaml:"tags,omitempty" json:"tags,omitempty"`
 	// When controls automatic inclusion: "always", "on-demand", or a
-	// file-glob like "file:*.go".  Empty defaults to "on-demand".
+	// file-glob like "file:*.go". Empty defaults to "on-demand". Kit extension.
 	When string `yaml:"when,omitempty" json:"when,omitempty"`
+
+	// project records whether the skill was discovered in a project-local
+	// scope. Used internally for name-collision precedence (project > user).
+	project bool `yaml:"-" json:"-"`
+}
+
+// Diagnostic describes a validation problem with a skill. Severity is either
+// "error" (the skill cannot be used) or "warning" (the skill is usable but
+// non-compliant).
+type Diagnostic struct {
+	// Severity is "error" or "warning".
+	Severity string `json:"severity"`
+	// Field names the frontmatter field the diagnostic relates to, if any.
+	Field string `json:"field,omitempty"`
+	// Message is a human-readable description of the problem.
+	Message string `json:"message"`
+}
+
+// Validate checks the skill against the agentskills.io specification and
+// returns a list of diagnostics. An empty slice means the skill is fully
+// compliant. A missing description is reported as an error because the spec
+// makes it required for discovery.
+func (s *Skill) Validate() []Diagnostic {
+	var diags []Diagnostic
+	if strings.TrimSpace(s.Name) == "" {
+		diags = append(diags, Diagnostic{Severity: "error", Field: "name", Message: "name is required"})
+	}
+	if strings.TrimSpace(s.Description) == "" {
+		diags = append(diags, Diagnostic{
+			Severity: "error",
+			Field:    "description",
+			Message:  "description is required for skill discovery",
+		})
+	}
+	return diags
+}
+
+// hasError reports whether diags contains a diagnostic with "error" severity.
+func hasError(diags []Diagnostic) bool {
+	for _, d := range diags {
+		if d.Severity == "error" {
+			return true
+		}
+	}
+	return false
+}
+
+// BaseDir returns the directory the skill was loaded from. Relative resources
+// referenced by a skill (scripts/, references/, assets/) resolve against this
+// directory.
+func (s *Skill) BaseDir() string {
+	if s.Path == "" {
+		return ""
+	}
+	return filepath.Dir(s.Path)
+}
+
+// resourceDirs are the conventional subdirectories a skill may bundle.
+var resourceDirs = []string{"scripts", "references", "assets"}
+
+// maxResources caps how many bundled resources are enumerated to avoid
+// flooding the prompt for skills with large asset trees.
+const maxResources = 50
+
+// Resources walks one level into the skill's scripts/, references/, and
+// assets/ subdirectories and returns the relative paths of any files found
+// (slash-separated, relative to BaseDir). The result is capped at 50 entries.
+// It returns nil when the skill has no bundled resources or its Path is not a
+// real on-disk file.
+func (s *Skill) Resources() []string {
+	base := s.BaseDir()
+	if base == "" {
+		return nil
+	}
+	var out []string
+	for _, sub := range resourceDirs {
+		dir := filepath.Join(base, sub)
+		entries, err := os.ReadDir(dir)
+		if err != nil {
+			continue
+		}
+		for _, e := range entries {
+			if e.IsDir() {
+				continue
+			}
+			out = append(out, sub+"/"+e.Name())
+			if len(out) >= maxResources {
+				sort.Strings(out)
+				return out
+			}
+		}
+	}
+	sort.Strings(out)
+	return out
 }

 // frontmatterSep is the YAML frontmatter delimiter.
@@ -55,7 +188,14 @@ func LoadSkill(path string) (*Skill, error) {
 		abs = path
 	}

-	skill := &Skill{Path: abs}
+	return parseSkill(data, path, abs)
+}
+
+// parseSkill parses skill bytes that originated from srcPath (used for error
+// messages and name derivation) and records storePath as the skill's Path.
+// It is shared by the os-backed and fs.FS-backed loaders.
+func parseSkill(data []byte, srcPath, storePath string) (*Skill, error) {
+	skill := &Skill{Path: storePath}

 	content := string(data)

@@ -69,8 +209,8 @@ func LoadSkill(path string) (*Skill, error) {
 			// Strip an optional trailing newline right after the closing ---.
 			body = strings.TrimPrefix(body, "\n")

-			if err := yaml.Unmarshal([]byte(frontmatter), skill); err != nil {
-				return nil, fmt.Errorf("parsing frontmatter in %s: %w", path, err)
+			if err := unmarshalFrontmatter([]byte(frontmatter), skill); err != nil {
+				return nil, fmt.Errorf("parsing frontmatter in %s: %w", srcPath, err)
 			}
 			skill.Content = strings.TrimSpace(body)
 		} else {
@@ -83,18 +223,69 @@ func LoadSkill(path string) (*Skill, error) {

 	// Fallback: derive name from filename if frontmatter didn't set one.
 	if skill.Name == "" {
-		base := filepath.Base(path)
+		base := filepath.Base(srcPath)
 		ext := filepath.Ext(base)
 		skill.Name = strings.TrimSuffix(base, ext)
 		// Convert SKILL → directory name for SKILL.md files.
 		if strings.EqualFold(skill.Name, "SKILL") || strings.EqualFold(skill.Name, "skill") {
-			skill.Name = filepath.Base(filepath.Dir(path))
+			skill.Name = filepath.Base(filepath.Dir(srcPath))
 		}
 	}

 	return skill, nil
 }

+// unquotedColonRe matches a YAML scalar line whose value contains an unquoted
+// colon, e.g. `description: Use when: extracting tables`. This is the most
+// common frontmatter authoring mistake in cross-client skills and breaks
+// strict YAML parsing.
+var unquotedColonRe = regexp.MustCompile(`^(\s*[A-Za-z0-9_-]+):[ \t]+([^'"\n].*:.*)$`)
+
+// unmarshalFrontmatter unmarshals YAML frontmatter into skill, tolerating the
+// common "unquoted colon in a scalar value" mistake (e.g.
+// `description: Use when: …`). On a parse failure it quotes offending scalar
+// values and retries once before giving up.
+func unmarshalFrontmatter(frontmatter []byte, skill *Skill) error {
+	err := yaml.Unmarshal(frontmatter, skill)
+	if err == nil {
+		return nil
+	}
+
+	// Attempt a single recovery pass: quote scalar values that contain an
+	// unquoted colon, which is the dominant cross-client failure mode.
+	repaired, changed := repairUnquotedColons(string(frontmatter))
+	if !changed {
+		return err
+	}
+	if retryErr := yaml.Unmarshal([]byte(repaired), skill); retryErr != nil {
+		// The original error is more useful to the author.
+		return err
+	}
+	return nil
+}
+
+// repairUnquotedColons quotes scalar values containing an unquoted colon and
+// reports whether any line was changed.
+func repairUnquotedColons(frontmatter string) (string, bool) {
+	lines := strings.Split(frontmatter, "\n")
+	changed := false
+	for i, line := range lines {
+		m := unquotedColonRe.FindStringSubmatch(line)
+		if m == nil {
+			continue
+		}
+		key, value := m[1], strings.TrimRight(m[2], " \t")
+		// Escape embedded double quotes before wrapping.
+		value = strings.ReplaceAll(value, `"`, `\"`)
+		lines[i] = fmt.Sprintf(`%s: "%s"`, key, value)
+		changed = true
+	}
+	if !changed {
+		return frontmatter, false
+	}
+	return strings.Join(lines, "\n"), true
+}
+
 // LoadSkillsFromDir loads all skills from a single directory. It looks for:
 //   - *.md and *.txt files directly in dir
 //   - SKILL.md (case-insensitive) in immediate subdirectories
@@ -113,7 +304,7 @@ func LoadSkillsFromDir(dir string) ([]*Skill, error) {
 	}

 	var skills []*Skill
-	var errs []string
+	var errs []error

 	for _, entry := range entries {
 		full := filepath.Join(dir, entry.Name())
@@ -123,7 +314,7 @@ func LoadSkillsFromDir(dir string) ([]*Skill, error) {
 			if ext == ".md" || ext == ".txt" {
 				s, err := LoadSkill(full)
 				if err != nil {
-					errs = append(errs, err.Error())
+					errs = append(errs, err)
 					continue
 				}
 				skills = append(skills, s)
@@ -140,7 +331,7 @@ func LoadSkillsFromDir(dir string) ([]*Skill, error) {
 			if !se.IsDir() && strings.EqualFold(se.Name(), "SKILL.md") {
 				s, err := LoadSkill(filepath.Join(full, se.Name()))
 				if err != nil {
-					errs = append(errs, err.Error())
+					errs = append(errs, err)
 					continue
 				}
 				skills = append(skills, s)
@@ -150,59 +341,204 @@ func LoadSkillsFromDir(dir string) ([]*Skill, error) {
 	}

 	if len(errs) > 0 {
-		return skills, fmt.Errorf("some skills failed to load: %s", strings.Join(errs, "; "))
+		return skills, fmt.Errorf("some skills failed to load: %w", errors.Join(errs...))
 	}
 	return skills, nil
 }

-// LoadSkills auto-discovers skills from standard directories:
-//  1. Global: $XDG_CONFIG_HOME/kit/skills/ (default ~/.config/kit/skills/)
-//  2. Project-local: <cwd>/.kit/skills/
+// LoadSkillsFromFS is the fs.FS-typed counterpart of LoadSkillsFromDir. It
+// walks fsys starting at root (which may be "." or a subdirectory), finds
+// *.md and *.txt files plus SKILL.md files in subdirectories, parses YAML
+// frontmatter + markdown body, and returns the loaded skills.
 //
-// Skills from project-local directories take precedence (appended last).
-// cwd is the working directory for project-local discovery; if empty the
-// current working directory is used.
-func LoadSkills(cwd string) ([]*Skill, error) {
+// Because fs.FS has no notion of an absolute on-disk path, each loaded skill's
+// Path is set to its slash-separated path within fsys. Files that fail to
+// parse are skipped and reported via the returned error.
+func LoadSkillsFromFS(fsys fs.FS, root string) ([]*Skill, error) {
+	if fsys == nil {
+		return nil, nil
+	}
+	if root == "" {
+		root = "."
+	}
+
+	var skills []*Skill
+	var errs []error
+
+	walkErr := fs.WalkDir(fsys, root, func(p string, d fs.DirEntry, err error) error {
+		if err != nil {
+			return nil // skip unreadable entries rather than aborting the walk
+		}
+		if d.IsDir() {
+			return nil
+		}
+		name := d.Name()
+		ext := strings.ToLower(path.Ext(name))
+		if ext != ".md" && ext != ".txt" {
+			return nil
+		}
+		// Top-level .md/.txt files, or SKILL.md anywhere.
+		isTopLevel := path.Dir(p) == root
+		if !isTopLevel && !strings.EqualFold(name, "SKILL.md") {
+			return nil
+		}
+		data, readErr := fs.ReadFile(fsys, p)
+		if readErr != nil {
+			errs = append(errs, fmt.Errorf("reading skill %s: %w", p, readErr))
+			return nil
+		}
+		s, parseErr := parseSkill(data, p, p)
+		if parseErr != nil {
+			errs = append(errs, parseErr)
+			return nil
+		}
+		skills = append(skills, s)
+		return nil
+	})
+	if walkErr != nil {
+		return skills, fmt.Errorf("walking skills fs at %s: %w", root, walkErr)
+	}
+	if len(errs) > 0 {
+		return skills, fmt.Errorf("some skills failed to load: %w", errors.Join(errs...))
+	}
+	return skills, nil
+}
+
+// LoadUserSkills discovers skills from the user-level scopes only:
+//
+//  1. ~/.agents/skills/ (cross-client convention)
+//  2. $XDG_CONFIG_HOME/kit/skills/ (default ~/.config/kit/skills/)
+//
+// The returned skills are not yet validated or deduplicated; pass them through
+// Combine together with project skills to produce the final catalog set.
+func LoadUserSkills() []*Skill {
+	var loaded []*Skill
+	if home, err := os.UserHomeDir(); err == nil && home != "" {
+		dir := filepath.Join(home, ".agents", "skills")
+		ss, loadErr := LoadSkillsFromDir(dir)
+		if loadErr != nil {
+			// Missing directories are already swallowed by LoadSkillsFromDir,
+			// so a non-nil error here is genuine (permission denied, read
+			// failure, or a malformed skill file) and would otherwise yield a
+			// silently partial catalog.
+			log.Warn("failed to load some user skills", "dir", dir, "err", loadErr)
+		}
+		loaded = append(loaded, ss...)
+	}
+	if g := globalSkillsDir(); g != "" {
+		ss, loadErr := LoadSkillsFromDir(g)
+		if loadErr != nil {
+			log.Warn("failed to load some user skills", "dir", g, "err", loadErr)
+		}
+		loaded = append(loaded, ss...)
+	}
+	for _, s := range loaded {
+		s.project = false
+	}
+	return loaded
+}
+
+// LoadProjectSkills discovers skills from the project-local scopes only:
+//
+//  1. <cwd>/.agents/skills/ (cross-client convention)
+//  2. <cwd>/.kit/skills/ (Kit-specific)
+//
+// Because project-local skills are injected into the system prompt, callers
+// may wish to gate this on a trust check before including the result. The
+// returned skills are not yet validated or deduplicated; pass them through
+// Combine.
+func LoadProjectSkills(cwd string) []*Skill {
 	if cwd == "" {
 		cwd, _ = os.Getwd()
 	}
-
-	seen := make(map[string]bool)
-	var all []*Skill
-
-	addUnique := func(skills []*Skill) {
-		for _, s := range skills {
-			if !seen[s.Path] {
-				seen[s.Path] = true
-				all = append(all, s)
-			}
+	var loaded []*Skill
+	for _, dir := range []string{
+		filepath.Join(cwd, ".agents", "skills"),
+		filepath.Join(cwd, ".kit", "skills"),
+	} {
+		ss, loadErr := LoadSkillsFromDir(dir)
+		if loadErr != nil {
+			log.Warn("failed to load some project skills", "dir", dir, "err", loadErr)
 		}
+		loaded = append(loaded, ss...)
+	}
+	for _, s := range loaded {
+		s.project = true
+	}
+	return loaded
+}
+
+// Combine validates and deduplicates the union of user-level and project-level
+// skills. Skills missing a required description are skipped with a logged
+// warning; when two skills share a Name the project-level one wins (also
+// logged). User skills are considered before project skills so first-seen
+// ordering is stable.
+func Combine(user, project []*Skill) []*Skill {
+	combined := make([]*Skill, 0, len(user)+len(project))
+	combined = append(combined, user...)
+	combined = append(combined, project...)
+	return finalizeSkills(combined)
+}
+
+// LoadSkills auto-discovers skills from the standard agentskills.io scopes:
+//
+//  1. User-level: ~/.agents/skills/ (cross-client convention)
+//  2. User-level: $XDG_CONFIG_HOME/kit/skills/ (default ~/.config/kit/skills/)
+//  3. Project-local: <cwd>/.agents/skills/ (cross-client convention)
+//  4. Project-local: <cwd>/.kit/skills/ (Kit-specific)
+//
+// When two skills share the same Name, the project-level skill takes
+// precedence over a user-level one and a warning is logged. cwd is the working
+// directory for project-local discovery; if empty the current working
+// directory is used.
+func LoadSkills(cwd string) ([]*Skill, error) {
+	return Combine(LoadUserSkills(), LoadProjectSkills(cwd)), nil
+}
+
+// finalizeSkills applies validation (skipping skills missing a required
+// description) and name-collision precedence (project overrides user). It
+// preserves first-seen ordering for stable catalog output.
+func finalizeSkills(loaded []*Skill) []*Skill {
+	byName := make(map[string]int) // name → index in result
+	var result []*Skill
+
+	for _, s := range loaded {
+		if diags := s.Validate(); hasError(diags) {
+			for _, d := range diags {
+				if d.Severity == "error" {
+					log.Warn("skipping skill: validation failed", "path", s.Path, "field", d.Field, "reason", d.Message)
+				}
+			}
+			continue
+		}
+
+		if idx, ok := byName[s.Name]; ok {
+			existing := result[idx]
+			// Project-level skills override user-level skills.
+			if s.project && !existing.project {
+				log.Warn("skill name collision: project skill overrides user skill",
+					"name", s.Name, "project", s.Path, "user", existing.Path)
+				result[idx] = s
+			} else {
+				log.Warn("skill name collision: keeping earlier skill, ignoring duplicate",
+					"name", s.Name, "kept", existing.Path, "ignored", s.Path)
+			}
+			continue
+		}
+
+		byName[s.Name] = len(result)
+		result = append(result, s)
 	}

-	// Global skills.
-	globalDir := globalSkillsDir()
-	if globalDir != "" {
-		global, _ := LoadSkillsFromDir(globalDir)
-		addUnique(global)
-	}
-
-	// Project-local skills: .agents/skills/ (standardized cross-tool convention).
-	agentsDir := filepath.Join(cwd, ".agents", "skills")
-	agentsSkills, _ := LoadSkillsFromDir(agentsDir)
-	addUnique(agentsSkills)
-
-	// Project-local skills: .kit/skills/ (kit-specific).
-	localDir := filepath.Join(cwd, ".kit", "skills")
-	local, _ := LoadSkillsFromDir(localDir)
-	addUnique(local)
-
-	return all, nil
+	return result
 }

 // FormatForPrompt formats skills as metadata-only XML for inclusion in a
 // system prompt. Only the name, description, and file location are included;
-// the agent reads the full skill file on demand using the read tool. This
-
+// the agent reads the full skill file on demand using the read tool. Skill
+// fields are XML-escaped so that descriptions containing <, >, & or quotes
+// produce valid markup. Skills with DisableModelInvocation set are omitted
+// from the catalog (they remain available via the /skill: slash command).
 func FormatForPrompt(skills []*Skill) string {
 	if len(skills) == 0 {
 		return ""
@@ -214,17 +550,63 @@ func FormatForPrompt(skills []*Skill) string {
 	buf.WriteString("When a skill file references a relative path, resolve it against the skill directory (parent of SKILL.md) and use that absolute path in tool commands.\n")
 	buf.WriteString("\n<available_skills>\n")

+	emitted := 0
 	for _, s := range skills {
-		buf.WriteString("  <skill>\n")
-		fmt.Fprintf(&buf, "    <name>%s</name>\n", s.Name)
-		if s.Description != "" {
-			fmt.Fprintf(&buf, "    <description>%s</description>\n", s.Description)
+		if s.DisableModelInvocation {
+			continue
 		}
-		fmt.Fprintf(&buf, "    <location>file://%s</location>\n", s.Path)
+		buf.WriteString("  <skill>\n")
+		fmt.Fprintf(&buf, "    <name>%s</name>\n", escapeXML(s.Name))
+		if s.Description != "" {
+			fmt.Fprintf(&buf, "    <description>%s</description>\n", escapeXML(s.Description))
+		}
+		if s.Compatibility != "" {
+			fmt.Fprintf(&buf, "    <compatibility>%s</compatibility>\n", escapeXML(s.Compatibility))
+		}
+		fmt.Fprintf(&buf, "    <location>%s</location>\n", escapeXML(s.Path))
 		buf.WriteString("  </skill>\n")
+		emitted++
 	}

 	buf.WriteString("</available_skills>")
+	if emitted == 0 {
+		return ""
+	}
+	return buf.String()
+}
+
+// escapeXML escapes a string for safe inclusion as XML text content.
+func escapeXML(s string) string {
+	var buf bytes.Buffer
+	if err := xml.EscapeText(&buf, []byte(s)); err != nil {
+		return s
+	}
+	return buf.String()
+}
+
+// FormatResources renders a skill's bundled resources as a <skill_resources>
+// block, or returns the empty string when the skill bundles no resources. It
+// is used when a skill is explicitly activated so the model knows which files
+// it can read without enumerating them itself.
+func FormatResources(resources []string) string {
+	if len(resources) == 0 {
+		return ""
+	}
+	var buf bytes.Buffer
+	buf.WriteString("<skill_resources>\n")
+	limit := len(resources)
+	truncated := false
+	if limit > maxResources {
+		limit = maxResources
+		truncated = true
+	}
+	for _, r := range resources[:limit] {
+		fmt.Fprintf(&buf, "  <file>%s</file>\n", escapeXML(r))
+	}
+	if truncated {
+		buf.WriteString("  <!-- (truncated) -->\n")
+	}
+	buf.WriteString("</skill_resources>")
 	return buf.String()
 }

@@ -0,0 +1,70 @@
+package skills
+
+import (
+	"testing"
+	"testing/fstest"
+)
+
+func TestLoadSkillsFromFS(t *testing.T) {
+	fsys := fstest.MapFS{
+		"top.md":        {Data: []byte("---\nname: top-skill\ndescription: a top level skill\n---\nbody here")},
+		"notes.txt":     {Data: []byte("plain text skill")},
+		"deep/SKILL.md": {Data: []byte("---\nname: deep-skill\n---\ndeep body")},
+		"deep/other.md": {Data: []byte("ignored non-SKILL nested md")},
+		"ignore.json":   {Data: []byte("{}")},
+	}
+
+	got, err := LoadSkillsFromFS(fsys, ".")
+	if err != nil {
+		t.Fatalf("LoadSkillsFromFS error: %v", err)
+	}
+
+	byName := map[string]*Skill{}
+	for _, s := range got {
+		byName[s.Name] = s
+	}
+
+	if _, ok := byName["top-skill"]; !ok {
+		t.Errorf("top-skill not loaded; got %v", names(got))
+	}
+	if _, ok := byName["notes"]; !ok {
+		t.Errorf("notes (txt) not loaded; got %v", names(got))
+	}
+	if _, ok := byName["deep-skill"]; !ok {
+		t.Errorf("deep SKILL.md not loaded; got %v", names(got))
+	}
+	if _, ok := byName["other"]; ok {
+		t.Errorf("nested non-SKILL .md should be ignored; got %v", names(got))
+	}
+	if len(got) != 3 {
+		t.Errorf("expected 3 skills, got %d: %v", len(got), names(got))
+	}
+
+	// Content/description parsed from frontmatter.
+	if s := byName["top-skill"]; s != nil {
+		if s.Description != "a top level skill" {
+			t.Errorf("description = %q", s.Description)
+		}
+		if s.Content != "body here" {
+			t.Errorf("content = %q", s.Content)
+		}
+		if s.Path != "top.md" {
+			t.Errorf("path = %q, want top.md", s.Path)
+		}
+	}
+}
+
+func TestLoadSkillsFromFSNil(t *testing.T) {
+	got, err := LoadSkillsFromFS(nil, ".")
+	if err != nil || got != nil {
+		t.Fatalf("nil fs should yield (nil, nil), got (%v, %v)", got, err)
+	}
+}
+
+func names(skills []*Skill) []string {
+	out := make([]string, 0, len(skills))
+	for _, s := range skills {
+		out = append(out, s.Name)
+	}
+	return out
+}
@@ -243,15 +243,24 @@ func TestLoadSkillsFromDir_CaseInsensitiveSKILLmd(t *testing.T) {
 // LoadSkills (auto-discovery)
 // ---------------------------------------------------------------------------

+func writeSkill(t *testing.T, path, name, desc, body string) {
+	t.Helper()
+	content := "---\nname: " + name + "\ndescription: " + desc + "\n---\n" + body
+	if err := os.WriteFile(path, []byte(content), 0o644); err != nil {
+		t.Fatal(err)
+	}
+}
+
 func TestLoadSkills_ProjectLocal(t *testing.T) {
 	dir := t.TempDir()
+	// Isolate user-level scopes so the host machine's skills don't leak in.
+	t.Setenv("XDG_CONFIG_HOME", filepath.Join(dir, "xdg"))
+	t.Setenv("HOME", filepath.Join(dir, "home"))
 	skillsDir := filepath.Join(dir, ".kit", "skills")
 	if err := os.MkdirAll(skillsDir, 0o755); err != nil {
 		t.Fatal(err)
 	}
-	if err := os.WriteFile(filepath.Join(skillsDir, "local.md"), []byte("Local skill"), 0o644); err != nil {
-		t.Fatal(err)
-	}
+	writeSkill(t, filepath.Join(skillsDir, "local.md"), "local", "A local skill", "Local skill")

 	skills, err := LoadSkills(dir)
 	if err != nil {
@@ -265,37 +274,64 @@ func TestLoadSkills_ProjectLocal(t *testing.T) {
 	}
 }

-func TestLoadSkills_Deduplication(t *testing.T) {
+// TestLoadSkills_SkipsMissingDescription verifies that a skill without a
+// required description is skipped during auto-discovery (gap #2).
+func TestLoadSkills_SkipsMissingDescription(t *testing.T) {
 	dir := t.TempDir()
-
-	// Set XDG_CONFIG_HOME to our temp dir so global and local overlap.
-	t.Setenv("XDG_CONFIG_HOME", dir)
-
-	globalDir := filepath.Join(dir, "kit", "skills")
-	localDir := filepath.Join(dir, ".kit", "skills")
-
-	if err := os.MkdirAll(globalDir, 0o755); err != nil {
+	t.Setenv("XDG_CONFIG_HOME", filepath.Join(dir, "xdg"))
+	t.Setenv("HOME", filepath.Join(dir, "home"))
+	skillsDir := filepath.Join(dir, ".kit", "skills")
+	if err := os.MkdirAll(skillsDir, 0o755); err != nil {
 		t.Fatal(err)
 	}
-	if err := os.MkdirAll(localDir, 0o755); err != nil {
-		t.Fatal(err)
-	}
-
-	// Same content in both directories but different paths — both should load.
-	if err := os.WriteFile(filepath.Join(globalDir, "shared.md"), []byte("Global version"), 0o644); err != nil {
-		t.Fatal(err)
-	}
-	if err := os.WriteFile(filepath.Join(localDir, "shared.md"), []byte("Local version"), 0o644); err != nil {
+	// No description — should be skipped.
+	if err := os.WriteFile(filepath.Join(skillsDir, "nodesc.md"), []byte("Just a body"), 0o644); err != nil {
 		t.Fatal(err)
 	}
+	writeSkill(t, filepath.Join(skillsDir, "good.md"), "good", "Has a description", "Body")

 	skills, err := LoadSkills(dir)
 	if err != nil {
 		t.Fatal(err)
 	}
-	// Different absolute paths = both loaded.
-	if len(skills) != 2 {
-		t.Fatalf("expected 2 skills (different paths), got %d", len(skills))
+	if len(skills) != 1 {
+		t.Fatalf("expected 1 skill (missing-description skipped), got %d", len(skills))
+	}
+	if skills[0].Name != "good" {
+		t.Errorf("Name = %q, want %q", skills[0].Name, "good")
+	}
+}
+
+// TestLoadSkills_NameCollisionPrecedence verifies project-level skills override
+// user-level skills with the same name (gap #5).
+func TestLoadSkills_NameCollisionPrecedence(t *testing.T) {
+	dir := t.TempDir()
+
+	// Set XDG_CONFIG_HOME so the user-level skill lives under our temp dir.
+	t.Setenv("XDG_CONFIG_HOME", dir)
+	t.Setenv("HOME", filepath.Join(dir, "home"))
+
+	userDir := filepath.Join(dir, "kit", "skills")
+	projectDir := filepath.Join(dir, ".kit", "skills")
+	if err := os.MkdirAll(userDir, 0o755); err != nil {
+		t.Fatal(err)
+	}
+	if err := os.MkdirAll(projectDir, 0o755); err != nil {
+		t.Fatal(err)
+	}
+
+	writeSkill(t, filepath.Join(userDir, "shared.md"), "shared", "User version", "USER")
+	writeSkill(t, filepath.Join(projectDir, "shared.md"), "shared", "Project version", "PROJECT")
+
+	skills, err := LoadSkills(dir)
+	if err != nil {
+		t.Fatal(err)
+	}
+	if len(skills) != 1 {
+		t.Fatalf("expected 1 skill (deduped by name), got %d", len(skills))
+	}
+	if !strings.Contains(skills[0].Content, "PROJECT") {
+		t.Errorf("expected project version to win, got content %q", skills[0].Content)
 	}
 }

@@ -321,8 +357,8 @@ func TestFormatForPrompt_SingleSkill(t *testing.T) {
 	if !strings.Contains(result, "<description>A test</description>") {
 		t.Errorf("result should contain description in XML")
 	}
-	if !strings.Contains(result, "<location>file:///tmp/test-skill/SKILL.md</location>") {
-		t.Errorf("result should contain file location")
+	if !strings.Contains(result, "<location>/tmp/test-skill/SKILL.md</location>") {
+		t.Errorf("result should contain bare file location (no file:// prefix)")
 	}
 	if !strings.Contains(result, "<available_skills>") {
 		t.Errorf("result should contain available_skills root element")
@@ -352,3 +388,149 @@ func TestFormatForPrompt_MultipleSkills(t *testing.T) {
 		t.Error("missing preamble instructions")
 	}
 }
+
+// ---------------------------------------------------------------------------
+// agentskills.io spec compliance (issue #65)
+// ---------------------------------------------------------------------------
+
+// TestFormatForPrompt_XMLEscaping verifies special characters in name and
+// description are escaped so the catalog is valid XML (gap #1).
+func TestFormatForPrompt_XMLEscaping(t *testing.T) {
+	skills := []*Skill{
+		{Name: "a&b", Description: "use when <tag> & \"quoted\"", Path: "/tmp/x"},
+	}
+	result := FormatForPrompt(skills)
+	if strings.Contains(result, "<tag>") {
+		t.Errorf("raw < should have been escaped, got: %q", result)
+	}
+	if !strings.Contains(result, "&lt;tag&gt;") {
+		t.Errorf("expected escaped <tag>, got: %q", result)
+	}
+	if !strings.Contains(result, "a&amp;b") {
+		t.Errorf("expected escaped ampersand in name, got: %q", result)
+	}
+}
+
+// TestFormatForPrompt_DisableModelInvocation verifies that a skill flagged
+// disable-model-invocation is omitted from the catalog (gap #10).
+func TestFormatForPrompt_DisableModelInvocation(t *testing.T) {
+	skills := []*Skill{
+		{Name: "visible", Description: "shown", Path: "/tmp/a"},
+		{Name: "hidden", Description: "not shown", Path: "/tmp/b", DisableModelInvocation: true},
+	}
+	result := FormatForPrompt(skills)
+	if !strings.Contains(result, "<name>visible</name>") {
+		t.Error("visible skill should be in catalog")
+	}
+	if strings.Contains(result, "<name>hidden</name>") {
+		t.Error("disable-model-invocation skill should be omitted from catalog")
+	}
+}
+
+// TestLoadSkill_NewSpecFields verifies the new frontmatter fields parse (gap #6).
+func TestLoadSkill_NewSpecFields(t *testing.T) {
+	dir := t.TempDir()
+	path := filepath.Join(dir, "spec.md")
+	content := `---
+name: spec-skill
+description: A spec-compliant skill
+license: MIT
+compatibility: claude-code, cursor
+allowed-tools: read, bash
+disable-model-invocation: true
+metadata:
+  author: jane
+  version: "1.2"
+---
+Body.`
+	if err := os.WriteFile(path, []byte(content), 0o644); err != nil {
+		t.Fatal(err)
+	}
+	s, err := LoadSkill(path)
+	if err != nil {
+		t.Fatal(err)
+	}
+	if s.License != "MIT" {
+		t.Errorf("License = %q, want MIT", s.License)
+	}
+	if s.Compatibility != "claude-code, cursor" {
+		t.Errorf("Compatibility = %q", s.Compatibility)
+	}
+	if s.AllowedTools != "read, bash" {
+		t.Errorf("AllowedTools = %q", s.AllowedTools)
+	}
+	if !s.DisableModelInvocation {
+		t.Error("DisableModelInvocation should be true")
+	}
+	if s.Metadata["author"] != "jane" || s.Metadata["version"] != "1.2" {
+		t.Errorf("Metadata = %v", s.Metadata)
+	}
+}
+
+// TestLoadSkill_UnquotedColonFallback verifies the YAML repair fallback for
+// the common `description: Use when: ...` mistake (gap #9).
+func TestLoadSkill_UnquotedColonFallback(t *testing.T) {
+	dir := t.TempDir()
+	path := filepath.Join(dir, "colon.md")
+	content := "---\nname: colon-skill\ndescription: Use when: extracting tables from PDFs\n---\nBody."
+	if err := os.WriteFile(path, []byte(content), 0o644); err != nil {
+		t.Fatal(err)
+	}
+	s, err := LoadSkill(path)
+	if err != nil {
+		t.Fatalf("expected unquoted-colon fallback to succeed, got error: %v", err)
+	}
+	if s.Name != "colon-skill" {
+		t.Errorf("Name = %q", s.Name)
+	}
+	if !strings.Contains(s.Description, "extracting tables") {
+		t.Errorf("Description = %q", s.Description)
+	}
+}
+
+// TestValidate verifies the Validate diagnostics (gaps #2, #15).
+func TestValidate(t *testing.T) {
+	missing := &Skill{Name: "x"}
+	diags := missing.Validate()
+	if !hasError(diags) {
+		t.Error("expected an error diagnostic for missing description")
+	}
+
+	ok := &Skill{Name: "x", Description: "y"}
+	if len(ok.Validate()) != 0 {
+		t.Error("expected no diagnostics for a complete skill")
+	}
+}
+
+// TestSkillResources verifies bundled-resource enumeration (gaps #11, #15).
+func TestSkillResources(t *testing.T) {
+	dir := t.TempDir()
+	skillDir := filepath.Join(dir, "my-skill")
+	if err := os.MkdirAll(filepath.Join(skillDir, "scripts"), 0o755); err != nil {
+		t.Fatal(err)
+	}
+	if err := os.MkdirAll(filepath.Join(skillDir, "references"), 0o755); err != nil {
+		t.Fatal(err)
+	}
+	if err := os.WriteFile(filepath.Join(skillDir, "scripts", "run.py"), []byte("x"), 0o644); err != nil {
+		t.Fatal(err)
+	}
+	if err := os.WriteFile(filepath.Join(skillDir, "references", "REF.md"), []byte("x"), 0o644); err != nil {
+		t.Fatal(err)
+	}
+	s := &Skill{Name: "my-skill", Path: filepath.Join(skillDir, "SKILL.md")}
+	res := s.Resources()
+	if len(res) != 2 {
+		t.Fatalf("expected 2 resources, got %d: %v", len(res), res)
+	}
+	if s.BaseDir() != skillDir {
+		t.Errorf("BaseDir = %q, want %q", s.BaseDir(), skillDir)
+	}
+	formatted := FormatResources(res)
+	if !strings.Contains(formatted, "<file>references/REF.md</file>") {
+		t.Errorf("FormatResources output missing reference: %q", formatted)
+	}
+	if !strings.Contains(formatted, "<file>scripts/run.py</file>") {
+		t.Errorf("FormatResources output missing script: %q", formatted)
+	}
+}
@@ -0,0 +1,142 @@
+// Package skilltool provides the built-in activate_skill tool, a dedicated
+// activation entry point for agentskills.io skills (issue #65, gaps #13/#14).
+//
+// While a skill can always be activated by reading its SKILL.md with the
+// generic read tool, a dedicated tool offers an enum-constrained skill name
+// (preventing hallucinated names), bundled-resource enumeration, and
+// per-session deduplication so the same skill is not re-injected repeatedly.
+package skilltool
+
+import (
+	"context"
+	"encoding/json"
+	"fmt"
+	"sort"
+	"strings"
+	"sync"
+
+	"charm.land/fantasy"
+
+	"github.com/mark3labs/kit/internal/skills"
+)
+
+// SkillProvider returns the skills currently available for activation. It is
+// queried on every call so runtime skill mutations are reflected.
+type SkillProvider func() []*skills.Skill
+
+type activateArgs struct {
+	Name string `json:"name"`
+}
+
+// activateSkillTool implements fantasy.AgentTool.
+type activateSkillTool struct {
+	info            fantasy.ToolInfo
+	provider        SkillProvider
+	providerOptions fantasy.ProviderOptions
+
+	mu        sync.Mutex
+	activated map[string]bool // session-level dedup tracking
+}
+
+func (t *activateSkillTool) Info() fantasy.ToolInfo                   { return t.info }
+func (t *activateSkillTool) ProviderOptions() fantasy.ProviderOptions { return t.providerOptions }
+func (t *activateSkillTool) SetProviderOptions(opts fantasy.ProviderOptions) {
+	t.providerOptions = opts
+}
+
+// New builds the activate_skill tool. names is the initial set of skill names
+// used to populate the enum constraint on the name parameter; provider is
+// queried at call time to resolve the skill by name (so runtime additions
+// resolve even if absent from the enum). Returns nil when no skill names are
+// available.
+func New(names []string, provider SkillProvider) fantasy.AgentTool {
+	if len(names) == 0 || provider == nil {
+		return nil
+	}
+	sorted := append([]string(nil), names...)
+	sort.Strings(sorted)
+	enum := make([]any, len(sorted))
+	for i, n := range sorted {
+		enum[i] = n
+	}
+
+	return &activateSkillTool{
+		info: fantasy.ToolInfo{
+			Name: "activate_skill",
+			Description: "Activate a skill by name to load its full instructions into context. " +
+				"Use this when a task matches a skill listed in <available_skills>. " +
+				"The skill body and a list of its bundled resources are returned.",
+			Parameters: map[string]any{
+				"name": map[string]any{
+					"type":        "string",
+					"description": "The exact name of the skill to activate.",
+					"enum":        enum,
+				},
+			},
+			Required: []string{"name"},
+			Parallel: false,
+		},
+		provider:  provider,
+		activated: map[string]bool{},
+	}
+}
+
+func (t *activateSkillTool) Run(_ context.Context, call fantasy.ToolCall) (fantasy.ToolResponse, error) {
+	var args activateArgs
+	if call.Input != "" && call.Input != "{}" {
+		if err := json.Unmarshal([]byte(call.Input), &args); err != nil {
+			return fantasy.NewTextErrorResponse(fmt.Sprintf("invalid arguments: %v", err)), nil
+		}
+	}
+	name := strings.TrimSpace(args.Name)
+	if name == "" {
+		return fantasy.NewTextErrorResponse("name is required"), nil
+	}
+
+	// Hold the lock across the whole activation so the dedup check and the
+	// subsequent mark are atomic — two concurrent calls cannot both pass the
+	// check and double-activate the same skill (gap #14). The skill is only
+	// marked activated on success, so a failed load can be retried.
+	t.mu.Lock()
+	defer t.mu.Unlock()
+
+	if t.activated[name] {
+		return fantasy.NewTextResponse(
+			fmt.Sprintf("Skill %q was already loaded earlier in this session.", name)), nil
+	}
+
+	// Resolve the skill path from the current provider snapshot. Skills with
+	// disable-model-invocation set are not activatable by the model (they
+	// remain available via the /skill: command), mirroring their exclusion
+	// from the catalog and the tool's name enum.
+	var path string
+	for _, s := range t.provider() {
+		if s.Name == name && !s.DisableModelInvocation {
+			path = s.Path
+			break
+		}
+	}
+	if path == "" {
+		return fantasy.NewTextErrorResponse(fmt.Sprintf("unknown skill %q", name)), nil
+	}
+
+	// Re-read the file for freshness, stripping frontmatter.
+	loaded, err := skills.LoadSkill(path)
+	if err != nil {
+		return fantasy.NewTextErrorResponse(fmt.Sprintf("failed to load skill %q: %v", name, err)), nil
+	}
+
+	var buf strings.Builder
+	fmt.Fprintf(&buf, "<skill_content name=%q location=%q>\n", loaded.Name, loaded.Path)
+	fmt.Fprintf(&buf, "References are relative to %s.\n\n", loaded.BaseDir())
+	buf.WriteString(loaded.Content)
+	if res := skills.FormatResources(loaded.Resources()); res != "" {
+		buf.WriteString("\n\n")
+		buf.WriteString(res)
+	}
+	buf.WriteString("\n</skill_content>")
+
+	t.activated[name] = true
+
+	return fantasy.NewTextResponse(buf.String()), nil
+}
@@ -0,0 +1,107 @@
+package skilltool
+
+import (
+	"context"
+	"os"
+	"path/filepath"
+	"strings"
+	"testing"
+
+	"charm.land/fantasy"
+
+	"github.com/mark3labs/kit/internal/skills"
+)
+
+func writeSkillFile(t *testing.T, dir, name string) *skills.Skill {
+	t.Helper()
+	skillDir := filepath.Join(dir, name)
+	if err := os.MkdirAll(filepath.Join(skillDir, "scripts"), 0o755); err != nil {
+		t.Fatal(err)
+	}
+	if err := os.WriteFile(filepath.Join(skillDir, "scripts", "run.py"), []byte("x"), 0o644); err != nil {
+		t.Fatal(err)
+	}
+	path := filepath.Join(skillDir, "SKILL.md")
+	content := "---\nname: " + name + "\ndescription: A test skill\n---\nDo the thing."
+	if err := os.WriteFile(path, []byte(content), 0o644); err != nil {
+		t.Fatal(err)
+	}
+	return &skills.Skill{Name: name, Description: "A test skill", Path: path}
+}
+
+func TestNew_NilWhenNoSkills(t *testing.T) {
+	if tool := New(nil, func() []*skills.Skill { return nil }); tool != nil {
+		t.Error("expected nil tool when no skill names provided")
+	}
+}
+
+func TestActivateSkill_LoadsAndDedups(t *testing.T) {
+	dir := t.TempDir()
+	s := writeSkillFile(t, dir, "extract")
+	provider := func() []*skills.Skill { return []*skills.Skill{s} }
+
+	tool := New([]string{"extract"}, provider)
+	if tool == nil {
+		t.Fatal("expected a tool")
+	}
+
+	// First activation loads content + resources.
+	resp, err := tool.Run(context.Background(), fantasy.ToolCall{Input: `{"name":"extract"}`})
+	if err != nil {
+		t.Fatal(err)
+	}
+	out := responseText(resp)
+	if !strings.Contains(out, "Do the thing.") {
+		t.Errorf("expected skill body, got: %q", out)
+	}
+	if !strings.Contains(out, "<skill_content name=\"extract\"") {
+		t.Errorf("expected skill_content wrapper, got: %q", out)
+	}
+	if !strings.Contains(out, "scripts/run.py") {
+		t.Errorf("expected enumerated resources, got: %q", out)
+	}
+
+	// Second activation is deduplicated.
+	resp2, err := tool.Run(context.Background(), fantasy.ToolCall{Input: `{"name":"extract"}`})
+	if err != nil {
+		t.Fatal(err)
+	}
+	if !strings.Contains(responseText(resp2), "already loaded") {
+		t.Errorf("expected dedup message, got: %q", responseText(resp2))
+	}
+}
+
+func TestActivateSkill_UnknownSkill(t *testing.T) {
+	provider := func() []*skills.Skill { return nil }
+	tool := New([]string{"extract"}, provider)
+	resp, err := tool.Run(context.Background(), fantasy.ToolCall{Input: `{"name":"nope"}`})
+	if err != nil {
+		t.Fatal(err)
+	}
+	if !strings.Contains(responseText(resp), "unknown skill") {
+		t.Errorf("expected unknown-skill error, got: %q", responseText(resp))
+	}
+}
+
+// TestActivateSkill_DisabledNotActivatable verifies a skill flagged
+// disable-model-invocation cannot be activated through the model-facing tool.
+func TestActivateSkill_DisabledNotActivatable(t *testing.T) {
+	dir := t.TempDir()
+	s := writeSkillFile(t, dir, "extract")
+	s.DisableModelInvocation = true
+	provider := func() []*skills.Skill { return []*skills.Skill{s} }
+
+	tool := New([]string{"extract"}, provider)
+	resp, err := tool.Run(context.Background(), fantasy.ToolCall{Input: `{"name":"extract"}`})
+	if err != nil {
+		t.Fatal(err)
+	}
+	if !strings.Contains(responseText(resp), "unknown skill") {
+		t.Errorf("disabled skill should not be activatable, got: %q", responseText(resp))
+	}
+}
+
+// responseText extracts the text from a tool response.
+func responseText(resp fantasy.ToolResponse) string {
+	return resp.Content
+}
@@ -0,0 +1,153 @@
+// Package trust manages a persisted allowlist of project directories that the
+// user has marked as trusted for loading project-local skills.
+//
+// Project-local skills (discovered under <project>/.agents/skills/ and
+// <project>/.kit/skills/) are injected into the system prompt. A freshly
+// cloned, untrusted repository could therefore smuggle instructions into the
+// agent the moment the user runs Kit inside it. To mitigate this prompt-
+// injection vector, project-level skill loading can be gated on an explicit
+// trust decision recorded here.
+//
+// The allowlist is stored as JSON at $XDG_CONFIG_HOME/kit/trusted-projects.json
+// (default ~/.config/kit/trusted-projects.json).
+package trust
+
+import (
+	"encoding/json"
+	"os"
+	"path/filepath"
+	"sync"
+)
+
+// Decision is the outcome of a trust prompt.
+type Decision int
+
+const (
+	// Skip declines to load project skills this session and records nothing.
+	Skip Decision = iota
+	// Trust loads project skills this session and persists the directory.
+	Trust
+	// TrustOnce loads project skills this session without persisting.
+	TrustOnce
+)
+
+// storeFileName is the basename of the persisted allowlist.
+const storeFileName = "trusted-projects.json"
+
+// Store is a persisted set of trusted project directories. The zero value is
+// not usable — construct one with Load.
+type Store struct {
+	mu      sync.Mutex
+	path    string
+	trusted map[string]bool
+}
+
+// store mirrors the on-disk JSON layout.
+type store struct {
+	Projects []string `json:"projects"`
+}
+
+// DefaultPath returns the path the trust allowlist is persisted to, respecting
+// $XDG_CONFIG_HOME. Returns the empty string when no home directory can be
+// determined.
+func DefaultPath() string {
+	base := os.Getenv("XDG_CONFIG_HOME")
+	if base == "" {
+		home, err := os.UserHomeDir()
+		if err != nil {
+			return ""
+		}
+		base = filepath.Join(home, ".config")
+	}
+	return filepath.Join(base, "kit", storeFileName)
+}
+
+// Load reads the trust allowlist from path. A missing file yields an empty
+// store (not an error). Pass an empty path to use DefaultPath.
+func Load(path string) (*Store, error) {
+	if path == "" {
+		path = DefaultPath()
+	}
+	s := &Store{path: path, trusted: map[string]bool{}}
+	if path == "" {
+		return s, nil
+	}
+
+	data, err := os.ReadFile(path)
+	if err != nil {
+		if os.IsNotExist(err) {
+			return s, nil
+		}
+		return s, err
+	}
+
+	var raw store
+	if err := json.Unmarshal(data, &raw); err != nil {
+		return s, err
+	}
+	for _, p := range raw.Projects {
+		s.trusted[normalize(p)] = true
+	}
+	return s, nil
+}
+
+// IsTrusted reports whether dir has been marked trusted.
+func (s *Store) IsTrusted(dir string) bool {
+	s.mu.Lock()
+	defer s.mu.Unlock()
+	return s.trusted[normalize(dir)]
+}
+
+// Trust records dir as trusted and persists the allowlist to disk.
+func (s *Store) Trust(dir string) error {
+	s.mu.Lock()
+	s.trusted[normalize(dir)] = true
+	s.mu.Unlock()
+	return s.save()
+}
+
+// Untrust removes dir from the allowlist and persists the change.
+func (s *Store) Untrust(dir string) error {
+	s.mu.Lock()
+	delete(s.trusted, normalize(dir))
+	s.mu.Unlock()
+	return s.save()
+}
+
+// save writes the allowlist to disk, creating parent directories as needed.
+func (s *Store) save() error {
+	if s.path == "" {
+		return nil
+	}
+	s.mu.Lock()
+	projects := make([]string, 0, len(s.trusted))
+	for p := range s.trusted {
+		projects = append(projects, p)
+	}
+	s.mu.Unlock()
+
+	data, err := json.MarshalIndent(store{Projects: projects}, "", "  ")
+	if err != nil {
+		return err
+	}
+	if err := os.MkdirAll(filepath.Dir(s.path), 0o755); err != nil {
+		return err
+	}
+	return os.WriteFile(s.path, data, 0o644)
+}
+
+// normalize resolves dir to an absolute, symlink-evaluated path for stable
+// comparison. It falls back to the cleaned input when resolution fails.
+func normalize(dir string) string {
+	if dir == "" {
+		return ""
+	}
+	abs, err := filepath.Abs(dir)
+	if err != nil {
+		return filepath.Clean(dir)
+	}
+	if resolved, err := filepath.EvalSymlinks(abs); err == nil {
+		return resolved
+	}
+	return abs
+}
@@ -0,0 +1,60 @@
+package trust
+
+import (
+	"path/filepath"
+	"testing"
+)
+
+func TestStore_TrustAndPersist(t *testing.T) {
+	dir := t.TempDir()
+	storePath := filepath.Join(dir, "trusted-projects.json")
+	project := filepath.Join(dir, "repo")
+
+	s, err := Load(storePath)
+	if err != nil {
+		t.Fatal(err)
+	}
+	if s.IsTrusted(project) {
+		t.Fatal("project should not be trusted initially")
+	}
+	if err := s.Trust(project); err != nil {
+		t.Fatal(err)
+	}
+	if !s.IsTrusted(project) {
+		t.Fatal("project should be trusted after Trust")
+	}
+
+	// Reload from disk to confirm persistence.
+	s2, err := Load(storePath)
+	if err != nil {
+		t.Fatal(err)
+	}
+	if !s2.IsTrusted(project) {
+		t.Fatal("trust decision should persist across reloads")
+	}
+}
+
+func TestStore_Untrust(t *testing.T) {
+	dir := t.TempDir()
+	storePath := filepath.Join(dir, "trusted-projects.json")
+	project := filepath.Join(dir, "repo")
+
+	s, _ := Load(storePath)
+	_ = s.Trust(project)
+	if err := s.Untrust(project); err != nil {
+		t.Fatal(err)
+	}
+	if s.IsTrusted(project) {
+		t.Fatal("project should not be trusted after Untrust")
+	}
+}
+
+func TestStore_MissingFileIsEmpty(t *testing.T) {
+	s, err := Load(filepath.Join(t.TempDir(), "does-not-exist.json"))
+	if err != nil {
+		t.Fatalf("missing file should not error: %v", err)
+	}
+	if s.IsTrusted("/anything") {
+		t.Fatal("empty store should trust nothing")
+	}
+}
@@ -146,9 +146,10 @@ var SlashCommands = []SlashCommand{
 	},
 	{
 		Name:        "/new",
-		Description: "Start a new session",
+		Description: "Start a new session (optionally with an initial prompt)",
 		Category:    "Navigation",
 		Aliases:     []string{"/n"},
+		HasArgs:     true,
 	},
 	{
 		Name:        "/name",
@@ -445,9 +445,12 @@ type AppModelOptions struct {
 	EmitBeforeFork func(targetID string, isUserMsg bool, userText string) (bool, string)

 	// EmitBeforeSessionSwitch, if non-nil, is called before switching
-	// to a new session branch (e.g. /new, /clear). Returns (cancelled,
-	// reason). May be nil if no extensions are loaded.
-	EmitBeforeSessionSwitch func(reason string) (bool, string)
+	// to a new session branch (e.g. /new, /clear). reason is the trigger
+	// ("new", "clear", "extension"); initialPrompt is the user prompt
+	// that will run as the first turn of the new session (empty when
+	// /new is called without arguments). Returns (cancelled, reason).
+	// May be nil if no extensions are loaded.
+	EmitBeforeSessionSwitch func(reason, initialPrompt string) (bool, string)

 	// GetGlobalShortcuts, if non-nil, returns extension-registered global
 	// keyboard shortcuts. Keys are binding strings (e.g., "ctrl+p").
@@ -575,6 +578,13 @@ type AppModel struct {
 	// flushed first, preserving chronological order.
 	pendingUserPrints []string

+	// newSessionResultCh, when non-nil, receives the outcome of an
+	// in-flight extension-triggered NewSession request. Set when an
+	// app.NewSessionRequestEvent arrives; cleared (with a result sent)
+	// in performNewSession success/failure paths or in the
+	// beforeSessionSwitchResultMsg cancellation path.
+	newSessionResultCh chan<- error
+
 	// canceling tracks whether the user has pressed ESC once during stateWorking.
 	// A second ESC within 2 seconds will cancel the current step.
 	canceling bool
@@ -677,7 +687,7 @@ type AppModel struct {

 	// emitBeforeSessionSwitch emits a before-session-switch event to extensions.
 	// Returns (cancelled, reason). May be nil if no extensions are loaded.
-	emitBeforeSessionSwitch func(reason string) (bool, string)
+	emitBeforeSessionSwitch func(reason, initialPrompt string) (bool, string)

 	// thinkingLevel is the current extended thinking level.
 	thinkingLevel string
@@ -2192,6 +2202,25 @@ func (m *AppModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
 			ic.textarea.CursorEnd()
 		}

+	case app.NewSessionRequestEvent:
+		// Extension wants to end the current session and start a fresh
+		// one (with an optional initial prompt). Stash the response
+		// channel so performNewSession (or the before-hook cancellation
+		// path) can signal completion, then run the same /new pipeline
+		// the user would trigger.
+		if msg.ResponseCh != nil {
+			// Only one new-session request in flight at a time. If a
+			// previous response channel is still pending, fail it before
+			// replacing it so the prior extension goroutine unblocks.
+			if m.newSessionResultCh != nil {
+				m.newSessionResultCh <- fmt.Errorf("superseded by a newer NewSession request")
+			}
+			m.newSessionResultCh = msg.ResponseCh
+		}
+		if cmd := m.handleNewCommand(msg.InitialPrompt); cmd != nil {
+			cmds = append(cmds, cmd)
+		}
+
 	case app.PasswordPromptEvent:
 		// Sudo password prompt - show a modal input prompt
 		// If already in prompt state, cancel the new request
@@ -2397,8 +2426,9 @@ func (m *AppModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
 		// session reset if the hook did not cancel.
 		if msg.cancelled {
 			m.printSystemMessage(msg.reason)
+			m.signalNewSessionResult(fmt.Errorf("session switch cancelled: %s", msg.reason))
 		} else {
-			cmds = append(cmds, m.performNewSession())
+			cmds = append(cmds, m.performNewSession(msg.initialPrompt))
 		}

 	case beforeForkResultMsg:
@@ -3241,7 +3271,7 @@ func (m *AppModel) handleSlashCommand(sc *commands.SlashCommand, args string) te
 	case "/fork":
 		return m.handleForkCommand()
 	case "/new":
-		return m.handleNewCommand()
+		return m.handleNewCommand(args)
 	case "/name":
 		return m.handleNameCommand(args)
 	case "/resume":
@@ -3672,7 +3702,7 @@ func (m *AppModel) printHelpMessage() {
 		"**Navigation:**\n" +
 		"- `/tree`: Navigate session tree (switch branches)\n" +
 		"- `/fork`: Branch from an earlier message\n" +
-		"- `/new`: Start a new session (discards context, saves old session)\n" +
+		"- `/new [prompt]`: Start a new session (discards context, saves old session). With a prompt, runs it as the first message; supports `@file` attachments.\n" +
 		"- `/resume`: Open session picker to switch sessions\n" +
 		"- `/name <name>`: Set a display name for this session\n\n" +
 		"**System:**\n" +
@@ -4368,7 +4398,12 @@ func (m *AppModel) handleForkCommand() tea.Cmd {

 // handleNewCommand starts a completely new session (Pi-style /new behavior).
 // Creates a new session file, discarding all context from the previous conversation.
-func (m *AppModel) handleNewCommand() tea.Cmd {
+// If initialPrompt is non-empty it is submitted as the first user turn of the
+// new session, with @file references expanded the same way they are for
+// regular user input.
+func (m *AppModel) handleNewCommand(initialPrompt string) tea.Cmd {
+	initialPrompt = strings.TrimSpace(initialPrompt)
+
 	// Emit before-session-switch event in a goroutine so that extension
 	// handlers can call blocking operations (e.g. ctx.PromptConfirm) without
 	// deadlocking the BubbleTea event loop.
@@ -4376,23 +4411,25 @@ func (m *AppModel) handleNewCommand() tea.Cmd {
 		emit := m.emitBeforeSessionSwitch
 		ctrl := m.appCtrl
 		go func() {
-			cancelled, reason := emit("new")
+			cancelled, reason := emit("new", initialPrompt)
 			ctrl.SendEvent(beforeSessionSwitchResultMsg{
-				cancelled: cancelled,
-				reason:    reason,
+				cancelled:     cancelled,
+				reason:        reason,
+				initialPrompt: initialPrompt,
 			})
 		}()
 		return noopCmd
 	}

-	return m.performNewSession()
+	return m.performNewSession(initialPrompt)
 }

 // performNewSession performs the actual session reset. Called either directly
 // (when no before-hook exists) or after the async hook completes.
 // Matches Pi behavior: creates a completely new session file, discarding all
-// context from the previous conversation.
-func (m *AppModel) performNewSession() tea.Cmd {
+// context from the previous conversation. If initialPrompt is non-empty it
+// is submitted as the first user turn (with @file expansion).
+func (m *AppModel) performNewSession(initialPrompt string) tea.Cmd {
 	ts := m.appCtrl.GetTreeSession()
 	if ts == nil {
 		// No tree session — just clear messages.
@@ -4406,13 +4443,16 @@ func (m *AppModel) performNewSession() tea.Cmd {
 		// Clear the ScrollList so the new session starts fresh.
 		m.messages = []MessageItem{}
 		m.printSystemMessage("Conversation cleared. Starting fresh.")
-		return nil
+		cmd := m.submitInitialPrompt(initialPrompt)
+		m.signalNewSessionResult(nil)
+		return cmd
 	}

 	// Create a brand new session file (Pi-style /new behavior)
 	newTs, err := session.CreateTreeSession(m.cwd)
 	if err != nil {
 		m.printSystemMessage(fmt.Sprintf("Failed to create new session: %v", err))
+		m.signalNewSessionResult(fmt.Errorf("create new session: %w", err))
 		return nil
 	}

@@ -4425,6 +4465,67 @@ func (m *AppModel) performNewSession() tea.Cmd {
 	// Clear the ScrollList so the new session starts fresh.
 	m.messages = []MessageItem{}
 	m.printSystemMessage("New session started. Previous conversation saved.")
+	cmd := m.submitInitialPrompt(initialPrompt)
+	m.signalNewSessionResult(nil)
+	return cmd
+}
+
+// signalNewSessionResult delivers the outcome of an extension-triggered
+// NewSession request (if one is in flight) and clears the response channel.
+// Safe to call when no request is pending.
+func (m *AppModel) signalNewSessionResult(err error) {
+	if m.newSessionResultCh == nil {
+		return
+	}
+	ch := m.newSessionResultCh
+	m.newSessionResultCh = nil
+	// Channel is buffered (cap >= 1) by contract — send is non-blocking.
+	ch <- err
+}
+
+// submitInitialPrompt is the shared submission path used by /new <prompt>
+// and ctx.NewSession(prompt). It mirrors the SubmitMsg handler: @file
+// references are expanded via fileutil.ProcessFileAttachments and the
+// resulting prompt is forwarded to AppController.Run / RunWithFiles.
+// Returns nil when prompt is empty.
+func (m *AppModel) submitInitialPrompt(prompt string) tea.Cmd {
+	prompt = strings.TrimSpace(prompt)
+	if prompt == "" || m.appCtrl == nil {
+		return nil
+	}
+
+	processedText := prompt
+	var fileParts []kit.LLMFilePart
+	if m.cwd != "" {
+		result := fileutil.ProcessFileAttachments(prompt, m.cwd, m.mcpResourceReader)
+		processedText = result.ProcessedText
+		for _, fp := range result.FileParts {
+			fileParts = append(fileParts, kit.LLMFilePart{
+				Filename:  fp.Filename,
+				Data:      fp.Data,
+				MediaType: fp.MediaType,
+			})
+		}
+	}
+
+	displayText := prompt
+	if len(fileParts) > 0 {
+		displayText = fmt.Sprintf("%s\n[%d file(s) attached]", prompt, len(fileParts))
+	}
+
+	var qLen int
+	if len(fileParts) > 0 {
+		qLen = m.appCtrl.RunWithFiles(processedText, fileParts)
+	} else {
+		qLen = m.appCtrl.Run(processedText)
+	}
+	if qLen > 0 {
+		m.queuedMessages = append(m.queuedMessages, displayText)
+		m.layoutDirty = true
+	} else {
+		m.pendingUserPrints = append(m.pendingUserPrints, displayText)
+		m.flushStreamAndPendingUserMessages()
+	}
 	return nil
 }

@@ -5133,8 +5234,9 @@ type mcpPromptResultMsg struct {
 // executed before-session-switch hook. The hook runs in a goroutine so that
 // blocking operations like ctx.PromptConfirm() do not deadlock the TUI.
 type beforeSessionSwitchResultMsg struct {
-	cancelled bool
-	reason    string
+	cancelled     bool
+	reason        string
+	initialPrompt string
 }

 // beforeForkResultMsg carries the result of an asynchronously executed
@@ -1144,3 +1144,128 @@ func TestRenderQueuedMessages_truncatesLongMessages(t *testing.T) {
 		t.Fatalf("expected truncated output to be ≤10 lines, got %d lines", lines)
 	}
 }
+
+// --------------------------------------------------------------------------
+// /new <prompt> and ctx.NewSession
+// --------------------------------------------------------------------------
+
+// TestNewCommand_noPrompt verifies that /new without an argument resets the
+// session (clears messages, prints the system message) and does NOT submit
+// any prompt to the controller.
+func TestNewCommand_noPrompt(t *testing.T) {
+	ctrl := &stubAppController{}
+	m, _, _ := newTestAppModel(ctrl)
+	m.cwd = t.TempDir()
+
+	_ = m.handleNewCommand("")
+
+	if len(ctrl.runCalls) != 0 {
+		t.Fatalf("expected no Run calls for empty prompt, got %v", ctrl.runCalls)
+	}
+	if ctrl.clearMsgCalled == 0 {
+		t.Fatal("expected ClearMessages to be called when no tree session is active")
+	}
+}
+
+// TestNewCommand_withPrompt verifies that /new <prompt> submits the prompt
+// to AppController.Run after clearing the session.
+func TestNewCommand_withPrompt(t *testing.T) {
+	ctrl := &stubAppController{}
+	m, _, _ := newTestAppModel(ctrl)
+	m.cwd = t.TempDir()
+
+	_ = m.handleNewCommand("continue from where we left off")
+
+	if len(ctrl.runCalls) != 1 {
+		t.Fatalf("expected exactly 1 Run call, got %d (%v)", len(ctrl.runCalls), ctrl.runCalls)
+	}
+	if ctrl.runCalls[0] != "continue from where we left off" {
+		t.Fatalf("unexpected prompt submitted: %q", ctrl.runCalls[0])
+	}
+}
+
+// TestNewCommand_whitespacePromptIsEmpty verifies that an all-whitespace
+// prompt is treated as empty (no Run call).
+func TestNewCommand_whitespacePromptIsEmpty(t *testing.T) {
+	ctrl := &stubAppController{}
+	m, _, _ := newTestAppModel(ctrl)
+	m.cwd = t.TempDir()
+
+	_ = m.handleNewCommand("   \n\t  ")
+
+	if len(ctrl.runCalls) != 0 {
+		t.Fatalf("expected no Run calls for whitespace-only prompt, got %v", ctrl.runCalls)
+	}
+}
+
+// TestNewSessionRequestEvent_signalsResponseCh verifies that
+// app.NewSessionRequestEvent runs the same /new pipeline and delivers a
+// nil error to the response channel on success.
+func TestNewSessionRequestEvent_signalsResponseCh(t *testing.T) {
+	ctrl := &stubAppController{}
+	m, _, _ := newTestAppModel(ctrl)
+	m.cwd = t.TempDir()
+
+	ch := make(chan error, 1)
+	m = sendMsg(m, app.NewSessionRequestEvent{
+		InitialPrompt: "hello from extension",
+		ResponseCh:    ch,
+	})
+
+	select {
+	case err := <-ch:
+		if err != nil {
+			t.Fatalf("expected nil error on success, got %v", err)
+		}
+	default:
+		t.Fatal("expected ResponseCh to receive a value")
+	}
+	if len(ctrl.runCalls) != 1 || ctrl.runCalls[0] != "hello from extension" {
+		t.Fatalf("expected prompt to be submitted to Run, got %v", ctrl.runCalls)
+	}
+	if m.newSessionResultCh != nil {
+		t.Fatal("expected newSessionResultCh to be cleared after signaling")
+	}
+}
+
+// TestNewSessionRequestEvent_cancelledByExtension verifies that when the
+// before-session-switch hook cancels, the response channel receives an
+// error.
+func TestNewSessionRequestEvent_cancelledByExtension(t *testing.T) {
+	ctrl := &stubAppController{}
+	m, _, _ := newTestAppModel(ctrl)
+	m.cwd = t.TempDir()
+	m.emitBeforeSessionSwitch = func(reason, prompt string) (bool, string) {
+		return true, "vetoed by test"
+	}
+
+	ch := make(chan error, 1)
+	m = sendMsg(m, app.NewSessionRequestEvent{
+		InitialPrompt: "should be cancelled",
+		ResponseCh:    ch,
+	})
+	// The before-hook runs in a goroutine, which sends back a
+	// beforeSessionSwitchResultMsg. Pump that synchronously by reading
+	// the SendEvent call indirectly: SendEvent on stub is a no-op so we
+	// need to dispatch the message ourselves to simulate the round trip.
+	sendMsg(m, beforeSessionSwitchResultMsg{
+		cancelled:     true,
+		reason:        "vetoed by test",
+		initialPrompt: "should be cancelled",
+	})
+
+	select {
+	case err := <-ch:
+		if err == nil {
+			t.Fatal("expected non-nil error on cancellation")
+		}
+		if !strings.Contains(err.Error(), "vetoed by test") {
+			t.Fatalf("expected error to mention the veto reason, got %v", err)
+		}
+	default:
+		t.Fatal("expected ResponseCh to receive a value")
+	}
+	if len(ctrl.runCalls) != 0 {
+		t.Fatalf("expected no Run calls when cancelled, got %v", ctrl.runCalls)
+	}
+}
@@ -74,7 +74,8 @@ host, err := kit.NewAgent(ctx,

 Helpers: `WithModel`, `WithSystemPrompt`, `WithStreaming`, `WithMaxTokens`,
 `WithThinkingLevel`, `WithTools`, `WithExtraTools`, `WithProviderAPIKey`,
-`WithProviderURL`, `WithConfigFile`, `WithDebug`, and `Ephemeral`. `Option` is
+`WithProviderURL`, `WithConfigFile`, `WithDebug`, `WithDebugLogger`, and
+`Ephemeral`. `Option` is
 a plain `func(*Options)`, so you can define your own. For fields without a
 `With*` helper (`MCPConfig`, `InProcessMCPServers`, `SessionManager`, MCP task
 tuning) construct an `Options` value and call `kit.New`.
@@ -329,7 +330,6 @@ kit.LLMFilePart     // {Filename, Data []byte, MediaType}
 // Agent configuration — concrete Kit-owned structs and function types.
 // All fields use SDK types (e.g. `[]kit.Tool`), so consumers can construct
 // these without importing any LLM-provider package.
-kit.AgentConfig              // Lower-level agent config — prefer Options unless you need direct control
 kit.DebugLogger              // Interface: LogDebug(string) / IsDebugEnabled() bool
 kit.MCPTaskConfig            // Task-aware MCP tools/call config (modes, polling, progress)
 kit.ToolCallHandler          // func(toolCallID, toolName, toolArgs string)
@@ -364,15 +364,28 @@ msg  := kit.ConvertFromLLMMessage(lMsg)  // LLMMessage  → SDK Message
 - `Option` - Functional option (`func(*Options)`) for `NewAgent`
 - `Message` - Conversation message with typed content parts
 - `Tool` - Agent tool interface
- `TurnResult` - Full result from a prompt including usage stats
+- `TurnResult` - Full result from a prompt including usage stats, captured
+  stream deltas (`Stream`), and any tool-driven halt (`FinalValue` /
+  `HaltedByTool`)
+- `StreamEvent` / `StreamEventKind` - Ordered delta events captured in
+  `TurnResult.Stream`
+- `ToolOutput` - Custom tool return value; set `Halt`/`FinalValue` to end the
+  agent loop and surface a typed result
+- Provider-error sentinels - `ErrContextOverflow`, `ErrRateLimit`, `ErrAuth`,
+  `ErrProviderUnavailable`, `ErrInvalidRequest`; classify with
+  `ClassifyProviderError(err)` and match via `errors.Is`

 ### Key Methods

 - `New(ctx, opts)` - Create new Kit instance
 - `NewAgent(ctx, ...Option)` - Create a Kit via functional options (streaming on by default)
 - `Prompt(ctx, message)` - Send message and get response string
- `PromptResult(ctx, message)` - Send message and get full TurnResult
+- `PromptResult(ctx, message)` - Send message and get full TurnResult (blocks
+  until end-of-turn; populates `TurnResult.Stream` in streaming mode)
 - `PromptWithOptions(ctx, message, opts)` - Prompt with per-call options
+  (system message, model, thinking level, provider credentials, extra tools)
+- `PromptResultWithOptions(ctx, message, opts)` - Per-call options variant that
+  returns the full TurnResult
 - `Steer(ctx, instruction)` - System-level steering
 - `FollowUp(ctx, text)` - Continue without new user input
 - `SetModel(ctx, model)` - Switch model at runtime
@@ -384,7 +397,15 @@ msg  := kit.ConvertFromLLMMessage(lMsg)  // LLMMessage  → SDK Message
 - `AddSkill(*Skill)` / `LoadAndAddSkill(path)` / `RemoveSkill(name)` / `SetSkills([])` - Manage skills at runtime
 - `AddContextFile(*ContextFile)` / `AddContextFileContent(path, content)` / `LoadAndAddContextFile(path)` / `RemoveContextFile(path)` / `SetContextFiles([])` - Manage AGENTS.md-style context files at runtime
 - `RefreshSystemPrompt()` - Re-apply the composed system prompt to the agent
+- `NewTool[T]` / `NewParallelTool[T]` - Create a typed custom tool
+- `NewRawTool(name, desc, schema, fn)` - Create a schema-driven tool when the
+  input shape isn't known at compile time (skill/MCP catalogs)
+- `LoadSkillsFromFS(fsys, root)` - `fs.FS`-typed skill loader (embed.FS,
+  fstest.MapFS, per-tenant virtual filesystems)
+- `CollapseBranch(fromID, toID, summary)` - Collapse a branch range into a
+  summary (works with any `SessionManager` via `AppendBranchSummary`)
 - `Close()` - Clean up resources
+- `CloseContext(ctx)` - Clean up resources with a shutdown deadline

 ### Options

@@ -403,7 +424,8 @@ Key `Options` fields for SDK usage:
 | `SessionPath` | Open specific session file |
 | `Continue` | Resume most recent session |
 | `InProcessMCPServers` | Map of name → `*kit.MCPServer` for in-process MCP servers |
-| `Debug` | Enable debug logging |
+| `Debug` | Enable debug logging via the built-in console logger (ignored when `DebugLogger` is set) |
+| `DebugLogger` | Custom `DebugLogger` implementation — routes engine + MCP debug output into your own logging system |

 ## Environment Variables

@@ -153,6 +153,11 @@ func (a *treeManagerAdapter) GetContextEntryIDs() []string {
 	return a.inner.GetContextEntryIDs()
 }

+// AppendBranchSummary implements SessionManager.
+func (a *treeManagerAdapter) AppendBranchSummary(fromID, summary string) (string, error) {
+	return a.inner.AppendBranchSummary(fromID, summary)
+}
+
 // Close implements SessionManager.
 func (a *treeManagerAdapter) Close() error {
 	return a.inner.Close()
@@ -1,208 +0,0 @@
-package kit
-
-import (
-	"context"
-	"errors"
-	"testing"
-	"time"
-
-	"github.com/mark3labs/kit/internal/agent"
-)
-
-// TestAgentConfigToInternal verifies that the SDK-side AgentConfig converts
-// faithfully to the internal agent.AgentConfig representation, preserving
-// every field consumed by the internal agent layer.
-//
-// Regression test for https://github.com/mark3labs/kit/issues/30.
-func TestAgentConfigToInternal(t *testing.T) {
-	t.Run("nil receiver returns nil", func(t *testing.T) {
-		var c *AgentConfig
-		if got := c.toInternal(); got != nil {
-			t.Errorf("nil.toInternal() = %v, want nil", got)
-		}
-	})
-
-	t.Run("scalar fields round-trip", func(t *testing.T) {
-		c := &AgentConfig{
-			SystemPrompt:     "sys",
-			MaxSteps:         7,
-			StreamingEnabled: true,
-			DisableCoreTools: true,
-		}
-		got := c.toInternal()
-		if got == nil {
-			t.Fatal("toInternal() = nil")
-		}
-		if got.SystemPrompt != "sys" {
-			t.Errorf("SystemPrompt = %q, want %q", got.SystemPrompt, "sys")
-		}
-		if got.MaxSteps != 7 {
-			t.Errorf("MaxSteps = %d, want 7", got.MaxSteps)
-		}
-		if !got.StreamingEnabled {
-			t.Error("StreamingEnabled = false, want true")
-		}
-		if !got.DisableCoreTools {
-			t.Error("DisableCoreTools = false, want true")
-		}
-	})
-
-	t.Run("tool slices propagate without conversion", func(t *testing.T) {
-		// Tool is a type alias for the underlying LLM-tool type, so the
-		// SDK []Tool and internal []fantasy.AgentTool slices share the
-		// same backing array after conversion.
-		tool := NewTool[struct{}]("noop", "noop", nil)
-		c := &AgentConfig{
-			CoreTools:  []Tool{tool},
-			ExtraTools: []Tool{tool, tool},
-		}
-		got := c.toInternal()
-		if len(got.CoreTools) != 1 {
-			t.Errorf("CoreTools len = %d, want 1", len(got.CoreTools))
-		}
-		if len(got.ExtraTools) != 2 {
-			t.Errorf("ExtraTools len = %d, want 2", len(got.ExtraTools))
-		}
-	})
-
-	t.Run("tool wrapper is invoked through internal config", func(t *testing.T) {
-		called := false
-		c := &AgentConfig{
-			ToolWrapper: func(in []Tool) []Tool {
-				called = true
-				return in
-			},
-		}
-		got := c.toInternal()
-		if got.ToolWrapper == nil {
-			t.Fatal("internal ToolWrapper is nil")
-		}
-		_ = got.ToolWrapper(nil)
-		if !called {
-			t.Error("SDK ToolWrapper was not invoked through the internal config")
-		}
-	})
-
-	t.Run("OnMCPServerLoaded propagates", func(t *testing.T) {
-		var captured string
-		wantErr := errors.New("boom")
-		c := &AgentConfig{
-			OnMCPServerLoaded: func(name string, _ int, _ error) {
-				captured = name
-			},
-		}
-		got := c.toInternal()
-		got.OnMCPServerLoaded("svr", 3, wantErr)
-		if captured != "svr" {
-			t.Errorf("OnMCPServerLoaded captured = %q, want %q", captured, "svr")
-		}
-	})
-
-	t.Run("DebugLogger propagates", func(t *testing.T) {
-		dl := &fakeDebugLogger{enabled: true}
-		c := &AgentConfig{DebugLogger: dl}
-		got := c.toInternal()
-		if got.DebugLogger == nil {
-			t.Fatal("internal DebugLogger is nil")
-		}
-		if !got.DebugLogger.IsDebugEnabled() {
-			t.Error("IsDebugEnabled = false, want true")
-		}
-		got.DebugLogger.LogDebug("hello")
-		if len(dl.messages) != 1 || dl.messages[0] != "hello" {
-			t.Errorf("messages = %v, want [hello]", dl.messages)
-		}
-	})
-
-	t.Run("MCPTaskConfig propagates with mode + progress", func(t *testing.T) {
-		c := &AgentConfig{
-			MCPTaskConfig: MCPTaskConfig{
-				PerServerMode: map[string]MCPTaskMode{
-					"build-svr": MCPTaskModeAlways,
-				},
-				DefaultTTL:      30 * time.Second,
-				PollInterval:    250 * time.Millisecond,
-				MaxPollInterval: 2 * time.Second,
-				Timeout:         5 * time.Minute,
-				Progress:        func(_ MCPTaskProgress) {},
-			},
-		}
-		got := c.toInternal()
-		if got.MCPTaskConfig.DefaultTTL != 30*time.Second {
-			t.Errorf("DefaultTTL = %v, want 30s", got.MCPTaskConfig.DefaultTTL)
-		}
-		if got.MCPTaskConfig.PollInterval != 250*time.Millisecond {
-			t.Errorf("PollInterval = %v, want 250ms", got.MCPTaskConfig.PollInterval)
-		}
-		if got.MCPTaskConfig.MaxPollInterval != 2*time.Second {
-			t.Errorf("MaxPollInterval = %v, want 2s", got.MCPTaskConfig.MaxPollInterval)
-		}
-		if got.MCPTaskConfig.Timeout != 5*time.Minute {
-			t.Errorf("Timeout = %v, want 5m", got.MCPTaskConfig.Timeout)
-		}
-		mode, ok := got.MCPTaskConfig.PerServerMode["build-svr"]
-		if !ok {
-			t.Fatal("PerServerMode missing 'build-svr'")
-		}
-		if string(mode) != string(MCPTaskModeAlways) {
-			t.Errorf("mode = %q, want %q", mode, MCPTaskModeAlways)
-		}
-		if got.MCPTaskConfig.Progress == nil {
-			t.Fatal("internal Progress handler is nil")
-		}
-	})
-
-	t.Run("auth and token store factories are wired", func(t *testing.T) {
-		auth := &fakeAuthHandler{}
-		tokenCalls := 0
-		var tokenServer string
-		factory := MCPTokenStoreFactory(func(server string) (MCPTokenStore, error) {
-			tokenCalls++
-			tokenServer = server
-			return nil, nil
-		})
-		c := &AgentConfig{
-			AuthHandler:       auth,
-			TokenStoreFactory: factory,
-		}
-		got := c.toInternal()
-		if got.AuthHandler == nil {
-			t.Fatal("internal AuthHandler is nil")
-		}
-		if got.TokenStoreFactory == nil {
-			t.Fatal("internal TokenStoreFactory is nil")
-		}
-		_, _ = got.TokenStoreFactory("https://example.test")
-		if tokenCalls != 1 {
-			t.Errorf("token factory call count = %d, want 1", tokenCalls)
-		}
-		if tokenServer != "https://example.test" {
-			t.Errorf("token factory server arg = %q", tokenServer)
-		}
-		if got.AuthHandler.RedirectURI() != "redirect" {
-			t.Errorf("RedirectURI = %q, want %q", got.AuthHandler.RedirectURI(), "redirect")
-		}
-	})
-
-	// Compile-time check that the internal type is what we expect.
-	//nolint:staticcheck // QF1011: explicit type asserts the conversion target.
-	var _ *agent.AgentConfig = (&AgentConfig{}).toInternal()
-}
-
-// fakeAuthHandler implements both kit.MCPAuthHandler and the structurally
-// identical tools.MCPAuthHandler used by the internal layer.
-type fakeAuthHandler struct{}
-
-func (f *fakeAuthHandler) RedirectURI() string { return "redirect" }
-func (f *fakeAuthHandler) HandleAuth(_ context.Context, _ string, _ string) (string, error) {
-	return "", nil
-}
-
-// fakeDebugLogger implements kit.DebugLogger for tests.
-type fakeDebugLogger struct {
-	enabled  bool
-	messages []string
-}
-
-func (f *fakeDebugLogger) LogDebug(m string)    { f.messages = append(f.messages, m) }
-func (f *fakeDebugLogger) IsDebugEnabled() bool { return f.enabled }
@@ -112,8 +112,20 @@ func (m *Kit) Compact(ctx context.Context, opts *CompactionOptions, customInstru
 }

 // compactInternal is the shared compaction implementation. The isAutomatic
-// flag distinguishes auto-triggered compaction from manual /compact.
+// flag distinguishes user-triggered from auto-compaction for hooks/events.
+// On failure it emits a CompactionEvent carrying the error so embedders can
+// observe the failure path symmetrically with the success path.
 func (m *Kit) compactInternal(ctx context.Context, opts *CompactionOptions, customInstructions string, isAutomatic bool) (*CompactionResult, error) {
+	result, err := m.compactImpl(ctx, opts, customInstructions, isAutomatic)
+	if err != nil {
+		m.events.emit(CompactionEvent{Err: err})
+	}
+	return result, err
+}
+
+// compactImpl performs the actual compaction work. On success it emits a
+// CompactionEvent via persistAndEmitCompaction.
+func (m *Kit) compactImpl(ctx context.Context, opts *CompactionOptions, customInstructions string, isAutomatic bool) (*CompactionResult, error) {
 	if opts == nil {
 		if m.compactionOpts != nil {
 			opts = m.compactionOpts
@@ -0,0 +1,113 @@
+package kit
+
+import (
+	"errors"
+	"strings"
+)
+
+// Provider-error sentinels. Provider and turn execution paths wrap these via
+// fmt.Errorf("%w: %s", …) so embedders can classify failures with errors.Is
+// instead of brittle string matching. Use [ClassifyProviderError] to map an
+// arbitrary provider error to one of these sentinels.
+var (
+	// ErrContextOverflow indicates the request exceeded the model's maximum
+	// context window. Embedders typically respond by compacting and retrying.
+	ErrContextOverflow = errors.New("context window exceeded")
+
+	// ErrRateLimit indicates the provider throttled the request. Embedders
+	// typically respond by backing off and retrying.
+	ErrRateLimit = errors.New("rate limited by provider")
+
+	// ErrAuth indicates a credential / authorization failure.
+	ErrAuth = errors.New("provider authentication failed")
+
+	// ErrProviderUnavailable indicates a transient provider/upstream failure
+	// (5xx, network error, timeout).
+	ErrProviderUnavailable = errors.New("provider unavailable")
+
+	// ErrInvalidRequest indicates the request was structurally invalid and
+	// retrying will not help.
+	ErrInvalidRequest = errors.New("invalid request to provider")
+)
+
+// ClassifyProviderError inspects err and returns it wrapped with the matching
+// provider-error sentinel ([ErrContextOverflow], [ErrRateLimit], [ErrAuth],
+// [ErrProviderUnavailable], or [ErrInvalidRequest]) when the underlying cause
+// can be recognized. The returned error satisfies errors.Is against both the
+// sentinel and the original cause, so the full chain stays inspectable.
+//
+// When err is nil it returns nil. When the cause cannot be classified the
+// original err is returned unchanged so callers never lose information.
+//
+// Classification is heuristic: it first honors any sentinel already present in
+// the chain (so double-classification is idempotent), then falls back to
+// matching common provider status codes and phrases in the error text.
+func ClassifyProviderError(err error) error {
+	if err == nil {
+		return nil
+	}
+	// Already classified — keep as-is so the call is idempotent.
+	for _, sentinel := range []error{
+		ErrContextOverflow, ErrRateLimit, ErrAuth,
+		ErrProviderUnavailable, ErrInvalidRequest,
+	} {
+		if errors.Is(err, sentinel) {
+			return err
+		}
+	}
+
+	if sentinel := classifyProviderErrorText(err.Error()); sentinel != nil {
+		return wrapSentinel(sentinel, err)
+	}
+	return err
+}
+
+// wrapSentinel returns an error that satisfies errors.Is(_, sentinel) while
+// keeping the original cause inspectable via errors.Is.
+func wrapSentinel(sentinel, cause error) error {
+	return &sentinelError{sentinel: sentinel, cause: cause}
+}
+
+type sentinelError struct {
+	sentinel error
+	cause    error
+}
+
+func (e *sentinelError) Error() string {
+	return e.sentinel.Error() + ": " + e.cause.Error()
+}
+
+// Unwrap returns both the sentinel and the cause so errors.Is matches the
+// sentinel and the underlying error chain stays reachable.
+func (e *sentinelError) Unwrap() []error {
+	return []error{e.sentinel, e.cause}
+}
+
+// classifyProviderErrorText returns the sentinel matching common provider
+// error phrasings, or nil if none match.
+func classifyProviderErrorText(msg string) error {
+	m := strings.ToLower(msg)
+	switch {
+	case containsAny(m, "context_length_exceeded", "context window", "maximum context length", "too many tokens", "prompt is too long"):
+		return ErrContextOverflow
+	case containsAny(m, "rate limit", "rate_limit", "too many requests", "status 429", "429"):
+		return ErrRateLimit
+	case containsAny(m, "unauthorized", "authentication", "invalid api key", "invalid_api_key", "permission denied", "status 401", "status 403", "401", "403"):
+		return ErrAuth
+	case containsAny(m, "status 500", "status 502", "status 503", "status 504", "internal server error", "bad gateway", "service unavailable", "gateway timeout", "timeout", "connection refused", "no such host", "eof"):
+		return ErrProviderUnavailable
+	case containsAny(m, "status 400", "invalid request", "bad request", "unprocessable"):
+		return ErrInvalidRequest
+	default:
+		return nil
+	}
+}
+
+func containsAny(s string, subs ...string) bool {
+	for _, sub := range subs {
+		if strings.Contains(s, sub) {
+			return true
+		}
+	}
+	return false
+}
@@ -0,0 +1,64 @@
+package kit_test
+
+import (
+	"errors"
+	"fmt"
+	"testing"
+
+	"github.com/mark3labs/kit/pkg/kit"
+)
+
+func TestClassifyProviderError(t *testing.T) {
+	cases := []struct {
+		name string
+		in   error
+		want error
+	}{
+		{"nil", nil, nil},
+		{"context overflow", errors.New("error: context_length_exceeded for this model"), kit.ErrContextOverflow},
+		{"context window phrase", errors.New("the prompt is too long for the context window"), kit.ErrContextOverflow},
+		{"rate limit", errors.New("HTTP status 429: rate limit exceeded"), kit.ErrRateLimit},
+		{"auth 401", errors.New("status 401 unauthorized"), kit.ErrAuth},
+		{"auth invalid key", errors.New("invalid api key provided"), kit.ErrAuth},
+		{"unavailable 503", errors.New("status 503 service unavailable"), kit.ErrProviderUnavailable},
+		{"invalid request", errors.New("status 400 bad request: malformed body"), kit.ErrInvalidRequest},
+		{"unclassified", errors.New("something totally unexpected"), nil},
+	}
+
+	for _, tc := range cases {
+		t.Run(tc.name, func(t *testing.T) {
+			got := kit.ClassifyProviderError(tc.in)
+			if tc.in == nil {
+				if got != nil {
+					t.Fatalf("expected nil, got %v", got)
+				}
+				return
+			}
+			if tc.want == nil {
+				// Unclassified errors are returned unchanged.
+				if got.Error() != tc.in.Error() {
+					t.Fatalf("expected unchanged error, got %v", got)
+				}
+				return
+			}
+			if !errors.Is(got, tc.want) {
+				t.Fatalf("errors.Is(%v, %v) = false", got, tc.want)
+			}
+			// Original cause must remain reachable.
+			if !errors.Is(got, tc.in) {
+				t.Fatalf("original cause not preserved in %v", got)
+			}
+		})
+	}
+}
+
+func TestClassifyProviderErrorIdempotent(t *testing.T) {
+	wrapped := fmt.Errorf("%w: upstream detail", kit.ErrRateLimit)
+	got := kit.ClassifyProviderError(wrapped)
+	if got != wrapped {
+		t.Fatalf("already-classified error should be returned unchanged")
+	}
+	if !errors.Is(got, kit.ErrRateLimit) {
+		t.Fatalf("expected ErrRateLimit to remain")
+	}
+}
@@ -370,7 +370,10 @@ type StepUsageEvent struct {
 // EventType implements Event.
 func (e StepUsageEvent) EventType() EventType { return EventStepUsage }

-// CompactionEvent fires after a successful compaction.
+// CompactionEvent fires after a compaction attempt. On success Err is nil and
+// the summary/token/file fields are populated. On failure Err is non-nil and
+// the remaining fields are zero-valued, so embedders can wire symmetric
+// start/end lifecycle telemetry around both outcomes.
 type CompactionEvent struct {
 	Summary         string
 	OriginalTokens  int
@@ -378,6 +381,10 @@ type CompactionEvent struct {
 	MessagesRemoved int
 	ReadFiles       []string
 	ModifiedFiles   []string
+
+	// Err is non-nil when compaction failed. On the failure path the other
+	// fields are zero-valued.
+	Err error
 }

 // EventType implements Event.
@@ -137,6 +137,7 @@ type ExtensionAPI interface {
 	EmitCustomEvent(name, data string)
 	EmitBeforeFork(targetID string, isUserMsg bool, userText string) (cancelled bool, reason string)
 	EmitBeforeSessionSwitch(switchReason string) (cancelled bool, reason string)
+	EmitBeforeSessionSwitchWithPrompt(switchReason, initialPrompt string) (cancelled bool, reason string)

 	// Commands
 	Commands() []ExtensionCommandDef
@@ -567,11 +568,20 @@ func (e *extensionAPI) EmitBeforeFork(targetID string, isUserMsg bool, userText
 }

 func (e *extensionAPI) EmitBeforeSessionSwitch(switchReason string) (cancelled bool, reason string) {
+	return e.EmitBeforeSessionSwitchWithPrompt(switchReason, "")
+}
+
+// EmitBeforeSessionSwitchWithPrompt is like EmitBeforeSessionSwitch but also
+// supplies the initial user prompt (if any) that will be submitted as the
+// first turn of the new session. Extensions inspecting BeforeSessionSwitchEvent
+// see this value in the event's InitialPrompt field.
+func (e *extensionAPI) EmitBeforeSessionSwitchWithPrompt(switchReason, initialPrompt string) (cancelled bool, reason string) {
 	if e.kit.extRunner == nil || !e.kit.extRunner.HasHandlers(extensions.BeforeSessionSwitch) {
 		return false, ""
 	}
 	result, _ := e.kit.extRunner.Emit(extensions.BeforeSessionSwitchEvent{
-		Reason: switchReason,
+		Reason:        switchReason,
+		InitialPrompt: initialPrompt,
 	})
 	if r, ok := result.(extensions.BeforeSessionSwitchResult); ok && r.Cancel {
 		reason := r.Reason
@@ -23,6 +23,7 @@ import (
 	"github.com/mark3labs/kit/internal/models"
 	"github.com/mark3labs/kit/internal/session"
 	"github.com/mark3labs/kit/internal/skills"
+	"github.com/mark3labs/kit/internal/skilltool"
 	"github.com/mark3labs/kit/internal/tools"

 	"github.com/spf13/viper"
@@ -115,6 +116,11 @@ type Kit struct {
 	steerMu       sync.Mutex
 	steerCh       chan agent.SteerMessage
 	leftoverSteer []agent.SteerMessage // unconsumed steer messages from the last turn
+
+	// promptOptsMu serializes per-call PromptOptions overrides that mutate
+	// shared agent state (model, thinking level, provider creds, extra tools)
+	// so the apply/restore window of one call never races another.
+	promptOptsMu sync.Mutex
 }

 // Subscribe registers an EventListener that will be called for every lifecycle
@@ -1004,9 +1010,24 @@ type Options struct {

 	// Skills
 	Skills    []string // Explicit skill files/dirs to load (empty = auto-discover)
-	SkillsDir string   // Override default project-local skills directory
+	SkillsDir string   // Direct skills directory to scan (overrides auto-discovery; scanned as-is)
 	NoSkills  bool     // Disable skill loading entirely (auto-discovery and explicit)

+	// SkillsDisable names skills (by Name) to exclude from the model-facing
+	// catalog. Disabled skills remain available via the /skill: slash command.
+	SkillsDisable []string
+
+	// SkillTrustPrompt is an optional callback invoked the first time Kit
+	// auto-discovers project-local skills (under <project>/.agents/skills or
+	// <project>/.kit/skills) in a directory that is not yet on the trust
+	// allowlist. It receives the project directory and the number of skills
+	// found, and returns a TrustDecision controlling whether the skills load.
+	//
+	// When nil, project-local skills are loaded without prompting (historical
+	// behaviour). Directories trusted via TrustProject are persisted to
+	// ~/.config/kit/trusted-projects.json and not prompted again.
+	SkillTrustPrompt func(projectDir string, skillCount int) TrustDecision
+
 	// NoExtensions disables Yaegi extension loading entirely.
 	NoExtensions bool

@@ -1047,9 +1068,25 @@ type Options struct {
 	AutoCompact       bool               // Auto-compact when near context limit
 	CompactionOptions *CompactionOptions // Config for auto-compaction (nil = defaults)

-	// Debug enables debug logging for the SDK.
+	// Debug enables debug logging for the SDK. When DebugLogger is nil this
+	// flag selects between the default no-op SimpleDebugLogger (Debug=false)
+	// and the built-in console/buffered logger (Debug=true). When DebugLogger
+	// is non-nil this flag is ignored — the supplied logger's
+	// IsDebugEnabled() controls whether downstream code emits messages.
 	Debug bool

+	// DebugLogger, if non-nil, routes low-level debug output from the engine
+	// and the MCP tool plumbing to a caller-supplied implementation. This is
+	// the SDK escape hatch for embedders that want to forward debug output
+	// into their own logging system (zap, slog, log/charm, an in-app TUI
+	// panel, etc.) instead of the built-in console logger.
+	//
+	// When nil (default) the Debug bool controls whether the built-in logger
+	// is installed. When non-nil this logger is used unconditionally and the
+	// Debug bool is ignored; the supplied logger's IsDebugEnabled() reports
+	// whether downstream code should bother formatting messages.
+	DebugLogger DebugLogger
+
 	// MCPAuthHandler handles OAuth authorization for remote MCP servers.
 	// When set, remote transports (streamable HTTP, SSE) are configured
 	// with OAuth support. If the server returns a 401, the handler is
@@ -1352,6 +1389,15 @@ func New(ctx context.Context, opts *Options) (*Kit, error) {
 			if err != nil {
 				return fmt.Errorf("failed to load skills: %w", err)
 			}
+
+			// Apply per-skill disable list (--skill-disable / skill-disable
+			// config key). Disabled skills stay loaded (so /skill: still
+			// works) but are hidden from the model-facing catalog.
+			disable := opts.SkillsDisable
+			if len(disable) == 0 {
+				disable = v.GetStringSlice("skill-disable")
+			}
+			applySkillDisableList(loadedSkills, disable)
 		}

 		// Always compose the system prompt with runtime context: base prompt +
@@ -1505,15 +1551,41 @@ func New(ctx context.Context, opts *Options) (*Kit, error) {
 	// Build agent setup options, pulling CLI-specific fields when available.
 	// Pass the pre-built ProviderConfig and scalar viper snapshots so
 	// SetupAgent doesn't need to re-read viper (which would require the lock).
+
+	// Register the dedicated activate_skill tool when at least one skill is
+	// loaded (issue #65, gaps #13/#14). The provider closure reads the live
+	// skill set from the Kit instance once it exists so runtime additions
+	// resolve; skillToolKit is assigned after construction below.
+	var skillToolKit *Kit
+	extraTools := opts.ExtraTools
+	if len(loadedSkills) > 0 {
+		names := make([]string, 0, len(loadedSkills))
+		for _, s := range loadedSkills {
+			if !s.DisableModelInvocation {
+				names = append(names, s.Name)
+			}
+		}
+		provider := func() []*skills.Skill {
+			if skillToolKit == nil {
+				return loadedSkills
+			}
+			return skillToolKit.GetSkills()
+		}
+		if t := skilltool.New(names, provider); t != nil {
+			extraTools = append(extraTools, t)
+		}
+	}
+
 	setupOpts := kitsetup.AgentSetupOptions{
 		MCPConfig:         mcpConfig,
 		Quiet:             opts.Quiet,
 		CoreTools:         opts.Tools,
 		DisableCoreTools:  disableCoreTools,
-		ExtraTools:        opts.ExtraTools,
+		ExtraTools:        extraTools,
 		ToolWrapper:       hookToolWrapper(beforeToolCall, afterToolResult),
 		ProviderConfig:    providerConfig,
 		Debug:             debug,
+		DebugLogger:       opts.DebugLogger,
 		NoExtensions:      noExtensions,
 		MaxSteps:          maxSteps,
 		StreamingEnabled:  streaming,
@@ -1609,6 +1681,10 @@ func New(ctx context.Context, opts *Options) (*Kit, error) {
 		prepareStep:           prepareStep,
 	}

+	// Point the activate_skill provider closure at the live Kit instance so it
+	// resolves skills mutated after construction.
+	skillToolKit = k
+
 	// Bridge extension events to SDK hooks.
 	if agentResult.ExtRunner != nil {
 		k.bridgeExtensions(agentResult.ExtRunner)
@@ -1719,6 +1795,14 @@ func (m *Kit) expandSkillCommand(prompt string) string {
 	fmt.Fprintf(&buf, "<skill name=%q location=%q>\n", loaded.Name, loaded.Path)
 	fmt.Fprintf(&buf, "References are relative to %s.\n\n", baseDir)
 	buf.WriteString(loaded.Content)
+
+	// Enumerate bundled resources (scripts/, references/, assets/) so the model
+	// knows what it can read without listing the directory itself.
+	if res := skills.FormatResources(loaded.Resources()); res != "" {
+		buf.WriteString("\n\n")
+		buf.WriteString(res)
+	}
+
 	buf.WriteString("\n</skill>")

 	args = strings.TrimSpace(args)
@@ -1735,18 +1819,33 @@ func (m *Kit) expandSkillCommand(prompt string) string {
 // ---------------------------------------------------------------------------

 // loadSkills loads skills based on Options. If explicit paths are provided
-// they are loaded directly; otherwise auto-discovery runs.
+// they are loaded directly. If SkillsDir is set it is treated as a direct
+// skills directory (scanned as-is, not as a parent of .agents/.kit). Otherwise
+// auto-discovery runs against the standard scopes rooted at SessionDir.
 func loadSkills(opts *Options) ([]*skills.Skill, error) {
 	if len(opts.Skills) > 0 {
 		return loadExplicitSkills(opts.Skills)
 	}

-	// Auto-discover from standard directories.
-	cwd := opts.SkillsDir
-	if cwd == "" {
-		cwd = opts.SessionDir
+	// An explicit --skills-dir is a direct skills directory: scan it as-is
+	// rather than appending .agents/skills and .kit/skills beneath it.
+	if opts.SkillsDir != "" {
+		return skills.LoadSkillsFromDir(opts.SkillsDir)
 	}
-	return skills.LoadSkills(cwd)
+
+	// Auto-discover from the standard scopes rooted at the session directory.
+	// Project-local skills are injected into the system prompt, so they are
+	// gated on a trust decision when a SkillTrustPrompt is configured.
+	cwd := opts.SessionDir
+	if cwd == "" {
+		cwd, _ = os.Getwd()
+	}
+	user := skills.LoadUserSkills()
+	project := skills.LoadProjectSkills(cwd)
+	if len(project) > 0 && !projectSkillsTrusted(opts, cwd, len(project)) {
+		project = nil
+	}
+	return skills.Combine(user, project), nil
 }

 // loadExplicitSkills loads skills from a list of explicit paths. Each path
@@ -1824,6 +1923,58 @@ type TurnResult struct {
 	// any tool call/result messages added during the agent loop.
 	// Each message carries role and plain-text content.
 	Messages []LLMMessage
+
+	// FinalValue is set when a tool returned a [ToolOutput] with Halt=true
+	// during the turn. The dynamic type is whatever the tool handler placed
+	// in [ToolOutput.FinalValue]. Nil when no tool halted the turn.
+	FinalValue any
+
+	// HaltedByTool is the name of the tool that returned Halt=true, or empty
+	// if the turn ended for any other reason.
+	HaltedByTool string
+
+	// Stream contains every delta event observed during the turn in emit
+	// order. It is populated regardless of streaming mode (in non-streaming
+	// mode it carries the coarse-grained events the provider reported).
+	// PromptResult and the other turn-returning entry points always block
+	// until end-of-turn, so Stream is complete when they return.
+	Stream []StreamEvent
+}
+
+// StreamEventKind classifies a [StreamEvent] captured during a turn.
+type StreamEventKind string
+
+// Stream event kinds captured in [TurnResult.Stream].
+const (
+	StreamEventTextDelta      StreamEventKind = "text_delta"
+	StreamEventReasoningStart StreamEventKind = "reasoning_start"
+	StreamEventReasoningDelta StreamEventKind = "reasoning_delta"
+	StreamEventReasoningEnd   StreamEventKind = "reasoning_end"
+	StreamEventToolCallChunk  StreamEventKind = "tool_call_chunk"
+)
+
+// StreamEvent is a single delta observed during a turn, captured in
+// [TurnResult.Stream]. It lets embedders assert streamed ordering
+// deterministically without re-implementing an OnMessageUpdate collector.
+type StreamEvent struct {
+	// Kind classifies the event.
+	Kind StreamEventKind
+
+	// Text carries the assistant text for StreamEventTextDelta.
+	Text string
+
+	// Reasoning carries the reasoning text for StreamEventReasoningDelta.
+	Reasoning string
+
+	// ToolName is the tool name for StreamEventToolCallChunk.
+	ToolName string
+
+	// ToolID is the tool call ID for StreamEventToolCallChunk.
+	ToolID string
+
+	// Args carries the (accumulating) tool-call argument JSON for
+	// StreamEventToolCallChunk.
+	Args string
 }

 // ---------------------------------------------------------------------------
@@ -2064,6 +2215,9 @@ func (m *Kit) Subagent(ctx context.Context, cfg SubagentConfig) (*SubagentResult
 // All prompt modes (Prompt, Steer, FollowUp, PromptWithOptions) share this
 // single code path so callback wiring is never duplicated.
 func (m *Kit) generate(ctx context.Context, messages []fantasy.Message) (*agent.GenerateWithLoopResult, error) {
+	// Capture the per-turn stream collector (set by runTurn) so streamed
+	// deltas are recorded into TurnResult.Stream in emit order.
+	collector := streamCollectorFromContext(ctx)
 	// Create a per-turn steer channel and attach it to the context so the
 	// agent's PrepareStep can inject steering messages between steps.
 	steerCh := make(chan agent.SteerMessage, 16)
@@ -2181,24 +2335,30 @@ func (m *Kit) generate(ctx context.Context, messages []fantasy.Message) (*agent.
 						i := strings.Index(remaining, thinkClose)
 						if i == -1 {
 							m.events.emit(ReasoningDeltaEvent{Delta: remaining})
+							collector.add(StreamEvent{Kind: StreamEventReasoningDelta, Reasoning: remaining})
 							return
 						}
 						if i > 0 {
 							m.events.emit(ReasoningDeltaEvent{Delta: remaining[:i]})
+							collector.add(StreamEvent{Kind: StreamEventReasoningDelta, Reasoning: remaining[:i]})
 						}
 						inThinkTag = false
 						m.events.emit(ReasoningCompleteEvent{})
+						collector.add(StreamEvent{Kind: StreamEventReasoningEnd})
 						remaining = remaining[i+len(thinkClose):]
 					} else {
 						i := strings.Index(remaining, thinkOpen)
 						if i == -1 {
 							m.events.emit(MessageUpdateEvent{Chunk: remaining})
+							collector.add(StreamEvent{Kind: StreamEventTextDelta, Text: remaining})
 							return
 						}
 						if i > 0 {
 							m.events.emit(MessageUpdateEvent{Chunk: remaining[:i]})
+							collector.add(StreamEvent{Kind: StreamEventTextDelta, Text: remaining[:i]})
 						}
 						inThinkTag = true
+						collector.add(StreamEvent{Kind: StreamEventReasoningStart})
 						remaining = remaining[i+len(thinkOpen):]
 					}
 				}
@@ -2206,9 +2366,11 @@ func (m *Kit) generate(ctx context.Context, messages []fantasy.Message) (*agent.
 		}(),
 		OnReasoningDelta: func(delta string) {
 			m.events.emit(ReasoningDeltaEvent{Delta: delta})
+			collector.add(StreamEvent{Kind: StreamEventReasoningDelta, Reasoning: delta})
 		},
 		OnReasoningComplete: func() {
 			m.events.emit(ReasoningCompleteEvent{})
+			collector.add(StreamEvent{Kind: StreamEventReasoningEnd})
 		},
 		OnToolOutput: func(toolCallID, toolName, chunk string, isStderr bool) {
 			m.events.emit(ToolOutputEvent{
@@ -2255,12 +2417,14 @@ func (m *Kit) generate(ctx context.Context, messages []fantasy.Message) (*agent.
 				ToolName:   toolName,
 				ToolKind:   toolKindFor(toolName),
 			})
+			collector.add(StreamEvent{Kind: StreamEventToolCallChunk, ToolID: toolCallID, ToolName: toolName})
 		},
 		OnToolCallDelta: func(toolCallID, delta string) {
 			m.events.emit(ToolCallDeltaEvent{
 				ToolCallID: toolCallID,
 				Delta:      delta,
 			})
+			collector.add(StreamEvent{Kind: StreamEventToolCallChunk, ToolID: toolCallID, Args: delta})
 		},
 		OnToolCallEnd: func(toolCallID string) {
 			m.events.emit(ToolCallEndEvent{
@@ -2412,6 +2576,14 @@ func (m *Kit) runTurn(ctx context.Context, promptLabel string, prompt string, pr

 	sentCount := len(messages)

+	// Attach a per-turn stream collector and halt holder so generate's
+	// callbacks can capture delta events (TurnResult.Stream) and tools can
+	// signal loop termination (TurnResult.FinalValue / HaltedByTool).
+	collector := &streamCollector{}
+	holder := &haltHolder{}
+	ctx = context.WithValue(ctx, streamCollectorKey{}, collector)
+	ctx = context.WithValue(ctx, haltHolderKey{}, holder)
+
 	m.events.emit(TurnStartEvent{Prompt: promptLabel})
 	m.events.emit(MessageStartEvent{})

@@ -2434,7 +2606,7 @@ func (m *Kit) runTurn(ctx context.Context, promptLabel string, prompt string, pr
 		m.events.emit(TurnEndEvent{Error: err})
 		// Run AfterTurn hooks even on error.
 		m.afterTurn.run(AfterTurnHook{Error: err})
-		return nil, err
+		return nil, ClassifyProviderError(err)
 	}

 	responseText := result.FinalResponse.Content.Text()
@@ -2487,6 +2659,13 @@ func (m *Kit) runTurn(ctx context.Context, promptLabel string, prompt string, pr
 		turnResult.FinalUsage = &finalUsage
 	}

+	// Surface captured stream deltas and any tool-driven halt signal.
+	turnResult.Stream = collector.drain()
+	if halted, toolName, value := holder.snapshot(); halted {
+		turnResult.HaltedByTool = toolName
+		turnResult.FinalValue = value
+	}
+
 	return turnResult, nil
 }

@@ -2628,28 +2807,158 @@ type PromptOptions struct {
 	// Use it to inject per-call instructions or context without permanently
 	// modifying the agent's system prompt.
 	SystemMessage string
+
+	// Model overrides the agent's configured model for this call only. Empty
+	// string means "use the agent's default". The previous model is restored
+	// after the call returns.
+	Model string
+
+	// ThinkingLevel overrides the agent's reasoning level for this call only
+	// (e.g. "off", "low", "medium", "high"). Empty string means "use the
+	// agent's default". The previous level is restored after the call.
+	ThinkingLevel string
+
+	// ExtraTools are added to the effective tool set for this call only and
+	// removed afterwards.
+	ExtraTools []Tool
+
+	// ProviderURL overrides the provider base URL for this call only. Useful
+	// for multi-tenant embedders that resolve endpoints per request. The
+	// previous value is restored after the call.
+	ProviderURL string
+
+	// ProviderAPIKey overrides the provider credential for this call only.
+	// The previous value is restored after the call.
+	ProviderAPIKey string
+}
+
+// applyPromptOptions applies the per-call overrides in opts to the shared
+// agent state and returns a restore function that reverts every change. It
+// holds promptOptsMu for the lifetime of the override window (the returned
+// restore releases it), so concurrent option-driven prompts are serialized.
+// On error nothing is changed and the lock is released.
+func (m *Kit) applyPromptOptions(ctx context.Context, opts PromptOptions) (func(), error) {
+	needsModelRebuild := opts.Model != "" || opts.ThinkingLevel != "" ||
+		opts.ProviderURL != "" || opts.ProviderAPIKey != ""
+	if !needsModelRebuild && len(opts.ExtraTools) == 0 {
+		return func() {}, nil
+	}
+
+	m.promptOptsMu.Lock()
+	var restores []func()
+	restore := func() {
+		for i := len(restores) - 1; i >= 0; i-- {
+			restores[i]()
+		}
+		m.promptOptsMu.Unlock()
+	}
+
+	// Extra tools (additive) — restored by re-setting the prior slice.
+	if len(opts.ExtraTools) > 0 {
+		prev := m.agent.GetExtraTools()
+		merged := make([]Tool, 0, len(prev)+len(opts.ExtraTools))
+		merged = append(merged, prev...)
+		merged = append(merged, opts.ExtraTools...)
+		m.agent.SetExtraTools(merged)
+		restores = append(restores, func() { m.agent.SetExtraTools(prev) })
+	}
+
+	if needsModelRebuild {
+		prevModel := m.modelString
+		prevThinkingSet := m.v.IsSet("thinking-level")
+		prevThinking := m.v.GetString("thinking-level")
+		prevURLSet := m.v.IsSet("provider-url")
+		prevURL := m.v.GetString("provider-url")
+		prevKeySet := m.v.IsSet("provider-api-key")
+		prevKey := m.v.GetString("provider-api-key")
+
+		if opts.ThinkingLevel != "" {
+			m.v.Set("thinking-level", opts.ThinkingLevel)
+		}
+		if opts.ProviderURL != "" {
+			m.v.Set("provider-url", opts.ProviderURL)
+		}
+		if opts.ProviderAPIKey != "" {
+			m.v.Set("provider-api-key", opts.ProviderAPIKey)
+		}
+
+		targetModel := opts.Model
+		if targetModel == "" {
+			targetModel = prevModel
+		}
+		if err := m.SetModel(ctx, targetModel); err != nil {
+			// Revert config keys we may have set, then unwind prior restores.
+			restoreViperString(m.v, "thinking-level", prevThinking, prevThinkingSet)
+			restoreViperString(m.v, "provider-url", prevURL, prevURLSet)
+			restoreViperString(m.v, "provider-api-key", prevKey, prevKeySet)
+			restore()
+			return nil, err
+		}
+		restores = append(restores, func() {
+			restoreViperString(m.v, "thinking-level", prevThinking, prevThinkingSet)
+			restoreViperString(m.v, "provider-url", prevURL, prevURLSet)
+			restoreViperString(m.v, "provider-api-key", prevKey, prevKeySet)
+			// Use a fresh context: the rollback must complete even if the
+			// caller's ctx was canceled or expired during the call, otherwise
+			// the per-call model override would leak into subsequent calls.
+			_ = m.SetModel(context.Background(), prevModel)
+		})
+	}
+
+	return restore, nil
+}
+
+// restoreViperString restores a config key to its prior value, clearing it
+// back to the unset state when it was not explicitly set before.
+func restoreViperString(v *viper.Viper, key, prev string, wasSet bool) {
+	if wasSet {
+		v.Set(key, prev)
+		return
+	}
+	v.Set(key, "")
 }

 // PromptWithOptions sends a message with per-call configuration. It behaves
-// like Prompt but allows injecting an additional system message before the
-// user prompt. Both messages are persisted to the session.
+// like Prompt but applies the overrides in opts (system message, model,
+// thinking level, provider credentials, extra tools) for this call only,
+// restoring the agent's prior state afterwards.
 func (m *Kit) PromptWithOptions(ctx context.Context, msg string, opts PromptOptions) (string, error) {
-	var preMessages []fantasy.Message
-	if opts.SystemMessage != "" {
-		preMessages = append(preMessages, fantasy.NewSystemMessage(opts.SystemMessage))
-	}
-	preMessages = append(preMessages, fantasy.NewUserMessage(msg))
-
-	result, err := m.runTurn(ctx, msg, msg, preMessages)
+	result, err := m.PromptResultWithOptions(ctx, msg, opts)
 	if err != nil {
 		return "", err
 	}
 	return result.Response, nil
 }

+// PromptResultWithOptions is the [TurnResult]-returning counterpart of
+// PromptWithOptions. Like all turn-returning entry points it blocks until
+// end-of-turn, so the returned TurnResult (including TurnResult.Stream) is
+// complete when it returns. Per-call overrides in opts are applied for this
+// call only and the agent's prior state is restored before returning.
+func (m *Kit) PromptResultWithOptions(ctx context.Context, msg string, opts PromptOptions) (*TurnResult, error) {
+	restore, err := m.applyPromptOptions(ctx, opts)
+	if err != nil {
+		return nil, err
+	}
+	defer restore()
+
+	var preMessages []fantasy.Message
+	if opts.SystemMessage != "" {
+		preMessages = append(preMessages, fantasy.NewSystemMessage(opts.SystemMessage))
+	}
+	preMessages = append(preMessages, fantasy.NewUserMessage(msg))
+
+	return m.runTurn(ctx, msg, msg, preMessages)
+}
+
 // PromptResult sends a message and returns the full turn result including
 // usage statistics and conversation messages. Use this instead of Prompt()
 // when you need more than just the response text.
+//
+// PromptResult blocks until end-of-turn regardless of whether streaming is
+// enabled. When streaming is enabled, every delta observed during the turn is
+// also captured in order in [TurnResult.Stream], so callers can assert
+// streamed ordering deterministically without an OnMessageUpdate collector.
 func (m *Kit) PromptResult(ctx context.Context, message string) (*TurnResult, error) {
 	return m.runTurn(ctx, message, message, []fantasy.Message{
 		fantasy.NewUserMessage(message),
@@ -2787,7 +3096,18 @@ func extractFileParts(msg fantasy.Message) []fantasy.FilePart {
 // Close cleans up resources including MCP server connections, model resources,
 // and the tree session file handle. Should be called when the Kit instance is
 // no longer needed. Returns an error if cleanup fails.
+//
+// Close is equivalent to CloseContext(context.Background()). Use
+// [Kit.CloseContext] when shutdown must be bounded by a deadline.
 func (m *Kit) Close() error {
+	return m.CloseContext(context.Background())
+}
+
+// CloseContext is like [Kit.Close] but accepts a context so graceful shutdown
+// can be bounded by a deadline or cancellation. The context is honored on a
+// best-effort basis: if it is already done when CloseContext is called, the
+// context error is returned after a best-effort cleanup pass.
+func (m *Kit) CloseContext(ctx context.Context) error {
 	// Emit SessionShutdown for extensions.
 	if m.extRunner != nil && m.extRunner.HasHandlers(extensions.SessionShutdown) {
 		_, _ = m.extRunner.Emit(extensions.SessionShutdownEvent{})
@@ -2799,7 +3119,11 @@ func (m *Kit) Close() error {
 	if closer, ok := m.authHandler.(interface{ Close() error }); ok {
 		_ = closer.Close()
 	}
-	return m.agent.Close()
+	err := m.agent.Close()
+	if ctxErr := ctx.Err(); ctxErr != nil && err == nil {
+		return ctxErr
+	}
+	return err
 }

 // Conversion helpers are defined in adapter.go.
@@ -102,10 +102,11 @@ type MCPTaskProgressHandler func(MCPTaskProgress)
 // are optional; the zero value disables progress callbacks and applies
 // sensible polling defaults inside the engine.
 //
-// For most consumers, the flat [Options] fields (`MCPTaskMode`,
-// `MCPTaskTTL`, `MCPTaskPollInterval`, `MCPTaskMaxPollInterval`,
-// `MCPTaskTimeout`, `MCPTaskProgress`) are the preferred entry point.
-// MCPTaskConfig is exposed for the low-level [AgentConfig] path.
+// Most consumers configure these via the flat [Options] fields
+// (`MCPTaskMode`, `MCPTaskTTL`, `MCPTaskPollInterval`,
+// `MCPTaskMaxPollInterval`, `MCPTaskTimeout`, `MCPTaskProgress`). The
+// MCPTaskConfig type itself is retained for downstream consumers that
+// receive it on engine-facing call sites.
 type MCPTaskConfig struct {
 	// PerServerMode overrides the per-server task mode resolved from
 	// [MCPServerConfig]. Keys are server names. Missing entries fall back
@@ -133,35 +134,6 @@ type MCPTaskConfig struct {
 	Progress MCPTaskProgressHandler
 }

-// toToolsConfig converts the SDK-level [MCPTaskConfig] to the internal
-// tools-package representation. Keeps the dependency arrow internal-only.
-func (c MCPTaskConfig) toToolsConfig() tools.MCPTaskConfig {
-	cfg := tools.MCPTaskConfig{
-		DefaultTTL:      c.DefaultTTL,
-		PollInterval:    c.PollInterval,
-		MaxPollInterval: c.MaxPollInterval,
-		Timeout:         c.Timeout,
-	}
-	if len(c.PerServerMode) > 0 {
-		cfg.PerServerMode = make(map[string]tools.MCPTaskMode, len(c.PerServerMode))
-		for k, v := range c.PerServerMode {
-			cfg.PerServerMode[k] = tools.MCPTaskMode(v)
-		}
-	}
-	if c.Progress != nil {
-		h := c.Progress
-		cfg.Progress = func(p tools.MCPTaskProgress) {
-			h(MCPTaskProgress{
-				Server:  p.Server,
-				TaskID:  p.TaskID,
-				Status:  MCPTaskStatus(p.Status),
-				Message: p.Message,
-			})
-		}
-	}
-	return cfg
-}
-
 // mcpTaskOptions carries SDK consumer configuration into the agent setup.
 // Stored on Options as a single value so the public surface stays compact;
 // individual fields are exposed via WithMCP* builder functions.
@@ -83,6 +83,17 @@ func WithConfigFile(path string) Option { return func(o *Options) { o.ConfigFile
 // WithDebug enables SDK debug logging.
 func WithDebug() Option { return func(o *Options) { o.Debug = true } }

+// WithDebugLogger installs a caller-supplied [DebugLogger] for low-level
+// engine and MCP tool plumbing output. When set this overrides the built-in
+// logger selected by [WithDebug] — messages flow into the supplied logger
+// unconditionally, and the logger's IsDebugEnabled reports whether downstream
+// code should bother formatting them. Use this to forward Kit's debug output
+// into your application's logging system (slog, zap, charm/log, an in-app
+// panel, etc.).
+func WithDebugLogger(l DebugLogger) Option {
+	return func(o *Options) { o.DebugLogger = l }
+}
+
 // Ephemeral configures an in-memory session with no persistence (equivalent to
 // Options.NoSession = true).
 func Ephemeral() Option { return func(o *Options) { o.NoSession = true } }
@@ -0,0 +1,102 @@
+package kit_test
+
+import (
+	"context"
+	"testing"
+
+	"charm.land/fantasy"
+
+	"github.com/mark3labs/kit/pkg/kit"
+)
+
+func TestNewRawTool(t *testing.T) {
+	schema := map[string]any{
+		"type": "object",
+		"properties": map[string]any{
+			"city": map[string]any{"type": "string", "description": "City name"},
+		},
+		"required": []any{"city"},
+	}
+
+	var gotArgs map[string]any
+	tool := kit.NewRawTool("get_weather", "Get weather", schema,
+		func(ctx context.Context, args map[string]any) (kit.ToolOutput, error) {
+			gotArgs = args
+			return kit.TextResult("72F in " + args["city"].(string)), nil
+		},
+	)
+
+	info := tool.Info()
+	if info.Name != "get_weather" {
+		t.Fatalf("name = %q", info.Name)
+	}
+	if info.Parameters["type"] != "object" {
+		t.Fatalf("schema not propagated: %#v", info.Parameters)
+	}
+	if len(info.Required) != 1 || info.Required[0] != "city" {
+		t.Fatalf("required not propagated: %#v", info.Required)
+	}
+
+	resp, err := tool.Run(context.Background(), fantasy.ToolCall{
+		ID:    "call_1",
+		Input: `{"city":"Boston"}`,
+	})
+	if err != nil {
+		t.Fatalf("Run error: %v", err)
+	}
+	if resp.IsError {
+		t.Fatalf("unexpected error response: %q", resp.Content)
+	}
+	if resp.Content != "72F in Boston" {
+		t.Fatalf("content = %q", resp.Content)
+	}
+	if gotArgs["city"] != "Boston" {
+		t.Fatalf("args not decoded: %#v", gotArgs)
+	}
+}
+
+func TestNewRawToolInvalidArgs(t *testing.T) {
+	tool := kit.NewRawTool("t", "d", nil,
+		func(ctx context.Context, args map[string]any) (kit.ToolOutput, error) {
+			t.Fatal("handler should not be called for invalid args")
+			return kit.ToolOutput{}, nil
+		},
+	)
+	resp, err := tool.Run(context.Background(), fantasy.ToolCall{ID: "x", Input: `not json`})
+	if err != nil {
+		t.Fatalf("Run error: %v", err)
+	}
+	if !resp.IsError {
+		t.Fatalf("expected error response for invalid args")
+	}
+}
+
+// Contract: null / whitespace-padded-null inputs must hand the handler a
+// non-nil empty map (not a nil map), so handlers can read or write keys
+// without a nil-map panic. Inputs are normalised before reaching the handler.
+func TestNewRawToolNullArgs(t *testing.T) {
+	for _, input := range []string{"null", " null ", "\tnull\n"} {
+		called := false
+		var gotNil bool
+		tool := kit.NewRawTool("t", "d", nil,
+			func(ctx context.Context, args map[string]any) (kit.ToolOutput, error) {
+				called = true
+				gotNil = args == nil
+				return kit.TextResult("ok"), nil
+			},
+		)
+		resp, err := tool.Run(context.Background(), fantasy.ToolCall{ID: "x", Input: input})
+		if err != nil {
+			t.Fatalf("input %q: Run error: %v", input, err)
+		}
+		if resp.IsError {
+			t.Fatalf("input %q: unexpected error response: %q", input, resp.Content)
+		}
+		if !called {
+			t.Fatalf("input %q: handler not called", input)
+		}
+		if gotNil {
+			t.Fatalf("input %q: args was nil, want non-nil empty map", input)
+		}
+	}
+}
@@ -1,9 +1,14 @@
 package kit

 import (
+	"errors"
 	"time"
 )

+// ErrBranchSummaryNotSupported is returned by SessionManager implementations
+// that do not support collapsing a branch range into a summary entry.
+var ErrBranchSummaryNotSupported = errors.New("session manager does not support branch summaries")
+
 // SessionManager defines the contract for conversation storage backends.
 // Implementations can use files (default), databases, cloud storage, etc.
 //
@@ -89,6 +94,12 @@ type SessionManager interface {
 	// determine which entries to summarize.
 	GetContextEntryIDs() []string

+	// AppendBranchSummary collapses the range from fromID to the current leaf
+	// on the active branch into a single summary entry and returns the new
+	// entry ID. It backs [Kit.CollapseBranch]. Managers that do not track
+	// branch summaries should return [ErrBranchSummaryNotSupported].
+	AppendBranchSummary(fromID, summary string) (entryID string, err error)
+
 	// Close releases resources (database connections, file handles, etc.).
 	Close() error
 }
@@ -217,19 +217,19 @@ func (m *Kit) SummarizeBranch(fromID, toID string) (string, error) {

 // CollapseBranch replaces a branch range with a summary entry.
 // Returns an error if the session is unavailable or the operation fails.
+// Custom SessionManagers that do not support branch summaries surface
+// [ErrBranchSummaryNotSupported].
+//
+// The branch is always collapsed from fromID to the current leaf; the toID
+// parameter is currently unused (the underlying AppendBranchSummary primitive
+// only supports collapsing to the leaf) and is retained for forward
+// compatibility.
 func (m *Kit) CollapseBranch(fromID, toID, summary string) error {
 	if m.session == nil {
 		return fmt.Errorf("no session available")
 	}
-	// Note: This operation is not directly supported by SessionManager interface
-	// as it requires AppendBranchSummary which is TreeManager-specific.
-	// For custom SessionManagers, this would need to be implemented differently.
-	// For now, we try to use the underlying TreeManager if available.
-	if adapter, ok := m.session.(*treeManagerAdapter); ok {
-		_, err := adapter.inner.AppendBranchSummary(fromID, summary)
-		return err
-	}
-	return fmt.Errorf("CollapseBranch not supported by custom session manager")
+	_, err := m.session.AppendBranchSummary(fromID, summary)
+	return err
 }

 // branchEntryToTreeNode converts a BranchEntry to a TreeNode.
@@ -2,10 +2,12 @@ package kit

 import (
 	"fmt"
+	"io/fs"
 	"os"

 	"github.com/mark3labs/kit/internal/extensions"
 	"github.com/mark3labs/kit/internal/skills"
+	"github.com/mark3labs/kit/internal/trust"
 )

 // ==== Skills Types ====
@@ -36,12 +38,28 @@ func LoadSkillsFromDir(dir string) ([]*Skill, error) {
 	return skills.LoadSkillsFromDir(dir)
 }

-// LoadSkills auto-discovers skills from standard directories:
-//   - Global: $XDG_CONFIG_HOME/kit/skills/ (default ~/.config/kit/skills/)
-//   - Project-local: <cwd>/.kit/skills/
+// LoadSkillsFromFS is the [fs.FS]-typed counterpart of [LoadSkillsFromDir].
+// It walks fsys starting at root (which may be "." or a subdirectory), finds
+// *.md/*.txt files and SKILL.md files in subdirectories, parses YAML
+// frontmatter + markdown body, and returns the loaded skills. Use it when
+// skill discovery is wrapped in an fs.FS abstraction (embed.FS distribution,
+// fstest.MapFS tests, or per-tenant virtual filesystems).
 //
-// cwd is the working directory for project-local discovery; if empty the
-// current working directory is used.
+// Each loaded skill's Path is its slash-separated path within fsys, since
+// fs.FS has no notion of an absolute on-disk path.
+func LoadSkillsFromFS(fsys fs.FS, root string) ([]*Skill, error) {
+	return skills.LoadSkillsFromFS(fsys, root)
+}
+
+// LoadSkills auto-discovers skills from standard directories:
+//   - User-level: ~/.agents/skills/ (cross-client convention)
+//   - User-level: $XDG_CONFIG_HOME/kit/skills/ (default ~/.config/kit/skills/)
+//   - Project-local: <cwd>/.agents/skills/ (cross-client convention)
+//   - Project-local: <cwd>/.kit/skills/ (Kit-specific)
+//
+// Project-level skills take precedence over user-level skills with the same
+// name. cwd is the working directory for project-local discovery; if empty
+// the current working directory is used.
 func LoadSkills(cwd string) ([]*Skill, error) {
 	return skills.LoadSkills(cwd)
 }
@@ -113,12 +131,17 @@ func (m *Kit) LoadSkillsFromDirForExtension(dir string) extensions.SkillLoadResu
 // convertSkill converts internal skill to extension-facing format.
 func (m *Kit) convertSkill(s *skills.Skill) *extensions.Skill {
 	return &extensions.Skill{
-		Name:        s.Name,
-		Description: s.Description,
-		Content:     s.Content,
-		Path:        s.Path,
-		Tags:        s.Tags,
-		When:        s.When,
+		Name:                   s.Name,
+		Description:            s.Description,
+		Content:                s.Content,
+		Path:                   s.Path,
+		License:                s.License,
+		Compatibility:          s.Compatibility,
+		Metadata:               s.Metadata,
+		AllowedTools:           s.AllowedTools,
+		DisableModelInvocation: s.DisableModelInvocation,
+		Tags:                   s.Tags,
+		When:                   s.When,
 	}
 }

@@ -286,3 +309,107 @@ func (m *Kit) applyComposedSystemPrompt() {
 func (m *Kit) RefreshSystemPrompt() {
 	m.applyComposedSystemPrompt()
 }
+
+// ---------------------------------------------------------------------------
+// Per-skill disable (Issue #65, gap #10)
+// ---------------------------------------------------------------------------
+
+// applySkillDisableList sets DisableModelInvocation on every skill whose Name
+// appears in names. Disabled skills remain loaded (so explicit /skill:
+// activation still works) but are hidden from the model-facing catalog.
+func applySkillDisableList(skillList []*skills.Skill, names []string) {
+	if len(names) == 0 {
+		return
+	}
+	disabled := make(map[string]bool, len(names))
+	for _, n := range names {
+		disabled[n] = true
+	}
+	for _, s := range skillList {
+		if disabled[s.Name] {
+			s.DisableModelInvocation = true
+		}
+	}
+}
+
+// DisableSkill hides the named skill from the model-facing catalog while
+// keeping it loaded (so it can still be activated explicitly via /skill:).
+// The system prompt is recomposed and applied. Returns true when a skill with
+// that name was found.
+func (m *Kit) DisableSkill(name string) bool {
+	return m.setSkillModelInvocation(name, true)
+}
+
+// EnableSkill re-exposes a previously disabled skill in the model-facing
+// catalog. The system prompt is recomposed and applied. Returns true when a
+// skill with that name was found.
+func (m *Kit) EnableSkill(name string) bool {
+	return m.setSkillModelInvocation(name, false)
+}
+
+// setSkillModelInvocation toggles DisableModelInvocation on the named skill
+// and refreshes the system prompt. Returns true when the skill was found.
+func (m *Kit) setSkillModelInvocation(name string, disabled bool) bool {
+	m.runtimeMu.Lock()
+	found := false
+	for _, s := range m.skills {
+		if s.Name == name {
+			s.DisableModelInvocation = disabled
+			found = true
+			break
+		}
+	}
+	m.runtimeMu.Unlock()
+
+	if !found {
+		return false
+	}
+	m.ClearSkillCache()
+	m.applyComposedSystemPrompt()
+	return true
+}
+
+// ---------------------------------------------------------------------------
+// Project-skill trust gate (Issue #65, gap #8)
+// ---------------------------------------------------------------------------
+
+// TrustDecision is the outcome of a project-skill trust prompt.
+type TrustDecision = trust.Decision
+
+// Trust-prompt outcomes. They mirror the trust package decisions.
+const (
+	// SkipProjectSkills declines to load project skills this session.
+	SkipProjectSkills = trust.Skip
+	// TrustProject loads project skills and persists the directory as trusted.
+	TrustProject = trust.Trust
+	// TrustProjectOnce loads project skills this session without persisting.
+	TrustProjectOnce = trust.TrustOnce
+)
+
+// projectSkillsTrusted decides whether project-local skills discovered in dir
+// should be loaded. When no SkillTrustPrompt is configured the directory is
+// trusted by default (preserving historical behaviour). Otherwise a persisted
+// allowlist is consulted first, then the prompt is invoked for an unknown
+// directory and the decision is persisted when the user chooses TrustProject.
+func projectSkillsTrusted(opts *Options, dir string, count int) bool {
+	if opts.SkillTrustPrompt == nil {
+		return true
+	}
+
+	store, err := trust.Load("")
+	if err == nil && store.IsTrusted(dir) {
+		return true
+	}
+
+	switch opts.SkillTrustPrompt(dir, count) {
+	case TrustProject:
+		if store != nil {
+			_ = store.Trust(dir)
+		}
+		return true
+	case TrustProjectOnce:
+		return true
+	default:
+		return false
+	}
+}
@@ -0,0 +1,120 @@
+package kit
+
+import (
+	"os"
+	"path/filepath"
+	"testing"
+)
+
+func writeSkillFile(t *testing.T, path, name, desc string) {
+	t.Helper()
+	content := "---\nname: " + name + "\ndescription: " + desc + "\n---\nBody."
+	if err := os.WriteFile(path, []byte(content), 0o644); err != nil {
+		t.Fatal(err)
+	}
+}
+
+// TestLoadSkills_SkillsDirIsDirect verifies --skills-dir scans the directory
+// directly rather than appending .agents/.kit beneath it (issue #65, gap #3).
+func TestLoadSkills_SkillsDirIsDirect(t *testing.T) {
+	dir := t.TempDir()
+	writeSkillFile(t, filepath.Join(dir, "direct.md"), "direct", "A direct skill")
+
+	got, err := loadSkills(&Options{SkillsDir: dir})
+	if err != nil {
+		t.Fatal(err)
+	}
+	if len(got) != 1 || got[0].Name != "direct" {
+		t.Fatalf("expected 1 skill named 'direct', got %+v", got)
+	}
+}
+
+// TestApplySkillDisableList verifies the disable list hides a skill from the
+// catalog (issue #65, gap #10).
+func TestApplySkillDisableList(t *testing.T) {
+	dir := t.TempDir()
+	writeSkillFile(t, filepath.Join(dir, "a.md"), "a", "skill a")
+	writeSkillFile(t, filepath.Join(dir, "b.md"), "b", "skill b")
+
+	got, err := loadSkills(&Options{SkillsDir: dir})
+	if err != nil {
+		t.Fatal(err)
+	}
+	applySkillDisableList(got, []string{"a"})
+
+	var aDisabled, bDisabled bool
+	for _, s := range got {
+		switch s.Name {
+		case "a":
+			aDisabled = s.DisableModelInvocation
+		case "b":
+			bDisabled = s.DisableModelInvocation
+		}
+	}
+	if !aDisabled {
+		t.Error("skill 'a' should be disabled")
+	}
+	if bDisabled {
+		t.Error("skill 'b' should not be disabled")
+	}
+}
+
+// TestProjectSkillsTrust verifies the trust gate drops untrusted project
+// skills and honours a Trust decision (issue #65, gap #8).
+func TestProjectSkillsTrust(t *testing.T) {
+	dir := t.TempDir()
+	t.Setenv("XDG_CONFIG_HOME", filepath.Join(dir, "xdg"))
+	t.Setenv("HOME", filepath.Join(dir, "home"))
+
+	projectDir := filepath.Join(dir, "repo")
+	skillsDir := filepath.Join(projectDir, ".agents", "skills")
+	if err := os.MkdirAll(skillsDir, 0o755); err != nil {
+		t.Fatal(err)
+	}
+	writeSkillFile(t, filepath.Join(skillsDir, "proj.md"), "proj", "project skill")
+
+	// Skip decision → no project skills loaded.
+	skipped, err := loadSkills(&Options{
+		SessionDir: projectDir,
+		SkillTrustPrompt: func(_ string, _ int) TrustDecision {
+			return SkipProjectSkills
+		},
+	})
+	if err != nil {
+		t.Fatal(err)
+	}
+	if len(skipped) != 0 {
+		t.Fatalf("expected 0 skills when trust is skipped, got %d", len(skipped))
+	}
+
+	// Trust decision → project skills loaded and directory persisted.
+	prompted := 0
+	trusted, err := loadSkills(&Options{
+		SessionDir: projectDir,
+		SkillTrustPrompt: func(_ string, _ int) TrustDecision {
+			prompted++
+			return TrustProject
+		},
+	})
+	if err != nil {
+		t.Fatal(err)
+	}
+	if len(trusted) != 1 {
+		t.Fatalf("expected 1 skill when trusted, got %d", len(trusted))
+	}
+
+	// A subsequent load should not prompt again (persisted trust).
+	again, err := loadSkills(&Options{
+		SessionDir: projectDir,
+		SkillTrustPrompt: func(_ string, _ int) TrustDecision {
+			t.Error("should not prompt for an already-trusted directory")
+			return SkipProjectSkills
+		},
+	})
+	if err != nil {
+		t.Fatal(err)
+	}
+	if len(again) != 1 {
+		t.Fatalf("expected 1 skill on trusted reload, got %d", len(again))
+	}
+}
@@ -0,0 +1,15 @@
+//go:build testing
+
+package kit
+
+import "github.com/mark3labs/kit/internal/config"
+
+// ResetForTesting clears package-global state that survives across tests in
+// the same binary. It is intended for test-binary teardown / between-test
+// cleanup. Safe to call concurrently with no in-flight kit.New() calls.
+//
+// This function is only compiled under the "testing" build tag so it never
+// ships in production binaries.
+func ResetForTesting() {
+	config.SetConfigPath("")
+}
@@ -1,8 +1,12 @@
 package kit

 import (
+	"bytes"
 	"context"
+	"encoding/json"
+	"fmt"
 	"strings"
+	"sync"

 	"charm.land/fantasy"

@@ -40,6 +44,20 @@ type ToolOutput struct {
 	// Metadata is optional opaque metadata attached to the response.
 	// It is not sent to the LLM but may be consumed by hooks or the UI.
 	Metadata any
+
+	// FinalValue, when Halt is true, is propagated to the turn's
+	// [TurnResult.FinalValue] so the caller can recover a typed result
+	// produced by the tool (e.g. a structured "finish" tool). The dynamic
+	// type is whatever the tool handler stored. Ignored when Halt is false.
+	FinalValue any
+
+	// Halt, when true, signals that the agent loop should terminate after
+	// this tool call. Content is still returned to the model for the current
+	// step, but [TurnResult.FinalValue] and [TurnResult.HaltedByTool] are
+	// populated so embedders building structured-result extraction patterns
+	// (model calls a finish(...) tool, the loop ends, the typed value is
+	// returned) no longer need a side-channel.
+	Halt bool
 }

 // TextResult creates a successful text [ToolOutput].
@@ -72,6 +90,49 @@ func MediaResult(content string, data []byte, mediaType string) ToolOutput {
 // toolCallIDKey is the context key for the tool call ID.
 type toolCallIDKey struct{}

+// haltHolderKey is the context key for the per-turn halt holder. It is
+// injected by runTurn so tool handlers created with [NewTool],
+// [NewParallelTool], and [NewRawTool] can signal loop termination and carry a
+// final value out to the [TurnResult] without an embedder-side side-channel.
+type haltHolderKey struct{}
+
+// haltHolder captures a Halt signal raised by a tool handler during a turn.
+type haltHolder struct {
+	mu       sync.Mutex
+	halted   bool
+	toolName string
+	value    any
+}
+
+func (h *haltHolder) set(toolName string, value any) {
+	h.mu.Lock()
+	defer h.mu.Unlock()
+	// First halt wins so the earliest finishing tool determines the result.
+	if h.halted {
+		return
+	}
+	h.halted = true
+	h.toolName = toolName
+	h.value = value
+}
+
+func (h *haltHolder) snapshot() (bool, string, any) {
+	h.mu.Lock()
+	defer h.mu.Unlock()
+	return h.halted, h.toolName, h.value
+}
+
+// recordHalt records a Halt signal from a tool handler onto the per-turn halt
+// holder, if one is present in the context.
+func recordHalt(ctx context.Context, toolName string, result ToolOutput) {
+	if !result.Halt {
+		return
+	}
+	if holder, ok := ctx.Value(haltHolderKey{}).(*haltHolder); ok && holder != nil {
+		holder.set(toolName, result.FinalValue)
+	}
+}
+
 // ToolCallIDFromContext extracts the tool call ID from the context.
 // The call ID is set automatically by [NewTool] and [NewParallelTool]
 // before invoking the handler. Returns an empty string if no ID is present.
@@ -144,6 +205,7 @@ func NewTool[TInput any](name, description string, fn func(ctx context.Context,
 			if err != nil {
 				return fantasy.NewTextErrorResponse(err.Error()), nil
 			}
+			recordHalt(ctx, name, result)
 			return toolOutputToResponse(result), nil
 		},
 	)
@@ -160,11 +222,104 @@ func NewParallelTool[TInput any](name, description string, fn func(ctx context.C
 			if err != nil {
 				return fantasy.NewTextErrorResponse(err.Error()), nil
 			}
+			recordHalt(ctx, name, result)
 			return toolOutputToResponse(result), nil
 		},
 	)
 }

+// rawToolInput is the decoded carrier used by [NewRawTool]. Using
+// json.RawMessage lets the typed-tool machinery in fantasy generate a
+// permissive object schema while we forward the raw arguments to the handler
+// as a decoded map.
+type rawToolInput = json.RawMessage
+
+// NewRawTool is the schema-driven counterpart to [NewTool]. Use it when the
+// tool's input shape isn't known at compile time — for example tools loaded
+// from JSON Schema definitions in skill files or MCP server catalogs.
+//
+// schema must be a valid JSON Schema describing the tool's input object; it is
+// advertised to the model as the tool's parameter schema. fn receives the
+// decoded JSON arguments as a map and returns a [ToolOutput]. Like [NewTool],
+// the tool call ID is injected into the context and can be retrieved with
+// [ToolCallIDFromContext], and [ToolOutput.Halt] is honored.
+//
+// If the model sends arguments that are not a valid JSON object the call
+// short-circuits with an error [ToolResponse] before fn is invoked.
+func NewRawTool(
+	name, description string,
+	schema map[string]any,
+	fn func(ctx context.Context, args map[string]any) (ToolOutput, error),
+) Tool {
+	tool := fantasy.NewAgentTool(name, description,
+		func(ctx context.Context, input rawToolInput, call fantasy.ToolCall) (fantasy.ToolResponse, error) {
+			ctx = context.WithValue(ctx, toolCallIDKey{}, call.ID)
+			args := map[string]any{}
+			// Normalise whitespace before the null/empty guard so values like
+			// " null " or "\tnull\n" take the same skip-unmarshal path as the
+			// bare "null" and the handler always receives a non-nil empty map.
+			// (fantasy currently trims via its RawMessage decode, but this keeps
+			// the guard correct independent of that upstream behaviour.)
+			trimmed := bytes.TrimSpace(input)
+			if len(trimmed) > 0 && !bytes.Equal(trimmed, []byte("null")) {
+				if err := json.Unmarshal(trimmed, &args); err != nil {
+					return fantasy.NewTextErrorResponse(fmt.Sprintf("invalid arguments for tool %q: %v", name, err)), nil
+				}
+			}
+			result, err := fn(ctx, args)
+			if err != nil {
+				return fantasy.NewTextErrorResponse(err.Error()), nil
+			}
+			recordHalt(ctx, name, result)
+			return toolOutputToResponse(result), nil
+		},
+	)
+	// Override the auto-generated schema with the caller-supplied one so the
+	// model sees the real input shape instead of the permissive raw-message
+	// schema.
+	if len(schema) > 0 {
+		info := tool.Info()
+		info.Parameters = schema
+		info.Required = requiredFromSchema(schema)
+		tool = &schemaOverrideTool{AgentTool: tool, info: info}
+	}
+	return tool
+}
+
+// schemaOverrideTool wraps an [fantasy.AgentTool] to advertise a
+// caller-supplied JSON Schema instead of the auto-generated one. Used by
+// [NewRawTool].
+type schemaOverrideTool struct {
+	fantasy.AgentTool
+	info fantasy.ToolInfo
+}
+
+// Info returns the tool info carrying the overridden parameter schema.
+func (t *schemaOverrideTool) Info() fantasy.ToolInfo { return t.info }
+
+// requiredFromSchema extracts the top-level "required" array from a JSON
+// Schema object, tolerating both []string and []any element types.
+func requiredFromSchema(schema map[string]any) []string {
+	raw, ok := schema["required"]
+	if !ok {
+		return nil
+	}
+	switch v := raw.(type) {
+	case []string:
+		return v
+	case []any:
+		out := make([]string, 0, len(v))
+		for _, e := range v {
+			if s, ok := e.(string); ok {
+				out = append(out, s)
+			}
+		}
+		return out
+	default:
+		return nil
+	}
+}
+
 // --- Individual tool constructors ---

 // NewReadTool creates a file-reading tool.
@@ -0,0 +1,48 @@
+package kit
+
+import (
+	"context"
+	"sync"
+)
+
+// streamCollectorKey is the context key carrying the per-turn stream collector
+// so the agent callbacks in generate can capture delta events into
+// TurnResult.Stream without re-implementing an OnMessageUpdate handler.
+type streamCollectorKey struct{}
+
+// streamCollector accumulates StreamEvents observed during a single turn in
+// emit order. It is safe for concurrent use because tool-call deltas and text
+// deltas may be emitted from different goroutines.
+type streamCollector struct {
+	mu     sync.Mutex
+	events []StreamEvent
+}
+
+func (c *streamCollector) add(e StreamEvent) {
+	if c == nil {
+		return
+	}
+	c.mu.Lock()
+	c.events = append(c.events, e)
+	c.mu.Unlock()
+}
+
+func (c *streamCollector) drain() []StreamEvent {
+	if c == nil {
+		return nil
+	}
+	c.mu.Lock()
+	defer c.mu.Unlock()
+	if len(c.events) == 0 {
+		return nil
+	}
+	out := make([]StreamEvent, len(c.events))
+	copy(out, c.events)
+	return out
+}
+
+// streamCollectorFromContext returns the per-turn stream collector if present.
+func streamCollectorFromContext(ctx context.Context) *streamCollector {
+	c, _ := ctx.Value(streamCollectorKey{}).(*streamCollector)
+	return c
+}
@@ -0,0 +1,86 @@
+package kit
+
+import (
+	"context"
+	"testing"
+)
+
+func TestHaltHolderFirstWins(t *testing.T) {
+	h := &haltHolder{}
+	if halted, _, _ := h.snapshot(); halted {
+		t.Fatal("new holder should not be halted")
+	}
+	h.set("finish", 42)
+	h.set("other", 99) // ignored — first halt wins
+	halted, name, val := h.snapshot()
+	if !halted {
+		t.Fatal("holder should be halted")
+	}
+	if name != "finish" {
+		t.Fatalf("toolName = %q, want finish", name)
+	}
+	if v, ok := val.(int); !ok || v != 42 {
+		t.Fatalf("value = %#v, want 42", val)
+	}
+}
+
+func TestRecordHalt(t *testing.T) {
+	holder := &haltHolder{}
+	ctx := context.WithValue(context.Background(), haltHolderKey{}, holder)
+
+	// Non-halting output records nothing.
+	recordHalt(ctx, "noop", ToolOutput{Content: "ok"})
+	if halted, _, _ := holder.snapshot(); halted {
+		t.Fatal("non-halting output should not halt")
+	}
+
+	recordHalt(ctx, "finish", ToolOutput{Halt: true, FinalValue: "done"})
+	halted, name, val := holder.snapshot()
+	if !halted || name != "finish" || val != "done" {
+		t.Fatalf("halt not recorded: halted=%v name=%q val=%v", halted, name, val)
+	}
+
+	// Missing holder in context is a safe no-op.
+	recordHalt(context.Background(), "finish", ToolOutput{Halt: true})
+}
+
+func TestStreamCollector(t *testing.T) {
+	c := &streamCollector{}
+	if c.drain() != nil {
+		t.Fatal("empty collector should drain to nil")
+	}
+	c.add(StreamEvent{Kind: StreamEventTextDelta, Text: "A"})
+	c.add(StreamEvent{Kind: StreamEventTextDelta, Text: "B"})
+	c.add(StreamEvent{Kind: StreamEventToolCallChunk, ToolName: "x"})
+
+	out := c.drain()
+	if len(out) != 3 {
+		t.Fatalf("len = %d, want 3", len(out))
+	}
+	if out[0].Text != "A" || out[1].Text != "B" {
+		t.Fatalf("order not preserved: %#v", out)
+	}
+	if out[2].Kind != StreamEventToolCallChunk || out[2].ToolName != "x" {
+		t.Fatalf("tool chunk wrong: %#v", out[2])
+	}
+}
+
+// nil receiver collector (no per-turn collector attached) must be safe.
+func TestStreamCollectorNil(t *testing.T) {
+	var c *streamCollector
+	c.add(StreamEvent{Kind: StreamEventTextDelta, Text: "x"}) // no panic
+	if c.drain() != nil {
+		t.Fatal("nil collector should drain to nil")
+	}
+}
+
+func TestStreamCollectorFromContext(t *testing.T) {
+	if streamCollectorFromContext(context.Background()) != nil {
+		t.Fatal("expected nil collector for bare context")
+	}
+	c := &streamCollector{}
+	ctx := context.WithValue(context.Background(), streamCollectorKey{}, c)
+	if streamCollectorFromContext(ctx) != c {
+		t.Fatal("collector not retrieved from context")
+	}
+}
@@ -5,13 +5,11 @@ import (

 	"charm.land/fantasy"

-	"github.com/mark3labs/kit/internal/agent"
 	"github.com/mark3labs/kit/internal/compaction"
 	"github.com/mark3labs/kit/internal/config"
 	"github.com/mark3labs/kit/internal/message"
 	"github.com/mark3labs/kit/internal/models"
 	"github.com/mark3labs/kit/internal/session"
-	"github.com/mark3labs/kit/internal/tools"
 	"github.com/mark3labs/mcp-go/client/transport"
 	"github.com/mark3labs/mcp-go/server"
 )
@@ -83,9 +81,10 @@ type MCPServerConfig = config.MCPServerConfig
 // concurrent use.
 //
 // Most consumers do not need to provide one; pass [Options.Debug] = true
-// to use the default logger. DebugLogger is exposed for the low-level
-// [AgentConfig] path and for embedders that want to route debug output
-// into their own logging system.
+// (or use [WithDebug]) to install the built-in console logger. DebugLogger
+// is the escape hatch for embedders that want to route debug output into
+// their own logging system — install one via [Options.DebugLogger] or
+// [WithDebugLogger].
 type DebugLogger interface {
 	// LogDebug records a single debug message. Implementations may drop,
 	// buffer, or render the message however they choose.
@@ -95,109 +94,6 @@ type DebugLogger interface {
 	IsDebugEnabled() bool
 }

-// AgentConfig holds configuration options for constructing an agent at the
-// SDK boundary. All fields use SDK-owned types, so consumers can populate
-// this struct without importing any underlying LLM-provider package.
-//
-// For most use cases, prefer the high-level [New] entry point with
-// [Options]. AgentConfig is exposed for advanced consumers that need
-// direct access to the lower-level agent configuration shape.
-type AgentConfig struct {
-	// ModelConfig holds the LLM provider configuration. A nil value means
-	// that the default provider/model resolution will be used.
-	ModelConfig *ProviderConfig
-
-	// MCPConfig describes any MCP servers whose tools should be loaded
-	// alongside core tools.
-	MCPConfig *Config
-
-	// SystemPrompt is the system prompt sent to the LLM.
-	SystemPrompt string
-
-	// MaxSteps caps the number of LLM iterations per turn. A value of
-	// zero means no cap is applied at this layer.
-	MaxSteps int
-
-	// StreamingEnabled controls whether the agent streams responses.
-	StreamingEnabled bool
-
-	// AuthHandler handles OAuth authorization for remote MCP servers.
-	// When nil, remote MCP servers requiring OAuth will fail to connect.
-	AuthHandler MCPAuthHandler
-
-	// TokenStoreFactory, if non-nil, creates a custom token store for each
-	// remote MCP server's OAuth tokens. When nil, the default file-based
-	// token store is used.
-	TokenStoreFactory MCPTokenStoreFactory
-
-	// CoreTools overrides the default core tool set. If empty, [AllTools]
-	// is used. Provide a custom tool set (e.g. [CodingTools] or tools
-	// built with a custom WorkDir) to scope agent capabilities.
-	CoreTools []Tool
-
-	// DisableCoreTools, when true, prevents loading any core tools.
-	// Combined with empty CoreTools this yields a chat-only agent with
-	// no built-in tools.
-	DisableCoreTools bool
-
-	// ExtraTools are additional tools loaded alongside core and MCP tools.
-	ExtraTools []Tool
-
-	// ToolWrapper, if non-nil, wraps the combined tool list before it is
-	// handed to the LLM. Used to intercept tool calls or results.
-	ToolWrapper func([]Tool) []Tool
-
-	// OnMCPServerLoaded, if non-nil, is invoked once for each MCP server
-	// when its tools have finished loading (or failed). Called from a
-	// background goroutine.
-	OnMCPServerLoaded func(serverName string, toolCount int, err error)
-
-	// DebugLogger receives low-level debug output from the engine and the
-	// MCP tool plumbing. Nil means no debug output is emitted at this
-	// layer (regardless of [Options.Debug], which feeds the higher-level
-	// [New] entry point). Pass an implementation here when wiring a custom
-	// logger through the lower-level AgentConfig path.
-	DebugLogger DebugLogger
-
-	// MCPTaskConfig configures task-aware MCP tools/call execution — mode
-	// overrides, polling intervals, timeouts, and the progress handler.
-	// The zero value preserves historical synchronous-only behaviour for
-	// any server that didn't advertise task support during initialize.
-	MCPTaskConfig MCPTaskConfig
-}
-
-// toInternal converts an AgentConfig to its internal representation.
-// Slice and function fields convert without allocation because [Tool]
-// is a type alias for the underlying LLM-tool type.
-func (c *AgentConfig) toInternal() *agent.AgentConfig {
-	if c == nil {
-		return nil
-	}
-	out := &agent.AgentConfig{
-		ModelConfig:       c.ModelConfig,
-		MCPConfig:         c.MCPConfig,
-		SystemPrompt:      c.SystemPrompt,
-		MaxSteps:          c.MaxSteps,
-		StreamingEnabled:  c.StreamingEnabled,
-		CoreTools:         c.CoreTools,
-		DisableCoreTools:  c.DisableCoreTools,
-		ExtraTools:        c.ExtraTools,
-		ToolWrapper:       c.ToolWrapper,
-		OnMCPServerLoaded: c.OnMCPServerLoaded,
-	}
-	if c.AuthHandler != nil {
-		out.AuthHandler = c.AuthHandler
-	}
-	if c.TokenStoreFactory != nil {
-		out.TokenStoreFactory = tools.TokenStoreFactory(c.TokenStoreFactory)
-	}
-	if c.DebugLogger != nil {
-		out.DebugLogger = c.DebugLogger
-	}
-	out.MCPTaskConfig = c.MCPTaskConfig.toToolsConfig()
-	return out
-}
-
 // ToolCallHandler is invoked when the LLM produces a tool call. It receives
 // the call ID, tool name, and the JSON-encoded input arguments.
 type ToolCallHandler func(toolCallID, toolName, toolArgs string)
@@ -264,30 +264,31 @@ func TestConvertFromLLMMessage(t *testing.T) {
 	}
 }

-// TestAgentConfigNoFantasyImport verifies AgentConfig can be populated with
-// every field — including CoreTools, ExtraTools, and ToolWrapper — using
-// only SDK-owned types. This test deliberately does not import
-// "charm.land/fantasy"; the package compiling at all is the proof that the
-// SDK no longer leaks the dependency name through AgentConfig.
+// TestOptionsNoFantasyImport verifies Options can be populated with the
+// tool-related fields — Tools and ExtraTools — using only SDK-owned types.
+// This test deliberately does not import "charm.land/fantasy"; the package
+// compiling at all is the proof that the SDK no longer leaks the dependency
+// name through the Options surface.
+//
+// Tool-call interception (formerly the AgentConfig.ToolWrapper escape hatch)
+// is covered by the hook system — [Kit.OnBeforeToolCall] /
+// [Kit.OnAfterToolResult] — whose hook payload types also use only
+// SDK-owned identifiers; see hooks_test.go.
 //
 // Regression test for https://github.com/mark3labs/kit/issues/30.
-func TestAgentConfigNoFantasyImport(t *testing.T) {
+func TestOptionsNoFantasyImport(t *testing.T) {
 	myTool := kit.NewTool[struct{}]("noop", "does nothing", func(_ context.Context, _ struct{}) (kit.ToolOutput, error) {
 		return kit.TextResult("ok"), nil
 	})

-	wrapperCalled := false
-	cfg := kit.AgentConfig{
-		SystemPrompt:     "you are a tester",
-		MaxSteps:         5,
-		StreamingEnabled: true,
-		CoreTools:        []kit.Tool{myTool},
-		ExtraTools:       []kit.Tool{myTool},
-		DisableCoreTools: false,
-		ToolWrapper: func(in []kit.Tool) []kit.Tool {
-			wrapperCalled = true
-			return in
-		},
+	streaming := true
+	cfg := kit.Options{
+		SystemPrompt:      "you are a tester",
+		MaxSteps:          5,
+		Streaming:         &streaming,
+		Tools:             []kit.Tool{myTool},
+		ExtraTools:        []kit.Tool{myTool},
+		DisableCoreTools:  false,
 		OnMCPServerLoaded: func(_ string, _ int, _ error) {},
 	}

@@ -297,36 +298,29 @@ func TestAgentConfigNoFantasyImport(t *testing.T) {
 	if cfg.MaxSteps != 5 {
 		t.Errorf("MaxSteps = %d, want 5", cfg.MaxSteps)
 	}
-	if !cfg.StreamingEnabled {
-		t.Error("StreamingEnabled = false, want true")
+	if cfg.Streaming == nil || !*cfg.Streaming {
+		t.Error("Streaming = false/nil, want true")
 	}
-	if len(cfg.CoreTools) != 1 {
-		t.Errorf("CoreTools len = %d, want 1", len(cfg.CoreTools))
+	if len(cfg.Tools) != 1 {
+		t.Errorf("Tools len = %d, want 1", len(cfg.Tools))
 	}
 	if len(cfg.ExtraTools) != 1 {
 		t.Errorf("ExtraTools len = %d, want 1", len(cfg.ExtraTools))
 	}
-
-	// Exercise the wrapper to confirm the func type is usable.
-	out := cfg.ToolWrapper(cfg.CoreTools)
-	if !wrapperCalled {
-		t.Error("ToolWrapper was not invoked")
-	}
-	if len(out) != 1 {
-		t.Errorf("wrapped tool list len = %d, want 1", len(out))
-	}
 }

-// TestAgentConfigToolWrapperSignature documents that AgentConfig.ToolWrapper
-// uses kit.Tool (not the underlying provider type) in its signature.
-func TestAgentConfigToolWrapperSignature(t *testing.T) {
-	//nolint:staticcheck // QF1011: explicit type asserts the SDK-side func signature.
-	var _ func([]kit.Tool) []kit.Tool = func(in []kit.Tool) []kit.Tool { return in }
-	cfg := kit.AgentConfig{
-		ToolWrapper: func(in []kit.Tool) []kit.Tool { return in },
-	}
-	if cfg.ToolWrapper == nil {
-		t.Fatal("ToolWrapper assignment failed")
+// TestToolSliceSignature documents that the kit.Tool alias — used by every
+// SDK tool-related surface (Options.Tools, Options.ExtraTools, WithTools,
+// WithExtraTools, hook payloads) — is referenced under its SDK-owned name
+// in user code, without any fantasy import.
+func TestToolSliceSignature(t *testing.T) {
+	var tools []kit.Tool
+	tools = append(tools, kit.NewTool[struct{}]("noop", "",
+		func(_ context.Context, _ struct{}) (kit.ToolOutput, error) {
+			return kit.TextResult("ok"), nil
+		}))
+	if len(tools) != 1 {
+		t.Fatalf("unexpected tool slice length: %d", len(tools))
 	}
 }

@@ -63,6 +63,52 @@ func TestOptionFunctionsPlumbing(t *testing.T) {
 	}
 }

+// recordingDebugLogger is a kit.DebugLogger used to verify WithDebugLogger
+// plumbs the supplied logger into Options. It records each LogDebug call.
+type recordingDebugLogger struct {
+	enabled  bool
+	messages []string
+}
+
+func (l *recordingDebugLogger) LogDebug(m string)    { l.messages = append(l.messages, m) }
+func (l *recordingDebugLogger) IsDebugEnabled() bool { return l.enabled }
+
+// TestWithDebugLoggerPlumbing verifies that kit.WithDebugLogger assigns the
+// supplied logger to Options.DebugLogger. End-to-end propagation into the
+// engine is covered indirectly by the existing kitsetup tests; this test
+// pins the SDK-surface contract.
+func TestWithDebugLoggerPlumbing(t *testing.T) {
+	l := &recordingDebugLogger{enabled: true}
+	o := &kit.Options{}
+	kit.WithDebugLogger(l)(o)
+	if o.DebugLogger == nil {
+		t.Fatal("WithDebugLogger: expected Options.DebugLogger to be set")
+	}
+	if o.DebugLogger != l {
+		t.Error("WithDebugLogger: expected the supplied logger to be installed verbatim")
+	}
+	// Sanity: the installed logger satisfies the SDK interface contract.
+	if !o.DebugLogger.IsDebugEnabled() {
+		t.Error("installed logger IsDebugEnabled() returned false")
+	}
+	o.DebugLogger.LogDebug("hello")
+	if len(l.messages) != 1 || l.messages[0] != "hello" {
+		t.Errorf("LogDebug not forwarded; got %v", l.messages)
+	}
+}
+
+// TestWithDebugLoggerNilClears verifies that passing a nil logger to
+// WithDebugLogger clears any previously-installed logger. This lets later
+// options override earlier ones the same way WithModel / WithStreaming do.
+func TestWithDebugLoggerNilClears(t *testing.T) {
+	o := &kit.Options{}
+	kit.WithDebugLogger(&recordingDebugLogger{enabled: true})(o)
+	kit.WithDebugLogger(nil)(o)
+	if o.DebugLogger != nil {
+		t.Errorf("WithDebugLogger(nil): expected DebugLogger to be cleared; got %#v", o.DebugLogger)
+	}
+}
+
 // TestOptionOrderingOverrides verifies later options override earlier ones.
 func TestOptionOrderingOverrides(t *testing.T) {
 	o := &kit.Options{}
@@ -93,3 +93,21 @@ result, err := host.PromptResult(ctx, "Count files")
 fmt.Println(result.Response)
 fmt.Println(result.Usage.TotalTokens)
 ```
+
+`PromptResult` blocks until end-of-turn regardless of streaming mode. When
+streaming is enabled, every delta observed during the turn is also captured in
+order in `result.Stream` (`[]kit.StreamEvent`), so you can assert streamed
+ordering deterministically without wiring an `OnMessageUpdate` collector:
+
+```go
+for _, ev := range result.Stream {
+    switch ev.Kind {
+    case kit.StreamEventTextDelta:
+        fmt.Print(ev.Text)
+    case kit.StreamEventReasoningDelta:
+        fmt.Print(ev.Reasoning)
+    case kit.StreamEventToolCallChunk:
+        fmt.Printf("[%s %s]", ev.ToolName, ev.Args)
+    }
+}
+```
@@ -67,14 +67,61 @@ kit --skill path/to/skill.md "prompt"
 # Load multiple skill files or directories (flag is repeatable)
 kit --skill ./skill1.md --skill ./skill2.md "prompt"

-# Load all skills from a custom directory instead of the default locations
+# Scan a directory directly for skills (overrides auto-discovery)
 kit --skills-dir /path/to/skills "prompt"

+# Hide a skill from the model catalog by name (still usable via /skill:)
+kit --skill-disable noisy-skill "prompt"
+
 # Disable all skill loading (auto-discovery and explicit)
 kit --no-skills "prompt"
 ```

-Skills are auto-discovered from `~/.config/kit/skills/`, `.kit/skills/`, and `.agents/skills/` by default. Use `--skills-dir` to override the project-local search root, or `--skill` to load files explicitly (which disables auto-discovery). `--no-skills` suppresses all skill loading regardless of other flags.
+Skills follow the [agentskills.io](https://agentskills.io/specification) convention. They are auto-discovered from four canonical scopes:
+
+| Scope | Location |
+|-------|----------|
+| User-level (cross-client) | `~/.agents/skills/` |
+| User-level (Kit) | `~/.config/kit/skills/` (honors `$XDG_CONFIG_HOME`) |
+| Project-local (cross-client) | `<project>/.agents/skills/` |
+| Project-local (Kit) | `<project>/.kit/skills/` |
+
+When two skills share the same `name`, the project-level one takes precedence over the user-level one. Use `--skills-dir` to scan one directory directly instead (it is **not** treated as a parent of `.agents`/`.kit` — the directory itself is scanned). `--skill` loads files explicitly (which disables auto-discovery), and `--no-skills` suppresses all skill loading regardless of other flags.
+
+Disabled skills (`--skill-disable`, the `skill-disable` config key, or `disable-model-invocation: true` in a skill's frontmatter) are hidden from the model-facing `<available_skills>` catalog but remain available for explicit activation via the `/skill:<name>` command.
+
+### Skill frontmatter
+
+A skill is a markdown file (`SKILL.md` in a directory, or a standalone `.md`/`.txt` file) with optional YAML frontmatter. Kit reads the full [agentskills.io](https://agentskills.io/specification) field set plus two Kit-specific extensions:
+
+```yaml
+---
+name: pdf-extractor                 # required
+description: Use when extracting tables from PDFs   # required (drives model discovery)
+license: MIT                        # optional, SPDX identifier
+compatibility: claude-code, cursor  # optional, targeted environments
+allowed-tools: read, bash           # optional (experimental) tool restriction
+disable-model-invocation: false     # optional; true hides from the catalog
+metadata:                           # optional arbitrary key/value pairs
+  author: you
+tags: [pdf, data]                   # Kit extension
+when: on-demand                     # Kit extension
+---
+```
+
+`name` and `description` are required — a skill missing its description is skipped with a logged warning, since the description is the sole basis on which the model decides relevance. Descriptions are XML-escaped before they enter the catalog, so characters like `<`, `>`, and `&` are safe. A skill directory may bundle `scripts/`, `references/`, and `assets/` subdirectories; when a skill is activated those files are enumerated in a `<skill_resources>` block so the model knows what it can read.
+
+### Project trust prompt
+
+Because project-local skills are injected into the system prompt, entering a repository that ships `.agents/skills/` or `.kit/skills/` for the first time prompts you to trust it before any project skill loads — a safeguard against a freshly cloned, untrusted repo smuggling instructions into the agent:
+
+```
+This project provides 2 skills under .agents/skills or .kit/skills:
+  /path/to/repo
+Load them into the agent? [t]rust always / [o]nce / [s]kip (default skip):
+```
+
+Choosing **trust always** persists the directory to `~/.config/kit/trusted-projects.json` so you are not asked again. The prompt is skipped (skills load silently) in non-interactive runs — when a prompt is passed positionally, `--quiet` is set, or stdin is not a TTY.

 ## GitHub integration

@@ -53,7 +53,8 @@ These flags control Kit's behavior. When a prompt is passed as a positional argu
 | Flag | Short | Default | Description |
 |------|-------|---------|-------------|
 | `--skill` | — | — | Load skill file or directory (repeatable) |
-| `--skills-dir` | — | — | Override the project-local skills directory for auto-discovery |
+| `--skills-dir` | — | — | Scan this directory directly for skills (overrides auto-discovery) |
+| `--skill-disable` | — | — | Hide a skill from the model catalog by name (repeatable); still usable via `/skill:` |
 | `--no-skills` | — | `false` | Disable skill loading (auto-discovery and explicit) |

 ## Generation parameters
@@ -49,7 +49,8 @@ stream: true
 | `prompt-template` | string | — | Specific template to load by name |
 | `no-skills` | bool | `false` | Disable skill loading (auto-discovery and explicit) |
 | `skill` | list | — | Explicit skill files or directories to load (disables auto-discovery) |
-| `skills-dir` | string | — | Override the project-local directory used for skill auto-discovery |
+| `skills-dir` | string | — | Scan this directory directly for skills (overrides auto-discovery; not treated as a parent of `.agents`/`.kit`) |
+| `skill-disable` | list | — | Skill names to hide from the model catalog (still usable via `/skill:`) |

 ## Environment variables

@@ -151,6 +152,7 @@ customModels:
    name: "My Custom Model"
    baseUrl: "http://localhost:8080/v1"
    apiKey: "my-secret-key"
+    apiModelName: "gpt-4-turbo"
    reasoning: true
    temperature: true
    cost:
@@ -168,6 +170,7 @@ customModels:
 | `name` | string | Yes | Display name for the model |
 | `baseUrl` | string | No | Per-model base URL override; when set, `--provider-url` is not required |
 | `apiKey` | string | No | Per-model API key override |
+| `apiModelName` | string | No | Overrides the model identifier sent in API requests; defaults to the config key |
 | `reasoning` | bool | No | Whether the model supports reasoning/thinking |
 | `temperature` | bool | No | Whether the model supports temperature adjustment |
 | `cost.input` | float | No | Cost per 1K input tokens |
@@ -447,11 +447,14 @@ Load and inject skills dynamically at runtime:
 ```go
 // Discover skills from standard locations
 result := ctx.DiscoverSkills()  // ext.SkillLoadResult{Skills, Error}
-// Standard locations: ~/.config/kit/skills/, .kit/skills/, .agents/skills/
+// Standard locations: ~/.agents/skills/, ~/.config/kit/skills/,
+//                     <project>/.agents/skills/, <project>/.kit/skills/

 // Load a specific skill file
 skill, err := ctx.LoadSkill("/path/to/skill.md")  // (*ext.Skill, error string)
-// skill.Name, skill.Description, skill.Content, skill.Tags, skill.When
+// Spec fields: skill.Name, skill.Description, skill.License, skill.Compatibility,
+//              skill.Metadata, skill.AllowedTools, skill.DisableModelInvocation
+// Plus content/path and Kit extensions: skill.Content, skill.Path, skill.Tags, skill.When

 // Load all skills from a directory
 result := ctx.LoadSkillsFromDir("/path/to/skills")  // ext.SkillLoadResult
@@ -176,12 +176,30 @@ Lower values run first. First non-nil result wins.
 | `SourceEvent` | `OnSource` | LLM referenced a source (e.g., web search) |
 | `ErrorEvent` | `OnError` | Agent-level error during streaming |
 | `RetryEvent` | `OnRetry` | LLM request retried after transient error |
-| `CompactionEvent` | `OnCompaction` | Conversation compacted |
+| `CompactionEvent` | `OnCompaction` | Conversation compacted (fires on success **and** failure — check `Err`) |
 | `SteerConsumedEvent` | `OnSteerConsumed` | Steering messages injected into turn |
 | `PasswordPromptEvent` | — | Sudo command needs password (respond via `ResponseCh`) |

 > **Note:** `OnStreaming` is a deprecated alias for `OnMessageUpdate` and will be removed in a future release.

+### Compaction telemetry
+
+`CompactionEvent` fires after every compaction attempt. On success `Err` is
+`nil` and the summary/token/file fields are populated; on failure `Err` is
+non-nil and the rest are zero-valued. This lets you wire symmetric
+start/end lifecycle telemetry without hand-rolling the failure path:
+
+```go
+host.OnCompaction(func(e kit.CompactionEvent) {
+    if e.Err != nil {
+        log.Printf("compaction failed: %v", e.Err)
+        return
+    }
+    log.Printf("compacted %d → %d tokens (%d messages removed)",
+        e.OriginalTokens, e.CompactedTokens, e.MessagesRemoved)
+})
+```
+
 ## Subagent event monitoring

 Monitor real-time events from LLM-initiated subagents (when the model uses the `subagent` tool):
@@ -31,6 +31,7 @@ host, err := kit.New(ctx, &kit.Options{
    Streaming:    ptrBool(true), // *bool: nil = unset (default true), &false = off
    Quiet:        true,
    Debug:        true,
+    DebugLogger:  myLogger,       // optional; overrides Debug + built-in logger when non-nil

    // Generation parameters (override env/config/per-model defaults)
    MaxTokens:        16384,              // 0 = auto-resolve; non-zero suppresses right-sizing
@@ -64,9 +65,10 @@ host, err := kit.New(ctx, &kit.Options{
    AutoCompact:  true,

    // Skills
-    Skills:       []string{"/path/to/skill.md"},
-    SkillsDir:    "/path/to/skills/",
-    NoSkills:     true,
+    Skills:        []string{"/path/to/skill.md"},
+    SkillsDir:     "/path/to/skills/",
+    SkillsDisable: []string{"noisy-skill"},
+    NoSkills:      true,

    // Feature toggles
    NoExtensions:   true,               // disable Yaegi extension loading
@@ -103,7 +105,8 @@ host, err := kit.New(ctx, &kit.Options{
 | `MaxSteps` | `int` | `0` | Max agent steps (0 = unlimited) |
 | `Streaming` | `*bool` | `nil` | Enable streaming output. `nil` leaves it to the precedence chain (env → config → default `true`); `&true`/`&false` forces it. Pointer so unset is distinct from explicit `false`. |
 | `Quiet` | `bool` | `false` | Suppress output |
-| `Debug` | `bool` | `false` | Enable debug logging |
+| `Debug` | `bool` | `false` | Enable debug logging via the built-in console / buffered logger. Ignored when `DebugLogger` is non-nil. |
+| `DebugLogger` | `DebugLogger` | `nil` | Caller-supplied logger that receives low-level engine + MCP tool plumbing debug output. When non-nil this overrides `Debug` — the supplied logger's `IsDebugEnabled()` controls downstream emission. See [Custom debug logger](#custom-debug-logger). |

 ### Generation parameters

@@ -168,13 +171,48 @@ when embedding Kit as a library.
 |-------|------|---------|-------------|
 | `SkipConfig` | `bool` | `false` | Skip `.kit.yml` file loading (viper defaults + env vars still apply) |
 | `Skills` | `[]string` | — | Explicit skill files/dirs to load |
-| `SkillsDir` | `string` | — | Override default skills directory |
+| `SkillsDir` | `string` | — | Scan this directory directly for skills (overrides auto-discovery; scanned as-is) |
+| `SkillsDisable` | `[]string` | — | Skill names to hide from the model catalog (still usable via `/skill:`) |
+| `SkillTrustPrompt` | `func(projectDir string, skillCount int) TrustDecision` | `nil` | Callback gating project-local skill loading on a trust decision (see below) |
 | `NoSkills` | `bool` | `false` | Disable skill loading entirely |

+#### Project-skill trust gate
+
+Project-local skills (under `<project>/.agents/skills/` or `<project>/.kit/skills/`)
+are injected into the system prompt, so loading them from an untrusted, freshly
+cloned repository is a prompt-injection vector. Set `SkillTrustPrompt` to gate
+that first load on an explicit decision. When `nil` (the default), project
+skills load without prompting — preserving historical behaviour.
+
+```go
+opts := &kit.Options{
+    SkillTrustPrompt: func(projectDir string, skillCount int) kit.TrustDecision {
+        // Consult your own UI / policy here.
+        if userApproves(projectDir, skillCount) {
+            return kit.TrustProject     // load and persist the directory as trusted
+        }
+        return kit.SkipProjectSkills    // do not load project skills
+    },
+}
+```
+
+The callback returns one of three `TrustDecision` values:
+
+| Decision | Effect |
+|----------|--------|
+| `kit.TrustProject` | Load project skills and persist `projectDir` to `~/.config/kit/trusted-projects.json` (not prompted again) |
+| `kit.TrustProjectOnce` | Load project skills for this run only, without persisting |
+| `kit.SkipProjectSkills` | Do not load project skills |
+
+A directory already on the persisted allowlist is trusted without invoking the
+callback. The Kit CLI wires this to an interactive terminal prompt automatically
+for TTY sessions.
+
 These fields only control the **initial** skill and context-file set picked
 up by `New()`. To add, remove, or replace skills and `AGENTS.md`-style
 context files at runtime (e.g. per user or per session), use the
-`AddSkill` / `LoadAndAddSkill` / `RemoveSkill` / `SetSkills` and
+`AddSkill` / `LoadAndAddSkill` / `RemoveSkill` / `SetSkills` /
+`DisableSkill` / `EnableSkill` and
 `AddContextFile` / `AddContextFileContent` / `RemoveContextFile` /
 `SetContextFiles` methods on `*kit.Kit`. See
 [Runtime skills and context files](/sdk/overview#runtime-skills-and-context-files).
@@ -346,6 +384,45 @@ loaded MCP server that advertises the corresponding capability.
 Context cancellation also works end-to-end: cancelling the `ctx` passed to a
 tool execution triggers a best-effort `tasks/cancel` before the call returns.

+## Custom debug logger
+
+Kit's engine and MCP tool plumbing emit low-level debug output through a
+`DebugLogger` interface. By default, setting `Debug: true` (or calling
+`WithDebug()`) installs the built-in console logger. To route the same output
+into your application's logging system instead, provide a custom
+implementation via `Options.DebugLogger` or `WithDebugLogger`.
+
+```go
+type DebugLogger interface {
+    LogDebug(message string)
+    IsDebugEnabled() bool
+}
+```
+
+When `DebugLogger` is non-nil it takes precedence over `Debug` — the
+supplied logger's `IsDebugEnabled()` reports whether downstream code should
+bother formatting messages.
+
+**Example: forward to `log/slog`:**
+
+```go
+import "log/slog"
+
+type slogDebugLogger struct{ l *slog.Logger }
+
+func (s *slogDebugLogger) LogDebug(m string)    { s.l.Debug(m) }
+func (s *slogDebugLogger) IsDebugEnabled() bool { return true }
+
+host, _ := kit.NewAgent(ctx,
+    kit.WithModel("anthropic/claude-sonnet-4-5-20250929"),
+    kit.WithDebugLogger(&slogDebugLogger{l: slog.Default()}),
+)
+```
+
+Implementations must be safe for concurrent use — messages can arrive
+from the engine goroutine, MCP connection pool, and tool execution paths
+simultaneously.
+
 ## Precedence

 For any given generation or provider field, the effective value is resolved
@@ -80,6 +80,7 @@ Available options:
 | `WithProviderURL(string)` | `Options.ProviderURL` |
 | `WithConfigFile(string)` | `Options.ConfigFile` |
 | `WithDebug()` | `Options.Debug = true` |
+| `WithDebugLogger(DebugLogger)` | `Options.DebugLogger` (route engine + MCP debug output into a custom logger; overrides `WithDebug` when set) |
 | `Ephemeral()` | `Options.NoSession = true` |

 Options are applied in order, so later options override earlier ones. `Option`
@@ -129,12 +130,36 @@ The SDK provides several prompt variants:
 | Method | Description |
 |--------|-------------|
 | `Prompt(ctx, message)` | Simple prompt, returns response string |
-| `PromptWithOptions(ctx, message, opts)` | With per-call options |
+| `PromptWithOptions(ctx, message, opts)` | With per-call options (model, tools, thinking level, provider creds) |
 | `PromptResult(ctx, message)` | Returns full `TurnResult` with usage stats |
+| `PromptResultWithOptions(ctx, message, opts)` | Per-call options variant that returns the full `TurnResult` |
 | `PromptResultWithFiles(ctx, message, files)` | Multimodal with file attachments |
 | `Steer(ctx, instruction)` | System-level steering without user message |
 | `FollowUp(ctx, text)` | Continue without new user input |

+### Per-call overrides
+
+`PromptOptions` scopes configuration to a **single call** and restores the
+agent's prior state afterwards — no need to rebuild a `*Kit` per request. This
+suits multi-tenant hosts that resolve the model, credentials, or tool set per
+request:
+
+```go
+result, err := host.PromptResultWithOptions(ctx, "Summarise this ticket", kit.PromptOptions{
+    SystemMessage:  "You are a concise triage assistant.", // prepended for this call
+    Model:          "anthropic/claude-haiku-3-5-20241022", // overrides the default model
+    ThinkingLevel:  "low",                                 // "off" | "low" | "medium" | "high"
+    ExtraTools:     []kit.Tool{lookupTool},                // added on top of the core set
+    ProviderURL:    "https://proxy.tenant-a/v1",           // per-tenant endpoint
+    ProviderAPIKey: tenantKey,                             // per-tenant credential
+})
+```
+
+Every field is optional; a zero value means "use the agent's default." The
+prior model, thinking level, provider credentials, and tool set are all
+restored before the call returns, and concurrent option-driven prompts are
+serialized so the apply/restore window of one call never races another.
+
 ## Custom tools

 Create custom tools with `kit.NewTool`. The JSON schema is auto-generated from the input struct — no external dependencies required:
@@ -175,6 +200,64 @@ Binary data (images, audio, etc.) in `ToolOutput.Data` is automatically forwarde

 Use `kit.NewParallelTool` for tools that are safe to run concurrently. Use `kit.ToolCallIDFromContext(ctx)` to retrieve the LLM-assigned call ID for logging or tracing.

+### Schema-driven tools
+
+When the tool's input shape isn't known at compile time — tools sourced from
+JSON Schema definitions in skill files, MCP server catalogs, or user-supplied
+definitions — use `kit.NewRawTool`. It takes a JSON Schema and a handler that
+receives the decoded arguments as a `map[string]any`, so no Go input type is
+required:
+
+```go
+schema := map[string]any{
+    "type": "object",
+    "properties": map[string]any{
+        "city": map[string]any{"type": "string", "description": "City name"},
+    },
+    "required": []any{"city"},
+}
+
+weatherTool := kit.NewRawTool("get_weather", "Get current weather for a city", schema,
+    func(ctx context.Context, args map[string]any) (kit.ToolOutput, error) {
+        return kit.TextResult("72°F, sunny in " + args["city"].(string)), nil
+    },
+)
+```
+
+The `schema` is advertised to the model as the tool's parameter schema. If the
+model sends arguments that aren't a valid JSON object, the call short-circuits
+with an error result before your handler runs.
+
+### Halting the agent loop
+
+For structured-result patterns — the model calls a `finish(...)` tool with a
+typed argument and the loop should terminate, returning that value to the
+caller — set `Halt` and `FinalValue` on the returned `ToolOutput` instead of
+smuggling the value out through a side-channel:
+
+```go
+finishTool := kit.NewTool("finish", "Return the final structured answer",
+    func(ctx context.Context, input AnswerInput) (kit.ToolOutput, error) {
+        return kit.ToolOutput{
+            Content:    "done",
+            Halt:       true,       // terminate the agent loop after this call
+            FinalValue: input,      // surfaced to the caller
+        }, nil
+    },
+)
+
+result, _ := host.PromptResult(ctx, "Extract the order details")
+if result.HaltedByTool == "finish" {
+    answer := result.FinalValue.(AnswerInput) // the typed value your handler stored
+    _ = answer
+}
+```
+
+`TurnResult.HaltedByTool` names the tool that halted the turn (empty if the
+turn ended for any other reason), and `TurnResult.FinalValue` carries whatever
+your handler placed in `ToolOutput.FinalValue`. `Halt`/`FinalValue` work with
+`NewTool`, `NewParallelTool`, and `NewRawTool` alike.
+
 ## Generation & provider overrides

 SDK consumers can configure generation parameters and provider endpoints
@@ -299,6 +382,11 @@ host.LoadAndAddContextFile("/etc/agents/tenant-acme.md")
 host.RemoveSkill("polite-french")
 host.RemoveContextFile(fmt.Sprintf("session://%s/AGENTS.md", userID))

+// Hide a skill from the model-facing catalog without unloading it — it stays
+// available for explicit /skill: activation. EnableSkill reverses this.
+host.DisableSkill("refund-policy")
+host.EnableSkill("refund-policy")
+
 // Or replace the whole set in one call.
 host.SetSkills(activeSkillsForUser)
 host.SetContextFiles(activeContextForUser)
@@ -324,9 +412,35 @@ Key points:
  from multiple goroutines; the underlying state is guarded by an internal
  `RWMutex`.
 - **Init-time options still apply.** `Options.Skills`, `Options.SkillsDir`,
-  `Options.NoSkills`, and `Options.NoContextFiles` continue to control the
-  startup set; the runtime API mutates from whatever state `New()` produced.
+  `Options.SkillsDisable`, `Options.SkillTrustPrompt`, `Options.NoSkills`, and
+  `Options.NoContextFiles` continue to control the startup set; the runtime API
+  mutates from whatever state `New()` produced.
  See [SDK options](/sdk/options#skills--configuration).
+- **Auto-discovery scopes.** When no explicit `Skills`/`SkillsDir` are given,
+  `New()` scans four [agentskills.io](https://agentskills.io/specification)
+  locations: `~/.agents/skills/`, `~/.config/kit/skills/`,
+  `<project>/.agents/skills/`, and `<project>/.kit/skills/`. Project-level
+  skills override user-level skills of the same `name`. Skills missing a
+  `description` are skipped with a logged warning, and a skill's
+  `disable-model-invocation: true` (or `Options.SkillsDisable`) hides it from
+  the catalog while keeping it available for explicit activation.
+- **Skill helpers.** A `kit.Skill` exposes `BaseDir()` (its directory) and
+  `Resources()` (the files bundled under `scripts/`, `references/`, and
+  `assets/`), which power the `<skill_resources>` enumeration shown when a skill
+  is activated.
+- **`fs.FS`-backed discovery.** The package-level loaders `kit.LoadSkill`,
+  `kit.LoadSkillsFromDir`, and `kit.LoadSkills` are path-string based;
+  `kit.LoadSkillsFromFS(fsys, root)` is the `fs.FS`-typed counterpart for
+  `embed.FS` distribution, `fstest.MapFS` tests, or per-tenant virtual
+  filesystems. Feed the result into `host.SetSkills(...)`:
+
+  ```go
+  //go:embed skills
+  var skillsFS embed.FS
+
+  loaded, _ := kit.LoadSkillsFromFS(skillsFS, "skills")
+  host.SetSkills(loaded)
+  ```

 ## MCP prompts and resources

@@ -382,6 +496,55 @@ if host.ShouldCompact() {
 }
 ```

+## Provider error classification
+
+Provider failures are wrapped with exported sentinels so you can branch on the
+failure category with `errors.Is` instead of string-matching the underlying
+HTTP error. `PromptResult` / `Prompt` already return classified errors; you can
+also classify any provider error yourself with `kit.ClassifyProviderError`:
+
+```go
+_, err := host.PromptResult(ctx, prompt)
+switch {
+case errors.Is(err, kit.ErrContextOverflow):
+    host.Compact(ctx, nil, "") // compact and retry
+case errors.Is(err, kit.ErrRateLimit):
+    backoffAndRetry()
+case errors.Is(err, kit.ErrAuth):
+    rePromptForKey()
+case errors.Is(err, kit.ErrProviderUnavailable):
+    retryLater()
+case errors.Is(err, kit.ErrInvalidRequest):
+    log.Printf("non-retryable: %v", err)
+}
+```
+
+| Sentinel | Meaning |
+|----------|---------|
+| `kit.ErrContextOverflow` | Request exceeded the model's context window |
+| `kit.ErrRateLimit` | Provider throttled the request |
+| `kit.ErrAuth` | Credential / authorization failure |
+| `kit.ErrProviderUnavailable` | Transient upstream failure (5xx, network, timeout) |
+| `kit.ErrInvalidRequest` | Structurally invalid request — retrying won't help |
+
+The original error stays reachable via `errors.Is`, so you never lose the
+provider's detail message.
+
+## Graceful shutdown
+
+`Close()` releases MCP connections, model resources, and the session file
+handle. When shutdown must be bounded by a deadline, use `CloseContext`:
+
+```go
+shutdownCtx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
+defer cancel()
+if err := host.CloseContext(shutdownCtx); err != nil {
+    log.Printf("shutdown: %v", err)
+}
+```
+
+`Close()` is equivalent to `CloseContext(context.Background())`.
+
 ## In-process subagents

 Spawn child Kit instances without subprocess overhead:
@@ -99,6 +99,12 @@ host, _ := kit.New(ctx, &kit.Options{
 })
 ```

-The interface requires methods for message storage, branching, compaction, extension data, and lifecycle management. See the [SDK skill reference](https://github.com/mark3labs/kit) for the complete interface definition.
+The interface requires methods for message storage, branching, compaction, branch summaries, extension data, and lifecycle management. See the [`SessionManager` interface definition](https://pkg.go.dev/github.com/mark3labs/kit/pkg/kit#SessionManager) for the complete method set.
+
+The `AppendBranchSummary(fromID, summary)` method backs `host.CollapseBranch`,
+which collapses a branch range into a single summary entry. Custom managers
+that don't track branch summaries can return `kit.ErrBranchSummaryNotSupported`
+from that method; `host.CollapseBranch` then surfaces the same sentinel so
+callers can detect it with `errors.Is`.

 When using a custom `SessionManager`, the `SessionPath`, `Continue`, and `NoSession` options are ignored — your manager handles its own storage and session selection.
Author	SHA1	Message	Date
Ed Zynda	cea82ea2d0	feat(skills): full agentskills.io spec compliance (#71 ) * feat(skills): full agentskills.io spec compliance - escape catalog XML and drop file:// prefix on <location> - skip skills missing a required description; add Skill.Validate - add license/compatibility/metadata/allowed-tools/disable-model-invocation frontmatter fields plus a malformed-YAML (unquoted colon) fallback - scan ~/.agents/skills and dedupe by name with project>user precedence - treat --skills-dir as a direct directory; add --skill-disable + DisableSkill/EnableSkill SDK methods - enumerate bundled resources via <skill_resources> on activation - add activate_skill MCP tool with enum-constrained name and session dedup - protect activated skill content from compaction pruning - gate project-local skills on a persisted trust allowlist via SkillTrustPrompt and an interactive CLI prompt - document new fields, flags, and SDK surface across README and docs site Fixes #65 * fix(skills): address skill loading and activation review findings - log (instead of discard) genuine errors from skill directory loads so permission/read failures no longer yield a silently partial catalog - make activate_skill dedup atomic by holding the lock across check and mark, preventing concurrent double-activation - reject activation of disable-model-invocation skills in the tool's runtime lookup, mirroring their catalog/enum exclusion - add regression test for disabled-skill activation	2026-06-18 19:37:53 +03:00
Ed Zynda	dd7ae41a58	feat(sdk): harden pkg/kit embedder surface with scoped additions (#69 ) * feat(sdk): harden pkg/kit embedder surface with scoped additions Implements 11 of the 14 items from the embedder SDK hardening request, all additive (no existing signature changes): - guard config.configPath global with a mutex; add GetConfigPath - extend PromptOptions with per-call model/thinking/tools/provider overrides and add PromptResultWithOptions - capture per-turn stream deltas in TurnResult.Stream ([]StreamEvent) - add ToolOutput.Halt/FinalValue, surfaced as TurnResult.HaltedByTool/ FinalValue for structured-result extraction - add provider-error sentinels + ClassifyProviderError, wired into the turn error path - add NewRawTool schema-driven untyped tool constructor - add CompactionEvent.Err and emit it on the compaction failure path - promote AppendBranchSummary onto SessionManager with ErrBranchSummaryNotSupported; drop the type-assertion fallback - add CloseContext for deadline-bounded shutdown - add ResetForTesting behind the testing build tag - add LoadSkillsFromFS (fs.FS-typed skill loader) Includes unit tests and docs-site coverage for each addition. Refs #68 * fix(sdk): address review feedback on embedder hardening - skills: aggregate load errors as []error with errors.Join + %w so the chain stays inspectable via errors.Is/As (both LoadSkillsFromDir and LoadSkillsFromFS) - errors: correct ClassifyProviderError godoc — sentinelError implements Unwrap() []error, which errors.Unwrap does not traverse; reference errors.Is instead (also fixed in the docs site) - kit: use restoreViperString on the applyPromptOptions error path so previously-unset config keys are cleared, not forced to "" - kit: roll back the per-call model override with context.Background() so a canceled caller ctx can't leak the override into later calls - sessions: document that CollapseBranch's toID is unused and the branch always collapses to the current leaf - docs: retarget the SessionManager link to its pkg.go.dev definition Refs #68 * fix(sdk): harden NewRawTool null-arg guard against whitespace Normalise the raw input with bytes.TrimSpace before the null/empty check so " null ", "\tnull\n", etc. follow the same skip-unmarshal path as the bare "null" and the handler always receives a non-nil empty map. Removes an implicit dependency on fantasy trimming the value during its RawMessage decode. Adds a contract test for the null-variant inputs. Refs #68	2026-06-18 18:18:54 +03:00
Ed Zynda	bd56f4a089	refactor(sdk): drop unreachable kit.AgentConfig surface (#67 ) * feat(sdk): add Options.DebugLogger and WithDebugLogger option Today the SDK exposes a DebugLogger interface (pkg/kit/types.go) but no public path to install one — the only consumer of the field is the unexported kit.AgentConfig.toInternal() method, which itself is not reachable from outside the package. As a result, embedders that want to forward Kit's low-level engine + MCP tool plumbing debug output into their own logging system (slog, zap, charm/log, an in-app TUI panel, etc.) have no option but the on/off Debug bool, which always installs the built-in SimpleDebugLogger / BufferedDebugLogger. This change closes that gap on the supported Options / functional-option construction path: - pkg/kit/kit.go: add Options.DebugLogger DebugLogger. When non-nil it is used directly and the Debug bool is ignored; the supplied logger's IsDebugEnabled() controls whether downstream code emits messages. - pkg/kit/options.go: add WithDebugLogger(l DebugLogger) Option. - internal/kitsetup/setup.go: add AgentSetupOptions.DebugLogger and switch SetupAgent's logger selection so the caller-supplied logger wins unconditionally; otherwise the existing Debug + UseBufferedLogger branch picks the built-in implementation. No behaviour change when DebugLogger is nil. - pkg/kit/kit.go: wire opts.DebugLogger into setupOpts so the New() path threads it through. - pkg/kit/viper_isolation_test.go: add TestWithDebugLoggerPlumbing and TestWithDebugLoggerNilClears covering the option-to-field contract and later-options-override semantics consistent with the other With* helpers. - pkg/kit/README.md: list WithDebugLogger in the helper inventory. Notes: - kit.DebugLogger and tools.DebugLogger are structurally identical (LogDebug(string) / IsDebugEnabled() bool), so the SDK value flows into the internal field without a conversion. - This is purely additive on the SDK surface and does not touch kit.AgentConfig — that field already carried a DebugLogger, but the AgentConfig path is unreachable from outside the package today. * refactor(sdk): drop unreachable kit.AgentConfig surface kit.AgentConfig (pkg/kit/types.go) and its toInternal converter were exposed as the documented "low-level / advanced consumer" path for agent construction, but the converter was unexported and not wired into any public constructor — neither New(Options) nor NewAgent(...Option) accept an AgentConfig. The only call sites were the dedicated agent_config_internal_test.go (same-package internal test) and two fantasy-import regression tests in types_test.go. Net effect today: no SDK consumer outside pkg/kit can populate or use kit.AgentConfig in any way. The type, the converter, the dedicated test file, and a chain of godoc cross-references all exist purely for their own sake — they don't enlarge what consumers can do, but they do enlarge the SDK's stability contract (every field becomes a public shape the internal agent layer can't refactor freely). The companion PR added Options.DebugLogger + WithDebugLogger so the last functional capability AgentConfig was documented to enable — installing a custom debug logger — is reachable through the supported construction path. With that wired, AgentConfig has no remaining purpose. Changes: - pkg/kit/types.go: remove the AgentConfig struct and its toInternal() method. Drop the now-unused internal/agent and internal/tools imports. Update the DebugLogger godoc to point at Options.DebugLogger and WithDebugLogger instead of AgentConfig. - pkg/kit/agent_config_internal_test.go: deleted (208 LOC). It exercised the unexported toInternal() method directly; with the method gone the test has no subject. - pkg/kit/types_test.go: rename TestAgentConfigNoFantasyImport to TestOptionsNoFantasyImport and rewrite it against Options (SystemPrompt, MaxSteps, Streaming, Tools, ExtraTools, DisableCoreTools, OnMCPServerLoaded). The original test also asserted ToolWrapper field semantics; that capability migrates to the hook system (OnBeforeToolCall / OnAfterToolResult), already covered by hooks_test.go, so the assertion is dropped with a pointer in the godoc. TestAgentConfigToolWrapperSignature replaced by TestToolSliceSignature, which still pins that []kit.Tool is the user-visible slice type for every tool-related SDK surface — the no-fantasy-import contract the original test guarded. - pkg/kit/mcp_tasks.go: update the MCPTaskConfig godoc to stop referencing AgentConfig. MCPTaskConfig stays — it is still emitted through Options.MCPTask fields and used as the engine-facing config type. - pkg/kit/README.md: drop the kit.AgentConfig line from the type inventory. internal/agent.AgentConfig is untouched and remains the internal construction shape. With the public type gone the internal one can evolve freely without breaking the SDK contract. Verification: - go build ./pkg/... ./internal/... ./cmd/... — clean - go vet ./pkg/... ./internal/... ./cmd/... — clean - go test -race -timeout 300s ./... — all packages pass * docs(sdk): document Options.DebugLogger and WithDebugLogger - README.md: add WithDebugLogger to the functional-options helper list - pkg/kit/README.md: expand the Debug row and add a DebugLogger row in the Options field summary - www/pages/sdk/overview.md: add WithDebugLogger to the helpers table with a note that it overrides WithDebug when set - www/pages/sdk/options.md: surface DebugLogger in the example, expand the Debug field description, add a DebugLogger row to the Core fields table, and add a "Custom debug logger" section with the interface signature and a log/slog adapter example --------- Co-authored-by: kit-agent <agent@local>	2026-06-18 14:46:03 +03:00
Ed Zynda	888c6c7953	chore(models): refresh embedded models database from models.dev - add GLM-5.2 across 9 providers (alibaba-token-plan-cn, baseten, cloudflare-workers-ai, fireworks-ai, neuralwatt, opencode-go, openrouter, venice, vercel) - add moonshotai/Kimi-K2.7-Code on baseten - drop deprecated neuralwatt models (MiniMax-M2.5, Devstral-Small-2-24B-Instruct-2512, gpt-oss-20b) - pick up new reasoning_options metadata on several models	2026-06-18 12:42:11 +03:00
Ed Zynda	a9d808eb9f	build(deps): bump go module dependencies - charm.land/fantasy v0.31.0 -> v0.32.0 - alecthomas/chroma/v2 v2.26.1 -> v2.27.0 - charmbracelet/openai-go to 20260617131321 - mark3labs/mcp-go v0.54.1 -> v0.55.0 - kaptinlin/jsonschema v0.8.0 -> v0.8.1 - pelletier/go-toml/v2 v2.3.1 -> v2.4.0 - google.golang.org/api v0.284.0 -> v0.285.0 - google.golang.org/genai v1.60.0 -> v1.61.0	2026-06-18 12:37:37 +03:00
Ed Zynda	d7948a64f3	fix(app): make ctx.NewSession wait for agent idle (#63 ) (#64 ) - Add ErrAgentBusy sentinel (shared between internal/app and internal/extensions) so callers can detect the busy condition with errors.Is instead of substring-matching the error message. - Add App.WaitForIdle(timeout) backed by a per-busy-cycle idleCh closed by a new setBusyLocked chokepoint; all busy transitions now route through it to keep the channel in sync with the busy flag. - Have RequestNewSessionFromExtension wait for idle (up to DefaultNewSessionIdleWait = 10m) instead of failing fast on IsBusy. This fixes the v0.79.0 phase-handoff race where OnAgentEnd fires from inside the agent loop, before drainQueue clears busy, so ctx.NewSession reliably failed with 'agent is busy'. - Expose ext.ErrAgentBusy to Yaegi via symbols.go. - Update NewSession godoc and phase-handoff example to document the new wait-then-send behavior. - Add regression tests covering already-idle, blocks-until-drain, timeout, zero-timeout, app-close, headless guard, and idleCh transitions. Fixes #63	2026-06-18 12:33:54 +03:00
Michal Hrušecký	d2e2e5e9b3	feat(models): add apiModelName field to custom model config (#59 ) * feat(models): add apiModelName field to custom model config Allows custom models to specify an alternative model name to send in API requests, distinct from the config key. Useful when a local or custom endpoint expects a different model identifier. Configures createCustomProvider to use modelInfo.APIModelName when calling p.LanguageModel(), falling back to the config key. * docs: document apiModelName field in custom model config	2026-06-17 17:17:50 +03:00
Ed Zynda	2c05280150	feat(ui): support /new <prompt> and ctx.NewSession for phase handoffs - /new now accepts an optional initial prompt that is submitted as the first user turn of the new session, with @file expansion mirroring normal input submission - Add ctx.NewSession(prompt) extension API for ending the current session and starting a fresh one from an extension (e.g. on AgentEnd) - Plumb the prompt through BeforeSessionSwitchEvent.InitialPrompt so extensions can inspect or veto the switch - Bridge extension calls into the TUI via app.NewSessionRequestEvent with a response channel so the caller observes success or failure - Add pkg/kit EmitBeforeSessionSwitchWithPrompt; keep the old method as a thin compatibility wrapper - Ship examples/extensions/phase-handoff.go demonstrating automatic session handoff on a <HANDOFF_READY> sentinel plus a /handoff command - Tests cover the new /new prompt path, the extension request event, and the before-hook cancellation flow	2026-06-17 17:16:24 +03:00