feat(extensions): add OnLLMUsage, SetState, enriched AgentEndEvent (#53) (#54)

* feat(extensions): add OnLLMUsage, SetState, enriched AgentEndEvent (#53)

Three additive primitives to the extension API:

- OnLLMUsage event: per-LLM-call token + cost deltas attributed to the
  specific model/provider used for each round-trip. Derived from the SDK
  StepFinishEvent in the extension bridge. Enables accurate budget
  enforcement between calls instead of only at turn boundaries.

- ctx.SetState / GetState / DeleteState / ListState: session-scoped,
  last-write-wins key-value store backed by a sidecar file
  (<session>.ext-state.json) outside the conversation tree. Reads are
  O(1), writes don't grow the JSONL, and the store is not duplicated on
  fork. State is preserved across hot-reloads.

- Enriched AgentEndEvent: ToolCallCount, ToolNames, LLMCallCount, token
  deltas (input/output/cache-read/cache-write), CostDelta, and
  DurationMs populated by a per-turn aggregator. Existing handlers
  reading only Response/StopReason are unaffected.

Includes unit tests for the state store, LLMUsage registration,
enriched AgentEndEvent, turn aggregator, llmUsageMeta, and sidecar path
derivation. Adds examples/extensions/usage-budget.go demoing all three
primitives together. Documents the additions in README, the docs site
(extensions overview, capabilities, examples), and the kit-extensions
and kit-sdk skill guides.

Fixes #53

* fix(extensions): address review feedback on state store and llmUsageMeta

- Serialize SetState/DeleteState saver invocations through a new saverMu
  so overlapping atomic-rename writes can no longer race on the shared
  .tmp file and persist an older snapshot after a newer one.
- LoadStateFromFile now clears the in-memory store when the sidecar is
  missing or empty, matching the documented "replace … with its
  contents" contract. This makes session-switching safe by preventing
  keys from a prior session leaking into a new one. Tests updated to
  cover both the missing-file and empty-file cases.
- llmUsageMeta now detects Anthropic OAuth credentials and returns
  Cost=0, matching the comment and the existing usage_tracker behavior
  for OAuth users. Mirrors the OAuth detection already used in
  cmd/extension_context.go.
- Document the single-in-flight-turn assumption baked into the
  per-turn aggregator with a clear migration path (per-turn ID) for if
  concurrent turns ever become a supported use case.

* fix(extensions): release saverMu on panic in state store

Extract a runSaver helper that locks saverMu and defers Unlock before
invoking the persistence callback. Without the deferred Unlock, a panic
inside the saver (e.g. disk full mid-write) would leave saverMu held
forever and deadlock the next SetState/DeleteState. Both SetState and
DeleteState now route through the helper. New TestRunner_State_Saver
PanicReleasesSaverMu reproduces the deadlock window with a 2s deadline
and proves the mutex is released after a panic.
This commit is contained in:
Ed Zynda
2026-06-09 16:18:10 +03:00
committed by GitHub
parent febdc530e1
commit 49f8b485be
22 changed files with 1429 additions and 15 deletions
+57 -2
View File
@@ -88,7 +88,8 @@ api.OnAgentStart(func(e ext.AgentStartEvent, ctx ext.Context) {
// e.Prompt string
})
// Agent finished responding.
// Agent finished responding. Carries per-turn aggregates so observer-style
// extensions don't need to maintain parallel bookkeeping.
api.OnAgentEnd(func(e ext.AgentEndEvent, ctx ext.Context) {
// e.Response string
// e.StopReason string — "error" (on failure), "completed" (when LLM returns
@@ -96,6 +97,33 @@ api.OnAgentEnd(func(e ext.AgentEndEvent, ctx ext.Context) {
// (e.g. "stop", "length" (max output tokens hit), "tool-calls", "content-filter").
// To detect errors, check e.StopReason == "error".
// Do NOT compare against "completed" for success — instead check != "error".
//
// Per-turn aggregates (computed by Kit's runtime):
// e.ToolCallCount int — total tool invocations this turn
// e.ToolNames []string — tool names in call order (duplicates preserved)
// e.LLMCallCount int — LLM round-trips / tool-loop iterations
// e.InputTokensDelta int — sum of input tokens across LLM calls this turn
// e.OutputTokensDelta int
// e.CacheReadTokensDelta int
// e.CacheWriteTokensDelta int
// e.CostDelta float64 — USD cost (zero when pricing unknown / OAuth)
// e.DurationMs int64 — wall-clock duration AgentStart→AgentEnd
})
// Per-LLM-call usage — fires after each provider round-trip with token + cost
// deltas attributed to that specific call. A single turn typically produces
// multiple LLMUsageEvents (one per tool-loop iteration). Use this for accurate
// budget enforcement that needs to react between calls instead of waiting
// for the turn to finish.
api.OnLLMUsage(func(e ext.LLMUsageEvent, ctx ext.Context) {
// e.InputTokens, e.OutputTokens int
// e.CacheReadTokens, e.CacheWriteTokens int
// e.Cost float64 — USD; zero when pricing unknown / OAuth
// e.Model, e.Provider string — model used for THIS call
// (may differ across calls if SetModel was called)
// e.StepNumber int — zero-based step index in this turn
// e.FinishReason string — "stop" / "tool_calls" / "length" / ...
// e.RequestID string — optional provider correlation id (may be empty)
})
```
@@ -528,11 +556,38 @@ stats := ctx.GetContextStats() // .EstimatedTokens, .ContextLimit, .UsagePer
msgs := ctx.GetMessages() // []ext.SessionMessage on current branch
path := ctx.GetSessionPath() // file path of session JSONL
// Persist custom data in the session tree:
// Append-only log in the session tree (fork-aware, walked on every branch read):
id, err := ctx.AppendEntry("my-type", "data string")
entries := ctx.GetEntries("my-type") // []ext.ExtensionEntry{ID, EntryType, Data, Timestamp}
```
### Session State (last-write-wins)
Key-value store scoped to the session, persisted to a sidecar file
(`<session>.ext-state.json`) outside the conversation tree. Reads are O(1)
(no branch walk), writes don't grow the JSONL, and the store is not
duplicated on fork. State is invisible to the LLM and survives session
resume. For ephemeral / in-memory sessions, state lives only in memory.
```go
ctx.SetState("myext:budget-cap", "10.00") // last write wins
val, ok := ctx.GetState("myext:budget-cap") // (string, bool)
ctx.DeleteState("myext:budget-cap") // no-op if missing
keys := ctx.ListState() // []string, unspecified order
```
**When to use which:**
| Need | Use |
|------|-----|
| Snapshot state ("current value of X") | `SetState` / `GetState` |
| Audit log / event history | `AppendEntry` / `GetEntries` |
| One-shot per-turn signal | enriched `AgentEndEvent` fields |
| Per-LLM-call observation | `OnLLMUsage` event |
Namespace keys with your extension name (e.g. `"myext:budget-cap"`) to avoid
collisions across extensions.
### Model Management
```go
+13
View File
@@ -1104,6 +1104,19 @@ if extAPI.HasExtensions() {
tools := extAPI.GetToolInfos()
extAPI.SetActiveTools([]string{"bash", "read"})
// Session-scoped extension state (last-write-wins key-value store).
// Backed by an in-memory map and a per-session sidecar file
// (<session>.ext-state.json) outside the conversation tree.
extAPI.SetState("myext:budget-cap", "10.00")
val, ok := extAPI.GetState("myext:budget-cap")
extAPI.DeleteState("myext:budget-cap")
keys := extAPI.ListState()
// Load any existing state from the sidecar and install a saver hook so
// subsequent SetState/DeleteState mutations are flushed atomically.
// No-op for ephemeral / in-memory sessions. Safe to call multiple times.
_ = extAPI.InitStatePersistence()
// Events
extAPI.EmitSessionStart()
extAPI.EmitModelChange("new/model", "old/model", "extension")