Files
Ed Zynda 49f8b485be feat(extensions): add OnLLMUsage, SetState, enriched AgentEndEvent (#53) (#54)
* feat(extensions): add OnLLMUsage, SetState, enriched AgentEndEvent (#53)

Three additive primitives to the extension API:

- OnLLMUsage event: per-LLM-call token + cost deltas attributed to the
  specific model/provider used for each round-trip. Derived from the SDK
  StepFinishEvent in the extension bridge. Enables accurate budget
  enforcement between calls instead of only at turn boundaries.

- ctx.SetState / GetState / DeleteState / ListState: session-scoped,
  last-write-wins key-value store backed by a sidecar file
  (<session>.ext-state.json) outside the conversation tree. Reads are
  O(1), writes don't grow the JSONL, and the store is not duplicated on
  fork. State is preserved across hot-reloads.

- Enriched AgentEndEvent: ToolCallCount, ToolNames, LLMCallCount, token
  deltas (input/output/cache-read/cache-write), CostDelta, and
  DurationMs populated by a per-turn aggregator. Existing handlers
  reading only Response/StopReason are unaffected.

Includes unit tests for the state store, LLMUsage registration,
enriched AgentEndEvent, turn aggregator, llmUsageMeta, and sidecar path
derivation. Adds examples/extensions/usage-budget.go demoing all three
primitives together. Documents the additions in README, the docs site
(extensions overview, capabilities, examples), and the kit-extensions
and kit-sdk skill guides.

Fixes #53

* fix(extensions): address review feedback on state store and llmUsageMeta

- Serialize SetState/DeleteState saver invocations through a new saverMu
  so overlapping atomic-rename writes can no longer race on the shared
  .tmp file and persist an older snapshot after a newer one.
- LoadStateFromFile now clears the in-memory store when the sidecar is
  missing or empty, matching the documented "replace … with its
  contents" contract. This makes session-switching safe by preventing
  keys from a prior session leaking into a new one. Tests updated to
  cover both the missing-file and empty-file cases.
- llmUsageMeta now detects Anthropic OAuth credentials and returns
  Cost=0, matching the comment and the existing usage_tracker behavior
  for OAuth users. Mirrors the OAuth detection already used in
  cmd/extension_context.go.
- Document the single-in-flight-turn assumption baked into the
  per-turn aggregator with a clear migration path (per-turn ID) for if
  concurrent turns ever become a supported use case.

* fix(extensions): release saverMu on panic in state store

Extract a runSaver helper that locks saverMu and defers Unlock before
invoking the persistence callback. Without the deferred Unlock, a panic
inside the saver (e.g. disk full mid-write) would leave saverMu held
forever and deadlock the next SetState/DeleteState. Both SetState and
DeleteState now route through the helper. New TestRunner_State_Saver
PanicReleasesSaverMu reproduces the deadlock window with a 2s deadline
and proves the mutex is released after a panic.
2026-06-09 16:18:10 +03:00

88 lines
2.8 KiB
Go

//go:build ignore
package main
import (
"fmt"
"strconv"
"kit/ext"
)
// Init demonstrates the three primitives added in issue #53:
//
// 1. api.OnLLMUsage(...) — per-LLM-call usage callback with token + cost
// deltas. Use this for budget enforcement that reacts between calls
// within a single agent turn, rather than only at turn boundaries.
//
// 2. ctx.SetState / ctx.GetState / ctx.DeleteState / ctx.ListState —
// last-write-wins, session-scoped key-value store backed by a sidecar
// file. Use this for snapshot state (current value of X) instead of
// ctx.AppendEntry, which is append-only and bloats branch reads.
//
// 3. ext.AgentEndEvent.ToolCallCount / .ToolNames / .LLMCallCount /
// .InputTokensDelta / .OutputTokensDelta / .CostDelta / .DurationMs —
// per-turn aggregates so observer extensions don't need to maintain
// parallel bookkeeping.
//
// Together these support a simple soft-budget cap: warn when the
// cumulative cost in this session exceeds a threshold, and print a
// per-turn report on AgentEnd.
//
// Usage: kit -e examples/extensions/usage-budget.go
func Init(api ext.API) {
const warnAtKey = "usage-budget:warn-at-usd"
// 1. Print per-LLM-call usage with provider, model, and cost.
api.OnLLMUsage(func(e ext.LLMUsageEvent, ctx ext.Context) {
ctx.Print(fmt.Sprintf(
"[usage] step=%d %s/%s tokens=↑%d ↓%d cache=↑%d/↓%d cost=$%.4f (%s)",
e.StepNumber, e.Provider, e.Model,
e.InputTokens, e.OutputTokens,
e.CacheWriteTokens, e.CacheReadTokens,
e.Cost, e.FinishReason,
))
// 2. Persist running total in last-write-wins state.
current := 0.0
if raw, ok := ctx.GetState("usage-budget:total-cost"); ok {
current, _ = strconv.ParseFloat(raw, 64)
}
current += e.Cost
ctx.SetState("usage-budget:total-cost", strconv.FormatFloat(current, 'f', 6, 64))
// Soft warn-at threshold (configurable via state).
warnAt := 0.50
if raw, ok := ctx.GetState(warnAtKey); ok {
if v, err := strconv.ParseFloat(raw, 64); err == nil {
warnAt = v
}
}
if current > warnAt {
ctx.PrintError(fmt.Sprintf(
"[usage] session cost $%.4f exceeds soft cap $%.2f",
current, warnAt,
))
}
})
// 3. Print a per-turn summary using the enriched AgentEndEvent.
api.OnAgentEnd(func(e ext.AgentEndEvent, ctx ext.Context) {
ctx.Print(fmt.Sprintf(
"[turn] stop=%s tools=%d llm-calls=%d tokens=↑%d ↓%d cost=$%.4f duration=%dms",
e.StopReason, e.ToolCallCount, e.LLMCallCount,
e.InputTokensDelta, e.OutputTokensDelta, e.CostDelta, e.DurationMs,
))
if len(e.ToolNames) > 0 {
ctx.Print(fmt.Sprintf("[turn] tool order: %v", e.ToolNames))
}
})
// Bootstrap default soft cap once per session.
api.OnSessionStart(func(e ext.SessionStartEvent, ctx ext.Context) {
if _, ok := ctx.GetState(warnAtKey); !ok {
ctx.SetState(warnAtKey, "0.50")
}
})
}