mirror of
https://github.com/mark3labs/kit.git
synced 2026-06-14 03:30:26 +00:00
* feat(extensions): add OnLLMUsage, SetState, enriched AgentEndEvent (#53) Three additive primitives to the extension API: - OnLLMUsage event: per-LLM-call token + cost deltas attributed to the specific model/provider used for each round-trip. Derived from the SDK StepFinishEvent in the extension bridge. Enables accurate budget enforcement between calls instead of only at turn boundaries. - ctx.SetState / GetState / DeleteState / ListState: session-scoped, last-write-wins key-value store backed by a sidecar file (<session>.ext-state.json) outside the conversation tree. Reads are O(1), writes don't grow the JSONL, and the store is not duplicated on fork. State is preserved across hot-reloads. - Enriched AgentEndEvent: ToolCallCount, ToolNames, LLMCallCount, token deltas (input/output/cache-read/cache-write), CostDelta, and DurationMs populated by a per-turn aggregator. Existing handlers reading only Response/StopReason are unaffected. Includes unit tests for the state store, LLMUsage registration, enriched AgentEndEvent, turn aggregator, llmUsageMeta, and sidecar path derivation. Adds examples/extensions/usage-budget.go demoing all three primitives together. Documents the additions in README, the docs site (extensions overview, capabilities, examples), and the kit-extensions and kit-sdk skill guides. Fixes #53 * fix(extensions): address review feedback on state store and llmUsageMeta - Serialize SetState/DeleteState saver invocations through a new saverMu so overlapping atomic-rename writes can no longer race on the shared .tmp file and persist an older snapshot after a newer one. - LoadStateFromFile now clears the in-memory store when the sidecar is missing or empty, matching the documented "replace … with its contents" contract. This makes session-switching safe by preventing keys from a prior session leaking into a new one. Tests updated to cover both the missing-file and empty-file cases. - llmUsageMeta now detects Anthropic OAuth credentials and returns Cost=0, matching the comment and the existing usage_tracker behavior for OAuth users. Mirrors the OAuth detection already used in cmd/extension_context.go. - Document the single-in-flight-turn assumption baked into the per-turn aggregator with a clear migration path (per-turn ID) for if concurrent turns ever become a supported use case. * fix(extensions): release saverMu on panic in state store Extract a runSaver helper that locks saverMu and defers Unlock before invoking the persistence callback. Without the deferred Unlock, a panic inside the saver (e.g. disk full mid-write) would leave saverMu held forever and deadlock the next SetState/DeleteState. Both SetState and DeleteState now route through the helper. New TestRunner_State_Saver PanicReleasesSaverMu reproduces the deadlock window with a 2s deadline and proves the mutex is released after a panic.
This commit is contained in:
@@ -88,7 +88,8 @@ api.OnAgentStart(func(e ext.AgentStartEvent, ctx ext.Context) {
|
||||
// e.Prompt string
|
||||
})
|
||||
|
||||
// Agent finished responding.
|
||||
// Agent finished responding. Carries per-turn aggregates so observer-style
|
||||
// extensions don't need to maintain parallel bookkeeping.
|
||||
api.OnAgentEnd(func(e ext.AgentEndEvent, ctx ext.Context) {
|
||||
// e.Response string
|
||||
// e.StopReason string — "error" (on failure), "completed" (when LLM returns
|
||||
@@ -96,6 +97,33 @@ api.OnAgentEnd(func(e ext.AgentEndEvent, ctx ext.Context) {
|
||||
// (e.g. "stop", "length" (max output tokens hit), "tool-calls", "content-filter").
|
||||
// To detect errors, check e.StopReason == "error".
|
||||
// Do NOT compare against "completed" for success — instead check != "error".
|
||||
//
|
||||
// Per-turn aggregates (computed by Kit's runtime):
|
||||
// e.ToolCallCount int — total tool invocations this turn
|
||||
// e.ToolNames []string — tool names in call order (duplicates preserved)
|
||||
// e.LLMCallCount int — LLM round-trips / tool-loop iterations
|
||||
// e.InputTokensDelta int — sum of input tokens across LLM calls this turn
|
||||
// e.OutputTokensDelta int
|
||||
// e.CacheReadTokensDelta int
|
||||
// e.CacheWriteTokensDelta int
|
||||
// e.CostDelta float64 — USD cost (zero when pricing unknown / OAuth)
|
||||
// e.DurationMs int64 — wall-clock duration AgentStart→AgentEnd
|
||||
})
|
||||
|
||||
// Per-LLM-call usage — fires after each provider round-trip with token + cost
|
||||
// deltas attributed to that specific call. A single turn typically produces
|
||||
// multiple LLMUsageEvents (one per tool-loop iteration). Use this for accurate
|
||||
// budget enforcement that needs to react between calls instead of waiting
|
||||
// for the turn to finish.
|
||||
api.OnLLMUsage(func(e ext.LLMUsageEvent, ctx ext.Context) {
|
||||
// e.InputTokens, e.OutputTokens int
|
||||
// e.CacheReadTokens, e.CacheWriteTokens int
|
||||
// e.Cost float64 — USD; zero when pricing unknown / OAuth
|
||||
// e.Model, e.Provider string — model used for THIS call
|
||||
// (may differ across calls if SetModel was called)
|
||||
// e.StepNumber int — zero-based step index in this turn
|
||||
// e.FinishReason string — "stop" / "tool_calls" / "length" / ...
|
||||
// e.RequestID string — optional provider correlation id (may be empty)
|
||||
})
|
||||
```
|
||||
|
||||
@@ -528,11 +556,38 @@ stats := ctx.GetContextStats() // .EstimatedTokens, .ContextLimit, .UsagePer
|
||||
msgs := ctx.GetMessages() // []ext.SessionMessage on current branch
|
||||
path := ctx.GetSessionPath() // file path of session JSONL
|
||||
|
||||
// Persist custom data in the session tree:
|
||||
// Append-only log in the session tree (fork-aware, walked on every branch read):
|
||||
id, err := ctx.AppendEntry("my-type", "data string")
|
||||
entries := ctx.GetEntries("my-type") // []ext.ExtensionEntry{ID, EntryType, Data, Timestamp}
|
||||
```
|
||||
|
||||
### Session State (last-write-wins)
|
||||
|
||||
Key-value store scoped to the session, persisted to a sidecar file
|
||||
(`<session>.ext-state.json`) outside the conversation tree. Reads are O(1)
|
||||
(no branch walk), writes don't grow the JSONL, and the store is not
|
||||
duplicated on fork. State is invisible to the LLM and survives session
|
||||
resume. For ephemeral / in-memory sessions, state lives only in memory.
|
||||
|
||||
```go
|
||||
ctx.SetState("myext:budget-cap", "10.00") // last write wins
|
||||
val, ok := ctx.GetState("myext:budget-cap") // (string, bool)
|
||||
ctx.DeleteState("myext:budget-cap") // no-op if missing
|
||||
keys := ctx.ListState() // []string, unspecified order
|
||||
```
|
||||
|
||||
**When to use which:**
|
||||
|
||||
| Need | Use |
|
||||
|------|-----|
|
||||
| Snapshot state ("current value of X") | `SetState` / `GetState` |
|
||||
| Audit log / event history | `AppendEntry` / `GetEntries` |
|
||||
| One-shot per-turn signal | enriched `AgentEndEvent` fields |
|
||||
| Per-LLM-call observation | `OnLLMUsage` event |
|
||||
|
||||
Namespace keys with your extension name (e.g. `"myext:budget-cap"`) to avoid
|
||||
collisions across extensions.
|
||||
|
||||
### Model Management
|
||||
|
||||
```go
|
||||
|
||||
@@ -1104,6 +1104,19 @@ if extAPI.HasExtensions() {
|
||||
tools := extAPI.GetToolInfos()
|
||||
extAPI.SetActiveTools([]string{"bash", "read"})
|
||||
|
||||
// Session-scoped extension state (last-write-wins key-value store).
|
||||
// Backed by an in-memory map and a per-session sidecar file
|
||||
// (<session>.ext-state.json) outside the conversation tree.
|
||||
extAPI.SetState("myext:budget-cap", "10.00")
|
||||
val, ok := extAPI.GetState("myext:budget-cap")
|
||||
extAPI.DeleteState("myext:budget-cap")
|
||||
keys := extAPI.ListState()
|
||||
|
||||
// Load any existing state from the sidecar and install a saver hook so
|
||||
// subsequent SetState/DeleteState mutations are flushed atomically.
|
||||
// No-op for ephemeral / in-memory sessions. Safe to call multiple times.
|
||||
_ = extAPI.InitStatePersistence()
|
||||
|
||||
// Events
|
||||
extAPI.EmitSessionStart()
|
||||
extAPI.EmitModelChange("new/model", "old/model", "extension")
|
||||
|
||||
Reference in New Issue
Block a user