This commit fixes several issues with token usage tracking:
1. Fix InputTokens-only validation bug - now checks any token field > 0
to handle OpenAI-compatible providers where cached prompts result in
InputTokens=0 while OutputTokens>0
2. Remove per-step context token updates from recordStepUsage() - context
fill is now set once at turn completion via updateUsageFromTurnResult
using FinalUsage.InputTokens, preventing display jumps during multi-step
tool calls
3. Track maximum context seen in SetContextTokens() - prevents the status
bar from showing decreasing token counts when FinalUsage.InputTokens
reflects only the last step's input
4. Add comprehensive debug logging for token tracking at key points:
- StepUsageEvent emission
- recordStepUsage processing
- updateUsageFromTurnResult processing
5. Update tests to reflect new behavior:
- TestRecordStepUsage_updatesTracker: no longer expects context updates
- TestUpdateUsageFromTurnResult_contextTokensUsesInputOnly: verifies
InputTokens-only tracking
All tests pass. Token tracking now correctly accumulates costs and shows
monotonically increasing context size.
- /new command now properly resets usageTracker stats when starting fresh session
- Remove EstimateAndUpdateUsage fallback in updateUsageFromTurnResult()
- Remove EstimateAndUpdateUsage fallback in UpdateUsageFromResponse()
- Only use actual API-reported tokens for cost tracking (following opencode pattern)
- Estimation is inaccurate and should never be used for billing
Fixes issues with kimi-k2.5 and opencode token tracking where:
1. /new didn't reset token count/cost
2. Tokens never updated correctly due to estimation fallback
Change thinking content from H6 to Italic for more subdued,
secondary visual appearance. Makes reasoning text less prominent
than main assistant responses.
- Change AlertNote label from "You" to "Info" for system/extension messages
- Update RenderUserMessage to use custom styling with "You" label
- This separates user messages ("You") from info messages ("Info")
Update RenderSystemMessage to use r.ty.Note() instead of r.ty.P()
for visual consistency with other herald-based message rendering.
This affects extension PrintInfo output and system messages.
- Add bottom margin to startup header (KVGroup)
- Add bottom margin to thinking/reasoning blocks
- Fix thinking block footer to appear on new line without extra spacing
- Update spawn_subagent tool output to use bash-style formatting
- Add blank line after extension startup messages for visual separation
The herald-based CodeBlock implementation didn't match the custom
styling we had for line numbers and gutters. Restoring the original
renderReadBody and renderCodeBlock functions with:
- Custom line number gutter styling
- Chroma syntax highlighting
- Truncation handling with footer preservation
- Replace MessageRenderer with herald-based implementation
- Use herald alerts (Note, Tip, Warning, Caution) for message types
- Use blockquote for thinking/reasoning content
- Use KVGroup for startup info display
- Add margin-bottom to all message types for visual separation
- Simplify Read tool with herald CodeBlock and line numbers
- Add detectLanguage helper for syntax highlighting
- Capture extension startup messages and print after startup banner
- Remove ~200 lines of custom rendering code
Extensions running via the SDK (without a fully-wired SetExtensionContext
call) would panic with 'reflect.Value.Call: call of nil function' when
calling any ctx method like ctx.PrintBlock().
normalizeContext() now replaces every nil function field in Context with
a safe no-op stub before storing it in the runner, so extension handlers
can never crash on a missing callback regardless of how Kit is embedded.
The StreamComponent was truncating content to fit the viewport height before
caching it in renderCache. This caused GetRenderedContent() to return truncated
content when flushing to scrollback.
Changes:
- render() now caches FULL content without height clamping
- New viewContent() helper applies height clamping only for display
- View() calls both: render() for full content, viewContent() for visible slice
This follows the Pi TUI pattern: full buffer in memory, viewport slicing only
at display time. Long assistant messages are now fully preserved in scrollback.
Remove the StepUsageEvent handler from subscribeSDKEvents. It was
calling UpdateUsage() for every individual tool-calling step as it
streamed, then updateUsageFromTurnResult() called UpdateUsage() again
with TotalUsage (fantasy's own aggregate of all steps). A turn with N
tool calls was counting every token N+1 times.
Fix updateUsageFromTurnResult to use a single, clean code path:
- UpdateUsage() called exactly once per turn using TotalUsage
- SetContextTokens() uses FinalUsage.InputTokens only (not +OutputTokens)
since input tokens of the last call = actual context window fill;
output tokens are the response length, not context occupancy
- Estimate fallback no longer early-returns before SetContextTokens
Verified with opencode/kimi-k2.5: cost accumulates linearly across
simple and multi-step tool-calling turns with no double-counting.
anthropic/claude-sonnet-4-6 correctly shows $0.00 for OAuth sessions.
- Update all Go dependencies to latest versions
- Remove internal/app/usage_test.go (import cycle)
- Add sanitizeToolCallID function to fix message tests
- All tests pass with race detection
Empty sessions (no messages) are now automatically cleaned up:
1. On shutdown: When kit exits cleanly, if the current session has no
messages, the session file is deleted.
2. On /resume: When listing sessions for the resume picker, any empty
session files are deleted and not shown in the list.
This prevents accumulation of orphaned empty session files when users
start sessions but don't send any messages.
Changes:
- internal/session/tree_manager.go: add IsEmpty() helper
- internal/app/app.go: delete empty session on Close()
- internal/session/store.go: filter and delete empty sessions in listSessionsInDir()
Change /new behavior to match Pi:
- Create a completely new session file instead of just resetting the leaf
- Previous session is closed and saved (accessible via /resume)
- New session starts with 0 entries, 0 messages - clean slate
- Update help text to reflect new behavior
Key fix: SwitchTreeSession now updates the kit SDK's tree session
reference so messages are persisted to the correct file.
Files changed:
- internal/app/app.go: update kit SDK session reference
- internal/ui/model.go: create new session file on /new
- internal/ui/model_test.go: add SwitchTreeSession stub
- Remove unused modelFamily variable in createOpenAICodexProvider
- Remove dead spark handling code (spark is rejected early with error)
- Simplify buildCodexProviderOptions to only handle regular codex models
- Remove redundant comments and simplify code structure
- Net reduction: 31 lines of code
Spark models are not accessible via ChatGPT OAuth and return Cloudflare
'Forbidden' errors. Add early detection and helpful error message directing
users to regular Codex models like 'openai/gpt-5.3-codex' instead.
Different Codex model families use different API formats:
- gpt-codex-spark: uses standard ProviderOptions (not Responses API)
- gpt-codex, gpt-codex-mini: uses ResponsesProviderOptions
- Add detectCodexModelFamily() to determine model family from name
- Use standard ProviderOptions for spark models
- Use ResponsesProviderOptions for regular codex models
- Conditionally use WithUseResponsesAPI() based on model family
Note: gpt-5.3-codex-spark still gets Cloudflare forbidden error,
may need additional headers or different endpoint.
The Codex API doesn't support the max_output_tokens parameter, which was causing
"Unsupported parameter: max_output_tokens" errors.
- Add SkipMaxOutputTokens flag to ProviderResult
- Set flag when creating Codex OAuth provider
- Check flag in agent setup to skip WithMaxOutputTokens option
- This matches pi's behavior of not sending max_tokens to Codex API
- Upgrade charm.land/fantasy from v0.16.0 to v0.17.1
- Add buildCodexProviderOptions() to pass system prompt as 'instructions'
- The Codex API requires instructions as a top-level field, not as system message
- Set Store=false to prevent server-side conversation storage
- Use ResponsesProviderOptions.Instructions for system prompt
- Change base URL to /backend-api/codex for correct endpoint path
- Add browser-like User-Agent to avoid Cloudflare blocking
- Add Accept, Accept-Language, Cache-Control headers
- Match pi client headers more closely
Implements OAuth authentication for OpenAI ChatGPT Plus/Pro (Codex) similar to pi:
- Add OpenAICredentials type with OAuth and API key support
- Add OpenAI OAuth client with correct endpoints (auth.openai.com)
- Implement PKCE-based OAuth flow with local callback server on :1455
- Add login/logout/status commands for openai provider
- Support both ChatGPT/Codex OAuth tokens (chatgpt.com/backend-api) and
regular OpenAI API keys (api.openai.com)
- Extract and store ChatGPT account ID from JWT token
- Add custom HTTP transport with required Codex headers:
- chatgpt-account-id, originator, OpenAI-Beta: responses=experimental
- Update provider selection to use correct endpoint based on auth type
Usage:
kit auth login openai # OAuth with ChatGPT account
kit auth logout openai
kit auth status
The implementation follows the same patterns as the existing Anthropic OAuth
support, with automatic token refresh and secure credential storage in
~/.config/.kit/credentials.json
When displaying streaming bash output, show the initial command as a
muted header ($ <command>) before the output lines. This helps users
understand what command is currently executing.
Changes:
- Add streamingBashCommand field to AppModel
- Extract command from ToolCallStartedEvent for bash tools
- Render $ <command> header in renderStreamingBashOutput
- Clear command on ToolResultEvent when tool completes
- Add tests for command extraction and cleanup
When a steer message is consumed mid-turn via PrepareStep, no new
SpinnerEvent{Show: true} fires within that turn, so the message was
stuck in pendingUserPrints indefinitely and never rendered.
Branch the SteerConsumedEvent handler on m.state:
- stateWorking (mid-turn): flush live stream content, then print the
steering user messages to scrollback immediately via drainScrollback.
- idle/post-turn: keep the existing pendingUserPrints deferral so the
SpinnerEvent{Show: true} for the next turn orders things correctly.
Models from the opencode provider (like claude-opus-4-6 and gpt-5.3-codex)
have provider overrides in the models database that specify different npm
packages than the provider's default. The code was ignoring these overrides
and routing all models through openaicompat, causing "bad request" errors.
Changes:
- Added Provider field to modelsDBModel to capture model-specific overrides
- Added ProviderNPM field to ModelInfo registry struct
- Updated autoRouteProvider() to check for model-specific provider overrides
- Fixed URL path handling for anthropic provider (strip /v1 suffix to avoid
double /v1/v1 paths when using third-party anthropic-compatible APIs)
Fixes routing for:
- opencode/claude-opus-4-6 -> @ai-sdk/anthropic
- opencode/gpt-5.3-codex -> @ai-sdk/openai
Previously, token usage and costs were only updated at the end of a complete
turn. For long-running multi-step tool-calling conversations, this meant the
status bar showed stale (or zero) costs during the entire interaction.
Now, after each complete step (tool call + result), the usage tracker is
updated with the actual token counts from that step. This provides real-time
cost accumulation visible in the status bar.
Changes:
- Add StepUsageHandler type and onStepUsage parameter to agent
- Emit StepUsageEvent from kit layer after each step completes
- Handle StepUsageEvent in app layer to update UsageTracker
- Add EventStepUsage constant and StepUsageEvent struct to events
The step usage is additive - each step's tokens are added to the running
session totals, just like the final turn usage was before.
When switching models (e.g., via /model command or ctx.SetModel), the usage
tracker now updates its model info to reflect the new model's:
- Pricing for cost calculations
- Context limits for percentage display
- OAuth status (to show bash costs when using OAuth creds)
Previously, token costs and context percentages continued using the old
model's settings after a switch, causing incorrect display for:
- Users switching from paid to free/OAuth models
- Users switching between models with different pricing
Changes:
- Add UpdateModelInfo() method to UsageTracker
- Call UpdateModelInfo() in both SetModel callbacks (extension and UI)
- Add auth import for OAuth detection in root.go
Add width and count truncation to renderStreamingBashOutput to prevent
long-running commands from blowing up the TUI layout:
- Per-line width truncation via truncateLine() (ANSI-aware, matches final
bash tool renderer behavior)
- Display cap at maxBashLines (20) showing the tail (latest output)
- Truncation hint '...(N more lines above)' when lines are hidden
The buffer still accumulates up to 50 lines for context, but only the
last 20 are rendered during streaming. This is consistent with how the
final bash tool result is displayed.
Add maxLsLines (20) constant and truncate Ls output in the TUI to
prevent large directory listings from blowing up the layout. Shows a
'...(N more entries)' hint when truncated, consistent with all other
core tool renderers (Edit, Read, Write, Bash, Subagent).
- makeTheme() and fileConfigToTheme() now compute DiffInsertBg, DiffDeleteBg,
DiffEqualBg, DiffMissingBg, CodeBg, GutterBg, and WriteBg by blending the
theme's own Background with its Success/Error colors, so every theme gets
properly tinted diff backgrounds.
- Added color derivation helpers: parseHexColor, blendHex, deriveDiffBg.
- File-based themes still allow explicit diff color overrides; derived colors
are used only as fallbacks.
- formatToolParams() now skips body-content keys (content, old_text, new_text,
etc.) from the header line regardless of value length, preventing raw
unformatted code from appearing above the formatted body.
- Use process group isolation (Setpgid) so the entire process tree is
killed on timeout/cancellation, not just the direct child
- Set cmd.Cancel to kill the process group (-pgid) with SIGKILL
- Set cmd.WaitDelay (500ms grace period) to force-close pipes when
grandchild processes hold them open after the direct child exits
- Convert buffered path from cmd.Run() to explicit pipes + cmd.Start()
+ cmd.Wait() so WaitDelay can properly force-close pipe handles
- Reorder streaming path: cmd.Wait() before wg.Wait() so the WaitDelay
timer starts when the child exits, not after pipes close
- Add mutex for thread-safe chunk collection in streaming mode
- Add comprehensive tests for timeout, background processes, context
cancellation, and both buffered/streaming paths
Allow users to define custom models in ~/.kit.yml under the customModels
section. These models are automatically merged into the custom provider.
Example config:
customModels:
my-model:
name: "My Custom Model"
reasoning: true
temperature: true
cost:
input: 0.002
output: 0.004
limit:
context: 128000
output: 32000
Usage:
kit --model custom/my-model "Hello"
kit --provider-url "http://localhost:8080" --model custom/my-model "Hello"
Note: When --provider-url is specified without --model, kit defaults to
custom/custom. When --provider-url is specified WITH a custom model from
config, that model is used.
Bug fixes:
- Fixed kit.New() re-loading config file and overriding CLI-specified config
- Fixed models command to reload registry for custom models
When users pass --provider-url without --model, automatically default
to custom/custom instead of the saved model preference. This lets users
point kit at any OpenAI-compatible endpoint without needing a provider/model
pair from the database.
The custom/custom model has:
- Zero cost (input/output = 0)
- 262K context window, 65K output limit
- Reasoning and temperature support
- Routes through openaicompat fantasy provider
Previously, pressing ESC twice to cancel rolled back the entire tree
session to the pre-turn state, discarding the user message, completed
tool call/result pairs, and any streamed response. Content that had
already rendered in the TUI would vanish from the session history.
Now the cancellation path uses the same logic as the non-cancellation
error path: the user message (already persisted before generation) and
any completed step messages (fully-paired tool_use + tool_result from
OnStepFinish) are preserved. Only the in-progress pending message or
tool call is discarded.
This ensures that if a message has rendered in the TUI, it stays in
the history and session.
Add Background(theme.MutedBorder) to all text elements in reasoning blocks: contentStyle, hintStyle, and footer styles. Previously these only specified foreground colors, causing them to inherit the terminal's default background instead of matching the box background.