- Include all token categories in context fill calculation:
InputTokens + CacheReadTokens + CacheCreationTokens + OutputTokens
- With Anthropic/kimi prompt caching, InputTokens can be near-zero
while CacheReadTokens holds the bulk of the context
- Include OutputTokens since assistant output becomes context next turn
- Remove max-only guard in SetContextTokens so context shrinks after
compaction instead of staying stuck at the high-water mark
- Reset context tokens to 0 after compaction in both SDK and UI layers
- Use real API-reported token counts in ShouldCompact() instead of
the chars/4 text heuristic which misses system prompts and tool defs
This commit fixes several issues with token usage tracking:
1. Fix InputTokens-only validation bug - now checks any token field > 0
to handle OpenAI-compatible providers where cached prompts result in
InputTokens=0 while OutputTokens>0
2. Remove per-step context token updates from recordStepUsage() - context
fill is now set once at turn completion via updateUsageFromTurnResult
using FinalUsage.InputTokens, preventing display jumps during multi-step
tool calls
3. Track maximum context seen in SetContextTokens() - prevents the status
bar from showing decreasing token counts when FinalUsage.InputTokens
reflects only the last step's input
4. Add comprehensive debug logging for token tracking at key points:
- StepUsageEvent emission
- recordStepUsage processing
- updateUsageFromTurnResult processing
5. Update tests to reflect new behavior:
- TestRecordStepUsage_updatesTracker: no longer expects context updates
- TestUpdateUsageFromTurnResult_contextTokensUsesInputOnly: verifies
InputTokens-only tracking
All tests pass. Token tracking now correctly accumulates costs and shows
monotonically increasing context size.
When switching models (e.g., via /model command or ctx.SetModel), the usage
tracker now updates its model info to reflect the new model's:
- Pricing for cost calculations
- Context limits for percentage display
- OAuth status (to show bash costs when using OAuth creds)
Previously, token costs and context percentages continued using the old
model's settings after a switch, causing incorrect display for:
- Users switching from paid to free/OAuth models
- Users switching between models with different pricing
Changes:
- Add UpdateModelInfo() method to UsageTracker
- Call UpdateModelInfo() in both SetModel callbacks (extension and UI)
- Add auth import for OAuth detection in root.go
- Fix context percentage: use FinalResponse.Usage (last API call) instead of
TotalUsage (sum of all tool-calling steps) which overstated context fill level
- Fix token count: display current context window tokens, not cumulative session
total, so the number and percentage tell a consistent story
- Fix script mode double-counting: app.updateUsage already updates the shared
tracker before sending StepCompleteEvent, so remove redundant
UpdateUsageFromResponse call
- Add sticky usage display in TUI: render in View() layout between stream and
separator instead of tea.Println so it updates in place
- Add usage display for non-interactive --prompt mode (non-quiet)
- Add SetContextTokens to UsageUpdater interface for separating billing tokens
(TotalUsage) from context utilization (FinalResponse.Usage)
- Delete dead ESC listener code and bubbletea/time imports from agent
- Remove internal/tokens/ package (empty stubs and trivial estimator)
- Inline token estimation into usage_tracker as unexported helper
- Remove unused EstimateAndUpdateUsageFromText dead method
- Remove 9 unsupported provider env var entries from registry
* draft: rewrite single message when streaming (not full terminal)
* having the spinner align better with dots in compact mode
* fix user messages
* handle usage display
* fix formatting
* bash highlighting
---------
Co-authored-by: Nate Woods <big.nate.w@gmail.com>